This paper presents a learning-based method for low-level vision problems - estimating underlying scenes from images, which is a combination themes of scene estimation and statistical learning. The estimates of underlying scenes are important for various tasks in image analysis, database search, and robotics.
This approach is called VISTA - Vision by Image/Scene TrAining. It is as follows: one specifies prior probabilities on scenes by generating typical examples, creating a synthetic world of scenes and rendered images. It break the images and scenes into a Markov network, and learn the parameters of the network from the training data by applying belief propagation in the Markov network.
Solving a Markov network involves a learning phase, where the parameters of the network connections are learned from training data, and an inference phase, when the scene corresponding to particular image data is estimated.
This paper applies VISTA to the "super-resolution" problem (estimating high frequency details from a low-resolution image), showing good results.
I think the important thing in this paper is that the power of the VISTA approach lies in the large training database, allowing rich prior probabilities, the selection of scene candidates, which focuses the computation on scenes that render to the image, and the bayesian belief propagation, which allows efficient inference.
沒有留言:
張貼留言