2009年6月19日 星期五

[Reading] Support Vector Learning for Ordinal Regression

"Learning to rank" is automatically creating a ranking function that assigns scores to instances, then rank the instances by using the scores.

This paper formalizes learning to rank as a problem of binary classification, and uses SVM (support vector machine) to learn the binary classifier. This formulation minimizes pair-wise 0-1 loss.

The learned ranking function can be viewed as (1)Ranking function: given an example, output its ranking score. (2)Classifier: given a pair of instances, output their relative ranking.

[Reading] The Structure and Function of Complex Networks

This paper reviews recent work on the structure and function of networked systems such as the Internet, the World Wide Web, social networks, networks of citations between papers, and many others. The study of networks, in the form of mathematical graph theory, is one of the fundamental pillars of discrete mathematics. Networks have also been studied extensively in the social sciences.

This paper mainly describes three parts:
(1) Empirical studies of the structure of networks, including social networks, information networks, technological networks and biological networks.
(2) Some of the common properties that are observed in many of these networks, how they are measured, and why they are believed to be important for the functioning of networked systems.
(3) The mathematical modeling of networks, including random graph models and their generalizations, exponential random graphs, and Markov graphs, the small-world model and its variations, and models of growing graphs including preferential attachment models and their many variations.

[Reading] Lazy Snapping

This paper presents "Lazy Snapping", an interactive image cutout tool, also a novel coarse-to-fine UI design for image cutout. The task in image cutout is in specifying which parts of the image are "foreground" (the part you want to cut out) and which belong to the background.

Lazy Snapping consists of two steps, both are formulated as a graph cut problem:
(1) a quick object marking step
Object marking (at a coarse scale) specifies the object of interest by a few marking lines. This step is intuitive and quick for object context specification. An efficient graph cut algorithm is proposed by employing pre-computed over-segmentation so that the marking UI can provide instant visual feedback for users.
(2) a simple boundary editing step
Boundary editing (at a finer scale or on the zoomed-in image) allows the user to edit the object boundary by simply clicking and dragging polygon vertices, and use the polygon locations as soft constraints to improve snapping results around ambiguous or low contrast edges. This step is easy and efficient for accurate boundary control.

2009年6月5日 星期五

[Reading] Learning Low-Level Vision

This paper presents a learning-based method for low-level vision problems - estimating underlying scenes from images, which is a combination themes of scene estimation and statistical learning. The estimates of underlying scenes are important for various tasks in image analysis, database search, and robotics.

This approach is called VISTA - Vision by Image/Scene TrAining. It is as follows: one specifies prior probabilities on scenes by generating typical examples, creating a synthetic world of scenes and rendered images. It break the images and scenes into a Markov network, and learn the parameters of the network from the training data by applying belief propagation in the Markov network.

Solving a Markov network involves a learning phase, where the parameters of the network connections are learned from training data, and an inference phase, when the scene corresponding to particular image data is estimated.

This paper applies VISTA to the "super-resolution" problem (estimating high frequency details from a low-resolution image), showing good results.

I think the important thing in this paper is that the power of the VISTA approach lies in the large training database, allowing rich prior probabilities, the selection of scene candidates, which focuses the computation on scenes that render to the image, and the bayesian belief propagation, which allows efficient inference.

[Reading] An Introduction to Graphical Models

This technical report gives an introduction to graphical models. it says that graphical models are a marriage between probability theory and graph theory.

graphical models provide a natural tool for dealing with (1) uncertainty and (2) complexity. In particular, graph theory provides the notion of modularity, i.e. a complex system is built by combining simpler parts. Probability theory provides the glue whereby the parts are combined, ensuring that the system as a whole is consistent, and providing ways to interface models to data.

This graphical model framework provides a way to view several systems as instances of a common underlying formalism, ex: mixture models, factor analysis, hidden Markovmodels, Kalman filters. Probabilistic graphical models are graphs in which nodes represent random variables, and the arcs represent conditional independence assumptions. They provide a compact representation of joint probability distributions.

There are two main kinds of graphical models: undirected and directed. In a directed graphical model (a Bayesian network), an arc from A to B can be informally interpreted as indicating that A causes B. We can see that the conditional independence relationships allow us to represent the joint more compactly.

Talking about inference, its goal is to estimate the values of hidden nodes, given the values of the observed nodes. In particular, we can use the conditional independence assumptions encoded in the graph to speed up exact inference. The key idea of the variable elimination algorithm (and many others) is to "push" the sums in as far as possible. Also, if we wish to compute several marginals at the same time, we can use dynamic programming to avoid the redundant computation that would be involved if we used variable elimination repeatedly. The reason why we use approximate inference is that the running time of these exact algorithms is exponential in the size of the largest cluster , and minimizing it is NP-hard. When it is large, it is necessary to use approximate inference.

2009年6月2日 星期二

[Reading] Rapid Object Detection using a Boosted Cascade of Simple Features

This paper describes a machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates. It uses haar features for weak learners, by using the "integral image"
technique, those features can be computed very very quickly. It uses adaboost as learning algorithm. It selects a small number of most important features from a larger set and yields extremely efficient, discriminative classifiers. It propose a "cascade" framework for providing efficiently distiguishing between face and nonface. Overall, this paper propose an approach for object detection which minimizes computation time while achieving high detection accuracy.

[Reading] Normalized Cuts and Image Segmentation

The approach of this paper aims at extracting the global impression of an image and provides a hierarchical description of it. It is most related to the graph theoretic formulation of grouping. By treating the grouping problem (image segmentation) as a graph partitioning problem, this paper proposed the normalized cut criteria for segmenting the graph.

Normalized cut is an unbiased measure of disassociation between subgroups of a graph and it has the nice property that minimizing normalized cut leads directly to maximizing the normalized association, which is an unbiased measure for
total association within the subgroups. it also avoids the problem that unnatural bias for partitioning out small sets of points.

minimizing normalized cut exactly is NP-complete. This paper shows that, when it embed the normalized cut problem in the real value domain, an approximate
discrete solution can be found efficiently. it is formulated as a
generalized eigenvalue problem.