pLSA is a novel approach to automated document indexing and information retrieval. It models each word in a document as a sample from a mixture model. Each word is generated from a single topic, different words in the document may be generated from different topics. Each document is represented as a list of mixing proportions for the mixture components.
pLSA is based on the likelihood principle and uses a statistical model called aspect model to define a proper generative model of the data, and directly minimizes word perplexity, so it has a better statistical foundation than LSA. Also, pLSA outperforms LSA in the experiments. pLSA uses EM algorithm to identify latent classes. It is capable of dealing with polysemy and synonymy.
2009年3月31日 星期二
2009年3月25日 星期三
[Reading] Shape Matching and Object Recognition Using Shape Contexts
This paper propose a robust and simple algorithm for finding correspondences and measure the similarity between shapes and exploit it for object recognition. This approach is a 3-stage process: (1) Find correspondences between points on shapes, (2) Estimate transformation, and (3) Measure similarity. In order to solve the correspondence problem, it propose a descriptor named shape context. Shape context records the distribution of relative positions of points. the estimation use regularized thin plate spline as transformation model. Shape distance is a weighted sum of shape context distance, appearance distance and bending energy. Results are presented for handwritten digits, 3D objects, silhouettes and trademarks.
[Reading] Contour and Texture Analysis for Image Segmentation
This paper propose a general algorithm for partitioning grayscale images into disjoint regions of coherent brightness and texture. It uses texture features for segmentation. A texture descriptor is a vector of filter bank outputs. Textons are found by clustering. Affinities are given by similarities of texton histograms over windows given by the "local scale" of the texture. Having get a locall measure, it use the spectral graph theoretic framework of normalized cuts to find partitions.
2009年3月10日 星期二
[Reading] Nonlinear Dimensionality Reduction by Locally Linear Embedding
This paper introduce locally linear embedding (LLE), an unsupervised learning algorithm that
computes low-dimensional, neighborhood-preserving embeddings of high-dimensional inputs.
LLE recovers global nonlinear structure from locally linear fits by exploiting the local symmetries of linear reconstructions, thus LLE is able to learn the global structure of nonlinear manifolds.
LLE maps high-dimensional data into a single global coordinate system of lower dimensionality. It constructs a neighborhood-preserving mapping based on reconstructing the constrained weights. By minimize the reconstruction errors, these weights reflect intrinsic geometric properties of the data that are invariant to rotations, rescalings, and translations.
This approach eliminates the need to estimate pairwise distances between widely separated data points. It also avoids the need to solve large dynamic programming problems.
computes low-dimensional, neighborhood-preserving embeddings of high-dimensional inputs.
LLE recovers global nonlinear structure from locally linear fits by exploiting the local symmetries of linear reconstructions, thus LLE is able to learn the global structure of nonlinear manifolds.
LLE maps high-dimensional data into a single global coordinate system of lower dimensionality. It constructs a neighborhood-preserving mapping based on reconstructing the constrained weights. By minimize the reconstruction errors, these weights reflect intrinsic geometric properties of the data that are invariant to rotations, rescalings, and translations.
This approach eliminates the need to estimate pairwise distances between widely separated data points. It also avoids the need to solve large dynamic programming problems.
[Reading] Eigenfaces for Recognition
Eigenfaces are a set of eigenvectors used in the computer vision problem of human face recognition. The eigenvectors of the covariance matrix associate to a large set of normalized pictures of faces are called eigenfaces. They are derived from the covariance matrix of the probability distribution of the high-dimensional vector space of possible faces of human beings. This approach is an example of principal components analysis.
2009年3月9日 星期一
[Reading] Scale & Affine Invariant Interest Point Detectors
This paper propose a novel approach for detecting interest points especially invariant to scale and affine transformaitons. Scale invariant detector computes a multi-scale representation for the Harris interest point detector and then selects points at which a local measure (Laplacian) is maximal over scales, such approach combines the Harris detector with the Laplacian-based scale selection. It extends the scale invariant detector to affine invariance by estimating the affine shape of a point neighborhood. This method modifies location, scale and shape of every point neighborhood and converges to affine invariant points.
[Reading] Distinctive Image Features from Scale-Invariant Keypoints
This paper presents a method named SIFT for extracting distinctive invariant features (named SIFT) from images that providing a basis for object and scene recognition. SIFT is a carefully designed procedure with empirically determined parameters for the invariant and distinctive features.
SIFT has the following four stages (the first two is as a detector, the last two is as a descriptor):
(1) Scale-space extrema detection
Use a DOG function to identify potential interest points that are invariant to scale.
(2) Keypoint localization
Detailed fitting for sub-pixel accuracy and further selection based on stability.
(3) Orientation assignment
In short it is based on gradient directions, so the feature are orientation invariant.
(4) Keypoint descriptor
Create array of orientation histograms.
The SIFT keypoints are invariant to image scale and rotation and robust across a substantial range of affine distortion, addition of noise, and change in illumination.
SIFT has the following four stages (the first two is as a detector, the last two is as a descriptor):
(1) Scale-space extrema detection
Use a DOG function to identify potential interest points that are invariant to scale.
(2) Keypoint localization
Detailed fitting for sub-pixel accuracy and further selection based on stability.
(3) Orientation assignment
In short it is based on gradient directions, so the feature are orientation invariant.
(4) Keypoint descriptor
Create array of orientation histograms.
The SIFT keypoints are invariant to image scale and rotation and robust across a substantial range of affine distortion, addition of noise, and change in illumination.
訂閱:
文章 (Atom)