next up previous contents
Next: Experiments Up: Algorithms for Dimension Reduction Previous: Algorithms for Dimension Reduction

Latent variable models

A certain process in nature may be generated from a small set of independent degrees of freedom, but it will usually appear in a more complex way due to a number of reasons, including stochastic variation and the measurement process. We consider here the case where the underlying variables are mapped by a fixed transformation into a higher-dimension variable space (measurement procedure) and noise is added there (stochastic variation). Such models are commonly known as latent variable models (e.g. factor analysis) and have been used in a number of fields to explain a multivariate process in terms of a few independent variables.

Let us consider a sample of D-dimensional real vectors that has been generated by an unknown distribution. In latent variable modelling [33] we assume that the distribution in data space is actually due to a small number L<D of variables acting in combination, called latent variables or hidden causes. Thus, a point in latent space is generated according to a prior distribution and it is mapped onto data space by a smooth mapping. This results in an L-dimensional manifold in data space. In order to extend this to the whole D-dimensional data space we define a noise (error) model. The latent variables model is defined by the prior in latent space, the mapping from latents space to data space and the noise model in data space. The parameters of such a model are typically optimized using a maximum likelihood criterion using the EM algorithm.

Dimensionality reduction is achieved by defining a reverse mapping from data space onto latent space, so that every data point is assigned a representative in latent space. In latent variable modelling, once the parameters are fixed, Bayes' theorem gives the posterior distribution in latent space given a data vector, i.e. the distribution of the probability that a point in latent space was responsible for generating the data point.

The different choices for the functional form of the prior in latent space , the smooth mapping and the noise model give rise to different latent variable models, including factor analysis which assumes a Gaussian prior, a linear mapping and a Gaussian noise model (diagonal covariance), principal component analysis (PCA) which is a special case of factor analysis with an isotropic data space noise model, and GTM (generative topographic mapping) [34] which uses a generalised linear mapping (e.g. radial basis functions) together with a sampled uniform prior over latent space and a Gaussian noise model. For computational reasons, GTM is limited to a two dimensional latent space (this space must be sampled).

Finite mixtures of latent variable models can be constructed in the usual way using a linear combination of component models. ML estimation can be conveniently accomplished with an EM algorithm. A reduced-dimension representative can be obtained as the reduced-dimension representative of the mixture component with the highest responsibility.


next up previous contents
Next: Experiments Up: Algorithms for Dimension Reduction Previous: Algorithms for Dimension Reduction
Christophe Ris
1998-11-10