Let us consider a sample of *D*-dimensional real vectors that has been
generated by an unknown distribution. In latent variable
modelling [33] we assume that
the distribution in data space is actually due to a small
number *L*<*D* of variables acting in combination, called *latent
variables* or *hidden causes*. Thus, a point in latent
space is generated according to a prior distribution
and it is mapped onto data space by a smooth mapping. This results in
an *L*-dimensional manifold in data space. In
order to extend this to the whole *D*-dimensional data space we define
a noise (error) model. The latent variables model is defined by the
prior in latent space, the mapping from latents space to data space and
the noise model in data space. The parameters of such a model are
typically optimized using a maximum likelihood criterion using the EM
algorithm.

Dimensionality reduction is achieved by defining a reverse mapping from data space onto latent space, so that every data point is assigned a representative in latent space. In latent variable modelling, once the parameters are fixed, Bayes' theorem gives the posterior distribution in latent space given a data vector, i.e. the distribution of the probability that a point in latent space was responsible for generating the data point.

The different choices for the functional form of the prior in latent
space , the smooth mapping and the noise model give rise to
different latent variable models, including *factor analysis*
which assumes a Gaussian prior, a linear mapping and a Gaussian noise
model (diagonal covariance), *principal component analysis* (PCA)
which is a special case of factor analysis with an isotropic data
space noise model, and *GTM* (generative topographic
mapping) [34]
which uses a generalised linear mapping (e.g. radial basis functions)
together with a sampled uniform prior over latent space and a Gaussian
noise model. For computational reasons, GTM is limited to a two
dimensional latent space (this space must be sampled).

Finite mixtures of latent variable models can be constructed in the usual way using a linear combination of component models. ML estimation can be conveniently accomplished with an EM algorithm. A reduced-dimension representative can be obtained as the reduced-dimension representative of the mixture component with the highest responsibility.