We prove rates of convergence in the statistical sense for kernel-based least
squares regression using a conjugate gradient algorithm, where regularization
against overfitting is obtained by early stopping. This method is directly
related to Kernel Partial Least Squares, a regression method that combines
supervised dimensionality reduction with least squares projection. The rates
depend on two key quantities: first, on the regularity of the target regression
function and second, on the intrinsic dimensionality of the data mapped into
the kernel space.
The derivation of statistical properties for Partial Least Squares regression
can be a challenging task. The reason is that the construction of latent
components from the predictor variables also depends on the response variable.
While this typically leads to good performance and interpretable models in
practice, it makes the statistical analysis more involved. In this work, we
study the intrinsic complexity of Partial Least Squares Regression. Our
contribution is an unbiased estimate of its Degrees of Freedom.
We prove the statistical consistency of kernel Partial Least Squares
Regression applied to a bounded regression learning problem on a reproducing
kernel Hilbert space. Partial Least Squares stands out of well-known classical
approaches as e.g. Ridge Regression or Principal Components Regression, as it
is not defined as the solution of a global cost minimization procedure over a
fixed model nor is it a linear estimator. Instead, approximate solutions are
constructed by projections onto a nested set of data-dependent subspaces.
Graphical Gaussian models are popular tools for the estimation of
(undirected) gene association networks from microarray data. A key issue when
the number of variables greatly exceeds the number of samples is the estimation
of the matrix of partial correlations. Since the (Moore-Penrose) inverse of the
sample covariance matrix leads to poor estimates in this scenario, standard
methods are inappropriate and adequate regularization techniques are needed.