Recursive Least-Squares in Feature Space

In an offline scenario where $ M$ input-output patterns are available, the standard kernel-based least-squares (LS) problem [1] can be defined as finding the coefficients $ \alpha_i$ that minimize

$\displaystyle \min_{\vec\alpha} \vert{\mathbf y}- {\mathbf K}\vec\alpha\vert^2 + \lambda \vec\alpha^T {\mathbf K}\vec\alpha,$ (3)

where $ {\mathbf y}\in \mathbb{R}^{M\times1}$ contains the outputs $ y_i$ of the training data, $ {\mathbf K}\in \mathbb{R}^{M\times M}$ is the Gram matrix (or kernel matrix), with elements $ K_{ij} = \kappa({\mathbf x}_i,{\mathbf x}_j)$, and $ \lambda$ is a regularization parameter which introduces a penalization on the solution norm and therefore imposes smoothness. The solution of (3) is found as

$\displaystyle \vec{\alpha} = ({\mathbf K}+\lambda {\mathbf I})^{-1} {\mathbf y},$ (4)

where $ {\mathbf I}$ represents the unit matrix.

The goal of kernel recursive least-squares is to update this solution recursively as new data become available [15]. However, in contrast to linear RLS, which is based on the covariance matrix, KRLS is based on the Gram matrix $ {\mathbf K}$, whose dimensions depend on the number of input patterns, not on their dimensionality. As a consequence, the inclusion of new data into the solution (4) causes the kernel matrix to grow without bound.

Pdf version (236 KB)
Steven Van Vaerenbergh
Last modified: 2010-08-07