Recursive Least-Squares in Feature Space

In an offline scenario where input-output patterns are available, the standard kernel-based least-squares (LS) problem [1] can be defined as finding the coefficients $\alpha_i$ that minimize

$\displaystyle \min_{\vec\alpha} \vert{\mathbf y}- {\mathbf K}\vec\alpha\vert^2 + \lambda \vec\alpha^T {\mathbf K}\vec\alpha,$

(3)

where ${\mathbf y}\in \mathbb{R}^{M\times1}$ contains the outputs

of the training data, ${\mathbf K}\in \mathbb{R}^{M\times M}$ is the Gram matrix (or kernel matrix), with elements $K_{ij} = \kappa({\mathbf x}_i,{\mathbf x}_j)$ , and $\lambda$ is a regularization parameter which introduces a penalization on the solution norm and therefore imposes smoothness. The solution of (3) is found as

$\displaystyle \vec{\alpha} = ({\mathbf K}+\lambda {\mathbf I})^{-1} {\mathbf y},$

(4)

where ${\mathbf I}$ represents the unit matrix.

The goal of kernel recursive least-squares is to update this solution recursively as new data become available [15]. However, in contrast to linear RLS, which is based on the covariance matrix, KRLS is based on the Gram matrix ${\mathbf K}$ , whose dimensions depend on the number of input patterns, not on their dimensionality. As a consequence, the inclusion of new data into the solution (4) causes the kernel matrix to grow without bound.

Pdf version (236 KB)
Steven Van Vaerenbergh
Last modified: 2010-08-07