Next: Least-Squares Regression
Up: A Sliding-Window Kernel RLS
Previous: A Sliding-Window Kernel RLS
Introduction
In recent years a number of kernel
methods, including support vector machines [1],
kernel principal component analysis [2], kernel
Fisher discriminant analysis [3] and kernel
canonical correlation analysis [4,5]
have been proposed and successfully applied to classification and
nonlinear regression problems. In their original forms, most of
these algorithms cannot be used to operate online since a number
of difficulties are introduced by the kernel methods, such as the
time and memory complexities (because of the growing kernel
matrix) and the need to avoid overfitting of the problem.
Recently a kernel RLS algorithm was proposed that dealt with both
difficulties [6]: by applying a sparsification
procedure the kernel matrix size was limited and the order of the
problem was reduced. In this paper we present a different
approach, applying a sliding-window approach and conventional
regularization. This way the size of the kernel matrix can be
fixed rather than limited, allowing the algorithm to operate
online in time-varying environments.
The basic idea of kernel methods is to transform the data
from the input space to a high dimensional feature
space of vectors
, where the inner products
can be calculated using a positive definite kernel function
satisfying Mercer's condition [1]:
. This simple and
elegant idea, also known as the ``kernel trick'', allows inner
products in the feature space to be computed without making direct
reference to the feature vectors.
A common nonlinear kernel is the Gaussian kernel
 |
(1) |
This kernel will be used to calculate the elements of a
kernel matrix. In the sliding-window approach, updating
this kernel matrix means first removing its first row and first
column and then adding a new row and column at the end, based on
new observations. Calculation of its inverse is needed to update
the algorithm's solution. To this end, two matrix inversion
formulas were derived in Appendix A. One of
them has already been used in kernel methods
[7] but in order to fix the dimensions of
the kernel matrix we also introduce a complementary formula.
The rest of this paper is organized as follows: in Section
2 a kernel transformation is introduced into linear
least-squares regression theory. A detailed description of the
proposed algorithm is given in Section 3, and in
Section 4 it is applied to a nonlinear
channel identification problem. Finally, Section
5 summarizes the main conclusions of this
work.
Next: Least-Squares Regression
Up: A Sliding-Window Kernel RLS
Previous: A Sliding-Window Kernel RLS
Pdf version (187 KB)
Steven Van Vaerenbergh
Last modified: 2006-03-08