A Sliding-Window Approach

Next: Updating the Inverse of Up: The Online Algorithm Previous: The Online Algorithm

A Sliding-Window Approach

The presented algorithm is a regularized kernel version of the RLS algorithm. An online prediction setup assumes we are given a stream of input-output pairs $\{(\textbf{x}_1,y_1), (\textbf{x}_2,y_2), \dots\}$ . The sliding-window approach consists in only taking the last

pairs of this stream into account. For window

, the observation vector $\textbf{y}_n = [y_n, y_{n-1}, \dots, y_{n-N+1}]^T$ and observation matrix $\textbf{X}_n = [\textbf{x}_n, \textbf{x}_{n-1}, \dots, \textbf{x}_{n-N+1}]^T$ are formed, and the corresponding regularized kernel matrix $\textbf{K}_n = \textbf{\~X}_n\textbf{\~X}_n^T + c\textbf{I}$ can be calculated.

Note that it is necessary to limit the number of data vectors $\textbf{x}_n$ , , for which the kernel matrix is calculated. Contrary to standard linear RLS, for which the correlation matrices have fixed sizes depending on the (fixed) dimension of the input vectors , the size of the kernel matrix in an online scenario depends on the number of observations .

In [6], a kernel RLS algorithm is designed that limits the matrix sizes by means of a sparsification procedure, which maps the samples to a (limited) dictionary. It allows both to reduce the order of the feature space (which prevents overfitting) and to keep the complexity of the algorithm bounded. In our approach these two measures are obtained by two different mechanisms. On one hand, the regularization against overfitting is done by penalizing the solutions, as in (9). On the other hand, the complexity of the algorithm is reduced by considering only the observations in a window with fixed length. The advantage of the latter approach is that it is able to track time variations without any extra computational burden.

Next: Updating the Inverse of Up: The Online Algorithm Previous: The Online Algorithm

Pdf version (187 KB)
Steven Van Vaerenbergh
Last modified: 2006-03-08