next up previous
Next: Identification and clustering limitations Up: Dividing the samples into Previous: Spectral clustering


Preprocessing

Some preprocessing steps are taken to facilitate the spectral clustering. Basically, the mixture samples are roughly reduced to those for which only one source was active at each instant. Apart from guaranteeing the overall efficiency of the algorithm, this reduction also lowers the computational cost.

Central samples are removed because they correspond to inactive sources and are almost unaffected by the nonlinearity. If $ p_i =
p, \textrm{ } \forall i$, the probability of having no active sources at all according to the sparse source model (1) is $ p^n$, so the $ \nu_1 = p^nN$ samples closest to the origin can be removed. In addition, ``non-sparse'' samples, which are the result of multiple sources active at the same time, are also removed. They can be estimated as the $ \nu_2 =
\left[1 - n(1-p)p^{n-1}-p^n\right]N$ samples with highest local scale.

If the sources have different $ p_i$-values, $ \nu_1$ and $ \nu_2$ can easily be calculated according to the previous description. In practice rough (over-) estimates can be used for $ \nu_1$ and $ \nu_2$. Especially when the $ p_i$ are unknown, $ \nu_1$ and $ \nu_2$ should be chosen so that after preprocessing the remaining samples can be clustered into non-overlapping clusters.



Steven Van Vaerenbergh
Last modified: 2006-04-05