The choice of the kernel size
in (3) has
a high impact on the clustering quality. It is a measure of when two
points are considered similar, and should be of the same order of
the distance between similar points. Some rules of thumb have been
proposed to set a value for
, whereas in other cases this
value is set manually.
When the data contains clusters with different local statistics,
there may not be a single value of
that works well for all
the data. In [12] a ``local'' scaling parameter
is proposed instead of this global parameter. It allows
self-tuning of the point-to-point distances by studying the local
statistics of the neightboring points of every point
.
This leads to the following extension of (3):
![]() |
(5) |
Steven Van Vaerenbergh