|
The project resulted in a publication at ICDM 2007: Locally constrained Support Vector Clustering
Description (from the abstract of the paper): Support vector clustering transforms the data into a high
dimensional feature space, where a decision function is
computed. In the original space, the function outlines the
boundaries of higher density regions, naturally splitting the
data into individual clusters. The method, however, though
theoretically sound, has certain drawbacks which make it
not so appealing to the practitioner. Namely, it is unstable
in the presence of outliers and it is hard to control the
number of clusters that it identifies. Parametrizing the algorithm
incorrectly in noisy settings, can either disguise some
objectively present clusters in the data, or can identify a
large number of small and nonintuitive clusters.
Here, we explore the properties of the data in small regions
building a mixture of factor analyzers. The obtained
information is used to regularize the complexity of the outlined
cluster boundaries, by assigning suitable weighting to
each example. The approach is demonstrated to be less susceptible
to noise and to outline better interpretable clusters
than support vector clustering alone. |