Abstract
Clustering is an important data mining task for grouping
similar objects. In high dimensional data, however, eects
attributed to the \curse of dimensionality", render clustering
in high dimensional data meaningless. Due to this, recent
years have seen research on subspace clustering which
searches for clusters in relevant subspace projections of high
dimensional data. As the number of possible subspace projections
is exponential in the number of dimensions, the
number of possible subspace clusters can be overwhelming.
In this position paper, we present our work on identifying
non-redundant, relevant subspace clusters which reduce the
result set to a manageable size. We discuss techniques for
evaluating, visualizing and exploring subspace clusterings,
and propose some directions for future work.
similar objects. In high dimensional data, however, eects
attributed to the \curse of dimensionality", render clustering
in high dimensional data meaningless. Due to this, recent
years have seen research on subspace clustering which
searches for clusters in relevant subspace projections of high
dimensional data. As the number of possible subspace projections
is exponential in the number of dimensions, the
number of possible subspace clusters can be overwhelming.
In this position paper, we present our work on identifying
non-redundant, relevant subspace clusters which reduce the
result set to a manageable size. We discuss techniques for
evaluating, visualizing and exploring subspace clusterings,
and propose some directions for future work.
Originalsprog | Engelsk |
---|---|
Titel | 1st International Workshop on Discovering, Summarizing and Using Multiple Clusterings (MultiClust 2010) in conjunction with 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA (2010) |
Forlag | Association for Computing Machinery |
Publikationsdato | 2010 |
ISBN (Trykt) | 978-1-4503-0227-2 |
Status | Udgivet - 2010 |