Unsupervised Feature Subset Selection

Nicolaj Søndberg-Madsen; C. Thomsen; Jose Pena

Unsupervised Feature Subset Selection

Nicolaj Søndberg-Madsen, C. Thomsen, Jose Pena

Department of Computer Science

Research output: Contribution to book/anthology/report/conference proceeding › Article in proceeding › Research

Abstract

This paper studies filter and hybrid filter-wrapper feature subset selection for unsupervised learning (data clustering). We constrain the search for the best feature subset by scoring the dependence of every feature on the rest of the features, conjecturing that these scores discriminate some irrelevant features. We report experimental results on artificial and real data for unsupervised learning of naive Bayes models. Both the filter and hybrid approaches perform satisfactorily.

Original language	English
Title of host publication	Proceedings on the Workshop on Probabilistic Graphical Models for Classification : (within ECML/PKDD 2003)
Number of pages	11
Publication date	2003
Pages	71-82
Publication status	Published - 2003
Event	ECML/PKDD - Cavtat-Dubrovnik, Croatia Duration: 22 Sept 2003 → 26 Sept 2003 Conference number: 14th / 7th

Conference

Conference	ECML/PKDD
Number	14th / 7th
Country/Territory	Croatia
City	Cavtat-Dubrovnik
Period	22/09/2003 → 26/09/2003

Keywords

feature selection
data-clustering
EM-algorithm
dependence measure

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@inproceedings{f4fcb050a78911da881a000ea68e967b,

title = "Unsupervised Feature Subset Selection",

abstract = "This paper studies filter and hybrid filter-wrapper feature subset selection for unsupervised learning (data clustering). We constrain the search for the best feature subset by scoring the dependence of every feature on the rest of the features, conjecturing that these scores discriminate some irrelevant features. We report experimental results on artificial and real data for unsupervised learning of naive Bayes models. Both the filter and hybrid approaches perform satisfactorily.",

keywords = "feature selection, data-clustering, EM-algorithm, dependence measure",

author = "Nicolaj S{\o}ndberg-Madsen and C. Thomsen and Jose Pena",

year = "2003",

language = "English",

pages = "71--82",

booktitle = "Proceedings on the Workshop on Probabilistic Graphical Models for Classification",

note = "ECML/PKDD ; Conference date: 22-09-2003 Through 26-09-2003",

}

TY - GEN

T1 - Unsupervised Feature Subset Selection

AU - Søndberg-Madsen, Nicolaj

AU - Thomsen, C.

AU - Pena, Jose

N1 - Conference code: 14th / 7th

PY - 2003

Y1 - 2003

N2 - This paper studies filter and hybrid filter-wrapper feature subset selection for unsupervised learning (data clustering). We constrain the search for the best feature subset by scoring the dependence of every feature on the rest of the features, conjecturing that these scores discriminate some irrelevant features. We report experimental results on artificial and real data for unsupervised learning of naive Bayes models. Both the filter and hybrid approaches perform satisfactorily.

AB - This paper studies filter and hybrid filter-wrapper feature subset selection for unsupervised learning (data clustering). We constrain the search for the best feature subset by scoring the dependence of every feature on the rest of the features, conjecturing that these scores discriminate some irrelevant features. We report experimental results on artificial and real data for unsupervised learning of naive Bayes models. Both the filter and hybrid approaches perform satisfactorily.

KW - feature selection

KW - data-clustering

KW - EM-algorithm

KW - dependence measure

M3 - Article in proceeding

SP - 71

EP - 82

BT - Proceedings on the Workshop on Probabilistic Graphical Models for Classification

T2 - ECML/PKDD

Y2 - 22 September 2003 through 26 September 2003

ER -

Unsupervised Feature Subset Selection

Abstract

Conference

Keywords

AUB Link

Fingerprint

Cite this