research-article

Unsupervised feature selection for multi-cluster data

Authors:
Deng Cai

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

,
Chiyuan Zhang

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

,
Xiaofei He

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data miningJuly 2010Pages 333–342https://doi.org/10.1145/1835804.1835848

Published:25 July 2010Publication History

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 333–342

ABSTRACT

In many data analysis tasks, one is often confronted with very high dimensional data. Feature selection techniques are designed to find the relevant feature subset of the original features which can facilitate clustering, classification and retrieval. In this paper, we consider the feature selection problem in unsupervised learning scenario, which is particularly difficult due to the absence of class labels that would guide the search for relevant information. The feature selection problem is essentially a combinatorial optimization problem which is computationally expensive. Traditional unsupervised feature selection methods address this issue by selecting the top ranked features based on certain scores computed independently for each feature. These approaches neglect the possible correlation between different features and thus can not produce an optimal feature subset. Inspired from the recent developments on manifold learning and L1-regularized models for subset selection, we propose in this paper a new approach, called Multi-Cluster Feature Selection (MCFS), for unsupervised feature selection. Specifically, we select those features such that the multi-cluster structure of the data can be best preserved. The corresponding optimization problem can be efficiently solved since it only involves a sparse eigen-problem and a L1-regularized least squares problem. Extensive experimental results over various real-life data sets have demonstrated the superiority of the proposed algorithm.

Supplemental Material

kdd2010_cai_ufsm_01.mov

mov

150.2 MB

Download

References

M. Belkin and P. Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. In Advances in Neural Information Processing Systems 14, pages 585--591. 2001.Google Scholar
J. Bi, K. Bennett, M. Embrechts, C. Breneman, and M. Song. Dimensionality reduction via sparse support vector machines. Journal of Machine Learning Research, 3:1229--1243, 2003. Google ScholarDigital Library
S. Boutemedjet, N. Bouguila, and D. Ziou. A hybrid feature extraction selection approach for high-dimensional non-gaussian data clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(8):1429--1443, 2009. Google ScholarDigital Library
C. Boutsidis, M. W. Mahoney, and P. Drineas. Unsupervised feature selection for principal components analysis. In Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'08), pages 61--69, 2008. Google ScholarDigital Library
D. Cai. Spectral Regression: A Regression Framework for Efficient Regularized Subspace Learning. PhD thesis, Department of Computer Science, University of Illinois at Urbana-Champaign, May 2009. Google ScholarDigital Library
D. Cai, X. He, and J. Han. Spectral regression: A unified approach for sparse subspace learning. In Proc. Int. Conf. on Data Mining (ICDM'07), 2007. Google ScholarDigital Library
D. Cai, X. He, and J. Han. Sparse projections over graph. In Proc. 2008 AAAI Conf. on Artificial Intelligence (AAAI'08), 2008. Google ScholarDigital Library
P. K. Chan, D. F. Schlag, and J. Y. Zien. Spectral k-way ratio-cut partitioning and clustering. IEEE Transactions on Computer-Aided Design, 13:1088--1096, 1994.Google ScholarDigital Library
F. R. K. Chung. Spectral Graph Theory, volume 92 of Regional Conference Series in Mathematics. AMS, 1997.Google Scholar
C. Constantinopoulos, M. K. Titsias, and A. Likas. Bayesian feature and model selection for gaussian mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(6):1013--1018, 2006. Google ScholarDigital Library
T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley-Interscience, 2nd edition, 2006. Google ScholarDigital Library
R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. Wiley-Interscience, Hoboken, NJ, 2nd edition, 2000. Google ScholarDigital Library
J. G. Dy and C. E. Brodley. Feature selection for unsupervised learning. Journal of Machine Learning Research, 5:845--889, 2004. Google ScholarDigital Library
B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression. Annals of Statistics, 32(2):407--499, 2004.Google ScholarCross Ref
M. A. Fanty and R. Cole. Spoken letter recognition. In Advances in Neural Information Processing Systems 3, 1990. Google ScholarDigital Library
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York: Springer-Verlag, 2001.Google Scholar
X. He, D. Cai, and P. Niyogi. Laplacian score for feature selection. In Advances in Neural Information Processing Systems 18, 2005.Google Scholar
J. J. Hull. A database for handwritten text recognition research. IEEE Trans. Pattern Anal. Mach. Intell., 16(5), 1994. Google ScholarDigital Library
R. Kohavi and G. H. John. Wrappers for feature subset selection. Artificial Intelligence, 97(1--2):273--324, 1997. Google ScholarDigital Library
M. H. C. Law, M. A. T. Figueiredo, and A. K. Jain. Simultaneous feature selection and clustering using mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(9):1154--1166, 2004. Google ScholarDigital Library
H. Liu and L. Yu. Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering, 17(4):491--502, 2005. Google ScholarDigital Library
A. Y. Ng, M. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 14, pages 849--856. MIT Press, Cambridge, MA, 2001.Google ScholarDigital Library
J. L. Rodgers and W. A. Nicewander. Thirteen ways to look at the correlation coefficient. The American Statistician, 42(1):59--66, 1988.Google ScholarCross Ref
V. Roth and T. Lange. Feature selection in clustering problems. In Advances in Neural Information Processing Systems 16. 2003.Google Scholar
S. Roweis and L. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323--2326, 2000.Google ScholarCross Ref
J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888--905, 2000. Google ScholarDigital Library
G. W. Stewart. Matrix Algorithms Volume II: Eigensystems. SIAM, 2001. Google ScholarDigital Library
J. Tenenbaum, V. de Silva, and J. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319--2323, 2000.Google ScholarCross Ref
L. Wolf and A. Shashua. Feature selection for unsupervised and supervised inference: The emergence of sparsity in a weight-based approach. Journal of Machine Learning Research, 6:1855--1887, 2005. Google ScholarDigital Library
Z. Zhao and H. Liu. Spectral feature selection for supervised and unsupervised learning. In Proceedings of the 24th Annual International Conference on Machine Learning (ICML'07), pages 1151--1157, 2007. Google ScholarDigital Library

Index Terms

Unsupervised feature selection for multi-cluster data
1. Computing methodologies
  1. Machine learning
    1. Machine learning algorithms
      1. Feature selection

Recommendations

Cluster structure preserving unsupervised feature selection for multi-view tasks

Multi-view or multi-modal tasks exist in many areas of pattern analysis as the advancement of feature acquisition or extraction. These tasks are usually confronted with the issue of curse of dimensionality. In this work we consider the unsupervised ...
Read More
A Redundancy Based Unsupervised Feature Selection Method for High-Dimensional Data
ICMLC '21: Proceedings of the 2021 13th International Conference on Machine Learning and Computing

Feature selection is a process to select key features from the initial feature set. It is commonly used as a preprocessing step to improve the efficiency and accuracy of a classification model in artificial intelligence and machine learning domains. ...
Read More
A clustering-based feature selection via feature separability
ICNC-FSKD 2015

With the extensive increase of the amount of data, such as text categorization, genomic microarray data, bio-informatics and digital images, there are more and more challenges in feature selection. Recently, feature selection has been widely studied in ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
July 2010
1240 pages
ISBN:9781450300551
DOI:10.1145/1835804
General Chairs:
Bharat Rao
Siemens
,
Balaji Krishnapuram
Siemens
,
Program Chairs:
Andrew Tomkins
Google Inc.
,
Qiang Yang
Hong Kong University of Science and Technology
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 July 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
clustering
feature selection
unsupervised
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 742
  Total Citations
  View Citations
- 4,421
  Total Downloads
- Downloads (Last 12 months)272
- Downloads (Last 6 weeks)26
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Unsupervised feature selection for multi-cluster data

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Cluster structure preserving unsupervised feature selection for multi-view tasks

A Redundancy Based Unsupervised Feature Selection Method for High-Dimensional Data

A clustering-based feature selection via feature separability