skip to main content
10.1145/1273496.1273562acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
Article

Adaptive dimension reduction using discriminant analysis and K-means clustering

Published:20 June 2007Publication History

ABSTRACT

We combine linear discriminant analysis (LDA) and K-means clustering into a coherent framework to adaptively select the most discriminative subspace. We use K-means clustering to generate class labels and use LDA to do subspace selection. The clustering process is thus integrated with the subspace selection process and the data are then simultaneously clustered while the feature subspaces are selected. We show the rich structure of the general LDA-Km framework by examining its variants and their relationships to earlier approaches. Relations among PCA, LDA, K-means are clarified. Extensive experimental results on real-world datasets show the effectiveness of our approach.

References

  1. Banerjee, A., Dhillon, I., Ghosh, J., Merugu, S., & Modha, D. (2004). A generalized maximum entropy approach to bregman co-clustering and matrix approximation. Proc. ACM Int'l Conf Knowledge Disc. Data Mining (KDD). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Beyer, K. S., Goldstein, J., Ramakrishnan, R., & Shaft, U. (1999). When is nearest neighbor meaningful? Proceedings of 7th International Conference on Database Theory(ICDT'99) (pp. 217--235). Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Cheng, Y., & Church, G. (2000). Biclustering of expression data. Proc. Int'l Symp. Mol. Bio (ISMB), 93--103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Dasgupta, S. (2000). Experiments with random projection. Proc. 16th Conf. Uncertainty in Artificial Intelligence (UAI 2000). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. De la Torre, F., & Kanade, T. (2006). Discriminative cluster analysis. Proc. Int'l Conf. Machine Learning. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Dhillon, I. S. (2001). Co-clustering documents and words using bipartite spectral graph partitioning. Proc. ACM Int'l Conf Knowledge Disc. Data Mining (KDD 2001). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Ding, C., & He, X. (2004). K-means clustering and principal component analysis. Int'l Conf. Machine Learning (ICML). Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Ding, C., He, X., & Simon, H. (2005). On the equivalence of nonnegative matrix factorization and spectral clustering. Proc. SIAM Data Mining Conf.Google ScholarGoogle ScholarCross RefCross Ref
  9. Ding, C., He, X., Zha, H., & Simon, H. (2002). Adaptive dimension reduction for clustering high dimensional data. Proc. IEEE Int'l Conf. Data Mining. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ding, C., Li, T., Peng, W., & Park, H. (2006). Orthogonal nonnegative matrix tri-factorizations for clustering. Proc. SIGKDD Int'l Conf. Knowledge Discovery and Data Mining(KDD), 126--135. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Duda, R. O., Hart, P. E., & Stork, D. G. (2000). Pattern classification, 2nd ed. Wiley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Han, E.-H., Boley, D., Gini, M., Gross, R., Hastings, K., Karypis, G., Kumar, V., Mobasher, B., & Moore, J. (1998). WebACE: A web agent for document categorization and exploration. Proceedings of the 2nd International Conference on Autonomous Agents (Agents'98). ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Hastie, T., Tibshirani, R., & Friedman, J. (2001). Elements of statistical learning. Springer Verlag.Google ScholarGoogle Scholar
  14. Jolliffe, I. (2002). Principal component analysis. Springer. 2nd edition.Google ScholarGoogle Scholar
  15. Lee, D., & Seung, H. (2001). Algorithms for non-negative matrix factorization. Advances in Neural Information Processing Systems 13. Cambridge, MA: MIT Press.Google ScholarGoogle Scholar
  16. Li, T., & Ma, S. (2004). IFD: Iterative feature and data clustering. Pro. SIAM Int'l conf. on Data Mining (SDM 2004) (pp. 472--476).Google ScholarGoogle Scholar
  17. Li, T., Ma, S., & Ogihara, M. (2004). Document clustering via adaptive subspace iteration. Proc. conf. Research and development in IR (SIRGIR) (pp. 218--225). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. McCallum, A. K. (1996). Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering. http://www.cs.cmu.edu/mccallum/bow.Google ScholarGoogle Scholar
  19. Park, H., & Howland, P. (2004). Generalizing discriminant analysis using the generalized singular value decomposition. IEEE. Trans. on Pattern Analysis and Machine Intelligence, 26, 995 -- 1006.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Parsons, L., Haque, E., & Liu, H. (2004). Subspace clustering for high dimensional data: a review. SIGKDD Explorations, 6, 90--105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ye, J., & Xiong, T. (2006). Null space versus orthogonal linear discriminant analysis. Proc. Int'l Conf. Machine Learning. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Zha, H., Ding, C., Gu, M., He, X., & Simon, H. (2002). Spectral relaxation for K-means clustering. Advances in Neural Information Processing Systems 14 (NIPS'01), 1057--1064.Google ScholarGoogle Scholar
  23. Zha, H., He, X., Ding, C., Gu, M., & Simon, H. (2001). Bipartite graph partitioning and data clustering. Proc. Int'l Conf. Information and Knowledge Management (CIKM 2001). Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Adaptive dimension reduction using discriminant analysis and K-means clustering

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICML '07: Proceedings of the 24th international conference on Machine learning
        June 2007
        1233 pages
        ISBN:9781595937933
        DOI:10.1145/1273496

        Copyright © 2007 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 20 June 2007

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate140of548submissions,26%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader