skip to main content
10.1145/2851141.2851152acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article
Public Access

A high-performance parallel algorithm for nonnegative matrix factorization

Published:27 February 2016Publication History

ABSTRACT

Non-negative matrix factorization (NMF) is the problem of determining two non-negative low rank factors W and H, for the given input matrix A, such that AWH. NMF is a useful tool for many applications in different domains such as topic modeling in text mining, background separation in video analysis, and community detection in social networks. Despite its popularity in the data mining community, there is a lack of efficient distributed algorithms to solve the problem for big data sets.

We propose a high-performance distributed-memory parallel algorithm that computes the factorization by iteratively solving alternating non-negative least squares (NLS) subproblems for W and H. It maintains the data and factor matrices in memory (distributed across processors), uses MPI for interprocessor communication, and, in the dense case, provably minimizes communication costs (under mild assumptions). As opposed to previous implementations, our algorithm is also flexible: (1) it performs well for both dense and sparse matrices, and (2) it allows the user to choose any one of the multiple algorithms for solving the updates to low rank factors W and H within the alternating iterations. We demonstrate the scalability of our algorithm and compare it with baseline implementations, showing significant performance improvements.

References

  1. G. Ballard, A. Druinsky, N. Knight, and O. Schwartz. Brief announcement: Hypergraph partitioning for parallel sparse matrix-matrix multiplication. In Proceedings of SPAA, pages 86--88, 2015. URL http://doi.acm.org/10.1145/2755573.2755613. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. E. Chan, M. Heimlich, A. Purkayastha, and R. van de Geijn. Collective communication: theory, practice, and experience. Concurrency and Computation: Practice and Experience, 19(13):1749--1783, 2007. URL http://dx.doi.org/10.1002/cpe.1206. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Cichocki, R. Zdunek, A. H. Phan, and S.-i. Amari. Nonnegative matrix and tensor factorizations: applications to exploratory multiway data analysis and blind source separation. Wiley, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Demmel, D. Eliahu, A. Fox, S. Kamil, B. Lipshitz, O. Schwartz, and O. Spillinger. Communication-optimal parallel recursive rectangular matrix multiplication. In Proceedings of IPDPS, pages 261--272, 2013. URL http://dx.doi.org/10.1109/IPDPS.2013.80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. P. Fairbanks, R. Kannan, H. Park, and D. A. Bader. Behavioral clusters in dynamic graphs. Parallel Computing, 47:38--50, 2015. URL http://dx.doi.org/10.1016/j.parco.2015.03.002.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Faloutsos, A. Beutel, E. P. Xing, E. E. Papalexakis, A. Kumar, and P. P. Talukdar. Flexi-FaCT: Scalable flexible factorization of coupled tensors on Hadoop. In Proceedings of the SDM, pages 109--117, 2014. URL http://epubs.siam.org/doi/abs/10.1137/1. 9781611973440.13.Google ScholarGoogle Scholar
  7. R. Fujimoto, A. Guin, M. Hunter, H. Park, G. Kanitkar, R. Kannan, M. Milholen, S. Neal, and P. Pecher. A dynamic data driven application system for vehicle tracking. Procedia Computer Science, 29: 1203--1215, 2014. URL http://dx.doi.org/10.1016/j.procs.2014.05.108.Google ScholarGoogle ScholarCross RefCross Ref
  8. R. Gemulla, E. Nijkamp, P. J. Haas, and Y. Sismanis. Large-scale matrix factorization with distributed stochastic gradient descent. In Proceedings of the KDD, pages 69--77. ACM, 2011. URL http://dx.doi.org/10.1145/2020408.2020426. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. Grove, J. Milthorpe, and O. Tardieu. Supporting array programming in X10. In Proceedings of ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming, ARRAY'14, pages 38:38--38:43, 2014. URL http://doi.acm.org/10.1145/2627373.2627380. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. N.-D. Ho, P. V. Dooren, and V. D. Blondel. Descent methods for nonnegative matrix factorization. CoRR, abs/0801.3199, 2008.Google ScholarGoogle Scholar
  11. P. O. Hoyer. Non-negative matrix factorization with sparseness constraints. JMLR, 5:1457--1469, 2004. URL www.jmlr.org/papers/volume5/hoyer04a/hoyer04a.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. O. Kaya and B. Uçar. Scalable sparse tensor decompositions in distributed memory systems. In Proceedings of SC, pages 77:1--77:11. ACM, 2015. URL http://doi.acm.org/10.1145/2807591.2807624. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. H. Kim and H. Park. Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics, 23(12):1495--1502, 2007. URL http://dx.doi.org/10.1093/bioinformatics/btm134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Kim and H. Park. Fast nonnegative matrix factorization: An active-set-like method and comparisons. SIAM Journal on Scientific Computing, 33(6):3261--3281, 2011. URL http://dx.doi.org/10.1137/110821172. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Kim, Y. He, and H. Park. Algorithms for nonnegative matrix and tensor factorizations: A unified view based on block coordinate descent framework. Journal of Global Optimization, 58(2):285--319, 2014. URL http://dx.doi.org/10.1007/s10898-013-0035-4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. Kuang, C. Ding, and H. Park. Symmetric nonnegative matrix factorization for graph clustering. In Proceedings of SDM, pages 106--117, 2012. URL http://epubs.siam.org/doi/pdf/10.1137/1.9781611972825.10.Google ScholarGoogle ScholarCross RefCross Ref
  17. D. Kuang, S. Yun, and H. Park. SymNMF: nonnegative low-rank approximation of a similarity matrix for graph clustering. Journal of Global Optimization, pages 1--30, 2013. URL http://dx.doi.org/10.1007/s10898-014-0247-2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. Liao, Y. Zhang, J. Guan, and S. Zhou. CloudNMF: A MapReduce implementation of nonnegative matrix factorization for large-scale biological datasets. Genomics, proteomics & bioinformatics, 12(1): 48--51, 2014. URL http://dx.doi.org/10.1016/j.gpb.2013.06.001.Google ScholarGoogle Scholar
  19. C. Liu, H.-c. Yang, J. Fan, L.-W. He, and Y.-M. Wang. Distributed nonnegative matrix factorization for web-scale dyadic data analysis on MapReduce. In Proceedings of the WWW, pages 681--690. ACM, 2010. URL http://dx.doi.org/10.1145/1772690.1772760. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. E. Mejía-Roa, D. Tabas-Madrid, J. Setoain, C. García, F. Tirado, and A. Pascual-Montano. NMF-mGPU: non-negative matrix factorization on multi-GPU systems. BMC bioinformatics, 16(1):43, 2015. URL http://dx.doi.org/10.1186/s12859-015-0485-4.Google ScholarGoogle ScholarCross RefCross Ref
  21. X. Meng, J. Bradley, B. Yavuz, E. Sparks, S. Venkataraman, D. Liu, J. Freeman, D. B. Tsai, M. Amde, S. Owen, D. Xin, R. Xin, M. J. Franklin, R. Zadeh, M. Zaharia, and A. Talwalkar. MLlib: Machine Learning in Apache Spark, May 2015. URL http://arxiv.org/abs/1505.06807.Google ScholarGoogle Scholar
  22. V. P. Pauca, F. Shahnaz, M. W. Berry, and R. J. Plemmons. Text mining using nonnegative matrix factorizations. In Proceedings of SDM, 2004.Google ScholarGoogle Scholar
  23. C. Sanderson. Armadillo: An open source C++ linear algebra library for fast prototyping and computationally intensive experiments. Technical report, NICTA, 2010. URL http://arma.sourceforge.net/armadillo_nicta_2010.pdf.Google ScholarGoogle Scholar
  24. D. Seung and L. Lee. Algorithms for non-negative matrix factorization. NIPS, 13:556--562, 2001.Google ScholarGoogle Scholar
  25. R. Thakur, R. Rabenseifner, and W. Gropp. Optimization of collective communication operations in MPICH. International Journal of High Performance Computing Applications, 19(1):49--66, 2005. URL http://hpc.sagepub.com/content/19/1/49.abstract. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Y.-X. Wang and Y.-J. Zhang. Nonnegative matrix factorization: A comprehensive review. TKDE, 25(6):1336--1353, June 2013. URL http://dx.doi.org/10.1109/TKDE.2012.51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. Williams, L. Oliker, R. Vuduc, J. Shalf, K. Yelick, and J. Demmel. Optimization of sparse matrix-vector multiplication on emerging multicore platforms. Parallel Computing, 35(3):178--194, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Z. Xianyi. Openblas, Last Accessed 03-Dec-2015. URL http://www.openblas.net.Google ScholarGoogle Scholar
  29. J. Yin, L. Gao, and Z. Zhang. Scalable nonnegative matrix factorization with block-wise updates. In Machine Learning and Knowledge Discovery in Databases, volume 8726 of LNCS, pages 337--352, 2014. URL http://dx.doi.org/10.1007/978-3-662-44845-8_22.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: Cluster computing with working sets. In Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud'10, pages 10--10. USENIX Association, 2010. URL http://dl.acm.org/citation.cfm?id=1863103.1863113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. A high-performance parallel algorithm for nonnegative matrix factorization

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        PPoPP '16: Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
        February 2016
        420 pages
        ISBN:9781450340922
        DOI:10.1145/2851141

        Copyright © 2016 ACM

        © 2016 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 February 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate230of1,014submissions,23%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader