On the equivalence between nonnegative tensor factorization and tensorial probabilistic latent semantic analysis

Peng, Wei; Li, Tao

doi:10.1007/s10489-010-0220-9

On the equivalence between nonnegative tensor factorization and tensorial probabilistic latent semantic analysis

Published: 24 March 2010

Volume 35, pages 285–295, (2011)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Wei Peng¹ &
Tao Li²

318 Accesses
12 Citations
Explore all metrics

Abstract

Non-negative Matrix Factorization (NMF) and Probabilistic Latent Semantic Analysis (PLSA) are two widely used methods for non-negative data decomposition of two-way data (e.g., document-term matrices). Studies have shown that PLSA and NMF (with the Kullback-Leibler divergence objective) are different algorithms optimizing the same objective function. Recently, analyzing multi-way data (i.e., tensors), has attracted a lot of attention as multi-way data have rich intrinsic structures and naturally appear in many real-world applications. In this paper, the relationships between NMF and PLSA extensions on multi-way data, e.g., NTF (Non-negative Tensor Factorization) and T-PLSA (Tensorial Probabilistic Latent Semantic Analysis), are studied. Two types of T-PLSA models are shown to be equivalent to two well-known non-negative factorization models: PARAFAC and Tucker3 (with the KL-divergence objective). NTF and T-PLSA are also compared empirically in terms of objective functions, decomposition results, clustering quality, and computation complexity on both synthetic and real-world datasets. Finally, we show that a hybrid method by running NTF and T-PLSA alternatively can successfully jump out of each other’s local minima and thus be able to achieve better clustering performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Effective Tensor-Based Data Clustering Through Sub-Tensor Impact Graphs

Fast Nonnegative Tensor Factorizations with Tensor Train Model

Article 01 April 2022

Simplex Decompositions Using Singular Values Decomposition

References

Acar E, Yener B (2007) Unsupervised multiway data analysis: a literature survey. Technical report, Computer Science Department, Rensselaer Polytechnic Institute
Smilde A, Bro R, Geladi P (2004) Multi-way analysis: applications in the chemical sciences. Wiley, New York
Book Google Scholar
Bader BW, Harshman RA, Kolda TG (2007) Temporal analysis of semantic graphs using asalsan. In: Proceedings of the ICDM07, pp 33–42, October 2007
Chi Y, Zhu S, Gong Y, Zhang Y (2008) Probabilistic polyadic factorization and its application to personalized recommendation. In: Proceedings of the 17th ACM conference on information and knowledge management (CIKM 2008), pp 941–950. ACM Press, New York
Chapter Google Scholar
Ding C, He X, Simon H (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In: Proceedings of the SIAM international conference on data mining (SDM 2005)
Ding C, Li T, Jordan MI (2009) Convex and semi-nonnegative matrix factorizations. IEEE Trans Pattern Anal Mach Intell 32(99):1
MATH Google Scholar
Ding C, Li T, Peng W (2008) On the equivalence between non-negative matrix factorization and probabilistic latent semantic indexing. Comput Statist Data Anal 52(8):3913–3927
Article MathSciNet MATH Google Scholar
Ding C, Li T, Peng W, Park H (2006) Orthogonal nonnegative matrix t-factorizations for clustering. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, pp 126–135
Duda RO, Hart PE, Stork DG (2000) Pattern classification (2nd edn). Wiley-Interscience, New York
Google Scholar
Evrim Acar MK, Camtepe SA, Yener B (2005) Modeling multiway analysis of chatroom tensors. Proc IEEE Int Conf Intell Secur Inform 3495:256–268
Google Scholar
Gaussier E, Goutte C (2005) Relation between PLSA and NMF and implications. In: Proceeding of the annual international ACM SIGIR conference (SIGIR 2005). ACM Press, New York, pp 601–602
Google Scholar
Milligan GW (1985) An algorithm for generating artificial test clusters. Psychometrika 50:123–127
Article Google Scholar
Milligan GW, Cooper MC (1986) A study of the comparability of external criteria for hierarchical cluster analysis. Multivar Behav Res 21:846–850
Article Google Scholar
Hofmann T (1999) Probabilistic latent semantic indexing. In: Proceedings of the annual international ACM SIGIR conference (SIGIR 1999). ACM Press, New York, pp 50–57
Google Scholar
Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1–2):177–196
Article MATH Google Scholar
Kolda T (2001) Orthogonal tensor decomposition. SIAM J Matrix Anal Appl 23:243–255
Article MathSciNet MATH Google Scholar
Kolda TG, Bader BW (2006) The tophits model for higher-order web link analysis. In: Workshop on link analysis, counterterrorism and security
Lee D, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791
Article Google Scholar
Lee D, Seung HS (2001) Algorithms for non-negatvie matrix factorization. In: Dietterich TG, Tresp V (eds) Advances in neural information processing systems, vol 13. MIT Press, Cambridge
Google Scholar
Lee DD, Seung HS (2000) Algorithms for non-negative matrix factorization. In: NIPS, pp 556–562
Lee DD, Seung SH (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791
Article Google Scholar
Li T (2008) Clustering based on matrix approximation: a unifying view. Knowl Inform Syst J 17(1):1–15
Article MATH Google Scholar
Li T, Ding C (2006) The relationships among various nonnegative matrix factorization methods for clustering. In: Proceedings of the 2006 IEEE international conference on data mining (ICDM 2006), pp 362–371
Pauca VP, Shahnaz F, Berry M, Plemmons R (2004) Text mining using non-negative matrix factorization. In: Proceedings of the SIAM international conference on data mining (SDM 2004), pp 452–456
Peng W (2009) Equivalence between nonnegative tensor factorization and tensorial probabilistic latent semantic analysis. In: Proceedings of the annual international ACM SIGIR conference (SIGIR 2009), pp 668–669
Peng W, Li T (2008) Author-topic evolution analysis using three-way non-negative paratucker. In: Proceedings of the annual international ACM SIGIR conference (SIGIR 2008). ACM Press, New York, pp 819–820
Google Scholar
Peng W, Li T, Shao B (2008) Clustering multi-way data via adaptive subspace iteration. In: Proceedings of the 17th ACM conference on information and knowledge management (CIKM 2008). ACM Press, New York, pp 1519–1520
Chapter Google Scholar
Priebe C, Conroy J, Marchette D, Park Y (2006) Enron data set. http://cis.jhu.edu/~parky/Enron/enron.html
Harshman RA (1970) Foundations of the parafac procedure: models and conditions for an ‘explanatory’ multi-modal factor analysis. UCLA Work Pap Phon 16:1–84
Google Scholar
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Article MATH Google Scholar
Shashanka M, Raj B, Smaragdis P (2008) Probabilistic latent variable models as nonnegative factorizations. Comput Intell Neurosci doi:10.1155/2008/947438
Google Scholar
Shashua A, Hazan T (2005) Non-negative tensor factorization with applications to statistics and computer vision. In: Proceedings of the 22nd international conference on machine learning (ICML 2005). ACM Press, New York, pp 792–799
Chapter Google Scholar
Strehl A, Ghosh J (2002) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617
MathSciNet Google Scholar
Sun J, Zeng H, Liu H, Lu Y, Chen Z (2005) Cubesvd: a novel approach to personalized web search. In: Proceedings of the 14th international conference on World Wide Web, pp 652–662
Vasilescu MAO, Terzopoulos D (2002) Multilinear analysis of image ensembles: tensorfaces. In: Proceedings of the 7th European conference on computer vision—part I (ECCV’02), pp 447–460
Vichi M, Rocci R, Kiers HAL (2007) Simultaneous component and clustering models for three-way data: within and between approaches. J Classif 24(1):71–98
Article MathSciNet MATH Google Scholar
Wang H, Ahuja N (2005) Rank-r approximation of tensors: using image-as-matrix representation. In: CVPR ’05, vol 2, pp 346–353
Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of ACM conference of research and development in IR(SIRGIR), pp 267–273, Toronto, Canada
Zhang T, Golub G (2001) Rank-one approximation to high order tensor. SIAM J Matrix Anal Appl 23:534–550
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Xerox Innovation Group, Xerox Corporation, Rochester, NY, 14580, USA
Wei Peng
School of Computer Science, Florida International University, Miami, FL, 33199, USA
Tao Li

Authors

Wei Peng
View author publications
You can also search for this author in PubMed Google Scholar
Tao Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tao Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Peng, W., Li, T. On the equivalence between nonnegative tensor factorization and tensorial probabilistic latent semantic analysis. Appl Intell 35, 285–295 (2011). https://doi.org/10.1007/s10489-010-0220-9

Download citation

Published: 24 March 2010
Issue Date: October 2011
DOI: https://doi.org/10.1007/s10489-010-0220-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the equivalence between nonnegative tensor factorization and tensorial probabilistic latent semantic analysis

Abstract

Access this article

Similar content being viewed by others

Effective Tensor-Based Data Clustering Through Sub-Tensor Impact Graphs

Fast Nonnegative Tensor Factorizations with Tensor Train Model

Simplex Decompositions Using Singular Values Decomposition

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On the equivalence between nonnegative tensor factorization and tensorial probabilistic latent semantic analysis

Abstract

Access this article

Similar content being viewed by others

Effective Tensor-Based Data Clustering Through Sub-Tensor Impact Graphs

Fast Nonnegative Tensor Factorizations with Tensor Train Model

Simplex Decompositions Using Singular Values Decomposition

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation