skip to main content
10.1145/3366423.3380102acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Graph Attention Topic Modeling Network

Published:20 April 2020Publication History

ABSTRACT

Existing topic modeling approaches possess several issues, including the overfitting issue of Probablistic Latent Semantic Indexing (pLSI), the failure of capturing the rich topical correlations among topics in Latent Dirichlet Allocation (LDA), and high inference complexity. In this paper, we provide a new method to overcome the overfitting issue of pLSI by using the amortized inference with word embedding as input, instead of the Dirichlet prior in LDA. For generative topic model, the large number of free latent variables is the root of overfitting. To reduce the number of parameters, the amortized inference replaces the inference of latent variable with a function which possesses the shared (amortized) learnable parameters. The number of the shared parameters is fixed and independent of the scale of the corpus. To overcome the limited application of amortized inference to independent and identically distributed (i.i.d) data, a novel graph neural network, Graph Attention TOpic Network (GATON), is proposed to model the topic structure of non-i.i.d documents according to the following two observations. First, pLSI can be interpreted as stochastic block model (SBM) on a specific bi-partite graph. Second, graph attention network (GAT) can be explained as the semi-amortized inference of SBM, which relaxes the i.i.d data assumption of vanilla amortized inference. GATON provides a novel scheme, i.e. graph convolution operation based scheme, to integrate word similarity and word co-occurrence structure. Specifically, the bag-of-words document representation is modeled as a bi-partite graph topology. Meanwhile, word embedding, which captures the word similarity, is modeled as attribute of the word node and the term frequency vector is adopted as the attribute of the document node. Based on the weighted (attention) graph convolution operation, the word co-occurrence structure and word similarity patterns are seamlessly integrated for topic identification. Extensive experiments demonstrate that the effectiveness of GATON on topic identification not only benefits the document classification, but also significantly refines the input word embedding.

References

  1. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In ICLR.Google ScholarGoogle Scholar
  2. Brian Ball, Brian Karrer, and M. E. J. Newman. 2011. Efficient and principled method for detecting communities in networks. Physical Review E 84 (Sep 2011), 036103.Google ScholarGoogle Scholar
  3. Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag, Berlin, Heidelberg.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. David M. Blei and John D. Lafferty. 2005. Correlated Topic Models. In NIPS. 147–154.Google ScholarGoogle Scholar
  5. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet Allocation. JMLR 3(2003), 993–1022.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Elia Bruni, Gemma Boleda, Marco Baroni, and Nam-Khanh Tran. 2012. Distributional Semantics in Technicolor. In ACL. 136–145.Google ScholarGoogle Scholar
  7. Jianfei Chen, Jun Zhu, Zi Wang, Xun Zheng, and Bo Zhang. 2013. Scalable Inference for Logistic-Normal Topic Models. In NIPS. 2445–2453.Google ScholarGoogle Scholar
  8. Djork-Arné Clevert, Thomas Unterthiner, and Sepp Hochreiter. 2016. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs). In ICLR.Google ScholarGoogle Scholar
  9. Rajarshi Das, Manzil Zaheer, and Chris Dyer. 2015. Gaussian LDA for Topic Models with Word Embeddings. In ACL. 795–804.Google ScholarGoogle Scholar
  10. Scott C. Deerwester, Susan T. Dumais, Thomas K. Landauer, George W. Furnas, and Richard A. Harshman. 1990. Indexing by Latent Semantic Analysis. JASIS 41, 6 (1990), 391–407.Google ScholarGoogle ScholarCross RefCross Ref
  11. Arthur P Dempster, Nan M Laird, and Donald B Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the royal statistical society. Series B (methodological) (1977), 1–38.Google ScholarGoogle Scholar
  12. Tyler Derr, Yao Ma, and Jiliang Tang. 2018. Signed Graph Convolutional Networks. In ICDM. 929–934.Google ScholarGoogle Scholar
  13. Adji B Dieng, Francisco JR Ruiz, and David M Blei. 2019. Topic Modeling in Embedding Spaces. arXiv preprint arXiv:1907.04907(2019).Google ScholarGoogle Scholar
  14. Lev Finkelstein, Evgeniy Gabrilovich, Yossi Matias, Ehud Rivlin, Zach Solan, Gadi Wolfman, and Eytan Ruppin. 2001. Placing search in context: the concept revisited. In WWW. 406–414.Google ScholarGoogle Scholar
  15. Martin Gerlach, Tiago P. Peixoto, and Eduardo G. Altmann. 2018. A network approach to topic models. Science Advances 4, 7 (2018).Google ScholarGoogle Scholar
  16. Junxian He, Zhiting Hu, Taylor Berg-Kirkpatrick, Ying Huang, and Eric P Xing. 2017. Efficient correlated topic modeling with topic embedding. In SIGKDD. 225–233.Google ScholarGoogle Scholar
  17. Felix Hill, Roi Reichart, and Anna Korhonen. 2015. SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation. Computational Linguistics 41, 4 (2015), 665–695.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Thomas Hofmann. 1999. Probabilistic Latent Semantic Indexing. In SIGIR. 50–57.Google ScholarGoogle Scholar
  19. Weihua Hu and Jun’ichi Tsujii. 2016. A Latent Concept Topic Model for Robust Topic Inference Using Word Embeddings. In ACL. 380–386.Google ScholarGoogle Scholar
  20. Michael I. Jordan, Zoubin Ghahramani, Tommi S. Jaakkola, and Lawrence K. Saul. 1999. An Introduction to Variational Methods for Graphical Models. Machine Learning 37, 2 (1999), 183–233.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Yoon Kim, Sam Wiseman, Andrew C. Miller, David Sontag, and Alexander M. Rush. 2018. Semi-Amortized Variational Autoencoders. In ICML. 2683–2692.Google ScholarGoogle Scholar
  22. Diederik P. Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In ICLR.Google ScholarGoogle Scholar
  23. Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR.Google ScholarGoogle Scholar
  24. Jey Han Lau, David Newman, and Timothy Baldwin. 2014. Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality. In EACL. 530–539.Google ScholarGoogle Scholar
  25. Quoc V. Le and Tomas Mikolov. 2014. Distributed Representations of Sentences and Documents. In ICML. 1188–1196.Google ScholarGoogle Scholar
  26. Daniel D. Lee and H. Sebastian Seung. 2000. Algorithms for Non-negative Matrix Factorization. In NIPS. 556–562.Google ScholarGoogle Scholar
  27. Omer Levy and Yoav Goldberg. 2014. Neural Word Embedding as Implicit Matrix Factorization. In NIPS. 2177–2185.Google ScholarGoogle Scholar
  28. Aaron Q. Li, Amr Ahmed, Sujith Ravi, and Alexander J. Smola. 2014. Reducing the sampling complexity of topic models. In SIGKDD. 891–900.Google ScholarGoogle Scholar
  29. Dingcheng Li, Jingyuan Zhang, and Ping Li. 2019. TMSA: A Mutual Learning Model for Topic Discovery and Word Embedding. In SDM. 684–692.Google ScholarGoogle Scholar
  30. Shaohua Li, Tat-Seng Chua, Jun Zhu, and Chunyan Miao. 2016. Generative Topic Embedding: a Continuous Representation of Documents. In ACL. 666–675.Google ScholarGoogle Scholar
  31. Luyang Liu, Heyan Huang, Yang Gao, Yongfeng Zhang, and Xiaochi Wei. 2019. Neural Variational Correlated Topic Modeling. In WWW. 1142–1152.Google ScholarGoogle Scholar
  32. Yang Liu, Zhiyuan Liu, Tat-Seng Chua, and Maosong Sun. 2015. Topical Word Embeddings. In AAAI. 2418–2424.Google ScholarGoogle Scholar
  33. Thang Luong, Richard Socher, and Christopher D. Manning. 2013. Better Word Representations with Recursive Neural Networks for Morphology. In CoNLL. 104–113.Google ScholarGoogle Scholar
  34. Joseph Marino, Yisong Yue, and Stephan Mandt. 2018. Iterative Amortized Inference. In ICML. 3400–3409.Google ScholarGoogle Scholar
  35. Yishu Miao, Edward Grefenstette, and Phil Blunsom. 2017. Discovering Discrete Latent Topics with Neural Variational Inference. In ICML. 2410–2419.Google ScholarGoogle Scholar
  36. Yishu Miao, Lei Yu, and Phil Blunsom. 2016. Neural Variational Inference for Text Processing. In ICML. 1727–1736.Google ScholarGoogle Scholar
  37. Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NIPS. 3111–3119.Google ScholarGoogle Scholar
  38. David M. Mimno, Hanna M. Wallach, Edmund M. Talley, Miriam Leenders, and Andrew McCallum. [n.d.]. Optimizing Semantic Coherence in Topic Models. In EMNLP. 262–272.Google ScholarGoogle Scholar
  39. Dat Quoc Nguyen, Richard Billingsley, Lan Du, and Mark Johnson. 2015. Improving Topic Models with Latent Feature Word Representations. TACL 3(2015), 299–313.Google ScholarGoogle ScholarCross RefCross Ref
  40. Shirui Pan, Ruiqi Hu, Sai-fu Fung, Guodong Long, Jing Jiang, and Chengqi Zhang. 2019. Learning graph embedding with adversarial training methods. IEEE Transactions on Cybernetics(2019).Google ScholarGoogle ScholarCross RefCross Ref
  41. Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP. 1532–1543.Google ScholarGoogle Scholar
  42. James Petterson, Alexander J. Smola, Tibério S. Caetano, Wray L. Buntine, and Shravan M. Narayanamurthy. 2010. Word Features for Latent Dirichlet Allocation. In NIPS. 1921–1929.Google ScholarGoogle Scholar
  43. Kira Radinsky, Eugene Agichtein, Evgeniy Gabrilovich, and Shaul Markovitch. 2011. A word at a time: computing word relatedness using temporal semantic analysis. In WWW. 337–346.Google ScholarGoogle Scholar
  44. Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic Backpropagation and Approximate Inference in Deep Generative Models. In ICML. 1278–1286.Google ScholarGoogle Scholar
  45. Bei Shi, Wai Lam, Shoaib Jameel, Steven Schockaert, and Kwun Ping Lai. 2017. Jointly Learning Word Embeddings and Latent Topics. In SIGIR. 375–384.Google ScholarGoogle Scholar
  46. Akash Srivastava and Charles A. Sutton. 2017. Autoencoding Variational Inference For Topic Models. In ICLR.Google ScholarGoogle Scholar
  47. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In NIPS. 5998–6008.Google ScholarGoogle Scholar
  48. Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In ICLR.Google ScholarGoogle Scholar
  49. Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, and Philip S. Yu. 2019. Heterogeneous Graph Attention Network. In WWW. 2022–2032.Google ScholarGoogle Scholar
  50. Man Wu, Shirui Pan, Xingquan Zhu, Chuan Zhou, and Lei Pan. 2019. Domain-Adversarial Graph Neural Networks for Text Classification. In ICDM. 648–657.Google ScholarGoogle Scholar
  51. Guangxu Xun, Yaliang Li, Jing Gao, and Aidong Zhang. 2017. Collaboratively Improving Topic Discovery and Word Embeddings by Coordinating Global and Local Contexts. In SIGKDD. 535–543.Google ScholarGoogle Scholar
  52. Guangxu Xun, Yaliang Li, Wayne Xin Zhao, Jing Gao, and Aidong Zhang. 2017. A Correlated Topic Model Using Word Embeddings. In IJCAI. 4207–4213.Google ScholarGoogle Scholar
  53. Liang Yang, Zhiyang Chen, Junhua Gu, and Yuanfang Guo. 2019. Dual Self-Paced Graph Convolutional Network: Towards Reducing Attribute Distortions Induced by Topology. In IJCAI. 4062–4069.Google ScholarGoogle Scholar
  54. Liang Yang, Zesheng Kang, Xiaochun Cao, Di Jin, Bo Yang, and Yuanfang Guo. 2019. Topology Optimization based Graph Convolutional Network. In IJCAI. 4054–4061.Google ScholarGoogle Scholar
  55. Liang Yang, Fan Wu, Yingkui Wang, Junhua Gu, and Yuanfang Guo. 2019. Masked Graph Convolutional Network. In IJCAI. 4070–4077.Google ScholarGoogle Scholar
  56. He Zhao, Lan Du, and Wray L. Buntine. 2017. A Word Embeddings Informed Focused Topic Model. In ACML. 423–438.Google ScholarGoogle Scholar
  57. He Zhao, Lan Du, Wray L. Buntine, and Gang Liu. 2017. MetaLDA: A Topic Model that Efficiently Incorporates Meta Information. In ICDM. 635–644.Google ScholarGoogle Scholar
  58. Shichao Zhu, Chuan Zhou, Shirui Pan, Xingquan Zhu, and Bin Wang. 2019. Relation Structure-Aware Heterogeneous Graph Neural Network. In ICDM. 1534–1539.Google ScholarGoogle Scholar
  59. Shichao Zhu, Lewei Zhou, Shirui Pan, Chuan Zhou, Guiying Yan, and Bin Wang. 2020. GSSNN: Graph Smoothing Splines Neural Networks. In AAAI.Google ScholarGoogle Scholar

Index Terms

  1. Graph Attention Topic Modeling Network
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              WWW '20: Proceedings of The Web Conference 2020
              April 2020
              3143 pages
              ISBN:9781450370233
              DOI:10.1145/3366423

              Copyright © 2020 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 20 April 2020

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed limited

              Acceptance Rates

              Overall Acceptance Rate1,899of8,196submissions,23%

              Upcoming Conference

              WWW '24
              The ACM Web Conference 2024
              May 13 - 17, 2024
              Singapore , Singapore

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format .

            View HTML Format