skip to main content
10.1145/3366423.3380278acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections

Discriminative Topic Mining via Category-Name Guided Text Embedding

Published:20 April 2020Publication History


Mining a set of meaningful and distinctive topics automatically from massive text corpora has broad applications. Existing topic models, however, typically work in a purely unsupervised way, which often generate topics that do not fit users’ particular needs and yield suboptimal performance on downstream tasks. We propose a new task, discriminative topic mining, which leverages a set of user-provided category names to mine discriminative topics from text corpora. This new task not only helps a user understand clearly and distinctively the topics he/she is most interested in, but also benefits directly keyword-driven classification tasks. We develop CatE, a novel category-name guided text embedding method for discriminative topic mining, which effectively leverages minimal user guidance to learn a discriminative embedding space and discover category representative terms in an iterative manner. We conduct a comprehensive set of experiments to show that CatE mines high-quality set of topics guided by category names only, and benefits a variety of downstream applications including weakly-supervised classification and lexical entailment direction identification.


  1. David Andrzejewski and Xiaojin Zhu. 2009. Latent Dirichlet Allocation with Topic-in-Set Knowledge. In HLT-NAACL.Google ScholarGoogle Scholar
  2. Marco Baroni and Alessandro Lenci. 2011. How we BLESSed distributional semantic evaluation. In EMNLP.Google ScholarGoogle Scholar
  3. Kayhan Batmanghelich, Ardavan Saeedi, Karthik Narasimhan, and Sam Gershman. 2016. Nonparametric spherical topic modeling with word embeddings. In ACL. 537.Google ScholarGoogle Scholar
  4. David Blei and John Lafferty. 2006. Correlated topic models. In NIPS. 147.Google ScholarGoogle Scholar
  5. David M Blei and Jon D Mcauliffe. 2008. Supervised topic models. In NIPS. 121–128.Google ScholarGoogle Scholar
  6. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet Allocation. In NIPS.Google ScholarGoogle Scholar
  7. Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics 5 (2017), 135–146.Google ScholarGoogle ScholarCross RefCross Ref
  8. Ming-Wei Chang, Lev-Arie Ratinov, Dan Roth, and Vivek Srikumar. 2008. Importance of Semantic Representation: Dataless Classification. In AAAI.Google ScholarGoogle Scholar
  9. Chaitanya Chemudugunta, Padhraic Smyth, and Mark Steyvers. 2008. Combining concept hierarchies and statistical topic models. In CIKM. 1469–1470.Google ScholarGoogle Scholar
  10. Rajarshi Das, Manzil Zaheer, and Chris Dyer. 2015. Gaussian lda for topic models with word embeddings. In ACL. 795–804.Google ScholarGoogle Scholar
  11. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT.Google ScholarGoogle Scholar
  12. Bhuwan Dhingra, Christopher J. Shallue, Mohammad Norouzi, Andrew M. Dai, and George E. Dahl. 2018. Embedding Text in Hyperbolic Spaces. In TextGraphs@NAACL-HLT.Google ScholarGoogle Scholar
  13. Adji B. Dieng, Francisco J. R. Ruiz, and David M. Blei. 2019. Topic Modeling in Embedding Spaces. ArXiv abs/1907.04907(2019).Google ScholarGoogle Scholar
  14. Zhicheng Dou, Ruihua Song, and Ji-Rong Wen. 2007. A large-scale evaluation and analysis of personalized search strategies. In WWW.Google ScholarGoogle Scholar
  15. George F. Foster and Roland Kuhn. 2007. Mixture-Model Adaptation for SMT. In WMT@ACL.Google ScholarGoogle Scholar
  16. Ryan J. Gallagher, Kyle Reing, David C. Kale, and Greg Ver Steeg. 2017. Anchored Correlation Explanation: Topic Modeling with Minimal Domain Knowledge. TACL (2017).Google ScholarGoogle Scholar
  17. Octavian-Eugen Ganea, Gary Bécigneul, and Thomas Hofmann. 2018. Hyperbolic Entailment Cones for Learning Hierarchical Embeddings. In ICML.Google ScholarGoogle Scholar
  18. Thomas L Griffiths, Michael I Jordan, Joshua B Tenenbaum, and David M Blei. 2004. Hierarchical topic models and the nested Chinese restaurant process. In NIPS. 17–24.Google ScholarGoogle Scholar
  19. Thomas Hofmann. 1999. Probabilistic Latent Semantic Indexing. In SIGIR.Google ScholarGoogle Scholar
  20. Jiaxin Huang, Yiqing Xie, Yu Meng, Jiaming Shen, Yunyi Zhang, and Jiawei Han. 2020. Guiding Corpus-based Set Expansion by Auxiliary Sets Generation and Co-Expansion. In WWW.Google ScholarGoogle Scholar
  21. Jagadeesh Jagarlamudi, Hal Daumé, and Raghavendra Udupa. 2012. Incorporating Lexical Priors into Topic Models. In EACL.Google ScholarGoogle Scholar
  22. Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In EMNLP.Google ScholarGoogle Scholar
  23. Simon Lacoste-Julien, Fei Sha, and Michael I Jordan. 2009. DiscLDA: Discriminative learning for dimensionality reduction and classification. In NIPS. 897–904.Google ScholarGoogle Scholar
  24. Jey Han Lau, David Newman, and Timothy Baldwin. 2014. Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality. In EACL.Google ScholarGoogle Scholar
  25. Wei Li and Andrew McCallum. 2006. Pachinko allocation: DAG-structured mixture models of topic correlations. In ICML. 577–584.Google ScholarGoogle Scholar
  26. Yang Liu, Zhiyuan Liu, Tat-Seng Chua, and Maosong Sun. 2015. Topical Word Embeddings. In AAAI.Google ScholarGoogle Scholar
  27. Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, Nov (2008), 2579–2605.Google ScholarGoogle Scholar
  28. Xian-Ling Mao, Zhao-Yan Ming, Tat-Seng Chua, Si Li, Hongfei Yan, and Xiaoming Li. 2012. SSHLDA: a semi-supervised hierarchical topic model. In EMNLP. 800–809.Google ScholarGoogle Scholar
  29. Qiaozhu Mei, Xuehua Shen, and ChengXiang Zhai. 2007. Automatic labeling of multinomial topic models. In KDD.Google ScholarGoogle Scholar
  30. Yu Meng, Jiaxin Huang, Guangyuan Wang, Chao Zhang, Honglei Zhuang, Lance Kaplan, and Jiawei Han. 2019. Spherical Text Embedding. In NeurIPS.Google ScholarGoogle Scholar
  31. Yu Meng, Jiaming Shen, Chao Zhang, and Jiawei Han. 2018. Weakly-Supervised Neural Text Classification. In CIKM.Google ScholarGoogle Scholar
  32. Yu Meng, Jiaming Shen, Chao Zhang, and Jiawei Han. 2019. Weakly-Supervised Hierarchical Text Classification. In AAAI.Google ScholarGoogle Scholar
  33. Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NIPS.Google ScholarGoogle Scholar
  34. Dat Quoc Nguyen, Richard Billingsley, Lan Du, and Mark Johnson. 2015. Improving topic models with latent feature word representations. TACL 3(2015), 299–313.Google ScholarGoogle ScholarCross RefCross Ref
  35. Kim Anh Nguyen, Maximilian Köper, Sabine Schulte im Walde, and Ngoc Thang Vu. 2017. Hierarchical Embeddings for Hypernymy Detection and Directionality. In EMNLP.Google ScholarGoogle Scholar
  36. Maximilian Nickel and Douwe Kiela. 2017. Poincaré Embeddings for Learning Hierarchical Representations. In NIPS.Google ScholarGoogle Scholar
  37. Maximilian Nickel and Douwe Kiela. 2018. Learning Continuous Hierarchies in the Lorentz Model of Hyperbolic Geometry. In ICML.Google ScholarGoogle Scholar
  38. Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP.Google ScholarGoogle Scholar
  39. Daniel Ramage, David Hall, Ramesh Nallapati, and Christopher D Manning. 2009. Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In EMNLP. 248–256.Google ScholarGoogle Scholar
  40. Michal Rosen-Zvi, Thomas Griffiths, Mark Steyvers, and Padhraic Smyth. 2004. The author-topic model for authors and documents. In UAI. 487–494.Google ScholarGoogle Scholar
  41. Timothy N Rubin, America Chambers, Padhraic Smyth, and Mark Steyvers. 2012. Statistical topic models for multi-label document classification. Machine learning 88, 1-2 (2012), 157–208.Google ScholarGoogle Scholar
  42. Evan Sandhaus. 2008. The New York Times Annotated Corpus.Google ScholarGoogle Scholar
  43. Enrico Santus, Alessandro Lenci, Qin Lu, and Sabine Schulte im Walde. 2014. Chasing Hypernyms in Vector Spaces with Entropy. In EACL.Google ScholarGoogle Scholar
  44. Jingbo Shang, Jialu Liu, Meng Jiang, Xiang Ren, Clare R. Voss, and Jiawei Han. 2018. Automated Phrase Mining from Massive Text Corpora. IEEE Transactions on Knowledge and Data Engineering 30 (2018), 1825–1837.Google ScholarGoogle ScholarCross RefCross Ref
  45. Yangqiu Song and Dan Roth. 2014. On Dataless Hierarchical Text Classification. In AAAI.Google ScholarGoogle Scholar
  46. Jian Tang, Meng Qu, and Qiaozhu Mei. 2015. PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks. In KDD.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Alexandru Tifrea, Gary Bécigneul, and Octavian-Eugen Ganea. 2019. Poincaré Glove: Hyperbolic Word Embeddings. In ICLR.Google ScholarGoogle Scholar
  48. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need. In NIPS.Google ScholarGoogle Scholar
  49. Ivan Vulic, Daniela Gerz, Douwe Kiela, Felix Hill, and Anna Korhonen. 2017. HyperLex: A Large-Scale Evaluation of Graded Lexical Entailment. Computational Linguistics(2017).Google ScholarGoogle Scholar
  50. Guoyin Wang, Chunyuan Li, Wenlin Wang, Yizhe Zhang, Dinghan Shen, Xinyuan Zhang, Ricardo Henao, and Lawrence Carin. 2018. Joint Embedding of Words and Labels for Text Classification. In ACL.Google ScholarGoogle Scholar
  51. Julie Weeds, David J. Weir, and Diana McCarthy. 2004. Characterising Measures of Lexical Distributional Similarity. In COLING.Google ScholarGoogle Scholar
  52. Xing Wei and W. Bruce Croft. 2006. LDA-based document models for ad-hoc retrieval. In SIGIR.Google ScholarGoogle Scholar
  53. Hongteng Xu, Wenlin Wang, Wei Liu, and Lawrence Carin. 2018. Distilled wasserstein learning for word embedding and topic modeling. In NIPS. 1716–1725.Google ScholarGoogle Scholar
  54. Guangxu Xun, Vishrawas Gopalakrishnan, Fenglong Ma, Yaliang Li, Jing Gao, and Aidong Zhang. 2016. Topic discovery for short texts using word embeddings. In ICDM. 1299–1304.Google ScholarGoogle Scholar
  55. Guangxu Xun, Yaliang Li, Jing Gao, and Aidong Zhang. 2017. Collaboratively Improving Topic Discovery and Word Embeddings by Coordinating Global and Local Contexts. In KDD.Google ScholarGoogle Scholar
  56. Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alexander J. Smola, and Eduard H. Hovy. 2016. Hierarchical Attention Networks for Document Classification. In HLT-NAACL.Google ScholarGoogle Scholar
  57. Chao Zhang, Fangbo Tao, Xiusi Chen, Jiaming Shen, Meng Jiang, Brian M. Sadler, Michelle T. Vanni, and Jiawei Han. 2018. TaxoGen: Constructing Topical Concept Taxonomy by Adaptive Term Embedding and Clustering. In KDD.Google ScholarGoogle Scholar
  58. Yu Zhang, Frank F Xu, Sha Li, Yu Meng, Xuan Wang, Qi Li, and Jiawei Han. 2019. HiGitClass: Keyword-Driven Hierarchical Classification of GitHub Repositories. In ICDM.Google ScholarGoogle Scholar
  59. Maayan Zhitomirsky-Geffet and Ido Dagan. 2005. The Distributional Inclusion Hypotheses and Lexical Entailment. In ACL.Google ScholarGoogle Scholar

Index Terms

  1. Discriminative Topic Mining via Category-Name Guided Text Embedding
        Index terms have been assigned to the content through auto-classification.



        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          WWW '20: Proceedings of The Web Conference 2020
          April 2020
          3143 pages

          Copyright © 2020 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]


          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 20 April 2020


          Request permissions about this article.

          Request Permissions

          Check for updates


          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate1,899of8,196submissions,23%

        PDF Format

        View or Download as a PDF file.



        View online with eReader.


        HTML Format

        View this article in HTML Format .

        View HTML Format