skip to main content
10.1145/2661829.2662005acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Twitter Opinion Topic Model: Extracting Product Opinions from Tweets by Leveraging Hashtags and Sentiment Lexicon

Published:03 November 2014Publication History

ABSTRACT

Aspect-based opinion mining is widely applied to review data to aggregate or summarize opinions of a product, and the current state-of-the-art is achieved with Latent Dirichlet Allocation (LDA)-based model. Although social media data like tweets are laden with opinions, their "dirty" nature (as natural language) has discouraged researchers from applying LDA-based opinion model for product review mining. Tweets are often informal, unstructured and lacking labeled data such as categories and ratings, making it challenging for product opinion mining. In this paper, we propose an LDA-based opinion model named Twitter Opinion Topic Model (TOTM) for opinion mining and sentiment analysis. TOTM leverages hashtags, mentions, emoticons and strong sentiment words that are present in tweets in its discovery process. It improves opinion prediction by modeling the target-opinion interaction directly, thus discovering target specific opinion words, neglected in existing approaches. Moreover, we propose a new formulation of incorporating sentiment prior information into a topic model, by utilizing an existing public sentiment lexicon. This is novel in that it learns and updates with the data. We conduct experiments on 9 million tweets on electronic products, and demonstrate the improved performance of TOTM in both quantitative evaluations and qualitative analysis. We show that aspect-based opinion analysis on massive volume of tweets provides useful opinions on products.

References

  1. S. Baccianella, A. Esuli, and F. Sebastiani. SentiWordNet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In LREC, pages 2200--2204, 2010.Google ScholarGoogle Scholar
  2. W. Buntine and M. Hutter. A Bayesian review of the Poisson-Dirichlet process. arXiv:1007.0296v2, 2012.Google ScholarGoogle Scholar
  3. C. Chen, L. Du, and W. Buntine. Sampling table configurations for the hierarchical Poisson-Dirichlet Process. In ECML, pages 296--311, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Davidov, O. Tsur, and A. Rappoport. Enhanced sentiment learning using Twitter hashtags and smileys. In COLING, pages 241--249, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. De Marneffe, B. MacCartney, and C. Manning. Generating typed dependency parses from phrase structure parses. In LREC, pages 449--454, 2006.Google ScholarGoogle Scholar
  6. X. Ding, B. Liu, and P. Yu. A holistic lexicon-based approach to opinion mining. In WSDM. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Fellbaum. WordNet. Wiley Online Library, 1999.Google ScholarGoogle Scholar
  8. A. Go, R. Bhayani, and L. Huang. Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, pages 1--12, 2009.Google ScholarGoogle Scholar
  9. B. Han, P. Cook, and T. Baldwin. Automatically constructing a normalisation dictionary for microblogs. In EMNLP-CoNLL, pages 421--432. ACL, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B. Han, P. Cook, and T. Baldwin. Lexical normalization for social media text. ACM TIST, 4(1):5:1--5:27, Feb. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Y. He. Incorporating sentiment prior knowledge for weakly supervised sentiment analysis. ACM TALIP, 11(2):4, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Hu and B. Liu. Mining opinion features in customer reviews. In AAAI, volume 4, pages 755--760, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Jagarlamudi, H. Daumé, III, and R. Udupa. Incorporating lexical priors into topic models. In EACL. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. L. Jiang, M. Yu, M. Zhou, X. Liu, and T. Zhao. Target-dependent Twitter sentiment classification. In ACL, pages 151--160, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Jo and A. Oh. Aspect and sentiment unification model for online review analysis. In WSDM, pages 815--824, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. F. Li, C. Han, M. Huang, X. Zhu, Y.-J. Xia, S. Zhang, and H. Yu. Structure-aware review mining and summarization. In COLING, pages 653--661. ACL, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. T. Li, Y. Zhang, and V. Sindhwani. A non-negative matrix tri-factorization approach to sentiment classification with lexical prior knowledge. In AFNLP, pages 244--252, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. Lin and Y. He. Joint sentiment/topic model for sentiment analysis. In CIKM, pages 375--384. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. B. Liu. Sentiment analysis and opinion mining. Synthesis Lectures on HLT, 5(1):1--167, 2012.Google ScholarGoogle Scholar
  20. S. Liu, F. Li, F. Li, X. Cheng, and H. Shen. Adaptive co-training SVM for sentiment classification on tweets. In CIKM, pages 2079--2088. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Lui and T. Baldwin. langid.py: An off-the-shelf language identification tool. In ACL, pages 25--30, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. D. Maynard, K. Bontcheva, and D. Rout. Challenges in developing opinion mining tools for social media. @NLP can u tag#usergeneratedcontent, 2012.Google ScholarGoogle Scholar
  23. M. McCord and M. Chuah. Spam detection on Twitter using traditional classifiers. In Autonomic and Trusted Computing, pages 175--186. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. R. Mehrotra, S. Sanner, W. Buntine, and L. Xie. Improving LDA topic models for microblogs via Tweet pooling and automatic labeling. In SIGIR, pages 889--892. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Q. Mei, X. Ling, M. Wondra, et al. Topic Sentiment Mixture: Modeling facets and opinions in weblogs. In WWW, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. Moghaddam and M. Ester. Opinion Digger: An unsupervised opinion miner from unstructured product reviews. In CIKM, pages 1825--1828. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. Moghaddam and M. Ester. ILDA: Interdependent LDA model for learning latent aspects and their ratings from online product reviews. In SIGIR, pages 665--674, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. Moghaddam and M. Ester. On the design of LDA models for aspect-based opinion mining. In CIKM. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. P. Nakov, Z. Kozareva, A. Ritter, S. Rosenthal, V. Stoyanov, and T. Wilson. SemEval-2013 task 2: Sentiment analysis in Twitter. In Workshop on Semantic Evaluation, 2013.Google ScholarGoogle Scholar
  30. R. Neal. Slice sampling. Ann. Statist., 31(3):705--767, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  31. O. Owoputi, B. O'Connor, C. Dyer, et al. Improved part-of-speech tagging for online conversational text with word clusters. In NAACL-HLT, pages 380--390, 2013.Google ScholarGoogle Scholar
  32. A. Pak and P. Paroubek. Twitter as a corpus for sentiment analysis and opinion mining. In LREC, 2010.Google ScholarGoogle Scholar
  33. B. Pang and L. Lee. Opinion Mining and Sentiment Analysis. Foundations and Trends in Information Retrieval, 2(1-2):1--135, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. J. Pitman. Some developments of the Blackwell-Macqueen urn scheme. Lecture Notes-Monograph Series, 1996.Google ScholarGoogle ScholarCross RefCross Ref
  35. A.-M. Popescu and O. Etzioni. Extracting product features and opinions from reviews. In Natural language processing and text mining, pages 9--28. Springer, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  36. A. Ritter, S. Clark, Mausam, and O. Etzioni. Named entity recognition in Tweets: An experimental study. In EMNLP, pages 1524--1534, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. M. Taboada, J. Brooke, M. Tofiloski, K. Voll, and M. Stede. Lexicon-based methods for sentiment analysis. Computational linguistics, 37(2):267--307, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Y. W. Teh. A Bayesian interpretation of interpolated Kneser-Ney. Tech Report A2/06, NUS, 2006.Google ScholarGoogle Scholar
  39. Y. W. Teh. A hierarchical Bayesian language model based on Pitman-Yor processes. In ACL, pages 985--992. ACL, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Y. W. Teh and M. Jordan. Hierarchical Bayesian non- parametric models with applications. Bayesian Non- parametrics: Principles and Practice, pages 158--207, 2010.Google ScholarGoogle Scholar
  41. M. Thelwall, K. Buckley, G. Paltoglou, D. Cai, and A. Kappas. Sentiment strength detection in short informal text. JASIST, 61(12):2544--2558, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. I. Titov and R. McDonald. A joint model of text and aspect ratings for sentiment summarization. In ACL08: HLT, 2008.Google ScholarGoogle Scholar
  43. I. Titov and R. McDonald. Modeling online reviews with multi-grain topic models. In WWW, pages 111--120, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. O. Tsur, D. Davidov, and A. Rappoport. ICWSM-A great catchy name: Semi-supervised recognition of sarcastic sentences in online product reviews. In ICWSM, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  45. T. Wilson, J. Wiebe, and P. Hoffmann. Recognizing contextual polarity in phrase-level sentiment analysis. In HLT-EMNLP, pages 347--354, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. J. Yang and J. Leskovec. Patterns of temporal variation in online media. In WSDM, pages 177--186, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. W. Zhao, J. Jiang, J. Weng, J. He, E.-P. Lim, H. Yan, and X. Li. Comparing Twitter and traditional media using topic models. In ECIR, pages 338--349, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. W. Zhao, J. Jiang, H. Yan, and X. Li. Jointly modeling aspects and opinions with a MaxEnt-LDA hybrid. In EMNLP, pages 56--65, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Twitter Opinion Topic Model: Extracting Product Opinions from Tweets by Leveraging Hashtags and Sentiment Lexicon

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management
      November 2014
      2152 pages
      ISBN:9781450325981
      DOI:10.1145/2661829

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 3 November 2014

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      CIKM '14 Paper Acceptance Rate175of838submissions,21%Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader