ABSTRACT
Aspect-based opinion mining is widely applied to review data to aggregate or summarize opinions of a product, and the current state-of-the-art is achieved with Latent Dirichlet Allocation (LDA)-based model. Although social media data like tweets are laden with opinions, their "dirty" nature (as natural language) has discouraged researchers from applying LDA-based opinion model for product review mining. Tweets are often informal, unstructured and lacking labeled data such as categories and ratings, making it challenging for product opinion mining. In this paper, we propose an LDA-based opinion model named Twitter Opinion Topic Model (TOTM) for opinion mining and sentiment analysis. TOTM leverages hashtags, mentions, emoticons and strong sentiment words that are present in tweets in its discovery process. It improves opinion prediction by modeling the target-opinion interaction directly, thus discovering target specific opinion words, neglected in existing approaches. Moreover, we propose a new formulation of incorporating sentiment prior information into a topic model, by utilizing an existing public sentiment lexicon. This is novel in that it learns and updates with the data. We conduct experiments on 9 million tweets on electronic products, and demonstrate the improved performance of TOTM in both quantitative evaluations and qualitative analysis. We show that aspect-based opinion analysis on massive volume of tweets provides useful opinions on products.
- S. Baccianella, A. Esuli, and F. Sebastiani. SentiWordNet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In LREC, pages 2200--2204, 2010.Google Scholar
- W. Buntine and M. Hutter. A Bayesian review of the Poisson-Dirichlet process. arXiv:1007.0296v2, 2012.Google Scholar
- C. Chen, L. Du, and W. Buntine. Sampling table configurations for the hierarchical Poisson-Dirichlet Process. In ECML, pages 296--311, 2011. Google ScholarDigital Library
- D. Davidov, O. Tsur, and A. Rappoport. Enhanced sentiment learning using Twitter hashtags and smileys. In COLING, pages 241--249, 2010. Google ScholarDigital Library
- M. De Marneffe, B. MacCartney, and C. Manning. Generating typed dependency parses from phrase structure parses. In LREC, pages 449--454, 2006.Google Scholar
- X. Ding, B. Liu, and P. Yu. A holistic lexicon-based approach to opinion mining. In WSDM. ACM, 2008. Google ScholarDigital Library
- C. Fellbaum. WordNet. Wiley Online Library, 1999.Google Scholar
- A. Go, R. Bhayani, and L. Huang. Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, pages 1--12, 2009.Google Scholar
- B. Han, P. Cook, and T. Baldwin. Automatically constructing a normalisation dictionary for microblogs. In EMNLP-CoNLL, pages 421--432. ACL, 2012. Google ScholarDigital Library
- B. Han, P. Cook, and T. Baldwin. Lexical normalization for social media text. ACM TIST, 4(1):5:1--5:27, Feb. 2013. Google ScholarDigital Library
- Y. He. Incorporating sentiment prior knowledge for weakly supervised sentiment analysis. ACM TALIP, 11(2):4, 2012. Google ScholarDigital Library
- M. Hu and B. Liu. Mining opinion features in customer reviews. In AAAI, volume 4, pages 755--760, 2004. Google ScholarDigital Library
- J. Jagarlamudi, H. Daumé, III, and R. Udupa. Incorporating lexical priors into topic models. In EACL. ACM, 2012. Google ScholarDigital Library
- L. Jiang, M. Yu, M. Zhou, X. Liu, and T. Zhao. Target-dependent Twitter sentiment classification. In ACL, pages 151--160, 2011. Google ScholarDigital Library
- Y. Jo and A. Oh. Aspect and sentiment unification model for online review analysis. In WSDM, pages 815--824, 2011. Google ScholarDigital Library
- F. Li, C. Han, M. Huang, X. Zhu, Y.-J. Xia, S. Zhang, and H. Yu. Structure-aware review mining and summarization. In COLING, pages 653--661. ACL, 2010. Google ScholarDigital Library
- T. Li, Y. Zhang, and V. Sindhwani. A non-negative matrix tri-factorization approach to sentiment classification with lexical prior knowledge. In AFNLP, pages 244--252, 2009. Google ScholarDigital Library
- C. Lin and Y. He. Joint sentiment/topic model for sentiment analysis. In CIKM, pages 375--384. ACM, 2009. Google ScholarDigital Library
- B. Liu. Sentiment analysis and opinion mining. Synthesis Lectures on HLT, 5(1):1--167, 2012.Google Scholar
- S. Liu, F. Li, F. Li, X. Cheng, and H. Shen. Adaptive co-training SVM for sentiment classification on tweets. In CIKM, pages 2079--2088. ACM, 2013. Google ScholarDigital Library
- M. Lui and T. Baldwin. langid.py: An off-the-shelf language identification tool. In ACL, pages 25--30, 2012. Google ScholarDigital Library
- D. Maynard, K. Bontcheva, and D. Rout. Challenges in developing opinion mining tools for social media. @NLP can u tag#usergeneratedcontent, 2012.Google Scholar
- M. McCord and M. Chuah. Spam detection on Twitter using traditional classifiers. In Autonomic and Trusted Computing, pages 175--186. Springer, 2011. Google ScholarDigital Library
- R. Mehrotra, S. Sanner, W. Buntine, and L. Xie. Improving LDA topic models for microblogs via Tweet pooling and automatic labeling. In SIGIR, pages 889--892. ACM, 2013. Google ScholarDigital Library
- Q. Mei, X. Ling, M. Wondra, et al. Topic Sentiment Mixture: Modeling facets and opinions in weblogs. In WWW, 2007. Google ScholarDigital Library
- S. Moghaddam and M. Ester. Opinion Digger: An unsupervised opinion miner from unstructured product reviews. In CIKM, pages 1825--1828. ACM, 2010. Google ScholarDigital Library
- S. Moghaddam and M. Ester. ILDA: Interdependent LDA model for learning latent aspects and their ratings from online product reviews. In SIGIR, pages 665--674, 2011. Google ScholarDigital Library
- S. Moghaddam and M. Ester. On the design of LDA models for aspect-based opinion mining. In CIKM. ACM, 2012. Google ScholarDigital Library
- P. Nakov, Z. Kozareva, A. Ritter, S. Rosenthal, V. Stoyanov, and T. Wilson. SemEval-2013 task 2: Sentiment analysis in Twitter. In Workshop on Semantic Evaluation, 2013.Google Scholar
- R. Neal. Slice sampling. Ann. Statist., 31(3):705--767, 2003.Google ScholarCross Ref
- O. Owoputi, B. O'Connor, C. Dyer, et al. Improved part-of-speech tagging for online conversational text with word clusters. In NAACL-HLT, pages 380--390, 2013.Google Scholar
- A. Pak and P. Paroubek. Twitter as a corpus for sentiment analysis and opinion mining. In LREC, 2010.Google Scholar
- B. Pang and L. Lee. Opinion Mining and Sentiment Analysis. Foundations and Trends in Information Retrieval, 2(1-2):1--135, 2008. Google ScholarDigital Library
- J. Pitman. Some developments of the Blackwell-Macqueen urn scheme. Lecture Notes-Monograph Series, 1996.Google ScholarCross Ref
- A.-M. Popescu and O. Etzioni. Extracting product features and opinions from reviews. In Natural language processing and text mining, pages 9--28. Springer, 2007.Google ScholarCross Ref
- A. Ritter, S. Clark, Mausam, and O. Etzioni. Named entity recognition in Tweets: An experimental study. In EMNLP, pages 1524--1534, 2011. Google ScholarDigital Library
- M. Taboada, J. Brooke, M. Tofiloski, K. Voll, and M. Stede. Lexicon-based methods for sentiment analysis. Computational linguistics, 37(2):267--307, 2011. Google ScholarDigital Library
- Y. W. Teh. A Bayesian interpretation of interpolated Kneser-Ney. Tech Report A2/06, NUS, 2006.Google Scholar
- Y. W. Teh. A hierarchical Bayesian language model based on Pitman-Yor processes. In ACL, pages 985--992. ACL, 2006. Google ScholarDigital Library
- Y. W. Teh and M. Jordan. Hierarchical Bayesian non- parametric models with applications. Bayesian Non- parametrics: Principles and Practice, pages 158--207, 2010.Google Scholar
- M. Thelwall, K. Buckley, G. Paltoglou, D. Cai, and A. Kappas. Sentiment strength detection in short informal text. JASIST, 61(12):2544--2558, 2010. Google ScholarDigital Library
- I. Titov and R. McDonald. A joint model of text and aspect ratings for sentiment summarization. In ACL08: HLT, 2008.Google Scholar
- I. Titov and R. McDonald. Modeling online reviews with multi-grain topic models. In WWW, pages 111--120, 2008. Google ScholarDigital Library
- O. Tsur, D. Davidov, and A. Rappoport. ICWSM-A great catchy name: Semi-supervised recognition of sarcastic sentences in online product reviews. In ICWSM, 2010.Google ScholarCross Ref
- T. Wilson, J. Wiebe, and P. Hoffmann. Recognizing contextual polarity in phrase-level sentiment analysis. In HLT-EMNLP, pages 347--354, 2005. Google ScholarDigital Library
- J. Yang and J. Leskovec. Patterns of temporal variation in online media. In WSDM, pages 177--186, 2011. Google ScholarDigital Library
- W. Zhao, J. Jiang, J. Weng, J. He, E.-P. Lim, H. Yan, and X. Li. Comparing Twitter and traditional media using topic models. In ECIR, pages 338--349, 2011. Google ScholarDigital Library
- W. Zhao, J. Jiang, H. Yan, and X. Li. Jointly modeling aspects and opinions with a MaxEnt-LDA hybrid. In EMNLP, pages 56--65, 2010. Google ScholarDigital Library
Index Terms
- Twitter Opinion Topic Model: Extracting Product Opinions from Tweets by Leveraging Hashtags and Sentiment Lexicon
Recommendations
Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge managementTwitter is one of the biggest platforms where massive instant messages (i.e. tweets) are published every day. Users tend to express their real feelings freely in Twitter, which makes it an ideal source for capturing the opinions towards various ...
Joint sentiment/topic model for sentiment analysis
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementSentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet ...
Entity-centric topic-oriented opinion summarization in twitter
KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data miningMicroblogging services, such as Twitter, have become popular channels for people to express their opinions towards a broad range of topics. Twitter generates a huge volume of instant messages (i.e. tweets) carrying users' sentiments and attitudes every ...
Comments