research-article

Twitter Opinion Topic Model: Extracting Product Opinions from Tweets by Leveraging Hashtags and Sentiment Lexicon

Authors:
Kar Wai Lim

Australian National University, Canberra, Australia

Australian National University, Canberra, Australia
View Profile

,
Wray Buntine

Monash University, Melbourne, Australia

Monash University, Melbourne, Australia
View Profile

CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge ManagementNovember 2014Pages 1319–1328https://doi.org/10.1145/2661829.2662005

Published:03 November 2014Publication History

CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management

Pages 1319–1328

ABSTRACT

Aspect-based opinion mining is widely applied to review data to aggregate or summarize opinions of a product, and the current state-of-the-art is achieved with Latent Dirichlet Allocation (LDA)-based model. Although social media data like tweets are laden with opinions, their "dirty" nature (as natural language) has discouraged researchers from applying LDA-based opinion model for product review mining. Tweets are often informal, unstructured and lacking labeled data such as categories and ratings, making it challenging for product opinion mining. In this paper, we propose an LDA-based opinion model named Twitter Opinion Topic Model (TOTM) for opinion mining and sentiment analysis. TOTM leverages hashtags, mentions, emoticons and strong sentiment words that are present in tweets in its discovery process. It improves opinion prediction by modeling the target-opinion interaction directly, thus discovering target specific opinion words, neglected in existing approaches. Moreover, we propose a new formulation of incorporating sentiment prior information into a topic model, by utilizing an existing public sentiment lexicon. This is novel in that it learns and updates with the data. We conduct experiments on 9 million tweets on electronic products, and demonstrate the improved performance of TOTM in both quantitative evaluations and qualitative analysis. We show that aspect-based opinion analysis on massive volume of tweets provides useful opinions on products.

References

S. Baccianella, A. Esuli, and F. Sebastiani. SentiWordNet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In LREC, pages 2200--2204, 2010.Google Scholar
W. Buntine and M. Hutter. A Bayesian review of the Poisson-Dirichlet process. arXiv:1007.0296v2, 2012.Google Scholar
C. Chen, L. Du, and W. Buntine. Sampling table configurations for the hierarchical Poisson-Dirichlet Process. In ECML, pages 296--311, 2011. Google ScholarDigital Library
D. Davidov, O. Tsur, and A. Rappoport. Enhanced sentiment learning using Twitter hashtags and smileys. In COLING, pages 241--249, 2010. Google ScholarDigital Library
M. De Marneffe, B. MacCartney, and C. Manning. Generating typed dependency parses from phrase structure parses. In LREC, pages 449--454, 2006.Google Scholar
X. Ding, B. Liu, and P. Yu. A holistic lexicon-based approach to opinion mining. In WSDM. ACM, 2008. Google ScholarDigital Library
C. Fellbaum. WordNet. Wiley Online Library, 1999.Google Scholar
A. Go, R. Bhayani, and L. Huang. Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, pages 1--12, 2009.Google Scholar
B. Han, P. Cook, and T. Baldwin. Automatically constructing a normalisation dictionary for microblogs. In EMNLP-CoNLL, pages 421--432. ACL, 2012. Google ScholarDigital Library
B. Han, P. Cook, and T. Baldwin. Lexical normalization for social media text. ACM TIST, 4(1):5:1--5:27, Feb. 2013. Google ScholarDigital Library
Y. He. Incorporating sentiment prior knowledge for weakly supervised sentiment analysis. ACM TALIP, 11(2):4, 2012. Google ScholarDigital Library
M. Hu and B. Liu. Mining opinion features in customer reviews. In AAAI, volume 4, pages 755--760, 2004. Google ScholarDigital Library
J. Jagarlamudi, H. Daumé, III, and R. Udupa. Incorporating lexical priors into topic models. In EACL. ACM, 2012. Google ScholarDigital Library
L. Jiang, M. Yu, M. Zhou, X. Liu, and T. Zhao. Target-dependent Twitter sentiment classification. In ACL, pages 151--160, 2011. Google ScholarDigital Library
Y. Jo and A. Oh. Aspect and sentiment unification model for online review analysis. In WSDM, pages 815--824, 2011. Google ScholarDigital Library
F. Li, C. Han, M. Huang, X. Zhu, Y.-J. Xia, S. Zhang, and H. Yu. Structure-aware review mining and summarization. In COLING, pages 653--661. ACL, 2010. Google ScholarDigital Library
T. Li, Y. Zhang, and V. Sindhwani. A non-negative matrix tri-factorization approach to sentiment classification with lexical prior knowledge. In AFNLP, pages 244--252, 2009. Google ScholarDigital Library
C. Lin and Y. He. Joint sentiment/topic model for sentiment analysis. In CIKM, pages 375--384. ACM, 2009. Google ScholarDigital Library
B. Liu. Sentiment analysis and opinion mining. Synthesis Lectures on HLT, 5(1):1--167, 2012.Google Scholar
S. Liu, F. Li, F. Li, X. Cheng, and H. Shen. Adaptive co-training SVM for sentiment classification on tweets. In CIKM, pages 2079--2088. ACM, 2013. Google ScholarDigital Library
M. Lui and T. Baldwin. langid.py: An off-the-shelf language identification tool. In ACL, pages 25--30, 2012. Google ScholarDigital Library
D. Maynard, K. Bontcheva, and D. Rout. Challenges in developing opinion mining tools for social media. @NLP can u tag#usergeneratedcontent, 2012.Google Scholar
M. McCord and M. Chuah. Spam detection on Twitter using traditional classifiers. In Autonomic and Trusted Computing, pages 175--186. Springer, 2011. Google ScholarDigital Library
R. Mehrotra, S. Sanner, W. Buntine, and L. Xie. Improving LDA topic models for microblogs via Tweet pooling and automatic labeling. In SIGIR, pages 889--892. ACM, 2013. Google ScholarDigital Library
Q. Mei, X. Ling, M. Wondra, et al. Topic Sentiment Mixture: Modeling facets and opinions in weblogs. In WWW, 2007. Google ScholarDigital Library
S. Moghaddam and M. Ester. Opinion Digger: An unsupervised opinion miner from unstructured product reviews. In CIKM, pages 1825--1828. ACM, 2010. Google ScholarDigital Library
S. Moghaddam and M. Ester. ILDA: Interdependent LDA model for learning latent aspects and their ratings from online product reviews. In SIGIR, pages 665--674, 2011. Google ScholarDigital Library
S. Moghaddam and M. Ester. On the design of LDA models for aspect-based opinion mining. In CIKM. ACM, 2012. Google ScholarDigital Library
P. Nakov, Z. Kozareva, A. Ritter, S. Rosenthal, V. Stoyanov, and T. Wilson. SemEval-2013 task 2: Sentiment analysis in Twitter. In Workshop on Semantic Evaluation, 2013.Google Scholar
R. Neal. Slice sampling. Ann. Statist., 31(3):705--767, 2003.Google ScholarCross Ref
O. Owoputi, B. O'Connor, C. Dyer, et al. Improved part-of-speech tagging for online conversational text with word clusters. In NAACL-HLT, pages 380--390, 2013.Google Scholar
A. Pak and P. Paroubek. Twitter as a corpus for sentiment analysis and opinion mining. In LREC, 2010.Google Scholar
B. Pang and L. Lee. Opinion Mining and Sentiment Analysis. Foundations and Trends in Information Retrieval, 2(1-2):1--135, 2008. Google ScholarDigital Library
J. Pitman. Some developments of the Blackwell-Macqueen urn scheme. Lecture Notes-Monograph Series, 1996.Google ScholarCross Ref
A.-M. Popescu and O. Etzioni. Extracting product features and opinions from reviews. In Natural language processing and text mining, pages 9--28. Springer, 2007.Google ScholarCross Ref
A. Ritter, S. Clark, Mausam, and O. Etzioni. Named entity recognition in Tweets: An experimental study. In EMNLP, pages 1524--1534, 2011. Google ScholarDigital Library
M. Taboada, J. Brooke, M. Tofiloski, K. Voll, and M. Stede. Lexicon-based methods for sentiment analysis. Computational linguistics, 37(2):267--307, 2011. Google ScholarDigital Library
Y. W. Teh. A Bayesian interpretation of interpolated Kneser-Ney. Tech Report A2/06, NUS, 2006.Google Scholar
Y. W. Teh. A hierarchical Bayesian language model based on Pitman-Yor processes. In ACL, pages 985--992. ACL, 2006. Google ScholarDigital Library
Y. W. Teh and M. Jordan. Hierarchical Bayesian non- parametric models with applications. Bayesian Non- parametrics: Principles and Practice, pages 158--207, 2010.Google Scholar
M. Thelwall, K. Buckley, G. Paltoglou, D. Cai, and A. Kappas. Sentiment strength detection in short informal text. JASIST, 61(12):2544--2558, 2010. Google ScholarDigital Library
I. Titov and R. McDonald. A joint model of text and aspect ratings for sentiment summarization. In ACL08: HLT, 2008.Google Scholar
I. Titov and R. McDonald. Modeling online reviews with multi-grain topic models. In WWW, pages 111--120, 2008. Google ScholarDigital Library
O. Tsur, D. Davidov, and A. Rappoport. ICWSM-A great catchy name: Semi-supervised recognition of sarcastic sentences in online product reviews. In ICWSM, 2010.Google ScholarCross Ref
T. Wilson, J. Wiebe, and P. Hoffmann. Recognizing contextual polarity in phrase-level sentiment analysis. In HLT-EMNLP, pages 347--354, 2005. Google ScholarDigital Library
J. Yang and J. Leskovec. Patterns of temporal variation in online media. In WSDM, pages 177--186, 2011. Google ScholarDigital Library
W. Zhao, J. Jiang, J. Weng, J. He, E.-P. Lim, H. Yan, and X. Li. Comparing Twitter and traditional media using topic models. In ECIR, pages 338--349, 2011. Google ScholarDigital Library
W. Zhao, J. Jiang, H. Yan, and X. Li. Jointly modeling aspects and opinions with a MaxEnt-LDA hybrid. In EMNLP, pages 56--65, 2010. Google ScholarDigital Library

Index Terms

Twitter Opinion Topic Model: Extracting Product Opinions from Tweets by Leveraging Hashtags and Sentiment Lexicon
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources

Recommendations

Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

Twitter is one of the biggest platforms where massive instant messages (i.e. tweets) are published every day. Users tend to express their real feelings freely in Twitter, which makes it an ideal source for capturing the opinions towards various ...
Read More
Joint sentiment/topic model for sentiment analysis
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet ...
Read More
Entity-centric topic-oriented opinion summarization in twitter
KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

Microblogging services, such as Twitter, have become popular channels for people to express their opinions towards a broad range of topics. Twitter generates a huge volume of instant messages (i.e. tweets) carrying users' sentiments and attitudes every ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management
November 2014
2152 pages
ISBN:9781450325981
DOI:10.1145/2661829
General Chairs:
Jianzhong Li
Harbin Inst. of Technology
,
X. Sean Wang
Fudan University
,
Program Chairs:
Minos Garofalakis
Technical University of Crete, Greece
,
Ian Soboroff
National Institute of Standards, USA
,
Torsten Suel
New York University, USA
,
Min Wang
Google Research, USA
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 November 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
emoticons
opinion mining
product review
sentiment analysis
sentiment lexicon
topic modeling
twitter
Qualifiers
- research-article
Conference

Acceptance Rates
CIKM '14 Paper Acceptance Rate175of838submissions,21%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 43
  Total Citations
  View Citations
- 1,337
  Total Downloads
- Downloads (Last 12 months)53
- Downloads (Last 6 weeks)12
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Twitter Opinion Topic Model: Extracting Product Opinions from Tweets by Leveraging Hashtags and Sentiment Lexicon

CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach

Joint sentiment/topic model for sentiment analysis

Entity-centric topic-oriented opinion summarization in twitter