Abstract
Sentiment analysis of documents aims to characterise the positive or negative sentiment expressed in documents. It has been formulated as a supervised classification problem, which requires large numbers of labelled documents. Semi-supervised sentiment classification using limited documents or words labelled with sentiment-polarities are approaches to reducing labelling cost for effective learning. Expectation Maximisation (EM) has been widely used in semi-supervised sentiment classification. A prominent problem with existing EM-based approaches is that the objective function of EM may not conform to the intended classification task and thus can result in poor classification performance. In this paper we propose to augment EM with the lexical knowledge of opinion words to mitigate this problem. Extensive experiments on diverse domains show that our lexical EM algorithm achieves significantly higher accuracy than existing standard EM-based semi-supervised learning approaches for sentiment classification, and also significantly outperforms alternative approaches using the lexical knowledge.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Baccianella, S., Esuli, A., Sebastiani, F.: Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In: LREC 2008 (2008)
Cozman, F., Cohen, I.: Risks of semi-supervised learning: How unlabeled data can degrade performance of generative classifiers. Semi-Supervised Learning 4, 57–72 (2006)
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological) 39(1), 1–38 (1977)
Druck, G., Mann, G., McCallum, A.: Learning from labeled features using generalized expectation criteria. In: SIGIR 2008 (2008)
Druck, G., McCallum, A.: High-performance semi-supervised learning using discriminatively constrained generative models. In: ICML 2010, Haifa, Israel (2010)
General-Inquirer: The General Inquirer Home Page (2010), http://www.wjh.harvard.edu/~inquirer/ (Online; accessed September 22, 2010)
Hatzivassiloglou, V., McKeown, K.R.: Predicting the semantic orientation of adjectives. In: LREC 1997 (1997)
Graca, J., Ganchev, K., Taskar, B.: Expectation maximization and posterior constraints. In: Proceedings of NIPS (2007)
Liu, B.: Sentiment analysis and subjectivity. In: Handbook of Natural Language Processing, 2nd edn. (2009)
Liu, B., Li, X., Lee, W., Yu, P.: Text classification by labelling words. In: AAAI 2004 (2004)
Macdonald, C., Ounis, I., Soboroff, I.: Overview of the TREC 2007 blog track. In: TREC 2007 (2007)
Melville, P., Gryc, W., Lawrence, R.D.: Sentiment analysis of blogs by combining lexical knowledge with text classification. In: ACM SIGKDD 2009 (2009)
Nigam, K., McCallum, A., Thrun, S., Mitchell, T.: Text classification from labeled and unlabeled documents using EM. Machine Learning 39(2), 103–134 (2000)
Ounis, I., Rijke, M.D., Macdonald, C., Mishne, G., Soboroff, I.: Overview of the TREC-2006 blog track. In: TREC 2006 (2006)
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2(1-2), 1–135 (2008)
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: ACL 2002, pp. 79–86 (2002)
Riloff, E., Wiebe, J.: Learning extraction patterns for subjective expressions. In: EMNLP 2003, pp. 105–112 (2003)
Sindhwani, V., Melville, P.: Document-word co-regularization for semi-supervised sentiment analysis. In: ICDM 2008, pp. 1025–1030 (2008)
Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: ACL 2002, pp. 417–424 (2002)
Wiebe, J., Wilson, T., Cardie, C.: Annotating expressions of opinions and emotions in language. Language Resources and Evaluation 39(2), 165–210 (2005)
Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: EMNLP 2005, pp. 347–354 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, X., Zhou, Y., Bailey, J., Ramamohanarao, K. (2012). Sentiment Analysis by Augmenting Expectation Maximisation with Lexical Knowledge. In: Wang, X.S., Cruz, I., Delis, A., Huang, G. (eds) Web Information Systems Engineering - WISE 2012. WISE 2012. Lecture Notes in Computer Science, vol 7651. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35063-4_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-35063-4_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35062-7
Online ISBN: 978-3-642-35063-4
eBook Packages: Computer ScienceComputer Science (R0)