Skip to main content

Sentiment Analysis by Augmenting Expectation Maximisation with Lexical Knowledge

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7651))

Abstract

Sentiment analysis of documents aims to characterise the positive or negative sentiment expressed in documents. It has been formulated as a supervised classification problem, which requires large numbers of labelled documents. Semi-supervised sentiment classification using limited documents or words labelled with sentiment-polarities are approaches to reducing labelling cost for effective learning. Expectation Maximisation (EM) has been widely used in semi-supervised sentiment classification. A prominent problem with existing EM-based approaches is that the objective function of EM may not conform to the intended classification task and thus can result in poor classification performance. In this paper we propose to augment EM with the lexical knowledge of opinion words to mitigate this problem. Extensive experiments on diverse domains show that our lexical EM algorithm achieves significantly higher accuracy than existing standard EM-based semi-supervised learning approaches for sentiment classification, and also significantly outperforms alternative approaches using the lexical knowledge.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baccianella, S., Esuli, A., Sebastiani, F.: Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In: LREC 2008 (2008)

    Google Scholar 

  2. Cozman, F., Cohen, I.: Risks of semi-supervised learning: How unlabeled data can degrade performance of generative classifiers. Semi-Supervised Learning 4, 57–72 (2006)

    Google Scholar 

  3. Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological) 39(1), 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  4. Druck, G., Mann, G., McCallum, A.: Learning from labeled features using generalized expectation criteria. In: SIGIR 2008 (2008)

    Google Scholar 

  5. Druck, G., McCallum, A.: High-performance semi-supervised learning using discriminatively constrained generative models. In: ICML 2010, Haifa, Israel (2010)

    Google Scholar 

  6. General-Inquirer: The General Inquirer Home Page (2010), http://www.wjh.harvard.edu/~inquirer/ (Online; accessed September 22, 2010)

  7. Hatzivassiloglou, V., McKeown, K.R.: Predicting the semantic orientation of adjectives. In: LREC 1997 (1997)

    Google Scholar 

  8. Graca, J., Ganchev, K., Taskar, B.: Expectation maximization and posterior constraints. In: Proceedings of NIPS (2007)

    Google Scholar 

  9. Liu, B.: Sentiment analysis and subjectivity. In: Handbook of Natural Language Processing, 2nd edn. (2009)

    Google Scholar 

  10. Liu, B., Li, X., Lee, W., Yu, P.: Text classification by labelling words. In: AAAI 2004 (2004)

    Google Scholar 

  11. Macdonald, C., Ounis, I., Soboroff, I.: Overview of the TREC 2007 blog track. In: TREC 2007 (2007)

    Google Scholar 

  12. Melville, P., Gryc, W., Lawrence, R.D.: Sentiment analysis of blogs by combining lexical knowledge with text classification. In: ACM SIGKDD 2009 (2009)

    Google Scholar 

  13. Nigam, K., McCallum, A., Thrun, S., Mitchell, T.: Text classification from labeled and unlabeled documents using EM. Machine Learning 39(2), 103–134 (2000)

    Article  MATH  Google Scholar 

  14. Ounis, I., Rijke, M.D., Macdonald, C., Mishne, G., Soboroff, I.: Overview of the TREC-2006 blog track. In: TREC 2006 (2006)

    Google Scholar 

  15. Pang, B., Lee, L.: Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2(1-2), 1–135 (2008)

    Article  Google Scholar 

  16. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: ACL 2002, pp. 79–86 (2002)

    Google Scholar 

  17. Riloff, E., Wiebe, J.: Learning extraction patterns for subjective expressions. In: EMNLP 2003, pp. 105–112 (2003)

    Google Scholar 

  18. Sindhwani, V., Melville, P.: Document-word co-regularization for semi-supervised sentiment analysis. In: ICDM 2008, pp. 1025–1030 (2008)

    Google Scholar 

  19. Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: ACL 2002, pp. 417–424 (2002)

    Google Scholar 

  20. Wiebe, J., Wilson, T., Cardie, C.: Annotating expressions of opinions and emotions in language. Language Resources and Evaluation 39(2), 165–210 (2005)

    Article  Google Scholar 

  21. Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: EMNLP 2005, pp. 347–354 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, X., Zhou, Y., Bailey, J., Ramamohanarao, K. (2012). Sentiment Analysis by Augmenting Expectation Maximisation with Lexical Knowledge. In: Wang, X.S., Cruz, I., Delis, A., Huang, G. (eds) Web Information Systems Engineering - WISE 2012. WISE 2012. Lecture Notes in Computer Science, vol 7651. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35063-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35063-4_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35062-7

  • Online ISBN: 978-3-642-35063-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics