skip to main content
10.1145/2063576.2063635acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Sentiment classification based on supervised latent n-gram analysis

Published:24 October 2011Publication History

ABSTRACT

In this paper, we propose an efficient embedding for modeling higher-order (n-gram) phrases that projects the n-grams to low-dimensional latent semantic space, where a classification function can be defined. We utilize a deep neural network to build a unified discriminative framework that allows for estimating the parameters of the latent space as well as the classification function with a bias for the target classification task at hand. We apply the framework to large-scale sentimental classification task. We present comparative evaluation of the proposed method on two (large) benchmark data sets for online product reviews. The proposed method achieves superior performance in comparison to the state of the art.

References

  1. A. Agresti. Analysis of Ordinal Categorical Data. John Wiley and Sons Inc., 2010.Google ScholarGoogle ScholarCross RefCross Ref
  2. B. Bai, J. Weston, R. Collobert, D. Grangier, K. Sadamasa, Y. Qi, O. Chapelle, and K. Weinberger. Supervised semantic indexing. In Proceeding of the 18th ACM Conference on Information and Knowledge Management, pages 187--196. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Y. Bengio. Learning Deep Architectures for AI. Now Publishers Inc., Hanover, MA, USA, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Y. Bengio, R. Ducharme, P. Vincent, and D. D. E. R. Operationnelle. A neural probabilistic language model. Journal of Machine Learning Research, 3:1137--1155, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Blei, A. Ng, and M. Jordan. Latent dirichlet allocation. The Journal of Machine Learning Research, 3:993--1022, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Blitzer, M. Dredze, and F. Pereira. Biographies, bollywood, boomboxes and blenders: Domain adaptation for sentiment classification. In In ACL, pages 187--205, 2007.Google ScholarGoogle Scholar
  7. L. Bottou. Stochastic learning. In O. Bousquet and U. von Luxburg, editors, Advanced Lectures on Machine Learning, Lecture Notes in Artificial Intelligence, LNAI 3176, pages 146--168. Springer Verlag, Berlin, 2004.Google ScholarGoogle Scholar
  8. S. Brody and N. Elhadad. An unsupervised aspect-sentiment model for online reviews. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT '10, pages 804--812, Stroudsburg, PA, USA, 2010. Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. Collobert and J. Weston. A unified architecture for natural language processing: Deep neural networks with multitask learning. In International Conference on Machine Learning, ICML, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. H. Cui, V. Mittal, and M. Datar. Comparative experiments on sentiment classification for online product reviews. In proceedings of the 21st national conference on Artificial intelligence - Volume 2, pages 1265--1270. AAAI Press, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of The American Society for Information Science, 41(6):391--407, 1990.Google ScholarGoogle ScholarCross RefCross Ref
  12. G. Ganu, N. Elhadad, and A. Marian. Beyond the stars: improving rating predictions using review text content. 2009.Google ScholarGoogle Scholar
  13. V. Hatzivassiloglou and K. R. McKeown. Predicting the semantic orientation of adjectives. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, ACL '98, pages 174--181, Stroudsburg, PA, USA, 1997. Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. Hofmann. Probabilistic latent semantic indexing. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pages 50--57. ACM Press New York, NY, USA, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. Kuksa, P.-H. Huang, and V. Pavlovic. Scalable algorithms for string kernels with inexact matching. In NIPS, 2008. Spotlight Presentation. Acceptance rate: 123/1022 (12%).Google ScholarGoogle Scholar
  16. C. S. Leslie, E. Eskin, J. Weston, and W. S. Noble. Mismatch string kernels for SVM protein classification. In NIPS, pages 1417--1424, 2002.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Y. Mansour, M. Mohri, and A. Rostamizadeh. Domain adaptation with multiple sources. In NIPS'08, pages 1041--1048, 2008.Google ScholarGoogle Scholar
  18. F. Morin. Hierarchical probabilistic neural network language model. aistats--05. In AISTATS, pages 246--252, 2005.Google ScholarGoogle Scholar
  19. G. Paltoglou and M. Thelwall. A study of information retrieval weighting schemes for sentiment analysis. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL '10, pages 1386--1395, Stroudsburg, PA, USA, 2010. Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. B. Pang and L. Lee. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1--2):1--135, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. R. Salakhutdinov and G. Hinton. Semantic hashing. Int. J. Approx. Reasoning, 50:969--978, July 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C. Sauper, A. Haghighi, and R. Barzilay. Incorporating content structure into text analysis applications. In EMNLP'10, pages 377--387, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. P. Turney and M. Littman. Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems, 21:315--346, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. H. Wang, Y. Lu, and C. Zhai. Latent aspect rating analysis on review text data: a rating regression approach. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '10, pages 783--792, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Weston, S. Bengio, and N. Usunier. Large scale image annotation: Learning to rank with joint word-image embeddings. Machine learning, 81(1):21--35, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. Weston, C. Leslie, E. Ie, D. Zhou, A. Elisseeff, and W. S. Noble. Semi-supervised protein classification using cluster kernels. Bioinformatics, 21(15):3241--3247, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Sentiment classification based on supervised latent n-gram analysis

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management
          October 2011
          2712 pages
          ISBN:9781450307178
          DOI:10.1145/2063576

          Copyright © 2011 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 October 2011

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,861of8,427submissions,22%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader