Abstract
We consider the task of semi-supervised classification: extending category labels from a small dataset of labeled examples to a much larger set. We show that, at least on our case study task, unsupervised fuzzy clustering of the unlabeled examples helps in obtaining the hard clusters. Namely, we used the membership values obtained with fuzzy clustering as additional features for hard clustering. We also used these membership values to reduce the confusion set for the hard clustering. As a case study, we use applied the proposed method to the task of constructing a large emotion lexicon by extending the emotion labels from the WordNet Affect lexicon using various features of words. Some of the features were extracted from the emotional statements of the freely available ISEAR dataset; other features were WordNet distance and the similarity measured via the polarity scores in the SenticNet resource. The proposed method classified words by emotion labels with high accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alm, O.C., Roth, D., Richard, S.: Emotions from text: Machine learning for text-based emotion prediction. In: Proceedings of HLT-EMNLP, pp. 579–586 (2005)
Andreevskaia, A., Bergler, S.: CLaC and CLaC-NB: Knowledge-based and corpus-based approaches to sentiment tagging. In: 4th International Workshop on SemEval, pp. 117–120 (2007)
Aue, A., Gamon, M.: Customizing sentiment classifiers to new domains: A case study. In: Proc. of RANLP (2005)
Awad, M., Khan, L., Bastani, F., Yen, I.L.: An Effective support vector machine (SVMs) Performance Using Hierarchical Clustering. In: Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2004), pp. 663–667 (2004)
Baccianella, S., Esuli, A., Sebastiani, F.: SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining. In: LRE, pp. 2200–2204 (2010)
Banea, C., Mihalcea, R., Wiebe, J.: A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources. In: LREC (2008)
Baroni, M., Vegnaduzzo, S.: Identifying subjective adjectives through web-based mutual information. In: Proceedings of the German Conference on NLP (2004)
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)
Boley, D., Cao, D.: Training support vector machine Using Adaptive Clustering. In: Proc. of SIAM Int. Conf. on Data Mining, Lake Buena Vista, FL, USA (2004)
Cambria, E., Speer, R., Havasi, C., Hussain, A.: SenticNet: A publicly available semantic resource for opinion mining. In: Proc. of AAAI CSK, pp. 14–18 (2010)
Cambria, E., Hussain, A.: Sentic computing: Techniques, tools, and applications, p. 153. Springer, Dordrecht (2012)
Cervantes, J., Li, X., Yu, W.: Support Vector Machine Classification Based on Fuzzy Clustering for Large Data Sets. In: Gelbukh, A., Reyes-Garcia, C.A. (eds.) MICAI 2006. LNCS (LNAI), vol. 4293, pp. 572–582. Springer, Heidelberg (2006)
Elliott, C.: The affective reasoner: A process model of emotions in a multi-agent system. Ph.D. thesis, Institute for the Learning Sciences, Northwestern University (1992)
Hatzivassiloglou, V., McKeown, K.R.: Predicting the semantic orientation of adjectives. In: 35th Annual Meeting of the ACL and the 8th EACL, pp. 174–181 (1997)
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the ACM SIGKDD, pp. 168–177 (2004)
Kamps, J., Marx, M., Mokken, R.J., de Rijke, M.: Using wordnet to measure semantic orientation of adjectives. In: Proceedings of the 4th LREC 2004, IV, pp. 1115–1118 (2004)
Kobayashi, N., Inui, T., Inui, K.: Dictionary-based acquisition of the lexical knowledge for p/n analysis. In: Proceedings of Japanese Society for Artificial Intelligence, SLUD-33, pp. 45–50 (2001) (in Japanese)
Liu, B.: Sentiment Analysis: A Multi-Faceted Problem. IEEE Intelligent Systems (2010)
Miller, A.G.: WordNet: a lexical database for English. Communications of the ACM 38(11), 39–41 (1995)
Mohammad, S., Turney, P.D.: Emotions evoked by common words and phrases: Using Mechanical Turk to create an emotion lexicon. In: Proc. of NAACL-HLT, Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, pp. 26–34 (2010)
Neviarouskaya, A., Prendinger, H., Ishizuka, M.: SentiFul: Generating a Reliable Lexicon for Sentiment Analysis. In: ACII 2009, pp. 363–368. IEEE (2009)
Pang, B., Lillian, L., Shivakumar, V.: Thumbs up? Sentiment classification using machine learning techniques. In: The Proc. of EMNLP, pp. 79–86 (2002)
Poria, S., Gelbukh, A., Cambria, E., Das, D., Bandyopadhyay, S.: Enriching SenticNet Polarity Scores through Semi-Supervised Fuzzy Clustering. In: Proc. of the SENTIRE 2012 Workshop at IEEE ICDM 2012 (2012)
Poria, S., Gelbukh, A., Cambria, E., Yang, P., Hussain, A., Durrani, T.: Merging SenticNet and WordNet-Affect Emotion Lists for Sentiment Analysis. In: Proc. of the 11th International Conference on Signal Processing, IEEE ICSP 2012, Beijing (2012)
Poria, S., Gelbukh, A., Das, D., Bandyopadhyay, S.: Extending SenticNet with Affective Labels for Concept-based Opinion Mining. IEEE Intelligent Systems (submitted, 2013)
Read, J.: Using emoticons to reduce dependency in machine learning techniques for sentiment classification. In: Proceedings of the ACL Student Research Workshop (2005)
Riloff, E., Wiebe, J., Wilson, T.: Learning subjective nouns using extraction pattern bootstrapping. In: Proceedings of the Seventh CoNLL 2003, pp. 25–32 (2003)
Scherer, K.R.: What are emotions? And how can they be measured? Social Science Information 44(4), 693–727 (2005)
Sidorov, G., Castro-Sánchez, N.A.: Automatic emotional personality description using linguistic data. Research in Computing Science 20, 89–94 (2006)
Strapparava, C., Ozbal, G.: The Color of Emotions in Texts. In: Proceedings of the 2nd Workshop on Cognitive Aspects of the Lexicon (CogALex 2010), Beijing, pp. 28–32 (2010)
Strapparava, C., Valitutti, A.: Wordnet affect: an affective extension of wordnet. Language Resource and Evaluation (2004)
Takamura, H., Inui, T., Okumura, M.: Extracting Semantic Orientations of Words using Spin Model. In: 43rd ACL, pp. 133–140 (2005)
Turney, P.D., Littman, M.L.: Measuring praise and criticism: Inference of semantic orientation from association. ACM TIS 21(4), 315–346 (2003)
Voll, K., Taboada, M.J.: Not All Words Are Created Equal: Extracting Semantic Orientation as a Function of Adjective Relevance. In: Orgun, M.A., Thornton, J. (eds.) AI 2007. LNCS (LNAI), vol. 4830, pp. 337–346. Springer, Heidelberg (2007)
Wiebe, J.M.: Learning subjective adjectives from corpora. In: Proceedings of the 17th National Conference on Artificial Intelligence (AAAI 2000), pp. 735–740 (2000)
Wiebe, J., Mihalcea, R.: Word sense and subjectivity. In: Proceedings of COLING/ACL, Sydney, Australia, pp. 1065–1072 (2006)
Yu, H., Yang, J., Han, J.: Classifying Large Data Sets Using SVMs with Hierarchical Clusters. In: Proc. of the 9th ACM SIGKDD (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Poria, S., Gelbukh, A., Das, D., Bandyopadhyay, S. (2013). Fuzzy Clustering for Semi-supervised Learning – Case Study: Construction of an Emotion Lexicon. In: Batyrshin, I., González Mendoza, M. (eds) Advances in Artificial Intelligence. MICAI 2012. Lecture Notes in Computer Science(), vol 7629. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37807-2_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-37807-2_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37806-5
Online ISBN: 978-3-642-37807-2
eBook Packages: Computer ScienceComputer Science (R0)