Abstract
Recently, emotion detection in language has increased attention to NLP researchers due to the massive availability of people’s expressions, opinions, and emotions through comments on the Web 2.0 platforms. It is a very challenging task to develop an automatic sentiment analysis system in Bengali due to the scarcity of resources and the unavailability of standard corpora. Therefore, the development of a standard dataset is a prerequisite to analyze emotional expressions in Bengali texts. This paper presents an emotional dataset (hereafter called ‘BEmoD’) for analysis of emotion in Bengali texts and describes its development process, including data crawling, pre-processing, labeling, and verification. BEmoD contains 5200 texts, which are labeled into six basic emotional categories such as anger, fear, surprise, sadness, joy, and disgust, respectively. Dataset evaluation with a Cohen’s \(\kappa \) score of 0.920 shows the agreement among annotators. The evaluation analysis also shows the distribution of emotion words that follow Zipf’s law.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Liu, B.: Sentiment analysis and subjectivity, 1–38 (2010)
Garg, K., Lobiyal, D.K.: Hindi emotionnet: a scalable emotion lexicon for sentiment classification of hindi text. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 19(4), 1–35 (2020)
Eckman, P.: Universal and cultural differences in facial expression of emotion. In: Nebraska Symposium on Motivation, vol. 19, pp. 207–284 (1972)
Alm, O.C., Roth, D., Richard, S.: Emotions from text: machine learning for text-based emotion prediction. In: Proceeding in HLT-EMNLP, pp. 579–586. ACL, Vancouver, British Columbia, Canada (2005)
Aman, S., Szpakowicz, S.: Identifying expressions of emotion in text. In: International Conference on Text, Speech and Dialogue, pp. 196–205. Springer, Berlin (2007)
Scherer, K.R., Wallbott, H.G.: Evidence for universality and cultural variation of differential emotion response patterning. J Per. Soc. Psy. 66(2), 310–328 (1994)
Pontiki, M., Galanis, D., Pavlopoulos, J., Papageorgiou, H., Androutsopoulos, I., Manandharet, S.: Semeval-2014 task 4: aspect based sentiment analysis. In: International Workshop on Semantic Evaluation, pp. 27–35. ACL, Dublin, Ireland (2014)
Al-Smadi, M., Qawasmeh, O., Talafha, B., Quwaider, M.: Human annotated Arabic dataset of book reviews for aspect based sentiment analysis. In: International Conference on Future Internet of Things and Cloud, pp. 726–730. IEEE, Rome, Italy (2015)
Ales, T., Ondrej, F., Katerina, V.: Czech aspect-based sentiment analysis: a new dataset and preliminary results. In: ITAT, pp. 95–99 (2015)
Apidianaki, M., Tannier, X., Richart, C.: Datasets for aspect-based sentiment analysis in French. In: International Conference on Lan. Res. & Evaluation, pp. 1122–1126. ELRA, Portorož, Slovenia (2016)
Mohammad, S., Bravo-Marquez, F., Salameh, M., Kiritchenko, S.: Semeval-2018 task 1: affect in tweets. In: International Workshop on Semantic Evaluation, pp. 1–17. ACL, New Orleans, Louisiana (2018)
Chatterjee, A., Narahari, K.N., Joshi, M., Agrawal, P.: Semeval-2019 task 3: emocontext: contextual emotion detection in text. In: International Workshop on Semantic Evaluation, pp. 39–48. ACL, Minneapolis, Minnesota, USA (2019)
Vijay, D., Bohra, A., Singh, V., Akhtar, S.S., Shrivastava, M.: Corpus creation and emotion prediction for hindi-english code-mixed social media text. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, pp. 128–135 (2018)
Das, D., Bandyopadhyay, S.: Word to sentence level emotion tagging for Bengali blogs. In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, pp. 149–152 (2009)
Strapparava, C., Valitutti, A., et al.: Wordnet affect: an affective extension of wordnet. In: Lrec, vol. 4, p. 40. Citeseer (2004)
Prasad, S.S., Kumar, J., Prabhakar, D.K., Tripathi, S.: Sentiment mining: an approach for Bengali and Tamil tweets. In: 2016 Ninth International Conference on Contemporary Computing (IC3), pp. 1–4. IEEE (2016)
Tripto, N.I., Ali, M.E.: Detecting multilabel sentiment and emotions from Bangla youtube comments. In: 2018 International Conference on Bangla Speech and Language Processing (ICBSLP), pp. 1–6. IEEE (2018)
Rahman, A., Dey, E.K.: Datasets for aspect-based sentiment analysis in Bangla and its baseline evaluation. Data 3(2), 15 (2018)
Sharif, O., Hoque, M.M., Hossain, E.: Sentiment analysis of Bengali texts on online restaurant reviews using multinomial naıve bayes. In: International Conference on Advance in Science, Engineering & Robotics Technology, pp. 1–6. IEEE, Dhaka, Bangladesh (2019)
Ruposh, H.A., Hoque, M.M.: A computational approach of recognizing emotion from Bengali texts. In: International Conference on Advances in Electrical Engineering (ICAEE), pp. 570–574. IEEE, Dhaka, Bangladesh (2019)
Dash, N.S., Ramamoorthy, L.: Utility and Application of Language Corpora. Springer (2019)
Accessible dictionary. https://accessibledictionary.gov.bd/. Accessed 2 Jan 2020
Full emoji list. https://unicode.org/emoji/charts/full-emoji-list.html. Accessed 7 Feb 2020
Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20(1), 37–46 (1960)
Alswaidan, N., Menai, M.B.: A survey of state-of-the-art approaches for emotion recognition in text. Knowl. Inf. Syst. 62, 2937–2987 (2020)
Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics, 159–174 (1977)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Das, A., Iqbal, M.A., Sharif, O., Hoque, M.M. (2021). BEmoD: Development of Bengali Emotion Dataset for Classifying Expressions of Emotion in Texts. In: Vasant, P., Zelinka, I., Weber, GW. (eds) Intelligent Computing and Optimization. ICO 2020. Advances in Intelligent Systems and Computing, vol 1324. Springer, Cham. https://doi.org/10.1007/978-3-030-68154-8_94
Download citation
DOI: https://doi.org/10.1007/978-3-030-68154-8_94
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68153-1
Online ISBN: 978-3-030-68154-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)