Skip to main content

Sentiment Analysis of Colloquial Arabic Tweets with Emojis

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1339))

Abstract

Sentiment Analysis is the process of classifying data according to its sentiment polarity as positive or negative or multiclass. In this paper, our goal is twofold: Firstly, to experiment and evaluate different approaches to dialectal Arabic sentiment analysis, including various classifiers and features. Secondly, we have curated a dataset of Arabic dialect tweets to validate its usefulness via utilization in the experiments. We collected the dataset via Twitter, the Twitter Arabic Dialect dataset, and its subset (which we used in this paper) the Twitter Arabic Dialect Emoji (TADE) datasets. TADE is automatically annotated for sentiment utilizing emojis encountered in the tweets. Our method favors real-word application of emoji as opposed to the theoretic meaning of the emoji. We use traditional (shallow) and deep learning classifiers for sentiment analysis of the TADE dataset. We experiment with good shallow classifiers including Gradient Boosting, Logistic Regression, Nearest Centroid, Decision Tree, MultinomialNB, SVM, XGB, Random Forest, AdaBoost, and a voting classifier. For the deep learning classifiers, we use MLP and CNN classifiers. Further, we experiment with TF-IDF and word embeddings for feature selections.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Liu, B.: Sentiment analysis and opinion mining. Synth. Lect. Hum. Lang. Technol. 5, 1–67 (2012)

    Article  Google Scholar 

  2. AlOtaibi, S., Khan, M.B.: Sentiment analysis challenges of informal Arabic. (IJACSA) Int. J. Adv. Comput. Sci. Appl. 8, 278–284 (2017)

    Google Scholar 

  3. Farghaly, A., Shaalan, K.: Arabic natural language processing: challenges and solutions. ACM Trans. Asian Lang. Inf. Process. (TALIP) 8, 1–22 (2009)

    Article  Google Scholar 

  4. Ahmad, K., Cheng, D., Almas, Y.: Multi-lingual sentiment analysis of financial news streams. In: 1st International Workshop on Grid Technology for Financial Modeling and Simulation (2007)

    Google Scholar 

  5. Rushdi-Saleh, M., Martı́n-Valdivia, M.T., Ureña-López, L.A., Perea-Ortega, J.M.: OCA: opinion corpus for Arabic. J. Am. Soc. Inf. Sci. Technol. 62, 2045–2054 (2011)

    Google Scholar 

  6. Abo, M.E.M., Raj, R.G., Qazi, A.: A review on Arabic sentiment analysis: state-of-the-art, taxonomy and open research challenges. IEEE Access 7, 162008–162024 (2019)

    Article  Google Scholar 

  7. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  8. Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016)

  9. Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (2014)

    Google Scholar 

  10. Altowayan, A.A., Elnagar, A.: Improving Arabic sentiment analysis with sentiment-specific embeddings. In: 2017 IEEE International Conference on Big Data (Big Data) (2017)

    Google Scholar 

  11. Dahou, A., Xiong, S., Zhou, J., Haddoud, M.H., Duan, P.: Word embeddings and convolutional neural network for arabic sentiment classification. In: Proceedings of Coling 2016, the 26th International Conference on Computational Linguistics: Technical Papers (2016)

    Google Scholar 

  12. Elnagar, A., Einea, O., Al-Debsi, R.: Automatic text tagging of Arabic news articles using ensemble deep learning models. In: Proceedings of the 3rd International Conference on Natural Language and Speech Processing, Trento, Italy (2019)

    Google Scholar 

  13. Elnagar, A., Al-Debsi, R., Einea, O.: Arabic text classification using deep learning models. Inf. Process. Manag. 57(1), 102121 (2020)

    Article  Google Scholar 

  14. Al-Saqqa, S., Obeid, N., Awajan, A.: Sentiment analysis for Arabic text using ensemble learning. In: 2018 IEEE/ACS 15th International Conference on Computer Systems and Applications (AICCSA) (2018)

    Google Scholar 

  15. Sghaier, M.A., Zrigui, M.: Sentiment analysis for Arabic e-commerce websites. In: 2016 International Conference on Engineering & MIS (ICEMIS) (2016)

    Google Scholar 

  16. Alomari, K.M., El-Sherif, H.M., Shaalan, K.: Arabic tweets sentimental analysis using machine learning. In: International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems (2017)

    Google Scholar 

  17. Aldayel, H.K., Azmi, A.M.: Arabic tweets sentiment analysis–a hybrid scheme. J. Inf. Sci. 42, 782–797 (2016)

    Article  Google Scholar 

  18. Al-Azani, S., El-Alfy, E.-S.M.: Hybrid deep learning for sentiment polarity determination of arabic microblogs. In: International Conference on Neural Information Processing (2017)

    Google Scholar 

  19. Heikal, M., Torki, M., El-Makky, N.: Sentiment analysis of Arabic Tweets using deep learning. Procedia Comput. Sci. 142, 114–122 (2018)

    Article  Google Scholar 

  20. Baly, R., El-Khoury, G., Moukalled, R., Aoun, R., Hajj, H., Shaban, K.B., El-Hajj, W.: Comparative evaluation of sentiment analysis methods across Arabic dialects. Procedia Comput. Sci. 117, 266–273 (2017)

    Article  Google Scholar 

  21. Al-Azani, S., El-Alfy, E.-S.M.: Using word embedding and ensemble learning for highly imbalanced data sentiment analysis in short Arabic text. In: ANT/SEIT (2017)

    Google Scholar 

  22. Alayba, A.M., Palade, V., England, M., Iqbal, R.: Arabic language sentiment analysis on health services. In: 2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR) (2017)

    Google Scholar 

  23. Einea, O., Elnagar, A., Al Debsi, R.: SANAD: single-label arabic news articles dataset for automatic text categorization. Data Brief 25, 104076 (2019)

    Article  Google Scholar 

  24. Al Qadi, L., El Rifai, H., Obaid, S., Elnagar, A.: Arabic text classification of news articles using classical supervised classifiers. In: 2019 2nd International Conference on new Trends in Computing Sciences (ICTCS), Amman, Jordan (2019)

    Google Scholar 

  25. Elnagar, A., Einea, O., Lulu, L.: Comparative study of sentiment classification for automated translated Latin reviews into Arabic. In: Proceedings of IEEE/ACS International Conference on Computer Systems and Applications, AICCSA (2018)

    Google Scholar 

  26. Elnagar, A., Einea, O.: Brad 1.0: book reviews in arabic dataset. In: 2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA), Agadir, Morocco (2016)

    Google Scholar 

  27. Elnagar, A., Khalifa, Y., Einea, A.: Hotel arabic-reviews dataset construction for sentiment analysis applications, vol. 740 (2018)

    Google Scholar 

  28. Baly, R., Khaddaj, A., Hajj, H., El-Hajj, W., Shaban, K.B.: Arsentd-lev: a multi-topic corpus for target-based sentiment analysis in arabic levantine tweets. arXiv preprint arXiv:1906.01830 (2019)

  29. Mdhaffar, S., Bougares, F., Esteve, Y., Hadrich-Belguith, L.: Sentiment analysis of tunisian dialects: Linguistic ressources and experiments (2017)

    Google Scholar 

  30. Elnagar, A.: Investigation on sentiment analysis for Arabic reviews. In: 2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA), Agadir, Morocco (2016)

    Google Scholar 

  31. Lulu, L., Elnagar, A.: Automatic Arabic dialect classification using deep learning models. Procedia Comput. Sci. 142, 262–269 (2018)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ashraf Elnagar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Khalifa, Y., Elnagar, A. (2021). Sentiment Analysis of Colloquial Arabic Tweets with Emojis. In: Hassanien, AE., Chang, KC., Mincong, T. (eds) Advanced Machine Learning Technologies and Applications. AMLTA 2021. Advances in Intelligent Systems and Computing, vol 1339. Springer, Cham. https://doi.org/10.1007/978-3-030-69717-4_40

Download citation

Publish with us

Policies and ethics