Skip to main content

Comparative Approach of Sentiment Analysis Algorithms to Classify Social Media Information Gathering in the Spanish Language

  • Conference paper
  • First Online:
Data Science and Algorithms in Systems (CoMeSySo 2022)

Abstract

This research provides a perspective on the large amount of information shared by millions of people, who express feelings and opinions on social networks. This investigation gathered data from the social network Twitter, employing the API v2 during the month of June of 2022, concerning topics of local economic indices in Peru, with the aim to identify the precision of the different sort algorithms from analysis of feelings (SA) for the Spanish language. Tweets collected that conformed to the Corpus Data from the study were processed using the Natural Language Processing (NLP); next, they vectorizaron with the Bag method of Words (BOW) which allowed them to construct a vocabulary of 4045 tokens clean, marking the absence (0) and it is present at (1) of the word in tweets with support of bookstores of software RStudio. Finally, it was come to compare three techniques of analysis of feelings (SA) in Machine Learning (Naïve Bayes, Support Vector Machine and K-Nearest Neighbors). The results indicate a 92.13% accuracy ratio for Naïve Bayes (NB) (F1 score = 0.90026135), 93.11% for Support Vector Machine (SVM) (F1 score = 0.90026135), and 91.71% for K-Nearest Neighbors (KNN) (F1 score = 0.90026135). The results indicate that the most optimal classification technique was the SVM with an accuracy of 95.15%, making it possible to identify the best technique for classifying feelings in Spanish, applied in a thematic environment related to topics of economic indicators (e.g., inflation, unemployment, dollar) in the Peruvian context.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Blazquez, D., Domenech, J.: Big data sources and methods for social and economic analyses. Technol. Forecast. Soc. Change 130(March), 99–113 (2018). https://doi.org/10.1016/j.techfore.2017.07.027

    Article  Google Scholar 

  2. Aragona, B., De Rosa, R.: Big data in policy making. Math. Popul. Stud. 26(2), 107–113 (2018). https://doi.org/10.1080/08898480.2017.1418113

    Article  MathSciNet  Google Scholar 

  3. Fersini, E.: Sentiment analysis in social networks: a machine learning perspective. In: Sentiment Analysis in Social Networks, pp. 20–25. Elsevier Inc. (2017). https://doi.org/10.1016/b978-0-12-804412-4.00006-1

  4. Arango Pastrana, C.A., Osorio Andrade, C.F.: Aislamiento social obligatorio: un análisis de sentimientos mediante machine learning. Suma Negocios 12(26), 1–13 (2021). https://doi.org/10.14349/sumneg/2021.v12.n26.a1.

  5. Carosia, A.E.O., Coelho, G.P., Silva, A.E.A.: Analyzing the Brazilian financial market through Portuguese sentiment analysis in social media. Appl. Artif. Intell. 34(1), 1–19 (2020). https://doi.org/10.1080/08839514.2019.1673037

    Article  Google Scholar 

  6. Dessain, J.: Machine learning models predicting returns: Why most popular performance metrics are misleading and proposal for an efficient metric. Expert Syst. Appl. 199. https://doi.org/10.1016/j.eswa.2022.116970

  7. Lima, M.S.M., Delen, D.: Predicting and explaining corruption across countries: a machine learning approach. Gov. Inf. Q. 37(1), 101407 (2020). https://doi.org/10.1016/j.giq.2019.101407

    Article  Google Scholar 

  8. Ramos-Sandoval, R.: Peruvian citizens reaction to Reactiva Perú program: a Twitter sentiment analysis approach. In: Information Management and Big Data, pp. 18–28 (2021)

    Google Scholar 

  9. Al-Hashedi, A., et al.: Ensemble classifiers for Arabic sentiment analysis of social network (Twitter Data) towards COVID-19-related conspiracy theories. Appl. Comput. Intell. Soft Comput. 2022. https://doi.org/10.1155/2022/6614730

  10. Hidayat, T.H.J., Ruldeviyani, Y., Aditama, A.R., Madya, G.R., Nugraha, A.W., Adisaputra, M.W.: Sentiment analysis of twitter data related to Rinca Island development using Doc2Vec and SVM and logistic regression as classifier. Procedia Comput. Sci. 197(2021), 660–667 (2021). https://doi.org/10.1016/j.procs.2021.12.187

    Article  Google Scholar 

  11. Niebles-Mamani, L., Velarde-Herencia, R., Sulla-Torres, J.: Predicción de incumplimiento de pago de clientes de tarjetas de crédito, con aplicación del algoritmo del k-vecino más cercano y Clas-FriedmanAligned-ST. In: Proceedings of LACCEI International Multi-conference for Engineering, Education and Technology, vol. 2017, pp. 0–7 (2017). https://doi.org/10.18687/LACCEI2017.1.1.329

  12. Pozzi, F.A., Fersini, E., Messina, E., Liu, B.: Challenges of Sentiment Analysis in Social Networks: An Overview, vol. 1. Elsevier Inc. (2017). https://doi.org/10.1016/B978-0-12-804412-4.00001-2

  13. Martínez-Cámara, E., Martín-Valdivia, M.T., Ureña-López, L.A., Montejo-Ráez, A.R.: Sentiment analysis in Twitter. Nat. Lang. Eng. 20(1), 1–28 (2014). https://doi.org/10.1017/S1351324912000332

    Article  Google Scholar 

  14. Tumasjan, A., Sprenger, T.O., Sandner, P.G., Welpe, I.M.: Election forecasts with Twitter: how 140 characters reflect the political landscape. Soc. Sci. Comput. Rev. 29(4), 402–418 (2011). https://doi.org/10.1177/0894439310386557

    Article  Google Scholar 

  15. Ceron, A., Curini, L., Iacus, S.M.: Using sentiment analysis to monitor electoral campaigns: method matters—evidence from the United States and Italy. Soc. Sci. Comput. Rev. 33(1), 3–20 (2015). https://doi.org/10.1177/0894439314521983

    Article  Google Scholar 

  16. Ceron, A., Curini, L., Iacus, S.M., Porro, G.: Every tweet counts? How sentiment analysis of social media can improve our knowledge of citizens’ political preferences with an application to Italy and France. New Media Soc. 16(2), 340–358 (2014). https://doi.org/10.1177/1461444813480466

    Article  Google Scholar 

  17. Ceron, A., Curini, L., Iacus, S.M.: Politics and Big Data: Nowcasting and Forecasting Elections with Social Media, vol. 53, no. 9, pp. 1689–1699. Taylor & Francis (2017)

    Google Scholar 

  18. Colonescu, C.: The effects of Donald Trump’s Tweets on US financial and foreign exchange markets. Athens J. Bus. Econ. 4(4), 375–388 (2018). https://doi.org/10.30958/ajbe.4-4-2

    Article  Google Scholar 

  19. Bramer, M.: Principles of Data Mining, vol. 180. Springer (2007)

    Google Scholar 

  20. Montesinos, L.: Análisis de sentimientos y predicción de eventos en twitter. Santiago De Chile, pp. 12–16 (2014)

    Google Scholar 

  21. Sobrino, J.C.: Análisis de sentimientos en Twitter. Universitat Oberta de Catalunya (2018)

    Google Scholar 

  22. Uddin, S., Haque, I., Lu, H., Moni, M.A., Gide, E.: Comparative performance analysis of K-nearest Neighbors (KNN) algorithm and its different variants for disease prediction. Sci. Rep. 12(1) (2022). https://doi.org/10.1038/s41598-022-10358-x

  23. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(5), 273–297 (1995). https://doi.org/10.1109/64.163674

    Article  MATH  Google Scholar 

  24. Rivas, R., Paul, S., Hristidis, V., Papalexakis, E.E., Roy-Chowdhury, A.K.: Task-agnostic representation learning of multimodal twitter data for downstream applications. J. Big Data 9(1) (2022). https://doi.org/10.1186/s40537-022-00570-x

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juan J. Soria .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Soria, J.J., De la Cruz, G., Molina, T., Ramos-Sandoval, R. (2023). Comparative Approach of Sentiment Analysis Algorithms to Classify Social Media Information Gathering in the Spanish Language. In: Silhavy, R., Silhavy, P., Prokopova, Z. (eds) Data Science and Algorithms in Systems. CoMeSySo 2022. Lecture Notes in Networks and Systems, vol 597. Springer, Cham. https://doi.org/10.1007/978-3-031-21438-7_64

Download citation

Publish with us

Policies and ethics