Abstract
The Zika disease is a 2015–16 virus epidemic and continues to be a global health issue. The recent trend in sharing critical information on social networks such as Twitter has been a motivation for us to propose a classification model that classifies tweets related to Zika and thus enables us to extract helpful insights into the community. In this paper, we try to explain the process of data collection from Twitter, the preprocessing of the data, building a model to fit the data, comparing the accuracy of support vector machines and Naïve Bayes algorithm for text classification and state the reason for the superiority of support vector machine over Naïve Bayes algorithm. Useful analytical tools such as word clouds are also presented in this research work to provide a more sophisticated method to retrieve community support from social networks such as Twitter.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cristianini, Nello, and John Shawe-Taylor. An introduction to support vector machines and other kernel-based learning methods. Cambridge university press, 2000.
El Kourdi, Mohamed, Amine Bensaid, and Tajje-eddine Rachidi. “Automatic Arabic document categorization based on the Naïve Bayes algorithm.” Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages. Association for Computational Linguistics, 2004.
Hassan, Sundus, Muhammad Rafi, and Muhammad Shahid Shaikh. “Comparing SVM and naive bayes classifiers for text categorization with Wikitology as knowledge enrichment.” Multitopic Conference (INMIC), 2011 IEEE 14th International. IEEE, 2011.
Joachims, Thorsten. “Text categorization with support vector machines: Learning with many relevant features.” European conference on machine learning. Springer Berlin Heidelberg, 1998.
Khan, Aamera ZH, Mohammad Atique, and V. M. Thakare. “Combining lexicon-based and learning-based methods for Twitter sentiment analysis.” International Journal of Electronics, Communication and Soft Computing Science & Engineering (IJECSCSE) (2015): 89.
Lerman, Kristina, and Rumi Ghosh. “Information contagion: An empirical study of the spread of news on Digg and Twitter social networks.” ICWSM 10 (2010): 90–97.
McCallum, Andrew, and Kamal Nigam. “A comparison of event models for naive bayes text classification.” AAAI-98 workshop on learning for text categorization. Vol. 752. 1998.
Pak, Alexander, and Patrick Paroubek. “Twitter as a Corpus for Sentiment Analysis and Opinion Mining.” LREc. Vol. 10. 2010.
Sakaki, Takeshi, Makoto Okazaki, and Yutaka Matsuo. “Earthquake shakes Twitter users: real-time event detection by social sensors.” Proceedings of the 19th international conference on World wide web. ACM, 2010.
Sebastiani, Fabrizio. “Machine learning in automated text categorization.” ACM computing surveys (CSUR) 34.1 (2002): 1–47.
Shmilovici, Armin. “Support vector machines.” Data Mining and Knowledge Discovery Handbook. Springer US, 2005. 257–276.
Tong, Simon, and Daphne Koller. “Support vector machine active learning with applications to text classification.” Journal of machine learning research 2. Nov (2001): 45–66.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Tripathy, B.K., Thakur, S., Chowdhury, R. (2017). A Classification Model to Analyze the Spread and Emerging Trends of the Zika Virus in Twitter. In: Behera, H., Mohapatra, D. (eds) Computational Intelligence in Data Mining. Advances in Intelligent Systems and Computing, vol 556. Springer, Singapore. https://doi.org/10.1007/978-981-10-3874-7_61
Download citation
DOI: https://doi.org/10.1007/978-981-10-3874-7_61
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3873-0
Online ISBN: 978-981-10-3874-7
eBook Packages: EngineeringEngineering (R0)