Abstract
Twitter, an online micro-blogging and social networking service, provides registered users the ability to write in 140 characters anything they wish and hence providing them the opportunity to express their opinions and sentiments on events taking place. Politically sentimental tweets are top-trending tweets; whenever election is near, users tweet about their favorite candidates or political parties and at times give their reasons for that. In this study, we hybridize two n-gram [two n-gram models used in this study are unigram and n-gram. Therefore, in this study, where unigram is mentioned that refers to a least-order n-gram (unigram) and where n-gram is mentioned that refers to the highest-order (full sentence or tweet level) n-gram] models and applied Laplace smoothing to Naïve Bayesian classifier and Katz back-off on the model. This was done in order to smoothen and address the limitation of accuracy in terms of precision and recall of n-gram models caused by the ‘zero count problem.’ Result from our baseline model shows an increase of 6.05% in average F-Harmonic accuracy in comparison with the n-gram model and 1.75% increase in comparison with the semantic-topic model proposed from a previous study on the same dataset, i.e., Obama–McCain dataset.
Similar content being viewed by others
References
Mitch W (2008) Obama election ushering in first internet presidency. InformationWeek Government. https://www.informationweek.com/government/obama-election-ushering-in-first-internet-presidency/d/d-id/1073714?. Accessed 26 June 2014
Kharde VA, Sonawane SS (2016) Sentiment analysis of Twitter data: a survey of techniques. Int J Comput Appl 139(11):5–15
Cambria E (2016) Affective computing and sentiment analysis. IEEE Intell Syst 16(March/April):102–107
Perikos I, Hatzilygeroudis I (2016) A classifier ensemble approach to detect emotions polarity in social media. In: Proceedings of the 12th international conference on web information systems and technologies (WEBIST 2016), vol 1, pp 363–370
Shaheen S, El-Hajj W, Hajj H, Elbassuoni S (2014) Emotion recognition from text based on automatically generated rules. In: IEEE international conference on data mining workshop, pp 383–392. http://doi.org/10.1109/ICDMW.2014.80
Fersini E, Messina E, Pozzi FA (2014) Sentiment analysis: Bayesian ensemble learning. Decis Support Syst 68(2014):26–38. https://doi.org/10.1016/j.dss.2014.10.004
Bravo-marquez F, Frank E, Pfahringer B (2016) From opinion lexicons to sentiment classification of Tweets and vice versa: a transfer learning approach. In: The 2016 IEEE/WIC/ACM international conference on web intelligence at Omaha, Nebraska
Boroş T, Ştefănescu D, Ion R (2013) Handling two difficult challenges for text-to-speech synthesis systems: out-of-vocabulary words and prosody: a case study in Romanian. In: Neustein A, Markowitz J (eds) Where humans meet machines. Springer, New York
Zhu L, Smith ML, Lerman K, Kozareva Z (2013) The role of social media in the discussion of controversial topics. Accessed on 10 Jan 2014
Mukund D, Avik S (2010) BI and sentiment analysis. Bus Intell J 15(2):41–43
Cambria E, Schuller B, Xia Y, Havasi C (2013) New avenues in opinion mining and sentiment analysis. IEEE Intell Syst 28(2):15–21
Jagtap VS, Pawar K (2013) Analysis of different approaches to sentence-level sentiment classification. Int J Sci Eng Technol 2(3):164–170
Saif H, He Y, Alani H (2012) Semantic sentiment analysis of Twitter. In: 11th international semantic Web conference, Boston, MA, USA, November 11–15, 2012, proceedings, part I, pp 508–524
Tobias G, Johansson R (2013) Sentiment analysis of microblogs. Dissertation, University of Gothenberg
Wang S, Manning CD (1998) Baselines and bigrams: simple, good sentiment and topic classification, vol 94305(1)
Tan S et al (2009) Adapting Naive Bayes to domain adaptation for sentiment analysis. In: Proceedings of the 31th European conference on IR research on advances in information retrieval, pp 337–349
Rennie JDM, Shih L, Teevan J, Karger DR (2003) Tackling the poor assumptions of naive bayes text classifiers. In: Proceedings of the twentieth international conference on machine learning (ICML-2003). AAAI Press, Washington, DC, pp 616–623
Rish I (2001) An empirical study of the Naive Bayes classifier. IBM Research Report. IBM Research Division, pp 41–46
Araque O, Corcuera-Platas I, Sánchez-Rada JF, Iglesias CA (2017) Enhancing deep learning sentiment analysis with ensemble techniques in social applications. Expert Syst Appl 77:236–246. https://doi.org/10.1016/j.eswa.2017.02.002
Mensikova A, Mattmann CA (2018) Ensemble sentiment analysis to identify human trafficking in web data. In: Proceedings of ACM workshop on graph techniques for adversarial activity analytics (GTA32018), pp 0–5
Hande K (2017) Sentiment analysis using ensemble classifier. Int J Innov Res Comput Commun Eng 5(9):15125–15130. https://doi.org/10.15680/IJIRCCE.2017
Ekman P (1999) Basic emotions. In: Handbook of cognition and emotion, pp 45–60
da Silva NF, Hruschka ER, Hruschka ER Jr (2014) Tweet sentiment analysis with classifier ensembles. Decis Support Syst 66:170–179. https://doi.org/10.1016/j.dss.2014.07.003
Narayanan V, Arora I, Bhatia A (2013) Fast and accurate sentiment classification using an enhanced Naive Bayes model. In: Proceedings of 14th international conference, IDEAL 2013, Hefei, China, October 20–23, 2013, pp 1–8
Jurafsky D, Martin JH (2007) Speech and language processing. In: An introduction to speech recognition, computational linguistics and natural language processing, p 42
Xiang B (2014) Improving Twitter sentiment analysis with topic-based mixture modeling and semi-supervised training. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, pp 434–439
Jurafsky D, Martin JH (2008) Speech and language processing. N-grams, 2nd edn. Prentice Hall, Englewood Cliffs, pp 83–121
Peng F (2003) Augmenting Naive Bayes classifiers with statistical language models. University of Massachusetts—Amherst, Computer Science Department, Faculty Publication Series 91. https://scholarworks.umass.edu/cgi/viewcontent.cgi?article=1090&context=cs_faculty_pubs. Accessed 26 June 2014
Bo P, Lillian L (2005) Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the ACL
Riaz M (2018) Fuzzy parameterized fuzzy soft compact spaces with decision-making. J Math 50(2):131–145
Riaz M, Hashmi MR (2018) Fixed points of fuzzy neutrosophic soft mapping with decision-making. Fixed Point Theory Appl 2018(1):7. https://doi.org/10.1186/s13663-018-0632-5
Sa M, Riaz M, Hashmi MR (2016) Fuzzy parameterized fuzzy soft topology with applications. Ann Fuzzy Math Inf 13(5):593–613
Saif H, Fernandez M, He Y, Alani H (2013) Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold. In: 1st international workshop on emotion and sentiment in social and expressive media: approaches and perspectives from AI (ESSEM 2013)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interests with anybody or any institution regarding the publication of this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Awwalu, J., Bakar, A.A. & Yaakub, M.R. Hybrid N-gram model using Naïve Bayes for classification of political sentiments on Twitter. Neural Comput & Applic 31, 9207–9220 (2019). https://doi.org/10.1007/s00521-019-04248-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-019-04248-z