research-article

The Surprising Performance of Simple Baselines for Misinformation Detection

Authors:
Kellin Pelrine

McGill University & Mila - Quebec AI Institute, Canada

McGill University & Mila - Quebec AI Institute, Canada
View Profile

,
Jacob Danovitch

McGill University & Mila - Quebec AI Institute, Canada

McGill University & Mila - Quebec AI Institute, Canada
View Profile

,
Reihaneh Rabbany

McGill University & Mila - Quebec AI Institute, Canada

McGill University & Mila - Quebec AI Institute, Canada
View Profile

Authors Info & Claims

WWW '21: Proceedings of the Web Conference 2021April 2021Pages 3432–3441https://doi.org/10.1145/3442381.3450111

Published:03 June 2021Publication History

WWW '21: Proceedings of the Web Conference 2021

Pages 3432–3441

ABSTRACT

As social media becomes increasingly prominent in our day to day lives, it is increasingly important to detect informative content and prevent the spread of disinformation and unverified rumours. While many sophisticated and successful models have been proposed in the literature, they are often compared with older NLP baselines such as SVMs, CNNs, and LSTMs. In this paper, we examine the performance of a broad set of modern transformer-based language models and show that with basic fine-tuning, these models are competitive with and can even significantly outperform recently proposed state-of-the-art methods. We present our framework as a baseline for creating and evaluating new methods for misinformation detection. We further study a comprehensive set of benchmark datasets, and discuss potential data leakage and the need for careful design of the experiments and understanding of datasets to account for confounding variables. As an extreme case example, we show that classifying only based on the first three digits of tweet ids, which contain information on the date, gives state-of-the-art performance on a commonly used benchmark dataset for fake news detection –Twitter16. We provide a simple tool to detect this problem and suggest steps to mitigate it in future datasets.

References

Piush Aggarwal. 2019. Classification approaches to identify informative tweets. In Proceedings of the Student Research Workshop Associated with RANLP 2019. 7–15.Google ScholarCross Ref
Oluwaseun Ajao, Deepayan Bhowmik, and Shahrzad Zargari. 2019. Sentiment aware fake news detection on online social networks. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2507–2511.Google ScholarCross Ref
Yandrapati Prakash Babu and R. Eswari. 2020. CIA_NITT at WNUT-2020 Task 2: Classification of COVID-19 Tweets Using Pre-trained Language Models. ArXiv abs/2009.05782(2020).Google Scholar
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473(2014).Google Scholar
Adrien Benamira, Benjamin Devillers, Etienne Lesot, Ayush K Ray, Manal Saadi, and Fragkiskos D Malliaros. 2019. Semi-supervised learning and graph neural networks for fake news detection. In 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, 568–569.Google ScholarDigital Library
MediaEval Multimedia Benchmark. 2020. FakeNews: Corona virus and 5G conspiracy. https://multimediaeval.github.io/editions/2020/tasks/fakenews/. Accessed: 2020-10-20.Google Scholar
Alessandro Bondielli and Francesco Marcelloni. 2019. A survey on fake news and rumour detection techniques. Information Sciences 497(2019), 38–55.Google ScholarDigital Library
Jeffrey Brainard. 2020. Scientists are drowning in COVID-19 papers. Can new tools keep them afloat. Science (2020).Google Scholar
Tom B Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. arXiv preprint arXiv:2005.14165(2020).Google Scholar
Juan Cao, Junbo Guo, Xirong Li, Zhiwei Jin, Han Guo, and Jintao Li. 2018. Automatic Rumor Detection on Microblogs: A Survey. CoRR abs/1807.03505(2018). arxiv:1807.03505http://arxiv.org/abs/1807.03505Google Scholar
Carlos Castillo, Marcelo Mendoza, and Barbara Poblete. 2011. Information credibility on twitter. In Proceedings of the 20th international conference on World wide web. 675–684.Google ScholarDigital Library
Kumud Chauhan. 2020. NEU at WNUT-2020 Task 2: Data Augmentation To Tell BERT That Death Is Not Necessarily Informative. ArXiv abs/2009.08590(2020).Google Scholar
Sijing Chen, Jin Mao, Gang Li, Chao Ma, and Yujie Cao. 2020. Uncovering sentiment and retweet patterns of disaster-related tweets from a spatiotemporal perspective–A case study of Hurricane Harvey. Telematics and Informatics 47 (2020), 101326.Google ScholarDigital Library
Zhouhan Chen and Juliana Freire. 2020. Proactive Discovery of Fake News Domains from Real-Time Social Media Feeds. In Companion of The 2020 Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, Amal El Fallah Seghrouchni, Gita Sukthankar, Tie-Yan Liu, and Maarten van Steen (Eds.). ACM / IW3C2, 584–592. https://doi.org/10.1145/3366424.3385772Google ScholarDigital Library
Mingxi Cheng, Shahin Nazarian, and Paul Bogdan. 2020. VRoC: Variational Autoencoder-aided Multi-task Rumor Classifier Based on Text. In WWW ’20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, Yennun Huang, Irwin King, Tie-Yan Liu, and Maarten van Steen (Eds.). ACM / IW3C2, 2892–2898. https://doi.org/10.1145/3366423.3380054Google ScholarDigital Library
Mingxi Cheng, Shahin Nazarian, and Paul Bogdan. 2020. VRoC: Variational Autoencoder-aided Multi-task Rumor Classifier Based on Text. In Proceedings of The Web Conference 2020. 2892–2898.Google ScholarDigital Library
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078(2014).Google Scholar
Matteo Cinelli, Walter Quattrociocchi, Alessandro Galeazzi, Carlo Michele Valensise, Emanuele Brugnoli, Ana Lucia Schmidt, Paola Zola, Fabiana Zollo, and Antonio Scala. 2020. The covid-19 social media infodemic. arXiv preprint arXiv:2003.05004(2020).Google Scholar
Limeng Cui and Dongwon Lee. 2020. CoAID: COVID-19 Healthcare Misinformation Dataset. arxiv:2006.00885 [cs.SI]Google Scholar
Limeng Cui, Haeseung Seo, Maryam Tabar, Fenglong Ma, Suhang Wang, and Dongwon Lee. 2020. DETERRENT: Knowledge Guided Graph Attention Network for Detecting Healthcare Misinformation. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 492–502.Google ScholarDigital Library
Limeng Cui, Suhang Wang, and Dongwon Lee. 2019. SAME: sentiment-aware multi-modal embedding for detecting fake news. In ASONAM ’19: International Conference on Advances in Social Networks Analysis and Mining, Vancouver, British Columbia, Canada, 27-30 August, 2019, Francesca Spezzano, Wei Chen, and Xiaokui Xiao (Eds.). ACM, 41–48. https://doi.org/10.1145/3341161.3342894Google ScholarDigital Library
Zihang Dai, Guokun Lai, Yiming Yang, and Quoc V. Le. 2020. Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing. arxiv:2006.03236 [cs.LG]Google Scholar
Ronald Denaux and Jose Manuel Gomez-Perez. 2020. Linked Credibility Reviews for Explainable Misinformation Detection. arXiv preprint arXiv:2008.12742(2020).Google Scholar
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. CoRR abs/1810.04805(2018). arxiv:1810.04805http://arxiv.org/abs/1810.04805Google Scholar
Matt Gardner, Joel Grus, Mark Neumann, Oyvind Tafjord, Pradeep Dasigi, Nelson F. Liu, Matthew Peters, Michael Schmitz, and Luke S. Zettlemoyer. 2017. AllenNLP: A Deep Semantic Natural Language Processing Platform. arXiv:arXiv:1803.07640Google Scholar
John M Giorgi, Osvald Nitski, Gary D. Bader, and Bo Wang. 2020. DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations. ArXiv abs/2006.03659(2020).Google Scholar
Nir Grinberg, Kenneth Joseph, Lisa Friedland, Briony Swire-Thompson, and David Lazer. 2019. Fake news on Twitter during the 2016 U.S. presidential election. Science 363, 6425 (2019), 374–378. https://doi.org/10.1126/science.aau2706 arXiv:https://science.sciencemag.org/content/363/6425/374.full.pdfGoogle Scholar
Yi Han, Shanika Karunasekera, and Christopher Leckie. 2020. Graph Neural Networks with Continual Learning for Fake News Detection from Social Media. arXiv preprint arXiv:2007.03316(2020).Google Scholar
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.Google ScholarDigital Library
Qi Huang, Junshuai Yu, Jia Wu, and Bin Wang. 2020. Heterogeneous Graph Attention Networks for Early Detection of Rumors on Twitter. arXiv preprint arXiv:2006.05866(2020).Google Scholar
Yen-Hao Huang, Ting-Wei Liu, Ssu-Rui Lee, Fernando Henrique Calderon Alvarado, and Yi-Shin Chen. 2020. Conquering Cross-source Failure for News Credibility: Learning Generalizable Representations beyond Content Embedding. In WWW ’20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, Yennun Huang, Irwin King, Tie-Yan Liu, and Maarten van Steen (Eds.). ACM / IW3C2, 774–784. https://doi.org/10.1145/3366423.3380158Google ScholarDigital Library
Yen-Hao Huang, Ting-Wei Liu, Ssu-Rui Lee, Fernando Henrique Calderon Alvarado, and Yi-Shin Chen. 2020. Conquering Cross-source Failure for News Credibility: Learning Generalizable Representations beyond Content Embedding. In Proceedings of The Web Conference 2020. 774–784.Google ScholarDigital Library
Tin Van Huynh, L. Nguyen, and Son T. Luu. 2020. BANANA at WNUT-2020 Task 2: Identifying COVID-19 Information on Twitter by Combining Deep Learning and Transfer Learning Models. ArXiv abs/2009.02671(2020).Google Scholar
Muhammad Imran, Prasenjit Mitra, and Carlos Castillo. 2016. Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of Crisis-related Messages. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016) (Portoroz, Slovenia, 23-28). European Language Resources Association (ELRA), Paris, France.Google Scholar
Mohammad Raihanul Islam, Sathappan Muthiah, and Naren Ramakrishnan. 2019. RumorSleuth: joint detection of rumor veracity and user stance. In ASONAM ’19: International Conference on Advances in Social Networks Analysis and Mining, Vancouver, British Columbia, Canada, 27-30 August, 2019, Francesca Spezzano, Wei Chen, and Xiaokui Xiao (Eds.). ACM, 131–136. https://doi.org/10.1145/3341161.3342916Google Scholar
Jun Ito, Jing Song, Hiroyuki Toda, Yoshimasa Koike, and Satoshi Oyama. 2015. Assessment of tweet credibility with LDA features. In Proceedings of the 24th International Conference on World Wide Web. 953–958.Google ScholarDigital Library
Armand Joulin, Edouard Grave, Piotr Bojanowski, and Tomas Mikolov. 2016. Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759(2016).Google Scholar
Nayomi Kankanamge, Tan Yigitcanlar, Ashantha Goonetilleke, and Md Kamruzzaman. 2020. Determining disaster severity through social media analysis: Testing the methodology with South East Queensland Flood tweets. International journal of disaster risk reduction 42 (2020), 101360.Google ScholarCross Ref
Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882(2014).Google Scholar
Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114(2013).Google Scholar
Elena Kochkina, Maria Liakata, and Arkaitz Zubiaga. 2018. All-in-one: Multi-task learning for rumour verification. arXiv preprint arXiv:1806.03713(2018).Google Scholar
Priyanshu Kumar and Aadarsh Singh. 2020. NutCracker at WNUT-2020 Task 2: Robustly Identifying Informative COVID-19 Tweets using Ensembling and Adversarial Training. arXiv preprint arXiv:2010.04335(2020).Google Scholar
Shamanth Kumar, Geoffrey Barbier, Mohammad Ali Abbasi, and Huan Liu. 2011. Tweettracker: An analysis tool for humanitarian and disaster relief.ICwSM 11(2011), 78–82.Google Scholar
Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2020. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. ArXiv abs/1909.11942(2020).Google Scholar
Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In International conference on machine learning. 1188–1196.Google ScholarDigital Library
L. Li, Q. Zhang, X. Wang, J. Zhang, T. Wang, T. Gao, W. Duan, K. K. Tsoi, and F. Wang. 2020. Characterizing the Propagation of Situational Information in Social Media During COVID-19 Epidemic: A Case Study on Weibo. IEEE Transactions on Computational Social Systems 7, 2 (2020), 556–562.Google ScholarCross Ref
Xiaomo Liu, Armineh Nourbakhsh, Quanzhi Li, Rui Fang, and Sameena Shah. 2015. Real-time rumor debunking on twitter. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. 1867–1870.Google ScholarDigital Library
Y. Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, M. Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. ArXiv abs/1907.11692(2019).Google Scholar
Ilya Loshchilov and Frank Hutter. 2017. Decoupled Weight Decay Regularization. arxiv:1711.05101 [cs.LG]Google Scholar
Yi-Ju Lu and Cheng-Te Li. 2020. GCAN: Graph-aware co-attention networks for explainable fake news detection on social media. arXiv preprint arXiv:2004.11648(2020).Google Scholar
Jing Ma, Wei Gao, Prasenjit Mitra, Sejeong Kwon, Bernard J Jansen, Kam-Fai Wong, and Meeyoung Cha. 2016. Detecting rumors from microblogs with recurrent neural networks. (2016).Google Scholar
Jing Ma, Wei Gao, and Kam-Fai Wong. 2017. Detect rumors in microblog posts using propagation structure via kernel learning. Association for Computational Linguistics.Google Scholar
Nickil Maveli. 2020. EdinburghNLP at WNUT-2020 Task 2: Leveraging Transformers with Generalized Augmentation for Identifying Informativeness in COVID-19 Tweets. ArXiv abs/2009.06375(2020).Google Scholar
Priyanka Meel and Dinesh Kumar Vishwakarma. 2020. Fake news, rumor, information pollution in social media and web: A contemporary survey of state-of-the-arts, challenges and opportunities. Expert Systems with Applications 153 (2020), 112986. https://doi.org/10.1016/j.eswa.2019.112986Google ScholarCross Ref
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781(2013).Google Scholar
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111–3119.Google Scholar
Shervin Minaee, Nal Kalchbrenner, Erik Cambria, Narjes Nikzad, Meysam Chenaghlu, and Jianfeng Gao. 2020. Deep learning based text classification: A comprehensive review. arXiv preprint arXiv:2004.03705(2020).Google Scholar
Anders Giovanni Møller, Rob Van Der Goot, and Barbara Plank. 2020. NLP North at WNUT-2020 Task 2: Pre-training versus Ensembling for Detection of Informative COVID-19 English Tweets. In Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020). 331–336.Google ScholarCross Ref
Valeryia Mosinzova, Benjamin Fabian, Tatiana Ermakova, and Annika Baumann. 2019. Fake News, Conspiracies and Myth Debunking in Social Media-A Literature Survey Across Disciplines. Conspiracies and Myth Debunking in Social Media-A Literature Survey Across Disciplines (February 3, 2019)(2019).Google Scholar
Martin Müller, Marcel Salathé, and Per E Kummervold. 2020. COVID-Twitter-BERT: A Natural Language Processing Model to Analyse COVID-19 Content on Twitter. arXiv preprint arXiv:2005.07503 (2020).Google Scholar
A. Nguyen. 2020. TATL at W-NUT 2020 Task 2: A Transformer-based Baseline System for Identification of Informative COVID-19 English Tweets. ArXiv abs/2008.12854(2020).Google Scholar
Dat Quoc Nguyen, Thanh Vu, and Anh Tuan Nguyen. 2020. BERTweet: A pre-trained language model for English Tweets. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations.Google ScholarCross Ref
Dat Quoc Nguyen, Thanh Vu, Afshin Rahimi, Mai Hoang Dao, Linh The Nguyen, and Long Doan. 2020. WNUT-2020 Task 2: Identification of Informative COVID-19 English Tweets. In Proceedings of the 6th Workshop on Noisy User-generated Text.Google ScholarCross Ref
Adam Paszke, S. Gross, Soumith Chintala, G. Chanan, E. Yang, Zachary Devito, Zeming Lin, Alban Desmaison, L. Antiga, and A. Lerer. 2017. Automatic differentiation in PyTorch.Google Scholar
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825–2830.Google ScholarDigital Library
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. In Empirical Methods in Natural Language Processing (EMNLP). 1532–1543. http://www.aclweb.org/anthology/D14-1162Google Scholar
Calum Perrio and Harish Tayyar Madabushi. 2020. CXP949 at WNUT-2020 Task 2: Extracting Informative COVID-19 Tweets – RoBERTa Ensembles and The Continued Relevance of Handcrafted Features.Google Scholar
Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proc. of NAACL.Google ScholarCross Ref
Cristina M Pulido, Beatriz Villarejo-Carballido, Gisela Redondo-Sama, and Aitor Gómez. 2020. COVID-19 infodemic: More retweets for science-based information on coronavirus than for false information. International Sociology(2020), 0268580920914755.Google Scholar
Lei Qin, Qiang Sun, Yidan Wang, Ke-Fei Wu, Mingchih Chen, Ben-Chang Shia, and Szu-Yuan Wu. 2020. Prediction of Number of Cases of 2019 Novel Coronavirus (COVID-19) Using Social Media Search Index. International Journal of Environmental Research and Public Health 17, 7 (Mar 2020), 2365. https://doi.org/10.3390/ijerph17072365Google ScholarCross Ref
Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training.Google Scholar
Alec Radford, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language Models are Unsupervised Multitask Learners. (2019).Google Scholar
Kenneth Rapoza. 2017. Can ‘fake news’ impact the stock market?by Forbes (2017).Google Scholar
Nir Rosenfeld, Aron Szanto, and David C. Parkes. 2020. A Kernel of Truth: Determining Rumor Veracity on Twitter by Diffusion Pattern Alone. In WWW ’20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, Yennun Huang, Irwin King, Tie-Yan Liu, and Maarten van Steen (Eds.). ACM / IW3C2, 1018–1028. https://doi.org/10.1145/3366423.3380180Google ScholarDigital Library
Victoria L Rubin, Niall Conroy, Yimin Chen, and Sarah Cornwell. 2016. Fake news or truth? using satirical cues to detect potentially misleading news. In Proceedings of the second workshop on computational approaches to deception detection. 7–17.Google ScholarCross Ref
Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. 2008. The graph neural network model. IEEE Transactions on Neural Networks 20, 1 (2008), 61–80.Google ScholarDigital Library
Kai Shu, Limeng Cui, Suhang Wang, Dongwon Lee, and Huan Liu. 2019. defend: Explainable fake news detection. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 395–405.Google ScholarDigital Library
Kai Shu, Limeng Cui, Suhang Wang, Dongwon Lee, and Huan Liu. 2019. dEFEND: Explainable Fake News Detection. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4-8, 2019, Ankur Teredesai, Vipin Kumar, Ying Li, Rómer Rosales, Evimaria Terzi, and George Karypis(Eds.). ACM, 395–405. https://doi.org/10.1145/3292500.3330935Google ScholarDigital Library
Kai Shu, Deepak Mahudeswaran, Suhang Wang, Dongwon Lee, and Huan Liu. 2018. FakeNewsNet: A Data Repository with News Content, Social Context and Dynamic Information for Studying Fake News on Social Media. arXiv preprint arXiv:1809.01286(2018).Google Scholar
Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang, and Huan Liu. 2017. Fake News Detection on Social Media: A Data Mining Perspective. SIGKDD Explorations 19, 1 (2017), 22–36. https://doi.org/10.1145/3137597.3137600Google ScholarDigital Library
Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang, and Huan Liu. 2017. Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsletter 19, 1 (2017), 22–36.Google ScholarDigital Library
Kai Shu, Suhang Wang, and Huan Liu. 2017. Exploiting Tri-Relationship for Fake News Detection. arXiv preprint arXiv:1712.07709(2017).Google Scholar
Kai Shu, Guoqing Zheng, Yichuan Li, Subhabrata Mukherjee, Ahmed Hassan Awadallah, Scott Ruston, and Huan Liu. 2020. Leveraging Multi-Source Weak Social Supervision for Early Detection of Fake News. arXiv preprint arXiv:2004.01732(2020).Google Scholar
Jyoti Prakash Singh, Yogesh K Dwivedi, Nripendra P Rana, Abhinav Kumar, and Kawaljeet Kaur Kapoor. 2019. Event classification and location prediction from tweets during disasters. Annals of Operations Research 283, 1 (2019), 737–757.Google ScholarCross Ref
Bruno Takahashi, Edson C Tandoc Jr, and Christine Carmichael. 2015. Communicating on Twitter during a disaster: An analysis of tweets during Typhoon Haiyan in the Philippines. Computers in human behavior 50 (2015), 392–398.Google Scholar
Hien To, Sumeet Agrawal, Seon Ho Kim, and Cyrus Shahabi. 2017. On identifying disaster-related tweets: Matching-based or learning-based?. In 2017 IEEE Third International Conference on Multimedia Big Data (BigMM). IEEE, 330–337.Google ScholarCross Ref
Khiem Vinh Tran, Hao Phu Phan, Kiet Van Nguyen, and Ngan Luu-Thuy Nguyen. 2020. UIT-HSE at WNUT-2020 Task 2: Exploiting CT-BERT for Identifying COVID-19 Information on the Twitter Social Network. arXiv preprint arXiv:2009.02935(2020).Google Scholar
Iulia Turc, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. Well-Read Students Learn Better: On the Importance of Pre-training Compact Models. arXiv preprint arXiv:1908.08962v2(2019).Google Scholar
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998–6008.Google Scholar
Amir Pouran Ben Veyseh, My T Thai, Thien Huu Nguyen, and Dejing Dou. 2019. Rumor detection in social networks via deep contextual modeling. In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 113–120.Google ScholarDigital Library
Anshul Wadhawan. 2020. Phonemer at WNUT-2020 Task 2: Sequence Classification Using COVID Twitter BERT and Bagging Ensemble Technique based on Plurality Voting. arXiv preprint arXiv:2010.00294(2020).Google Scholar
Youze Wang, Shengsheng Qian, Jun Hu, Quan Fang, and Changsheng Xu. 2020. Fake News Detection via Knowledge-driven Multimodal Graph Convolutional Networks. In Proceedings of the 2020 International Conference on Multimedia Retrieval. 540–547.Google ScholarDigital Library
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, R’emi Louf, Morgan Funtowicz, and Jamie Brew. 2019. HuggingFace’s Transformers: State-of-the-art Natural Language Processing. ArXiv abs/1910.03771(2019).Google Scholar
Lianwei Wu and Yuan Rao. 2020. Adaptive Interaction Fusion Networks for Fake News Detection. arXiv preprint arXiv:2004.10009(2020).Google Scholar
Lianwei Wu, Yuan Rao, Haolin Jin, Ambreen Nazir, and Ling Sun. 2019. Different absorption from the same sharing: Sifted multi-task learning for fake news detection. arXiv preprint arXiv:1909.01720(2019).Google Scholar
Lianwei Wu, Yuan Rao, Yongqiang Zhao, Hao Liang, and Ambreen Nazir. 2020. DTCA: Decision tree-based co-attention networks for explainable claim verification. arXiv preprint arXiv:2004.13455(2020).Google Scholar
Zhiyuan Wu, Dechang Pi, Junfu Chen, Meng Xie, and Jianjun Cao. 2020. Rumor Detection Based On Propagation Graph Neural Network With Attention Mechanism. Expert Systems with Applications(2020), 113595.Google Scholar
Patrick Xia, Shijie Wu, and Benjamin Van Durme. 2020. Which* BERT? A Survey Organizing Contextualized Encoders. arXiv preprint arXiv:2010.00854(2020).Google Scholar
Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies. 1480–1489.Google ScholarCross Ref
Xinyi Zhou and Reza Zafarani. 2018. Fake news: A survey of research, detection methods, and opportunities. arXiv preprint arXiv:1812.00315(2018).Google Scholar
Xinyi Zhou, Reza Zafarani, Kai Shu, and Huan Liu. 2019. Fake news: Fundamental theories, detection strategies and challenges. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 836–837.Google ScholarDigital Library
Arkaitz Zubiaga, Maria Liakata, Rob Procter, Geraldine Wong Sak Hoi, and Peter Tolmie. 2016. Analysing how people orient to and spread rumours in social media by looking at conversational threads. PloS one 11, 3 (2016), e0150989.Google ScholarCross Ref

Recommendations

Are Mutated Misinformation More Contagious? A Case Study of COVID-19 Misinformation on Twitter
WebSci '22: Proceedings of the 14th ACM Web Science Conference 2022

The spread of online misinformation has become a major global risk. Understanding how misinformation propagates on social media is vital. While prior studies suggest that the content factors, such as emotion and topic in texts, are closely related to the ...
Read More
The Dynamics of (Not) Unfollowing Misinformation Spreaders
WWW '24: Proceedings of the ACM on Web Conference 2024

Many studies explore how people "come into" misinformation exposure. But much less is known about how people "come out of" misinformation exposure.Do people organically sever ties to misinformation spreaders? And what predicts doing so? Over six months, ...
Read More
Demographics and topics impact on the co-spread of COVID-19 misinformation and fact-checks on Twitter
Abstract
Correcting misconceptions and false beliefs are important for injecting reliable information about COVID-19 into public discourse, but what impact does this have on the continued proliferation of misinforming claims? Fact-checking ...
Highlights
- The relation between misinformation and fact-checking during COVID-19 is studied.
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '21: Proceedings of the Web Conference 2021
April 2021
4054 pages
ISBN:9781450383127
DOI:10.1145/3442381
Editors:
Jure Leskovec
Stanford
,
Marko Grobelnik
Jožef Stefan Institute
,
Marc Najork
Google
,
Jie Tang
Tsinghua University
,
Leila Zia
Wikimedia Foundation
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 June 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
COVID-19
datasets
misinformation
natural language processing
social media
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 18
  Total Citations
  View Citations
- 627
  Total Downloads
- Downloads (Last 12 months)158
- Downloads (Last 6 weeks)18
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

The Surprising Performance of Simple Baselines for Misinformation Detection

WWW '21: Proceedings of the Web Conference 2021

ABSTRACT

References

Cited By

Recommendations

Are Mutated Misinformation More Contagious? A Case Study of COVID-19 Misinformation on Twitter

The Dynamics of (Not) Unfollowing Misinformation Spreaders

Demographics and topics impact on the co-spread of COVID-19 misinformation and fact-checks on Twitter