Abstract
The problems of online misinformation and fake news have gained increasing prominence in an age where user-generated content and social media platforms are key forces in the shaping and diffusion of news stories. Unreliable information and misleading content are often posted and widely disseminated through popular social media platforms such as Twitter and Facebook. As a result, journalists and editors are in need of new tools that can help them speed up the verification process for content that is sourced from social media. Motivated by this need, in this paper, we present a system that supports the automatic classification of multimedia Twitter posts into credible or misleading. The system leverages credibility-oriented features extracted from the tweet and the user who published it, and trains a two-step classification model based on a novel semisupervised learning scheme. The latter uses the agreement between two independent pretrained models on new posts as guiding signals for retraining the classification model. We analyze a large labeled dataset of tweets that shared debunked fake and confirmed real images and videos, and show that integrating the newly proposed features, and making use of bagging in the initial classifiers and of the semisupervised learning scheme, significantly improves classification accuracy. Moreover, we present a Web-based application for visualizing and communicating the classification results to end users.
Similar content being viewed by others
Notes
Using: github.com/socialsensor/geo-util.
The selection is based on their performance on the training set during cross-validation.
The VC was since then expanded with new data that was used as part of the VMU 2016 task. However, the bulk of the experiments reported here refer to the 2015 version of the data, so any reference to VC refers to the 2015 edition of the dataset, unless otherwise stated.
In MediaEval, each team can submit up to five runs.
References
Boididou C, Papadopoulos S, Kompatsiaris Y, Schifferes S, Newman N (2014) Challenges of computational verification in social multimedia. In: Proceedings of the companion publication of the 23rd international conference on world wide web companion, pp 743–748
Boididou C, Andreadou K, Papadopoulos S, Dang-Nguyen DT, Boato G, Riegler M, Kompatsiaris Y (2015a) Verifying multimedia use at mediaeval 2015. In: MediaEval 2015 workshop, Sept 14–15, 2015, Wurzen, Germany
Boididou C, Papadopoulos S, Dang-Nguyen DT, Boato G, Kompatsiaris Y (2015b) The certh-unitn participation @ verifying multimedia use 2015. In: MediaEval 2015 workshop, Sept 14–15, 2015, Wurzen, Germany
Boididou C, Papadopoulos S, Dang-Nguyen D, Boato G, Riegler M, Middleton SE, Petlund A, Kompatsiaris Y (2016a) Verifying multimedia use at mediaeval 2016. In: Working notes proceedings of the MediaEval 2016 workshop, Oct 20–21, 2016, Hilversum, The Netherlands
Boididou C, Papadopoulos S, Middleton SE, Dang-Nguyen D, Riegler M, Petlund A, Kompatsiaris Y (2016b) The VMU participation @ verifying multimedia use 2016. In: Working notes proceedings of the MediaEval 2016 workshop, Oct 20–21, 2016, The Netherlands
Boididou C, Middleton SE, Jin Z, Papadopoulos S, Dang-Nguyen DT, Boato G, Kompatsiaris Y (2017a) Verifying information with multimedia content on twitter. Multimedia Tools Appl. https://doi.org/10.1007/s11042-017-5132-9
Boididou C, Papadopoulos S, Apostolidis L, Kompatsiaris Y (2017b) Learning to detect misleading content on twitter. In: Proceedings of the 2017 ACM on international conference on multimedia retrieval, ICMR ’17. ACM, pp 278–286
Cao J, Jin Z, Zhang Y (2016) Mcg-ict at mediaeval 2016 verifying tweets from both text and visual content. In: Working notes proceedings of the MediaEval 2016 workshop, CEUR-WS.org, vol 1739, Oct 20–21, 2016, Hilversum, The Netherlands
Castillo C, Mendoza M, Poblete B (2011) Information credibility on twitter. In: Proceedings of the 20th international conference on world wide web. ACM, pp 675–684
Gupta A, Lamba H, Kumaraguru P, Joshi A (2013) Faking sandy: characterizing and identifying fake images on twitter during hurricane sandy. In: Proceedings of the 22nd international conference on world wide web companion, pp 729–736
Gupta A, Kumaraguru P, Castillo C, Meier P (2014) Tweetcred: a real-time web-based system for assessing credibility of content on twitter. In: Proceedings of 6th international conference on social informatics (SocInfo)
Hassan N, Adair B, Hamilton J, Li C, Tremayne M, Yang J, Yu C (2015) The quest to automate fact-checking. In: Proceedings of the 2015 computation and journalism symposium, pp 1–5
Jin Z, Cao J, Zhang Y, Zhang Y (2015) Mcg-ict at mediaeval 2015: verifying multimedia use with a two-level classification model. In: MediaEval 2015 workshop, Sept 14–15, 2015, Wurzen, Germany
Jin Z, Cao J, Zhang Y, Zhou J, Tian Q (2017) Novel visual and statistical image features for microblogs news verification. IEEE Trans Multimedia 19(3):598–608
Kanske P, Kotz SA (2010) Leipzig affective norms for german: a reliability study. Behav Res Methods 42(4):987–991
Klein D, Manning CD (2003) Accurate unlexicalized parsing. In: Proceedings of the 41st annual meeting on association for computational linguistics—Volume 1, Association for Computational Linguistics. ACL’03, pp 423–430
Kumar S, West R, Leskovec J (2016) Disinformation on the web: Impact, characteristics, and detection of wikipedia hoaxes. In: Proceedings of the 25th international conference on world wide web, WWW 2016, Montreal, Canada, April 11–15, 2016. ACM, pp 591–602
Maigrot C, Claveau V, Kijak E, Sicre R (2016) Mediaeval 2016: A multimodal system for the verifying multimedia use task. In: Working notes proceedings of the MediaEval 2016 workshop, Hilversum, vol 1739, CEUR-WS.org, Oct 20-21, 2016, The Netherlands
Martin N, Comm B (2014) Information verification in the age of digital journalism. In: Special libraries association annual conference, Vancouver
Metaxas P, Finn S, Mustafaraj E (2015) Using twittertrails.com to investigate rumor propagation. In: Proceedings of the 18th ACM conference companion on computer supported cooperative work & social computing. ACM, pp 69–72
Middleton S (2015) Extracting attributed verification and debunking reports from social media: Mediaeval-2015 trust and credibility analysis of image and video. In: MediaEval 2015 workshop, Sept 14–15, 2015, Wurzen, Germany
O’Donovan J, Kang B, Meyer G, Hollerer T, Adalii S (2012) Credibility in context: An analysis of feature distributions in twitter. In: 2012 international conference on privacy, security, risk and trust (PASSAT) and 2012 international conference on social computing (SocialCom). IEEE, pp 293–301
Oikawa MA, Dias Z, de Rezende Rocha A, Goldenstein S (2016) Manifold learning and spectral clustering for image phylogeny forests. IEEE Trans Inf Forensics Secur 11(1):5–18
Pandey RC, Singh SK, Shukla KK (2016) Passive forensics in image and video using noise features: a review. Digit Investig 19:1–28. https://doi.org/10.1016/j.diin.2016.08.002
Phan QT, Budroni A, Pasquini C, Natale FGBD (2016) A hybrid approach for multimedia use verification. In: Working notes proceedings of the MediaEval 2016 Workshop, vol 1739, CEUR-WS.org, Octob 20–21, 2016, Hilversum, The Netherlands
Ratkiewicz J, Conover M, Meiss M, Gonçalves B, Patil S, Flammini A, Menczer F (2011) Truthy: mapping the spread of astroturf in microblog streams. In: Proceedings of the 20th international conference companion on world wide web. ACM, pp 249–252
Redondo J, Fraga I, Padrón I, Comesaña M (2007) The spanish adaptation of anew (affective norms for english words). Beh Res Methods 39(3):600–605
Resnick P, Carton S, Park S, Shen Y, Zeffer N (2014) Rumorlens: a system for analyzing the impact of rumors and corrections in social media. In: Proceedings of computational journalism conference
Rubin VL, Conroy NJ, Chen Y, Cornwell S (2016) Fake news or truth? using satirical cues to detect potentially misleading news. In: Proceedings of NAACL-HLT, pp 7–17
Shao C, Ciampaglia GL, Flammini A, Menczer F (2016) Hoaxy: a platform for tracking online misinformation. In: Proceedings of the 25th international conference companion on world wide web, pp 745–750
Silva E, de Carvalho TJ, Ferreira A, Rocha A (2015) Going deeper into copy-move forgery detection: exploring image telltales via multi-scale analysis and voting processes. J Vis Commun Image Represent 29:16–32
Silverman C (2013) Verification handbook. The European Journalism Centre (EJC), Maastricht
Spyromitros-Xioufis E, Papadopoulos S, Kompatsiaris I, Tsoumakas G, Vlahavas I (2014) A comprehensive study over VLAD and Product Quantization in large-scale image retrieval. IEEE Trans Multimedia 16(6):1713–1728
Sun S, Liu H, He J, Du X (2013) Detecting event rumors on sina weibo automatically. In: Web technologies and applications—15th Asia-Pacific web conference, APWeb 2013, Sydney, Australia, April 4–6, 2013. Proceedings, lecture notes in computer science, vol 7808. Springer, pp 120–131
Teyssou D, Leung JM, Apostolidis E, Apostolidis K, Papadopoulos S, Zampoglou M, Papadopoulou O, Mezaris V (2017) The invid plug-in: web video verification on the browser. In: Proceedings of the 1st workshop on multimedia verification
Tsakalidis A, Papadopoulos S, Kompatsiaris I (2014) An ensemble model for cross-domain polarity classification on twitter. In: Web information systems engineering—WISE 2014. Springer, pp 168–177
Volkova S, Shaffer K, Jang JY, Hodas N (2017) Separating facts from fiction: linguistic models to classify suspicious and trusted news posts on twitter. In: Proceedings of the 55th annual meeting of the association for computational linguistics, vol 2, pp 647–653
Vosoughi S, Mohsenvand MN, Roy D (2017) Rumor gauge: predicting the veracity of rumors on twitter. ACM Trans Knowl Discov Data 11:1–36
Wu K, Yang S, Zhu KQ (2015) False rumors detection on sina weibo by propagation structures. In: 31st IEEE international conference on data engineering, ICDE 2015, Seoul, South Korea, April 13–17, 2015. IEEE Computer Society, pp 651–662
Zampoglou M, Papadopoulos S, Kompatsiaris Y (2015) Detecting image splicing in the wild (web). In: IEEE international conference on multimedia & expo workshops (ICMEW). IEEE, pp 1–6
Zampoglou M, Papadopoulos S, Kompatsiaris Y, Bouwmeester R, Spangenberg J (2016) Web and social media image forensics for news professionals. In: Social media in the newsroom, papers from the 2016 ICWSM workshop, vol WS-16-19, Cologne, Germany, May 17, 2016. AAAI Press
Zampoglou M, Papadopoulos S, Kompatsiaris Y (2017) A large-scale evaluation of splicing localization algorithms for web images. Multimedia Tools Appl 76(4):4801–4834
Zubiaga A, Aker A, Bontcheva K, Liakata M, Procter R (2017) Detection and resolution of rumours in social media: a survey. CoRR. arXiv:1704.00656
Author information
Authors and Affiliations
Corresponding author
Additional information
This work has been supported by the REVEAL and InVID projects, under Contract Nos. 610928 and 687786, respectively, funded by the European Commission.
Rights and permissions
About this article
Cite this article
Boididou, C., Papadopoulos, S., Zampoglou, M. et al. Detection and visualization of misleading content on Twitter. Int J Multimed Info Retr 7, 71–86 (2018). https://doi.org/10.1007/s13735-017-0143-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13735-017-0143-x