Abstract
Various research areas at the intersection of computer and social sciences require a ground truth of contextualized claims labelled with their truth values in order to facilitate supervision, validation or reproducibility of approaches dealing, for example, with fact-checking or analysis of societal debates. So far, no reasonably large, up-to-date and queryable corpus of structured information about claims and related metadata is publicly available. In an attempt to fill this gap, we introduce ClaimsKG, a knowledge graph of fact-checked claims, which facilitates structured queries about their truth values, authors, dates, journalistic reviews and other kinds of metadata. ClaimsKG is generated through a semi-automated pipeline, which harvests data from popular fact-checking websites on a regular basis, annotates claims with related entities from DBpedia, and lifts the data to RDF using an RDF/S model that makes use of established vocabularies. In order to harmonise data originating from diverse fact-checking sites, we introduce normalised ratings as well as a simple claims coreference resolution strategy. The current knowledge graph, extensible to new information, consists of 28,383 claims published since 1996, amounting to 6,606,032 triples.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
We provide full correspondence tables here: https://goo.gl/Ykus98.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
References
Allcott, H., Gentzkow, M.: Social media and fake news in the 2016 election. J. Econ. Persp. 31(2), 211–236 (2017)
Barrón-Cedeño, A., et al.: Overview of the CLEF-2018 checkThat! Lab on automatic identification and verification of political claims, task 2: factuality. In: CLEF. CEUR-WS (2018)
Bennett, W.L., Pfetsch, B.: Rethinking political communication in a time of disrupted public spheres. J. Commun. 68(2), 243–253 (2018). https://doi.org/10.1093/joc/jqx017
Ciampaglia, G.L., Shiralkar, P., Rocha, L.M., Bollen, J., Menczer, F., Flammini, A.: Computational fact checking from knowledge networks. PloS one 10, e0128193 (2015)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv e-prints arXiv:1810.04805 [cs.CL] (2018)
Dong, X., et al.: Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: ACM SIGKDD, pp. 601–610. ACM (2014)
Esteves, D., Reddy, A.J., Chawla, P., Lehmann, J.: Belittling the source: trustworthiness indicators to obfuscate fake news on the web. In: 1st Workshop on Fact Extraction and VERification (FEVER), pp. 50–59 (2018)
Ferragina, P., Scaiella, U.: TAGME: on-the-fly annotation of short text fragments (by Wikipedia entities). In: ACM ICIKM, pp. 1625–1628. ACM (2010)
Ferreira, W., Vlachos, A.: Emergent: a novel data-set for stance classification. In: NAACL-HLT, pp. 1163–1168 (2016)
Gerber, D., et al.: Defacto–temporal and multilingual deep fact validation. Web Semant.: Sci. Serv. Agents World Wide Web 35, 85–101 (2015)
Gorrell, G., Bontcheva, K., Derczynski, L., Kochkina, E., Liakata, M., Zubiaga, A.: RumourEval 2019: determining rumour veracity and support for rumours. In: Semantic Evaluation, pp. 60–67 (2017)
Hassan, N., et al.: The quest to automate fact-checking. World (2015)
Hassan, N., Arslan, F., Li, C., Tremayne, M.: Toward automated fact-checking: detecting check-worthy factual claims by claimbuster. In: ACM SIGKDD, pp. 1803–1812. ACM (2017)
Hoffart, J., et al.: Robust disambiguation of named entities in text. In: EMNLP, pp. 782–792. ACL, Stroudsburg (2011). http://dl.acm.org/citation.cfm?id=2145432.2145521
Hon, L.: Social media framing within the million hoodies movement for justice. Public Relat. Rev. 42, 9–19 (2015). https://doi.org/10.1016/j.pubrev.2015.11.013
Ma, J., et al.: Detecting rumors from microblogs with recurrent neural networks. In: IJCAI, pp. 3818–3824 (2016)
Mihaylova, T., et al.: Fact checking in community forums. In: AAAI, pp. 879–886 (2018)
Pérez-Rosas, V., Mihalcea, R.: Experiments in open domain deception detection. In: CEMNLP, pp. 1120–1125 (2015)
Popat, K., Mukherjee, S., Strötgen, J., Weikum, G.: Credibility assessment of textual claims on the web. In: ACM ICIKM, pp. 2173–2178. ACM (2016)
Popat, K., Mukherjee, S., Strötgen, J., Weikum, G.: Where the truth lies: explaining the credibility of emerging claims on the web and social media. In: WWW, pp. 1003–1012 (2017)
Rashkin, H., Choi, E., Jang, J.Y., Volkova, S., Choi, Y.: Truth of varying shades: analyzing language in fake news and political fact-checking. In: EMNLP, pp. 2931–2937 (2017)
Scheufele, D.A.: Agenda-setting, priming, and framing revisited: another look at cognitive effects of political communication. Mass Commun. Soc. 3(2–3), 297–316 (2000). https://doi.org/10.1207/S15327825MCS0323_07
Shiralkar, P., Flammini, A., Menczer, F., Ciampaglia, G.L.: Finding streams in knowledge graphs to support fact checking. In: ICDM, pp. 859–864. IEEE (2017)
Smith, M., Shneiderman, B., Rainie, L., Himelboim, I.: Mapping Twitter topic networks: from polarized crowds to community clusters, February 2014
Thorne, J., Vlachos, A., Christodoulopoulos, C., Mittal, A.: Fever: a large-scale dataset for fact extraction and verification. In: NAACL-HLT, pp. 809–819 (2018)
Tonon, A., Felder, V., Difallah, D.E., Cudré-Mauroux, P.: VoldemortKG: mapping schema.org and web entities to linked open data. In: Groth, P., et al. (eds.) ISWC 2016. LNCS, vol. 9982, pp. 220–228. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46547-0_23
Tschiatschek, S., Singla, A., Gomez Rodriguez, M., Merchant, A., Krause, A.: Fake news detection in social networks via crowd signals. In: WWW, pp. 517–524 (2018)
Vlachos, A., Riedel, S.: Fact checking: task definition and dataset construction. In: Language Technologies and Computational Social Science, pp. 18–22 (2014)
Vlachos, A., Riedel, S.: Identification and verification of simple claims about statistical properties. In: CEMNLP, pp. 2596–2601. ACL (2015)
Vosoughi, S., Roy, D., Aral, S.: The spread of true and false news online. Science 359(6380), 1146–1151 (2018)
Walenz, B., et al.: Finding, monitoring, and checking claims computationally based on structured data. In: Computation+Journalism (2014)
Wang, W.Y.: Liar, liar pants on fire: a new benchmark dataset for fake news detection. In: AMACL, pp. 422–426 (2017)
Wang, X., Yu, C., Baumgartner, S., Korn, F.: Relevant document discovery for fact-checking articles. In: WWW, pp. 525–533 (2018)
Wu, Y., Agarwal, P.K., Li, C., Yang, J., Yu, C.: Toward computational fact-checking. Proc. VLDB Endow. 7(7), 589–600 (2014)
Yu, R., Gadiraju, U., Fetahu, B., Lehmberg, O., Ritze, D., Dietze, S.: KnowMore - knowledge base augmentation with structured web markup. Semant. Web J. (2019). http://www.semantic-web-journal.net/content/knowmore-knowledge-base-augmentation-structured-web-markup-1. IOS Press
Yua, R., Gadirajua, U., Fetahua, B., Lehmbergb, O., Ritzeb, D., Dietzea, S.: KnowMore-knowledge base augmentation with structured web markup. Semant. Web J. (2017). IOS Press
Acknowledgments
We thank Vinicius Woloszyn for providing support on the first step of the pipeline (claim extraction), as well as Josselin Alezot, Imran Meghazi and Elisa Gueneau for their work on the Web interface. We thank all fact-checking sites and the fact-checkers community for the laborious work of manual claim verification.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Tchechmedjiev, A. et al. (2019). ClaimsKG: A Knowledge Graph of Fact-Checked Claims. In: Ghidini, C., et al. The Semantic Web – ISWC 2019. ISWC 2019. Lecture Notes in Computer Science(), vol 11779. Springer, Cham. https://doi.org/10.1007/978-3-030-30796-7_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-30796-7_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30795-0
Online ISBN: 978-3-030-30796-7
eBook Packages: Computer ScienceComputer Science (R0)