Propagation2Vec: Embedding partial propagation networks for explainable fake news early detection

https://doi.org/10.1016/j.ipm.2021.102618Get rights and content

Highlights

  • The propagation pattern of news on social media can facilitate fake news detection.

  • Propagation2Vec emphasises informative nodes/cascades to detect fake news.

  • Propagation2Vec early reconstructs complete propagation networks by partial networks.

  • Propagation2Vec achieves state-of-the-art performance for fake news early detection.

  • The underlying logic of Propagation2Vec is explainable by the attention weights.

Abstract

Many recent studies have demonstrated that the propagation patterns of news on social media can facilitate the detection of fake news. Most of these studies rely on the complete propagation networks to build their model, which is not fully available in the early stages and may take a long time to complete. Hence, relying on the complete propagation network is not ideal for fake news early detection. However, detecting fake news as early as possible is important due to their fast-spreading nature and the significant harm they can cause. In addition, most existing propagation network-based fake news detection techniques are not explicitly designed to jointly emphasise informative cascades and nodes in the propagation networks to detect fake news. To bridge these research gaps, this work proposes Propagation2Vec, a novel fake news early detection technique, which assigns varying levels of importance for the nodes and cascades in propagation networks, and reconstructs the knowledge of complete propagation networks based on their partial propagation networks at an early detection deadline. Our experiments show that our model can achieve state-of-the-art performance while only having access to the early stage propagation networks. Furthermore, we devise general explanations for the underlying logic of Propagation2Vec based on its attention weights assigned to different nodes and cascades, which improves the applicability of our approach and facilitates future research on propagation network-based fake news detection.

Introduction

While the growing popularity of social media has greatly facilitated the exchange of information, it also provides an ideal platform to spread fake news, especially intentional misinformation, which has already and will continue to cause significant damage. For example, it has been estimated that at least 800 people died and 5800 were admitted to hospital as a result of false information related to the COVID-19 pandemic that alcohol-based cleaning products are a cure for the virus.1

Even though many independent fact-checking organisations have emerged globally over recent years,2 the sheer volume of fake news makes it infeasible to rely entirely on human investigation. In addition, what makes the task even more challenging is that fake news needs to be detected at an early stage before it becomes widespread, since it is difficult to correct people’s perception towards an issue once it is formed, even if the previous impression is inaccurate (keersmaecker & Roets, 2017). Therefore, this work focuses on early detection of fake news: verifying the validity of a news item within a certain time limit from when it is published online. Here we use the definition in Zhou and Zafarani (2018) that fake news is intentionally and verifiably false news published by a news outlet—similar definitions have also been used in previous studies on fake news detection (Monti et al., 2019, Ruchansky et al., 2017, Shu, Cui et al., 2019, Shu et al., 2017).

It has been demonstrated that the propagation pattern of news on social media, e.g., tweets and retweets of news on Twitter, can facilitate the detection of fake news (Liu and Wu, 2018, Ma et al., 2017, Shu, Mahudeswaran et al., 2019, Wu and Liu, 2018, Zhou and Zafarani, 2019), since the propagation pattern of fake news exhibits distinctive characteristics. Therefore, this work studies how the propagation patterns of news records can be effectively used to identify the veracity of a news record. Specifically, the propagation pattern of a news record refers to the corresponding tweets and retweets (see Section 3 for more details): as shown in Fig. 1, each propagation pattern can be considered as a tree, which consists of multiple cascades (sequences) and each cascade includes a sequence of tweets/retweets with the corresponding user profiles. Fig. 1 also shows that some nodes and cascades have distinctive characteristics, e.g., nodes corresponding to verified users, or the length of a cascade, which could be useful to identify fake news. To effectively exploit these types of knowledge, there should be a way to jointly emphasise informative nodes and cascades when identifying fake news using propagation patterns.

Another challenge is that the propagation network of a news record could take days or even weeks to complete. Hence, relying on the entire propagation network of a news record to identify its veracity is not ideal for fake news early detection. To address this challenge, models for fake news early detection should be able to recover the news label using a partially available propagation network at an early detection deadline.

Another limitation that hinders the applicability of existing fake news detection models is the lack of explainability. Most propagation network-based fake news detection models adopt deep neural architectures such as Graph Neural Networks (GNN), Graph Recurrent Neural Networks (GRNN), which are known as black-box models. Hence, how to explain the predictions made by a fake news detection model is another important problem that has not been well-addressed in previous work.

Contributions. The contributions of our work are as follows.

  • We initially conduct an extensive empirical study to highlight the importance of the aforementioned research gaps and our various design decisions.

  • We propose Propagation2Vec, a novel propagation network-based fake news detection technique that is capable of:

    • assigning varying attention for the nodes and cascades in propagation patterns using a hierarchical attention mechanism to emphasise informative nodes and cascades for fake news detection;

    • reconstructing useful knowledge available in complete propagation networks using their early propagation networks, which enables early detection of fake news; and

    • explaining the underlying logic of the model, which provides useful insights for future research on propagation network-based fake news detection.

  • We evaluate our approach using two publicly available datasets. Our experimental results show that the proposed framework outperforms state-of-the-art fake news detection models by as much as 5.55% in F1-score, while revealing the fake news labels at an early detection deadline. In addition, we construct general explanations for the underlying logic of our model based on the attention weights assigned for the nodes and cascades in the propagation patterns.

Paper outline. The rest of the paper is structured as follows. In Section 2, we discuss previous work related to Propagation2Vec. Section 3 defines the problem statement. We conduct an empirical study in Section 4 to highlight the importance of our contribution. Section 5 provides the technical details of Propagation2Vec. We evaluate Propagation2Vec in Section 6 and conclude the manuscript in Section 7.

Section snippets

Related work

Detecting fake news on social media has been a popular research problem over recent years (Parikh and Atrey, 2018, Sharma et al., 2019, Shu et al., 2017). In this section, we review the prior work on this topic. Specifically, similar to Pierri and Ceri (2019) and Shu et al. (2017), we classify existing work into three categories: content-based approaches, context-based approaches and mixed approaches, the first two of which, as suggested by their names, mainly rely on news content and social

Problem statement

We define the problem of fake news early detection as follows: let RL be a set of labelled news records. Each record rRL is represented as a tuple tr,Wr,Gtr,yr, where (1) tr is the timestamp when r is published online; (2) Wr is the text content of r; (3) Gtr is the propagation network of r at timestamp tr+t (further explained below) ; and (4) yr is the label: yr is 1 if r is false and 0 otherwise.

Each propagation network Gtr is an attributed directed graph (Vtr,Etr,Xtr), where:

  • Vtr is the

Quantitative analysis of propagation network features

This section initially provides the details about the datasets used in our experiments. Then, we analyse a wide range of node-level features of propagation networks, including temporal-based, text-based and user-based features, to identify their contributions to detect fake news. Note that these features are not well studied in similar previous work (Shu, Mahudeswaran et al., 2019). Moreover, we analyse the importance of the features extracted from complete propagation networks and early

Propagation2Vec

This section provides the technical details of the proposed model Propagation2Vec. Motivated by the findings in Section 4.4, Propagation2Vec embeds propagation networks of news records as low-dimensional vectors such that these embeddings have two main properties that are useful to identify the veracity of news records. First, it is capable of assigning varying importance for the nodes and the information cascades of a propagation network. As found by the empirical study in Section 4.4, the

Experimental verification

In this section, we present our experimental results to demonstrate the performance of the proposed approach for fake news early detection.

Conclusion

In summary, this work proposed Propagation2Vec, a novel propagation-based fake news early detection technique. Propagation2Vec is designed to address two empirically verified research gaps in existing propagation-based detection methods. First, most existing techniques are unable to emphasise the informative nodes and cascades in propagation networks. To address this, we propose a hierarchical attention mechanism to encode propagation networks, which can assign varying levels of importance for

CRediT authorship contribution statement

Amila Silva: Conceptualization, Methodology, Software, Investigation, Validation, Data curation, Writing - original draft. Yi Han: Conceptualization, Data curation, Writing - original draft, Supervision. Ling Luo: Conceptualization, Writing - review & editing, Supervision. Shanika Karunasekera: Conceptualization, Writing - review & editing, Supervision. Christopher Leckie: Conceptualization, Writing - review & editing, Supervision.

Acknowledgements

This research was financially supported by Melbourne Graduate Research Scholarship, Australia and Rowden White Scholarship, Australia .

References (61)

  • BianT. et al.

    Rumor detection on social media with bi-directional graph convolutional networks

    (2020)
  • Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., & Yakhnenko, O. (2013). Translating embeddings for modeling...
  • ChoK. et al.

    Learning phrase representations using RNN encoder-decoder for statistical machine translation

    (2014)
  • CiampagliaG.L. et al.

    Computational fact checking from knowledge networks

    PLOS ONE

    (2015)
  • CuiL. et al.

    Coaid: Covid-19 healthcare misinformation dataset

    (2020)
  • Cui, L., Seo, H., Tabar, M., Ma, F., Wang, S., & Lee, D. (2020). DETERRENT: Knowledge guided graph attention network...
  • Guo, H., Cao, J., Zhang, Y., Guo, J., & Li, J. (2018). Rumor detection with hierarchical social attention network. In...
  • HanY. et al.

    Graph neural networks with continual learning for fake news detection from social media

    (2020)
  • HorneB.D. et al.

    This just in: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news

    (2017)
  • HorneB.D. et al.

    Robust fake news detection over time and attack

    ACM Transactions on Intelligent Systems and Technology (TIST)

    (2019)
  • Jin, Z., Cao, J., Zhang, Y., & Luo, J. (2016). News verification by exploiting conflicting social viewpoints in...
  • JinZ. et al.

    Novel visual and statistical image features for microblogs news verification

    IEEE Transactions on Multimedia

    (2017)
  • keersmaeckerJ.D. et al.

    ‘Fake news’: Incorrect, but hard to correct. The role of cognitive ability on the impact of false information on social impressions

    Intelligence

    (2017)
  • Kim, Y. (2014). Convolutional neural networks for sentence classification. In Proc. of...
  • Kochkina, E., Liakata, M., & Augenstein, I. (2017). Turing at SemEval-2017 task 8: Sequential approach to rumour stance...
  • Liu, Y., & Wu, Y.-f. B. (2018). Early detection of fake news on social media through propagation path classification...
  • LuY.-J. et al.

    GCAN: Graph-aware co-attention networks for explainable fake news detection on social media

    (2020)
  • Ma, J., Gao, W., Mitra, P., Kwon, S., Jansen, B. J., & Wong, K.-F., et al. (2016). Detecting rumors from microblogs...
  • Ma, J., Gao, W., & Wong, K.-F. (2017). Detect rumors in microblog posts using propagation structure via kernel...
  • MontiF. et al.

    Fake news detection on social media using geometric deep learning

    (2019)
  • NickelM. et al.

    A review of relational machine learning for knowledge graphs

    IEEE

    (2016)
  • Pan, J. Z., Pavlova, S., Li, C., Li, N., Li, Y., & Liu, J. (2018). Content based fake news detection using knowledge...
  • Parikh, S. B., & Atrey, P. K. (2018). Media-rich fake news detection: A survey. In Proc. of...
  • Parikh, S. B., Khedia, S. R., & Atrey, P. K. (2019). A framework to detect fake tweet images on social media. In Proc....
  • Pérez-Rosas, V., Kleinberg, B., Lefevre, A., & Mihalcea, R. (2018). Automatic detection of fake news. In Proc. of...
  • PierriF. et al.

    False news on social media: A data-driven survey

    SIGMOD Record

    (2019)
  • Popat, K., Mukherjee, S., Yates, A., & Weikum, G. (2018). DeClarE: Debunking fake news and false claims using...
  • Potthast, M., Kiesel, J., Reinartz, K., Bevendorff, J., & Stein, B. (2018). A stylometric inquiry into hyperpartisan...
  • Qian, F., Gong, C., Sharma, K., & Liu, Y. (2018). Neural user response generator: Fake news detection with collective...
  • Resource description framework (RDF) model and syntax specification

    (1999)
  • Cited by (66)

    • A systematic survey on explainable AI applied to fake news detection

      2023, Engineering Applications of Artificial Intelligence
    View all citing articles on Scopus
    View full text