Hierarchical Co-Attention Selection Network for Interpretable Fake News Detection

: Social media fake news has become a pervasive and problematic issue today with the development of the internet. Recent studies have utilized different artiﬁcial intelligence technologies to verify the truth of the news and provide explanations for the results, which have shown remarkable success in interpretable fake news detection. However, individuals’ judgments of news are usually hierarchical, prioritizing valuable words above essential sentences, which is neglected by existing fake news detection models. In this paper, we propose an interpretable novel neural network-based model, the hierarchical co-attention selection network (HCSN), to predict whether the source post is fake, as well as an explanation that emphasizes important comments and particular words. The key insight of the HCSN model is to incorporate the Gumbel–Max trick in the hierarchical co-attention selection mechanism that captures sentence-level and word-level information from the source post and comments following the sequence of words–sentences–words–event. In addition, HCSN enjoys the additional beneﬁt of interpretability—it provides a conscious explanation of how it reaches certain results by selecting comments and highlighting words. According to the experiments conducted on real-world datasets, our model outperformed state-of-the-art methods and generated reasonable explanations.


Introduction
As a consequence of the booming growth of social media platforms, social media fake news has become a pervasive problem in society [1]. Given the ease with which individuals can freely and swiftly share their thoughts and feelings on social media platforms, fake news can spread quickly and easily distort people's assessment of a political [2,3] or economic event [4,5], public health [6][7][8], etc.
Substantial research, which attempts to use data mining and machine learning techniques, has been conducted in recent years on developing an effective and automated framework for detecting fake news. Relying on classical machine learning approaches, researchers extract special features and utilize supervised learning (e.g., support vector machine and random forest) for the detection of fake news [9][10][11]. With the advancement of deep learning, user profiles [12,13], user responses [14][15][16], and the propagation of news [17] have been used for learning the hidden representation of news through neural networks (e.g., recurrent neural networks (RNN) and graph neural networks (GNN)). These methods can improve the detection performance of fake news, but it is challenging to provide a reasonable explanation of the detection results. To address this issue, fake news detection models utilize different mechanisms to provide explanations through user comments [15], user information [18], and retweet sequences [19].
In interpretable fake news detection, while existing efforts have been excellent, grey areas remain. Firstly, the model based on user information and forwarding sequence [18] not only takes time to obtain the user information and forwarding sequence but also involves user privacy. Secondly, the current interpretable models adopt a hierarchical approach to achieve excellent classification performance, such as word-post-subevent-event [20] and word-sentence-event [21], but they ignore the relevance between the source post and user comments, which can cause higher trust in the explanation. Finally, although some current hierarchical interpretable fake news detection models emphasize the correlation between the source posts and user comments to detect fake news, on the one hand, some ignore the effect of post-related tokens. For example, the interpretable model dEFEND [15] only considers the interpretation of sentence-level relevance between source posts and user comments, as shown in Figure 1, red depicts the highlighted words that the dEFEND used to construct sentence vectors. However, it ignores the green, shown in Figure 1, which are post-related words that should be paid more attention to when detecting fake news, and they should also be part of the explanation. On the other hand, some of them do not use a selection process to reduce the irrelevant information. The MAC model [22] uses the multi-head attention mechanism to build a word-document hierarchical model, considering word-level correlation, but it is applicable to the whole document and does not consider the selection process of extracting important information for reducing noise and unrelated sentences. To solve the issues mentioned above, we propose a novel hierarchical co-attention selection network (HCSN). First, a recurrent neural network (RNN) was utilized to learn word embeddings for source posts and comments, and a sentence embedding was constructed for each sentence based on a self-attention mechanism. Then we adopted a sentence-level co-attention mechanism to learn the correlation between the source post and comments and utilized the Gumbel-Max trick to select certain informative sentences for the next layer. We next utilized the word-level co-attention mechanism to catch the representation of the word-level correlation between the source posts and selected comments. In the end, the binary prediction was produced based on the final learn embedding. In addition, there were two types of explanations for the prediction results, which are sentence-level explanations and word-level explanations. To confirm the interpretability of the model, we also utilized a case study to compare the prediction results with three modification strategies.
The contributions are summarized as follows: 1. We propose an interpretable method HCSN to predict the veracity of news based on a realistic social media scenario. 2. We present a hierarchical co-attention structure that incorporates the Gumble-Max trick for selecting relevant comments and valuable words in accordance with the human judgment order (sentences-to-words) for news to facilitate veracity learning. 3. We compare state-of-the-art models with HCSN on real datasets. In addition to competitive prediction results, the HCSN also provides reasonable sentence-level and word-level explanations, as shown by a case study with three modification strategies.
We organized this paper as follows. We discuss the related fake news detection approaches in Section 2. Then in Section 3, we describe the problem statement. Section 4 details the structure of our proposed HCSN model. The evaluation settings, results, and explanation analysis are in Section 5. We conclude our work and indicate future work in Section 6.

Related Work
This section provides an overview of the relevant research on interpretable deep learning and fake news detection.

Fake News Detection
Automatic fake news detection is usually divided into news content-based, social context-based, and hybrid feature-based methods according to features [23]. For news content-based fake news detection methods, there are two types of news content, textual and visual. Many studies on textual content extract a large number of credibility-indicative features around language style [24], emotion [25,26], writing style [27], and semantics [28]. For example, Cui [24] proposed a novel framework to detect rumors by capturing differences in writing style, as rumors tend to prefer capitalized words that are more appropriate for nouns. In contrast, the different characteristics of fake news in visual features are extracted from images or videos [29]. Social context-based fake news detection methods include user-based and propagation-based methods. The former is modeled according to the characteristics of users who publish and retweet fake news [12,13], and the characteristics mainly include user sex, the number of fans, and the user profile. The latter performs fake news detection through features of retweets or propagation structures in social networks [30][31][32][33].In reference [32], they studied a novel method, as an example, which utilized a bidirectional graph neural network model to learn the embedding propagation structure for detecting fake news. Hybrid feature-based methods are fusion multi-models or multiple features for fake news detection [26,34,35].

The Interpretation of Deep Learning
Machine learning (ML) and artificial intelligence (AI) models have gradually risen in complexity, accuracy, and other quality indicators over the years. However, this growth has often come at the expense of the interpretability of the models' final results. Simultaneously, academics and practitioners have started to come to the realization that greater openness in artificial intelligence and deep learning engines is required if these techniques are to be used in reality [36]. In recent years, interpretable AI (IAI) and explainable AI (or XAI) models have begun to be applied in more domains [37], such as cybersecurity [38], recommender systems [39], healthcare [40], social networks [18], etc. The explanation of deep learning models generally refers to the presentation of model decision results in an understandable manner, which can help the user to understand the inner workings of complex models and the reasons why models make specific decisions. Interpretable AI is based on intrinsic interpretability, which is built by adopting self-explanatory models that incorporate interpretability directly into their structure [15]. In contrast, the explainable AI utilizes post hoc explanations that require the creation of another model to provide explanations of the existing models [41]. Recent studies on detecting fake news have focused on the identification of evidence to make the model interpretable or on the study of results using Interpretable tools. These explainable methods and interpretable methods mainly provide explanations by extracting relevant articles [15], user information [18], and retweet sequences [19].

Problem Statement
. ., s M is a source post, which contains M sentences, and each sentence s m = w m 1 , w m 2 , . . ., w m p contains p words. In fact, there is often only one sentence in the source post. In order to unify the symbols or apply it in long-text fake news, we used multiple sentences to represent the source post. When a source post is published on a social network, some users will share their views or opinions about the source post, forming a large number of comments. C = c 1 , c 2 , . . ., c N is the set of N comments related to the source post S, where each comment c n = w n 1 , w n 2 , . . ., w n q contains q words. We treated fake news detection as a binary classification task, and a binary label y ∈ 0, 1 was used to indicate the truthfulness of each source post. In addition, according to the model, we selected certain sentences from the source post content, some comments from the user comments, and then certain words from both to interpret why it was defined as fake news.

The Proposed HCSN Model
In this section, we introduce the details of utilizing source posts and user reviews to detect fake news through a Hierarchical Co-attention Selection Network model (HCSN). As shown in Figure 2, the HCSN consisted of the following four components: (1) input encoder, which generated the word-level representation of the source post and comments through the RNN and self-attention mechanism (2) sentence-level co-attention, which selected informative comments through the sentence-level interaction of the source post and user responses; (3) word-level co-attention, which selected informative words or phrases through the word-level interaction of the source post and selected comments; and (4) fake news prediction, which conducted the fake news prediction by concatenating the final learned representations from the source posts and user comments.

Input Encoding
In fake news detection or text classification tasks, researchers often use self-attention mechanisms to learn word-level or sentence-level representations. Similarly, we also used the same structure to learn sentence-level representations. In particular, we first obtained the word vector w t ∈ R for each word by the embedding matrix. The source posts and user comments on Twitter are usually short texts, so we directly adopted a bidirectional GRU [42] to learn the word sequence representation. Finally, we obtained the sentence vector through the self-attention mechanism.
For a source post consisting of p words, the forward hidden state and the backward hidden state were obtained as follows: By connecting the forward hidden state − → h m t and the backward hidden state In order to find the informative words in the sentence, the importance of each word was measured by the self-attention mechanism [43], and it obtained sentence vectors s m ∈ R 2d as follows: where the importance of the tth word for the source post s m was measured by α m t , and the calculation method was as follows: Similarly, given a comment c n with q words, we can also obtain the representation of ← − h q t ] and the comment vector c n ∈ R 2d .

Sentence-Level Co-Attention
Since social media platforms allow people to publish the responses to the original post, there are many comments that support or deny the source posts, which can assist the fake news detection models to confirm the authenticity of a piece of news. However, a large amount of information and noise also exist. In this section, we aim to select the most informative comment from the comments. Specifically, we utilized sentence-level coattention to select comments by the semantic affinity of the source post and user comments. Therefore, we first needed to construct feature matrices for source posts and user comments separately. Similar to the MPCN [44] model, given the source post (S ∈ R M×2d ) and corresponding user comments (C ∈ R N×2d ), we can easily capture the similarity matrices X ∈ R M×N as follows: where Q ∈ R 2d×2d , and F(•)is a feed-forward neural network function. We calculated the row and column maxima of the affinity matrix X and utilized the result to weight source posts and user comments.
In order to select the special sentences from the source posts and comments, we calculated the pointer to sentences as follows: Here, we chose to use max-pooling because it intuitively selects the most influential source post and user comments. Then, we applied the function to obtain max col (X) and max row (X). In this process, the input vector is usually transformed into a probability distribution utilizing the standard Softmax function, and the obtained co-attention vector representation is then fed to the next layer of the framework. However, we did not want to make use of these vector representations but wanted to continue to conduct further operations on the selected comments. Therefore, we used the Gumbel-Max trick to learn pointers based on the co-attentional layer, since the Gumbel-Max trick [45] transforms sampling from a multinomial distribution into a discrete optimization problem; that is, it transforms a sampling problem into an optimization problem.
Consider a k-dimensional categorical distribution with class probabilities described in terms of its unnormalized log probabilities l 1 , . . . , l k : A one-hot sample e = (e 1 , . . ., e k ) ∈ R from the distribution can be obtained as follows: In this case, the arg max operation is equivalent to taking a sample that is weighted by p i , p k , and g i signifies the Gumbel noise that disturbs each log(l i ) term to the extent of g i .
By respectively applying p s , p c , to S, C, we obtained the selected source post s i and user comments c j : Then, the selected user commentsC and source postS were passed to the next layer, where rich word-level interactions were extracted between them.

Word-Level Co-Attention
In the previous section, we obtained multiple user comment information through sentence-level co-attention. In this process, we used pointers to obtain the most informative comments one by one. Although in the input encoding, we used the self-attention mechanism to focus on the word information to obtain the vector of the sentence, the sentence information was still redundant for predicting the veracity of the news. So, we adopted the word-level co-attention mechanism for modeling to extract more fine-grained information, which is conducive to richer interactions. According to the method of computing sentence-level co-attention, we computed the affinity matrix between the source postS and user commentsC in the selected ground source, and the affinity matrixȲ was calculated as follows: where Q w ∈ R 2d×2d , and F(•)is the same function as in the sentence-level co-attention. Different from the sentence-level co-attention, which relied on pointers to select sentences, we computed the word-level co-attention representations utilizing an affinity matrix with mean pooling as follows: where H(•) is the standard softmax function. We used mean pooling and the softmax function to directly achieve the word-level representation obtained by the word-level coattention.

Prediction Layer
In this layer, the feature vectors of the source posts and the feature vectors in the comments were concatenated and fed into a multilayer perception (MLP) and a Softmax layer for the final prediction of news veracityŷ = [ŷ 0 ,ŷ 1 ], whereŷ 0 andŷ 0 are the label prediction probabilities of 0 and 1, respectively.
We adopted the loss function to minimize the cross entropy value.
where θ denotes all trainable parameters. In the training process, the Adam optimizer was utilized to learn θ, because the Adam optimizer is very suitable for large-scale data and parameter scenarios and is widely used in neural network training.

Experiments
To demonstrate the detection performance and interpretability of the proposed model, in this section, we discuss the design of different experiments to validate and answer the following research questions: Q1: In terms of fake news detection performance, does our HCSN model outperform state-of-the-art methods? Q2: What is the performance of the HCSN without the different components? Q3: Is our model capable of providing a compelling explanation?

Datasets
We utilized two well-known datasets twitter15 and twitter16 established by [46]. They both contain source posts, user comments, and user information. We utilized source posts and user comments as input and only chose "true" and "false" labels as the groundtruth. The detailed statistics of the twitter15 and twitter16 datasets are shown in Table 1.

Compared Methods
We compare our HCSN with the representative state-of-the-art fake news detection methods, as listed below.
• RNN [47]: an RNN-based method that models social context information as a variablelength time series for learning continuous representation of microblog events. We utilized a variant of RNN bidirectional GRU. • text-CNN [48]: a fake news detection model based on convolutional neural networks, which utilizes multiple convolutional filters to capture textual features of different granularities. • HAN [21]: a fake news detection model for learning source post representations, based on a hierarchical attention neural network, which utilizes word-level attention and sentence-level attention to learn source post representations. • HPA-BLSTM [20]: a hierarchical attention neural network-based fake news detection model that learns source post representation from word-level, post-level, and sub-event level. • dEFEND [15]: a model that utilizes the co-attention mechanism to learn the correlation representation between source posts and user comments for fake news detection. According to the weight of the co-attention mechanism, the content of source posts and user comments are obtained as the interpretation of the fake news detection results. • GCAN [18]: a fake news detection model based on graph-aware co-attention network, which learns the relationship between source posts, retweets, and user informationrelated representations through dual co-attention, and connects the features of source posts for fake news detection.. • PLAN [19]: an interpretable rumor detection model focusing on user interaction, taking rumors and retweeted comments as the input of Transformer, and using the positional embedding instead of delay time embedding for rumor detection, providing explanations in posts and tags through Attention. • Dual emotion [26]: a fake news detection model based on dual emotional features, which obtains the emotional representation of the source post, the emotional representation of the user comments, and the emotional gap as emotional features through the emotion dictionary, and connects the semantic features of the source post for fake news detection.

Experimental Results
To answer Q1, we compared our model with the state-of-the-art model. The evaluation metrics included accuracy, precision, recall, and F1. We randomly chose 60% data for training, 20% for validation, and 20% for testing. The experiment was performed five times, and the average was taken. We ran the source code of all the compared methods, except for GCAN, whose result was cited from the original paper. The experimental results are shown in Tables 2 and 3. It is clear from the results that our model was significantly competitive on both datasets. On twitter15 and twitter16, it compared favorably with three interpretable fake news detection models. Compared to dEFEND, our model achieved a 6.4% and 15% improvement in f1 and a 6.7% and 15% improvement in accuracy. Compared with the dual co-attention mechanism GCAN, our model achieved a 7% and 4% improvement in f1 and 3.6% and −1.1% improvement in accuracy, respectively. Compared to PLAN, our model achieved a 6.6% and 4% improvement in f1 and a 7.3% and 7.0% in accuracy, respectively. Compared with dual emotion, the state-of-the-art fake news detection model, we achieved a 6.0% and 8.5% improvement in f1 and a 6.1% and 8.5% improvement in the accuracy, respectively. Furthermore, our method (including source posts and user comments) outperformed models that rely only on source posts or user comments, such as HPA-BLSTM.
When comparing fake news detection models, which only utilize source posts, the HAN model obviously outperformed RNN and CNN, indicating that the hierarchical attention mechanism obtained semantic features well. The models based on source post and other information performed better, especially PLAN and GCAN; the former utilizes delay-time embeddings instead of positional embeddings, and the latter utilizes two coattention mechanisms to fully capture source posts and forward users spread structure. The dEFEND utilizes co-attention to capture the relevance of source posts and user comments, but it is more suitable for long articles when targeting source posts. Dual emotion extracts multiple emotional features, which are used as supplementary features, but their relevance is not considered.

Ablation Analysis
We studied the contribution of each component of the entire model to answer Q2. We experimented with several models that removed different components, and the results are shown in Figure 3. We removed the sentence encoding structure and the self-attention mechanism as "-se" and "-t". Removing sentence-level co-attention and word-level coattention was denoted by "-s" and "-w", respectively. Finally, the entailment model with all components (sentence encoding, self-attention mechanism, sentence-level co-attention, and word-level co-attention) is shown. We can clearly see that every part of the model is important. When we removed a component of the model, the performance of the model dropped, which shows that the components in our model are all critical. In addition, the performance of the model without the sentence encoding dropped significantly, indicating that the sentence-encoding effect was more obvious at the beginning. We also found that the effect of the wordlevel co-attention at the end was relatively small. However, in terms of sentence-level co-attention, the performance of the model without it dropped obviously, which showed that the selection reduced redundancy and noise during the learning process.

Interpretability Case Study
In this subsection, to answer Q3, we show a case study to illustrate the interpretability performance of the HCSN framework through the sentence-level co-attention and wordlevel co-attention, respectively (Figures 4 and 5). The prediction value was larger than 0.5 representing that the news was false; otherwise, it was true.
Interpretability case study on sentence-level co-attention.
Utilizing the one-hot vector in sentence-level co-attention, user comments related to the source post were selected as the explanation. To show the interpretability of the selected comments, we adopted two types of sentence-level modification techniques to compare the prediction performance of the HCSN under different situations: • -Keep: keep the selected comments. • -Drop: delete the selected comments. • -Change: replace the selected comments with comments in different data randomly.
We selected a fake post and related comments in the test data on twitter15 to verify its explainability, in which the source post had only one sentence ("People are enraged that Starbucks's red cups aren't christmasy enough URL URL"), and there were a total of 12 comments. We utilized the one-hot vector produced by the sentence-level co-attention to locate the appropriate comment, as seen in Figure 4. Among the selected comments, we found that each one was extremely closely related to the source post. Then, we showed the prediction results of the model under different situations, and the prediction values changed when we dropped or changed selected comments. In particular, if we replace selected sentences using random comments from another post, the prediction results value dropped. This case study shows that selected comments played a significant role in the prediction process, and they could be regarded as the sentence-level interpretation. Interpretability case study on word-level co-attention. By examining the attention weight associated with word-level co-attention, it was possible to determine the predictive power of informative words in detecting fake news. To test the interpretability of the highlighted words, we adopted three types of word-level modification techniques on news posts and comments to compare the performance of HCSN under different situations: The word-level co-attention of our model can further explain the words the model cares about. After obtaining the comments pointed to by the one-hot vector, we continued to utilize the word-level co-attention to obtain the relevant words in each comment. As shown in Figure 5, among the five comments pointed to by the one-hot vector, the model highlighted some words in the word-level co-attention layer. We compared the prediction value of true or false and found that when we dropped or masked these tokens, the prediction value dropped dramatically. Even when we replaced these tokens with other tokens randomly, the prediction result changed. This case study shows the importance of these tokens, which are the reasons the model gives us such decisions and confirms the word-level interpretability.

Conclusion and Future Work
In recent years, interpretable fake news detection has received increasing attention. However, few researchers directly filter out interpretable information and usually only focus on a certain part. We solved the problem of interpretable fake news detection by filtering user comments and the words in them. The purposes were to (1) significantly improve the detection performance; and to (2) screen interpretable news sentences and user comments and relocate words to explain why they were deemed false. We proposed a Hierarchical Co-attention Selection Network for fake news detection and explanatory sentence/comment discovery, and the experiments demonstrated the model has satisfying detection performance and reasonable explainability. In addition, we believe that our model can also be used for other explainability classification tasks on social media, such as position detection, hate detection, and malicious comment prediction. During the experiment, we found that comments that often contain emotional words were selected, so for future work, we will conduct tnterpretable fake news detection from the perspective of emotion features, especially using the relationship between the user comment emotion and the source posts to further improve fake news detection performance and explainability.  Data Availability Statement: All relevant datasets are publicly available on the web, and you can also obtain them from our public GitHub repository; our model is available on our public GitHub repository: https://github.com/wj-gxy/HCSN.

Conflicts of Interest:
The authors declare no conflict of interest.