EVOCA: Explainable Verification of Claims by Graph Alignment

De Felice, Carmela; Longo, Carmelo Fabio; Mongiovì, Misael; Santamaria, Daniele Francesco; Tuccari, Giusy Giulia

doi:10.3390/info16070597

Open AccessArticle

EVOCA: Explainable Verification of Claims by Graph Alignment

by

Carmela De Felice

^1,†,

Carmelo Fabio Longo

^2,†,

Misael Mongiovì

^1,2,†,

Daniele Francesco Santamaria

^1,*,†

and

Giusy Giulia Tuccari

^1,2,†

¹

Dipartimento di Matematica e Informatica, Università di Catania, Viale Andrea Doria 6, 95125 Catania, Italy

²

Istituto di Scienze e Tecnologie della Cognizione (ISTC), Consiglio Nazionale delle Ricerche (CNR), Via Paolo Gaifami 18, 95126 Catania, Italy

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Information 2025, 16(7), 597; https://doi.org/10.3390/info16070597

Submission received: 20 May 2025 / Revised: 27 June 2025 / Accepted: 9 July 2025 / Published: 11 July 2025

(This article belongs to the Collection Natural Language Processing and Applications: Challenges and Perspectives)

Download

Browse Figures

Versions Notes

Abstract

The paper introduces EVOCA—Explainable Verification Of Claims by Graph Alignment—a hybrid approach that combines NLP (Natural Language Processing) techniques with the structural advantages of knowledge graphs to manage and reduce the amount of evidence required to evaluate statements. The approach leverages the explicit and interpretable structure of semantic graphs, which naturally represent the semantic structure of a sentence—or a set of sentences—and explicitly encodes the relationships among different concepts, thereby facilitating the extraction and manipulation of relevant information. The primary objective of the proposed tool is to condense the evidence into a short sentence that preserves only the salient and relevant information of the target claim. This process eliminates superfluous and redundant information, which could impact the performance of the subsequent verification task and provide useful information to explain the outcome. To achieve this, the proposed tool called EVOCA—Explainable Verification Of Claims by Graph Alignment—generates a sub-graph in AMR (Abstract Meaning Representation), representing the tokens of the claim–evidence pair that exhibit high semantic similarity. The structured representation offered by the AMR graph not only aids in identifying the most relevant information but also improves the interpretability of the results. The resulting sub-graph is converted back into natural language with the SPRING AMR tool, producing a concise but meaning-rich “sub-evidence” sentence. The output can be processed by lightweight language models to determine whether the evidence supports, contradicts, or is neutral about the claim. The approach is tested on the 4297 sentence pairs of the Climate-BERT-fact-checking dataset, and the promising results are discussed.

Keywords:

natural language processing; knowledge graphs; fact-checking

1. Introduction

Automatic fact-checking plays an important role in mitigating the spread of information that conflicts with prevailing narratives within a given domain—especially in contexts where news circulates rapidly and often without proper verification. This process involves analyzing claims against trusted reference sources to assess the truthfulness of the information presented.

Within this process, claim verification, which involves assessing the veracity of a claim based on a knowledge base—specifically by examining the provided evidence—is fundamental. This task requires analyzing the logical relationships between sentences or documents to determine whether a claim can be confirmed or contradicted or whether its validity remains undetermined.

Fact-checking can be structured as a natural language inference (NLI) task, where a pair of sentences (premise and hypothesis) is analyzed to determine whether the first supports, contradicts, or is neutral with respect to the second [1,2,3]. The relationship between the two parts can be of the following three types:

Entailment, when the meaning of the hypothesis can be inferred from the premise.
Contradiction, when the hypothesis contradicts the premise.
Neutral, when the relationship between the hypothesis and the premise is indeterminate.

An example for each type is illustrated in Table 1, which is extracted from the SNLI corpus [4].

The analyzed pairs are provided as input for language models, which in turn output a label specifying the type of logical connection between the hypothesis and the premise. There are numerous pre-trained language models based on the Transformer architecture [5], such as BERT [6] and RoBERTa [7], specifically designed for the task of natural language inference. However, the performance of these models still suffers from certain limitations [8], which are also influenced by the size of the input to be processed; in fact, as the input size increases, the performance of these models tends to decrease [9,10,11].

Therefore, many research works have focused on optimizing the retrieval and management of evidence [9,10,12,13], particularly in multi-evidence corpora, such as the FEVER corpus [14].

Several studies leverage the semantic structure of knowledge graphs for the selection of relevant features [15,16]. The graph structure permits explicitly representing the meaning of a sentence or a set of sentences in a clear and interpretable manner, illustrating the semantic relationships among its components. Specifically, the AMR (Abstract Meaning Representation) format [17] enables the concise capture of semantic relations, generating graphs that condense significant information.

The objective of this paper is to present EVOCA—Explainable Verification Of Claims by Graph Alignment—a graph-based approach to managing and reducing evidence, with the aim of condensing only the most relevant and salient information into a short sentence with respect to the claim. The idea is to compare the similarity between the tokens of the pair of sentences and, after identifying the words that share a high degree of similarity, generate a sub-graph in the AMR (Abstract Meaning Representation) format [17,18] that is both meaningful and capable of demonstrating the factual quality of the evidence. Finally, the sub-graph will be converted into natural language using the SPRING AMR tool [19], thus generating synthetic “sub-evidence”. The proposed approach is finally tested on the Climate-BERT-fact-check dataset [20] with promising results. The dataset is a simplified version of the Climate FEVER dataset [20], restructured for the natural language inference (NLI) task. The presence of a single piece of evidence for each statement, as opposed to the more complex structure of the original dataset, makes it particularly suitable for the development phase of the proposed tool since it facilitates integration and supports experimentation with the current version of the system. Thanks to SPRING’s domain-agnostic parsing capabilities and the use of a general-purpose BERT model during the token selection stage, the proposed pipeline can be effectively applied across various domains. However, for enhanced performance, domain-specific tokenizers and fine-tuned models should be employed in both the token-selection and verification stages.

The paper is structured as follows. Section 2 presents the related work by focusing on existing approaches for evidence management and the use of graph-based representations in the context of fact-checking. Section 3 provides a detailed description of the proposed methodology and the pipeline for evidence generation and reduction using AMR graphs, while Section 4 presents a case study on Climate-BERT fact-checking. The paper concludes in Section 5 with a final discussion and future research directions.

2. Related Work

Automated fact-checking and claim verification studies have received significant attention in recent years, primarily due to the relentless proliferation of false information, for instance, through social media [21,22,23,24]. Various approaches in the literature have addressed this challenge using different strategies. Recent approaches [14] have evaluated and selected the content of a sentence using neural architectures and graph-based methods. A significant contribution to the field of automated fact-checking was made by Thorne et al. (2018) [14] with the creation and release of FEVER, a claim verification dataset consisting of 185,445 claims classified as SUPPORTED, REFUTED, or NOTENOUGHINFO.

The dataset was built in two phases:

Claim generation based on information extracted from Wikipedia.
Claim annotation—labeling claims as supported, refuted, or not enough info—and selecting relevant evidence.

The authors introduced a pipeline composed of three main components: document retrieval, fact extraction, and claim verification. Numerous studies have adopted the framework proposed by the FEVER authors [25,26,27,28,29,30,31].

To address the document retrieval task, Nie et al. (2019) [27] introduced the Neural Semantic Matching Network (NSMN), a neural architecture “capable of directly performing advanced semantic matching from raw textual input, without relying on intermediate term representations or structured external knowledge bases”. This model was later adopted in subsequent works [26,28] for the same purpose.

Among the proposed models, Hanselowski et al. (2019) [31] developed a combined pipeline to achieve the tasks outlined by the Climate FEVER dataset authors. Specifically, their approach incorporates entity linking for document retrieval by extracting entities from claims using AllenNLP and linking them to Wikipedia articles via the MediaWiki API. The methodology proposed by the UKP-Athene authors has been widely adopted in later research [25,29,30].

Recognizing the ability of graphs to enhance automated fact-checking by leveraging structured knowledge bases that explicitly represent relationships between concepts, Kim et al. (2023) introduced FACTKG (Fact Verification via Reasoning on Knowledge Graphs) [32], a dataset containing 108,000 natural language claims, categorized into five reasoning types: One-hop, Conjunction, Existence, Multi-hop, Negation.

Each record in the FACTKG dataset includes a claim and a list of entities that appear in both the claim and the knowledge graph (KG). In this context, the traditional document retrieval task is replaced by sub-graph retrieval.

To support this process, the authors trained two BERT models:

One model predicts the set of relations R that connect a claim c with an entity e.
The other estimates the maximum number of reasoning steps n required from e.

The resulting subgraph is extracted based on the predicted relations R and the entities found in the claim.

Inspired by Kim et al.’s work, Opsahl introduced a system extracting relevant sub-graphs using three training-free methods:

Retrieving triples where both nodes are present in the entity list.
Including relations between nodes that match lemmatized words in the claim.
Retrieving all triples that are reachable within one step from any node in the entity list.

Continuing along this line of research, the authors of FACE-KEG [33] proposed an innovative architecture that combines textual analysis with structured knowledge extraction from external knowledge bases. By integrating recurrent neural networks with Graph Transformer Networks, their system not only verifies the truthfulness of claims but also generates natural language explanations, thereby enhancing the interpretability and reliability of the verification process.

Despite these advances, current automated fact-checking systems still face several limitations. Chief among them is the high computational cost required by sophisticated models, which demand considerable resources for both training and inference. Moreover, the lack of interpretability often hinders trust in the system’s decisions. Another significant challenge lies in the evidence-retrieval process, which may yield irrelevant or redundant information—thereby increasing data volume without necessarily improving verification quality.

To address these limitations, the proposed method EVOCA combines knowledge graphs and language models to optimize the selection of relevant evidence, while also improving the efficiency and comprehensibility of the verification process. A key innovation is the reduction in evidence, which condenses only the most salient information related to the claim. This reduces the input size for pre-trained language models, making the process faster and more efficient, lowers computational costs, and facilitates the management of large volumes of data.

An additional advantage is the transformation of the extracted sub-graph into natural language, allowing seamless integration with pre-trained language models.

The use of Abstract Meaning Representation (AMR) graphs further enhances this process by explicitly encoding semantic relationships between entities and actions. This contributes to greater system transparency and result interpretability. Overall, the proposed approach improves the effectiveness of fact-checking while making it more accessible and trustworthy.

3. Methodology

This section presents the methodology developed for the extraction of meaningful sub-graphs and its subsequent conversion into natural language for the generation of shorter, albeit meaningful, evidence statements. Specifically, the proposed method, called EVOCA –Explainable Verification Of Claims by Graph Alignment—employs NLP techniques for identifying and extracting salient information and graph theory for an explicit representation of relationships between tokens.

Figure 1 illustrates the EVOCA pipeline. The process begins with dataset pre-processing, which includes data cleaning, normalization, and the conversion of selected evidence into Abstract Meaning Representation (AMR) graphs. Next, embeddings are extracted, and the similarity between claim and evidence tokens is computed to identify relevant concepts required for the subsequent sub-graph extraction step.

To achieve this, evidence tokens must first be aligned with their corresponding concepts in the AMR graph. Once alignment is complete, the pipeline proceeds to the graph-manipulation and reduction phase. This involves extracting the relevant sub-graph, converting it into the Penman representation, and finally generating a simplified textual version of the evidence. The resulting reduced evidence is designed to retain only the information essential to the claim.

For the implementation of our methodology, we leveraged the NetworkX library [34], one of the most widely adopted tools within the scientific community for graph representation and manipulation. It provides a comprehensive set of functionalities and advanced algorithms, along with extensive and well-maintained documentation that facilitates both development and reproducibility.

For the evaluation task, we leverage the Climate-BERT-fact-check dataset [35], a revised version of Climate FEVER [20], adapted for the natural language inference task. The key components required for the system to function are the claim, evidence, and label.

The following subsections detail the various steps of the presented methodology. Specifically, Section 3.1 describes the parsing preprocessing step, Section 3.2 covers the alignment, and Section 3.3 discusses the verbalization.

3.1. Parsing

A preprocessing step is necessary to improve data quality and, consequently, enhance the accuracy and reliability of the final results. The conversion of evidence into AMR graphs is a crucial prerequisite for enabling subsequent manipulation and reduction operations. This step involves the preprocessing of the dataset and the proper handling of missing and duplicate values. Hence, all the evidence in the dataset is converted into AMR graphs through the RESTful APIs of the SPRING (Symmetric ParsIng aNd Generation) AMR tool [19], a sequence-to-sequence model designed for parsing and generating sentences in AMR format and vice versa.

To identify the tokens in the evidence that support the information in the claim, a similarity calculation is performed for each evidence–claim token pair. Specifically, each token of the evidence (

t o k e n_{e}

) is compared with each token of the claim (

t o k e n_{c}

).

To transform the tokens into vectors (embeddings), we leverage BERT-base-uncased, a pre-trained version of the BERT (Bidirectional Encoder Representations from Transformers) model [6]. The model, which is case-insensitive, consists of 12 layers, 12 attention heads, a hidden size of 768, and has a total of 110M parameters.

BERT is widely used for semantic embedding extraction due to its ability to capture the contextual meaning of words. In this work, the BERT-base-uncased model is adopted as it offers a good balance between performance and computational efficiency. Compared to larger variants, the base model significantly reduces memory and computational requirements while maintaining sufficient embedding quality for the task at hand. The choice of the uncased version is motivated by the goal of minimizing sensitivity to letter casing.

Additionally, cosine similarity is used as the reference metric to calculate the similarity between the two tensors. For the calculation of similarity between semantic vectors, cosine similarity is used, as it is a well-established and efficient measure for assessing the semantic closeness between two vector representations in vector spaces. The algorithm has complexity

O (m n)

and can be parallelized, for instance, using GPUs, to improve performance.

Then, we reduce the number of tokens to be processed in the subsequent step by applying two filtering strategies. First, we retain only those token pairs whose cosine similarity exceeds a fixed threshold of

0.7

. Second, for each token

a_{i} \in A

, we retain the token

b_{j} \in B

that has the highest similarity score with

a_{i}

. The final set of candidate pairs is given by the union of the results of both filters. This procedure is illustrated in Algorithm 1 (which has complexity

O (m n)

).

Algorithm 1: Filtering tokens

The similarity threshold of

0.7

was introduced to avoid treating all claim–evidence token pairs as meaningful relationships. Without this constraint, the system would indiscriminately associate all token pairs, resulting in an excessive number of connections between nodes in the graph—including many that do not reflect relevant or substantial semantic relations.

During the phase of reducing the evidence graph, this would have produced an overly dense graph in which nearly every node—i.e., every concept in the evidence—was preserved. As a consequence, the graph would not have been meaningfully reduced but merely reproduced in a paraphrased form.

Following an in-depth analysis of the similarity matrix, we set the threshold at 0.7, as this value effectively distinguishes significant relationships between tokens from weaker or irrelevant ones. Hence, the goal is to select the strongest semantic relationships, avoiding weak or redundant connections. As shown in the distribution of similarity scores in Figure 2, most cosine values fall between

0.3

and

0.4

. Based on this observation, we set a threshold at

0.7

to retain only pairs with high semantic affinity. This threshold integrates seamlessly with the graph reconstruction step, where only the above-threshold tokens (i.e., terminal nodes) are connected, including the minimum set of intermediate nodes necessary to ensure overall connectivity. This step is crucial—as discussed in Section 3.3—for enabling the proper conversion of the graph into the Penman format. As a result, the final graph is structured around strong semantic links, thus supporting genuine evidence reduction rather than mere paraphrasing. Token/node selection is currently based solely on semantic similarity—a limitation we plan to overcome by introducing syntactic validation via part-of-speech (POS) tagging.

3.2. Alignment

After identifying and selecting the relevant tokens from the evidence, the next step is the token-to-AMR alignment. In this phase, each token is matched to its corresponding AMR concept using a string matching algorithm. At present, we assume a straightforward alignment between tokens and AMR concepts. Nevertheless, addressing more complex linguistic instances is a key objective for future developments. When an exact match is not found—such as in the case where AMR concepts represent higher-level abstractions or where no one-to-one correspondence exists between the token and the concept—we leverage the SPRING tool [19] to convert the token into its AMR representation, which can then be matched accurately.

For instance, a direct string comparison between the token “man” and the AMR concept “person” would not yield a match, even though the two are semantically equivalent. SPRING permits resolving such cases by generating the appropriate AMR abstraction for the token.

3.3. Verbalization

In order to build and manipulate the graph for the content analysis and the subsequent fact verification, we proceed through several interconnected stages. These stages are essential for achieving an accurate and computationally efficient representation of the information.

Given the set of AMR concepts extracted from the evidence obtained so far, we construct a sub-graph as follows. First, we create an initial graph of the entire evidence with the NetworkX library; nodes, labels, and the semantic relationships are derived from the AMR graph of the sentence, which was previously obtained via an AMR format parsing tool. Specifically, we use ‘AMR Reader’, which allows AMR graphs or alignments to be loaded in various formats. By default, the AMR graphs loaded by AMR Reader are converted into the LDC/ISI representation.

Each node and its corresponding edges are mapped to a unique identifier, which is then used to correctly reconstruct the graph with the NetworkX library. An example of such mapping is shown in Figure 3. The first column represents the type, namely, node (either root or internal) or edge. In the case of nodes, the second column shows the identifier and then the label; in the case of edges, the connected nodes follow. In the example, duration e unit are the edge names.

Subsequently, a sub-graph is extracted by applying the minimum Steiner tree algorithm [36] as implemented by the NetworkX library. The minimal Steiner tree algorithm connects only the relevant (terminal) nodes, minimizing the total number of intermediate nodes and arcs required. Unlike other reduction techniques that enforce full connectivity of the graph, this algorithm selects only those connections that are strictly necessary. As a result, it yields a more compact structure and significantly reduces information noise. This property is especially beneficial in scenarios involving the processing of large-scale graphs.

Our goal is to obtain a connected graph that includes all identified terminal nodes, admitting non-terminal nodes when necessary. The minimum Steiner tree algorithm is an effective solution for dealing with connection problems between sets of terminal nodes within a graph.

This step requires three inputs: a source graph, a set of terminal nodes, and the specific Steiner tree algorithm to use. The algorithm computes an approximation of the minimum Steiner tree induced by the terminal nodes. For this purpose, we employ the algorithm proposed by Mehlhorn [37], a variant of the algorithm in Kou et al. [38] and one of the two algorithms available in the NetworkX library.

We choose Mehlhorn’s approach for the sub-graph extraction since it takes advantage of non-terminal nodes by identifying the nearest terminal one for each of them. This knowledge is used to build a complete graph containing only terminal nodes, where the edge weights are determined by leveraging the shortest path distances between each pair of terminals.

Once the sub-graph is constructed, it is converted into the Penman representation using the corresponding Python library. The Penman representation is a widely adopted notation for representing semantic graphs, such as AMR [18]. In the event of layout errors during the conversion, an alternative strategy is adopted: this involves the identification and inclusion of intermediate nodes and semantic relations that were not initially detected but are required for generating a well-formed Penman graph.

This corrective step is crucial for ensuring that complex semantic structures are translated into a coherent and interpretable format, thus enabling subsequent operations such as parsing, natural language generation, formal reasoning, and machine learning.

4. Case Study and Evaluation

We carried out a qualitative and quantitative experimental analysis to evaluate the effectiveness and performance of the proposed tool. We considered the 4297 sentence pairs comprising the ClimateBERT_factcheck dataset [20] that exhibit entailment, contradiction, or neutral relationships between them.

We implemented in Python 3.10 the proposed tool, which uses the NetworkX library (version 3.2) and Penman library (version 1.3) for the creation and manipulation of graphs and the SPRING tool [19] for the conversion of AMR graphs.

The ClimateBERT_factcheck dataset underwent a cleaning process that removes any inconsistencies or irrelevant data. Once the cleaning phase was completed, the relevant data, namely claim and evidence, were converted into AMR graphs using the SPRING tool, integrated via RESTful API. AMR graphs explicitly represent the semantic meaning of sentences, capturing conceptual relationships and the context of the sentences. Figure 4 shows an example of an AMR graph for the sentence, “By August 2014 a three year drought was prompting changes to the agriculture industry in the valley”. In the figure, the nodes of the graph correspond to the tokens of the sentence, and the edges represent the semantic relationships between concepts.

Then, we compute the semantic similarity between the tokens present in the claim and those in the evidence. To identify the most relevant concepts for the extraction of embeddings, we adopted pre-trained BERT, and we computed the similarity: tokens with a similarity score equal to or greater than 0.7 were selected as relevant, as they represent a high degree of semantic connection linking the pair of sentences. In case no tokens exceed the similarity threshold, the tool returns an empty list as output.

Once the significant evidence tokens have been identified, the process proceeds with the token-to-AMR-concept alignment. To achieve this, the two strategies described in the previous section are applied: the string-matching algorithm and the conversion of the token with the SPRING tool. The combination of the proposed methods ensures a robust and accurate alignment, minimizing information loss during the token-to-graph mapping process, although this comes at the cost of increased execution time.

As shown in Figure 5, the selected AMR concepts are stored as a list, which in turn is composed of sub-lists representing the relational structure of the nodes. In the given example, the elements in position 0, referred to as source nodes (s), are connected to the nodes called target nodes (t) in position 1.

Once nodes are aligned, the creation and manipulation of the graph required to visualize and analyze the data can be carried out. Specifically, the AMR graph of the evidence is represented by means of the NetworkX library, through an undirected graph. The AMR node IDs, extracted via the AMR reader, are used as unique identifiers for the graph nodes. The tokens extracted from the sentences, which represent the significant concepts, are instead associated to the graph nodes as labels. This representation is crucial because it allows for the easy manipulation required for targeted analyses. In particular, it enables the extraction of the sub-graph of interest by applying the minimum Steiner tree algorithm already implemented in NetworkX. The algorithm uses the entire evidence graph and the list of selected AMR concepts to return a subset of nodes and edges that constitute the minimum Steiner tree. The latter connects a specific set of nodes (the terminals) of a weighted graph. The goal is to minimize the total cost of the tree that connects these terminal nodes, allowing the inclusion of non-terminal nodes that are necessary to optimize the connection. An example of such a process is illustrated in Figure 6a. The figure illustrates the evidence before the application of the minimum Steiner tree algorithm. The nodes prompt-01 and drought are connected by the relation :ARG0, while the nodes 3 and year are orphan nodes.

The same evidence after the application of the minimum Steiner tree is illustrated in Figure 6b. The nodes 3 and year, which were orphans, are now connected to each other through the intermediate node temporal quantity.

Finally, extracted sub-graphs are converted into the Penman representation using the Eponymous library, which interprets the sub-graph nodes as concepts and the edges as relationships, thereby creating a readable structured representation. In the case of our example, the conversion failed due to a semantic ambiguity that generated a layout error. As shown in Figure 7, it is evident that some triples—namely, source node, relation, and target node—are poorly structured and do not properly explain the relationships between concepts. In our case, the concept temporal quantity is used three times as a target, leaving its nature ambiguous and the relationship between the concept drought and its temporal duration 3 years unclear.

Once the intermediate relationships are reconstructed and the root node is identified, the graph is restructured with significant changes in its overall configuration.

In Figure 8, it is possible to observe the reconstructed graph, where the issues described above are addressed. In this new shape, the triples are structured coherently, correctly expressing the relationships between the concepts, and removing any ambiguity, such as the one related to the temporal duration of the concept drought.

Finally, the graph is converted into natural language using the SPRING tool, which allows the graphical structure to be translated into natural language. The final result is the sentence “A three-year drought”.

Table 2 shows the differences between some evidence statements from the original dataset and the ones produced by EVOCA. Table 3 shows the graphs and the extracted sub-graphs of the examples in Table 2, where the reduction carried out by EVOCA is highlighted.

We are now ready to show the qualitative analysis carried out on the ClimateBERT_factcheck dataset. For an initial analysis of the synthesis capability of the proposed method, we consider the average of the sentences’ lengths as it provides an immediate indication of the reduction in the information density in the evidence. Although it is not sufficient to comprehensively assess the quality of reduction, it represents a good starting point for the initial evaluation of the results, thus piloting future in-depth analyses.

Hence, the average length of the original evidence statements of the dataset was calculated and compared with that of the evidence statements produced by EVOCA: as shown in Figure 9, there is a significant difference—approximately 80%.

We now present the qualitative analysis of EVOCA. We compared the performance of the RoBERTa-large-MNLI model [7] using both the original evidence and reduced evidence generated by our method. With a learning rate of

2 \times 10 - 5

and the original evidence as input, the model achieved an accuracy of

67.6 %

and an F1 score of

68.1 %

. Using the reduced evidence, the accuracy slightly decreased to

67.0 %

and the F1 score decreased to

66.2 %

. At a lower learning rate of

1 \times 10 - 6

, the model reached

72.0 %

accuracy and a

72.4 %

F1 score with the original evidence, while the reduced evidence again led to a slight decrease in performance, with accuracy at

67.0 %

and the F1 score at 66.9%.

We evaluated the performance of the BART-large-MNLI model [39] using both the original and the reduced evidence sets (see Table 4). With a learning rate of

2 \times 10 - 5

and the original evidence, the model achieved an accuracy of

64.3 %

and an F1-score of

65.1 %

. When provided with the reduced evidence, accuracy marginally increased to

64.8 %

, whereas the F1-score slightly declined to

64.2 %

. At a lower learning rate of

1 \times 10 - 6

, the model attained an accuracy of

70.3 %

and an F1-score of

70.6 %

with the original evidence. In contrast, the use of reduced evidence at this learning rate resulted in an accuracy of

66.4 %

and an F1-score of

66.5 %

.

Hence, NLI models trained on reduced evidence achieve results comparable to those trained on the original evidence. Accuracy, F1 score, precision, and recall remain at similar levels, with only a minimal decrease that is largely justified by the substantial reduction in information load.

In conclusion, although this work is still in development, the results are promising and indicate that EVOCA will be especially effective for multi-evidence corpora, where the volume of information to analyze is substantial. In such scenarios, the capacity to reduce evidence effectively while maintaining high performance will be crucial to ensuring both efficiency and accuracy. The analysis shows that while EVOCA has the ability to extract and represent salient information, there is still room for improvement. The comparison between evidence statements and sub-evidence statements, as well as the corresponding graphs and sub-graphs, demonstrates the capability of the tool in reducing structural complexity, identifying, and isolating relevant fragments. However, the ability to manage linguistic structures needs further development. One of the long-term goals is to make the tool scalable for multi-evidence corpora, such as the Climate FEVER dataset. To carry this out, it is necessary to improve the token alignment and selection techniques, for instance, by leveraging Named Entity Recognition (NER), morphosyntactic analysis (POS), and co-reference resolution, in order to improve the handling of multiple evidence statements and the accuracy of verification.

It is worth noting that the development of GREM is still ongoing, but it constitutes a starting point that lays the foundation for future refinements, specifically in the enhancement and optimization of the overall processes of extraction and representation of data.

5. Conclusions

EVOCA—the tool proposed in this paper, still in development—represents a potential advancement in the field of factual quality verification of texts by combining natural language processing techniques with knowledge graphs. The analysis of the results obtained so far highlights the strengths of the system but also some limitations that open the way to future improvements.

The main advantage offered by the tool is its ability to reduce the evidence associated with statements into a more concise representation, retaining only relevant and pertinent information. This process reduces the input size of language models and, consequently, the computational cost, while simplifying the fact-checking process without sacrificing the quality of the results. The proposed approach leverages the structural capabilities of knowledge graphs to efficiently extract and manipulate information by explicitly representing concepts and their relationships through Abstract Meaning Representation (AMR). This formalism enhances transparency by clearly encoding the semantic structure underlying a sentence. While AMR interpretation may not be immediately intuitive, its expressive power offers significant advantages for downstream processing. Moreover, our reduction process selectively retains only semantically relevant nodes and relations, thereby simplifying the resulting graph and improving its interpretability.

Despite the advantages described above, the tool has some limitations that are worth mentioning. One of the main limitations concerns the computational cost of the procedure, which is mainly derived from the SPRING tool. Another limitation regards the selection of tokens, which currently relies only on semantic similarity. This is generally insufficient in more complex scenarios, such as those that involve datasets with multiple trials. One of the planned future developments involves refining the alignment process between nodes to more effectively handle complex linguistic cases and significantly reduce the current computational overhead. To this end, we plan to systematically evaluate the performance of tools based on the expectation-maximization (EM) algorithm, as well as those leveraging attention mechanisms in neural models [40], in order to more deliberately inform subsequent methodological choices. In this work, we deliberately chose AMR verbalization over AMR linearization to simplify the training process with widely used pre-trained transformer models, which are primarily designed to process natural language input. We plan to compare the two approaches in future work to assess potential improvements in both accuracy and interpretability. The selection of tokens or nodes based solely on semantic similarity represents a limitation that we aim to address. To improve the syntactic fluency of the reduced sentences, we propose, as future work, the integration of syntactic validation through part-of-speech (POS) tagging. This would enable us to assess the grammatical consistency of the selected sequences, ensuring that the resulting outputs are not only semantically coherent but also syntactically well-formed.

Finally, to enhance computational efficiency and scalability, we plan to replace NetworkX with a more performant graph library such as Networkit. While NetworkX offers flexibility and ease of use, Networkit provides optimized data structures and parallel algorithms, resulting in significant speed-ups for large-scale graph processing tasks. At this stage, we could eventually consider adhering to the PRISMA [41] guidelines.

Author Contributions

Conceptualization, C.D.F. and C.F.L. and M.M. and G.G.T. and D.F.S.; methodology, C.D.F. and C.F.L. and M.M. and G.G.T.; software, C.D.F. and G.G.T.; validation, C.D.F. and C.F.L. and M.M. and G.G.T. and D.F.S.; formal analysis, C.D.F. and C.F.L. and M.M. and G.G.T. and D.F.S.; investigation, C.D.F. and C.F.L. and M.M. and G.G.T. and D.F.S.; writing—original draft preparation, C.D.F.; writing—review and editing, C.D.F. and C.F.L. and M.M. and G.G.T. and D.F.S.; supervision, C.F.L. and M.M. and G.G.T. and D.F.S.; project administration, M.M. and D.F.S.; funding acquisition, D.F.S. All authors have read and agreed to the published version of the manuscript.

Funding

Daniele Francesco Santamaria acknowledges the Research Program PIAno di inCEntivi per la Ricerca di Ateneo 2020/2022—Linea di Intervento 3 “Starting Grant”-University of Catania. Misael Mongiovì, Giusy Giulia Tuccari, and Carmelo Fabio Longo acknowledges FOSSR (Fostering Open Science in Social Science Research), funded by the European Union-NextGenerationEU under NRRP Grant agreement n. MUR IR0000008.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author(s).

Conflicts of Interest

The authors declare no conflict of interest.

References

Muharram, A.P.; Purwarianti, A. Enhancing Natural Language Inference Performance with Knowledge Graph for COVID-19 Automated Fact-Checking in Indonesian Language. arXiv 2024, arXiv:2409.00061. [Google Scholar]
Karkera, N.; Ghosh, S.; Escames, G.; Palaniappan, S.K. MelAnalyze: Fact-Checking Melatonin claims using LLMs and NLI. BioRxiv 2024. [Google Scholar]
Martín, A.; Huertas-Tato, J.; Huertas-Garcia, A.; Villar-Rodríguez, G.; Camacho, D. FacTeR-Check: Semi-automated fact-checking through Semantic Similarity and Natural Language Inference. arXiv 2021, arXiv:2110.14532. [Google Scholar] [CrossRef]
Bowman, S.R.; Angeli, G.; Potts, C.; Manning, C.D. A Large Annotated Corpus for Learning Natural Language Inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP), Lisbon, Portugal, 17–21 September 2015. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.; Kaiser, L.; Polosukhin, I. Attention is All you Need. In Proceedings of the Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Devlin, J.; Chang, M.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the NAACL, Minneapolis, MN, USA, 2–7 June 2019. [Google Scholar]
Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
Yang, Z.; Dong, S.; Hu, J. Explainable Natural Language Inference via Identifying Important Rationales. IEEE Trans. Artif. Intell. 2023, 4, 438–449. [Google Scholar] [CrossRef]
Mongiovì, M.; Gangemi, A. Graph-based Retrieval for Claim Verification over Cross-document Evidence. In Complex Networks & Their Applications X; Springer: Berlin/Heidelberg, Germany, 2022. [Google Scholar]
Mongiovì, M.; Gangemi, A. GRAAL: Graph-Based Retrieval for Collecting Related Passages across Multiple Documents. Information 2024, 15, 318. [Google Scholar] [CrossRef]
Levy, M.; Jacoby, A.; Goldberg, Y. Same Task, More Tokens: The Impact of Input Length on the Reasoning Performance of Large Language Models. In Proceedings of the ACL, Bangkok, Thailand, 11–16 August 2024. [Google Scholar]
Chen, C.; Cai, F.; Hu, X.; Zheng, J.; Ling, Y.; Chen, H. An entity-graph based reasoning method for fact verification. Inf. Process. Manag. 2021, 58, 102472. [Google Scholar] [CrossRef]
Lan, Y.; Liu, Z.; Gu, Y.; Yi, X.; Li, X.; Yang, L.; Yu, G. Multi-Evidence based Fact Verification via A Confidential Graph Neural Network. arXiv 2024, arXiv:2405.10481. [Google Scholar] [CrossRef]
Thorne, J.; Vlachos, A.; Christodoulopoulos, C.; Mittal, A. FEVER: A Large-scale Dataset for Fact Extraction and VERification. arXiv 2018, arXiv:1803.05355. [Google Scholar]
Giarelis, N.; Kanakaris, N.; Karacapilidis, N. An Innovative Graph-Based Approach to Advance Feature Selection from Multiple Textual Documents. In Proceedings of the AIAI 2020, IFIP AICT, Neos Marmaras, Greece, 5–7 June 2020; Springer: Cham, Switzerland, 2020; Volume 583. [Google Scholar]
Jalil, Z.; Nasir, M.; Alazab, M.; Nasir, J.; Amjad, T.; Alqammaz, A. Grapharizer: A Graph-Based Technique for Extractive Multi-Document Summarization. Electronics 2023, 12, 1895. [Google Scholar] [CrossRef]
Tosik, M. Abstract Meaning Representation—A Survey. 2015. Available online: https://www.melanietosik.com/files/amr.pdf (accessed on 19 May 2025).
Banarescu, L.; Bonial, C.; Cai, S.; Georgescu, M.; Griffitt, K.; Hermjakob, U.; Knight, K.; Koehn, P.; Palmer, M.; Schneider, N. Abstract Meaning Representation for Sembanking. In Proceedings of the LAW VII, Sofia, Bulgaria, 8–9 August 2013. [Google Scholar]
Bevilacqua, M.; Blloshmi, R.; Navigli, R. One SPRING to Rule Them Both: Symmetric AMR Semantic Parsing and Generation without a Complex Pipeline. In Proceedings of the AAAI, Vancouver, BC, Canada, 20–27 February 2024. [Google Scholar]
Diggelmann, T.; Boyd-Graber, J.; Bulian, J.; Ciaramita, M.; Leippold, M. CLIMATE-FEVER: A Dataset for Verification of Real-World Climate Claims. arXiv 2021, arXiv:2012.00614. [Google Scholar] [CrossRef]
Pew Research Center. News Consumption Across Social Media in 2021. Pew Research Center: Journalism & Media. 2021. Available online: https://www.pewresearch.org/journalism/2021/09/20/news-consumption-across-social-media-in-2021/ (accessed on 19 May 2025).
Vosoughi, S.; Roy, D.; Aral, S. The spread of true and false news online. Science 2018, 359, 1146–1151. [Google Scholar] [CrossRef] [PubMed]
Borges do Nascimento, I.J.; Pizarro, A.B.; Almeida, J.M.; Azzopardi-Muscat, N.; Gonçalves, M.A.; Björklund, M.; Novillo-Ortiz, D. Infodemics and health misinformation: A systematic review of reviews. Bull. World Health Organ. 2022, 100, 544–561. [Google Scholar] [CrossRef] [PubMed]
Allcott, H.; Gentzkow, M. Social media and fake news in the 2016 election. J. Econ. Perspect. 2017, 31, 211–236. [Google Scholar] [CrossRef]
Zhou, J.; Han, X.; Yang, C.; Liu, Z.; Wang, L.; Li, C.; Sun, M. GEAR: Graph-Based Evidence Aggregating and Reasoning for Fact Verification. arXiv 2019, arXiv:1908.01843. [Google Scholar]
Zhong, W.; Xu, J.; Tang, D.; Xu, Z.; Duan, N.; Zhou, M.; Wang, J.; Yin, J. Reasoning Over Semantic-Level Graph for Fact Checking. arXiv 2019, arXiv:1909.03745. [Google Scholar]
Nie, Y.; Chen, H.; Bansal, M. Combining Fact Extraction and Verification with Neural Semantic Matching Networks. In Proceedings of the AAAI, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33. [Google Scholar]
Ma, J.; Gao, W.; Joty, S.; Wong, K.F. Sentence-Level Evidence Embedding for Claim Verification with Hierarchical Attention Networks. In Proceedings of the ACL, Florence, Italy, 28 July–2 August 2019; pp. 2561–2571. [Google Scholar]
Soleimani, A.; Monz, C.; Worring, M. BERT for Evidence Retrieval and Claim Verification. arXiv 2019, arXiv:1910.02655. [Google Scholar]
Liu, Z.; Xiong, C.; Sun, M.; Liu, Z. Fine-Grained Fact Verification with Kernel Graph Attention Network. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Association for Computational Linguistics, Online, 5–10 July 2020; pp. 7342–7351. [Google Scholar]
Hanselowski, A.; Zhang, H.; Li, Z.; Sorokin, D.; Schiller, B.; Schulz, C.; Gurevych, I. UKP-Athene: Multi-Sentence Textual Entailment for Claim Verification. arXiv 2019, arXiv:1809.01479v5. [Google Scholar]
Kim, J.; Park, S.; Kwon, Y.; Jo, Y.; Thorne, J.; Choi, E. FactKG: Fact Verification via Reasoning on Knowledge Graphs. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers); Rogers, A., Boyd-Graber, J., Okazaki, N., Eds.; Association for Computational Linguistics: Toronto, ON, Canada, 2023; pp. 16190–16206. [Google Scholar]
Vedula, N.; Parthasarathy, S. FACE-KEG: Fact Checking Explained Using KnowledgE Graphs. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining, Virtual Event, 8–12 March 2021; pp. 526–534. [Google Scholar]
Hagberg, A.A.; Schult, D.A.; Swart, P.J. Exploring network structure, dynamics, and function using NetworkX. In Proceedings of the 7th Python in Science Conference (SciPy2008), Pasadena, CA USA, 19–24 August 2008; pp. 11–15. [Google Scholar]
Kim, Y. ClimateBERT FactCheck Dataset, 2022. HuggingFace Datasets Repository. Available online: https://huggingface.co/datasets/Yoonseong/climatebert_factcheck (accessed on 19 May 2025).
Robins, G.; Zelikovsky, A. Tighter Bounds for Graph Steiner Tree Approximation. SIAM J. Discrete Math. 2005, 19, 122–134. [Google Scholar] [CrossRef]
Mehlhorn, K. A Faster Approximation Algorithm for the Steiner Problem in Graphs. Inf. Process. Lett. 1988, 27, 125–128. [Google Scholar] [CrossRef]
Kou, L.; Markowsky, G.; Berman, L. A Fast Algorithm for Steiner Trees. Acta Inform. 1981, 15, 141–145. [Google Scholar] [CrossRef]
Lewis, M.; Liu, Y.; Goyal, N.; Ghazvininejad, M.; Mohamed, A.; Levy, O.; Stoyanov, V.; Zettlemoyer, L. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. arXiv 2019, arXiv:1910.13461. [Google Scholar]
Martínez Lorenzo, A.C.; Huguet Cabot, P.L.; Navigli, R. Cross-lingual AMR Aligner: Paying Attention to Cross-Attention. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, Toronto, ON, Canada, 9–14 July 2023; pp. 1726–1742. [Google Scholar]
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, M.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Diagram of the EVOCA pipeline.

Figure 2. Distribution of cosine similarity scores between tokens.

Figure 3. Example of a LDC/ISI representation.

Figure 4. AMR graph of the sentence “By August 2014 a three year drought was prompting changes to the agriculture industry in the valley”.

Figure 5. The selected AMR nodes of the example evidence.

Figure 6. Graphical representation of the selected evidence before and after applying the minimum Steiner tree algorithm.

Figure 7. Penman representation of the example sentence before the reconstruction step.

Figure 8. Penman representation of the example sentence after the reconstruction step.

Figure 9. Comparison between the average length of the evidence statements of ClimateBERT and the average length of evidence statements computed by EVOCA.

Table 1. Examples of claim–evidence pairs annotated with their respective semantic category, used for the natural language inference task in the context of fact-checking.

Premise	Label	Hypothesis
Children smiling and waving at camera	entailment	There are children present
A Little League team tries to catch a runner sliding into a base in an afternoon game.	neutral	A team is trying to score the game’s winning out.
An older man is drinking orange juice at a restaurant.	contradiction	Two women are at a restaurant drinking wine.

Table 2. Examples of evidence–claim pairs and their corresponding sub-evidence.

Evidence	Sub-Evidence	Claim
By August 2014, a three-year drought was prompting changes to the agriculture industry in the valley.	A three-year drought.	While the north-east, midwest and upper great plains have experienced a 30% increase in heavy rainfall episodes—considered once-in-every-five year downpours—parts of the west, particularly California, have been parched by drought.
The IUGG concurs with scientific assessments stating that human activities are the primary cause of recent climate change.	Activities which are the primary evidence of recent climate change.	The IPCC was formed to build the scientific case for humanity being the primary cause of global warming.
This increase in acidity inhibits all marine life—having a greater impact on smaller organisms as well as shelled organisms (see scallops).	All marine life will be inhibited by this increase in acidity, which will have a great impact on the small shells and fishes.	More than half of the 44 studies selected for publication found that raised levels of CO₂ had little or no impact on marine life, including crabs, limpets, sea urchins, and sponges.

Table 3. A comparison between the graphs and sub-graphs of the example evidence statements.

Subgraph Dimension	Subgraph	Original Graph Dimension	Original Graph
5 nodes, 4 edges	(drought :ARG0 prompt-01 :duration (temporal-quantity :quant 3 :unit year))	13 nodes, 12 edges	(z0 / prompt-01 :ARG0 (z1 / drought :duration (z2 / temporal-quantity :quant 3 :unit (z3 / year))) :ARG1 (z4 / change-01 :ARG1 (z5 / industry :mod (z6 / agriculture) :location (z7 / valley))) :time (z8 / by :op1 (z9 / date-entity :month 8 :year 2014)))
7 nodes, 6 edges	(cause-01 :ARG1 evidence-01 :ARG0 activity-06 :ARG1 (change-01 :ARG1-of climate :time recent) :mod primary)	36 nodes, 37 edges	(z0 / concur-01 :ARG0 (z1 / organization :wiki “International_Panel_on_Climate_Change” :name (z2 / name :op1 “IUGG”)) :ARG1 (z3 / and :op1 (z4 / assess-01 :ARG0 (z5 / organization :wiki “Intergovernmental_Panel_on_Climate_Change” :name (z6 / name :op1 “Intergovernmental” :op2 “Panel” :op3 “on” :op4 “Climate” :op5 “Change”)) :mod (z7 / science) :mod (z8 / comprehensive) :ARG1-of (z9 / accept-01 :ARG1-of (z10 / wide-02)) :ARG1-of (z11 / endorse-01 :ARG1-of z10)) :op2 (z12 / establish-01 :ARG0 (z13 / and :op1 (z14 / body :mod (z15 / region)) :op2 (z16 / body :mod (z17 / nation))) :ARG1 (z18 / evidence-01 :ARG0 z7 :ARG1 (z19 / cause-01 :ARG0 (z20 / activity-06 :ARG0 (z21 / human)) :ARG1 (z22 / change-01 :ARG1 (z23 / climate) :time (z24 / recent)) :mod (z25 / primary))) :ARG1-of (z26 / firm-03))) :medium (z27 / it))
14 nodes, 13 edges	(inhibit-01 :ARG0 (increase-01 :ARG1 acidity :mod this) :ARG1 (life :mod all :mod marine :ARG0-of (impact-01 :ARG1 (and :op1 (organism :mod small :mod shell) :op2 organism) :mod great)))	21 nodes, 20 edges	(z0 / multi-sentence :snt1 (z1 / inhibit-01 :ARG0 (z2 / increase-01 :ARG1 (z3 / acidity) :mod (z4 / this)) :ARG1 (z5 / life :mod (z6 / all) :mod (z7 / marine) :ARG0-of (z8 / impact-01 :ARG1 (z9 / and :op1 (z10 / organism :mod (z11 / small :degree (z12 / more))) :op2 (z13 / organism :mod (z14 / shell))) :mod (z15 / great :degree (z16 / more))))) :snt2 (z17 / see-01 :mode imperative :ARG0 (z18 / you) :ARG1 (z19 / scallop)))

Table 4. Performance comparison of EVOCA.

Evidence Type	Model	Learning Rate	F1-Score	Accuracy
Full evidence	RoBERTa-large-MNLI	$2 \times 10 - 5$	0.681	0.676
Reduced evidence	RoBERTa-large-MNLI	$2 \times 10 - 5$	0.662	0.670
Full evidence	RoBERTa-large-MNLI	$1 \times 10 - 6$	0.724	0.720
Reduced evidence	RoBERTa-large-MNLI	$1 \times 10 - 6$	0.669	0.670
Full evidence	Bart-large-MNLI	$1 \times 10 - 6$	0.706	0.736
Reduced evidence	Bart-large-MNLI	$1 \times 10 - 6$	0.665	0.664
Full evidence	Bart-large-MNLI	$2 \times 10 - 5$	0.651	0.642
Reduced evidence	Bart-large-MNLI	$2 \times 10 - 5$	0.642	0.648
ine Full evidence	Bart-large-MNLI	$3 \times 10 - 5$	0.683	0.687
Reduced evidence	Bart-large-MNLI	$3 \times 10 - 5$	0.679	0.687

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

De Felice, C.; Longo, C.F.; Mongiovì, M.; Santamaria, D.F.; Tuccari, G.G. EVOCA: Explainable Verification of Claims by Graph Alignment. Information 2025, 16, 597. https://doi.org/10.3390/info16070597

AMA Style

De Felice C, Longo CF, Mongiovì M, Santamaria DF, Tuccari GG. EVOCA: Explainable Verification of Claims by Graph Alignment. Information. 2025; 16(7):597. https://doi.org/10.3390/info16070597

Chicago/Turabian Style

De Felice, Carmela, Carmelo Fabio Longo, Misael Mongiovì, Daniele Francesco Santamaria, and Giusy Giulia Tuccari. 2025. "EVOCA: Explainable Verification of Claims by Graph Alignment" Information 16, no. 7: 597. https://doi.org/10.3390/info16070597

APA Style

De Felice, C., Longo, C. F., Mongiovì, M., Santamaria, D. F., & Tuccari, G. G. (2025). EVOCA: Explainable Verification of Claims by Graph Alignment. Information, 16(7), 597. https://doi.org/10.3390/info16070597

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

EVOCA: Explainable Verification of Claims by Graph Alignment

Abstract

1. Introduction

2. Related Work

3. Methodology

3.1. Parsing

3.2. Alignment

3.3. Verbalization

4. Case Study and Evaluation

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI