Knowledge Enhancement and Semantic Information-Fused Emotion–Cause Pair Extraction

Li, Shi; Wang, Yuqian

doi:10.3390/info17010042

Open AccessArticle

Knowledge Enhancement and Semantic Information-Fused Emotion–Cause Pair Extraction

by

Shi Li

and

Yuqian Wang

^*

College of Computer and Control Engineering, Northeast Forestry University, Harbin 150006, China

^*

Author to whom correspondence should be addressed.

Information 2026, 17(1), 42; https://doi.org/10.3390/info17010042

Submission received: 4 December 2025 / Revised: 28 December 2025 / Accepted: 31 December 2025 / Published: 4 January 2026

(This article belongs to the Section Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

Emotion–cause pair extraction is a crucial task in natural language processing that identifies emotional expressions and their corresponding causes within text. Despite substantial progress, most current approaches depend on sequence modeling or standard attention mechanisms, which frequently overlook intricate inter-sentential relationships and fail to utilize causal commonsense knowledge to enhance semantic links between clauses. To address these limitations, this paper introduces KESIF, a novel emotion–cause pair extraction model that integrates knowledge enhancement with enriched semantic information for improved performance. The proposed model incorporates a graph attention network to capture semantic dependency relationships between sentences, integrates causal commonsense knowledge from the ATOMIC knowledge base to enrich semantic representations, and utilizes a bidirectional MRC mechanism for achieving effective bidirectional matching between emotions and causes. The model’s performance is assessed using core metrics, such as precision, recall, and F1 score. Experimental results on both Chinese and English datasets demonstrate that our method outperforms SOTA baselines.

Keywords:

natural language processing; emotion–cause pair extraction; knowledge enhancement; graph attention network

1. Introduction

Identifying the causes behind emotions is a critical and challenging research topic in sentiment analysis that has received increasing attention in recent years [1]. Early studies primarily focused on the sentence-level emotion–cause extraction (ECE) task, which aims to retrieve cause clauses associated with emotion clauses. However, the ECE task faces limitations in practical applications due to its reliance on emotion annotation. To address this issue, recent research has introduced the emotion–cause pair extraction (ECPE) task [1], which aims to extract potential emotion–cause clause pairs from documents, thus broadening the scope of sentiment analysis applications. As illustrated in Figure 1, in the ECPE corpus example, blue and green clauses represent emotion clauses and cause clauses, respectively. Red phrases indicate commonsense reasoning knowledge (extracted from ATOMIC) associated with each clause. This document contains seven distinct clauses, among which two are emotion clauses: clause c1 (“Weary and agonized”) and clause c7 (“Strong and positive”). For emotion clause c1, the corresponding cause clauses are c2 and c3; for emotion clause c7, the corresponding cause clauses are c5 and c6. The output consists of all emotion–cause pairs in the document: (c1, c2), (c1, c3), (c7, c5), and (c7, c6).

Current research on the emotion–cause pair extraction task faces several critical challenges: the two-stage processing paradigm for subtasks hinders joint optimization, thereby limiting the effectiveness of information interaction and global modeling; most existing methods inadequately capture fine-grained lexical, semantic, and syntactic features—this limitation is particularly prominent in scenarios involving multiple causal relationships; task decomposition and sliding window mechanisms often fail to capture complex relationships, including one-to-many, many-to-one, and long-distance dependencies. As a result, these approaches may provide incomplete coverage and hinder generalization in real-world scenarios.

Most initial methods for the emotion–cause pair extraction task adopted a two-stage processing paradigm to establish associations between emotions and causes. The core logic involves decomposing ECPE into two independent subtasks executed sequentially. The first stage is clause classification, which requires identifying emotion clauses and cause clauses separately in the document. Notably, this stage does not establish any association between emotions and causes. The second stage is pairing and association. Based on the sets of emotion clauses and cause clauses extracted in the first stage, rule-based matching or simple classification models are used to select emotion–cause pairs that conform to causal logic. While this two-stage paradigm reduces the modeling complexity in early tasks, it has become a core bottleneck in subsequent research: on the one hand, the independence of the two stages leads to the direct propagation of classification errors from the first stage to the next, resulting in error accumulation; on the other hand, the optimization objectives of emotion extraction and cause extraction are inconsistent—the former emphasizes emotion word matching, while the latter focuses on causal clue capture), making it impossible to achieve global information interaction, particularly in scenarios involving multiple causal relationships, where pairing accuracy and generalization ability are significantly constrained.

To address these challenges, this paper proposes a knowledge-enhanced and semantic information-fused emotion–cause pair extraction model (KESIF). KESIF leverages ATOMIC knowledge, graph attention, and bidirectional machine reading comprehension to better handle long-distance dependencies and implicit causality. The main contributions of this study are as follows: (1) We propose a commonsense-enhanced semantic representation framework specifically designed for ECPE tasks. This study introduces the ATOMIC commonsense knowledge base, selecting emotion- and intention-related knowledge (xReact, xWant) to enhance clause representations. To address inaccurate knowledge matching, Sentence-BERT calculates semantic similarity between clauses and ATOMIC knowledge tuples, enabling precise retrieval and integration of relevant commonsense for causal reasoning. This design enables the model to recognize implicit causal relationships—a key limitation of traditional models that rely solely on explicit textual clues. (2) A graph attention network (GAT)-based global dependency modeling approach to handle complex inter-clause relationships. The GAT constructs a fully connected clause graph, dynamically adjusting edge weights via attention mechanisms based on contextual distance and commonsense similarity. This enables the model to comprehensively capture global semantic and causal associations among clauses, overcoming the limitations of local context modeling in previous methods. (3) A bidirectional cross-validation mechanism with four custom query types is designed to optimize emotion–cause matching accuracy. This study proposes two reasoning directions (emotion-to-cause and cause-to-emotion) and four query types (static: emotion extraction, cause extraction; dynamic: emotion-specific cause query, cause-specific emotion query) to cover all ECPE reasoning scenarios. This bidirectional framework reduces one-way reasoning bias and significantly improves extraction accuracy for documents with multiple emotion–cause pairs, where traditional models often underperform.

The KESIF model directly targets joint extraction of emotion–cause pairs by fusing the three core components above. This enables global information interaction and semantic modeling and also resolves key issues of error accumulation and mismatched optimization in two-stage tasks, improving the model’s robustness and generalization.

2. Related Work

2.1. Emotion–Cause Extraction

Lee et al. [2] introduced the emotion–cause extraction task, built a dataset, and defined it as a word-level tagging problem. Early studies adopted rule-based [3] and machine learning methods [4]. Chen et al. [5] proposed a clause-level approach. Gui et al. [6] expanded linguistic clues and released a benchmark corpus. In the deep learning era, Li et al. [7] used CNNs and attention mechanisms to optimize extraction by integrating positional information [8]. Xia and Ding [1] proposed the emotion–cause pair extraction (ECPE) task. Early step-wise methods suffered from error propagation. Subsequent end-to-end models [9,10] used ensemble frameworks but struggled to capture context. Hua et al. [11] addressed ECPE challenges in unsupervised domain adaptation. Hu et al. [12] split ECPE into emotion-oriented cause and cause-oriented emotion prediction modules. Recently, rapid NLP advances have seen the rise of large language models (LLMs), such as LLAMA [13], which excel at various NLP tasks [14] in zero- or few-shot settings without parameter updates. Given their power and potential, this paper leverages LLMs to solve ECPE. However, due to the generative paradigm, directly applying large models to information extraction often yields high recall but low precision [15]. Wang et al. [16] constrained prompts to require only one emotion–cause pair as output. This approach fails to achieve satisfactory performance and conflicts with practical requirements. Most LLM-based methods also suffer from large parameter sizes and high inference costs, leaving room to optimize lightweight models for resource-limited scenarios.

2.2. Machine Reading Comprehension

Machine reading comprehension (MRC) focuses on text-based question-answering tasks, with answers presented in diverse formats. Numerous natural language processing tasks have been reformulated as MRC, such as relation extraction [17]. Multi-turn MRC has been applied to entity-relation extraction [18], where multi-turn QA improves handling of complex problems. Li et al. [19] addressed event extraction by framing trigger identification, classification, and argument extraction as multi-turn MRC in a pipeline. Liu et al. [20] explored unsupervised question generation for event extraction. Mao et al. [21] constructed two MRC questions and solved aspect-based sentiment analysis via end-to-end training. Chen et al. [22] proposed a bidirectional MRC framework for aspect sentiment triplet extraction.

2.3. ATOMIC Knowledge Retrieval and Query Formulation

External knowledge is widely used in sentiment analysis. Russo et al. [23] constructed a commonsense knowledge base; Neviarouskaya et al. [24] developed OCC-based rules; Ma et al. [25] used frameworks, lexicons, and knowledge paths for analysis. However, existing methods rely on explicit emotion words, while the ATOMIC knowledge graph demonstrates reasoning advantages in emotion detection [26]. ATOMIC is a large-scale commonsense knowledge base that contains event-centric sociocultural characteristics and “if-then” reasoning tuples. Of its nine relation types, xReact and xWant infer an individual’s emotional state toward a given event. Integrating explicit emotional and intent features, emotional states help detect implicit causal expressions and reasoning clues. Thus, reaction- and intent-type commonsense are extracted for each clause from xReact and xWant in ATOMIC. A common knowledge acquisition approach links events in text to ATOMIC commonsense tuples using matching [27]. Sentence-BERT is designed for clause-level semantic similarity [28]. For each clause, Sentence-BERT matches each event in the knowledge graph, and the corresponding reaction and intent commonsense are adopted as external knowledge for the clause.

As shown in Figure 2, this paper adopts four main query types: emotion extraction, cause identification, linking an emotion to its cause, and identifying the emotion triggered by a given cause. These are grouped into two categories: static and dynamic queries. Static queries use templates that need no target, covering emotion and cause extraction. Dynamic queries combine templates with specific target content, including emotion-specific cause queries and cause-specific emotion queries.

3. Methodology

The architecture of the KESIF model for emotion–cause pair extraction is shown in Figure 3. It includes three parts: a document encoder for contextual representation, a graph attention module for semantic dependency, and a prediction module for joint extraction of emotion–cause pairs.

3.1. Task Definition

In the emotion–cause pair extraction task, the input is a document d containing multiple clauses,

d = [c_{1}, c_{2}, \dots, c_{N}]

, where each clause

c_{i}

consists of multiple words:

c_{i} = [t_{1}^{i}, t_{2}^{i}, \dots, t_{n_{i}}^{i}]

. Here, N denotes the number of clauses in the document, and

n_{i}

represents the number of tokens in clause

c_{i}

, respectively.

The goal of the ECPE task is to extract all clause-level emotion–cause pairs from the document d:

P = \{\dots, (c^{e}, c^{c}), \dots\}

where

c^{e}

denotes an emotion clause, and

c^{c}

denotes the corresponding cause clause.

3.2. Document Encoding

BERT-Base [29] serves as the base encoder to generate clause representations. By integrating external commonsense knowledge, the model effectively fuses implicit information. For each clause, relevant knowledge is retrieved and concatenated as a third sequence, separated by [SEP] tokens. The query, clause, and this external knowledge are then input into BERT. This third sequence helps BERT produce richer clause representations.

To generate the representation of clause

c_{i} = [t_{1}^{i}, t_{2}^{i}, \dots, t_{n_{i}}^{i}]

, [CLS] and [SEP] tokens are added at the beginning and end of the clause:

[CLS], t_{1}^{i} t_{2}^{i}, \dots, t_{n_{i}}^{i} [SEP]

. To distinguish different clauses in the document, alternating segment embeddings

(E_{A}, E_{B}, E_{A}, \dots)

are assigned: clauses at odd positions use

E_{A}

, while those at even positions use

E_{B}

.

Semantic similarity between input clauses and all event tuples in ATOMIC is calculated with Sentence-BERT. For each clause, the xReact and xWant knowledge with the highest similarity is retrieved. To ensure knowledge quality and reduce noise, a similarity threshold (0.5 in experiments) is used. If similarity falls below this threshold, the clause is considered to have no highly relevant external knowledge, and no external sequence is introduced; instead, the model then relies solely on the original clause text for encoding.

3.3. Graph Attention Module

The BERT encoder takes the query, document, and ATOMIC external commonsense knowledge as inputs, generating a representation matrix

H = {h_{c_{0}}, h_{c_{1}}, \dots, h_{c_{N}}} \in R^{(N + 1) \times m}

, where N denotes the number of clauses in the document, and m is the output representation dimension of BERT. Subsequently, a clause graph is constructed based on the representation matrix

H

output by the BERT encoder, and a graph attention network is used to model the relationships between clauses. Each clause representation

h_{c_{i}}

serves as a node in the graph. A fully connected graph is constructed using contextual distance and commonsense similarity between clauses, with edge weights dynamically computed by an attention mechanism. For each node

c_{i}

, the GAT updates its representation as follows:

z_{c_{i}} = W h_{c_{i}}

where

W \in R^{d \times m}

denotes a learnable weight matrix, and

z_{c_{i}} \in R^{d}

represents the transformed representation of clause

c_{i}

. The attention coefficient between clause

c_{i}

and its neighboring clause

c_{j}

is calculated as follows:

e_{c_{i} c_{j}} = LeakyReLU (a^{T} [z_{c_{i}} \oplus z_{c_{j}}])

where

a \in R^{2 d}

denotes the parameter vector of the attention mechanism, the symbol ⊕ represents vector concatenation, and

e_{c_{i} c_{j}}

denotes the attention score between clause

c_{i}

and clause

c_{j}

. The normalization of attention coefficients is achieved using the softmax function.

α_{c_{i} c_{j}} = \frac{exp (e_{c_{i} c_{j}})}{\sum_{k \in N (c_{i})} exp (e_{c_{i} k})}

where

N (c_{i})

denotes the set of neighboring clauses of

c_{i}

, and

α_{c_{i} c_{j}}

represents the normalized attention weight from clause

c_{i}

to clause

c_{j}

. The node representation is updated by aggregating information from neighboring nodes via weighted summation, thereby updating the representation of clause

c_{i}

in the l-th graph attention layer:

z_{c_{i}}^{' (l)} = σ (\sum_{j \in N (c_{i})} α_{c_{i} c_{j}} W^{(l)} z_{c_{j}}^{' (l - 1)})

where

σ

denotes the ReLU nonlinear activation function,

W^{(l)}

is the learnable weight matrix of the l-th layer,

z_{c_{j}}^{' (l - 1)}

represents the input representation of clause

c_{j}

for the l-th layer (the output representation of the

(l - 1)

-th layer), and

z_{c_{i}}^{' (l)}

denotes the updated representation of clause

c_{i}

after processing by the l-th graph attention layer.

By stacking L layers of graph attention networks, the global emotional and causal relationships between clauses are captured incrementally. The output of the l-th layer is formulated as:

Z^{(l)} = {GAT}^{(l)} (Z^{(l - 1)})

where

Z^{(l)} = [z_{c_{1}}^{' (l)}; z_{c_{2}}^{' (l)}; \dots; z_{c_{N}}^{' (l)}] \in R^{N \times d}

denotes the output representation matrix of the l-th layer (the semicolon indicates vertical stacking of vectors), and

Z^{(0)} = H

(the output of the BERT encoder).

After processing graph attention network layers by L, the final clause representation matrix is obtained:

Z = Z^{(L)} = [z_{c_{1}}^{″}; z_{c_{2}}^{″}; \dots; z_{c_{N}}^{″}] \in R^{N \times d}

where

z_{c_{i}}^{″} = z_{c_{i}}^{' (L)}

(i.e., the output representation of clause

c_{i}

from the L-th graph attention layer) represents the final representation of clause

c_{i}

, which integrates global semantic information and commonsense knowledge. The clause representations output by the graph attention network (GAT) are then fed into a bidirectional long short-term memory network (BiLSTM) to further optimize the contextual representations of the clauses.

Since the method involves four distinct query types, the corresponding outputs of BiLSTM are represented using the following sets:

H^{e} = \{h_{c_{0}}^{e}, h_{c_{1}}^{e}, \dots, h_{c_{N}}^{e}\}, H^{c} = \{h_{c_{0}}^{c}, h_{c_{1}}^{c}, \dots, h_{c_{N}}^{c}\},

H^{e c} = \{h_{c_{0}}^{e c}, h_{c_{1}}^{e c}, \dots, h_{c_{N}}^{e c}\}, H^{c e} = \{h_{c_{0}}^{c e}, h_{c_{1}}^{c e}, \dots, h_{c_{N}}^{c e}\} \in R^{(N + 1) \times m},

where

h_{c_{i}}^{e}, h_{c_{i}}^{c}, h_{c_{i}}^{e c}

, and

h_{c_{i}}^{c e}

denote the contextual representations of the i-th clause

c_{i}

(corresponding to different query types).

BiLSTM is not used as a standalone sequential model, but as a complement to GAT for two key reasons. First, GAT captures non-sequential global dependencies but may miss temporal order and local context—BiLSTM’s bidirectional propagation compensates for this, preserving linear semantic coherence. Second, compared with complex sequential models (e.g., Transformer variants), BiLSTM is computationally efficient and uses fewer parameters, aiding a balance between performance and inference speed. Integrating BiLSTM also stabilizes performance on long documents, confirming its unique role in robust contextual representation.

3.4. Pair Prediction

After obtaining clause representations, our model predicts the four query types. Although MRC is used in NLP tasks, the bidirectional MRC framework proposed here differs from traditional single-directional designs and shows unique value in ECPE tasks. Unlike existing MRC-based ECPE models focusing solely on “emotion-to-cause” reasoning, our framework covers both “emotion-to-cause” and “cause-to-emotion” directions, enabling cross-validation to reduce false positives from one-way bias. Second, the four query types are tailored to ECPE: emotion and cause extraction queries support candidate screening, while bidirectional association queries focus on pair matching, integrating subtasks with the main task. Third, unlike LLM-based generative MRC methods, our bidirectional MRC framework avoids high inference costs and low precision, and is more compatible with knowledge enhancement and graph modeling, enabling end-to-end optimization of knowledge fusion, dependency modeling, and pair prediction. A bidirectional loss function

L_{1}

fusing various loss terms is then used to train the model:

L_{1} = L_{E C} + L_{C E}

L_{E C} = - \sum_{i = 1}^{N} y_{i}^{e} log (p_{i}^{e}) - \sum_{i = 1}^{N} \sum_{j = L_{e}} y_{i}^{e c_{j}} log (p_{i}^{e c_{j}})

L_{C E} = - \sum_{i = 1}^{N} y_{i}^{c} log (p_{i}^{c}) - \sum_{j \in L_{c}} \sum_{i = 1}^{N} y_{j}^{e c_{i}} log (p_{i}^{e c_{j}})

where

y_{i}^{e}

and

p_{i}^{e}

denote the ground-truth label and predicted probability for the emotion query of clause

c_{i}

;

y_{i}^{e c_{j}}

and

p_{i}^{e c_{j}}

correspond to the ground-truth label and predicted probability for the emotion-to-cause query (associating emotion clause

c_{i}

with cause clause

c_{j}

);

y_{i}^{c}

and

p_{i}^{c}

represent the ground-truth label and predicted probability for the cause query of clause

c_{i}

; and

y_{j}^{e c_{i}}

and

p_{i}^{e c_{j}}

denote the ground-truth label and predicted probability for the cause-to-emotion query (associating cause clause

c_{j}

with emotion clause

c_{i}

).

In training, the second round of machine reading comprehension constructs specific queries using actual emotion and cause clauses. In inference, queries are generated from the previous round’s outputs, causing a distribution mismatch between training and inference. This mismatch introduces exposure bias, which can lead to error accumulation. To address this, we follow [29] and define the loss function

L_{2}

as:

L_{2} = - \sum_{j \in {\hat{L}}^{e}} \sum_{i = 1}^{N} y_{i}^{e c_{j}} log (p_{i}^{e c_{j}}) - \sum_{j \in {\hat{L}}^{c}} \sum_{i = 1}^{N} y_{i}^{c e_{j}} log (p_{i}^{c e_{j}})

where

{\hat{L}}^{e}

denotes the index set of pseudo-emotion clauses, and

{\hat{L}}^{c}

denotes the index set of pseudo-cause clauses.

y_{i}^{e c_{j}}

corresponds to the cause clause

c_{i}

under the condition of pseudo-emotion clause

c_{j}

; correspondingly,

y_{i}^{c e_{j}}

represents the emotion clause

c_{i}

under the condition of pseudo-cause clause

c_{j}

.

The final loss of the KESIF model is determined by the sum of

L_{1}

and

L_{2}

:

L = L_{1} + L_{2} + λ {∥ θ ∥}^{2}

where

λ

is the regularization parameter based on the

L_{2}

norm, and

θ

includes all parameters used in the model. During inference, due to the lack of real emotion and cause clause datasets, we construct second-round MRC queries from first-round results. To integrate the candidate sets, the model uses the Harmonics and Complementary strategies from [29]. After integration, we apply emotion filtering: a valid emotion clause must contain at least one emotion word from the lexicon to ensure extraction accuracy.

4. Experiments

4.1. Datasets and Settings

We conducted experiments on two benchmark datasets: a Chinese dataset [1] and an English dataset [30]. The Chinese dataset consists of 1945 documents collected from SINA city news, while the English dataset contains 2843 documents gathered from novels. Each document in both datasets includes at least one emotion–cause pair. Detailed statistical data are presented in Table 1. For the Chinese dataset, we adopted the same data split as used by Xia and Ding [1], and the dataset is split into two parts: 90% for training and the remaining 10% for testing. For the English dataset, we used the same data split as Singh et al. [31], and the dataset is divided into a training/development/test set in a ratio of 8:1:1. To obtain statistically reliable results, we repeated the experiment for 20 rounds and report the average results to ensure robust outcomes across all settings. In the experiments, the BERT-Base model [32] was used, with AdamW as the optimizer and a batch size of 2. The learning rate of BERT was set to

2 \times 10^{- 5}

, while the learning rate of other parameters (e.g., BiLSTM and fully connected layers) was

1 \times 10^{- 4}

. The hidden layer dimension of BiLSTM was 100, and the

L_{2}

regularization coefficient was

1 \times 10^{- 5}

. The value of

α

was set to 0.1. The learning rate adopted a linear warm-up in the first 10% of training steps, followed by linear decay. The training process employed an early stopping strategy, with a maximum of 15 training epochs.

The emotion lexicon ANTUSD [33] and external knowledge ATOMIC-zh [27] were used. This study adopts an emotion classification scheme based on Ekman’s basic emotion model, with a null category designated to represent neutral or non-emotional content. The evaluation metrics included precision (P), recall (R), and F1-score (F1). In addition to the main emotion–cause pair extraction task, the performance of two key subtasks was also evaluated: emotion clause extraction (EE) and cause clause extraction (CE).

4.2. Baselines

Indep [1]: This model first extracts emotions and causes separately in the first step; the extracted emotions and causes are paired in subsequent steps, and invalid emotion–cause pairs are filtered out.

Inter-EC [1]: This model is similar to the Indep model, with the main difference in the first step: the results of emotion extraction are used to assist in improving the performance of cause extraction.

Inter-CE [1]: This model is similar to the Indep model, with the main difference in the first step: the results of cause extraction are used to assist in improving the performance of emotion extraction.

ECPE-2D [9]: This model adopts a joint modeling approach, integrating the representation, interaction, and prediction processes of emotion–cause pairs into a single model.

GANU [34]: This model leverages information at three different levels (word, clause, and document). Its architecture includes a base encoder, a multi-head attention module for processing emotional and causal clues, and a graph attention network with a cross-graph co-attention mechanism; finally, a neural classifier is integrated to implement the prediction function.

RANKCP [10]: This model uses a graph attention network to model the relationships between clauses, then ranks potential emotion–cause pairs to obtain the final results.

GAT-ECPE [35]: This model proposes a knowledge-aware graph network to facilitate interaction between different tasks. It aggregates the features of phrase pairs via an inter-sentence dependency graph and leverages the encoded relationship information to improve the performance of emotion–cause pair extraction.

MMN [36]: This model introduces a modular interaction network, which aims to capture the interactivity in relationships. It consists of a perception encoder, an interaction optimization module, and a double regularization predictor.

MGSAG [25]: This model is a multi-granularity semantic-aware graph model. It enriches clause representations at the semantic level by incorporating emotional information from external emotion lexicons.

JFTA [37]: This model integrates feature encoding and task alignment mechanisms. By synchronously processing the features of emotion pairs and cause pairs, it resolves the inconsistency in label prediction between related tasks, thereby improving the accuracy of emotion–cause pair extraction.

MEKIT [38]: This model is based on an instruction-tuned multi-source heterogeneous knowledge injection method. By integrating internal emotional knowledge and external causal knowledge, it simultaneously enhances the emotion perception and causal reasoning capabilities of LLMs.

LLM-MTLN [39]: The model enhances text representation by fine-tuning a large language model and integrating external knowledge, modeling inter-clause dependencies using a relational graph convolutional network, and achieving multi-task feature sharing and collaborative learning via a partition filtering network.

4.3. Main Results

As shown in Table 2, the KESIF model performs excellently in the emotion clause extraction (EE), cause clause extraction (CE), and emotion–cause pair extraction (ECPE), especially in emotion extraction and ECPE, where it outperforms other models. Compared to MGSAG, RANKCP, and GAT-ECPE, KESIF achieves a higher F1, demonstrating superior performance. In the ECPE task, the F1-score of KESIF reaches 77.61%, surpassing most models. Additionally, compared to GAT-ECPE and JFTA, KESIF better balances precision and recall, highlighting stronger generalization in multi-clause emotion–cause pair extraction.

In emotion extraction, KESIF achieves an F1 of 93.20%, outperforming MGSAG (92.09%) and RANKCP (90.57%). Compared to GAT-ECPE and JFTA, it better balances precision and recall, further demonstrating robustness in emotion extraction. Integrating ATOMIC commonsense knowledge and the graph attention network allows KESIF to comprehensively analyze context and semantic associations between clauses, significantly boosting performance in emotion–cause pair tasks.

As shown in Table 3, MEKIT adopts an instruction-tuned multi-source heterogeneous knowledge injection to integrate internal emotional and external causal knowledge, boosting LLMs’ emotion–cause pair extraction—but with many parameters and high inference cost. With a collaborative framework of knowledge enhancement, graph attention, and bidirectional machine reading comprehension, KESIF ensures strong performance (68.02% on the English dataset) while using fewer parameters and providing faster inference, making it preferable for resource-limited settings. KESIF also excels at handling multi-causal and implicit causal relationships, further proving its effectiveness in emotion–cause pair extraction.

RANKCP adopts a ranking-first coverage strategy, improving recall by keeping more candidate pairs, but its precision is just 71.19%. In contrast, KESIF balances precision and recall via commonsense filtering and bidirectional verification—raising precision to 82.26% with only a slight recall drop (within 3%). This trade-off better suits practical needs (e.g., public opinion sentiment analysis). In summary, KESIF’s strategy offers greater practical value.

4.4. The Impact of the Number of Emotion–Cause Pairs

Since documents with multiple pairs are harder to handle than those with only one, we further examine the impact of emotion–cause pair count on model performance. The test set is split: one subset has documents with a single pair, the other with two or more. As shown in Table 4, KESIF is compared to RANKCP. KESIF achieves the best results on both subsets, showing it performs well across document types. RANKCP’s recall exceeds KESIF’s for single-pair documents because RANKCP always extracts at least one pair, resulting in high recall but low precision. All models perform worse on documents with two or more pairs than on those with one, indicating that handling multi-pair documents remains a bottleneck for ECPE.

4.5. Ablation Study

As shown in Table 5, to further assess each component’s impact on emotion–cause pair extraction, we conducted ablation experiments using the F1 metric. These experiments analyze model performance by sequentially removing or replacing key modules: graph attention network, ATOMIC commonsense knowledge, bidirectional extraction (emotion-to-cause and cause-to-emotion), and emotion filtering.

Without ATOMIC, KESIF no longer acquires commonsense knowledge and relies only on query and clause context for emotion–cause pair extraction, causing the F1 to drop from 77.61% to 72.22%. These results show that commonsense knowledge is crucial for understanding clause semantics and capturing causal relationships.

Removing the graph attention network drops the F1-score from 77.61% to 74.77%. The model then relies solely on local information, failing to capture global causal relationships between clauses. This issue is especially evident in cross-clause causal relationship extraction, reducing the accuracy and robustness of extracted emotion–cause pairs.

Removing the emotion-driven cause or cause-driven emotion direction module lowers the F1-score from 77.61% to 75.15% and 75.99%, respectively. The model cannot jointly optimize the bidirectional relationship between emotion and cause. Removing the emotion lexicon reduces the F1-score from 77.61% to 77.01%; the model becomes more susceptible to noise during emotion clause extraction, leading to inaccurate matching between emotion and cause clauses and reducing overall extraction quality. The emotion filtering mechanism significantly improves the accuracy of emotion clause extraction and ensures high-quality emotion–cause pair extraction.

4.6. Case Study

This paper presents a comparative case study, with results shown in Table 6, comparing predictions of the GAT-ECPE and KESIF models. The goal is to clearly demonstrate the advantages of the proposed KESIF model by directly comparing the two models’ outputs.

As shown in Table 6, the document has three emotion–cause pairs:

(c_{12}, c_{9})

,

(c_{12}, c_{10})

, and

(c_{12}, c_{11})

. The GAT-ECPE model identified two pairs correctly,

(c_{12}, c_{9})

and

(c_{12}, c_{10})

, but missed

(c_{12}, c_{11})

. In contrast, the KESIF model identified all pairs accurately. This shows KESIF better captures causal relationships between emotions and their causes. The GAT-ECPE model’s error may result from biased selection of contextual window clauses, reducing emotion–cause pair extraction accuracy. The KESIF model is more accurate in causal and contextual reasoning, enabling successful extraction of all correct emotion–cause pairs.

To comprehensively evaluate the performance of the proposed model under complex contextual conditions, as shown in Table 7, we conduct comparative analyses with baseline methods, focusing particularly on long-distance dependencies and scenarios with multiple emotion–cause pairs.

The first case shows a long-distance dependency: emotion clause

c_{7}

(“he has been living in fear”) is causally linked to the distant cause clause

c_{2}

(“Chengwei Sun killed his uncle”). RANKCP’s wrong prediction

(c_{7}, c_{6})

shows it relies too much on local context and can not handle long-range reasoning. KESIF solves this by using global semantic dependencies via graph attention network (GAT) and adding commonsense knowledge from ATOMIC, correctly finding

(c_{7}, c_{2})

. The second case has intertwined pairs, including an implicit self-cause

(c_{7}, c_{7})

. RANKCP finds

(c_{7}, c_{7})

but confuses

(c_{6}, c_{5})

with

(c_{6}, c_{7})

, showing it struggles to tell apart close clauses in multi-pair cases. KESIF’s success at finding both pairs shows its bidirectional MRC and knowledge-augmented representations help with complex, multi-pair documents.

In summary, KESIF shows strong robustness in tough cases—like long-distance dependencies and documents with many emotion–cause pairs—thanks to its commonsense-augmented semantic representation, global dependency modeling, and bidirectional MRC verification. Compared to RANKCP and other baselines, KESIF achieves higher recall and maintains a better balance between precision and recall with commonsense filtering and bidirectional cross-checking, making it more practical for real-world tasks like public opinion analysis.

5. Conclusions

This paper presents an emotion–cause pair extraction model that combines knowledge enhancement and semantic information. Using the ATOMIC knowledge base, the model gives clauses commonsense knowledge about emotions and causes. It addresses causal correlations, strengthens semantic links between clauses, and greatly improves extraction accuracy—especially for long-distance dependencies and implicit causal relationships. The model also uses a graph attention network to efficiently capture causal and knowledge associations among clauses. At the same time, it uses a bidirectional machine reading comprehension (MRC) framework for both emotion-to-cause and cause-to-emotion directions and applies cross-validation to optimize emotion–cause matching and boost model performance. In summary, KESIF advances emotion–cause pair extraction by fusing knowledge enhancement, graph attention networks, and a bidirectional MRC framework. This gives clear advantages for tasks with implicit causality, long-distance dependencies, and multi-clause extraction. Future work may further strengthen knowledge integration, improve computational efficiency, and apply the model to datasets in other languages to broaden its use.

Author Contributions

Conceptualization and methodology, S.L. and Y.W.; software, validation, and formal analysis, S.L. and Y.W.; writing—original draft preparation, Y.W.; writing—review and editing, S.L. and Y.W.; funding acquisition, S.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The codes used in this manuscript are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Appendix A.1. Sensitivity Analysis of Hyperparameter α

The hyperparameter

α

mainly controls the negative slope in the LeakyReLU activation, which affects the sparsity of attention weights. We used grid search on the development set with

α \in {0.01, 0.05, 0.1, 0.2, 0.5}

to find the best value. Results show

α = 0.1

gives the highest average F1-score on the validation set and keeps training stable.

We also conducted a sensitivity analysis of

α

on the Chinese dataset. Results in Table A1 (ECPE-F1 is the F1-score for emotion–cause pair extraction) show the model stays stable when

α \in [0.05, 0.2]

, with

α = 0.1

giving the best result. If

α

is too small (like 0.01) or too large (like 0.5), performance drops, so a proper negative slope helps capture asymmetric inter-clause dependencies. On the English dataset,

α = 0.1

is also stable (F1 = 68.02%), showing this setting works for both Chinese and English.

Table A1. Experimental results of the sensitivity analysis of hyperparameter

α

.

Table A1. Experimental results of the sensitivity analysis of hyperparameter

α

.

$α$	Dev ECPE-F1 (%)	Val ECPE-F1 (%)
0.01	75.32	74.89
0.05	76.81	76.12
0.1	77.61	76.95
0.2	76.93	76.40
0.5	75.04	74.58

Appendix A.2. Model Efficiency Comparison

To evaluate model efficiency and practicality, we analyze model scale and inference speed. As shown in Table A2, we compare base architectures, parameter sizes, and per-document inference times on GPU for MEKiT, GAT-ECPE, and KESIF.

For model scale, MEKiT is based on Gemma-2-9B-it and has about 9B parameters, much larger than the lightweight pre-trained models used by GAT-ECPE and KESIF. MEKiT may have more representational power, but it also needs more memory and computation.

For inference efficiency, KESIF—with the fewest parameters—runs fastest, processing each document in 60–250 ms. GAT-ECPE is a bit slower at about 100–225 ms, while MEKiT, the largest model, takes 200–1000 ms, with speed varying due to input length and complexity. Overall, lightweight models are more efficient and better for real-time applications.

In summary, this experiment compares model performance and, from a practical view of parameter count and inference time, shows the trade-off between expressiveness and efficiency in model selection. Future work can build on these results to explore lightweight methods that keep performance while cutting computational cost.

Table A2. Comparison of three ECPE models in terms of base model, parameter count, and inference time.

Model	Base Model	Parameter Count (Approx.) *	Inference Time per Doc (GPU est.) *
MEKiT	Gemma-2-9B-it	9 B	200 ms–1000 ms
GAT-ECPE	BERT-Base	120 M	100 ms–225 ms
KESIF	BERT-Base	115 M	60 ms–250 ms

* Parameter counts are approximate; inference times are estimated on GPU.

References

Xia, R.; Ding, Z. Emotion-cause pair extraction: A new task to emotion analysis in texts. arXiv 2019, arXiv:1906.01267. [Google Scholar] [CrossRef]
Lee, S.Y.M.; Chen, Y.; Huang, C.R. A text-driven rule-based system for emotion cause detection. In Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, Los Angeles, CA, USA, 5 June 2010; pp. 45–53. [Google Scholar]
Gao, K.; Xu, H.; Wang, J. Emotion cause detection for chinese micro-blogs based on ecocc model. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Ho Chi Minh City, Vietnam, 19–22 May 2015; Springer: Berlin/Heidelberg, Germany, 2015; pp. 3–14. [Google Scholar]
Ghazi, D.; Inkpen, D.; Szpakowicz, S. Detecting emotion stimuli in emotion-bearing sentences. In Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics, Cairo, Egypt, 14–20 April 2015; Springer: Berlin/Heidelberg, Germany, 2015; pp. 152–165. [Google Scholar]
Gui, L.; Yuan, L.; Xu, R.; Liu, B.; Lu, Q.; Zhou, Y. Emotion cause detection with linguistic construction in chinese weibo text. In Proceedings of the CCF International Conference on Natural Language Processing and Chinese Computing, Shenzhen, China, 5–9 December 2014; Springer: Berlin/Heidelberg, Germany, 2014; pp. 457–464. [Google Scholar]
Ahmad, F.; Abbasi, A.; Li, J.; Dobolyi, D.G.; Netemeyer, R.G.; Clifford, G.D.; Chen, H. A deep learning architecture for psychometric natural language processing. ACM Trans. Inf. Syst. (TOIS) 2020, 38, 1–29. [Google Scholar] [CrossRef]
Chen, Y.; Hou, W.; Cheng, X.; Li, S. Joint learning for emotion classification and emotion cause detection. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018; pp. 646–651. [Google Scholar]
Fan, C.; Yan, H.; Du, J.; Gui, L.; Bing, L.; Yang, M.; Xu, R.; Mao, R. A knowledge regularized hierarchical approach for emotion cause analysis. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; Association for Computational Linguistics: Stroudsburg, PA, USA, 2019; pp. 5618–5628. [Google Scholar]
Ding, Z.; Xia, R.; Yu, J. ECPE-2D: Emotion-cause pair extraction based on joint two-dimensional representation, interaction and prediction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 3161–3170. [Google Scholar]
Wei, P.; Zhao, J.; Mao, W. Effective inter-clause modeling for end-to-end emotion-cause pair extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 3171–3181. [Google Scholar]
Hua, Y.; Huang, Y.; Huang, S.; Feng, T.; Qu, L.; Bain, C.; Bassed, R.; Haf, R. Causal discovery inspired unsupervised domain adaptation for emotion-cause pair extraction. In Findings of the Association for Computational Linguistics: EMNLP 2024; Association for Computational Linguistics: Stroudsburg, PA, USA, 2024; pp. 8139–8156. [Google Scholar]
Hu, G.; Zhao, Y.; Lu, G. Unifying emotion-oriented and cause-oriented predictions for emotion-cause pair extraction. Neural Netw. 2024, 178, 106431. [Google Scholar] [CrossRef]
Touvron, H.; Lavril, T.; Izacard, G.; Martinet, X.; Lachaux, M.; Lacroix, T.; Rozière, B.; Goyal, N.; Hambro, E.; Azhar, F.; et al. Llama: Open and efficient foundation language models. arXiv 2023, arXiv:2302.13971. [Google Scholar] [CrossRef]
Qin, C.; Zhang, A.; Zhang, Z.; Chen, J.; Yasunaga, M.; Yang, D. Is ChatGPT a general-purpose natural language processing task solver? arXiv 2023, arXiv:2302.06476. [Google Scholar]
Han, R.; Peng, T.; Yang, C.; Wang, B.; Liu, L.; Wan, X. Is information extraction solved by chatgpt? an analysis of performance, evaluation criteria, robustness and errors. arXiv 2023, arXiv:2305.14450. [Google Scholar] [CrossRef]
Wang, Z.; Xie, Q.; Feng, Y.; Ding, Z.; Yang, Z.; Xia, R. Is ChatGPT a good sentiment analyzer? A preliminary study. arXiv 2023, arXiv:2304.04339. [Google Scholar]
Levy, O.; Seo, M.; Choi, E.; Zettlemoyer, L. Zero-shot relation extraction via reading comprehension. arXiv 2017, arXiv:1706.04115. [Google Scholar] [CrossRef]
Li, X.; Yin, F.; Sun, Z.; Li, X.; Yuan, A.; Chai, D.; Zhou, M.; Li, J. Entity-relation extraction as multi-turn question answering. arXiv 2019, arXiv:1905.05529. [Google Scholar]
Li, F.; Peng, W.; Chen, Y.; Wang, Q.; Pan, L.; Lyu, Y.; Zhu, Y. Event extraction as multi-turn question answering. In Findings of the Association for Computational Linguistics: EMNLP 2020; Association for Computational Linguistics: Stroudsburg, PA, USA, 2020; pp. 829–838. [Google Scholar]
Liu, J.; Chen, Y.; Liu, K.; Bi, W.; Liu, X. Event extraction as machine reading comprehension. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 16–20 November 2020; pp. 1641–1651. [Google Scholar]
Mao, Y.; Shen, Y.; Yu, C.; Cai, L. A joint training dual-mrc framework for aspect based sentiment analysis. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 2–9 February 2021; Volume 35, pp. 13543–13551. [Google Scholar]
Chen, S.; Wang, Y.; Liu, J.; Wang, Y. Bidirectional machine reading comprehension for aspect sentiment triplet extraction. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 2–9 February 2021; Volume 35, pp. 12666–12674. [Google Scholar]
Russo, I.; Caselli, T.; Rubino, F.; Boldrini, E.; Martínez-Barco, P. Emocause: An Easy-Adaptable Approach to Emotion Cause Contexts; Association for Computational Linguistics (ACL): Stroudsburg, PA, USA, 2011. [Google Scholar]
Neviarouskaya, A.; Aono, M. Extracting causes of emotions from text. In Proceedings of the Sixth International Joint Conference on Natural Language Processing, Nagoya, Japan, 14–19 October 2013; pp. 932–936. [Google Scholar]
Bao, Y.; Ma, Q.; Wei, L.; Zhou, W.; Hu, S. Multi-granularity semantic aware graph model for reducing position bias in emotion-cause pair extraction. arXiv 2022, arXiv:2205.02132. [Google Scholar]
Ghosal, D.; Majumder, N.; Gelbukh, A.; Mihalcea, R.; Poria, S. Cosmic: Commonsense knowledge for emotion identification in conversations. arXiv 2020, arXiv:2010.02795. [Google Scholar]
Li, D.; Li, Y.; Zhang, J.; Li, K.; Wei, C.; Cui, J.; Wang, B. C3kg: A chinese commonsense conversation knowledge graph. arXiv 2022, arXiv:2204.02549. [Google Scholar] [CrossRef]
Sahu, S.K.; Christopoulou, F.; Miwa, M.; Ananiadou, S. Inter-sentence relation extraction with document-level graph convolutional neural network. arXiv 2019, arXiv:1906.04684. [Google Scholar]
Cheng, Z.; Jiang, Z.; Yin, Y.; Wang, C.; Ge, S.; Gu, Q. A consistent dual-MRC framework for emotion-cause pair extraction. Acm Trans. Inf. Syst. 2023, 41, 1–27. [Google Scholar] [CrossRef]
Gao, Q.; Hu, J.; Xu, R.; Gui, L.; He, Y.; Wong, K.; Lu, Q. Overview of NTCIR-13 ECA Task. In Proceedings of the 13th NTCIR Conference on Evaluation of Information Access Technologies, Tokyo, Japan, 5–8 December 2017. [Google Scholar]
Singh, A.; Hingane, S.; Wani, S.; Modi, A. An end-to-end network for emotion-cause pair extraction. arXiv 2021, arXiv:2103.01544. [Google Scholar]
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar]
Wang, S.-M.; Ku, L.-W. ANTUSD: A large Chinese sentiment dictionary. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), Portorož, Slovenia, 23–28 May 2016; pp. 2697–2702. [Google Scholar]
Chen, S.; Mao, K. A graph attention network utilizing multi-granular information for emotion-cause pair extraction. Neurocomputing 2023, 543, 126252. [Google Scholar] [CrossRef]
Zhu, P.; Wang, B.; Tang, K.; Zhang, H.; Cui, X.; Wang, Z. A knowledge-guided graph attention network for emotion-cause pair extraction. Knowl.-Based Syst. 2024, 286, 111342. [Google Scholar] [CrossRef]
Shang, X.; Chen, C.; Chen, Z.; Ma, Q. Modularized mutuality network for emotion-cause pair extraction. IEEE/ACM Trans. Audio Speech Lang. Process. 2022, 31, 539–549. [Google Scholar] [CrossRef]
Li, S.; Sun, D. Joint Feature Encoding and Task Alignment Mechanism for Emotion-Cause Pair Extraction. Comput. Mater. Contin. 2025, 82, 1069–1086. [Google Scholar] [CrossRef]
Mu, S.; Liu, Y.; Feng, S.; Yang, X.; Wang, D.; Zhang, Y. MEKiT: Multi-source Heterogeneous Knowledge Injection Method via Instruction Tuning for Emotion-Cause Pair Extraction. arXiv 2025, arXiv:2507.14887. [Google Scholar]
Liu, Z.; Wen, C.; Li, D.; Fu, Z.; Kong, X.; Liu, S. Large language model augmented multi-task learning network with inter-clause modeling for emotion-cause pair extraction. Inf. Fusion 2025, 117, 103583. [Google Scholar] [CrossRef]

Figure 1. An illustration of the ECPE corpus.

Figure 2. Query formulation.

Figure 3. The architecture of KESIF model.

Table 1. Statistical data of Chinese and English datasets.

Statistical Data	Chinese	English
Number of Documents	1975	2843
Number of Emotion–Cause Pairs	2167	3215
Average Clauses per Document	14.77	7.69
Documents with One Emotion–Cause Pair	1746	2537
Documents with Two Emotion–Cause Pairs	177	256
Documents with More Than Two Emotion–Cause Pairs	22	50

Table 2. Experimental results of Chinese dataset comparison.

Model	Emotion Extraction			Cause Extraction			Emotion–Cause Pair Extraction
Model	P (%)	R (%)	F1 (%)	P (%)	R (%)	F1 (%)	P (%)	R (%)	F1 (%)
Indep	83.75	80.71	82.10	69.02	56.73	62.05	68.32	50.82	58.18
Inter-EC	83.64	81.07	82.30	70.41	60.83	65.07	67.21	57.05	61.28
Inter-CE	84.94	81.22	83.00	68.09	56.34	61.51	69.02	51.35	59.01
ECPE-2D	86.27	92.21	89.10	73.36	69.34	71.23	72.92	65.44	68.89
GANU	86.34	89.72	87.91	74.10	74.64	74.33	72.95	71.02	71.89
RANKCP	91.23	89.99	90.57	74.61	77.88	76.15	71.19	76.30	73.60
GAT-ECPE	90.98	91.03	90.99	76.17	78.72	77.34	72.65	77.52	74.92
MMN	90.37	87.85	89.07	79.01	75.54	77.21	76.11	73.96	75.02
MGSAG	92.08	92.11	92.09	79.79	74.68	77.12	77.43	73.21	75.21
JFTA	90.13	88.80	89.44	78.78	77.01	77.83	76.41	75.81	76.05
LLM-MTLN	91.24	89.57	90.39	78.45	79.01	78.71	75.18	77.29	76.20
KESIF	96.68	89.91	93.20	78.94	77.96	78.45	82.26	73.45	77.61

Table 3. Experimental results of English dataset comparison.

Model	Emotion Extraction			Cause Extraction			Emotion–Cause Pair Extraction
Model	P (%)	R (%)	F1 (%)	P (%)	R (%)	F1 (%)	P (%)	R (%)	F1 (%)
Indep	71.73	68.11	69.66	61.64	51.74	56.11	49.48	40.42	44.32
Inter-EC	70.08	68.85	69.39	63.70	52.45	57.37	49.50	42.72	45.68
Inter-CE	72.38	67.45	69.80	62.53	51.25	56.18	50.27	40.72	44.83
ECPE-2D	74.35	69.68	71.89	64.91	53.53	58.55	60.49	43.84	50.73
RANKCP	61.32	66.17	63.64	54.82	57.32	56.01	44.94	48.32	46.52
MEKIT	–	–	–	–	–	–	65.04	58.31	61.49
KESIF	83.80	73.92	78.55	69.21	59.51	64.00	69.81	66.75	68.02

Table 4. Effect of the number of emotion–cause pairs in documents.

Pairs	Model	P (%)	R (%)	F1 (%)
One per doc	RANKCP	72.32	79.07	75.56
One per doc	KESIF	75.12	76.34	75.73
Two or more per doc	RANKCP	67.62	51.46	58.67
Two or more per doc	KESIF	72.69	55.02	62.78

Table 5. Ablation study on F1 score across ECPE, EE, and CE tasks.

Ablation Study	ECPE	EE	CE
Ablation Study	F1 (%)	F1 (%)	F1 (%)
KESIF	77.42	93.07	76.86
– w/o ATOMIC	72.22	89.08	73.42
– w/o GAT	74.77	92.69	75.98
– w/o Emotion-driven cause	75.15	92.64	74.81
– w/o Cause-driven emotion	75.99	92.36	76.64
– w/o Emotion filtering	77.01	92.50	76.10

Table 6. Case comparison of emotion–cause pair extraction results between GAT-ECPE and KESIF.

Document	Golden	GAT-ECPE	KESIF
(c₁) To rescue the woman as soon as possible
(c₂) The commander immediately formulated a rescue plan
(c₃) The first team laid an air cushion downstairs
(c₄) And evacuated irrelevant personnel around	(c₁₂, c₁₀)	(c₁₂, c₁₀)	(c₁₂, c₁₀)
(c₅) Another team quickly climbed to the 6th floor
(c₆) To persuade the woman inside the building	(c₁₂, c₉)	(c₁₂, c₉)	(c₁₂, c₉)
(c₇) During the persuasion process
(c₈) The firefighter learned that	(c₁₂, c₁₁)	(c₁₂, c₁₀)	(c₁₂, c₁₁)
(c₉) The woman was owed wages
(c₁₀) The family was in urgent need of money
(c₁₁) The life pressure was very high
(c₁₂) She had to attempt suicide by jumping off the building out of helplessness

Table 7. Case comparison of emotion–cause pair extraction results between RANKCP and KESIF.

Document	Golden	RANKCP	KESIF
(c₁) 18 years ago
(c₂) Chengwei Sun killed his uncle
(c₃) and then began to flee
(c₄) He changed his name and got married	$(c_{7}, c_{2})$	$(c_{7}, c_{6})$	$(c_{7}, c_{2})$
(c₅) but he could not escape the law
(c₆) He said that for 18 years
(c₇) he has been living in fear
(c₁) 25 years ago
(c₂) my mother went missing
(c₃) In April
(c₄) I got news from my friends	$(c_{6}, c_{5})$	$(c_{6}, c_{7})$	$(c_{6}, c_{5})$
(c₅) and finally found my mother in Henan Province	$(c_{7}, c_{7})$	$(c_{7}, c_{7})$	$(c_{7}, c_{7})$
(c₆) In addition to happiness
(c₇) I also worry about my mother’s difficulty in settling down

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, S.; Wang, Y. Knowledge Enhancement and Semantic Information-Fused Emotion–Cause Pair Extraction. Information 2026, 17, 42. https://doi.org/10.3390/info17010042

AMA Style

Li S, Wang Y. Knowledge Enhancement and Semantic Information-Fused Emotion–Cause Pair Extraction. Information. 2026; 17(1):42. https://doi.org/10.3390/info17010042

Chicago/Turabian Style

Li, Shi, and Yuqian Wang. 2026. "Knowledge Enhancement and Semantic Information-Fused Emotion–Cause Pair Extraction" Information 17, no. 1: 42. https://doi.org/10.3390/info17010042

APA Style

Li, S., & Wang, Y. (2026). Knowledge Enhancement and Semantic Information-Fused Emotion–Cause Pair Extraction. Information, 17(1), 42. https://doi.org/10.3390/info17010042

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Knowledge Enhancement and Semantic Information-Fused Emotion–Cause Pair Extraction

Abstract

1. Introduction

2. Related Work

2.1. Emotion–Cause Extraction

2.2. Machine Reading Comprehension

2.3. ATOMIC Knowledge Retrieval and Query Formulation

3. Methodology

3.1. Task Definition

3.2. Document Encoding

3.3. Graph Attention Module

3.4. Pair Prediction

4. Experiments

4.1. Datasets and Settings

4.2. Baselines

4.3. Main Results

4.4. The Impact of the Number of Emotion–Cause Pairs

4.5. Ablation Study

4.6. Case Study

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

Appendix A.1. Sensitivity Analysis of Hyperparameter α

Appendix A.2. Model Efficiency Comparison

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI