Progressive Prompt Generative Graph Convolutional Network for Aspect-Based Sentiment Quadruple Prediction

Feng, Yun; Tang, Mingwei

doi:10.3390/electronics14214229

Open AccessArticle

Progressive Prompt Generative Graph Convolutional Network for Aspect-Based Sentiment Quadruple Prediction

by

Yun Feng

¹ and

Mingwei Tang

^2,3,*

¹

School of Intelligent Manufacturing and Information Engineering, Sichuan Technology & Business College, Chengdu 610000, China

²

School of Artificial Intelligence and Electronic Engineering, Sichuan Technology and Business University, Chengdu 611745, China

³

School of Computer and Software Engineering, Xihua University, Chengdu 610039, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(21), 4229; https://doi.org/10.3390/electronics14214229

Submission received: 26 September 2025 / Revised: 14 October 2025 / Accepted: 24 October 2025 / Published: 29 October 2025

Download

Browse Figures

Review Reports Versions Notes

Abstract

Aspect-based sentiment quadruple prediction has important application value in the current information age. There are often implicit expressions and multi-level semantic relationships in sentences, making accurate prediction for existing methods still a complex and challenging task. To address the above problems, this paper proposes the Progressive Prompt-Driven Generative Graph Convolutional Network for Aspect-Based Sentiment Quadruple Prediction (ProPGCN). Firstly, a progressive prompt module is proposed. The module uses progressive prompt templates to generate paradigm expressions of corresponding orders and introduces third-order element prompt templates to associate high-order semantics in sentences, providing a bridge for modeling the final global semantics. Secondly, a graph convolutional relation-enhanced reasoning module is designed, which can make full use of contextual dependency information to enhance the recognition of implicit aspects and implicit opinions. In addition, a graph convolutional aggregation strategy is constructed. The strategy uses graph convolutional networks to aggregate adjacent node information and correct conflicting implicit logical relationships. Finally, experimental results show that the ProPGCN model can achieve state-of-the-art performance. Specifically, our ProPGCN model achieves overall F1 scores of 65.04% and 47.89% on the Restaurant and Laptop datasets, respectively, which represent improvements of +0.83% and +0.61% over the previous strongest generative baseline.

Keywords:

aspect sentiment quadruple prediction; generative architecture; semantic information; graph convolutional networks

1. Introduction

With the continuous advancement of the global digitalization process, the Internet has become an indispensable part of people’s daily lives. The rapid development of e-commerce websites (such as Taobao, JD.com, eBay, Amazon, etc.) and social media platforms has enabled users to post comments or share opinions at any time and from anywhere. These comments are not only a reflection of the personalized needs of users but also an important source of data for merchants, platforms, and research institutions to gain insights into user psychology and behavior. However, with the explosive growth in the amount of comment data, extracting useful information from them—especially implicit emotional information—has become one of the most important problems to be solved. Therefore, the question of how to use sentiment analysis technology to extract the required information from these complex and obscure text data is key among the current academic community.

Sentiment analysis technology (such as aspect-based sentiment analysis (ABSA) [1,2]) can mine implicit keywords and the emotional information that they contain to help businesses to better understand user needs. ABSA not only identifies the overall sentiment of a text but also classifies the sentiments of each specific aspect. Subsequently, the ABSA task has evolved into subtasks such as aspect term extraction (ATE) [3], opinion term extraction (OTE) [4], and aspect sentiment classification (ASC) [5]. In order to integrate the results of multiple subtasks of ABSA, researchers have proposed the aspect sentiment triplet extraction (ASTE) task [6]. ASTE focuses on extracting explicit aspects and opinions from text information. Although some studies have tried to solve the problem of implicit aspects and implicit opinions, there is still a lack of unified frameworks to discuss implicit issues.

In order to solve this problem, relevant researchers have proposed the aspect sentiment quad prediction (ASQP) task to address the problem of implicit aspect and opinion extraction. The ASQP task involves multiple subtasks, including aspect identification, opinion extraction, aspect category detection, and sentiment classification. Although these subtasks are relatively simple, implicit expressions and multi-level semantic relationships often exist in sentences, making accurate prediction a complex and challenging task. In recent years, researchers have proposed a variety of methods to improve the performance of the ASQP task, which can be roughly divided into three categories. The first category is classification methods based on Bidirectional Encoder Representations from Transformers (BERT) [7]. The pre-trained language model BERT is used to capture contextual information, extract aspect words and opinion words, and then complete the prediction of the quadruple through the classifier. The second category is data enhancement methods, which expand and optimize the training data to improve the robustness of the model in the sample imbalance scenario.

Compared with the BERT model, the generative model shows better performance, but it still has some problems. Since the generation method depends on the decoding process, the computational overhead in the inference stage is large. At the same time, it may have the problem of redundant generation when processing short text tasks, and it is necessary to design appropriate input and output formats according to different tasks, which may limit the generalization ability for some tasks. The third category is methods based on the generative model T5 (Text-to-Text Transfer Transformer) [8]. By converting the ASQP task into a semantic generation problem, the generative pre-trained model is used to predict the sentiment quadruple at one time, which has good generalizability when processing complex sentences. The combination of these methods not only effectively improves the prediction accuracy but also lays the foundation for further research and application. Next, the above methods will be introduced in detail to explore their advantages and disadvantages in ASQP tasks.

The current generative model only considers the semantic relationship between aspects and opinions in the target text, and it ignores the high-order semantics between the elements of the quadruple. Therefore, this method still has defects in capturing the global semantic issues between the elements of the quadruple. Secondly, the additional sequence labeling task leads to a significant increase in training overhead. Based on the above problems, this paper proposes a progressive prompt-driven generative graph convolutional network (ProPGCN) based on the aspect–sentiment quadruple. The main contributions are as follows:

ProPGCN uses progressive prompt templates to generate paradigm expressions of corresponding orders and introduces third-order element prompt templates to associate high-order semantics in sentences, providing a bridge for modeling the final global semantics.
A graph convolutional relational inference module (GRI) is designed. The module can make full use of the dependency information of the context to enhance the recognition of implicit aspects and implicit opinions.
A graph convolution aggregation module is constructed. The module uses the graph convolutional network to aggregate the information of adjacent nodes and correct the conflicting implicit logical relationships. The influence of multi-order cueing tasks on the model is adjusted by a weighted balancing loss function, and constrained decoding is used to generate the final quadruple.

The remainder of this paper is organized as follows. First, Section 2 presents related work on the techniques involved in the aspect–sentiment quadruple prediction method in this paper. Second, Section 3 presents a comprehensive overview of the general design of the methodology. Next, Section 4 describes experiments and result analyses regarding several aspects of the method in this paper. Finally, Section 5 summarizes the work.

2. Related Work

The aspect–sentiment quadruple prediction method based on BERT classification mainly uses the pre-trained BERT model to extract semantic information from the text, obtain aspect words and opinion words in the text, and predict sentiment polarity and aspect categories through the classification model. Compared with traditional sentiment analysis methods, the BERT-based model does not require the manual construction of sentiment dictionaries, nor does it rely on limited feature information. It can automatically learn rich semantic representations from large-scale corpora and process more complex texts.

2.1. BERT-Based Methods

Based on existing research, Cai et al. [9] proposed a unified framework to solve the problem of implicit aspects and opinions by improving several frameworks, such as DP [9], JET [10], and Extract-Classify [11]. The proposed ACOS task is used to extract entity and aspect-describing opinion words from the text and to judge the aspect categories and sentiment polarity. The BERT model can not only process explicit sentiment expressions but also identify implicit sentiments by encoding the contextual information of the text, thereby assigning corresponding opinion words to each aspect word and predicting the corresponding sentiment polarity and aspect category.

Yang et al. [12] introduced a suitable prompt strategy to predict the aspect category and sentiment polarity of the input text. The prompt strategy fully utilizes the semantic relationships between sentiment elements. Zhu et al. [13] designed a novel sentence-guided grid tagging scheme that can help to extract sentiment quadruplets from sentences and uses a grid that represents the overall meaning of the sentence to tag implicit elements. Li et al. [14] designed a linearization method to represent sentiment quadruplets as a sequence and use natural language tags to represent the sentiment elements of the quadruplets, which enhances the model’s learning of semantics and reduces the sequence length through unique tags, thereby improving the performance of the model.

In order to enhance the model’s understanding of the syntactic and semantic information of the input text, some researchers have enhanced the prediction aspect of the ASQP task by introducing graph neural networks (GNNs) [15]. Feng et al. [16] used an improved graph convolutional network (GCN) to extract syntactic information from sentences and designed a classifier based on a mutual assistance mechanism to classify aspect categories and sentiment polarity. At the same time, the semantic relationships between words were used to enhance the prediction of the quadruple. Li et al. [17] introduced syntactic and semantic information into the study of sentiment quadruples based on dialog, used graph convolutional networks to model syntactic dependency information, and constructed the interaction between discourses through a dual graph attention network (GAT) [18], thereby improving the accuracy of quadruple prediction.

Zhou et al. [19] proposed to divide the ASQP task into two simultaneous subtasks, using a shared encoder to perform triple extraction and aspect category detection tasks simultaneously, and then use a one-step decoding method to obtain the final quadruple extraction result. Chen et al. [20] introduced a bidirectional cross-attention mechanism to model explicit and implicit quadruple representations, enhanced the alignment of aspect words and opinion words, and introduced contrastive learning and self-attention mechanisms to capture the contextual associations of spans and finally infer the final prediction results through confidence. Zhang et al. [21] explored the role of graph attention networks in the ASQP task and proposed a prompt fine-tuning method based on opinion tree perception. By modeling emotional elements as a tree structure, the “one-to-many” dependency between elements can be accurately captured. Dynamic virtual templates and soft prompt modules are designed, and unique tags are used to identify implicit elements.

Su et al. [22] proposed a unified grid annotation scheme to represent implicit terms and designed an adaptive graph diffusion convolutional network to establish the associations between explicit and implicit sentiments using dependency trees and abstract semantic representations. They then integrated heterogeneous word pair relationships through the Triaffine mechanism to capture high-order interactions. Although the BERT method performs well in many sentiment analysis tasks, it still relies on a large amount of training data and will cause propagation errors during text feature transmission.

2.2. Data Augmentation-Based Methods

In recent years, data augmentation methods have been widely used in the task of aspect–sentiment quadruple prediction. Data augmentation generates new training samples manually or automatically to expand the original dataset, thereby improving the generalization ability and robustness of the model. By expanding text data, data augmentation methods can effectively alleviate the problem of insufficient data, especially in scenarios with small training sets or scarce labels. Due to the outstanding performance of data augmentation in the field of natural language processing, data augmentation technology has become a key means to improve model performance and expand datasets.

Mao et al. [23] proposed Seq2Path to introduce the generated sentiment quadruple into the tree structure. Each path of the tree is a token. By adding a tag to the end of each token as a negative sample, the sentiment tuple elements in the negative sample are randomly replaced, yielding negative sample D1. Then, negative sample D2 is generated by beam search. The negative sample and the original positive sample are taken as the enhanced data. Hu et al. [24] analyzed the impacts of different decoding orders of the quadruple on model performance and proposed the DLO and ILO frameworks, analyzing the impact of the quadruple output template order on the model in different datasets and single instances. DLO uses the pre-trained language model T5 to calculate the lowest entropy to select the appropriate template order. Zhang et al. [25] analyzed the ASQP dataset and found a data imbalance problem. Through a heuristic conditional function, they adaptively enhanced the data to address the data imbalance problems of quadruple patterns and aspect categories, thereby improving the performance of the model. Wang et al. [26] proposed a data augmentation method from quadruple labels to text. Q2T regenerates new quadruples using the input quadruple ASQP training label. Then, Q2T generates new quadruples as new training data in a sequence-to-sequence manner and uses the AC-IDF strategy to maintain the balance of the generated augmented samples. The experimental results prove the effectiveness of the Q2T augmentation strategy. The generated augmented training data effectively improve the performance of the model. Li et al. [27] proposed a dual-sequence data augmentation method to solve the problem of data scarcity in quadruple prediction tasks. By increasing the input and output data, it compensates for the limitations regarding the large cost of quadruple data annotation and improves the robustness of the model.

The above methods expand the diversity of the training data in different ways and have proven their effectiveness in sentiment analysis and text classification tasks. However, the effect of data augmentation mainly depends on the quality of the generated data. If the generated augmented samples are unnatural or semantically incorrect, these data will have a negative impact on model training. Therefore, when applying data augmentation, the augmentation strategy must be carefully designed to ensure that the generated data are both rich and diverse, while maintaining the semantic and structural consistency of the original data.

2.3. Generative Model-Based Methods

Compared with traditional rule-based or classification-based text analysis methods, generative models are more flexible and adaptable when processing text tasks. Their powerful text understanding and generation capabilities enable them to retain more contextual information when processing complex sentence structures and long texts, thereby improving the accuracy and generation quality of tasks. By pre-training on large-scale corpora, generative models can automatically learn the deep relationships and potential structures between texts, effectively reducing the reliance on artificial feature design.

The aspect sentiment quadruple prediction method based on the T5 generative model was first proposed by Zhang et al. [28]. They introduced two new datasets for the ASQP task and proposed the semantic modeling paradigm of Paraphrase, which converts the ASQP task into a semantic generation problem. The Paraphrase paradigm outputs the sentiment quadruple of the input text in the form of natural sentences, fully mining the semantic information in the sentiment quadruple. Mao et al. [25] proposed the Seq2Path model. Seq2Path introduces the generated sentiment quadruple into the tree structure, and different quadruples occupy different paths. In the reasoning stage, the introduction of discriminant tokens can realize beam search for constraint decoding. In addition, Seq2Path also uses pruning to prune invalid tuple paths in the generated tree structure. Bao et al. [29] proposed a novel opinion tree generation strategy. OTG uses opinionated constraint decoding to guide model decoding, uses a joint strategy consisting of multiple training tasks, integrates the semantic and grammatical features of the input text, and effectively utilizes the semantic structure of the sentiment analysis task. Joseph et al. [30] used supervised contrastive learning to enable the encoder–decoder to distinguish and represent the input text according to different features. In addition, this method also introduces a new structured generation format for the quadruple, which can effectively reduce the problem of semantic parsing errors in the model. Gao et al. [15] proposed a three-stage information extraction learning framework ranging from easy to difficult. E2H can imitate the learning strategies of human beings. In the simple stage, E2H can determine the extraction order of each element for the ASQP task. In the difficult stage, E2H can obtain more complex training samples. The main stage is to train the model, which ultimately enables E2H to achieve stronger generalization abilities. Wang et al. [31] simplified the quadruple prediction task into a triple extraction task by performing Cartesian product operations on aspect categories and sentiment polarity sets, and they semantically mapped implicit terms into pronoun representations, addressing the problem of missing aspect words and opinion words in the model prediction input. Li et al. [32] proposed a two-stage framework to enhance the correlation between aspects and opinions, using a span tagging scheme to construct a machine reading comprehension task to extract aspect–opinion pairs and generate sentiment elements by learning natural language. Xiong et al. [27] used a contrast and review network based on BART [33] to enhance the association between sentiment elements in quadruples through supervised contrastive learning and review learning modules. Nie et al. [9] explored the impact of non-autoregressive generative paradigms on quadruple prediction tasks. By treating aspects and opinions as latent variables, unsupervised aspect opinions were used to enhance the reasoning of implicit terms, and aspect- and opinion word-guided latent output modeling was introduced to dynamically guide quadruple output.

Zhang et al. [28] regarded the ASQP task as a semantic generation problem, introduced two restaurant datasets for the ASQP task, converted the quadruple extraction task into a paraphrase generation problem, and used Seq2Seq to predict the quadruple. Hu et al. [26] proposed a model based on data enhancement. First, by pre-training the language model, the minimum entropy is calculated to select the most appropriate output template sequence, and multiple suitable templates are combined for data enhancement.

In E2TP [34], a two-stage prompting framework is proposed, and a step-by-step prompting method from element to tuple is designed, which imitates the human step-by-step reasoning process and uses a diverse output paradigm design to enhance knowledge transfer from the source domain to the target domain and improve the robustness of the model. Qin et al. [35] explored the guiding role of chain thinking in the four-tuple generation model, introduced step-by-step reasoning into the ASQP task for the first time, and used prefix hints and text masking strategies to enhance the understanding of the deep semantics of the text and reduce the possibility of overfitting on small data. Lai et al. [34] proposed a framework of step-by-step task enhancement and relationship learning, imitating the human divide-and-conquer reasoning method, enhancing the model’s ability to capture complex relationships, and enhancing the model’s performance in implicit emotional expression and cross-domain scenarios. Bai et al. [36] constructed the first dataset designed for few-shot learning and proposed a multi-template collaborative wide-view soft hint method, selecting the best template by quantifying template correlations through Jensen–Shannon divergence. Zhu et al. [37] explored the application of diffusion models in the ASQP task and proposed a diffusion fuzzy learning strategy to simulate noise diffusion and a denoising process to reduce the distribution noise of emotional elements.

3. Proposed Model

The proposed progressive prompt-driven generative graph convolutional network (ProPGCN) based on the aspect–sentiment quadruple will next be introduced in detail. The model mainly consists of four parts: data pre-processing, a progressive prompt module, a graph convolution relational reasoning module, and quadruple generation and reasoning.

3.1. Definition

The problem description of the ASQP task is as follows. Given an input sentence X, the goal is to accurately predict all quadruplets

Q = {(a, c, o, s)}

in the sentence. Following existing methods, the sentiment elements are mapped to the corresponding semantic expressions

W_{a}

,

W_{o}

,

W_{c}

, and

W_{s}

, where W represents the mapping function. For example, the sentiment polarity “POS” is represented as “great” in the output target sequence, while the “NULL” label is represented as “it”. At the same time, in order to represent the different sentiment elements in the quadruplet, corresponding prompt tags are added to each element. The prompt tags of

W_{a}

,

W_{o}

,

W_{c}

, and

W_{s}

are [A], [C], [O], and [S], respectively. If an input text contains multiple sentiment quadruplets, a symbol [SSEP] is used to concatenate their corresponding semantic expressions to obtain the final target expression.

3.2. Model Architecture

The overall architecture of the ProPGCN model proposed in this work is shown in Figure 1. It mainly consists of four parts: data pre-processing, a step-by-step prompt module, a graph convolution relational reasoning module (GRI), and quadruple generation and reasoning. This work uses the encoder–decoder of the T5 pre-trained model as the core framework of the model and uses the step-by-step prompt module to help the encoder to learn the dependency relationship from part to global. At the same time, the introduction of the third-order element prompt template provides a bridge for the transition from low-order semantics to global semantics. In addition, an adjacency matrix is constructed according to the dependency relationship of the input sentence, and a graph convolutional network (GCN) is introduced to reason about syntactic and semantic relationships and calculate the graph convolution loss. Constrained decoding is used in the reasoning process, and the final quadruple is aggregated by voting.

3.3. Data Preprocessing

In this section, we will discuss how to parse the syntactic dependency relations in the input text, obtain the corresponding syntactic dependency tree, and construct the corresponding adjacency matrix in the data pre-processing stage. The parsing of syntactic dependency relations will be introduced in detail below.

In the ASQP task, dependency trees and AMR graphs can be used to reveal subtle relationships between input text words for the pre-trained model, thereby reasoning about the deep semantics of the text. Building syntactic dependencies is particularly important when dealing with complex sentences containing implicit expressions or multiple quadruplets, because these sentences may contain multiple sentiment tuples and the boundaries between them may not be obvious. The construction of dependency trees helps to understand the implicit key terms in the text and the connections between their corresponding syntactic subjects, as well as to establish the corresponding adjacency matrix.

Firstly, the dependency relationship in the pre-training data is modeled through the tool Spacy to obtain the syntactic dependency tree of the sentence. Secondly, the adjacency matrix is constructed through the syntactic dependency tree. The construction process is shown in Figure 1. The words in the sentence and the dependency relationship between them can be represented by a directed graph

G = {V, E}

, where V is a set of nodes and E is a set of edges. Each node represents a word in the sentence. In this directed graph, the edge from node i to node j represents the dependency relationship between two words. According to the syntactic dependency tree, the adjacency matrix is established. Let the words of the sentence be

x_{1}, x_{2}, \dots, x_{n}

, where n is the number of words in the sentence. The calculation formula for the adjacency matrix is as shown in Equation (1):

\begin{matrix} A_{i j} = \{\begin{matrix} 1 & if word x_{i} and word x_{j} have a dependency \\ 0 & otherwise \end{matrix} \end{matrix}

(1)

where

A_{i j} \in R^{n \times n}

, and

A_{i j}

is an

n \times n

matrix. The processes of constructing the syntactic dependency tree and adjacency matrix are shown in Figure 2.

3.4. Progressive Prompt Enhancement Module

The design of element prompts can help the model to establish a complete semantic relationship for a sentiment quadruple. The step-by-step enhanced prompt design provides a bridge for the construction process, spanning from low-order semantics to high-order semantics. The following will introduce in detail the construction process from the low-order semantics of a single element to the high-order semantics of a complete quadruple.

(1) First-order element prompt learning

By adjusting the arrangement and combination of element tags in the prompt template, different combinations of prediction results can be generated. Specifically, in the first-order element prompt stage, different prompt tags are assigned to the elements

W_{a}

,

W_{o}

,

W_{c}

, and

W_{s}

in the sentiment quadruple: [A], [C], [O], [S]. These prompt tags indicate the prediction order of the sentiment elements in the model and are added to the input text. The predicted sentiment elements are then concatenated with their corresponding prompt tags to form a prediction target, thus realizing the construction of the first-order semantics of the sentiment quadruple. At the same time, the task prefix “First Order” is used to distinguish it from other auxiliary data. For example, given an input sentence X, the input and output formal structures are as follows:

Input First Order: The food was pretty traditional. Rule: [A][C][O][S]

Output [A] food [C] food quality [O] traditional [S] great

The prompt module can provide effective prompts for the generation order of sentiment tuples. Therefore, through the arrangement and combination of elements, the model can be promoted to model the semantic relationships between the elements of the sentiment tuple. In order to reduce the computational cost caused by multiple arrangements and combinations, the entropy of all 24 arrangements of the four-tuple sentiment elements on the training set is calculated, and the top-k performance combinations are selected from them. That is, all possible arrangement orders P are used together with the training dataset Dt to prompt the T5 pre-training model to generate sentiment tuples in the target order. For a certain arrangement

p_{i}

in P, the corresponding conditional generation score

p (y_{p_{i}} | x)

is obtained through pre-training, and the average generation score

S_{p_{i}}

of the entire training set is used as the score of

p_{i}

, as shown in Equation (2):

\begin{matrix} S_{p_{i}} = \frac{\sum p (y_{p_{i}} | x)}{| D |} \end{matrix}

(2)

Therefore, by sorting these scores, we can select the top-k candidate permutations for model training. For the first-order element prompting task, the loss function calculation process is as shown in Equation (3):

\begin{matrix} L_{\tilde{ρ}} = \sum_{i = 1}^{J} log p (y | x_{i}^{{\tilde{ρ}}_{I S T}}) \end{matrix}

(3)

(2) Second-order element prompt learning

In order to express the second-order semantic relationships between paired elements, different prompt tags are combined with causal relationships to make the expression of paired elements in the quadruple more reasonable. For the input text, add the task prefix “Second Order” to it to predict the target sequence of the input data. In order to make the expression of the target sequence more consistent with human expression, the following four suitable paired element combination orders are selected: [AO], [CS], [AS], [CO]. By combining these four arrangements, such as [AO] [CS], [CS], [AO], for the arrangement and combination of second-order element pairs, 12 candidate arrangements can be generated. The input and output structures of the second-order element prompt are as follows:

Input Second Order: The food was pretty traditional. Rule: [AO][CS]

Output [AO] food is traditional [CS] food quality is great

For second-order element prompt learning, the loss function calculation process is as shown in Equation (4):

\begin{matrix} L_{S} = \sum^{K} log p (y | x_{i}^{s e c o n d}) \end{matrix}

(4)

(3) Third-order element prompt learning

After analyzing the semantic relationship between the first-order and second-order elements of the quadruple, it is found that, if the global semantics of the quadruple is directly modeled, there may be information missing in the process of semantic information transmission. Therefore, the third-order element prompt tag is introduced to capture the high-order semantic association between the third-order elements of the quadruple, providing a bridge for the establishment of global high-order semantics from low-order semantics. The task prefix “Third Order” is used to predict the corresponding target sequence. In order to make the sequence expression of the third-order element more consistent with the grammatical expressions and logical relationships of natural language, the following third-order element arrangement orders are defined: [ASO], [AOC]. By splicing the three-element combination with a single element, different third-order prompt templates can be obtained, such as [ASO][C], [AOC][S]. The input and output structures of the third-order element prompt are as follows:

Input Third Order: The food was pretty traditional. Rule: [AOS][C]

Output [AOS] food is great because traditional [C] food quality

For the third-order element prompt learning task, the loss function calculation process is as shown in Equation (5):

\begin{matrix} L_{T} = \sum_{i = 1}^{M} log p (y_{i} ∣ x_{i}^{t h i r d}) \end{matrix}

(5)

(4) Global semantic module

In order to model the global semantics of the four-tuple sentiment elements, the global semantic module represents the causal relationship between the sentiment elements in the four-tuple to construct the final global semantic representation. This method enables the model to capture the global semantics between the four-tuple sentiment elements in a gradually enhanced manner, while relying on the support of previous prompt learning. This enables the model to capture the semantic information between sentiment elements more naturally and improves the overall semantic expression ability. The input and output forms of global semantic modeling are as follows:

Input Global Semantic: The food was pretty traditional. Rule: [CSAO]

Output [CSAO] The food quality is great because food is traditional.

The loss function calculation process of the global semantic module is as shown in Equation (6):

\begin{matrix} L_{T} = \sum_{i = 1}^{M} log p (y_{i} ∣ x_{i}^{t h i r d}) \end{matrix}

(6)

In the pre-training stage, multi-task training is performed for progressive cue learning. By introducing the top-k strategy, the appropriate element permutation order combination is selected, and the corresponding element order cue template is used to predict the quadruple. However, since the number of candidate permutations and combinations generated by multi-order element cue tasks is different, the corresponding loss function weights in the training process are also different. In order to solve this problem, a weighted balanced loss (WBL) is introduced to adjust the proportion of the influence of each order cue task on the model, as shown in Equation (7):

\begin{matrix} L_{W B} = - \frac{1}{J} L_{F} - \frac{1}{K} L_{S} - \frac{1}{M} L_{T} - \frac{1}{N} L_{G} \end{matrix}

(7)

where J, K, M, and N represent the numbers of sortings contained in the first-order, second-order, third-order, and global semantic prompt learning tasks, respectively. In order to optimize the loss function, the average of each combination in each task is used as the balanced weight to minimize the loss function, so that the model can achieve a more balanced learning effect on semantic cue tasks at different levels.

3.5. Graph Convolutional Relational Inference (GRI) Module

In this section, we will introduce the graph convolutional relational inference (GRI) module in detail. GRI uses the dependency tree and adjacency matrix obtained in the pre-training phase to infer the implicit relationships between key terms in the input text.

For a sentence, the encoder part extracts text features and obtains the hidden features

H^{e n}

of the text T. The encoder–decoder obtains the high-order semantic information of the text through

H^{e n}

. The formula for calculating the hidden features of the text T is as shown in Equation (8):

\begin{matrix} H^{e n} = Encoder (T) \end{matrix}

(8)

In the decoding stage, the decoder of the pre-trained model can generate predicted quads based on the hidden feature information of the input text generated by the encoder. Specifically, the encoder first passes the semantic hidden features of the input text to the decoder. In the decoding stage, the corresponding output features are obtained by combining the contextual hidden features

H^{e n}

of the input text with the target sequence

y^{t a r}

generated by the previous decoder, as shown in Equation (9):

\begin{matrix} H^{d x} = Decoder (H^{e n}, y^{t a r}) \end{matrix}

(9)

As shown in Figure 2, the graph convolutional relational inference (GRI) module is mainly composed of three parts: a feature conversion layer, a GCN, and a graph convolution loss calculation layer. For the input text hidden feature

H^{e n}

output by the encoder, a word vector converter is used to obtain the word vector representation of each word in the input text. At the same time, through the dependency tree and adjacency matrix constructed in the pre-training stage, the GCN can help the pre-trained model to further infer the implicit relationships in the text and strengthen the implicit semantic association. Finally, the quadruple predicted by the model is compared with the true label through the graph convolution loss calculation layer, and the parameters of the model are adjusted according to their differences.

(1) Feature conversion layer

In order to convert the hidden features

H^{e n}

obtained from the encoder into node features that can be used in the graph neural network (GNN), a feature conversion layer is designed in GRI. This conversion layer is responsible for converting the input text hidden features

H^{e n}

obtained from the encoder into the corresponding node features H. The calculation process of the optimal choice

H_{i}

represented by the word vector is as shown in Equation (10):

\begin{matrix} H_{I} = max (\{H_{J}^{e n c} ∣ M_{J} = 1\}) \end{matrix}

(10)

where

H_{J}^{e n c}

represents the feature representation of node j after encoding, and

M_{i j}

is a binary matrix that represents the relationship between node i and node j. If the value of

M_{i j}

is 1, there is a dependency relationship between the two words.

(2) Graph Convolutional Neural Network (GCN)

By leveraging the characteristics of the GCN, the proposed model can better understand the deep semantics of ASQP and accurately identify implicit terms in the text. The model’s understanding of implicit semantics is enhanced by combining word vectors with adjacency matrices, as shown in Figure 3. In this representation, the syntactic dependency information H included not only improves the model’s understanding of explicit semantics but also improves the ability to reason about implicit relationships in the text and improves the prediction performance regarding implicit expressions.

In the GCN, the input word vector representation H is first linearly transformed to obtain the latest representation

H^{'}

of the node, as shown in Equation (11):

H^{'} = H W

(11)

where W is the weight matrix corresponding to the linear transformation, which is used to learn the feature mapping between nodes. The weighted feature sum of each node and its adjacent nodes is calculated through the adjacency matrix

A_{i j}

. The latest representation of node i is the weighted sum of the features of its adjacent nodes, as shown in Equation (12):

H_{i}^{'} = A H W

(12)

At the same time, in order to make the process of node feature aggregation more balanced, the adjacency matrix is normalized. The normalized adjacency matrix

\hat{A}

is represented as shown in Equation (13):

\hat{A} = D^{- \frac{1}{2}} A D^{- \frac{1}{2}}

(13)

where D is the degree matrix. Finally, the node features of the GCN are aggregated through a non-linear activation function, as shown in Equation (14):

H_{i}^{''} = σ (\sum_{j \in N_{i}} A_{i j} H_{j}^{'})

(14)

where

σ

is a non-linear activation function, and

H_{i}^{''}

represents the latest feature representation of node i.

(3) Graph convolution loss function

A graph convolution loss function can measure the similarity between aspect terms, opinion terms, and the hidden features

H^{d e c}

output by the decoder, as well as their relationships with other elements within the sentence. In order to obtain more accurate quadruple predictions, the intermediate representation S is formed by combining the hidden features

H^{d e c}

of the decoder with the output

H^{″}

of the GCN, as shown in Equation (15):

S = R e L U (L i n e a r (c o n c a t (H, D)))

(15)

Among them, D is the feature representation corresponding to the term related to the quadruple,

c o n c a t ()

represents the concatenation operation, and

R e L U

is the activation function. Then, the intermediate representation S is input into the classifier, and the predicted probability distribution

\hat{Y}

is obtained through linear transformation, as shown in Equation (16):

\hat{Y} = softmax (Linear (S))

(16)

The function

s o f t m a x ()

is used to convert the output into a probability distribution. The graph convolution loss is minimized by calculating the difference between the predicted probability distribution

\hat{Y}

and the true label Y, as shown in Equation (17):

L_{G} = - \sum_{i = 1 c = 1}^{I} \sum_{i c}^{C} Y_{i c} log {\hat{Y}}_{i c}

(17)

where I is the number of nodes in the GCN, C is the number of node categories,

Y_{i c}

is the true label of the i-th node, and

\hat{Y_{i c}}

is the predicted probability that the i-th node belongs to the c-th category.

3.6. Inference Layer

At the inference layer, the constrained decoding strategy is first used. Then, the top-k prediction tuples selected by the step-by-step prompting module are summarized. Finally, the final prediction result is selected by the maximum voting method. The aspects, opinions, aspect categories, and sentiment polarity in the summarized quadruple prediction results are counted. The prediction result with the most occurrences is selected as the final quadruple output, as shown in Equation (18):

\hat{p} = arg max \sum_{i = 1}^{N} I ({\hat{p}}_{i} = p)

(18)

where

{\hat{p}}_{i}

represents the prediction result of the i-th sentiment quadruple. p is the quadruple category to be predicted.

3.7. Model Training Loss

During the training process of the ProPGCN model, the total loss function is the sum of the weight balance loss in the step-by-step prompting task and the GRI module loss, as shown in Equation (19):

L_{t o t a l} = L_{W B} + L_{G}

(19)

4. Experiments and Result Analysis

In this section, we mainly describe the experimental analysis of ProPGCN on four common datasets. The performance of the model is analyzed in detail from the perspectives of the experimental parameter settings, baseline model, result analysis, ablation analysis, and case analysis. The experimental results fully verify the effectiveness of the proposed model in the ASQP task.

4.1. Dataset and Parameter Settings

The following introduces several public datasets commonly used in ASQP tasks, including the Restaurant and Laptop datasets, shown in Table 1, and the Rest15 and Rest16 datasets, shown in Table 2.

The T5-base pre-trained model in the Hugging-face Transformer library is used to initialize the parameters of the model. T5-base mainly consists of 12 encoder layers and 12 decoder layers. Among them, the dimension of the hidden state is set to 768, and the number of attention heads is set to 12. During the training process, Adam W is used as the optimizer of the model, the initial learning rate is set to 1×10⁻⁴, and the learning rate decays linearly. The number of iterations in the training process is set to 20 epochs, the batch size is 16, and the maximum token length is 128. For the top-k strategy in the quadruple prediction stage, k = 15 is used for all tasks and datasets in the main results. The experiment takes the average of the results of five different random seeds as the final result. All experiments are conducted on an NVIDIA A40 GPU.

4.2. Baseline Method

In this section, we evaluate BERT-based methods and multiple baseline models based on generation methods and introduce the main contributions of each baseline method in detail.

(1) BERT-based methods

Extract-Classify-ACOS [10]: A two-stage approach is designed to obtain quadruplets. The first stage extracts aspect–opinion pairs in sentences. The second stage classifies aspect categories and sentiments and determines whether there are implicit aspects or implicit opinions based on the obtained context-aware tokens and then predicts the final quadruplets.
One-ASQP [19]: It is proposed to divide the ASQP task into two simultaneous subtasks, perform triple extraction and aspect category detection tasks simultaneously through a shared encoder, and then use a one-step decoding method to obtain the final quadruple extraction result.
CACA [21]: By introducing a bidirectional cross-attention mechanism, explicit and implicit quadruple representations are modeled to enhance the alignment of aspect words and opinion words. At the same time, contrastive learning and self-attention mechanisms are introduced to capture the contextual association of the span; finally, the final prediction result is inferred through confidence.
OTPT [22]: The role of graph attention networks in the ASQP task is explored, and a prompt fine-tuning method based on opinion tree perception is proposed. By modeling emotional elements as a tree structure, the “one-to-many” dependency relationship between elements can be accurately captured. A dynamic virtual template and soft prompt module are designed, and unique tags are used to identify implicit elements.
UGTS [38]: A unified grid annotation scheme is proposed to represent implicit terms, and an adaptive graph diffusion convolutional network is designed to establish the association between explicit and implicit sentiments using dependency trees and abstract semantic representations. Subsequently, the Triaffine mechanism is used to integrate heterogeneous word pair relationships to capture high-order interactions.
MRCCLRI [39]: A novel end-to-end non-generative model presented for ASQP, involving multi-task decomposition within a machine reading comprehension (MRC) framework.

(2) Generation-based methods

Paraphrase [40]: The ASQP task is regarded as a semantic generation problem. Two restaurant datasets are introduced for the ASQP task. The quadruple extraction task is converted into a paraphrase generation problem, and the Seq2Seq method is used to predict the quadruple.
DLO [24]: A model based on data enhancement is proposed. By pre-training the language model, the minimum entropy is calculated to select the most appropriate output template sequence, and multiple appropriate templates are combined for data enhancement.
UAUL [30]: For the first time, it is proposed to study the ASQP task from the perspective of “what not to generate” and to use negative samples to provide relevant information to the generative model, thereby reducing the inherent error of the generative model.
E2TP [41]: A two-stage prompting framework is proposed, and a step-by-step prompting method from elements to tuples is designed, which imitates the process of human step-by-step reasoning. The diversified output paradigm design is used to enhance knowledge transfer from the source domain to the target domain and improve the robustness of the model.
SIT [35]: The paper explores the guiding role of chain thinking in the quadruple generation model. It introduces step-by-step reasoning into the ASQP task for the first time and uses prefix hints and text masking strategies to enhance the understanding of the deep semantics of the text and reduce the possibility of overfitting on small data.
STAR [34]: A framework of step-by-step task enhancement and relationship learning is proposed, which imitates the human divide-and-conquer reasoning method, enhances the model’s ability to capture complex relationships, and enhances the model’s performance in implicit emotional expression and cross-domain scenarios.
BvSP [36]: The first dataset designed specifically for few-shot learning is constructed, and a multi-template collaborative wide-view soft hint method is proposed. The Jensen–Shannon divergence is used to quantify the template correlation and select the best template.
GDP [37]: The application of diffusion models in ASQP tasks is explored, and a diffusion fuzzy learning strategy is proposed to simulate the noise diffusion and denoising process to reduce the distribution noise of sentiment elements.
DuSR² [42]: This framework presents a straightforward and effective strategy-level approach: a dual-system-based reasoning framework with intuitive reactions.

4.3. Experimental Results

In this section, a comparison of the results between the proposed ProPGCN model and other strong baseline models is given. The specific experimental results on four general datasets are shown in Table 3 and Table 4. As can be seen from Table 3 and Table 4, in the two datasets, i.e., Restaurant and Laptop, the ProPGCN model achieved good performance in terms of both the P and F1 evaluation indicators. On the Rest15 dataset, the ProPGCN model achieved good performance in terms of both the R and F1 evaluation indicators. On the Rest16 dataset, although the ProPGCN model did not achieve the best F1 score, it was better than most baseline models. The best value of each evaluation indicator is marked in bold, where “-” indicates that the original paper does not give the corresponding data. In terms of the F1 score evaluation indicator, the ProPGCN model has improved it by 0.36%, 0.61%, 0.64%, and −1.36%, respectively, compared with other strong baseline methods.

The experimental results show that the average performance of the proposed model is better than that of the previously proposed models in the study of aspect sentiment quadruple prediction. Especially when modeling the global semantics of quadruples, the proposed ProPGCN can achieve complete construction from low-level to global semantics through step-by-step enhancement. ProPGCN can obtain complete semantic information from the prompt task, reducing the possibility of errors in the process of semantic information transmission. In addition, ProPGCN is significantly better than other generation-based methods in its ability to reason about implicit contextual relations. The reason may be that the GCN’s modeling of syntactic dependency trees and adjacency matrices enhances the model’s understanding of contextual information, thereby improving the model’s ability to reason about implicit quadruple elements in sentences.

4.4. Ablation Analysis

In order to further verify the effectiveness of each module of the ProPGCN model, the effects of the second-order element prompt task module, the third-order element prompt task module, the weight balance loss module, the graph convolutional relational reasoning module, and the reasoning strategy module on the model are studied through ablation experiments. Here, “w/o Second-Order Prompt” means that the second-order element prompt task module is removed based on the ProPGCN model; “w/o Third-Order Prompt” means that the third-order element prompt task module is removed based on the ProPGCN model; “w/o WBL” means that the weight balance loss module is removed based on the ProPGCN model; “w/o GRI” means that the graph convolutional relational reasoning module is removed based on the ProPGCN model; and “w/o Inference” means that the reasoning strategy module is removed based on the ProPGCN model. The ablation experiment results for the four datasets, i.e., Restaurant, Laptop, Rest15, and Rest16, are shown in Table 5.

As can be seen from Table 5, in the absence of the second-order element prompt task module and the third-order element prompt task module, the performance of the model declined to varying degrees. In particular, in the absence of the third-order element prompt task module, the performance of ProPGCN was significantly reduced. The reason may be that the second-order element prompt task module and the third-order element prompt task module correct the erroneous semantic information that occurs in the process of semantic information transmission. The third-order element prompt task module plays a transitional role in the construction process of the model from low-order semantics to global semantics. At the same time, even without WBL, the step-by-step prompt task module can improve the performance of the model in most cases. However, when WBL was missing, the F1 scores in all datasets dropped significantly. This shows that, although the step-by-step prompt task module helps to alleviate the problem of errors in semantic information transmission, WBL effectively balances the weight of each order prompt task’s contribution to the model.

In addition, after removing GRI, the model performs poorly in reasoning about implicit relations. In order to obtain the contextual dependency information of the input text, the GRI module introduces a GCN to model the contextual dependency of the model. The contextual information can be used to reason about the implicit relations in the sentence. It should also be noted that, when the reasoning module is removed, the performance of the model also decreases. The reasoning module counts the elements of the quadruple generated by the multi-order prompt task and obtains the most recognized quadruple, which improves the accuracy of quadruple prediction.

4.5. Performance Analysis

In order to verify the effectiveness of the ProPGCN model in processing different types of quadruples, an experimental analysis was conducted on different types of sentiment quadruples. Similarly, the impact of the selection of the top-k strategies in different datasets on the model was explored, and the F1 scores in the cases of k = 10 and k = 15 were compared.

(1) Performance Effects

In Table 6, the ProPGCN model’s scores on test subsets of different aspects and opinion types are shown. The experimental results show that the ProPGCN model achieved significant improvements in the four different aspects and opinion types. The BERT-based method Extract-Classify first extracts the aspect–opinion pairs in the text and classifies the aspect categories and sentiment polarity. It may ignore the associations between subtasks. Although the grid annotation-based method UGTS constructs the associations between explicit and implicit sentiment elements, it cannot handle the situation whereby both aspects and opinions are implicitly expressed in the sentence. At the same time, ProPGCN also surpasses generative models such as GEN-SCL-NAT and Paraphrase. Through experiments, it is found that the proposed method can effectively aggregate the global semantic information between the four-tuple elements and use the context information provided by the dependency tree to reason about the implicit aspects and implicit opinions.

(2) Top-k effect

In order to analyze the impact of the choice of the top-k strategy on the model under different tasks, the F1 scores of the two cases of k = 10 and k = 15 in the Restaurant and Laptop datasets are compared. As shown in Figure 4 and Figure 5, the performance of the model declines in the absence of WBL. As the order of k increases, the decline becomes larger. In both datasets, even when k = 10, the performance of the model exceeds the MVP (k = 15), indicating the limitation of simply increasing the order of elements in quadruple prediction. It can also be seen that the model achieves the highest F1 score when the value of k is 15.

4.6. Case Study

The performance of the ProPGCM model and the current state-of-the-art strong baseline model UGTS in several different quadruple type cases is shown in Table 7.

In the second and third cases, when the opinion word in the sentence was implicit and the aspect word was explicit, both ProPGCN and UGTS could infer the implicit opinion corresponding to the explicit aspect. However, when the aspect word in the sentence was implicit, UGTS did not correctly predict the corresponding aspect expression. The reason for this may be the expression of some suspected aspect words in the sentence. The fourth case analyzed the situation where both the aspect word and the opinion word were implicitly expressed by the two models. UGTS did not predict the double implicit sentiment quadruple expression in the sentence. In contrast, the ProPGCN model successfully inferred the double implicit sentiment quadruple in the sentence through the context information obtained by the GRI module. Finally, the performance of the two models in the context of complex sentence expression is analyzed. The rhetorical question expression in this case had a certain impact on UGTS, resulting in errors in the prediction of implicit opinions and sentiment polarity. The proposed ProPGCN correctly predicted the sentiment polarity corresponding to the explicit aspect “Ray’s” through the step-by-step prompt enhancement module, and it inferred that the corresponding aspect word was implicit.

5. Conclusions

This paper mainly introduces a progressive prompt-driven generative graph convolutional network model based on aspect–sentiment quadruples. In the training data pre-processing stage, the syntactic dependency relationship in the input text is first parsed by the syntactic parsing toolkit to obtain the syntactic dependency tree and adjacency matrix corresponding to the input text. Subsequently, the global semantic information between the elements of the sentiment quadruple is aggregated through the step-by-step prompt enhancement module. In particular, the semantic information of a single element is constructed through the first-order element prompt task, and then the second-order element and third-order element semantic information are extracted step by step, providing a transition space for construction from low-order semantics to global semantics. At the same time, the top-k strategy is used to select appropriate element prompt templates, and the weight balance loss module is used to balance the contributions of each order element prompt task in the model. Finally, the context information obtained by the graph convolutional network is used to infer the implicit relationships in the sentence, and the reasoning module is used to predict the final quadruple. The experimental results show that the proposed method has better performance. When generating output text, the generative model is easily influenced by noise, which may cause the output to be repeated or even ungrammatical.

While the introduction of the graph convolutional network (GCN) module and step-by-step prompt learning improves the performance, it also increases the overall model complexity and training time. Compared to the basic T5 model, our model requires additional syntax parsing, adjacency matrix construction, and GCN forward propagation. In the future, we will explore lighter-weight graph neural network architectures (such as reducing the number of GCN layers) or knowledge distillation techniques to improve the model’s inference efficiency while maintaining its performance. The denoising mechanism of the diffusion model will be used in the generative model to enhance its robustness.

Author Contributions

Conceptualization, Y.F.; methodology, Y.F. and M.T.; software, Y.F. and M.T.; validation, Y.F.; formal analysis, Y.F.; data curation, Y.F.; writing—original draft preparation, Y.F. and M.T.; writing—review and editing, Y.F. and M.T.; visualization, Y.F.; funding acquisition, Y.F. and M.T. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Sichuan Science and Technology Program (Nos. 2025NSFSC2017), the National Natural Science Foundation of China (Nos. 62576295, 61902324), the Scientific Research Funds Project of the Science and Technology Department of Sichuan Province (Nos. 2019YFG0508, 2019GFW131, 25LHJJ0133), the Funds Project of Chengdu Science and Technology Bureau (No. 2017-RK00-00026-ZF), Chinese Society for Technical and Vocational Education(CSTVE) (No. XHHWCJRH2024-02-05-01) and the Science and Technology Planning Project of Guizhou Province (No. QianKeHeJiChu-ZK[2021]YiBan319).

Data Availability Statement

The datasets generated during and analyzed during the current study are available in the article.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Wang, P.; Tao, L.; Tang, M.; Wang, L.; Xu, Y.; Zhao, M. Incorporating syntax and semantics with dual graph neural networks for aspect-level sentiment analysis. Eng. Appl. Artif. Intell. 2024, 133, 108101. [Google Scholar] [CrossRef]
Zheng, Y.; Tang, M.; Yang, Z.; Hu, J.; Zhao, M. Semantic-enhanced relation modeling for fine-grained aspect-based sentiment analysis. Int. J. Mach. Learn. Cybern. 2025. [Google Scholar] [CrossRef]
Xu, Y.; Tian, J.; Tang, M.; Tao, L.; Wang, L. Document-level relation extraction with entity mentions deep attention. Comput. Speech Lang. 2024, 84, 101574. [Google Scholar] [CrossRef]
Wu, Z.; Ying, C.; Zhao, F.; Fan, Z.; Dai, X.; Xia, R. Grid Tagging Scheme for Aspect-oriented Fine-grained Opinion Extraction. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, Online, 16–20 November 2020; pp. 2576–2585. [Google Scholar]
Zhang, W.; Deng, Y.; Li, X.; Bing, L.; Lam, W. Aspect-based sentiment analysis in question answering forums. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, Punta Cana, Dominican Republic, 7–11 November 2021; pp. 4582–4591. [Google Scholar]
Yang, K.; Zong, L.; Tang, M.; Hu, J.; Zheng, Y.; Chen, Y.; Zhao, M. MPGM:Multi-prompt generation model with self-supervised contrastive learning for aspect sentiment triplet extraction. Neural Netw. 2025, 192, 107894. [Google Scholar] [CrossRef] [PubMed]
Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar]
Xu, H.; Tang, M.; Cai, T.; Hu, J.; Zhao, M. Dual-enhanced generative model with graph attention network and contrastive learning for aspect sentiment triplet extraction. Knowl.-Based Syst. 2024, 301, 112342. [Google Scholar] [CrossRef]
Li, S.; Lin, N.; Wu, P.; Zhou, D.; Yang, A. Enhancing Aspect Sentiment Quad Prediction through Dual-Sequence Data Augmentation and Contrastive Learning. In Proceedings of the The 16th Asian Conference on Machine Learning (Conference Track), Hanoi, Vietnam, 5–8 December 2024. [Google Scholar]
Cai, H.; Xia, R.; Yu, J. Aspect-category-opinion-sentiment quadruple extraction with implicit aspects and opinions. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL), Online, 1–6 August 2021; pp. 340–350. [Google Scholar]
Wan, H.; Yang, Y.; Du, J.; Liu, Y.; Qi, K.; Pan, J.Z. Target-aspect-sentiment joint detection for aspect-based sentiment analysis. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 9122–9129. [Google Scholar]
Yang, K.; Zong, L.; Tang, M.; Zheng, Y.; Chen, Y.; Zhao, M.; Jiang, Z. MPBE: Multi-perspective boundary enhancement network for aspect sentiment triplet extraction. Appl. Intell. 2025, 55, 301. [Google Scholar] [CrossRef]
Zhu, L.; Bao, Y.; Xu, M.; Li, J.; Zhu, Z.; Kong, X. Aspect sentiment quadruple extraction based on the sentence-guided grid tagging scheme. World Wide Web 2023, 26, 3303–3320. [Google Scholar] [CrossRef]
Li, S.; Zhang, Y.; Lan, Y.; Zhao, H.; Zhao, G. From implicit to explicit: A simple generative method for aspect-category-opinion-sentiment quadruple extraction. In Proceedings of the 2023 International Joint Conference on Neural Networks (IJCNN), Gold Coast, Australia, 18–23 June 2023; pp. 1–8. [Google Scholar]
Lewis, M.; Liu, Y.; Goyal, N.; Ghazvininejad, M.; Mohamed, A.; Levy, O.; Stoyanov, V.; Zettlemoyer, L. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), Online, 5–10 July 2020; pp. 7871–7880. [Google Scholar]
Li, B.; Li, Y.; Jia, S.; Ma, B.; Ding, Y.; Qi, Z.; Tan, X.; Guo, M.; Liu, S. Triple GNNs: Introducing Syntactic and Semantic Information for Conversational Aspect-Based Quadruple Sentiment Analysis. In Proceedings of the 2024 27th International Conference on Computer Supported Cooperative Work in Design (CSCWD), Tianjin, China, 8–10 May 2024; pp. 998–1003. [Google Scholar]
Wang, J.; Yang, A.; Zhou, D.; Lin, N.; Wang, Z.; Huang, W.; Chen, B. Simplifying aspect-sentiment quadruple prediction with cartesian product operation. In Proceedings of the International Conference on Intelligent Computing, Zhengzhou, China, 10–13 August 2023; pp. 707–719. [Google Scholar]
Tang, M.; Tang, W.; Gui, Q.; Hu, J.; Zhao, M. A vulnerability detection algorithm based on residual graph attention networks for source code imbalance (RGAN). Expert Syst. Appl. 2024, 238, 122216. [Google Scholar] [CrossRef]
Zhou, J.; Yang, H.; He, Y.; Mou, H.; Yang, J. A Unified One-Step Solution for Aspect Sentiment Quad Prediction. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, Toronto, ON, Canada, 9–14 July 2023; pp. 12249–12265. [Google Scholar]
Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph Attention Networks. In Proceedings of the International Conference on Learning Representations (ICLR), Vancouver, BC, Canada, 30 April–3 May 2018; Volume 1050, p. 4. [Google Scholar]
Chen, B.; Xu, H.; Luo, Y.; Xu, B.; Cai, R.; Hao, Z. CACA: Context-Aware Cross-Attention Network for Extractive Aspect Sentiment Quad Prediction. In Proceedings of the 31st International Conference on Computational Linguistics, Abu Dhabi, United Arab Emirates, 19–24 January 2025; pp. 9472–9484. [Google Scholar]
Zhang, Z.; Yang, Z.; Li, Z. Opinion-Tree-aware Prompt Tuning for Aspect Sentiment Quadruple Prediction. arXiv 2024. [Google Scholar] [CrossRef]
Mao, Y.; Shen, Y.; Yang, J.; Zhu, X.; Cai, L. Seq2Path: Generating Sentiment Tuples as Paths of a Tree. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, Dublin, Ireland, 22–27 May 2022; pp. 2215–2225. [Google Scholar]
Hu, M.; Wu, Y.; Gao, H.; Bai, Y.; Zhao, S. Improving Aspect Sentiment Quad Prediction via Template-Order Data Augmentation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), Abu Dhabi, United Arab Emirates, 7–11 December 2022; pp. 7889–7900. [Google Scholar]
Zhang, W.; Zhang, X.; Cui, S.; Huang, K.; Wang, X.; Liu, T. Adaptive data augmentation for aspect sentiment quad prediction. In Proceedings of the ICASSP 2024—2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Republic of Korea, 14–19 April 2024; pp. 11176–11180. [Google Scholar]
Wang, A.; Jiang, J.; Ma, Y.; Liu, A.; Okazaki, N. Generative data augmentation for aspect sentiment quad prediction. J. Nat. Lang. Process. 2024, 31, 1523–1544. [Google Scholar] [CrossRef]
Nie, Y.; Fu, J.; Zhang, Y.; Li, C. Modeling implicit variable and latent structure for aspect-based sentiment quadruple extraction. Neurocomputing 2024, 586, 127642. [Google Scholar] [CrossRef]
Peper, J.; Wang, L. Generative Aspect-Based Sentiment Analysis with Contrastive Learning and Expressive Structure. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates, 7–11 December 2022; pp. 6089–6095. [Google Scholar]
Gao, C.; Zhang, W.; Lam, W.; Bing, L. Easy-to-Hard Learning for Information Extraction. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, Toronto, ON, Canada, 9–14 July 2023; pp. 11913–11930. [Google Scholar]
Hu, M.; Bai, Y.; Wu, Y.; Zhang, Z.; Zhang, L.; Gao, H.; Zhao, S.; Huang, M. Uncertainty-Aware Unlikelihood Learning Improves Generative Aspect Sentiment Quad Prediction. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, Toronto, ON, Canada, 9–14 July 2023; pp. 13481–13494. [Google Scholar]
Lil, Z.; Yang, Z.; Li, X.; Li, Y. Two-stage aspect sentiment quadruple prediction based on MRC and text generation. In Proceedings of the 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Maui, HI, USA, 1–4 October 2023; pp. 2118–2125. [Google Scholar]
Xiong, H.; Yan, Z.; Wu, C.; Lu, G.; Pang, S.; Xue, Y.; Cai, Q. BART-based contrastive and retrospective network for aspect-category-opinion-sentiment quadruple extraction. Int. J. Mach. Learn. Cybern. 2023, 14, 3243–3255. [Google Scholar] [CrossRef]
Wang, P.; Tao, L.; Tang, M.; Zhao, M.; Wang, L.; Xu, Y.; Tian, J.; Meng, K. A novel adaptive marker segmentation graph convolutional network for aspect-level sentiment analysis. Knowl.-Based Syst. 2023, 270, 110559. [Google Scholar] [CrossRef]
Lai, W.; Xie, H.; Xu, G.; Li, Q. STAR: Stepwise Task Augmentation and Relation Learning for Aspect Sentiment Quad Prediction. arXiv 2025, arXiv:2501.16093. [Google Scholar] [CrossRef]
Qin, Y.; Lv, S. Generative Aspect Sentiment Quad Prediction with Self-Inference Template. Appl. Sci. 2024, 14, 6017. [Google Scholar] [CrossRef]
Bai, Y.; Xie, Y.; Liu, X.; Zhao, Y.; Han, Z.; Hu, M.; Gao, H.; Cheng, R. BvSP: Broad-view Soft Prompting for Few-Shot Aspect Sentiment Quad Prediction. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL), Bangkok, Thailand, 11–16 August 2024; pp. 8465–8482. [Google Scholar]
Zhu, L.; Chen, X.; Guo, X.; Zhang, C.; Zhu, Z.; Zhou, Z.; Kong, X. Pinpointing Diffusion Grid Noise to Enhance Aspect Sentiment Quad Prediction. In Proceedings of the Findings of the Association for Computational Linguistics ACL 2024, Bangkok, Thailand, 11–16 August 2024; pp. 3717–3726. [Google Scholar]
Su, G.; Zhang, Y.; Wang, T.; Wu, M.; Sha, Y. Unified Grid Tagging Scheme for Aspect Sentiment Quad Prediction. In Proceedings of the 31st International Conference on Computational Linguistics, Abu Dhabi, United Arab Emirates, 19–24 January 2025; pp. 3997–4010. [Google Scholar]
Zhang, H.; Song, X.; Jia, X.; Yang, C.; Chen, Z.; Chen, B.; Jiang, B.; Wang, Y.; Feng, R. Query-induced multi-task decomposition and enhanced learning for aspect-based sentiment quadruple prediction. Eng. Appl. Artif. Intell. 2024, 133, 108609. [Google Scholar] [CrossRef]
Zhang, W.; Deng, Y.; Li, X.; Yuan, Y.; Bing, L.; Lam, W. Aspect Sentiment Quad Prediction as Paraphrase Generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), Virtual, 7–11 November 2021; pp. 9209–9219. [Google Scholar]
Mohammadkhani, M.G.; Ranjbar, N.; Momtazi, S. E2TP: Element to Tuple Prompting Improves Aspect Sentiment Tuple Prediction. arXiv 2024, arXiv:2405.06454. [Google Scholar] [CrossRef]
Bai, Z.; Sun, Y.; Min, C.; Lu, J.; Zhu, H.; Yang, L.; Lin, H. Intuition meets analytics: Reasoning implicit aspect-based sentiment quadruplets with a dual-system framework. Knowl.-Based Syst. 2025, 320, 113534. [Google Scholar] [CrossRef]

Figure 1. The overall architecture of ProPGCN.

Figure 2. Processes of constructing syntactic dependency tree and adjacency matrix.

Figure 3. Architecture of GRI Module.

Figure 4. F1 scores for various top-k orders in the Restaurant dataset.

Figure 5. F1 scores for various top-k orders in the Laptop dataset.

Table 1. Statistics for Restaurant and Laptop datasets.

Dataset	Restaurant				Laptop
Dataset	S	Q	N = 1	N ≥ 2	S	Q	N = 1	N ≥ 2
Train	2934	4172	2100	834	1530	2484	920	610
Dev	326	440	236	90	171	261	106	65
Test	816	1161	580	236	583	916	370	213

Table 2. Statistics for Rest15 and Rest16 datasets.

Dataset	Restaurant				Laptop
Dataset	S	Q	N = 1	N ≥ 2	S	Q	N = 1	N ≥ 2
Train	834	1354	499	335	1264	1989	784	480
Dev	209	347	122	87	316	507	195	121
Test	537	795	358	179	544	799	377	167

Table 3. Performance of different models on Restaurant and Laptop datasets.

Method	Comparison Model	Restaurant			Laptop
Method	Comparison Model	Pre	Rec	F1	Pre	Rec	F1
BERT	Extract-Classify-ACOS [10]	59.81	28.94	39.01	44.52	16.25	23.81
	One-ASQP [19]	62.60	57.21	59.78	42.83	40.00	41.37
	CACA [21]	66.31	61.24	63.16	45.26	41.37	43.22
	UGTS [38]	65.94	63.47	64.68	48.21	46.39	47.28
	MRCCLRI [39]	61.04	64.30	62.63	44.93	45.30	45.11
Generative	Paraphrase [40]	58.98	59.11	59.04	41.77	42.56	42.56
	DLO [24]	60.02	59.84	59.93	43.40	43.80	43.60
	UAUL [30]	61.03	60.55	60.78	43.78	43.53	43.65
	E2TP [41]	–	–	61.89	–	–	45.00
	SIT [35]	63.13	63.49	63.31	44.38	44.61	44.49
	STAR [34]	61.79	60.37	61.07	45.53	44.78	45.15
	DuSR² [42]	61.86	–	61.20	46.11	–	45.75
	GDP [37]	64.71	63.71	64.21	46.84	44.20	45.48
	ProPGCN	66.52	63.15	65.04	48.65	45.73	47.89

Table 4. Performance of different models on Rest15 and Rest16 datasets.

Method	Comparison Model	Rest15			Rest16
Method	Comparison Model	Pre	Rec	F1	Pre	Rec	F1
BERT	Extract-Classify-ACOS [10]	35.64	37.25	36.42	38.40	50.93	43.77
	OTPT [22]	51.01	52.26	51.63	59.30	62.02	60.63
	UGTS [38]	52.76	52.43	52.59	65.72	64.50	65.10
	MRCCLRI [39]	53.83	52.36	53.08	60.09	65.85	62.84
Generative	Paraphrase [40]	46.16	47.72	46.93	56.63	59.30	57.93
	DLO [24]	47.08	49.33	48.18	57.92	61.80	59.79
	UAUL [30]	48.03	50.54	49.26	59.02	62.05	60.50
	E2TP [41]	–	–	51.70	–	–	62.90
	SIT [35]	47.89	50.13	48.98	58.98	61.60	60.26
	STAR [34]	50.80	51.95	51.37	60.54	62.90	61.70
	BvSP [36]	60.96	47.53	53.17	68.16	59.42	63.49
	DuSR² [42]	50.12	–	50.90	59.71	–	60.99
	GDP [37]	49.20	50.31	49.75	61.16	62.08	61.61
	ProPGCN	56.73	52.68	53.81	63.49	64.86	63.74

Table 5. Results of ablation experiments on the four datasets Restaurant, Laptop, Rest15, and Rest16.

Model	Restaurant	Laptop	Rest15	Rest16
w/o Second-Order Prompt	63.87	46.64	54.28	63.91
w/o Third-Order Prompt	63.72	45.95	53.89	63.80
w/o WBL	63.84	47.28	54.05	64.42
w/o GRI	64.51	47.09	54.12	63.98
w/o Inference	64.09	47.42	53.97	63.83
ProRGCN	65.04	47.89	54.70	64.74

Table 6. F1 scores on testing subsets with different aspects and opinion types.

Method	Restaurant-ACOS				Laptop-ACOS
Method	EA &EO	EA &IO	IA &EO	IA &IO	EA &EO	EA &IO	IA &EO	IA &IO
Extract-Classify	45.0	23.9	34.7	33.7	35.4	16.8	39.0	18.3
Paraphrase	65.4	45.6	53.3	49.2	45.7	33.0	51.0	39.6
GEN-SCL-NAT	66.5	45.2	56.5	50.7	45.8	34.3	54.0	39.6
UGTS	67.81	47.52	60.13	52.65	50.71	36.82	57.29	43.50
ProPGCN (ours)	68.23	48.60	60.94	54.06	51.37	38.04	58.01	44.79

Table 7. Comparison of UGTS and HPGCN models. In the examples, aspect words are annotated with cyan, and opinion words are annotated with orange. For example, in the first example, the simplest quadruple example is provided, in which both aspect words and opinion words are explicit. Therefore, both models can accurately predict the quadruple.

#	Sentence	Ground Truth	UGTS	HPGCN
1	The screen looked great	(screen, display general, POS, great)	( $A_{\sqrt}$ , $O_{\sqrt}$ , $C_{\sqrt}$ , $P O S_{\sqrt}$ )	( $A_{\sqrt}$ , $O_{\sqrt}$ , $C_{\sqrt}$ , $P O S_{\sqrt}$ )
2	Everything was fine and I went out for an hour.	(NULL, laptop general, POS, fine)	A_×, $O_{\sqrt}$ , $C_{\sqrt}$ , $P O S_{\sqrt}$ )	( $A_{\sqrt}$ , $O_{\sqrt}$ , $C_{\sqrt}$ , $P O S_{\sqrt}$ )
3	The display stopped working within 2 months.	(display, display general, NEG, NULL)	( $A_{\sqrt}$ , $O_{\sqrt}$ , $C_{\sqrt}$ , $N E G_{\sqrt}$ )	( $A_{\sqrt}$ , $O_{\sqrt}$ , $C_{\sqrt}$ , $N E G_{\sqrt}$ )
4	We were seated right away, the table was private and nice.	(table, ambience general, POS, private) (table, ambience general, POS, nice) (NULL, service general, POS, NULL)	( $A_{\sqrt}$ , $O_{\sqrt}$ , $C_{\sqrt}$ , $P O S_{\sqrt}$ ) ( $A_{\sqrt}$ , $O_{\sqrt}$ , $C_{\sqrt}$ , $P O S_{\sqrt}$ ) (–, –, –, –)	( $A_{\sqrt}$ , $O_{\sqrt}$ , $C_{\sqrt}$ , $P O S_{\sqrt}$ ) ( $A_{\sqrt}$ , $O_{\sqrt}$ , $C_{\sqrt}$ , $P O S_{\sqrt}$ ) ( $A_{\sqrt}$ , $O_{\sqrt}$ , $C_{\sqrt}$ , $P O S_{\sqrt}$ )
5	Our server continued to be attentive throughout the night, but I did remain puzzled by one issue: who thinks that Ray’s is an appropriate place to take young children for dinner?	(server, service general, POS, attentive) (ray’s, restaurant miscellaneous, NEU, NULL)	( $A_{\sqrt}$ , $O_{\sqrt}$ , $C_{\sqrt}$ , $P O S_{\sqrt}$ ) ( $A_{\sqrt}$ , O_×, $C_{\sqrt}$ , NEG_×)	( $A_{\sqrt}$ , $O_{\sqrt}$ , $C_{\sqrt}$ , $P O S_{\sqrt}$ ) ( $A_{\sqrt}$ , $O_{\sqrt}$ , $C_{\sqrt}$ , $N E U_{\sqrt}$ )

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Feng, Y.; Tang, M. Progressive Prompt Generative Graph Convolutional Network for Aspect-Based Sentiment Quadruple Prediction. Electronics 2025, 14, 4229. https://doi.org/10.3390/electronics14214229

AMA Style

Feng Y, Tang M. Progressive Prompt Generative Graph Convolutional Network for Aspect-Based Sentiment Quadruple Prediction. Electronics. 2025; 14(21):4229. https://doi.org/10.3390/electronics14214229

Chicago/Turabian Style

Feng, Yun, and Mingwei Tang. 2025. "Progressive Prompt Generative Graph Convolutional Network for Aspect-Based Sentiment Quadruple Prediction" Electronics 14, no. 21: 4229. https://doi.org/10.3390/electronics14214229

APA Style

Feng, Y., & Tang, M. (2025). Progressive Prompt Generative Graph Convolutional Network for Aspect-Based Sentiment Quadruple Prediction. Electronics, 14(21), 4229. https://doi.org/10.3390/electronics14214229

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Progressive Prompt Generative Graph Convolutional Network for Aspect-Based Sentiment Quadruple Prediction

Abstract

1. Introduction

2. Related Work

2.1. BERT-Based Methods

2.2. Data Augmentation-Based Methods

2.3. Generative Model-Based Methods

3. Proposed Model

3.1. Definition

3.2. Model Architecture

3.3. Data Preprocessing

3.4. Progressive Prompt Enhancement Module

3.5. Graph Convolutional Relational Inference (GRI) Module

3.6. Inference Layer

3.7. Model Training Loss

4. Experiments and Result Analysis

4.1. Dataset and Parameter Settings

4.2. Baseline Method

4.3. Experimental Results

4.4. Ablation Analysis

4.5. Performance Analysis

4.6. Case Study

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI