Semantic Augmentation in Chinese Adversarial Corpus for Discourse Relation Recognition Based on Internal Semantic Elements

Hua, Zheng; Yang, Ruixia; Feng, Yanbin; Yin, Xiaojun

doi:10.3390/electronics13101944

Open AccessArticle

Semantic Augmentation in Chinese Adversarial Corpus for Discourse Relation Recognition Based on Internal Semantic Elements

¹

Key Laboratory of Computational Linguistics, Department of Chinese Language and Literature, Peking University, Beijing 100871, China

²

School of Information Science, Beijing Language and Culture University, Beijing 100083, China

³

Research Institute of International Chinese Language Education, Beijing Language and Culture University, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(10), 1944; https://doi.org/10.3390/electronics13101944

Submission received: 7 April 2024 / Revised: 7 May 2024 / Accepted: 13 May 2024 / Published: 15 May 2024

(This article belongs to the Special Issue Data Mining Applied in Natural Language Processing)

Download

Browse Figures

Review Reports Versions Notes

Abstract

This paper proposes incorporating linguistic semantic information into discourse relation recognition and constructing a Semantic Augmented Chinese Discourse Corpus (SACA) comprising 9546 adversative complex sentences. In adversative complex sentences, we suggest a quadruple (P, Q, R,

Q_{β}

) representing internal semantic elements, where the semantic opposition between Q and

Q_{β}

forms the basis of the adversative relationship. P denotes the premise, and R represents the adversative reason. The overall annotation approach of this corpus follows the Penn Discourse Treebank (PDTB), except for the classification of senses. We combined insights from the Chinese Discourse Treebank (CDTB) and obtained eight sense categories for Chinese adversative complex sentences. Based on this corpus, we explore the relationship between sense classification and internal semantic elements within our newly proposed Chinese Adversative Discourse Relation Recognition (CADRR) task. Leveraging deep learning techniques, we constructed various classification models and the model that utilizes internal semantic element features, demonstrating their effectiveness and the applicability of our SACA corpus. Compared with pre-trained models, our model incorporates internal semantic element information to achieve state-of-the-art performance.

Keywords:

corpus construction; semantic information; discourse relation recognition; adversative complex sentence

1. Introduction

Discourse relation recognition (DRR) is a crucial element in discourse analysis [1], playing a significant role in downstream tasks such as casual reasoning [2], machine translation [3], and information extraction [4]. The Penn Discourse Treebank (PDTB) [5,6] and the Rhetorical Structure Theory (RST) [7] serve as the mainstream discourse-level annotation methods for discourse analysis. PDTB targets the localized relationship within a discourse, while RST is crafted for the overall discourse structure analysis.

Despite the complexity of PDTB annotation and its high human resource costs, numerous datasets following PDTB annotation standards [5] have emerged in the past several decades. These datasets cover English [5,6], German [8], Portuguese [9], Persian [10], Chinese [11,12], and a multilingual resource corpus of six languages (English, Polish, German, Russian, European Portuguese, and Turkish) [13]. However, none of these datasets pay attention to the sentences’ linguistic semantic knowledge.

Currently, the research on DRR is focused on implicit discourse relation recognition (IDRR), and most of the methods mainly leverage human-annotated connectives to enhance their performance [14]. DRR can be regarded as a classification task, where the input consists of argument1 and argument2, and the goal is to determine the discourse relationship between the two [15]. Although most existing methods have already employed large language models to capture contextual features [16], we contend that these methods fall short of considering linguistic knowledge, especially neglecting the internal characteristics of sentences. This becomes particularly evident when dealing with sentences that involve adversative discourse relations.

Example 1 shows an adversative sentence (To be concise, we will represent “adversative complex sentence” as “adversative sentence”). Considering the context provided, we can infer that she stopped because “she realized she couldn’t waste this water when there are people in Watsonville who don’t have fresh water to drink”. However, the PDTB annotation does not address the reasons. Therefore, we believe that exploring the linguistic features of sentences in texts with discourse relations will enhance discourse comprehension. Specifically, we think introducing extra-linguistic features would enhance the accuracy of recognizing discourse relations. Currently, numerous corpora are annotated for causal relations [17,18,19], but there is a lack of available data for adversative relations. However, adversative relations are also essential in discourse, highlighting the necessity for the development of a relevant corpus. The primary challenge is the difficulty in annotation, which demands the involvement of professionals and significant human resources. Additionally, leveraging semantic information also necessitates linguistic support.

Example 1.

Last Sunday, Ms. Johnson [finally got a chance to water her plants]_Arg1, but [stopped abruptly.]_Arg2 "I realized I couldn’t waste this water when there are people in Watsonville who don’t have fresh water to drink." [WSJ 0766] Sense: Comparison.Concession.Arg2-as-denier

Example 1 is an adversative sentence from the PDTB 3.0 corpus, with a comparison of senses. Concession is used when an expected causal relation is cancelled or denied by the situation described in one of the arguments. Arg2-as-denier is used when Arg1 raises an expectation of some consequence, while Arg2 denies it [6].

In this work, we constructed a Semantic Augmented Chinese Adversative corpus (SACA) to address the issue stated earlier. Unlike other corpora, the texts in SACA specifically focus on adversative sentences. Our analysis of the corpus will concentrate on two dimensions: overall sense relationship classifications and the internal semantic elements. For the overall sense relationship classifications, we basically follow PDTB annotation guidelines [5,6]. However, considering the differences between Chinese and English, as well as the semantic characteristics of adversative sentences, we have also referenced the classification method of the CDTB [12,20], dividing the overall sense relation classification of adversative sentences into eight categories in Section 3.1: cause, condition, direct contrast, indirect contrast, concession, expansion, progression, and coordination. Subsequently, to utilize linguistic features, we discussed the determination of the internal semantic elements and their symbolic representation. An adversative sentence is a grammatical structure commonly used to express a contrast or opposite situation to the viewpoints, plots, or situations mentioned in the preceding context [21]. Adversative sentences can make the text more vivid and specific, guiding readers to notice aspects or changes that differ from the previous content [22]. Zeng [23] suggests that adversative sentences indicate a contrast between the actual and expected results, and Yuan [24] thinks there is always a reason for the contrast leading to the opposition between the actual and expected results. At the same time, through observing numerous examples, we find that there is always a background for the entire context in adversative sentences. Therefore, we used a quadruple (P, Q, R,

Q_{β}

) to represent the concepts in Section 3.2 corresponding to the internal semantic elements of an adversative sentence: premise, expected result, reason, and unexpected result.

The annotation in SACA basically follows PDTB annotation guidelines [5,6] and introduces the concept of internal semantic elements shown in Table 1. The corpus includes 9546 text segments, categorized into eight sense types, with each sentence labelled with internal semantic elements. Compared with the existing corpora mentioned in Section 2.1, our corpus has the following features: (1) It highlights eight overall sense relationships in adversative sentences. (2) It introduces internal semantic elements into adversative sentences. (3) It incorporates paragraph-level content for added contextual detail.

Based on this corpus, we can combine linguistic knowledge with deep learning techniques, enabling a more systematic and scientific exploration of DRR. We propose a task known as Chinese Adversative Discourse Relation Recognition (CADRR), which is analogous to the DRR task and aims to identify relations in Chinese adversative sentences. We have developed a model that skillfully integrates internal semantic information for CADRR tasks, demonstrating the usability and effectiveness of these semantic elements. This corpus provides a new perspective on discourse relation recognition by utilizing its internal semantic elements. By examining the connection between these elements and their sense types, we can enhance our grasp of discourse relations and language structure, potentially uncovering novel language principles.

Our contributions include:

We provide a relatively large-scale semantic augmented Chinese adversative discourse treebank. It follows PDTB annotations for sense types and annotates internal semantic elements of adversative complex sentences.
We analyze this corpus, exploring the connection between sense classification and internal semantic element classification.
We introduce a new task called CADRR (Chinese Adversative Discourse Relation Recognition), aimed at predicting discourse relationships for Chinese adversative sentences. We then classify the corpus using deep learning models and our proposed method utilizing internal semantic elements. Results indicate the effectiveness of our internal semantic features and the applicability of our SACA corpus.

The remainder of this paper is structured as follows: Section 2 reviews the PDTB format of discourse relation annotation corpora, discourse relation recognition models, and research related to adversative sentences. In Section 3, we discuss the details of corpus construction, including categorization, data sources, preprocessing, annotation process, and consistency checks. In Section 4, we conduct a detailed corpus analysis, especially focusing on the relevance between internal semantic elements and overall categorization. Section 5 introduces the CADRR task in Section 5.1, elaborates our classification model enhanced by internal semantic elements in Section 5.2, and presents and discusses the results in Section 5.4. Finally, in Section 6, we summarize the main conclusions of our study and discuss potential directions for future work.

2. Related Work

This section reviews the literature relevant to the study of discourse relations, particularly focusing on discourse datasets, adversative sentence analysis, and discourse relation recognition models. Section 2.1 presents the datasets following the PDTB annotation in Chinese and English. Section 2.2 delves into the linguistic analysis of adversative sentences. Section 2.3 explores advancements in discourse relation recognition, emphasizing the role of deep learning and the challenges of incorporating complex semantic elements.

2.1. PDTB Datasets in English and Chinese

The English corpora PDTB 2.0 (https://catalog.ldc.upenn.edu/LDC2008T05 (accessed on 15 January 2023)) and PDTB 3.0 (https://catalog.ldc.upenn.edu/LDC2019T05 (accessed on 15 January 2023)) are the primary benchmarks within the PDTB framework. PDTB 2.0 includes news articles from the Penn Treebank [25]. PDTB 3.0 has been improved and enhanced based on PDTB-2.0, with 53,631 tokens annotated. The Harbin Institute of Technology Chinese Discourse TreeBank (HIT-CDTB) (http://ir.hit.edu.cn/hit-cdtb/index.html (accessed on 15 January 2023)) is a small Chinese PDTB corpus comprising 525 annotated texts, analyzing and annotating Chinese corpora by distinguishing explicit and implicit discourse relations [20]. The Chinese Discourse Treebank 0.5 (CDTB-0.5) (https://catalog.ldc.upenn.edu/LDC2014T21 (accessed on 10 February 2023)) is a corpus based on PDTB annotation principles, studying the syntactic and statistical distribution of discourse connectives in the Chinese treebank and improving annotation strategies for Chinese corpora [12].

2.2. Research on Adversative Sentences

An adversative sentence is a crucial type of Chinese complex sentence. The logical foundation of adversative sentences has consistently been the focus of widespread scholarly interest. Fuyi Xing pointed out that establishing Chinese adversative sentences depends on the foundation of a contrasting relationship between things [21]. Ref.[22] stated that the characteristic of adversative clauses lies in the fact that the main clause does not follow the meaning of the subordinate clause; instead, it contrasts or opposes the meaning of the subordinate clause.

Many linguistic researchers have explored the internal semantic elements of adversative sentences. Longacre [26] analyzed adversative sentences in English and proposed that they consist of five internal semantic elements P, Q, R,

Q_{β}

, S.

Longacre suggested that certain internal semantic elements can be implicit and not explicitly stated in syntax. In Figure 1, the internal semantic elements R and S, bounded by dashed lines, can be omitted, and the sentence remains complete: “I went to look for it, but I couldn’t find it”. Yulin Yuan [24] stated that the two sentences linked by 反而(fǎn’ér)[on the contrary] could express an unexpected event scenario: with event P, the expected outcome Q should follow logically. However, contrary to expectations, outcome R has emerged, surpassing the severity of the expected opposite of Q. Based on these views, our hypothesis further relies on the premise that the recognition of adversative discourse relations in sentences must depend on internal semantic cues.

2.3. Discourse Relation Recognition Model

Early research utilized manually created features to classify discourse relations into four top-level senses [27]. The rapid progress of deep learning has led to the exploration of various methods using neural networks. Typical methods include shallow CNN [28], LSTM with multi-level attention [29], and knowledge-augmented LSTM [30]. With the power of language models, the prediction of connectives proves to be effective in relation recognition [31,32,33]. Another effective approach is incorporating hierarchical labels from PDTB into the model, which also enhances accuracy [15,34]. More recently, large pre-trained language models (PLMs) and prompting techniques have also improved the performance of DRR [16,35,36].

While the models discussed above have considered semantic exploration, such as using connectives and contextual features, they have consistently overlooked the potential existence of internal semantic elements discussed above. One reason is the lack of annotation in existing mainstream PDTB corpora, making it impossible to leverage them. Additionally, integrating internal semantic elements with existing deep learning models requires further exploration.

3. The Corpus Construction

This section outlines the development and annotation of the Semantic Augmented Chinese Discourse Corpus (SACA). Section 3.1 details our refined adversative discourse relation system, simplifying previous complex classification systems to better suit Chinese linguistic features. Section 3.2 defines the internal semantic elements essential for understanding adversative meanings. Section 3.3 describes the data collection and processing methods, ensuring a focused and high-quality corpus. Section 3.4 explains our annotation methods and processes, aimed at maintaining high consistency and accuracy. Finally, Section 3.5 presents the inter-annotator agreement, evaluating the reliability of our annotations.

3.1. Adversative Discourse Relation System

PDTB [5,6] adopts a hierarchical classification system for annotating relations. The first level comprises four major categories, each further divided into multiple subcategories. The Chinese corpus HIT-CDTB [20] inherits the hierarchical annotation method from PDTB. However, we encounter redundancy in many categories when applying this system to adversative sentences. The multi-level hierarchical system appears overly complex for the classification of adversative sentences. Consequently, after analyzing adversative sentences in PDTB3 and HIT-CDTB, we removed redundant senses and refined and merged certain senses. For example, through statistical analysis, we scarcely observed temporal senses in adversative sentences, so we eliminated this sense. As a result, we obtained the following eight senses for Chinese adversative sentences. This article uses Arg1 for the pre-located argument and Arg2 for the post-located argument, and we identify connectives with underscores.

3.1.1. Cause

There is a factual cause-and-effect relationship between the two arguments. This category exists in PDTB and HIT-CDTB, but the two annotation systems have detailed this sense, such as whether the cause precedes or follows the effect. We have unified sentences with causal relationships under the cause for convenient annotation.

Example 2.

[没有热交换器]_Arg1,即使[其它设备全部修好, 也得停产]_Arg2. ([Without a heat exchanger]_Arg1, [production would have to halt, even if all other equipment is fully repaired]_Arg2).

3.1.2. Condition

One argument presents a condition or a hypothetical scenario, while another details the expected outcome under this condition or scenario. Similar to the causal sense, the condition sense also appears in both PDTB and HIT-CDTB, and their annotation systems have provided detailed categories of conditions. For ease of annotation, we have simplified sentences with conditional relationships under the category condition.

Example 3.

[产品设计, 场所建设都要充分考虑保证安全]_Arg1, 否则[不能生产]_Arg2. ([Product design and facility construction must thoroughly consider ensuring safety]_Arg1; otherwise, [production cannot proceed]_Arg2).

The next three categories—direct contrast, indirect contrast, and concession—are refinements within the broader scope of sense comparison in PDTB. Given that most adversative sentences convey comparison, we have identified and specified these three subtypes.

3.1.3. Direct Contrast

Directly comparing the elements within the two arguments, direct contrast exhibits a contrasting relationship.

Example 4.

[欧元区仍深陷危机, 美国经济还难言真正复苏, 日本经济的前景也存在隐忧]_Arg1, 而[新兴国家的经济总量却水涨船高]_Arg2. ([With the Eurozone still mired in crisis, the U.S. economy finding it difficult to achieve a true recovery, and potential concerns about the future of the Japanese economy]_Arg1, meanwhile [the economic output of emerging nations is on the upswing]_Arg2).

3.1.4. Indirect Contrast

When two arguments lack a clear contrast but exhibit semantic shifts and transitions between topics, their sense is indirect contrast.

Example 5.

[随着科学技术的发展, 一些产业部门需要的劳动力可用技术替代]_Arg1, 然而也要看到, [科学技术的发展必然导致新行业的开辟]_Arg2. ([As science and technology advance, labor in certain industries can be replaced by technology]_Arg1. However, it’s crucial to acknowledge that [technological progress inevitably results in the creation of new industries]_Arg2).

3.1.5. Concession

Concession involves a logical contradiction where one argument introduces a constraint, condition, or opposing situation, while the other argument presents a relatively subdued or surprising result.

Example 6.

电力一直是煤炭的需求和消费大户, 也是煤炭的主市场, 即使[在煤炭市场疲软的情况下]_Arg1, [电力方面对煤炭的需求也始终是稳定的]_Arg2. (Electricity stands as the primary consumer and major market for coal, consistently driving demand.Even [during a downturn in the coal market]_Arg1, [the demand for coal in the power sector remains steady]_Arg2).

3.1.6. Expansion

The expansion adds complexity to the sentence, such as elaborating on details, or clarifying exceptional circumstances. In PDTB, expansion is one of the four top relations, with several subcategories beneath it. It includes exception, level of detail, instantiation, and so on. We grouped these relationships under the category of expansion.

Example 7.

即使[发生衰退]_Arg1, [也将属于西方所称的“增长性衰退”]_Arg2, 持续时间不会很长. Even [in the event of a downturn]_Arg1, [it would be what the West calls a “growth recession”]_Arg2, and its duration is not expected to be prolonged).

3.1.7. Progression

One argument outlines a particular issue, while the other delves deeper into its meaning, establishing a hierarchical relationship between them, known as progression. This category does not exist in PDTB but is present in HIT-CDTB, due to differences between Chinese and English. English emphasizes structure, and Chinese emphasizes semantics. Progression is very common in Chinese [20], especially in complex sentences with concessive clauses. Therefore, this category is treated separately.

Example 8.

实力和经验的欠缺注定了我们不可能走得太远.但[我们既不会因此而悲伤]_Arg1, [更不会因此而放弃]_Arg2. (The lack of strength and experience inevitably dictates that we may not go too far. However, [this will neither sadden us]_Arg1 [nor will we give up]_Arg2).

3.1.8. Coordination

Both arguments exhibit coordinate content and semantics, maintaining an equivalent status. This category does not exist in PDTB but is present in HIT-CDTB. English often employs clauses, while Chinese tends to use multiple sentences. In Chinese, it is common to encounter consecutive sentences with similar semantic roles, addressing various aspects of a problem [20]. In contrastive sentences, we noticed that the two arguments linked by a contrastive connective often have independent meanings, describing two aspects of a single problem. Therefore, this category is singled out separately.

Example 9.

[一方面, 他们表示愿意谈判]_Arg1,但是[另一方面他们准备重新挑起战争, 并且进行威胁和施加压力]_Arg2. ([On the one hand, they express willingness to negotiate]_Arg1, but [on the other hand, they are prepared to reignite the war and employ threats and pressure] _Arg2).

3.2. Definition of Internal Semantic Elements

Following the discussion in Section 2.2, we have decided to adopt a symbolic representation for our annotation. We choose the semantic elements most likely to appear in Chinese adversative sentences as our semantic elements. Through extensive observation and linguistic research on Chinese adversative sentences, the primary elements that can be determined are the expected proposition Q and opposing proposition

Q_{β}

, as this is the key condition for constituting adversative meaning. The factors contributing to adversative meaning have been considered significant in many studies. Thus, our annotation identifies this element as R. Ultimately, we define the premise mentioned by Longacre as P, where P always results in Q.

These four elements are widely recognized in Chinese adversative sentences. On this basis, we define internal elements as four labels, shown in Table 1, where P represents the premise, Q represents the expected result, R represents the adversative reason, and

Q_{β}

represents the unexpected result. The adversative meaning can be summarized as follows: the occurrence of premise P should, logically, lead to the expected result Q. However, the emergence of reason R results in the unexpected result being

Q_{β}

.

Following Longacre’s [26] and Yuan’s [24] theory, these semantic elements may be present or absent. For instance, in the example sentence in Table 1, the sentence remains complete even if P does not appear. Therefore, in the annotation process, we specify that not all internal semantic elements must be labeled; annotation should be performed based on the actual circumstances.

3.3. Data Collection

3.3.1. Data Source

We have chosen the BLCU (Abbreviation for Beijing Language and Culture University) Corpus Center (BCC) Chinese Corpus [37] as our data source. The BCC Chinese Corpus is a multilingual platform (https://bcc.blcu.edu.cn/ (accessed on 15 March 2023)) with a total of approximately 9.5 billion characters. It includes diverse corpora from multiple domains, such as newspapers (2 billion characters), literature (3 billion characters), general texts (1.9 billion characters), ancient Chinese (2 billion), and dialogues (600 million, from Weibo and film subtitles). This extensive corpus not only comprehensively captures the linguistic landscape of contemporary society but also provides advanced text-processing capabilities, such as regular expression searches, to analyze both modern and ancient Chinese texts effectively. The platform supports online retrieval and provides various textual content for users to select from, including categories like ‘multi-domain’, ‘literature’, ‘newspapers’, and ‘Ancient Chinese’. To ensure data quality, we restrict our searches to the ‘newspapers’ category, specifically focusing on the ‘People’s Daily’ corpus from 1946 to 2018. Our corpus exclusively focuses on adversative sentences. Therefore, we utilized keyword searches, selecting over a dozen keywords according to [21], such as 可是(kěshì), 但是(dànshì), 然而(rán’ér), and 否则(fǒuzé).

3.3.2. Data Processing

We chose sentences containing adversative connectives. However, sentences do not exist in isolation but are part of a larger discourse, meaning they are semantically connected to the context. To obtain an appropriate length for our research, we initially annotated 100 complete texts. Two annotators with linguistic backgrounds annotated each text. In this task, annotators were free to extract what they considered the most relevant content based on the principles outlined in Section 3.2. After observing these 100 texts, annotators extracted 82 segments, with 56 segments showing consistent annotations. After careful review, we found that the sentence with adversative connectives, along with the sentences before and after it (totaling three sentences), could cover all the meaning conveyed by the adversative sentence. Hence, we opted to extract these three segments for annotation. This streamlines annotation by identifying semantic elements (such as P and R) in nearby sentences and more comprehensive training with longer text. Finally, we have collected a total of 9564 segments, with each text ranging from 100 to 250 words.

3.4. Annotation Method

3.4.1. Annotation Guideline

When handling an adversative sentence, it is essential to categorize its sense and identify its internal semantic elements. When categorizing sense, annotators first need to identify the key adversative connective in the sentence and mark them. Following PDTB standards[5], argument 1 and argument 2 are then annotated. Finally, they classify the data according to the eight simplified sense categories outlined in Section 3.1. For the annotation of internal semantic elements, we choose the labels P, Q, R, and

N Q

to specifically represent the logical internal semantic elements P, Q, R, and

Q_{β}

, as shown in Table 1. Initially, we provide annotators with a batch of texts containing specific adversative connectives (100 texts for each batch). Annotators are tasked with determining, one by one, whether these sentences express only one adversative meaning. If present, annotators should label each clause or phrase’s corresponding internal semantic elements. For clauses without corresponding internal semantic elements, no annotation is necessary. Familiarity is ensured through annotation training and pilot annotations before annotation. The following outlines the annotation steps:

Read the sentence and assess whether it expresses only one adversative meaning.
- Skip without annotation if there are no adversative semantics or multiple instances of adversative semantics.
- If there is only one adversative meaning, move on to the second step.
Identify the connective, annotate argument 1 and argument 2, and classify the sense according to Section 3.1: cause, condition, direct contrast, indirect contrast, concession, expansion, progression, and coordination.
Annotate internal semantic elements for each clause with the four types available in the Table 1: P, Q, R, $N Q$ .
Pay attention to the following specific circumstances:
- If there is more than one adversative connective, it is necessary to select one as the key adversative connective that is related to the adversative meaning.
- When the annotator finds a connective that requires manual supplementation, we adopt the PDTB processing method. If necessary, we manually add connectives and mark them with curly braces for identification.
- If the phrase does not align with any of the four semantic elements, do not mark the phrase.
The overall annotation format is: if contrast; overall sense (Arg1, connective, Arg2); internal semantic elements: (P, Q, R, $N Q$ ).

3.4.2. Annotation Process

We pre-labeled 1000 data entries, noting variables such as annotator gender, grade, profession, time of annotation, and duration spent on each annotation. Using a model to monitor the process, we found that annotation accuracy was not correlated with the annotator’s gender but did correlate with the grade, profession, time, and duration of annotation.

In our findings, we determined that annotation quality is not correlated with gender, but it does relate to age; younger students tend to have lower annotation quality compared to their senior peers. Additionally, students who have taken courses in linguistics generally produce higher-quality annotations. We also observed that annotation quality was higher during specific times of the day, particularly from 9 a.m. to 11 a.m. and from 3 p.m. to 5 p.m. Moreover, there appears to be a correlation between annotation quality and the duration of the annotation session; quality remains high during the first three hours, but there is a significant decline in the fourth hour.

To annotate internal semantic elements, we assembled a team of twenty-two annotators with linguistic expertise, divided into five groups. Each group annotated 7–12 sets of adversative word corpora, totaling nearly 2000 instances. Two annotators were assigned to each segment, and their annotations underwent dual evaluations, followed by a review by a proofreader. Any discrepancies were addressed through comments in the annotation tool, leading to collaborative discussions between the proofreader and annotators to reach a consensus on the final version.

We engaged six native Mandarin speakers to classify the overall sense of adversative sentences, with each annotator labeling around 1600 instances. An auditor then conducted a review to ensure annotation accuracy. We provided annotation training and pilot annotations for two tasks to ensure annotators were familiar with the standards. In the pilot annotations, 105 data samples were extracted from the original dataset, and each annotator followed the instructions to annotate them. After completion, a comprehensive review of the results was conducted. Inconsistencies were addressed through revisions, and annotation guidelines were clarified for ambiguous sections.

3.5. Inter-Annotator Agreement

To validate the consistency of the Chinese annotation system, this study treats semantic annotation as a classification problem. The standard Kappa values in Equation (1) are utilized to assess the annotation consistency among multiple annotators [38], and the instances with complete agreement across multiple annotators are counted.

K = \frac{P (A) - P (E)}{1 - P (E)}

(1)

In Equation (1), K represents the Kappa coefficient, which ranges from −1 to 1. It is a statistical measure used to evaluate the consistency of agreement among annotators beyond chance. A higher Kappa value indicates greater agreement.

P (A)

denotes the observed agreement among annotators, reflecting the proportion of instances where all annotators agree.

P (E)

represents the expected agreement by chance, calculated based on the probability distributions of each annotator’s ratings. The denominator

1 - P (E)

normalizes the Kappa coefficient to account for chance agreement, thereby quantifying the degree of consistency relative to what would be expected if annotations were made randomly.

This paper employs a strict matching approach to calculate the Kappa values for annotation consistency among each group of annotators, which includes both overall semantic annotation and internal semantic element annotation. Only when multiple labels are fully refined to the lowest-level categories and the results are consistent is the outcome considered identical. Detailed results are presented in Table 2.

According to Table 2, the average Kappa value for the consistency of internal semantic element annotation is 63.84%, while the Kappa value for overall sense classification reaches 69.46%. The average consistency of internal semantic element annotation is lower than that of overall sense classification annotation. Due to the complex composition of corpus clauses and the susceptibility of internal semantic elements to individual subjective factors, different annotators may have varying judgments in internal semantic element annotation, resulting in lower consistency compared to overall sense classification. Meanwhile, the Kappa values for each annotation group are relatively high, indicating good consistency. Higher annotation consistency also suggests a well-coordinated SACA discourse relation system. Furthermore, instances where results differ completely among annotator groups are extremely rare, indicating that most instances can be appropriately categorized and confirming the completeness of the annotation system.

Additionally, it is worth noting that there are significant discrepancies in annotation consistency among annotator groups. In the annotation of internal semantic elements, Group 3 achieved a Kappa value of 69.58%, while Group 1 only reached 57.43%, illustrating a considerable difference in consistency between the two groups. Similarly, in the overall sense classification annotation, there are notable variations in consistency between Group 3 and Group 5. Overall, annotator Group 3 exhibits the highest consistency among the six groups, while the other groups show divergent levels of consistency for both tasks. This highlights the complexity of semantic issues, indicating that annotation results rely, to some extent, on the judgments of annotators. The intrinsic ambiguity of semantic issues, combined with the diverse textual variations in Chinese adversative sentences, makes it difficult to apply uniform judging criteria, thereby increasing the difficulty in annotation.

4. Analysis

4.1. Sense-Level Features

From Table 3, it can be seen that concession accounts for the largest proportion in the overall context at 54.01%, followed by direct contrast and indirect contrast at 13.85% and 10.87%, respectively. This indicates that concession relationships, direct contrast, and indirect contrast are very common in Chinese adversative sentences.

4.2. Internal Semantic Element Level Features

Through the semantic annotation task, SACA generated more than 60 arrangements of semantic elements. After excluding instances of individual category labeling errors and arrangements deemed irrelevant, we processed the data and obtained 19 internal semantic element arrangements (Table 4). The top five arrangements, with their respective proportions, are as follows: (P,

Q_{β}

) holds the most significant share (22.65% of the total), followed by (P, R,

Q_{β}

) (20.02%), (P, Q, R,

Q_{β}

) (11.76%), (P,

Q_{β}

, R) (10.62%), and (R, P,

Q_{β}

) (9.65%).

Through observation, we know that there are often cases of missing semantic elements in real language corpora. We have found that an adversative semantic is often formed when two or more internal semantic elements appear together, and the arrangement of these elements tends to follow certain patterns. There are three situations where all four semantic elements appear simultaneously, including (P, Q,

Q_{β}

, R), (P, Q, R,

Q_{β}

), and (Q, P, R,

Q_{β}

). It is noted that the positions of P and Q can be interchangeable, and the positions of R and

Q_{β}

can generally be interchangeable as well. Additionally, P and Q often appear before R and

Q_{β}

. Q is the most easily omitted element, while P and

Q_{β}

are the most frequently occurring elements. For individual internal semantic elements, we have found that P and R, as well as P and

Q_{β}

, often appear simultaneously. On the other hand, Q and R have a relatively low co-occurrence.

4.3. Semantic–Sense Relation Analysis

We have obtained sense classifications and their internal semantic element annotations. Our objective is to investigate the potential correlations between them. We introduce the concept of mutual information (MI) for calculating correlations. MI is a measure used to quantify the degree of association or dependence between two random variables [39]. Specifically, it assesses how much knowing the value of one variable reduces uncertainty about the other. The mutual information between variables X and Y is calculated based on the probabilities of their joint occurrences compared to those expected under the assumption of independence. A higher mutual information value indicates a stronger relationship between the variables, implying that knowledge about one variable provides more information about the other.

M I (X, Y) = \sum_{x \in X} \sum_{y \in Y} P (x, y) log (\frac{P (x, y)}{P (x) P (y)})

(2)

Equation (2) represents the calculation of mutual information between two discrete random variables, X and Y. In this formula,

P (x, y)

denotes the joint probability of X and Y taking values x and y simultaneously.

P (x)

and

P (y)

represent the marginal probabilities of X and Y, respectively.

The mutual information calculation results are presented in Table 5 between specific combinations of internal semantic elements and overall senses. The internal semantic element patterns occurring at least 100 times were chosen for analysis. In the table, Lower MI values suggest weaker correlations, while higher MI values indicate stronger associations.

The pattern (P, Q, R,

Q_{β}

) has relatively high MI values for expansion. The pattern (P, Q,

Q_{β}

, R) shows notably high MI values for cause, progression and expansion. The pattern (P,

Q_{β}

, R) is closely associated with direct contrast, having an MI value of 5.2049. The connection between the pattern (P, Q,

Q_{β}

) and the expansion is highly significant, with an MI value of 10.5881. The pattern (P,

Q_{β}

) is closely associated with both the direct contrast and the indirect contrast sense types. From the perspective of semantic classification, it is noted that condition, progression, and coordination are not highly correlated with specific internal semantic element patterns, whereas the remaining categories show a strong correlation with specific internal semantic element patterns.

5. Experiment

5.1. Problem Definition

In order to validate the applicability of our corpus, firstly, we defined the problem of Chinese Adversarial Discourse Relationship Recognition (CADRR). The task of CADRR prediction is to establish a model for predicting the categories of Chinese adversarial discourse relations.

Each sentence can be described by multiple internal semantic elements, including P, Q, R, and

Q_{β}

. The different combinations of these elements in a sentence can be described as different patterns. The instantiation of an element serves as a feature. In the sequential CADRR model, various elements and features are primarily utilized as inputs:

C A D R R = f ({e l e m e n t}_{1}, {e l e m e n t}_{2}, \dots, {e l e m e n t}_{n})

(3)

The goal of CADRR prediction is to predict the probability, denoted as y, of a sentence S being one of the senses of a Chinese adversative sentence type, based on the aforementioned pattern features.

5.2. Method

In this section, we introduce our proposed method for CADRR, whose architecture is shown in Figure 2 and consists of four layers: input and embedding layer, internal semantic element layer, attention layer, and prediction layer.

5.2.1. Input and Embedding Layers

Unlike languages made up of letters, such as English, Dutch, and German, Chinese is a language made up of characters. The shape of the characters also carries rich semantic information. For example, Chinese characters such as “桦” (huà, a type of birch tree), “樟” (zhāng, a type of camphor tree), and “杨” (yáng, a type of poplar tree) all share the radical “木” (mù, wood or tree), indicating their semantic connection to trees. Therefore, we encode Chinese glyph information as part of the semantic representation, as discussed in [40,41,42,43,44].

For each Chinese character, we utilize three distinct font styles, represented as

24 \times 24

pixel images with values ranging from 0 to 255. These styles include FangSong (

I^{F}

), characterized by slightly slanted and rounded strokes; XingKai (

I^{X}

), known for its flowing, cursive lines; and LiShu (

I^{L}

), noted for its thick, strong strokes. In Figure 2, three distinct font styles of writing trees are shown. These images are amalgamated into a tensor measuring

I_{24 \times 24 \times 3} = (I^{F}, I^{X}, I^{L})

. Following this, the tensor undergoes flattening and is subsequently processed through an RNN layer to derive the glyph embedding as shown in Equation (4).

e_{i}^{(g l y p h)} = R N N (I_{i}^{F}, I_{i}^{X}, I_{i}^{L})

(4)

Then, we combine pretrained word embeddings

e_{i}^{(w o r d)}

with glyph-based word embeddings

e_{i}^{(g l y p h)}

such as the

X_{i}

shown in Equation (5).

X_{i} = e_{i}^{(w o r d)} + e_{i}^{(g l y p h)}

(5)

5.2.2. Internal Semantic Element Layer

Afterward, we input these embeddings into a multi-layer BiLSTM to generate contextualized word representations. Subsequent to this, we employ two single-layer feedforward neural networks (FNN) to generate two vectors,

h_{i}^{(s t a r t)}

and

h_{i}^{(e n d)}

, for each word. One vector signifies the word as the starting point of the internal semantic element, while the other denotes it as the ending point. Ultimately, we utilize a Biaffine function to evaluate the score for each word pair and another to assess the label for each pair.

R_{i} = B i L S T M (X_{i})

(6)

h_{i}^{(s t a r t)} = F N N^{(s t a r t)} (R_{i})

(7)

h_{i}^{(e n d)} = F N N^{(e n d)} (R_{i})

(8)

S_{i j} = {(h_{i}^{(s t a r t)})}^{T} W_{1} h_{j}^{(e n d)} + b (i \leq j)

(9)

where W is a square matrix and b is a scalar.

We train the element by optimizing the cross-entropy loss:

L_{e l e m e n t} = - \sum_{i \leq j} y_{i j} l o g S i g m o i d (S_{i j}) + (1 - y_{i j}) l o g (1 - S i g m o i d (S_{i j}))

(10)

where

y_{i j}

indicates whether the element

(e_{i}, \dots, e_{j})

is indeed an element.

T_{i} = S o f t m a x (W_{2} S_{i j} + c)

(11)

where

W_{2}

is a square matrix and c is a scalar.

L_{l a b e l} = - \frac{1}{N} \sum_{i = 1}^{N} \sum_{k = 1}^{K} l_{i, k} log (T_{i, k})

(12)

where

T_{i}

is the predicted result of the ith element,

L_{l a b e l}

represents the loss function,

l_{i, k}

is the true label at the ith position, N is the length of the sequence, and K is the number of label categories.

Now, we have the internal semantic element loss function.

L_{p a t t e r n} = α L_{e l e m e n t} + (1 - α) L_{l a b e l}

(13)

where

α

is hyper parameter from 0 to 1.

5.2.3. Attention Layer

After the above processing, we now obtain N relevant source sequences

X = [E_{1}, \dots, E_{N}] (2 \leq N \leq 4)

, where each sequence

P_{i} = [x_{i, 1}, \dots, x_{i, T}] (1 \leq i \leq N)

and T is the length of the ith sequence.

Due to the occurrence of the premise, P should, logically, lead to the expected result Q. However, the emergence of reason R results in the unexpected result being

Q_{β}

. So we calculate attention

R_{1}

between R with P, and attention

R_{2}

between Q with

Q_{β}

. We suppose that the occurrence of premise P is

P = [x_{p, 1}, \dots, x_{p, T_{p}}]

. And the emergence of reason R is

R = [x_{r, 1}, \dots, x_{r, T_{r}}]

. Then,

R_{1} = \sum_{i = 1}^{T_{p}} α_{i} x_{p, i}

(14)

where,

Z_{i j}^{r} = x_{p, i}^{T} W_{3} x_{r, j}

and

α_{i}^{r} = \frac{e x p (Z_{i j})}{\sum_{k = 1}^{T_{p}} \sum_{l = 1}^{T_{r}} (Z_{k l})}

.

Let the expected result

Q = [x_{q, 1}, \dots, x_{q, T_{q}}]

and the unexpected result

Q_{β} = [x_{q_{β}, 1}, \dots, x_{q_{β}, T_{q_{β}}}]

. Then,

R_{2} = \sum_{i = 1}^{T_{q}} α_{i} x_{q, i}

(15)

where,

Z_{i j}^{q_{β}} = x_{q, i}^{T} W_{4} x_{q_{β}, j}

and

α_{i}^{q_{β}} = \frac{e x p (Z_{i j})}{\sum_{k = 1}^{T_{q}} \sum_{l = 1}^{T_{q_{β}}} (Z_{k l})}

.

For the source sequences

X = {x_{1}, \dots, x_{n}}

, we calculate the self-attention. Then, we will obtain

R_{3}

as follows.

R_{3} = \sum_{i = 1}^{n} α_{i} x_{i}

(16)

where,

Z_{i j}^{s} = x_{i} W_{5} x_{j}

and

α_{i}^{s} = \frac{e x p (Z_{i j})}{\sum_{i = 1}^{n} \sum_{j = 1}^{n} (Z_{i j})}

.

The global attention network is used to learn the weights between the feature representations from three attention networks. The input is

R_{1}

,

R_{2}

and

R_{3}

, and the output R global is calculated as

R^{g l o b a l} = \sum_{i = 1}^{3} α_{i}^{g l o b a l} R_{i}

(17)

where,

Z_{i j}^{g l o b a l} = R_{i}^{T} W_{5} R_{j}

and

α_{i}^{g l o b a l} = \frac{e x p (Z_{i j})}{\sum_{i = 1}^{3} \sum_{j = 1}^{3} (Z_{i j})}

.

5.2.4. Prediction Layer

The Prediction layer is a DNN composed of multiple fully connected layers. The output of the current layer is used as the input of the next layer. We represent

T_{0} = R^{g l o b a l}

as the input of the first layer. The calculation process of each fully connected layer is as follows:

T_{i} = F N N (T_{i - 1})

(18)

After obtaining the output

T_{h}

of the last layer, we will make the final prediction through the sigmoid function, which is expressed as

\hat{Q} = s i g m o i d (W_{h + 1} T_{h} + b_{h + 1})

(19)

We train the sense relation by optimizing the cross-entropy loss:

L_{r e l a t i o n} = \sum_{i = 1}^{N} Q_{i} l o g S i g m o i d ({\hat{Q}}_{i}) + (1 - Q_{i}) l o g (1 - {\hat{Q}}_{i})

(20)

where

Q_{i}

indicates the truth sense relation of ith sentence and N is the number of sentence.

Our method for CADRR is trained with the following loss function using gradient descent. We use a tunable interpolation constant

β \in (0, 1)

to balance the element loss and the relation loss.

L_{c a d r r} = β L_{p a t t e r n} + (1 - β) L_{r e l a t i o n}

(21)

5.3. Methods Based on PLMs

Pre-trained language models (PLMs) have revolutionized the field of natural language processing (NLP) by providing powerful, generalizable models that can be fine-tuned for a wide range of tasks [45,46,47]. These models leverage vast amounts of textual data, learning complex patterns and linguistic structures before being applied to specific NLP tasks, showing the effective capture of rich semantic and syntactic patterns from plain texts [48,49]. In this section, we explore the applicability of three widely used pre-trained language models (PLMs), namely BERT [50], RoBERTa [51], and ALBERT [52], which have demonstrated significant effectiveness across various natural language inference (NLI) tasks, to conduct exploratory experiments.

BERT (Bidirectional Encoder Representations from Transformers): Introduced by Google in 2018, BERT is a breakthrough in leveraging bidirectional contexts in the text by training on both masked language modeling (MLM) and next-sentence prediction (NSP) tasks. Its full consideration of the surrounding context for each token distinguishes it in the field of NLP.
RoBERTa (Robustly Optimized BERT Approach): Developed by Facebook in 2019, RoBERTa optimizes BERT’s training regimen by removing the NSP task, using much larger datasets and training for longer periods. These changes, along with hyperparameter tuning, result in improved performance across many benchmarks.
ALBERT (A Lite BERT): Also from Google, ALBERT addresses the demand for efficiency by introducing parameter-sharing across layers and a factorization of the embedding matrix. This significantly reduces model size and increases training speed while maintaining comparable performance to its predecessors.

We first utilize these PLMs to obtain word embeddings

e_{i}^{(w o r d)}

. To incorporate internal semantic element information, we concatenate sentence embeddings with internal semantic element information

e_{i}^{(e l e m e n t)}

directly. To facilitate comparison with the model proposed in Section 5.2, we have also incorporated glyph information

e_{i}^{(g l y p h)}

following the Equation (4).

X_{i} = e_{i}^{(w o r d)} + e_{i}^{(e l e m e n t)} + e_{i}^{(g l y p h)}

(22)

This concatenated vector

X_{i}

is then fed into an FNN layer to predict the overall label y. This method leverages the rich contextual representations derived from PLMs, enhancing the predictive capability by integrating specific semantic details and glyph information directly into the learning process.

5.4. Results

To evaluate the computability of SACA and the effectiveness of our proposed method, we conducted preliminary experiments on the CADRR task. Our experiment is based on 7636 training data and 1910 testing data. For the sake of fairness, we set all hyperparameters to 0.5. Using a carefully curated dataset tailored for specific experiments, we evaluate their predictions based on their accuracy in recovering the original connectives. To ensure the robustness and reliability of our experimental findings, we conduct five runs for each method and calculate the average accuracy obtained across these iterations. The results are shown in the following tables.

The PLMs-based methods, as described in Section 5.3, simply concatenate internal semantic elements. Our approach described in Section 5.2 focuses on using these internal semantic elements, employing biaffine and attention mechanisms to make effective use of them.

From Table 6, it is observed that our model achieves the highest accuracy and F1 performance when utilizing internal semantic elements as features. In particular, with an accuracy of 67.53%, this demonstrates the effectiveness of our semantic features for classification tasks. Meanwhile, the impact of data imbalance is evident, as all models exhibit poor performance in terms of recall metrics.

Additionally, we observed that all methods utilized glyph information. To verify the effectiveness of this glyph information, we conducted a separation experiment, the results of which are shown in the Table 7.

From Table 7, it can be seen that without glyph encoding, all four models showed varying decreases in accuracy and F1 value. Although our model experienced a greater decline compared to the pre-trained language models, we still attained the best results. In future work, we will conduct more in-depth research with glyph encoding.

As this corpus is a new dataset, no previous research has been conducted on it. Our task, known as CADRR, is an initiative we have proposed based on the DRR task. To validate the reasonableness of our task and the applicability of our corpus, we compared it with the currently popular IDRR task, which is also based on DRR.

The F1 value of the IDRR task has improved from over 40% in earlier years [29,53,54] to over 60% in recent years [16,55,56]. The initial results of the IDRR task, which are similar to the F1 value achieved with our corpus, indicate the inherent difficulty of recognizing discourse relations. This similarity suggests the reasonableness of our corpus. However, the increase to an accuracy of over 60% in the IDRR task indicates that our model has potential for improvement.

6. Conclusions and Outlooks

In this study, we constructed a Semantic Augmented Chinese Adversative Corpus (SACA), aligning with PDTB standards for sense classification and incorporating information about internal semantic elements based on linguistic theories. Our experiments have revealed a significant correlation between internal semantic elements and their corresponding sense classifications. We also defined the task of Chinese Adversarial Discourse Relationship Recognition (CADRR). Our proposed method based on deep learning for CADDR has been employed to showcase the effectiveness of internal semantic features and the applicability of our corpus. The scientific contribution of this research lies in enhancing the accuracy of discourse relationship recognition through a systematic analysis of internal semantic elements, offering new perspectives and tools for the in-depth study and application of discourse relations in the related downstream natural language processing tasks.

From a practical standpoint, although this research primarily focuses on Chinese data, the methods and theoretical frameworks are broadly applicable and can be extended to other languages and cultural contexts, thereby supporting the recognition and analysis of complex discourse relationships worldwide. The limitations of our study include the scope and diversity of the corpus, which currently focuses mainly on specific types of adversative relations. Additionally, although our method performs well on the current dataset, its generalizability and performance in different linguistic contexts require further validation. In the future, our goal is to broaden our corpus by incorporating a wider variety of complex Chinese sentences. We also aim to delve deeper into how sense classification and internal semantic elements are interconnected. This exploration will help improve both the precision and the broad applicability of recognizing discourse relations, thereby benefiting related downstream tasks in natural language processing.

Author Contributions

Methodology, Z.H.; Data curation, R.Y.; Writing—original draft, Z.H. and R.Y.; Writing—review & editing, Y.F. and X.Y. All authors have read and agreed to the published version of the manuscript.

Funding

China Postdoctoral Science Foundation: No. 2022M72078.

Data Availability Statement

The data presented in this study are available in this article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Marcu, D.; Echihabi, A. An unsupervised approach to recognizing discourse relations. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA, 6–12 July 2002; pp. 368–375. [Google Scholar]
Staliūnaitė, I.; Gorinski, P.J.; Iacobacci, I. Improving commonsense causal reasoning by adversarial training and data augmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 2–9 February 2021; Volume 35, pp. 13834–13842. [Google Scholar]
Maruf, S.; Saleh, F.; Haffari, G. A survey on document-level neural machine translation: Methods and evaluation. ACM Comput. Surv. CSUR 2021, 54, 1–36. [Google Scholar] [CrossRef]
Schick, T.; Schütze, H. It’s not just size that matters: Small language models are also few-shot learners. arXiv 2020, arXiv:2009.07118. [Google Scholar]
Prasad, R.; Dinesh, N.; Lee, A.; Miltsakaki, E.; Robaldo, L.; Joshi, A.K.; Webber, B.L. The Penn Discourse TreeBank 2.0. In Proceedings of the LREC, Marrakech, Morocco, 26 May–1 June 2008. [Google Scholar]
Webber, B.; Prasad, R.; Lee, A.; Joshi, A. The Penn Discourse Treebank 3.0 Annotation Manual; University of Pennsylvania: Philadelphia, PA, USA, 2019; Volume 35, p. 108. [Google Scholar]
Carlson, L.; Marcu, D.; Okurowski, M.E. RST Discourse Treebank LDC2002T07; University of Pennsylvania: Philadelphia, PA, USA, 2002. [Google Scholar]
Sluyter-Gaethje, H.; Bourgonje, P.; Stede, M. Penn Discourse Treebank Version 2.0—German Translation LDC2021T05; University of Pennsylvania: Philadelphia, PA, USA, 2021. [Google Scholar]
Mendes, A.; Lejeune, P. CRPC-DB a Discourse Bank for Portuguese. In Proceedings of the International Conference on Computational Processing of the Portuguese Language, Fortaleza, Brazil, 21–23 March 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 79–89. [Google Scholar]
Mirzaei, A.; Safari, P. Persian discourse treebank and coreference corpus. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, 7–12 May 2018. [Google Scholar]
Zhou, L.; Li, B.; Wei, Z.; Wong, K.F. The CUHK Discourse TreeBank for Chinese: Annotating Explicit Discourse Connectives for the Chinese TreeBank. In Proceedings of the LREC, Reykjavik, Iceland, 26–31 May 2014; pp. 942–949. [Google Scholar]
Zhou, Y.; Xue, N. The Chinese Discourse TreeBank: A Chinese corpus annotated with discourse relations. Lang. Resour. Eval. 2015, 49, 397–431. [Google Scholar] [CrossRef]
Zeyrek, D.; Mendes, A.; Grishina, Y.; Kurfalı, M.; Gibbon, S.; Ogrodniczuk, M. TED Multilingual Discourse Bank (TED-MDB): A parallel corpus annotated in the PDTB style. Lang. Resour. Eval. 2020, 54, 587–613. [Google Scholar] [CrossRef]
Wu, H.; Zhou, H.; Lan, M.; Wu, Y.; Zhang, Y. Connective Prediction for Implicit Discourse Relation Recognition via Knowledge Distillation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Toronto, ON, Canada, 9–14 July 2023; Volume 1, Long Papers. pp. 5908–5923. [Google Scholar]
Jiang, Y.; Zhang, L.; Wang, W. Global and Local Hierarchy-aware Contrastive Framework for Implicit Discourse Relation Recognition. arXiv 2022, arXiv:2211.13873. [Google Scholar]
Chan, C.; Liu, X.; Cheng, J.; Li, Z.; Song, Y.; Wong, G.Y.; See, S. DiscoPrompt: Path Prediction Prompt Tuning for Implicit Discourse Relation Recognition. arXiv 2023, arXiv:2305.03973. [Google Scholar]
Mirza, P.; Sprugnoli, R.; Tonelli, S.; Speranza, M. Annotating causality in the TempEval-3 corpus. In Proceedings of the EACL 2014 Workshop on Computational Approaches to Causality in Language (CAtoCL), Gothenburg, Sweden, 26–27 April 2014; pp. 10–19. [Google Scholar]
Mostafazadeh, N.; Grealish, A.; Chambers, N.; Allen, J.; Vanderwende, L. CaTeRS: Causal and temporal relation scheme for semantic annotation of event structures. In Proceedings of the Fourth Workshop on Events, San Diego, CA, USA, 21–22 June 2016; pp. 51–61. [Google Scholar]
Tan, F.A.; Hürriyetoğlu, A.; Caselli, T.; Oostdijk, N.; Nomoto, T.; Hettiarachchi, H.; Ameer, I.; Uca, O.; Liza, F.F.; Hu, T. The causal news corpus: Annotating causal relations in event sentences from news. arXiv 2022, arXiv:2204.11714. [Google Scholar]
Zhang, M.; Qin, B.; Liu, T. Chinese discourse relation semantic taxonomy and annotation. J. Chin. Inf. Process. 2014, 28, 28–36. [Google Scholar]
Fuyi, X. Modern Chinese Complex Sentences III: Adversative Type; Taylor & Francis: Abingdon, UK, 2023. [Google Scholar]
Zhang, B. Implication, presupposition and the understanding of sentences. Chin. Teach. World 2002, 3, 5–9. [Google Scholar]
Zeng, J.; Lu, F. From Counter-Expectation Marker to Discourse Marker On the Pragmatic Function and Evolution of “danshi”. Linguist. Sci. 2016, 15, 391–400. [Google Scholar]
Yuan, Y. Counter-expectation, additive relation and the types of pragmatic scale: The comparative analyses of the semantic function of shenzhi and faner. Contemp. Linguist. 2008, 10, 109–121. [Google Scholar]
Marcus, M.P.; Santorini, B.; Marcinkiewicz, M.A. Treebank-3 LDC99T42; Web Download; University of Pennsylvania: Philadelphia, PA, USA, 1999. [Google Scholar]
Longacre, R.E. The Grammar of Discourse; Springer Science & Business Media: Berlin/Heidelberg, Germany, 1996. [Google Scholar]
Pitler, E.; Louis, A.; Nenkova, A. Automatic sense prediction for implicit discourse relations in text. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Singapore, 2–7 August 2009; pp. 683–691. [Google Scholar]
Zhang, B.; Su, J.; Xiong, D.; Lu, Y.; Duan, H.; Yao, J. Shallow convolutional neural network for implicit discourse relation recognition. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015; pp. 2230–2235. [Google Scholar]
Liu, Y.; Li, S. Recognizing implicit discourse relations via repeated reading: Neural networks with multi-level attention. arXiv 2016, arXiv:1609.06380. [Google Scholar]
Dai, Z.; Huang, R. Improving implicit discourse relation classification by modeling inter-dependencies of discourse units in a paragraph. arXiv 2018, arXiv:1804.05918. [Google Scholar]
Van Ngo, L.; Than, K.; Nguyen, T.H. Employing the correspondence of relations and connectives to identify implicit discourse relations via label embeddings. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 4201–4207. [Google Scholar]
Kishimoto, Y.; Murawaki, Y.; Kurohashi, S. Adapting bert to implicit discourse relation classification with a focus on discourse connectives. In Proceedings of the Twelfth Language Resources and Evaluation Conference, Marseille, France, 11–16 May 2020; pp. 1152–1158. [Google Scholar]
Kurfalı, M.; Östling, R. Let’s be explicit about that: Distant supervision for implicit discourse relation classification via connective prediction. arXiv 2021, arXiv:2106.03192. [Google Scholar]
Wu, C.; Hu, C.; Li, R.; Lin, H.; Su, J. Hierarchical multi-task learning with CRF for implicit discourse relation recognition. Knowl. Based Syst. 2020, 195, 105637. [Google Scholar] [CrossRef]
Zhao, H.; He, R.; Xiao, M.; Xu, J. Infusing Hierarchical Guidance into Prompt Tuning: A Parameter-Efficient Framework for Multi-level Implicit Discourse Relation Recognition. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Toronto, ON, Canada, 9–14 July 2023; Volume 1, Long Papers. pp. 6477–6492. [Google Scholar]
Wang, C.; Jian, P.; Huang, M. Prompt-based Logical Semantics Enhancement for Implicit Discourse Relation Recognition. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, 6–10 December 2023; pp. 687–699. [Google Scholar]
Xun, E.; Rao, G.; Xiao, X.; Zang, J. The construction of the BCC Corpus in the age of Big Data. Corpus Linguist. 2016, 3, 93–109. [Google Scholar]
Passonneau, R. Measuring Agreement on Set-Valued Items (MASI) for Semantic and Pragmatic Annotation; Columbia University Libraries: New York, NY, USA, 2006. [Google Scholar]
Xu, Y.; Jones, G.J.; Li, J.; Wang, B.; Sun, C. A study on mutual information-based feature selection for text categorization. J. Comput. Inf. Syst. 2007, 3, 1007–1012. [Google Scholar]
Meng, Y.; Wu, W.; Wang, F.; Li, X.; Nie, P.; Yin, F.; Li, M.; Han, Q.; Sun, X.; Li, J. Glyce: Glyph-vectors for chinese character representations. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; pp. 2746–2757. [Google Scholar]
Sun, Z.; Li, X.; Sun, X.; Meng, Y.; Ao, X.; He, Q.; Wu, F.; Li, J. Chinesebert: Chinese pretraining enhanced by glyph and pinyin information. arXiv 2021, arXiv:2106.16038. [Google Scholar]
Wang, Z.; Liu, X.; Zhang, M. Breaking the representation bottleneck of Chinese characters: Neural machine translation with stroke sequence modeling. arXiv 2022, arXiv:2211.12781. [Google Scholar]
Gu, R.; Wang, T.; Deng, J.; Cheng, L. Improving Chinese named entity recognition by interactive fusion of contextual representation and glyph representation. Appl. Sci. 2023, 13, 4299. [Google Scholar] [CrossRef]
Zhang, X.; Zheng, Y.; Yan, H.; Qiu, X. Investigating Glyph Phonetic Information for Chinese Spell Checking: What Works and What’s Next. arXiv 2022, arXiv:2212.04068. [Google Scholar]
Yang, Z.; Dai, Z.; Yang, Y.; Carbonell, J.; Salakhutdinov, R.R.; Le, Q.V. Xlnet: Generalized autoregressive pretraining for language understanding. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; pp. 5753–5765. [Google Scholar]
Joshi, M.; Chen, D.; Liu, Y.; Weld, D.S.; Zettlemoyer, L.; Levy, O. Spanbert: Improving pre-training by representing and predicting spans. Trans. Assoc. Comput. Linguist. 2020, 8, 64–77. [Google Scholar] [CrossRef]
Clark, K.; Luong, M.T.; Le, Q.V.; Manning, C.D. Electra: Pre-training text encoders as discriminators rather than generators. arXiv 2020, arXiv:2003.10555. [Google Scholar]
Clark, K.; Khandelwal, U.; Levy, O.; Manning, C.D. What does bert look at? An analysis of bert’s attention. arXiv 2019, arXiv:1906.04341. [Google Scholar]
Rogers, A.; Kovaleva, O.; Rumshisky, A. A primer in BERTology: What we know about how BERT works. Trans. Assoc. Comput. Linguist. 2021, 8, 842–866. [Google Scholar] [CrossRef]
Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. Roberta: A robustly optimized bert pretraining approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
Lan, Z.; Chen, M.; Goodman, S.; Gimpel, K.; Sharma, P.; Soricut, R. Albert: A lite bert for self-supervised learning of language representations. arXiv 2019, arXiv:1909.11942. [Google Scholar]
Lan, M.; Wang, J.; Wu, Y.; Niu, Z.Y.; Wang, H. Multi-task attention-based neural networks for implicit discourse relationship representation and identification. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 7–11 September 2017; pp. 1299–1308. [Google Scholar]
Lei, W.; Xiang, Y.; Wang, Y.; Zhong, Q.; Liu, M.; Kan, M.Y. Linguistic properties matter for implicit discourse relation recognition: Combining semantic interaction, topic continuity and attribution. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; pp. 4848–4855. [Google Scholar]
Wu, C.; Cao, L.; Ge, Y.; Liu, Y.; Zhang, M.; Su, J. A label dependence-aware sequence generation model for multi-level implicit discourse relation recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 22 February–1 March 2022; Volume 36, pp. 11486–11494. [Google Scholar]
Long, W.; Webber, B. Facilitating contrastive learning of discourse relational senses by exploiting the hierarchy of sense relations. arXiv 2023, arXiv:2301.02724. [Google Scholar]

Figure 1. An illustration of the internal semantic elements in an adversative sentence, following Longacre’s theory. Normally, P results in Q, where

Q_{β}

is an opposing proposition. R signifies the factors causing the emergence of

Q_{β}

, while S represents an alternative behavior. Q is omitted here, and it should be completed as “I find it”.

Figure 1. An illustration of the internal semantic elements in an adversative sentence, following Longacre’s theory. Normally, P results in Q, where

Q_{β}

is an opposing proposition. R signifies the factors causing the emergence of

Q_{β}

, while S represents an alternative behavior. Q is omitted here, and it should be completed as “I find it”.

Figure 2. Overall framework for our deep learning method for CADRR.

Table 1. The internal semantic elements labels and the example.

Label	Meaning	Example
P	Premise	所有前来应试的孩子条件都不错.
		All the candidates had good qualifications.
Q	Expected Result	原本想招收20名,
		Initially, we planned to enrol 20 of them,
R	Reason	但是为了能更好地集中精力培养精英,
		but to concentrate on cultivating elites,
$Q_{β}$	Unexpected Result	还是放弃了.
		we abandoned it.

Table 2. The inter-annotators’ agreements of task (1)’s overall sense annotation and task (2)’s internal semantic elements annotation.

Group	Overall Sense	Internal Semantic Elements
Group 1	70.54%	57.43%
Group 2	71.28%	63.10%
Group 3	73.41%	69.58%
Group 4	67.29%	59.86%
Group 5	64.80%	67.91%
Group 6	N/A	65.14%
Average	69.46%	63.84%

N/A indicates the absence of an element.

Table 3. Sense classification statistics.

Sense	Count	Contribution
Cause	475	4.98%
Condition	940	9.85%
Direct Contrast	1322	13.85%
Indirect Contrast	1038	10.87%
Concession	5156	54.01%
Expansion	422	4.42%
Progression	84	0.88%
Coordination	109	1.14%

Table 4. The arrangement of internal semantic elements.

Internal Semantic	Num *	Semantic Elements				Count
Element Type	Num *	$P$	$Q$	$R$	$Q_{β}$	Count
P, Q, R, $Q_{β}$	4/4	1	2	3	4	1123
P, Q, $Q_{β}$ , R	4/4	1	2	4	3	204
Q, P, R, $Q_{β}$	4/4	2	1	3	4	40
P, R, $Q_{β}$	3/4	1	N/A	2	3	1911
P, $Q_{β}$ , R	3/4	1	N/A	3	2	1050
R, P, $Q_{β}$	3/4	2	N/A	1	3	921
P, Q, R	3/4	1	2	3	N/A	335
P, Q, $Q_{β}$	3/4	1	2	N/A	3	256
Q, R, $Q_{β}$	3/4	N/A	1	2	3	107
$Q_{β}$ , P, R	3/4	2	N/A	3	1	50
Q, $Q_{β}$ , R	3/4	N/A	1	3	2	31
Q, P, R	3/4	2	1	3	N/A	15
R, $Q_{β}$ , Q	3/4	N/A	3	1	2	10
P, $Q_{β}$	2/4	1	N/A	N/A	2	2162
P, R	2/4	1	N/A	2	N/A	615
R, Q	2/4	N/A	2	1	N/A	591
Q, $Q_{β}$	2/4	N/A	1	N/A	2	63
Q, R	2/4	N/A	1	2	N/A	44
R, $Q_{β}$	2/4	N/A	N/A	1	2	18

* Num represents the fraction of elements present out of the total.

Table 5. The mutual information calculation results.

	Cause	Condition	Direct Contrast	Indirect Contrast	Concession	Expansion	Progression	Coordination
P, Q, R, $Q_{β}$	0.8044	2.0547	2.3165	−19.6749	1.5834	3.2951	1.1953	2.8227
P, Q, $Q_{β}$ , R	9.0908	1.0000	1.3983	2.3853	8.1094	3.6484	−0.1580	−0.5577
P, R, $Q_{β}$	1.0001	1.0000	1.0003	1.0004	1.0003	1.0003	1.0003	1.0008
P, $Q_{β}$ , R	0.6788	1.0630	5.2049	−4.9579	−4.8115	−85.4876	2.2120	1.9229
R, P, $Q_{β}$	1.0001	1.0004	1.0001	1.0001	1.0004	1.0002	1.0003	1.0000
P, Q, R	1.0003	1.0000	1.0006	1.0007	1.0002	1.0005	1.0011	1.0012
P, Q, $Q_{β}$	−1.0162	1.0527	0.9400	1.4621	2.0928	10.5881	0.6616	−0.2112
P, $Q_{β}$	1.2691	1.0436	10.7147	4.6576	1.6265	−6.7003	2.0428	1.1283
P, R	1.0001	1.0000	1.0004	1.0005	1.0003	1.0008	1.0010	1.0013
R, Q	1.0000	0.4818	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000

Table 6. The Precision, recall, and F1-Score of discourse relation recognition with our proposed method.

Method	Scores
Method	Precision	Recall	F1
BERT	56.89%	33.78%	42.39%
RoBERTa	60.15%	33.65%	43.16%
AlBERT	59.82%	35.76%	44.76%
Ours	67.53%	38.59%	49.11%

Table 7. Results of methods without glyph encoders.

Method	Scores
Method	Precision	Recall	F1
BERT	54.32%	33.08%	41.12%
RoBERTa	56.79%	32.96%	41.71%
AlBERT	55.61%	34.87%	42.86%
Ours	62.95%	35.48%	45.38%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hua, Z.; Yang, R.; Feng, Y.; Yin, X. Semantic Augmentation in Chinese Adversarial Corpus for Discourse Relation Recognition Based on Internal Semantic Elements. Electronics 2024, 13, 1944. https://doi.org/10.3390/electronics13101944

AMA Style

Hua Z, Yang R, Feng Y, Yin X. Semantic Augmentation in Chinese Adversarial Corpus for Discourse Relation Recognition Based on Internal Semantic Elements. Electronics. 2024; 13(10):1944. https://doi.org/10.3390/electronics13101944

Chicago/Turabian Style

Hua, Zheng, Ruixia Yang, Yanbin Feng, and Xiaojun Yin. 2024. "Semantic Augmentation in Chinese Adversarial Corpus for Discourse Relation Recognition Based on Internal Semantic Elements" Electronics 13, no. 10: 1944. https://doi.org/10.3390/electronics13101944

APA Style

Hua, Z., Yang, R., Feng, Y., & Yin, X. (2024). Semantic Augmentation in Chinese Adversarial Corpus for Discourse Relation Recognition Based on Internal Semantic Elements. Electronics, 13(10), 1944. https://doi.org/10.3390/electronics13101944

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Semantic Augmentation in Chinese Adversarial Corpus for Discourse Relation Recognition Based on Internal Semantic Elements

Abstract

1. Introduction

2. Related Work

2.1. PDTB Datasets in English and Chinese

2.2. Research on Adversative Sentences

2.3. Discourse Relation Recognition Model

3. The Corpus Construction

3.1. Adversative Discourse Relation System

3.1.1. Cause

3.1.2. Condition

3.1.3. Direct Contrast

3.1.4. Indirect Contrast

3.1.5. Concession

3.1.6. Expansion

3.1.7. Progression

3.1.8. Coordination

3.2. Definition of Internal Semantic Elements

3.3. Data Collection

3.3.1. Data Source

3.3.2. Data Processing

3.4. Annotation Method

3.4.1. Annotation Guideline

3.4.2. Annotation Process

3.5. Inter-Annotator Agreement

4. Analysis

4.1. Sense-Level Features

4.2. Internal Semantic Element Level Features

4.3. Semantic–Sense Relation Analysis

5. Experiment

5.1. Problem Definition

5.2. Method

5.2.1. Input and Embedding Layers

5.2.2. Internal Semantic Element Layer

5.2.3. Attention Layer

5.2.4. Prediction Layer

5.3. Methods Based on PLMs

5.4. Results

6. Conclusions and Outlooks

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI