3.1.1. Verb Extraction
Constructing a situation type classification system is a bottom-up process. The number of verbs at the foundational level must be sufficiently large and their coverage broad enough to ensure the representativeness of the resulting classification system. Therefore, the primary task is to establish a large-scale lexicon of verbs. The sources of verbs in it consist of two parts.
First, we draw from the
Verb Usage Dictionary by Meng [
40]. The reasons for using this dictionary are as follows: First, the dictionary is dedicated to the detailed division of verb senses and the explanation of verb meanings, providing a reference for determining primary senses during manual annotation. Second, the dictionary contains extensive information describing verb functions, such as the co-occurrence capabilities of verbs with aspectual markers like “LE”, “ZHE”, “GUO” and temporal quantifiers, which can be used to verify whether the annotation results of the co-occurrence of verbs and aspectual markers are accurate. Third, Guo [
12] annotated the verbs in Meng’s dictionary, allowing for a dual verification of our annotation results to ensure their reliability.
Second, we extracted high-frequency verbs from the CCL corpus. The rationale for this choice is twofold: first, verbs in the corpus directly reflect actual language usage. Using raw corpus data ensures that the data are more authentic and representative. In contrast, verbs in dictionaries undergo manual standardization and may not accurately capture the true face of language. This approach thus complements the verbs from Meng’s dictionary. Second, the corpus contains a larger number of verbs, providing broader coverage of co-occurrence features between verbs and aspectual markers. This can effectively avoid the data sparsity problem in statistical validation.
3.1.2. Manual Annotation of Co-Occurrence Information
The classification of verb situation types is based on the co-occurrence features of verbs with different aspectual markers. To ensure the scientific rigor of the situation type classification system, we need to establish objective annotation criteria and manually judge and annotate the relevant features.
We use Guo’s principles for classifying verb situation types [
12] as a basis for determining whether verbs have the ability to co-occur with aspectual markers. Guo [
12,
19] focused on the event structure exhibited by verbs across temporal stages, classifying Chinese verbs from the perspective of process structure, which essentially aligns with the mainstream concept of situation types. Guo [
12] believes that verbs, as declarative components, have the signified that involves an internal process unfolding over time. This internal process consists of three elements: inception, finish and duration. The presence and intensity of these three elements determine the differences in verb situation types. Based on this, Guo classified Chinese verb situation types into five major categories and ten subcategories, viewing the subclasses as a complete gradual system with three prototypical categories: Va, Vc4 and Ve, representing state, activity and transition, respectively [
12]. Later, Guo [
19] revised his perspective, proposing that the subclasses of verb situation types exist in a certain hierarchical relationship. He introduced a three-tiered situation type hierarchy consisting of stative verbs, action verbs and change verbs, as shown below in
Figure 2.
The reason for adopting Guo’s annotation method [
12] is that his work in annotating Meng’s dictionary is comprehensive and meticulous. He annotated nearly all the verbs in the dictionary, totaling 1328 verbs and 2117 senses. Moreover, Guo’s classification of situation types is primarily based on the grammatical features of verbs—specifically, whether they can co-occur with various aspectual markers—rather than on their intrinsic semantics. This approach ensures both scientific rigor and practical operability, aligning with our principles. We have summarized the specific criteria as follows in
Table 1.
As shown in the table above, classifying the situation type of a verb requires referencing two criteria in sequence: first, what aspectual markers the verb can co-occur with; and second, what meaning the verb conveys when co-occurring with these aspectual markers. For example, in the case of the aspectual marker “LE”, we first need to determine whether the verb can co-occur with “LE”. If they can co-occur, there are two possible scenarios: first, if the co-occurrence indicates the beginning of an action, the verb has both the inception and duration; second, if the co-occurrence indicates the completion of an action, the verb has the finish. For instance:
(1) hai4xiu1 le
feel_shy LE
start to feel shy
[+inception, +duration]
(2) shi2xian4 le
achieve LE
have achieved
[+finish]
(3) zhi2xing2 le
execute LE
start to execute/have executed
[+inception, +duration, +finish]
Furthermore, if a verb can co-occur with the progressive markers “ZHE” or “ZAI/ZHENGZAI”, then the verb possesses a duration; if a verb can co-occur with the experiential marker “GUO”, then the verb has the finish. It is important to note that if a verb cannot co-occur with any of these five aspectual markers but can indicate an ongoing action, then the verb still has a duration.
Therefore, it is evident that the elements of inception, duration and finish are crucial in determining the situation type of a verb. Based on the presence and tensity of these three elements, verbs can be classified into ten distinct subcategories.
There are several points that require special clarification:
(1) Many aspectual markers in Chinese have homographs. The aspectual marker “LE” can be divided into two separate words: “LE1”, which functions as a modal particle at the end of sentences indicating the emergence of a new situation, and “LE2”, which is an auxiliary word placed after the verb to indicate the completion of an action. Although Guo [
12] did not explicitly distinguish between the two, he noted that the “LE” in the above table should form a direct constituent with the verb. This means that the aspectual marker “LE” includes two cases: “LE” functioning as “LE2” and “LE” combining both “LE1” and “LE2”. It does not include “LE” functioning solely as “LE1”.
(2) In fact, many scholars argue that the time adverbs “ZAI” and “ZHENGZAI” differ in their semantics and co-occurrence capabilities with verbs. However, since both can indicate the progression of an action or the continuation of a state, we adhere to Guo’s standard [
12] and treat them as a single feature for the time being.
(3) The aspectual marker “GUO” here is an auxiliary word. Some scholars argue that it should be divided into “GUO1” and “GUO2”. “GUO1” follows a predicate to indicate the completion of an action, while “GUO2” follows a predicate to indicate a past experience [
41,
42]. We do not make this distinction here, but it is important to clarify that the aspectual marker “GUO” does not include other homographs such as the directional verb “GUO”.
Next, we employ a manual annotation approach to label the co-occurrence capabilities of verbs with five aspectual markers. The annotators are required to make judgments based on intuition and introspection, avoiding interference from corpus data.
Additionally, to conduct an internal consistency check on the data, we employed a systematic sampling method to select 20 verbs from the lexicon as repeated items. These verbs were then re-annotated by the annotators after a certain period following the initial annotation. The systematic sampling approach was able to avoid the selected verbs from suggesting co-occurring features with each other, which can prevent the failure of internal consistency checks of data.
3.1.3. Consistency Check
The annotation task was completed by four annotators who had received specialized training in linguistics. Subsequently, we conducted a multi-level cross-validation using Cohen’s Kappa consistency test [
43]. The Kappa coefficient is a statistical measure used to assess consistency, evaluating the ratio of observed agreement among annotators to the agreement expected by chance. The coefficient ranges from −1 to 1, with values closer to 1 indicating higher agreement between annotators. We established a threshold of 0.75; if the Kappa value from the test exceeded this threshold, it demonstrated that the annotated data had passed the consistency verification.
The consistency verification was conducted at three levels:
(1) Intra-annotator consistency: For the 20 repeated verbs, we performed intra-annotator consistency checks by comparing each annotator’s initial and subsequent annotations. This was performed to evaluate the stability of the judgments formed by each annotator through intuition and introspection.
(2) Inter-annotator consistency: We conducted inter-annotator consistency checks across the four annotators to measure the degree of agreement among them in performing the annotation task.
(3) Cross-validation with existing resources: After finalizing the annotations results, we performed consistency checks between our annotations and those of Guo [
12]. By cross-validating with pre-annotated linguistic resources, we ensured the scientific rigor of the annotation results.
Only the annotations that passed the consistency checks were used in the hierarchical clustering experiments. These tests effectively improve the methods used by previous researchers in annotating data, to some extent reducing subjectivity in situation type classification.
3.1.4. Automatic Generation of Lexical Situation Type System
Based on the annotation results, verbs exhibit distinct distribution patterns in terms of aspectual marker features. Although Guo [
12] classified the situation type system of verbs according to objective grammatical criteria, his approach remains insufficiently thorough. This is because, when determining the presence and intensity of the three elements—inception, finish and duration—he merely aggregated these features without employing a more scientific method to examine the impact of differences in element intensity. Therefore, to address the limitations of previous research and further measure the similarity and structural relationships among verbs, we transformed the categorical variables in the original matrix into dummy variables. Based on these dummy variables, we performed hierarchical clustering on the verbs and visualized the clustering results using a dendrogram. Finally, we evaluated the clustering outcome using the Cophenetic Correlation Coefficient (CPCC) metric.
Prior to conducting hierarchical clustering, it is necessary to calculate the distance between each pair of samples and assemble these distances into a distance matrix to determine the optimized clustering structure during the clustering process. Since there are multiple methods for calculating the distance matrix and linking methods, different parameters can yield varying clustering results and construct different dendrograms. To achieve a more desirable hierarchical clustering outcome, we first employed the grid search method to identify the optimized parameters for hierarchical clustering.
Grid search can exhaustively explore all possible combinations of parameters, selecting the optimized set for hierarchical clustering. Using this set of parameters in hierarchical clustering yields the best CPCC, and the dendrogram with the highest CPCC value is chosen as the optimized dendrogram.
After determining the parameters, we employ agglomerative hierarchical clustering to process the matrix, automatically generating the hierarchical structure of situation types from the bottom up. In hierarchical clustering, each sample initially forms its own cluster. Based on the distance information in the distance matrix, the similarity between clusters is calculated, and at each step, the two closest clusters are merged. This process continues until all samples are merged into a single cluster. The system of situation types is then automatically generated based on the semantic distances between verbs and is presented in the form of a dendrogram.
Finally, based on the results of hierarchical clustering, the verb situation types are mapped to the senses of the verbs. We adhere to the principle of a one-to-one correspondence between verb senses and situation types, meaning that each sense of a verb is assigned to a specific situation type in the hierarchical clustering, thereby obtaining the corresponding situation type labels.