Representing Aspectual Meaning in Sentence: Computational Modeling Based on Chinese

Liu, Hongchao; Liu, Bin

doi:10.3390/app15073720

Open AccessArticle

Representing Aspectual Meaning in Sentence: Computational Modeling Based on Chinese

by

Hongchao Liu

¹

and

Bin Liu

^2,*

¹

School of Literature, Shandong University, Jinan 250100, China

²

Research Center for Language and Language Education, Central China Normal University, Wuhan 430079, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(7), 3720; https://doi.org/10.3390/app15073720

Submission received: 21 February 2025 / Revised: 21 March 2025 / Accepted: 26 March 2025 / Published: 28 March 2025

(This article belongs to the Special Issue Application of Artificial Intelligence and Semantic Mining Technology)

Download

Browse Figures

Versions Notes

Abstract

Situation types can be viewed as the foundation of representation of sentence meaning. Noting that situation types cannot be determined by verbs alone, recent studies often focus on situation type prediction in terms of the combination of different linguistic constituents at the sentence level instead of lexically marked situation types. However, in languages with a fully marked aspectual system, such as Mandarin Chinese, such an approach may miss the opportunity of leveraging lexical aspects as well as other distribution-based lexical cues of event types. Currently, there is a lack of resources and methods for the identification and validation of the lexical aspect, and this issue is particularly severe for Chinese. From a computational linguistics perspective, the main reason for this shortage stems from the absence of a verified lexical aspect classification system, and consequently, a gold-standard dataset annotated according to this classification system. Additionally, owing to the lack of such a high-quality dataset, it remains unclear whether semantic models, including large general-purpose language models, can actually capture this important yet complex semantic information. As a result, the true realization of lexical aspect analysis cannot be achieved. To address these two problems, this paper sets out two objectives. First, we aim to construct a high-quality lexical aspect dataset. Since the classification of the lexical aspect depends on how it interacts with aspectual markers, we establish a scientific classification and data construction process through the selection of vocabulary items, the compilation of co-occurrence frequency matrices, and hierarchical clustering. Second, based on the constructed dataset, we separately evaluate the ability of linguistic features and large language model word embeddings to identify lexical aspect categories in order to (1) verify the capacity of semantic models to infer complex semantics and (2) achieve high-accuracy prediction of lexical aspects. Our final classification accuracy is 72.05%, representing the best result reported thus far.

Keywords:

lexical aspect; situation TYPE; dataset evaluation

1. Introduction

1.1. Aspect

Aspect serves as a cornerstone in both linguistic theory and computational modeling. As the primary temporal representation mechanism cross-linguistically, this grammatical category has remained a central focus of linguistic inquiry for over five decades. Within contemporary computational paradigms, aspectual features—when combined with tense—are systematically analyzed to derive temporal metadata for essential natural language processing tasks, including (but not limited to) the following: neural machine translation architectures, temporal relation extraction modules, and abstractive text summarization systems [1,2].

Both aspect and tense serve as one of the most important temporal information of a sentence in different languages [3,4]. However, their interchangeable usage in some studies illustrates a confusion in their definitions. In fact, they refer to different temporal facets in a sentence.

Tense concerns the relationship between speech time, which is when the situation described by a sentence occurs and reference time, which is usually the time a speaker utters the sentence. Past, present and future are the most common ones for tense.

Aspect—fundamentally encoded in its etymological grounding—encapsulates perceptual vantage points regarding event phase segmentation. This linguistic phenomenon manifests through predicates whose morphosyntactic realization spans lexical units to sentential constructions, thereby governing situational representations across granularity levels.

(1) a. He was running.

b. He has finished the assignment.

Taking (1) as an example, was, instead of is, is used indicating that a past tense is applied for (1a); be + v-ing, as a whole, is related with the viewpoint of the action denoted by the verb or verb phrase and it has nothing to do with tense or reference time. The action of run has different phases, including inception, duration and final, of which only duration is focused on by be +v-ing. The case is the same for (1b), which contains a viewpoint focusing on the final of a situation.

The same situation also exists in Chinese, such as (2).

(2) a.

ta1	zheng4zai4	pao3bu4
he	ZHENGZAI	run
He is running.

b.

ta1	wan2cheng2	le	zuo4ye4
he	finish	LE	assignment
He has finished the assignment.

The differentiation between tense and aspect opens a broad area for the further research of aspectual information in sentences and dozens of studies have appeared since the beginning of modern aspectual studies conducted by Vendler [4] and Comrie [5].

It has been commonly accepted that aspect is divided into two parts, including the situation type and viewpoint aspect, mainly developed and made clear by Smith [3] and Smith [6] in her two-components theory. The interaction between viewpoint aspects and situation types is vitally important for aspectual studies since it plays a key role in the description and explanation of grammatical and semantic features of various aspectual structures.

Situation type classification derives from the temporal architecture inherent in predicative units, spanning verbal constituents to complete sentential expressions. This classification schema fundamentally interrogates the temporal contour of situations, specifically examining the presence or absence of inception, duration, and termination phases. At the lexical stratum, the situation type is instantiated through verb classification, constituting what is traditionally designated as the lexical aspect.

Consequently, the phenomenon of situation type classification and its lexical manifestation as aspectual properties represent universal linguistic features across diverse language systems. This classification framework serves as a critical analytical tool for predicate categorization, particularly in verb classification paradigms, given its predictive capacity for determining syntactic behavior and semantic combinatorial patterns within linguistic units.

Stative verbs inherently denote situations lacking both inception and termination points, exhibiting atelic properties with homogeneous temporal distribution. Taking know as a paradigmatic example, this verb encodes a continuous state maintained by the subject without temporal interruption. Due to their inherent durative and static nature, stative verbs exhibit low compatibility with progressive aspect marking, as evidenced by the ungrammaticality of constructions like “He is knowing the name”. This linguistic constraint extends to Mandarin Chinese, where zhi1dao4 (know) similarly resists combination with progressive markers zhe or zhengzai. The underlying rationale stems from the homogeneous internal structure of states: regardless of temporal observation points, the state remains constant and indivisible into internal phases. This directly conflicts with the progressive aspect’s core function of highlighting internal temporal segmentation within events.

Activity verbs denote situations lacking explicit inception or termination points, yet unlike stative verbs, they exhibit heterogeneous temporal distribution. This indicates that the activity process comprises dynamic phases while maintaining conceptual unity as a single event. Consequently, activity verbs demonstrate flexible compatibility with progressive aspect marking, as evidenced by English constructions like “be running” and Mandarin Chinese sequences with zhengzai preceding pao3bu4 (run).

Transition verbs, by contrast, inherently encode either inception or termination points in their temporal structure. While some transition verbs like build represent durative processes, others, such as die, denote punctual events. The common characteristic across transition verbs is their representation of state changes, either emphasizing the transition process or the resultant state. This dual nature is particularly evident in Chinese verb–verb compounds like ti2gao1 (improve), which explicitly marks the progression from one state to another. More frequently, however, transition verbs focus on the resultant state, as exemplified by both the English die and the Chinese si3 (die). The inherent state transition in these verbs results in strong compatibility with perfective (or realizative) aspect marking. Representative examples are systematically presented in Table 1.

Unlike situation types, which are denoted by content verbs or predicates, viewpoint aspects are grammaticalized markers. As is shown by its name, a viewpoint aspectual marker stands for a subject’s view or perspective of a situation. Imperfective viewpoints only focus on a phase of a situation, such as be + v-ing, which takes a snapshot of the duration of a situation, while perfective aspectual markers treat a situation as a whole, such as have + v-ed, which regards a situation as a complete and finished one.

Grammaticalized aspect markers are generally considered an important demarcation point between aspect languages and non-aspect languages [7]. As a representative of inflectional languages, English has developed multiple viewpoint aspectual markers (such as the progressive aspect and the perfective aspect) or inflectional means, which is also quite common among inflectional languages.

However, Chinese, as a typical isolating language almost without inflectional morphemes, is surprisingly an aspect language with abundant aspectual markers including ZHE, ZAI, ZHENGZAI, LE1, LE2 and GUO.Verbs’ abilities to co-occur with these markers serve as an important part in describing their syntactic and semantic features in Chinese [3,6,8,9,10,11,12,13].

ZHE is the progressive aspectual marker in Chinese. It can only be attached to a verb, indicating that the action is in progress, thus making it an imperfective aspectual marker. For example, the ZHE in pao3 zhe (be running) views the duration period of a running process.

ZAI and ZHENGZAI, although there are semantic differences between them, are generally considered to indicate the progressive aspect as well. However, unlike ZHE, they have not been grammaticalized into aspectual markers or aspectual particles but rather function as adverbs, which means that they are usually positioned before the verbs as modifiers instead of being attached to the back of verbs, such as zheng4zai4 gou4wu4 (be shopping). Historically speaking, their usage in expressing the progressive aspect even predates that of ZHE in the same regard.

LE has two different usages. One of them, namely LE1, is considered an actualization aspect marker, indicating the realization or completion of an action, and it is attached to the end of a verb, such as chi1 LE1 fan4 (have eaten dinner). The other one, that is LE2, is regarded as a change in the state aspect marker, generally indicating the emergence of a new situation, such as shi2 san1 sui4 LE2 (become thirteen years old). It is attached to the end of the whole sentence instead of being attached to the verb.

GUO is generally regarded as an experiential aspect marker, indicating that the subject has once experienced the event, such as qu4 guo4 China (have ever been to China).

1.2. Lexical Aspect

It should be noted that in this paper, the lexical aspect, as a term, is used to refer to the situation type at the lexical level. However, the lexical aspect is far from being studied comprehensively and reliably for several reasons.

While aspects at the phrase and sentence level gain researchers’ attention, lexical aspects are bypassed in many studies due to their rejection of the compositionality and recursiveness of aspectual information in sentences. This rejection mainly arises from the so-called ambiguity of verbs’ situation types.

Taking run and run a mile as examples, Vendler [4] treats run itself as an activity verb and run a mile as a transition, or accomplishment situation to be specific. Studies such as Smith [6] identify this phenomenon as an ambiguity of verbs situation types since the situation type of run per se is different from the run in run a mile. To tackle this dilemma, she moves situation types onto the phrase level and claims that verbs have no situation type, which is followed by dozens of researchers. However, two issues occurred when this procedure was taken. On the one hand, enhancing the assignment of the situation types onto the phrasal level cannot solve the ambiguity problem because when new components such as arguments and adjuncts are added, the situation type will keep changing accordingly, meaning that we have to keep raising situation types onto upper levels until sentences which are the biggest grammatical units. On the other hand, there is no way to explore the interaction between the aspectual nature of verbs and aspectual markers because verbs are ruled out from situation type assignment.

However, it is impossible to study aspect without talking about the interaction between verbs and aspectual makers, which serve as the foundation for situation types in upper levels. As a result, scholars who reject lexical aspects have to accumulate all of the aspectual information of the whole structure into the constituting verb, which makes a shifting nature for the verbs’ situation type. What is more, they have to create other terminologies to refer to the situation type at the lexical level because they have already claimed that verbs have no situation type. Aspectual value and parameter are used in Smith [6] and Verkuyl [14], respectively, which makes their rejection of situation type at the lexical level a self-contradictory statement because there is no fundamental difference between aspectual values/parameters and situation types.

The ambiguity actually appears due to their rejection of the compositional and recursive nature of aspectual information. The situation type is recursive because different linguistic levels, such as verb, verb phrase and sentence, can have their own situation types. The situation type is compositional because different constituents in a structure form the situation type of the whole structure. It should be noted that the situation type of the constituting verb is independent of that of the whole structure. It is weird to insist that the run in run to the store is different from the run per se. The so-called shifting of run from an activity to an accomplishment is, in fact, imposing the meaning of to the store on run, which violates language reality.

Accepting the recursive and compositional nature can help to settle this problem, which also means lexical aspects have to be studied before situation types at other levels because they are the basic units for phrases and sentences and, thus, for situation types at these levels. A lot of computational studies such as Li et al. [15], Cao et al. [16], Zarcone and Lenci [17], Xu [18] only focus on the prediction of situation types in sentential level following linguistics studies who have reject the existence of lexical aspects, which makes the prediction of lexical aspects a nearly blank area to be explored.

In conclusion, the rejection of the recursive and compositional nature of aspectual information has blocked the way to make further exploration of lexical aspects, which also give rise to this study.

1.3. Research Questions

This paper focuses on the lexical aspect in Chinese, which is neglected in previous studies, and we intend to cover the following research questions in later sections.

Firstly, how to classify situation types in a more reliable way, compared with the intuition method. The priority before studying the interaction between viewpoint aspects and situation types is to confirm their classification. While viewpoint aspects, which are grammatical markers, are limited in number and thus have no need to be classified, situation types are hard to be determined objectively. Previous studies determined the situation type of a structure simply based on intuition, which is subjective and inconsistent. In this paper, the co-occurrence ability with aspectual markers for a verb is annotated by ourselves and then cross-validated in other language resources. Instead of classifying verbs into different situation types by intuition from top to bottom, hierarchical clustering is applied on the co-occurrence matrix to construct the classification system from bottom to top by continuously merging verbs with similar distribution into one type.

Secondly, how can we validate the classification system in a corpus? Although cross-validation and hierarchical clustering are applied to guarantee the reliability of the classification of the situation type, statistical validation in the corpus is still needed. Multi-nominal logistic regression is used to test the dataset of verbs’ situation types based on verbs’ co-occurrence frequencies with various aspectual markers extracted from SINICA corpus [19].

Thirdly, is it possible to predict the situation type of a verb in a sentence according to our validated model and annotated dataset of verbs’ situation types based on word embedding vectors? Previous computational studies are not able to conduct experiments based on word embedding vectors because of their rejection of verbs’ situation type. In this paper, word embedding vectors are trained on a combined corpus of SINICA and Chinese Gigaword [20], which are used for the prediction of verbs’ situation type in our dataset.

2. Materials and Methods

Based on the three research questions mentioned above, we designed different experimental procedures to conduct research on them. Each step involves different research methods, and all of these will be introduced in this section.

2.1. The Construction of Situation Type System and Its Dataset

This part mainly involves the construction of the situation type system. After the situation type classification system is established, the words will be annotated with their corresponding situation types so as to construct a gold standard dataset for the identification of lexical aspects.

2.1.1. Classification Criteria for Situation Types

The classification of situation types serves as the foundation to explore the interaction between lexical aspects, as well as aspects in other levels, and aspectual markers, which stand for different viewpoint aspects.

We have selected whether a verb can co-occur with aspectual markers as the basic criterion for classification, and the main reasons are as follows:

Firstly, constructing a situation type system according to semantic features is unreliable. Previous studies, especially linguistics studies, usually classify verbs into the four or five situation types which have been developed in Vendler [4] and other studies based on semantic features including [stative], [dynamic], [durative] and [telic]. These semantic features are extremely abstract and complex, and even scholars who specialize in researching this issue have significant disputes among themselves. For example, there has been an ongoing debate among scholars about what constitutes telic and what the distinction is between telic and boundary. Therefore, it is extremely difficult to determine which situation type a verb belongs to based on these features, and it is also highly subjective.

Secondly, it has widely been accepted that the interaction between verbs and aspectual markers always plays the most important role in depicting verbs’ situation types in linguistics studies [3,6,8,9,10,11,12,13]. Furthermore, it is extremely easy for native speakers to judge whether two words can be collocated. This does not require them to understand and judge complex semantic features; instead, they can make the judgment merely based on their own linguistic intuition. At the same time, it becomes much simpler and more feasible to conduct cross-validation among multiple people.

As a result, we have no intention to annotate the situation type of verbs directly. Only verbs’ abilities to co-occur with different aspectual markers are annotated, which will be used to cluster verbs into different types through hierarchical clustering according to their semantic similarity calculated on the matrix of co-occurrence.

2.1.2. Lexical Selection for Classification

Before proceeding with the selection of words, we first need to determine a basic premise, that is, the lexical aspects are attached to senses of verbs instead of verbs per se. Zhang [7] advises to attach the situation type of a verb to its typical meaning. Regardless of the difficulty in defining which meaning is typical, verbs can have different senses with different situation types.

(3) a.

Chuan1 Pu3	zuo4	zong3tong3
Trump	act as	president
Trump acts as the president.

b.

Li3 Yi4yang2	zheng4zai4	zuo4	zui4hou4	ping2gu1
Li Yiyang	ZHENGZAI	make	last	evaluation
Li Yiyang is making the last evaluation.

The zuo4 (act as) in (3a) is a stative verb, while the zuo4 (make) in (3b) is an activity verb.

verbs such as the posture and position ones also show different situation types with different meanings such as (4):

(4) a.

ta1	chuan1	zhe	yi1	jian4	fei1zhou1	da4gua4
he	wear	ZHE	one	CLA	African	gown
He wears an African gown.

b.

xiao3min3	xia4	qu4	chuan1	jian4	wai4tao4
Xiaomin	down	go	wear	CLA	coat
Xiaomin went downstairs to wear a coat.

It should be noted that this phenomenon is different from the so-called shifting nature of a verb’s situation type, which is caused by accumulating all the meanings of the other elements in a sentence on the single verb. These senses are their concept meanings.

As a result, we accept Huang et al. [21]’s Module-Attribute Representation of Verbal semantics (MARVS) and insist that the situation types (ST) of a verb are attached to the senses of it. This is illustrated in Figure 1.

Thus, the annotation of lexical aspects is not applied to verbs but to the senses of verbs.

Firstly, we choose Meng [22] as the source for verb selection due to the following reasons. On the one hand, the verbs in Meng [22] are organized by their senses, which enables our annotation to be conducted on the meanings of verbs following the first hypothesis. On the other hand, whether a sense of a verb is able to co-occur with different aspectual markers is annotated in Meng [22]. Most importantly, Guo [10] has taken the words and their meanings in this book as the items for annotation and carried out the annotation of the relationships between these items and aspectual markers. Therefore, cross-validation among these annotation sets can be conducted.

Then, we annotated whether the selected verbs and their senses could co-occur with aspectual markers. In this way, we have three sets of annotation data: those from Meng [22], those from Guo [10], and our own.

Finally, we conducted cross-validation among the three sets of data. Instead of using the method of consistency testing, we adopted a more stringent criterion; that is, we only selected the data for which the co-occurrence situations were completely identical among the three sets, and all the inconsistent data were deleted. In this way, it was ensured that the data used for the subsequent experiments were completely consistent.

2.1.3. Semantic Distance Visualization and Hierarchical Clustering

After completing the first step, we successfully constructed a matrix of the co-occurrence between verbs (in fact, the senses of verbs) and various aspectual markers. That is, taking verbs as the index column in the first column, taking each aspectual marker as the header in the first row, and filling each cell in the middle with a specific value, indicating whether they can be collocated (generally, 0 represents that they cannot be collocated, and 1 represents that they can be collocated). Based on this matrix, we completed the construction of the situation type system in three steps.

First, combine the words with exactly the same co-occurrence situations. Verbs with exactly the same co-occurrence situations indicate that their syntactic and semantic distributions in terms of aspect are completely the same. Therefore, they can be represented by the same representative word, and their co-occurrence situations can also be combined and represented by the same multi-dimensional vector. The vector here can be simply regarded as a row in the co-occurrence matrix table mentioned above. A column represents one dimension, so the number of columns determines the number of different dimensions. There are six dimensions representing the six aspectual markers, respectively.

Then, visualize the co-occurrence matrix to observe the clustering tendency and semantic distance among each representative word. The classification system of lexical aspects has a prototype structure, which is shared by Huang [23], Smith [6], Zhang [7] and Guo [10]. Prototype theory believes that members within a certain category are distributed around the most typical one and other ones become increasingly untypical. The similarity between members of that category is called family resemblance, which is represented by the semantic distance between verbs calculated according to their position in the matrix of co-occurrence in this paper.

However, there are co-occurrences with six different aspectual markers for a verb, which means that a six-dimensional graph should be depicted. This is not possible for human eyes, and thus, we should reduce the dimensions. Dimensionality reduction is a technique used in data analysis and machine learning. The main idea is to simplify data without losing too much important information. Imagine you have a lot of data that have many features, like a long list of numbers that describe something. These features might be related to each other, or some of them might not be very useful. Dimensionality reduction helps to find a way to represent the data with fewer features but still keep most of the important information. This makes the data easier to understand, analyze, and most importantly, to be projected to lower dimensions, thus making it possible to observe.

We choose Principal Component Analysis (PCA) for the task as it is one of the most commonly used methods for dimensionality reduction. It is a statistical method designed to simplify complex datasets by identifying the most significant patterns of variation. It works by transforming a large set of correlated variables into a smaller set of uncorrelated variables called principal components (PCs). These PCs are linear combinations of the original variables, ordered by their ability to explain variance in the data. The first PC captures the direction of maximum variability, the second PC accounts for the next highest variance, and so on. By retaining only the top few PCs, researchers can reduce data dimensionality while preserving critical information.

Principal Component Analysis (PCA) was implemented with the following configurations using the scikit-learn library (v1.3.0) in Python:

Data preprocessing: Since the co-occurrence matrix consists entirely of dummy variables (i.e., yes/no variables or 0/1 variables), standardization processing is unnecessary to avoid distorting their categorical nature. Redundant variables with perfect collinearity were removed.
n_components: The number of principal components was dynamically determined by retaining ≥80% cumulative explained variance (code: n_components = 0.80). Given the low original dimensionality (six features), this threshold balances dimensionality reduction and information preservation.
svd_solver: The exact solver (svd_solver = ‘full’) was selected to ensure the numerical stability for low-dimensional data.

Then, the dimensionality-reduced data are plotted in two or three dimensions using matplotlib in Python to visualize the spatial distribution and clustering tendencies of words, preparing for subsequent hierarchical clustering.

Hierarchical clustering is a robust unsupervised learning technique that constructs a hierarchy of clusters by either merging (agglomerative) or splitting (divisive) data points based on their similarity. This method is particularly advantageous for datasets with inherent hierarchical structures, as it provides a comprehensive dendrogram that visually represents the relationships between clusters. For our analysis, we applied hierarchical clustering to a dataset comprising over 1000 entries, each characterized by six dimensions. The choice of hierarchical clustering was driven by its ability to handle multi-dimensional data effectively while preserving the natural grouping tendencies of the data. Unlike partitioning methods such as k-means, hierarchical clustering does not require predefining the number of clusters, making it more flexible and interpretable. Additionally, the dendrogram offers valuable insights into the clustering process, enabling us to determine the optimal number of clusters based on the data’s intrinsic properties. This approach is particularly suitable for exploratory analysis as it facilitates a deeper understanding of the data’s structure and relationships.

Hierarchical clustering was performed using the scikit-learn library (v1.3.0) in Python. n_clusters and linkage parameters are set according to the Cophenetic Correlation Coefficient (CPCC).

The Cophenetic Correlation Coefficient (CPCC) is preferred for hierarchical clustering on our dataset (binary co-occurrence matrices, six dimensions and 1000–2000 observations) because it quantifies how well the dendrogram preserves the original pairwise distances. CPCC calculates the Pearson correlation between the original distance matrix (D) and the cophenetic distance matrix (C), reflecting the fidelity of hierarchical structure reconstruction. For sparse binary data, silhouette coefficients are less reliable due to assumptions of dense, continuous clusters, while CPCC directly evaluates hierarchical relationships without requiring predefined clusters. Additionally, CPCC avoids the

O (n^{2})

computational burden of silhouette scores, making it efficient for moderate-sized datasets. This ensures robust validation of term association hierarchies inherent in co-occurrence matrices.

The Cophenetic Correlation Coefficient (CPCC) is defined as follows:

CPCC = \frac{\sum_{i < j} (d_{i j} - \bar{D}) (c_{i j} - \bar{C})}{\sqrt{\sum_{i < j} {(d_{i j} - \bar{D})}^{2} \sum_{i < j} {(c_{i j} - \bar{C})}^{2}}}

where

$d_{i j}$ : Pairwise distance between observations i and j in the original space
$c_{i j}$ : Cophenetic distance (dendrogram height where i and j merge)
$\bar{D}, \bar{C}$ : Mean of all $d_{i j}$ and $c_{i j}$ values, respectively

2.2. Statistical Validation of Classification Model

The best way to verify a model is to apply it on a corpus containing real sentences uttered by speakers naturally. The co-occurrence ability can be simulated by the frequency. It is reasonable to expect a high frequency for more grammatical collocations of verbs and aspectual markers and a low frequency for less grammatical ones. In this section, the dataset of verbs with their situation types is used to train a multi-nominal logistic regression classifier based on the extracted co-occurrence information of verbs and aspectual markers in the SINICA corpus.

2.2.1. Corpus Selection and Sense Mapping

We chose the SINICA Corpus for two main reasons. Firstly, the SINICA Corpus is of exceptionally high quality. Its word segmentation and part-of-speech tagging have been manually verified, and although its scale is not extensive, it contains very few errors. Secondly, and most importantly, the rich part-of-speech annotations in the SINICA Corpus allow for the identification of the primary senses of verbs.

One of the reasons why some linguistics and computational studies ignore or even deny the existence of lexical aspects is the difficulty in annotating verbs’ senses. The smallest unit in a corpus is a word instead of a sense, which makes it impossible to study the interaction between verbs’ senses and aspectual markers.

Fortunately, the SINICA corpus contains abundant tags representing different senses of a verb, which facilitates the identification of their situation types. For example, zuo4wei2 (serve as) is tagged as VG, which means a classification verb. It is clear that all verbs with VG can be treated as state verbs. As a result, it is possible to identify verb senses and thus match situation types with them by their tags in the SINICA corpus. The following procedures are conducted to map situation types to the verbs with different tags.

Firstly, all of the verbs, together with their tags, are extracted from the SINICA corpus and intersected with the verb list in our constructed dataset based on Meng [22].

The next step is to match the meaning of a verb–tag pair in the intersection with the one in Meng [22]. Not all of the situation types of senses are clear enough to match and some of them have to be checked in the context and thus, bi-grams and sentences that contain the verbs in the intersection are also extracted simultaneously to confirm their situation types. For example, chu2 has two tags, including VC (active transitive verb) and VJ (stative transitive verb). By checking their bi-grams, chu2-VC is identified as “get rid of” while chu2-VJ is identified as “divided by”. It is clear that the former has a situational type of transition, while the latter has a situational type of activity.

2.2.2. Multinomial Logistic Regression

After completing the sense mapping, you can construct the co-occurrence frequency matrix by using the word-POS pairs as the first column of keywords and the six aspectual markers as the first row of headers. The values in the matrix are filled with the co-occurrence frequencies of words and aspectual markers from the SINICA corpus. This forms the co-occurrence frequency matrix for statistical testing. It is worth noting that we applied min-max scaling for standardization but found that it did not positively impact the final results. Therefore, we ultimately used the raw frequencies for the multinomial logistic regression testing.

Multinomial logistic regression (MLR) is a statistical technique used to model the relationship between a categorical dependent variable with more than two levels and one or more independent variables. Unlike binary logistic regression, which predicts dichotomous outcomes, MLR extends the framework to handle multi-class scenarios by estimating the probability of each category relative to a reference category.

In this study, multinomial logistic regression (MLR) is employed for the statistical analysis of situation types due to its suitability for modeling categorical dependent variables with multiple unordered levels. Situation types, as the outcome variable, typically represent distinct categories (e.g., states, activities, accomplishments, achievements, etc.), which are inherently qualitative and cannot be meaningfully ordered. MLR is specifically designed for such scenarios, as it allows for the estimation of the probability of each situation type relative to a reference category while accounting for the influence of predictor variables (e.g., aspectual markers, lexical features, or contextual factors).

The use of MLR is justified because it provides a robust framework to examine how independent variables affect the likelihood of different situation types occurring. For instance, it can reveal whether specific aspectual markers or linguistic features are more strongly associated with certain situation types. Additionally, MLR does not assume an ordinal relationship among the categories, making it ideal for analyzing situation types, which are often nominal in nature. By leveraging MLR, this study can systematically explore the complex interplay between linguistic predictors and situation types, offering nuanced insights into their statistical associations.

For MLR, we set the situation type of activity as the reference group while applying correlation coefficients for multicollinearity testing, which means a normal distribution test is also needed.

The reason for applying multicollinearity testing is that multicollinearity can occur when independent variables are highly correlated with each other, which can lead to several issues in the regression model, including the following:

Unstable Coefficient Estimates: High multicollinearity can cause large variances in the estimated coefficients, making them unreliable and difficult to interpret
Reduced Statistical Power: It can inflate the standard errors of the coefficients, reducing the likelihood of detecting statistically significant relationships.
Interpretation Challenges: Multicollinearity can make it difficult to assess the individual contribution of each predictor to the model.

The reason for the normal distribution test is that there are multiple correlation coefficient testing methods, some of which require the data to be normally distributed, such as the Pearson correlation coefficient test, while others do not, such as the Spearman correlation coefficient test. Due to the small size of our dataset, we choose Shapiro–Wilk test as the method for the distribution test.

For correlation analysis, Spearman’s or Pearson’s tests were applied based on the data distribution, as assessed by the Shapiro–Wilk test (

α = 0.05

). Statistical analyses were performed using Python’s scipy.stats module.

2.3. Prediction of Lexical Aspects

Quite a few studies have tried to predict situation types and achieved increasingly better performance. Most of them, however, only pay attention to situation types at the sentence level. Li et al. [15], Cao et al. [16], Zarcone and Lenci [17], Xu [18], Siegel [24] all reject the compositional and recursive nature of aspectual information, which means that verbs cannot have their own situation types and they have to change their so-called aspectual values according to the whole phrase or sentence. For example, the situation type of run has to be shifted to transition in a sentence like he run to the store. However, it is obvious that transition is the situation type of the whole sentence instead of the verb per se. As a result, their experimentation is actually predicting the situation type of a sentence, which is much easier compared to the ones at the lexical level because some of the indicators, such as LE2, are decisive for the prediction of sentence aspects, which is not the case at the lexical level.

(5)

ta1	zhi1dao4	le
he	know	LE2
He has known it.

In (5), LE2 per se decides that the situation type of the whole sentence is a transition or achievement. It is very common for the prediction of situation types of sentences to find a single decisive feature, which makes the accuracy and recall very high. However, this is not the case for lexical aspects. LE2 alone is not able to predict the situation type of zhi1dao4 (know), which is a state verb. In fact, the prediction of the situation type of verbs is a much harder task.

Although also applying the classification method, the experiments aim to achieve better prediction performance rather than to validate the model, meaning that more features besides aspectual markers, which are used to cluster verbs or their senses into different situation types, are used in order to realize better precision. What is more, word embedding vectors, as unsupervised information, are also applied to predict lexical aspects, which is absent from previous studies to test whether large language models can capture complex semantic features like situation types.

2.3.1. Prediction Based on Linguistic Features

All of the aspectual markers, including ZHE, LE1, LE2, GUO, ZAI and ZHENGZAI, will be used to do the prediction. The SINICA corpus has annotated these aspectual markers as Di or D, which makes it possible to extract their co-occurrence frequencies with the annotated verbs. ASP is used to refer to aspectual markers in the following content.

In order to improve the prediction performance, other linguistics features, such as verb reduplication frequencies and syllable length, are also introduced.

Smith [6], Dai [9], Xiao and McEnery [13] have noticed the importance of verb reduplication and treat it as a viewpoint aspect, similar to other aspectual markers. VV is regarded as an abbreviation of verb reduplication in the following content.

For syllable length, He [25] summarizes its relationship with lexical aspects although in a not reliable way. We also include it into the linguistics features for the prediction experiment. SYLL is treated as the abbreviation of syllable length.

Besides, all the Part-of-Speech(POS) tagging information in the SINICA corpus is also included. POS is used as its abbreviation.

Three classifiers, including multinomial logistic regression (MNLogit), supporting vector machine (SVM), and artificial neural network (ANN), are chosen to conduct the classification experiment. The classifiers mentioned above were implemented using scikit-learn library (v1.3.0) in Python and their configurations are depicted below.

Classifier Parameters

Support Vector Machine (SVM): The penalty value C was tuned within the range of 0 to 100 and ultimately set to 5. The kernel method was selected as the radial basis function (RBF).
Artificial Neural Network (ANN): A five-layer sequential neural network model was constructed, with the number of neurons in each layer set to 6, 180, 60, 15, and 3, respectively. All layers were fully connected (dense) layers, and the activation function was a rectified linear unit (ReLU). A default dropout rate of 0.5 was applied to prevent overfitting.
Multinomial Logistic Regression (MNLogit): The default configuration provided by the sklearn package was utilized for the multinomial logistic regression model.

In order to raise the reliability, ten-fold cross-validation is implemented for each experiment.

2.3.2. Prediction Based on Word Embeddings

In this section, word embedding vectors are used as features to predict lexical aspects, which has rarely been conducted in previous studies.

word2vec is targeted as our word embedding vectors trained on a combined corpus of Chinese gigaword and SINICA with a size of more than 1.12 billion words.

We conducted a more thorough examination of the Gigaword corpus used for training, including checks for segmentation errors, part-of-speech tagging errors, and other inconsistencies, to ensure its quality aligns with that of the SINICA corpus.

Word2Vec Training

Word2Vec models were trained using the Gensim library in Python to generate word embeddings of three different dimensions: 300, 500, and 800. The default parameters were applied for training, which include the following settings:

Algorithm: Skip-gram (default in Gensim).
Window size: 5 (the maximum distance between the current and predicted word within a sentence).
Minimum word count: 5 (words with a frequency lower than this threshold are ignored).
Negative sampling: 5 (number of negative samples to be drawn for each positive sample).
Subsampling of frequent words: Enabled with a threshold of $10^{- 5}$ .
Learning rate: 0.025 (initial learning rate for the stochastic gradient descent optimizer).
Epochs: 5 (number of iterations over the training corpus).
Random seed: 1 (for reproducibility).

The trained embeddings were then utilized for downstream tasks in the study.

The same classifiers as the ones in the last section are stored to conduct the classification task with some minor changes in its configurations. The classifiers mentioned above were implemented using the Scikit-learn library in Python and their configurations are depicted below.

Classifier Parameters

While maintaining consistency with the established parameters, this section specifically modifies the ANN topology: The architecture adopts a six-tiered structure with progressive neuron scaling (6→180→60→15→11→3). This refined configuration supersedes previous definitions in the parameter tables.

Also, ten-fold cross-validation is implemented for each experiment.

3. Results and Discussion

3.1. Semantic Distance Visualization and Hierarchical Clustering

After performing cross-validation across the three datasets and taking the intersection of consistent annotation results, the number of verb senses was determined to be 1610.

The senses of verbs with the same distribution in terms of their abilities to co-occur with different aspectual markers have the same situation type, and thus, they are merged before the hierarchical clustering. Typical words are picked out as the representatives of these groups including type shi (be), type ren4shi (know), type xi3huan1 (like), type xin4ren4 (trust), type ai4 (love), type gong1zuo4 (work), type chi1 (eat), type chan3sheng1 (generate), type li2kai1 (leave) and type si3 (die).

We use PCA (Principle Components Analysis) to reduce the dimensions, which is shown in Figure 2.

As the number of components increases, the proportion of explained variance keeps rising, with two components achieving 87% and three components reaching 94%, respectively, which makes it reasonable to visualize the semantic distance between senses of verbs in a three-dimension coordinate system or vector space. The coordinate system is shown in Figure 3.

The graph shows that several types are not distant from each other. Type xi3huan1 (like), type ai4 (love), type xin4ren4 (trust) and type ren4shi2 (know) are close; type chi1 (eat) and type gong1zuo4 (work) are just next to each other; type chan3shengq (generate), type li2kai1 (leave) and si3 (die) are also not far from each other. It should be noted that si3 (die) seems to be isolated. However, this is caused by the projecting angle. Type shi4 (be), although it only contains two verbs, is separated from other groups.

With a CPCC of 0.77, the method of centroid is applied to generate the dendrogram in Figure 4 and the clustered result is consistent with the three-dimensional graph.

Type ai4 (love), xin4ren4 (trust), xi3huan1 (like) and ren4shi2 (know) are merged into the first group from the right. It is clear that type ai4 (love) and xin4ren4 (trust) are further merged into one branch. Type xi3huan1 (like) and ren4shi2 (know) form another branch in this group. The senses of this type of verbs express mental states or mental activities. This group is thus called state verb.
Type chi1 (eat) and type gong1zuo4 (work), which are typical activity verbs, are grouped together. We assign the situation type of activity to this group.
Type si3 (die), li2kai1 (leave) and chan3sheng1 (generate), which illustrate a transition or change in state, are clustered into one group. As a result, this group is named transition.
Type shi4 (be), which represents a kind of attribute, is isolated to be one individual group, including several verbs such as shi4 (be) and deng3yu2 (equal to). Because of the small sample size, these verbs are merged into the first situation type, which is state.

By clustering verb senses into distinct situation types, we can assign corresponding tags to each verb and its senses, as illustrated in Table 2. Among the 1610 verb senses analyzed, activity-related senses dominate, representing over 60% of the total. In contrast, transition senses are the least frequent, while state senses fall somewhere in between these two extremes.

3.2. Statistical Validation of Classification Model

After matching the senses of verbs, together with their situation types, to the verb-tag pairs in SINICA corpus, it is applicable to validate the reliability of the dataset containing verbs and their situation types in a corpus by using multi-nominal logistic regression analysis.

It is obvious that the total number of items in Table 3 is smaller than that in Table 2 due to several reasons. On the one hand, not all verbs can be found in the SINICA corpus and it is the same case for verbs’ senses; on the other hand, some of the senses in Meng [22] can be mapped into one item in SINICA without changing their situation types.

Multicollinearity test is supposed to be conducted before the regression analysis since it may bring in model distortion. Multicollinearity test is implemented through correlation analysis. If data follow normal distribution, Pearson correlation analysis is appropriate. If not, Spearman correlation analysis should be used.

We use Shapiro–Wilk test to confirm whether our data follow normally distribution or not [26].

Table 4 shows that none of these co-occurrence frequencies follow a normal distribution, and thus, Spearman correlation analysis should be used to test the multicollinearity.

The multicollinearity test results confirm that all co-occurrence frequencies are suitable for logistic regression analysis. As evidenced by the data, the highest correlation coefficient in Table 5 stands at 0.18, while Table 6 reveals a p-value approaching zero. These findings collectively demonstrate the absence of significant correlations among the co-occurrence frequencies, thereby validating their independent use in the subsequent analysis.

The LLR p-value with a value of zero in Table 7 confirms the reliability of the dataset, and it proves that our manual annotation of situation types for verbs (senses) in our dataset can be directly used for linguistics and computational studies.

3.3. Prediction of Lexical Aspects

3.3.1. Prediction Based on Linguistics Features

Table 8 and Table 9 show the performances of different classifiers and the detailed result of the best classifier.

We have achieved a 17% improvement from the baseline, which is set by dropping all of the verbs into the situation type of activity. Table 9 shows that activity verbs have the best accuracy.

As far as we know, only Meyer et al. [27] and Falk and Martin [28] conducted experiments aiming to predict lexical aspects. However, the results are not comparable because they are trained and tested on different datasets in different languages with different situation types. They are shown here not to prove that our model is better but to illustrate the difficulty in predicting lexical aspects. While Meyer et al. [27] improves by less than 10% from the baseline in Table 10, Falk and Martin [28] achieves a 19% improvement from the baseline in Table 11.

3.3.2. Prediction Based on Word Embedding Vectors

Table 12 and Table 13 shows the performances of different classifiers and the detailed result of the best one for word embedding vectors.

By improving more than 20% from the baseline, word embedding vectors beat linguistics features. Word embedding vectors outweigh linguistics features in many aspects. The quality requirement of the corpus for word embedding vectors is much lower. Compared with the SINICA corpus, the Chinese gigaword is annotated in bad quality since no manual check is applied after the automatic annotation. We tried to extract co-occurrence frequencies from Chinese gigawords. However, the performance is even worse than baseline, which means that only linguistics features extracted from a corpus with high quality can be used to predict lexical aspects, which is a tough task. On the contrary, word embedding vectors are only concerned about corpus size instead of quality, which means that more time and resources can be saved by using a corpus with low quality but big size.

4. Conclusions

In this paper, we report a study conducted on a carefully constructed dataset of verbs and their situation types, taking into consideration explicit and verifiable distributional linguistic cues. It is argued that lexical aspects select and differentiate verbal senses and not just the prototype argument structure of a verb form. As a result, verbs and their senses can be merged into different situation types based on their family resemblance calculated through their collocational distribution with different aspectual markers. Cross-validation, hierarchical clustering and multi-nominal logistic regression are applied to guarantee the reliability of these procedures.

With the verified dataset, the classification of lexical situation types is conducted using linguistics features and the distributional representation of word embedding vectors, respectively. Word embedding vectors are shown to outweigh linguistics features with a better performance. The best result was an accuracy of 72.05%, 20% higher than the baseline and 4% higher than linguistics features-based classification. With verb’s situation types as the foundation, it will be possible to predict situation types at higher levels such as phrase and sentence. Previous studies on the prediction of situation type at the phrase or sentence level are mainly based on structural features.

Our experiments show that a continuous distributed representation of sentence meaning, such as word embedding vectors, works well. As this approach requires less time and resources with comparable or better performance, our study supports the applicability of continuous distributed representation of sentence meaning-based approaches such as word-embedding vectors to the processing of Chinese and other languages and as a versatile approach to combine meaning representation with other NLP tasks.

Author Contributions

Conceptualization, H.L.; methodology, H.L.; software, H.L.; validation, H.L. and B.L.; formal analysis, H.L. and B.L.; writing—original draft preparation, H.L.; writing—review and editing, B.L.; supervision, B.L.; project administration, B.L.; funding acquisition, H.L. and B.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Social Science Foundation of China (23BYY032 and 21FYYB059). The APC was funded by the National Social Science Foundation of China (21FYYB059).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset can be downloaded at https://github.com/LHongchao/Situation-Types (accessed on 10 March 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

ZHE	Progressive aspectual marker
LE1	Actualization aspectual marker
LE2	Change in state aspectual marker
ZAI	Progressive aspectual marker
ZHENGZAI	Progressive aspectual marker
GUO	Experiential aspectual marker
POS	Part-of-speech
CLA	Classifier
D	Adverb
Da	Quantitative adverb
Dfa	Pre-verbal adverb of degree
Dfb	Post-verbal adverb of degree
Di	Aspectual adverb
Dk	Sentential adverb
SHI	Be
VA	Active intransitive verb
VAC	Active causative verb
VB	Active pseudo-transitive verb
VC	Active transitive verb
VCL	Active verb with a locative object
VD	Ditransitive verb
VE	Active verb with a sentential object
VF	Active verb with a verbal object
VG	Classificatory verb
VH	Stative intransitive verb
VHC	Stative causative verb
VI	Stative pseudo-transitive verb
VJ	Stative transitive verb
VK	Stative verb with a sentential object
VL	Stative verb with a verbal object
V_2	Have

References

Costa, F.; Branco, A. Aspectual type and temporal relation classification. In Proceedings of the the 13th Conference of the European Chapter of the Association for Computational Linguistics, Avignon, France, 23–27 April 2012; pp. 266–275. [Google Scholar]
Kazantseva, A.; Szpakowicz, S. Summarizing short stories. Comput. Linguist. 2010, 36, 71–109. [Google Scholar] [CrossRef]
Smith, C.S. Event types in Mandarin. Linguistics 1990, 28, 309–336. [Google Scholar] [CrossRef]
Vendler, Z. Linguistics and Philosophy; Cornell University Press: Ithaca, NY, USA, 1967. [Google Scholar]
Comrie, B. Aspect: An Introduction to the Study of Verbal Aspect and Related Problems. Mod. Lang. Rev. 1979, 74, 1964–1968. [Google Scholar]
Smith, C.S. The Parameter of Aspect; Springer Science & Business Media: Berlin, Germany, 1991. [Google Scholar]
Zhang, L. A Contrastive Study of Aspectuality in German, English, and Chinese. Ph.D. Thesis, University of California, Berkeley, CA, USA, 1993. [Google Scholar]
Chen, P. On the ternary structure of modern Chinese time system. Stud. Chin. Lang. 1988, 6, 401–422. [Google Scholar]
Dai, Y. A Study on the Tense and Aspect System in Mandarin; Zhejiang Education Press: Hangzhou, China, 1997. [Google Scholar]
Guo, R. A systematic analysis verbs representing course of event in Chinese. Stud. Chin. Lang. 1993, 6, 410–419. [Google Scholar]
Tai, J. Verbs and times in Chinese: Vendler’s four categories. In Proceedings of the the Para Session on Lexical Semantics, Chicago, IL, USA, 17 October 1984; pp. 75–78. [Google Scholar]
Teng, S. The temporal structure of Verbs in Chinese. Lang. Teach. Linguist. Stud. 1985, 4, 7–17. [Google Scholar]
Xiao, R.; McEnery, T. Aspect in Mandarin Chinese: A Corpus-Based Study; John Benjamins Publishing: Amsterdam, The Netherlands, 2004. [Google Scholar]
Verkuyl, H.J. On the Compositional Nature of the Aspects; Reidel: Dordrecht, The Netherlands, 1972. [Google Scholar]
Li, W.; Wong, K.F.; Cao, G.; Yuan, C. Applying machine learning to Chinese temporal relation resolution. In Proceedings of the the 42nd annual meeting on association for computational linguistics. Association for Computational Linguistics, Barcelona, Spain, 21–26 July 2004; p. 582. [Google Scholar]
Cao, D.; Li, W.; Yuan, C.; Wong, K.F. Automatic chinese aspectual classification using linguistic indicators. Int. J. Inf. Technol. 2006, 12, 99–109. [Google Scholar]
Zarcone, A.; Lenci, A. Computational Models for Event Type Classification in Context. In Proceedings of the LREC, Marrakech, Morocco, 28–30 May 2008. [Google Scholar]
Xu, H. The Chinese Aspectual System. Ph.D. Thesis, The Hong Kong Polytechnic University, Hong Kong, China, 2015. [Google Scholar]
Chen, K.J.; Huang, C.R.; Chang, L.P.; Hsu, H.L. Sinica corpus: Design methodology for balanced corpora. Language 1996, 167, 176. [Google Scholar]
Ma, W.Y.; Huang, C.R. Uniform and effective tagging of a heterogeneous giga-word corpus. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC2006), Genoa, Italy, 22–28 May 2006; pp. 24–28. [Google Scholar]
Huang, C.R.; Ahrens, K.; Chang, L.L.; Chen, K.J.; Liu, M.C.; Tsai, M.C. The Module-Attribute Representation of Verbal Semantics: From Semantic to Argument Structure. Int. J. Comput. Linguist. 2000, 5, 19–46. [Google Scholar]
Meng, C. Chinese Verb Usage Dictionary; The Commercial Press: Shanghai, China, 1987. [Google Scholar]
Huang, L.M.j. Aspect: A general system and its manifestation in Mandarin Chinese. Ph.D. Thesis, Rice University, Houston, TX, USA, 1987. [Google Scholar]
Siegel, E.V. Linguistic Indicators for Language Understanding: Using Machine Learning Methods to Combine Corpus-Based Indicators for Aspectual Classification of Clauses. Ph.D. Thesis, Columbia University, New York, NY, USA, 1998. [Google Scholar]
He, B. Situation Types and Aspectual Classes of Verbs in Mandarin Chinese. Ph.D. Thesis, The Ohio State University, Columbus, OH, USA, 1992. [Google Scholar]
Shaphiro, S.; Wilk, M. An analysis of variance test for normality. Biometrika 1965, 52, 591–611. [Google Scholar] [CrossRef]
Meyer, T.; Grisot, C.; Popescu-Belis, A. Detecting narrativity to improve English to French translation of simple past verbs. In Proceedings of the the 1st DiscoMT Workshop at ACL 2013 (51st Annual Meeting of the Association for Computational Linguistics), Sofia, Bulgaria, 4–9 August 2013. [Google Scholar]
Falk, I.; Martin, F. Automatic Identification of Aspectual Classes across Verbal Readings. In Proceedings of the Sem 2016: The Fifth Joint Conference on Lexical and Computational Semantics, Berlin, Germany, 11–12 August 2016. [Google Scholar]

Figure 1. Senses and situation types of a verb.

Figure 2. The explained variance ratio of different numbers of PCA components.

Figure 3. Positions of different types of verbs in aspectual vector space.

Figure 4. Hierarchical clustering of different types of verbs.

Table 1. Lexical aspects.

Type	Features			Examples
Type	Telic	Durative	Dynamic	Examples
State	−	+	−	know, love
Activity	−	±	+	run, hiccup
Transition	+	±	+	finish, die

Table 2. Distribution of verbs’ situation types in Meng [22].

	State	Activity	Transition	Total
Number	387	978	245	1610

Table 3. Distribution of verbs’ situation types in the SINICA corpus.

	State	Activity	Transition	Total
Number	386	634	200	1220

Table 4. Shapiro–Wilk test on aspectual markers’ co-occurrence frequencies.

Aspectual Markers	W	p-Value
ZHE	0.01118	0.0000
LE1	0.02165	0.0000
LE2	0.1097	0.0000
GUO	0.1049	0.0000
ZAI	0.1014	0.0000
ZHENGZAI	0.0876	0.0000

Table 5. Spearman correlation coefficients between aspectual markers.

Aspectual Markers	ZHE	LE1	LE2	GUO	ZAI	ZHENGZAI
ZHE	1	−0.03636	0.120529	0.127262	0.133219	0.045576
LE1	−0.03636	1	0.027713	0.019172	−0.04071	−0.02603
LE2	0.120529	0.027713	1	0.094726	0.084482	0.010214
GUO	0.127262	0.019172	0.094726	1	0.143605	0.162479
ZAI	0.133219	−0.04071	0.084482	0.143605	1	0.177979
ZHENGZAI	0.045576	−0.02603	0.010214	0.162479	0.177979	1

Table 6. p-values of the coefficients.

Aspectual Markers	ZHE	LE1	LE2	GUO	ZAI	ZHENGZAI
ZHE	0.000000	0.204379	0.000024	0.000008	0.000003	0.111585
LE1	0.204379	0.000000	0.333467	0.503474	0.155270	0.363621
LE2	0.000024	0.333467	0.000000	0.000924	0.003146	0.721551
GUO	0.000008	0.503474	0.000924	0.000000	0.000000	0.000000
ZAI	0.000003	0.155270	0.003146	0.000000	0.000000	0.000000
ZHENGZAI	0.111585	0.363621	0.721551	0.000000	0.000000	0.000000

Table 7. LLR test for the classification model.

Log Likelihood	Log Likelihood Null	LLR p-Value
−1118.7	−1220.8	0.0000

Table 8. The performance of different classifiers based on linguistics features.

Features	MNLogit	SVM	ANN
ASP	57.46%	55.74%	58.36%
ASP+POS	67.54%	62.79%	65.66%
ASP+POS+VV	67.79%	63.28%	67.38%
ASP+POS+VV+SYLL	68.44%	64.43%	67.54%
Baseline	51.97%	51.97%	51.97%

Table 9. Detailed result of the best system based on linguistics features.

Situation Type	Precision	Recall	F1	Number
State	70%	58%	63%	386
Activity	68%	90%	78%	634
Transition	64%	20%	30%	200
Average/total	68%	68%	65%	1220

Table 10. Result in Meyer et al. [27].

Model	Recall	Precision	F1
MaxEnt	71%	72%	71%
CRF	30%	44%	36%
Baseline	64%	64%	64%

Table 11. Result in Falk and Martin [28].

Algorithm	Complete Features	Selected Features
trees.j48	61.80%	63.00%
rules.jrip	63.89%	61.56%
lazy.kstar	62.89%	67.47%
functions.libsvm	62.72%	61.13%
bayes.naivebayes	60.22%	65.80%
Baseline	48.37%	48.37%

Table 12. The performance of different classifiers based on word embedding vectors.

	MNLogit	SVM	ANN
Accuracy	67.54%	72.05%	71.48%
Baseline	51.97%	51.97%	51.97%

Table 13. Detailed result of the best system based on word embedding vectors.

Situation Type	Precision	Recall	F1	Number
State	69%	59%	64%	386
Activity	72%	87%	78%	634
Transition	81%	51%	63%	200
Average/total	72%	72%	71%	1220

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, H.; Liu, B. Representing Aspectual Meaning in Sentence: Computational Modeling Based on Chinese. Appl. Sci. 2025, 15, 3720. https://doi.org/10.3390/app15073720

AMA Style

Liu H, Liu B. Representing Aspectual Meaning in Sentence: Computational Modeling Based on Chinese. Applied Sciences. 2025; 15(7):3720. https://doi.org/10.3390/app15073720

Chicago/Turabian Style

Liu, Hongchao, and Bin Liu. 2025. "Representing Aspectual Meaning in Sentence: Computational Modeling Based on Chinese" Applied Sciences 15, no. 7: 3720. https://doi.org/10.3390/app15073720

APA Style

Liu, H., & Liu, B. (2025). Representing Aspectual Meaning in Sentence: Computational Modeling Based on Chinese. Applied Sciences, 15(7), 3720. https://doi.org/10.3390/app15073720

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Representing Aspectual Meaning in Sentence: Computational Modeling Based on Chinese

Abstract

1. Introduction

1.1. Aspect

1.2. Lexical Aspect

1.3. Research Questions

2. Materials and Methods

2.1. The Construction of Situation Type System and Its Dataset

2.1.1. Classification Criteria for Situation Types

2.1.2. Lexical Selection for Classification

2.1.3. Semantic Distance Visualization and Hierarchical Clustering

2.2. Statistical Validation of Classification Model

2.2.1. Corpus Selection and Sense Mapping

2.2.2. Multinomial Logistic Regression

2.3. Prediction of Lexical Aspects

2.3.1. Prediction Based on Linguistic Features

Classifier Parameters

2.3.2. Prediction Based on Word Embeddings

Word2Vec Training

Classifier Parameters

3. Results and Discussion

3.1. Semantic Distance Visualization and Hierarchical Clustering

3.2. Statistical Validation of Classification Model

3.3. Prediction of Lexical Aspects

3.3.1. Prediction Based on Linguistics Features

3.3.2. Prediction Based on Word Embedding Vectors

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI