1.1. Aspect
Aspect serves as a cornerstone in both linguistic theory and computational modeling. As the primary temporal representation mechanism cross-linguistically, this grammatical category has remained a central focus of linguistic inquiry for over five decades. Within contemporary computational paradigms, aspectual features—when combined with tense—are systematically analyzed to derive temporal metadata for essential natural language processing tasks, including (but not limited to) the following: neural machine translation architectures, temporal relation extraction modules, and abstractive text summarization systems [
1,
2].
Both aspect and tense serve as one of the most important temporal information of a sentence in different languages [
3,
4]. However, their interchangeable usage in some studies illustrates a confusion in their definitions. In fact, they refer to different temporal facets in a sentence.
Tense concerns the relationship between speech time, which is when the situation described by a sentence occurs and reference time, which is usually the time a speaker utters the sentence. Past, present and future are the most common ones for tense.
Aspect—fundamentally encoded in its etymological grounding—encapsulates perceptual vantage points regarding event phase segmentation. This linguistic phenomenon manifests through predicates whose morphosyntactic realization spans lexical units to sentential constructions, thereby governing situational representations across granularity levels.
(1) a. He was running.
b. He has finished the assignment.
Taking (1) as an example, was, instead of is, is used indicating that a past tense is applied for (1a); be + v-ing, as a whole, is related with the viewpoint of the action denoted by the verb or verb phrase and it has nothing to do with tense or reference time. The action of run has different phases, including inception, duration and final, of which only duration is focused on by be +v-ing. The case is the same for (1b), which contains a viewpoint focusing on the final of a situation.
The same situation also exists in Chinese, such as (2).
(2) a.
ta1 | zheng4zai4 | pao3bu4 |
he | ZHENGZAI | run |
He is running. |
b.
ta1 | wan2cheng2 | le | zuo4ye4 |
he | finish | LE | assignment |
He has finished the assignment. |
The differentiation between tense and aspect opens a broad area for the further research of aspectual information in sentences and dozens of studies have appeared since the beginning of modern aspectual studies conducted by Vendler [
4] and Comrie [
5].
It has been commonly accepted that aspect is divided into two parts, including the situation type and viewpoint aspect, mainly developed and made clear by Smith [
3] and Smith [
6] in her two-components theory. The interaction between viewpoint aspects and situation types is vitally important for aspectual studies since it plays a key role in the description and explanation of grammatical and semantic features of various aspectual structures.
Situation type classification derives from the temporal architecture inherent in predicative units, spanning verbal constituents to complete sentential expressions. This classification schema fundamentally interrogates the temporal contour of situations, specifically examining the presence or absence of inception, duration, and termination phases. At the lexical stratum, the situation type is instantiated through verb classification, constituting what is traditionally designated as the lexical aspect.
Consequently, the phenomenon of situation type classification and its lexical manifestation as aspectual properties represent universal linguistic features across diverse language systems. This classification framework serves as a critical analytical tool for predicate categorization, particularly in verb classification paradigms, given its predictive capacity for determining syntactic behavior and semantic combinatorial patterns within linguistic units.
Stative verbs inherently denote situations lacking both inception and termination points, exhibiting atelic properties with homogeneous temporal distribution. Taking know as a paradigmatic example, this verb encodes a continuous state maintained by the subject without temporal interruption. Due to their inherent durative and static nature, stative verbs exhibit low compatibility with progressive aspect marking, as evidenced by the ungrammaticality of constructions like “He is knowing the name”. This linguistic constraint extends to Mandarin Chinese, where zhi1dao4 (know) similarly resists combination with progressive markers zhe or zhengzai. The underlying rationale stems from the homogeneous internal structure of states: regardless of temporal observation points, the state remains constant and indivisible into internal phases. This directly conflicts with the progressive aspect’s core function of highlighting internal temporal segmentation within events.
Activity verbs denote situations lacking explicit inception or termination points, yet unlike stative verbs, they exhibit heterogeneous temporal distribution. This indicates that the activity process comprises dynamic phases while maintaining conceptual unity as a single event. Consequently, activity verbs demonstrate flexible compatibility with progressive aspect marking, as evidenced by English constructions like “be running” and Mandarin Chinese sequences with zhengzai preceding pao3bu4 (run).
Transition verbs, by contrast, inherently encode either inception or termination points in their temporal structure. While some transition verbs like build represent durative processes, others, such as die, denote punctual events. The common characteristic across transition verbs is their representation of state changes, either emphasizing the transition process or the resultant state. This dual nature is particularly evident in Chinese verb–verb compounds like ti2gao1 (improve), which explicitly marks the progression from one state to another. More frequently, however, transition verbs focus on the resultant state, as exemplified by both the English die and the Chinese si3 (die). The inherent state transition in these verbs results in strong compatibility with perfective (or realizative) aspect marking. Representative examples are systematically presented in
Table 1.
Unlike situation types, which are denoted by content verbs or predicates, viewpoint aspects are grammaticalized markers. As is shown by its name, a viewpoint aspectual marker stands for a subject’s view or perspective of a situation. Imperfective viewpoints only focus on a phase of a situation, such as be + v-ing, which takes a snapshot of the duration of a situation, while perfective aspectual markers treat a situation as a whole, such as have + v-ed, which regards a situation as a complete and finished one.
Grammaticalized aspect markers are generally considered an important demarcation point between aspect languages and non-aspect languages [
7]. As a representative of inflectional languages, English has developed multiple viewpoint aspectual markers (such as the progressive aspect and the perfective aspect) or inflectional means, which is also quite common among inflectional languages.
However, Chinese, as a typical isolating language almost without inflectional morphemes, is surprisingly an aspect language with abundant aspectual markers including ZHE, ZAI, ZHENGZAI, LE1, LE2 and GUO.Verbs’ abilities to co-occur with these markers serve as an important part in describing their syntactic and semantic features in Chinese [
3,
6,
8,
9,
10,
11,
12,
13].
ZHE is the progressive aspectual marker in Chinese. It can only be attached to a verb, indicating that the action is in progress, thus making it an imperfective aspectual marker. For example, the ZHE in pao3 zhe (be running) views the duration period of a running process.
ZAI and ZHENGZAI, although there are semantic differences between them, are generally considered to indicate the progressive aspect as well. However, unlike ZHE, they have not been grammaticalized into aspectual markers or aspectual particles but rather function as adverbs, which means that they are usually positioned before the verbs as modifiers instead of being attached to the back of verbs, such as zheng4zai4 gou4wu4 (be shopping). Historically speaking, their usage in expressing the progressive aspect even predates that of ZHE in the same regard.
LE has two different usages. One of them, namely LE1, is considered an actualization aspect marker, indicating the realization or completion of an action, and it is attached to the end of a verb, such as chi1 LE1 fan4 (have eaten dinner). The other one, that is LE2, is regarded as a change in the state aspect marker, generally indicating the emergence of a new situation, such as shi2 san1 sui4 LE2 (become thirteen years old). It is attached to the end of the whole sentence instead of being attached to the verb.
GUO is generally regarded as an experiential aspect marker, indicating that the subject has once experienced the event, such as qu4 guo4 China (have ever been to China).
1.2. Lexical Aspect
It should be noted that in this paper, the lexical aspect, as a term, is used to refer to the situation type at the lexical level. However, the lexical aspect is far from being studied comprehensively and reliably for several reasons.
While aspects at the phrase and sentence level gain researchers’ attention, lexical aspects are bypassed in many studies due to their rejection of the compositionality and recursiveness of aspectual information in sentences. This rejection mainly arises from the so-called ambiguity of verbs’ situation types.
Taking run and run a mile as examples, Vendler [
4] treats run itself as an activity verb and run a mile as a transition, or accomplishment situation to be specific. Studies such as Smith [
6] identify this phenomenon as an ambiguity of verbs situation types since the situation type of run per se is different from the run in run a mile. To tackle this dilemma, she moves situation types onto the phrase level and claims that verbs have no situation type, which is followed by dozens of researchers. However, two issues occurred when this procedure was taken. On the one hand, enhancing the assignment of the situation types onto the phrasal level cannot solve the ambiguity problem because when new components such as arguments and adjuncts are added, the situation type will keep changing accordingly, meaning that we have to keep raising situation types onto upper levels until sentences which are the biggest grammatical units. On the other hand, there is no way to explore the interaction between the aspectual nature of verbs and aspectual markers because verbs are ruled out from situation type assignment.
However, it is impossible to study aspect without talking about the interaction between verbs and aspectual makers, which serve as the foundation for situation types in upper levels. As a result, scholars who reject lexical aspects have to accumulate all of the aspectual information of the whole structure into the constituting verb, which makes a shifting nature for the verbs’ situation type. What is more, they have to create other terminologies to refer to the situation type at the lexical level because they have already claimed that verbs have no situation type. Aspectual value and parameter are used in Smith [
6] and Verkuyl [
14], respectively, which makes their rejection of situation type at the lexical level a self-contradictory statement because there is no fundamental difference between aspectual values/parameters and situation types.
The ambiguity actually appears due to their rejection of the compositional and recursive nature of aspectual information. The situation type is recursive because different linguistic levels, such as verb, verb phrase and sentence, can have their own situation types. The situation type is compositional because different constituents in a structure form the situation type of the whole structure. It should be noted that the situation type of the constituting verb is independent of that of the whole structure. It is weird to insist that the run in run to the store is different from the run per se. The so-called shifting of run from an activity to an accomplishment is, in fact, imposing the meaning of to the store on run, which violates language reality.
Accepting the recursive and compositional nature can help to settle this problem, which also means lexical aspects have to be studied before situation types at other levels because they are the basic units for phrases and sentences and, thus, for situation types at these levels. A lot of computational studies such as Li et al. [
15], Cao et al. [
16], Zarcone and Lenci [
17], Xu [
18] only focus on the prediction of situation types in sentential level following linguistics studies who have reject the existence of lexical aspects, which makes the prediction of lexical aspects a nearly blank area to be explored.
In conclusion, the rejection of the recursive and compositional nature of aspectual information has blocked the way to make further exploration of lexical aspects, which also give rise to this study.