2. Definiteness
In this paper, I distinguish formal definiteness, i.e., through the formal reflection of determination, and semantic definiteness, i.e., through the property of referentiality. Definiteness and indefiniteness in the sample languages are marked morphologically by the affixation of markers to the nouns. The data will be evaluated for the relationship between formal and semantic definiteness (see
Section 4) and Target word order. In addition, the cardinal number “one” is frequently used as an indefinite article (Kurdish
yak/yek, NENA
xa, Armenian
mi, and Azeri
bir). The cardinal number “one” in Mukri, NEK, and NENA can also be combined with the word
dāna, literally “grain”, and this combination (NENA
xa dank2, Kurdish
yek dank/dāna) expresses indefiniteness as well (see
Table 1 below). Here,
dāna is not a count word, and it is used with all different types of nouns, including animate ones. An overview of the morphological marking is given in
Table 1.
Table 2 shows an overview of formal definite marking in the sample languages. Below, I will give examples of formal marking in the sample languages for illustration.
Below, I will present several examples to illustrate definite marking in the sample languages.
(1) Singular –DEF
yak and DEF
aka (Mukri, ÖM 2016: 197, NZ.187 cited in
Asadpour 2022a)
| s | p | p | t | v |
amn=īš | nānawā-yak=mān | habū | xom | da | bar | nānawā-y-aka-y | kǝrd |
1sg=add | bakery-indf=pc.1pl.poss | have.pst | self.1sg | at | on | bakery-glid-def-obl | do.pst |
“As for me, (we) had a bakery. (I) put myself under the bakery [lit. I laid with the bakery].: |
In example (1), the object is marked with the indefinite marker -yak, whereas the Target is marked with the definite suffix -aka and is also flagged by a combination of grammatical and lexical prepositions and an oblique case.
(2) Singular DEF -
a (Mukri, ÖM 2016: 205, ČQ.139 cited in
Asadpour 2022a)
s | o | v=p | t |
pādšā | daz | bǝrd=a | sar=ī | bāza-a-y |
king | hand | bring.pst=to | head=ez | falcon-def-obl |
“The king got a hold of the falcon’s head. [lit. The king put his hand on the head of the falcon]”. |
In example (2), the object is unmarked, and the Target is marked by the definite suffix -a. The definite marker -a is also introduced for the masculine gender as an oblique and as a definite article. It can also have the function of a demonstrative in the form of =a, which is cliticized to a nominal element. See example (3) below for an illustration of -a as a definite suffix:
(3)
-a as a definite suffix (Mukri ÖM 2016: 270, ŽB.41b cited in
Asadpour 2022a)
3 | s | p | t | v |
kut=ī, | kuř-a | wa | peš=ǝm | kawǝt |
say.pst=pc.3sg.agr | boy-def | to | front=pc.1sg | fall.pst |
“(He) said, the boy fell down in front of me.” |
In example (3),
kuř (“boy”) is the subject of the sentence, and it is marked by the suffix
-a. The referent is given and familiar, and it refers to someone who forms part of the background information shared between the speaker and the hearer. Out of context, in a conversational dialogue where both the speaker and the hearer see someone and the speaker points to the person by using the
-a suffix, this clitic most likely has a demonstrative rather than definite function. In this case, the extra marking by the suffix
-a is for the purpose of highlighting the specific referent (see
Asadpour 2022a, sct. 5.7).
(4) Singular DEF -
ak (NEK, TONI, SM_63 cited in
Asadpour 2022a)
o | p | t | p | |
du | toq=e | sarī-y=ā | we | kǝç-ak-e | re | īnāndǝbū |
two | scarf=ez.m | head-glid=pl | that.obl.sg.f | girl-def-obl.sg.f | postp | bring.ppf |
“(She) brought two head scarves to the girl.” |
In example (4), the object is unmarked, and the Target is marked by a definite suffix -ǝk and is flagged by a circumposition.
(5) Singular -DEF
-ek (NEK, TONI, MD_62 cited in
Asadpour 2022a)
| v=p | t |
ku | bǝ-š-t=a | cǝ-y-ek-e |
that | sbjv.go.prs-3sg=to | place-glid-indf-obl.sg.f |
“[…] that (he) goes to somewhere.” |
In example (5), the Target is marked with the indefinite suffix
-ek. In the TONI corpus, NEK has the definite article
-ak(a), while
Haig and Öpengin (
2018, 16) claim that in Northern Kurdish, if a constituent is not morphologically marked with the indefinite suffix, it is considered to be either definite or generic, depending on the context. Furthermore,
Gündoğdu (
2018, 53) claims that in Kurmanǰī, and especially in Muš Kurmanji Kurdish, there is no definite article. In the next examples, I illustrate the definite and indefinite marking of attested tokens in NENA, Azeri, and Armenian.
(6) Singular -DEF
xa- (NENA (
Khan 2008, 414, E87C cited in
Asadpour 2022a)).
o | v | t |
xá-danka | pardá | yasr-í-wa | m-gudà |
one-grain | curtain | tie.hab-3pl-pst | prep-wall |
“(They) would draw a curtain over the wall.” |
Since, in NENA, Targets are not attested for marking by definite markers, I give the example of a direct object marked with the indefinite xa-danka marker. Similarly, Azeri also presents indefinite marking by the element bir in Target constructions (see example (7) below).
(7) Singular -DEF
bir (Azeri TONI cited in
Asadpour 2022a)
o | v | t |
bir-dānā | kītāb | āl-dı-m | ver-dī-m | bir | kas-a |
one-grain | book | buy-pst-1sg | give-pst-1sg | one | someone-dat |
“(I) bought a book and gave (it) to someone.” |
In Armenian, similar to Mukri and NEK, nominal elements can be marked with definite suffixes (see examples (8), (9), (10), and (11) below).
(8) Singular DEF
-n (Armenian TONI, 3-1.25 cited in
Asadpour 2022a)
t | v | |
ter | Bagrat-i-n | el | asa-m | vor | ari… |
father | Bagrat-dat-def | also | say.pst-1sg | that | come.imp.sg… |
“(I) also told [lit. said] Father Bagrat to come.” |
(9) Singular DEF
-ǝ (Armenian, TONI, 5-2.14m/n cited in
Asadpour 2022a)
v | t |
gnac̣-i | ira | łek-i | hetew-ə |
go.pst-1sg | his | steering.wheel-dat | behind-def |
“(I) went behind his steering wheel.” |
(10) Singular -DEF
mi (Armenian, TONI, 5-2.2a cited in
Asadpour 2022a)
v | t |
gnac̣-inkՙ | mi | hat | pՙołoc̣-i | ners |
go.pst-1pl | one | item | street-dat | inside |
“(We) went to [lit. inside] a street.” |
Above, I gave examples for singular forms. Below, I show examples of elements with definiteness in the plural form. All examined languages express (in)definite plurality through the ending Mukri
-ān; NEK
-en, NENA
-e, Armenian
-(n)er, and Azeri
-lār (see
Table 2 above).
(11) Plural DEF
-ak-ān (Mukri, ÖM 2016: 226, MK.223 cited in
Asadpour 2022a)
v | p | t |
da-č-et=awa | kǝn | dǝz-ak-ān | čǝl | nafar-aka-y |
ipfv-go.prs-3sg=postv | close | thief-def-pl | forty | individual-def-obl |
“(He) goes close to the thieves, the forty men.” |
(12) Plural DEF
-k-en (NEK, TONI, SM_96 cited in
Asadpour 2022b)
| s | v=p | p | t |
čǝra | motorkǝlāzā | kur-ǝk | na-hāt=ā | pešǝ-y=ā | bǝrā-k-en | kǝç-ǝk-e |
why | motorbike | boy-def | neg-come.pst=to | front-glid=ez.f | brother-def-pl.ez | girl-def-obl.sg.f |
“Suddenly the motorbike of the boy came to the front of the brothers of the girl.” |
(13) Plural DEF
-ner-in (Armenian,
Dum-Tragut 2009, 67 cited in
Asadpour 2022a)
s | v | o |
es | tesn-um | em | ays | erek’ | ałjik-ner-i-n |
I | see-ptcp.prs | cop.1sg | this | three | girl-pl-dat-the |
“I see these three girls.” |
In Armenian, no Targets with plural and definite suffixes are attested. Finally, Targets and other elements can be unmarked, i.e., without any definite marking:
(14) Unmarked Target (Mukri ÖM 2016: 254, ČN.118 cited in
Asadpour 2022a)
o | p | t | v |
mǝndāł-a=yān | da | sundūq-e | hāwīšt |
child-def=pc.3pl.agr | into | coffer-obl | throw.pst |
“(They) threw the child into the coffer.” |
(15) Unmarked Target (NEK, TONI, KP_159 cited in
Asadpour 2022a)
v | t |
hat-in | mal-e |
come.pst.3pl | home-obl.sg.f |
“(They) came home.” |
(16) Unmarked Target (NENA,
Khan 2008, 418, F101B cited in
Asadpour 2022a)
v | p-t |
zǝllu | g-komsèr |
go.pst.3pl | obl-police_station |
“(They) went to the police station.” |
(17) Unmarked Target (Armenian, TONI, 5-1.28 cited in
Asadpour 2022a)
| v | t |
heto, | gnac̣inkՙ | Iran |
then | go.pst.1pl | Iran |
“Then (we) went to Iran.” |
(18) Unmarked Target (Azeri, kɪral, 2001: 142, T2/4 cited in
Asadpour 2022a)
o | v |
va | har | na | da | kī | yāz-ɪr-dī-∅ | āpār-ɪr-dɪ-∅ |
and | every | what | also | that | write-ipfv-pst-3sg | take-ipfv-pst-3sg |
t | |
ruznāmī- ya | kī | cāp | elī-ya-lar | cāp | ela-m-īr-dī-lar |
newspaper-dat | that | publishing | do-opt-3pl | publishing | do-neg-ipfv-pst-3pl |
“and whatever (he) was writing would be taken to the newspaper so that they publish it but they would not publish it.”
4. Corpus Analysis of Formal Definiteness
For different parts of speech, the following coding is used. Nominal elements can be marked by a(n) (in)definite marker, or they can be unmarked. Pronominal elements, as well as bound pronouns, are considered unmarked. Since the pronominal and bound pronoun elements are given information, they are coded as unmarked definite unless they indicate an indefinite element.
Table 3 below offers the placement of constituents according to their definiteness on subjects, direct objects, and Targets.
Table 3 shows the frequency of different word orders (TV: Target-Verb; VT: Verb-Target) for marked and unmarked Targets (Mukri, NENA, Azeri, Armenian, NEK) in terms of definiteness (DEF, -DEF). All sample languages demonstrate a tendency for unmarked constituents. Mukri illustrates a fairly equal distribution of formal definite marking in both positions. NENA presents an obvious tendency for the indefinite marking of constituents in the preverbal position. Azeri displays no definite marking, and unmarked constituents are mostly definite regardless of their position. NEK shows a preference for definite marking in the postverbal position and indefinite marking in the preverbal position. There also seems to be a tendency for unmarked definite arguments to appear postverbally in the sample languages, but there is no real tendency for the unmarked constituent in Mukri.
The research question can be answered by looking at the patterns and trends in the table, as well as performing some tests of significance to compare the proportions of word orders across the variables. To answer the question, the kinds of word orders in specific types of definite and indefinite Targets are detailed in the following paragraphs.
For marked Targets, TV is more common than VT for both definite and indefinite Targets, except for in Armenian, where VT is more common for definite Targets. This suggests that word order variation for marked Targets is influenced by language-specific factors rather than definiteness.
For unmarked Targets, TV is more common than VT for definite Targets, while VT is more common than TV for indefinite Targets. This suggests that word order variation for unmarked Targets is influenced by definiteness rather than language-specific factors.
For both marked and unmarked Targets, there is a significant difference in the proportions of TV and VT across definiteness (chi-square = 181.4, p < 0.001). This means that definiteness affects word order variation for both marked and unmarked Targets.
In the next passages, I will show that animacy demonstrates an influence on Target PoS, especially regarding noun phrase placement, and it is necessary to separate the constituents into subjects, objects, and Targets and to analyze the effects of definite marking on each of them separately. This helps to gain a clearer picture of the influence of definiteness on Target word order. For this purpose, it is also important to pair the features, looking at animacy and definiteness together for the different types of constituents. This will help to better examine the data.
Table 4 shows the frequency of different word orders (TV: Target-Verb; VT: Verb-Target) for marked and unmarked Targets (Mukri, NENA, Azeri, Armenian, NEK) in terms of definiteness (DEF, -DEF) and the definiteness of the subject (S) and object (O) constituents. For marked Targets, there is no clear relationship between the word order and the definiteness of the subject or object constituents. The proportions of TV and VT do not vary much across different combinations of subject and object definiteness. This suggests that word order variation for marked Targets is not influenced by the definiteness of other constituents.
In considering Targets in pre- and postverbal positions, there is no clear pattern of the definiteness distribution in Mukri because formal marking occurs in both positions for all three constituents. This tendency is also the same for unmarked forms. In NENA, indefinite marking shows a clear preference for the preverbal position with all constituents. Unmarked constituents in NENA display no placement sensitivity. Azeri presents no definite marking, and unmarked Targets exhibit a tendency for preverbal unmarked indefinite and postverbal unmarked definite. In Armenian, most of the formal definite marking occurs postverbally for all constituents and less so in the preverbal position. Unmarked Targets reveal no preference for either position. In NEK, the formal marking of definiteness displays a tendency for the postverbal position, and unmarked forms are neutral in terms of preference. Among various PoS, objects and Targets are mostly marked with (in)definite articles and fewer subjects in Mukri; objects in NENA; more Targets and fewer subjects in Armenian; and more subjects and Targets and fewer objects in NEK. For both marked and unmarked Targets, there is a significant interaction effect between the word order, Target definiteness, and subject or object definiteness (chi-square = 108.9, p < 0.001). This means that word order variation depends on the combination of Target definiteness and subject or object definiteness.
For a clearer idea of what is happening for Targets in various positions, it is necessary to separate the Targets and examine them in relation to animacy.
Table 4 demonstrates the realization of Targets in terms of definiteness marking.
In
Table 5, Targets that are marked with a definite article are in the preverbal position in Mukri for both human (3%) and inanimate (4%) and have a lower tendency to be in the postverbal position for human (2%) vs. inanimate (2%). The indefinite Targets are fairly divided between pre- and postverbal positions. In NENA, indefinite Targets with formal marking are in the preverbal position (1%), and definite Targets are in the postverbal position (1%). Azeri illustrates no definite marking. In Armenian, the definite marking of human Targets presents a preference for the postverbal position among inanimate entities (6%). Finally, NEK has a tendency for postverbal Targets to be marked with a definite article (7% vs. 2% in the preverbal position). In all of these languages, the strongest trend is for unmarked Targets. Unmarked human Targets are more often located before the verb: for example, in Mukri, H = 84% occurred preverbally, while H = 8% of tokens occurred postverbally. On the other hand, inanimate unmarked Targets are located postverbally: for example, I = 52%, whereas I = 33% of tokens are preverbal. Equally preferred for both positions in NENA are unmarked human Targets (preverbal = 52% vs. postverbal = 43%), as well as postverbal preferences for inanimate Targets (79% vs. 18% in the preverbal position). The trend becomes clearer in Azeri, Armenian, and NEK, which show a preference for postverbal unmarked definite Targets.
As shown above, definiteness affects word order variation for unmarked Targets, while animacy affects word order variation for marked Targets. However, there is no clear interaction effect between definiteness and animacy on word order variation. For marked Targets, there is a significant difference in the proportions of TV and VT across animacy (chi-square = 10.8, p = 0.01) but not across definiteness (chi-square = 0.01, p = 0.92). This means that marked Targets tend to have different word orders depending on whether they are human, animate non-human, or inanimate, but not depending on whether they are definite or indefinite.
For unmarked Targets, there is a significant difference in the proportions of TV and VT across definiteness (chi-square = 113.64, p < 0.001) but not across animacy (chi-square = 0.36, p = 0.55). This means that unmarked Targets tend to have different word orders depending on whether they are definite or indefinite, but not depending on whether they are human, animate non-human, or inanimate.
For human Targets, there is a significant difference in the proportions of TV and VT across marking (chi-square = 28.69, p < 0.001) but not across definiteness (chi-square = 1.59, p = 0.21). This means that human Targets tend to have different word orders depending on whether they are marked or unmarked, but not depending on whether they are definite or indefinite.
For animate non-human Targets, there is a significant difference in the proportions of TV and VT across marking (chi-square = 38.64, p < 0.001) but not across definiteness (chi-square = 0.01, p = 0.94). This means that animate non-human Targets tend to have different word orders depending on whether they are marked or unmarked, but not depending on whether they are definite or indefinite.
For inanimate Targets, there is a significant difference in the proportions of TV and VT across marking (chi-square = 38.64, p < 0.001) and across definiteness (chi-square = 113.64, p < 0.001). This means that inanimate Targets tend to have different word orders depending on whether they are marked or unmarked and whether they are definite or indefinite.
At the current analytical level, as stated above, the categories are broad enough, and this leads to results of their own. Some finer distinctions that exist in the data, however, may still have been missed. Therefore, in order to give a finer analysis, I have paired the features of the (in)definiteness and animacy of Targets for each language, and I will look at the results for each constituent: subject, object, and Target.
For a closer look at the overt realization of subjects, objects, and Targets, I cross-paired animacy and definiteness for each language.
Table 6,
Table 7,
Table 8,
Table 9 and
Table 10 below determine the data for subjects, objects, and Targets and their definite marking and animacy in the sample languages. The categories begin with human (h), animate (a), and inanimate (i) definite (DN), followed by human, animate and inanimate indefinite (IN), unmarked definite (UD), and unmarked indefinite (UI).
Several interesting results come to light when pairing the features. When considering animacy and definiteness for each constituent, some differences become apparent for human and animate definite and indefinite, as well as unmarked, constituents. By taking into account the differences in definiteness, [+human] subjects show a high tendency for the preverbal position (37%), and [+human] and [-animate] objects are also preverbal at 26% and 94%, respectively. On the other hand, definite human Targets demonstrate a tendency for the postverbal position (29%), as do [-animate] Targets (3%). Similar to definite subjects and objects, indefinite subjects and objects display a preference for the preverbal position, while Targets present a preference for the postverbal position. Unmarked human subjects and objects show a preverbal preference, with a slight tendency for the postverbal position, and Targets exhibit a tendency for both positions. These tendencies indicate that when animacy and definiteness features are paired, there are preferences that are not seen in the broader categories.
As shown above, definiteness and animacy, or a combination of both, play a role in the word order variation of Targets depending on the marking status of the Target. Definiteness affects word order variation for unmarked Targets, while animacy affects word order variation for marked Targets. However, there is no clear interaction effect between definiteness and animacy on word order variation. The tendencies of word order variation are represented in the interaction of word order with definiteness and animacy and constituents such as subject, object, and Target by comparing the proportions of word orders across the variables using chi-square tests. The results are described below.
For subject constituents, there is no significant difference in the proportions of TV and VT across definiteness (chi-square = 0.87, p = 0.83) or across animacy (chi-square = 1.21, p = 0.75). This means that the subject constituent does not affect word order variation for either marked or unmarked Targets. For the object constituent, there is a significant difference in the proportions of TV and VT across definiteness (chi-square = 67.76, p < 0.001) and across animacy (chi-square = 10.8, p = 0.01). This means that the object constituent affects word order variation for both marked and unmarked Targets. For definite noun object constituents, TV is more common than VT for human and animate non-human objects, while VT is more common than TV for inanimate objects. This suggests that word order variation for definite noun object constituents is influenced by animacy rather than definiteness. For indefinite noun object constituents, TV is more common than VT for all types of animacy. This suggests that word order variation for indefinite noun object constituents is not influenced by animacy or definiteness. For unmarked definite object constituents, TV is less common than VT for all types of animacy. This suggests that word order variation for unmarked definite object constituents is not influenced by animacy or definiteness. For unmarked indefinite object constituents, TV is less common than VT for human and animate non-human objects, while TV is more common than VT for inanimate objects. This suggests that word order variation for unmarked indefinite object constituents is influenced by animacy rather than definiteness.
For Target constituents, there is a significant difference in the proportions of TV and VT across definiteness (chi-square = 113.64, p < 0.001) and across animacy (chi-square = 108.9, p < 0.001). This means that the Target constituent affects word order variation for both marked and unmarked Targets. For definite noun Target constituents, VT is more common than TV for human and inanimate Targets, while TV is more common than VT for animate non-human Targets. This suggests that word order variation for definite noun Target constituents is influenced by animacy rather than definiteness. For indefinite noun Target constituents, TV is more common than VT for all types of animacy. This suggests that word order variation for indefinite noun Target constituents is not influenced by animacy or definiteness. For unmarked definite Target constituents, VT is more common than TV for inanimate Targets, while TV is more common than VT for human and animate non-human Targets. This suggests that word order variation for unmarked definite Target constituents is influenced by animacy rather than definiteness. For unmarked indefinite Target constituents, VT is more common than TV for all types of animacy. This suggests that word order variation for unmarked indefinite Target constituents is not influenced by animacy or definiteness.
In NENA, formal definite rarely occurs. A description and summary of the table are as follows: For subject constituents, there is a significant difference in the proportions of TV and VT word orders across definiteness (chi-square = 67.76, p < 0.001) and animacy (chi-square = 10.8, p = 0.01). For definite noun subjects, TV is more common for human subjects, while VT is more common for inanimate subjects. No animate non-human subjects have definite noun marking, suggesting animacy’s influence. For indefinite noun subjects, TV is more common for all types of animacy, indicating that word order variation is not influenced by animacy or definiteness. For unmarked definite subjects, TV is more common for human and animate non-human subjects, while VT is more common for inanimate subjects, again pointing to animacy’s influence. For unmarked indefinite subjects, TV is more common for human subjects, while VT is more common for animate non-human and inanimate subjects, suggesting animacy’s role.
For object constituents, there is no significant difference in the proportions of TV and VT word orders across definiteness (chi-square = 0.87, p = 0.83) or across animacy (chi-square = 1.21, p = 0.75). This implies that the object constituent does not affect word order variation for either marked or unmarked Targets.
For Target constituents, there is a significant difference in the proportions of TV and VT word orders across definiteness (chi-square = 113.64, p < 0.001) and across animacy (chi-square = 108.9, p < 0.001). For definite noun Targets, VT is more common for inanimate Targets, while TV is more common for human Targets. There are no animate non-human Targets with definite noun marking, indicating animacy’s influence. For indefinite noun Targets, TV is more common for all types of animacy, suggesting that word order variation is not influenced by animacy or definiteness. For unmarked definite Targets, TV is more common for human and animate non-human Targets, while VT is more common for inanimate Targets, once again highlighting animacy’s influence. For unmarked indefinite Targets, VT is less common for all types of animacy, indicating that word order variation is not influenced by animacy or definiteness.
Azeri is the only language that does not demonstrate a definite marking system. However, the results for unmarked definiteness show that for definite noun Targets and indefinite noun Targets, there are no available data in the table. This suggests that word order variation for these types of Targets is either not applicable or not observed in Azeri.
For unmarked definite Targets, TV is more common than VT for human and animate non-human Targets, while VT is more common than TV for inanimate Targets. For both marked and unmarked Targets, there is no significant difference in the proportions of TV and VT across definiteness (chi-square = 0.87, p = 0.83). This means that definiteness does not affect word order variation for either marked or unmarked Targets. For the subject constituent, there is a significant difference in the proportions of TV and VT word orders across definiteness (chi-square = 67.76, p < 0.001) and across animacy (chi-square = 10.8, p = 0.01). This means that the subject constituent affects word order variation for both marked and unmarked Targets. However, for definite noun and indefinite noun subject constituents, there are no available data in the table, suggesting that word order variation is not applicable or not observed in Azeri. For unmarked definite subject constituents, TV is more common than VT for human and animate non-human subjects, while VT is more common than TV for inanimate subjects, indicating that animacy influences the word order for unmarked definite subject constituents. For unmarked indefinite subject constituents, TV is more common than VT for human subjects, while VT is more common than TV for animate non-human and inanimate subjects, suggesting that animacy plays a role in word order variation, with no clear effect of definiteness. For the object constituent, there is no significant difference in the proportions of TV and VT word orders across definiteness (chi-square = 0.87, p = 0.83) or across animacy (chi-square = 1.21, p = 0.75). This implies that the object constituent does not affect word order variation for either marked or unmarked Targets. For the Target constituent, there is a significant difference in the proportions of TV and VT word orders across animacy (chi-square = 108.9, p < 0.001) but not across definiteness (chi-square = 0.01, p = 0.92). This means that the Target constituent affects word order variation for both marked and unmarked Targets, depending on whether they are human, animate non-human, or inanimate.
Overall, the findings suggest that animacy has a significant influence on word order variation in Azeri, especially for unmarked definite and unmarked indefinite constituents. Definiteness appears to have a limited impact on word order variation. There is no clear interaction effect between definiteness and animacy in word order variation. These conclusions are supported by the chi-square test results provided in the text.
In this table, both the descriptive patterns from the table and chi-square tests lead to the following conclusions: For definite noun Targets, VT is more common than TV for inanimate Targets, while TV is more common than VT for human and animate non-human Targets. This indicates that animacy significantly influences word order for definite noun Targets, with animacy being more influential than definiteness. For indefinite noun Targets, VT is more common than TV for all types of animacy, suggesting that word order variation for indefinite noun Targets is not influenced by either animacy or definiteness. For unmarked definite Targets, VT is more common than TV for inanimate Targets, while TV is more common than VT for human and animate non-human Targets. This implies that word order variation for unmarked definite Targets is influenced by animacy more than definiteness. For unmarked indefinite Targets, VT is more common than TV for inanimate Targets, while TV is more common than VT for human and animate non-human Targets. Similar to unmarked definite Targets, this suggests that animacy plays a more significant role in word order variation for unmarked indefinite Targets.
For the subject constituent, there is a significant difference in the proportions of TV and VT word orders across definiteness (chi-square = 67.76, p < 0.001) and across animacy (chi-square = 10.8, p = 0.01), indicating that the subject constituent affects word order variation for both marked and unmarked Targets. In the case of definite noun subject constituents, TV is more common than VT for human subjects, and VT is more common than TV for inanimate subjects. There are no instances of animate non-human subjects with definite noun marking. This suggests that word order variation for definite noun subject constituents is primarily influenced by animacy, not definiteness. For indefinite noun subject constituents, VT is more common than TV for all types of animacy, indicating that word order variation for this group is not significantly affected by animacy or definiteness. Regarding unmarked definite subject constituents, TV is more common than VT for human and animate non-human subjects, while VT is less common than TV for inanimate subjects. This implies that animacy plays a more substantial role in word order variation for unmarked definite subject constituents, overshadowing definiteness. Similarly, for unmarked indefinite subject constituents, TV is more common than VT for human and animate non-human subjects, while VT is less common than TV for inanimate subjects. This further suggests that animacy is the dominant factor influencing word order for unmarked indefinite subject constituents.
For the object constituent, there is no significant difference in the proportions of TV and VT word orders across definiteness (chi-square = 0.87, p = 0.83) or across animacy (chi-square = 1.21, p = 0.75). This implies that the object constituent does not significantly affect word order variation for either marked or unmarked Targets.
In the Target constituent, there is a significant difference in the proportions of TV and VT word orders across animacy (chi-square = 108.9, p < 0.001) but not across definiteness (chi-square = 113.64, p < 0.001). This indicates that the Target constituent affects word order variation for both marked and unmarked Targets, with animacy playing a more significant role in influencing the word order depending on whether the Target is human, animate non-human, or inanimate. In summary, animacy appears to be a prominent factor influencing word order variation in Armenian, especially in the context of definite and indefinite Targets, unmarked definite and unmarked indefinite Targets, and unmarked subject constituents. Definiteness has a more noticeable impact on unmarked Targets. However, there is no clear interaction effect between definiteness and animacy in word order variation across the constituents. These findings are supported by chi-square test results provided in the text.
Finally, NEK exhibits a similar pattern to that of Mukri. For definite noun Targets, VT is more common than TV for human and inanimate Targets, while TV is more common than VT for animate non-human Targets. This suggests that word order variation for definite noun Targets is significantly influenced by animacy rather than definiteness. For indefinite noun Targets, TV is more common than VT for human Targets, while VT is more common than TV for animate non-human and inanimate Targets. This implies that word order variation for indefinite noun Targets is influenced by animacy rather than definiteness. For unmarked definite Targets, VT is more common than TV for inanimate Targets, while TV is more common than VT for human and animate non-human Targets. This suggests that word order variation for unmarked definite Targets is influenced by animacy rather than definiteness. For unmarked indefinite Targets, VT is more common than TV for all types of animacy. This suggests that word order variation for unmarked indefinite Targets is not influenced by animacy or definiteness. The chi-square test (181.4, p < 0.001) indicates that definiteness significantly affects word order variation for both marked and unmarked Targets.
For the subject constituent, there is a significant difference in the proportions of TV and VT word orders across definiteness (chi-square = 67.76, p < 0.001) and across animacy (chi-square = 10.8, p = 0.01). This means that the subject constituent affects word order variation for both marked and unmarked Targets. In the case of definite noun subject constituents, TV is more common than VT for human subjects, while VT is more common than TV for inanimate subjects. No animate non-human subjects have definite noun marking, indicating that word order variation for definite noun subject constituents is influenced by animacy rather than definiteness. For indefinite noun subject constituents, TV is more common than VT for all types of animacy, suggesting that word order variation for this group is not significantly influenced by animacy or definiteness. In the case of unmarked definite subject constituents, TV is more common than VT for human and animate non-human subjects, while VT is less common than TV for inanimate subjects. This suggests that word order variation for unmarked definite subject constituents is influenced by animacy rather than definiteness. Similarly, for unmarked indefinite subject constituents, TV is more common than VT for human and animate non-human subjects, while VT is less common than TV for inanimate subjects. This suggests that word order variation for unmarked indefinite subject constituents is influenced by animacy rather than definiteness.
For the object constituent, there is no significant difference in the proportions of TV and VT word orders across definiteness (chi-square = 0.87, p = 0.83) or across animacy (chi-square = 1.21, p = 0.75). This indicates that the object constituent does not significantly affect word order variation for either marked or unmarked Targets.
For the Target constituent, there is a significant difference in the proportions of TV and VT word orders across animacy (chi-square = 1.21, p = 0.75) but not across definiteness (chi-square = 0.87, p = 0.83). This means that the Target constituent affects word order variation for both marked and unmarked Targets, depending on whether they are human, animate non-human, or inanimate.
In summary, this research reveals that animacy has a substantial influence on word order variation in NEK, particularly for definite and indefinite Targets, as well as unmarked definite and unmarked indefinite Targets. Definiteness plays a role, primarily affecting word order variation for unmarked Targets. However, there is no clear interaction effect between definiteness and animacy on word order variation across constituents. These findings are supported by the provided chi-square test results.
The analysis of the formal marking of definiteness demonstrates that the sample languages allow for various forms of word order, with Targets showing a clear preference for appearing in the postverbal position. The aim of this article was to analyze these various patterns based on their actual use in the existing corpora. With respect to definiteness, there are notable tendencies in the determination of word order. These tendencies open up a number of questions regarding related factors in word order variation. For example, is there a relation between definite and indefinite on the one side and referential and non-referential or specific and non-specific on the other? What is the relationship between syntactic constituents, such as subjects, objects, and Targets, and (in)definiteness with respect to the Target word order?
The hypothesis was that, in terms of definiteness, there is a distinction between the tendency for postverbal Targets that follow the verb directly to be marked by an indefinite marker or to be non-specific/referential and the tendency for those Targets that are in the preverbal position to be marked by a definite marker or to be unmarked definite. Among sample languages, Mukri presents a very systematic (in)definiteness system. NEK and Armenian display a(n) (in)definite marking system but not as detailed as Mukri’s definite system. NENA shows a moderate use of (in)definite markers, mainly used for indicating indefinite nouns. Finally, Azeri does not have a(n) (in)definite marking system. In languages without a formal marking of (in)definiteness, the word order indicates (in)definiteness. The results led to some interesting and hitherto unnoticed generalizations relating to the formal and semantic properties of these functions, as well as their positioning within the sentence in the sample languages.
The data for the syntactic constituents demonstrate that subjects are less marked by a(n) (in)definite marker than objects and Targets. Objects illustrate the strongest affinity to be marked with a definite marker in the sample languages. Only NENA and NEK exhibit a direct correlation between definiteness and word order.
As we have seen, the above-mentioned numerical results for definite Targets are very low. Mukri has a detailed marking system, while NEK and Armenian have definite systems that can use definite markers only for nominal Targets in the attested corpora.
I found more definite postverbal NP and adverbial Targets and fewer indefinite ones in the preverbal position. In a general sense, I was not able to identify the effects of the definiteness and animacy of constituents and, in particular, Targets’ pre- vs. postverbality within one single language, but by comparing several languages, this effect becomes objectively clearer. In fact, the sample languages show different patterns for the Target PoS as well as subjects and objects. Due to the special frequency of each PoS (for instance, NPs occur more frequently in the postverbal position, and pronominal Targets are mostly preverbal), one can clearly see the role of the PoS as the main influencing factor in word order and consequently in definiteness in different positions (see
Asadpour 2022b). This indicates that definiteness has a secondary role in the word order of constituents such as Targets. With the strong likelihood of definiteness for pronominal and bound pronoun Targets (see
Asadpour 2022b for the definition of bound pronouns in the sample languages), most of the bare definite Targets and constituents are preverbal. Regarding NPs, which can be marked by (in)definite markers, indefinite NPs are mostly preverbal, and definite NPs are postverbal.
In summary, in all of the sample languages, definite human subjects are placed more frequently in the preverbal position. Objects demonstrate more flexibility, and Targets are the most mobile constituents. Unmarked Targets of both the definite and indefinite types do not illustrate a particular preference, except for those that have been noted. The lack of obvious tendencies in the data and the lack of definite marking in the languages with existing definite marking systems are interesting results in themselves. They reveal that although these languages present definite marking systems, this factor is not the main influencing element in word order determination; rather, it plays a secondary role. It is also worth mentioning that objects were mostly marked for definiteness, and similarly, subjects and Targets illustrate the same behavior. This shows that in a continuous discourse, objects are usually newer or more contrastive than subjects and Targets, and that definite marking can play a role in re-highlighting the information. Targets are at least accessible information, but they are still background information (cf.
Asadpour 2022a). Hence, this may result in the less frequent marking of a Target with a definite marker. Furthermore, the overall picture of definiteness in Mukri, Armenian, and NEK indicates that in the postverbal position, the number of definite Targets is higher than in the preverbal position. This implies that the postverbal position has a preference for given information, and the preverbal position demonstrates a tendency for new information (see
Asadpour 2022a, sct. 5.7). In NENA and Azeri, this picture differs, and the trend does not show any special preference over the information structure based on definiteness or a connection with the semantics of constituents.
Finally, it seems that verb type influences DEF and -DEF. MOTION and CAUSED-MOTION verbs have the highest number of -DEF Targets, and the DEF Targets are for the other verb types (see
Asadpour 2022a,
2022b,
2022c).
5. Corpus Analysis of Semantic Definiteness
Targets have also been examined for their semantic differences (cf.
Lyons 1980;
Dik 1989) in terms of marking for definiteness. Below, I present a general overview of the parameters for the sake of convenience.
In
Table 11, te difference between identifiability and familiarity is that in identifiability, the element refers to an entity that is not identifiable by the listener and that can be textually understood. This is in contrast to identifiability, where the term familiarity refers to identifiable entities. For such entities, textually, there is usually a “referent identification” (cf.
Givón 1979, 296;
Lyons 1980, 173–88;
Dik 1989, 143–46). In addition to these two definite terms, there are other sources of availability and referentiality that can help the listener obtain relevant information, such as uniqueness (i.e., general knowledge), indexicality (i.e., the identifiability of the element depends on the reference and the speech event), anaphoricity (i.e., a non-relational referent to the context of speech event), and rigidity (i.e., proper names).
Table 11 presents the placement of constituents according to their semantic definiteness (covering subjects, direct objects, and Targets).
Table 11 displays the variation in and distribution of semantic definiteness for all constituents, such as subject, object, or Target. Familiarity semantics is preferred over the other features, and identifiability is the second-most-common feature. In Mukri, familiarity and identifiability semantics occur mostly preverbally, while in other languages, both familiarity and identifiability are more frequent in the postverbal position. This can be due to the type of PoS or the animacy of the constituents (cf.
Asadpour 2022a,
2022b). Uniqueness in Mukri presents a higher frequency preverbally, while in the other languages, it demonstrates a postverbal tendency. Indexicality is noted only for Mukri in the preverbal position, but in Armenian, it is attested evenly in both positions. The rest of the languages did not exhibit this feature. In Mukri and NENA, rigidity is preverbally attested, and in Azeri, Armenian, and NEK, rigidity is postverbal. The above data yield the following hierarchy of definiteness encoding.
Table 12 shows that among the various semantic definiteness possibilities of different positions, uniqueness, rigidity, and indexicality clearly dominate the postverbal position, while familiarity and identifiability do not present a clear placement. Among the sample languages, Mukri is the only language that does not exhibit any preference over semantic definiteness. NENA, Azeri, and NEK show similar patterns with dominant postverbality over uniqueness, rigidity, indexicality, and familiarity. On the other hand, Armenian displays a mixed type of various forms, with a less intense preference for the postverbal position. By combining the results of various semantic definiteness parameters, it turns out that NEK and Armenian are typical postverbal languages. Among other sample languages, indexicality is presented in Armenian as having a postverbal tendency, and anaphoricity in Mukri has no clear placement preference; see
Table 13 below.
Since the discrepancy in definiteness in various positions can be partly explained by the genre of text, which awaits further investigation,
Table 13 and
Table 14 show that constituents in the postverbal position are mostly definite. The data shown above indicate differences in the examined corpora and the narrative style of the informants. In other words, the conciseness of the speaker leads to variability in the use of definiteness expressions, overt constituents, PoS, etc., which are signaled by different ways of marking definiteness.
For the irregularities in various placements of semantic definiteness, similarly to definite marking, I separate the constituents into subjects, objects, and Targets and analyze the effects of definite marking on each of them. This will give a clearer picture of the influence of semantic definiteness on the Target word order. I will further group the features to perform a pair analysis in relation to the animacy of the constituents.
The kinds of word orders in specific types of definite and indefinite Targets in
Table 15 are as follows: For familiarity-marked Targets, TV is more common than VT for all languages except Armenian, where VT is more common than TV. This suggests that word order variation for familiarity-marked Targets is influenced by language-specific factors rather than definiteness or animacy. For identifiability-marked Targets, TV is more common than VT for Mukri and NENA, while VT is more common than TV for Azeri and NEK. Armenian has a balanced distribution of TV and VT for identifiability-marked Targets. This suggests that word order variation for identifiability-marked Targets is influenced by language-specific factors rather than definiteness or animacy. For uniqueness-marked Targets, TV is less common than VT for all languages except Mukri, where TV is more common than VT. This suggests that word order variation for uniqueness-marked Targets is influenced by language-specific factors rather than definiteness or animacy. For indexicality-marked Targets, there are very few data available in the table. This suggests that word order variation for indexicality-marked Targets is not applicable or not observed in these languages. For anaphoricity marked Targets, there are no data available in the table. This suggests that word order variation for anaphoricity marked Targets is not applicable or not observed in these languages. For rigidity marked Targets, TV is less common than VT for all languages except Mukri and Armenian, where TV is more common than VT. This suggests that word order variation for rigidity-marked Targets is influenced by language-specific factors rather than definiteness or animacy. For both marked and unmarked Targets, there is a significant difference in the proportions of TV and VT across semantic definiteness (chi-square = 181.4,
p < 0.001). This means that semantic definiteness affects word order variation for both marked and unmarked Targets.
The table also shows that postverbal Targets are accessible in terms of information, i.e., backgrounded information, while in subjects and objects, this information occurs mostly in the preverbal position (cf.
Asadpour 2022a).
For a clearer idea of what is happening for Targets in various positions, it is necessary to separate the Targets and explore them individually in relation to animacy.
Table 16 demonstrates the realization of Targets in terms of semantic definiteness and animacy effect.
Table 16 above shows that for familiarity-marked Targets, TV is more common than VT for human and animate non-human Targets in all languages except Armenian, where VT is more common than TV. For inanimate Targets, TV is less common than VT in all languages. This suggests that word order variation for familiarity-marked Targets is influenced by animacy rather than definiteness or language-specific factors. For identifiability-marked Targets, TV is more common than VT for human and animate non-human Targets in Mukri and NENA, while VT is more common than TV in Azeri and NEK. Armenian has a balanced distribution of TV and VT for identifiability-marked Targets. For inanimate Targets, TV is less common than VT in all languages. This suggests that word order variation for identifiability-marked Targets is influenced by animacy and language-specific factors rather than definiteness. For uniqueness-marked Targets, TV is less common than VT for human and animate non-human Targets in all languages except Mukri, where TV is more common than VT. For inanimate Targets, TV is less common than VT in all languages except Mukri and Armenian, where TV is more common than VT. This suggests that word order variation for uniqueness marked Targets is influenced by animacy and language-specific factors rather than definiteness. For indexicality-marked Targets, there are very few data available in the table. This suggests that word order variation for indexicality-marked Targets is not applicable or not observed in these languages. For anaphoricity-marked Targets, there are no data available in the table. This suggests that word order variation for anaphoricity-marked Targets is not applicable or not observed in these languages. For rigidity-marked Targets, TV is less common than VT for human and animate non-human Targets in all languages except Mukri and Armenian, where TV is more common than VT. For inanimate Targets, TV is less common than VT in all languages except Mukri and Armenian, where TV is more common than VT. This suggests that word order variation for rigidity-marked Targets is influenced by animacy and language-specific factors rather than definiteness. For both marked and unmarked Targets, there is a significant difference in the proportions of TV and VT across semantic definiteness (chi-square = 181.4,
p < 0.001). This means that semantic definiteness affects word order variation for both marked and unmarked Targets.
For a closer look at the overt realization of subjects, objects, and Targets, I cross-paired animacy and semantic definiteness for each feature.
Table 16,
Table 17,
Table 18,
Table 19 and
Table 20 are given below to determine the data for subjects, objects, and Targets regarding their definite marking and animacy in the sample languages. The categories begin with human (h), animate (a), and inanimate (i) definite familiar (DF), followed by human, animate, and inanimate definite identifiable (DI), definite uniqueness (DU), and definite rigidity (DR).
The data presented in
Table 16 provide the basis for these conclusions: (a) Animate subjects are predominantly preverbal and recognized for having familiarity definiteness. Unique subjects are mostly animate, while rigid subjects are human. (b) Objects are mostly inanimate, with a lower number of animate objects that are all predominately familiar and identifiable. Postverbal objects are predominantly inanimate familiar. In contrast to subjects, no human object is attested as unique and rigid. (c) Semantically definite Targets are mostly human and familiar, as are subjects. However, Targets are much more pronounced postverbally than subjects and objects, and unique Targets are mostly inanimate. Rigid Targets are also attested preverbally, and postverbal positions are attested equally for humans, with a dominant postverbal tendency for inanimate Targets. (d) Subjects and Targets are similar in terms of animacy, familiarity, and identifiability, while Targets and objects are similar in terms of inanimate unique and rigid entities. There is a discrepancy among postverbal Targets toward human uniqueness and rigidity; this is due to the information structure (see
Asadpour 2022a).
From
Table 17, the following conclusions are drawn: (a) Animate subjects are predominantly preverbal and recognized for having familiarity definiteness. Unique subjects do not present any sensitivity to animacy, while rigid subjects are predominantly human. (b) Objects are evenly human and inanimate for familiar entities, while identifiable objects are predominantly inanimate. (c) Targets display a more mixed type without a clear preference for preverbality or postverbality. However, unique Targets show a preference for the postverbal position, and (d) in contrast to Mukri, NENA shows a similarity in word order, as well as semantic definiteness between objects and Targets; i.e., these two constituents are treated similarly. The constituent order of objects and Targets in NENA demonstrates no clear preference for preverbal or postverbal positions; rather, the constituent order prefers an intermediate position.
The results of
Table 19 for Azeri show the following: (a) Animate subjects are predominantly preverbal and recognized for familiarity and rigidity. (b) Objects are mostly inanimate and identifiable. The trend for animate objects is very low. Postverbal objects are predominantly inanimate familiar. (c) Preverbal Targets are mostly human and familiar, while postverbal Targets demonstrate a tendency to be inanimate familiar and identifiable. Moreover, inanimate Targets are coded as unique with a large difference (86% postverbal vs. 11% preverbal). (d) In Azeri, none of the constituents exhibit similar tendencies to those of Mukri and NENA. In Azeri, postverbal objects are not typical; however, one reason lies behind the type of PoS (see
Asadpour 2022b).
Armenian presents few instances of indexical elements (six indexicalized subjects and three objects). The trend of tendencies in Armenian illustrates that (a) animate subjects are predominantly preverbal and recognized for familiarity, and unique subjects are predominantly human; (b) objects are inanimate familiar and identifiable, and inanimate objects occur mostly in the postverbal position; (c) preverbal Targets are mostly human and familiar, while postverbal Targets show a tendency to be inanimate familiar and identifiable, and unique Targets are predominantly postverbal; and (d) in Armenian, subjects, objects, and Targets exhibit different patterns.
Finally, as indicated in the
Table 20 and
Table 21, NEK demonstrates a similar pattern to that of Mukri with a preference for (a) preverbal human familiar subjects; (b) preverbal inanimate objects with familiarity semantics and predominantly postverbal inanimate objects coded for identifiability; (c) Targets in the preverbal position that are mostly animate and familiar but, in the postverbal position, show a preference for inanimate identifiable entities. Moreover, inanimate Targets marked for uniqueness and rigidity are predominantly postverbal.
Based on the data presented in
Table 17,
Table 18,
Table 19,
Table 20 and
Table 21, it seems that animate subjects are predominantly preverbal and recognized for having familiarity definiteness. Unique subjects are mostly animate, while rigid subjects are human. Objects are mostly inanimate, with a lower number of animate objects that are predominately familiar and identifiable. Postverbal objects are predominantly inanimate familiar. Semantically definite Targets are mostly human and familiar, as are subjects. However, Targets are much more pronounced postverbally than subjects and objects, and unique Targets are mostly inanimate. Rigid Targets are also attested preverbally, and postverbal positions are attested equally for humans, as well as a predominantly postverbal tendency for inanimate Targets. Subjects and Targets are similar in terms of animacy, familiarity, and identifiability, while Targets and objects are similar in terms of inanimate unique and rigid entities. There is a discrepancy among postverbal Targets toward human uniqueness and rigidity; this is due to the information structure (see
Asadpour 2022a).
In summary, the sample languages allow for various forms of word order, with Targets showing a clear preference for appearing in the postverbal position. The aim of this study was to analyze these various patterns based on their actual use in the existing corpora.
The results of definiteness indicate that regardless of the genre of text, for example, procedural, prose, and memoriae, grammatical definiteness markers have no primary influence. Instead, they have a secondary influence. In the existing corpora, despite the lack of attested definite markers in Mukri, NEK, Armenian, and, in part, NENA, the listener is able to identify referents if they are definite or indefinite because of the preceding text. It can be concluded that the definite marking patterns in Mukri, NEK, Armenian, and NENA allow for freedom of word order, and that there is no definiteness constraint. In order to compensate for the absence of a definite marker, extra-textual knowledge of an element and its precedent occurrence in the context enables the identifiability of semantic definiteness. In such cases, grammatical markers are unnecessary, and this results in a flexible word order. Moreover, the non-frequent use of definite markers in the sample languages that do have a marking system implies the gradual loss of definite marking in their functionality. This is very apparent in Mukri. Regarding semantic definiteness, uniqueness and rigidity appear to be two of the most efficacious markers of definiteness for postverbal Targets. These markers also combine with other features, such as familiarity, to have a higher tendency toward preverbal position. Identifiability combines with a higher tendency for preverbal Targets in Mukri and postverbal Targets in NEK. There are no further clear distinctions in the rest of the languages.