Any syntactic analysis of synthetic-compound formation must necessarily start from three basic questions about the nature of compound structure. These include the nature of the building blocks in synthetic-compound formation; the base structure on which the derivational process proceeds; and the nature of the actual operations which derive the final synthetic compound structure. For a successful ‘morphology as syntax’ model of grammar, it is important that the syntactic principles that operate in synthetic-compound formation are general, independently motivated syntactic rules and not compound-specific processes.
3.1. The Status of Roots
We assume here that the building blocks of compound formation include pre-categorial roots, which remain active throughout syntactic operations (such as internal or external merge). This is a standard assumption within earlier versions of the DM analysis of compounding, such as
Harley (
2009,
2014). Let us take the Greek synthetic compound
pliroforiodhotis, ‘information-provider’, discussed in the previous section.
Harley (
2009) assumes the following structure for such a compound:
15.
A root predicate, dho (‘provide’), selects for a nominal argument, pliroforio (‘information’), and the root head moves to the predicate head (via head-movement). This root string is then nominalized by a category-assigning head, n, expressed as the nominalizing suffix -tis in Greek, and further head movement to the left of the nominalizing head derives the final synthetic compound structure.
The structure assumed here departs from
Harley’s (
2009) structure in a number of ways. However, before discussing the details, it is important to revisit the syntactic status of the root here. In several approaches within DM, (see, for example,
Arad 2003;
Harley 2012;
Wood 2021;
Marantz 2021), all roots have to be categorized first before becoming visible for later syntactic operations, including the selection and licensing of arguments, and also in order to be interpreted (see
Harley (
2014) for a treatment of roots as active, argument-selecting syntactic objects, as well as response papers in the same volume, especially
Alexiadou (
2014), for arguments against such an approach). Thus, the structure of the compound above would be represented by a structure like (16), below (adapted from
Wood 2021):
16.
The structure in (16) resembles the structures in (2) if one decomposes morphological labels such as N and V to ROOT+category-assigning-head (√+n and √+v, respectively). It seems then that these approaches form a retreat to a more morphology-like base syntactic structure, with argument structure projected after the categorization of heads has taken place.
However, such an analysis cannot be maintained, at least not for the Greek compound data. The form of the nominal argument within a Greek compound is not inflected for gender/number features. Such a form is illicit as a free form in the language:
17.
a. | * i | plirofori | (-o) |
b. | i | plirofori | -a |
| the | information | -F.SG.NOM |
All Greek nouns (with some exceptions, especially loanwords) need to be inflected with gender, number, and case morphology. Noninflected nouns are only allowed in compound-internal (or derivational) contexts (see
Ralli 1992 for discussion).
Marantz (
2021) assumes that nouns are formed by a root attached to a gender feature, where gender would exhaustively classify nouns (partition the class of nouns) syntactically. Thus, a typical noun would have the following structure:
18.
In
Marantz’s (
2021) notation, √CAT{N} indicates a root that will become a noun (i.e., has a nominal predisposition) and g[common]
i indicates a gender feature (where the subscript
i identifies the specific set of phi features that this gender feature will be a member of). Thus, the little nominalizing n in
Wood’s (
2021) structure in (16) translates to the gender feature in Marantz’s structure in (18). If this is correct and based on the examples in (17), Wood’s structure cannot be accurate for Greek synthetic compounds such as
pliroforiodhotis.
The root status of the non-head element in synthetic compounds has been clearly supported in work on Greek synthetic compounds presented in
Iordachioaia et al. (
2017), who assume that the initial merging of the Greek synthetic compound elements involves two roots, a “verbal” root and a “nominal” root, which saturates the internal argument of the former. In this account, the internal argument root incorporates into the verbal element. The timing of the incorporation process results in different back-formed N–V compounds in Greek: If incorporation happens before categorization of the verbal root by a v head, then the resulting N–V compound verb has idiosyncratic stress patterns and stem form (for example,
vivli-o-detó, ‘to book-bind’; compare
déno, ‘to bind’). However, if incorporation happens after verbal root categorization (to an already categorized verb) then the resulting N–V compound has the typical source verb stress pattern and stem form (for example,
thiri-o-damázo, ‘to beast-tame’; compare
damázo, ‘to tame’).
In addition to the morphological evidence for assuming the active presence of roots (without categorization layers) in Greek synthetic compounds, additional evidence of the presence of roots in compounding is provided in numerous other languages, including Dutch primary compounds (
de Belder 2017), where it is clearly shown that primary compounds contain uninflected roots with generic (i.e., nonnominal) interpretations, or in Chinese (
Zhang 2007), where the only possible derivation of exocentric compounds in Chinese is shown to involve the merger of bare roots without syntactic features (see
Bauke 2014, chp. 2, for extended discussion of the presence of roots inside syntactically derived compounds).
We will assume then that the base structure of synthetic compounds in Greek involves a verbal predicate selecting for a root argument.
3.2. Merge and Licensing
In recent years, work within the minimalist program (
Chomsky 1995 and later work) has assumed that syntactic derivations proceed in steps that interface with the phonological and interpretive components (‘phases’ in
Chomsky 2001,
2005). In its initial implementation, Phase Theory assumed that the only possible phases are CPs and vPs (and possibly DPs, although this is left open in
Chomsky 2001).
Marantz (
2001) extends the phase inventory to include every case where category-changing morphology attaches to a syntactic structure, changing the extended projection. Thus, every time a nominalizer attaches to a verbal string, it defines a new phase. In other work, the term phase is applied to each domain of the verb in which an argument is added (
Sportiche 1999,
2005;
Hallman 1997;
Carnie and Barss 2006). Even though the notion of ‘phase’, and what the possible domains in which it applies are, is not completely clear yet, ‘phase theory’ has helped explain a number of problems within the syntactic component and has been applied to the analysis of morphological derivations with similar success, most notably in the work of
Marantz (
2001,
2006).
Furthermore,
Sportiche (
1999,
2005) puts forward a novel account of how verbal arguments acquire referential properties. The account is based on the assumption that selection is ‘strictly’ local. This means that predicates must select for bare NPs and that subsequent nominal layers (case, number, quantification) project outside the thematic domain and trigger movement of the argument NP to VP-external positions. Thus, a VP-internal argument is selected by the verb as an NP. It subsequently raises to number, case, and D projections outside the VP shell. The evidence that
Sportiche (
2005) provides for such a claim is drawn from reconstruction effects. Consider, for example, the following:
19. In 1986, no integer had been proved to falsify Fermat’s theorem.
Under current assumptions, the underlying structure for (19) would be something like (20):
20. In 1986, had been proved [no integer falsify Fermat’s theorem]
This structure should give rise to two different interpretations (depending on the scope of the determiner no with respect to the main predicate):
21.
a. In 1986, no integer x, it had been proved that x falsifies Fermat’s theorem
b. In 1986, it had been proved that no integer falsifies Fermat’s theorem
However, the second interpretation is not possible, which means that the quantifier does not reconstruct in its base position. Assuming that there is always reconstruction when there is a movement operation,
Sportiche (
2005) concludes that (20) is not an accurate underlying representation for (19), and it should change to (22):
22. No ……prove …[EMBEDDED CLAUSE integer falsify…]
Thus, surface structure is derived by movement of the NP integer to the projection that hosts the quantifier in order to be quantized. Since this is not DP movement, reconstruction is not possible, and the paradox is explained straightforwardly.
What does this predict with respect to the lexical integrity effects, which were discussed in the previous section? If the analysis is on the right track, then D-elements merge outside the verbal domain of ‘first merge’ between the verb and a unquantized nominal head. I propose that the domain of synthetic-compound formation (and, possibly, most ‘derivational’ processes) is exactly this ‘first merge’ domain (see, for example, ‘first phase syntax’,
Ramchand 2003; see also
Sportiche 1999 for a discussion of synthetic compounds). In other words, the domain of synthetic-compound formation includes the lower VP but no additional verbal functional layers.
Nominalizers (e.g., the agentive
–er in English or
–tis in Greek) merge at different levels in the syntactic spine, changing the categorial status of the projection from verbal to nominal (see
Alexiadou 2001;
Ntelitheos 2006,
2013 for different approaches on how this is achieved). In the case of synthetic compounds, the nominalizer merges directly above VoiceP (the projection where the external argument is licensed). Let us see how the proposed derivation is implemented.
3.3. Deriving Synthetic Compounds in Greek
Consider the formation of
thiriodamastis ‘beast-tamer’. The verb enters the derivation selecting its internal (THEME) argument in the lower vP: [vP [√P
damas- [√P
thiri-]]] (see also
Iordachioaia et al. (
2017), based on
Harley 2009,
2014). The argument is a root, and thus, it has no nominal properties or nominal functional layers.
The string of the preverbal predicate selecting for a root argument defines a symmetrical structure where each head c-commands the other since there is no functional material to create antisymmetry.
Delfitto et al. (
2008) show that such structures are not licit, as they create a Point of Symmetry of two substructures that share the same structural complexity (in this case, two roots, but it could also be two nominalized roots or two larger identical structures).
Delfitto et al. (
2008) identify such symmetric structures as instances of Parallel Merge, which constitute a violation of the Linear Correspondence Axiom (LCA) of
Kayne (
1994), resulting in a structure that cannot be linearized (see
Citko 2005;
Delfitto et al. 2008; and
Bauke 2014 for leftward movement as a symmetry-breaking mechanism for licensing purposes).
Therefore, the only way for the structure to be well formed is for the internal argument to move to a higher functional projection in order to be linearized. We propose that the movement of the internal argument in a synthetic compound to a preverbal position is triggered by such symmetry-breaking linearization requirements (see
Barrie 2006 for a somewhat different discussion of these issues). This predicts that such movement can only be licit in the environment of real ‘root’ compounds, i.e., compounds in which two roots merge at the lower level of the derivation. If the compound involves a merger of an inflected predicate or a nominal argument (with gender and number morphology) then no such movement is predicted (unless there are other independent reasons that force inversion; see discussion in
Ntelitheos and Pertsova (
2019) for this asymmetry in primary/root compounds in several different languages).
Thus, the internal argument inverts over the verb to some licensing projection. We assume (following the discussion in
Ralli 1992 and later work) that this projection is headed by the compound marker
-o- in Greek. The status of this marker has received considerable discussion in the relevant literature (
Ralli 1992,
2013;
Booij 1992;
Scalise 1992) where it is termed a ‘linking vowel’. The discussion in
Ralli (
2013) clearly shows that the linking element has the function to signal compound-formation processes, but only in the cases where the leftmost element of the compound is a root. The linking marker is dropped when this element is a fully (or partially) inflected word, e.g., in compounds borrowed from classical Greek:
23. angeli.a-foros
message.ACC-bringer
‘messenger’
As we will see in the following section, this is also the case in phrasal compounds in Greek, where the argument of the verb is an inflected nominal, and thus, licensing is achieved via adjacency and not through inversion over the linker -o-.
Returning to the derivation of Greek synthetic compounds, the current step in the derivation is as in (24):
24.
At this point, the nominalizer merges, changing the projection from verbal to nominal. The nominalizer in the case of the agentive nominal
thiriodamastis/‘beast-tamer’ expresses the external theta-role of the verb. This means that the nominalizer merges above the projection where the external argument is licensed. Based on observations in
Kratzer (
1994) and subsequent work (see, among others,
Cuervo (
2003);
Collins (
2005);
Alexiadou et al. (
2006);
Merchant (
2008);
Harley (
2009)), we assume that the external argument is projected in the specifier of a separate projection VoiceP, which may additionally be the locus for voice morphology in passive voice structures. The lower vP projection maintains the role of verbalizing the root domain but may also carry semantic (and morphological) content with the introduction of causative or inchoative semantics (see, for example,
Cuervo 2003 for detailed discussion of the different flavors of little v). The structure, thus, is as follows:
25. [np tis [VoiceP [vP thiri- [-o- [vP [√P damas- [√P thiri-]]]]]]]
The approach proposed here differs from the incorporation analysis of
Iordachioaia et al. (
2017), based on
Harley (
2009,
2014), in at least two crucial ways. Firstly, the agentive synthetic compound order of compound-internal elements is not achieved through incorporation but through phrasal movement of the root internal argument over the linking element
-o-. This means that the linker plays an active syntactic role (in the sense of
den Dikken 2006), and it is not simply a phonological reflex (as in
Ralli 2013). Secondly, the agentive interpretation of the resulting synthetic compound is achieved through a type of reduced relative clause formation. Thus, the suffix
-tis is assumed to be a relative pronoun of sorts, with the meaning ‘one (m.sg) who’. In
Iordachioaia et al. (
2017), the agentive nominalizing suffix (e.g.,
-er in English) is assumed to bind an external argument variable, <x>, introduced by the Voice projection (
Schäfer 2008;
Alexiadou and Schäfer 2010). However, a relative clause analysis does not presuppose the existence of an external argument and thus can potentially unify all cases where the affix is used with similar semantics, e.g., in non-verbal contexts (‘three-footer’, ‘Londoner’, or Greek
politis, ‘citizen’, from
poli, ‘city’, and
anatolitis, ‘easterner’, from
anatoli, ‘east’).
Subsequent movement of VoiceP to spec-nP is driven by the fact that the nominalizer is a bound morpheme, entering the derivation with the relevant morphophonological selection information. What is important here is that the nominalizer changes the properties of the extended projection from verbal to nominal and the distribution of the final string is that of an NP.
Ntelitheos (
2006,
2013) provides a discussion of the mechanisms involved in nominalization processes of this type, but the general idea here is that the higher nP is, in fact, a type of relative clause, with the suffix
-tis acting as a relativizer attracting the external argument of the verb (or is, in fact, itself a relativized external argument of the verb in some languages). This straightforwardly provides the sematic interpretation of these strings as ‘one who V(P)s’, e.g., ‘one who tames beasts’. In such an account, an agentive nominalization (of the type involved in these compounds) is a reduced and headless relative clause where the agentive argument is null. However, this does not exclude the case where a similar structure could be formed with an overt nominal as the agent. Thus, a ‘repairer’ is ‘one who repairs’ and a ‘repairman’ is a ‘person who repairs’ (allowing for a gender-neutral interpretation of the gendered compound).
Wood (
2021) proposes that the two strings (the derived nominal and the compound) must have different structures, given the incompatibility of parallel examples:
26. a. a frequent watcher of the movies
b. * a frequent watchman of the movies
However, the problem here may be the semantic incompatibility of the internal argument, given the semantically nontransparent meaning of man-compounds. Thus, watchman is not ‘man who watches’ in general but ‘man who works as guard’, and an internal argument may be forced if it is ‘something that can be guarded’:
27. John is a government-appointed watchman of the district school.
Several similar examples can be found with other compounds of this type: ‘a repairer of TVs’ vs. ‘a repairman of TVs’ and, even better, as a compound: ‘a TV repairman’.
Returning now to the derivation of (25), what predictions, if any, does this derivation make? This being a syntactic derivation, the fact that the leftmost element in the compound is interpreted as an argument of the verbal base follows directly from the derivational process: thiri- is the internal argument of the verb and is selected directly by it within the VP. This predicts the exclusion of additional internal arguments in the structure:
28.
* O Giannis | ine | thiriodamastis | liontarion. |
The Giannis | is | beast-tamer | lions.GEN |
‘Giannis is a beast-tamer of lions’ |
More importantly, most lexical integrity effects follow directly. As has been noted in the previous section, most lexical integrity effects are directly related to the referential properties of the internal (‘sublexical’) argument. However, if
Sportiche’s (
1999,
2005) proposal is on the right track, and the internal argument enters the derivation without any referential properties, then these effects disappear.
Let us revisit the relevant examples. We have seen that synthetic compounds seem to be anaphoric islands (8, repeated here as 29):
29.
* O Giannis | ine | plirofori1-o-dotis | alla den tin1 gnorizo. |
The Giannis | is | information- LNK-giver | but not 3SG.F know.1SG |
‘Giannis is an informer but I don’t know it (the information)’. |
It is well known in the relevant literature that only DPs (i.e., ‘quantized’ nominals) can be referential; NPs never can (cf.
Stowell 1991;
Longobardi 1994). If the nominal inside the synthetic compound has not been quantized (due to lack of the appropriate projections) then its inability to bind/co-refer with a compound-external referential expression follows straightforwardly.
The fact that the internal argument cannot be quantized is further supported by its inability to host determiners or proper names and pronoun internal arguments 13, repeated here as 30):
30.
a. | * [o kapno-] | kalierjia |
| [D tobacco-] | cultivation |
b. | * Giorgo-thavmastis |
| Giorgo-admirer |
31.
a. a Nixon admirer
b. the Euler number
c. my computer is an IBM machine
32. * a Bill admirer
It is therefore possible that the proper name inside the compound is not fully referential—it does not refer to the respective entity in the real world but rather to a specific property of that entity (e.g., in the case of ‘Nixon admirer’, to the policies of Richard Nixon). For Greek synthetic compounds, Alexiadou 2020 shows that a limited number of compounds, formed on verbal root heads that are remnants from earlier historical stages in the language (e.g.,
-ktonos, ‘killer’, or
-latris, ‘admirer’), allow for proper names as non-head compound elements. However, she supports an analysis of these as formed on bound roots, which have achieved (or are on the way of achieving) an affixal status in the language. Thus, the fact that Greek synthetic compounds do not readily allow for the inclusion of proper names as non-head elements in the compound can be maintained.
Alexiadou (
2020) shows that this contrasts with English, where synthetic compounds do allow the inclusion of proper names as non-heads (as in 31). This is because, in her analysis, English synthetic compounds are formed on already categorized nominal elements (nPs) and not roots (or semi-affixes, in her analysis), as we have been arguing for Greek synthetic compounds (see
Alexiadou 2020 for further discussion).
Moving to other types of integrity effects, we have noted that no movement is allowed out of the compound 7, repeated here 33):
33.
a. | O | Giannis | ine | plirofori-o-dotis. |
| The | Giannis | is | information- LNK-giver |
| ‘Giannis is an informer’. |
b. | * Ti | ine | o | Giannis dotis? |
| What | is | the | Giannis giver? |
| ‘What is Giannis giver (of)? |
In
Bresnan and Mchombo (
1995), this inability is taken as one of the main lexical integrity effects. However, the problem here is that there are numerous, clearly syntactic structures that disallow such movement. The effect observed in (33) is a case of the well-known left-branch extraction (
Ross 1967), which the following example also exhibits:
34.
a. I like green apples.
b. * Which do you like apples?
Other types of extraction unavailability can also be accounted for because of the phasal status of the synthetic compound. The Phase-Impenetrability Condition (
Chomsky 2001,
2005) assumes that at each step of the derivation, the only elements available to subsequent syntactic operations are the head and specifier of the phase. This means that, for elements inside the compound, the rightmost element is not available for subsequent computations, as it resides below the phase head (i.e., the nominalizer). Thus, neither the leftmost nor the rightmost elements of the compound can move for independent reasons, and thus, the effect disappears.
The unavailability of N–V compound verbs (i.e., noun incorporation) is due to the fact that, when the nominalizing affix is not projected, and the domain (and extended projection) remains verbal, the nominal argument would be forced to vacate the VP and check quantificational properties in the subsequent available projections. The only reason this is not a possibility in synthetic compounds is because the projection where definiteness is checked is not available because the nominalizer has altered the extended projection. This has to be a language–specific property—in cases of noun incorporation (e.g., in polysynthetic languages), the nonreferential nominal can be licensed inside the VP. In addition, this predicts something that has not really been discussed in detail in the relevant literature. The only cases that are brought forward as examples for the unavailability of noun-incorporation (in English and other languages of this type) are infinitival or finite forms of the verb:
35.
a. * to truck-drive
b. * The farmer occasionally tobacco-produces.
However, when slightly nominalized forms of the verb are used (e.g., gerundive forms in progressive contexts) then incorporation seems to be okay:
36.
a. I’d download his podcast when I was truck driving part time.
b. We have been window cleaning in Dudley and Stourbridge for 13 years.
c. I was truck driving up north, too.
There are several examples of this type available, many appearing in simple web searches (36.a-36.b) or in the Corpus of Contemporary American English 36.c). This is predicted in the analysis proposed here, as the suffix -ing nominalizes the verbal structure, allowing for incorporation of the internal argument; then, the copular verb BE verbalizes this nominal structure again, allowing for subsequent verbal projections (aspect and tense) to merge. The issue of noun-incorporation of this type in English, therefore, needs to be explored in more detail, beyond quick dismissals based simply on the data of (35.a-35.b).
Finally,
Iordachioaia et al. (
2017) show that, in the case of Greek synthetic compounds, N–V verbs are available in many cases, but their existence is subject to certain restrictions. Greek has three types of synthetic compounds depending on whether they allow for backformed NV verbs or not: synthetic compounds that do not have any N–V verbs (e.g.,
anth-o-polis, ‘flower-seller’ vs. *
anth-o-polo, ‘to flower-sell’); synthetic compounds which can derive backformed N–V verbs with idiosyncratic stress patterns and stem forms (e.g.,
vivli-o-detis, ‘book-binder’ and
vivli-o-detó, ‘to book-bind’; compare
déno, ‘to bind’); and synthetic compounds with independent N–V compounds with the expected verb stems and stress patterns (e.g.,
thiri-o-damastis, ‘beast-tamer’ and
thiri-o-damázo, ‘to beast-tame’; compare
damázo, ‘to tame’).
The status and availability of N–V backformed verbs for these types of synthetic compounds need to be further explored (see
Iordachioaia et al. (
2017) for an analysis of what restricts the availability of these patterns in Greek).