Grammar Competition and Word Order in a Northern Early Middle English Text

: The Edinburgh Royal College of Physicians manuscript of Cursor Mundi and the Northern Homilies , a northern Middle English text from the early 14th century, contains unprecedentedly high frequencies of matrix verb‑third and embedded verb‑second word orders with subject–verb inver‑ sion. I give a theoretical account of these word orders in terms of a grammar, the ‘CM grammar’, which differs minimally in its formal description from regular verb‑second grammars, but captures these unusual word orders through addition of a second preverbal A ′ ‑projection. Despite its flex‑ ibility, the CM grammar did not spread through the English‑speaking population. I discuss the theoretical consequences of this failure to spread for models of grammar competition where fitness is tied to parsing success, and discuss prospects for refining such models.


Introduction
This paper has two goals. The first is to describe matrix verb-third and embedded verb-second orders with subject-verb inversion in Old and Early Middle English, with special reference to the Edinburgh Cursor Mundi and Northern Homilies, a northern Middle English document written in three hands from the early-mid 14th century, in which these orders are particularly common. It is particularly noteworthy that matrix verb-third orders with inversion are common in this manuscript, because this contrasts with the matrix verbsecond orders found in almost all Germanic languages.
An example of matrix V3 with inversion is given in (1), and an example of embedded V2 with inversion is in (2) one in embedded clauses. Because the CM grammar can generate these uncommon word orders, in addition to many more common word orders, the CM grammar is more flexible than the northern Middle English described by Kroch and Taylor (1997), or indeed any other grammar described for any variety of English. The second aim of this paper is to draw out some theoretical implications of this very flexible grammar, which apparently existed in the margins of the history of English, from the perspective of approaches to syntactic change based on grammar competition (Kroch 1989;Yang 2002). These widely adopted approaches model a speaker's grammatical competence as a distribution over multiple grammars, where the relative weights, or fitness, of the different grammars are determined on Yang's (2002) model by their success in parsing sentences encountered during language acquisition. On such a model, the flexibility of the CM grammar should lead to that grammar having greater fitness than contemporary grammars and therefore spreading through the population. That, strikingly, did not happen: English word order evolved towards the fixed SVO order found since late Middle English, rather than the more flexible orders generated by the CM grammar. This suggests, contrary to the prediction of Yang (2002), that greater flexibility (more specifically, greater ability to parse sentence structures in the input) does not necessarily yield a selectional advantage, a conclusion that forces a reconsideration of the implementation of grammar competition.
Section 2 gives a brief introduction to relevant aspects of Old and Middle English word order, as background to the description in Section 3 of word order in the Edinburgh Cursor Mundi and Northern Homilies. Finally, Section 4 discusses the implications for models of grammar competition.

Background
I adopt the hypothesis, with its roots in Borer (1983), that syntactic variation reduces to variation in the specification of the properties of functional heads. The models of word order that I describe in this section should therefore ultimately be considered as models of the specification of heads such as C and I. I focus particularly on the uses that different grammars make of the heads which are most directly involved in models of V2 in early English, namely C and a lower head motivated in Haeberli (2000) which I will call F. There are a limited set of possible specifications of these heads, and no single specification is able, on its own, to capture the richness and variation attested in many Old and Middle English texts. However, on a grammar competition approach, we need not expect any single grammar to generate all the observed sentences in a text, because an individual in principle may have access to multiple distinct grammars (that is, multiple distinct sets of specifications of the properties of functional heads).
The rest of this section focuses on grammars that generate V2 and related orders in Old and Middle English. Section 2.1 describes the standard generative model of OE clause structure, deriving from van Kemenade (1987) and Pintzuk (1991). Section 2.2 focuses on the description of northern Middle English word order in Kroch and Taylor (1997), as a point of comparison for the northern Early Middle English of the CM grammar. Finally, in Section 2.3 we identify three grammars that we will compare to the CM grammar in later sections.

Verb-Second in Old English
Old English (OE) is a verb-second language. This means that in general, a single phrasal constituent precedes the finite verb in matrix clauses. The identity of this constituent ranges over most argumental and adverbial categories. In (3a), an object is in first position; in (3b), a prepositional phrase, and in (3c), an adverbial. However, pronominal subjects precede the finite verb, even if there is some other constituent in initial position (van Kemenade 1987;Pintzuk 1991). This means that verbthird word orders like (4) are standard. Certain initial elements trigger inversion, though, even when the subject is a personal pronoun. The most important members of this class of elements are the adverbs þa 'then' and nu 'now', as in (5)-see Pintzuk (1991, pp. 145-50) for a full characterization of this class. Since Pintzuk (1991), these facts have motivated models of OE clause structure which involve multiple subject positions and multiple positions for the finite verb. Assume that the finite verb occupies the same position in (3) and (4). The word order difference between these two examples then indicates that there are two subject positions, one above the finite verb and one below it, with pronominal subjects restricted to the higher position and full NP subjects mainly occurring in the lower position.
The variation between (4) and (5) then reflects variation in the position of the finite verb. In (5), the verb moves higher than in (4), to a position above both subject positions.
A specific implementation of this analysis, from Haeberli (2000), works as follows (for other implementations, see Pintzuk 1991Pintzuk , 1993Kroch and Taylor 1997, among many others). The top of the clausal functional sequence contains three projections, which I will call CP, FP, and IP (Haeberli uses different labels, but the specific labels are not very important). The higher and lower positions for the finite verb are C and F, respectively. The higher and lower subject positions are Spec,FP and Spec,IP, respectively. Spec,CP is the initial A ′ -position which the preverbal phrase occupies. The three patterns above are then derived as follows. Subject-initial V2 orders can also be derived by moving the subject to Spec,CP, regardless of the internal syntax of the subject and the position of the verb. 8 As for word order in subordinate clauses, OE does not typically show embedded V2 effects (Salvesen and Walkden 2017), although we will see some exceptions to this claim in Section 3.5. This means that embedded word order can be taken as a more or less faithful indicator of word order within IP. However, the syntax of the Old English IP is quite variable. It is common to assume, again following Pintzuk (1991Pintzuk ( , 1993, that there was competition between head-medial and head-final orders for both VP and IP in OE. Variation and change in the structure of IP is not a focus of this paper. However, it will be relevant below that Early Middle English was undergoing a change in progress, the outcome of which was the rigidly head-medial order which has been uniformly attested since late Middle English. The analysis sketched here is known to undergenerate in that there are several other classes of V3 or V > 3 order in OE (Bech 2001;Haeberli 2002;Speyer 2010;Biberauer and van Kemenade 2011;Bech and Salvesen 2014;Salvesen and Bech 2014;Walkden 2014). However, previous work on V3 and V > 3 order in OE has focused on XSV and SXV orders, where the subject is preverbal. In this paper, I am interested instead in XYVS orders, with two preverbal constituents, neither of them plausibly left-dislocated or left-adjoined, and a postverbal subject. These orders will be particularly relevant in the discussion of the CM grammar in Section 3. They are particularly interesting for the investigation of V2-like patterns, because the classical understanding of V2 puts a lot of explanatory burden on the existence of a single projection, CP, above the subject position, but under an expanded, cartographic view of clause structure, there are potentially several such projections. The greater the number of specifier positions that precede the position of the verb, the greater the number of routes to different varieties of V3 or V > 3 order, potentially including orders where the verb raises past subject position but still has two or more A ′ -positions to its left (see, for instance, discussion of V3 orders in Walkden 2014). Therefore, on the one hand, inversion with multiple preverbal elements speaks against the empirically well-supported notion that second position is somehow special. On the other hand, the proliferation of leftperipheral positions since Rizzi (1997) undermines our theoretical understanding of the special nature of second position. The existence of a verb-third grammar with inversion is informative with respect to this theoretical tension.

Word Order in Northern Middle English
The earliest Middle English (ME) texts still show the three properties illustrated in (3-5). However, Kroch and Taylor (1997) demonstrate that the late 14th century northern prose Rule of St. Benet, one of the earliest surviving northern prose texts, does not have V3 orders in matrix clauses with subject pronouns. Instead, it shows the inverted order illustrated in (8). service 'I will establish my school to God's service.' (cmbenrul-m3,4.84) 9 This means that the grammar of the Rule of St. Benet is like that of many present day Germanic V2 languages: the verb moves to C, a single constituent moves to Spec,CP, and nothing needs to be said about the special status of pronominal subjects (Haeberli 2000).
Although PPCME2 does not contain any other texts which show this pattern so categorically, Kroch and Taylor argue that the distinctive grammar of the Rule of St. Benet is indicative of a longstanding dialect split in English, largely obscured by the uneven distribution of surviving Old and Middle English texts across dialects. They make a convincing claim that this northern V2 grammar was in fact inherited from OE, on the basis of an ingenious analysis of pronominal subjects in northern OE glosses, suggesting that this dialectal variation persisted for several hundred years. Moreover, they hypothesize that the distinctive syntax of the northern dialect could have its origins in contact with Old Norse, a possibility which implies that it could also be useful to compare the Rule of St. Benet to corpus texts from Lincolnshire and East Anglia, areas further south in the Danelaw where there was also signifcant contact with Old Norse.
Our information about this northern grammar remains sparse and fragile, precisely because the only source in PPCME2, for most purposes, is the Rule of St. Benet. An initial goal of the research reported in the present paper was to deepen our understanding of the syntax of northern Early Middle English (EME) by investigating texts not included in 9 Northern prose Rule of St. Benet, early 15th century, Northern, PPCME2. PPCME2, so as to better contextualize Kroch and Taylor's findings about the Rule of St. Benet. As we will see presently, this attempted replication of Kroch and Taylor (1997) is partially successful, but also threw up several unexpected grammatical complexities.

The Syntactic Context for the CM Grammar
On the basis of the facts above, we identify three different grammars which serve as points of comparison for the CM grammar. These are not complete grammars, but rather specifications of the properties of C, F, and their specifier positions. This reflects the thinking behind Yang's (2002) 'Variational Learner': that competition is between individual parameters (from a Borerian perspective, specifications of properties of functional heads), not between whole grammars.
The first two grammars we will consider have been discussed directly above; the third (the 'SV grammar') is a generalization over the non-V2 grammars which have predominated since late ME. Although there is clearly variation with respect to both V-to-I movement and the XV/VX parameter within the class of non-V2 grammars, this variation concerns the lower part of the clause, and we expect the properties of CP and FP to be uniform across all of the non-V2 grammars.
Because we assume a competition-based approach, we do not expect any one grammar to be able to capture all of the linguistic behaviour observed in a text. 10 Nevertheless, on Yang's model, the relative fitness of these grammars in different linguistic environments is determined by their relative parsing success.
The grammars are the following. 10 It is natural to assume that all sentences in a text are generated by some grammar, so the full set of competing grammars to which an individual has access should in principle be able to generate a complete text created by that individual. I do not pay attention to this natural assumption here, for two reasons. First, nothing suggests that these three grammars, plus the CM grammar, are the only grammars of interest. Second, not everything which has been written down is grammatical, for instance because of scribal errors. 11 This does not reflect the assumption, common since Bech (2001), that this position allows a wider range of discourse-given noun phrases, because I am not currently in a position to compare all four grammars with respect to the information-structural characteristics of Spec,CP and Spec,FP. 12 Adjunction to CP or FP would lead us to expect widespread V > 2 orders, contrary to the evidence that OE and EME are predominantly V2 languages. Haeberli (2000) demonstrates that adjuncts can surface between the finite verb and a full NP subject. We model this as IP-adjunction, deviating from Haeberli. 13 In the study of competition among these grammars in Section 4, X is restricted to adjuncts in embedded clauses in all three grammars described here. That is, I assume that these grammars can generate embedded SVO and SOV orders, but not embedded OSV. 14 See discussion in Haeberli (2000).

•
No adjunction to CP. Adjunction to IP is permitted.
Word orders generated by the northern V2 grammar are the following.
3. The SV grammar V remains below C in all clauses. • Adjunction to IP permitted.
Word orders generated by the SV grammar are the following.
I now proceed to investigate a well-defined set of word orders which cannot be generated by any of these grammars.

Word Order in the Edinburgh Cursor Mundi
In an investigation of grammar competition in historical texts, we have to make careful use of concrete, observable word order to draw inferences about grammars, which are more abstract objects which cannot be observed directly. In what follows, I will distinguish between 'the Edinburgh manuscript', a manuscript copy of the Cursor Mundi and the Northern Homilies; certain grammars represented, by hypothesis, in the Edinburgh manuscript; and the observable word orders in the manuscript.
The distinctive word orders identified by Kroch and Taylor in the Rule of St. Benet are also found in the Edinburgh manuscript. In particular, pronouns do not behave differently from full NPs with respect to subject-verb inversion. This is the hallmark of the northern V2 grammar. However, the Edinburgh manuscript is syntactically heterogeneous, and contains orders that cannot be described by the northern V2 grammar, or indeed either of the other grammars identified in Section 2.3. These include matrix V3 orders with subject-verb inversion, and embedded V2 orders. I refer to these two orders as the 'CM orders', although we will see presently that they are not unique to Cursor Mundi or the Edinburgh manuscript.
The CM orders motivate a fourth grammar, which I will call the 'CM grammar', in addition to the three described in Section 2.3, which can generate these orders. In the CM grammar, Spec,CP and Spec,FP are both preverbal A ′ -positions. Both of these positions are available in the matrix clause, and matrix V3 orders with inversion arise when they are filled by different phrases. Spec,FP also features in embedded clauses, and embedded V2 orders with inversion arise when it is filled by a nonsubject.
Section 3.1 describes the corpus investigation that generated the findings reported here. Section 3.2 describes the word order properties that the Edinburgh manuscript shares with the Rule of St. Benet, and with EME more generally. These word orders do not provide evidence for the CM grammar, though in many cases they are compatible with it. Sections 3.3 and 3.4 document the two CM orders, embedded verb-second and matrix XYVS, respectively. Finally, Section 3.5 compares word order in the Edinburgh manuscript with other Old and Middle English texts. We will see that the orders uniquely generated by the CM grammar are present across many, but not all, Old and Middle English texts, and can be doubly dissociated from the north/south split identified by Kroch and Taylor. We end with some speculations about why the CM grammar is particularly visible in the Edinburgh manuscript, given that it is not unique to the manuscript.

Method
All data reported in this paper, unless noted otherwise, come from the York-Toronto-Helsinki parsed corpus of Old English prose (YCOE, Taylor et al. 2003), the Penn-Helsinki Parsed Corpus of Middle English, 2nd edition (PPCME2, Kroch and Taylor 2000), and the Parsed Linguistic Atlas of Early Middle English (PLAEME, Truswell et al. 2019). I ran a series of coding queries in CorpusSearch (http://corpussearch.sourceforge.net, accessed on 12 March 2021) on these corpora. The queries for each corpus were identical except for minor modifications required because of the slightly different annotations in each corpus.
YCOE and PPCME2 are industry-standard parsed corpora of historical English. PLAEME is a new parsed corpus, focusing on the 1250-1325 period, which is underrepresented in PPCME2. It contains texts from the unparsed Linguistic Atlas of Early Middle English (Laing 2013), annotated with information about syntactic structure in the format of PPCME2. The texts contained in PLAEME are arguably of a lower quality, for the purposes of syntactic research, than the texts included in PPCME2: they are mainly in verse, and often short and/or fragmentary. Nevertheless, they have the undeniable virtue of existing: data from this period is particularly sparse, and we must work with what we have, while being sensitive to the limitations of the available evidence. For instance, there are legitimate concerns about the use of verse texts for word order research. However, it is possible to draw inferences about syntax from verse: see Truswell et al. (2019) for discussion, and Trips (2003) for a generative example, discussing Stylistic Fronting in the Ormulum, the only verse text in PPCME2.
Although PLAEME is first and foremost a supplement to PPCME2, it retains some of the functionality of LAEME as a dialect atlas, and the geographical coverage of PLAEME significantly improves on the corresponding M2 period (1250-1350) in PPCME2. PLAEME contains several texts from within the Danelaw. Most are too short for extensive investigation of word order, but there are two longer texts: an early 14th century verse Genesis and Exodus composed in Norfolk (ms. Cambridge, Corpus Christi College 444), and a major mid-14th century northern manuscript in three hands, the Edinburgh Royal College of Physicians Cursor Mundi and Northern Homilies, composed in Yorkshire in the first half of the 14th century, almost 100 years earlier than the Rule of St. Benet. This manuscript, as transcribed in LAEME and annotated in PLAEME, is our main focus in this paper. Despite some variation between the grammars of the three hands, the text is reasonably uniform with respect to the syntactic properties investigated here, so I will treat the three hands together.
The corpus queries coded each declarative matrix clause, and each finite declarative complement or adverbial clause, for the nature of the subject (full NP or personal pronoun), and the category of the first four constituents in the clause. 15 The decision to code only the first four categories reflected a trade-off between practicality and informativity: four categories is sufficient to allow investigation of XYVS orders, and the coding queries were already quite unwieldy, each requiring around a day to run on a standard desktop computer, so there was a practical reason not to code for the category of further constituents.
I excluded several types of parenthetical and/or left-peripheral elements from consideration. These included all constituents tagged with -LFD ('left-dislocated') or -PRN ('parenthetical'), vocatives, and interjections. The exclusion of these elements allowed for a cleaner focus on CP-internal word order. Also excluded were any clause introduced by a coordinating conjunction (because noninitial conjuncts often show subordinate clause word order, even in matrix clauses), and any clause with a null subject, whether coded as expletive, pro, or trace.
These queries permitted a wide-coverage investigation of the presence or absence of inversion in particular contexts. However, there are several special cases that need to be treated separately. These include negation (which in Old and Early Middle English always immediately precedes the finite verb and need not fill the preverbal A ′ -specifier position), pronominal objects (which in many cases are subject to positional restrictions similar to those of pronominal subjects), monosyllabic deictic adverbs such as þa, which have information-structural effects with many subtle consequences for word order (van Kemenade et al. 2008), particles in verb-particle constructions (where the determina-tion of word boundaries is often problematic), participles and other constituents which participate in Stylistic Fronting, and correlative structures such as if … then … or as … so …, which trigger inversion in main clauses in many texts where inversion is not otherwise common. In order to treat the 'basic' pattern separately from all these special cases, in this paper I will focus on the order of verb and subject in non-correlative contexts with initial PPs, AdvPs other than those like þa, full NP objects, and less commonly APs, fronted VPs, or fronted infinitival clauses.

Unsurprising Word Orders
The most common orders in the Edinburgh manuscript are verb-second orders, with or without inversion. Example (9) shows the uninverted order, while (10) shows inversion in a range of nonsubject-initial orders. 16  The Edinburgh manuscript also shows inversion with pronominal subjects, as in (11). As discussed in Section 2.2, this is claimed by Kroch and Taylor (1997) to be a hallmark of northern Old and Middle English dialects. The existence of these orders therefore suggests that the northern V2 grammar is well represented in the Edinburgh manuscript. 16 PLAEME, following LAEME, transcribes manuscripts more faithfully than YCOE or PPCME2, which are based on edited texts. I have represented examples from PLAEME in the orthography implied by PLAEME's transcription conventions. As a result, examples in this paper drawn from PLAEME contain several initially unusual-looking abbreviations, not found to the same extent in examples from PPCME2 or YCOE. These include a macron for a following nasal (ī = 'in'), a superscript vowel for a preceding <r> (g a ce = 'grace'), and the <  > symbol for <er> (lau  d = 'lord'). Linebreaks are represented with '\'. 17 Edinburgh Cursor Mundi, Hand C, early 14th century, Northern, PLAEME.
Several kinds of V3 and V > 3 orders discussed by Bech and Salvesen (2014); Salvesen and Bech (2014) are also attested in the Edinburgh manuscript. (12a) shows an SXV order, (12b) an XSV order, and (12c) an XSYV order. I have not investigated the distribution of these V3 and V > 3 orders without inversion in any detail, so do not know if the factors which Bech (2001) and subsequent authors identify as conditioning these word orders also apply in the Edinburgh manuscript. Section 3.4 focuses instead on V3 orders with inversion, which have not been previously discussed to such an extent.
As for subject positions, Haeberli (2000) argued that there is only a single subject position, Spec,IP, in the northern prose Rule of St. Benet, unlike non-northern V2 grammars with a distinct higher position for subject pronouns. The Edinburgh manuscript is roughly as Haeberli describes for the Rule of St. Benet: pronominal subjects do not generally appear higher than full NP subjects. However, there is a low, typically clause-final, subject position restricted to definite full NPs, illustrated in (13) However, full NP subjects can also appear in the same position as subject pronouns, whether definite, as in (15a), indefinite, as in (15b), or quantificational, as in (15c). 18 There are only two indefinite NPs which are plausibly in this low subject position. One is a there-insertion sentence and the other is an apparent NPI in the scope of negation. It would therefore perhaps be more accurate to claim that indefinite full NPs can only occur in this lower position if they are in the scope of a licensor. However, with only two such examples, it is hard to propose such a generalization with any certainty. I interpret these facts as implying that there is only one left-peripheral subject position, in Spec,IP, together with a lower rightward subject position whose syntactic status is unclear. Pending further investigation, I will represent the lower position as a rightward specifier of vP, as in (16), but I will restrict attention to the higher subject position in what follows.

Surprising Word Orders: Embedded Verb-Second
The predominant order in embedded clauses is SVO, as in (17), with a minority of SOV orders as in (18) These orders cannot be generated by either of the V2 grammars considered in Section 2.3, because the preverbal A ′ -position for those grammars is Spec,CP, which is ordinarily incompatible with the presence of a complementizer in C. Instead, we will attribute them to the CM grammar, and assume that in the CM grammar, the verb raises to a head lower than C, and that the specifier of that head is an A ′ -position. In the terms used in Section 2.1, a natural interpretation would be that the verb raises to F in the CM grammar, and that Spec,FP is an A ′ -position. Embedded V2 orders then instantiate the structure in (20). The competition between the embedded V2 orders in (19) and the non-V2 subordinate clause orders in (21-22) implies structural heterogeneity of the sort which is expected under a grammar competition model, as there is no single grammar in this format that I am aware that could generate this diverse range of orders.
For completeness, there are also 547 examples of embedded S-V(-X) orders like (9), which are equally compatible with all four grammars under consideration.

Surprising Word Orders: Matrix Verb-Third with Inversion
If one were to couple the CM grammar's embedded V2 structure with a further A ′ position in matrix Spec,CP, we could derive an alternation between matrix V3 and embedded V2 orders, both permitting inversion. The matrix V3 structure is as in (24) The simplicity and systematicity of the formal description of the CM grammar offers some support for the claim that these matrix V3 orders are a product of the same grammar as embedded V2. The CM grammar differs from the northern V2 grammar in just two respects: the verb moves to F, rather than C; and Spec,FP is an A ′ -position which must be filled. Likewise, it differs from the non-northern V2 grammar in two respects: the A ′nature of Spec,FP, as just described, and the fact that subject pronouns are not restricted to preverbal position. 22 Because of the flexibility of the structure in (24), with the two A ′ -positions to the left of the verb, very many orders discussed above are compatible with the CM grammar, including all matrix orders with the verb in second or third position, and all embedded verb-second orders. In fact, the matrix word orders that the CM grammar can generate are a proper superset of the orders generated by either of the V2 grammars described in Section 2.3, because a single constituent can move to Spec,FP and then on to Spec,CP, as in (26) As mentioned above, matrix verb-second orders are common in the Edinburgh manuscript-much more common than matrix XYVS orders. However, this in itself is uninformative about the distribution of the CM grammar, as opposed to a regular V2 grammar, because of the compatibility of the V2 structure in (26) with the CM grammar.
Likewise, matrix XSV orders like (12b), which provide a common source of evidence for the loss of V2, are uninformative with respect to the CM grammar, because these are verb-third orders, and so compatible in principle with the CM grammar. Such orders therefore do not lead to a fitness advantage for the SV grammar over the CM grammar. It is, however, possible to find unequivocal evidence in matrix clauses for an SV grammar distinct from both the CM grammar and both V2 grammars. This comes from sentences like those in (27), where the verb occupies fourth (or later) position and is preceded by the subject. There is not space for three preverbal constituents in the structures generated by other grammars, unless any of the constituents are left-dislocated (which seems particularly implausible in the case of (27b), with a particle in first position), so these examples must be generated by a regular SV grammar with left-adjunction to a pre-subject position.  Walkden (2014) developed an analysis of the Old English V3 orders first described by Bech (2001), according to which the two preverbal constituents occupy A ′ -positions. This is in opposition to the approach in Pintzuk (1991) where the second constituent occupies an A-position. Clearly, Walkden's analysis is very similar to the analysis developed here. However, it differs in that the second A ′ -posiiton for Walkden was Spec,FamP, restricted to discourse-given constituents. It does not appear that any such restriction is evident in the Edinburgh manuscript. If one were to adopt Walkden's analysis of OE V3, it would then be possible to claim that the difference between the non-northern V2 grammar and the CM grammar lies in the nature of the second A ′ -position, with this being less restricted in the CM grammar. Such an account must await a better understanding of information structure in the CM grammar. 23 Alternatively, one could suggest that only Spec,FP needs to be filled on the CM grammar, and Spec,CP can optionally be filled.
In sum, there is a small amount of examples in matrix clauses in the Edinburgh manuscript which can be generated by an SV grammar but not the CM grammar, a small amount of examples which can be generated by the CM grammar but not the SV grammar, and a very large amount of examples (including all SVX and XSV examples) which are in principle compatible with both grammars.

Comparisons
Although the influence of the CM grammar is particularly clear in the Edinburgh manuscript, the surprising orders that the CM grammar generates are not unique to that manuscript, and the CM grammar is not the only grammar visible in it. In this section, I investigate the distribution of the CM grammar, and address the question of why it is particularly visible in the Edinburgh manuscript. As well as situating the Edinburgh manuscript and the CM grammar in context, this quantitative investigation actually gives rise to a new argument that the CM orders are the product of a single grammar: the rate of matrix V3 with inversion is positively correlated with the rate of embedded V2 with inversion across texts, as would be expected if these two orders are distinctive products of a single grammar, differently weighted by different individuals. 24 As shown in Figure 1, the frequency of matrix XYVS orders is correlated with the frequency of embedded XVS orders across texts, once the special cases described in Section 3.1 are excluded (Spearman's ρ = 0.49, p < 0.00004, on log-transformed counts for all texts in YCOE, PPCME2, and PLAEME with >10,000 words). This corroborates the hypothesis laid out in previous sections, that these two word orders are the product of a single grammar: the frequency of both of these orders then correlates with the weight of the CM grammar. In the following subsections, I ask more focused questions about the spatiotemporal distribution of the CM grammar. In turn, I investigate the CM orders in northern ME, Old English, and late ME.
Before I begin these comparisons, I will briefly dismiss two alternative hypotheses as to why the CM grammar is particularly prominent in the Edinburgh manuscript. Firstly, it is not explanatory to attribute the prominence of the CM grammar in the Edinburgh manuscript to the fact that this text is in verse. The vast majority of PLAEME is in verse, as is the Ormulum in PPCME2, but not all of these texts show the CM orders at the same rate. So to claim that verse is somehow responsible for these word orders in the Edinburgh manuscript, one would have to accept that the creator of this text allowed himself more freedom to bend grammar to meet the metrical constraints of the poem, than other authors did. It seems unlikely that any real explanation could be developed along these lines.
Secondly, it does not seem likely that these word orders could be a product of language contact, or at least not uniquely. The two most plausible contact languages to consider are Old Norse, and Old French, because the Edinburgh manuscript was composed in Yorkshire, within the Danelaw, and some of its material had French sources. However, Old Norse does not display the relevant orders. According to Faarlund (2007), Old Norse is a regular verb-second language, with the only non-V2 matrix clauses being some cases of V1 which Faarlund analyses as having a null initial topic, and therefore actually conforming to the V2 pattern. Moreover, embedded V2 with inversion is limited to a construction where a topic follows complementizer at 'that'. The details of this construction do not match embedded V2 in the CM grammar, where a range of complementizers and prepositions can introduce the subordinate clause, and where the initial element can belong to categories, such as VP or nominal and adjectival predicates, which cannot be sentence topics. Although this does not rule out the possibility of Old Norse influence on the CM grammar, it rules out any simple story where these structures are borrowed from Old Norse.
Old French is in some respects a more promising source: it shows matrix V3 orders with inversion, like (28), and embedded V2 orders, like (29). him 'Now the story tells you that the gentleman made Lancelot stay with him for three days.' (Graal, col. 187s, l. 3, Salvesen and Walkden 2017, p. 177) However, I will demonstrate in Section 3.5.2 that these orders also existed in OE, before any extensive French influence on English morphosyntax. This suggests that any effect of language contact with French would be more subtle than simple borrowing. At best, the effect of contact with French would be to amplify possibilities that were already present to some extent in English grammar.

Northern Middle English
Although the Edinburgh manuscript is a major northern ME text, there is no reason a priori to expect a direct grammatical relationship between the CM grammar and the northern V2 grammar described by Kroch and Taylor (1997). The two grammars make different claims about the position of the finite verb and about the status of Spec,FP, and are therefore best treated as distinct grammars which both happen to be prominent in northern ME texts.
However, we might still expect a quantitative, indirect relationship between the CM grammar and the northern V2 grammar. The distinctive word orders generated by the 25  CM grammar both involve inversion, and the northern V2 grammar generates a high level of inversion, because northern XVS orders with pronominal subjects correspond to nonnorthern XSV orders. One would therefore expect the CM grammar to have greater parsing success in northern texts, simply because there is more inversion in northern texts.
In this section, I investigate this possibility by looking at the frequencies of matrix XYVS and embedded XVS orders in relation to the frequency of matrix XVS orders with pronominal subjects, and the status of the Edinburgh manuscript with respect to this relationship.
Firstly, I use data from PLAEME to investigate Kroch and Taylor's (1997) proposal that inversion with pronominal subjects is a distinctively northern feature. I compare the global frequency of inversion in nonsubject-initial sentences with full NP subjects, to the frequency of inversion in such sentences with pronominal subjects. Figure 2 shows the rate of inversion in nonsubject-initial sentences with full NP subjects for all PLAEME texts. Each dot on the map represents a text, with the area of the dot proportional to the number of such sentences in the text. The colour of the dot represents the rate of inversion in such sentences. Every large text shows at least 50% inversion in these contexts. This tells us that inversion remains the norm in nonsubject-initial sentences with full NP subjects across the English varieties represented in PLAEME. In contrast, Figure 3 represents the rate of inversion with pronominal subjects. Although inversion with pronominal subjects is not categorically present in any large text in PLAEME, it is almost completely absent in southwestern texts, and observable to different extents in texts produced in the Danelaw. This suggests that Kroch and Taylor are at least approximately correct in identifying inversion with pronominal subjects as a distinctively northern characteristic. However, this pattern, which they identified in more or less 'pure' form in the Rule of St. Benet, is not strictly northern but instead found, to differing extents, in most of the Danelaw. 26 Rate of inversion in nonsubject-initial sentences with pronominal subjects, all PLAEME texts.
The Edinburgh manuscript is visible on these maps as the three large circles just northwest of the Humber (one circle for each hand). The maps show, as already suggested in Section 3.2, that the Edinburgh mansucript makes greater use of Kroch and Taylor's northern inversion pattern than most PLAEME texts (although not to the same extent as the Rule of St. Benet, which is not plotted on the map as it is not included in PLAEME). In this section, I compare the Edinburgh manuscript to major northern and East Anglian Middle English texts included in PLAEME and PPCME2. From PLAEME, in addition to the three hands of the Edinburgh manuscript, I include the Middle English Genesis and Exodus (visible on the maps as a large dot immediately southeast of the Wash In fact, across all texts in YCOE, PPCME2, and PLAEME, there is a positive correlation between frequency of matrix XVS orders with pronominal subjects, and frequency of the two CM orders, taken together (Spearman's ρ = 0.26, p = 0.03, on log-transformed counts for all texts in YCOE, PPCME2, and PLAEME with >10,000 words). This correlation is shown in Figure 4, where northern texts are highlighted in blue. This suggests that the CM orders are distinctively northern to approximately the same extent as Kroch and Taylor's northern V2 grammar is: there is a cluster of northern texts which show a greater frequency of both the CM orders and the northern V2 pattern than any other text, although some nonnorthern texts show these variables at almost the same rate.  However, the Ormulum and Richard Rolle's works have a different profile (these are the other blue dots in Figure 4). Examples of the CM orders from these texts, while not nonexistent, are rare. (32) gives two examples of embedded XVS order, the more common of the two CM orders in these texts. These texts therefore support the position outlined at the start of this section, that there is no direct causal link between the distinctive word order of the northern V2 grammar and the CM orders, because they show relatively high rates of northern V2 with relatively low rates of the CM orders. In other words, the statistical correlation between the orders associated with one grammar and the orders associated with the other tolerates exceptions, and these texts are such exceptions.

Old English
The correlation observed in the previous subsection between the CM orders and the northern pattern of inversion with pronominal subjects is overwhelmingly due to ME texts. In Figure 5, Old English texts are coloured red and Middle English texts are coloured blue. It can then be observed that the correlation is particularly strong for the ME texts (Spearman's ρ = 0.79, p = 1.4 × 10 −7 ), but there is no correlation for the OE texts (Spearman's ρ = −0.11, p = 0.58). The reason for this is plausibly that inversion in matrix clauses with pronominal subjects is almost categorically absent in surviving OE texts. 31 Accordingly, the lack of correlation among OE texts in Figure 5 could be a kind of floor effect: matrix inversion with pronominal subjects is so close to completely absent that it only features as noise.
Because of this, OE examples of the CM word orders are restricted to full NP subjects. Such examples are found, but the CM orders are at a lower rate overall than in ME texts, because of this restriction on subject type. Examples of OE embedded V2 orders are given in (33) Despite the lower frequency of the CM orders in OE texts, I have not discerned any differences in conditions of use of embedded V2 between the Edinburgh manuscript and earlier texts, once the large number of OE examples with þa and similar adverbials are excluded. This suggests that the CM grammar is in fact already present in OE, but that there is less positive evidence for the use of that grammar because of independent facts about OE grammar.
It is also possible that the greater prominence of the CM grammar in later texts reflects the nature of the other competing grammars, and particularly the shift from OV to VO. As is to be expected in a language that is still largely head-final, there are more embedded SXV 32 One must again bear in mind the demonstration in van Kemenade (1997)  orders than SVX in OE. These verb-third orders cannot be generated by the CM grammar, if the CM grammar requires V-to-F movement. Accordingly, in OE there is simultaneously less positive evidence in favour of the CM grammar, and more positive evidence in favour of SV grammars.

Late Middle English
The CM orders are present in many late ME texts, but never became widespread, and certain long texts, such as the The Brut or The Chronicles of England, contain no examples at all.
Because the CM orders involve inversion, it is natural to link their decline to the loss of V2. The loss of V2 involves two components: the verb no longer moves to a left-peripheral head position such as C or F, and A ′ -movement to the associated left-peripheral specifier position is no longer required. I assume that the two components of a V2 grammar, even though they are dissociable in principle, were lost together because any grammar which retained one component without the other would generate word orders which would fail to parse large amounts of the input. Loss of the former without the latter would lead to 'verb-late' orders, with the verb in V or I and significant freedom in the position of preverbal elements due to A ′ -movement to Spec,CP and/or Spec,FP. Loss of the latter without the former would lead to verb-initial orders, possibly only in matrix clauses. 37 To the extent that a text does not show inversion in standard V2 contexts, it should also be expected not to show the CM orders, because the CM grammar would be able to produce standard V2 orders with inversion as well as the CM orders. That expectation is borne out. Figure 6 plots the log-transformed frequency of the CM orders against the frequency of inversion in PP-initial matrix clauses with full NPs, for texts over 10,000 words in length from the 14th and 15th centuries. There is again a positive correlation between the CM orders and the PP-initial V2 orders (Spearman's ρ = 0.52, p = 0.003).  Figure 6. Correlation of the CM orders with inversion in PP-initial matrix clauses with full NP subjects, 14th-15th-century texts of >10,000 words. The major northern texts are highlighted in blue.

Summary
The CM grammar is present throughout Old and Middle English, but particularly visible in northern ME texts for a variety of reasons. First, OE had a greater rate of OV orders, compared to the preference for VO orders in ME. Because OV orders are less likely to put the verb in second position, this means that there were more OE orders, particularly in embedded clauses, that could not be generated by the CM grammar.
Second, the possibility of inversion with pronominal subjects, coupled with a tendency to use pronominal subjects in the distinctive CM word orders, meant that there was more positive evidence for the CM grammar in northern texts. One must also consider the contingent fact that there are no northern OE texts suitable for investigation of word order.
Finally, because the distinctive CM orders all involve inversion, it is to be expected that the loss of V2, beginning in late ME, effectively reduced the positive evidence for the CM grammar to zero. This claim is supported by a correlation between the rate of inversion and the frequency of the CM orders in Middle English.

Implications for Grammar Competition
In this section, I consider the relationships between the two V2 grammars, the SV grammar, and the CM grammar, in the light of the model of grammar competition developed by Yang (2002), in which a grammar's fitness is determined uniquely by its success in parsing observed input.
The outline of Yang's model is as follows. There is a finite set of possible grammars (as defined, for instance, by a finite set of parameters). Each grammar G i is associated with a probability, or a 'weight', p i . When a child encounters an input sentence s, the child picks one grammar to analyse it. The probability that the child will select G i is p i . If G i successfully analyses s, then p i increases (according to the linear reward-penalty algorithm given Yang 2002, p. 29), and the weights assigned to all other grammars concomitantly decrease. If G i does not successfully analyse s, then p i decreases and the weights assigned to all other grammars increase.
Because parsing success increases the weight associated with a grammar, this model favours more flexible grammars, which are capable of parsing a wider variety of input sentences. The CM grammar is interesting in that respect because it is so flexible. The only orders it cannot generate are matrix V ≥ 4 orders and embedded V ≥ 3 orders. In contrast, the SV grammar cannot generate orders with inversion; the northern V2 grammar can generate only V2 orders in matrix clauses and is essentially identical to the SV grammar in embedded clauses; and the non-northern V2 grammar is similar to the northern grammar except for the requirement of matrix V3 orders with subject pronouns.
The greater flexibility of the CM grammar apparently did not give it a competitive advantage, because the CM grammar evidently lost out, in the fullness of time, to the SV grammar. In this section, my aim is to understand why the CM grammar lost out.
I will consider the fitness of the four different grammars discussed in this paper, using all texts in PLAEME as a model of the linguistic environment in England c.1300. All PLAEME texts are dated to within a 75-year window, 1250-1325. I abstract away from temporal differences between these texts and treat them as a single point in the history of English.
It is likely that English around this time had properties which were particularly favourable to the CM grammar. Earlier texts showed greater frequency of embedded SXV orders, which the CM grammar cannot generate because it only generates verb-second orders in embedded clauses. Later texts, as V2 declined, have more frequent XYSV orders, with multiple constituents left-adjoined to the clause and no inversion.
The window occupied by PLAEME texts therefore falls between two sets of less favourable circumstances for the CM grammar. This implies that PLAEME should be a sample of a particularly favourable period for the CM grammar. However, this remains a matter of degree: all grammars other than the CM grammar can generate V ≥ 3 orders in embedded clauses, for instance, so there are always cases which are beyond the reach of the CM grammar.
In order to investigate the relative fitness, on Yang's definition, of the CM grammar, the northern and non-northern V2 grammars, and the SV grammar in ME, I conducted a corpus study on PLAEME using the same technique described in Section 3.1 of coding the category of the first four constituents in the clause. There were some minor improvements to this query compared to the one described above, in the handling of negation, object pronouns, and deictic adverbials like þa. However, the same principles for excluding leftperipheral elements and other special cases were applied to this study. The study included all localized PLAEME texts, regardless of length.
I extracted all finite matrix, complement, and adverbial clauses where the first four positions are occupied by subject, verb, and two other constituents. I then divided these clauses into twelve categories according to the position of the verb and subject, and calculated which categories could be generated by which grammars. These calculations are summarized in Table 1  These results indicate that the CM grammar has the greatest fitness, on Yang's model, of the four grammars under consideration across all localized PLAEME texts, considered together. Moreover, the CM grammar has greater fitness in all dialect areas for which data is available in PLAEME. In fact, the CM grammar also has the greatest fitness for the vast majority of individual texts: there were only ten texts (out of 59) in which one of the other grammars had greater parsing success than the CM grammar. Table 3 shows that this advantage arises because the CM grammar can parse more matrix clauses than the other grammars, while the other grammars can parse more embedded clauses, where they all make identical predictions. Despite this lower parsing success in embedded clauses, the CM grammar has the greatest overall parsing success, simply because there are more matrix than embedded clauses. So, around the time of the Edinburgh manuscript, the CM grammar has a fitness advantage in comparison to other grammars visible in contemporary English texts. Furthermore, yet, the CM grammar never spread, despite this fitness advantage. This suggests that extra flexibility does not always confer a selectional advantage, a lesson familiar from the Subset Principle (Berwick 1985;Manzini and Wexler 1986). However, it does not follow from Yang's learning model, where fitness is defined directly in terms of parsing success.
This problem can be attributed to the linear reward-penalty algorithm adopted by Yang. Several alternative algorithms are conceivable, and I do not intend to investigate them seriously here. However, I close by pointing out that Bayesian approaches are a widely-used family of algorithms which avoid this problem. Bayesian approaches incorporate Bayes' rule, in (35).
The utility of Bayes' rule is that it allows us to infer the probability of a grammar (here, a 'hypothesis', h) given a set of some observed phenomena (here, some 'data', d). Bayes' rule tells us that the probability of h given d is proportional to the product of the 'prior probability' of h (P(h)), and the 'likelihood' (P(d|h), the probability of d given h). This latter term is the most important for our purposes. 40 The Subset Principle is closely related to a general principle of Bayesian learning known as the Size Principle (Perfors et al. 2011). The Size Principle is formulated by Tenenbaum (1998) as follows: 'learners … weight more specific hypotheses higher than more general ones by a factor that increases exponentially with the number of examples used' (Tenenbaum 1998, p. 64). This follows directly from Bayes' rule because the posterior probability of a grammar is a function of the prior probability and the likelihood. Because a more flexible grammar can generate more structures, the likelihood of any individual structure being generated is lower. Unlike the Subset Principle, which is stated deterministically, the Size Principle is probabilistic, but approximates the Subset Principle increasingly closely as the amount of input increases.
It is not straightforward to design a Bayesian learning algorithm which is faithful to the spirit of grammar competition. Grammar competition has two components which jointly pose a challenge to typical Bayesian approaches. The first is that grammars are discrete objects: either they generate a sentence or they do not. For a sentence that they do not generate, P(d|h) is literally 0. The second is that individuals display syntactically heterogeneous behaviour. That is, many, or even all individuals have access to grammars G a and G b , where G a can generate some sentences that G b cannot, and vice versa. This means that, in any representative sample D of linguistic behaviour from such a speaker, there will be some sentences d ∈ D for which P(d|G a ) = 0 and some sentences for which P(d|G b ) = 0. This has the effect of fixing the posterior probability of G a and G b at zero.
It is common in Bayesian learning models to avoid this problem by setting P(d|h) in such cases not at zero, but at some 'error term' ϵ, slightly above zero. Despite the practical value of this approach, it is not compatible with the spirit of grammar competition. The logic of grammar competition implies that a speaker who assigns a high weight to a grammar G a may nevertheless generate sentences which G a cannot generate, simply because a speaker may have access to multiple grammars.
Speculatively, it seems to me that a way round this would be to treat the h term in Bayes' rule not as an individual grammar, but as the set of grammars accessible to an individual. The Bayesian version of the Subset Principle would still be embodied, because higher likelihoods would be assigned to observed data on more restrictive sets of grammars, and the problem of zeroes disappears: it does not matter if the probability of an observed sentence is zero for any individual grammar, so long as it is not zero for the whole set of grammars accessible to an individual. I hope to develop this speculation in future research.

Summary
In this paper, I hope to have demonstrated that the matrix V3 and embedded V2 orders which are prominent in the Edinburgh manuscript should be given a theoretical account in terms of the CM grammar, a grammar which is particularly visible in 14th-century English, but whose effects can be seen already in OE. The argument for treating these apparently exceptional orders as the product of a single grammar rests on the correlation between the frequency of the two orders across texts, suggesting that these orders have a systematic relationship and are not just noise.
The description of that grammar is very simple: it differs from V2 grammars of the period in having two A ′ -positions before the verb, rather than one. That grammatical description is dissociable from the northern/non-northern split uncovered by Kroch and Taylor (1997), and the matrix V3 and embedded V2 orders associated with the CM grammar can be dissociated from Kroch and Taylor's dialect split in the textual record. Nevertheless, I also demonstrated a correlation, particularly strong in ME, between rates of use of the northern pattern of inversion with pronominal subjects, and rates of use of matrix V3 and embedded V2 orders. This suggests a probabilistic association between the CM grammar and the northern grammar, even if the two grammars should not be considered as one and the same.
The Edinburgh manuscript dates from a time period in which the SXV orders which predominated in OE had largely been replaced by SVX orders, but in which English was still essentially a V2 language, with inversion the norm in nonsubject-initial sentences with full NP subjects. This period is particularly congenial to the CM grammar, because both the earlier SXV orders and the later XYSV orders are incompatible with the CM grammar. Indeed, the CM grammar has greater fitness in the 'PLAEME window' of 1250-1325 than any grammar traditionally considered in the generative analysis of ME, on the model of fitness proposed by Yang (2002).
Nevertheless, the CM grammar did not spread, and the distinctive CM orders are only marginally visible in most late 14th-and 15th-century texts. This poses a challenge to Yang's conception of fitness. In the final section of this paper, I sketched an outline of a way to develop a Bayesian alternative to Yang's linear reward-penalty algorithm.
This paper is a first attempt at making sense of these orders in texts which have been the subject of little or no prior generative analysis. I will end with three avenues for further research in this area.
The first is simple: I sketched an alternative to the linear reward-penalty algorithm, but did not actually develop it. The paper therefore leaves unfinished business with respect to the theoretical understanding of grammar competition, which should be addressed in future research.
Secondly, there is more empirical work to be done on the Edinburgh manuscript and the CM grammar. In particular, recent work on OE (Bech 2001;van Kemenade et al. 2008;Speyer 2010;Bech and Salvesen 2014) has demonstrated that information structure has a large effect on OE word order, and it would be interesting to investigate similar effects in the time of the Edinburgh manuscript. Of course, a more nuanced understanding of the CM grammar may also affect my assumptions about its relative fitness.
Third, there is the intriguing question of why the CM grammar peaked when it did, and why such grammars are typologically rare. From a theoretical perspective, it is wel-come to see a productive V3 grammar like the CM grammar, because there is no theoretical reason why such grammars should not exist, as discussed in Section 2.1. However, V2 grammars are already typologically rare, and it seems likely that the CM grammar is even rarer. As a final speculation, I suggest that transitional periods like English around 1300 are fertile grounds for rare grammars. During a transitional period, the observable linguistic behaviour is more heterogeneous than is usually the case, and this means that there is less pressure from the Subset Principle against more flexible grammars like the CM grammar. The CM grammar can therefore potentially inform us about conditions favouring the emergence of such rare grammars.
Funding: This research received no external funding.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
No new data were created or analyzed in this study. Data sharing is not applicable to this article. All scripts used to analyze pre-existing data are publicly available at https://github.com/rtruswell/CM_supplementary.