1. Introduction
Almost two decades ago,
Mukherjee and Hoffmann (
2006, p. 148) suggested that verb complementation is an under-researched area in studies of regional variability in world Englishes. Today, this claim remains valid, and particularly so in the case of verb complementation involving subordinate clauses. The present study explores the use of non-finite clauses serving as complements to so-called “catenative” verbs in twenty national varieties of English. Catenative verbs are used in potentially recursive constructions yielding chain-like sequences where each non-finite clause is a complement to a (catenative) verb, as in (1). The subcorpus abbreviation “GH” used in this example is, along with those in all following corpus examples, spelt out and listed in
Section 3 below (see
Table 1).
(1) | | All is Love I promise to keep trying to listen to that stillness within. (GH) |
Following
Huddleston and Pullum (
2002, pp. 1176–1177), I shall use the term “catenative complement” to refer to a distinctive type of complement, one that is not covered by other types such as object or predicative, and one that is realised exclusively by non-finite clauses.
There are four kinds of catenative complements—
to-infinitival, bare-infinitival, gerund-participial, and past-participial—exemplified in (2) and (3) below.
1 The examples in (2) all involve “simple” complementation, the catenative complement in each case being the sole complement of the catenative verb, while those in (3) involve “complex” complementation, with more than one complement (the catenative complement in each case here following an NP object complement).
(2) | a. | I really can’t bear to see them go when they get old (SG) [To-infinitival]; |
| b. | Most days we helped prepare the food for the children (KE) [Bare-infinitival]; |
| c. | We do not stop playing because we are old (IN) [Gerund-participial]; |
| d. | then i got banned from channel for 1 day (SG) [Past-participial]; |
(3) | a. | I can’t bear him to touch me because I don’t feel ‘complete’ (GB) |
| | [To-infinitival]; |
| b. | His dad came rushing down to help him make it to the finish line (PH) [Bare |
| | -infinitival]; |
| c. | We do not stop them going to the mosque (IN) [Gerund-participial]; |
| d. | before the BBC got them banned (GB) [Past-participial]. |
There is much diversity and complexity in this area of grammar, with more than the merely eight possible complementation patterns with catenative verbs that are exemplified in (2) and (3). For example, an alternative to (3c) is that with a prepositional
from complement (“stop them from going to the mosque”). Comprehensive lists of catenative verbs along with the clausal complements which they license are available in
Biber et al. (
1999, pp. 693–759),
Huddleston and Pullum (
2002, pp. 1225–1244), and
Algeo (
2006).
While it is not possible to categorically specify the type(s) of non-finite complement that a particular catenative verb will select, the selection is not entirely random. Verbs with similar meanings tend to enter into similar complementation constructions. Examples include the selection of complex complementation with object and
to- infinitive by verbs of coercion (
compel,
force,
cause,
allow, etc.) and verbs of saying (
warn,
urge,
instruct,
invite, etc.) as in (4a) and (4b), of prepositional phrase complement plus
to- infinitival by verbs expressing reliance (
rely,
depend,
appeal,
bank, etc.) as in (4c), and of object plus gerund-participial by verbs of depiction (
depict,
picture,
portray, etc.) as in (4d):
(4) | a. | It will shock them and compel them to have a new look at their social |
| | behaviour (PK); |
| b. | The town-crier goes ahead to warn people to put out all lights (GH); |
| c. | They depend on others to be more dependable than they are (JM); |
| d. | Sigiriya frescoes depict women wearing the cloth gracefully draped like a |
| | dhoti (LK). |
In cases where a catenative verb licenses more than one type of complement construction, there is often a semantic difference, one that may be quite elusive. The difference may be related to the general modal distinction between potentiality on the one hand (in the case of infinitivals) and actuality on the other (in the case of gerund participials). There is a historical motivation for the two meanings in question (see
Huddleston & Pullum, 2002, p. 1241). The infinitival marker
to derives from the preposition
to, whose goal-oriented meaning is rooted in the notion of direction, and by extension to unactualised future situations (with verbs such as
strive,
consent,
promise,
threaten, and the like). By contrast, gerund participles have a historical connection with nominal constructions that depict actual events (compare
he finished swimming with
he finished his swim;
she hates working with
she hates work). This distinction elucidates the difference between (5a), where we understand that there was an earlier commitment to put the fiddle in the car, and that the speaker’s failure to act accordingly is projected forward from this commitment, and by contrast (5b), where what is forgotten is an actual rather than merely potential action, namely the insertion of the addressee’s name by the speaker.
(5) | a. | I forgot to put the fiddle in the car (IE); |
| b. | I forgot putting your name there (US). |
The potential versus actual distinction is, however, more commonly not discernible, or barely so. For example, in (6) and (7), the
to infinitive and present participle are readily interchangeable without any evident change in meaning.
(6) | a. | I won’t bother to repeat it here (GB); |
| b. | I won’t bother repeating what others have already written (IN); |
(7) | a. | Mobil Oil started talking to us locally (IN); |
| b. | Firmi and I somehow started to talk again (SG). |
This study is limited to cases such as those exemplified in (5), (6) and (7), where a catenative verb licenses more than one type of non-finite complement construction, with the patterns of choice associated with each construction identified across the twenty national varieties represented in the mega-corpora GloWbE (the Global Web-based Corpus of English) and NOW (the News on the Web corpus): see further
Section 3. By providing insights into regional variation in catenative constructions in English world-wide, and into some of the diachronic implications thereof, the study aims to contribute to the expanding body of research on morphosyntactic variation in space and time in English world-wide. Most previous studies of catenative constructions have focused on data from British English (BrE) and/or American English (AmE), the two influential and much-studied “reference” varieties that are dubbed respectively “hypercentral” and “supercentral” by
de Swaan (
2001) and
Mair (
2013). These include
Algeo (
2006);
Rudanko (
2005,
2006,
2017);
Rudanko and Luodes (
2005); and
Vosberg (
2009). Exceptions include studies of Australian English (AusE) and New Zealand English (NZE) by
Mair (
2009),
Collins (
2015), and
Rickman and Rudanko (
2018); of Hong Kong English (HKE) by
Deshors (
2015); of Indian English (IndE) by
García Castro (
2019); and of sets of varieties by
Deshors and Gries (
2016) and
Romasanta (
2017,
2019,
2021,
2022,
2023).
2This paper circumscribes the locus of variation to those patterns of catenative complementation that are found widely throughout the English-speaking world. Accordingly, I have excluded more regionally restricted patterns such as the substrate-shaped use of bare-infinitival complements instead of standard gerund participials in creole-influenced JamE as in (8), the use of
to infinitivals for standard bare infinitivals in some African varieties as in (9), bare infinitivals for standard
to infinitivals in some South Asian varieties as in (10), and
to infinitivals with prepositional
for in some South East Asian varieties as in (11).
(8) | | Them time dey I was a half believer in the spiritual ting. Things start get worse when I find myself not even wanting to go ova the yard (JM). |
(9) | | We should make them to assist the police during the day (GH). |
(10) | | I mentioned him about our desire and asked him give a portion to us from their |
| | delicious meal being cooked (BD). |
(11) | | If you wish for to make a striking statement (MY). |
The structure of the rest of the paper is as follows. In
Section 2, I consider the diachronic development of non-finite catenative complement constructions.
Section 3 introduces the two mega web-based corpora used and data extraction procedures.
Section 4 presents a number of individual case studies.
Section 5 is devoted to concluding observations.
2. Diachronic Background
The nature and distribution of non-finite catenative complement constructions in present-day English are by-products of extensive changes that have occurred throughout the history of English. During the Middle English period
to infinitivals (as in
[They told him] to leave) began to compete with finite
that clauses in subjunctive constructions (as in
[They told him] that he should leave), with the former spreading widely and eventually becoming the default option (
Los, 2015). In a further development, the
to infinitival itself began to be challenged by the rise of gerund participials. This scenario had its origins in the decay of the Old English case system, which paved the way for the increasing use of prepositions to signal grammatical relationships (
Fanego, 2004), and in turn for gerunds to be used as prepositional complements (a function not possible for
to infinitivals). However, it was not until the early eighteenth century that there was a sharp spike in the frequency of gerund participials, prompted by their use with an increasing number of catenative verbs. This profound reconfiguration of the verb complementation system is commonly referred to as the “Great Complement Shift” (
Rohdenburg, 2006;
Vosberg, 2009).
A number of diachronic studies concentrating on the past half century have provided robust evidence that in AmE and BrE the competition between non-finite catenative complements has continued to the present day (e.g.,
Mair, 2009;
Leech et al., 2009). In this paper, I identify, analyse and discuss patterns of non-finite catenative complementation that are in some cases parallel to, and in others divergent from, these well-attested developments. The orientation of the present paper is primarily synchronic, the only directly real-time component being that facilitated by the short time span offered in NOW, from 2010 to the present day (see
Section 3).
3. The Data
The data for the present study are captured from two large web-derived corpora, GloWbE and NOW. GloWbE comprises 1.9 billion words of texts taken from English websites.
Table 1 presents the labels used for the twenty countries represented in GloWbE, as used henceforth in this paper, along with their associated English variety labels. The twenty GloWbE subcorpora represent six high-contact L1 varieties and fourteen indigenised L2 varieties, these two categories corresponding roughly to that between Inner Circle (IC) and Outer Circle (OC) Englishes, respectively, in
Kachru (
1985). The six L1/IC varieties are here subclassified into three regional groups ((North) American, European, and Oceanian) and the fourteen OC Englishes into four regional groups (South Asian, South-East Asian, African, and Caribbean). Needless to say, the uneven distribution of subcorpora sizes—ranging in size from 387,615,074 words for GB to 35,169,042 words for TZ—necessitates the use of normalised frequencies in our intervarietal comparisons. While the primary focus of the paper is on the “New Englishes” of the OC, comparisons will systematically be drawn with the findings for the IC varieties.
All the texts in GloWbE originate from the internet. They are categorised into two sections, “blogs”, and “general” (newspapers, magazines, company websites, and so on), and offer a mixture of informality and formality. The tenor of many of the GloWbE blogs is personal, involved and opinionated, and they feature a high density of linguistic features associated with oral communication (
Biber & Egbert, 2018). By contrast, the tenor of the general texts tends more towards the formal end of the informality–formality spectrum.
The NOW corpus contains 20 billion words of data from web-derived newspapers and magazines from the same twenty countries represented in GloWbE, from 2010 to the present time. The online platform allows searches either by country or by date. Web newspapers are professionally written and edited and tend to adhere to an accepted level of quality in maintaining the standards of the genre. Thus, the availability of NOW alongside GloWbE provides us with the opportunity to draw comparisons between the more informal mix of texts in GloWbE and the less informal news reportage texts of NOW. There being insufficient space to present full details of NOW searches to match those for GloWbE, when making GloWbE vs. NOW comparisons I shall simply select relevant frequencies from the NOW findings. Occasional references are also made, where relevant, to frequencies extracted from a third corpus, COHA (Corpus of Historical American English), which comprises 475 million words of AmE texts representing a balance of genres, from 1820 to 2019.
In addition to the large number of national varieties of English represented, the massive size of the GloWbE and NOW corpora is invaluable for the study of low-frequency items and constructions. Admittedly, mega web-derived corpora cannot compete with smaller corpora such as those belonging to the International Corpus of English (ICE) and the “Brown-family” of corpora, in studies where rigorously controlled generic representation and foolproof identification of text origins are essential requisites. However, the generally informal text types in GloWbE make it a particularly suitable resource for a study of historically changeable categories, with such phenomena as colloquialisation and grammaticalisation known to thrive in more informal language (
Collins & Yao, 2013). The fact that grammatical changes tend to proceed more rapidly in speech than in the typically more conservative medium of writing is suggested by the findings of Leech et al.’s (2009) investigation of grammatical changes in written and spoken BrE and AmE between the 1960s and the 1990s. A caveat that must be entered here is that such changes cannot be assumed to be directly relevant to developments in the New Englishes. As a prelude to this study, I ran some preliminary searches in GloWbE and NOW for a set of orality/colloquiality-oriented items to determine their distribution in the two corpora: see
Table 2. The items selected as bellwethers of change were of three types: grammatical word classes, contractions and punctuation marks. The word classes were those whose frequencies in speech in Biber at al. (1999) were significantly greater than those in fiction, news, and academic writing. As
Table 2 shows, they were all more frequent in GloWbE than in NOW (by ratios ranging from 12.6:1 for modal verbs to 1.3:1 for determiners, personal pronouns and negators. Contractions—represented here by
’d,
’ll, and
’m—are referred to by
Leech et al. (
2009, p. 240) as “the paradigm case of colloquialization”, a claim confirmed by the findings of
Collins and Yao (
2013,
2018). These three contractions were approximately 63% more frequent in GloWbE than NOW. Finally, three punctuation indicators in
Table 2—question mark, exclamation mark, and ellipsis—manifest aspects of the interactivity that is an essential feature of conversational speech. They were approximately 44% more frequent in GloWbE than NOW. I hypothesise that if a feature that is known to be on the rise historically yields a relatively high frequency of occurrence in NOW vis-à-vis GloWbE, this may indicate an advanced stage of change wherein it has spread from spoken (or speech-like) genres into a more formal written genre.
Searches of the type “HELP _v?i” and “HELP to _v?i” were conducted in GloWbE and NOW using the online Brigham Young University online platform. For information on this platform, see
https://21centurytext.wordpress.com/introducing-the-1-9-billion-word-global-web-based-english-corpus-glowbe/, accessed on 16 December 2024. In some cases where complete searches were not possible, it was necessary to restrict them. For example, in the case of complex constructions, it was not possible to capture all the possible expressions that could manifest the direct object NP (since the corpora are POS-tagged but not syntactically parsed), so searches were limited to personal pronouns instead of full NPs. For example, for constructions of the type “force him into leaving”, it was not possible capture relevant NP tokens using the query “VERB+ * into _v?g”—due to mishits of the type “had gone into hiding”—so “VERB+ _pp into _v?g” was used. The pmw frequencies yielded by the searches for the individual varieties, for the regional groupings thereof, and for the IC vs. OC were then compared across the alternants for each construction type using ratios and percentages, in the spirit of recent “onomasiological” research (e.g.,
Collins, 2023a,
2023b).
5. Conclusions
Previous research on non-finite catenative complementation has largely been restricted to BrE and/or AmE. The present study has sought to expand the regional coverage of such research by analysing a set of catenative constructions (help + NP + (to) + V; “preventative” V + NP + (from) + Ving; “coercive” V + NP + into + Ving/to V; “aspectual/emotive” V + Ving/to V) across the twenty national varieties represented in GloWbE, with a second mega-corpus, NOW, regularly drawn upon for comparative purposes. The selection of these two large web-derived corpora is defended on the grounds of their capacity to capture low-frequency items and constructions, and of the potential insights offered by the register contrasts between them (as established by an analysis of the distribution of a set of colloquiality-oriented items, including contractions and punctuation marks, across the two corpora). Wherever possible, findings are related to historical trends attested in previous studies, and to such diachronically relevant phenomena as colloquialisation and grammaticalisation.
American influence is widely in evidence in the findings, with the Englishes of countries that are geographically close to the USA (CanE, JamE) and/or historically closely related (PhilE) tending to follow AmE patterns. For example, the American predilection for “prevent NP from Ving” is shared by all three of these varieties, while the American aversion to the alternative from-less variant “prevent NP Ving” is paralleled in CanE and PhilE. A similar result is attested with the declining to-infinitival variant with help, with the American distaste for this variant matched by that in CanE and PhilE. Curiously, in some cases, there were other varieties that shared an AmE preference or dispreference for a particular construction, but more strongly so than AmE itself. A case in point is help with bare-infinitival complements, where the AmE frequency (90.07 pmw) was strongly overtaken by that for CanE (113.52) and that for PhilE (118.52). While AmE epicentrality—and possibly hypercentrality—is clearly a factor in many of the study’s findings, it is not the only variety to enjoy epicentral status. The results for the IC across a range of constructions, both in GloWbE and in NOW, not only see AmE tending to outscore CanE but also BrE outscoring IrE and AusE outscoring NZE.
Another notable finding was that, occasionally, the OC varieties, as a group, conservatively buck real-time trends associated with the reference varieties. For example, the to infinitival with help is attested in a number of studies to be losing ground to its bare counterpart, and in the present study it is the IC whose frequencies suggest that it is leading the way in this development, its dispreference for the to-infinitival variant (23.60 pmw) being markedly stronger than that of the OC (32.61). Similarly, with catenative fear, the variant with a to-infinitival complement, which this study shows to be less frequent in GloWbE than that with a gerund-participial complement, is far more strongly endorsed by the OC (2.05) than by the IC (1.21).
The most remarkable finding in the results for the OC, however, is arguably the dominant scores for Africa. It was suggested that this finding is due to the popularity of “serial verb” constructions in a number of African languages, and, in particular, in Western African languages, including pidgins. It was also hypothesised that over time serial verb-promoted catenative constructions may have diffused from the Western African Englishes to regionally disparate Englishes in Eastern and Southern Africa.
The findings of the study are compatible with previous claims (by
Leech et al., 2009;
Mair, 2009;
Collins, 2015) that historical changes in catenative complementation are driven in some cases by the stylistic factor of colloquialisation, and by the structural factor of grammaticalisation. Consider, for example, the current and increasing domination of bare infinitivals over
to infinitivals with catenative
help, a trend more advanced in the IC (with a bare-infinitival vs.
to-infinitival ratio of 3.84:1) than the OC (2.86:1). The finding of the study that the overall ratio in GloWbE (3.08:1) is overwhelmingly surpassed by that in NOW (11.50:1) suggests that the bare-infinitival construction has undergone a colloquialisation-driven increase, which has become firmly established in the more formal written genre of news reportage. Furthermore, the decline of the infinitival marker
to in this construction is undoubtedly ascribable to grammaticalisation. This development has parallels in the historical process that saw the canonical modal auxiliaries of contemporary English transition, between OE and ME, from lexical verbs to auxiliaries that select a bare infinitival (see further
Hopper & Traugott, 2003). Furthermore, the semantic bleaching that is another common feature of grammaticalisation is in evidence with catenative
help, with the notion of assistance giving way to merely generalised causation (similar to catenative
make). Another uncontroversial case of grammaticalisation is that suggested by the dominance of gerund participials over
to-infinitival complements with aspectual
start. This situation, which is more pronounced in the OC with a V
ing vs.
to V ratio of 1.99:1, than in the IC (1.46:1), reflects the historical drift in catenative complementation in English towards gerund-participial constructions. As with
help, one manifestation of grammaticalisation with aspectual
start is semantic weakening, in this case involving a shift from the earlier meaning of “leap/jump” to the current sense of “commencement” which emerged during the eighteenth century.
It was noted that in the results reported for the two corpora, GloWbE and NOW, the similarities tended to outweigh the differences. It was suggested that a possible explanation for this finding is to be found in the tendency noted in previous studies for OC varieties to exhibit less stylistic variability across formal and informal contexts than IC varieties, and further that this tendency might be influenced by that in OC countries for English to be learned primarily in the formal context of the classroom.
It remains to comment on potential future directions. Researchers should seek to avail themselves of the most advanced statistical methods relevant to the study of complementation. Importantly, moreover, researchers exploring the grammar of World Englishes in general need to engage in the preparation of suitable large diachronic corpora comprising not merely web-derived texts but those from other sources, and not merely written but also spoken texts. Finally, it may be suggested that the present study could be used as a suitable reference point to explore catenative constructions in the EFL varieties of the Expanding Circle.