1. Introduction
In his introduction to the recent volume Arabic Historical Dialectology: Linguistic and Sociolinguistic Approaches, Clive Holes, an expert on Bahraini and Gulf Arabic dialects, relates the following anecdote to illustrate the similarity of so-called “Bedouin” type dialects:
[W]hen, in the mid-1970s, my employer transferred me from Kuwait to Algeria, a distance of several thousand miles, I had no difficulty, if I spoke in Gulf Arabic, in making myself understood to (and in understanding) ordinary Algerians in southern oasis towns such as Ourgla and Touggourt, even though most of them had never left Algeria in their lives: we were all speaking ‘bedouin’ dialects. But the Arabic of the city of Algiers, only a few hundred miles to the north, and where I was based, is of North African ‘sedentary’ type and was so incomprehensible to me (as was my Gulf Arabic to the Algérois) that throughout my two-year residence there I found it easier to speak French.
This example comes in a section on Bedouin dialects, and is intended to illustrate how, despite issues with the use of “Bedouin” as a classification, it still has value as a category of analysis.
1 However, the exact mechanism by which these two far-flung dialects remain mutually intelligible—but unintelligible compared to the nearby coastal sedentary dialects—remains uninterrogated. How exactly does their status as “Bedouin” dialects render them similar? The explanation becomes simply that they are Bedouin dialects, and by existing in the same classification, they are expected to be similar, without invoking either history or linguistics to delve deeper into that similarity. If a linguist from outside the field was presented with this example, they would almost certainly explain it simply as a matter of time-depth—clearly the Gulf dialects and the southern Algerian dialects are simply the result of a relatively recent divergence, rather than due to a vaguely defined typological similarity.
2 This article will argue that it is precisely due to an accretion of traditional approaches to Arabic linguistic and linguistic history that the situation Holes describes is seen as anomalous or explainable only through categorization. Instead, if we reassess some of those existing views, this situation falls out naturally from basic linguistic and historical processes, without the need to invoke specific social categories like “Bedouin” as explanatory. The issue here is not that the category of “Bedouin dialect” itself is invalid, but rather that both the category, and the linguistic evidence, are not sufficiently interrogated. This article aims to suggest ways in which our traditional approaches to link Arabic dialectology and the social history of Arabic-speaking peoples can be profitably reconsidered, investigating a nexus of interrelated issues that center around a general theme of “oldness”: how dialectologists interpret conservative linguistic features, how they conceive of earlier versus later layers of movement and population, and what conservative dialects mean for the genealogy of modern dialects.
This paper is, at its heart, a historiographical exploration, looking at the narratives that surround the history of Arabic as much as the linguistic data itself, and how these narratives shape our conception of that history. Since it is difficult to survey the entire field in a meaningful or systematic manner in a paper of this length, much of the attention is focused on the recent survey that the quote above comes from,
Holes (
2018a). This seminal work is a well-researched, elegantly conceived volume which makes the historical dialectological work performed until now both easily accessible, and easily comparable, and it is doubtful this paper could have been easily written prior to the publication of that volume. Indeed, the high quality of many of the essays therein make the larger critique in this paper difficult at times, as many of the authors have indeed begun to move beyond the assumptions critiqued here. However, as argued here, those assumptions continue to influence the research on a less conscious level.
This paper is not intended to be polemical, nor is it intended as a broader criticism of the work in the field of Arabic dialectology and linguistics. The incredible work performed by scholars for the past several centuries has and continues to be of immense value, and none of the critiques laid out here could even be articulated without that work. The goal, rather, is to offer constructive suggestions to improve the depth and accuracy of the work in the field of Arabic historical dialectology. When a scholar is quoted in the process of identifying a common theme in the literature, these quotes are intended to represent a larger narrative within the field, and certainly not to criticize the author quoted or their work more generally. Indeed, much of the criticism is focused on the author’s own earlier work.
The paper also seeks to suggest concrete, actionable ways to avoid common issues in historical dialectological work.
Section 5 presents a novel heuristic approach to considering how linguistic and population movements are intertwined in the South-West Asia and North Africa region based on general linguistic and social principles. Similarly, the conclusion presents several specific recommendations for future historical dialectology.
Prior to beginning, it is important to step back and consider the goals of what we seek to determine from Arabic historical dialectology research. The closest statement that
Holes (
2018a) makes on this topic is in a footnote: “Our main purpose in writing this book is to show how a more historically and socially grounded linguistic approach, despite the gaps in the record, can help trace the long-term dynamics and some detail of what happened [in the development of Arabic dialects today] (note 7).” In a sense, what historical Arabic dialectology seeks to do is to take the various snapshots we have of the Arabic dialects—their modern distribution and the glimpses of the dialects we find in historical records—and interpolate them to develop a narrative of how the dialects developed. We seek to be able to somehow “rewind” the historical development of the language to see how it came to be. It is, of course, the “gaps in the historical record” that are the primary difficulty in this endeavor, but this paper will also argue that it is
how we view the available historical and linguistic record that can, at times, hamper our progress toward that larger goal of understanding the history of Arabic dialects.
2. Conservative Features and Conservative Dialects
One key issue in Arabic dialectology is how we interpret the data that is available to us. As a preliminary, it is important here to be clear in differentiating several levels of linguistic analysis that operate at difference scales.
3 The lowest level of analysis is a linguistic feature, a particular way of using language, such as the use of a certain reflex of a proto-phoneme, a certain word used to mean “to go,” an intonation pattern in declarative sentences, etc.
4 At a higher scale is the location of that linguistic feature in space, i.e., its dialect geography, which also implies a certain point in time as well, analogous to what we find on the pages of a dialect atlas. Finally, for the purposes here, we have a dialect (or more awkwardly, “feature bundle”), which is the total collection of linguistic features spoken at a bounded area in time and space.
5 It is important to differentiate these levels of analysis as there is a significant difference between a given feature being old and long attested, and its presence in a particular space and time being long-term. This too is different from presence alongside a cluster of other features being long-term in a particular area or among a particular speech community. The first of these can be quite easy to prove, even in Arabic where diglossia muddies the waters considerably—if we can find an early attestation, we can prove that a feature is quite old. Of course, the latter is rarely true—if we cannot prove the antiquity of a feature, it does not necessarily mean it is new or old. More difficult is to prove that a feature has been in the same location over a long period of time. Certainly Occam’s razor suggests that if we find a feature in, e.g., early Levantine Middle Arabic documents, and also today, it must have been resident in that place the entire time. However, it is easy to imagine a scenario in which two waves of movement and replacement occurred, such that the feature ceased to exist in the area, and then was replaced by a dialect that again had the feature. Finally, it is most difficult to establish the long-term durability of a dialect or cluster of features. Among other challenges in establishing dialect durability is at which threshold one considers that bundle of features to be fundamentally altered, such that a declaration of continuity, or of change, can be declared a kind of “Ship of Theseus” problem.
6 A significant issue in the dialectology literature is in how we interpret dialects which have a preponderance of archaic features, as opposed to dialects with many innovative features. There is a strong tendency to associate conservative features with a kind of “originalism,” a primordial state that is often taken to imply long-term residency in an area, or some kind of genetic priority to more developed dialects.
7 This is a form of essentialism in which the linguistic conservatism becomes linked to a larger conservatism that is seen as an inseparable characteristic of the dialects which have those linguistic features.
We see many examples of this conflation in the literature.
Behnstedt and Woidich (
2018, p. 81) list a variety of migrations into the Fayyum, up to and including the eleventh century, but consider the Fayyum dialect to represent “the earliest linguistic stratum” in Egypt based on conservative linguistic features. For a dialect area only a short distance from Cairo (certainly half the distance from Cairo to Alexandria), and certainly in a close relationship of trade with that city, what is remarkable is precisely that the Fayyum somehow resisted those assimilatory pressures to which Alexandria was subjected, as detailed in
Section 3.1. Similarly they argue that a conservative syllable structure reflects earlier migrations to upper Egypt, while less conservative syllable structures represent later or continuing migration, independent of historical data (p. 84).
Procházka (
2018) formally distinguishes between inherited and innovative traits in his discussion of the Northern Fertile Crescent dialects, a welcome division given how often these are conflated as distinguishing features of dialects. However, he considers the inherited traits to be “’archaic’ or ‘pre-diasporic’, i.e., going back to dialects spoken in Arabia before Islam (p. 262),” when by definition these traits should be found in any dialect that has not innovated a new form, regardless of when it migrated into or out of an area. Similar arguments regarding linguistic conservatism as evidence of longer settlement or a vague sense of “oldness” are found throughout the volume (see pp. 1, 57, 71, 81, 136, 162–63, 264, 298, 304).
Indeed, this idea of ‘old features’ as ‘conservative’ is fundamental to the differentiation between sedentary and Bedouin dialects, and the related tendency to consider Bedouin dialects as themselves ‘conservative’ by extension. Quoting
Rosenhouse (
2006, p. 259),
Holes (
2018a, p. 20) notes that Bedouin dialects are seen as “more conservative” since they “retain many ‘Classical’ features lost elsewhere,” though even without considering Classical Arabic, many characteristically Bedouin features such as the retention of interdentals are certainly retentive with respect to most nearby sedentary dialects. Lists of purportedly Bedouin features are rarely more than lists of retentions, rather than innovations, with the only innovation that commonly can be said to unite all Bedouin dialects being the use of a voiced reflex of the (Q) variable (
Palva 2006).
Though Holes and many other modern authors have developed more detailed understandings of the distinctions between Bedouin and sedentary dialects, they are still viewed as fundamentally distinct from one another and form a key category in the linguistic analyses in the field. In the Holes volume, under the larger category of “major areal and typological” distinctions, there are 35 distinct entries in the index under “Bedouin vs. sedentary,” totaling over 60 pages, while related patterns such as the pre-Hilali vs. Hilali and qultu vs. gilit distinctions have a further 18 and 17 entries each. The only other categories listed under this heading are “Maghrebi vs. Mashreqi”, “peripheral vs. heartland” (a total of 50 entries) and “urban vs. rural” which often tends to functionally be a “Bedouin” vs. “sedentary” distinction, especially in the chapter on the Maghreb. The Bedouin vs. sedentary distinction is by far the most ubiquitous distinction in this volume, and one predicated primarily on the apparent conservatism of Bedouin dialects (but more accurately, the features of those dialects).
Indeed, it is notable that Bedouin dialects do not appear to be more or less innovative in general than sedentary dialects. Rather, they have participated, by and large, in different innovations than sedentary dialects.
Magidow (
forthcoming), in a sample of 52 dialects across the Arab world, divided them into Bedouin dialects if they had a voiced realization of (Q), and sedentary dialects if they did not. Out of a pool of 59 total possible innovations, Bedouin dialects showed an average of 13.1 innovations, versus sedentary dialects with an average of 13.9 innovations. As expected, there was no significant statistical difference between these groups—the two groups are effectively equally innovative.
This focus on conservatism tends to miss a key point, which is that conservatism vis-a-vis dominant linguistic features in the area is simply the result of a group failing to participate in an innovation, not necessarily a deeper statement about the history of that dialect. The fundamental observation of historical linguistics is that only successful sharing of a feature is indicative of shared history or participation in a common speech community (
Hetzron 1976;
Magidow 2017). Sharing of linguistic features implies connection between the dialects sharing the features, while linguistic conservatism implies a lack of sufficient connection. There are only a few ways that connection can occur, and in that sense all happy linguistic families are alike in that they share many innovations, reflecting a shared past of contact.
8 However, unhappy linguistic families, those without a connection, are often uniquely different. A failure to participate in an innovation can be caused by a wide range of factors. Dialects may simply be too far apart, unable to be exposed to a particular feature, or the speakers of two nearby dialects by virtue of their lifestyles never come into contact. Social rather than geographical barriers may play a role—a group of speakers can resist aligning themselves linguistically with their neighbors, even if they live one neighborhood over—this is precisely what happens with communal and sectarian dialects (
Blanc 1953,
1964;
Walters 2006). There may indeed be some kind of influence from the social structure of the speakers of a dialect—the work by Lesley and James Milroy has long shown that dense social networks can inhibit the diffusion of innovations in comparison to looser social networks (
Milroy 2008).
9 The role of network density needs to be investigated in greater depth for Arabic, but in most cases it is likely the lack of contact—a lack of frequent linguistic interactions between two populations—that explains most of the disparity between dialects.
Indeed, when there is a connection and interaction between groups over sufficiently long periods of time and little reason to resist change, we would expect to find that linguistic features would diffuse across the entire population. From this perspective, what is remarkable about Bedouin dialects is their lack of participation in innovations. If we discard the essentializing notion that they are in some fundamental, unchangeable way conservative, the most logical explanation for the deviance of Bedouin dialects from sedentary dialects is not that Bedouin dialects are somehow “old,” but rather that they are relatively new arrivals to an area. This is where the distinction between features—which may indeed be old, archaic, or non-innovative, relative to other features found in a language—and feature position in space is key. The individual features of a Bedouin dialect may be archaic, but their deviance from the features in the surrounding dialect geography is almost certainly indicative of a relatively recent movement into that area.
Another issue arises here, which is that even as a nomadic dialect might contain a variety of conservative features, the “feature-bundle” of that dialect may or may not be continuous, any more than in a sedentary dialect. The way that tribes are imagined in the dialectology literature are as relatively unchanging familial groups, but the reality is that tribal groupings are in actuality political rather than genetic entities (
Hoyland 2009, p. 390). They can divide and recombine, even as traditional tribal names might be retained (
Magidow 2013, pp. 119–22). Today’s Maʿqil tribe is not necessarily yesterday’s Maʿqil (or Ibn Khaldoun’s), either in terms of the genetic makeup of its members or its linguistic behavior. Here again, the distinction between conservative features and conservative dialects is key. The presence of any number of conservative features in a dialect should not necessarily be understood to mean that a specific combination of features has been used together for a long period of time. Nor does it imply that the group which uses those conservative features has had long-term cohesion and durability. It is possible that this was the case, that there has been continuity, but it should not be assumed.
Moreover, Bedouin dialects are imagined as being traditionally spoken nomadic groups subsisting in resource-poor areas with quite low population density. From the perspective of a dialect map, it takes relatively little human movement for a given space in a low-population area to change its linguistic behavior. Even a dialect map drawn at different seasons might show significant changes as groups move to summer and winter camps within these marginal areas. Contrast this with the dialect maps of high population sedentary areas where a massive catastrophe would be required to cause a migration sufficient to change the overall linguistic landscape—in these areas, one would instead expect the linguistic features to diffuse across the landscape, while the speakers themselves remained stationary.
Magidow (
2013, pp. 133–34) refers to this contrast between linguistic conservatism, but recent migration, as the “
Bedouin paradox:”
Nomadic speakers generally do not always participate in the spread of innovations among settled groups, and therefore they appear to retain archaic linguistic features in comparison with their settled neighbors. However, their extreme mobility and the ease of replacing indigenous nomadic groups means that these ‘archaic’ speakers may be newcomers to an area in comparison with settled groups.
This idea helps explain Holes’ observation about the Bedouin dialects in Algeria. The dialects that Holes reports being able to understand so well are not magically “Bedouin” in nature. Rather, they likely have a much shallower historical branching from the dialects that he was already familiar with, and had moved into southern Algeria relatively recently in comparison with the sedentary dialects of the coasts. The conservatism of these dialects (which nonetheless have acquired “Maghrebi” features from nearby settled areas) reflects their relatively recent arrival on the dialectological scene.
3. Early Layers and Later Layers
Another key idea linked to the idea of “oldness” as deterministic in the history of Arabic dialect is the “big-bang” model of the expansion of Arabic. This model holds that it is the initial expansion of Arabic in the early Islamic period that is at essence responsible for much, if not all, of the modern geographical distribution of Arabic dialect features.
10 This model has a genetic component—it is these old dialects, first distributed across what is now the Arabic-speaking world, that are the direct ancestors of the modern Arabic dialects, with changes within those dialects due to contact, urbanization or similar processes.
This concept is a foundational in Arabic dialectology. The strongest modern proponent of the idea, Jonathan Owens, explicitly designed his monography,
A Linguistic History of Arabic, around the goal of reconstructing the Arabic of the period from 630 to 790 (2006, pp. 2–5), and continues to use a similar methodology in more recent papers (
Owens 2018). My own earlier work,
Magidow (
2013) followed this basic assumption quite closely as well, and it is a common underlying assumption throughout the Holes volume, where the introduction focuses on what “language …the conquerors spoke”(
Holes 2018a, p. 7). The
Encyclopedia of Arabic Language and Linguistics article on “Dialects: Genesis” quite explicitly states that “by the 10th century, dialect areas were already shaped” in essentially their present distribution (
Abboud-Haggar 2006, p. 620).
Jastrow (
2002) divides between Zone I dialects, those in the Arabian Peninsula, against Zone II dialects, those “colonial” dialects that are a results of the early Islamic expansions. The idea was also key in earlier work. Ferguson’s famous idea of an Arabic
koine, the ancestor of modern sedentary dialects, assumes that “its spread coincided roughly with the spread of urban Arabo-Islamic culture (
Ferguson 1959, p. 618), and the same is essential true of Versteegh’s pidginization hypothesis (
K. Versteegh 1984,
2004).
11 Even earlier approaches which assume linear descent of the Arabic dialects directly from Classical Arabic are, at their heart, assuming a diffusion of relatively similar speakers at the time of the conquests, with later developments occurring in-situ, with many of these ideas going back to even the very early grammarian traditions that spoke of Bedouin informants and dialects becoming corrupted by sedentarization (
Blau 1977;
Fück 1950;
Garbell 1958;
C. H. M. Versteegh 2014, p. 138).
The big-bang phenomenon also has a related phenomenon in the study of North Africa, what could be called the “little bang.” The first big-bang is shared with the rest of the Arab world, as Arab armies lead the conquest of North Africa and Andalusia in the 7th and 8th centuries. This is believed to have laid down an initial layer of Arabic, known as the “pre-Hilalian” variety of Arabic (
Marçais 1938). Following this era, another major linguistic expansion occurred in the movement of tribes from the lineage of the Banū Hilāl, supposedly from the Arabian Peninsula (by way of a brief stopover in southern Egypt) beginning in the early eleventh century and ending by the fourteenth in the typical accounts. This group is said to be responsible for the “Hilalian” dialects of North Africa, a group of dialects primarily spoken by Bedouin, rural or recently urbanized populations.
If true, the big-bang idea would be extremely convenient for the historical dialectology of Arabic. Researchers would be able to ignore the complex histories that follow the time periods in which these “bangs” occurred, and instead focus on the vast, early historical tradition which reports many of the early population movements in and out of the Arabian peninsula. This would allow us to reduce the enormity of the task of Arabic historical dialectology, and to focus on linking those historical reports to the modern distribution of dialects (
Aguadé 2018;
Peter Behnstedt and Woidich 2018;
Magidow 2013;
Procházka 2018). Unfortunately, the big-bang model appears untenable for three primary reasons. The first is that it is not clear how durable dialect geography is over time. Second, given the lack of durability of features-in-space over time, it is important to pay attention to the significant evidence that major population movements occurred well after the Islamic conquests. Finally, the model (in either the big-bang or little bang versions) simply does not make effective linguistic predictions, with the actual linguistic features of modern dialects contradicting the predictions that these models would make.
3.1. Durability of Linguistic Material over Time
In general, there is an unarticulated assumption that linguistic features, once in-place, will generally persist over time. On a basic level, this is often true, but the Arabic speaking world is a crossroads of civilizations, with both long-term, continuously inhabited cities and vast areas of quite low population density. Indeed, the disappearance of the many languages other than Arabic following the Arab conquests gives lie to this theory, for clearly this linguistic inertia can be interrupted and once dominant languages driven extinct, like Coptic, or into a very marginal status, as with Aramaic.
There is plentiful evidence from sociolinguistics that language change can proceed extremely rapidly.
Miller (
2005) found that within a generation of arrival, many Upper Egyptian migrants to Cairo had assimilated to a wide variety of different Cairene linguistic features. The koineization of the Amman dialect appears to have happened within three generations, and has significantly changed the linguistic repertoire of the newly created city (
Al-Wer 2003,
2007). The migration of ʿArab dialects to Bahrain, though hailing perhaps from the 18th century, accelerated after the 1930s and so, in spite of sectarian differences, endogamous marriage within groups and other barriers, by 1995 there was already a developing areal koine (
Holes 1995), and by the late 2000s, even in rural areas the old village Baharna dialects “have now all but disappeared (
Holes 2015, p. 475).” One notes also that for several key variables, including the shift of (Q)/q/>/ʔ/,
Behnstedt (
1997, map 9) differentiates between the oldest and youngest generations, showing change in within three generations. These kinds of changes, well attested in the sociolinguistics literature more generally, typically occur on timescales of 3–4 generations, equivalent to approximately one century (
Trudgill 1986). To expect any significant linguistic durability of features-in-space, or even of dialect bundles, across longer timespans seems wildly optimistic.
Some accounts for the “big-bang” approach have attempted to formalize the idea of linguistic inertia. For example,
Owens (
2018, p. 209) suggests that in the framework of
Dixon (
1997), the Islamic conquests represent a “punctuated phase” in a larger linguistic equilibrium. Even leaving aside the many criticisms of Dixon’s model (
see Bowern 2006), and that it is clearly meant for longer time-periods than treated here, it is unclear how this is the only punctuated phase that is meaningful in the history of Arabic, or how long the phase lasted exactly. Arabicization took centuries in most places, and is still incomplete in many others, such as North Africa. Going back to
Dixon (
1997, esp. Chapter 6), virtually every form of punctuation he discusses—natural causes (e.g., plague), material innovations (especially of weapons), “development of aggressive tendencies”—happened repeatedly since the early Islamic conquests.
Magidow (
2013) adopted a different approach, attempting to formalize this model of persistence using a concept from geography, adopted by Labov for sociolinguistics, the “principle of first effective settlement.” This principle states:
Whenever an empty territory undergoes settlement or an earlier population is dislodged by invaders, the specific characteristics of the first group able to effect a viable, self-perpetuating society are of crucial significance for the later social and cultural geography of the area, no matter how tiny the initial band of settlers may have been.
In any one generation, if the numbers of immigrants rise to an order of magnitude greater than the extant population, the doctrine may be overthrown, with quantitative changes in the general speech pattern.
Though this principle does indeed seem to match with the role of population density in acting as a barrier to linguistic change (
Magidow 2013, 99ff;
Ostler 2005), it is frustratingly vague, and again we simply do not have sufficient access to the complete history of the places in question.
Magidow (
2013), drawing heavily on
Conrad (
1981), makes much of the plagues occurring immediately around the time of the Islamic conquests. However, there were clearly many subsequent plagues, including the Black Death that devastated the entire western hemisphere in the 14th century (
Dols 1974). Between plague, conquest, migrations, deurbanization and urbanization due to changes in trade, climate, and other facts, Labov’s “order of magnitude” criterion must have been regularly fulfilled in the millenium after the Islamic conquests.
12 3.2. Later Population Movements
Indeed, we find that when we do look at the history of the Arab world, there are often many examples of later movements and changes that clearly post-date the early Islamic conquests, and which have significant implications for the linguistic history of a region. Even if we restrict ourselves to the chapters in
Holes (
2018a), we find numerous examples where the current distribution of linguistic features in space clearly are a result of post-conquest population movements.
In Egypt,
Behnstedt and Woidich (
2018) find many examples where the dialectological situation owes its distribution to much later phenomena. Against
Owens (
2003), they argue that “the constant return of Maghrebi tribes to Egypt” reinforced the use of the
niktub-niktubu verb paradigm, and that for certain regions of the Delta these forms are “at least partly due to later Maghrebization from the fourteenth century onwards (p. 76).” For the city of Alexandria, older linguistic layers have apparently been erased. Alexandria has a long and storied history, previously having many Maghrebi features. By the time the French arrived it had only 7000 inhabitants, growing again only in the nineteenth century under Muhammad Ali Pasha. By the end of the 19th century, it continued to have non-Cairene dialect features, many of which were also Maghrebi, such as/dʒ/for the (J) variable versus Cairene/ɡ/, and common use of the
niktub-
niktubu verbal paradigm. By the 1970’s it was “a ‘one foot in the grave dialect’ (p. 79)” effectively replaced by the Cairene dialect in middle and upper-class speech in younger people, and with older people preserving only some of the original features. These are changes which largely took place only in the last three centuries, such that the pre-modern dialect has little relationship to the current one, and that original dialect may be difficult if not impossible to reconstruct.
Indeed, while we generally would expect that large cities would be the most stable across time given high population density and durability of location, the situation of Alexandria is surprisingly common. Many cities witnessed periods of intense depopulation and repopulation, particularly in the 20th century when urban areas underwent spectacular growth (
Miller 2007). Many modern cities in the Arab world are virtually ex nihilo creations, such as Casablanca (25,000 in 1900 to millions today), Amman (effectively founded in 1923) and Nouakchott (founded 1957). However, even older cities had surprisingly low populations until recent times. Table 1.1 in
Miller (
2007) shows the vast growth in many of the major cities of the Arab world, most of which have grown at least 4-fold in 20th century—and given the rates of urbanization that have also grown in that time, from 14.5% in 1900 to 59.7% in 2005, this is almost certainly a massive movement of rural inhabitants into urban spaces.
All of this occurred only in the last century. Going further back, the internal histories of many of these cities are replete with cycles of growth and decline, and so it is quite difficult to be sure that an urban space is going to continue earlier linguistic behavior. Baghdad, once one of the largest cities in the world, is said to have had as few as 15,000 inhabitant in the 1650s (
Palva 2009, p. 31). Even rural populations may have had recent depopulating and repopulating events—
Behnstedt and Woidich (
2018) suggest that upper Egypt was depopulated multiple times, and that in some Upper Egyptian dialects “one has to suppose that the immigration from the Ḥijāz lasted right up until the present era, and that some of the village dialects evolved only in recent times (p. 84)”.
In the Levant, an area for which we have some of the oldest clear examples of Arabic language in the form of late Nabatean and Safaitic, many modern dialects and their features certainly seem to be more recent, certainly more recent than the Islamic conquests.
Lentin (
2018, p. 175) quotes Ayalon as saying that near the end of the 15th century, the area between Latakia and modern Biredjik was Turkish, rather than Arabic-speaking (these form a line that passes northeast from Latakia, to approximately 50km north of modern Manbij). While certainly not entirely surprising, given that much of this area is Turkish speaking today, it pushes the development (or perhaps, “deployment”) of the Cilician dialects later than might be imagined.
In the fertile crescent and the Syrian-Iraqi desert,
Procházka (
2018) begins the history of the region in the pre-Islamic era, with the desert hinterlands said to already have contained Arabic-speaking tribes. His primary claim is that the late tenth century is the
terminus ante quem for the features he describes as characteristic of relatively more sedentary dialects in this region, but ascribes the Bedouin features in others only to the era following the Mongol conquests in the 13th century. The Shāwi Bedouin dialects he considers an early stratum, but one he links only to the 11th century, while he also notes frequent movement even into the 20th century. The camel-breeding Shammar and ʿAnaza are said to have come only in the 19th century, the time he gives for similar migrations to the Cilician Plain, building on Lentin’s account above about the Arabicization of north-west Syria. It is also notable that folk accounts put the major migrations into Tillo, near Siirt, Turkey by Arabic speakers to ca. 1300 and 1600 in two waves (
Procházka 2018, n. 5). While Procházka does note features which are attested early, and still present in the region, such as the shift of/*r/> [ɣ], attested in Al-Jahiz (d. 869 CE), this is only a report of a feature, with some marginal spacial information (p. 270). No data exists to determine whether the current linguistic situation reflects continuous inhabitation by the same dialect group.
In the Gulf,
Holes (
2018b) states that only in the past century, beginning in the 1930s to the 1970s, “the Gulf dialects
as a whole (with the partial exception of Oman) underwent a number of ‘reductional’ changes in their morphology (p. 134, emphasis original).” These changes include a loss of gender distinction in plural verbs, loss of the internal passive, loss of the dialectal tanwin, and the innovation or increased use of analytical genitive markers. All of these features are typical of the features used in dialectology for classification and historical reconstruction. If they can spread in a few decades, then we must be cautious about reconstructing even further back. Holes’ “tribal Arab” dialects are said to be primarily due to 18th century migrations, with an earlier stratum of unknown chronology (pp. 134–35). Though his evidence for the antiquity of his B strand of dialects as an early layer is quite compelling, he rests some of his argument on the isolated nature of Oman, noting that significant changes have occurred since the ascension of Sultan Qaboos in the 1970s, but that many speakers he worked with in Oman showed extremely limited mobility. However, one wonders whether earlier periods in Oman’s history, such as the Yaʿrubid dynasty (1624–1742) and Omani Empire periods (1710–1783), when Oman was major regional power, might not have had a similar impact on the language to the present growth of the Omani state.
13 While all of the authors in this volume, and in Arabic dialectology more generally, are clearly aware of these later changes, it is still difficult for them to pull entirely away from the “big-bang” model.
Behnstedt and Woidich (
2018) search hard for the “first layer” in Egypt, drawing on early accounts of the tribal affiliations of the migrants, even as they acknowledge later strands of migration.
Holes (
2018b, p. 133) attributes the -
inn- infix’s distribution in the Arabian Peninsula to the era of the early Islamic conquests, though he notes that its movement into Egypt, Sudan and the West Sudanic area was probably later.
Owens (
2018), as noted previously, simply assumes a big-bang model which leads quite directly to suspect historical reconstruction. He argues that the
b- prefix in modern dialects come from a single source simply as a result of his historical model, “the spread of b- described did occur in some regions very early, and indeed has existed in the forms which will be reconstructed since at least the earliest Islamic period, if not in the pre-Islamic era (p. 212),” which leads him to treat the Yemeni
b- prefixes, which are quite transparently derived from *
bayna(ma) (
Behnstedt 2016, p. 213) as equivalent to other
b- prefixes which are likely from other sources (Owens himself reconstructs them as being from
yabġā > yaba). Procházka also tends to use the construct of “pre-diasporic Arabic” (p. 267) and focuses on the early tribal conquests and settlement, even as he acknowledges the later population movements.
The other side of the big bang equation also seems lacking at times—there simply is not always a compelling record of Arabicization in the earliest periods. The Arabicization of Egypt probably did not begin in earnest until the Fatimid era, with complaints about the shift from Coptic to Arabic reported into the eleventh century CE (
Magidow 2013, pp. 220–22;
Papaconstantinou 2012). Aramaic took time to be supplanted even in the Levant and Iraq, with neo-Aramaic dialects surviving to this day in Syria and Iraq.
The area where early Arabicization is most unlikely is North Africa, where vast portions of the area remain either un-Arabicized or show significant bilingualism between Berber languages and Arabic, even after 13 centuries of Arabic presence in the region. The chronology outlined in
Aguadé (
2018) does not seem likely to have produced significant Arabic penetration prior to the tenth century (the “pre-Hilalian” period). He repeats reports that Arabs formed only a part of the population in many Tunisian towns into the twelfth century CE, while Qayrawan in his telling only developed into a major regional center by the nineth century. Even if it was, as claimed with very little evidence “the origin of the spread of all pre-Hilali Maghrebi dialects” the process that would have resulted in significant Arabicization would have taken centuries. Even in the traditional French chronologies, Tunisia itself, a mostly flat and accessible area immediately surrounding Qayrawan, only is completely Arabicized by the 15th century, suggesting that the spread of Arabic must have been quite slow (
Aguadé 2018, p. 42). Fes is said to have been “surrounded by Arab tribes” in the twelfth century, while the supposedly Arab settlements of Baṣra and Nakūr disappeared by the 11th century, though it seems likely Berber was still the dominant language. There is little in this history to suggest a strong early Arabicizing trend in North Africa that would allow us to clearly attribute the supposedly “Pre-Hilalian” layer to the earliest era of settlement. We do have evidence that a dialect similar to modern Moroccan Arabic existed by the twelfth century, but it still shows some differences, such as the use of
mtaːʕ ‘of’ instead of
dyāl now more common in Morocco (
Vicente 2012).
Indeed, there certainly seem to have existed an even earlier layer of dialects than the Pre-Hilalian ones. For example, in Ajwila Berber in Libya, the words for ‘Friday’ and ‘heaven’, contain a reflex of the (J) variable that must be originally a palatal-velar [ɟ] or [dʒ], which goes against the “pre-Hilalian” [ʒ] and Egyptian (and likely Proto-Arabic) [ɡ] (
van Putten and Benkato 2017). Borrowed words with the feminine ending have
-at, not the current
-a used outside of construct state, and this is the case in Berber dialects across North Africa (
Kossmann 2013, pp. 209–14). All of this is suggestive of a very minor early level of Arabicization, that was almost certainly swept away by later layers, many of which probably came into the area well after the early Islamic conquest period.
The big bang view would perhaps presume that these later migrations were simply additive, building on an already-established dialect geography. However, the poor level of initial penetration of Arabic into the conquered areas, the high likelihood of extreme depopulating events occurring between those initial settlements and the present, and the relatively rapid pace of linguistic change all suggest that the big bang model of linguistic history is far too simplistic, and that the convenience of the model is not something that we as a field are lucky enough to enjoy.
3.3. Effectiveness of the Big Bang Approach as a Linguistic Model
The big bang approach is also problematic simply because it does not appear to be a strongly predictive linguistic model. As
Behnstedt and Woidich (
2013, sec. 15.5.2.2) succinctly note, “Neither of these two approaches [Jastrow 2002’s zones, Owens 2006 ‘pre-diasporic Arabic’] is convincing for linguistic subgrouping, because they cannot be related to linguistic variables which would justify them.” This is in part because of the difficulty, noted in
Section 2, of linking not just individual features, but features clustered with one another within a dialect across nearly two millennia of history. Attempts to do so starting in the mid-twentieth century were largely unsuccessful, failing to identify clear features of the earliest layer of conquest (
Blau 1977;
Cohen 1962;
Ferguson 1959;
see summary in Miller 1986;
Versteegh 1984), while both
Magidow (
2013) and
Owens (
2006) should both be re-evaluated in light of the arguments presented here.
There are certainly features shared by all modern dialects that contrast with earlier layers of Arabic—the development of vowels in hollow verbs, for example, suffuses all modern Arabic dialects, even though there was a clear historical memory of the earlier situation, where the glides in hollow verbs were retained, also attested in the pre-Islamic Safaitic inscriptions (
van Putten 2017a). Similarly, virtually all modern dialects now have *-aya >/aː/for the
alif maqsūra, though the/e:/reflex is still attested in Classical and Quranic Arabic (
van Putten 2017a). However, the fact that these features date to around the time of the Islamic conquests does not tell us that these features directly hail from that era, since any later migrations almost certainly would also have brought these features to those areas.
The little bang theory makes more specific predictions than the big-bang theory: there should be a clear division between the dialects that settled North Africa during the “pre-Hilalian” era and those that came later. Of course, even if this prediction was proven true, it does not prove the specific historical claims of the model—it simply proves that there have been multiple waves of migration, a situation that exists in basically all Arabic-speaking regions. However, the linguistic evidence still does not support a simple binary bifurcation of the dialects in this region.
Aguadé’s (
2018) article in the Holes volume has done a monumental job of listing all the claimed isoglosses between pre-Hilalian and Hilalian dialects, which previously were largely scattered across dozens of publications. This allows us, however, to note how contradictory many of these isoglosses are, often crossing the supposed boundary between the two layers. For example, the phonemes/b/and/m/replace each other in both types of dialects; interdentals are generally merged with dentals in pre-Hilalian dialects, but many exceptions exist, even in dialects like Cherchell that are traditionally considered key examples of pre-Hilalian dialects (p. 44). Short vowels are lost in open syllables in both types of dialects in Morocco (p. 47), though this is an urban feature further east. Monopthongization of historical diphthongs similar cross the boundary (p. 48). Unconditioned
imāla cuts across all Maghrebi dialects (p. 49). The classic
niktub-
niktubu isoglosses characterizes all North African dialects, while the loss (or retention) of gender distinctions has occurred in both Bedouin and sedentary dialects (p. 55). While there are indeed some remaining isoglosses which do distinguish these groups, for example the realization of (Q) and greater use of analytic genitives in sedentary dialects, it remains quite possible that these are later than the alleged migrations (see also below in this section).
Research outside of the traditionally highlighted isoglosses again supports a subgrouping of North African dialects which cuts across the divide.
Magidow (
forthcoming) analyzes the personal pronouns, demonstratives, and interrogatives in over 80 Arabic dialects to find isoglosses which can be used to classify the entire Arabic-speaking region. In contrast to many classification schemes, where the historical model often drives the selection of isoglosses, the isoglosses in this study were derived directly from the linguistic data without reference to a historical model. Isoglosses in this study are only those in which a clear innovation has occurred, so retentive features (e.g., retention of the interdentals) that are so often mentioned in the dialectology literature are except for comparison with traditional classifications.
One striking result from this analysis is that North African dialects, regardless of where they fall on the Hilalian divide, tend to cluster quite strongly. The study identified 9 isoglosses that were much more strongly represented in North African dialects than in the mainstream of Arabic dialects, identified in
Table 1, which shows the isoglosses, their prevalence among the dialect sample as a whole, their prevalence in North African dialects, and their prevalence in dialects which have voiceless and voiced realizations of the variable (Q), used to distinguish between putatively pre-Hilalian and Hilalian dialects.
14 It is clear from
Table 1 that, not only are North African dialects strongly linked by the isoglosses, but that these innovations cut across the Hilalian divide. Many of the features are common in both sets of dialects, even as they are rare elsewhere. There is a great deal more homogeneity, and a larger number of isoglosses which distinguish this group in contrast to the other dialect groupings found in the few other dialect groupings found in (
Magidow, forthcoming). The “Penisular Bedouin” group, found in the western Arabian peninsula, is only distinguished by 3–5 isoglosses, while the “Sedentary Levantine” group has 8 isoglosses.
This is not to say that North African dialects do not show some differentiation along the lines of the supposed Hilalian divide, but this tends to be the exception in the sample and data here rather than the rule. The
ʔintiːna innovation appears to characterize only dialects which fit the pre-Hilalian mold—none have voiced Q, all have merged interdentals. These 5 dialects (Anjra, Morocco; Larache, Morocco; Fez, Morocco; Tlemcen, Algeria; Djidjelli, Algeria)
15 share almost all of the major North African isoglosses, though some lack -
ayya suffixes,
ʔaʃkuːn or
waqtaːʃ. They also share another isogloss, a ‘when’ form based on *
fiː ʔay waqt.
The ‘where’ interrogative fayn is never found in dialects with voiced Q, and almost all dialects (75%) with that form have merged interdentals. From the other side, only voiced Q dialects have -a(h) in the 3ms suffix pronouns. Interrogatives for ‘when’ derived from ʔayy mataː are primarily found in the voiced Q group, but this is also found in two putatively pre-Hilalian dialects, that is Marrakesh and the Jewish dialect of Tripoli, Libya.
Figure 1 is a heatmap of the number of these features, showing all dialects with two or more features from
Table 1. The size of the circles corresponds to the number of isoglosses from the table found in each dialect. Overlaid on this are symbols for whether the dialects have merged the interdentals with the dental consonants (taken typically as an indication of ‘sedentary’ dialects) or if they have a voiced realization of (Q). What is evident from this figure is that these features are found in both sedentary and Bedouin dialects. The number of features appears to form an east–west cline, rather than a division between Bedouin and sedentary dialects. As one moves west, one tends to find more of these features, while further east there are fewer—to the point where Benghazi’s dialect shows none of these isoglosses at all. There are of course some dialects outside of North Africa which have some of these isoglosses as well, though not in the areas of Arabia often held to be the source of the North African dialects (
Magidow 2013, p. 236 shows the presumed homeland of the Banū Ḥilāl and Sulaym as western Arabia).
Evidence of an east–west cline, rather than a sedentary-Bedouin distinction is found elsewhere in
Aguadé (
2018). The affrication of/*t/>/t͡s/is only found in Morocco and Algeria, not further east (p. 44) The indefinite articles
waːhid and
ʃiː are used only in Morocco and Western Algeria, while
fard is common further east in Tunisia and Libya (p. 50). The supposedly ancient genitive particles derived from
*mataːʕ are most widespread from North-Eastern Morocco into Libya, while the genitive particles similar to
diyaːl are concentrated further west.
The Hilalian “little bang” narrative does not provide a strong model here for understanding the data. Instead the data shows a remarkable unity between the North African dialects from both pre- and post-Hilalian dialects, while the many contradictions in the traditional isoglosses found for North African dialects strongly undermine the narrative. If the Hilalian narrative was correct, we would expect a clear bifurcation on an essentially north–south axis, between sedentary Hilalian dialects on the coasts and rural/Bedouin dialects further south. The fact that many of the features work on an east–west cline strongly suggests that North Africa is little different from any other region of the Arabic-speaking world, with a gradual process of migration into the region. Dialects coming from further east, which presumably lacked many of the key North African features (but not all, given the eastern dialects shown with yellow in
Figure 1) would acquire those features. Distance to the west should also act as a reasonable proxy for time spent in the region, as it takes time to migrate and settle, which would explain the cline going from fewer features to greater the further west the dialects are found.
16 This section provides another piece of the explanation for Holes’ observation about the similarity of Gulf and Algerian Bedouin dialects. The existing big-bang and little-bang models cloud our view of these dialects, forcing us to assume that they are separated by nearly a thousand years of history, with the Algerian Bedouin dialects being the direct ancestors of the Hilalian dialects of the 11th to 14th centuries. Instead, it seems quite likely that the movement of those dialects into that area were much later, which would explain the easy mutual intelligibility he experienced. Moving away from that historical model allows us to instead apply Occam’s razor and propose that many migrations proceeded into (and out of) North Africa even after the supposed Hilalian period.
4. Conservative Dialects and Early Origins
There is another common idea based on apparent linguistic conservatism that combines both the “conservativeness as archaic” and the “big-bang” model. The idea is that dialects which are more conservative are seen as likely candidates for genealogical ancestors of less conservative dialects. This was, of course, the basic premise behind the long-held view that Classical Arabic is the direct ancestor of the modern dialects, which has largely fallen out of fashion as evidence has begun to more clearly show that Old Arabic deviated in significant ways from Classical Arabic (
Al-Jallad 2017).
However, there still tends to be a belief that the modern Yemeni dialects reflect, due to their archaism, a kind of ancestor to most modern dialects. This cannot be entirely separated from the big-bang hypothesis, nor can it be separated from the idea that conservative dialects have genetic posteriority. The Yemeni dialects are highly archaic in a number of ways, from phonology to vocabulary, and many of the features in Yemeni dialects (as a whole perhaps more than individually) appear quite similar to canonical Classical Arabic. For example, the interrogatives are primarily of the mā variety, rather than the forms derived from *ayy šay which are common in both later Classical/Middle Arabic and most dialects.
This apparent linguistic conservatism is coupled with a strong historical tradition that holds that Yemen, and Yemeni tribes, were the origins of many of the Islamic armies. In combination with the big-bang approach, this means that these Yemeni dialects are often held to be the immediate proximate ancestors of modern Arabic dialects. For example, Map 3.2 in
Behnstedt and Woidich (
2018) depicts “spring pastures of Yemeni tribes in immediate post-conquest Egypt,” while references to that Yemeni origin and influence abound in their article. Elsewhere we see for example Yemeni origins posited as the ultimate ancestors of the Maʿqil tribes that are seen as ancestors of the Hassaniya dialect (
Taine-Cheikh 2006, p. 301), a narrative repeated by Watson (
Watson 2018, n. 9). Outside of this volume, one can witness the attempt to link Yemeni to Andalusi Arabic (
Corriente 2014), though that argument has been criticized (
van Putten 2017b).
Beyond the historical reasons for seeing this link, there is also a deeper misinterpretation, again about the significance of “old” features. This idea revolves around the idea that archaic features, or dialects with many retentions versus innovations, are older, and that older dialects must necessarily be ancestors to newer dialects. However, the reality is that the Yemeni dialects (and conservative dialects more generally) are striking precisely in how much they differ from other dialects, however conservative that might be. Though this means they could be an ultimate ancestor (or related to an ultimate ancestor) of other modern dialects, there almost certainly exist more proximate ancestors.
A wide-scale comparison of existing Arabic dialects largely confirms this. Magidow, (in preparation) identified 55 isoglosses that represent innovations with regard to Proto-Arabic, and then identified isoglosses present in at least 60% of all sampled dialects. These are shown in
Table 2. The geographic distribution of these groups is shown in
Figure 2.
What is notable about the distribution of these features is that, as would be expected in the literature, the Yemeni dialects are the only dialects to fall outside of the 90% core. That is to say, the vast majority of Arabic dialects have participated in those two major innovations, while it is primarily the Northern Tihama dialects (
Behnstedt 2016, p. 191) that have failed to partake in these innovations. Though not necessarily in the same group of dialects per Behnstedt’s classification, there are further archaisms found only in Yemen that are absent from most modern dialects. Virtually every modern Arabic dialect has, through analogy with the 1cp suffix
-na: (both pronominal and verbal) innovated a form of the 1cp independent pronoun from *
niħnu to
niħna or similar. The primary exceptions are in Yemen (points 153 and 154, just outside of Taʿizz, not far from the point 145 included in
Figure 2), while
Reinhardt (
1894, pp. 21–22) transcribes
honu: and
naḥnu for some Omani dialects.
The implication of this data, when freed from the “archaic as ancestor” narrative is that it is quite unlikely that these Yemeni dialects are the most recent node on a genealogical tree from which all other dialects developed. It is much more likely that most modern dialects derive from a dialect which innovated these core features, while the Yemeni dialects simply did not participate in those changes. The conservatism of these Yemeni dialects and their linguistic features provide a convenient window into the linguistic past of Arabic, however, they do not imply a linguistic ancestry in a historical sense.
Given how widely distributed across space (and time, given their presence in Andalusian Arabic) the dialects with 90% core features are, it would seem reasonable to assume that a single ancestor developed those features before becoming more widely distributed. This dialect, of course, would be unlikely to have been in Yemen, given that its innovations have not fully suffused the area, in contrast to the rest of the Arabic-speaking world. In the “big-bang” viewpoint, this would be the “pre-diasporic” Arabic, the variety that was spoken by the conquering Arabic armies in the early Islamic conquests.
However, as detailed previously, we simply cannot be entirely sure of the how ancient or new any dialectal features are with respect to their distribution in space. There has clearly been a huge amount of dialect movement and change, and many waves of diffusion which have brought specific linguistic features across the Arabic-speaking world. The example of the verb
ʃaːf ‘he saw’ are illustrative. The verb
ʃaːf has incredibly wide distribution across the Arab world, found in virtually every Arabic-speaking region today. While this diffusion is often believed to be quite early (
Ferguson 1959 includes it in his list of features of the koine), many dialects still have other verbs or biforms which appear to be diminishing only in recent times (
Behnstedt and Woidich 2011, pp. 330–37). It is hard to determine the earliest this verb came into use—we find an example of the causative
ywšwfwk ‘they will show you’ in a papyrus datable as as early as ca. 1000–1100 CE.
17 Though attested relatively early, the modern wide-spread distribution of this form often appears to be quite recent.
Cowan (
1966) argues that the diffusion of
ʃaːf must have taken place between the twelfth and sixteenth centuries, based on the places where it is not found (Malta, 15th century Andalusia, and Cyprus). However, in much of North Africa this lexeme’s uptake is clearly quite recent. It is generally not found in Judeo-Arabic varieties in North Africa, while
Aguadé (
2018, p. 57) notes that in Djidjelli as of the 1950s,
ʃaːf was a recent loanword, while in Anjra in the 2000s there was a generational divide between users of
ra and
ʃaːf.
Blau (
1977, p. 200) notes a similar situation in the Tunisian dialect of Marazig, where
ʃaːf was only widespread among men.
Behnstedt and Woidich (
2011, p. 333) note many places, in both eastern and western dialects where there are bi-forms, often with
ʃaːf clearly encroaching on older local forms (mostly
*raː reflexes, but see their maps for many other previously common verbs with this meaning).
The implication here is that the diffusion of the “core” isoglosses could be either genetic or areal—that is, the diffusion of these features could be an early phenomenon, in a common ancestor of modern dialects, or like ʃaːf, they could owe their current distribution to later waves of diffusion. Particularly in the latter case, it would not surprising that this diffusion would not have made it to Yemen, a peripheral region with difficult terrain, often not included in a meaningful way in the major Arabic-speaking empires. However, in either perspective, while Yemen may have preserved earlier states of the language, Yemeni dialects are at best a great-grandparent or great-aunt to the bulk of modern dialects, but hardly a direct parent.
5. From History to Heuristic: An Alternative Approach
There are several dangers when it comes to a using a historical approach for analyzing the macro-history of Arabic dialects. The first is simply that any attempts to reconstruct the history of the Arabic-speaking world are wildly ambitious. The Arabic-speaking world is vast, both in its present reach and historical extent. With Arabic spoken from Mauritania to Afghanistan to Zanzibar, and Arabic inscriptions already attested early in the first millennium, centuries before Islam, any attempt to even scratch the surface of this history will necessarily be extremely superficial. Most historians would be hesitant to even attempt a complete history of the population movements in a region, let alone the entire Arabic-speaking world, especially in the current environment where longue durée and macrohistory has become somewhat rare, or is treated at a more popular than academic level. By way of example, Hugh Kennedy’s The Great Arab Conquests runs to nearly 500 pages, but covers less than two centuries. Expecting a dialectologist not trained in history to be able to make an original contribution and synthesis of the historical literature is demanding a great deal.
The second is that established histories can become quite difficult to rethink and question once they have become integrated into the dialectological literature. We have seen this already in the big-bang approaches to Arabic, but the little-bang approach of North African history bears additional scrutiny. The historical model established by
Marçais (
1938) has dominated the dialectology of North Africa, and researchers remain strongly committed to it.
Benkato (
2019) has documented this extensively (pp. 16–18 especially), but one is still amazed to find statements in
Aguadé (
2018) such as “it is this role as a junction that, according to William Marcais, made Qayrawan the origin of the spread of all pre-Hilalian Maghrebi dialects, and
there is no reason not to accept this assumption [emphasis added] (p. 39)” or “the question of whether this description [of Banu Hilal causing destruction] (which was widely embraced by the French colonial ideology in the nineteenth and twentieth centuries) actually reflects historical facts
need not be discussed here [emphasis added] (p. 42).” The dominant narrative is unquestioned, even as it makes it difficult to account for the many contradictions in the data covered in that chapter, as discussed previously.
Finally, it is not always clear that linguists are always able to do high quality historical work, again because we are largely not trained to do so. A common concept in Arabic dialectology is that of
sprachinseln, dialect areas which, by virtue of being cut-off from the mainstream of the Arabic language, can allow us to date subsequent changes. However, our historical models for these sprachinseln are highly simplistic. One of the most commonly discussed situations is that of Malta, for which
Holes (
2018a, p. 18) gives a typical interpretation: “a good example [of a
sprachinsel] is Malta, where a variant of Siculo-Arabic was spoken until the end of the eleventh century, after which all contact with Arabic-speaking communities ceased.” This is echoed by
Aguadé (
2018, p. 34) where he states that “Maltese is an important source as
terminus comparationis since the island of Malta was conquered by the Normans […] in the year 1090,” the value of which is that “Maltese represents an archaic pre-Hilali dialect which evolved uncontaminated by later Hilali interferences.”
However, even a slightly deeper look into the history of Malta suggests that reports of this “cutting off” are greatly exaggerated. The conquest of Malta did indeed bring it under the rule of the Kingdom of Siciliy in 1090—but that Kingdom also included Northern Tunisia, and further conquest included Tripoli from 1146 to 1158, and coastal Tunisia from 1134–1148. The Normans did not appear to meaningful occupy Malta at that time. Rather, Malta was a tributary until 1127 when it was conquered “to use as a transit point for trade,” clearly including trade with Tunisia (
Joffé 1990, p. 68).
Luttrell (
1975, pp. 31–32) reports a Pisan captain, apparently to combat piracy, seizing a Tunisian ship at Malta with the goods aboard it, and throwing the crew into the sea. The island of Pantelleria, 200 km NW of Malta, was even included in an arrangement by which half their tribute went to Tunis (
Luttrell 1975). This trade with North Africa clearly continued, where under the Aragonese crown starting from 1283, “Malta’s real usefulness was not as a market or a source of raw materials but as an entrepot and safe harbor on the routes to Beirut and Alexandria, and above all to Tunis and other African ports (
Luttrell 1965, p. 6)” where slaves, wood, and cotton all appear to have been traded through Malta. Trade contacts almost certainly would have also resulted in continuing linguistic contact.
Cyprus is held to similarly have been cut off, with
Borg (
2006, pp. 536–37) stating “the sociocultural parallels between Cypriot Arabic and Maltese is particularly close … since in both cases, we are dealing with an Arabic vernacular surviving in complete isolation” referring presumably to Crusader conquest of the island beginning in the twelfth century.
18 This vision of a “cut-off” in Cypriot Arabic leads
Tsiapera (
1969, p. 11) to speak of a dialect “isolated for some six centuries from other Arabic speakers.” However, here again, there is no clear cut-off, with both migration and trade clearly continuing through both the crusader and Ottoman eras. Movements of Christian refugees from the Levant continued at least through the 13th centuries (
Borg 2004, p. 8), with some authors suggesting migrations even in into the Ottoman era in the 16th century (
Hourani 1998). As in Malta, trade with the Levantine coast must have continued in both the crusader and Ottoman eras (
Borg 2004, p. 10).
The problem here is that linguists, not trained as historians, are making two basic mistakes in their historical research. The first is simply not diving deep enough in their research on these dialects—there is notably almost never a reference cited for the “cutting off” of Malta (none is cited in Holes or Aguadé’s assertions above), and little further research appears to have been carried out in most such accounts. This is somewhat reasonable, as these authors are dialectologists, not historians, and cannot be expected to be deeply immersed in the historical literature. The second is the overly facile equation of a change in political rule or religious affiliation with a wholesale change in a group’s personal and trade relationships. This again seems to be an issue of not having been trained in the surprising mobility and importance of trade that was common in the pre-modern world. Thus, while dialectologists are not ‘to blame’ per se for perpetuating historically inaccurate models, these models can significantly impact how we interpret the data from our dialectological research.
One solution to this problem is to work on more focused, microhistorical work of particular regions.
Palva (
2009) is a paradigmatic example of this, weaving highly detailed dialectology work with a more sophisticated view of history than we sometimes encounter. He rightly questions the tradition that 1258 marks the decline of Baghdad (see for an example of that idea
Fischer (
2006) which uses 1258 as the dividing line for “post-Classical Arabic”), and instead notes that the decline of the city has been documented as beginning much earlier. He also correctly identifies that the city may have had cycles of depopulation and repopulation. This kind of highly focused work—in this case, on a single city—allows for a much deeper level of research into the history of the area. Another suggestion would be for linguists to reach across departmental and disciplinary lines and work with historians specializing in the area to bring a new perspective to their work, and certainly linguistics and dialectologists have their own contributions to make to historical research.
The microhistorical approach, however, does not work as well when dealing with large areas or finding general trends. Here, the danger of missing large chunks of history will always rear its head, and it can be challenging to dismantle the existing historical narratives dominant in the field. For this reason, I suggest instead a general heuristic that draws on general principles, rather than specific historical narratives, to suggest the kinds of linguistic movements that we should expect to see over time and which can be applied as a basic test of whether a historical narrative is plausible.
This heuristic is based on the observation that the movement of linguistic features across the landscape divides into primarily two types. The first is the diffusion of features without a major change in the distribution of populations within a space—that is, the speakers remain in situ, while the linguistic feature itself moves across the landscape. I refer to this simply as “diffusion”. The second is the movement of populations, such that a group moves into an area, bringing the linguistic features of that group with it. I refer to this as “migration.” The speakers are not changing their linguistic behavior, but the linguistic geography of the area changes by virtue of their movement.
Diffusion is by far the most common way a linguistic form moves across the landscape in areas with high population density. This diffusion is generally not simply geographical, diffusing outward to the nearest geographical point, but instead is often hierarchical, with features moving between areas with similar population densities even if they are physically distant, and only later diffusing to areas that are geographically adjacent but with lower population density (
Britain 2008). Major movements by sedentary groups are rare, the result of major economic or political changes (e.g., industrialization, urbanization, warfare) and as discussed earlier require quite large changes in the total population for a group to supplant the linguistic behavior of a high population group via migration (
Miller 2004;
Palva 2009). Less populated areas certainly experience diffusion as well, though being at the “bottom” of hierarchical diffusion they tend to diffuse features from the “bottom up,” that is features tend to diffuse between areas of similar population density (
Wikle and Bailey 1996).
In contrast, for areas less conducive to intensive settlement, migration is much more common as a driver of change in dialect geography. The lower carrying capacity of the land means that population densities are lower to begin with, often requiring nomadism to maximize resources, ensuring frequent movement. Therefore, on a given point of land in such areas, the inhabitants are both more likely to move of their own accord, and if another group moves into that area, are more likely to be overwhelmed by the power of numbers. The basic strategy of nomadism means that it is easy for a group speaking a particular linguistic variety to move out of a given territory or into new territory, perhaps due to a single or several seasons of bad weather. Nomadic groups are well prepared for mobility, physically (in terms of their possessions) and culturally (in terms of having the necessary knowledge to survive). Sedentary populations are less likely to move. Even when they do, they have larger populations and thus are more likely to leave behind enough members of the group to maintain the presence of their linguistic variety in their original location. Depopulating a city requires a catastrophe, while for a nomadic group, leaving a given area is a routine seasonal event. For a nomadic group to seize the territory of another nomadic group requires a relatively small migration of people and could radically change the linguistic behavior of an area from a dialectologist’s perspective. On the other hand, nomadic groups cannot generally move into densely populated sedentary areas and impose their linguistic behavior on that area. The nomadic group would need to first win a military campaign against a numerically superior group, and then to maintain and impose their language on a still numerically superior group.
19 Therefore, in this model, densely populated areas are treated as a barrier to migration as a means of language change. For a migrating group to have a linguistic impact, absent a depopulating catastrophe, it must migrate within marginal lands that support similarly small groups. Of course, there is a limit to the kind of terrain that can support meaningful numbers of people. Life in the Sahara and Empty Quarter require such specialized skills for survival that it might be hard to supplant the small number of speakers in those regions easily. This means that there are narrow corridors between the areas where population density is too high for a migrating group to have an impact linguistically (and where there would be potential resource competition with settled people who were actively using more fertile areas), and where it simply is too difficult to inhabit for a newly arrived group. This model is especially important for the MENA region since those corridors are a more common component of the landscape as vast areas of the Arab world are poorly suited for intensive settlement. The amount of agricultural land in 1961 in the Arab world was approximately 25%, compared to 56% in the Euro area.
20 Nomadic groups have played a much more significant role in the development of Arabic than in the modern development of Europe languages, for example, and so our heuristics for the development of Arabic must take this into account.
This model therefore suggests that there are two main corridors of movement of linguistic features over space and time in the Arabic-speaking world. For sedentary populations, movement of linguistic material will likely happen between densely populated areas by way of diffusion of linguistic features without permanent population movement. For nomadic populations, movement would occur through marginally inhabited spaces, and it would be the physical movement of populations that result in a particular distribution of linguistic features in space. Densely populated areas would act as a barrier to the diffusion of linguistic features associated with nomadic speakers of the language. These corridors allow us to anticipate particular kinds of linguistic change over space and time without a need for reference to historical events.
Population density is fundamental to this model, but since we have no access to historical population data, we need a proxy variable. Rainfall provides a good proxy, since agricultural production is roughly proportional to total rainfall, while nomadic pastoralism in the Middle East exploits low-rainfall areas through seasonal migration and use of hearty livestock such as camels. To establish that this proxy variable works well,
Magidow (
forthcoming), correlated data about the realization of the Q variable in a sample of 88 geo-located dialects against average (modern) rainfall totals. Comparing whether dialects have the voiced realization of (Q) to the average amount of rainfall produces a correlation of 0.50 (
p < 0.001), explaining nearly 25% of the variation between dialects, a reasonable result for a very rough and simplistic comparison.
21 Another variable to consider is elevation. Different types of nomadism require different subsistence strategies. One major division is between low-land subsistence (based on movement across large distances) versus elevation subsistence (based on movement up and down in elevation) (
Barfield 1993;
Barth 1961;
Donner 1989). It is notable that in many areas where the Arabs conquered, the 1000 m elevation line represents the limits of Arabicization, from Andalusia to North Africa to the Iraq-Iran border, though Yemen constitutes something of an exception (
Donner 1989;
Kennedy 2007, pp. 435, 438;
Magidow 2013).
Figure 3 therefore combines both rainfall and elevation in an attempt to illustrate the migration corridors for linguistic features. The black areas represent either those areas with too much rain (or a river systems likely to support agriculture) or which are at too high an elevation (1000+ m).
22 The light grey areas on the map have extremely low rainfall of 50mm/year or less, which are very difficult to inhabit even for nomadic settlement. The white areas are therefore the “Bedouin corridors.” These are the areas where we would expect movement of Bedouin groups, while we expect less movement, and slower movement, elsewhere.
23Overlaid on the map in
Figure 3 are realizations of the proxy variable most strongly associated with Bedouin dialects, use of a voiced reflex of the Q variable. As expected, voiced Q reflexes are primarily found within the Bedouin corridors. The most common exception is in Yemen, which is a geographical and linguistic area noted for being dialectologically unusual. This suggests that in general, the notion of the Bedouin corridor is a good proxy for how nomads help diffuse linguistic features, without a need to reference a particular model of the history of Arabic.
This model also makes meaningful predictions. Among other things, it removes the need to treat the Bedouin-sedentary difference as an essentialized difference with little explanation. Instead, this distinction falls out naturally from the basic idea that there are two different manners in which linguistic features move. Should a dialect spoken by nomads settle into a populated area, it would stop being subject to migration as a driver of linguistic change, and become more influenced by diffusion from more densely populated areas. The similarities between sedentary dialects are therefore due primarily to linguistic diffusion between them, rather than due to a single process of koineization (or even perhaps sharing an early ancestor.) The similarities between Bedouin dialects are similarly a result of their contact with one another, and likely due to rapid movements which erase diversity rather quickly and present an illusion of homogeneity across broad spaces.
This model also informs our historical understanding. In North Africa, we expect to find significant nomadic movement along the coast as far as southern Tunisia. After that, we would expect most of the areas of the coast to be relatively more difficult for nomadic groups to penetrate, and that this would be an area of sedentary in situ linguistic diffusion.
24 This replicates relatively well the “pre-Hilalian” vs. “Hilalian” distinction, but without a need to rely on the flawed historical model. We expect the Sinai to be a major land-bridge, with frequent population movements, and that the dialects in that area probably do not represent extremely long-term settlement.
25 We expect the movement along the western coast of the Arabian Peninsula to be relatively easy, while the Hijaz mountains would likely divide the dialects to the west and east of that mountain range. We expect to find the desert between Syria and Iraq to act as an extension of the Arabian Peninsula in terms of dialect, but the crescent of sedentary areas from the Syrian coast through Anatolia and into Mesopotamia to form a barrier for further nomadic movement, and for linguistic features to diffuse through that sedentary corridor (
Behnstedt 1990;
Ingham 1982;
Jastrow 1978, p. 78).
Another key prediction of the model is that the linguistic behavior of sparsely populated areas is unlikely to have significant time-depth in terms of features in that space over long periods of time. The chances that the linguistic behavior in the Bedouin corridors has remained the same over millennia is quite low. Indeed, as discussed in
Section 3, even the sedentary areas may show less time-depth for their linguistic features than is typically believed. This has serious implications for how dialectologists analyze the dialects of the Arabian Peninsula. In keeping with the ‘conservative features imply antiquity in place’ fallacy, we often find linguists assuming that Najdi Arabic represents an unbroken linguistic tradition in the region.
Owens (
2018, p. 211) claims that “Gulf and Najdi Arabic are spoken in the Arabian Peninsula and have been spoken in these areas since pre-Islamic times.”
Al-Jallad (
2009) similarly states that “the Bedouin dialects of the southern Najd of course never left the Arabian Peninsula.”
26This of course seems highly unlikely. This is a region of very low population density, with well-document migration out of the Arabian Peninsula into other areas. The vacuum caused by these migrations would almost certainly have changed the dialect landscape of the Peninsula itself.
Ingham (
1982, map 5) illustrates a very significant reshaping of the linguistic isoglosses in the Peninsula since the 17th century. It is simple to imagine that the thousand years between the Islamic conquests and the start of his map witnessed many similar disruptions to that dialectal map. Indeed, the main argument of
Holes (
2018b) is that the dialects of the Gulf reflect at least three major layers, one of which may date only as recently as the late eighteenth century. One wonders too about the Baharna layer that he uses as a key piece of evidence. It is in relatively rapid decline, and if traces of that dialect can virtually disappear within two or three centuries, how many layers of Arabic dialects may have been lost in the past millennia? This is not to strictly argue that there cannot be continuity here, but it is highly unlikely and we would need the kind of microhistory mentioned previously to prove continuity or a lack thereof.
This idea that the dialects of the Arabian Peninsula are somehow original to the area based on their archaism is not entirely dissimilar from the argument about Yemeni dialects discussed previously. However, the heuristic here would predict very different histories for the two regions. The highlands of Yemen are relatively inaccessible, and have historically had higher populations than central Arabia, receiving greater rainfall including some from the monsoon. Our model would expect that linguistic change in Yemen would be relatively more difficult than in the Najd. This is not the say that we expect Yemen dialects to be highly durable—it is notable for example that the least “non-core” Yemeni dialects discussed above are located on the Tihama, an area that should be a Bedouin corridor and should not have long-term durability. The geography of the region means that we should consider the possibility that these archaic dialects originated somewhere else, and only arrived relatively recently in Yemen. The model proposed here allows us to still derive observed differences between dialects while being able explain them in a parsimonious manner based on general principles and using a more accurate analysis of conservative and innovative features.