Next Article in Journal
Multipartite Correlations in Quantum Collision Models
Next Article in Special Issue
Grammatical Gender Disambiguates Syntactically Similar Nouns
Previous Article in Journal
Metriplectic Structure of a Radiation–Matter-Interaction Toy Model
Previous Article in Special Issue
Frequency, Informativity and Word Length: Insights from Typologically Diverse Corpora
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Evolution and Trade-Off Dynamics of Functional Load

1
Surrey Morphology Group, University of Surrey, Guildford GU2 7XH, UK
2
Ancient Language Lab, University of Queensland, St Lucia 4072, Australia
3
Max Planck Institute for the Science of Human History, D-07745 Jena, Germany
4
Department of Linguistics, Swarthmore College, Swarthmore, PA 19081, USA
5
CEREMADE, CNRS, UMR 7534, Université Paris-Dauphine, PSL University, 75016 Paris, France
*
Author to whom correspondence should be addressed.
Entropy 2022, 24(4), 507; https://doi.org/10.3390/e24040507
Submission received: 15 December 2021 / Revised: 31 March 2022 / Accepted: 1 April 2022 / Published: 5 April 2022
(This article belongs to the Special Issue Information-Theoretic Approaches to Explaining Linguistic Structure)

Abstract

:
Functional load (FL) quantifies the contributions by phonological contrasts to distinctions made across the lexicon. Previous research has linked particularly low values of FL to sound change. Here, we broaden the scope of enquiry into FL to its evolution at higher values also. We apply phylogenetic methods to examine the diachronic evolution of FL across 90 languages of the Pama–Nyungan (PN) family of Australia. We find a high degree of phylogenetic signal in FL, indicating that FL values covary closely with genealogical structure across the family. Though phylogenetic signals have been reported for phonological structures, such as phonotactics, their detection in measures of phonological function is novel. We also find a significant, negative correlation between the FL of vowel length and of the following consonant—that is, a time-depth historical trade-off dynamic, which we relate to known allophony in modern PN languages and compensatory sound changes in their past. The findings reveal a historical dynamic, similar to transphonologization, which we characterize as a flow of contrastiveness between subsystems of the phonology. Recurring across a language family that spans a whole continent and many millennia of time depth, our findings provide one of the most compelling examples yet of Sapir’s ‘drift’ hypothesis of non-accidental parallel development in historically related languages.

1. Introduction

Functional load (FL) quantifies the contribution of specific phonological contrasts to distinctions made in the lexicon of a language [1,2,3]. In English for example, the phonemes /t/ and /d/ contrast, and thus there exist phonological strings in English, including consonant clusters, syllables, and whole words, which differ only by virtue of one containing /t/ in a position where the other contains /d/. Examples within word-length strings include time/dime, welter/welder, and hit/hid. At a conceptual level, the FL of the {/t/,/d/} contrast in English is the degree to which that contrast supports the distinctiveness of phonological strings in the English lexicon, or conversely, the degree to which those strings would be conflated if /t/ and /d/ were merged into a single category.
A classic operationalization of FL by Hockett [2] is in terms of entropy [4]. Hockett’s definition makes reference to domains, D , which could be words, morphemes, syllables, or any kind of substring composed of phonemes. In a language L , the lexicon Λ will contain a set S D , Λ of unique phonological string types, s, which comprise a domain of type D . The entropy of domain D in lexicon Λ is:
H D , Λ = s S D , Λ Pr ( s ) · log 2 Pr ( s )
Following Hockett [2], the FL of a phonological contrast ϕ in lexicon Λ and domain D is the difference between two entropy measures: the entropy of domain D in lexicon Λ , and of domain D in an altered lexicon Λ ϕ , created by collapsing the contrast ϕ in Λ :
f ( D , Λ , ϕ ) = H D , Λ H D , Λ ϕ
A related metric is a normalized functional load, which states the FL of a contrast relative to the entropy of domain D in lexicon Λ [3]:
f n o r m ( D , Λ , ϕ ) = H D , Λ H D , Λ ϕ H D , Λ
A phonological contrast, ϕ , may refer to distinctions between the members of a single set of phonemes, such as between two phonemes {/t/, /d/} or between four phonemes {/t/, /d/, /s/, /z/}. Alternatively, it may refer to a collection of multiple, parallel distinctions, defined by a collection of sets and the distinctions within each of them. For instance, a contrast between voiced and voiceless stop consonants could refer to the distinctions within each of the three sets in the collection {{/p/, /b/}, {/t/, /d/}, {/k/, /g/}}, and a contrast between the places of articulation of stop consonants could refer to the distinctions within each of the two sets in the collection {{/p/, /t/, /k/}, {/b/, /d/, /g/}}. In all cases, the altered lexicon Λ ϕ is obtained by taking each individual set and replacing the phonemes within it with a single phonemic symbol that is distinct from all other phonemic symbols in L .
As FL is a frequency measure calculated from empirical datasets, the results that one obtains will vary according to the corpus used [3,5]. Accordingly, when investigating FL, it is important to bear in mind how the properties of a dataset might relate to the research question and to any assumptions relevant to the interpretation of results.

1.1. Functional Load and Explanation in Linguistics

Human linguistic communication is a highly complex cultural and cognitive phenomenon. One of its key traits is that every language contains an inventory of thousands of distinctive symbolic units termed morphemes, which in turn, comprise smaller, distinctive phonological elements [6]. In spoken languages, these include phonemes (found in all languages) and lexically distinctive prosodic elements, such as tones, accent and stress patterns (found in around half of all languages [7,8,9]). Of equal significance is that, although the existence of morphemes and phonemes is universal, the actual inventories are language-specific. In linguistics, this specificity of individual languages gives rise to fields of enquiry, such as linguistic typology, studying the range and distribution of variation across languages, and historical linguistics, studying the evolution of distributions over time. Both subfields contribute to the scientific understanding of human language in general by generalizing across the variation of individual language systems.
To better understand how linguistic systems provide and organize the distinctiveness that makes linguistic communication possible, investigations of FL play a valuable role in shedding light on how distinctions in phonological modules function to support distinctiveness in larger strings. Currently, in both typology and historical linguistics, the study of FL is in its infancy with much of the potential scope of FL research still to be explored. Most studies of FL, for instance, have focused on words as the strings of interest (see Section 1.2 below). However, words are only one kind of string known to be significant in the structure and use of linguistic systems. Systems of phonological structure exhibit well-known organizational principles at levels above the phoneme and below the word [10,11], while psycholinguistic research shows that listeners also use multi-phonemic strings smaller than the word or morpheme, to gain efficiencies in speech processing [12]. The study we present here illustrates some of the potential of studying strings other than words.
FL research to date has focused either on a small number of related languages (e.g., [5]) or on surveys across very distantly related languages (e.g., [3]). These are useful for producing highly detailed case-studies of well-understood systems or for evaluating the outer bounds of variation across languages in general. However, insights into historical dynamics at a statistically significant level are most readily obtained through phylogenetic analysis of large sets of related languages. Here, we fill a gap in the FL research by presenting the first phylogenetic study of FL and thereby produce the first results for historical dynamics that are supported at a statistical level across a large language family.
Research on FL has focused particularly on phonological contrasts whose FL is very low, for reasons related to a prominent conjecture in historical linguistics [1] (see Section 1.2). However, contrasts in human languages have a range of different functional loads, from very low to very high, and important questions include why this range exists and how FL values evolve over time. In terms of the evolutionary question, we can ask: (1) What kinds of changes to FL values appear to occur? (2) How do those changes pattern more broadly, for instance, does their distribution follow some well-described stochastic process? (3) What might the possible causes of change be that give rise to the distributions we see?
Answering these questions will ultimately require fitting together the pieces of a large puzzle. In this study, we contribute some central pieces. We produce some striking findings regarding the phylogenetic signal of FL. A phylogenetic signal is a measure of how close the match is—at the level of a whole language family—between (1) actual FL values and (2) what would be expected if FL was inherited from ancestor languages to their descendants while undergoing fluctuations according to a stochastic process termed Brownian motion (more on this in Section 2.2).
Naturally, we would not suppose that the real evolution of FL is so simplistic, rather the questions are: how close to such a scenario does the evolution of FL appear to come, and how does this compare to the evolution of other properties of language? Answering these questions provides insights into the kinds of explanations for FL evolution that could be consistent with the data. We also examine the correlation, or trade-off relationship, that can exist between the FL of two different contrasts in a linguistic system. Understanding how multiple, individual changes to FL are linked (or not) provides new insights into the historical dynamics of contrastiveness in linguistic systems.

1.2. Functional Load and Sound Change

Languages undergo mutational changes, known as sound changes, in which sounds may change from one phonemic category to another [13]. The term unconditioned merger refers to sound changes that cause two or more previously distinct phonemic categories to become conflated into one. Conditioned mergers are when conflation affects the phonemes only in certain contexts. FL has attracted attention as a possible explanatory factor in the incidence of mergers; it has long been conjectured [1] that contrasts with low FL are more prone to merge than contrasts with high FL, and recent results support that conjecture [5,14,15,16]. Debate is ongoing over which operationalizations of FL provide the greatest predictive power [5] and what kinds of mergers FL predicts [17]. Whether it used entropy-based definitions of FL or not, most research to date has focused on domains D that are words.
When a contrast undergoes an unconditional merger, its FL falls to zero. The fact that contrasts with low FL values are more likely to fall to zero than contrasts with higher FL values, is an observation that can be described in several ways, and each way of describing it can have different implications for what we believe is in need of explanation. For instance, taken by itself, it is consistent with a description in which (1) only falls to zero FL show a tendency for small changes in FL to be more likely than larger changes; (2) all falls in FL, whether the fall is to zero or otherwise, show the tendency; or (3) all changes to FL, whether they are falls or rises, show the tendency.
Since prior research has focused its attention on falls to zero, it has not yet been established whether or not current observations will generalize to support strictly case (1), the slightly broader case (2), or the general case (3). The study we present here will consider evidence for case (3). If it turns out that case (3) is supported, then this may recast how we think about recent results in sound change, since the phenomenon requiring explanation will not be only mergers but all changes in FL.

1.3. Contrastiveness in Pama–Nyungan VC Strings

In this paper, we examine FL from a phylogenetic perspective, investigating how FL evolves over time. Our empirical focus is in the large, Pama–Nyungan (PN) language family of Australia. PN languages extend across 90% of the Australian mainland and the time depth of the family is estimated at around 5000–6000 years before present [18,19,20].
In many PN languages, an inverse correlation has been observed between phonemic vowel length and the phonetic duration of a following consonant [21,22,23,24,25]. This is particularly so for vowels in the first syllable of words, referred to as tonic vowels. Here, we focus on tonic vowels and single, intervocalic consonants that follow them. Post-tonic single consonants that follow phonemically short vowels and that have phonetically longer durations in some languages also exhibit additional phonetic properties associated with long duration, such as more complete closure and passive devoicing of stops as well as pre-stopping of nasals and laterals.
Conversely, post-tonic single consonants that follow phonemically long vowels and that have phonetically shorter durations may exhibit more voicing and lenition of stops, and the absence of pre-stopping in laterals and nasals [21,22,26,27,28]. Examples of allophonic conditioning of this kind are cited in Table 1.
A general fact about sound change is that when two phonemes merge, it is possible for phonetic correlates of the original contrast, which are manifested in other segments, to remain in place and become contrastive. This has occurred in multiple branches of PN, as phonemic vowel length is lost while its erstwhile phonetic correlates on the following consonant remain and become distinctive. Examples are cited in Table 2.
In the cases cited in Table 2, the complete merger of the length contrast in the tonic vowel is associated with an increase in contrastiveness in following consonants. In such cases, the FL of the length contrast in tonic vowels falls (to zero) while the FL of manner of articulation contrasts (including voicing and fortition) in the following consonant rises. Consequently, there is a trade-off relationship between F L V , the FL of tonic vowel length, and  F L C , the FL of post-tonic consonantal manner of articulation.
This trade-off is of a very specific kind, in which the complete merger of all short/long vowel pairs reduces F L V to zero. We will refer to this as a trade-off with contrast collapse. Other trade-offs are possible, however. If a length contrast is lost only in certain vowels, and/or only in certain contexts, then F L V would fall (though not to zero), and in such cases, F L C could be expected to rise if the consonants become more contrastive when they follow the vowels that do merge. This scenario would give rise to a second kind of trade-off. We will refer to this second kind as a trade-off with contrast maintenance.
The two studies that we present below examine the evolution of the FL of tonic vowel length contrasts and of contrasts in the following consonant in PN.
Study 1 establishes that these FL variables contain significant phylogenetic signals. One implication of this is that our data are consistent with an evolutionary process in which smaller changes in FL—both falls and rises—have been more common than larger changes. More detail is given in Section 2.2 below.
Study 2 examines FL trade-offs with contrast maintenance. We test the hypothesis that F L V and F L C are negatively correlated in PN. F L C is defined in terms of the manner of articulation of post-tonic consonants, and we expect F L C to correlate negatively with F L V for the phonetic and historical reasons introduced above. We also examine F L P , the FL of place of articulation of post-tonic consonants. As research on PN languages has identified no particular association between vowel length and consonant place, our hypothesis is that F L P will show no significant correlation with F L V .

2. Materials and Methods

2.1. Functional Load Data in Pama–Nyungan

We estimated the FL for vowel length ( F L V ), consonant manner ( F L C ), and consonant place ( F L P ) in domains comprised of a tonic vowel followed by a single, intervocalic consonant in a set of 90 Pama–Nyungan languages listed in Appendix A. To calculate the FL, we used both the unnormalized formula in (2) and the normalized formula in (3). In cases where we intended to be specific, we refer to the unnormalized values as F L V u , F L C u , F L P u and the normalized values as F L V n , F L C n , F L P n . In most cases though, when the point under discussion applies equally to both, we write F L V , F L C , F L P .
At the current stage of documentation of global linguistic diversity, every major language family contains many low-resource languages for which data are scarce [41]. As a consequence, studies of FL, such as ours, which aim for a coverage that spans whole families, will face limits on the data available. This is true in the case of Pama–Nyungan. For most languages in our dataset, the available corpora are lexical lists. As we will see in Section 3, this does not prevent clear results from emerging, and we return to discuss the reasons why in Section 4.1.
Correspondingly, our FL estimates were based on lexical datasets from which we extracted instances of the domain of interest. The lexicons contained between 208 and 3215 domain instances (mean 774 and median 605). The 90 languages studied were selected by taking the 112 PN lexicons studied in [42] and keeping only those that (1) have some degree of tonic vowel length contrast (since we want to study changes in FL that do not involve complete mergers) and (2) that have greater than 200 domain instances. A representative tree of these 90 languages is shown in Figure 1.
When classifying vowels as short or long, we regarded sequences of two adjacent short vowels as one long vowel, and sequences of /uwu/ and /iji/ as long vowels, since a tradition followed in some Australianist analysis is to represent long high vowels [u:] and [i:] as phonemic vowel-glide-vowel sequences (e.g., ([43], p. 24), ([44], p. 91)).
Phonemically long or geminate consonants, and phonemically pre-stopped sonorants, have been analysed as both mono- and bi-segmental units in the Australianist literature [45]. Here, we were guided by the kinds of historical developments that we wish to study, and we classed them as single segments.
The FL data obtained for the 90 PN languages are reported in Appendix A.

2.2. Phylogenetic Analysis

As languages are related to one another, it is not statistically valid to treat cross-linguistic observations as independent [46]. Quantitative phylogenetic methods [47] take genealogical relatedness into account in a principled and statistically sound manner. Our two studies use phylogenetic techniques in order to make valid inferences from the cross-linguistic FL data in PN.
Study 1 assesses the degree of phylogenetic signal in F L V , F L C , and F L P . The phylogenetic signal is a measure that compares the variation in an observed variable against its expected variation if it evolved along a phylogenetic tree, t, according to a Brownian motion process. In Brownian motion, the value of a variable is in constant flux. Positive and negative changes are equally likely at all times, and small changes are more likely than large ones. A phylogenetic signal will be stronger when the variable in question actually did evolve along the tree, and less strong if it was influenced by lateral transfer as in borrowing, especially borrowing among languages that are only distantly genealogically related.
The phylogenetic signal will be stronger when the variable evolved along the specific tree t, to which the data are being compared and not some other tree, t . Furthermore, it will be stronger when evolution was similar to Brownian motion, so that the value of the variable had equal probabilities of shifting up or down at any point, as opposed to (for example) being constrained within some tight range, so that at extreme outer values, there was a greater chance for the variable to evolve back towards the central value than further outwards. For a more technical description of phylogenetic signals written for a linguistic readership, see [42,46].
Here, we measure the phylogenetic signal, using the picante package in R [48], according to a standard two-step procedure described in Blomberg et al. [49]. First, the variation in the data is compared to a randomized baseline in which the shape of a previously determined tree t plays no role in structuring the data. The null hypothesis is that, relative to the structure implied by the tree, the data are simply random; the alternative hypothesis suggests patterns like the tree.
For example, a phylogenetic signal is considered to be evident at a p = 0.05 level if the variation in the real data matches the tree better than 95% of the randomized datasets. Next, if the data has been confirmed as significantly differing from randomness, then we use the statistic, Blomberg’s K to measure the strength of phylogenetic signal. (For mathematical details of the calculation of K, see [42,49].) Blomberg’s K takes a value of 1 if the variation in the data accords perfectly with the tree t, and a minimum value of 0 if the data are perfectly randomly distributed relative to t. Values in excess of 1 are possible if FL data values are highly clumped within subgroups of the family.
When calculating a phylogenetic signal, reference must be made to a tree, t. In our case, the aim is to compare FL data to the PN family tree. However in linguistics, there is uncertainty regarding the details of this tree. Uncertainty about the details of trees is common in phylogenetic research and is termed phylogenetic uncertainty. Here, we employ a standard approach to account for phylogenetic uncertainty, by measuring the phylogenetic signal with respect to not one tree t but a sample of 1000 highly-likely family trees t 1 , t 2 , , t 1000 . This generates 1000 tests against randomness, followed by 1000 estimates of K, which provide a distribution describing its likely value. Our tree sample t 1 , t 2 , , t 1000 comprises 1000 dated phylogenetic trees from the posterior distribution inferred by Bowern [50] and described further in Macklin-Cordes et al. [42]. The trees were inferred from cognate data from which known borrowings were excluded [19,20].
Study 2 examined the phylogenetic Pearson’s correlation [51] between F L V and F L C and between F L V and F L P . This test is conceptually parallel to a regular Pearson’s correlation; however, it also takes into account the specific kinds of non-independence caused by genealogical relationships between languages. As a consequence, the results reflect correlations not only between the values of traits in individual modern languages but also between values characteristic of subgroups at all levels in the tree. As such, the results for a pair of variables can inform us about the strength and direction of linked relationships that characterize the language family as a whole, through its history. When we calculate the statistics, as with our estimate of the phylogenetic signal, we take phylogenetic uncertainty into account by performing the correlation test in reference to the sample of 1000 highly-likely trees.
Most statistical tests require assumptions to be made about the data. The test we use in study 2 assumes that F L V , F L C , and F L p evolve along a phylogeny following Brownian motion. Given the results that we obtained in Study 1 for Blomberg’s K (see Section 3), the assumption is well motivated. We used the phytools R package [52] to estimate the covariance matrix of the Brownian motion on each tree in the sample, giving a sample from the posterior distribution of Pearson’s r correlation. The p-values reported were computed using the posterior mean estimates and correspond to testing the null hypothesis that the correlation is zero against the alternate hypothesis that the correlation is non-zero.

3. Results

Study 1 The statistical significance of the presence of phylogenetic signal was measured to three digits of accuracy, and for all FL variables and all trees, the highest p-value was p = 0.001 , indicating that a phylogenetic signal was significantly present. The strength of the phylogenetic signal as measured by Blomberg’s K was very close to 1 for F L V , F L C , and F L P , as shown in Table 3, both for the normalized and unnormalized versions of the FL measure.
To place these K values in context, Macklin-Cordes et al. [42] examined the lexical Markov chain transition probabilities of biphones (two-segment sequences) in PN and found mean K values of 0.54 or mean K of 0.59 when segments were binned into groups by place or manner of articulation. Macklin-Cordes and Round [46] examined the relative frequencies of dental versus palatal consonants in word initial and intervocalic positions in PN and found mean K values from 0.78 to 1.32 word-initially and from 0.34 to 0.70 intervocalically.
Dockum [53] examined phoneme frequencies and biphone Markov chain transition probabilities in languages of the Tai family and found mean K values of 0.71 and 0.68 , respectively. Further afield, Blomberg et al. [49] examined 121 biological traits of a wide variety of plant and animal organisms, finding mean K values of 0.35 for behavioural traits, 0.54 for physiology, and 0.83 for traits related to body size. Taken in this context, our results suggest that the evolution of FL is very well described by Brownian motion process along the PN tree.
Study 2 One consequence of the high levels of phylogenetic signal found in FL, is that statistical analysis, such as the measurement of correlations should be carried out using phylogenetic comparative methods [46]. Phylogenetic Pearson’s correlation (Table 4) was significant and negative between F L V and F L C but did not reach significance between F L V and F L P , in accordance with our hypotheses. This was true for both for the normalized and unnormalized versions of FL.

4. Discussion

The studies in this paper have examined language diachrony at a statistical level. In doing so, we contribute to a more precise, quantitative characterization of diachronic typology. Specifically, we studied the historical dynamics of FL, which is a quantitative characterization of the contribution of specific contrasts to distinctiveness in the lexicon.
We established that the FL of certain variables evolves according to non-independent stochastic processes: they were found to change in a linked, statistically correlated fashion across almost a hundred languages and thousands of years of history. Moreover, we demonstrated that FL exhibited interesting historical dynamics that are deserving of further investigation not only at values close to zero, which have been the focus of prior work but at higher values as well.
Recent research has confirmed a long-standing conjecture that contrasts with low FL are particularly prone to merger [5,14,15,16], which is to say that FL is more likely to fall to zero from a lower value than from a higher one. Efforts at explaining this phenomenon have focused on homophony avoidance, an account that is specific to FL falling to zero and doing so in the domain of whole words [5,17]. However, our findings suggest that there may be nothing special about FL falls as opposed to rises, and changes to zero as opposed to other values. In addition, the word is not the only domain in which causally interesting effects of FL may be active.
Consequently, although it was not our primary focus here, our findings suggest that recent efforts may be focusing on too narrow a research question and consequently entertaining a set of explanatory accounts that will generalize only poorly to other, related phenomena. Future research will benefit from broadening its scope beyond the recent, more narrow focus on FL, which falls to zero in whole words.
We now take up three additional topics for expansion and emphasis.

4.1. High Degree of Phylogenetic Signal in FL

Phylogenetic signals have recently been shown to be present in phonotactic biphone frequencies, phoneme frequencies, and contextual ratios of places of articulation [42,46,53,54,55]. These studies reveal that the frequencies of phonological structures pattern with genealogy. Here, we find a high level of phylogenetic signals also in FL—that is, in the contrastive function that phonological structures serve. Interestingly, we find that the phylogenetic signal in the FL measures examined here was very close to 1 and closer than the values found in studies of phonological structures. Two questions can be posed in response: why did we find a strong phylogenetic signal in FL, and why is it even stronger than in structural traits? Any answer at this stage of research is necessarily speculative; however, the observations we offer here may point to useful lines of future inquiry.
Why did we find a strong phylogenetic signal in FL in the data that we used, bearing in mind that our data (1) are sourced from lexical lists of word types, not tokens; (2) are sourced from lists that are short, mostly numbering in the hundreds of items, not thousands or tens of thousands; and (3) examine FL in domains comprised of a tonic vowel and following consonant, not whole words. One prior expectation might have been that since the datasets are so small and since they do not examine the full words that are the focus of much recent research, they would be awash in statistical noise and exhibit little patterning of significance. Evidently, this is not the case, however, and we believe there may be reasons why not.
Much research on FL has focused on the hypothesis that FL exerts an influence on sound change through a mechanism of homophony avoidance [1,5,15,17]. Since homophony is a relationship that holds between words, the effect of such a mechanism may be to promote FL within the domain of whole words. Furthermore, as a consequence of that mechanism, sound changes would be less likely to occur, the more they altered the FL within words. However, whether this hypothesis is correct or not, other mechanisms should not be ruled out.
For instance, during speech processing, words become activated cognitively well before the listener hears the entire word [56]. Consequently, any sound change that caused a loss of a contrastiveness early in a word could potentially impair the ease of processing, even if it did not result in homophony. Accordingly, if we are prepared to entertain the existence of homophony avoidance as a causal factor in sound change, it is not unreasonable to entertain the existence of an avoidance of loss of contrastiveness early in the word as an additional factor in sound change (for supporting evidence, see [57,58]).
By this line of reasoning, since our data focuses on the first vowel and following consonant of PN words, i.e., contrastiveness early in the word, it is not altogether surprising that our results were significant despite the fact that we did not examine whole words. (PN roots are typically disyllabic and affixation is suffixal [59]; the initial consonant position, before the tonic vowel, permits only a subset of the contrastive consonants found elsewhere [45,60], which potentially increases the importance placed on maintaining the subsequent VC contrasts.)
Our data come from short word lists, which might be expected to supply FL values that are, at best, a noisy approximation of the more precise FL values obtainable from larger lists or from token-based corpus data [61,62]. However, the words that appear in short wordlists are heavily skewed towards the most frequent words of a language, and these are the words that listeners would process most often and would have learned the earliest during acquisition. If we grant that words of higher frequency and earlier acquisition are likely to play an especially significant role in the mechanisms behind sound change, then it follows that even short wordlists will plausibly contain rich evidence of the contrastiveness that matters most.
Our results also align with the findings of [61] in that, above a minimum threshold for wordlist length, even lists of only a few hundred words contained phonemic distributions that conformed closely to the full lexicon, when randomly sampled from a larger lexicon containing thousands of items. Thus, it is not as surprising as it might first seem that we obtained clear results and a strong phylogenetic signal from the limited data we had available.
Our second question was, why does the phylogenetic signal appear higher in FL than in structural traits of phonology, such as phonotactics? To answer this, we return to the two causes of stronger/weaker phylogenetic signals described in Section 2.2.
First, a phylogenetic signal is stronger when computed relative to the truest tree and lower otherwise. However, the trees we used here for PN while studying FL are the same as those used by Macklin-Cordes et al. [42] for phonotactics, and thus a difference in the trees used is unlikely to be the cause of the differences in phylogenetic signal.
Second, a phylogenetic signal is stronger if the change process has the properties of Brownian motion. In Brownian motion, small changes are more likely than large changes, and positive and negative changes are equally likely. We take these aspects in turn.
Both FL and structural traits—as with phonotactic frequencies—change as the lexicon changes. Any lexicon is constantly affected in small ways by neologisms and the obsolescence of words. Additionally, they may be affected by borrowing, which can occur at various rates, and by sound changes, which can occur in highly specific contexts or more sweeping ones. This mixture of factors supports an expectation that small changes will be frequent and larger changes less so, and it is not obvious that there would be significant differences in this regard between FL or traits, such as phonotactic frequencies.
It now remains to consider whether positive and negative changes in values are equally likely. For FL, positive/negative changes in values entail that a contrast becomes more/less central in supporting the distinctiveness of strings in the lexicon. There is a lower bound at zero; however, in our study, we did not include that lower bound. Aside from that lower bound, we are not aware of constraints that would make the likelihood of positive or negative changes uneven at any point, and consequently the stochastic process that describes changes in FL could genuinely be quite close to Brownian motion. For structural aspects of phonology, however, the situation is different.
Structural features are subject to constraints: there are less likely and more likely structures, both in universal and in lineage-specific terms [45,63]. Consequently, for instance, the frequency of a highly marked structure should be more likely to decrease than to increase. This kind of inequality in the likelihood of positive and negative changes to values—irrespective of the actual sources, such as production, perception, and cognition—entails a departure from Brownian motion, which ought to weaken the phylogenetic signal. This, we suggest, may be why structural traits appear to have a lower phylogenetic signal compared with FL. If this line of reasoning is correct, we would expect similar results to emerge from studies of other language families beyond PN.

4.2. Transphonologization and the Flow of Contrastiveness

Transphonologization [64] (cf. rephonologization [65] and cheshirization [66]) is a term given to sound changes in which a contrastive function is preserved; however, the locus of the contrast—the segments or features that instantiate it—changes. Here, we studied a closely related phenomenon in which contrasts do not necessarily disappear or emerge in their entirety, but the relative contrastive workload of them (their FL) does shift from one to another. One way to view this phenomenon is in terms of a diachronic flow of FL from one contrast to another (cf. [67]). The fact that we are able to quantitatively detect the presence of this flow of contrastiveness through a language family as large and old as PN suggests the potential of new avenues for investigating the dynamic flow of contrastiveness through phonological systems as they evolve over time.
One question that arises is whether our findings in PN might reflect some strong preference in language for the conservation of contrastiveness in which case, the flow of FL from one contrast to another might be regarded as an automatic consequence of one contrast undergoing a significant decrease in FL. Although our results alone cannot answer this question, we doubt that such a principle exists in any strong form. Certainly, in many mergers, the overall contrastive capacity of a language is simply reduced, as the FL of one contrast falls but no other FL rises to balance it.
In the case of PN tonic vowels and post-tonic consonants, we suggest that the cause of recurrent historical flow of FL lies in particular phonetic factors that are common across PN languages: a synchronic correlation between phonetic tonic vowel duration and phonetic post-tonic consonant manner, even in systems in which only the vowel-durational aspect is tied to a synchronic phonemic contrast; when the phonetic vowel-durational differences are neutralized diachronically, causing the phonemic vowel length contrasts to collapse, the phonetic manner differences—which still correlate with the same lexical distinctions that vowel length had signalled—become phonemic. On this view, it is the phonetics of the vowel–consonant strings that furnish the conditions for a natural flow of FL from vowel to consonant.

4.3. On Sapir’s ‘Drift’: The Non-Accidental, Parallel Evolution of Related Languages

It is a century now since the appearance in print of Edward Sapir’s hypothesis that languages undergo parallel grammatical evolution for several centuries after they split [68]. Providing anything more than anecdotal evidence in support of Sapir’s hypothesis has long been difficult [69,70,71], and some apparent cases may be due to language contact [72,73]. Dunn et al. [74] used phylogenetic methods to examine patterns of word order evolution in different language families; however, the study did not produce an identifiable cause for those patterns.
Ideally, evidence in support of Sapir’s drift should not be anecdotal but rather be statistically significant across a language family; it should not be reducible to the effects of language contact, and it should be relatable to an identifiable cause. The current study meets these three criteria. It detects parallel changes in FL that are instantiated statistically across 90 languages within the PN family whose time depth is estimated at around 5000–6000 years [18,19,20]; thus, the evidence is not anecdotal.
The data pattern fits tightly with phylogeny, and thus is unlikely to be due to contact (cf. our note in the next paragraph). Furthermore, we identified a causal basis for this, in the common phonetics of PN tonic vowel–consonant sequences. Thus, we believe our results to be one of the fullest confirmations yet that Sapir’s conjecture was essentially correct: that under the right circumstances, linguistic systems can undergo parallel evolution after they split, not merely for centuries but for millennia.
A reviewer asks about the situation in which language contact closely mimics the pattern of phylogeny. If contact did pattern perfectly with phylogeny (such that languages only borrowed from their very closest relatives), then its effects would be indistinguishable. However, languages also borrow from geographic neighbours that are less closely related. It is hard to conceive of borrowing of lexical items whose impact on contrastiveness has the phylogenetic signal we find here in the absence of vertical inheritance. At the very least, the burden of proof is on the advocate of a language-contact account, given that the data pattern is in very close accord with expectations from vertical inheritance, and there is an accompanying explanation in terms of phonetics and sound change for why this should be so.
It has been suggested by Joseph [75] that drift in phonology may be due to a narrowing of the range of variation inherited from a proto-language. In the PN changes described here, however, the flow of FL from vowel length to consonant manner is not due to any narrowing of variation in FL in proto-PN (indeed it is not entirely clear what it should mean for FL to have a range of variation). Nor, when the PN developments are viewed in terms of phonological substance are they a matter merely of narrowing variation. Although contrasts in vowel length are lost, new variation is introduced in the inventory of contrastive consonant manners and into the set of relationships that can hold between the length of a tonic vowel and the manner of post-tonic consonants.
In reality, Joseph’s proposal would appear to reduce to a fact, well-recognised in evolutionary biology, that incomplete lineage sorting (i.e., the inheritance of variation from a proto-taxon into its descendant) can result in the appearance of convergent evolution [76]. However, this does not entail that all convergent evolution is due to incomplete lineage sorting (see also [77]).
Another important source can be the existence of dependencies within a system that are inherited along with its substance [78], which will favour certain outcomes over others in descendent systems as has been observed in protein evolution; for example, ref. [79]. In PN, certain phonetic dependencies between tonic vowel length and post-tonic consonant manners were inherited alongside the phonological substance itself. In the descendent systems, in the event that vowel length was lost, the inherited dependencies favoured the rise of new, contrastive consonant manners.

5. Conclusions

This paper joins a growing body of work regarding the application of computational phylogenetic methods to phonological data. It also represents the first phylogenetic study of FL.
We showed that there was a significant phylogenetic signal in FL, which has implications for a better understanding of the dynamics of sound change. Further, we showed that the FL of tonic vowels and post-tonic consonants were negatively correlated in PN and that this maps closely to a sample of highly probable PN family trees. We also introduce the idea of the flow of contrastiveness between subsystems of the phonology in different languages, which is connected to the concept of transphonologization, and we claim that this represents a concrete example of Sapir’s drift.
Setting our gaze beyond PN, while not all sound changes are associated with the phonetic conditions that promote the flow of FL, there are many that appear to be, and it will be valuable to apply the methods we introduced here to study them. In time, this may lead to a more general understanding of how FL can flow within phonological systems over long time horizons. Promising future applications of our approach include the investigation of other suspected diachronic trade-offs, such as the rise of phonemic tone and register (such as contrastive phonation) in Southeast Asia tied to losses of consonantal laryngeal distinctions [3,80,81,82].

Author Contributions

Conceptualization, methodology, project administration, supervision, and investigation, E.R., R.D. and R.J.R.; software and formal analysis, E.R. and R.J.R.; resources and data curation, E.R.; writing—original draft preparation, E.R.; review and editing, E.R., R.D. and R.J.R.; visualization, E.R.; and funding acquisition, E.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Max Planck Institute for the Science of Human History and British Academy grant number GP300169 to E.R.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

FL data are supplied in Appendix A. PN trees were published with [42] and are available at https://zenodo.org/record/3988775 (accessed on 15 December 2021).

Acknowledgments

We are grateful to the editors and three anonymous reviewers for their insightful and helpful comments. Versions of this paper were presented at the Edinburgh Symposium on Historical Phonology 2021 and the Annual Meeting of the Australian Linguistics Society 2021, and we are grateful to the audiences at these meetings for their valuable comments. This research was initiated at the workshop, An evolutionary science of word and sound systems, hosted by the Max Planck Institute for the Science of Human History, Jena, in November 2019.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:
FLFunctional load
PNPama–Nyungan

Appendix A. Functional Load Data

Table A1. Functional load estimates and the number of observations on which they are based in Pama–Nyungan languages A–M.
Table A1. Functional load estimates and the number of observations on which they are based in Pama–Nyungan languages A–M.
Language F L V u F L C u F L P u F L V n F L C n F L P n N
Adnyamathanha0.007282.298771.693280.001190.375690.276731424
Anguthimri0.157860.890400.946000.027880.157250.16707255
Bakanh0.482160.756751.701560.081060.127230.28607379
Bidyara0.020481.043991.439080.004180.213270.29397466
Bilinarra0.024011.601191.385370.004540.302580.261801071
Biri0.011291.047121.307910.002250.208690.26067387
Bularnu0.010291.814201.531690.001820.321360.27131514
Butchulla0.152591.204930.839800.030470.240580.16768303
Dhangu0.536891.305651.204280.093030.226230.20866222
Dhay’yi0.619971.334551.356320.106250.228720.23245636
Djabugay0.051081.269091.160980.010320.256360.23452641
Djapu0.552421.425801.455730.093470.241250.24632563
Djinang0.008351.757351.275970.001530.322420.234101459
Duungidjawu0.359250.931560.912880.065750.170500.16709340
Dyirbal0.018991.206771.226230.003940.250130.25416339
Gamilaraay0.476391.092000.666710.092320.211610.12920619
Gidabal0.383151.208550.849650.076950.242710.17063863
Gumbaynggir0.681901.141610.560690.129490.216780.10647260
Gunya0.085921.675251.879280.014920.290870.32629421
Gupapuyngu0.703251.659651.389700.112750.266100.222811426
Gurindji0.141361.683901.400020.025910.308640.256612783
Guwamu0.151540.971841.283260.030000.192380.25402284
Jaru0.146101.650951.238540.027410.309740.232371459
Jiwarli0.106551.499671.544720.019630.276310.28460747
Kalkatungu0.173581.450941.667490.030860.257990.29650937
Karajarri0.011301.585141.371810.002200.308500.266981217
Kariyarra0.007941.511891.342080.001530.290660.25802252
Kartujarra0.102591.598301.628780.018720.291700.29726526
Kok Nar0.024401.024591.037860.004470.187810.19024213
Koko Bera0.007601.125141.144490.001390.206310.20985428
Kugu Nganhcara0.230711.201151.524460.038140.198590.25204359
Kukatj0.492471.371321.262120.081510.226960.20888422
Kukatja0.177021.691951.556460.031640.302430.278212339
Kuku Yalanji0.005551.205571.064020.001130.244620.215901070
Kurrama0.157401.475091.332560.029180.273480.24705495
Kurtjar0.356941.097900.902450.057150.175790.14450405
Kuugu Ya’u0.821800.967871.826870.141590.166760.31476672
Malkana0.090601.359741.519800.017190.258050.28843208
Mangala0.068271.585191.311690.013120.304710.25214749
Martuthunira0.121681.596841.412520.022300.292690.25891633
Mirniny0.057231.493531.645560.010710.279400.30784259
Mudburra0.121141.572661.361960.022550.292690.25347509
Muruwari0.298881.302931.239290.055550.242150.23032873
Table A2. Functional load estimates and the number of observations on which they are based in Pama–Nyungan languages N–Z.
Table A2. Functional load estimates and the number of observations on which they are based in Pama–Nyungan languages N–Z.
Language F L V u F L C u F L P u F L V n F L C n F L P n N
Ngaanyatjarra0.308471.657311.543720.054000.290150.270261125
Ngadjunmaya0.278711.581111.524640.051230.290600.28022512
Ngamini0.008591.657691.627870.001670.321350.31556453
Ngardily0.043981.540711.419180.008380.293550.27040274
Ngarinyman0.049771.609201.371250.009490.306870.26150870
Ngarla0.054331.650361.521670.010260.311510.28721940
Ngarluma0.009491.549841.510680.001790.291790.28442633
Nhanda0.221281.616921.982740.037620.274910.33710427
Nhangu0.527061.621341.351210.088220.271370.22616952
Nukunu0.229961.884751.572200.037970.311220.25961282
Nyamal0.010961.667571.584580.002100.319580.30368574
Nyangumarta0.055911.636891.530350.010400.304510.284691002
Nyawaygi0.447790.916930.847040.086270.176650.16318339
Olkol0.009131.897321.525450.001400.291020.23398992
Panyjima0.033191.525541.536910.006210.285450.28758327
Payungu0.120351.442931.487100.022260.266930.27510514
Pintupi0.257291.620371.497980.046050.290030.268123065
Purduna0.186071.484111.641130.033640.268350.29675540
Ritharrngu0.527981.593911.284790.088930.268460.21640841
Sth. Paakintyi0.481821.310331.586420.083520.227130.27499748
Thaayorre0.397950.945811.296680.067500.160430.21995873
Thalanyji0.112951.412001.639850.020830.260370.30239467
Tharrkari0.101051.807801.468150.017600.314840.25569371
Umpila0.726330.895341.792110.126880.156400.31305473
Waalubal0.388111.211120.846410.077900.243090.16989864
Walmajarri0.075291.694691.530690.013870.312160.281952361
Wangkatja0.242881.646521.548710.043290.293450.276021290
Wangkumara0.037371.784151.492040.006620.316140.26438480
Warlmanpa0.031931.768831.546360.005820.322180.28166603
Warlpiri0.077881.717191.549130.014160.312150.281603215
Warluwarra0.062111.624981.608510.011120.290840.28789731
Warnman0.033871.619201.557050.006280.300220.28870607
Warrgamay0.466061.166421.224450.087020.217790.22862470
Warriyangga0.048811.463361.556680.009250.277390.29508273
Watjarri0.095501.555021.584780.017420.283630.28906787
Wayilwan0.440921.112220.699250.086970.219370.13792489
Western Wakaya0.025511.477901.346830.004730.274290.24996696
Wik Mungkan0.579060.910751.733350.092370.145290.276511411
Yadhaykenu0.118101.390361.474000.021850.257300.27278385
Yalarnnga0.012811.478691.596140.002390.275890.29780397
Yanyuwa0.027681.533691.477080.005210.288370.277731351
Yaygir0.627341.490291.004100.114430.271840.18315658
Yidiny0.013761.240011.084030.002780.250630.21910964
Yindjibarndi0.126911.476081.212150.023910.278040.22833492
Yinhawangka0.080241.653581.678200.014420.297130.30155773
Yulparija0.077991.657151.581840.014260.303020.289251234
Yuwaalaraay0.610241.185420.806900.115170.223730.152291109

References

  1. Martinet, A. Function, structure, and sound change. Word 1952, 8, 1–32. [Google Scholar] [CrossRef]
  2. Hockett, C.F. The quantification of functional load. Word 1967, 23, 300–320. [Google Scholar] [CrossRef]
  3. Surendran, D.; Niyogi, P. Quantifying the functional load of phonemic oppositions, distinctive features, and suprasegmentals. Amst. Stud. Theory Hist. Linguist. Sci. 2006, 279, 43. [Google Scholar]
  4. Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423, 623–565. [Google Scholar] [CrossRef] [Green Version]
  5. Wedel, A.; Kaplan, A.; Jackson, S. High functional load inhibits phonological contrast loss: A corpus study. Cognition 2013, 128, 179–186. [Google Scholar] [CrossRef] [PubMed]
  6. Hockett, C.F. The origin of speech. Sci. Am. 1960, 203, 88–97. [Google Scholar] [CrossRef]
  7. Ratliff, M. Tonoexodus, tonogenesis, and tone change. In The Oxford Handbook of Historical Phonology; Honeybone, P., Salmons, J., Eds.; Oxford University Press: Oxford, UK, 2015; pp. 245–261. [Google Scholar]
  8. Hyman, L.M. What tone teaches us about language. Language 2018, 94, 698–709. [Google Scholar] [CrossRef]
  9. Goedemans, R.; van der Hulst, H. Weight Factors in Weight-Sensitive Stress Systems. In The World Atlas of Language Structures Online; Dryer, M.S., Haspelmath, M., Eds.; Max Planck Institute for Evolutionary Anthropology: Leipzig, Germany, 2013. [Google Scholar]
  10. Nespor, M.; Vogel, I. Prosodic Phonology; Foris: Dordrechtm, The Netherlands, 1986. [Google Scholar]
  11. Blevins, J. The independent nature of phonotactic constraints: An alternative to syllable-based approaches. In The Syllable in Optimality Theory; Féry, C., de Vijver, R., Eds.; Cambridge University Press: Cambridge, UK, 2003; pp. 375–403. [Google Scholar]
  12. Cutler, A. Native Listening: Language Experience and the Recognition of Spoken Words; MIT Press: Cambridge, MA, USA, 2012. [Google Scholar]
  13. Garrett, A. Sound change. In The Routledge Handbook of Historical Linguistics; Bowern, C., Evans, B., Eds.; Routledge: London, UK, 2015; pp. 227–248. [Google Scholar]
  14. Silverman, D. Neutralization and anti-homophony in Korean. J. Linguist. 2010, 46, 453–482. [Google Scholar] [CrossRef] [Green Version]
  15. Bouchard-Côté, A.; Hall, D.; Griffiths, T.L.; Klein, D. Automated reconstruction of ancient languages using probabilistic models of sound change. Proc. Natl. Acad. Sci. USA 2013, 110, 4224–4229. [Google Scholar] [CrossRef] [Green Version]
  16. Babinski, S.; Bowern, C. Mergers in Bardi: Contextual probability and predictors of sound change. Linguist. Vanguard 2018, 4. [Google Scholar] [CrossRef]
  17. Ceolin, A. On Functional Load and its Relation to the Actuation Problem. Univ. Pa. Work. Pap. Linguist. 2020, 26, 6. [Google Scholar]
  18. McConvell, P. Backtracking to Babel: The chronology of Pama-Nyungan expansion in Australia. Archaeol. Ocean. 1996, 31, 125–144. [Google Scholar] [CrossRef]
  19. Bowern, C.; Atkinson, Q. Computational phylogenetics and the internal structure of Pama-Nyungan. Language 2012, 88, 817–845. [Google Scholar] [CrossRef]
  20. Bouckaert, R.R.; Bowern, C.; Atkinson, Q.D. The origin and expansion of Pama—Nyungan languages across Australia. Nat. Ecol. Evol. 2018, 2, 741–749. [Google Scholar] [CrossRef] [PubMed]
  21. Butcher, A.R. What speakers of Australian Aboriginal languages do with their velums and why: The phonetics of the nasal/oral contrast. In Proceedings of the XIVth International Congress of the Phonetic Sciences, San Francisco, CA, USA, 1–7 August 1999; pp. 479–482. [Google Scholar]
  22. Tabain, M.; Breen, G.; Butcher, A. VC vs. CV syllables: A comparison of Aboriginal languages with English. J. Int. Phon. Assoc. 2004, 34, 175–200. [Google Scholar] [CrossRef] [Green Version]
  23. Butcher, A. Australian Aboriginal languages: Consonant-salient phonologies and the ‘place-of-articulation imperative’. In Speech Production: Models, Phonetic Processes and Techniques; Harrington, J.M., Tabain, M., Eds.; Psychology Press: New York, NY, USA, 2006; pp. 187–210. [Google Scholar]
  24. Fletcher, J.; Butcher, A. Sound patterns of Australian languages. In The Languages and Linguistics of Australia: A Comprehensive Guide; Koch, H., Nordlinger, R., Eds.; Mouton de Gruyter: Berlin, Germany, 2014; pp. 91–138. [Google Scholar]
  25. Jepson, K.; Stoakes, H. Vowel duration and consonant lengthening in Djambarrpuyngu. In Proceedings of the 18th International Congress of Phonetic Sciences, Glasgow, UK, 10–14 August 2015. [Google Scholar]
  26. Hercus, L.A. The pre-stopped nasal and lateral consonants of Arabana-Wangganguru. Anthropol. Linguist. 1972, 14, 293–305. [Google Scholar]
  27. Loakes, D.; Butcher, A.; Fletcher, J.; Stoakes, H. Phonetically prestopped laterals in Australian languages: A preliminary investigation of Warlpiri. In Proceedings of the 9th Annual Conference of the International Speech Communication Association Interspeech 2008, Brisbane, Australia, 22–26 September 2008. [Google Scholar]
  28. Round, E.R. Prestopping of nasals and laterals is only partly parallel. In Language Description Informed by Theory; Pensalfini, R., Guillemin, D., Turpin, M., Eds.; John Benjamins: Amsterdam, The Netherlands, 2014; pp. 81–95. [Google Scholar]
  29. Hale, K.L. Wik reflections of Middle Paman phonology. In Languages of Cape York; Sutton, P., Ed.; AIAS: Canberra, Australia, 1976; pp. 50–60. [Google Scholar]
  30. Smith, I.; Johnson, S. Kugu Nganhcara. In Handbook of Australian Languages; Dixon, R.M.W., Blake, B.J., Eds.; Oxford University Press: Oxford, UK, 2000; Volume 5, pp. 357–507. [Google Scholar]
  31. Hercus, L.A. A Nukunu Dictionary; Department of Linguistics, Australian National University: Canberra, Australia, 1992. [Google Scholar]
  32. Crowley, T. Uradhi. In Handbook of Australian Languages; Dixon, R.M.W., Blake, B.J., Eds.; John Benjamins: Amsterdam, The Netherlands, 1983; Volume 3, pp. 307–428. [Google Scholar]
  33. Alpher, B.J. Sound change. In Oxford Handbook of Australian Languages; Bowern, C., Ed.; Oxford University Press: Oxford, UK, 2022. [Google Scholar]
  34. Hale, K.L. Phonological developments in particular Northern Paman languages. In Languages of Cape York; Sutton, P., Ed.; AIAS: Canberra, Australia, 1976; pp. 7–40. [Google Scholar]
  35. Hale, K.L. Phonological developments in a Northern Paman language: Uradhi. In Languages of Cape York; Sutton, P., Ed.; AIAS: Canberra, Australia, 1976; pp. 41–49. [Google Scholar]
  36. Verstraete, J.C. Lamalamic Root Structure: Erosion and Expansion. Aust. J. Linguist. 2018, 38, 360–394. [Google Scholar] [CrossRef]
  37. Koch, H.J. Pama-Nyungan reflexes in the Arandic languages. In Boundary Rider: Essays in Honour of Geoffrey O’Grady; Tryon, D., Walsh, M., Eds.; Pacific Linguistics: Canberra, Australia, 1997; pp. 271–302. [Google Scholar]
  38. Koch, H. Basic vocabulary of the Arandic languages: From classification to reconstruction. In Forty Years on: Ken Hale and Australian Languages; Simpson, J., Nash, D., Laughren, M., Austin, P., Alpher, B., Eds.; Pacific Linguistics: Canberra, Australia, 2001; pp. 71–87. [Google Scholar]
  39. Black, P.D. Norman Pama historical phonology. In Papers in Australian Linguistics 13; Pacific Linguistics: Canberra, Australia, 1980; pp. 181–239. [Google Scholar]
  40. Dixon, R.M.W. Olgolo syllable structure and what they are doing about it. Linguist. Inq. 1970, 1, 273–276. [Google Scholar]
  41. Seifart, F.; Evans, N.; Hammarström, H.; Levinson, S.C. Language documentation twenty-five years on. Language 2018, 94, e324–e345. [Google Scholar] [CrossRef] [Green Version]
  42. Macklin-Cordes, J.L.; Bowern, C.; Round, E.R. Phylogenetic signal in phonotactics. Diachronica 2021, 38, 210–258. [Google Scholar] [CrossRef]
  43. Austin, P.K. A Grammar of Diyari, South Australia; Number 32 in Cambridge Studies in Linguistics; Cambridge University Press: Cambridge, UK; New York, NY, USA, 1981. [Google Scholar]
  44. McGregor, W. A Functional Grammar of Gooniyandi; Number 22 in Studies in Language; John Benjamins: Amsterdam, The Netherlands, 1990. [Google Scholar]
  45. Round, E.R. Phonotactics. In Oxford Guide to Australian Languages; Bowern, C., Ed.; Oxford University Press: Oxford, UK, 2022. [Google Scholar]
  46. Macklin-Cordes, J.; Round, E.R. Challenges of sampling and how phylogenetic comparative methods help: With a case study of the Pama-Nyungan laminal contrast. arXiv 2022, arXiv:2201.00195. [Google Scholar] [CrossRef]
  47. Garamszegi, L.Z. Modern Phylogenetic Comparative Methods and Their Application in Evolutionary Biology: Concepts and Practice; Springer: Berlin/Heidelberg, Germany, 2014. [Google Scholar]
  48. Kembel, S.; Cowan, P.; Helmus, M.; Cornwell, W.; Morlon, H.; Ackerly, D.; Blomberg, S.; Webb, C. Picante: R tools for integrating phylogenies and ecology (version 1.8.2). Bioinformatics 2010, 26, 1463–1464. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  49. Blomberg, S.P.; Garland, T.; Ives, A.R. Testing for phylogenetic signal in comparative data: Behavioral traits are more labile. Evolution 2003, 57, 717–745. [Google Scholar] [CrossRef] [PubMed]
  50. Bowern, C. Pama–Nyungan phylogenetics and beyond [plenary address]. In Lorentz Center Workshop on Phylogenetic Methods in Linguistics; Leiden University: Leiden, The Netherlands, 2015. [Google Scholar] [CrossRef]
  51. Martins, E.P.; Garland, T., Jr. Phylogenetic analyses of the correlated evolution of continuous characters: A simulation study. Evolution 1991, 45, 534–557. [Google Scholar] [CrossRef] [PubMed]
  52. Revell, L.J. phytools: An R package for phylogenetic comparative biology (and other things). Methods Ecol. Evol. 2012, 3, 217–223. [Google Scholar] [CrossRef]
  53. Dockum, R. Phylogeny in phonology: How Tai sound systems encode their past. In Proceedings of the 2017 Annual Meetings on Phonology, New York, NY, USA, 15–17 September 2017. [Google Scholar] [CrossRef] [Green Version]
  54. Macklin-Cordes, J.; Round, E. High-definition phonotactics reflect linguistic pasts. In Proceedings of the 6th Conference on Quantitative Investigations in Theoretical Linguistics, Tübingen, Germany, 4–6 November 2015. [Google Scholar]
  55. Dockum, R. The Tonal Comparative Method: Tai Tone in Historical Perspective. Ph.D. Thesis, Yale University, New Haven, CT, USA, 2019. [Google Scholar]
  56. Grosjean, F. Spoken word recognition processes and the gating paradigm. Percept. Psychophys. 1980, 28, 267–283. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  57. Wedel, A.; Ussishkin, A.; King, A. Crosslinguistic evidence for a strong statistical universal: Phonological neutralization targets word-ends over beginnings. Language 2019, 95, e428–e446. [Google Scholar] [CrossRef]
  58. Wedel, A.; Ussishkin, A.; King, A. Incremental word processing influences the evolution of phonotactic patterns. Folia Linguist. 2019, 53, 231–248. [Google Scholar] [CrossRef]
  59. Baker, B. Word structure in Australian languages. In The Languages and Linguistics of Australia: A Comprehensive Guide; Nordlinger, R., Koch, H., Eds.; Walter de Gruyter: Berlin, Germany, 2014; pp. 139–213. [Google Scholar]
  60. Hamilton, P. Phonetic Constraints and Markedness in the Phonotactics of Australian Aboriginal Languages. Ph.D. Thesis, University of Toronto, Toronto, ON, Canada, 1996. [Google Scholar]
  61. Dockum, R.; Bowern, C. Swadesh lists are not long enough: Drawing phonological generalizations from limited data. Lang. Document. Descript 2019, 16, 35–54. [Google Scholar]
  62. Round, E.R.; Macklin-Cordes, J. Clouded vision: Why insights improve when uncertainty about data is made explicit, 2019. In Proceedings of the 24th International Conference on Historical Linguistics, Canberra, Australia, 1–5 July 2019. [Google Scholar]
  63. Gordon, M.K. Phonological Typology; Oxford University Press: Oxford, UK, 2016. [Google Scholar]
  64. Haudricourt, A.G. Les doubles transphonologisations simultanées. In Proceedings of the Actes du XIIe Congrès International de Linguistique et de Philologie Romanes, Bucarest, Romania, 15–20 April 1968; pp. 315–317. [Google Scholar]
  65. Jakobson, R. Principles of historical phonology. In A Reader in Historical and Comparative Linguistics; Keiler, A.R., Ed.; Holt, Rinehart and Winston: New York, NY, USA, 1971; pp. 121–138. [Google Scholar]
  66. Matisoff, J.A. Areal and universal dimensions of grammatization in Lahu. In Approaches to Grammaticalization; Traugott, E., Heine, B., Eds.; John Benjamins: Amsterdam, The Netherlands, 1991; pp. 383–453. [Google Scholar]
  67. Macklin-Cordes, J.L.; Round, E.R. Re-evaluating phoneme frequencies. Front. Psychol. 2020, 11, 3181. [Google Scholar] [CrossRef]
  68. Sapir, E. Language: An Introduction to the Study of Speech; Dover Publications: Mineola, NY, USA, 1921. [Google Scholar]
  69. Grierson, G.A. On the Modern Indo-Aryan Vernaculars; Indian Antiquary: Bombay, India, 1931. [Google Scholar]
  70. Fortescue, M. Drift and the grammaticalization divide between Northern and Southern Wakashan. Int. J. Am. Linguist. 2006, 72, 295–324. [Google Scholar] [CrossRef]
  71. Croft, W. Explaining Language Change: An Evolutionary Approach; Longman: New York, NY, USA, 2000. [Google Scholar]
  72. Dahl, Ö. The Growth and Maintenance of Linguistic Complexity; John Benjamins Publishing: Amsterdam, The Netherlands, 2004. [Google Scholar]
  73. Heine, B.; Kuteva, T. Language Contact and Grammatical Change; Cambridge University Press: Cambridge, UK, 2005. [Google Scholar]
  74. Dunn, M.; Greenhill, S.J.; Levinson, S.C.; Gray, R.D. Evolved structure of language shows lineage-specific trends in word-order universals. Nature 2011, 473, 79–82. [Google Scholar] [CrossRef] [PubMed]
  75. Joseph, B.D. Demystifying drift. In Shared Grammaticalization; Robbeets, M., Cuyckens, H., Eds.; John Benjamins: Amsterdam, The Netherlands, 2013; pp. 43–66. [Google Scholar]
  76. Mendes, F.K.; Hahn, M.W. Gene Tree Discordance Causes Apparent Substitution Rate Variation. Syst. Biol. 2016, 65, 711–721. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  77. Iosad, P. Phonological convergence in North-western Europe. In Proceedings of the 12th International Conference of Nordic and General Linguistics, Oslo, Norway, 14–18 June 2021. [Google Scholar]
  78. Round, E.R. Getting Sapir’s drift: Parallel evolution in linguistic as in biological systems. In Proceedings of the Protolang 5, Lisbon, Portugal, 9–12 September 2019. [Google Scholar]
  79. Storz, J.F. Causes of molecular convergence and parallelism in protein evolution. Nat. Rev. Genet. 2016, 17, 239–250. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  80. Matisoff, J.A. Tonogenesis in Southeast Asia. In Consonant Types and Tone; Hyman, L.M., Ed.; Number 1 in Southern California Occasional Papers in Linguistics; University of Southen California: Los Angeles, CA, USA, 1973; pp. 71–96. [Google Scholar]
  81. Huffman, F. The register problem in fifteen Mon-Khmer languages. In Austroasiatic Studies; Jenner, P.N., Thompson, L.C., Starosta, S., Eds.; Univ of Hawaii Press: Honolulu, HI, USA, 1976; pp. 575–590. [Google Scholar]
  82. Dockum, R.; Gehrmann, R. The East Asian Voicing Shift and its Role in the Origins of Tone and Register. In Proceedings of the 95th Annual Meeting of the Linguistic Society of America, San Francisco, CA, USA, 7–10 January 2021. [Google Scholar]
Figure 1. Pama–Nyungan tree containing the 90 languages used in this study, inferred from lexical cognacy judgements. Displayed here is a single, maximum clade credibility tree, i.e., the one tree within the 1000-tree sample that most adequately represents the most frequently recurring subgroups in all of the trees of the sample.
Figure 1. Pama–Nyungan tree containing the 90 languages used in this study, inferred from lexical cognacy judgements. Displayed here is a single, maximum clade credibility tree, i.e., the one tree within the 1000-tree sample that most adequately represents the most frequently recurring subgroups in all of the trees of the sample.
Entropy 24 00507 g001
Table 1. Examples of allophony in post-tonic consonants conditioned by the phonemic length of the tonic vowel. Key: V_ after a phonemically short vowel, VV_ after a phonemically long vowel. See references for additional details in the conditioning of allophony.
Table 1. Examples of allophony in post-tonic consonants conditioned by the phonemic length of the tonic vowel. Key: V_ after a phonemically short vowel, VV_ after a phonemically long vowel. See references for additional details in the conditioning of allophony.
Language (Subgroup)ConsonantsV_VV_
Djambarrpuyngu (Yolngu) [25]Consonantslongershorter
Wik (Middle Paman) [29]Stopstenserlaxer
Kugu Nganhcara (Middle Paman) [30]Voiced stopsstopfricative
Nukunu (Thura–Yura) [31]Nasals, Lateralsprestoppedplain
Yadhaykenu (Nothern Paman) [32]Lateralsplainflapped
Table 2. Examples of post-tonic consonant contrasts created upon the merger of length distinctions in tonic vowels of Pama–Nyungan languages. Key: T short stop, TT long stop, D voiced stop, Z spirant, N nasal, NN long nasal, DN prestopped nasal, ND nasal+stop, L lateral, DL prestopped lateral, V_ after erstwhile short vowel, and VV_ after erstwhile long vowel. See the references for additional details and conditioning of the tabulated sound changes.
Table 2. Examples of post-tonic consonant contrasts created upon the merger of length distinctions in tonic vowels of Pama–Nyungan languages. Key: T short stop, TT long stop, D voiced stop, Z spirant, N nasal, NN long nasal, DN prestopped nasal, ND nasal+stop, L lateral, DL prestopped lateral, V_ after erstwhile short vowel, and VV_ after erstwhile long vowel. See the references for additional details and conditioning of the tabulated sound changes.
Language (Subgroup)Original CV_VV_
Warumungu (Warunmungic) [33]TTTT
Wik-Muminh (Middle Paman) [29]T (non-apical)TD
Northern Paman subgroup [32,34,35]T (non-apical)TZ
Lamalama, Umbuygamu (Lamalamic) [36]/k//k//h/
Kugu Mumminh (Middle Paman) [33]NNNN
Arandic subgroup [37,38]NDNN
Walangama (Norman Paman) [39]NDNN
Olgolo (Southwest Paman) [40]NDNN
Lamalama (Lamalamic) [36]NNDN
Rimanggudinhma (Lamalamic) [36]NDN
Thura–Yura subgroup [31]LDLL
Table 3. Phylogenetic signal in F L V , F L C , and F L P , measured using Blomberg’s K and a sample of 1000 reference PN trees.
Table 3. Phylogenetic signal in F L V , F L C , and F L P , measured using Blomberg’s K and a sample of 1000 reference PN trees.
Functional Load MeasureMean Kstd.dev of K
unnormalized FL
FL of tonic vowel length ( F L V u ) 0.972 0.036
FL of post-tonic consonant manner ( F L C u ) 0.956 0.038
FL of post-tonic consonant place ( F L P u ) 0.960 0.030
normalized FL
FL of tonic vowel length ( F L V n ) 0.997 0.039
FL of post-tonic consonant manner ( F L C n ) 1.181 0.035
FL of post-tonic consonant place ( F L P n ) 1.010 0.033
Table 4. Phylogenetic Pearson’s correlation between F L V , F L C , and F L P .
Table 4. Phylogenetic Pearson’s correlation between F L V , F L C , and F L P .
Functional Load Measuresr95% Intervalp
unnormalized FL
F L V u versus F L C u 0.28 [ 0.46 0.08 ] 0.006
F L V u versus F L P u 0.03 [ 0.18 0.23 ] 0.78
normalized FL
F L V n versus F L C n 0.50 [ 0.64 0.33 ] 5 × 10 7
F L V n versus F L P n 0.19 [ 0.38 0.02 ] 0.08
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Round, E.; Dockum, R.; Ryder, R.J. Evolution and Trade-Off Dynamics of Functional Load. Entropy 2022, 24, 507. https://doi.org/10.3390/e24040507

AMA Style

Round E, Dockum R, Ryder RJ. Evolution and Trade-Off Dynamics of Functional Load. Entropy. 2022; 24(4):507. https://doi.org/10.3390/e24040507

Chicago/Turabian Style

Round, Erich, Rikker Dockum, and Robin J. Ryder. 2022. "Evolution and Trade-Off Dynamics of Functional Load" Entropy 24, no. 4: 507. https://doi.org/10.3390/e24040507

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop