1. Introduction
Within the first few years of life, infants acquire their native language with remarkable ease. Not only do neonates differentiate their own language from languages belonging to other language families [
1], but they also prefer listening to their mother’s voice and poems/stories she read aloud before they were born (see review in [
2]). Debate continues as to whether some components of speech and language are genetically determined [
3,
4], but there is little controversy about the fact that infants show remarkable readiness and sensitivity to acquiring language and that experience contributes to shaping these abilities over development. Recent advances in developmental science have paved the way for elucidating how initial endogenous biases interact with language exposure and other exogenous factors to allow infants to tune into the sounds and structures of their mother tongue.
Over the first year of life, infants’ perceptual capacities become progressively more specialized in a fashion consistent with the rhythmic and prosodic patterns of their mother tongue [
5,
6,
7]. At 6 months, infants are able to discriminate speech sounds from a variety of different language families. However, by 10–12 months of age, their perceptual abilities are narrowed to those sounds specifically relevant to their own language [
8,
9,
10]. The nature of the decline in perceptual sensitivity is tightly bound to the characteristics of the language being acquired, implying that the perceptual system does not simply turn on or off a particular speech contrast, but rather that the system undergoes substantial dynamic reorganization during this early period [
8,
11].
Exactly how infants make their gradual journey from an open system comprising a wide variety of sounds to the specialized system observed in adults is a complex, interactive developmental story, the elements of which are gradually beginning to be understood. At the most basic level, neural and perceptual systems enable the infant to discriminate among human speech sounds [
12]. Language exposure, combined with the statistical learning capacities that infants possess [
13], play an important role in sensitizing the infant to the frequency and distributional properties of the exposure language [
14,
15], resulting in an impact on syllable structure and phonotactics [
16]. The timing and consistency of exposure appear to be highly relevant for the acquisition of speech contrasts. For instance, Korean adoptees who were no longer exposed to Korean after 3–8 years of age were no better at discriminating Korean contrasts compared to monolingual French speakers [
17]. On the other hand, infants exposed to a language early on who receive continuous exposure, even for just a few hours per week alongside their native language, do show native-like discrimination skills [
18].
A further influence on the development of speech in infancy is the social context in which language acquisition takes place. Interestingly, ‘live’ social interactions were found to be crucial to the acquisition of speech because exposure to the same contrasts through television does not result in learning [
19]. Specifically, research has demonstrated that infants are particularly sensitive to patterns of contingency in the context of social interactions. For example, infants given contingent phonological feedback from their mothers will rapidly restructure their babbling, incorporating phonological patterns from caregivers’ speech, whereas infants who are provided the same feedback in a non-contingent fashion will not [
20]. This suggests that contingency in social interaction plays an important role in acquisition of speech milestones.
The goal of our study was to map individual differences in reaching speech discrimination milestones onto differences in the quality of mother-child interaction, and in particular onto contingency in those interactions. To test the relationship between contingency in the context of naturalistic mother-child interactions and speech discrimination, we focus on a well-replicated milestone where infants narrow their perceptual sensitivity to native language contrasts. Little research has examined the mechanisms involved and how qualitative differences in mother-child interaction might relate to this ability. Infant sensitivity to syllables from their native tongue was compared to sensitivity to non-native phonemes (from a different language family) in a discrimination task. The study had two aims: (1) to replicate the finding of loss of discrimination of non-native phonemes between 6- and 10-months of age in English, as well as extending the findings to two new languages, French and German, and (2) to explore the relationship between the narrowing of this phonological sensitivity and contingency of mother-child interaction. We chose the 6–10 month time window of longitudinal change to assess the impact of exogenous factors, such as mother-child interaction, on the timing of the emergence of these abilities.
3. Results
Table 1 shows the amount of data retained from the initial group of 122 infants at 6-months and 106 returning at 10-months follow up. Infants were excluded from the study due to fussiness and fatigue. Specifically, to be included in the analysis, infants had to complete all test trials either at 6-months, 10, months or both ages in either the native or non-native contrast. This approach allowed us to retain sufficient statistical power to examine the effects of subgroups based on parent-child interactions (see below).
Figure 1 and
Figure 2 show the amount of time infants attended to the syllables during same and switch trials at 6 and 10 months in the two experiments assessing perception of native and non-native speech contrasts. Differences in looking time between same and switch trials were analyzed using four ANOVAs corresponding to each experiment (Native
vs. Non-Native) tested at baseline and again at follow-up. Each model included the factors Trial (same
vs. switch) and lab membership (London, Munich, Paris) to verify that the target variable Trial was not affected by testing location.
Figure 1.
Looking time during same and switch trials for native contrasts at 6- and 10-months.
Figure 1.
Looking time during same and switch trials for native contrasts at 6- and 10-months.
Figure 2.
Looking time during same and switch trials for non-native contrasts at 6- and 10-months (* p < 0.05).
Figure 2.
Looking time during same and switch trials for non-native contrasts at 6- and 10-months (* p < 0.05).
In the experiment assessing perception of native contrasts, results showed a significant main effect of Trial (same vs. switch), indicating that infants listened significantly longer during switch trails relative to same trials (F(1,77) = 19.5, p < 0.001). This pattern did not differ depending on the lab from which the data were collected; there was no interaction between Trial and Lab membership (F(2,77) = 0.65, p = 0.53). A similar pattern was observed at 10 months where there was a significant main effect of Trial (F(1,73) = 10.1, p = 0.002) but no interaction between Trial and Lab membership (F(2,73) < 1, p = 0.47). The same analysis was used for the non-native speech perception experiment. The results indicated a significant difference between same and switch trials on non-native contrasts at 6 months (F(1,69) = 8.9, p = 0.004) whereas at 10 months, no such difference in looking time emerged (F(1,72) < 1, p = 0.92). Also here, the interactions between Trial and Lab membership were not significant (both p > 0.2). In view of these findings, lab membership was dropped as a factor from subsequent analysis.
In summary, these results indicate that as a group the 6-month-olds listened significantly longer during switch trials relative to same trials for both native and non-native contrasts. At 10-months, however, the discrimination effect was only found for native contrasts. To assess the relationship between the quality of mother-child interaction and speech discrimination performance, we split infants into two subgroups on the basis of the global rating on the sensitivity derived from the Care-Index. The overall mean for the sensitivity scale was 10 out of 14 (SD = 3.2) at 6 months and 11 out of 14 (SD = 2.9) at 10 months. Scores on maternal sensitivity were very strongly correlated (r = 0.8, p < 0.001) with another Care-Index measure describing infant cooperativeness, indicating that the scores reflected dyadic characteristics. Nevertheless, we preferred to use maternal scores to guarantee that any results obtained were not solely attributable to the general infant characteristics that might affect their speech discrimination performance. Dyads were divided into two groups whose scores fell above and below the median score on the sensitivity scale (11 at 6 and 10 months). Furthermore, although maternal sensitivity scores at 6-months and 10-months were very strongly correlated (p < 0.001), the group division was done separately for each age. Out of a maximum of 14, mean scores at both ages in the “high contingency” group were around 12 for maternal sensitivity and 12 for infant cooperativeness, whereas mean scores in the “moderate contingency” group were around 7 for maternal sensitivity and 8 for infant cooperativeness.
Figure 3,
Figure 4 present discrimination results for native and non-native contrast as a function of contingency of interaction. Post-hoc comparisons (setting a
p value < 0.01 to correct for multiple comparisons) confirmed longer looking in switch
vs. same trials for native phonemes across the two groups at both ages: high contingency 6 months,
t(39) = −3.9,
p < 0.001,
n = 40; high contingency 10 months,
t(49) = −2.2,
p = 0.02,
n = 50; moderate contingency 6 months,
t(36) = −2.6,
p = 0.01,
n = 37; moderate contingency, 10 months
t(28) = −2.3
p = 0.02,
n = 29. By contrast, in the non-native speech perception tasks, the results of the subgroups differed from the overall group results. At 6 months, infants in the high contingency group did not show a significant difference between same and switch trials (
t (32) = −1.02,
p = 0.31,
n = 33), whereas the infants in the moderate contingency group displayed this difference (
t (34) = −3.1,
p = 0.004,
n = 35). At 10 months, performance in both groups parallels the overall group results where infants in both groups showed no differences in listening time for same
vs. switch trials.
To ascertain whether these differing results in the High and Moderate contingency groups were associated with differences in attention to the stimuli during the task, we assessed whether the groups differed in the amount of sustained attention to the stimuli in the non-native discrimination experiment at 6 months. The results indicated that total looking time in the familiarization phase was remarkably similar (high contingency = 61.4, SD =17.4; moderate contingency =61.9; SD=24.1, p = 0.9). Hence, the results cannot be attributed to differences in exposure time during the familiarization phase.
Figure 3.
Looking time during same and switch trials for native contrasts at 6- and 10-months for subgroups of infants (* p < 0.05).
Figure 3.
Looking time during same and switch trials for native contrasts at 6- and 10-months for subgroups of infants (* p < 0.05).
Figure 4.
Looking time during same and switch trials for non-native contrasts at 6- and 10-months for subgroups of infants (* p < 0.05).
Figure 4.
Looking time during same and switch trials for non-native contrasts at 6- and 10-months for subgroups of infants (* p < 0.05).
4. Discussion
The relationship between phonetic development and the specific context of language acquisition, has rarely been investigated in previous research [
19]. Our findings suggest that specific characteristics of this social context, measured in our study through differences in the quality of dyadic interactions, exert an influence on the timing of this developmental process. At 6 months, infants from dyads with high contingency scores appear to have already narrowed their perceptual abilities to their mother tongue, yielding evidence for early specialization in that they discriminated native but not non-native speech contrasts, whereas infants from dyads with moderate contingency scores continue to discriminate both native and non-native contrasts. By 10 months of age, the two groups of infants were indistinguishable, both displaying the expected pattern of discrimination for their native but not for the non-native phonetic categories.
Our findings suggest that characteristics of the social environment maps onto the timing of native language discrimination. This developmental milestone has already been viewed as critical in setting the stage for other linguistic skills acquired well into toddlerhood. The timing of this ability at around 10 to 12 months of age coincides with the point at which children begin to understand meaningful words and to produce the particular sounds of their native language [
3]. Some have suggested that these evolving perceptual skills are essential for segmenting the speech stream and for mapping words onto meaningful concepts, in a process where phonological and lexical acquisition go hand in hand [
25,
26]. Longitudinal studies have revealed that perceptual sensitivity at 6 months predicts vocabulary size at 24 months [
27] as well as reading skills at 3–8 years of age [
19,
28].
What are the implications of these results for theories of language acquisition? Traditionally, the debate over language acquisition in general, and phonetic development in particular, has focused on whether the infant’s evolving capacities are subserved by “modular”, domain-specific mechanisms or, alternatively, by general perceptual mechanisms [
29,
30]. In view of our findings, the relationship between mother-child interactions and phonetic discrimination performance does not appear to be mediated by a general mechanism, e.g., categorization skills, since the effects seen were specific to non-native, but not to native language contrasts. Instead, our findings support the emerging view of language acquisition as drawing on a diverse set of perceptual, cognitive, and social mechanisms [
19,
31]. The developmental relations among these mechanisms and phonetic development can be captured within specific time windows, in our study at 6 months but not at 10 months, where individual variability in the timing of phonetic specialization is influenced by the social context of language acquisition. Hence, while phonetic development initially draws on broad perceptual mechanisms, further specialization and in particular its timing is influenced by a variety of mechanisms including social ones.
What might be driving these differences in speech discrimination performance as a function of the quality of mother-child interaction? One possibility is that mothers in the high contingency group provide more linguistic input to their infants compared to the moderate contingency group. This, however, is inconsistent with previous studies demonstrating that mere exposure to linguistic contrasts outside the social context is not sufficient for successful discrimination [
19]. Furthermore, while previous research has highlighted the critical role of statistical learning in lexical segmentation, it has also affirmed that human infants are efficient learners [
13]. Consistent with this view, in the ratings of mother-child interactions in the instruments we employed, frequency of input was not a primary factor in discrimination. On the other hand, it has been previously suggested that the social context aids learning through eliciting attention and motivation and this even extends to other species such as birds in learning songs [
19]. Our findings offer further insights into the features that characterize these interactions and how they might facilitate learning through stimulating the infant’s attention and motivation. Dyads in the high contingency group showed more mutual gaze, verbal and non-verbal turn-taking, and mutual affect. Furthermore, these interactions were characterized by high levels of contingency, where mothers altered their behavior as a function of the infants’ behavior and varied their verbal and non-verbal input within the context of these contingent interactions. Taken together, our findings are consistent with previous work suggesting that quantity of input is only one among many factors that need to be accounted for within the social context of language acquisition.
Further support for the importance of dyadic contingency in relation to phonetic learning comes from atypical development. In the neurodevelopmental disorder autism, differentiation of native language phonetic categories is less clear than that found in typical development, even by the age of 3–4 years [
32]. While a number of studies have shown that mothers of children with autism do not differ in the frequency of verbal input they provide, the interactions are less synchronous than those seen in typically developing infants and their care-givers [
32]. These atypical patterns of social interactions appear to result in serious consequences for phonetic development and, as our current study has shown, even in the typical case, features of mother-child interaction influence the timing of infants’ specialization for the sounds of their native language.
Our findings require replication because of two key limitations. First, despite the large sample size we did not have sufficient statistical power to track each infant’s development longitudinally across native and non-native phoneme discrimination tasks. Given typical limitation of infancy experimental research designs, only a handful of infants produced valid data across both ages and in both experiments. Therefore, future studies need to have larger samples or employ novel methods to reduce data attrition. The second limitation of our study is that we did not specifically explore potential interactions between the specific language environment (English vs. French vs. German), parental characteristics such as education and socioeconomic status, and patterns of mother-child interaction. Therefore, such potential cross-linguistic and individual differences need to be explored in more detail in future studies.