1. Introduction
Cross-linguistic phonetic interaction in bilingualism and language learning is believed to be bidirectional: the earlier acquired, more established language (L1) can be affected by the later acquired, often non-dominant, language (L2). This type of crosslinguistic interaction is known by many names: back-transfer, reverse interference, phonetic drift, and language attrition, to name a few. We define this type of interaction as phonetic changes in speakers’ L1 brought about by use of L2 and refer to these changes primarily as L2-to-L1 (phonetic) effects or L1 drift.
A few prominent lines of inquiry dominated the previous research on L2-to-L1 effects, leaving the full scope of this phenomenon under-explored. Specifically, the majority of previous work has focused on proficient bilinguals or advanced second language learners and most speakers were studied in the situation of L2 immersion (
Baker and Trofimovich 2005;
Barlow et al. 2013;
Bergmann et al. 2016;
Caramazza et al. 1973;
Chang 2012;
De Leeuw 2019;
De Leeuw et al. 2018;
De Leeuw et al. 2010;
Flege 1987;
Fowler et al. 2008;
Guion 2003;
Harada 2003;
Hopp and Schmid 2013;
Kartushina and Martin 2019;
Lang and Davidson 2019;
Lev-Ari and Peperkamp 2013;
MacLeod and Stoel-Gammon 2005;
Major 1992;
Mayr et al. 2012;
Mora and Nadeu 2012;
Mora et al. 2015;
Sancier and Fowler 1997;
Simonet 2010;
Tobin et al. 2017;
Ulbrich and Ordin 2014). Moreover, language pairings often involved Western European languages, such as English, Spanish, French, German, and Dutch, which tend to be relatively similar phonologically and share the Latin alphabet. Finally, these studies on cross-language interaction have typically focused on sound classes that have distinct phonetic realizations in the respective languages, such as oral stops, distinguished across languages via voice onset time (VOT), or oral vowels, distinguished via formant frequencies.
The current study expands the scope of previous work on L2-to-L1 effects by examining a population of relatively inexperienced learners of a rarely studied Slavic second language (Russian). The L2 learners in the present study reside in the home country and are immersed in their native language (American English). Moreover, in addition to inquiring how comparable phonological categories can affect one another’s acoustic realization across languages, the present study also considers the transferability of phonological processes from L2 to L1. We target the acoustic realization of word-initial voiced and voiceless stops in native speech of American learners of Russian, to determine whether it has been affected by exposure to Russian. In addition, we investigate word-final stops, fricatives and affricates in learners’ English to establish whether their productions show an effect of the Russian final devoicing rule.
In the following sections, we discuss the theoretical underpinnings of the L2-to-L1 phonetic effects (
Section 1.1), provide a brief overview of the previous literature on the topic (
Section 1.2), and introduce the details of the present study (
Section 1.3).
1.1. Mechanism of L2-to-L1 Effects
Among the theoretical models put forth to account for the production of second language speech, the speech learning model (or SLM/SLM-r;
Flege 1995,
2003;
Flege and Bohn 2020) explicitly predicts bidirectional phonetic interactions and outlines their general mechanism.
SLM postulates that sound categories of learners’ first and second languages coexist in the same phonological space, which a-priori creates a possibility for mutual influence. Moreover, SLM proposes that a mechanism of ‘equivalence classification’ affects the perception of L2 sounds that are acoustically non-identical but similar to existing L1 categories. As a result, corresponding L1 and L2 sounds are joined under the same category and their acoustic properties are predicted to affect each other, such that L2 sounds are realized in an L1-like manner, and L1 sounds are produced similarly to L2 ones—a situation known as category assimilation.
Flege’s own work (e.g.,
Flege 1987) and much of the subsequent research, however, demonstrated that phonetically separate sound categories are nevertheless maintained across languages in the speech of bilinguals, with one or both deviating from the monolingual norm in the direction of assimilation to the other language (
Baker and Trofimovich 2005;
Caramazza et al. 1973;
Chang 2012;
Flege and Eefting 1987a,
1987b;
Fowler et al. 2008;
Harada 2003;
Major 1992;
Sancier and Fowler 1997;
Sundara and Baum 2006). This cross-language separation suggests that bilinguals are able to discern the acoustic–phonetic differences between the cross-language equivalents even when they are merged under the same category. Moreover, this ability is an important condition of the L2-to-L1 effects. If bilinguals perceived L2 sounds as indistinguishable from the ones in their L1, there could not be any influence of L2 on the production of L1.
To summarize, current theoretical models predict the emergence of L2-to-L1 phonetic effects in experienced L2 speakers, with a possible added condition of reduced L1 use. In the following section, we review relevant studies which serve to refine these general predictions.
1.2. Previous Research on L2-to-L1 Phonetic Effects
Research consistently demonstrated that greater L2 experience and proficiency lead to a greater likelihood of L2-to-L1 phonetic effects for sequential bilinguals and adult language learners. For example,
Flege (
1987) demonstrated that the VOT of English [t] was significantly more French-like in the speech of Americans residing in Paris, compared to American students and teachers of French who were residing domestically. The French [t] of speakers of French residing in Chicago was also significantly different from that of French monolinguals, in the direction of assimilation to English, indicating the combined effect of proficiency, experience, and immersion. Later,
Lang and Davidson (
2019) showed that only Americans residing in Paris, but not American students on a short-term study abroad in France, experienced a drift in native vowel acoustics in the direction of L2 norms, confirming the important role of long-term immersion.
L2 pronunciation proficiency has also been linked more directly to changes in L1 phonetics.
Major (
1992) reported a positive correlation between L2 proficiency and L1 drift for American immigrants to Brazil: the closer they approximated Portuguese VOT norms in their production of L2 voiceless stops, the more they deviated from native norms in their L1 productions, in the direction of assimilation to L2 (although see
Kartushina and Martin (
2019) who report a negative L2 proficiency–L1 drift correlation).
The age of L2 acquisition also plays a role in promoting L2-to-L1 phonetic effects. In
Guion (
2003), only early and mid but not late Quechua–Spanish bilinguals revealed an effect of L2 (Spanish) on native vowel acoustics. Similarly, in
Baker and Trofimovich (
2005) an L2-to-L1 cross-language influence in vowels was uncovered only for early but not late Korean–English bilinguals, suggesting the importance of accumulated L2 experience.
Estimating L2 experience via length of residence in an L2-dominant environment,
Dmitrieva et al. (
2010) showed that Russian speakers of English with greater L2 experience were more likely to realize final obstruents in Russian in a more English-like manner, with less devoicing. Similarly,
Bergmann et al. (
2016) established that native speech of long-term German immigrants to Canada was perceived as more accented by their monolingual compatriots as a function of a longer residence abroad.
While the effects of L2 exposure and L2 proficiency are almost inevitably conflated in research on long-term immigrants,
Chang (
2012,
2013) was able to disentangle the two.
Chang (
2012) demonstrated L1 drift in several phonetic parameters, including VOT and vowel spectrum, in beginner American learners of Korean after only a short immersion in Korean during a study abroad program. Crucially, these participants achieved only elementary proficiency in Korean by the end of the six-week program, while L1 drift was observed already at week two. This work suggests that L2 proficiency by itself is not a necessary condition of L2-to-L1 phonetic effects, but L2 exposure which comes about due to L2 immersion may be.
Chang’s work does share the element of L2 immersion, providing a level of L2 input that is both abundant and authentic, even if it is primarily overheard, with much of the previous literature. There is evidence that even overhearing-type exposure to another language may have important and long-lasting consequences.
Au et al. (
2002) and
Knightly et al. (
2003) showed that individuals who, early in life, were exposed to Spanish without learning it (‘overhearers’), upon enrolling in an L2 Spanish class, demonstrated near-native like VOTs in Spanish, compared to learners in the same class who did not overhear Spanish earlier in life. Moreover,
Chang (
2019b) demonstrated that L1 drift persisted for L2 learners immersed in L2 even when they no longer actively used the second language, suggesting that ambient language exposure in adulthood as well as in early childhood is an important factor affecting language production.
Caramazza et al. (
1973) also reported an interaction between English and French, exclusively in the direction of English affecting French, for residents of Canada who spoke only one of the two languages but were presumably exposed to both.
This work raises an important question about the minimum amount of L2 exposure required to trigger L2-to-L1 phonetic effects. Clearly, the intensive and authentic exposure provided by L2 immersion can be sufficient. However, what about non-immersion-type exposure to L2? Research on L2 learners in non-immersion situations is relatively scarce but it provides some indication that an even more fleeting introduction to another language may trigger phonetic changes in L1.
Traditional classroom learners of additional languages have been largely overlooked when it comes to L2-to-L1 effects. A number of studies examining non-immersion population were often conducted with small numbers of participants, thus arriving at somewhat inconclusive results.
Huffman and Schuhmann (
2016) examined four beginner American learners of Spanish and reported little evidence of L2-to-L1 phonetic effects. Between weeks 2 and 6 of language instruction, learners demonstrated no changes in the VOT of native voiced or voiceless stops. Only the frequency of prevoicing in English suggested a tendency to dissimilate away from Spanish: three participants decreased or eliminated prevoicing from their English voiced stops.
Schuhmann and Huffman (
2015) did show that after a period of explicit phonetic training, three out of five learners of Spanish shortened their English voiceless stops’ VOT, indicating assimilation to Spanish.
Herd et al. (
2015), the only large-scale (N = 40) cross-sectional study of classroom learners known to us, demonstrated that near-native and advanced learners of Spanish produced English voiced stops with more negative VOTs than beginner learners. The near-native, advanced, and intermediate learners also produced more peripheral English vowels than beginner learners did—a difference also compatible with the effect of Spanish on English. This study indicates that L1 drift is possible in more experienced classroom L2 learners but, in the absence of the monolingual control group, it was not established whether beginner learners also modified the acoustics of their native speech in the direction of second language norms. Overall, the available studies on L2-to-L1 effects in classroom learners provide limited evidence that L2 exposure in the classroom may be sufficient to trigger L1 drift.
Another reason to study classroom learners is the fact that they continue to reside in the home country while acquiring their L2. Most foreign language courses in US colleges provide active instruction 3–5 h a week. For the remainder of the time, learners use their L1. The amount of reduction in L1 use and exposure is most likely negligible in these circumstances.
This aspect of classroom language learning is important because the reduction in L1 use associated with L2 immersion could play an important role in creating conditions for the L2-to-L1 phonetic effects. Conversely, continued L1 use has been suggested to promote and protect the ‘authenticity’ of L1 speech (
Kartushina et al. 2016b). For example,
Bergmann et al. (
2016) demonstrated a negative correlation between the amount of L1 use and the degree of perceived non-native accent in the L1 speech of long-term German immigrants to North America. De Leeuw and colleagues also showed that the German of immigrants to Canada and The Netherlands was less likely to be perceived as non-native sounding if they had a high amount of contact with other Germans in a monolingual mode (
De Leeuw et al. 2010). Moreover, Mora and colleagues (
Mora and Nadeu 2012;
Mora et al. 2015) reported that greater use of L1 Catalan promoted more monolingual-like Catalan vowels in Catalan–Spanish bilinguals. Although
Tobin et al. (
2017) did not detect any L1 drift in the native speech of Spanish learners of English after a 3–4 months period of L2 immersion in the United States, they explained this result by the lack of a sufficient reduction in L1 use.
The dominant, and thus more frequently used, language is also believed to be protected from the cross-linguistic influence. For example,
Kartushina and Martin (
2019) showed that, in balanced Catalan–Spanish bilinguals, both languages were affected by immersion in English but in Spanish-dominant bilinguals only Catalan vowels drifted towards English (see also
Caramazza et al. (
1973) and
Mack (
1989)).
To summarize, much previous research indicates that while advanced L2 proficiency is not a necessary condition for L2-to-L1 phonetic effects, greater L2 exposure and experience promote L1 drift. Immersion-type exposure to L2 is particularly conducive to L1 drift. Moreover, the reduction in L1 use, which typically co-occurs with L2 immersion and L2 dominance, is another possible condition for L2-to-L1 phonetic effects.
The population of classroom learners, which has not been widely studied with respect to L1 drift, provides an essential complement to previous work on immersed learners; a comparison that leads to a better understanding of the role of L2 immersion and reduced L1 use in bidirectional cross-language interaction. The following section describes the present study designed to address the question of L1 drift in classroom language learners.
1.3. Present Study
The present study aims to determine whether exposure to a second language via classroom learning can lead to phonetic changes in the native speech of the learners. The second language studied by our participants is Russian.
Russian is a relatively unusual choice for American learners and a comparatively difficult language to acquire for English speakers. In a ranking of languages encompassing four different difficulty categories, the US Foreign Service Institute placed Russian in category III, among ‘hard’ languages with significant linguistic and/or cultural differences from English (
https://www.state.gov/foreign-language-training/), and specified that approximately 1100 class hours are required to reach general professional proficiency in speaking and reading (S3 and R3). This amounts to 14 semesters of study, assuming a fairly typical five hours per week over a 16-week semester study pattern. Thus, although participants for the present study were recruited from the second through to the sixth semesters of Russian study, it is reasonable to assume that most had not managed to reach advanced proficiency in this amount of time.
Unlike more frequently studied languages such as French, German, Italian, and Spanish, Russian does not share the same writing system with English. This makes L1 English–L2 Russian a qualitatively different and novel language pairing to consider. In particular, we ask whether, L1 drift is as likely in pairings of languages with fewer linguistic, orthographic, and cultural similarities as among more similar languages.
We consider the voice onset time of word-initial voiced and voiceless stops as the phonetic aspect potentially subject to L1 drift. In addition to this commonly studied parameter, we examine onset f0—pitch at the beginning of the post-consonantal vowel—as a secondary correlate of voicing. Secondary correlates have rarely been studied in L2 learners and we know little about their propensity to drift towards L2 in L1 speech.
Russian realizes its initial prevocalic [+voice] stops as robustly prevoiced (with negative VOT) and its initial prevocalic [−voice] stops as voiceless unaspirated (short lag VOT) (
Ringen and Kulikov 2012). English realizes its initial prevocalic [+voice] stops as a combination of weakly prevoiced (about 30% for the population,
Dmitrieva et al. 2015) and voiceless unaspirated stops (70%), and its initial prevocalic [−voice] stops as voiceless aspirated (long lag VOT) (
Lisker and Abramson 1964). This phonetic difference between Russian and English stop voicing is usually not taught explicitly in Russian language courses, as was confirmed by Purdue University Russian language instructors.
The expected pattern of L1 drift, based on previous research, includes a well-documented tendency towards VOT shortening in voiceless English stops. It is also possible that the prevoicing period in English [+voice] stops could be lengthened under the influence of Russian. Finally, the proportion of prevoiced to voiceless unaspirated stops among English [+voice] segments could change towards a greater frequency of prevoicing, in assimilation with Russian.
With respect to onset f0, the two languages demonstrate a congruent covariation of f0 with phonological categories (lower f0 after [+voice] stops) but an incongruent covariation with phonetic VOT categories: first, f0 is lower after prevoiced stops than after voiceless unaspirated stops in Russian but there is no such difference in English because short lag and lead VOT stops are variants of the same phonological category (
Kulikov 2012;
Dmitrieva et al. 2015). Thus, exposure to Russian could lead to f0 lowering after prevoiced stops in participants’ English speech. Second, English voiceless unaspirated stops are characterized by low onset f0, as they are phonologically voiced, while Russian voiceless unaspirated stops are characterized by high onset f0, as they are phonologically voiceless. Thus, an L2-to-L1 effect in this case would involve the relative raising of onset f0 after voiceless unaspirated stops in the English of Russian learners.
Finally, we investigate the temporal indices of voicing in word-final obstruents: preceding vowel duration, consonant constriction duration, and duration of voicing during constriction. This additional area of interest was selected because of important differences between English and Russian in the way phonological and phonetic voicing is treated in final obstruents. English, for the most part, maintains phonetic differences between phonologically voiced and voiceless final obstruents, although there is a gradient tendency to devoice in this position, especially for fricatives (
Davidson 2016). Russian, on the other hand, features categorical devoicing in word-final position. We aim to investigate the possibility of L2-to-L1 influence on the basis of phonological rules which apply in the L2. We hypothesize that learners’ L1 may adopt this phonological process from the L2 (
Barlow et al. 2013;
Simonet and Amengual 2020).
We further hypothesize that such influence may be especially likely for areas of L1 phonology that trend towards change, in particular if change is in the direction of the L2 process, in this case, devoicing (see
Barlow et al. (
2013) and
Bullock and Gerfen (
2004) for similar reasoning). Thus, English speakers exposed to Russian may be expected to demonstrate a stronger tendency to devoice in word-final position than is observed for monolingual English speakers.
To summarize, the present study examines L1-immersed classroom language learners in order to extend previous investigations of L2-to-L1 effects to populations not characterized by extensive L2 exposure and reduced L1 use due to L2 immersion. To establish the phonetic effects of their L2, Russian, on their L1, English, we examine the acoustic properties of word-initial stops (VOT and onset f0), and word-final obstruents (temporal indices of final voicing).
Following previous research, we conduct two types of comparisons: that between learners’ L1 and L2, in order to determine whether the two systems are distinct or merged with respect to the select acoustic properties (a within-subject comparison) and those between learners’ and monolinguals’ L1s, in order to determine whether L2-to-L1 effects have taken place in learners’ speech (a between-subject comparison). We believe that it is important to conduct both comparisons in order to demonstrate that a degree of phonetic learning has taken place in these speakers’ L2, and that L1 drift, if present in their speech, is consistent with the nature of phonetic learning they achieved in their L2 speech. By establishing the degree of L2 phonetic learning for our participants, we further our understanding of the conditions under which L1 drift can be expected to occur. Moreover, the cooccurrence of L1 drift and L2 phonetic learning for the same features supports the notion that L2 phonetic learning is what triggers L1 drift.
To determine that L1 drift is a relatively stable feature of learners’ native speech as opposed to the short-term effect of producing speech in the two languages in immediate succession, we analyzed the effect of the order of language elicitation.
We also examine the relationship between the extent of individual L1 drift and L2 proficiency in order to test the hypothesis that magnitude of drift in L1 is linked to the degree of pronunciation gains in L2.
Thus, the three main objectives of the present research are: (1) to determine whether phonetic learning has taken place in the Russian speech of learners; (2) to determine whether L1 drift has taken place in the English speech of learners; and (3) to determine whether the degree of phonetic learning/pronunciation gains were correlated with the degree of L1 drift.
2. Materials and Methods
2.1. Participants
Twenty native speakers of American English learning Russian as a second language participated in the study: eleven men and nine women, between the ages of 19 to 24 years (M = 20.6, SD = 1.3). They were recruited and recorded in two locations: Purdue University (14 participants) and the University of Kansas (6 participants). Participants filled out a language background questionnaire after the recording. All reported English as their first and native language. All participants reported learning Russian mainly through college classroom instructions and only four participated in a 2–4 months-long Russian study abroad program some time during the year preceding their enrollment in the study. On average, they studied Russian for 5 semesters by the time of participation (SD = 3, R = 1.5–12). The amount of class time varied by level, e.g., from five hours a week for semesters 1 through 4 of Russian, to three hours a week, starting from the 5th semester (Purdue campus).
Participants reported using Russian mostly in class or with classmates, on average for four hours per week (ranging from one to 6 h). Four participants reported using Russian with a family member but only up to one hour a week. The most commonly reported type of engagement with Russian was reading (M = 2 h/week, R = 1–6 h/week). Writing in Russian was the second most common activity (M = 2 h/week, R = 0.5–4 h/week). Only about half of the participants reported listening to Russian radio or watching Russian TV (M = 3 h/week, R = 1–6 h/week).
Participants’ average self-reported Russian fluency was ‘fair’ (‘3’ on a 7-point scale), and the degree of accentedness in Russian was ‘moderate’ (‘3’ on a 7-point scale). All participants studied additional modern languages in classroom settings (the majority of participants studied only one additional language per person), most commonly Spanish, French and German (for 5 semesters on average, across these three languages). Achieved proficiency was ‘fair’ on average (‘3’ on a 7-point scale). Only three participants reported ‘good’ or ‘very good’ knowledge of an additional language (German and Spanish).
Eighteen native speakers of American English from the same dialectal area (Midwest) participated in the study as the control group: four men and fourteen women, between the ages of 18 and 57 (M = 25.8, SD = 9.8). These participants were recruited at Purdue University from the same undergraduate student population. They self-identified as native and monolingual speakers of Midwestern English without significant knowledge of other languages. Although all had some experience of learning a second language in instructional settings (most often Spanish or French), this experience was current or recent for only three participants.
None of the participants in either experimental or control group reported a hearing or speech impairment, and all were compensated for participation with course credit or cash. The study was approved by the Purdue University and University of Kansas Institutional Review Boards, protocols 1409015219 and 00003743, respectively.
2.2. Elicitation Materials
Elicitation materials consisted of English and Russian minimal and near-minimal monosyllabic pairs contrasting word-initial and word-final voicing.
The 44 English pairs consisted of 18 stop-initial (e.g., cap–gap), 18 stop-final (e.g., mop–mob), 6 fricative-final (e.g., safe–save), and 2 affricate-final (e.g., rich–ridge) pairs. There was a total of 75 experimental items (some words were used in the word-initial and the word-final condition). Bilabial, alveolar, and velar stops were represented in equal numbers and final fricatives were labiodental (2 pairs) and alveolar (4 pairs). Preceding and following context was largely limited to the vowels [æ], [α], and [Λ]. There was no significant difference in lexical frequency between voiced and voiceless members of the pairs (COCA Corpus,
Davis (
2008)). Forty-eight mono- and disyllabic distractor items were also included. A complete list of English target stimuli is provided in
Appendix A,
Table A1 and
Table A2.
The 42 Russian pairs consisted of 18 stop-initial (e.g., [kostj]–[gostj] ‘bone’–‘guest’), 18 stop-final (e.g., [xrjip]–[grji
] ‘wheeze’–‘mushroom’), and 6 fricative-final pairs (e.g., [rjis]–[prjiz̥] ‘rice’–‘prize’), for a total of 84 experimental items. Bilabial, dental, and velar stops were represented in equal numbers and final fricatives were labiodental (1 pair) or alveolar (4 pairs). Preceding and following vowels were mid-low [e], [a], and [o] in about two-thirds of cases, the rest contained high vowels [i], [u], or [ɨ]. There were no significant differences in lexical frequency between voiced and voiceless stimuli (
Russian National Corpus 2003). Forty-five mono- and disyllabic distractor items were also included. A complete list of target stimuli is provided in
Appendix A,
Table A3 and
Table A4.
2.3. Procedure
Participants recorded at Purdue University were seated in front of the computer screen in a double-walled sound-attenuated booth. E-prime 2.0 (Psychology Software Tools, Pittsburgh, PA) was used to display the words for elicitation. The words appeared on the screen one by one, in a random order. Each word stayed on the screen for 2 s and was followed by 0.5 s of blank screen. Participants were instructed to pronounce each word the way they speak normally. The whole list was presented three times to each participant with short breaks offered between the blocks. The recording was performed using an Audio-Technica AE4100 cardioid microphone and a TubeMP preamp connected directly to a PC.
For participants recorded at the University of Kansas, a similar procedure was used. PowerPoint software was used to present the prompts on the screen, in a random order for each participant, with each word displayed on the screen for 1.5 s, followed by 1.5 s of blank screen. Recordings were performed in an anechoic chamber, using an Electro-Voice N/D 767a microphone and Marantz PMD671 digital recorder.
This computer-controlled stimulus presentation elicits an appropriately consistent rate of speech across and within participants. The order of Russian and English conditions was counterbalanced across participants, with a brief break between conditions. Due to technical issues, only one repetition of each item was recorded for one experimental participant, and only English data were collected from another experimental participant.
2.4. Measurements
For initial stops, voice onset time (VOT) and onset f0 were measured. For final obstruents, preceding vowel duration, duration of consonantal constriction, and duration of voicing during constriction were measured. Segmentation was performed manually based on Praat (
Boersma and Weenink 2018) waveform, and spectrogram representations and using standard segmentation criteria. Measurements were collected using custom-written Praat scripts.
VOT was measured from the onset of consonantal release until the onset of voicing. Onset f0 was measured at the vowel onset as soon as the Praat autocorrelation algorithm detected periodicity. Obtained f0 values were examined for algorithm errors and corrected manually if necessary. Normalization was performed by converting f0 values to semitones relative to each participant’s individual mean onset f0, using the formula 12ln(x/individual mean onset f0)/ln2, based on the semitone normalization procedure in
Boersma and Weenink (
2018). After normalization, outliers more than two standard deviations away from the normalized grand mean onset f0 were removed from further analysis (97% of onset f0 measurements were retained). The resulting values represented the deviation of each onset f0 value, on the logarithmic scale, from each participant’s mean, now represented as 0.
Duration of the preceding vowel, duration of the closure for stops/affricates, frication portion for fricatives/affricates, and duration of voicing during constriction were measured for final obstruents.
4. Discussion
The present study examined native and second language speech of American learners of Russian in order to determine whether classroom exposure to L2 can lead to phonetic changes in learners’ L1.
First, to confirm that classroom exposure to L2 resulted in phonetic learning, as evidenced by effective separation of L1 and L2 systems in the speech of the participants, and to evaluate the degree of this learning, we conducted a comparison between the acoustics of learners’ Russian and English sounds. The results indicated that, for almost every measure taken, learners produced statistically distinct values in Russian and English. In Russian, their word-initial voiceless stops had shorter VOTs, their word-initial voiced stops had longer prevoicing, the frequency of prevoiced stops was higher, and prevoiced stops were characterized by extra-low onset f0, compared to English. Learners’ word-final voiced obstruents were also partially devoiced in Russian, compared to English. All of these modifications were in the direction of approximating native Russian norms: short lag voiceless stops, robustly and near-exclusively prevoiced voiced stops, lower onset f0 after prevoiced stops, and word-final devoicing.
The fact that distinct productions were obtained across learners’ L1 and L2 indicates that even at these relatively early stages of learning, taking place while immersed in L1, participants grasped the phonetic differences between similar phones across the languages and were attempting to implement them in their L2 speech. These results fit in with a wide array of similar findings for bilinguals and L2 learners, demonstrating not merged but distinct productions across the two languages (
Baker and Trofimovich 2005;
Flege and Eefting 1987a,
1987b;
Fowler et al. 2008).
At the same time, it is clear that these learners were not near-native like in their L2 pronunciation by any measure. Their initial [−voice] stops were too aspirated and their initial [+voice] category was still dominated by short lag productions, instead of prevoiced ones. Their final [+voice] obstruents were also only slightly devoiced as opposed to fully devoiced, as expected in Russian speech. Therefore, using these acoustic measures, we can conclude that, at least with respect to pronunciation, these learners were not highly proficient/advanced in their L2.
The question remains whether we can expect L2-driven phonetic changes in the learners’ native speech for these non-advanced speakers immersed in the L1. The answer given by the present results is ‘yes’. A comparison of learners’ English to the native English monolingual group revealed acoustically subtle but statistically significant differences, all compatible with the influence of Russian. Learners’ initial [−voice] stops were characterized by shortened VOTs, indicating assimilation with Russian. Comparison of learners of Russian with the monolingual group also revealed a tendency to reduce the magnitude of the phonetic contrast between final voiced and voiceless obstruents. This reduction was implemented via partial devoicing of [+voice] final obstruents and is compatible with the effect of the Russian final devoicing process.
This finding suggest that L2 phonological rules can trigger phonetic changes in speakers’ L1. The present outcome may also have been helped by the fact that American English is already gravitating towards variable final devoicing, most strongly in fricatives (
Davidson 2016), thus facilitating this particular back-transfer from Russian. It is interesting that there was no significant frication duration difference between voiced and voiceless fricatives and affricates in learners’ English. Thus, if learners have ‘drifted’ all the way into Russian-like complete neutralization (with respect to this parameter) then it was only in segments that are especially prone to devoicing in their L1. This finding warrants further research in order to learn more about the conditions under which phonological processes may seep from bilinguals’ L2 into their L1.
Interestingly, English [+voice] stops were not affected, neither short lag nor prevoiced ones. Only the frequency of prevoicing showed a change, notably, in the direction of dissimilation from Russian. A similar effect was reported by
Huffman and Schuhmann (
2016), who indicated that some American learners of Spanish produced fewer or no prevoiced stops in their English after 6 weeks of classroom Spanish instruction. This result merits attention because it demonstrates that L1 phonetic changes in the direction of dissimilation from L2 are possible even at the beginning stages of L2 acquisition—a possibility not provided for in SLM, which predicts that only advanced learners will dissimilate after having created separate categories for L2 sounds. Moreover, this finding indicates that the dissimilatory or assimilatory direction of crosslinguistic interaction can be determined not only by the stage of L2 acquisition but also by the sound category itself. Specifically, the present data suggest that L1 may tend towards dissimilation with L2 when L1 offers a choice of sub-phonemic variants of the same category only one of which is used in L2 to represent the same phonological category.
Overall, phonetic parameters affected by L1 drift were a subset of those used by participants to differentiate their L1 and L2, suggesting that L2 phonetic learning is a natural precursor of L1 drift.
The present evidence of L2-to-L1 phonetic effects in American learners of Russian indicates that even relatively dissimilar language pairings are subject to such phonetic interactions. Assimilatory changes in the acoustics of English obstruents suggest that despite great linguistic differences between the two languages overall and the use of different orthographic symbols for these sounds, Russian and English segments influenced each other. Orthography has been shown to play a powerful role in adult second language learning, which relies greatly on literacy and orthographic input, unlike first language acquisition (
Bassetti and Atkinson 2015;
Bassetti et al. 2015;
Hayes-Harb et al. 2018). Nevertheless, the present study shows, in agreement with previous research on dissimilar language pairs such as English and Korean, that orthographic differences are not insurmountable obstacles for equivalence classification even in highly literate adult language learners. Equivalence classification between English and Russian obstruents, leading to cross-language assimilatory changes, could be facilitated by similarities in the phonological functioning of these segments in the respective languages. The two languages have similar sets of voiced and voiceless obstruents, which contrast initially and intervocalically but assimilate in voicing when in clusters and devoice, to different degrees, when in final position. Additionally, equivalent phonological categories across these two languages can be realized in phonetically identical ways, albeit in different contextual environments, e.g., English [−voice] stops can be implemented with short lag VOT word-medially before unstressed vowels, similarly to Russian [−voice] stops.
The fact that L2-to-L1 phonetic effects were detected for traditional classroom learners indicates that L2 immersion is not a necessary condition and that the amount of L2 experience and exposure received via classroom instruction can be sufficient to trigger such changes. Our participants did produce an acoustic distinction between L1 and L2 obstruents as a group, which suggest that this degree of phonetic learning may be required for L1 drift to initiate. Conversely, advanced pronunciation proficiency is most likely not a pre-requisite for L1 drift. Nevertheless, prior L2-immersion, which was not ongoing at the time of participation and was limited to 2-4 months, appeared to intensify the degree of L1 drift for these participants in some measures, in parallel with improving the authenticity of their L2 pronunciation.
Moreover, these findings strongly suggest that a significant reduction in the use of L1 is also not a necessary condition for L1 phonetic drift. It is very unlikely that participants in the present study experienced a substantial reduction in the amount or quality of L1 use as a result of studying Russian at the university. It is also unlikely that they were exposed to much spoken Russian through overhearing (concomitantly, also reducing the ‘overhearing’ of English), the way immersed learners are. Thus, even in the absence of considerable reduction in L1 use, first language can and does drift towards the phonetics of L2 in comparable sound categories. To complement this finding, other work demonstrated that even in bilingual or immigrant settings, where L1 use reduction is likely, L1 use does not always correlate with the quality of L1 pronunciation (
Guion et al. 2000;
Hopp and Schmid 2013). The diversity of results with regard to the effect of L1 use on L1 speech indicates that its role is not fully understood and merits further attention.
Additionally, L1 drift in the present work was not the result of an immediate ‘spill-over’ effect from one language to another. The order of Russian and English elicitation was counterbalanced, and the analysis of elicitation order effects showed that order did not condition the presence of the L1 drift, although it could sometimes increase its magnitude.
Although it is unlikely that learners’ L1 use was substantially decreased by their enrollment in Russian courses, another possibility is that L1 inhibition, not L1 use reduction, is what paved the road for L2-to-L1 phonetic effects. Some authors have argued that successful L1 inhibition is important for effective L2 learning. For example,
Linck et al. (
2009) showed that immersed L2 learners fared better in acquired L2 proficiency but worse in L1 lexical retrieval than classroom learners, and argued that greater L1 inhibition in immersion settings was responsible. Moreover,
Levy et al. (
2007) demonstrated that even a short laboratory training session was enough to trigger L1 inhibition effects in lexical retrieval. This suggests that a relatively limited time of classroom L2 learning may also be sufficient to trigger L1 inhibition and, therefore, L1 drift. Furthermore, if laboratory training can induce L1 inhibition and L1 inhibition can trigger L1 drift, we could expect L2-to-L1 phonetic effects under laboratory conditions. This is precisely what was demonstrated by
Kartushina et al. (
2016a) who showed drift in L1 vowels towards similar non-native ones after short-term visual articulatory feedback training. Nevertheless, further research is necessary to fully understand the role of L1 inhibition in the susceptibility of the L1 sound system to the influence of the L2.
A related issue is the longevity of L1 drift observed in laboratory conditions, after a short L2 immersion, or in the course of classroom L2 acquisition. It is rather plausible that such effects may be short-lived. In fact,
Kartushina and Martin (
2019) showed that, four months after studying abroad, participants experienced a ‘return drift’ towards native-like phonetics in their L1 (see also
Chang (
2019b) for similar findings). It is possible that our learners would lose the effects of Russian on their L1 if or when they discontinue their Russian studies. Such short-term phonetic changes in L1 may be qualitatively different from language attrition, which is believed to develop over longer periods of time and have greater strength of ‘inertia’ in resisting the ‘rebound’ back towards native-like values when speakers return to L1 immersion and no longer actively use their L2 (
Chang 2019a). Additionally, the ‘novelty’ effect may boost the cross-language drift at the early stages of language acquisition (
Chang 2013). Ultimately, the observation that the L1 can respond flexibly and intricately to the changing circumstances of learners’ language use and environment demonstrates its great plasticity and adaptability and argues against maturation-related limits on phonetic learning.
Finally, there are a number of factors we could not address in the present study, but which merit serious attention in future research. Among those is the role of exposure to a non-native like L1 in triggering L1 phonetic drift as well as factors such as motivation for learning and attitudes towards the L2 and its associated culture.
An implicit assumption in previous research examining L2-to-L1 phonetic effects has been that it is the exposure to and use of L2, per se, that triggers L1 phonetic drift. This assumption is supported in the current study by the observation that L1 drift co-occurred with a degree of phonetic learning in L2 sufficient to produce two distinct phonetic systems. However, under many scenarios of L2 learning and use, learners are also exposed, to varying degrees, to a non-native-like and accented L1. In the present case, all instructors in the Russian courses attended by our learners were native speakers of Russian. An informal survey of Russian instructors at Purdue University indicated that during class students may be addressed in English anywhere between 15% and 70% of the time, depending on the course level and individual proficiencies of class participants. This suggests that, especially at the beginner level of Russian instruction, learners are exposed to a considerable amount of Russian-accented English speech. Some acoustic characteristics of Russian-accented English would be very similar to the ones observed in native English that has drifted towards Russian (e.g., shorter voiceless VOTs of initial stops, partial or complete devoicing of final obstruents).
At present, we know very little about the role that such exposure may play in the development of apparent L1 drift. Nevertheless, some research suggests that L2-accented L1 input may contribute to non-native-like L1 productions in bilinguals. For example,
Mora and Nadeu (
2012) and
Mora et al. (
2015) suggest that the partial merging of Catalan /e/ and /ε/ vowels in the speech of their participants could be due, in part, to their exposure to Spanish-accented Catalan (see also
Sebastián-Gallés et al. 2005). If L1 drift in classroom language learners is guided, in part or primarily by the Russian-accented English input provided by their instructors, drift may develop as L1 phonetic accommodation to the instructor. In this case, it could be impacted by factors shown to mediate accommodation, such as considerations of dominance, prestige, and speakers’ disposition and attitudes towards each other, and language distance (
Babel 2012;
Babel et al. 2014;
Kim et al. 2011;
Pardo 2006;
Pardo et al. 2012;
Pardo et al. 2013;
Yu et al. 2013).
Additionally, learners’ motivations for studying the chosen language and their global attitudes towards the associated culture and native speakers of their L2 could further mediate the propensity to drift. Previous research has shown that language attitudes and considerations of prestige can influence cross-language interaction in bilinguals (
Gatbonton et al. 2005;
Gatbonton et al. 2011;
Giles et al. 1977;
Law et al. 2019), but considerably more work is needed to determine the role of such factors in L2-to-L1 phonetic effects.