The aim of this study is to explore the development of speech perception in young multilinguals’ non-native languages (L2 and L3) and to trace the patterns of cross-linguistic mappings over the first year of L3 learning. This study forms a part of the international MULTI-PHON research project, in which speech perception and production was investigated with a battery of tests in two parallel groups of young adolescents in Polish and German schools.
3.1. Participants
The participants were 13 L1 Polish speakers (aged 12–13) who had been learning English as their L2 at school for five years (pre-intermediate level) and who had just started to learn German as their L3 in an instructed setting. They were observed over the first year of L3 learning. Our strict inclusion criteria featured no prior command of German, only Polish as an L1, no additional languages, and data availability at all testing times, thus, for the sake of the present analysis the number of participants was reduced from a larger participant pool (initially 24) to 13 speakers with a homogeneous profile (see
Table 1).
An informed consent was obtained from all the subjects who participated in the study, their parents, and the school authorities where the data was collected. The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the Ministry of Education in Brandenburg on 17/07/2017 (ref. number 51/2017).
Language background interviews were conducted in the participants’ L1 Polish at the very onset of the project in order to collect information about the individual learner’s language backgrounds, including information about their language learning history (i.e., age of learning, length and intensity of instruction), language use (declared percentage in varied situations/contexts), self-evaluation of proficiency (at the onset of instructed L3 learning), and attitudes towards foreign language learning.
3.3. Research Questions and Hypotheses
In order to investigate cross-linguistic interactions in multilinguals’ speech perception, the following research questions were posed in the study:
1. Is there evidence of CLI in the perception of L2 English and L3 German?
It is hypothesized that cross-linguistic interactions in the two foreign languages may differ and result in variable performance on the measures of perception accuracy and reaction time (RT), depending on the language status (L2 vs. L3) as well as the investigated feature (rhotics vs. final obstruent devoicing). Better performance on both measures and, thus, less CLI is expected for the more established L2 as compared to the newly acquired L3.
Hypothesis 1 (H1). Both phonological feature and language determine perception accuracy and reaction times. There will be less CLI in the learners’ L2 English than their L3 German.
2. Is there a perceptual development over time caused by a change in CLI? Does the perceptual development in L3 parallel that in L2?
In this study, cross-linguistic interactions were operationalized as the respondents’ preferences for L1-, L2-, L3-accented stimuli in the performed forced-choice goodness task. We expect different patterns to hold for the two foreign languages acquired. We expect to observe a change in CLI patterns as a function of the testing time (T1 vs. T2).
Hypothesis 2 (H2). There will be changes in CLI across time. The developmental patterns of CLI differ between the learners’ L2 English and L3 German.
3.4. Materials and Methods
The participants performed perceptual tasks in both their L2 English and L3 German, respectively, to test their perception of rhotics and final obstruent (de)voicing. Response accuracy and reaction times were recorded for analyses at two testing times (T1, after 5 months and T2, after 10 months of L3 learning). To create appropriate language modes, the data collection for each of the languages was carried out on two separate days with L1 speakers of the respective languages as instructors.
A forced-choice (FC) goodness task was selected for the present study as an alternative to more traditional perceptual paradigms such as discrimination or identification. Perception discrimination tasks, in which the listener decides whether two stimuli are the same or different, seemed to be of little use as the aim was to test the association of a given variant of a sound with a chosen language in the multilingual’s repertoire. Identification tasks in turn are inherently notorious for specifying response alternatives (including difficulties concerning non-transparent orthography), the problem being magnified in the case of three phonological systems in interaction. Moreover, identification tasks are not useful for testing allophonic differences across languages. Overly complex perception tasks needed to be avoided, too: when task complexity increases, perceivers have been found to switch to a primarily phonological level of reasoning (
Strange 2009). Therefore, a forced-choice goodness task was selected for the present research, which allowed for elicitation of an association of a given allophone across multiple languages while the complexity of stimulus identification was avoided.
More specifically, the participants in this study heard two renditions of the same phrase differing on the last stimulus items embedded in a carrier phase. By pressing one of two buttons (marked 1 and 2) on a button box, they had to decide which phrase sounds more natural (i.e., more target-like) to them. One rendition was a target realization and the other was an accented language realization, where only the investigated feature was manipulated. For example, for rhotics, in the English version of the task, the stimuli included the target-like phrase “You will hear the word ring /ɹiŋ/” followed by the Polish-like realization of the rhotic sound “You will hear the word ring /riŋ/”.
For rhotic sounds, this included two trials of pair items as the target item was positioned next to two other possible realizations, while for obstruent (de)voicing, it featured a single trial as the target was presented in opposition to voiced or devoiced/voiceless. The order of presentation of target and non-target stimuli was counterbalanced across trials.
Thus, in the English version, there were stimuli with English target rhotics as well as with Polish and German rhotics. Likewise, in the German version, the stimuli included German target rhotics embedded in a carrier phrase as well as Polish- and English-accented manipulated rhotics in the target words. In case of obstruent (de)voicing, the stimuli in the English version included the target-like phrase “You will hear the word have” /hæv/, followed by a manipulated realization of the final obstruent /hæf/. Similarly, in the German version, the target words embedded in a carrier phrase (“Du hörst das Wort Hand” /hant/) included final obstruents that were either voiceless (thus target like) or voiced (i.e., L2-accented).
The stimuli in each language version involved 10 pair items containing rhotics, 13 to 14 pair items featuring final (de)voicing, and three training pair items that preceded the testing blocks. In total, the FC task, thus, included 26 English and 27 German pair items for the participants to respond to.
The target rhotics occurred either in word-initial or medial position and included:
For English: ring, rabbit, red, round, giraffe (with the manipulated items realized as having an L1-Polish-accented alveolar trill or an L3-German-accented uvular fricative).
For German: rot, Regen, Reise, Fahrrad, verloren (with the manipulated items realized as having an L1-Polish-accented alveolar trill or an L2-English accented post-alveolar approximant).
The final obstruent (de)voicing stimuli were in coda positions and featured as follows:
For English: days, grab, leg, could, stab, big, skies, give, love, food, judge, have, rob (with the manipulated items realized with voiceless final obstruents, which could be interpreted as either L1-Polish or L3-German-accented)
For German: Hand, Berg, Quiz, lieb, Kleid, Mund, Honig, Hund, Fahrrad, Kind, vierzig, brav, Korb, gelb (with the manipulated items realized with voiced final obstruents, which could be interpreted as L2 English-accented).
The stimuli were randomized across trials in E-prime. The inter-stimulus interval was set at 500 ms and the participants had a 3000 ms response limit, thus, the task was timed. The participants’ performance on the timed forced-choice goodness task was examined in terms of accuracy and reaction time (RT). The latter was included as a proxy for the perceptual difficulty of the tested stimuli.
The stimuli were recorded by three female native speakers of the respective languages, who were fluent advanced speakers of the other two languages in the triad of languages. The stimuli were produced naturalistically to avoid artificial concatenation. To ensure naturalness, several recordings of the same items were performed and validated by selecting the ones in which the performed accented manipulation sounded the most acceptable to the researchers. The process of stimulus validation was based on the perceptual assessment of each stimulus by native speakers of the respective languages. We adopted a perceptual ‘category goodness’ criterion, which was deemed to have the best ecological validity given the nature of the FC goodness task administered to the participants.
As far as the three speakers who produced the stimuli are concerned, their stay in a foreign country ranged from a few months to a few years. While we acknowledge the fact that their L1 production could be affected by a highly proficient knowledge of the L2/Ln, it is debatable if the prototypical monolingual rendition should be sought as the target production of the stimuli, in the light of the recent discussions on the native monolingual norm in research on multilingual acquisition (see e.g.,
Sorace 2020;
Kroll 2020). Moreover, monolingual speakers of German, Polish, or English are increasingly impossible to find. Therefore, it was not our goal to search for a native monolingual rendition of the target items, but rather to allow for a potential variation represented by native speakers of particular languages who are multilingual speakers themselves.