Perception and Production of Sentence Types by Inuktitut-English Bilinguals

: We explore the perception and production of English statements, absolute yes-no questions, and declarative questions by Inuktitut-English sequential bilinguals. Inuktitut does not mark stress, and intonation is used as a cue for phrasing, while statements and questions are morphologically marked by a sufﬁx added to the verbal root. Conversely, English absolute questions are both prosodically and syntactically marked, whereas the difference between statements and declarative questions is prosodic. To determine the degree of crosslinguistic inﬂuence (CLI) and whether CLI is more prevalent in tasks that require access to contextual information, bilinguals and controls performed three perception and two production tasks, with varying degrees of context. Results showed that bilinguals did not differ from controls in their perception of low-pass ﬁltered utterances but diverged in contextualized tasks. In production, bilinguals, as opposed to controls, displayed a reduced use of pitch in the ﬁrst pitch accent. In a discourse-completion task, they also diverged from controls in the number of non-target-like realizations, particularly in declarative question contexts. These ﬁndings demonstrate patterns of prosodic and morphosyntactic CLI and highlight the importance of incorporating contextual information in prosodic studies. Moreover, we show that the absence of tonal variations can be transferred in a stable language contact situation. Finally, the results indicate that comprehension may be hindered for this group of bilinguals when sentence type is not redundantly marked.


Introduction
Post-lexical or intonational uses of pitch can be transferred from one language to another in a contact situation (Queen 2001(Queen , 2012Colantoni and Gurlekian 2004), which raises the question of whether the absence of pitch movement is also susceptible to crosslinguistic influence (CLI). To tackle this question, it is crucial to find a contact situation in which one of the languages has a restricted use of pitch. Thus, in this paper we direct our attention to Inuktitut, an Eskimo-Aleut language spoken in Eastern Canada, 1 which has been in contact with English since the 16th century (Dorais 2010), and which has been described as having no stress (Fortescue 1983;Shokeir 2009;Arnhold et al. Forthcoming) and a very limited use of intonation (Massenet 1980;Fortescue 1983;Shokeir 2009). We focus on the perception, interpretation, and production of the three sentence types listed in examples (1)-(3).
Instrumental and experimental descriptions of Inuktitut prosody are not abundant, but the existing ones clearly suggest that Inuktitut is a language that does not mark lexical stress (Fortescue 1983;Shokeir 2009;Arnhold et al. Forthcoming), and that tonal variations are restricted to the end of the utterance (Massenet 1980;Fortescue 1983;Shokeir 2009; see also Thalbitzer 1904, p. 141). Massenet (1980) analyzes the variety spoken in Resolute Bay and concludes that declaratives have a rise, associated with the penultimate syllable, followed by a fall. Absolute yes-no questions are signaled by a rise associated with the antepenultimate syllable, which is followed by a fall in the penultimate and a rise in the final syllable (HLH contour). Questions are also marked by vowel lengthening. Fortescue's (1983) overview of twelve Eskimo varieties shows that dialects differ in their rhythmic patterns (syllable vs. mora time), in the syllable to which the tonal movement is associated, and in whether interrogatives end with a fall or with a rise. Of the varieties surveyed in his study, the two closest to the variety analyzed here are characterized by a fall in declaratives, and either a sustained pitch or a sharp rise in interrogatives, which are also signaled by vowel lengthening (i.e., a lengthening of the final vowel, as illustrated in (6)). Shokeir's (2009) autosegmental metrical analysis of multiple narratives produced by Inuktitut speakers confirms, to a large extent, the conclusions of previous work. First, she showed that tonal movements are restricted to the last two syllables in the utterance. Second, rising contours (LH) have the basic meaning of continuation and can be used to hold a turn (Figure 1, top). Third, falling contours (HL) have the basic cross-linguistic meaning of finality and can thus signal the end of a turn (Figure 1, bottom). Fourth, rising contours may be found in interrogatives, but she concludes that the most consistent acoustic correlate of interrogative utterances is vowel lengthening. Finally, she highlighted the fact that Inuktitut does not show the declination patterns characteristic of most of the world languages, and this is also illustrated in Figure 1. Given these prosodic characteristics, Inuktitut could be classified as an edge-prominence language (Jun 2014; see Arnhold 2014 for West Greenlandic), in which tonal events are located at the end of a domain.
Although the systematic differences between the languages at the prosodic and morphosyntactic level may hinder any type of prosodic convergence, Inuktitut and English have been in contact since the 1500s (Dorais 2010, chp. 5), and thus, sociolinguistic conditions lead us to hypothesize that Inuktitut prosodic features may be transferred to English. 2,3 Indeed, most of the population of Nunavut, where our participants are from, is bilingual (Allen 2007;Dorais 2010;Statistics Canada 2019). There is still a large percentage of speakers who claim Inuktitut as their first language and this percentage is higher than any other aboriginal language in Canada (Allen 2007). Moreover, a series of political decisions, such as the creation of Nunavut in 1999, the Official Languages Act (1988) and the Inuit Language Protection Act (2008), have resulted in the promotion of positive attitudes towards the language (Dorais 2010). 4 Education has also played a role in language maintenance. The introduction of education in Inuktitut up to Grade 2 or 4 has allowed children to develop their writing skills in their first language, whereas the absence of a comprehensive curriculum in Inuktitut (Aylward 2010;Dorais 2010) yields an increasing use of English in later grades. As is the case in most bilingual communities, though, there is a large degree of individual variation in terms of proficiency and language use. Patterns of language use not only vary across the Arctic region (Dorais 2010, pp. 226-7), but depend on the specific demographic conditions of each individual, as we will see when we discuss the participants' profiles (Table 1). Figure 1. Realization of intonational contours in Inuktitut. Top: Rising contour to hold a turn; female speaker from Baker Lake (Shokeir 2009, p. 21). Bottom: Falling contour to indicate the end of a turn; female speaker from Iqaluit (Shokeir 2009, p. 24).
Although the systematic differences between the languages at the prosodic and morphosyntactic level may hinder any type of prosodic convergence, Inuktitut and English have been in contact since the 1500s (Dorais 2010, chp. 5), and thus, sociolinguistic conditions lead us to hypothesize that Inuktitut prosodic features may be transferred to English. 2,3 Indeed, most of the population of Nunavut, where our participants are from, is bilingual (Allen 2007;Dorais 2010;Statistics Canada 2019). There is still a large percentage of speakers who claim Inuktitut as their first language and this percentage is higher than any other aboriginal language in Canada (Allen 2007). Moreover, a series of political decisions, such as the creation of Nunavut in 1999, the Official Languages Act (1988) and the Inuit Language Protection Act (2008), have resulted in the promotion of positive attitudes towards the language (Dorais 2010). 4 Education has also played a role in language maintenance. The introduction of education in Inuktitut up to Grade 2 or 4 has allowed children to develop their writing skills in their first language, whereas the absence of a comprehensive curriculum in Inuktitut (Aylward 2010;Dorais 2010) yields an increasing use of English in later grades. As is the case in most bilingual communities, though, there is a large degree of individual variation in terms of proficiency and language use. Patterns of language use not only vary across the Arctic region (Dorais 2010, pp. 226-7), but depend on the specific demographic conditions of each individual, as we will see when we discuss the participants' profiles (Table 1).

Figure 1.
Realization of intonational contours in Inuktitut. Top: Rising contour to hold a turn; female speaker from Baker Lake (Shokeir 2009, p. 21). Bottom: Falling contour to indicate the end of a turn; female speaker from Iqaluit (Shokeir 2009, p. 24). Thus, if there is an influence from Inuktitut into English, we expect to observe an overall difference between bilinguals and English monolinguals. These differences are predicted to be larger in the perception, interpretation, and production of tonal movements at the beginning rather than at the end of the utterance (see also Section 3). Since previous research on bilingual intonation has suggested that differences between monolinguals and bilinguals are modulated by the type of task (Grabe et al. 2003;Ortega-Llebaria and Colantoni 2014), our secondary goal is to analyze whether group differences are smaller in tasks that tap auditory rather than contextualized perception, and in imitation rather than in contextualized production tasks. In the next section, we review the literature on the perception and production of sentence types in bilinguals. This is followed by our research questions and hypotheses in Section 3, and our methodology in Section 4. In Section 5, we summarize our results, first for perception and then for production, and then we compare the perception-production results. We discuss our findings in Section 6, and briefly conclude in Section 7.

Cross-Linguistic Influence and Sentence Types
Regarding intonation, more is known about production than perception in language contact situations. This represents a striking contrast with the Second Language Acquisition literature, where theoretical models derive their primitives from perception (e.g., Flege 1995;Flege and Bohn 2021;Best 1995;Best and Tyler 2007). Different scenarios have been Languages 2022, 7, 193 5 of 26 studied, which include situations of stable social bilingualism, migratory languages, and heritage speakers. In addition, studies have factored in language typology (contact between typologically similar and different languages), as well as the possibility of bidirectional influence (Delais-Roussarie et al. 2015) and language attrition. The overall picture suggests that intonation is permeable to the influence of language contact, with the possibility of one (Muntendam and Torreira 2016) or both languages (Mennen 2004;Delais-Roussarie et al. 2015;Queen 2012;Dehé 2018) being affected. Some prosodic structures are more susceptible than others (Delais-Roussarie et al. 2015), and some positions in the contour are also more prone than others to being affected by language contact.
Studies involving the perception of sentence types have concentrated on the role of cross-linguistic influence (CLI) and have mostly focused on English, either as the L1 or L2. Beginning with studies that investigated the perception of a foreign language, there is evidence that one's L1 influences foreign language perception of sentence types. For example, Cruz-Ferreira (1983) tested Portuguese and English speakers on their perception of Ss and DQs in English and Portuguese, respectively. She found differences in the identification of sentence types, particularly in those that were characterized by a low-rise, showing that sentence type identification in a foreign language is influenced by the L1. Similarly, Liu and Rodríguez (2012) looked at the identification and discrimination of final contours in English statements and yes/no questions by monolingual English and Chinese speakers. They found that the groups differed in the processing of contours, since in Chinese there is an interaction of intonation and lexical tone.
Closer to our study are investigations that have analyzed the role of CLI in the perception of sentence types in early and late bilinguals. These studies have used a variety of methodologies, such as gating paradigms (Marasco 2020), identification tasks Patience et al. 2020) or imitation of resynthesized stimuli (Zárate-Sández 2015) and have yielded mixed results. L1 English-L2 Spanish speakers were the focus of two studies (Zárate-Sández 2015; Marasco 2020). Marasco (2020) investigated the perception of initial boundary tones and prenuclear peaks in advanced learners, whereas Zárate-Sández (2015) analyzed the perception of prenuclear accents and final boundary tones in learners of different proficiencies (beginners and advanced), as well as heritage speakers. While Marasco (2020) found no evidence of CLI in the perception of pre-nuclear accents (i.e., controls and learners were equally accurate at distinguishing statements from yesno questions), Zárate-Sández (2015) reported group differences in the perception of prenuclear accents, but not in boundary tones. Beginner learners were not able to detect alignment differences that corresponded to broad and narrow focus patterns, as expected from CLI, while all the other groups did. However, heritage speakers and advanced learners shifted between the two categories at an earlier point than native speakers. Radu et al. (2018) explored the identification and comprehension of statements, yes-no questions and declarative questions in L1 Spanish-L2 English advanced learners and found no evidence of CLI in the perception of either low-pass filtered or isolated stimuli. CLI was restricted to the interpretation of the different question types (see Section 2.2). Finally, Patience et al. (2020) found that L1 Mandarin speakers could use intonation to identify questions from statements in a low-pass filtered task; however, when the statements were presented as isolated utterances (not low-pass filtered), the L1 Mandarin speakers had difficulty distinguishing between Ss and DQs, suggesting that they paid more attention to the syntax than the intonation. The authors interpreted this as evidence of CLI, given that Mandarin yes-no questions are marked more reliably with syntax than with prosody. Some potential evidence for positive CLI was also observed in the L1 Mandarin-L2 English learners. When utterances were presented following a context that prompted either an AQ, DQ, or S, the L1 Mandarin speakers performed similarly to the L1 English controls, and outperformed L1 Spanish speakers. The authors attributed this to positive transfer, given that an AQ-DQ pragmatic contrast is marked prosodically in Mandarin, but not in Spanish.
In summary, perception studies offer mixed results, but suggest that group differences are larger with prenuclear accents than with boundary tones. Moreover, the findings reveal Languages 2022, 7, 193 6 of 26 that CLI related to the syntax and pragmatics of sentence types may play a more influential role than prosody, although this is dependent on the L1 of the L2 learners.
Production studies are not only more abundant, but they have also investigated a wider range of language pairings, including many typologically different languages. Once again, the evidence supports CLI, but different outcomes have been reported, such as convergence (Colantoni and Gurlekian 2004;Barnes and Michnowicz 2015), hybrid (Lai 2018) or mixed patterns (Queen 2001(Queen , 2012, overgeneralizations and hypercorrections (Santiago and Delais-Roussarie 2012). Moreover, changes due to CLI have been documented for different parts of the utterance. For example, changes in the alignment of high tones in prenuclear accents have been reported for Spanish declaratives in contact with Italian (Colantoni and Gurlekian 2004;Barnes and Michnowicz 2015) and for Spanish and English declaratives in English-Spanish bilinguals (Zárate-Sández 2015). Nuclear contours have been reported to display patterns of convergence in declaratives in Spanish in contact with Italian (Colantoni and Gurlekian 2004), Spanish in contact with Catalan (Simonet 2011) and Yami in contact with Mandarin (Lai 2018). Moreover, nuclear contours in interrogatives may be even more susceptible to change than nuclear contours in declaratives, as shown by Alvord (2007), who looked at the realization of final contours in declaratives and polar interrogatives in the Spanish of three generations of Cuban Spanish-English bilinguals. In addition to convergence studies, others have observed the emergence of mixed patterns, specifically the use of the same contour in both languages, albeit with a different pragmatic distribution (Queen 2001(Queen , 2012. Crucially for our study, there is evidence of CLI from substratum indigenous languages, such as Quechua, into Indo-European languages, such as Spanish. Similar to Inuktitut, Quechua is an agglutinative language that uses morphemes to mark sentence type (e.g., Cerrón Palomino 1988;O'Rourke 2009) or information structure (Sánchez 2008). Evidence of influence from Quechua into Spanish has been reported in peak alignment patterns in prenuclear accents in broad and narrow focus declaratives (O'Rourke 2012), as well as in the lack of use of f0 cues to mark narrow (O'Rourke 2012) or contrastive focus (Muntendam and Torreira 2016). Overall, these studies show that Quechua-Spanish bilinguals use different peak alignment patterns and have a restricted use of pitch in Spanish when compared to monolingual controls. If, as Colantoni and Sánchez (2021) suggest, this is a result of different patterns of module interactions across languages according to which languages that have a rich morphological layer tend to have a restricted use of pitch to mark sentence types or information structure, we should expect to see a more restricted use of intonation in the English spoken by L1 Inuktitut speakers.

Role of Task Type in Modulating CLI in Contact Situations
Bilinguals have been reported to perform differently across tasks, particularly when tasks are not culturally appropriate in one of the languages (e.g., Sánchez 2008;Kiser 2014) or demand the use of skills that bilinguals may not have in one language (e.g., reading in the language in which they have not been educated; Tsimpli 2014). These task effects have also been observed in intonation studies (Barnes and Michnowicz 2015;Colantoni et al. 2016). These studies, however, did not explore whether access to contextual information was responsible for the differences observed. Studies that have examined the role of access to contextual information have yielded consistent results both in perception (Grabe et al. 2003;Radu et al. 2018;Patience et al. 2020) and perception-production (Ortega-Llebaria and Colantoni 2014). Grabe et al.'s (2003) pioneering study showed that three groups of English, Mandarin and Peninsular Spanish speakers did not differ in their discrimination of falling and rising contours in non-speech stimuli (i.e., frequency modulated sine-waves), but differed in their perception of falling contours when they listened to the utterance Melanie Maloney (whose final syllable was manipulated to generate 11 different stimuli with rising and falling contours) produced by a Scottish English speaker. Ortega-Llebaria and Colantoni (2014) studied the effect of access to meaning in the perception and production of English corrective stress by English controls and two groups of L2 speakers (L1 Mandarin Languages 2022, 7, 193 7 of 26 and L1 Spanish). Once again, learners diverged from controls more in perception and in production in tasks whereby they either answered or produced utterances appropriate to a context. Interestingly, the contextual effect was modulated by CLI. Spanish participants were outperformed by Mandarin learners, which is expected given that Mandarin resembles English more closely than Spanish in the prosodic marking of corrective stress. Radu et al. (2018) analyzed the perception and interpretation of Ss, AQs and DQs by L1 Spanish-L2 English speakers, using a variety of tasks. They found that learners did not differ from controls in tasks that tapped auditory processing, but they diverged from controls in contextualized tasks in which they had to choose the sentence type that appropriately completed a given context. Patience et al. (2020), which was based on the same methodology as Radu et al. (2018), found that Mandarin speakers behaved similar to controls, outperforming L1 Spanish speakers. As mentioned, the relative success of the L1 Mandarin speakers was attributed to positive CLI, given that Mandarin also contrasts prosodically between AQs and DQs. Note that these results mirror the findings in Ortega-Llebaria and Colantoni (2014), given that they also found that the Mandarin speakers outperformed the Spanish speakers, due to similarities in the prosody of the structure under examination.
In summary, previous work investigating the role of task has found that CLI (including positive CLI) is more prevalent as access to contextual meaning increases. As a result, we expect to find the same results in the speakers of the present study. We outline our specific hypotheses in the next section.

Research Questions and Predictions
Based on the research on the role of CLI and access to contextual information in the perception and production of intonation reviewed in the previous section, we formulate two research questions followed by the corresponding predictions.
Is there evidence of CLI in the bilinguals' perception and production of sentence types?
Based on previous descriptions of Inuktitut, which revealed that tonal movement is mostly restricted to nuclear position and rising and falling contours are not strictly associated with sentence types, we predict that bilinguals will be less accurate than English monolinguals at identifying statements and questions, when exposed to low-pass filtered stimuli or when the syntactic structure is identical (i.e., Ss vs. DQs). In production, the experimental group should differ from the control group to a larger extent in prenuclear than in nuclear position since there is little tonal movement in the L1 of the former group. Finally, bilinguals are expected to produce a larger number of rising contours in Ss than controls, given that final rises are not strictly associated with questions in their L1.

Is CLI modulated by task type?
Based on previous research (see Section 2.2), we expect to see larger between-group differences in more contextualized as opposed to more controlled tasks. In perception, bilinguals are expected to have difficulty identifying the appropriate context for AQs and DQs. In production, we expect bilinguals to resemble controls' pitch patterns more closely in controlled tasks than in contextualized tasks. We also expect bilinguals to have difficulty choosing the appropriate question type in the contextualized task (i.e., variable production of AQs in DQ-prompting contexts), given that sentence type is encoded in the morphology rather than in the syntax in their L1.

Participants
The study includes 16 English controls (12 females, 4 males) with a mean age of 24 (range: 18-30). All controls were born and raised in Canada and were studying or had completed a university degree. The bilingual group includes 13 participants (10 females, 3 males). Table 1 summarizes the bilingual participants' profiles. At the start of the testing session, participants completed a background questionnaire detailing several aspects of their language experience and abilities. All bilingual speakers were exposed to Eastern Canadian Inuktitut at home where either one (3/13) or both parents spoke Inuktitut. They constitute a fairly homogenous dialectal group (Dorais 2010, p. 19), but they represent different "speech areas" (Dorais 2010) within this dialect (North and South Baffin: N = 9; Nunatsiavut: N = 3; Aivilik: N = 1). This means that all participants were born and raised in areas in which Inuktitut is currently in contact with English (rather than with French). As mentioned in the Introduction, most participants (N = 9) came from an area where bilingualism has been expanding since the 70s, but where Inuktitut is still both an official language and the language of the home.
It is important to highlight that the groups exhibit several differences regarding their education (only one bilingual participant completed a university degree) and mean age (bilinguals are older than controls). The bilingual group also exhibits some variability in terms of the age of onset of English acquisition. In the sample (Table 1), we have three simultaneous bilinguals and one participant who was exposed to English before entering school, whereas the rest were exposed to English upon entering the school system or slightly later. The amount of English used daily also varies. Table 1 presents the mean proportion of English use by speaker, which is the result of averaging the proportion of English used at home, at work, in school and in social situations. Although the mean proportion of English use is 61%, there is a wide range, with some participants reporting to use English only 25% of the time and others using almost exclusively English. Given the variability in our sample, in addition to the group results, we will present individual results both for perception and production.

Materials
The data reported here include perception and production experiments, and within each category, we developed tasks that manipulated the degree of access to contextual information, ranging from no access to contextual information to perceiving and producing utterances appropriate to a context (see Table 2 for a summary).
The perception component of the experiment included three tasks. In the first task, or intonation only task (IO), participants heard a low-pass filtered stimulus out of context. In the second task, participants heard isolated unaltered utterances that contained segmental Languages 2022, 7, 193 9 of 26 and intonation information (SI task). In the third task, participants heard a scenario followed by three utterances, only one of which was appropriate to the context (C task). Our production experiment included two tasks that also varied according to the amount of contextual information. In the first task, participants heard an utterance in isolation and were asked to repeat it (Sentence Imitation task-SI). In the second task, they heard the same scenarios used in the perception task, but, this time, participants were asked to produce an utterance appropriate to the context (C task).
The stimuli used for the IO and SI perception tasks, and the SI production task consisted of 10 utterances for each sentence type (AQ, DQ, S) and 25 distractors, which included Wh-questions and exclamations. The stimuli were recorded by a Canadian female speaker using a Marantz solid-state recorder PMD-661 and a unidirectional lavaliere microphone. The stimuli were digitized using a 22,000 sample-rate and a 16-bit resolution. All of the stimuli were checked for naturalness and potential reading errors by all authors.
The stimuli in the C task consisted of six scenarios, as in (9), per sentence type and no distractors. 5 These scenarios were selected from a larger set of scenarios piloted, given that they prompted appropriate responses in monolingual and L2 speakers of English alike. In the perception task, after hearing the scenario, participants heard three utterances only one of which was appropriate to the context. In the production component, participants had to produce a phrase appropriate to the context. Materials for the contextualized tasks were recorded by the same Canadian female speaker who recorded the other stimuli using the same equipment described above.
(9) C task Context (S): Mary is on vacation in Toronto and really wants to see a racoon. One of her friends knows of a place with a bunch of trees where racoons live and takes Mary there to see if she can finally see one. Soon after they arrive, a racoon shows up and Mary's friend says, "Look . . . " (a) This is a racoon.
(b) This is a racoon? (c) Is this a racoon?

Context (DQ):
Before coming to Toronto from Australia, Mary heard about raccoons, looked at some pictures and thought they were cute little things. One evening, she was eating outside with friends and saw a mid-sized animal crossing the street and thought it was a dog. Her friends commented that it was a raccoon, and she asked . . . (a) This is a racoon. (b) This is a racoon? (c) Is this a racoon?

Context (AQ):
Mary is from Australia, and she has never seen a raccoon in her life. When she got to Toronto, she spent hours in the evening trying to spot one. One evening, she is sitting outside with a bunch of friends and she sees something that she believes may be a raccoon. She points at the animal and asks . . . (a) This is a racoon. (b) This is a racoon? (c) Is this a racoon?
The stimuli used were acoustically analyzed to determine whether the target sentence types were produced with the intended characteristics. Table 3 summarizes the acoustic characteristics of the target stimuli used in the IO and SI perception tasks, as well as in the SI production task. Table 3. Acoustic analysis of the perception (SI and IO tasks) and production stimuli (SI task). Mean max F0 values in the first pitch accent and the nuclear contour, and F0 excursion in the first pitch accent and nuclear contour (values in semitones). The stimuli for the three sentence types used in these decontextualized tasks clearly differed in the realization of the nuclear contour (pitch excursion: DQ > AQ > S) and partially differed in the realization of the first pitch accent, which had a larger pitch excursion in DQs than in the other sentence types. Most importantly, the prosodic characteristics of the stimuli used are consistent with those reported in previous descriptions of American English (e.g., Bartels 1999). 6 Finally, Table 4 displays the characteristics of the stimuli used in the C task (Perception only). Once again, the three sentence types differed in the degree of pitch change in the nuclear contour (DQ > AQ > S), although the pitch excursion was smaller in this task than in the others. Similar F0 maximum values were obtained for the first pitch accent in the three sentence types, but here, as opposed to the other tasks, the largest pitch excursion was produced in Ss.

Procedure and Data Analysis
The perception and production tasks reported in this paper are part of a larger project in which we analyzed other structures (e.g., attachment ambiguity) and included additional L1 groups (Spanish and Mandarin). Thus, we had two testing sessions which were one week apart. Perception and production components of each task were divided into the two testing sessions and participants were randomly assigned to start either with the perception or the production component.
The perception tasks were administered using SuperLab pro. In the IO and SI tasks, participants listened to the stimulus, and then pressed one of the three colored keys on the keypad corresponding to Statement, Question or Exclamation. This last response was included since DQs could be interpreted as exclamations and there were exclamations among the distractors. In the C task, participants listened to the scenario and then heard three possible options that would complete the scenario, either a statement, a DQ or an AQ.
Participants only listened to each stimulus once. After having heard the last option, they had to press one of the three keys on the keypad. Before testing began, we included a short practice session.
The production portion of the experiment was administered via PowerPoint, and responses were recorded with the same equipment used to prepare the stimuli and analyzed with Praat (Boersma and Weenink 2017). In the SI task, participants listened to a stimulus and were asked to repeat it. In the C task, participants listened to the scenario and then had to produce an utterance that would complete each scenario. In both cases, participants were allowed to listen to the stimulus more than once. In all cases, practice sessions were introduced at the beginning of each task.
Perception data were analyzed for accuracy. In the production data (Figure 2), we identified the first pitch accent and the nuclear contour (i.e., last pitch accent and boundary tone). We labeled each tonal event using the ToBI system (Beckman and Ayers Elam 1997) and measured the maximum and minimum f0 (in semitones) associated with each tonal event. We then calculated the pitch change (i.e., the f0 maximum minus the f0 minimum) over the first pitch accent and the nuclear contour. Labeling was conducted by one of the authors and then checked by a second author. Statistics were calculated with R Studio Team (2015). We used a combination of linear mixed effects models and binomial mixed effect models, with treatment coding contrasts for our categorical variables. In all of the statistical analyses, for the sentence type variable, "AQ" was the reference level; for language, "English" was the reference level; and for task, the reference level was the C task. The values that we display in the results of our statistical tables therefore reflect the listed value with that of the reference level. We will provide details about the specific models in each of the results sections.

Perception
Table 5 displays the mean accuracy by task and shows that bilinguals had a lower proportion of accurate answers across tasks and sentence types than controls. However, except for DQs in the C task, responses were always above chance.  Statistics were calculated with R Core Team (2013). We used a combination of linear mixed effects models and binomial mixed effect models, with treatment coding contrasts for our categorical variables. In all of the statistical analyses, for the sentence type variable, "AQ" was the reference level; for language, "English" was the reference level; and for task, the reference level was the C task. The values that we display in the results of our statistical tables therefore reflect the listed value with that of the reference level. We will provide details about the specific models in each of the results sections.

Perception
Table 5 displays the mean accuracy by task and shows that bilinguals had a lower proportion of accurate answers across tasks and sentence types than controls. However, except for DQs in the C task, responses were always above chance. To determine whether group differences were statistically significant and to understand whether such differences were larger in contextualized than in de-contextualized tasks, we fitted a generalized binomial mixed effects model with accuracy (Accurate; Nonaccurate) as the dependent variable, Task (C, SI, IO), Sentence Type (AQ, DQ, S) and Language (English, Inuktitut) as fixed factors, and Participant and Item as random factors (random intercepts). We also tested models with two and three-way interactions. Model comparisons using the AIC criterion revealed that the model which best fitted the data (i.e., the one with the lowest AIC value = 1805.4) was the one that included all the fixed factors and a three-way interaction. Results of this model are reported in Table 6, confirming that the number of non-accurate responses was significantly higher in the experimental than in the control group. Bilinguals also were less accurate in DQs when compared to AQs, but as expected, the non-accurate responses with DQs were lower in the SI task than in the C task. Results of post-hoc pairwise Tukey-adjusted comparisons revealed that, in the IO task, there were no significant between-group differences for any of the sentence types tested. In the SI task, instead, bilinguals were less accurate than controls in DQs (ß = −2.64; SE = 0.38; z ratio = −4.00; p = 0.007) and in Ss (ß = −2.37; SE = 0.66; z ratio = −3.56; p = 0.03). Controls also were less accurate with DQs than with AQs (ß = −2.11; SE = 0.58; z ratio = −3.62; p = 0.3). Finally, in the C task, bilinguals displayed a higher number of non-accurate responses than controls in DQ-prompting contexts (ß = −2.83; SE = 0.51; z ratio = −5.50; p < 0.0001); their accuracy was also lower in this context than with AQ-(ß = −2.81; SE = 0.49; z ratio = −5.67; p < 0.0001) and S-prompting contexts (ß = 1.64; SE = 0.44; z ratio = 3.68; p = 0.02). No within-group differences were found in the control group.
An analysis of the response patterns (Figure 3), particularly in the C task, revealed that bilinguals differed from controls in their responses to DQ-and S-prompting contexts. As concerns the former, bilinguals were twice as likely as controls (33% vs. 15%, respectively) to choose AQ as a possible answer, although DQ was still the most frequently chosen response (59%). The proportion of non-target-like responses was smaller in the S-than in the DQ-prompting contexts (33% vs. 41%, respectively), and DQs and AQs were chosen as a response at a similar rate (15% to 18%, respectively). Thus, we can partially answer our first research question; namely, bilinguals differed from controls in their identification of sentence types, displaying a higher number of non-accurate responses than controls across tasks, particularly in the C task (see RQ2). The results reported above reflect the behavior of both groups, but bilinguals have diverse language histories and their behavior is highly variable, so it is crucial to explore to what extent individuals mirror the group behavior. As seen in Figure 4 (see also Table  5), 10/13 speakers displayed accuracy values that were within one SD from the mean. Two speakers (I04 and I12) were above that threshold and one participant (I11) was clearly below one SD from the mean. Demographic variables may account in part for these results; I04 was a simultaneous bilingual, with college education who used both languages in equal proportions. I12 was the same age as I04, and, although she was exposed to English when she entered the school system, she reported using English most of the time. I11's behavior, however, is difficult to explain with the information available, since his language learning profile was similar to I12's and he was the participant who reported using English the most (Table 1). In the next section, we will compare these findings to the production results to better understand if his lower accuracy is a consequence of the perception tasks used, or if it reflects his overall performance. The results reported above reflect the behavior of both groups, but bilinguals have diverse language histories and their behavior is highly variable, so it is crucial to explore to what extent individuals mirror the group behavior. As seen in Figure 4 (see also Table 5), 10/13 speakers displayed accuracy values that were within one SD from the mean. Two speakers (I04 and I12) were above that threshold and one participant (I11) was clearly below one SD from the mean. Demographic variables may account in part for these results; I04 was a simultaneous bilingual, with college education who used both languages in equal proportions. I12 was the same age as I04, and, although she was exposed to English when she entered the school system, she reported using English most of the time. I11's behavior, however, is difficult to explain with the information available, since his language learning profile was similar to I12's and he was the participant who reported using English the most (Table 1). In the next section, we will compare these findings to the production results to better understand if his lower accuracy is a consequence of the perception tasks used, or if it reflects his overall performance. Languages 2022, 7, x FOR PEER REVIEW 15 of 28 Figure 4. Percentage of accurate responses in all tasks combined by Inuktitut-English bilinguals.

Accuracy
Before discussing pitch changes in pitch accents and nuclear contours, it is important to analyze the response accuracy, particularly in the C task, which allowed for open answers. We focus here on these results, which are displayed in Table 7, given that there were no repetition errors in the SI task. We treated any utterance that was not consistent with the contextual prompt as a non-target realization. For example, the use of a Whquestion or an inverted question in a context that prompted a DQ was treated as nonaccurate, as was the use of a question in a context that was intended to prompt a statement.  Table 7 reveals that bilinguals were overall less accurate than controls, particularly in DQ-prompting contexts, where most of the non-target responses (86%) involved the production of an AQ. Results of a binomial mixed-effects model with Response (Accurate, Non-accurate) as the dependent variable, Language and Sentence Type as independent variables, and Participant and Item as random factors revealed that bilinguals did not differ from controls as a group (Table 8). 7 Non-target responses were significantly higher in DQ-prompting contexts and post-hoc Tukey pairwise comparisons showed that this was the case for controls (AQ vs. DQ: ß = −1.71; SE = 0.59; z ratio = −2.85; p = 0.04) and for bilinguals (AQ vs. DQ: ß = −1.70; SE = 0.59; z ratio = −2.85; p = 0.04), but no differences were found in DQ accuracy between groups (ß = −0.88; SE = 0.72; z ratio = −1.21; p = 0.82).

Accuracy
Before discussing pitch changes in pitch accents and nuclear contours, it is important to analyze the response accuracy, particularly in the C task, which allowed for open answers. We focus here on these results, which are displayed in Table 7, given that there were no repetition errors in the SI task. We treated any utterance that was not consistent with the contextual prompt as a non-target realization. For example, the use of a Wh-question or an inverted question in a context that prompted a DQ was treated as non-accurate, as was the use of a question in a context that was intended to prompt a statement.  Table 7 reveals that bilinguals were overall less accurate than controls, particularly in DQ-prompting contexts, where most of the non-target responses (86%) involved the production of an AQ. Results of a binomial mixed-effects model with Response (Accurate, Non-accurate) as the dependent variable, Language and Sentence Type as independent variables, and Participant and Item as random factors revealed that bilinguals did not differ from controls as a group (Table 8). 7 Non-target responses were significantly higher in DQ-prompting contexts and post-hoc Tukey pairwise comparisons showed that this was the case for controls (AQ vs. DQ: ß = −1.71; SE = 0.59; z ratio = −2.85; p = 0.04) and for bilinguals (AQ vs. DQ: ß = −1.70; SE = 0.59; z ratio = −2.85; p = 0.04), but no differences were found in DQ accuracy between groups (ß = −0.88; SE = 0.72; z ratio = −1.21; p = 0.82). Accuracy results, however, do not present an overall picture of participants' behavior in this task. Whereas controls failed to produce an utterance appropriate to the context in a very small percentage of cases (AQ: 1%; DQ: 3%; S: 7%), bilinguals produced no responses or one-word responses in a larger proportion of contexts, particularly in DQ-prompting contexts (AQ: 8%; DQ: 26%; S: 15%). Individual results ( Figure 5) reveal an interesting pattern; namely, there was a quasi-complementary distribution between non-target-like responses and the absence of response. Indeed, participants with the highest number of accurate responses did not produce utterances that were inappropriate to the context, but failed to produce an answer to some scenarios, whereas participants with the lowest accuracy tended to produce a response in all contexts.
Languages 2022, 7, x FOR PEER REVIEW 16 of 28 Accuracy results, however, do not present an overall picture of participants' behavior in this task. Whereas controls failed to produce an utterance appropriate to the context in a very small percentage of cases (AQ: 1%; DQ: 3%; S: 7%), bilinguals produced no responses or one-word responses in a larger proportion of contexts, particularly in DQprompting contexts (AQ: 8%; DQ: 26%; S: 15%). Individual results ( Figure 5) reveal an interesting pattern; namely, there was a quasi-complementary distribution between nontarget-like responses and the absence of response. Indeed, participants with the highest number of accurate responses did not produce utterances that were inappropriate to the context, but failed to produce an answer to some scenarios, whereas participants with the lowest accuracy tended to produce a response in all contexts. Figure 5. Accurate, non-accurate and no responses in the contextualized production task (bilinguals only). Note: total of contexts = 18. Accuracy in production was equal (2/13) or higher than in perception for most bilingual participants (8/13), as illustrated in Figure 6. Moreover, all participants performed above chance in production, which was not the case in perception. Interestingly, participants who were exposed to English at home (i.e., I03, I04, I05) were the ones with the most consistent performance in perception and in production. As for the remaining participants, the overall higher accuracy in production may be attributed to the difficulty of the perception task, which tapped into more metalinguistic knowledge than the production task. Accuracy in production was equal (2/13) or higher than in perception for most bilingual participants (8/13), as illustrated in Figure 6. Moreover, all participants performed above chance in production, which was not the case in perception. Interestingly, participants who were exposed to English at home (i.e., I03, I04, I05) were the ones with the most consistent performance in perception and in production. As for the remaining participants, the overall higher accuracy in production may be attributed to the difficulty of the perception task, which tapped into more metalinguistic knowledge than the production task. Figure 6. Accuracy in perception and production (C task only) by participant.

Phonetic Realization of Pitch Accents and Nuclear Contours
In this section, we analyze the patterns of pitch change in the first pitch accent and in the nuclear contours in both tasks. If there is an influence from Inuktitut into English, we expect to see very little pitch movement at the beginning of the utterance. Recall that we measured the f0 maximum minus the f0 minimum. Thus, if the first accent is a rising accent, we expect a positive difference, and if there is no pitch movement, we expect a value close to 0. Results displayed in Figure 7 suggest that the latter is the case. If we compare the patterns obtained for each group, we see that bilinguals have values that are close to 0 (C task (mean in ST): AQ = 0.8; DQ = 1.2; S = 0.4; SI task (mean in ST): AQ = 1.6; DQ = 1.4; S = 1.1) and that are relatively similar across sentence types and tasks. Controls, instead, showed larger pitch changes in questions than in statements (C task (mean in ST): AQ = 1.4; DQ = 3.4; S = 0.9; SI task (mean in ST): AQ = 4.8; DQ = 4.8; S = 1.9) and the amount of pitch change varied between tasks.
To determine the significance of pitch change in the first pitch accent, we ran a series of linear mixed effect models with pitch change (in semitones) as the dependent variable, Language and Sentence Type as the independent variables, and Participant and Stimulus as random factors. We also tested models with interactions. Here and elsewhere in this subsection, we will report the results of the best model according to the AIC criterion. In all cases, we compared the base model (only random effects) with models including only the independent variables or the independent variables plus the interactions. As for the first pitch accent, model comparisons revealed that the best model was the latter, and its output is reported in Table 9. 8   0  10  20  30  40  50  60  70  80  90  100   I01   I02   I03   I04   I05   I06   I07   I08   I09   I10   I11   I12 I13 Accuracy (C Task) Production Perception Figure 6. Accuracy in perception and production (C task only) by participant.

Phonetic Realization of Pitch Accents and Nuclear Contours
In this section, we analyze the patterns of pitch change in the first pitch accent and in the nuclear contours in both tasks. If there is an influence from Inuktitut into English, we expect to see very little pitch movement at the beginning of the utterance. Recall that we measured the f0 maximum minus the f0 minimum. Thus, if the first accent is a rising accent, we expect a positive difference, and if there is no pitch movement, we expect a value close to 0. Results displayed in Figure 7 suggest that the latter is the case. If we compare the patterns obtained for each group, we see that bilinguals have values that are close to 0 (C task (mean in ST): AQ = 0.8; DQ = 1.2; S = 0.4; SI task (mean in ST): AQ = 1.6; DQ = 1.4; S = 1.1) and that are relatively similar across sentence types and tasks. Controls, instead, showed larger pitch changes in questions than in statements (C task (mean in ST): AQ = 1.4; DQ = 3.4; S = 0.9; SI task (mean in ST): AQ = 4.8; DQ = 4.8; S = 1.9) and the amount of pitch change varied between tasks.
To determine the significance of pitch change in the first pitch accent, we ran a series of linear mixed effect models with pitch change (in semitones) as the dependent variable, Language and Sentence Type as the independent variables, and Participant and Stimulus as random factors. We also tested models with interactions. Here and elsewhere in this subsection, we will report the results of the best model according to the AIC criterion. In all cases, we compared the base model (only random effects) with models including only the independent variables or the independent variables plus the interactions. As for the first pitch accent, model comparisons revealed that the best model was the latter, and its output is reported in Table 9. 8 Results showed a main effect of Sentence Type (larger pitch change in DQs than in other sentence types) and Task (larger pitch change in the SI than in the C task). Interactions between Language, Task and Sentence Type revealed that bilinguals had a smaller pitch change than controls in the SI task, in general, but pitch change was larger in this task in DQs and Ss when compared to those same sentences in the C task. Finally, post-hoc Tukey pairwise comparisons showed that controls had a larger pitch change in questions than in Ss (E,AQ vs. E,S: ß = 2.90; SE = 0.470; df = 49.2; t ratio = 6.18; p < 0.0001; E, DQ vs. E,S: ß = 2.84; SE = 0.47; df = 49.1; t ratio = 6.04; p < 0.0001) in the SI task, and between AQs and DQs (ß = −1.84; SE = 0.52; df = 158.9; t ratio = 3.53; p = 0.020) and DQs and Ss (ß = 2.21; SE = 0.52; df = 143.4; t ratio = 4.18; p = 0.002) in the C task. Bilinguals, instead, showed no significant differences across sentence types in both tasks. Figures 8 and 9 further show that the group tendencies hold for most of the individuals in the group, since bilinguals' values are closer to 0 and are similar across sentence types. It is important to remember, however, that fewer tokens of DQs were analyzed in the bilingual group in the C task because participants either failed to produce an analyzable utterance or produced an utterance that was not expected in that context (see Figure 5).
Languages 2022, 7, x FOR PEER REVIEW 18 of 28 Figure 7. Boxplots displaying the pitch change (in semitones) over the first pitch accent in both tasks. Results organized by group. Results showed a main effect of Sentence Type (larger pitch change in DQs than in other sentence types) and Task (larger pitch change in the SI than in the C task). Interactions between Language, Task and Sentence Type revealed that bilinguals had a smaller pitch change than controls in the SI task, in general, but pitch change was larger in this task in DQs and Ss when compared to those same sentences in the C task. Finally, post-hoc Tukey pairwise comparisons showed that controls had a larger pitch change in questions than in Ss (E,AQ vs. E,S: ß = 2.90; SE = 0.470; df = 49.2; t ratio = 6.18; p < 0.0001; E,  Table 9. (Pitch accents). Linear mixed effect model with Language, Task and Sentence Type as fixed effects and Language*Sentence Type*Task interaction (* p < 0.05; ** p < 0.01; *** p < 0.001).  Figures 8 and 9 further show that the group tendencies hold for most of the individuals in the group, since bilinguals' values are closer to 0 and are similar across sentence types. It is important to remember, however, that fewer tokens of DQs were analyzed in the bilingual group in the C task because participants either failed to produce an analyzable utterance or produced an utterance that was not expected in that context (see Figure 5).

Figure 8.
Pitch change in the first pitch accent (SI task) in each sentence type by participant. Figure 9. Pitch change in the first pitch accent (C task) in each sentence type by participant.
Individual results revealed that, in both tasks, some participants (e.g., I09) had consistently lower pitch change, whereas other participants (e.g., I06, I07) had consistently larger pitch changes. Other participants had a relatively large pitch change in the SI task, but a small pitch change in the C task (e.g., I08). no significant differences across sentence types in both tasks. Figures 8 and 9 further show that the group tendencies hold for most of the individuals in the group, since bilinguals' values are closer to 0 and are similar across sentence types. It is important to remember, however, that fewer tokens of DQs were analyzed in the bilingual group in the C task because participants either failed to produce an analyzable utterance or produced an utterance that was not expected in that context (see Figure 5).  Individual results revealed that, in both tasks, some participants (e.g., I09) had consistently lower pitch change, whereas other participants (e.g., I06, I07) had consistently larger pitch changes. Other participants had a relatively large pitch change in the SI task, but a small pitch change in the C task (e.g., I08). Individual results revealed that, in both tasks, some participants (e.g., I09) had consistently lower pitch change, whereas other participants (e.g., I06, I07) had consistently larger pitch changes. Other participants had a relatively large pitch change in the SI task, but a small pitch change in the C task (e.g., I08).
We now turn to the analysis of nuclear contours. Figure 10 displays the results obtained in both tasks for bilinguals and controls. Groups appear to resemble each other more closely in the realization of nuclear contours than in the realization of pitch accents (Figure 7).
We now turn to the analysis of nuclear contours. Figure 10 displays the results obtained in both tasks for bilinguals and controls. Groups appear to resemble each other more closely in the realization of nuclear contours than in the realization of pitch accents (Figure 7). Figure 10. Boxplots displaying the pitch change (in semitones) over the nuclear contour in both tasks. Results organized by group.
To investigate whether there were any significant differences, we ran a series of linear-mixed effects models following the same procedure described for pitch accents. Once again, the best model (Table 10) was that with the three-way interaction. 9 Results showed the expected difference in the realization of Ss when compared to questions, and as was the case with pitch accents, the task effect was also significant, revealing a larger pitch change in the SI task than in the C task, probably due to imitation. Groups only significantly differed in their realization of the nuclear falls in Ss, with bilinguals showing a smaller pitch change than controls (see Figure 10).  To investigate whether there were any significant differences, we ran a series of linearmixed effects models following the same procedure described for pitch accents. Once again, the best model (Table 10) was that with the three-way interaction. 9 Results showed the expected difference in the realization of Ss when compared to questions, and as was the case with pitch accents, the task effect was also significant, revealing a larger pitch change in the SI task than in the C task, probably due to imitation. Groups only significantly differed in their realization of the nuclear falls in Ss, with bilinguals showing a smaller pitch change than controls (see Figure 10). Results of post-hoc Tukey pairwise comparisons confirmed that both groups had the same patterns in the realization of nuclear contours; namely, rises in AQs and DQs did not differ significantly between groups and between tasks, whereas questions differed from statements in both tasks.

Summary of Results
Table 11 offers a qualitative summary of our perception and production results: We begin by returning to our first research question: Is there evidence of CLI in the bilinguals' perception and production of sentence types? We found that, in perception, and as opposed to our prediction, groups did not differ in the IO task, where participants had to identify low-pass filtered stimuli, but did differ in the other two tasks in the direction predicted (i.e., with Ss and DQs). In the SI task, both groups were less accurate with DQs, but bilinguals, as opposed to controls, were also less accurate with Ss. In the C task, however, only bilinguals were less accurate in DQ contexts. This suggests that bilinguals associate meaningless intonation contours with sentence types, as monolinguals do, in patterns that resemble those observed in other studies with different language pairings (e.g., Grabe et al. 2003;Radu et al. 2018). However, when syntactic and contextual information are present, these take precedence over prosody, as expected due to CLI. As we mentioned, in Inuktitut, these sentence types are marked by different morphemes rather than by different intonation contours. In our study, when syntactic information was present (i.e., in AQs), bilinguals were as accurate as controls. However, when syntactic information was not informative (i.e., Ss and DQs), they were less accurate.
Accuracy patterns in production differed from those found in perception. Group differences were not found to be statistically significant, and all bilingual participants performed above chance, which was not the case in perception. However, we found different response patterns in bilinguals when compared to monolinguals in the C task. First of all, several participants did not provide an answer, or, as predicted, produced an AQ in DQ-prompting contexts. The analysis of pitch change largely supported the prediction regarding differences in prenuclear accents. As expected from CLI, bilinguals displayed a smaller pitch change than controls across tasks and sentence types. Moreover, the pitch change hovered slightly above 0 (Figure 8), revealing almost no pitch movement, especially when compared to controls, whose average pitch change ranged from 5 STs in both question types in the SI task, to 3 and 1.5 STs in DQs and AQs, respectively, in the C task. In nuclear contours, the groups did not differ in the use of rising patterns, but bilinguals displayed a less sharp fall than controls in both tasks in Ss. Thus, evidence of CLI was observed in multiple dimensions, including difficulties in perception to determine differences between question types or by producing AQs or Wh-questions in DQ-prompting contexts. This is expected if we keep in mind that questions and statements are marked by morphology in Inuktitut, as opposed to English. Results obtained for pitch changes in prenuclear accents are consistent with previous studies that indicate that pitch is not a reliable cue to stress in the language (Fortescue 1983;Shokeir 2009;Arnhold et al. Forthcoming), and that tonal changes are restricted to the end of the utterance (Massenet 1980;Fortescue 1983;Shokeir 2009). Finally, the smaller pitch change observed in nuclear contours in Ss in our study may be attributed to the absence of declination observed in Inuktitut (Shokeir 2009).
Our second question was: Is CLI modulated by task type? We predicted larger differences in contextualized (perception and production) tasks than in tasks that had no access or limited access to contextual meaning. This prediction was partially supported in perception and in production. In perception, although bilinguals were overall less accurate than controls, differences were restricted to the SI and C tasks, that is, in tasks that include either only lexical and syntactic information (SI) or contextual information (C). An interesting interaction between task and sentence type was observed in the C task, where only bilingual speakers exhibited significantly more non-target responses in DQ-prompting contexts than in the other two contexts. We attribute this effect to CLI, and we interpret this as a sign of either a reduced sensitivity to the contextual factors that yield a preference for non-inverted questions (which results in AQs being accepted in this context) or a reduced sensitivity to tonal cues, which would account for the choice of Ss as a preferred answer. 10 A complementary explanation to the behavior of bilinguals in the contextualized perception task (Figure 8 shows a high degree of variability among participants) could be task difficulty. Support for such an explanation comes from production results in the C task, where we found no significant between-group differences in accuracy rate. Along these lines, we could speculate that our perception task tapped into metalinguistic knowledge, since participants had to understand the context and imagine what kind of sentence would complete it. Differences in performance between tasks that require skills that bilinguals may not be accustomed to performing in both languages have been previously observed in different types of bilingual populations (Sánchez 2008;Kiser 2014;Tsimpli 2014). In the production task, instead, participants were asked to engage in something that is common in their everyday interactions, which is to listen to what somebody says and react appropriately. In addition to the lack of significant differences, we also saw more consistent individual patterns in production. Indeed, none of the participants performed below chance (Figure 8).
Concerning the analysis of pitch change (Figures 7-10), differences between groups were not larger in the C than in the SI task for several reasons. First, in prenuclear accents, bilinguals displayed the same degree of pitch change across tasks and sentence types; namely the average pitch change across tasks was consistently close to 0, which suggests that they were not sensitive to the large pitch excursions (Table 3) in the SI stimuli. In contrast, controls displayed what we believe to be an imitation effect. Indeed, we observed a larger change in the SI task than in the C task. In the former, the average values for AQs-DQs and Ss were 4.8 STs and 1.9 STs, respectively. In the latter task, values were consistently lower; namely, the average pitch range in Ss was 0.9 STs, and a difference between AQs (1.4 STs) and DQs (3.4 STs) emerged. This is consistent with the large pitch excursions (Table 3) in SI task in our stimuli.
Regarding nuclear contours, group differences were restricted to the magnitude of the fall in Ss. Otherwise, bilinguals and controls showed a larger pitch change in the SI task than in the C task. Indeed, the average pitch change across sentence types in the SI task was 10 STs for bilinguals and 11STs for controls. In the C task, this change was reduced to 5STs for bilinguals and 7STs for controls. Once again, we interpret this task difference as an imitation effect, since the pitch change in nuclear contours in the SI stimuli (Table 3) was rather large. It is interesting to see that bilinguals adjusted their pitch change in nuclear contours (as controls did) in the SI, but adjustments were not observed in prenuclear accents, which is consistent with our predictions that bilinguals would be more sensitive to pitch changes in nuclear than in prenuclear position, since tonal changes are restricted to this position in Inuktitut. Arguably, pitch changes in final contours should also be more salient than at the beginning of the utterance, since pitch changes are much larger in nuclear positions than in prenuclear positions (see description of the stimuli in Tables 3 and 4), independently of the task. Moreover, and as summarized in the Introduction, there is agreement that nuclear contours are a cue to sentence type in North American English. However, evidence indicating that initial pitch differences are a cue to sentence type is much more recent and such differences were not consistently present in our own stimuli (see Tables 3 and 4). If we assume that imitation can be a proxy of perception, as has been argued by several scholars (e.g., Gussenhoven 2004;D'Imperio et al. 2014;Zárate-Sández 2015), we tentatively conclude that bilingual participants imitated the tonal movements that are meaningful in Inuktitut (i.e., final cues). Pitch changes at the beginning of the utterance appear to have a purely paralinguistic meaning for participants. Further anecdotal evidence of the non-linguistic meaning of pitch variations for our participants were comments gathered during the testing process. Indeed, when performing the imitation task, participants would frequently laugh after finishing imitating an utterance.

Perception and Production
Results showed some interesting links between perception and production, as well as pathways for future research. As summarized in Figure 6, for most participants (i.e., 8/13) accuracy in production was higher than accuracy in perception, particularly in the C task, which is not the tendency in L2 and bilingual research. One explanation, which would account for the behavior of this sub-group, has to do with task demands. In perception, participants had to keep the context in mind, listen to the three possible matching options, and choose one. In addition to being more demanding for participants' memory and attention, this task required them to perform something that is absent from their daily lives, as opposed to the production task that prompted them to listen to a context and produce an appropriate response. Moreover, we can hypothesize that age factors (i.e., decline in auditory capacity due to aging) may account for the performance of two participants (I07; I10) who were the oldest in the sample.
The opposite trend (i.e., perception better than production) was observed in three participants (I01, I09, I12), and of those, only I01 was highly accurate in perception (indeed, this participant was the most accurate in our sample). We would expect a better performance in perception than in production for this participant (albeit her production was highly accurate) since she has been exposed to English in school but she mostly uses Inuktitut in her daily life. The other two participants, however, have little in common with I01, with the exception of their AoA and their gender.
Finally, the remaining two participants revealed a similar behavior in perception and production (I02, I04). These participants, however, differed in their accuracy rates. Whereas I04 was on average 83% accurate, I02 s accuracy average was 61%. Interesting parallels emerge if we turn to previous literature. As was the case in previous studies (Grabe et al. 2003;Ortega-Llebaria and Colantoni 2014;Radu et al. 2018), participants did not differ when responding to stimuli with no linguistic content (IO task), as compared to tasks that had access to contextual meaning. Bilingual participants in our study were also highly accurate at imitating tonal changes in nuclear contours in the SI production task. As such, they resembled L1 Spanish-L2 English participants in Ortega-Llebaria and Colantoni (2014) who matched controls better in f0 changes when the focalized element was in object position, where pitch is used in the L1. However, as opposed to participants in Ortega-Llebaria and Colantoni (2014)'s study, who were able to imitate tonal changes in focalized subjects and verbs, bilinguals in this study were not able to imitate the pitch change in prenuclear position, which suggests that the absence of tonal changes in Inuktitut is an entrenched feature in their L1.

Individual Variability
Given the characteristics of our population, it is important to turn briefly to patterns of individual variability, some of which have been highlighted throughout this study. While English and Inuktitut have been in contact for centuries and there is a high degree of social bilingualism, our participants (Table 1) differed along all the dimensions captured in our background questionnaire. Individual differences became especially apparent in the C task, both in perception and in production. Of the two participants who showed consistent patterns in perception and production (I04 and I02), only one of them (I04) was exposed to English from birth through one of her parents (Table 1). This participant had similar self-reported patterns of language use (i.e., she uses English approximately 50% of the time), but differed from I04 in her self-rating (Advanced as opposed to Near Native). The other two participants who were exposed to English at home (I03, I05) were also among those with the highest combined accuracy rate (75%). These participants resembled I04 in their education and patterns of language use, but, once again, differed from her in their self-rating. Finally, I01, the participant with the highest average accuracy (86%) reported using English the least (25%) and began learning English at age 6. It is interesting to observe, though, that accuracy patterns do not seem to go hand in hand with patterns of pitch change in prenuclear accents. Of all the participants mentioned, only I05 produced differences that may be considered perceptible between sentence types, since her DQs are on average 1.5 STs higher than her Ss and one semitone higher than her AQs.

Conclusions
Our results confirm that, in cases of language contact, and given the appropriate demographic and social conditions, any pitch pattern can be transferred, including changes in alignment (Mennen 2004;Colantoni and Gurlekian 2004), in the size of the pitch excursion (e.g., Santiago and Delais-Roussarie 2012), in the frequency and use of pitch accents (Gut 2005;Queen 2001Queen , 2012, and in the lack of tonal movements. Evidence of CLI was observed in perception and production. First and foremost, in perception, differences were not attested in the task without linguistic information, but emerged in the other two tasks, providing evidence of reduced sensitivity to tonal variations that signal sentence types. In production, and as in previous studies (e.g., Alvord 2007; Zárate-Sández 2015), we found positional asymmetries, with CLI being most evident in prenuclear position. The nonsignificant differences in pitch change across tasks and sentence types could be attributed to the fact that, in Inuktitut, tonal movements are restricted to the end of the sentence; tonal changes throughout the utterance do not encode grammatical information, this information being encoded by a rich morphology. Admittedly, tonal variations at the beginning of the utterance, albeit a cue for sentence type (see Saindon et al. 2017a), are redundant in English, since grammatical (changes in word order, do-support) and tonal information (final boundary tones) provide sufficient cues. We believe, however, that it was important to begin by analyzing sentence types to establish a descriptive basis for the uses of pitch in this bilingual population. We predict that this absence of tonal variations throughout the utterance will have consequences for the perception, interpretation, and production of other grammatical structures, such as corrective focus, where tonal movements in prenuclear position play a crucial role.
Finally, this study contributes to a growing literature that has shown that early (Queen 2001(Queen , 2012Lleó et al. 2004;Rakow and Lleó 2011) and sequential bilinguals (Colantoni et al. 2016) exhibit CLI in their prosody. We are particularly interested in expanding our knowledge of the prosody of early and sequential bilinguals whose L1 is one of the many indigenous languages spoken in the Americas, given that most studies until now have focused on Spanish bilingualism (O'Rourke 2009(O'Rourke , 2012Muntendam and Torreira 2016). We have shown here that the English spoken by L1 Inuktitut speakers displays signs of CLI, and that not only tonal movements but also the absence of tonal variations can be transferred in a stable language contact situation.