Quality of L2 Input and Cognitive Skills Predict L2 Grammar Comprehension in Instructed SLA Independently

: Input is considered one of the most important factors in the acquisition of lexical and grammatical skills. Input has been found to interact with other factors, such as learner cognitive skills and the circumstances where language is heard. Language learning itself has sometimes been found to enhance cognitive skills. Indeed, intensive contact with another language has been found to sometimes boost cognitive skills, even in intensive instructed settings, such as immersion programs (bilingual advantage hypothesis). In this paper, we report a cross-sectional study to assess grammar learning of 79 fourth grade German students learning L2 English in two immersion schools. Verbal teacher input was assessed using the Teacher Input Observation Scheme (TIOS, Items 14–25), and the learners’ L2 grammar comprehension was tested with the ELIAS Grammar Test II. Cognitive skills, including phonological awareness, working memory, and non-verbal intelligence, were determined using standardized assessment procedures. The results show that verbal input quantity and quality correlated signiﬁcantly with the learners’ L2 grammar comprehension. None of the cognitive skills moderated the effect of input on grammar comprehension but all predicted it independently. The combination of L2 input and phonological awareness was found to be the most robust predictor of L2 grammar comprehension.


Introduction
Recent research in instructed second language acquisition has focused on different factors that impact language learning in general, and the acquisition of grammar in particular. Some of these are external to the learner, such as input, while others are dependent on individual cognitive abilities (Kersten 2020;Paradis 2011). The influence of input as an external factor and selected cognitive skills as internal factors are considered in this context for children's comprehension of grammar in their second language in an instructional context.
The term "input" has been used with various different meanings in the literature. In this paper, we refer to input as defined by Truscott and Sharwood Smith, who derived their interpretation from Carroll (1999), as sights, including pointing and gesturing, sounds, smells, tastes, etc., in other words everything that contributes to the interpretation of an utterance and which can lead to further development of an individual's linguistic ability, i.e., all the relevant external contexts. This should be included in a comprehensive understanding of what input is. (Truscott and Smith 2019, p. 10) Input, in this sense, is regarded as a variable that includes all relevant external factors contributing to the interpretation of an utterance that stimulate learners in the form of sensory data. According to the authors, this refers to both the "situational context", that is, the setting in which learners encounter the L2 (e.g., the types of activities and materials encountered in a classroom setting), and the "discourse context", referring to the specific linguistic features a learner encounters (e.g., the linguistic input provided by the teacher in the classroom). We will refer to the latter throughout this paper as "verbal input".
Research into the effects of input on learner language differentiates between "input quantity" and "input quality". While definitions of these two constructs also differ widely between authors, we refer to "input quantity" as the amount of L2 a learner is exposed to in an instructional setting, which includes variables such as contact duration and contact intensity, that is, for how much time and how often learners experience L2 in their particular school program. These variables are closely related to the construct of frequency, which has been found to affect the acquisition of both vocabulary and grammar in different language learning contexts (Ellis 2002;Lieven 2010;Paradis 2010;Schelletter 2016;Thordardottir 2014;Unsworth 2016). We use "input quality", on the other hand, as referring to the characteristics (i.e., the nature or type) of input stimuli the learner encounters (Kersten in press). Again, this refers to all types of sensory stimuli. These include aspects such as the intonation and speech rate of the person providing the input, their conversational style, joint attention strategies, and the amount of contextual information that is provided (De Houwer 2014; Pierce and Genesee 2014;Vigil et al. 2005). In the context of the current study, where children acquire their second language in an immersion school setting, input quality includes all types of language modifications and scaffolding techniques used by the teachers to render the L2 comprehensible and to facilitate language learning (Loewen and Sato 2018;Kersten in press). These definitions are in line with the cognitive-interactionist framework (Long 2015) pertaining to the research within the field of instructed second language acquisition (ISLA, Ellis and Shintani 2013;Loewen 2020).
It has to be noted that in most studies of ISLA, researchers focus on the nature of input as (linguistic) stimuli provided in the classroom, and not on the actual exposure of each individual learner. As Carroll (2017) convincingly argues, individual exposure to (or experience of) the L2 is a much better predictor of SLA since stimulation and subsequent knowledge construction is an individual internal process. As a result of exposure, a network of different representations is activated in the learner, which leads to (deep or shallow) processing and storage, all of which are highly individual processes (Kersten in press). As studies into individual exposure are extremely time-consuming, ISLA research often uses input quantity and quality measures as provided by teachers and school programs as a proxy for the actual stimuli to which a learner could potentially have been exposed. The present study follows this approach in operationalizing types of input quantity and quality as opportunities for potential exposure.
Internal variables have been discussed as important factors in cognitive-interactionist approaches (Ellis and Shintani 2013;Long 2015). They relate, among other things, to the cognitive abilities of learners that are important for language learning. Cognitive factors included in this study are the ability to identify and manipulate sounds (phonological awareness), the ability to store information (working memory), and non-verbal IQ. A number of studies have explored the effect of some of these skills on children's first and second language learning (Paradis 2011;Pham and Tipton 2018;Sun et al. 2015;Vaahtoranta et al. 2021). Kersten (2020) considers that there is a bi-directional influence between language learning and cognitive skills. This study seeks to determine which features of teachers' verbal input contribute most to L2 learners' receptive grammatical skills, and whether this effect is independently predicted or moderated by learners' cognitive skills.

Effects of Input Quantity
Numerous studies consider the effects of input quantity in terms of duration, intensity, and frequency on processing. Input frequency is considered an important driver both in first and second language acquisition (Ellis and Wulff 2008;Tomasello 2003Tomasello , 2009. One basic assumption is that the frequent occurrence of particular strings in the verbal input leads to the learner's establishment of the corresponding structure in a piecemeal fashion (entrenchment). In L2 learning, the frequency hypothesis (Hatch and Wagner-Gough 1976;Ellis 2012) assumes that frequently recurring elements lead to better storage in memory.
Item frequency has been found to affect both vocabulary and grammar learning in first and second language acquisition. Token frequency affects the learning of particular items, whereas type frequency enables the learner to draw comparisons and learn by analogy, and the acquisition of inflectional morphology is dependent on both the type and token frequency of the word and the inflection. According to Ambridge et al. (2015), the acquisition of simple sentence constructions depends on both the frequency of the verbs used within the construction and the occurrence of the structure.
Studies investigating the effect of frequency on L2 learners' knowledge of language structures (Blom et al. 2012;Unsworth 2016;Unsworth et al. 2014) show that while there is a relationship between the two, other factors, such as the target structure, also play a role in the acquisition process, as has been suggested by Gathercole (2007), who showed that the language structure itself, not just the frequency, affects the timing of development.
In their comprehensive study on predictors of L2 acquisition of 71 very young children in instructional contexts, Sun et al. (2015) found that input quantity, operationalized as the total amount of school input, significantly predicted learners' receptive and productive lexical skills, as well as their grammar reception.
In the context of early L2 acquisition in immersion settings, the ELIAS project (Kersten et al. 2010) found that children's L2 contact duration and contact intensity predict their L2 grammar comprehension in bilingual preschools. The input intensity measures included the number of hours per week that the children were immersed in the L2, the number of teachers that provide L2 input, and the teacher-to-child ratio (Weitz et al. 2010). Similarly, Rohde (2010) and Steinlen et al. (2010) found that input intensity correlated with L2 lexical and grammatical development in the same bilingual preschools. However, when measuring the amount of progress in receptive L2 knowledge over a period of 6 months, Weitz et al. (2010) found similar progress for different intensity groups with 147 bilingual preschoolers, but differences in terms of input quality.

Effects of Input Quality
Given that the focus of the present study is on children's L2 learning in immersion settings, input quality has been claimed to be an important variable that contributes to differences in instructed SLA (Graham et al. 2017;Loewen and Sato 2018;Long 2015). The construct, as used in this paper, includes variables relating to the language of the person providing the input (such as their speech rate, language style, lexical diversity, that is, "verbal input") but also the nature of the situational context and the circumstances in which the input is provided (Truscott and Smith 2019;Zurer Pearson and Amaral 2014).
The speech rate of the caregivers has previously been found to affect children's first language development (Hart and Risley 1995). De Houwer (2014) compared the speech rate of Dutch mothers talking to their monolingual children with that of mothers of bilingual children who always addressed their children in Dutch. They did not find any differences, thereby challenging the claim that bilingual children have less exposure to each of their languages. Mothers' language styles have been found to be different from that of fathers', as mothers tend to talk more and are less directive (Pierce and Genesee 2014).
In the context of early second language acquisition, where children's only input is in daycare or the classroom, there is evidence that the characteristics of the language of the teacher also have an effect on the language of young learners. Bowers and Vasilyeva (2011) compared the vocabulary growth of American monolingual children, age four, with English language learners of the same age who primarily spoke a different language at home with the parents. The children's English vocabulary was assessed at the beginning and end of the observation period. The teachers' language was analyzed in terms of the total number of words, lexical diversity, and complexity. They found that the lexical diversity of the teacher affected vocabulary growth in monolingual children, while English language learners' vocabulary correlated with the total number of words used as well as the average number of words per utterance.
Based on a number of L2 processing models (Gass et al. 2020;Leow 2015;Kormos 2011;Truscott and Smith 2019), Kersten (in press) describes ways in which situational and discourse contexts shape the interpretation and cognitive processing of language users. Such external factors are particularly relevant in the context of L2 learning in the classroom. They include characteristics of classroom activities that serve as a matrix for L2 input, language modifications, and other scaffolding techniques (Kersten 2020;Kersten et al. 2019), for example, embedding language in meaningful tasks that stimulate the learner's construction of knowledge (Wolff 2002) and relate to their own prior world knowledge, non-verbal support, interactional strategies, and the stimulation of the learner's L2 output. All of these modifications of the external context stimulate the learner in the form of different types of sensory data (Kersten in press).
To capture these external modifications of input quality provided by teachers, the Input Quality Observation Scheme (IQOS, Weitz et al. 2010) was used to compare input quality in nine different bilingual preschools that were part of the ELIAS project (Kersten et al. 2010). Teachers' input quality was rated on a Likert scale of 1-4 for 15 different items and a total score was calculated. The items include input quality factors relating to both the language provided by the teacher (speech rate, intonation, lexical and structural diversity) as well as interactional techniques (contextualization, corrective feedback). The results show that the children whose teachers achieved higher scores on the IQOS showed more progress in terms of their L2 grammar comprehension.
A re-analysis of the ELIAS data using a multivariate growth model (Kersten et al. 2018b, Kersten et al. in prep) included a number of additional variables. While the results for the first test time showed that the children's ages, L2 contact duration, and home literacy activities (operationalized as reading books in the family) predict their L2 grammar comprehension, age and L2 input quality had a significant effect on children's L2 grammar achievement a year later at the second test time.
The current study assessed input quality using an input observation scheme (TIOS) that was designed by Kersten et al. (2018aKersten et al. ( , 2018b and which uses more details compared to the IQOS. Apart from capturing general information about the program, class arrangements, and particular language skills, the TIOS consists of four high-inference scales and includes 41 items: Characteristics of Tasks (Items 1-13), Verbal Input (Items 14-25), Non-Verbal Input (Items 26-30), and Support of Learners' Output (Items 31-41). In the current context, the verbal input scale was used to determine the influence of input quality on grammar comprehension.

Cognitive Skills and Grammar Learning
Internal factors in the acquisition of language relate, among other things, to the way information is processed by the language learner. Together with external factors, cognitive skills can account for individual differences in children's language performance (Long 2015). Cognitive skills that relate to language include, for example, metalinguistic awareness (abstract knowledge of language as well as attentional control), memory (verbal and non-verbal), and non-verbal intelligence (analytical reasoning abilities).
According to Bialystok (2001), metalinguistic awareness is of particular importance for second language learning because the learner already has knowledge of a language system through their L1 that can be transferred to the new language. As children develop linguistically, they are increasingly able to gain access to knowledge that has been implicit initially (Schelletter 2020).
Working memory capacity is another important cognitive factor investigated in interactionist approaches (Mitchell et al. 2019). Learners have to be able to process a string of different symbols in sequence (Miyake and Friedman 1998, p. 341). This means that the acquisition of a language requires the simultaneous storage and processing of information. Working memory (Baddeley 2007) is assumed to have a central role in the acquisition of a second language (Linck et al. 2014). In particular, phonological short-term memory was found to play an important role in the acquisition of L2 vocabulary; learners need to learn the new phonological form but require no conceptual restructuring if they already have an L1 equivalent (Gathercole et al. 1992, p. 897).
In addition, a connection is assumed between non-verbal intelligence and L2 acquisition (Genesee and Hamayan 1980, p. 96). Learners need to distinguish different components of the input, find out about their respective structure and functions, as well as the principles that apply in order to achieve successful communication (Kristiansen 1990, p. 118). Due to learners' abilities to recognize complex patterns and reach conclusions based on reasoning, non-verbal intelligence is particularly important for the acquisition of grammatical rules (Kempe and Brooks 2011, p. 18).
Phonological awareness, an aspect of metalinguistic awareness, is a crucial factor in the acquisition of L2, possibly even more than for first language acquisition. Not only could L2 words contain sounds that are not native to the L1, but they can also differ in sound sequences, intonation, and syllable structure (Hu 2003, p. 434). Hopp et al. (2018) investigated cross-linguistic influences on the lexical and grammatical development of 200 learners of English (L3) in German primary schools. As part of the study, cognitive and social background factors were also included. Their findings showed that non-verbal cognitive abilities and phonological awareness significantly predicted productive vocabulary. Working memory (as measured by digit span tasks) affected grammar production (measured by a sentence repetition task). Similarly, Hopp et al. (2019) examined input and cognitive factors in the comprehension of "wh" questions and relative clauses in early foreign learners of English in fourth grade and monolingual English children aged 5-8. While differences in accuracy could be related to the amount of input children had at the different schools, phonological awareness was found to be the only cognitive factor that affected the children's ability to interpret the order of objects in the test sentences.
In a study with 20 German fourth grade learners of English, Werkmeister (2015) found that receptive grammar skills elicited with the ELIAS Grammar Test 2 correlated positively with working memory and phonological awareness, but not with non-verbal intelligence. In a regression analysis, phonological awareness and phonological shortterm memory predicted receptive vocabulary knowledge as measured with the BPVS III (Dunn et al. 2009), but only phonological awareness showed a significant effect on receptive grammar skills. Paradis (2011) considered the effect of three different internal variables on L2 acquisition: maturity, as measured by age of acquisition (AoA), language aptitude (including verbal memory skills and pattern recognition), as well as L1 to L2 transfer. These were investigated together with external factors in a study including 4-7-year-old learners of English with the onset of English after the age of 3. Among the child internal factors, the children's phonological short-term memory, a part of working memory according to Baddeley's (2007) model, was found to be the strongest predictor of the children's language abilities for both vocabulary and verbal morphology. Sun et al. (2015) investigated 71 Chinese L2 learners of English between the ages of 2 and 5 with regard to their receptive and productive vocabulary, receptive grammar skills, and a number of internal and external variables. In their study, the age of onset predicted all three skill types, while short-term memory predicted productive vocabulary, and nonverbal intelligence predicted L2 grammar reception. They suggest that "the memory-based approach of L2 learning might heavily rely on the L2 environment. ( . . . ) In these (less favorable) contexts, analytical reasoning ability might emerge as a more significant factor than memory in dealing with sentences, because it helps children to better organize the intensive and complicated information" (p. 12).
This line of thought, that is, that the impact of cognitive skills on L2 acquisition might be context-dependent, or, in other words, that the role of individual learner differences interacts with the characteristics of the learning context, corresponds to findings by Tagarelli et al. (2011Tagarelli et al. ( , 2015. They investigated the effect of working memory on learners' grammatical judgment outcomes in a semi-artificial language task in implicit/incidental versus explicit/intentional learning conditions. They found that working memory predicted grammar performance only in the explicit but not in the implicit condition, which lends support to their hypothesis that cognitive skills predict L2 acquisition only in explicit learning contexts. This is in line with Long's (2015, p. 60) "hypothesized combination of learner-internal and input differences minimally required to account for the facts about variation in within-learner and between-learner achievement".
While cognitive factors can clearly affect L2 learning, exposure to another language as such can affect or shape the development of cognitive skills of the language learner, such that the relationship between the two is bi-directional (Kersten 2020). Research into children's cognitive skills comparing monolingual and bilingual learners has shown that bilingual learners can develop more enhanced cognitive skills as a result of regular exposure to more than one language. According to the bilingual advantage hypothesis, (Bialystok 2009;Bialystok and Barac 2012;Morales et al. 2013), switching between languages can enhance executive functions and conflict resolution. Executive functions include inhibitory control, working memory, and shifting-the ability to switch between different mental states.
The majority of studies investigating executive functions have been carried out with children who grew up bilingually from birth, yet Bialystok and Barac (2012) investigated both metalinguistic awareness and executive functions in school-age children with different L1 backgrounds who attended either Hebrew or French immersion schools. It was found that children's performance on executive function tasks increased with the length of time that they had participated in the immersion program. Similar results were found in studies by Bialystok et al. (2014), Poncelet (2013, 2015), Poarch and van Hell (2012), and Woumans et al. (2016Woumans et al. ( , 2019. Some studies, however, have failed to find a bilingual advantage when comparing executive functions in monolingual and bilingual subjects (Von Bastian et al. 2016;Paap and Greenberg 2013;Simonis et al. 2019), thereby leading to a debate about the bilingual advantage hypothesis, which is still continuing.
In the context of the present study, the focus is on particular cognitive skills that can have an impact on grammar comprehension in L2 learners. As the outlined studies show, phonological awareness, working memory, and non-verbal intelligence have all been found to be predictors of L2 grammatical skills.

The Present Study
Based on the theoretical assumptions outlined above, which summarized potential effects of external factors, such as input quantity and quality, and internal factors, such as individual cognitive skills, this study investigated the differential contribution of verbal L2 input and selected cognitive skills on L2 grammar comprehension. To that end, we carried out a cross-sectional study in two immersion primary schools in Germany on learners' receptive grammar skills in L2 English, their phonological awareness, working memory, and non-verbal intelligence, and the teachers' verbal L2 input, assessed with a standardized observation scheme.
The study investigated the following research questions: 1. Which features of teachers' verbal L2 input contribute most to primary L2 learners' grammar comprehension?
Learners at the end of high-intensity immersion primary schools have a comparatively high L2 proficiency; they are able to understand most of the L2 input in class without extensive scaffolding and can express their thoughts rather fluently, albeit not without errors.
Hypothesis 1a (H1a). Verbal behavior that is usually provided for beginning learners and geared to guarantee high comprehensibility Items 18, 20-24 in the observation scheme referring to intonation and speech rate, cf. Section 3) will not predict L2 grammar variation in this dataset.
All teachers have very high L2 skills, some of them being native speakers, and did not resort to German in class.
Hypothesis 1b (H1b). Teachers' language proficiency and exclusive use of the L2 in class (Items 14 and 15) will not predict L2 grammar variation.
Item 16 refers to a high amount of L2 input to accompany all actions, Item 17 refers to structurally rich input using various different grammatical structures, synonyms, antonyms, paraphrases, and so forth, and Item 25 entails comprehension checks.

2.
Do selected cognitive skills exert an independent effect on grammar comprehension?
As summarized above, all cognitive skills included in this study have been found to predict SLA to some extent in previous studies. Language knowledge is built up over 4 years in a complex immersion context. Information on the L2 can be derived by learners from various external situational and linguistic (discourse) sources that go beyond teachers' modified verbal input.
Hypothesis 2 (H2). Cognitive skills will show an independent prediction of L2 grammar reception beyond a moderating effect on the influence of L2 input (second main effect).

3.
Is the effect of verbal L2 input on grammar comprehension moderated by different cognitive skills?
As pointed out in earlier research on ISLA (Long 2015), the effect of input is expected to interact with the learners' cognitive skills, that is, learners with high phonological awareness, working memory skills, and non-verbal intelligence were shown to be better able to derive linguistic information from verbal input. This effect was found to be strongest in explicit instructional learning conditions (Tagarelli et al. 2011(Tagarelli et al. , 2015. Hypothesis 3 (H3). The data will show a moderating effect of all three cognitive skills for the relationship between verbal input and L2 grammar comprehension (interaction effect).

Participants
In order to assess which factors contribute most to grammar learning, a cross-sectional study was carried out with 79 fourth grade students in 2 partial immersion schools in Lower Saxony, where all subjects except for German were taught through the L2 English. Participants were recruited from 5 classes, which were taught by 4 teachers (Table 1).
School 1 is a private immersion primary school with about 240 students and an L2 intensity of 82% (the classes receive 27 out of 33 lessons in L2 English). A total of 3 classes of School 1 took part in the tests; Classes 1.1a and 1.1b were taught by the same teacher. School 2 is a public primary school with approximately 240 children. In each year, 1 immersion class is taught in the L2 English as a medium of instruction with an intensity of 76% (Grades 1-2: 18 out of 25 lessons in L2 English; Grades 3-4: 20 out of 25 lessons in L2 English). All classes started in first grade with the immersion program. In total, 43 boys (54.4%) and 36 girls (45.6%) participated in the study. As opposed to traditional EFL programs, L2 teaching in these immersion programs is centered around the subject matter and does not contain much form-focused instruction. Although some focus on form is present in immersion classrooms, the L2 is built up predominantly on the basis of large amounts of input embedded in meaningful subject content, which is, especially in beginner levels (Grades 1 and 2), accompanied by various measures of comprehension scaffolds.
All participating teachers were trained primary L2 teachers. Except for the teacher of Class 2.2, all teachers were English native speakers with similar teaching experiences in bilingual schools. On the other hand, Teacher 2.2 had the longest teaching experience. Note. a Two classes (1.1a and 1.1b) were taught by Teacher 1.1 for two subsequent years.

Data Elicitation and Analysis
As part of the SMILE (Studies on Multilingualism in Language Education) project at the University of Hildesheim (2014-2018) 1 , we carried out classroom observation of teacher input in the classes and tested the learners' cognitive skills and L2 grammar comprehension using a battery of standardized tests.
Grammar comprehension was elicited with the ELIAS Grammar Test II (Kersten et al. 2012) 2 . The grammar test contains a picture pointing task and focuses on the comprehension of 12 grammatical phenomena distributed across 72 items (total possible score: 72). Relying on the predictions of Pienemann's (1998) processability theory (PT), the test has been shown to contain items with increasing processing complexity, and thus, to reflect an increase in L2 grammar reception (Koch 2020;Koch et al. 2021;Lenzing et al. forthcoming). Grammatical phenomena include SVO, plural, personal pronouns sg. (subj.), possessive pronouns, negation (PT, Stage 2), personal pronouns sg. (obj.), possessive (PT, Stage 3), subject-verb agreement: copula verbs (sg., pl), passive (PT, Stage 4), relative sentences (Type 1), relative sentences (Type 2), and subject-verb agreement: full verbs (sg., pl) (PT, Stage 5) (Lenzing et al. forthcoming). In a study with 360 primary school learners of English, Koch et al. (2021) applied a Rasch model and showed that the Rasch model was valid. This was shown by the values in the bootstrap goodness-of-fit test, χ 2 = p = 0.595.
For the present study, the learners were tested in class-based groups during regular school time and L2 English was used for instruction. The prompts were presented in front of the classroom, while each learner had a test booklet (answer sheet) in front of them. In the picture-pointing task, the children heard prompts and then had to select the corresponding picture out of 3 possibilities.
The following instruments were selected to measure cognitive skills: phonological awareness was assessed with 5 subtests (pseudoword segmentation, vowel substitution, phoneme inversion, vowel length determination, word inversion) of the German standardized reading and spelling test BAKO 1-4 ("Basiskompetenzen für Lese-Rechtschreibleistungen", (Basic Competencies for Reading and Spelling Skills) by Stock et al. (2003). To increase implementation objectivity, pre-recorded items of all subtests were presented using a computer with speakers. For data analysis, the number of correct trials of all tasks was used (total possible score: 59).
Working memory was measured with the subtests Letter-Number Sequencing and Digit Span Forwards and Backwards of the standardized WISC-IV (Wechsler Intelligence Scale for Children, German edition by Petermann and Petermann 2011). While the Digit Span subtest required the children to recall a series of orally presented number sequences in direct and reverse order (16 trials per task), the Letter Number Sequence task required the children to arrange and reproduce a series of orally presented letters and numbers in the correct order. The sum of raw scores of all subtests was used for further analysis (maximum possible score: 62).
Non-verbal intelligence was assessed using the subtest matrices of the German school readiness test, BUEGA ("Basisdiagnostik Umschriebener Entwicklungsstörungen im Grundschulalter" (Basic diagnostics of circumscribed developmental disorders of primary school-age children) by Esser et al. (2008), in which the children had to fill incomplete matrices by choosing the correct element from 5-8 alternatives. One point was given for each correct answer, and the total score was the sum of the correct answers, with a maximum score of 38. The participants' raw scores of the non-verbal intelligence test were then transferred into t-values (max. value: 79).
All cognitive tests were conducted in the prescribed standardized way of data collection in the majority language German. For this purpose, the children were brought to a quiet room and tested individually by trained research assistants; the non-verbal intelligence test was carried out in class-based groups in the same way as the grammar test.
Data elicitation took about one week in each class. Prior to the testing, all parents gave their informed consent.
To capture L2 input, the corresponding teachers were videotaped at the end of the respective school year. A best practice video was selected for data analysis for each teacher according to a number of criteria that were supposed to ensure that lessons were as comparable and representative as possible in terms of lesson content and timing within the teaching unit. All videos were recorded in General Science with comparable topics, comparable numbers, types, and lengths of interactional phases. It was ensured that the videos were technically sound and that the learners had already been familiarized with videography beforehand to minimize the observer's paradox. In each video observation, one camera was directed towards the teacher, whereas another camera was facing the students for the whole session.
Afterwards, video data was analyzed using the Teacher Input Observation Scheme (TIOS, Kersten et al. 2018aKersten et al. , 2018b, an observation scheme that aims to capture the use of L2 input and instructional techniques provided by the L2 teacher during a lesson in the foreign language. TIOS contains 4 scales and 41 items on different teacher behaviors rated on a 6-point Likert scale ranging from 0 ("not present at all") to 5 ("present to a very high degree") (Kersten et al. 2018a(Kersten et al. , 2018b. As the focus of this study was on L2 teacher input, the 12 categories on verbal input of the TIOS (Items 14-25, Kersten et al. 2018aKersten et al. , 2018b) 3 were used for operationalization. These items are geared at different aspects of verbal input quantity and quality (Kersten in press): The teacher... 14. has high language proficiency in the L2 15. exclusively uses the L2 in class 16. provides a high amount of L2 input (i.e., uses L2 a lot to accompany all actions) 17. uses varied L2 input 18. uses recurring verbal routines/rituals 19. uses repetitions of key words and phrases 20. adapts the L2 input to different (groups of) learners 21. articulates and enunciates clearly 22. slows down speech rate for selected contents 23. uses intonation to stress key words/phrases in the L2 24. uses pauses to indicate key words/phrases in the L2 25. uses comprehension checks The teachers' scores were then added to each individual learner in their class as variables for the input they received. Learners in each class were thus assigned the same values for teacher input, while these values differed among classes.
The data were analyzed using IBM SPSS Statistics (version 26), and SPSS PROCESS (version 3.4) using correlational, multiple regression, and moderator analyses.

Results
Hypotheses 1a-c focuses on types of verbal L2 input that contribute most to the learners' L2 grammar comprehension. A correlational analysis (Table 2) revealed that the total score of the Verbal Input (VI) scale, which includes 12 items, did not correlate with grammar comprehension. However, two variables, 16 ("The teacher provides a high amount of L2 input"), and 17 ("The teacher uses varied L2 input"), correlate significantly with the test results. Additionally, 18 ("The teacher uses recurring verbal routines/rituals") deviates from the 5% significance level by 0.002 points (p = 0.052, r = 0.222). As the p-value is an arbitrary cut-off point that is sensitive to the sample size and likely to decrease with larger samples, Variable 18 was included in the following calculations, as well (comp. Larson-Hall 2012, p. 249).
In the second analysis with Items 16-18, Variable 16 ("The teacher provides a high amount of L2 input") emerged as a significant predictor of L2 grammar comprehension (n = 77, R 2 = 0.069, F (1, 75) = 5.589, t = 2.364, β = 0.263, p = 0.021*), with Variable 17 ("The teacher uses varied L2 input") being excluded in all models, as 16 and 17 correlated with r = 1. Thus, the effects of Variables 16 and 17 were statistically identical. 5 Even though the ratings for Variables 16, 17, and 18 varied only across three values, the normal distribution and homoskedasticity of residuals were acceptable, and no multicollinearity was found (VIF = 3.105). These results show that all three variables potentially affect grammar reception (first main effect). For this reason, we used a short verbal input scale (VI s ) for the following research questions consisting of the sum scores of Items 16-18. The short scale was found to have a very high internal consistency (αVI s = 0.918).
Subsequently, to address Hypothesis 2, we tested whether the three cognitive variables had an independent effect on grammar comprehension using several regression analyses (second main effect). The short scale on verbal input, including Variables 16-18 (VI s ) was used as an additional independent variable. In the first step, the effect of input was measured in comparison with each cognitive score independently (Table 3). When entered as predictors in combination with the short scale on verbal input, all cognitive skills, as well as verbal input, were shown to be independent statistical predictors of L2 grammar comprehension. Together, verbal input and phonological awareness predict 21.3% (F (2, 51) = 6.89, p = 0.002*), verbal input and working memory predict 11.2% (F (2, 67) = 4.23, p = 0.019*), and verbal input and non-verbal intelligence predict 14.5% (F (2, 63) = 5.34, p = 0.007*) of the variance of L2 grammar scores, respectively (compare R 2 scores in Table 6, where only the effect size of non-verbal intelligence scores varies by 0.3% (14.8% versus 14.5% in Table 3), probably because the interaction contributes to the explanations of variance to a very small extent).
In a second step, in a backward elimination regression of verbal input (VI s ) and all cognitive scores, input and phonological awareness emerged as the strongest significant predictors of L2 grammar results, explaining 18.9% (F (2, 37) = 4.31, p = 0.021*) of variance of the results when combined in the model (Table 4). Phonological awareness was shown to have a slightly stronger effect size than verbal input. Finally, the third hypothesis refers to a moderation of the effect of L2 verbal input on grammar comprehension by different cognitive skills. To that end, we carried out a correlational analysis and moderator analysis with the three cognitive variables of phonological awareness, working memory, and non-verbal intelligence as moderators of the effect of verbal input (VI s ) on the grammar test scores.
Of the three cognitive scores, only phonological awareness correlated significantly with grammar comprehension (Table 5). None of the cognitive variables showed a moderating effect (interaction) on the relationship between the input factors and the grammar scores (Table 6).
The three cognitive skills were then examined as moderators of the influence of teachers' L2 verbal input (short scale VI s , TIOS Items 16-18) on L2 grammatical skills using SPSS PROCESS.

Discussion
With regard to Hypothesis 1a-c, we expected the two items concerning high-level quantity and quality of verbal input as well as comprehension checks to show the strongest relation with grammar comprehension, while input techniques geared at the comprehension of language beginners were thought to be less relevant. The hypotheses regarding a high amount of L2 provided by the teachers, and lexically and structurally rich verbal input (Items 16 and 17) were met, but comprehension checks showed no relation. Unexpectedly, recurring linguistic rituals/routines in the L2 (Item 18) also showed a strong relationship. Ratings of this item were mainly due to recurring organizational matters and general introductions of topics and activities rather than linguistic rituals. These findings are in line with the frequency hypothesis and lend further support to the assumed quantity effects of L2 input on SLA (comp. Gathercole 2007;Paradis 2010;Sun et al. 2015;Unsworth et al. 2014;Weitz et al. 2010). Further analyses were then carried out with a short scale on L2 verbal input consisting of the three input techniques on high-level input quality and quantity (comp. Weitz et al. 2010;Kersten et al. 2018a, Kersten et al. in prep), for which a relationship with L2 grammar comprehension had been demonstrated (Items 16, 17, and 18). We could thus confirm a significant main effect of the teachers' behavior geared towards high input quantity and quality that predicted L2 grammar reception in our dataset. However, not all types of verbal input contributed to L2 learning equally well, which calls for a differentiated view on the effects of specific teacher behavior.
Regression analyses, pertaining to the second hypothesis, showed that all three cognitive skills under investigation predicted a portion of variance independently of the teachers' verbal input (second main effect). These results corroborate findings by Hopp et al. (2018) and Paradis (2011) for working memory, and Sun et al. (2015) for non-verbal intelligence. When looking at the effect sizes, the independent variables, that is, input in combination with each cognitive skill, predict roughly 11-21% of variance. It has often been pointed out that in the social sciences, when predicting behavior, large R 2 effect sizes are not to be expected because it is impossible for research designs to include all factors relevant to a complex and dynamic social system. The full TIOS scheme containing 41 items and an additional more fine-grained scale shows that verbal input represents just a fraction of teaching strategies-and, thus, of multisensory contextual stimuli-present at all times in the classroom. The classroom situation itself represents a highly complex environment that cannot be operationalized in its entirety by an observation scheme. For this reason, according to Ellis (2010), effects in the range that we found can actually be considered medium-sized effects in SLA.
It is interesting, however, that the independent contributions of working memory and non-verbal intelligence disappear in a regression analysis, including verbal input and all three variables. Phonological awareness seems to be, statistically, the most robust predicting cognitive factor, which is in line with the fact that it has the largest effect size among all independent variables (comp. Bialystok 2001;Hopp et al. 2019). It is possible that, especially in such a small sample, there are some statistical suppression effects because of variance shared by the three cognitive factors. This is especially probable since the three cognitive skills are statistically not independent but correlate with each other ( Table 5). As small effects tend to become statistical in large datasets, it is possible that the effects of working memory and non-verbal intelligence would be found to become statistical in a larger dataset even when all variables are included, but based on our rather small numbers, we cannot make any claims as to that.
With regard to Paradis' (2010) finding that internal factors show a stronger prediction than external ones, our results corroborate this assumption with regard to phonological awareness. When comparing the effect sizes of the other two cognitive skills with verbal input, our results are, however, more in line with the findings by Sun et al. (2015), who found contextual factors to be more important than internal ones. This suggests that differential effects do not only rely on different learning contexts (naturalistic versus instructional), but also on the different cognitive skills in question.
The relationship between verbal input and L2 grammar skills was, thirdly, expected to be moderated by the three cognitive skills under investigation, that is, phonological awareness, working memory, and non-verbal intelligence (Hypothesis 3-interaction effect). Contrary to the hypothesis, no moderating effects could be found. (We have to caution, however, that this does not rule out that in a larger dataset with more statistical power, a moderation effect might be found, but based on our results, we cannot make any claims about that.) As it is, the data do not lend support to the hypothesis that learners at this stage of development use higher cognitive skills to their advantage for L2 grammar comprehension-at least not with respect to the grammatical phenomena under scrutiny in the ELIAS Grammar Test 2. We do not assume that this result is due to insufficient variation of the test structures because a substantial portion of the grammatical phenomena tested was supposed to appear at higher stages, that is Stages 4 and 5, of the developmental hierarchy (Pienemann 1998;Koch et al. 2021;Lenzing et al. forthcoming). It might, however, result from the fact that the independent variable (teachers' verbal input) did not show strong variation, which might lead to low statistical power. The statistical power is the probability to find an effect under the condition that it exists (1-beta). The values were extremely low for the moderation effects. The power for the interaction of VI and PA is 1 − β = 5.52%, the power for the interaction of VI and WM is 1 − β = 5.25%, and the power for the interaction of VI and NVI is 1 − β = 7.84%. This means that the absence of the interaction does not necessarily rule out the moderation, which might just be too small to detect statistically in the present dataset. To further test the assumed interaction, a systematic (conceptual) replication of the study with a more powerful dataset would be necessary, in which even small effects might become detectable.
Another explanation might refer to the particular learning environment in immersion schools. While teaching in conventional language-centered foreign language classrooms, which represent the most common model in Germany, is often predominately form-focused, immersion classrooms center around the content subjects and often contain only very limited explicit teaching of linguistic form. With its high intensity and comparatively low form-focused instruction, immersion contexts are prone to foster implicit rather than explicit learning (comp. Long 2015). Tagarelli et al. (2011Tagarelli et al. ( , 2015 report, however, that working memory, for one, seems to take effect in explicit rather than implicit learning contexts. A similar phenomenon was suggested by Sun et al. (2015) with regard to non-verbal intelligence. Cognitive skills may be particularly helpful for learners to understand grammatical patterns when focusing consciously on these patterns, while in contexts where implicit learning is necessary, these cognitive skills have lesser enhancing effects. This line of thought is also in line with Kersten and Greve (2015), who also report a differential effect of cognitive skills in regular versus bilingual school contexts. Qualitatively and quantitatively high input thus shows an independent main effect on grammar comprehension, which learners seem to profit from independently of their cognitive setup. In other words, this result might point to the fact that immersion contexts might foster L2 learning, at least in part, independently of the fact whether a child has high or low cognitive skills, and might, thus, contribute to some extent in leveling the playing field for disadvantaged learners. A similar differential effect between traditional and immersion contexts can also be found for the impact of socioeconomic status on L2 acquisition, a result that leads (Trebits et al. 2021;Trebits and Kersten 2019) to a similar conclusion. (This does not mean that other moderating factors may contribute to help learners make the most of teachers' input, but they were not part of this dataset.) Another aspect that has to be kept in mind is that the direction of effects cannot be interpreted as causal in cross-sectional data. Even though we tested for a particular predictive relationship, that is, whether input and cognitive skills predict the level of grammar skills, the cross-sectional nature of our data would be compatible with three different causal interpretations of the robust relationship we found between verbal input, phonological awareness, and grammar skills. Input and cognition could positively influence grammar skills, high grammar skills could influence teachers' use of input and learners' cognitive skills, or language and cognition could depend on other unrelated variables that were not part of this particular study. The first interpretation is the hypothesis we suggested and tested for in the regression analyses, and our results are compatible with such an interpretation. The second interpretation would also make sense considering that teachers adapt their input according to the comprehension levels of the learners, that is, they use highly varied input because the learners have already reached a very high level in their L2 (Van de Pol et al. 2010), and, furthermore, considering the bilingual advantage hypothesis. As pointed out above, high bilingual language skills have been found to lead to higher cognitive skills with respect to phonological awareness, working memory, and non-verbal intelligence. In that sense, if this direction of effects were substantiated, our results would also be in line with the findings by Bialystok et al. (2014), Poncelet (2013, 2015), Woumans et al. (2016Woumans et al. ( , 2019, and others, who found cognitive advantages for L2 learners in primary immersion programs. A bilingual advantage has often been found for phonological awareness, which is a cognitive skill that is claimed to show a greater impact on some language-related skills than working memory (Van Kleeck et al. 2006). This could be another reason why we found phonological awareness to be a more robust predictor than the other two skills, a result that is also in line with Werkmeister (2015).
To sum up, while we did not find a moderating effect of cognitive skills, we found that qualitatively and quantitatively rich verbal input and the three cognitive skills under scrutiny have independent predictive effects on L2 grammar comprehension, and phonological awareness and verbal input show the most robust relationships with L2 grammar. Whether these effects have a causal relationship, and whether the direction of the effects lends support to the impact of cognition on SLA, or to the bilingual advantage hypothesis, or both, needs to be corroborated with the help of a different design that would allow for causal interpretations, such as an experimental study, a longitudinal study with a cross-lagged panel design (Adler, forthcoming), or a multilevel study. In addition, a larger sample would provide more statistical power, and more videotaped lessons for each participating teacher would increase the validity of the input measures.

Limitations of the Study
Data collection in schools with a large number of tests for the different variables involved proved to be difficult due to time constraints and organizational matters. For this reason, we were not able to assess all learners with all test instruments. As a result, the number of learners involved in the single analysis varies. For a more comprehensive view, it would be helpful to include a higher number of learners with data for each variable.
The same limitations hold for videography. Because of the teachers' time limits and willingness to be videotaped, different numbers of videos per teacher were recorded. One teacher who taught two classes was included in the dataset. For her, only one video was available. More reliable results on teacher input can be achieved using an average score from a number of different videos for each teacher recorded over a longer period of time to get a more accurate picture of teaching behavior.
In larger datasets, the nested structure of the data, that is, the class structure, calls for multilevel modeling (Kersten and Greve accepted). This was not possible for our dataset due to limited statistical power but would be desirable for future studies with a higher number of participants. Moreover, cross-sectional data, such as was used in this study, does not allow for causal interpretation. For future studies, longitudinal designs that also include input measurements of previous teachers and control for extracurricular L2 input would be preferable.
It is possible and, indeed, not unlikely in the observation of teachers' scaffolding behavior to imagine a reverse effect, that is, that teachers use more scaffolds for students with lower language competence ( Van de Pol et al. 2010). According to this line of thinking, cross-sectional analyses might look as if a higher amount of scaffolds "predicted" lower language competence. This point was further elaborated in the discussion. Even though this is not the case in our study, we cannot derive causal explanations from our results. For this, longitudinal data is needed, which we intend to collect in the next step of the project.

Conclusions
In this paper, we reported the results from a cross-sectional study with 79 German fourth grade learners of L2 English in two partial immersion schools. We investigated the relationship between the learners' receptive L2 grammar skills, their working memory, phonological awareness, and non-verbal intelligence scores, and the teachers' verbal L2 input in the classroom. Three items operationalizing input frequency and input quality out of a scale of twelve were found to correlate significantly with learners' L2 grammar comprehension.
All of the cognitive skills predicted grammar reception independently of verbal input; however, none of them moderated the effect of input on grammar comprehension. L2 verbal input and phonological awareness were found to be the most robust predictors of L2 grammar comprehension. The lack of a moderating effect of cognitive skills might be an artifact of missing variation of the independent variable (input), but it would also be in line with findings by Tagarelli et al. (2011Tagarelli et al. ( , 2015 and Sun et al. (2015), who found that cognitive skills may help learners understand grammatical patterns in explicit form-focused learning contexts, while in contexts where implicit learning is necessary, these cognitive skills may have no enhancing effect. It was discussed that the partial immersion programs with high L2 intensity and meaningful content-based teaching in the L2 provide a high amount of implicit learning contexts. In that, they might have the potential to level the playing field for learners with lower cognitive skills (Trebits et al. 2021). Larger datasets with more statistical power are needed to test whether the lacking moderation might have been due to low variation in the independent variable.
Extending the predictions by Paradis (2011) and Sun et al. (2015), a comparison of internal (cognitive) and external (input) factors suggest a differential effect, in that quantity and quality of verbal input was shown to be a more robust predictor than working memory and intelligence but had a smaller effect size than phonological awareness. Thus, differential effects of internal and external factors do not only seem to rely on different learning contexts (naturalistic versus instructional), but also on the different cognitive skills in question.
While cognitive skills were found to correlate with L2 grammar reception, the direction of causal effects are unclear in our cross-sectional design. Our results would also corroborate the predictions of the bilingual advantage hypothesis, which claims that higher bilingual skills lead to specific cognitive advantages. To that end, an experimental or longitudinal cross-lagged panel design would be necessary to shed further light on the actual direction of causal effects (Kersten and Greve accepted).

1
For further information see https://www.uni-hildesheim.de/smile/ (accessed on 20 July 2021). 2 For detailed information on the development of the test see Kersten et al. (2012), and Lenzing et al. (forthcoming) for information on the adapted second edition. 3 For further information on the whole scheme see Kersten et al. (2018aKersten et al. ( , 2018b. 4 Items in the VI scale correlate with each other, which might lead to suppression effects in regression analyses because of shared variance. For that reason, the second regression analysis was restricted to the only items that correlated with the grammar scores. 5 This could be confirmed when combining both with 18 in backward regressions, yielding identical values for 16 and 17 in combination with 18, respectively.