Does Timing in Acquisition Modulate Heritage Children’s Language Abilities? Evidence from the Greek LITMUS Sentence Repetition Task

: Recent proposals suggest that timing in acquisition, i.e., the age at which a phenomenon is mastered by monolingual children, inﬂuences acquisition of the L2, interacting with age of onset of bilingualism and amount of L2 input. Here, we examine whether timing affects acquisition of the bilingual child’s heritage language, possibly modulating the effects of environmental and child-internal factors. The performance of 6- to 12-year-old Greek heritage children residing in Germany (age of onset of German: 0–4 years) was assessed across a range of nine syntactic structures via the Greek LITMUS (Language Impairment Testing in Multilingual Settings) Sentence Repetition Task. Based on previous studies on monolingual Greek, the structures were classiﬁed as “early” (main clauses (SVO), coordination, clitics, complement clauses, sentential negation, non-referential wh -questions) or as “late” (referential wh -questions, relatives, adverbial clauses). Current family use of Greek and formal instruction in Greek (environmental), chronological age, and age of onset of German (child-internal) were assessed via the Questionnaire for Parents of Bilingual Children (PABIQ); short-term memory (child-internal) was measured via forward digit recall. Children’s scores were generally higher for early than for late acquired structures. Performance on the three early structures with the highest scores was predicted by the amount of current family use of Greek. Performance on the three late structures was additionally predicted by forward digit recall, indicating that higher short-term memory capacity is beneﬁcial for correctly reconstructing structurally complex sentences. We suggest that the understanding of heritage language development and the role of child-internal and environmental factors will beneﬁt from a consideration of timing in the acquisition of the different structures.


Introduction
There is a general consensus that sentence repetition tasks (SRTs) provide a reliable measure of bilingual children's overall language ability. Most studies investigate the children's L2, which is usually the majority societal language (e.g., Hamann and Ibrahim 2017;Kaltsa et al. 2019); some studies examine the children's heritage language (HL) (Armon-Lotem et al. 2011;Papastefanou et al. 2019;Andreou et al. 2020;Armon-Lotem et al. 2020). Irrespective of whether the children's L2 or HL was tested, the aim has been to determine which environmental and child-internal factors best explain overall performance in the SRT. Accordingly, studies using sentence repetition have not considered the extent to which these factors affect children's performance differently, i.e., dependent on the specific structures that are included in the SRT (but see Armon-Lotem et al. 2011).
In our contribution, we address this lacuna with the following rationale: SRTs typically comprise syntactically complex structures along with syntactically simple structures (see Marinis and Armon-Lotem 2015). Assuming that complex structures are mastered later than simple structures, children's performance in the SRT structures should reflect these differences regarding acquisitional timing. Furthermore, recent proposals on the impact of timing on L2 acquisition (Tsimpli 2014;Schulz and Grimm 2019, see Section 2) suggest that the predictive role of environmental and child-internal factors for acquisition may interact with timing in the HL. In the present study, we examine these expectations by assessing Greek heritage children's performance on the different structures contained in the Greek LITMUS SRT (Chondrogianni et al. 2013). In particular, we ask whether 6-to 12-year-old heritage children perform better on structures mastered early than on structures mastered late in monolingual acquisition of Greek. Moreover, we ask whether factors known to play a role for children's HL abilities affect children's performance in early and late acquired structures to the same extent. For child-internal factors, we consider chronological age (CA), age of onset of the second language (L2-AoO), and short-term memory (STM); for environmental factors, we consider current use of the HL in the family and current exposure to the HL through schooling.
The paper is structured as follows: Section 2 reviews previous findings regarding the effects of the various factors and their interaction with timing for HL acquisition and presents our research questions. Section 3 gives a brief demographic overview on Greek communities in Germany and presents the study and the results obtained. Section 4 describes our findings and points to some limitations and suggestions for further research.

Child-Internal and Environmental Effects on HL Acquisition and Their Interaction with Timing in Acquisition
A large body of research has advanced our knowledge regarding the range of language abilities that HL speakers show as adults, having grown up as simultaneous bilinguals or as early L2 learners of a socially predominant language (see Polinsky 2018 for an overview). In recent years, this knowledge has been extended by examining the development of HL abilities in childhood and by considering the role of child-internal and environmental factors for the outcome of HL speakers. Among the phenomena that have received particular attention are grammatical gender, case marking, verbal inflection, and vocabulary, as well as overall morphosyntactic ability, which was typically measured via SRTs. In the following, we summarize previous studies that address the factors relevant to the current study (internal: CA, L2-AoO, STM; environmental: HL exposure and use) 1 and discuss whether they speak to timing, i.e., the earliness or lateness of the phenomena under investigation.
Timing in L1 acquisition refers to the age at which a specific phenomenon is mastered in monolingual acquisition. This notion has been proposed in the context of child L2 acquisition (Tsimpli 2014;Grimm and Schulz 2016;Schulz and Grimm 2019) but to date has not been used for HL acquisition. It originates with Tsimpli (2014) proposal to consider timing in L1 development as a further factor influencing bilingual children's performance in the L2. Tsimpli argues that the acquisition of early phenomena, i.e., those mastered prior to age 4 by monolingual children, is subject to effects of critical periods. Children whose L2-AoO is within these periods will differ from children with a later L2-AoO regarding pace and path of acquisition of early phenomena in the L2. The acquisition of late phenomena, mastered after age 4, is not dependent upon critical period(s) but upon the amount of L2-input. That is, late phenomena are predicted to reveal similar (high or low) performance across bilingual groups with different L2-AoO, differentiating them from monolinguals.
Applying this proposal to early second language learners (eL2) and simultaneous bilingual children (2L1), Schulz and Grimm (2019) found that age of onset effects are indeed modulated by effects of timing in monolingual acquisition: 2L1 children had an advantage over their eL2 peers in early acquired phenomena, which disappeared with time, whereas in late acquired phenomena, 2L1 and eL2 children did not differ. More importantly, and pertinent for the current study on HL, 2L1 children performed like monolingual children in early acquired phenomena but had a disadvantage in the late acquired phenomena with the amount of delay decreasing with time. The latter result could be explained by assuming that reduced input from birth, as is the case in 2L1 acquisition, results in (temporary) disadvantages for late phenomena. As for HL acquisition, this finding offers the following prediction: a higher AoO of the L2 may be more beneficial for the acquisition of late than of early phenomena of the HL.
Regarding chronological age (CA), results are mixed, with some studies showing that HL abilities did not improve as children grew older (Rodina and Westergaard 2017;Chondrogianni and Schwartz 2020) and some studies indicating a positive effect of CA (Gagarina and Klassert 2018;Papastefanou et al. 2019;Rodina et al. 2020). Rodina and Westergaard (2017) report that, when controlling for cumulative length of HL exposure (see Unsworth 2013 for details of this measure of exposure), acquisition of grammatical gender was not predicted by CA in a group of heritage Russian children living in Norway. Similarly, in a study on the production and comprehension of case marking in canonical (SVO) and non-canonical (OVS) sentences in heritage Greek in the USA, Chondrogianni and Schwartz (2020) found no effect of CA once HL use and other background variables were controlled for. This result is at first sight surprising, since the age range of their participants was quite large (between 6 and 12 years). In contrast, Gagarina and Klassert (2018) study of heritage Russian children in Germany suggests an overall strong positive influence of CA on lexical and grammatical skills in the HL, even when various other background variables are considered. Chronological age was the strongest predictor for all phenomena, except for production of case, which was more dependent on current input than on age. Similarly, in a study of Greek heritage children living in the United Kingdom, Papastefanou et al. (2019) found that 8-year-olds outperformed 6-year-olds in expressive vocabulary, phonological awareness, inflectional morphological awareness, reading decoding skills, and morphosyntactic skills; the only exception was derivational morphological awareness, where both age groups performed alike. The findings for morphosyntactic skills are especially relevant because the participants were tested with the same SRT as in the present study. Finally, Rodina et al. (2020) found that CA was one of the factors that predicted children's correct production of grammatical gender in their heritage Russian across several countries.
Concerning the effects of age of onset of the L2 (L2-AoO), later exposure to the dominant societal language should in general be beneficial for the HL, because in this case, more time is allocated to its development independently of another language. Whereas in simultaneous bilingual children, the two languages are in contact from the beginning, sequential bilingual children have already mastered some phenomena in their HL before they are exposed to the L2. 2 Accordingly, bilingual children with a later AoO of the L2 may show more advanced HL abilities than children with an earlier L2-AoO (see Kupisch and Rothman 2018;Kupisch 2019 for discussion). The majority of studies assessing the role of L2-AoO have provided support for this assumption. One notable exception is the study by Armon-Lotem et al. (2011) who found that the development of lexical and grammatical skills in heritage Russian children living in Germany and Israel was not influenced by AoO of the majority language. Subsequent studies (Janssen et al. 2015;Gagarina and Klassert 2018;Armon-Lotem et al. 2020;Rodina et al. 2020) revealed consistent advantages of later L2-AoO for HL acquisition. Janssen et al. (2015) found that heritage Russian children could better exploit case cues for understanding non-canonical OVS sentences in Russian if they had a later AoO of the L2 Hebrew or Dutch, which exhibit sparse inflectional case morphology and rigid SVO word order. Likewise, Gagarina and Klassert (2018) report that later AoO of the L2 positively influenced children's performance regarding expressive vocabulary and case production, but not regarding receptive vocabulary and production of verb inflections. According to the authors, these discrepancies are attributable to differences in transparency: phenomena with transparent form-function mapping such as verbal inflections in Russian can be mastered even with reduced L1 exposure. Accordingly, the presence of an L2 and the concomitant switch in input dominance do not affect performance negatively. On the other hand, acquisition of untransparent systems such as the Russian case system requires extensive L1 exposure and is negatively affected by input reduction due to L2 exposure. Put differently, acquisition of case, which is a late acquired phenomenon, seems to profit from a higher AoO of the L2, i.e., from a longer period of exclusive exposure to the HL. Note that the difference between receptive and productive vocabulary remains unaccounted for under this explanation. Armon-Lotem et al. (2020) studied English heritage children living in Israel and found that children who were exposed to the majority language Hebrew after age 2;0 had better HL English lexical and grammatical skills than children with an AoO of Hebrew before age 2;0. Finally, Rodina et al. (2020) studied heritage Russian children growing up in different countries. Their findings revealed a positive relation between AoO of the majority language and development of a target gender system in the HL. If HL children were exposed early to the majority language, they were likely to develop a bi-partite system or a system with no gender distinctions instead of the target Russian tri-partite system.
With regard to effects of short-term memory (STM) on HL development, results are mixed as well. Haman et al. (2017) showed that scores in a forward digit recall task were the strongest predictor of Polish-English bilingual children's overall language performance in their HL Polish, measured by an SRT. In light of the nature of both tasks, it is not surprising that the verbatim (forward) recall of sentences required in an SRT profits from good forward digit recall skills. Accordingly, relations between memory and HL acquisition are likely to depend on the specific tasks used. The studies by Andreou et al. (2020) and by Dosi and Papadopoulou (2019) are cases in point: Andreou et al. (2020) used complex working memory measures (via backward digit recall and a rotating figure task) and found no effect on overall SRT performance in heritage Albanian. Similarly, in the study by Dosi and Papadopoulou (2019), there was no significant relation between scores in an updating task and production and comprehension of verbal aspect in heritage Greek.
Turning to the issue of the environmental factors, that is, input in the HL, it needs to be acknowledged that studies have measured HL exposure either as the current use of the HL at the time of testing or as cumulative exposure up to the time of testing. Due to reasons of space, we focus on the synchronic dimension and distinguish between current use in the family and formal instruction.
Regarding current use in the family, many studies have reported positive effects (Gagarina and Klassert 2018;Daskalaki et al. 2019;Papastefanou et al. 2019;Chondrogianni and Schwartz 2020;but cf. Rodina and Westergaard 2017). Gagarina and Klassert (2018) found the current amount of HL input by parents and siblings to predict performance in receptive and productive vocabulary and production of case, but not performance in production of verb inflections, which they argue are easy to acquire, as mentioned above. Daskalaki et al. (2019) studied how Greek-English bilingual children acquiring Greek as a heritage language master subjects in syntax-discourse contexts (subject realization with non-topic shift, subject placement with wide focus) and in syntactic contexts (placement in embedded interrogatives). Results revealed a non-linear relationship between the amount of heritage language use in the home and the three structures. Subject realization (in non-topic shift contexts) was unproblematic, with children performing target-like when Greek made up approximately 40% of their home input. Subject placement was more difficult: for embedded interrogatives (syntax proper), ceiling performance was reached with approximately 75% Greek input at home, and for wide focus (syntax-discourse), even 75% HL use at home did not ensure target-like performance. According to the authors, these results indicate that the nature of the phenomena (syntax-discourse interface or syntax proper) does not align with their sensitivity to amount of HL exposure. Papastefanou et al. (2019) reported that the proportion of current HL use at home correlated positively with children's expressive vocabulary and overall morphosyntactic ability in heritage Greek, but not with phonological and morphological awareness and with reading decoding. Again, most relevant for the present study is the finding regarding morphosyntactic ability, since children were tested with the same Greek SRT as our participants. In a further study on heritage Greek, Chondrogianni and Schwartz (2020) found that performance on case marking in production and comprehension was positively related to increased current HL use in the family over and above effects of age. Finally, Rodina and Westergaard (2017) measured both current exposure and cumulative length of exposure to the HL: current exposure (both in and outside the family) had no effect on gender acquisition in heritage Russian once cumulative length of exposure was controlled for, suggesting that effects of current exposure may result from effects of exposure over time.
Few studies have examined the role of formal instruction in HL acquisition, with mixed results (Dosi and Papadopoulou 2019;Andreou et al. 2020;Bongartz and Torregrossa 2020;Rodina et al. 2020, see also Rinke and Flores 2021, for the role of formal instruction compared to exposure to the colloquial register). As for verbal aspect in Greek-German children in Germany, Dosi and Papadopoulou (2019) found that the educational setting influenced children's performance in both comprehension and production: children attending a Greek-dominant school outperformed their age peers from a German-dominant school. Bongartz and Torregrossa (2020) analyzed discourse abilities such as story length, syntactic complexity, and lexical diversity in Greek and in German narratives of children who attended either one of the two schools featuring in Dosi and Papadopoulou (2019) study or German mainstream schools with afternoon weekly Greek classes. The children from the Greek-dominant school outperformed their peers from the German mainstream school in all aspects examined, and children from the bilingual German-dominant school exhibited an intermediate performance between the other two groups. Similarly, Rodina et al. (2020) report that Russian heritage children with more weekly hours of instruction in the HL were more accurate in production of gender in Russian. On the other hand, Andreou et al. (2020) did not find any effect of HL instruction on the overall morphosyntactic ability of mono-and biliterate Albanian children living in Greece: children attending weekly Albanian classes did not perform better in the Albanian LITMUS SRT than children who did not attend those classes.
In summary, previous research indicates that the effects of environmental and childinternal factors for the development of HL children differ according to the domains and phenomena under investigation. Recall, for example, the impact of L2-AoO and current input in heritage Russian on verb inflections compared to case marking. Here, we suggest that the explanations given in terms of irregularity, non-transparency, easiness, or domain of the phenomenon may fall under the umbrella notion of timing. That is, timing in monolingual acquisition may be one of the aspects responsible for the finding that environmental and child-internal factors play different roles for different phenomena. The current study takes up this issue by examining the following questions: (Q1) Does bilingual children's performance in their HL differ depending on the earliness/lateness of the structures targeted in sentence repetition tasks? (Q2) To which extent do child-internal (chronological age, AoO of the L2, short-term memory) and environmental (current use of HL in the family, current weekly hours of HL instruction) factors account for children's performance in the SR structures, and does timing in acquisition interact with these factors?

The Present Study
The present study focuses on the acquisition of Greek as a heritage language in Germany. According to the most recent data of the German Federal Statistical Office (Statistisches Bundesamt/DESTATIS 2020), 453,000 individuals with a Greek migration background reside in Germany, making up ca. 2.1% of the inhabitants with a migration background. In total, 40,000 individuals with a Greek migration background are residents in the federal state of Hesse, where the present study was conducted. In the school year 2019/2020, there was a total of 32,424 pupils with Greek citizenship attending a school in Germany; 23,889 attended general schools and 8535 vocational schools. Note that these numbers refer to pupils with non-German citizenship, excluding those pupils who have a Greek migration background and are German citizens. Accordingly, the total number of children and teenagers with a Greek migration background residing in Germany is very likely to be much higher.

Participant Information
Twenty-seven typically developing Greek-German bilingual children (Age: 6;0-12;8, Mean = 9;1, SD = 1;11) living in the Frankfurt metropolitan area participated in the current study as part of a larger project (see the description in Makrodimitris and Schulz 2021). Typical development was ensured through parental and teacher information and, where norms were available, via the standardized language test LiSe-DaZ (Schulz and Tracy 2011). Children were included in the study if they had at least one parent who is a native (heritage) speaker of Greek and used Greek with the child from birth. Age of onset of German (L2-AoO) varied from 0;0 to 3;10 (Mean = 1;4, SD = 1;4).
Participant recruitment took place in three different school types (similar to those in Bongartz and Torregrossa 2020, see Section 2). The majority of children (17 out of 27) attended a German state primary school that offers three to five hours of weekly instruction in Greek, depending on pupils' grade. For an additional five to seven hours per week, several subjects (mathematics, science, music) are taught in tandem by a German and a Greek teacher; each teacher presents and discusses the lesson contents in their language. Two children attended a Greek state primary school, where Greek is the main medium of instruction and German is taught as a second language four hours a week. Finally, eight children attended monolingual German state schools; three of them additionally participated in afternoon Greek language classes offered by the Greek government or the state of Hesse for two to four hours a week. Notably, this variability regarding amount and type of input in heritage Greek and in the majority language German is representative of the options that pupils of Greek origin have in Germany, with availability of each option being subject to regional differences.

Materials
The Questionnaire for Parents of Bilingual Children (PABIQ), developed in the COST Action IS0804 (see Tuller 2015), was used to obtain information regarding children's chronological age, AoO of the L2, and their current patterns of language use in the family. The latter variable measures use of Greek and German with mother, father, other adult family member(s), and sibling(s). A total of 16 points is awarded, divided between the two languages. Information about the amount of current formal instruction in Greek was obtained from the teachers: they were asked how many hours a week children received instruction in Greek or had Greek as the medium of instruction. 3 Children's short-term memory was measured via a German forward digit recall task (Hasselhorn et al. 2012). The first of ten test items contains two, three, or four digits, depending on the child's age. Every time the child correctly repeats two successive items of the same length, the subsequent test item contains one digit more. The final score is the mean length of the last eight items.
To assess children's performance across a range of structures, we used the Greek LITMUS Sentence Repetition Task (Greek SRT) developed in the COST Action IS0804 (Chondrogianni et al. 2013). It consists of 32 sentences targeting the following structures (4 items per structure): main clauses (SVO), sentential negation (NEG), coordination of main clauses (COOR), clitics (CL) in clitic doubling and clitic left dislocation contexts, complement clauses (COMPL), referential and non-referential object wh-questions (WH_ref, WH_non-ref), adverbial clauses (ADV), and subject and object relatives (RC). All sentences are matched for length (in syllables) and word frequency. Examples for each structure are given in examples 1 to 8 below; the structure type is noted after the English translation.

1.
O manavis pulise tis orimes fraules stin aγora poli fθina. the grocer sold the ripe strawberries in-the market very cheaply "The grocer sold the ripe strawberries in the market very cheaply." (SVO) 2.
O proponitis ðen elpizi na kerðisi i omaða tu simera. the trainer not hopes that wins the team his today. "The trainer does not hope that his team wins today." (NEG) 3.
I mama majirepse makaronia ke I jiajia eftiakse mia pita. the mother cooked pasta and the grandmother made a pie "The mother cooked pasta and the grandmother made a pie." (COOR) 4.
To koritsi tin entise tin kukla tu me omorfa foremata. the girl it-CL dressed the doll its with beautiful dresses "The girl dressed its doll with beautiful dresses." (CL, clitic doubling) 5.
I nosokomes ipan oti i ptisi tu jiatru exi kathisterisi. the nurses said that the flight of-the doctor has delay "The nurses said that the doctor's flight is delayed." (COMPL) 6.
a. I ðaskala ðen ine siγuri pjo vivlio ðiavase i maθitria. the teacher not is sure which book read the pupil "The teacher is not sure which book the pupil read." O jitonas aγorase to aftokinito prin pulisi to mikro spiti. the neighbor bought the car before sold the small house "The neighbor bought the car before he sold the small house." (ADV) 8.
O tzitzikas ðiavaze ena vivlio pu eγrapse o vasilias tis zunglas. the cicada was-reading a book that wrote the king of-the jungle "The cicada was reading a book that the king of the jungle wrote." (RC) The sentences are presented in three blocks; two blocks contain 11 sentences each and one contains 10; each structure appears at least once within each block, in random order (see Kaltsa et al. 2019 for a detailed description of the task).
For the purposes of the present study, the SRT structures were classified according to the factor "timing in L1 acquisition", defined as the age at which a specific phenomenon is mastered in monolingual acquisition. Following Tsimpli (2014) and Schulz and Grimm (2019) classification, we define early acquired phenomena as those that are mastered prior to age 4 by monolingual children acquiring Greek and late acquired phenomena as those that are mastered after 4. In line with the authors, early phenomena are compared to late phenomena. 4 The age of mastery for each of the structures in the Greek SRT was inferred from previous observational and experimental studies (production and comprehension) with young monolingual Greek-speaking children (see Table 1). This procedure was chosen because, unlike in the study by Schulz and Grimm (2019), no normed data are available for the Greek SRT. 5 We acknowledge two important limitations of this procedure, related to the specifics of the method and of the structures tested. First, the method of sentence repetition is related to but different from both production and comprehension tasks. SRTs involve language processing at all levels of representation in both comprehension and production, including morphosyntactic, semantic, and phonological aspects, and also several working memory components. That is because successful repetition requires the ability to process, decode, reconstruct, and produce the heard sentence based on one's own grammatical knowledge (see Marinis and Armon-Lotem 2015). Repetition has indeed been found to correlate highly with both comprehension and production (e.g., Ruigendijk and Friedmann 2017); at the same time, repetition scores are typically higher than comprehension scores, for children may be able to repeat sentences they do not interpret target-like (Willis and Gathercole 2001;Frizelle et al. 2017). The second limitation concerns the fact that age of mastery for a given structure may differ greatly depending on which specific aspect of this structure is examined. This holds especially for phenomena such as negation or clitics, which involve many different aspects that are acquired at different points in development.
In the following, we provide a summary of the existing studies in monolingual Greek, with a focus on the specific aspects of the structures as they are presented in the Greek SRT. For the purpose of the present study, we considered results from production and comprehension, where available, and classified a structure as late, if it was mastered late in one or both. The ages of mastery we inferred from existing production and comprehension studies should be understood as an approximation. According to our literature search, five structures (SVO, NEG, COOR, CL, COMPL) were examined in production only; for the remaining structures (WH_ref, WH_non-ref, ADV, RC), results from both production and comprehension exist. Table 1 provides the classification of the structures and the respective studies, which are described below.
Turning first to the structures for which only production data are available, SVO structures are classified as early: monolingual Greek children produce SVO from very early on and are familiar with the semantic and discourse-related features as well as with the stress patterns of the different word orders of Greek prior to age 3 (Tsimpli 2005;Kapetangianni 2010). According to production studies, by age 3, children are also aware of the different negation particles (ðen, factual, and min, non-factual) and do not omit them in obligatory contexts (Baslis 1994;Stephany 1997). Clausal coordination with ke ("and") has been reported to become productive between ages 2 and 3 (Baslis 1994;Stephany 1997). Clitic doubling has been found to occur around age 2, followed by the emergence of clitic left dislocation (CLLD). Notably, children do not omit or misplace the clitics (Marinis 2000;Tsimpli 2005) and adhere to their respective discourse constraints; for example, CLLD is used for topicalization but not for focusing (Tsimpli 2005). Finally, sentential complements with oti ("that", declarative) and pu ("that", factive), the two complementizers used in the Greek SRT, emerge already before age 3 (Stephany 1997;Katis and Stampouliadou 2009), and three-year-olds have been found to use them correctly and without omissions in obligatory contexts (Mastropavlou and Tsimpli 2011). Now we turn to the structures for which both production and comprehension data are available. Non-referential object wh-questions with ti ("what") are productively used by children before age 2 (Stephany 1997;Tsimpli 2005). A comprehension study with children aged 4;0 to 5;4 found adult-like comprehension of these wh-questions (97% accuracy, Varlokosta et al. 2015); we take the ceiling performance to suggest that mastery occurs already by age 4. Referential object wh-questions, i.e., which-questions, are unattested in early production (Stephany 1997;Tsimpli 2005) and are still difficult in comprehension for four-to five-year-old children (76% accuracy, Varlokosta et al. 2015). 6 Adverbial clauses are acquired late as well: of the four adverbial conjunctions used in the Greek SRT (temporal otan-"when", afu-"after", prin-"before", concessive eno-"although"), two conjunctions (afu, prin) are not attested before age 4 in spontaneous speech; the other two (otan, eno) are used very infrequently (Baslis 1994;Stephany 1997). Comprehension of adverbial sentences with afu, prin, and eno is mastered late and still developing during primary school years (Papakonstantinou 2015). Finally, in the case of relative clauses, production and comprehension results from Greek differ regarding age, in line with studies from many other languages. Three-year-olds have been reported to use relative clauses correctly in their spontaneous speech (Mastropavlou and Tsimpli 2011), and at age 3 and 4, children are able to produce subject and object relatives in elicited production (Varlokosta and Armon-Lotem 1998). Comprehension studies, in contrast, indicate that children do not master relative clauses before age 6 (Guasti et al. 2012;Stavrakaki et al. 2015;Varlokosta et al. 2015). Guasti et al. (2012), for example, report 68% accuracy for subject relatives and 49% for object relatives in 4-to 6-year-olds. Accordingly, relative clauses were classified as late acquired.
The resulting classification of the structures (see Table 1) is generally in agreement with cross-linguistic evidence: SVO, clitics 7 , and clause-level coordination are reported to be mastered early, whereas adverbial and relative clauses, especially object relatives, are found to be mastered late (see also Tsimpli 2014 for an overview). Likewise, the earlier acquisition of non-referential compared to referential wh-questions has been attested across many languages (Sauerland et al. 2016). Two structures call for further elaboration: negation and complements. Negation was classified as early based only on production, because comprehension studies on monolingual Greek are lacking. Findings from comprehension of sentential negation in German, however, show that target-like comprehension of negation develops late, around age 6 (see Schulz and Grimm 2019), which would suggest a production-comprehension asymmetry similar to the case of relative clauses. The meaning of sentential negation is likely to be the same across languages; hence, sentential negation may in fact be a late acquired phenomenon in Greek, just like in German. Sentential complements were also classified as early based on production, due to the lack of comprehension data from monolingual Greek. Comprehension studies focusing on the semantic properties of sentential complements in German and English suggest that target-like comprehension may develop later than age 3 or 4 (i.a., because of factivity, see Schulz 2003). Again assuming that their meaning is invariant across languages, sentential complements may in fact be a late acquired phenomenon in Greek. We return to these points in the discussion of the results.

Data Analysis
Scoring of the SRT responses was based on children's realizations of the intended structure. Children received 1 point if they repeated the target structure correctly and 0 points if the target structure was not present (see Marinis and Armon-Lotem 2015). For example, in an SVO sentence as in example (1) above, omission of an adjunct does not alter the target structure, whereas a repetition without the object NP counts as a structurerelated error and receives 0 points. Changes in word order are disregarded if the basic SVO structure remains intact; alterations such as VSO count as errors. Similarly, in referential whquestions as in (6a), substitution of the main predicate with a similar one (e.g., doesn't know instead of is not sure) is disregarded, but omission of the lexical NP within the wh-phrase would change the question type and counts as an error.
Note that there are further coding schemes available for SRTs, which are based on grammaticality and accuracy (i.e., verbatim repetition) rather than on target structure. To illustrate the difference, consider a response in which the lexical NP within the whphrase in (6a) is omitted. In this case, the child's response is still grammatical; hence, the grammaticality score (1 for grammatical, 0 for ungrammatical) would not capture the violation of the target structure. Accuracy coding awards a maximum of 3 points per sentence and subtracts 1 point for every word that is omitted or substituted. This score does not reflect whether substitutions/omissions affect the target structure or not. For example, if the child in her repetition of (6a) replaces book with text, she would be penalized, although the target structure remains intact. Here, we chose the coding for target structure because we consider this most informative regarding the focus of the present study: children's ability to use the specific structures.

Performance across Early and Late Acquired Structures in the SRT
Children reached a mean score of 19.41 out of a maximum of 32 (Range: 2-32, SD = 9.06) in the Greek SRT. For further analyses, raw scores were converted into proportions because there were two tokens each for the two types of wh-questions and four tokens each for all other structures. Figure 1 illustrates the mean proportions for each structure; as expected, performance was not uniform across structures. As can be inferred from Figure 1, at the group level, there were neither floor nor ceiling effects. Descriptively, seven of the nine structures are in line with the classification inferred from previous studies as "early" and as "late", respectively (see Table 1). Children's scores for the early structures COOR, WH_non-ref, SVO, and NEG are higher than their scores for the late structures WH_ref, ADV, and RC. A repeated-measures ANOVA with structure as the within-subjects factor revealed a main effect of structure (F(8, 208) = As can be inferred from Figure 1, at the group level, there were neither floor nor ceiling effects. Descriptively, seven of the nine structures are in line with the classification inferred from previous studies as "early" and as "late", respectively (see Table 1). Children's scores for the early structures COOR, WH_non-ref, SVO, and NEG are higher than their scores for the late structures WH_ref, ADV, and RC. A repeated-measures ANOVA with structure as the within-subjects factor revealed a main effect of structure (F (8, 208) = 6.882, p = 0.001, partial η 2 = 0.209). Pairwise comparisons with Bonferroni correction showed that performance in COOR was significantly higher compared to COMPL (p = 0.031), ADV (p = 0.031), RC (p = 0.031), and CL (p = 0.001). In addition, CL was significantly lower than WH_non-ref (p = 0.014), SVO (p < 0.001) and NEG (p = 0.001). No other comparison was significant.

Predictors of SRT Performance and Their Interaction with Timing
First, we calculated the outcomes for the following background variables: current use of Greek in the family (CUF-Greek), weekly instruction in Greek (INS-Greek), and STM. Children had a mean score of 9.89 out of 16 (Range: 3-16, SD = 4.06) for CUF-Greek, which indicates that for many of the children, Greek is the dominant language in their home environment. At the time of testing, children received a mean of 7.0 h of instruction per week in the Greek language or with Greek as the medium of instruction (Range: 0-25, SD = 6); three children had no literacy skills in Greek, and another two were currently not attending any Greek classes but had done so in the past (in both cases, INS-Greek was coded as 0). In the forward digit recall task, which measures STM, children received a mean score of 4.26 (Range: 2.75-5.75, SD = 0.74).
In order to determine which background variables (CA, L2-AoO, CUF-Greek, INS-Greek, STM) predict children's performance and to uncover the potential contribution of timing, we ran two separate regression analyses, one for the three late structures (WH_ref, ADV, RC) and one for three early structures (COOR, WH_non-ref, SVO). Three out of the six early structures were selected in order to achieve equal statistical power for early and late structures; coordination, non-referential wh-questions and SVO were chosen because, unlike negation and complements, they have been unequivocally documented to be acquired early (see Section 3.2). Table 2 describes the correlations between the two outcome variables and the background variables. Table 2. Pearson correlations (one-tailed) between outcome and background variables.

2.
3. 4. 5. 6. 7. CUF-Greek showed a strong positive correlation with both outcome variables; L2-AoO was moderately correlated with the scores for both early and late structures. Moreover, STM correlated weakly with the score for late structures. In the two regression models, the background variables that correlated significantly with the respective outcome variable were simultaneously entered as predictors. Table 3 summarizes the two models.

Score early structures (in %)
In the model for early structures, the variables CUF-Greek and L2-AoO were entered into the model; only CUF-Greek (β = 0.617, p = 0.001) was a significant predictor. In the model for late structures, CUF-Greek, L2-AoO, and STM were entered into the model, and CUF-Greek (β = 0.619, p < 0.001) as well as STM (β = 0.373, p = 0.008) were significant predictors.

Discussion and Conclusions
We tested 27 Greek-German bilingual children with a mean age of 9;1 in their HL Greek to examine whether bilingual children's performance in their HL differs regarding the earliness/lateness of the nine structures targeted in the Greek LITMUS SRT (Q1) and to which extent child-internal and environmental factors impact performance in these structures differently depending on their timing (Q2). In a first step, we estimated the ages of mastery from existing studies in monolingual Greek (based mostly on production). The structures SVO, NEG, COOR, CL, COMPL, and WH_non-ref were classified as "early", i.e., acquired before age 4, and WH_ref, ADV, and RC were classified as "late". Coding of the SRT responses was based on "repetition of the target structure", which we considered most informative regarding children's ability to reconstruct the specific structures. The children's mean scores in the SRT were compared using a repeated-measures ANOVA; two separate linear regression analyses were conducted to determine the factors that influence performance in early and in late acquired structures.
With regard to (Q1), i.e., whether timing in monolingual acquisition affects how children master the respective early and late structures in their heritage language Greek, our results suggest the presence of partial effects. The scores of most early structures (SVO, NEG, COOR, WH_non-ref) were higher than the scores of the three late structures (WH_ref, ADV, RC). Significant differences, however, were only found for sentence coordination (COOR), which was easier than the late structures adverbial clauses (ADV) and relative clauses (RC). Children's high performance on COOR confirms the findings by Kaltsa et al. (2019) on monolingual Greek 8-to 10-year-olds who were tested with the same SRT. Notably, scores for clitics (CL), which are supposed to be acquired early, were significantly lower than all other early structures in our study, except for complement clauses (COMPL). This result is at first sight unexpected but confirms previous empirical findings involving the same task: in the study by Kaltsa et al. (2019), both monolingual Greek and bilingual Albanian-Greek children had most difficulty with the sentences containing clitics. Clitic omission was the most frequent error type, and this pattern is found in the present study as well. We conclude that child heritage learners of Greek do not have exceptional difficulty with the use of clitics; their low performance seems to result from task effects related to the discourse salience of clitics in the target sentences (see also Kaltsa et al. 2019 for discussion). We acknowledge that the classification as early/late is necessarily preliminary, derived mostly from production studies. Future comprehension studies could contribute to the map of timing of acquisition in monolingual Greek, possibly resulting in a reclassification of some of the early structures. Two cases in point are sentential negation and sentential complements, where comprehension findings from other languages suggest that these structures may not be as early as the production data tell us.
Notably, the 9-year-old heritage Greek children in our study did not reach ceiling performance in any of the structures. Low scores on early structures in the SRT may mean that children had difficulty in general in their heritage language because of their L2 input (see Section 2). However, relatively low scores may also be due to the specific design of the SRT. Recall that in the Greek LITMUS SRT, sentences were long enough to prevent passive copying, with all items being matched for length to control for memory effects. This made it possible to disentangle the difficulty of a structure from the factor memory, but it may have made the task relatively difficult. Future research on younger monolingual Greek children could help to answer the question of at what age ceiling performance appears.
Turning to (Q2), concerning the extent to which environmental and child-internal factors account for children's performance and whether timing in acquisition interacts with these factors, we discuss each of the factors in turn. Chronological age showed no correlation with either early or late structures, despite the fact that children's age ranged between 6 and 12. This general lack of age effects adds to previous research showing that HL language abilities may not improve linearly with age; they may be influenced by other factors such as HL use in the family, which may vary independent of age (e.g., Chondrogianni and Schwartz 2020;Rodina and Westergaard 2017). The fact that some studies reported effects of chronological age (Gagarina and Klassert 2018;Papastefanou et al. 2019;Rodina et al. 2020) highlights the variable role of age in HL acquisition. This may even hold when the same outcome measure is used. Papastefanou and colleagues, for example, found overall performance in the Greek SRT to improve with age in Greek heritage children. 8 The amount of current formal instruction in Greek showed no correlation with either early or late structures, in line with the findings of Andreou et al. (2020), but in contrast to several other studies (Dosi and Papadopoulou 2019; Bongartz and Torregrossa 2020; Rodina et al. 2020). Further research is needed to better understand the role of formal input at school. It may be that the quality of instruction plays a larger role than the amount of instruction. For example, children may be taught with different types of textbooks, progressing slower or faster (for a textbook specifically designed for heritage language instruction, see, e.g., Hessisches Kultusministerium 2008), and by teachers with different qualifications. Moreover, effects of schooling may be seen more clearly over the years (see Kupisch and Rothman 2018 for an overview), a dimension not captured in the present study.
Current use of Greek in the family was highly correlated with early and with late structures and was the strongest predictor of both. This is in line with the majority of previous studies showing increased current use of the HL at home to be related to more advanced HL outcomes (e.g., Gagarina and Klassert 2018;Daskalaki et al. 2019;Chondrogianni and Schwartz 2020;also Papastefanou et al. 2019, for the outcome measure Greek SRT). Our results indicate that, irrespective of the age at which children come in contact with the majority language, sustained HL exposure in the family is crucial for HL acquisition and maintenance. It is still unclear, however, under which circumstances effects of cumulative exposure to the HL may override effects of current use (see Rodina and Westergaard 2017).
Age of onset of the L2 (L2-AoO) was moderately correlated with early and with late structures but showed no effects in the two regression analyses. The absence of effects of L2-AoO on HL development confirms the findings of Armon-Lotem et al. (2011) but contrasts with many studies reporting positive effects of later exposure to the L2 (Janssen et al. 2015;Gagarina and Klassert 2018;Armon-Lotem et al. 2020;Rodina et al. 2020). More research is needed to explain these differences, e.g., by targeting the same structures and/or by using the same statistical model. As a case in point, we found that L2-AoO was a significant predictor in both models if it was entered as a single factor into the regression models for early and for late structures. When adding the factor current exposure in the family (CUF-Greek) to the respective models, however, the effect of L2-AoO was outweighed by the factor CUF-Greek, because the latter factor was much stronger than the former. 9 Finally, we found that short-term memory (STM), measured via forward digit recall, was the one factor that impacted early and late structures differently. STM was significantly correlated with performance on late structures and significantly predicted children's performance in the respective regression model, whereas no correlation was found with children's performance on early structures. This result substantiates the finding by Haman et al. (2017) that bilingual children's SRT performance in their HL was predicted by forward digit recall.
By involving the notion of timing, we can now ask about the nature of the link between STM and sentence repetition. It has long been acknowledged that the relations between phonological memory and sentence processing established in both adults and children seem to be " . . . highly specific to sentence structures, and in particular to complex and lengthy constructions" (Willis and Gathercole 2001, p. 351). Their own study is a case in point: Willis and Gathercole (2001) tested 16 structures from the TROG (Bishop 1982) in two groups of monolingual 5-year-olds that differed regarding STM, measured, i.a., via digit recall. The high STM group scored significantly better than the low STM group in nine structures, the majority of which was not mastered by age 5 (e.g., extraposed and center embedded relative clauses). Out of the seven structures in which the high STM group and the low STM group did not differ, the majority was mastered by age 5. In our view, this finding from monolingual children agrees with our results on HL, indicating that higher STM capacity is beneficial for accurately reconstructing structures that are considered complex.
In comparison to the three early structures (SVO main clauses, coordinated main clauses, and non-referential wh-questions), the three late acquired structures (adverbial clauses, ADV, relative clauses, RC, referential wh-questions, WH_ref) involve structural complexity. Complexity is a rather vague concept that needs to be specified relative to a domain of application (see the discussion in Di Sciullo and Jenkins 2016). Here, we suggest that the notion of structural complexity is not one-dimensional and that the late structures are subject to structural complexity in one or more of the following dimensions: they require reconstructing syntactic subordination (ADV, RC, WH_ref), they involve whmovement (RC, WH_ref), and they involve discourse linking (WH_ref). 10 We leave for further research the question of whether the effect of short-term memory on late structures holds across measures other than forward digit recall and across learner groups other than heritage bilingual children.
In conclusion, early and late acquired structures, as assessed in the SRT, show parallels in that heritage children's performance was predicted by current use of the HL in the family, but not by chronological age, age of onset of the L2, or by formal instruction in the HL. Early and late structures differ in that short-term memory predicted performance in late but not in early structures. The latter result suggests that our understanding of HL development and of the role of child-internal and environmental factors will benefit from a consideration of the timing of acquisition for the different structures.
Author Contributions: Both authors C.M. and P.S. are fully responsible for all parts of the text and the analyses. C.M. collected the data and carried out the transcription and the scoring. Both authors have read and agreed to the published version of the manuscript.

Funding:
The research presented here was conducted in the framework of the project "DaZ ab Sechs: The role of language knowledge for the acquisition of German by child second language learners" (PI: PS), which is part of the Research Center IDEA. The project is supported by the Hessian Ministry of Higher Education, Research, Science and the Arts.

Institutional Review Board Statement:
The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Ethics Committee of the DIPF|Leibniz Institute for Research and Information in Education (protocol code: DIPF_EK_DazabSechs, date of approval: 24 February 2020).
Informed Consent Statement: Informed written consent was obtained from the parents of all children involved in the study. The children were also informed that recordings were made, and they gave oral consent prior to performing the task.

Data Availability Statement:
The data presented in this study will be made available in an appropriate form to the scientific public after completion of the project "DaZ ab Sechs" in August 2023.