Education and Input as Predictors of Second Language Attainment in Naturalistic Contexts

This study examines the effects of education and input as predictors of adult second language acquisition in naturalistic contexts. L1 Albanian learners of Greek who differed in amount of schooling (from 4 to 16 years) and length of residence (from 8 to 27 years) completed elicitation tasks that tested mastery of gender and number agreement, and past tense morphology. In addition, samples of spontaneous speech were assessed for fluency, grammatical complexity, and lexical richness in order to establish the learners’ overall proficiency in the L2. We hypothesized that education would facilitate attention to form and hence lead to better attainment of grammatical distinctions with relatively low functional load, particularly when these are complex. Quantity of input would be most strongly associated with aspects of language that are most relevant to communication, and in particular, fluency. These predictions were largely confirmed: education accounted for 15% of the variance on adjective number agreement and between 31% and 38% of the variance in performance on past tense morphology, which is considerably more complex. Fluency and clausal density, in contrast, were associated with length of residence but not with education.


Introduction
The majority of Second Language Acquisition (SLA) research is informed by data from participants who are highly educated and highly literate in their native language (L1) (Young-Scholten 2013).The tacit assumption is that the findings will generalize to learners who have less schooling, including those who are illiterate.However, there is a growing body of research suggesting that literacy affects both first language (D ąbrowska forthcoming; Ravid and Tolchinsky 2002) and second language acquisition in non-trivial ways.There is now considerable evidence showing that illiterate and low-literate adult L2 learners lag behind their more educated peers in numerous aspects of both metalinguistic and linguistic development (e.g., Becker et al. 1977;Clahsen et al. 1983;Kurvers 2002;Kurvers et al. 2006;Perdue 1993;Tarone et al. 2009;Young-Scholten and Naeb 2010;Young-Scholten and Strom 2006).As a result, these adult learners are more likely to fossilize at a less target-like stage (Van de Craats et al. 2006).
With regard to first language development, there are numerous studies showing a strong relationship between literacy and metalinguistic awareness.In particular, there is considerable evidence that phonemic awareness, that is, the ability to segment words into phonemes, is a consequence of acquiring an alphabetic writing system.Illiterates and pre-literates have been shown to do very poorly on phoneme segmentation tasks, such as adding or deleting a single consonant at the beginning of a word, or naming words beginning with the same consonant.However, they have no problems with phonological tasks involving larger units, such as syllables or rhymes (Adrián et al. 1995;Dellatolas et al. 2003;Kolinsky et al. 1987;Kurvers et al. 2006;Morais et al. 1979Morais et al. , 1986;;Read et al. 1986).There is also evidence that being literate is strongly related to how speech is processed.We know, for example, that illiterates have difficulty repeating pseudowords, and often substitute them with real words (Castro-Caldas et al. 1998;Reis and Castro-Caldas 1997).There is also evidence that literates and illiterates have different patterns of brain activation when repeating pseudowords, although, interestingly, not when repeating real words (Castro-Caldas et al. 1998).Other research shows that illiterate speakers process speech more slowly than literate speakers (Huettig et al. 2011).
There is also evidence to suggest that literacy is strongly related to metalinguistic awareness of other aspects of L1 development.Karanth et al. (1995), for example, compared school-going and non school-going children and literate and illiterate speakers of Kannada on grammaticality judgment and syntactic comprehension tasks.Their results show that development of literacy improves performance on these tasks.Havron et al. (2018) explored the impact of literacy acquisition on children's learning of an artificial language.In particular, they compared children's success in learning novel noun labels (e.g., keba 'clock', nadi 'chair') relative to their success in learning article-noun gender agreement (e.g., do(article)-kebi(cup), bu(article)-guni(spoon)), before and after the children had learned to read.The researchers found that prior to becoming literate the children were better at learning agreement than at learning nouns, and that the difference between these significantly decreased after the children acquired literacy.These findings suggest that literacy affects not only language processing, but also leads to important differences in language learning.That is, being literate allows children to attend to smaller sized units.Duncan et al. (2009) conducted a cross-linguistic comparison of metalinguistic development in French and English.The researchers examined early ability to manipulate derivational suffixes in oral language games as a function of chronological age, receptive vocabulary, and year of schooling.The researchers provide data from judgment and production tasks for children aged between 5 and 8 years in their first, second, or third school year in the United Kingdom and France.The results suggest that metamorphological development is accelerated in French relative to English.Part of the explanation for the French advantage encompasses knowledge of a broader range of suffixes and a markedly greater facility for generalizing morphological knowledge to novel contexts.The researchers interpreted the findings in relation to the word formation systems of English and French, and the educational context in each country.Nunes et al. (2006) provide evidence that literacy affects learners' knowledge of morphemes.The researchers undertook two large-scale longitudinal studies.In the first study, children's success in spelling the inflection at the end of regular past verbs (e.g., jumped rather than jumpt) predicted their performance in two morphological awareness tasks a year later (e.g., ability to transform noun to adjectives, noun to verbs).In the second study, the children's consistency in spelling morphemes predicted their ability to define new words on the basis of their morphemic structure (e.g., lugged as a verb, lugginess as a noun).Nunes et al.'s explanation for these findings is that the spelling of many words depends on their morphemic structure.Therefore, children have to have some knowledge about morphemes in order to learn to read and write and as such children gain much of their explicit knowledge about morphemes as a direct result of learning to read and to spell.
With regard to the relationship between literacy and metalinguistic development in L2, Kurvers (2002), for example, compared the performance of three groups (unschooled adults, low-educated literate adults, and pre-school children) from various L1 backgrounds learning Dutch as a second language on different aspects of metalinguistic awareness, including syllable awareness, rhyme awareness, word awareness, and word and sentence segmentation.The non-literate adults were illiterate both in their L1 and L2, while the low-educated adults had no more than six years of schooling in their L1.The children were in the last term of kindergarten.The results show significant differences between the literate and non-literate adult groups, and between the literate adults and the children.However, that was not the case between the non-literate adults and the children on the majority of tasks.In fact, on some measures, including rhyme, word segmentation, and word referent differentiation tasks, the non-literate adults exhibited more difficulties than the pre-schoolers.
Several studies (Becker et al. 1977;Clahsen 1980Clahsen , 1984;;Clahsen et al. 1983;Meisel et al. 1981;Pienemann 1980Pienemann , 2005;;Tarone et al. 2009) have found that certain participants, typically those with the lowest literacy levels, are much more likely to omit obligatory main verbs, grammatical markers of tense, as well as other grammatical morphemes, compared to higher-literacy participants.This is an indication that literacy is also a key factor in the development of linguistic competence in the L2.Becker et al. (1977), for example, employed directed conversation techniques to elicit oral data from 48 L1 Spanish and Italian learners of L2 German who varied in period of residence from up to 2 years to over 6 years.On the basis of 100 successive utterances produced by each learner, the participants were categorized into four proficiency groups.The researchers found that the lowest group produced utterances without a finite element, a main verb or a subject.The data also showed that the lower literate learners differed in their development of morphosyntax compared to the more educated learners: for example, they overgeneralized the modal verb muss to mark tense.Van de Craats and colleagues (Julien et al. 2013; Van de Craats and Van Hout 2010) have found a similar pattern of overgeneralization, in this case of Dutch auxiliaries zijn 'be' and gaan 'go', to mark tense in their Dutch corpus of low-educated adult immigrants' oral production.Clahsen (1980Clahsen ( , 1984)); Clahsen et al. (1983); Meisel et al. (1981);andPienemann (1980, 2005) report data from both longitudinal and cross-sectional studies (known as the ZISA projects).These studies were designed to investigate the development of word order in L2 German by uninstructed adult foreign workers with various L1s, namely: Italian, Spanish, and Portuguese.The researchers found evidence to suggest that L2 language learners follow the same developmental stages, regardless of the L1.They also found large individual differences between participants who produced obligatory, though semantically redundant, grammatical morphemes, such as subject pronouns, modal and auxiliary verbs, prepositions and determiners, and participants who omitted these features.Unfortunately, the researchers did not say why this might be.Data regarding participants' education was collected, so we know that different participants had very different levels of education.However, this data was not correlated with the presence/absence of obligatory grammatical features.And there was no measure of literacy level for any participants.However, given the different levels of education, it is very likely that the participants also had very different levels of L1 literacy when they entered Germany.It is, therefore, plausible, particularly given findings from other studies, that the different amounts of obligatory morphosyntactic features produced by the participants in the ZISA projects were related to participants' level of education/literacy.This relationship has been directly addressed by more recent studies.
In Experiment 3 of Tarone et al. (2009), for example, the researchers employed a series of picture description tasks designed to elicit various aspects of morphosyntax, including both verbal (e.g., auxiliary be, progressive -ing, third person singular present tense -s, and past tense -ed) and nominal (e.g., plural -s) morphology.Participants were 35 Somali L1 learners of L2 English, who were divided into a low-literacy and a moderate-literacy group after taking the SPEAK (Speaking Proficiency English Assessment Kit 1982) test.The authors found that the low literacy group omitted obligatory verbal morphology 64% of the time (range: 55-77%), while in the moderate literacy group such errors occurred in 50% of obligatory contexts (range: 38-58%).
There are several possible explanations for the observed differences between literate and less-literate language learners.First, it is possible that low literacy language learners are less familiar with the classroom learning context and consequently find it difficult to learn in such a setting.Second, it is possible that the written form supports learning by providing a permanent, objective representation of the target language and allowing literate language learners to process target language utterances at their own pace.Third, learning to read and write results in improved metalinguistic abilities and thus facilitates attention to form.The above are not mutually exclusive and it is most likely the case that all three are contributing factors.However, in the present study, we focus specifically on the third possibility; that being literate supports acquisition by enhancing the ability to attend to form.To do so, we study a group of L1 Albanian speakers who differ considerably in the amount of schooling they have had in their L1, and who learned L2 Greek as adults in naturalistic contexts.
In addition to testing the role of literacy on L2 attainment, we also test a second predictor variable, namely input.Input clearly plays a crucial role in both L1 and L2 language acquisition.However, the extent to which input affects ultimate L2 language attainment is a matter of some controversy.Some researchers (e.g., Flege 2009) have proposed that the differences in outcome between L1 and L2 acquisition depend largely on the quality and quantity of the input.However, numerous other researchers (e.g., Birdsong 2006;Birdsong and Molis 2001;DeKeyser et al. 2010;Johnson and Newport 1989) argue that the effects of input are overshadowed by age of acquisition effects.Therefore, the failure of (most) adult L2 learners to acquire a native-like competence is best explained by postulating a critical period for language.Whatever stand one takes in this controversy, it is clear that acquiring a high level of proficiency requires a large amount of input.For example, Hartshorne et al. (2018) found that native speaker's performance on a test tapping knowledge of a variety of grammatical structures continues to increase up to about age 30.Furthermore, their data also suggests that L2 immersion learners continue to improve for up to 30 years post-arrival, in sharp contrast to most ultimate attainment studies which assume that learners reach a steady state after about five years.
Since this is an exploratory study, we examine a variety of linguistic measures.First, we analyzed spontaneous speech samples to obtain more measures of fluency, grammatical complexity, and lexical richness.In addition, we conducted elicitation tasks, which probed the L2 speakers' mastery of gender and number agreement in the noun phrase and the ability to produce perfective past tense forms.Such obligatory yet largely redundant grammatical markers have been repeatedly shown to be particularly difficult for L2 learners, even in English with its relatively impoverished morphology.
Both of the Greek subsystems that we investigate are relatively complex.With regard to agreement marking, both determiners and adjectives have to agree with the head noun in gender (masculine, feminine or neuter), number (singular or plural) and case (nominative, genitive, accusative or vocative), and there are several subclasses of adjectives which require different endings.Verbal inflections are even more complex.Verbs are marked for person, number, tense, voice, aspect, and mood, and the formation of a particular form typically involves both affixation and stem changes.Consider, for example, the present tense form gráfo 'I write' and the corresponding perfective past tense form égrapsa 'I wrote'.The formation of the past tense involves the following processes: adding the prefix e-(added to monosyllabic stems beginning with a consonant); -a stem change (f > p); -insertion of the suffix -s-; -addition of the first person singular perfective past tense ending -a.
The verb gráfo is a so-called sigmatic, or regular verb, so in this case the stem changes are phonologically predictable.In addition to sigmatic verbs, Greek has a number of classes of non-sigmatic, or irregular, verbs, which involve more idiosyncratic stem changes (for details, see Holton et al. 2004).
Our central aim was to establish the extent to which fluency, grammatical complexity and accuracy and lexical knowledge are predicted by literacy (operationalized as the number of years in full-time schooling) and input (operationalized as length of residence), as well as examining any possible interactions between these two factors.We expected that both education and input would predict L2 achievement.However, we predicted that education would be particularly relevant for the acquisition of "decorative" grammar, i.e., those aspects of grammar that contribute relatively little to meaning, and particularly when these are complex and/or irregular.That is to say, we predicted a stronger relationship between education on the one hand and agreement marking and especially past tense marking on the other.By contrast, input should be a better predictor for fluency measures.

Participants
The participants were 49 native speakers of Albanian (23 females and 26 males) learning Greek as a second language in a naturalistic setting.None of them had attended courses in Greek as a foreign language; 33 of the participants knew the Greek alphabet, and some could read single words and a few could read simple sentences; however, none could read or write Greek fluently.It is worth noting that some of the more educated participants in our sample might have encountered a difficulty in reading Greek due to the different writing system between the two languages.Length of residence (LoR) varied from 8 to 27 years (mean 20.6, median 21); age at the time of testing from 30 to 69 years old (mean 52, median 54); age of arrival from 16 to 49 years old (mean 30, median 31); and full-time education in the native language from 4 to 16 years (mean 9.2, median 9.0).Whilst not direct, the number of years of schooling is a very strong indicator of the level of literacy.
The participants' L1, Albanian, is similar to Greek in that it inflects both adjectives and determiners for gender and number (although in Albanian the determiner comes after the noun rather than before, as in Greek).The verb morphology is also broadly similar in the two languages, in that both have past tenses and the inflectional system is quite complex, with numerous stem changes and inflectional subclasses.However, one important difference is that Albanian does not make a distinction between Perfective and Imperfective aspect (Varlokosta 2002).The Present, the Past and the Pluperfect are used more or less interchangeably, which is not the case in Greek.
All of the participants were informed of their rights before participating in the study, and provided their written consent.Data collection took place in Athens, Greece, between May and July 2017.Each participant was tested either at their own house or at a quiet nearby café, where the researcher was always accompanied by a family member.The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the Ethics Committee of Northumbria University at Newcastle (659, 18 May 2017).

Agreement Tasks
The agreement tasks were a modified version of the tasks used by Konta (2012aKonta ( , 2012bKonta ( , 2013aKonta ( , 2013bKonta ( , 2013c)).The participant and the experimenter each had a notebook.For the singular agreement task, every page in the notebook contained an array of four pictures depicting objects of the same kind that differed in color (e.g., four belts).In the participant's notebook, one of the objects was circled (cf.see Figure 1).The participant's task was to ask the researcher to show them the target item.To achieve this, the participant had to utter the sentence Ðίkse mou 'Show me' followed by the target noun phrase (e.g., tin prásini zóni 'the green belt', where the determiner tin and the adjective prásini agree with the noun zóni in both gender and number).The plural agreement task was exactly the same, except that the pictures contained four pairs of identical objects (see Figure 2).
The target noun phrases are provided in Appendix A. Both agreement tasks tested all three genders, with 8 items per gender, giving a total of 24 items.For each gender, we chose nouns with the most frequent endings, based on data provided by Mastropavlou and Tsimpli (2011): -os and -as for masculine nouns (e.g., tίhos 'wall' and anaptίras 'lighter'), -i and -a for feminines (e.g., zóni 'belt' and pórta 'door'), and -i and -o for neuters (e.g., balóni 'balloon' and vivlίo 'book').The elicited adjectives were the color terms kókinos 'red', prásinos 'green', and kítrinos 'yellow'.These three colors are part of the main color spectrum, which helps to avoid any semantic ambiguity, and they are the most appropriate colors for object description.The target noun phrases are provided in Appendix A. Both agreement tasks tested all three genders, with 8 items per gender, giving a total of 24 items.For each gender, we chose nouns with the most frequent endings, based on data provided by Mastropavlou and Tsimpli (2011): -os and -as for masculine nouns (e.g., tίhos 'wall' and anaptίras 'lighter'), -i and -a for feminines (e.g., zόni 'belt' and pόrta 'door'), and -i and -o for neuters (e.g., balόni 'balloon' and vivlίo 'book').The elicited adjectives were the color terms kókinos 'red', prásinos 'green', and kítrinos 'yellow'.These three colors are part of the main color spectrum, which helps to avoid any semantic ambiguity, and they are the most appropriate colors for object description.

Spontaneous Speech
Participants were asked to discuss a familiar topic, such as their hometown, family, work, or how they learned Greek.Some participants chose a topic and started talking, while others had to be told which topic to discuss.There were participants who were not as willing as others to discuss in depth, in which case the researcher had to ask questions to elicit speech.There were also participants who spoke only for a few minutes either because they did not have enough free time or because they were not willing to share any more information.Thus, individual samples varied in length from 114 to 360 words.The target noun phrases are provided in Appendix A. Both agreement tasks tested all three genders, with 8 items per gender, giving a total of 24 items.For each gender, we chose nouns with the most frequent endings, based on data provided by Mastropavlou and Tsimpli (2011): -os and -as for masculine nouns (e.g., tίhos 'wall' and anaptίras 'lighter'), -i and -a for feminines (e.g., zόni 'belt' and pόrta 'door'), and -i and -o for neuters (e.g., balόni 'balloon' and vivlίo 'book').The elicited adjectives were the color terms kókinos 'red', prásinos 'green', and kítrinos 'yellow'.These three colors are part of the main color spectrum, which helps to avoid any semantic ambiguity, and they are the most appropriate colors for object description.

Spontaneous Speech
Participants were asked to discuss a familiar topic, such as their hometown, family, work, or how they learned Greek.Some participants chose a topic and started talking, while others had to be told which topic to discuss.There were participants who were not as willing as others to discuss in depth, in which case the researcher had to ask questions to elicit speech.There were also participants who spoke only for a few minutes either because they did not have enough free time or because they were not willing to share any more information.Thus, individual samples varied in length from 114 to 360 words.

Spontaneous Speech
Participants were asked to discuss a familiar topic, such as their hometown, family, work, or how they learned Greek.Some participants chose a topic and started talking, while others had to be told which topic to discuss.There were participants who were not as willing as others to discuss in depth, in which case the researcher had to ask questions to elicit speech.There were also participants who spoke only for a few minutes either because they did not have enough free time or because they were not willing to share any more information.Thus, individual samples varied in length from 114 to 360 words.

Past Tense Production Task
The past tense production task was based on Clahsen et al. (2010).In this task, participants were presented with pairs of pictures depicting an ongoing (Figure 3) and a completed action (Figure 4).In each trial, the experimenter described the first picture (e.g., Edó to pedí halái to pehnídi 'Here the child is breaking the toy') and then pointed to the second picture and asked what the agent had done (e.g., Edó to pedí ti ékane?'Here the child did what?') in order to elicit a sentence containing a perfective past tense form (e.g., To pedí/aftós hálase to pehnídi 'The child/he broke the toy.')To make the task more manageable, we chose three out of the five conditions used in the original study: existing sigmatic verbs, existing non-sigmatic verbs, and novel verbs which do not rhyme with any existing verb.There were 48 items in total: 30 testing items (10 for each condition), 8 practice items, and 10 filler items (see Appendix B).
presented with pairs of pictures depicting an ongoing (Figure 3) and a completed action (Figure 4).In each trial, the experimenter described the first picture (e.g., Edό to pedí halái to pehnídi 'Here the child is breaking the toy') and then pointed to the second picture and asked what the agent had done (e.g., Edό to pedí ti ékane?'Here the child did what?') in order to elicit a sentence containing a perfective past tense form (e.g., To pedí/aftόs hálase to pehnídi 'The child/he broke the toy.')To make the task more manageable, we chose three out of the five conditions used in the original study: existing sigmatic verbs, existing non-sigmatic verbs, and novel verbs which do not rhyme with any existing verb.There were 48 items in total: 30 testing items (10 for each condition), 8 practice items, and 10 filler items (see Appendix B).In each trial, the experimenter described the first picture (e.g., Edό to pedí halái to pehnídi 'Here the child is breaking the toy') and then pointed to the second picture and asked what the agent had done (e.g., Edό to pedí ti ékane?'Here the child did what?') in order to elicit a sentence containing a perfective past tense form (e.g., To pedí/aftόs hálase to pehnídi 'The child/he broke the toy.')To make the task more manageable, we chose three out of the five conditions used in the original study: existing sigmatic verbs, existing non-sigmatic verbs, and novel verbs which do not rhyme with any existing verb.There were 48 items in total: 30 testing items (10 for each condition), 8 practice items, and 10 filler items (see Appendix B).

Procedure
The experiment lasted from thirty to forty-five minutes, and the researcher met with each participant once.All meetings were recorded using a digital voice recorder.The instructions were provided both orally and in writing in Greek unless the participant asked for an explanation in Albanian.Each elicitation task was preceded by practice trials.The session started with an interview during which participants were to provide information about their age, gender, full-time education in the L1, years of residence in Greece, whether they had received any education in Greece, knowledge of languages other than the L1 and the L2, reading time per week, knowledge of writing in the L2, and interaction with native speakers of Greek.The interview was followed by the agreement tasks, the spontaneous speech task, and the past tense production task, in that order.

Data Coding
In the three grammar tasks (singular agreement, plural agreement, and past tense production), responses were coded as correct or incorrect.The dependent variable was the percentage of correct responses.In the agreement tasks, we coded determiner-noun agreement and adjective-noun agreement separately.Thus, there were four outcome variables: singular determiner agreement, singular adjective agreement, plural determiner agreement, and plural adjective agreement.For the novel verbs, the regular (i.e., sigmatic) form was considered the target.
The spontaneous speech samples were used to calculate the following measures: • Pauses to fluent speech ratio: this is a measure of fluency computed by dividing the number of pauses by the number of fluent segments and multiplying by 100.A fluent segment was defined as intonational unit.

•
Speech rate: this was a second measure of fluency and was computed by dividing the total number of words by the total speech time in seconds and multiplying the result by 60 (Grosjean 1980;in Götz 2013), which yields the mean number of words per minute.

•
Mean length of T-unit (MLTU): this is a global measure of syntactic complexity.A 'Terminable unit' (T-unit) is a unit consisting of an independent clause and any subordinate clauses or non-finite fragments that are attached to it (Hunt 1970;Götz 2013).Thus, the utterance I started learning English when I was 11 consists of one T-unit, while I am supposed to meet my friends this evening but the weather is very bad consists of two T-units.MLTU, the mean length of a T-unit in words, is widely used as a measure of syntactic complexity beyond the preschool years (see, for example, Götz 2013;Nippold et al. 2005;Scott 1988).

•
Clausal density (also known as subordination index): this measures the amount of subordination in a sample.It is computed by dividing the number of clauses by the number of T-units (Götz 2013;Nippold et al. 2005;Scott 1988).

•
Type to token ratio (TTR): this is a widely used measure of lexical diversity computed by dividing the number of word types in the sample by the number of word tokens (Johansson 2008).A higher ratio means that fewer word types are repeated, and hence that the sample is more lexically diverse.

•
Lexical density: this measures the density of information.It is calculated by dividing the number of content words by the total number of words and multiplying the result by 100.

Descriptives
The descriptive statistics are presented in Table 1.It is clear from these figures that the grammar tasks differ in difficulty, with mean scores ranging from 23% correct for the past tense of nonce verbs to 91% for singular determiner agreement.Furthermore, although there is a good range of variation on all measures, we have got a considerable proportion of participants performing at ceiling (100% correct) on the agreement tasks and at floor (0% correct) on the past tense production task.For singular determiner agreement, the lowest score was 66%, with 32 out of the 49 participants performing at ceiling.For singular adjective agreement and plural determiner agreement, 27 participants scored 100% correct; and for plural adjective agreement, 20 participants.For past tense inflection, the scores were considerably lower, with no participant performing at ceiling.Eight participants (all with no more than 8 years of schooling) failed to produce a single correct form of an existing nonsigmatic verb, and 18 participants (15 of whom had no more than 8 years of schooling) failed to produce a single correct form of a nonce verb.

Regression Analyses
To examine the role of Education, LoR, and their possible interaction on linguistic variables of interest, we conducted regression analyses using the lm function in R. We began with the full model, then removed any non-significant predictors beginning with the interaction term.Finally, we used the calc.relimpfunction from the relaimpo library in R to compute the lmg metric (Grömping 2006).This metric is an estimate of each predictor's unique contribution to variance in the regression model, expressed as a proportion of total variance-a convenient measure of effect size which allows comparisons between models for different outcome variables (Larson-Hall 2010).The final regression models are provided in Appendix C; Table 2 provides information about the proportion of variance in each linguistic measure accounted for by the final model.The upper part of the table (above the horizontal line in the middle) presents the regression results for grammatical accuracy, i.e., the elicitation tasks.As in Table 1, the linguistic measures are arranged from easiest (singular determiner agreement) to most difficult (past tense of nonce verbs).As anticipated, performance on these tasks is predicted by education, with more educated participants achieving higher scores, rather than length of residence.Moreover, the results show a clear pattern.The effects of education are most noticeable on the most difficult grammatical tasks, i.e., past tense, especially the past tense of existing non-sigmatic (i.e., irregular) verbs and nonce verbs: for both measures, education accounts for approximately 38% of the variance.For plural adjective agreement, which was somewhat easier, education accounts for 15% of the variance.For plural determiner agreement, education on its own makes only a small contribution (2.4% of the variance).Finally, for singular determiner agreement and singular adjective agreement, there is no effect.This is most likely due to ceiling effects, as our participants achieved scores of 91% and 87% correct respectively for these measures.
For four of the accuracy measures (adjective number agreement, existing sigmatic verbs, existing non-sigmatic verbs, and nonce verbs), education was the only significant predictor of performance.For plural determiner agreement there was also a very small effect of length of residence, which accounted for 0.1% of the variance, and a significant interaction between education and length of residence, accounting for 9.4% of the variance.To explore this interaction further, we divided the participants into two groups: a low-educated group, which included 24 participants with up to 8 years of formal schooling (mean 6.5), and a high-educated group, which included 25 participants with 9 to 16 years of formal schooling (mean 11.8).We then computed simple correlations between length of residence and performance on plural determiner agreement for each group separately.This analysis revealed an interesting pattern.In the low-educated group, we have a weak positive correlation (r = 0.25, p = 0.237) between the two variables, while in the high-educated group, the correlation was close to zero (r = −17, p = 0.406).The lack of progress in the high-educated group is most likely due to ceiling effects, as they were 90% correct on this task.
The next two rows in Table 2 present information about the effects of the predictor variables on the two fluency measures, namely, speech rate and the ratio of pauses to fluent segments.As predicted, both measures are related to length of residence, but not to education.The relationship is considerably stronger for speech rate (18% of the variance) than for ratio of pauses to fluent speech (only 8%).Interestingly, these two measures are not correlated (r = 0.06), indicating that they tap different aspects of fluency.They also show a different pattern of relations with the other variables.The ratio of pauses to fluency is negatively correlated with most grammatical accuracy measures, which suggests the existence of a trade-off between accuracy and fluency: more fluent speakers tend to be less accurate and vice versa.Speech rate was not associated with the other linguistic measures except possibly lexical diversity.
Both lexical measures, TTR and lexical density, were related to education, accounting for 12% and 19% of the variance respectively.Length of residence was also a significant predictor for lexical density, but its effect was much smaller (just under 7%).For TTR, the effect of length of residence is negligible (0.3% of the variance), but the interaction between education and LoR explains an additional 7% of the variance.Comparison of the two subgroups revealed a similar pattern to that observed for plural determiner agreement: a moderately strong positive correlation between the two variables in the low education group (r = 0.47, p = 0.024), and no significant relationship in more educated participants (r = −0.23,p = 0.170).These results indicate that more educated participants have larger vocabularies in the L2, possibly as a result of developing better strategies for learning new words.
We now turn to the two global measures of grammatical complexity, MLTU and clausal density.For MLTU, although neither education nor length of residence is significant on its own, there is a significant interaction between the two predictors.Further analysis showed that there is no correlation between MLTU and length of residence in the low-educated group (r = −0.03)and a significant positive correlation in the high-educated group (r = 0.44, p = 0.03): in other words, the mean length of T-unit continues to increase for up to three decades after arrival, but only in the more educated participants.
Clausal density, in contrast, shows a different pattern.For this variable, length of residence accounts for 14% of the variance and there is no effect of education, and no interaction.This is surprising: it is well established that in literate speakers, clausal density increases steadily throughout childhood and adolescence (Frizelle et al. 2018;Nippold et al. 2005;Scott 1988).This increase is most likely attributable to exposure to written texts (D ąbrowska forthcoming).We should note that the clausal density in our sample (mean 1.22 and median 1.19) is quite low.We have no data on the development of clausal density in children acquiring Greek as a first language; however, literate English-speaking children usually attain this level at about the age of nine, that is to say, after three or four years of schooling.Assuming that the figures are similar for Greek, our results suggest that the ability to produce subordinate clauses does not necessarily transfer into the L2-although clearly further research is necessary to establish this conclusively.

General Discussion and Conclusions
In this paper we examined L2 acquisition by adult naturalistic learners of Greek as a second language, focusing in particular on the role of education (operationalized as number of years spent in full time education) and exposure (operationalized as length of residence in Greece).We anticipated that both factors would contribute to L2 attainment, but in different ways.We hypothesized that higher educational attainment would facilitate attention to form, and thus be most relevant to the acquisition of "decorative" morphology (grammatical markers whose contribution to meaning is largely redundant), particularly those aspects which are relatively complex and/or irregular.This prediction was largely confirmed: education accounted for just over 2% of the variance on plural determiner agreement, 15% of the variance on plural adjective agreement, and for between 31% and 38% of the variance in performance on past tense morphology, which is considerably more complex.We found no significant effect of education on singular determiner agreement or singular adjective agreement.This, however, was most likely due to ceiling effects, as performance on these measures was 91% and 87% correct respectively.Fluency, in contrast, was predicted by length of residence but not by education.This is most likely the case because it depends (almost) entirely on implicit learning, which is not associated with education, whereas the acquisition of "decorative" grammar has a strong explicit component, at least in adult learners.
The absence of a relationship between length of residence and performance on "decorative" morphology suggests that our participants fossilized at a non-target-like level.Inflectional morphemes such as agreement and past tense markers, which are largely redundant from a semantic point of view, are known to be difficult for L2 learners, and are among the structures that are most likely to fossilize (Han 2013).Interestingly, our results showed a clear difference between agreement and tense marking in this respect.As explained earlier, agreement marking in Greek is comparatively simple.Children acquiring Greek as a first language typically master agreement morphology in the preschool years (Diamanti et al. 2018;Koromvokis and Kalaitzidis 2013).Our learners also attained relatively high levels of performance, with means ranging from 78% correct on plural adjective agreement to 91% correct on singular determiner agreement, and a relatively high proportion of participants performing at ceiling.In fact, for each of the four agreement measures, more than half of the participants with 9 or more years of schooling achieved a perfect score.In the less educated group, the number of participants performing at ceiling was lower, ranging from 21% on the most difficult task, adjective plural agreement, to 58% on singular determiners.Thus, our results indicate that it is possible even for low-educated naturalistic adult learners to attain native-like levels of performance in this area.
Past tense marking in Greek is considerably more complex, and our participants' performance on tense marking tasks was much poorer: 55% correct on existing sigmatic, 38% on existing nonsigmatic and 23% on nonce verbs, with no participant performing at ceiling in any condition.Furthermore, more than a third of our participants, and almost two-thirds of those with no more than eight years of schooling, failed to produce a single target form on the nonce verb inflection task.Since we used the same test as Stavrakaki and Clahsen (2009) and Clahsen et al. (2010), we can directly compare our results with theirs.It is striking that even the youngest L1 learners tested by Stavrakaki and Clahsen (2009), who were aged between 3 and 4, performed better than our participants: the scores in this age group were 70%, 36% and 39% respectively in these three tasks.Clahsen et al. (2010) used the same test with highly educated instructed learners with a much shorter length of residence (from 2.3 to 6.8 years), and this group did much better than our participants, achieving scores of 90% on existing sigmatic verbs, 66% on existing nonsigmatic verbs, and 76% on non-rhyming nonce verbs.These results suggest that for complex inflectional systems such as the Greek past tense, explicit instruction appears to be necessary for adult learners to acquire the system.
It should be stressed, however, that while there was evidence of fossilization in some areas, other aspects of language continued to develop for a long time after arrival.This is most noticeable on measures of fluency and clausal density, but as we have seen, length of residence was also positively correlated with performance on determiner plural agreement, particularly in the less educated participants.This supports Han's (2013) claim that fossilization is highly selective, both at the level of individual structures and the individual learner.In fact, perhaps the most striking finding from our study is the extent of individual differences in attainment in our group of long-resident L2 learners.As we have seen, there was considerable variation in performance on all tasks: for example, for existing past non-sigmatic verbs, individual scores ranged from 0% to 90% correct, and for plural determiner and adjective agreement, from 44% to 100%.The two factors we focused on here, education and length of residence, account for only a relatively small proportion of the variance in scores.Future research will need to examine the role of other factors such as age of arrival, frequency of interaction with native speakers, language aptitude, and motivation.
Table A2.Testing noun phrases for the singular agreement task.

Figure 1 .
Figure 1.Example of an item from the singular agreement task.

Figure 2 .
Figure 2. Example of an item from the plural agreement task.

Figure 1 .Figure 1 .
Figure 1.Example of an item from the singular agreement task.

Figure 2 .
Figure 2. Example of an item from the plural agreement task.

Figure 2 .
Figure 2. Example of an item from the plural agreement task.

Figure 3 .
Figure 3. Example of picture with ongoing event in the present tense (Clahsen et al. 2010).

Figure 4 .
Figure 4. Example of picture with completed event in the perfective past tense (Clahsen et al. 2010).

Figure 3 .
Figure 3. Example of picture with ongoing event in the present tense (Clahsen et al. 2010).

Figure 3 .
Figure 3. Example of picture with ongoing event in the present tense (Clahsen et al. 2010).

Figure 4 .
Figure 4. Example of picture with completed event in the perfective past tense (Clahsen et al. 2010).Figure 4. Example of picture with completed event in the perfective past tense (Clahsen et al. 2010).

Figure 4 .
Figure 4. Example of picture with completed event in the perfective past tense (Clahsen et al. 2010).Figure 4. Example of picture with completed event in the perfective past tense (Clahsen et al. 2010).

Table 1 .
Mean, median, range and interquartile range (IQR) for all measures.Note:The figures for the grammatical accuracy measures (agreement and past tense production) are percentages of target responses.For details about the remaining measures, please see the Method section above.Sg Det Agr: singular determiner agreement; Sg Adj Agr: singular adjective agreement; Pl Det Agr: plural determiner agreement; Pl Adj Agr: plural adjective agreement; Past Sigm: existing past sigmatic verbs; Past Nonsigm: existing past nonsigmatic verbs; Nonce Past: nonce past verbs; MLTU: Mean length of T-unit; TTR: type to token ratio.

Table 2 .
Proportion of variance in linguistic variables explained by education, length of residence and their interaction.Predictor approaches significance (p < 0.10).Numbers in parentheses indicate that the main effect is not significant. a