Associations between Language at 2 Years and Literacy Skills at 7 Years in Preterm Children Born at Very Early Gestational Age and/or with Very Low Birth Weight

Preterm children (born <37 gestational weeks) who are born at very early gestational age (<32 weeks, very preterm, VP) and/or with very low birth weight (≤1500 g, VLBW) are at increased risk for language and literacy deficits. The continuum between very early language development and literacy skills among these children is not clear. Our objective was to investigate the associations between language development at 2 years (corrected age) and literacy skills at 7 years in VP/VLBW children. Participants were 136 VP/VLBW children and 137 term controls (a 6-year regional population cohort, children living in Finnish-speaking families). At 2 years of corrected age, language (lexical development, utterance length) was assessed using the Finnish version of the MacArthur–Bates Communicative Development Inventory and the Expressive Language Scale from Bayley scales of Infant Development, second edition. At 7 years, children’s literacy skills (pre-reading skills, reading, and writing) were evaluated. Statistically significant correlations were found in both groups between language development at 2 years and literacy skills at 7 years (r-values varied between 0.29 and 0.43, p < 0.01). In the VP/VLBW group, 33% to 74% of the children with early weak language development had weak literacy skills at 7 years relative to those with more advanced early language skills (11% to 44%, p < 0.001 to 0.047). Language development at 2 years explained 14% to 28% of the variance in literacy skills 5 years later. Language development at 2 years had fair predictive value for literacy skills at 7 years in the VP/VLBW group (area under the receiver operating characteristic (ROC) curve (AUC) values varied between 0.70 and 0.77, p < 0.001). Findings provide support for the continuum between very early language development and later language ability, in the domain of literacy skills in preterm children.


Introduction
Prematurely born (<37 gestational weeks) children born at very early gestational age (<32 gestational weeks, very preterm, VP) and/or with very low birth weight (≤1500 g, VLBW) are at increased risk for developmental impairments and learning deficits such as difficulties in early language development [1][2][3] and literacy skills [4][5][6][7], including pre-reading skills, reading, and writing. The gap to full-term controls in language and literacy skills is evident even in the absence of neurodevelopmental impairment (NDI), including cerebral palsy, hearing impairment, blindness, or severe cognitive impairment (intelligence quotient, IQ < 70) [8,9]. The goal of clinical follow-up is to identify weak development as early as possible to provide targeted intervention to improve developmental outcomes. Findings of recent longitudinal studies, although sparse, suggest that difficulties in language functions persist from early years through late childhood, up to the age of 13 years [3,10,11]. Previous investigations, including a recent study of a large French cohort [12], highlight the usefulness of a validated parent-reported measure, such as the MacArthur-Bates Communicative Development Inventories (CDI) [13], in assessing early language skills of children born VP/VLBW to predict developmental difficulties in language [2,3].
Although earlier studies have provided essential information regarding the continuum between early language skills and later language performance in children born VP/VLBW, far fewer reports have used literacy skills as an outcome measure [14][15][16][17]. Furthermore, most of the existing studies examining associations between language and literacy skills have been based on samples of school-aged children, not assessing very early childhood. To date, the earliest age point in a longitudinal setting has been reported by Pritchard et al. [17] who investigated the relations between school readiness domains, including language, at the corrected age (i.e., adjusted age, representing the age of the child from the expected date of delivery) of 4 years and later educational achievement at school age. To the best of our knowledge, the possible longitudinal associations between very early language development at 2 years and literacy skills at 7 years is an open question in this highrisk population.
In early clinical follow-up of prematurely born children, their development is often followed up to the age of 2 years. However, the language development of VP/VLBW children is not always assessed specifically. Clinicians evaluating early language development of children born VP/VLBW would benefit from the knowledge of whether lexical development and utterance length at 2 years of corrected age have predictive value for literacy outcome at 7 years in this vulnerable population, and whether there is a cost-effective way of identifying toddlers at potential risk for literacy deficits. To maximize the effects of early intervention, it is crucial to identify children with weak skills as early as possible.
In the current study, we analyze longitudinal associations between language skills at 2 years of corrected age and literacy skills at 7 years in a Finnish sample of children born VP/VLBW. In Finland, children begin formal schooling in the year in which they reach the age of 7 years. Finnish is a transparent language with a highly regular grapheme-phoneme correspondence, and thus, basic decoding skills are often acquired during the first year of school see e.g., [18]. In addition, more than one-third of Finnish first-graders can already read before entering school. Previous findings from longitudinal studies investigating Finnish children with a hereditary risk for dyslexia and their controls [19,20] suggest that features of early language development, including lexical development and utterance length, have predictive value for later reading acquisition [20,21]. In preterm children, this association has not been analyzed previously.
This study had three aims: (1) to evaluate the associations between language skills at 2 years of corrected age and literacy skills at the beginning of schooling, at 7 years, in a regional cohort of Finnish-speaking children born VP/VLBW and in their full-term controls; (2) to analyze how much early language skills explain the variance of literacy skills at 7 years; and (3) to assess the predictive value of language skills at 2 years for literacy skills at 7 years measured using area under the receiver operating characteristic (ROC) curve (AUC) values in VP/VLBW children and their controls.

Participants
This study is part of the multidisciplinary 6-year regional cohort study of prematurely born children called PIPARI (Development and Functioning of Very Low Birth Weight Infants from Infancy to School Age) [22,23]. The participants were children born <32 weeks of gestational age and/or with birth weight ≤1500 g in Turku University Hospital, Finland, in 2001-2006. From 2001to 2003, the inclusion criteria were birth weight ≤1500 g and prematurity (<37 gestational weeks). From the beginning of 2004, the inclusion criteria were expanded to include all infants born <32 weeks of gestation, regardless of birth weight. At least one of the parents had to speak Finnish or Swedish, the two official languages in Finland. Children with severe congenital anomalies or diagnosed syndromes affecting their development were excluded.
The present study sample consisted of 136 children born VP/VLBW living in Finnishspeaking families. The flow chart of the children born VP/VLBW participating in this study is presented in Figure 1. Neurodevelopmental impairment was determined if one or more of the following factors were present by the corrected age of 2 years: cerebral palsy, hearing impairment (threshold >40 dB), blindness, or severe cognitive impairment (Mental Developmental Index, MDI of the Bayley Scales of Infant Development II [24], BSID-II, <70 standard scores). In the PIPARI study, the age of VP/VLBW children was corrected for prematurity until the age of 2 years. Figure 1. Flow chart of the prematurely born children born at very early gestational age (<32 weeks, very preterm, VP) and/or with very low birth weight (≤1500 g, VLBW) included in the study.
The control group consisted of healthy full-term (≤37 weeks of gestation) infants born in the same hospital between 2001 and 2004. They were recruited by asking the parents of the first boy and the first girl born in each week to take part in the study. If the family was not interested in partaking in the study, the parents of the next boy/girl were invited. The full-term controls were born ≥37 weeks of gestation, were not admitted to a neonatal care unit during the first week of life, and had at least one parent speaking either Finnish or Swedish. The exclusion criteria were any major congenital anomalies or chromosomal or genetic syndromes, the mother's known use of illicit drugs or alcohol during pregnancy, and the infant's birth weight being small for gestational age according to age-and genderspecific Finnish growth charts. In the present study, only those 136 children born VP/VLBW and those 137 controls who had data available from both the language assessment at 2 years of corrected age and literacy skills assessment at 7 years were included.
The PIPARI study protocol was approved by the Ethics Review Committee of the Hospital District of Southwest Finland in December 2000 and January 2012. After receiving oral and written information, all parents who agreed to participate provided written informed consent.

Assessment at 2 Years of Corrected Age
Language skills were assessed with the Finnish long-form version of the MacArthur-Bates Communicative Development Inventory (FinCDI, toddler version) [25], and the Expressive Language Score (ELS) from BSID-II. The FinCDI is a validated, parent-report measure evaluating the development of lexicon and grammar, including inflectional morphological skills. Variables of lexicon size and mean length of the three longest utterances (M3L) were used. Lexicon size is the number of words the parents estimated that their child uses, based on word lists (595 words). M3L is calculated in morphemes (i.e., the smallest units of language creating a difference in meaning) based on the three longest recent utterances the child has made. The ELS consists of 10 pictures and 5 objects that the child was asked to name in the testing situation.

Assessment at 7 Years
Reading precursors, reading, and writing ability were evaluated during the first weeks of grade 1 of primary school (a 6-week period from August to September during the school entrance year). Reading precursors assessed were phonological awareness and letter knowledge. To evaluate phonological awareness, three-to seven-letter words were presented phoneme by phoneme [26]. Children were instructed to mark one picture out of four alternatives that they thought would best match the word (max 9). To evaluate letter knowledge, the child was asked to name 29 uppercase letters presented in random order (max 29) [27]. In this study, the sum score of the tasks of phonological awareness and letter knowledge was used as the measure of precursors of reading (max 38).
Reading skills were evaluated using a short version of the Finnish reading test ARMIa tool for assessing reading and writing skills in Grade 1 [27], consisting of a wordlist of two-syllable (seven words), three-syllable (two words), and five-syllable (one word) words. The child was asked to read the words aloud. The score for reading skills was the number of correctly read words (max 10). To evaluate writing skills, the child was asked to write 5 words and 8 pseudowords said aloud one word at a time [19]. The writing skills score was the total number of correctly written items (max 13).

Statistical Analyses
All analyses were run separately for all children born VP/VLBW, for preterm children without neurodevelopmental impairment, and for controls. Pearson's correlation coefficient values were used to investigate the correlations between the continuous language and literacy variables measured at 2 and 7 years. All language and literacy variables were also categorized. The 10th percentile cut-off value was used to evaluate the association between early weak language skills at 2 years of corrected age and weak literacy skills at 7 years of age. For the FinCDI, the cut-off value was based on the normative sample, and for the other measures, the 10th percentile cut-off values were derived from the control group. Comparisons between categorical variables were done using cross-tabulation with the Chi-square test or Fisher's exact test. Multiple variable linear regression analysis was conducted to assess how much 2-year language variables explain the variance in literacy skills at 7 years when the effect of background factors were taken into consideration. The dependent variables were reading precursors (sumscore of letter knowledge and phoneme synthesis), reading, and writing skills at 7 years. The independent variables were lexicon size, M3L, and ELS measured at 2 years. Since the independent variables were strongly correlated with each other, they were analyzed separately. Nine regression models were run: in the first three models, lexicon size was used as an independent variable; in the next three models, M3L; and in the last three models, ELS. In the preliminary analyses, the following background factors were associated with the outcome variables and were therefore included in the regression models: gestational age, mother's self-reported reading difficulties, father's self-reported reading difficulties, and paternal education. Maternal education was not included because in the preliminary analysis paternal education level correlated more strongly with the outcome variables. Due to multicollinearity between maternal and paternal education, only paternal education was included in the regression analyses. Lastly, the predictive value of early language development at 2 years for literacy skills at 7 years was analyzed using the AUC values. The AUC is the measure of the ability of a test to distinguish between classes [28]. The greater the AUC values, the better the prediction model. An area of 1. represents a perfect classifier, whereas a ROC curve no better than chance would have an area under the curve of 0.5. AUC values are interpreted as follows: excellent predictive value 0.90-1, good 0.80-0.90, fair 0.70-0.80, poor 0.60-0.70, and fail <0.60 [28]. All statistical analyses were performed using IBM ® SPSS ® Statistics for Windows, version 26.0. (IBM Corp., Armonk, NY, USA). Two-tailed p-values < 0.05 were considered statistically significant.

Data Description
The background characteristics of the children are presented in Table 1. No statistically significant difference in background factors was found between the children born VP/VLBW who participated in the study and the VP/VLBW children living in Finnishspeaking families whose language and literacy data were unavailable (n = 46, 25%), except that there were more multiple births among participating children (36% of the study children vs. 16% of the dropouts, p = 0.02).  (7) 29 (21) a Small for gestational age was defined as a birth weight < −2.0 SD, according to the age-and gender-specific Finnish growth charts. b See specific magnetic resonance imaging (MRI) protocol and details about the classification elsewhere [29]. c High is defined as a Bachelor's degree, Master's degree, or Doctoral degree; Intermediate is defined as high school or vocational school; Low is defined as primary or lower secondary or less. d The percentages were calculated from the data available. VP/VLBW = very preterm (<32 gestational weeks)/very low birth weight (≤1500 g).
Descriptive statistics for language and literacy variables were measured and group comparisons are presented in Table 2. A statistically significant difference between the groups was found in every language (p-values from 0.04 to < 0.001) and literacy variable (p-values from 0.002 to 0.003). When children with neurodevelopmental impairment were excluded, the group differences remained statistically significant in ELS and in every literacy variable.  VP/VLBW = very preterm (<32 gestational weeks)/very low birth weight (≤1500 g); NDI = Neurodevelopmental Impairment; (SD) = Standard Deviation; (min, max) = minimum and maximum; n = number; (%) = percentages; 95% CI = Confidence Interval for the mean; p = significance level; M3L = mean length of the three longest utterances value; ELS = Expressive Language Score.

Associations between Language Development at 2 Years of Corrected Age and Literacy Skills at 7 Years
Statistically significant positive correlations were found in both groups between all variables measured at 2 and 7 years (r-values between 0.29 and 0.43, p < 0.01) ( Table 3). When children with neurodevelopmental impairment were excluded, the correlations remained statistically significant. The r-values were slightly smaller for precursors of reading but remained the same or even slightly increased in reading and writing. Based on the cross-tabulation, 33% to 74% of VP/VLBW children who had weak early language development (10th percentile) had also weak literacy skills at 7 years (see Table 4). The corresponding proportions for VP/VLBW children with typical language development at 2 years were 11% to 44%. In the controls, the corresponding proportions for children with weak language at 2 years were 15% to 83%. However, the results of the cross-tabulation were statistically significant only between weak lexicon size and weak reading skills and between weak M3L and weak reading and writing skills. Table 4. Results of the cross-tabulation with Chi-square or Fisher's exact tests of the associations between weak lexicon size (10th percentile, <30 words), weak M3L (10th percentile, <2.06), weak ELS (10th percentile, <1.60) measured at 2 years, and weak reading precursors (10th percentile, <25), weak reading (<10th percentile, 0 words), and weak writing (10th percentile, 0 words) measured at 7 years. Results for all VP/VLBW children, for VP/VLBW children without NDI, and for full-term controls are presented. The regression models of the VP/VLBW group, including early lexicon size as a predictor, explained 23% of the variance in reading precursors, 27% of the variance in reading skills, and 17% of the variance in writing skills at 7 years (Table 5). In these models, early lexicon size and paternal education were statistically significant independent predictors. The regression models of the controls are presented in Appendix A. Table 5. Results of multiple variable linear regression analysis with reading precursors, reading and writing skills at 7 years of age as dependent variables, and with lexicon size at 2 years of corrected age and background factors as independent variables. Results of VP/VLBW children are presented.

Reading Precursors
Reading Writing The models including M3L as a predictor explained 27% of the variance in reading precursors, 28% of the variance in reading skills, and 16% of the variance in writing skills at 7 years (Table 6). M3L and paternal education were statistically significant independent predictors. Table 6. Results of multiple variable linear regression analysis with reading precursors, reading, and writing skills at 7 years of age as dependent variables, and with M3L at 2 years of corrected age and background factors as independent variables. Results of VP/VLBW children are presented.

Reading Precursors
Reading Writing The models including ELS as a predictor ( Table 7) explained 20% of the variance in reading precursors, 25% of the variance in reading skills, and 14% of the variance in writing skills at 7 years. ELS, paternal education, and mother's self-reported reading difficulties were statistically significant independent predictors. The exclusion of children with neurodevelopmental impairment did not alter the results. In the control group (Appendix A), the same models explained a smaller proportion of the variance in outcome relative to children born VP/VLBW.

Discussion
To the best of our knowledge, this is the first controlled follow-up study providing longitudinal information on the associations between very early language development at 2 years of corrected age and later literacy skills in VP/VLBW children and their controls. Significant correlations between every language and literacy variable were found both in the VP/VLBW group and in the control group. Most of the children born VP/VLBW with weak language skills at 2 years had also weak literacy skills at 7 years. Lexicon size, M3L, and ELS measured at 2 years were statistically significant predictors in the regression models explaining the variance in literacy skills, especially in the VP/VLBW group. Every language variable at 2 years had a fair predictive value for literacy skills 5 years later in children born VP/VLBW when measured using AUC values.
Previously, the associations between language and literacy ability in children born preterm have been analyzed at 4 years of age at the earliest [17]. In a longitudinal study consisting of 110 children born VP and 113 term controls, Pritchard and colleagues [17] found an association between school readiness domains including language at age 4 years, and literacy and numeracy skills at ages 6 and 9 years. In addition, in the study of Perez-Pereira et al. [30], morphosyntactic production and comprehension of syntactic structures at 5 years were associated with reading outcome at 9 years in preterm children. The knowledge of letters and words at 5 years [31] and phonological awareness and expressive and receptive language at 6 years [16] have been found to be associated with reading and writing outcome at 7 years [31] and at 8 years [16] in VP populations. In two previous cross-sectional studies [14,15], reading performance at 8 years in children born VP was correlated with lexical production and grammar comprehension [14] and with phonological awareness and rapid naming [15]. To date, it has been unclear whether an association between very early language development and later literacy outcome can be detected in VP/VLBW children. Our findings fill in this gap, suggesting that the association between language and literacy at 7 years of age is evident already at age 2 years in children born VP/VLBW.
In the present cohort, most children born VP/VLBW with small lexicon size, short M3L, and weak ELS at 2 years had weak literacy skills at 7 years compared with those with more advanced early language. In previous studies considering the continuum of language in children born VP/VLBW small lexicon size and short utterance length at 2 years have been shown to predict later language development [3,4,10,12]. Our study extends this knowledge to literacy skills. These results together emphasize the need for early screening of weak language development in the vulnerable group of VP/VLBW children Another novel finding was that language skills at 2 years explained a significant amount of the variance in literacy skills 5 years later, especially in the VP/VLBW group. Thus, the present findings provide evidence for the existence of a continuum between language development at 2 years and literacy ability at 7 years in these children. This study offers an interesting perspective on the association of early language with later literacy skills in preterm children using the Finnish language, which has a highly regular grapheme-phoneme correspondence. Different language versions of the CDI have been shown to be a cost-effective way of identifying small lexicon and short utterance length at 2 years [2][3][4]12]. In this study, the variables of the FinCDI were even stronger predictors for later literacy skills than ELS, which is a performance-based subtest of BSID-II [24]. Thus, our results support the view that parents can provide valuable information on early language development of their children, when structured, validated measure, such as CDI, is used.
Paternal education was a significant background variable in the regression models, especially in children born VP/VLBW. For the controls, the effect of paternal education was not as clear. The effects of paternal education are less studied than those of maternal education. However, in previous studies regarding the same PIPARI cohort [22,23], paternal education was found to relate also to precursors of reading at 5 years [5], and to verbal comprehension at 11 years [32] in children born VP/VLBW. Our findings suggest that fathers may have a significant role in supporting the language development of preterm children in the home environment during childhood years, at least in societies which emphasize the role of both parents in early childhood care, as in Finland.
In this study, lexicon size, M3L, and ELS measured at 2 years had fair predictive value (AUC values varied between 0.70 and 0.77, p < 0.001) for literacy skills 5 years later in the VP/VLBW group. The explaining value of early language at 2 years of age for literacy performance at school age has been investigated previously in full-term populations, e.g., in children with a familial risk of dyslexia [19,20]. Parallel results have been noted for late talkers, i.e., children with small expressive lexicon at 2 years but with an absence of cognitive delay or any other neurological condition explaining the slow language acquisition [33,34]. Late talkers perform consistently lower on language and literacy tasks at school age and even in adolescence than their peers [33][34][35]. In the present study, the predictive value of early language at 2 years of age for literacy skills at 7 years was established for the first time in the vulnerable population of preterm children born at very early gestational age (<32) and/or with very low birth weight (≤1500 g). Comparable findings detected in different populations support the view that very small lexicon size and/or very short utterance length at 2 years of age are risk factors for later language and literacy deficits after controlling for background factors. Furthermore, the predictive value of early language skills for later literacy outcome was better in children born VP/VLBW than in controls. This finding may be explained by the fact that the VP/VLBW sample included more children with early weak language skills relative to controls [3]. These results may also reflect the persistence of language-related difficulties among children born VP/VLBW with early weak language.
This study has several implications. First, it shows very clear longitudinal associations between very early language skills and later literacy outcome in preterm children. Thus, our findings propose the clinical importance of screening language skills of preterm children born at very early gestational age and/or with very low birth weight at 2 years of corrected age. Our findings highlight the usefulness of the validated parent-reported form, such as CDI [13,25] in the follow-up of the vulnerable group of high-risk prematurely born VP/VLBW children for identifying children at risk for later literacy deficits. Identifying developmental problems as early as possible is important, since it enables targeted early interventions and support. In addition, standardized parental report forms, such as CDI, may promote parents' active involvement in observing and supporting the language development of their preterm-born child. Our results provide information also for the educational professionals working with school-aged children born VP/VLBW showing the higher percentage of weak pre-reading, reading and writing skills in this population when compared with full-term control children.
Strengths of the study include its longitudinal design with a well-defined cohort of children born VP/VLBW together with a control group born in the same hospital. The longitudinal data from altogether 274 children provided a great possibility to assess the associations between early language development and later literacy performance. Both a validated parent-report form [13,25] and a test-based measure [24] were used to gather information on early lexical and grammatical development. The use of different types of method to assess early language development strengthened our findings. In our study, the phonological awareness task, which included three to seven-letter words said aloud phoneme by phoneme, also relates to working memory and actually measures both domains. However, the participants also had visual aid: at the same time as they heard the phonemes, they saw pictures of the correct word and three other alternatives. The participants had to mark one picture out of four alternatives they thought would best match the word. This might have reduced the burden of the working memory during the task. As a limitation, measures used in the study provided information on expressive language only. Information on receptive language would have provided an even more comprehensive view of early language development. This should be taken into consideration when applying these results to a clinical context.

Conclusions
Language development is essential for academic learning of children starting school. It is important to recognize potential risks for learning disorders as early as possible. Our study shows, for the first time, that problems in literacy skills at the beginning of formal schooling at 7 years of age may be identified already at age 2 years in preterm children born at very early gestational age and/or with very low birth weight. Early identification enables early interventions for those preterm children at risk for later literacy deficits. If concern arises regarding the early language ability of preterm children based on the results of a parental report form, such as CDI, a broader assessment of language skills by a speech-language pathologist is recommended. We emphasize the need for further studies (randomized controlled trials) regarding effective early interventions for VP/VLBW children at risk for literacy deficits.
Appendix A   Table A1. Results of multiple variable linear regression analysis with reading precursors, reading, and writing skills at 7 years of age as dependent variables, and with lexicon size at 2 years of age and background factors as independent variables. Results of full-term controls are presented.

Reading Precursors
Reading Writing