How Word/Non-Word Length Influence Reading Acquisition in a Transparent Language: Implications for Children’s Literacy and Development

Decoding skills are crucial for literacy development and they tend to be acquired early in transparent languages, such as Brazilian Portuguese. It is essential to better understand which variables may affect the decoding process. In this study, we investigated the processes of decoding as a function of age of children who are exposed to a transparent language. To this end, we examined the effects of grade, stimulus type and stimulus extension on the decoding accuracy of children between the ages of six and 10 years who are monolingual speakers of Brazilian Portuguese. The study included 250 children, enrolled from the first to the fifth grade. A list of words and pseudowords of variable length was created, based on Brazilian Portuguese structure. Children assessment was conducted using the computer program E-prime® which was used to present the stimuli. The stimuli were programmed to appear on the screen in a random order and children were instructed to read them. The results indicate two important moments for decoding: the acquisition and the mastery of decoding skills. Additionally, the results highlight an important effect of the extent and type of stimuli and how it interacts with the school progress. Moreover, data indicate the multifactorial nature of decoding acquisition and the different interactions between variables that can influence this process. We discuss medium- and long-term implications of it, and possible individual and collective actions which can improve this process.


Introduction
In recent years, the study of reading development has been the focus of different areas given its relevance for academic success and its role as predictor of cognitive, intellectual, and linguistic achievement [1,2]. According to the theory of information processing [2,3], reading is a complex skill dependent of multiple linguistic-cognitive abilities, which, interdependently, act for the proper processing of decoded information [3,4]. Two routes, the phonological and the lexical, are responsible for the acquisition and development of reading [1][2][3][4]. Proficient reading is only reached when decoding is automatized and when cognitive and metacognitive mechanisms are available to enable the understanding of the decoded material [5][6][7]. The phonological route uses the grapheme-phoneme conversion process, translating letters or groups of letters into phonemes, through the application of grapheme-phonemic rules. In contrast, in the lexical route, pronunciation is not constructed segment by segment, but retrieved as a whole from the orthographic lexicon. However, the lexical route is used only when the item to be read has its orthographic representation pre-stored in the orthographic lexicon, that is, it was acquired through the phonological route [1][2][3][5][6][7]. Thus, development and automating of decoding process is fundamental for literacy [5][6][7].
In the process of acquisition and development of decoding and reading, there is a transition from slow reading, based on the grapheme-phoneme relationship, to rapid and assertive word recognition. This evidences the reduction in the use of the phonological route and the increase in the use of the lexical route [4][5][6][7]. Thus, phonological decoding is essential for the development of automatic visual recognition of words, a key skill for reading fluency. Initially, phonological decoding is responsible for familiarizing the novice reader with orthographic representations necessary for fluent and effortless decoding. It is important to emphasize, however, that this process occurs gradually, and the transition from the use of the phonological to the lexical route takes place throughout the development of reading and does not end with literacy [2][3][4][5]. Researchers from different areas have developed studies in order to verify the applicability of the dual route model to different languages, with conflicting results. The cognitivist model of reading, supported by the dual route theory, is also applicable to Brazilian Portuguese. Pinheiro [8] showed that beginning Brazilian readers tend to rely primarily on grapheme-phoneme conversion rules to decode unfamiliar words. Simultaneously, these readers acquire the orthographic representation of the decoded words so that they become familiar and, thus, the decoding can be automated.
In addition to the decoding process, another variable was identified to have great influence on the learning process of written code: the orthographic characteristics of each language [9][10][11][12]. Several studies have proved that such a process varies according to the orthography of the language in which the child is being literate [9][10][11][12]. Considering the orthographic variation of languages, a group of authors [13] developed the theory of orthographic depth. The authors argue that writing systems represent the phonology of a given language through orthography, via rules that do not necessarily occur on the phoneme-grapheme relation. Considering that writing systems represent the phonology of the language with different degrees of consistency, transparency (a phoneme is represented by a single grapheme and vice versa) or opacity (a phoneme is represented by more than one grapheme and vice versa) with which the relationship between phonemes and graphemes occurs in a language can facilitate or hinder the acquisition and development of reading. Following this line of research, later studies stated that, in alphabetic-based languages, such as Brazilian Portuguese, the development of reading begins with basic skills of grapheme-phoneme relation, followed by the acquisition of orthographic representations for a more automatic and fluent decoding [9,11,12,14]. Most studies on cross-linguistic differences in reading have focused on European orthographies [9][10][11][12]. Much less is known about other regions or even about languages that are variants from European languages, such as Brazilian Portuguese-which has its own characteristics as will be detailed in the following section. Thus, the findings on such languages would expand the knowledge on the reading acquisition process, including orthographies and variations of a language that share similarities with a European idiom, but which also has its exclusive characteristics.

Brazilian Portuguese Orthography
Although European Portuguese has an intermediate orthography depth (Seymour et al., 2003), Brazilian Portuguese has a very transparent decoding system, since it has only three inconsistent (irregular) graphemes [15]. The orthography of Brazilian Portuguese presents a set of consistent, biunivocal graph-phonemic relations and also the Brazilian pronunciation of vowels are very much different from European Portuguese since they are longer and more stressed, which facilitates its perception, consonants perception, phonemegrapheme association and, thus, their decoding [15]. Moreover, the set of inconsistent graph-phonemic relations of Brazilian Portuguese are governed by rules, that is, they depend on the graphemic context, but are easily understandable by Brazilian Portuguese speakers [15]. Only a small part of this set of inconsistent relationships is not rule-governed, that is, they are irregular inconsistent. In this last set of inconsistent relations are found the three most opaque graphemes of Brazilian Portuguese. Thus, graphemes governed by rules independent of the graphemic context are: "p", "b", "t", "d", "f", "v", "ss", "ç", "sc", "ch", "j", "nh", "rr", "ü", "ó", "õ", "á", "à", "â", and "ã". Graphemes governed by context-dependent rules are described through 23 rules. For example, in this context, the rules for decoding the grapheme "g" in front of the letters that represent vowels, that is, "i", "í", "e", "ê" and "é", as in "gelo" and "girafa" (ice and giraffe) and also in other contexts, as in "água", "gola" and "gato" (water, collar, cat).
There are decoding rules that depend on the application of metalinguistic knowledge or knowledge of the morphosyntactic and semantic context present in the text. However, in some of these cases, knowledge must be combined with the pairing of the word with the orthographic representation present in the mental lexicon. These items, therefore, can only be read correctly via lexical route. In this last set are the rules for decoding the graphemes "e" and "o" when not marked by a diacritic. The same applies the grapheme "x" in intervocalic position, as it can represent three different sounds: /S/ as in "abacaxi" (pineapple), /s/ as in "máximo" (maximum) and /ks/ as in "taxi". The correct decoding of this grapheme, in these contexts, depends on the storage of orthographic representations of the words in the mental lexicon.

Decoding Assessment
Decoding is a crucial skill for literacy and for the consolidation of fluent reading and, consequently, reading comprehension [16][17][18]. The assessment of decoding through read aloud is currently the most frequently used measure to monitor the acquisition and progress of the skill, both with regard to school assessments and to verify the effectiveness of intervention programs [19][20][21]. In addition, the results of the oral decoding assessment are an important predictor of the reading performance of the individual. In the United States, oral reading assessment measures are analyzed by the Federal Education Department to monitor the academic development and to develop stimulation and/or intervention programs [19]. The type of material used for the evaluation must be adequate to the objective that has been set, as the results differ according to measures, such as isolated words or texts [19][20][21]. The oral reading of isolated words is the most frequently used task to assess the individual's proficiency in decoding [22]. This task isolates context or visual (pictorial) cues and thus strictly evaluates decoding. The reading assessment models are strongly based on the dual route model, with the use of words and nonwords.
In Brazil, it is common for schoolchildren to present some difficulty in reading or writing [23]. Therefore, it is essential to characterize the reading condition of children to allow proper identification of typical variations of development or possible deficits. According to the latest evaluation of the Program for International Student Assessment (PISA), Brazil remained with high rates of school failure [24] and the country has been among the worst performing countries for 10 years. According to this latest report, the reading difficulties faced by Brazilian schoolchildren begin in elementary education, interfering with the consolidation of literacy. These difficulties, when not identified or treated, become chronic, leading the student to low performance throughout the school years.
In view of this reality, the occurrence of "false positives" for reading and learning disorders is very common, since the characteristics of a learning difficulty can resemble the manifestations observed in different learning disorders and specialized professionals need to carry out the appropriate differentiation of these conditions, in a specific evaluation and with the support of a multidisciplinary team. Thus, decoding assessment becomes essential, as it allows early identification of possible deviations in development, elaboration of stimulation and rehabilitation programs, in addition to favoring the adequate process of literacy and schooling. Furthermore, understanding how language characteristics (i.e., opacity/transparency) facilitate or hinder such a process is of great value to different areas of knowledge and countries so that public policies to promote literacy can be specifically strengthened and advanced [10,11].

Empirical Implications
In addition to understanding the development of decoding of children in literacy process ages in terms of opacity and transparency, advancing investigations by understanding how the characteristics of a language itself (i.e., word length/syllable structure) are of fundamental importance to expand understanding in the area [5,6].
In recent years, studies have investigated the reading processing in relation to other aspects of the language, such as word length and syllabic structure [10][11][12]. The syllabic structure of French interferes both in the development of decoding and in spelling knowledge on the graph-phonemic decoding and writing of children [12]. For Brazilian Portuguese [21], children with less schooling have trouble in decoding words that are longer or outside the most common standard of the Portuguese language (Consonant-Vowel). It is important to highlight that the study [21] was limited to two grades and the authors claimed that more studies in the field were needed. Such data may bring consistent subsidies for the planning and execution of actions that can, in the medium and long term, reduce the low reading rates commonly presented by Brazilian students.
Understanding the acquisition patterns of graph-phonemic decoding and reading in the different spelling patterns is important, not only to favor the development of children according to their language, but also to identify in which way the predictors of the reading skills vary from one language to another [9][10][11]. Continuity and research advances investigating linguistic features more deeply can also promote a better understanding of how these features correlate with the underlying decoding skills (i.e., phonological awareness, RAN) to them in different languages [9][10][11][12].

The Present Study
The present study is anchored in the theory of information processing and considers the theoretical assumptions that support the theory of double route as well as its interaction with the different characteristics of the languages. The aims of this study are to investigate the effect of grade, stimulus type (word/nonword) and stimulus extension on the decoding accuracy of children between six and 10 years of age who are speakers of Brazilian Portuguese through a list of words and pseudowords that considers the characteristics of the language. The present study is fundamental for the deepening of knowledge on the process of acquiring basic reading skills and its relationship with the characteristics of a transparent language. In addition, the current study will provide data for transversal and longitudinal cross-linguistic studies in different fields. It is noteworthy that this study differs positively from others by studying the effects of different linguistic characteristics on the literacy process of children in an entire literacy cycle, in addition to providing data on two of the most used reading assessment measures [16][17][18][19].

Hypotheses
Hypothesis a. As the decoding acquisition process develops, better performance is expected for older children, shorter stimuli, and words, as compared to nonwords. Grade (a1), stimulus type (word/pseudoword) (a2), and stimulus length (a3) will influence performance on the reading task.

Hypothesis b. The following interactions are expected:
Hypothesis b1. Between grade and type of stimulus: that is, the differences in decoding performance between words and pseudowords will vary according to grade. Based on the dual route hypothesis, we expect the difference to be greater in lower grades than in higher grades.
Hypothesis b2. Between grade and stimulus length: that is, the difference in reading performance between monosyllables and polysyllables will vary according to grade. As decoding acquisition becomes more advanced, the effect of stimulus length should be less pronounced and, therefore, expected to be smaller for higher grades.
Hypothesis b3. Between stimulus type and length: that is, the difference in reading performance between monosyllables and polysyllables will be different for words and pseudowords. Based on the dual route hypothesis, longer nonwords should be more challenging than longer words as no benefit from lexical route is expected for nonwords.
Hypothesis b4. Between grade, stimulus type and stimulus length: that is, the interaction between grade and word length should be different for each type of stimulus as the decoding acquisition process advances.

Material and Methods
This is a prospective study that followed the principles of the Standards for Educational and Psychological Testing (SEPT) [25], a guideline proposed by American organizations that compiles fundamental recommendations and definitions regarding the psychometric aspects involved on the preparation and interpretation of tests, in addition to the different necessary steps for validation of a procedure. This study was approved by the Institution's Research Ethics Committee (CEP No. 2262300). The data collection procedures started only after schools, parents/guardians, and children signed the Free and Informed Consent Form.

Step 1: Evidence of Validity Based on Test Content
At this stage, the target population was defined, an extensive literature review was carried out, and, for the elaboration of the items, the syntactic and semantic aspects that contribute to the clarity, pertinence, coherence and scope of the items were considered. The representativeness and relevance of the items in relation to the outcome was evaluated by judges with expertise in the subject of the test.
Definition of the target population: students from a public and a private school, both in the city of São Paulo, were included in the study with the aim of evaluating a representative sample of school-age children. The indicators of the National Institute of Studies and Research (INEP) in relation to the test Provinha Brasil-which is the main indicator for calculating the Basic Education Development Index (IDEB)-were considered to select the schools to be included in this study. The selected schools presented scores close to that observed in the national average for public and private schools based on the most recent published data [23].
Literature Review: an extensive literature review was carried out regarding the different word reading tests or word banks developed for Brazilian Portuguese in recent years [26][27][28][29][30][31][32][33]. It was observed that most of the compiled literature used criteria such as frequency of words, or even their concreteness, with the exception of one study [33] that was based on the language decoding rules [15]. We emphasize, however, that we did not find tests or procedures designed according to the characteristics of Brazilian Portuguese that considered aspects beyond the decoding rules, such as: variation in word length and its frequency of occurrence in the language.
Elaboration of the items: a list of words and nonwords of variable lengths (from monosyllabic to polysyllabic) was created, based on three fundamental principles: (a) the decoding rules of Brazilian Portuguese [15,34]-context-independent graphophonemic correspondence, context-dependent graphophonemic correspondence, and irregular graphemes; (b) length variability of Brazilian Portuguese words-words ranging from monosyllables to polysyllables; (c) the frequency of occurrence of the different word length in the language [34], as shown in Table 1. In Brazilian Portuguese, 86.1% of words are concentrated between mono and polysyllables with a maximum of five syllables. For this reason, words of the present study followed this same pattern. Elementary school children, the target population of this procedure, are not exposed to all variations in Brazilian Portuguese word length [15,20]. Therefore, polysyllables with up to five syllables were included. Taking into account the decoding rules, the length of words, and the frequency of occurrence of these words in the children's experiences, a final list with a total of 68 words distributed as follows was created: 6 monosyllables (8.8%) (i.e., boi (ox), pé (foot)), 16 two-syllables (23.5%) (i.e., noite (night), chuva (rain)), 22 tri-syllables (32.3%) (i.e., escola (school), zeloso (zealous)); 16 polysyllables with up to four syllables (23.5%) (i.e., aquarela (watercolor), nascimento (birth)), and eight polysyllables with five syllables (11.7%) (i.e., maravilhosa (wonderful), insegurança (insecurity)). (Appendix A).
The use of nonwords in the decoding assessment of Brazilian Portuguese speakers is essential due to the transparency of the language, especially when considering the dual route [15,34]. The list of nonwords was designed by a linguist based on the list of words, respecting the phonological structure of each one of them. The following rules were adopted: • Vowels-(a) always keep the corresponding low for exchange (/a/in an unstressed position); (b) replace the middle vowel with a middle vowel; (c) replace the high vowel with a high vowel; • Plosives/Fricatives-replace respecting the following order of priority: point of articulation, voicing and, in case of impossibility, mode of articulation; • Nasal-replace only the point of articulation; • Liquid-replace lateral phonemes with non-lateral ones and vice versa.
In addition, the transformation of words into nonwords followed criteria for maintaining the length of the word. Thus, for the monosyllables, only the vowels were changed; for two-syllables, a vowel and a consonant were changed; for trisyllables, two consonants and one vowel were changed; for polysyllables, three consonants and two vowels were changed. After changes were made to the structure of the words, they were spelled in order to respect the decoding rules of Brazilian Portuguese. However, the various possibilities of graphophonemic representation were considered, since the nonwords do not follow the orthographic rules of the language (Appendix B).
Analysis by expert judges: each of the nonwords were evaluated by three different expert judges who determined whether the nonwords were adequate to the instrument's construction criteria, both in terms of structure and length. The collected data were submitted to statistical analysis using the SPSS software version 25. The analysis of agreement between judges was performed based on the value of Fleiss' Kappa coefficient, which is a generalization of Cohen's Kappa coefficient. Kappa coefficient values greater than 0.75 are considered to indicate excellent inter-judge agreement; between 0.40 and 0.75 as moderate; and below 0.40 as weak and/or non-existent. In this research, the analysis of agreement among the three judges showed the following results: k = 0.800 for adequacy to the nonword structure criteria (excellent agreement) and k = 0.575 for adequacy to the non-word extension criteria (moderate agreement).
The developed list of words and nonwords will henceforward be addressed as The Protocol for Decoding Acquisition Development-Protocolo de Acompanhamento do Desenvolvimento da Decodificação (PRADE) [35] (Appendices A and B).

Step 2: Evidence of Validity Based on Response Processes
In this step, the adequacy, structure and application of the items in a real context were verified. There is no explicit recommendation for the sample size at this stage. It is suggested the formation of representative strata of the target population, composed of at least 10 individuals in each stratum. Interviews/procedures were carried out to verify that participants understood the test items [25].
Sample size: the number of classes (strata or groups) into which the sample would be divided was considered. Thus, the formula below was adopted to calculate the minimum sample size (considering k = 1 + 3.322 × logn where; k = number of classes (strata or groups); n = sample size; log = base 10 logarithm): Considering that each school would have five distinct groups of children (grades 1 to 5), the minimum number of sample elements determined for each group was 16. Thus, it was established that the sample should have at least 80 sample elements, distributed among the school groups.
To guarantee the statistical power of the sample we chose to collect a number greater than the minimum indicated by the analysis. In addition, we selected a balanced number of children from public and private schools previously selected in order to constitute a representative sample of the Brazilian educational reality. Thus, the study included 250 children, enrolled from first to fifth grade of elementary school. Each grade had 50 children, as follows: 23 girls and 27 boys in 1st grade, with a mean age of 6.6; 23 girls and 27 boys in 2nd grade, with a mean age of 7.7; 22 girls and 28 boys in 3rd grade, with a mean age of 8.5; 26 girls and 24 boys in 4th grade, with a mean age of 9.6; and 21 girls and 29 boys in 5th grade, with a mean age of 10.5.
To ensure that the study sample was composed of children with different academic profiles and to prevent a single profile of children from being indicated for participation, we chose to use stratified random sampling for the selection of participants. Thus, the children were numbered from 1 to 250, in ascending order, according to the school year, and then these numbers were used to randomly select the final sample of the study.
Participants are able to complete the procedures: to be included in the study, children should have no auditory or visual complaints; no signs of neurological or cognitive disorders; absence of retention in school records; no phonological and oral language alterations. Oral language was assessed through the ABFW phonology test [36], which consists of naming and imitating phonologically balanced linguistic items. In addition, we also applied the word reading subtest of the School Performance Test [37] due to its procedural similarity with what was intended to be evaluated in this research.

Step 3: Evidence of Validity Based on Internal Consistency
In this step, the degree of relationship between the test items and the outcome was verified by applying the test to a sample of the target population. Corrected item-total correlation and inter-item correlation was observed [25].
Application of the test in a sample of the target population: for this stage, the computer program E-prime ® was used to present the stimuli. The stimuli were programmed to appear on the screen in random order. Before starting the experiment, each of the 250 children was presented with a screen containing instructions about the test, which were read by the researcher: "Hello! Next, words that exist and that do not exist, of different sizes, will appear. Read them aloud the way you think the word should be read. If a word that you do not know appears, no problem, move on to the next one! Good reading"-Olá, agora eu vou te apresentar palavras que existem e que não existem, de diferentes tamanhos. Leia em voz alta do jeito que você acha que a palavra deve ser lida. Se aparecer alguma palavra que você não conhece, você pode pular para a próxima, sem problemas. Boa leitura!-The stimuli to be decoded were typed in Arial font, size 20, in uppercase. The children were instructed to read the words the way they were used to or the way they thought they should be read. If the child refused to read the word or could not decode it, they could skip it. The experiment was designed and run on E-Prime and video recorded for posterior analysis. Reading time was computed by E-Prime. Transcriptions of the responses were conducted by two Speech-Language Pathologists and the score of 0 was assigned to incorrect decoding and the score of 1 was assigned for correct decoding. No discrepancies were observed. As expected, the analysis of data from older children was faster and easier to compute than children from first and second grade given their more advanced decoding skills.
Data analysis: to verify the decoding accuracy, only the percentage of words correctly read was considered, respecting the graphophonemic and orthographic relations, in the case of words, and the graphophonemic relations, in the case of nonwords. Such data were also analyzed both in terms of the length of the words and the total percentage of correct answers in each of the lists. Generalized Estimated Equations (GEE), a method for modeling clustered data, was applied to estimate the parameters of a generalized linear model with a possible unmeasured correlation between observations and test the study hypotheses. Figure 1 shows the mean percentage of correct responses (and 95% confidence intervals) according to grade and the number of syllables for words and nonwords. In general, the percentage of correct responses was higher for more advanced grades, with first and second grades differing between them and among the others. The percentage of correct responses decreased with increasing stimulus length, but more markedly when comparing monosyllables, disyllables and trisyllables, especially in the early grades, as expected. A better performance for words compared to nonwords was observed.   Complete measures of central tendency and dispersion on the percentage of correct responses according to grade, stimuli type and length can be found in Table 2. Data show, in general, better performance in reading as the scholar grade advances. It should be noted that for first and second grade, the effect of word length and stimuli were greater than that observed in the more advanced grades, as expected. In addition, the standard deviation tends to decrease as the school grade advances indicating more homogeneity from the third grand onwards. The results from the pseudowords demonstrate the same pattern of decoding skills, although with a decrease in the accuracy percentage in all grades when compared to the words, except from the first grade. A Generalized Estimated Equations (GEE) model was applied to test the study hypotheses and verify the effect of grade, type and length of stimulus (Hypothesis a) and two-way and three-way interactions between these variables (Hypothesis b) on the percentage of correct answers in the reading task. Based on the nature of the dependent variables, the best fit was obtained considering a gamma distribution with identity link function and an unstructured covariance matrix for the two variables testing different adjustments based on the quasi-likelihood under the Independence Model criterion (QIC) and evaluating the model residuals using Q-Q graphs. Table 3 presents the effects of each factor separately for each of the models.   Note: 95% CI = 95% confidence interval; LL = lower limit; UL = upper limit; * = statistically significant value 5% (p ≤ 0.05); degrees of freedom = 1 for all analysis.

Results
Taken together, the results of Tables 3 and 4 demonstrate that there are multiple effects and interactions between the variables that these factors influence the accuracy on a reading task.
To better investigate the observed effects, post-hoc analyses of the estimated marginal means of percentage of correct responses for each grade and stimulus length and type was conducted using Student's t-tests with Bonferroni correction for multiple comparisons. The effect size was measured by calculating the d coefficient (Cohen, 1992). The results of these tests can be found in Table 5. The data in Table 5 indicate a high variability in responses from children in lower grade (1st grade) and the effect of word length is limited to the 2nd grade, which is quite different from all other grades. The data also indicate a ceiling effect of stimuli length on decoding performance of words around third grade, contrary to what is observed for nonwords, which maintains such an effect until fifth grade.

Hypothesis A
We observed that (a1) school year (X 2 = 157.101, df = 4, p < 0.001), (a2) type of stimulus (word/pseudoword) (X 2 = 727.674; df = 1; p < 0.001), and (a3) word length (X 2 = 485.817; gl = 4; p < 0.001) influenced performance in the decoding task, confirming hypothesis A. Better performance was observed the higher the grade, the shorter stimuli, and with the presentation of words as a type of stimulus as the decoding acquisition process advanced.

Hypothesis B
All hypotheses related to interactions were confirmed, indicating that there are significant interactions between (b1) grade and stimulus type (X 2 = 126.102, gl = 4, p < 0.001) that is, the difference in decoding performance between words and nonwords is greater as grades advances, supporting the dual route theory; (b2) grade and stimulus length (X 2 = 101.155, gl = 16, p < 0.001) that is, the difference in decoding performance between monosyllables and polysyllables should decrease as grades advances, as evidence of the advance in the decoding acquisition process; (b3) stimulus type and length (X 2 = 379.190, gl = 4, p < 0.001) that is, the difference in decoding performance between monosyllables and polysyllables is different for words and nonwords, also supporting the dual route theory, as longer nonwords should be more challenging than longer words as no benefit from the lexical route is expected for nonwords; (b4) grade, type of stimulus and length of stimulus (X 2 = 115.962, df = 16, p < 0.001) that is, the interaction between grade and word length is different for each type of stimulus as the decoding acquisition process advances.

Discussion
The current study investigated the processes of decoding as a function of the age of children who are exposed to a transparent language. The effects of grade, stimulus type and stimulus extension on the decoding accuracy of children between the ages of six and 10 years who are monolingual speakers of Brazilian Portuguese were studied. The findings from this study are in line with different studies that stated that in alphabeticbased languages, such as Brazilian Portuguese, the development of decoding tends to be early [9][10][11][12][13][14]38,39]. Furthermore, the multiple interactions between the variables investigated in this study are in line with international research that indicates the multifactorial nature of decoding development [6,[40][41][42]. This fact is extremely important for the understanding of the acquisition process so that investment in projects and public policies that better direct the literacy process can be fulfilled.

Acquisition of Decoding Skills
The results indicate two important moments for decoding development: the acquisition phase, in first and second grade; and the mastery phase, in third grade, with similar performance in fourth and fifth grades. In addition, there is an important effect of length and type of stimuli and how they interact as grade progresses, with such effect reducing as grade advances. Regarding nonwords, there is a greater influence of length, with a lower percentage of accuracy throughout the entire elementary school cycle when compared to words.
The pattern of decoding development observed in this study is in accordance with that described by the dual route theory [1,2,6], which explains the development of automaticity in reading. According to that theory, as individuals learn to decode and master the written code, they tend to show major changes in their decoding characteristics at the beginning of the process. Thus, as their orthographic lexicon increases, the values found in the decoding speed tend to stabilize and the differences with their peers of similar grades are reduced [10,11].
Bar-Kochva and Breznitz [43] also argue that for children learning to decode in more transparent orthographies, understanding and mastering the grapheme-phoneme conversion rules, in addition to providing faster learning of written code, will initially imply a greater dependence on phonological skills than on those of visual recognition. In the present study, the phonological route was strongly influenced by word length, as there was a decline in the percentage of correct responses with increasing stimuli length, especially with regard to nonwords. In the case of words, this influence strongly concentrated in first and second grades, attenuating from the third grade onwards. This data is extremely important because it allows reflections on literacy methods, educational speech therapy programs, and even the therapeutic intervention of children with learning disabilities, indicating that the length of stimuli should be an important variable part of the planning of activities for children in the initial phase of the decoding acquisition process.
Caravolas [10] found similar results in a study carried out with English, Czech and Slovak speaking children when identifying greater gains in speed and accuracy in words when compared to nonwords. The authors argue that that finding may be due to the fact that nonwords are of low frequency, while words tend to be stored in the orthographic lexicon of readers, providing direct access and, consequently, faster decoding. The author, however, reaffirms the importance of the phonological route for the acquisition of orthographic patterns in alphabetic-based languages, reinforcing the hypothesis of "self-teaching" provided by learning the graph-phonemic conversion rules.
In line with what was observed in this study, more recent research has shown that, in general, decoding measures (i.e., accuracy and speed) tend to be more heterogeneous between the first and third grades of elementary schooling, as opposed to what happens between the third and fifth grades, which tend to be more stabilized and even similar [44]. It is noteworthy that this pattern of development finds theoretical support from the dual route theory, which explains the development of automaticity in reading [2,[5][6][7]. Thus, as individuals learn to decode and master the written code, they tend to present greater changes in reading characteristics at the beginning of the process. As their spelling lexicon increases, the values found in the decoding speed tend to stabilize and the differences among their peers of close grades are reduced [44][45][46]. Thus, the importance of encouraging the development of grapho-phonemic conversion skills in the initial grades is reinforced so that the process of acquisition and development of decoding and reading are enhanced both in clinical and institutional settings. At the same time, for children from third grade onwards, the data suggest the need to reinforce activities beyond decoding, which also involve aspects related to reading fluency and its prosodic aspects and comprehension in order to concretize and improve the development of this very important skill, fundamental for the development of the individuals in all spheres of their lives.

Decoding Skills, Policies e Social Economical Status
The present study indicates that for children literate in Brazilian Portuguese, the acquisition of decoding occurs primarily between the first and third grades, with increasing automatization and mastery of the lexical route onwards. These data indicate the third grade as an important highlighter in reading development of Brazilian schoolchildren and are in agreement with international studies that point out that, in transparent languages, the development of decoding tends to happen early and stabilize in more advanced grades [12][13][14]. The fundamental importance of mastering decoding for the development of reading as a whole is well known [12][13][14]39,43], thus, these data can be an important indicator for the development of public education policies aiming to improve teaching methods.
The document from the Brazilian Ministry of Education that determines the National Curriculum Bases indicates that around third grade students must master decod-ing [47]. However, the latest reports from the Program for International Student Assessment PISA [24] indicate that the performance of Brazilian children evaluated in reading comprehension is flawed throughout the education chain, mainly due to the residual difficulties in decoding that these children carry in their trajectory. Ultimately, these residual difficulties make reading comprehension difficult and compromise the children's general academic performance.
This effect may be exacerbated by other factors such as Social Economic Status (SES). Kainz [48] discussed the academic outreach of African-American and Latin American children from schools in low-socio-economic neighborhoods in the United States. The study included twenty thousand children enrolled in 900 schools across the United States as a sample and carried out an evaluation of these children at the end of kindergarten and at the end of the first grade. The data showed evidence that programs from the United States Department of Education aimed at reducing the academic differences of these children with their peers in privileged situations proved to be highly effective in reducing deficits and gaps found on the first assessment of these individuals. The author also stated that the smaller number of children per classroom and teachers with better training were decisive variables for the data found. The strengthening of public education policies, teacher training, and a broader presence of educational speech therapy can be crucial elements for the increasing the potential of public education in Brazil. Considering the current findings regarding the role of word length on the acquisition of decoding in Brazilian Portuguese, the development of didactic materials, teacher training programs, and strategies to reduce learning difficulties must consider this variable to guarantee greater success in the initial grades and thus, promote the development of better readers with a positive impact in the Brazilian educational reality.

Cross-Linguistic Comparisons
Our data points that in transparent languages, such as Brazilian Portuguese, reading accuracy is influenced by word length, especially at the beginning of the schooling process. There are similar results described for other transparent languages such as Finnish, Greek, German, Czech and Italian [10][11][12]39,42,43,49], in which the variations of this early domain of decoding according to the degree of transparency are discussed. In some languages, such as Finnish, it is possible for children to master decoding in their first year of school [50]. In the case of Brazilian Portuguese, a language considered transparent, such mastery only occurs in the third year, although it is possible to verify a better performance of the second year compared to the first. Such data is of fundamental importance to better understand the process of acquiring decoding in the different ranges of transparency of the languages in which this process is studied. Bar-Kochva and Breznitz [43] also suggest that in learning to decode languages with transparent orthography, understanding and mastering the grapheme-phoneme conversion rules favor the learning of the written code, with a greater dependence on phonological skills than on visual recognition. In transparent languages, phonological skills not only favor the learning of the written code, but also help improve reading with a more assertive decoding [10][11][12].
We can hypothesize that in transparent orthographies, monitoring the acquisition of decoding in typical children should be a priority in the initial grades, with the aim of automating this process at the beginning of literacy. Decoding is essential for the development of reading fluency and, consequently, reading comprehension, which is the final objective on the domain of written code [10,12,18,46]. Thus, monitoring the acquisition of decoding becomes essential, especially in developing countries where educational indexes are generally low on international assessments. In the latest PISA report [24] for example, the data indicated that Brazil has remained stagnant in the past decade, and Brazilian readers have consistent reading deficits. The report details that most students fail to learn basic elements of reading, such as decoding [24].
The present study recruited subjects only in the city of São Paulo, which is the most populous city in Brazil, with the population of about 22 million in the metro area. Further studies in other regions of the country are needed, mainly considering the extension of Brazil and the variation in social and educational opportunities according to different regions. Considering the reading route theory in which this study is anchored, we believe that the pattern of decoding acquisition may be very similar for different regions. However, children from underprivileged regions may acquire and master decoding skills later than observed here. That is a fundamental question that must be addressed in further studies.

Conclusions
The current study contributes to the advancement of understanding of the decoding acquisition process by children who speak Brazilian Portuguese, and are literate in transparent orthographies languages, by showing that the acquisition of decoding is influenced by the type and length of the stimulus and that this influence varies according to the elementary school grades, evidencing the dual route theory [1][2][3][4][5], which has both clinical and theoretical implications.
The current findings contribute significantly to the area by indicating not only the process of acquiring decoding in a transparent language, but also its multifactorial nature and the different interactions between variables that can positively or negatively influence this process. These data are fundamental for the expansion of this process in a transparent language, but we also discuss their medium-and long-term implications, including possible individual and collective actions for the improvement of this process, mainly when considering the importance of decoding for literacy and for further development of the individual.  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study. Data Availability Statement: All data are available in excel arquives and also the videos that were recorded during the research conduction in the Research Laboratory where the study was conducted.

Conflicts of Interest:
The authors declare no conflict of interest.