The Correlation between Chinese Written Vocabulary Size and Cognitive, Emotional and Behavioral Factors in Primary School Students

Written vocabulary size plays a key role in children’s reading development. We aim to study the relationship between Chinese written vocabulary size and cognitive, emotional, and behavioral factors in primary school students. Using stratified cluster sampling, 1162 pupils from Grade 2~5 in Guangzhou were investigated. Chinese written vocabulary size, cognitive, emotional, and behavioral factors were assessed by the Chinese written vocabulary size assessment scale, the dyslexia checklist for Chinese children (DCCC) and the Strengths and Difficulties Questionnaire (SDQ), respectively. The scores of visual word recognition deficit (β = −3.32, 95% CI: −5.98, −0.66) and meaning comprehension deficit (β = −6.52, 95% CI: −9.39, −3.64) were negatively associated with Chinese written vocabulary size; the score of visual word recognition deficit (odds ratio (OR) = 1.04, 95% CI: 1.02, 1.07) was the related factor of a delay in written vocabulary size. The score of meaning comprehension deficit was negatively associated with boys’ Chinese written vocabulary size, while the score of auditory word recognition deficit was negatively associated with girls’ Chinese written vocabulary size. The related factor of a delay in written vocabulary size was spelling deficit in boys and visual word recognition deficit in girls. There is a significant correlation between Chinese written vocabulary size and cognitive factors, but not emotional and behavioral factors in primary school students and these correlations are different when considering gender.


Introduction
Written vocabulary knowledge includes the number of words known (breadth or vocabulary size) and richness of knowledge about the words known (depth) [1]. Chinese written vocabulary size refers to the number of Chinese characters known by schoolage children in our study. It is the foundation of reading activities and a representative indicator of children's reading development, which is highly correlated with academic performance [2]. The delay in written vocabulary size prevents children from reading correctly and fluently, which further leads to academic failure and loss of confidence in learning [3]. The primary school stage is a critical period for written vocabulary size acquisition [4]. The delay in written vocabulary size in primary school is a powerful predictor of the difficulties of reading and writing in adolescents and adults [5,6].
Many factors may have an association with written vocabulary size, including family reading environment, school teaching quality, children's characteristics, and so on. In this study, we gained more insight into the relationship between written vocabulary size and children's characteristics, including cognitive, emotional, and behavioral factors. Word learning is a complex cognitive processing that requires a lot of cognitive skills to participate in, for example, visual processing, phonological processing, meaning comprehension, spelling, memory and attention, and so on [7,8]. So far, it is still controversial which cognitive factors are the core factors that affect word learning. In alphabetic writing systems, phonological processing is considered to be the core ability for word learning [9]. Interestingly, different from the alphabetic writing system, Chinese characters are a kind of ideographic writing system, so Chinese learning may be more dependent on visual processing, semantic processing, and orthographic processing [10]. Different studies have used different cognitive factors; therefore, it is still controversial which is the main cognitive factor affecting Chinese written vocabulary size.
In terms of emotional and behavioral factors, studies showed that there is a consensus about substantial comorbidity between vocabulary delay and emotional and behavioral problems in school-age children [11,12]. Children with emotional disorders, hyperactivity, or conduct problems also suffer from a delay in written vocabulary size of both alphabetic and logographic writing systems [13][14][15]. A possible mechanism may be that negative emotional expressions and hyperactivity may hinder attentional processes and overburden the cognitive resources which are needed for word learning [16]. Literature showed that cognitive and emotional processing interacted with each other and had a joint influence on behavior pattens [17][18][19], so it was reasonable when exploring the influencing factors on written vocabulary size that cognitive, emotional, and behavioral factors should be considered at the same time.
In addition, we also focused on the gender difference in correlation with Chinese written vocabulary size. There are significant differences in cognitive abilities between boys and girls [20]. Boys have an advantage in visuospatial information processing [21], while girls perform better in language comprehension and expression [22]. At the same time, children of different genders also suffer from different emotional and behavioral problems in primary school. Girls tend to show emotional and peer relationship problems, while boys are more likely to represent hyperactivity and conduct problems [23]. However, evidence for the gender difference in correlation with written vocabulary size is much less, especially in the Chinese written system.
To summarize, little empirical evidence exists considering multidimensional factors related to Chinese written vocabulary size. Here, we aimed to explore: 1) which cognitive, emotional, and behavioral factors are the main factors influencing Chinese written vocabulary size after controlling for demographic information. In our study, cognitive factors included visual word recognition, auditory word recognition, meaning comprehension, spelling, oral language, written expression, and attention; emotional factors mainly included anxious and depressive emotional symptoms; behavioral factors included conduct problems, hyperactivity/inattention, peer relationship problems, and prosocial behavior. 2) Whether the influencing factors have different patterns between boys and girls. Solving the two questions can provide a comprehensive reference for improving Chinese written vocabulary size instruction of children in primary schools and children with dyslexia in clinics.

Study Design
This study was conducted from October 2016 to March 2017 in Guangzhou, Guangdong Province of China. Using the method of stratified cluster sampling, we selected five public primary schools in five districts of Guangzhou and investigated all the students in Grade 2-5 of all the primary schools. A total of 2057 children from Grade 2 to 5 were enrolled. They were all native Chinese speakers and learned English as their second language since entering primary school. Among them, 2026 questionnaires were returned and the rate of recovery was 98.5%. In total, 864 cases were excluded because of missing data for the main assessments. Finally, there were 1162 children entered into the analysis, including 606 boys and 556 girls. The investigators were experienced graduated students who received professional training on the survey. None of the subjects reported intellectual disability, autism spectrum disorder, or attention deficit hyperactivity disorder.
This study was supported by the Key Realm R&D Program of Guangdong Province (grant number 2019B030335001), the Guangdong Basic and Applied Basic Research Foundation (grant number 2021A1515011757), and the National Natural Science Foundation of China (grant number 81673197). This study was approved by the Ethics Committee of the School of Public Health, Sun Yat-sen University (L2016-036). All parents of the children signed informed consent before their inclusion in our study.

Participant Demographics
A self-designed questionnaire was used to collect the demographic information. The content consists of children's gender, date of birth, school, grade, only one child or not, mode of delivery, parents' education level, and family income.

Chinese Written Vocabulary Size Assessment Scale for Primary School Children
The Chinese written vocabulary size assessment scale was compiled by Wang and Tao of East China Normal University [24]. This scale was widely used to assess Chinese written vocabulary size in Grade 2-5 [25]. There were 10 sets of questions in the test paper, and each group had 6 to 33 Chinese characters. Written vocabulary size, which was at least 1.5 standard deviation (SD) below the average level of the actual grade, represented the existence of a delay in written vocabulary size. This standardized test had a reliability and validity of 0.98.

The Dyslexia Checklist for Chinese Children (DCCC)
The DCCC, established by Wu in 2006, was used to assess the cognitive abilities of Chinese students in Grade 2-5 [26]. It contained 57 items and synthesized 8 factors, including the deficit of visual word recognition, the deficit of auditory word recognition, the deficit of meaning comprehension, the deficit of spelling, the deficit of oral language, the deficit of written expression, bad reading habits, and the deficit of attention. In this study, seven factors other than bad reading habits were used to evaluate children's cognitive abilities. The higher score of each factor indicated more serious difficulties in cognitive abilities. The test-retest reliability of the DCCC was 0.734, and the internal consistency of all subscales was above 0.752 [27].
The meaning of each factor is as follows: (1) The deficit of visual word recognition mainly refers to children's difficulties in the visual processing of Chinese characters.
(2) The deficit of auditory word recognition mainly refers to children's difficulties in the phonological processing of Chinese characters. (3) The deficit of meaning comprehension mainly refers to children's difficulties in the acquisition and processing of semantic access in different levels, including characters, vocabularies, sentences, paragraphs, and texts. (4) The deficit of spelling mainly refers to children's poor fluency and recognizability of writing. (5) The deficit of oral language mainly refers to children's difficulties in oral comprehension and oral expression. (6) The deficit of written expression mainly refers to children's difficulties in using and outputting written language, reflecting children's comprehensive obstacles in meaning processing. (7) Bad reading habits mainly include reading the same line repeatedly, skipping characters, losing characters, adding characters, and making a sound when children are reading. (8) The deficit of attention mainly refers to children's difficulties in attention and concentration levels.

The Chinese Version of the Strengths and Difficulties Questionnaire Rated by the Parent (SDQ)
The SDQ was designed to identify children's emotional and behavioral problems [28]. It contained 25 items and was divided into 5 subscales, including emotional symptoms, conduct problems, hyperactivity/inattention, peer relationship problems, and prosocial behavior. The higher scores of the difficulty part were indicated to have more serious emotional and behavioral problems, except for the prosocial behavior score, where a lower score indicated greater difficulties. The retest stability was 0.564~0.772 and the content validity was 0.482~0.774 in Chinese children [29].

Statistical Analysis
Data were entered using EpiData3.0 and statistics were performed using SPSS 23.0. Descriptive statistics were applied to present the characteristics of participants' demographics and the chi-square test was used to assess the difference between boys and girls. Two-sample t-test, chi-square test, one-way ANOVA, and Pearson correlation analysis were used for univariate analysis. Multiple stepwise linear and logistical regression analyses were used to explore related factors of Chinese written vocabulary size and a delay in written vocabulary size, respectively. Participant demographics entered the first regression model (Model 1). Model 2 comprised Model 1, and four difficulties and one strength assessed using SDQ. Finally, Model 3 encompassed Model 2 and the seven factors of DCCC. Then, we performed a hierarchical subgroup analysis of gender. A p < 0.05 indicated a statistically significant difference. Table 1 shows the demographic information of the students in Grade 2 to 5 of five public primary schools. The average age was 9.19 ± 1.15 years and the sex ratio was 1.09:1 (boys: girls). Most parents' education levels were a senior middle school or above. The per capita annual income of most pupils' families was 3000-15,000 RMB. No statistical difference in demographic information was detected between boys and girls.

Univariate Analysis for Related Factors on Chinese Written Vocabulary Size
About 7.3% of the participating children exhibited a delay in written vocabulary size, and the proportion of boys was significantly higher than girls. The scores of conduct problems, hyperactivity/inattention, and the seven factors of DCCC were significantly negatively correlated with Chinese written vocabulary size. Compared to the children with normal written vocabulary size, the children with a delay in written vocabulary size had higher scores in hyperactivity/inattention, peer relationship problems, and seven factors of DCCC. The results are shown in Table 2.

Multiple Linear and Logistic Regression Analysis for Related Factors of Chinese Written Vocabulary Size
In were negatively related to Chinese written vocabulary size. Model 3 of Chinese written vocabulary size showed that the scores of the deficit of visual word recognition (β = −3.32, 95% CI: −5.98, −0.66) and the deficit of meaning comprehension (β = −6.52, 95% CI: −9.39, −3.64) had a significant negative association with Chinese written vocabulary size. However, the emotional and behavioral factors were no longer statistically significant in Model 3. Model 1 of a delay in written vocabulary size showed that boys and children with cesarean birth were more likely to have a delay in written vocabulary. In Model 2 of a delay in written vocabulary size, children with more serious hyperactivity/inattention (OR = 1.18, 95% CI: 1.07, 1.30) and peer relationship problems (OR = 1.15, 95% CI: 1.00, 1.32) were more likely to suffer from a delay in written vocabulary size after controlling demographic information. When further considering cognitive abilities, the degree of a deficit on visual word recognition (OR = 1.04, 95% CI: 1.02, 1.07) was the related factor to a delay in written vocabulary size, and the emotional and behavioral factors had no statistical significance in Model 3.   Table 4. The score of meaning comprehension (β = −8.96, 95% CI: −12.27, −5.65) was closely related to boys' Chinese written vocabulary size, and the degree of a deficit on spelling (OR = 1.04, 95% CI:1.01, 1.07) was correlated to boys' delay in written vocabulary size in Model 3, which further considers the seven factors of DCCC. The score of the deficit of auditory word recognition (β = −8.86, 95% CI: −12.44, −5.29) had an association with girls' Chinese written vocabulary size, and the degree of a deficit on visual word recognition (OR = 1.07, 95% CI: 1.02, 1.12) was correlated to girls' delay in written vocabulary size in Model 3.

Discussion
This is the first multivariate study to assess the relationships between Chinese written vocabulary size and cognitive, emotional, and behavioral factors in primary school students. The main findings included: first, when the cognitive, emotional, and behavioral factors were considered at the same time, only the cognitive factors had a significant correlation with Chinese written vocabulary size in primary school students, including visual word recognition and meaning comprehension; second, the factors related to Chinese written vocabulary size had significant differences between boys and girls.

The Correlation between Chinese Written Vocabulary Size and Cognitive, Emotional, and Behavioral Factors
This study indicated that, after controlling demographic characteristics, the scores of the hyperactivity/inattention and peer relationship problem had significant correlations with Chinese written vocabulary size. In western countries, studies showed that hyperactivity/inattention symptoms during early and middle childhood can predict written vocabulary size at age 12 years [16,30]. In China, hyperactivity/inattention symptoms are also found to be negatively associated with written vocabulary size in Grade 2 to 5 [15]. When learning Chinese characters, children require strong behavioral self-regulation and attention control [31]. The children with more serious hyperactivity/inattention are prone to be distracted by irrelevant stimuli, which may inevitably affect the quality of memory and lead to a delay in written vocabulary size [32]. In line with previous literature, Chinese written vocabulary size also has a strong negative relationship with peer relationship problems [15]. Compared with children who have a negative relationship with their peers, children who have a positive peer relationship appear to have a larger written vocabulary size and are less likely to have a delay in written vocabulary size in our study. Good peer relationships can provide a source of companionship and emotional support for school-age children [33], which are conducive to the development of Chinese written vocabulary size. However, when the cognitive abilities were considered at the same time, the significant correlation between the scores of hyperactivity/inattention and peer relationship problem, and Chinese written vocabulary size disappeared. The results indicated that cognitive factors were the main factors affecting Chinese written vocabulary size, while emotional and behavioral problems were not. Literature showed that cognitive and emotional processing interacted with each other and had a joint influence on behavior pattens [17][18][19], so we speculated that the cognitive factors were the most important factors that directly affect Chinese written vocabulary size, while emotional and behavioral factors are indirect influencing factors.
The most important finding in the current study was that visual word recognition and meaning comprehension had a significant association with Chinese written vocabulary size when controlling demographic characteristics, and emotional and behavioral problems. Visual word recognition mainly refers to children's difficulties in the visual processing of Chinese characters in the scale of DCCC. Previous studies have suggested that auditory word recognition was the core correlate of written vocabulary size in alphabetic language [34,35]. However, different from alphabetic language, Chinese character is a two-dimensional visual processing unit composed of strokes and has a more complex visual-spatial structure [36]. A few studies indicated that independent of phonological and morpheme skills, visual processing had an important influence on Chinese written vocabulary size, which supported our results [37]. Furthermore, we also found that children with a deficit in visual word recognition were more likely to suffer from a delay in Chinese written vocabulary size. Previous studies indicated that children's visual skill deficit may lead to insufficient processing of Chinese characters [38], and visual skill can be used to distinguish between children with and without developmental dyslexia in Chinese [39].
In the current study, meaning comprehension mainly refers to the acquisition and processing obstacles of children's semantic access in different levels, including characters, vocabularies, sentences, paragraphs, and texts in the scale of DCCC. We found that chil-dren with a more serious meaning comprehension deficit were more likely to have lower Chinese written vocabulary size. Consistent with our results, some studies indicated that insufficient knowledge of word meaning is a crucial barrier to Chinese written vocabulary size growth [40,41]. Meanwhile, previous studies demonstrated that Chinese written vocabulary size also make an important contribution in meaning comprehension performance due to the children needing to recognize a large number of words to read fluently [40,42]. It seems that children's meaning comprehension had a bidirectional association with Chinese written vocabulary size, which suggests that the rich and direct instruction of meaning comprehension may be an effective approach to improve Chinese written vocabulary size for primary school students.

The Gender Difference Correlates between Chinese Written Vocabulary Size and Cognitive, Emotional, and Behavioral Factors
Another important finding of our study is that there are significant gender difference correlates between Chinese written vocabulary size and cognitive, emotional, and behavioral factors.
For boys, the cognitive factors influencing Chinese written vocabulary size and a delay in written vocabulary size were meaning comprehension and spelling. First, there is a notable gender difference in meaning comprehension [43]. Compared with girls, boys typically rely more on analysis and logical reasoning for cognitive processing, and prefer to guess the meaning of single characters according to the context [44,45]. The ability of meaning comprehension is the basis of recognizing Chinese character form [46]. Boys with a deficit in meaning comprehension were more likely to have a lower Chinese written vocabulary size. Second, the deficit of spelling mainly refers to children's poor fluency and recognizability of writing, which reflects the deficit of the encoding process of literacy in our study. The previous study also showed that boys always performed worse than girls at all grade levels on spelling tests, which suggested that the degree of a deficit in spelling ability may be more sensitive to boys [47]. In addition, the learning of Chinese characters can be acquired by copying repeatedly [48]. If boys have poor spelling ability, it will seriously affect their Chinese character recognition and lead to a delay in written vocabulary size. Thus, we speculate that the deficit in meaning comprehension and spelling of boys may give an inverse contribution to the cognitive processing of Chinese characters and relate to Chinese written vocabulary size.
Different from boys, for girls, the cognitive related factor influencing Chinese written vocabulary size was auditory word recognition The literature showed that phonological memory was a significant predictor of the reading and writing ability of Chinese characters [49]. Moreover, girls generally scored higher than boys on phonological memory [50]. We speculate that girls may rely more on the phonological processing ability, such as phonological memory, to recognize Chinese characters. Thus, girls are more likely to have lower Chinese written vocabulary size when their auditory word recognition is poor. In addition, we found that the delay in written vocabulary size was significantly correlated with the degree of a deficit in visual word recognition among girls. Chinese character recognition needs not only auditory processing ability, but also visual processing ability [50]. According to our results, it is inferred that the cognition of Chinese characters is more dependent on visual processing ability, and girls who lag in visual processing ability are more likely to suffer from a delay in written vocabulary size, even if their auditory processing ability is normal. Future work should develop targeted methods based on different genders in both school and clinic written vocabulary size instruction.

The Correlation between Chinese Written Vocabulary Size and Parental Education Level
Interestingly, we found that the father's education level was positively associated with children's written vocabulary size. Inconsistent with our results, most studies indicated that there was also a significant association between mother's education level and children's vocabulary development [51,52]. However, recent studies indicated that children with a higher father's education level also have a larger written vocabulary size [41,53].
The further stratified analysis showed that the education level of the father only had an association with the written vocabulary size of boys, but had no association with girls. The possible reasons were still unclear. We speculated that, first, this significant gender difference may reflect a Y-linked inheritance pattern of Chinese written vocabulary size, although the specific genetic pathway remains uncertain. Literature also indicated that the Y chromosome has a significant effect on learning performance by affecting multiple cognitive abilities, such as visuospatial abilities, which play an important role in the developing of Chinese written vocabulary [54,55]. Second, previous studies suggested that fathers were the representative of male behavior patterns in their children's lives and are the main role models for boys' role identification [56][57][58]. Boys were more likely to portray fathers as their own role models and imitate father's behavior and vocabulary [59][60][61]. Moreover, McBride-Chang et al. followed 22 Chinese children from the beginning of kindergarten to Grade 1, and also found that the mediation of maternal guidance for children only explained children's Chinese character reading, but not Chinese character writing [62].

The Pedagogical and Therapy Implication
The pedagogical implication of this study may be that it reveals the importance of specific cognitive abilities to the growth of Chinese written vocabulary size. The teachers should pay more attention to visual processing and meaning comprehension when formulating children's education strategies. Meanwhile, due to boys and girls tending to use different cognitive abilities when learning Chinese characters, in the future, personalized and targeted teaching plans should be developed according to different genders. The present study also has a therapy implication for clinicians. Children with a delay in written vocabulary size generally have multiple cognitive skills deficits, alongside emotional and behavioral problems. In order to improve children's difficulties in recognizing Chinese characters, clinicians should first focus on children's cognitive deficits, rather than emotional and behavioral problems.

Limitations
First, the design of our study was a cross-sectional study which was not able to judge the causal relationship between the related factors and Chinese written vocabulary size. In the future, prospective studies are needed to confirm the results of our study. Second, the cognitive abilities of this study were assessed by parents filling out a questionnaire instead of a standardized neuropsychological experimental assessment, which may affect the accuracy of the results. However, the DCCC has good reliability and validity [27]. The cognitive abilities can be evaluated quickly and effectively while saving manpower and material resources.

Conclusions
In summary, when considering cognitive, emotional, and behavioral factors at the same time, there is a significant positive correlation between Chinese written vocabulary size and cognitive factors but not emotional and behavioral problems in primary school students, mainly including visual word recognition and meaning comprehension. Moreover, the related influencing factors of Chinese written vocabulary size were different between boys and girls. Boys are more dependent on meaning comprehension and spelling, while girls are more dependent on auditory and visual word recognition.
Author Contributions: N.P. is the primary writer of the manuscript. X.L. designed the study and helped to revise the manuscript. N.P. and X.L. performed the statistical analysis. Y.G. was responsible for communicating with the schools involved in the survey. J.M., X.F., Z.Y., X.X., L.C., and Y.Z. participated in administering the questionnaire. All authors read and approved the final manuscript. Institutional Review Board Statement: This study was approved by the Ethics Committee of the School of Public Health, Sun Yat-Sen University (L2016-036). All research was performed in accordance with relevant guidelines and regulations. All the procedures were approved by the Ethics Committee.
Informed Consent Statement: All the tests were supported by the parents of the children and signed informed consent.
Data Availability Statement: We will not share our raw data in the manuscript, because the raw data of this article is being applied to another unpublished article and the project group has not published the data in whole elsewhere.