Learning Chinese Characters of Visual Similarity: The Effects of Presentation Style and Color Coding

Li, Junmin; Shi, Mengya; Wang, Xin

doi:10.3390/languages10100260

Open AccessArticle

Learning Chinese Characters of Visual Similarity: The Effects of Presentation Style and Color Coding

by

Junmin Li

¹,

Mengya Shi

² and

Xin Wang

^3,4,*

¹

School of Foreign Languages, Hangzhou City University, Hangzhou 310015, China

²

School of Humanities and Social Sciences, Beihang University, Beijing 100191, China

³

Department of Linguistics, Macquarie University, Sydney, NSW 2109, Australia

⁴

Lifespan Health and Wellbeing Center, Macquarie University, Sydney, NSW 2109, Australia

^*

Author to whom correspondence should be addressed.

Languages 2025, 10(10), 260; https://doi.org/10.3390/languages10100260

Submission received: 18 July 2025 / Revised: 9 September 2025 / Accepted: 1 October 2025 / Published: 11 October 2025

Download

Browse Figures

Versions Notes

Abstract

This study examined how beginners benefit from ‘salience’ in learning two types of visually similar Chinese characters: those with identical strokes (e.g., 人 and 入) and those differing by an additional stroke (e.g., 日 and 白), while identifying the role of color coding and presentation style. A total of 183 non-tonal native speakers with no prior experience of Chinese characters participated in the study. In a 2 × 2 × 2 experimental design, the study assessed the influence of color coding (with vs. without), presentation style (single vs. paired characters), and stroke similarity (identical vs. different) on learning. Results showed (1) Characters with stroke differences were learned more easily than identical-stroke characters; (2) Simultaneous character presentation enhanced discrimination of subtle stroke differences, but (3) Color coding slowed down reaction times, suggesting visual overload. These findings demonstrate that perceptual similarity—not just complexity—impacts character learning difficulty. Pedagogically, the results support using paired character presentation while cautioning against excessive visual enhancements. The study provides empirical evidence for optimizing Chinese character instruction by balancing discriminability and cognitive load in beginning learners.

Keywords:

stroke count; Chinese character learning; pattern complexity; presentation style; color cue

1. Introduction

Learning Chinese characters poses significant challenges for non-native beginners, especially those from alphabetic language backgrounds (Hu, 2010; Shen, 2004), because Chinese characters use a logographic system where each character conveys meaning through visual elements rather than phonetics (Shen, 2013). Thus, strong visual-spatial skills are vital for learning Chinese characters, as the ability to distinguish subtle visual differences is essential due to the large number of similar-looking radicals and characters in Chinese orthography (e.g., Chen et al., 1996; McBride-Chang et al., 2005, etc.). In this study, we aim to explore to what extent initial L2 Character learning can be enhanced by manipulating perceptual salience and constructed salience (e.g., Gass et al., 2018) to direct learners’ attention to critical visual features of characters, to reduce cognitive load or enhance memory retention. Here, we focus on three visual factors that can be manipulated in learning: color coding, stroke structure, and presentation style.

In the field of second language acquisition (SLA), salience is a critical factor in learning, referring to a property that makes a linguistic feature more perceptually prominent. As Gass et al. (2018) delineate, salience can be either an inherent characteristic of the input (perceptual salience) or intentionally induced through instruction (constructed salience). Perceptual salience arises from the innate features of the language itself, such as stress or syllabicity, which naturally attract learners’ attention. In contrast, constructed salience occurs when external agents, such as instructors or materials, enhance the prominence of a feature through deliberate means. This is frequently achieved in instructed learning settings via interactional feedback or techniques. The influence of both perceptual and constructed salience extends across various domains of SLA.

A substantial body of research has examined their role in vocabulary acquisition (e.g., Ko, 2012; Elgort & Warren, 2014). Here, in the framework of salience in SLA, we investigate the role of both language-specific properties (perceptual salience) and learning materials (constructed salience) in learning Chinese characters. We leverage the vast number of Chinese characters that share visual similarity (i.e., Chinese orthographic neighbors), as they could be hard to differentiate among themselves and hard to learn (e.g., Ellis, 2016). Regarding the external agents to assist learning, we leverage color coding and presentation style, which are typically adopted in modifying learning materials and instructions (e.g., Tortorelli et al., 2021; Vlach et al., 2022). In the following sections, we first introduce the basic properties of Chinese characters, followed by a thorough review of the three visual factors that can contribute to the ‘salience’ in learning.

1.1. Basic Properties of Chinese Characters

The orthographic structure of Chinese characters can be analyzed hierarchically to its components: radicals and strokes. Radicals often serve as semantic or phonetic indicators. For example, the semantic radical 氵 (three dots/strokes to indicate water) is found in characters related to water, like 河 (river) and 海 (sea). Likewise, the phonetic radical 青 (pronounced as qing 1) in 清 (qing1), 情 (qing2), 请 (qing3), etc., produces a similar pronunciation. They are essential for organizing and understanding Chinese writing, as they help group characters with similar meanings or pronunciations. Strokes are the basic components of Chinese radicals and characters. A stroke in Chinese writing is created with a single, continuous motion, often involving changes in direction (Anderson et al., 2013). Shu et al. (2003) analyzed 2570 characters taught in Chinese elementary schools and identified eight types of strokes that vary in length and spatial orientations. Strokes can take various forms—vertical, horizontal, or diagonal; straight or slightly curved; and some may include a “hook” (Taylor & Taylor, 1995). For instance, the simple character 木 (wood) contains four types of strokes: horizontal (一), vertical (丨), left-slanting (丿), and right-slanting (㇏). Additionally, shorter stroke segments, such as dots, can vary in direction, as seen in the character 兴 (prosperous), as compared to the character 河 (river).

Characters are categorized into two types based on their structures: integral and compound characters. Integral characters can be formed as simply as one stroke (一) or as complex as multiple strokes (束). One dimension of the visual complexity of Chinese characters is the relative positions of different strokes. For example, the same two strokes can create different characters by varying the stroke positions. For instance, both 人 (person) and 入 (enter) are composed of two strokes: 丿 and ㇏, but their differing arrangements result in distinct characters.

Similarly, in compound characters, the position of sub-lexical orthographic units (radicals) can change to form entirely different characters (J. Zhang, 1992, pp. 31–32). For example, 杏 (apricot) and 呆 (dumbness) both include木 (wood) and 口 (mouth) arranged in a top-down pattern. When the positions of these radicals are reversed, a completely different character is formed. In another example, 晾 (air/sundry) and 景 (scenery) both contain the radical 日 (sun) and京 (capital). However, the structural arrangement differs: 晾 follows a left-right pattern, while 景 adopts a top-down structure.

The phenomenon where Chinese characters share the same strokes but differ in stroke structure is intriguing, especially when considering how L2 beginners of alphabetic language backgrounds learn and identify these visual complexities to eventually map to the meanings. Previous studies on visual complexity typically focus on stroke counts, assuming that more strokes equate to greater visual complexity (e.g., Shen, 2013; Yu et al., 2018) when considering L2 learning of Chinese characters. Yet, learning characters not only involves learning the individual strokes but also identifying characters that look similar but differ in stroke structures. The latter is probably harder to learn, in particular, at the early stage of learning, given the subtlety. To our knowledge, no study has yet explored whether characters of the same stroke counts but presenting different stroke structures pose additional difficulties for Chinese L2 learners. Thus, the aim of the study is to examine how visual similarity (i.e., stroke structure) influences the early stage of L2 Chinese character learning. We focused on integral characters because, as compared to compound characters, they contain fewer strokes and simpler stroke patterns, normally defined as “spatially contiguous patterns of strokes” (Anderson et al., 2013, p. 45).

1.2. Literature Review of the Three Visual Factors

The concept of salience in character learning manifests in two forms. Perceptual salience denotes the inherent visual cues that signal the complex similarities among Chinese characters (Section 1.2.1). Constructed salience, on the other hand, is developed through instructional aids such as color coding (Section 1.2.2) or presentation styles (Section 1.2.3).

1.2.1. Visual Similarity in Chinese Characters

A basic examination of the Chinese writing system quickly highlights its stronger emphasis on visual-spatial processing compared to alphabetic orthographies that bear certain phonological relationships with words. That is, the alphabet usually indicates how words sound to a large or small extent, depending on the orthographic depth of the writing system. This is usually referred to as the Grapheme-Phoneme-Correspondence (GPC) rule. Thus, unlike alphabetic systems, which rely on a segmental and “atomistic” structure (Ho & Bryant, 1999), Chinese lacks this rule. Although many Chinese characters (e.g., phono-semantic compounds) contain phonetic radicals that offer clues to pronunciation, the relationship between orthography and phonology in Chinese is generally less consistent and transparent than in alphabetic writing systems. According to Jin (2006), learning Chinese characters is a more visual than auditory process. The best way to remember them is by focusing on their overall shape and radicals, not by memorizing stroke order or pronunciation. This is due to the unique nature of Chinese characters. Unlike the linear sequence of letters in alphabetic languages, each character is a dense, square-shaped image filled with intricate details—similar to a complex icon or a piece of fine print. This “high spatial frequency” means that learning Chinese is like building a visual library, requiring strong spatial memory to distinguish between characters with many small, detailed strokes (e.g., M. Wang et al., 2003). The development of sensitivity to the internal structure of Chinese characters is a gradual process. Gao and Meng (2000) found that beginning learners were significantly less accurate than advanced learners at distinguishing between visually similar characters (e.g., 会 vs. 公, 休 vs. 体) in a reading context. Beginners often struggle to notice subtle differences between similarly shaped characters and are not yet familiar with perceiving the details of stroke-based writing.

The literature contains various measures of visual complexity for Chinese characters, which aim to accurately reflect their complexity. For example, these measures include stroke frequency (Majaj et al., 2002; J. Y. Zhang et al., 2007) and the “ink” measure (Pelli et al., 2006), defined as the ratio of black pixels to total pixels from a binary image of the character. Among these, visual complexity is typically measured by the number of strokes (Liversedge et al., 2014; Yu et al., 2018). However, some Chinese characters, such as 犬 (dog) and 太 (too), share the same three strokes. Their stroke patterns and spatial arrangements vary significantly. This complexity extends beyond stroke count to include differences in radicals, spatial arrangements, and even subtle variations in individual strokes. In a very similar way, characters like 大 (big) and 犬 (dog), where the latter is formed by adding a single stroke to the former, are normally considered as visually similar characters. These conditions highlight the intricate nature of Chinese orthography and, thus, raise an intriguing question: when comparing characters like 大 (big) vs. 犬 (dog), and 犬 (dog) vs. 太 (too), where the former pair differs by the addition of a stroke to the first character and the latter pair consists of two characters with the identical number of strokes but different patterns, which pair can be considered more visually complex or more difficult to learn?

Research on Chinese visual word recognition, particularly through the stroke number effect, remains limited. Jiang et al. (2020) conducted a study comparing native Chinese speakers (NS) and Chinese as a second language (CSL) learners using a lexical decision task involving disyllabic compound words (each containing two characters) and an equal number of nonwords. The results revealed that while both groups were influenced by word frequency and familiarity, only the CSL group demonstrated a stroke number effect. In a subsequent study, Jiang and Feng (2022) used single Chinese characters as stimuli and found that CSL learners again exhibited a stroke number effect in a lexical decision task. They argued that adult L2 Chinese learners process Chinese characters analytically, much like beginning readers, focusing on individual strokes or radicals rather than recognizing the overall character structure (Yeh et al., 2003).

If the number of strokes plays a key role in L2 character learning, we hypothesize that visually similar character pairs with an additional stroke will be more challenging to learn. On the other hand, character pairs that differ by an additional stroke are easier to differentiate due to the visual detectability of the added mark. However, perceiving the subtler structural patterns that distinguish characters with the same number of strokes is a unique challenge in Chinese character identification. Difficulties arise when a feature exists in the L2 but is absent in learners’ L1 (e.g., Franceschina, 2005; Hawkins & Liszka, 2003). Therefore, learners may struggle to differentiate character pairs with identical strokes. However, there is currently no clear evidence indicating whether integral characters with the same number of strokes are equally difficult or harder to process compared to those with an additional stroke.

1.2.2. Color Coding in Character Learning

Color coding has been shown to facilitate word learning by enhancing morphological awareness and supporting the decoding of linguistic structures (e.g., Tortorelli et al., 2021; Songsangkaew et al., 2023). For example, Tortorelli et al. (2021) reviewed instructional approaches to teaching code-related skills in English reading and highlighted the effectiveness of strategies such as separating words into phonemes, syllables, and morphemes, as well as teaching affixes explicitly. These methods of integrating colors into linguistic structures are instrumental in helping learners navigate the complexities of an alphabetic language. Supporting this, Songsangkaew et al. (2023) demonstrated that color-coding derivational and inflectional morphemes significantly improved morphological awareness, lexical inferencing ability, and reading comprehension among non-English major undergraduates. By visually distinguishing morphemes, color coding may provide learners with a clearer understanding of character structures, aiding the acquisition of both vocabulary and reading skills.

If visual support, like color coding, can benefit learning alphabetical languages, as evidenced in various studies, this strategy might also be helpful in learning Chinese characters, which rely more on visual processing given their complex stroke structures, though relevant research is limited and far from being conclusive. To start, some researchers consider lower-level structures of characters as a unit to learn, i.e., radicals, which facilitate L2 character learning by providing integrated structural features, unlike the complexity and variability of individual strokes (e.g., Huang & Chen, 1988). Radicals help to organize stroke-level information (Taft & Chung, 1999) and draw learners’ attention to the internal structure of characters to some extent (Cao et al., 2013). For example, Taft and Chung (1999) conducted a study in which participants learned 24 Chinese characters composed of two radicals paired with meaning (e.g., 咀-CHEW) under four different radical presentation conditions to see whether understanding the internal structure of Chinese characters aids novice learners in memorization. Participants were exposed to the character-meaning pairs three times and then presented with each character alone to recall its associated meaning. In a between-subject design, four groups received different types of radical instructions: Radicals Before (radicals pre-learned before exposure), Radicals Early (radicals informed during the initial exposure), Radicals Late (radicals identified during the third exposure), and No Radical (no instruction on radicals). The Radicals Early group performed best, suggesting that introducing radicals at the first exposure enhances learning.

If radical learning serves as the foundation of character learning, as indicated by previous studies (Taft & Chung, 1999), strategies like color-coding to mark the boundaries of radicals within a character might be facilitatory in L2 character learning as well. However, existing empirical evidence does not seem to support this rationale. Among the limited studies investigating whether marking radicals with different colors—a common teaching technique—supports L2 orthographic and semantic learning, Hou and Jiang (2022) explored its impact on Chinese L2 learners of alphabetic language backgrounds. In a Latin square design, participants were required to learn characters presented across four distinct conditions: (a) radical markings with stroke animations, (b) no radical markings with stroke animations, (c) radical markings without stroke animations, and (d) neither radical markings nor stroke animations. They were shown 40 high-frequency fillers and 120 low-frequency target characters, with each character displayed for 1000 milliseconds. In the study, compound characters were divided into two radicals, either left and right or top and bottom, with the radicals marked in red and blue, respectively. For instance, in the character “苛,” the radical “艹” on top was marked in red, while the radical “可” on the bottom was marked in blue. After learning, participants completed character recognition and meaning-matching tests. The results showed that marking radicals increased participants’ reaction times and reduced their accuracy in character recognition tests. Similarly, stroke-order animations also negatively affected recognition accuracy. These findings suggest that providing radical and stroke information may interfere, rather than facilitate character learning, as excessive visual information can increase cognitive load for L2 learners. Although strokes and radicals are key sub-character units in native Chinese speakers’ character processing (e.g., Peng & Wang, 1997; Li et al., 2005), their utility for L2 learners in developing Chinese orthographic representations and orthography-semantics connections remains controversial. It is empirical to test whether color-coding a single critical stroke in a character would change the learning outcome.

In the present study, instead of applying color to the entire character (radical), we focused on highlighting one or two specific strokes that distinguish paired characters from one another. Su and Samuels (2010) suggest that “beginning Chinese readers process characters in an analytic way”. To emphasize the critical differences between paired characters, we applied color coding to the distinguishing strokes. For example, in character pairs with the same number of strokes, such as 犬 and 太, the differentiating dot was highlighted in color to provide critical input. Similarly, for characters with a different number of strokes, such as 日 and 旦, the distinguishing stroke 一 was color-coded.

1.2.3. Presentation Style

Given the pivotal role of generalization in cognition, there has been substantial research exploring the factors that facilitate it. Among them, the timing of instance presentation is particularly important. Research findings reveal a paradox: both simultaneous presentation (presenting instances at the same time), which allows for direct comparison (e.g., Gentner et al., 2009; Oakes & Ribar, 2005), and spaced presentation (presenting instances apart in time), which separates instances over time (e.g., Cepeda et al., 2006), have been shown to enhance generalization.

According to structure mapping theory (Gentner, 1983), presenting exemplars at the same time draws attention to shared features, encouraging mental comparison and the identification of commonalities within a category. Researchers suggest that simultaneous presentations help learners focus on the similarities between exemplars while reducing the short-term memory load required to compare them. As all exemplars are visible during this process, learners do not need to rely on memory to recall previously seen examples, which reduces cognitive demands. In addition, identifying these shared features helps learners generalize to new exemplars within the category. In short, simultaneous presentations support generalization by directing visual attention to key similarities and easing memory requirements.

On the other hand, other researchers suggest that both simultaneous and spaced learning facilitate generalization. For example, Vlach et al. (2012) investigated the effects of simultaneous, massed, and spaced presentations on 2-year-old children’s performance in a novel noun generalization task. In this task, four new objects were introduced and labeled (e.g., “fep”) using simultaneous, massed, or spaced presentation methods in the learning phase. In the simultaneous condition, all exemplars were presented at once, allowing children to visually compare them. In the massed condition, exemplars were presented one at a time in immediate succession, with less than 1 s between presentations. In a spaced condition, 30 s elapsed between each presentation. The massed and spaced presentation belonged to a sequential presentation. In the test phase, four objects were shown, and children were asked to identify the target object (e.g., “Can you hand me the fep?”). For children in the immediate testing condition, the test was conducted right after the learning phase. For those in the delayed testing condition, the test took place 15 min later. Results showed that children in the simultaneous condition outperformed those in the other two conditions on immediate tests, while children in the spaced condition performed best after a delay (15 min). Overall, research suggests that visually comparing multiple exemplars presented simultaneously promotes better skills of abstraction, retention, and generalization compared to sequential (massed) presentations.

In their ongoing research program, Vlach et al. (2022) investigated how preschool-aged children learn science category exemplars under three schedules: simultaneous, massed, and spaced presentations. While no differences were observed between conditions during the immediate test, their first experiment showed that children in the spaced condition demonstrated the strongest generalization performance at a delayed post-test. However, further experiments revealed that spaced learning led to reduced visual attention and increased forgetting during the learning process.

Previous studies have investigated the effects of simultaneous, massed, and spaced learning on students’ performance in science education. The presentation style of literacy development has not been extensively studied. While spaced presentation has been found to reduce visual attention and increase forgetting (Vlach et al., 2022), in the present study, we aim to adopt simultaneous and massed presentation methods to explore their impact on character learning. This study aims to address this gap by investigating how visual stimuli, specifically presenting characters in sequence or in pairs, impact learners’ ability to acquire and retain Chinese characters.

In summary, the acquisition of Chinese characters presents a unique challenge to perceptual salience due to their intricate visual complexity. To overcome this, constructed salience can be introduced through instructional interventions such as color coding and manipulated presentation styles. This study aims to address the following questions:

Do beginning Chinese learners find characters with identical strokes easier to learn than those differing by one stroke, or the other way around?
Does the use of color coding to highlight the critical strokes that differentiate visually similar characters facilitate L2 character learning?
Does simultaneous presentation (2 characters presented together) enhance L2 character learning more effectively than massed presentation (1 character presented individually)?

To address the above questions, we designed a study of four different experimental conditions, namely, “One Character, Color-Coded”, “One Character, No Color”, “Two Characters, Color-Coded”, “Two Characters, No Color”, such that we can test how L2 learners learn Chinese characters in different contexts. L2 learners were expected to learn both the stroke structures and the meanings of characters. To investigate this, we employed a 2 × 2 × 2 experimental design of three dependent variables: (1) Critical strokes cued with vs. without color, (2) Presentation with 1 vs. 2 characters, and (3) Characters with identical vs. different stroke numbers. Color coding and Presentation Style were between-subjects factors, while Stroke Number was a within-subjects factor, as all learners were exposed to characters with both the same and different stroke numbers. Participants received the training first, followed by an immediate test to see how well they could recognize the meaning of the characters. The next day, a delayed test was conducted to measure how well they retain the memory of the characters.

2. Materials and Methods

2.1. Participants

The study recruited 202 undergraduate students aged from 19 to 50 (Mean = 22.50, SD = 3.85) at Macquarie University. 19 participants, who reported having some or extensive knowledge of Chinese characters, were excluded from the final analysis that included a total of 183 participants. The majority of participants were native English speakers with no prior knowledge or little exposure to Chinese. Others had diverse language backgrounds, including Arabic and Vietnamese. Detailed participant information is provided in Table 1. They were randomly and equally assigned to one of the four experimental conditions.

2.2. Materials

The stimuli consisted of 48 Chinese characters (see Appendix A) selected for their visual simplicity and structural variability. These characters were grouped into 24 pairs: 12 pairs of identical strokes but different stroke structures (e.g., 人 and 入) and 12 pairs only differing by one stroke (e.g., 日 and 旦). Specifically, in the identical-stroke condition, characters like 人 and 入 share identical strokes but differ in their orientations, forming distinct characters. In the different-stroke condition, characters like 日 and 旦 differ by one single stroke. Each pair was presented alongside an English translation to facilitate learning and recognition. The average stroke number for the identical-stroke condition was 3.5 (SD = 0.9), while for the different-stroke condition, the average stroke number was 3.92 (SD = 0.79) and 4.92 (SD = 0.79), respectively. There was no significant difference in stroke number between these two conditions (t = 0.001, p = 0.999).

To measure the visual complexity of Chinese characters, we compared the four metrics. H. Wang et al. (2014) proposed, namely, stroke count, ink density, stroke frequency, and perimetric complexity, among which, the perimetric complexity was the one that could best be applied to both alphabets and Chinese characters. This metric was defined as the perimeter square of a symbol, divided by the ‘ink’ area (Pelli et al., 2006). The width and height of the characters were the same, and they were treated as an image with a fixed size. Each character was stored as a binary image, with the strokes in a black background in white. The ink density is defined by the ratio of the number of black pixels to the total number of pixels in a character image. An independent samples t-test was conducted to compare perimetric complexity between the identical-stroke and different-stroke conditions. There was no significant difference (t(46) = 0.39, p = 0.969) between the identical-stroke condition (M = 0.51, SD = 0.16, n = 24) and the different-stroke condition (M = 0.59, SD = 0.11, n = 24).

2.3. Design and Procedure

At the initial stage of learning Chinese characters, a common approach is to introduce new characters alongside their pinyin (a pronunciation cue) and English equivalent. Research by Lee and Kalyuga (2011b) found that presenting pinyin with new words improved word retention for learners who had achieved automatic pinyin reading but provided little benefit for those who had not yet mastered the pinyin system. Additionally, the widely used horizontal layout for presenting pinyin can impose a high cognitive load and hinder character learning, particularly for beginners (Lee & Kalyuga, 2011a). To avoid potential confounding factors, the present study excluded pinyin and phonological information during the training phase.

This study employed a 2 × 2 × 2 design with two between-subject variables: presentation style (massive presentation vs. simultaneous presentation, i.e., one character vs. two characters) and color coding (with vs. without color cues). Stroke number (identical-stroke vs. different-stroke) was treated as a within-subject variable. Thus, in total, there were 4 training conditions.

Participants were randomly assigned to one of the four training conditions presented in Figure 1:

This study was conducted online using the Gorilla Experiment Builder (www.gorilla.sc; Anwyl-Irvine et al., 2021). First, a questionnaire was designed to gather participants’ Chinese learning experience. The experiment comprises two phases: a training phase and a testing phase. During the training phase, participants were exposed to Chinese characters based on their assigned conditions. They were instructed to observe each character, count the number of strokes, and enter the stroke number before advancing to the next trial. No feedback was given during this phase.

The testing phase involved two recognition tests: an immediate test on Day 1 and a delayed test 24 h later on Day 2. Both tests used a two-alternative forced-choice task, where participants were shown an English translation and asked to identify the corresponding Chinese character from two options. Each test included 48 trials, including all the new characters learned during the training phase. The study followed a between-subjects design, ensuring that each participant experienced only one training condition. Figure 2 illustrates the training phase of the “Two Characters, Color-Coded” condition, while the testing phase was the same for all conditions.

2.4. Statistical Analysis

The statistical analyses were performed using the lme4 package (Bates et al., 2015) within the R statistical software environment (R version 4.3.3), maintained by the R Core Team (2020). In these analyses, accuracy and reaction times (RT) for each trial were treated as dependent variables. A linear mixed-effects model was applied, with stroke number, color cue, and presentation type as fixed-effect predictors. The coding for these predictors was as follows: different-stroke (0.5) vs. identical stroke (−0.5), presence of a color cue (C) (0.5) vs. absence of a color cue (NC) (−0.5), and massed presentation (one character) (0.5) vs. simultaneous presentation (two characters) (−0.5). The random effect structure included varying intercepts and slopes for both participants and items, with no initial correlation assumed between these intercepts and slopes. In the event of convergence issues, the models were adjusted by removing random terms sequentially, starting with the term having the smallest value (Barr et al., 2013).

Before conducting the analysis, we used a density plot to visualize RTs and identify potential outliers. RTs below 300 ms or above 10,000 ms were considered outliers and were excluded from the dataset. Based on the results of Box–Cox tests (Box & Cox, 1964), a logarithmic transformation was applied to the RTs, and the transformed values were used as the dependent variable. The initial model was fitted to the complete dataset, and subsequently, following the residual trimming method proposed by Baayen et al. (2008) and Baayen and Milin (2010), data points with residuals exceeding 2.5 standard deviations were removed. The final results are based on the refined model. For response accuracy, generalized linear mixed-effects models were used, with accuracy coded as 1 (correct) or 0 (error) as the dependent variable. These models were structured similarly to those used in the RT analysis. (Data and R scripts could be available at https://osf.io/jt6f4/).

3. Results

In this analysis, we examine the main effects of three variables—visual complexity (stroke number), presentation style (simultaneous or massed), and color coding—as well as their interactions in both accuracy and RT. To address the research questions, participants’ performance on both the immediate and delayed tests was evaluated using linear mixed models. For all analyses, we began with a complex full model and simplified it to identify the best-fitting parsimonious model. We then conducted two post hoc analyses: one stratified by both presentation and color, and another stratified by presentation alone.

3.1. Immediate Test Analysis

3.1.1. Accuracy Analysis

The accuracy rates of the four training conditions, as well as their SD, were presented in Table 2.

Generalized linear mixed models (GLMMs) were fitted to analyse the accuracy data. The final model was shown in (1)

Mixed(correct ~ color * presentation * stroke_num + (stroke_num|subj) + (presentation|itemN)).

(1)

The results of the model were presented in Table 3. The random effect structure indicated substantial between-subject variability (σ² = 0.390) and moderate between-item variability (σ² = 0.104).

The model output revealed a significant main effect of stroke number (β = 0.14, SE = 0.05, z = 2.64, p = 0.008) and a significant interaction between presentation style and stoke number (β = 0.07, SE = 0.03, z = 2.11, p = 0.035). No other main effects or interactions reached statistical significance.

Further pairwise analyses using emmeans (stratified by both presentation and color) revealed that the accuracy rate for characters with different strokes was significantly higher than that for characters with identical strokes in both the One Character, Color-Coded condition (Mean Difference = 1.56, SE = 0.23, z = 3.01, p = 0.003) and the One Character, No Color condition (Mean Difference = 1.49, SE = 0.21, z = 2.77, p = 0.006). However, no significant differences were observed between the accuracy rates for characters with different strokes and those with identical strokes in either the Two Characters, Color-Coded condition (p = 0.13) or the Two Characters, No Color condition (p = 0.65).

The post hoc pair-wise comparison analysis (stratified by presentation) showed that the massed presentation condition, where the characters were presented one by one, produced more accurate results in learning different-stroke pairs than the identical-stroke pairs (Mean Difference = 1.52, SE = 0.91, z = 3.33, p < 0.001). In contrast, the simultaneous presentation condition showed no difference between the identical-stroke and different-stroke pairs (Mean Difference = 1.15, SE = 0.14, z = 1.15, p = 0.250).

3.1.2. RT Analysis

RTs for correct trials were analyzed, comprising a total of 5327 trials. Outliers of reaction times greater than 10,000 ms or less than 300 ms were excluded, accounting for 3.17% of the data, resulting in 5158 trials in final analyses. An additional 1.9% of trials were removed as these data points were standardized residuals beyond ±2.5 standard deviations. Mean RT of the four conditions is presented in Table 4.

Linear mixed-effects models were fitted to analyze log-transformed reaction times. The final model included fixed effects for color, presentation type, stroke number, and all their interactions. The random effect structure accounted for by-subject variability in intercepts and slopes for stroke number, as well as by-item variability in intercepts as in (2).

Mixed(logrt ~ color * presentation * stroke_num + (1 + stroke_num1|subj) + (1|itemN))

(2)

The results were presentend in Table 5. The random effect structure indicated substantial between-subject variability (σ² = 0.247) relative to residual variance (σ² = 0.179).

The final model showed a significant color cue effect (β = 0.09, SE = 0.04, z = 2.41, p = 0.017), which indicated that color-coded characters were processed more slowly than those with no color cues. We also observed a significant main effect of stroke number (β = −0.03, SE = 0.01, z = −2.48, p = 0.017), which showed that different stroke pairs were quicker to respond than those of identical stroke, consistent with the accuracy data. No interactions reached statistical significance.

A post hoc analysis (stratified by both presentation and color) showed that log-transformed RTs for characters with different strokes were significantly faster than those with identical strokes in the One Character, Color-Coded condition (Mean Difference = −0.120, SE = 0.034, z = −3.48, p < 0.001). In contrast, no significant differences in RTs were observed between characters with different strokes and those with identical strokes in the other three conditions (all p’s > 0.13).

A post hoc analysis (stratified by presentation) revealed that in the massed presentation format, the colored-cued condition significantly hindered responses compared to the no-color condition (Mean Difference = 0.27, SE = 0.11, z = 2.50, p = 0.01), Conversely, in the simultaneous presentation format, there was no significant difference in responses between the colored and no-color conditions (Mean Difference = 0.10, SE = 0.11, z = 0.92, p = 0.36).

3.2. Delayed Test Analysis

Among those who participated in the Day 1 training sessions, a total of 191 participants took part in Day 2, delayed tests. Again, participants who reported knowledge of Chinese characters were excluded, resulting in 172 participants included in the final analysis. The analysis procedure followed the same procedure as that in the immediate tests (Day 1) analysis.

3.2.1. Accuracy Analysis

The accuracy rates of the four conditions are shown in Table 6.

Logistic mixed-effect models were fitted to analyze the accuracy. The final model was shown in (3).

mixed(correct ~ color * presentation * stroke_num + (stroke_num|subj) + color * presentation|itemN)

(3)

The results of the final model were presented in Table 7. The random effect structure indicated substantial between-subject variability (σ² = 0.398) and moderate between-item variability (σ² = 0.098).

The model output showed a marginal main effect of stroke number (β = 0.10, SE = 0.05, z = 1.88, p = 0.06), and a marginally significant interaction between stroke number and presentation style was observed (β = 0.06, SE = 0.03, z = 1.86, p = 0.062). No other effects reached statistical significance.

A post hoc analysis (stratified by both presentation and color) was conducted. The results indicated that the accuracy rate for characters with different strokes was significantly higher than that for characters with identical strokes in the One Character, Color-Coded condition (Mean Difference = 1.57, SE = 0.24, z = 2.91, p = 0.004). However, no significant differences in accuracy rates were found between characters with different strokes and those with identical strokes in the other three conditions (all p’s > 0.20).

A Post hoc analysis (stratified by presentation) showed that the massed presentation condition produced better results in learning different-stroke pairs than identical stroke pairs (Mean Difference = 1.37, SE = 0.18, z = 2.47, p = 0.014). However, the simultaneous presentation condition showed no difference between identical and different stroke pairs (Mean Difference = 1.09, SE = 0.13, z = 0.71, p = 0.476). These patterns were consistent with the results from the immediate tests.

3.2.2. RT Analysis

Using the same data trimming procedure as shown in the immediate test, we removed 4.61% outliers and included 4736 trials for final analyses. An additional 2.47% outliers were excluded to remove datapoints as standardized residuals beyond ±2.5 standard deviations. The mean RTs of the four conditions are presented in Table 8.

Linear mixed-effects models were fitted to analyze the RT data. The final model was shown in (4).

mixed(logrt ~ color * presentation * stroke_num + (stroke_num|subj) + (1|itemN)

(4)

The results of the model were shown in Table 9. The random effect structure indicated substantial between-subject variability (σ² = 0.277) relative to residual variance (σ² = 0.164).

Similarly to the Immediate test analysis, the model revealed a main effect of color coding (β = 0.11, SE = 0.04, t = 2.78, p = 0.006), showing that color coding led to slower reaction times. Additionally, there was a main effect of stroke number (β = −0.04, SE = 0.01, t = −3.39, p = 0.001), showing that different-stroke pairs resulted in faster reaction times. Presentation style did not show a main effect (β = 0.01, SE = 0.04, t = 0.28, p = 0.78). No interaction was observed between any two variables (all p’s > 0.14).

Further pairwise analyses (stratified by both presentation and color) revealed that characters with different strokes were significantly quicker to recognize than those with identical strokes in the following conditions: One Character, Color-Coded (Mean Difference = −0.07, SE = 0.03, z = −2.04, p = 0.04); One Character, No Color (Mean Difference = −0.08, SE = 0.03, z = −2.29, p = 0.02); and Two Characters, Color-Coded (Mean Difference = −0.13, SE = 0.04, z = −3.46, p < 0.01). However, this difference was not significant in the Two Characters, No Color condition (Mean Difference = −0.05, SE = 0.04, z = −1.35, p = 0.18).

Post hoc analysis (stratified by presentation) revealed that in the massed presentation condition, color coding significantly increased RTs compared to no color coding (Mean Difference = −0.35, SE = 0.11, z = −3.03, p < 0.01). However, in the simultaneous presentation condition, no significant difference in RTs was observed between color coding and no color coding (Mean Difference = −0.11, SE = 0.12, z = −0.92, p = 0.36).

4. Discussion

This study aims to determine whether visual complexity of characters with identical stroke number or different stroke number poses the same level of challenge to L2 Chinese learners and to understand an optimal way to present input (i.e., Chinese characters) by empirically testing two strategies: color coding and presentation style. Specifically, we aim to understand whether and how these strategies can be applied to visually similar Chinese characters during the early stage of character learning. We employed an online training and testing approach to examine how L2 Chinese beginning learners retain visually similar characters under different training conditions: with or without color coding and presented either simultaneously or in a massed format. The current study also compared the retention of visually similar character pairs of identical strokes to those of different strokes. Results from both Immediate tests and Delayed tests revealed a consistent pattern: character pairs of different strokes were better retained than character pairs of identical strokes, as evidenced by both accuracy and RT analyses. The presentation style showed no main effect either in the Immediate tests or the Delayed tests analysis. However, accuracy analysis showed that in the massed presentation conditions, learners did better in different-stroke number characters than identical-stroke characters, while this pattern disappeared in the simultaneous presentation conditions, such that learners performed similarly across both types of characters. This suggests that simultaneous presentation was more effective for learning identical-stroke characters than different-stroke ones. Consequently, it reduced the difference in learning outcomes between the two character types. This is because simultaneous presentation encourages learners to compare the stroke structures of identical-stroke characters, which is consistent with the structure mapping theory (Gentner, 1983; Namy & Gentner, 2002). Finally, color coding showed no main effects in the accuracy analysis in either immediate tests or delayed tests. However, the RT analysis showed that the use of color cues slowed down word recognition in both immediate tests and delayed tests.

To address the first research question, we examined which type of visually similar word pairs—identical-stroke or different-stroke—is easier to learn. By logic, the addition of a stroke increases the overall stroke count, which, according to the conventional measure of visual complexity, should make these characters more difficult to learn and retain. However, the results contradicted this expectation, revealing that characters with identical stroke numbers were harder to remember. According to Ellis (2016), salience is a stimulus property that causes it to be easily noticed and learned. Characters with additional strokes are often more visually salient than those with identical strokes, which might explain this phenomenon.

The current study further explores how well beginners can identify the stroke counts and internal structures of characters, through exposure and practice, such that they can visualize and familiarize themselves with basic character radicals. As Reder et al. (2016) have noted, novice learners can accurately decode characters with minimal training, especially in controlled laboratory settings. This ability may enhance their capacity to differentiate characters with different stroke counts, as these characters share most of their stroke structure except for one stroke. In contrast, for characters with identical strokes, the stroke structures are different even though they share the same number of strokes. Our results indicated that word pairs with identical strokes, but different stroke orientations, were harder to learn than those with additional-stroke differences.

This difficulty might stem from the more subtle differences in identical-stroke characters, which require higher visual processing skills. For learners whose L1 lacks such visual patterns (displayed in identical stroke characters), the requisite features for processing them are unavailable. Furthermore, ingrained L1 processing strategies can inhibit the ability to notice correct orthographic cues in L2. This aligns with Tomlin and Villa’s (1994) cognitive theory of detection, which posits that detection—a function of alertness and orientation—is a critical first step for linguistic features to become available for processing and acquisition.

As M. Wang et al. (2003) have highlighted, the complex, square-shaped structure and high spatial frequency of Chinese characters place significant demands on spatial memory. This may impose an additional processing burden when learning and retaining identical-stroke characters compared to different-stroke characters. McBride-Chang et al. (2005) found a positive correlation between visual-spatial abilities and the acquisition of Chinese characters. Thus, visual working memory is crucial for learning Chinese characters, as it allows for the temporary storage of approximately three to four visual objects (e.g., Luck, 2008; Luck & Vogel, 2013). This implies that the average capacity for remembering items or features is quite limited. In our study, the characters selected averaged 3–4 strokes, which falls within the capacity of visual working memory. According to Zimmer and Fischer (2020), even native Chinese speakers rely on visual working memory rather than semantic information when memorizing trained characters and pseudo-characters whose one radical was altered from real characters. Zimmer and Fischer (2020) found in Experiment 4 that highly conceptually similar distractors did not increase memory errors, while visually similar distractors weakened memory performance. It can be inferred that beginners would primarily use visual-spatial memory to memorize Chinese characters. The current study is the first to show that characters of the same number of strokes, but different spatial orientations, are more challenging to learn compared to visually similar characters of different stroke numbers.

Second, presentation style showed no main effects. However, the interaction effect indicated that simultaneous presentation facilitated learning identical stroke characters more than different stroke characters. The highest accuracy rate was observed in the massed presentation of characters with different strokes, whereas the lowest accuracy rate was in the simultaneous presentation with identical strokes. To elaborate, when characters were presented one by one, learning different-stroke pairs resulted in higher accuracy compared to identical-stroke pairs. Conversely, in the simultaneous presentation condition, no significant difference was found between identical-stroke and different-stroke pairs. This suggests that the simultaneous presentation format helped reduce the difficulty associated with learning characters of identical strokes and, thus, reduced the difference between identical-stroke characters and different-stroke characters. As previously noted, simultaneous presentation allows learners to directly compare the stroke patterns of two characters, enhancing learning efficiency. In contrast, massed presentation may increase cognitive load by requiring learners to process two elements presented in random order and integrate them later, which can strain working memory (Chandler & Sweller, 1992, 1996; Kalyuga et al., 2000). Shen (2005, p. 56) identified learners’ common strategies such as “paying attention to graphic structures” and “visualizing the graphic structure of the character”. This study provides strong evidence that simultaneous presentation significantly improves beginners’ ability to learn the identical-stroke characters by comparing the graphic structures. Next, in the massed presentation condition, the RT analysis revealed that colored characters led to slower responses compared to the no-color condition. In contrast, during simultaneous presentation, there were no significant differences in RT between the colored and no-color conditions, irrespective of whether the tests were immediate or delayed. The analysis also suggested that while color exerted a slight influence on response times, simultaneous presentation seemed to mitigate the effects of color on response times. Additionally, the results of our study were consistent with a large number of studies that have shown that viewing multiple instances simultaneously enhances generalization (e.g., Gentner et al., 2009; Oakes & Ribar, 2005; Vlach et al., 2012). For example, Vlach et al. (2012) found that 2-year-old children who were exposed to stimuli simultaneously outperformed their peers who were in spaced or massed conditions on a novel noun generalization task when tested immediately afterward. The researchers suggested that the simultaneous presentation condition likely reduced forgetting and memory load, allowing learners to engage more readily in comparative mental processes during learning. Since all instances were visible throughout the learning phase, children in the simultaneous condition did not need to mentally retrieve previous instances they had observed. In summary, simultaneous presentation might help beginners retain the memory of visually similar characters, but it is inconclusive to draw this conclusion because the main effect was not significant.

Finally, we applied a color cue to highlight the stroke, distinguishing the paired-characters, and found that the color cue increased recognition reaction times in both the immediate tests and delayed tests. The design of the color cues aimed to guide learners’ visual focus towards the differential stroke structure within characters. Nonetheless, the use of color might inadvertently introduce a negative effect by acting as an additional distraction. The intricate visual information could potentially compel participants to engage in an extra process of matching radicals with colors, resulting in a divided attention effect. As a result, this could adversely affect learners’ character acquisition process. These findings align with Hou and Jiang (2022), who reported that radical marking using color or animation increased RT and reduced recognition accuracy in character recognition tasks. They suggested that providing radical and stroke information might interfere, rather than facilitate character learning. Our study supports this view, indicating that excessive visual information introduced during the learning process may increase cognitive load for L2 learners, thereby reducing learning effectiveness (Baddeley, 1992). When learners process multiple elements of information simultaneously, they must divide their attention across all elements. This simultaneous presentation of information can result in the split attention effect (Chandler & Sweller, 1991, 1992; Owens & Sweller, 2008), further straining working memory capacity. Thus, the color cue did not facilitate learning characters in our study, which may have important pedagogical implications for L2 instruction.

Constructed salience, a concept derived from input enhancement (IE; Sharwood Smith, 1991), is, therefore, a key pedagogical tool for directing learners’ attention to specific linguistic features, such as the structural components of Chinese characters. In this study, it took the form of specific presentation styles and color coding. While the simultaneous presentation of characters proved effective for retention, color coding did not function as intended. The results align with Sharwood Smith’s (1991) central caution that externally induced salience “may not necessarily be registered by the learner and even when it is registered, it may not affect the learning mechanisms per se” (p. 118). This outcome demonstrates that while IE can draw a learner’s attention, its subsequent processing into the language system depends on the type of linguistic evidence provided (Sharwood Smith, 1991, pp. 122–125). In this study, simultaneous presentation was an effective approach for teaching visually similar words to beginning learners, as it successfully facilitates the intake of orthographic information; in contrast, color coding was ineffective, possibly because highlighting a single stroke in red failed to provide adequate linguistic evidence.

Based on our results, we recommend the simultaneous presentation of visually similar characters to raise learner awareness. This approach should be integrated into classroom instruction (e.g., on the same slide) and textbook design (e.g., within the same exercise or page). Conversely, color marking, whether applied to a single stroke (this study) or a radical (Hou & Jiang, 2022), appears ineffective for character learning and is not recommended.

5. Conclusions

This study yielded two key findings related to salience. From a perceptual salience perspective, an intriguing finding was that characters with identical strokes were more challenging to learn than those with different strokes, despite the difference being just one stroke. From a constructed salience perspective, simultaneous presentation aided beginners by facilitating structural comparison, which reduced RT and accuracy differences between identical-stroke characters and different-stroke characters. However, color coding may interfere with the learning process. When guiding learners to analyze the internal radicals of Chinese characters, such as strokes and radicals, it is crucial to minimize the cognitive load imposed by instructional materials and educational software.

Author Contributions

Conceptualization, J.L.; Methodology, J.L. and X.W.; Software, J.L. and X.W.; Validation, M.S.; Data Analysis, J.L.; Writing—Original Draft Preparation, J.L.; Writing—Review & Editing, J.L. and X.W.; Visualization, J.L.; Supervision, X.W.; Project Administration, X.W.; Funding Acquisition, J.L. and X.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by Zhejiang Provincial Philosophy and Social Sciences Planning Project (22NDJC037Z to JML) and Australia Research Council (grant number ARC DP 210102789 to XW).

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Macquarie University. Project Ethics Number: ID 11189.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Research data and the detailed analysis scripts for this article were available at https://osf.io/jt6f4/.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Identical-Stroke Characters		Different-Stroke Characters
未	末	目	自
犬	太	木	禾
天	夫	日	旦
手	毛	今	令
刀	力	火	灭
土	士	月	用
干	千	白	百
开	井	中	申
午	牛	尸	户
元	无	从	丛
己	已	王	主
人	入	了	子

References

Anderson, R. C., Ku, Y.-M., Li, W., Chen, X., Wu, X., & Shu, H. (2013). Learning to see the patterns in Chinese characters. Scientific Studies of Reading, 17(1), 41–56. [Google Scholar] [CrossRef]
Anwyl-Irvine, A., Dalmaijer, E. S., Hodges, N., & Evershed, J. K. (2021). Realistic precision and accuracy of online experiment platforms, web browsers, and devices. Behavior Research Methods, 53(4), 1407–1425. [Google Scholar] [CrossRef]
Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59(4), 390–412. [Google Scholar] [CrossRef]
Baayen, R. H., & Milin, P. (2010). Analyzing reaction times. International Journal of Psychological Research, 3, 12–28. [Google Scholar] [CrossRef]
Baddeley, A. (1992). Working memory. Science, 255, 556–559. [Google Scholar] [CrossRef] [PubMed]
Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255–278. [Google Scholar] [CrossRef]
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. [Google Scholar] [CrossRef]
Box, G. E. P., & Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society, Series B, 26, 211–252. [Google Scholar] [CrossRef]
Cao, F., Rickles, B., Vu, M., Zhu, Z., Chan, D. H. L., Harris, L. N., Stafura, J., Xu, Y., & Perfetti, C. A. (2013). Early stage visual-orthographic processes predict long-term retention of word form and meaning: A visual encoding training study. Journal of Neurolinguistics, 26, 440–461. [Google Scholar] [CrossRef] [PubMed]
Cepeda, N. J., Pashler, H., Vul, E., Wixted, J. T., & Rohrer, D. (2006). Distributed practice in verbal recall tasks: A review and quantitative synthesis. Psychological Bulletin, 132(3), 354–380. [Google Scholar] [CrossRef]
Chandler, P., & Sweller, J. (1991). Cognitive load theory and the format of instruction. Cognition & Instruction, 8, 293–332. [Google Scholar] [CrossRef]
Chandler, P., & Sweller, J. (1992). The split attention effect as a factor in the design of instruction. British Journal of Educational Psychology, 62, 233–246. [Google Scholar] [CrossRef]
Chandler, P., & Sweller, J. (1996). Cognitive load while learning to use a computer program. Applied Cognitive Psychology, 10, 151–170. [Google Scholar] [CrossRef]
Chen, Y. P., Allport, D. A., & Marshal, J. C. (1996). What are the functional orthographic units in Chinese word recognition: The stroke or the stroke pattern? The Quarterly Journal of Experimental Psychology, 49A(4), 1024–1043. [Google Scholar] [CrossRef]
Elgort, I., & Warren, P. (2014). L2 vocabulary learning from reading: Explicit and tacit lexical knowledge and the role of learner and item variables. Language Learning, 64, 365–414. [Google Scholar] [CrossRef]
Ellis, N. C. (2016). Salience, cognition, language complexity, and complex adaptive systems. Studies in Second Language Acquisition, 38, 341–351. [Google Scholar] [CrossRef]
Franceschina, F. (2005). Fossilized second language grammars: The acquisition of grammatical gender. John Benjamins. [Google Scholar]
Gao, L., & Meng, L. (2000). The role of the phonetic code and orthographic code of Chinese character recognition by foreign learners 外国留学生汉语阅读中音，形信息对汉字辨认的影响. Chinese Teaching in the World, 4, 67–76. [Google Scholar]
Gass, S. M., Spinner, P., & Behney, J. (Eds.). (2018). Salience in second language acquisition. Routledge. [Google Scholar]
Gentner, D. (1983). Structure-mapping: A theoretical framework for analogy. Cognitive Science, 7(2), 155–170. [Google Scholar] [CrossRef]
Gentner, D., Loewenstein, J., Thompson, L., & Forbus, K. D. (2009). Reviving inert knowledge: Analogical abstraction supports relational retrieval of past events. Cognitive Science: A Multidisciplinary Journal, 33, 1343–1382. [Google Scholar] [CrossRef]
Hawkins, R., & Liszka, S. (2003). Locating the source of defective past tense marking in advanced L2 English speakers. In R. van Hout, A. Hulk, F. Kuiken, & R. Towell (Eds.), The interface between syntax and lexicon in second language acquisition (pp. 21–44). John Benjamins. [Google Scholar]
Ho, C. S.-H., & Bryant, P. (1999). Different visual skills are important in learning to read English and Chinese. Educational and Child Psychology, 16, 4–14. [Google Scholar] [CrossRef]
Hou, F., & Jiang, X. (2022). Interference effects of radical markings and stroke order animations on Chinese character learning among L2 learners. Frontiers in Psychology, 13, 783613. [Google Scholar] [CrossRef] [PubMed]
Hu, B. (2010). The challenges of Chinese: A preliminary study of UK learners’ perceptions of difficulty. Language Learning Journal, 38(1), 99–118. [Google Scholar] [CrossRef]
Huang, Z., & Chen, Y. (1988). Introductory Chinese: Reading comprehension. Sinolingua. [Google Scholar]
Jiang, N., & Feng, L. (2022). Analytic visual word recognition among Chinese l2 learners. Foreign Language Annals, 55, 540–558. [Google Scholar] [CrossRef]
Jiang, N., Hou, F., & Jiang, X. (2020). Analytic versus holistic recognition of Chinese words among L2 learners. The Modern Language Journal, 104, 567–580. [Google Scholar] [CrossRef]
Jin, H. G. (2006). Multimedia effects and Chinese character processing: An empirical study of CFL learners from three different orthographic backgrounds. Journal of Chinese Language Teachers Association, 41, 35–56. [Google Scholar]
Kalyuga, S., Chandler, P., & Sweller, J. (2000). Incorporating learner experience into the design of multimedia instruction. Journal of Educational Psychology, 92, 126–136. [Google Scholar] [CrossRef]
Ko, M. H. (2012). Glossing and second language vocabulary learning. TESOL Quarterly, 46, 56–79. [Google Scholar] [CrossRef]
Lee, C. H., & Kalyuga, S. (2011a). Effectiveness of different pinyin presentation formats in learning Chinese characters: A cognitive load perspective. Language Learning, 61(4), 1099–1118. [Google Scholar] [CrossRef]
Lee, C. H., & Kalyuga, S. (2011b). Effectiveness of on-screen pinyin in learning Chinese: An expertise reversal for multimedia redundancy effect. Computer in Human Behavior, 27, 11–15. [Google Scholar] [CrossRef]
Li, L., Liu, H., & Liu, X. (2005). Effects of characters construction on basic processing unit of Chinese character recognition (汉字结构对汉字识别加工的影响). Psychology Exploration (心理学探新), 25, 23–27. [Google Scholar]
Liversedge, S. P., Zang, C., Zhang, M., Bai, X., Yan, G., & Drieghe, D. (2014). The effect of visual complexity and word frequency on eye movements during Chinese reading. Visual Cognition, 22, 441–457. [Google Scholar] [CrossRef]
Luck, S. J. (2008). Visual short-term memory. In S. J. Luck, & A. Hollingworth (Eds.), Visual memory (pp. 43–85). Oxford University Press. [Google Scholar]
Luck, S. J., & Vogel, E. K. (2013). Visual working memory capacity: From psychophysics and neurobiology to individual differences. Trends in Cognitive Sciences, 17, 391–400. [Google Scholar] [CrossRef]
Majaj, N. J., Pelli, D. G., Kurshan, P., & Palomares, M. (2002). The role of spatial frequency channels in letter identification. Vision Research, 42, 1165–1184. [Google Scholar] [CrossRef]
McBride-Chang, C., Chow, B. W. Y., Zhong, Y., Burgess, S., & Hayward, W. G. (2005). Chinese character acquisition and visual skills in two Chinese scripts. Reading and Writing, 18, 99–128. [Google Scholar] [CrossRef]
Namy, L., & Gentner, D. (2002). Making a silk purse out of two sow’s ears: Young children’s use of comparison in category learning. Journal of Experimental Psychology: General, 131(1), 5–15. [Google Scholar] [CrossRef]
Oakes, L. M., & Ribar, R. J. (2005). A comparison of infant’s categorization in paired and successive presentation familiarization tasks. Infancy, 7, 85–98. [Google Scholar] [CrossRef]
Owens, P., & Sweller, J. (2008). Cognitive load theory and music instruction. Educational Psychology, 28, 29–45. [Google Scholar] [CrossRef][Green Version]
Pelli, D. G., Burns, C. W., Farell, B., & Moore-Page, D. C. (2006). Feature detection and letter identification. Vision Research, 46, 4646–4674. [Google Scholar] [CrossRef] [PubMed]
Peng, D. L., & Wang, C. M. (1997). Basic processing unit of Chinese character recognition: Evidence from stroke number effect and radical number effect (汉字加工的基本单元: 来自笔画数效应和部件数效应). Acta Psychologica Sinica (心理学报), 29, 8–16. [Google Scholar]
R Core Team. (2020). R: A language and environment for sta tistical computing. R Foundation for Statistical Computing. Available online: https://www.R-project.org/ (accessed on 27 May 2025).
Reder, L., Liu, X., Keinath, A., & Popov, V. (2016). Building knowledge requires bricks, not sand: The critical role of familiar constituents in learning. Psychonomic Bulletin Review, 23, 271–277. [Google Scholar] [CrossRef]
Sharwood Smith, M. S. (1991). Speaking to many minds: On the relevance of different types of language information for the L2 learner. Second Language Research, 7, 118–132. [Google Scholar]
Shen, H. H. (2004). Level of cognitive processing: Effects on character learning among non-native learners of Chinese as a foreign language. Language and Education, 18(2), 167–182. [Google Scholar] [CrossRef]
Shen, H. H. (2005). An investigation of Chinese-character learning strategies among non-native speakers of Chinese. System, 33(1), 49–68. [Google Scholar] [CrossRef]
Shen, H. H. (2013). Chinese L2 literacy development: Cognitive characteristics, learning strategies, and pedagogical interventions. Language and Linguistics Compass, 7(7), 371–387. [Google Scholar] [CrossRef]
Shu, H., Chen, X., Anderson, R. C., Wu, N., & Xuan, Y. (2003). Properties of school Chinese: Implications for learning to read. Child Development, 74(1), 27–47. [Google Scholar] [CrossRef]
Songsangkaew, P., Yodchim, S., Sukamolson, S., & Person, K. R. (2023). English morphological awareness in complex words through color coding for reading comprehension. Journal of Namibian Studies, 34, 1836–1852. [Google Scholar] [CrossRef]
Su, Y. F., & Samuels, S. J. (2010). Developmental changes in character-complexity and word-length effects when reading Chinese script. Reading & Writing, 23, 1085–1108. [Google Scholar] [CrossRef]
Taft, M., & Chung, K. (1999). Using radicals in teaching Chinese characters to second language learners. Psychologia, 42, 234–251. [Google Scholar]
Taylor, I., & Taylor, M. M. (1995). Writing and literacy in Chinese, Korean, and Japanese. John Benjamins Publishing. [Google Scholar]
Tomlin, R. S., & Villa, V. (1994). Attention in cognitive science and second language acquisition. Studies in Second Language Acquisition, 16, 183–203. [Google Scholar] [CrossRef]
Tortorelli, L. S., Lupo, S. M., & Wheatley, B. C. (2021). Examining teacher preparation for code-related reading instruction: An integrated literature review. Reading Research Quarterly, 56(S1), S317–S337. [Google Scholar] [CrossRef]
Vlach, H. A., Ankowski, A. A., & Sandhofer, C. M. (2012). At the same time or apart in time? The role of presentation timing and retrieval dynamics in generalization. Journal of Experimental Psychology: Learning, Memory, and Cognition, 38(1), 246–254. [Google Scholar] [CrossRef]
Vlach, H. A., Kaul, M., Hosch, A., Lazaroff, E., & Wang, Q. (2022). Attending less and forgetting more: Dynamics of simultaneous, massed, and spaced presentations in science concept learning. Journal of Applied Research in Memory and Cognition, 11(3), 361–373. [Google Scholar] [CrossRef]
Wang, H., He, X., & Legge, G. E. (2014). Effect of pattern complexity on the visual span for Chinese and alphabet characters. Journal of Vision, 14(8), 6. [Google Scholar] [CrossRef] [PubMed]
Wang, M., Koda, K., & Perfetti, C. A. (2003). Alphabetic and nonalphabetic L1 effects in English word identification: A comparison of Korean and Chinese English L2 learners. Cognition, 87, 129–149. [Google Scholar] [CrossRef]
Yeh, S.-L., Li, J.-L., Takeuchi, T., Sun, V., & Liu, W.-R. (2003). The role of learning experience on the perceptual organization of Chinese characters. Visual Cognition, 10(6), 729–764. [Google Scholar] [CrossRef]
Yu, L., Zhang, Q., Priest, C., Reichle, E. D., & Sheridan, H. (2018). Character-complexity effects in Chinese reading and visual search: A comparison and theoretical implications. Quarterly Journal of Experimental Psychology, 71(1), 140–151. [Google Scholar] [CrossRef] [PubMed]
Zhang, J. (1992). Modern Chinese character course (现代汉字教程). Modern Publishing House. [Google Scholar]
Zhang, J. Y., Zhang, T., Xue, F., Liu, L., & Yu, C. (2007). Legibility of Chinese characters and its implications for visual acuity measurement in Chinese reading population. Investigative Ophthalmology & Visual Science, 48(5), 2383–2390. [Google Scholar] [CrossRef][Green Version]
Zimmer, H. D., & Fischer, B. (2020). Visual working memory of Chinese characters and expertise: The expert’s memory advantage is based on long-term knowledge of visual word forms. Frontiers in Psychology, 11, 516. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Four training conditions.

Figure 2. The design schemata of the experiment.

Table 1. The language background of participants in four conditions.

Task	N	English	Other Language	No Chinese Exposure	Average Age	Female Leaners
One Character, Color-Coded	44	34	10	38	22.5	26
Two Characters, Color-Coded	44	34	10	33	21	28
One Character, No Color	48	40	8	41	24.5	37
Two Characters, No Color	47	37	10	43	21	25
Total/Average	183	145	38	155	22.25	116

Table 2. Mean Accuracy in % (SD) of the 4 training conditions in the Immediate Test.

	Identical Stroke	Different Stroke
One Character, Color-Coded	0.594 (0.491)	0.683 (0.465)
One Character, No Color	0.586 (0.493)	0.666 (0.472)
Two Characters, Color-Coded	0.586 (0.493)	0.627 (0.483)
Two Characters, No Color	0.608 (0.488)	0.617 (0.486)

Table 3. Summary of the generalized linear mixed model for Accuracy in Immediate Test.

Fixed Effects	Estimate	Std. Error	z Value	p-Value
(Intercept)	0.565	0.070	8.07	<0.001
color1	0.015	0.052	0.29	0.774
presentation1	0.046	0.056	0.82	0.411
stroke_num1	0.141	0.053	2.64	0.008
color1:presentation1	0.018	0.052	0.35	0.730
color1:stroke_num1	0.025	0.025	1.00	0.317
presentation1:stroke_num1	0.069	0.033	2.11	0.035
color1:presentation1:stroke_num1	−0.014	0.025	−0.54	0.590
Random Effects	Variance	Std. Dev.	Corr
subj: (Intercept)	0.390	0.625	-
subj: stroke_num1	0.017	0.131	0.38
itemN: (Intercept)	0.104	0.323	-
itemN: presentation1	0.021	0.143	0.03
Model Fit Statistics
Conditional R²	0.121
Marginal R²	0.008
AIC	10,934.2
BIC	11,033.0

Table 4. Mean RT in ms (SD) of the 4 training conditions in Immediate Test.

	Identical Stroke	Different Stroke
One Character, Color-Coded	3283 (1782)	2935 (1590)
One Character, No Color	2647 (1645)	2483 (1507)
Two Characters, Color-Coded	2671 (1605)	2611 (1496)
Two Characters, No Color	2791 (1800)	2586 (1587)

Table 5. Summary of the linear mixed-model for log-transformed Reaction Times in Immediate Tests.

Fixed Effects	Estimate	Std. Error	df	t-Value	p-Value
(Intercept)	7.674	0.039	198.08	196.18	<0.001
color1	0.090	0.037	173.98	2.41	0.017
presentation1	0.045	0.037	174.00	1.19	0.234
stroke_num1	−0.033	0.013	47.01	−2.48	0.017
color1:presentation1	0.041	0.037	173.99	1.11	0.269
color1:stroke_num1	−0.007	0.006	161.95	−1.19	0.236
presentation1:stroke_num1	−0.010	0.006	162.77	−1.53	0.129
color1:presentation1:stroke_num1	−0.010	0.006	161.83	−1.56	0.120
Random Effects	Variance	Std. Dev.	Corr
subj: (Intercept)	0.247	0.497	-	-	-
subj: re1.stroke_num1	0.001	0.023	−0.77	-	-
itemN: (Intercept)	0.007	0.081	-	-	-
Residual	0.179	0.423	-	-	-
Model Fit Statistics
Marginal R²	0.006
Conditional R²	0.581

Table 6. Mean Accuracy in % (SD) of the 4 training conditions in the Delayed Test.

	Identical Stroke	Different Stroke
One Character, Color-Coded	0.600 (0.490)	0.688 (0.463)
One Character, No Color	0.573 (0.495)	0.611 (0.488)
Two Characters, Color-Coded	0.588 (0.492)	0.614 (0.487)
Two Characters, No Color	0.619 (0.486)	0.624 (0.485)

Table 7. Summary of the generalized linear mixed model for Accuracy in Delayed Test.

Fixed Effects	Estimate	Std. Error	z-Value	p-Value
(Intercept)	0.536	0.071	7.58	<0.001
color1	0.044	0.054	0.82	0.415
presentation1	0.017	0.056	0.30	0.763
stroke_num1	0.100	0.053	1.88	0.060
color1:presentation1	0.078	0.054	1.44	0.150
color1:stroke_num1	0.043	0.028	1.55	0.120
presentation1:stroke_num1	0.058	0.031	1.86	0.062
color1:presentation1:stroke_num1	0.022	0.028	0.80	0.427
Random Effects	Variance	Std. Dev.	Corr
subj: (Intercept)	0.398	0.631	-
subj: stroke_num1	0.031	0.177	0.12
itemN: (Intercept)	0.098	0.314	-
itemN: presentation1	0.010	0.098	0.24
Model Fit Statistics
Conditional R²	0.127
Marginal R²	0.007

Table 8. Mean RT in ms (SD) of the Four Training Conditions in Delayed Test Analysis.

	Identical Stroke	Different Stroke
One Character, Color-Coded	2981 (1946)	2706 (1743)
One Character, No Color	2300 (1456)	2087 (1337)
Two Characters, Color-Coded	2771 (1638)	2338 (1307)
Two Characters, No Color	2533 (1597)	2347 (1476)

Table 9. Summary of Linear Mixed Model for Log-Transformed Reaction Time in Delayed Test.

Fixed Effects	Estimate	Std. Error	df	t-Value	p-Value
(Intercept)	7.568	0.042	176.89	180.70	<0.001
color1	0.113	0.041	162.55	2.78	0.006
presentation1	0.011	0.041	162.54	0.28	0.782
stroke_num1	−0.041	0.012	59.64	−3.39	0.001
color1:presentation1	0.059	0.041	162.54	1.46	0.148
color1:stroke_num1	−0.009	0.008	151.31	−1.18	0.241
presentation1:stroke_num1	0.003	0.008	151.44	0.46	0.648
color1:presentation1:stroke_num1	0.011	0.008	151.17	1.47	0.143
Random Effects	Variance	Std. Dev.	Corr
subj: (Intercept)	0.277	0.526	-
subj: re1.stroke_num1	0.003	0.057	−0.49
itemN: (Intercept)	0.004	0.065	-
Residual	0.164	0.405	-
Model Fit Statistics
Marginal R²	0.009
Conditional R²	0.632

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, J.; Shi, M.; Wang, X. Learning Chinese Characters of Visual Similarity: The Effects of Presentation Style and Color Coding. Languages 2025, 10, 260. https://doi.org/10.3390/languages10100260

AMA Style

Li J, Shi M, Wang X. Learning Chinese Characters of Visual Similarity: The Effects of Presentation Style and Color Coding. Languages. 2025; 10(10):260. https://doi.org/10.3390/languages10100260

Chicago/Turabian Style

Li, Junmin, Mengya Shi, and Xin Wang. 2025. "Learning Chinese Characters of Visual Similarity: The Effects of Presentation Style and Color Coding" Languages 10, no. 10: 260. https://doi.org/10.3390/languages10100260

APA Style

Li, J., Shi, M., & Wang, X. (2025). Learning Chinese Characters of Visual Similarity: The Effects of Presentation Style and Color Coding. Languages, 10(10), 260. https://doi.org/10.3390/languages10100260

Article Menu

Learning Chinese Characters of Visual Similarity: The Effects of Presentation Style and Color Coding

Abstract

1. Introduction

1.1. Basic Properties of Chinese Characters

1.2. Literature Review of the Three Visual Factors

1.2.1. Visual Similarity in Chinese Characters

1.2.2. Color Coding in Character Learning

1.2.3. Presentation Style

2. Materials and Methods

2.1. Participants

2.2. Materials

2.3. Design and Procedure

2.4. Statistical Analysis

3. Results

3.1. Immediate Test Analysis

3.1.1. Accuracy Analysis

3.1.2. RT Analysis

3.2. Delayed Test Analysis

3.2.1. Accuracy Analysis

3.2.2. RT Analysis

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI