Next Article in Journal
Negative Indefinite Constructions in Bantu: ‘Nobody’
Previous Article in Journal
Linguistic Contact, Transcoding and Performativity: Linguistic and Cultural Integration of Italian Immigrants in the Río de la Plata
Previous Article in Special Issue
Impact of Speaker Accent and Listener Background on FL Learners’ Perceptions of Regional Italian Varieties
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Blended Phonetic Training with HVPT Features for EFL Children: Effects on L2 Perception and Listening Comprehension

by
KyungA Lee
* and
Hyunkee Ahn
Department of English Language Education, Seoul National University, Seoul 08826, Republic of Korea
*
Author to whom correspondence should be addressed.
Languages 2025, 10(6), 122; https://doi.org/10.3390/languages10060122
Submission received: 20 December 2024 / Revised: 17 May 2025 / Accepted: 20 May 2025 / Published: 26 May 2025
(This article belongs to the Special Issue L2 Speech Perception and Production in the Globalized World)

Abstract

:
Despite being fundamental for speech processing, L2 perceptual training often lacks attention in L2 classrooms, especially among English as Foreign Language (EFL) learners navigating complex English phonology. The current study investigates the impact of the blended phonetic training program incorporating HVPT features on enhancing L2 perception and listening comprehension skills in Korean elementary EFL learners. Fifty-seven learners, aged 11 to 12 years, participated in a four-week intervention program. They were trained on 13 challenging consonant phonemes for Korean learners, using multimedia tools for practice. Pre- and posttests assessed L2 perception and listening comprehension. They are grouped into three proficiency levels based on listening comprehension tests. The results showed significant improvements in L2 perception (p = 0.01) with small and in listening comprehension (p < 0.001) with small-to-medium effects. The lower proficiency students demonstrated the largest gains. The correlation between L2 perception and listening comprehension was observed both in pre- (r = 0.427 **) and posttests (r = 0.479 ***). Findings underscore the importance of integrating explicit phonetic instruction with HVPT to enhance L2 listening skills among EFL learners.

1. Introduction

Accurate perception of second language (L2) sounds is essential for English as a Foreign Language (EFL) learners with limited exposure to the target language. According to L2 speech models, such as the Speech Learning Model (SLM; Flege, 1995) and the Perceptual Assimilation Model (PAM; Best & Tyler, 2007), L2 learners often assimilate L2 phonemes into existing first language (L1) categories, especially when phonemes are similar across languages. These theories highlight the need for perceptual training to help learners form distinct L2 phonemic categories.
One effective approach for improving L2 perception is High-Variability Phonetic Training (HVPT), which exposes learners to a wide range of phonetic variations across multiple speakers and contexts (Logan et al., 1991). It helps learners refine their perceptual categories in ways that transfer positively to perception and/or production (Lambacher et al., 2005; Sakai & Moorman, 2018; Thomson, 2018; Uchihara et al., 2024). When paired with explicit phonetic instruction, HVPT might be even more effective. Recent studies suggest that explicit instruction to guide learners’ attention to phonetic contrasts and acoustic cues can enhance the effect of perceptual training (Chen & Pederson, 2017; Felker et al., 2023). Felker et al. (2023) showed that a video-based explicit instruction of phonetic cue of vowel duration increased phonological awareness and L2 perception of English consonant and vowel contrasts by Dutch younger and older adult L2 listeners. Such instruction helps learners become consciously aware of the features that differentiate L2 phonemes, which may be especially beneficial for EFL learners who lack extensive immersion in the target language. While most HVPT research has focused on adult learners, studies involving L2 children have only recently begun to emerge (Brekelmans et al., 2024; Giannakopoulou et al., 2017; Lacabex et al., 2014). The present study contributes to this growing line of research by targeting fifty-seven Korean EFL learners, aged 11 to 12 years, who participated in a four-week intervention program.
Despite the known benefits of HVPT in improving form-based perceptual skills, few studies have explored its impact on meaning-focused skills like listening comprehension. Yet accurate perception is key to constructing mental representations and retrieving word meanings in successful listening comprehension (Hulstijn, 2001; Perfetti et al., 2005). Supporting this claim, Ke and Wang (2021) showed that the learners’ phonological processing skills explain nearly half of the variance in their L2 listening performance, and that early phonemic awareness instruction significantly enhances EFL learners’ listening comprehension (Choe et al., 2020). These findings suggest HVPT may contribute to broader listening skills beyond sound discrimination. However, existing HVPT studies have predominantly focused on adult learners, leaving young EFL learners largely underrepresented. To address this gap, the present study explores whether HVPT with explicit phonetic instruction improves not only learners’ L2 perception but also their listening comprehension skills in meaningful contexts. It addresses the following research questions:
  • To what extent does the blended phonetic training program with HVPT features enhance EFL children’s L2 speech perception and listening comprehension?
  • Does this intervention have differential effects on learners with varying levels of listening proficiency in terms of listening comprehension? If so, to what extent?

1.1. L2 Speech Model and L2 Perception

L1 significantly influences L2 learners’ ability to perceive L2 sounds. According to the SLM (Flege, 1995), similar L2 sounds which are acoustically close to learners’ L1 categories pose greater perceptual challenges compared to entirely new L2 sounds (Flege & Bohn, 2020). The PAM (Best, 1995; Best & Tyler, 2007) similarly argues that L2 sounds are mapped onto the closest L1 phonetic categories. The L2 Linguistic Perception model (L2LP; Escudero, 2005) further clarifies how learners gradually move from initial L1-based perception to more accurate L2 perception over time. Building on models like SLM and PAM that primarily describe categorical perception patterns, the L2LP model adds a developmental perspective by explaining how learners gradually transition from L1-influenced perception to more target-like L2 perception through exposure and input-driven learning.
The SLM suggests that L1 and L2 phonetic systems coexist in a shared phonological space, with new L2 categories only forming when sufficiently distinct from existing L1 categories (Flege, 1987). Without such distinctiveness, L2 phonemes merge into L1 categories, potentially impairing both perception and production accuracy (Flege & Bohn, 2020). The SLM-r enhances this perspective by emphasizing meaningful sensory input, both auditory and visual, acquired through authentic L2 interactions (Flege & Bohn, 2020). It might be challenging in EFL contexts, where limited exposure to authentic linguistic input can lead learners to rely heavily on L1-based perceptual strategies (Flege & Bohn, 2020). Therefore, L2 sounds similar to learners’ L1 categories require specialized and careful perceptual training. From this perspective, this study adopts explicit phonetic information with anatomical visual materials such as vocal fold vibration, tongue position, and lip movements. With these visual inputs, abstract perceptual learning might be more concrete, facilitating more accurate L2 speech perception.

1.2. High Variability Phonetic Training in L2 Acquisition

High Variability Phonetic Training (HVPT) is a perceptual training technique that trains learners to identify L2 sounds of stimuli spoken by multiple talkers and in various phonetic contexts with immediate feedback (Thomson & Derwing, 2015). Studies have validated the effect of HVPT in enhancing perception of foreign sounds (Carlet & Cebrian, 2022; Iverson & Evans, 2009; Lambacher et al., 2005; Lee & Hwang, 2016). Some studies have effectively demonstrated that HVPT leads to sustained improvements to untrained stimuli and voices, and extended discourse contexts (Carlet, 2017; Hirata, 2004). For instance, Hirata (2004) found that learners trained on Japanese phoneme contrasts were able to identify target words spoken by new voices within carrier sentences. Similarly, Carlet (2017) emphasized that perceptual improvements gained through HVPT could extend beyond the specific tokens and talkers used during training.
Recent meta-analysis has underscored the moderating role of learner proficiency in the effectiveness of HVPT. Mahdi and Mohsen (2023), synthesizing L2 pronunciation training studies, revealed that advanced learners experienced greater benefits from HVPT (g = 0.91) than beginners (g = 0.57), especially when the training targeted segmental features. These findings imply that higher-level learners’ phonological awareness and refined perceptual systems may enable them to extract more from HVPT stimuli. Informed by this perspective, the present study investigates learner proficiency as a moderating variable, examining whether HVPT outcomes vary across different proficiency subgroups.
A persistent challenge in phonetic training is its limited ecological validity. As Mora and Mora-Plaza (2023) noted, many phonetic training programs are often pedagogically decontextualized and may not align with meaning-oriented classroom practices. Mora and Mora-Plaza (2023) showed that pronunciation training with task-based classroom activities targeting difficult vowel contrasts led to improvements in segmental accuracy. However, these gains did not generalize to sentence-level listening comprehension. This suggests that form-focused perceptual gains may not automatically transfer to communicative contexts. In addition, many foreign language teachers remain reluctant to adopt phonetic training approaches due to their limited evidence of effectiveness in real classroom settings. The current study addresses the gap that most of the literature with few exceptions dealt with adult learners, and demonstrates how developmentally appropriate phonetic instruction can be meaningfully integrated into an elementary school curriculum.

1.3. From Perception to Comprehension: Effects of HVPT on L2 Listening

Phonological processing is the ability to convert speech sounds into recognized linguistic units and meanings. Having accurate L2 perception is a prerequisite for effective phonological processing (Field, 2003; Vandergrift & Goh, 2012). While accurate phoneme discrimination is often believed to support lexical access and overall comprehension, this relationship remains debated. Some studies (e.g., Simonchyk & Darcy, 2017; Darcy et al., 2013) report that L2 lexical representations are often shaped by L1 influence rather than native-like targets. However, other recent research, such as Melnik and Peperkamp (2020), shows that perceptual gains from HVPT can extend to lexical processing, suggesting potential for transfer to higher-level language skills. The scope and stability of such effects require further investigation.
Lexical access plays a key role in L2 phonetic learning. Real-word training may facilitate perceptual learning through the activation of top-down lexical knowledge, especially when words are familiar. However, this can also result in cognition burden, interfering with L2 perceptual progress. Recent HVPT studies have suggested that training with non-word stimuli can foster more robust phonemic category formation (Thomson & Derwing, 2016; Mora et al., 2022). They are designed to eliminate lexical priming and direct learners’ attention to acoustic cues, thereby strengthening bottom-up processing (Barrios et al., 2016). Carlet (2019) further demonstrated that perceptual gains from nonword training can generalize to real words. The present study examines whether phonetic instruction based on HVPT principles, using both nonword and real-word stimuli alongside explicit articulatory explanation, can promote generalization to untrained items and ultimately contribute to broader L2 listening comprehension.
Building on this theoretical link between perception and lexical processing, several studies have further examined how phonological processing skills relate to L2 listening comprehension performance. Ke and Wang (2021) explored the relationship between phonological processing skills and L2 listening comprehension in EFL contexts. The results demonstrated that the EFL students who excelled in discriminating spoken words significantly outperformed in listening comprehension tasks. Ke and Wang (2021) showed that the robust phonological processing skills explain a substantial portion of the variance observed in listening comprehension performance. Conversely, poor phonetic perception might increase cognitive load, limiting their capacity to understand overall meaning. Effective perceptual training as HVPT could mitigate these challenges. Zhang et al. (2019) illustrated that Japanese EFL college students significantly improved their listening comprehension scores in acoustic distortion, after completing an HVPT intervention designed to handle speech variations and background noise. While these studies were conducted with adult learners, they offer the potential link between perceptual accuracy and listening comprehension.
Furthermore, there is growing evidence that perceptual training can facilitate not only phoneme discrimination but also overall listening comprehension. A recent meta-analysis by Choe et al. (2020) reported a substantial positive effect (Hedges’ g = 3.7) of phonemic awareness instruction on L2 listening comprehension, particularly in young learners. In line with this, Dao et al. (2021) found that Vietnamese EFL university students who received seven sessions of pronunciation instruction covering both segmental and suprasegmental features showed significantly higher listening scores compared to a control group. Although this performance difference did not persist in delayed posttests, self-report data indicated that learners perceived the pronunciation instruction as highly beneficial for improving their listening comprehension skills. The present study extends this line of inquiry to younger EFL learners, addressing the gap in age-specific research and exploring whether similar relationships hold in elementary-level classroom settings. However, research indicates that HVPT alone may not fully maximize comprehension gains. Incorporating multimodal resources, such as audio–visual aids can further solidify perceptual learning (Aliaga-García & Mora, 2009; Hazan et al., 2005). With these considerations, the current study implements a blended phonetic training program incorporating HVPT features, aiming to enhance young EFL learners’ perceptual accuracy and listening comprehension.

1.4. Phonological Challenges for Korean English Learners

The perception of non-native speech sounds varies across L2 learners depending on the phonological categories established in their L1. According to theoretical models such as the SLM and the PAM, categorical perception plays a critical role in L2 sound acquisition. According to PAM (Best, 1995), the ease or difficulty of L2 sound discrimination depends not only on category overlap but also on specific assimilation patterns such as SC (Single Category), CG (Category Goodness), and UC (Uncategorized), each predicting different perceptual outcomes. Learners also attend to phonetic cues that are most salient in their L1 (Strange, 2011), and when two L2 phonemes exist as allophones of a single L1 category, perceptual differentiation becomes particularly difficult. For example, English /l/ and /ɹ/ are phonemically contrastive, as in ‘light’ vs. ‘right’, but are typically realized as allophones of a single liquid phoneme in Korean. Consequently, Korean learners often struggle to perceive and produce this contrast. Similar challenges are observed with fricatives like /s/ and /θ/, which are frequently assimilated to a single Korean category (Jeon, 2005).
Korean learners typically do not map English consonants in a one-to-one manner (Lee, 2023). Few English phonemes correspond directly to Korean sounds; most require multiple Korean categories or adjustments in both place and manner of articulation. This reflects a high degree of similarity among competing L1 categories, potentially leading to perceptual overlap and categorization difficulties.
In addition to inventory mismatches, several English phonemes pose challenges for Korean learners not merely due to articulatory differences, but more importantly because of acoustic similarity and perceptual overlap with existing L1 categories. For instance, sounds such as [θ], [f], [v], [ʒ], and [dʒ] are often substituted with the closest Korean equivalents—[f] with [pʰ], [v] with [p], [θ] with [s] or [t], and [ʒ], [dʒ] with [tɕ]—not because they are articulatory unproducible, but because their acoustic cues are either less salient or confusable within the Korean phonological system. These perceptual challenges are influenced by factors such as cue robustness, phonetic markedness, and position within the psychoacoustic space.
Moreover, languages differ in how they employ phonetic cues. For instance, English and Spanish vary in their use of voice onset time (VOT) for stop voicing, with English using positive VOT and Spanish using negative VOT for voiced stops. While both language groups rely on VOT to distinguish contrasts, learners must recalibrate their perceptual boundaries to align with L2 patterns. Similarly, Korean lacks specific manners of articulation found in English, such as voiced fricatives and affricates. Phonemes like /z/, /v/, /ʒ/, and /dʒ/ do not exist in Korean and are typically mapped onto voiceless or lenis counterparts. This limits learners’ ability to perceive and produce fine-grained voicing contrasts. Additionally, English distinguishes voiced and voiceless obstruents primarily through VOT, whereas Korean employs a three-way laryngeal contrast system—lenis, aspirated, and tense—which does not align neatly with the English binary voicing distinction. As a result, Korean learners may face difficulty in accurately perceiving and producing English stop-voicing contrasts.
Orthographic inconsistency in English further complicates learning. The spelling system does not always distinguish consonant sounds, such as /θ/ and /ð/, both represented by the digraph ‘th’ in words like ‘thigh’ and ‘thy’ (Ladefoged & Johnson, 2014). This challenge is compounded by English’s deep orthography, in which sound–letter correspondences are opaque. In contrast, Korean has a shallow orthographic system, which may heighten Korean learners’ awareness of such inconsistencies in English and increase processing difficulty. In the present study, only the voiceless interdental fricative /θ/ was included as a target phoneme, while the voiced counterpart /ð/ was excluded due to its higher articulatory and perceptual difficulty for young learners.
Building on pilot findings and in line with predictions from the SLM (Flege, 1995), this study focuses on English consonants that share partial similarity with Korean consonants. Vowels, which are generally considered more difficult than consonants due to their subtle articulatory distinctions, dialectal variation, and greater auditory tolerance (Jacewicz & Fox, 2012; Nam et al., 2009; Van Ooijen, 1996), were excluded to reduce task complexity for beginner learners. To support learning and instruction, the selected consonants were organized by voicing contrast. The target phonemes include alveolar fricatives (/s, z/), alveolar liquids (/l, r/), bilabial glide (/w/), labiodental fricatives (/f, v/), palatal glide (/j/), palato-alveolar affricates (/tʃ, dʒ/), palato-alveolar fricatives (/ʃ, ʒ/), and the voiceless interdental fricative (/θ/). Among these, /dʒ/, /θ/, and /z/ are considered new phonemes for Korean learners, as no direct equivalents exist in the Korean phonological system.

1.5. Phonetic Training Design Considerations for Young Learners

This study targeted eight consonant contrasts, including perceptually challenging pairs for Korean learners such as /l–r/, /f–v/, and /tʃ–dʒ/. Training multiple contrasts simultaneously increases cognitive load and necessitates extended exposure to promote stable phonological category formation. As Nishi and Kewley-Port (2007) demonstrated, broader phoneme inventories can foster generalization but typically require a higher number of sessions to yield consistent improvement.
As for the intervention duration, the meta-analysis on perceptual training effects by Uchihara et al. (2024) noted that some HVPT studies lasted for one month or longer (k = 12), with a range from 3 to 45 sessions (M = 11, SD = 8.7). The average session length was 35.77 min (SD = 17.7), and the total training time ranged from 60 to 1125 min (M = 310.64, SD = 204.8). Classic HVPT studies, such as the one carried out by Logan et al. (1991), investigated the perception of English /r–l/ of Japanese EFL adult learners with 15 sessions, and many foundational studies involving single contrasts adopted a similar range.
Although extensive research examined the effects of HVPT for the adult population, studies investigating its effect on young L2 learners have recently emerged (Brekelmans et al., 2024; Giannakopoulou et al., 2017; Lacabex et al., 2014). For younger learners, shorter sessions of around 20 min, distributed over several weeks, have been shown to be more effective than longer, intensive training blocks (Giannakopoulou et al., 2017; Lacabex et al., 2014). For instance, Giannakopoulou et al. (2017) trained eight-year-old children on an English vowel contrast using ten 12 min sessions and reported significant perceptual gains. In line with this evidence, the current study limited each session to 20 min to reduce fatigue and maintain attention.
In addition, research on distributed learning emphasizes that spaced repetition and periodic review lead to more durable learning outcomes than intensive practice (Serrano, 2022). Toyama and Hori (2025) emphasize that blending technology with teacher-led instruction is particularly beneficial in pronunciation training, as it combines the consistency and scalability of digital tools with the adaptability and affective support provided by human instructors. With this consideration, the study incorporated review sessions as part of a blended reinforcement approach to enhance retention and engagement. Specifically, a 19-session hybrid training program was delivered over 3–4 weeks, with each session lasting approximately 20 min. Although longer than typical lab-based HVPT designs, this schedule aligns with previously documented intensive training programs. The schedule was intentionally structured to accommodate the cognitive characteristics and learning needs of elementary-aged EFL learners, combining online homework with classroom-based review sessions for optimal reinforcement.

2. Methods

2.1. Participants

The study involved 57 sixth-grade EFL learners from an elementary school in Suwon, South Korea. They were aged from 11 to 12 years. The participants consisted of 29 boys and 28 girls, and have not attended schools in English-speaking countries for more than a semester. The participants were reported to have no prior experience with HVPT. The experiment was approved in accordance with institutional ethical standards. Written informed consent was obtained from all participants’ parents or legal guardians.

2.2. Instruments

The listening comprehension test and odd-one-out phoneme discrimination test were administered as research instruments. First, the listening comprehension test was conducted for measuring participants’ listening comprehension skills. The listening section of the Test of Practical English Language was selected. It was developed by Korea Occupational Development Evaluation Service and endorsed by the Korean Ministry of Education. The test items reflect the vocabulary from the English curriculum recommended by the Korea Ministry of Education. The listening comprehension test comprised 11 types of questions, including (1) choosing an odd word out; (2) matching spoken sentences to pictures; (3–6) responding to conversations and questions with picture or text-based options; (7) completing conversations with appropriate responses; (8) interpreting visual descriptions; (9) describing pictures; (10) understanding conversation flow; and (11) reading tasks. A full description of the task types and scoring rubric can be found in Appendix A. To minimize memory effects and ensure that test performance reflected learning gains, students were administered two different sheets but equivalent levels of the listening comprehension test: Version 1 at pretest and Version 2 at posttest. In order to verify the reliability of these two tests used in the study, a total of 10 participants who did not participate in the main experiment were gathered. They took the tests used in the pretest and the posttest in two consecutive days. A paired samples t-test was conducted to verify the homogeneity of difficulties. The result was not statistically significant (p = 0.91), indicating no significant difference between the two tests.
Secondly, the odd-one-out phoneme discrimination test was utilized for measuring L2 perception. In each test item, students heard a sequence of three sounds and were instructed to select the odd one out. Among the three sounds, two of them were identical and one was different. It was developed by the researchers, with reference to Vaughan-Rees (2002). The test consisted of 32 items, with four questions for each of the six consonant contrasts and an additional four questions for each of the two individual consonants. All test items were recorded in a quiet room by two native English-speaking teachers, resulting in two equivalent sets. The order of answer choices was varied across the two sets to minimize potential bias. Each item consisted of three spoken stimuli—two identical and one different—produced by a single speaker within each item. The inter-stimuli interval was approximately one second, and the full test took about 20 min to complete. The test was delivered in a paper-based format in the classroom. An instructor played the recorded stimuli aloud, and students marked their responses individually on printed answer sheets. The completed tests were then collected and scored by the instructor. The script of the L2 perception test used in this study is provided in Appendix B.
Prior to the main experiment, a pilot study was conducted to identify English phonemes that elementary school learners find particularly difficult. The consonant pairs included in the pilot study are presented in Table 1, and the corresponding script is provided in Appendix C. The pilot participants were 24 third-grade students from the same elementary school who were not part of the experimental group. At the time, they had completed one semester of English instruction under the national curriculum and had been introduced to the full set of English phonics.
Based on the result of the pilot test (Figure 1), the target phonemes were determined. The top six pairs were selected. Then, while /m/, /n/, and /h/ showed relatively high error rates, /w/ was selected instead. They were excluded from the analysis as it corresponds clearly to distinct Korean consonants (/m/, /n/), allowing for a one-to-one mapping with minimal risk of categorical confusion. Some phonemes were paired up according to the shared phonetic features. Table 2 shows the target consonants for the main experiment.

2.3. Experiment Materials

The intervention consisted of 19 hybrid sessions in total. Thirteen target consonants were taught, sequenced according to perceptual difficulty, starting with the more challenging contrasts. The instructional design followed a blended learning approach, where new phonemes were introduced through online sessions and subsequently reinforced through in-class activities. The instruction was conducted in four separate classes, with each class consisting of 14 to 15 students.
For the online session, each session consisted of two core components. The first component involved explicit instruction on phonetic features, lasting approximately 10 min. Demonstrated by the first author, the explanations and listening exercises were conducted to support the perceptual training drills. All of the drills were performed without requiring speaking practices from students. Using the Juna Accent Coach (Bartholomew, 2024) mobile application, students were taught with each phoneme through auditory input and visual cues. The instructor explained articulatory features such as tongue position, lip shape, and voicing. To further support perceptual development, the web media, the Sounds of Speech (The University of Iowa, 2014), was used to visualize articulatory anatomy. These instructional materials were uploaded as online learning videos prior to each lesson. Continuous monitoring was implemented to ensure engagement and participation in online sessions. The researcher monitored each learner’s engagement by tracking individual video completion rates to ensure training fidelity.
Next, HVPT features were introduced. HVPT was assigned as homework using the English Accent Coach (Thomson, 2024), a web-based tool that provides HVPT with perception drills in the form of forced-choice identification games using IPA symbols. Students completed 40 HVPT items per session—20 items with a fixed vowel (/a/) and 20 items with varied vowel contexts—and submitted their result pages to the researchers. The platform provided stimuli from diverse talkers and immediate feedback by indicating whether their answers were correct or incorrect after each trial, allowing learners to monitor their own performance during practice.
During the class, review sessions were conducted for reinforcement. The sessions began with a Q&A segment to address questions from the video lessons, followed by instructor-led explanations and modeling. The participants subsequently engaged in a 10 min collective HVPT session in a group setting. The instructor played sample items aloud, and students responded individually by circling target phonemes on their worksheets. They completed 40 identification items for the pairs. Feedback was provided immediately after each item, with the correct answer revealed right after completion. Worksheets were designed to allow students to self-mark: a circle (O) indicated a correct answer and a cross (X) an incorrect one. This repeated exposure and self-monitoring supported learners’ perceptual development in a structured yet learner-driven environment.

2.4. Procedures

Before the lessons, participants completed pretests on L2 perception and listening comprehension. Nineteen instructional sessions were then conducted, followed by posttests on the final two days. The collected data were quantitatively analyzed.
To explore participants’ perceptions of the lessons, a post-survey comprising 15 items was administered, assessing learners’ attitudes, preferences, confidence, motivation, and intentions for future use. The survey items were designed with reference to the core constructs of the Integrative Model of Technology Acceptance (Fagan et al., 2008). Of the 57 participants, 34 voluntarily completed the online survey, which was conducted after the conclusion of the experiment and was not part of the regular coursework. All responses were collected in Korean, the participants’ first language, and translated into English by the researchers.
To gain deeper insights into learners’ experiences, individual in-depth interviews were conducted with six students—two from each proficiency group—who demonstrated the most notable improvements. The interviews were conducted in Korean via the video conferencing platform Zoom and focused on learners’ perceptions of the lessons, particularly the factors contributing to their progress. Transcripts were translated into English and reviewed by two bilingual English teachers to ensure accuracy and consistency. The findings were triangulated with quantitative data to provide a comprehensive understanding of participants’ experiences and attitudes toward the blended phonetic training program with HVPT features. Table 3 illustrates the lesson procedures of the main experiment.
In the training schedule, each target consonant was allocated one instructional day. The order began with perceptually difficult contrasts (e.g., /l/–/r/), but the number of training days per consonant was consistent across all targets. The variations in review session groupings reflected pedagogical considerations rather than differences in training intensity. Specifically, /tʃ/, /dʒ/, /ʃ/, and /ʒ/ were reviewed together due to their shared manner of articulation (affricates and fricatives) and their organization into voiced–voiceless pairs. It was intended to facilitate more efficient teaching and reinforce contrastive features in a single instructional session.

3. Results

3.1. Effects on L2 Perception

To address the first research question, L2 perception pretest and posttest scores were analyzed. Table 4 presents the descriptive statistics and paired sample t-test results. A statistically significant improvement was found from pretest to posttest with a small effect size. These results suggest that the blended phonetic training program with HVPT features had a small but positive effect on learners’ L2 consonant perception in EFL contexts.
Figure 2 illustrates L2 perception test results by phonemes over time. The participants were most likely to have incorrect answers on dental and alveolar fricative pairs, /θ/ and /s/. The alveolar fricatives, /s/ and /z/, were the second most difficult phonemes. The improvements among the consonants varied. Students were able to distinguish alveolar /l/ and /r/ better in the posttest. There was an improvement in the palato-alveolar fricatives, /ʃ/ and /ʒ/; and palate-alveolar affricates, /tʃ/ and /dʒ/. Yet, the students still find dental and alveolar fricative pairs, /θ/ and /s/, the most difficult to discriminate against.

3.2. Effects on Listening Comprehension

The results of the listening comprehension pretest and posttest were analyzed to examine effects of the blended phonetic training program with HVPT features. As shown in Table 5, participants’ scores significantly increased with a small-to-medium effect. These findings indicate that the intervention contributed to notable improvements in students’ listening comprehension skills in EFL contexts.

3.3. Comparison of the Effects by Proficiency Groups

To assess the impact of the blended phonetic training program with HVPT features on listening comprehension skills with respect to proficiency levels, students were divided into three groups according to their pretest scores in listening comprehension tests. The test scores were converted into standardized (z) scores. Referring to Unsworth (2005), the subjects were grouped as follows: the low-proficiency group that had a score below −0.5, the intermediate groups with scores in between −0.5 and 0.5, and high-proficiency subjects with those scoring above 0.5. Table 6 displays the descriptive results for each of the proficiency groups. The raw data for the proficiency distribution of the whole participants are stated in Appendix B.
All proficiency groups made improvements in the posttest compared to their results in the pretest. The grades of all students were higher in the posttest, except for a single student from the intermediate group who had 8 points fewer in the posttest. Notably, the improvement of the low-proficiency group was substantially higher than those of the intermediate and high-proficiency group. Every student in the lower-level group gained scores from the pretest to the posttest from a minimum of four points to the largest increase of 59 points. Figure 3 below illustrates the mean differences by proficiency groups with line graphs. The graph of the low-proficiency group had the greatest slope among the three groups.
Next, paired sample t-tests were carried out in each group to investigate the varying impact of the blended phonetic training program with HVPT features on listening comprehension skills. Table 7 shows the results of the paired sample t-tests on listening comprehension tests over time of three proficiency groups.
All groups significantly improved listening comprehension skills in the posttest compared to the pretest. The results indicate that the low-proficiency group showed a substantial improvement in listening comprehension scores, with a large effect size (d = 2.26), exceeding those of the other groups. The intermediate group also demonstrated a large effect (d = 1.74), while the high-proficiency group showed a small-to-medium effect (d = 0.87).

3.4. Relationship Between L2 Perception and Listening Comprehension

The Pearson correlation analysis was performed to explore the correlation between L2 perception and listening comprehension skills. Table 8 conveys the results of correlation of all participants’ L2 perception and listening comprehension in pre- and posttests.
It shows significant positive correlations among L2 perception and listening comprehension skills. It shows that the correlation becomes stronger in the posttests. It indicates that the interventions further increased correlations between L2 perception and listening comprehension skills. Scatter plots were generated to examine the correlations between L2 perception and listening comprehension skills across all proficiency groups (Figure 4).
A positive relationship was revealed in both scatter plots, with the slope of the regression line higher in the posttest results than those in the pretest. To probe the correlation between the variables within each proficiency group, three rounds of correlation analysis were performed. Subsequently, Table 9 presents the statistical results of correlations within the high-, intermediate, and low-proficiency group each.
In the case of the high-proficiency group, the results of the phoneme awareness pretest were not significantly correlated with the listening comprehension pretest (r = 0.566, p = 0.069), which is slightly deviated from the significance level of 0.05. Also, it was found that there were no statistically significant correlations between the variables in the posttest. Next, the intermediate group showed a non-significant correlation in the pretest and the posttest. The lower-level group showed differing results from the previous two groups. There was no significant correlation at the level of 0.05 in the pretest. When examining the results of the posttest after the interventions, there was a significant correlation at the level of 0.01 between the variables in the posttest. In sum, the lower-proficiency group exhibited the strongest correlation between L2 perception and listening comprehension skills.
Figure 5 shows the scatter plots illustrating the correlation of the variables. In the pretest, among the groups, the high-proficiency group reveals the highest slope in the trend line. In the posttest, a positive correlation is fairly evident between two variables in all groups. The low proficiency group has a steep slope of the trend line with highly significant correlation coefficients (p = 0.002). It highlights the emergence of correlations between the two variables in the lower-level group.

3.5. Post-Survey and Interview Results

To conduct an in-depth analysis of the student’s performance on the listening comprehension tests, post surveys using questionnaires were administered. Thirty-four participants responded to the survey of all participants (58.6%). After that, a total of six, two students from each proficiency level were invited to individual interviews. Table 10, Table 11 and Table 12 demonstrate the survey results and Table 13 illustrates the interview responses.
Most of the respondents appeared to be satisfied with the lessons they had. Noticeably, S5, the student who had the highest improvement in the listening comprehension test, mentioned his viewpoints on the importance of the pronunciation in English changed after the interventions.

4. Discussion

4.1. Effects on L2 Perception and Listening Comprehension

The first research question investigates the effectiveness of the blended phonetic training program incorporating HVPT features on EFL elementary school learners’ L2 perception and listening comprehension skills. The results showed meaningful gains in both areas, suggesting the effectiveness of the interventions.
First, learners’ L2 perception scores improved significantly following the intervention, with a small effect size (d = 0.26). The improvement, despite differences in the voices used for training and testing, suggests that learners have shown to generalize their perceptual gains to unfamiliar talkers. This generalization indicates the development of more flexible phonemic categories and reflects robust phonetic learning, consistent with prior findings (Carlet, 2017; Hirata, 2004). The systematic instruction in articulatory features and exposure to varied acoustic input likely have contributed to developing L2 perception.
Second, the listening comprehension scores also increased significantly after the intervention, with a small-to-medium effect size (d = 0.67). The improvement may be attributed to learners’ increased attention to phonetic cues and sound patterns, which enabled more accurate bottom-up processing of auditory input (Kuhl et al., 2005). These results align with previous studies showing that accurate phoneme perception supports automated processing of low-level input, thereby freeing cognitive resources for higher-level comprehension (Dao et al., 2021; Li et al., 2012; Lund, 1991). The participants’ qualitative feedback of the current study further supports this interpretation: many learners reported being able to understand “confusing pronunciations” and described learning about “mouth shape and tongue shape”, indicating improved articulatory awareness and phonemic sensitivity.
While the results suggest overall effectiveness of the intervention, it is important to consider the methodological implications of assigning HVPT sessions as homework. The decision to assign HVPT as homework was driven by practical constraints and the goal of maximizing learner exposure. However, this approach might have variability in engagement and training fidelity, particularly among young learners. To mitigate this, all students completed the same standardized tasks, and the researcher monitored video completion rates to ensure training fidelity. Furthermore, review sessions were held in class to reinforce learning and address any uncertainties from the at-home training. This hybrid design reflects a balance between ecological validity and instructional control in a real-school context.
When comparing gains by question type during the listening comprehension test, it was revealed that learners made the greatest gains in visually supported comprehension tasks, such as ‘describing a picture’ or ‘choosing an image that matches a spoken description’. Approximately 44% of participants answered one additional item correctly in each visual task type on the posttest. These tasks required the learners to interpret auditory cues in relation to visual information, such as character actions or object states. Given that most participants were beginner to intermediate learners, these findings suggest that visual scaffolding can significantly support listening comprehension, as also demonstrated in Latifi et al. (2014).
The patterns across phoneme categories revealed notable differences in learners’ L2 perceptual development. The greatest gains were observed for palate-alveolar affricates (/tʃ, dʒ/), fricatives (/ʃ, ʒ/), and the /l/–/r/ contrast. Except for /dʒ/, which is a completely new sound absent from Korean, these sounds either function as non-contrastive allophones or similar sounds in the Korean phonological system. Korean lacks a phonemic distinction between /l/ and /r/, as both are mapped onto the single liquid phoneme /l/, which alternates between flap and lateral realizations. While /tʃ/ and /ʃ/ have partial matches in Korean, differences in articulatory detail and voicing result in incomplete transfer. According to the SLM (Flege, 1995), L2 sounds that are perceived as similar to L1 sounds are more difficult to acquire than entirely new sounds, due to category assimilation. The current findings partially support this theory: learners found these phonemes difficult in the beginning. They showed the largest progress after the interventions. The allophones or similar sounds that partially overlap with L1 categories make the L2 learners more susceptible to assimilation and perceptual confusion. The fact that learners made the greatest gains in these sounds suggests that the blended phonetic training program with HVPT features may be particularly effective in helping learners overcome perceptual equivalence and restructure overly broad L1 categories. Yet, the learners struggled with /θ/, the phoneme typically classified as new in Korean. This divergence may stem from the articulatory difficulty and orthographic ambiguity of /θ/, which complicates both its recognition and production despite its novelty.
These results can be interpreted with the Functional Load Theory (FLT; Catford, 1987), which ranked the phonemic contrasts that carry greater communicative weight. For instance, /l/ and /r/—a notoriously challenging pair for Korean learners—carry a high functional load in English (83%) and are crucial for intelligibility (Kang & Moran, 2015). FLT does not claim that high-load contrasts are inherently difficult but suggests their pedagogical prioritization due to communicative relevance. The participants’ improved ability to distinguish /l/ and /r/ not only suggests that they have better L2 perception but also shows that they could better perceive functionally significant phoneme contrasts.
In sum, participants demonstrated perceptual gains in phonemes that are absent or non-contrastive in Korean, such as /tʃ, dʒ/, /ʃ, ʒ/, and /l, r/, underscoring the value of targeted phoneme instruction. However, persistent difficulties with phonemes like /θ/, /s/, and /z/ suggest that longer-term and repeated exposure to such contrasts is needed in EFL instruction.
While the present findings support the effectiveness of the blended phonetic training, one important methodological limitation should be acknowledged: the absence of a control group. This decision was made based on ethical and practical considerations, given the classroom-based context and the age of the participants. Notably, a critical synthesis of 32 HVPT studies reported that several also proceeded without control groups under comparable conditions (Thomson, 2018). In this light, the findings remain meaningful, though they should be interpreted with contextual awareness.

4.2. Effects Across Proficiency Groups

The second research question examined the differential effects of L2 perception instruction on listening comprehension across proficiency groups. The findings suggest that learners with the lower proficiency group benefited the most from the intervention. The lower-level learners might have urgent needs and more room for perceptual and processing growth in bottom-up processing. The results align with previous research suggesting that explicit pronunciation and perception instruction supports bottom-up processing in L2 listening (Kissling, 2018; Li et al., 2012; Saito & Plonsky, 2019). Li et al. (2012) noted that beginner learners benefit greatly from improved access to phoneme-level cues. Kissling (2018) also showed that pronunciation-focused instruction can improve learners’ ability to decode speech at the segmental and suprasegmental levels. Similarly, Saito and Plonsky (2019), through a meta-analysis, confirmed that pronunciation instruction has a moderate and reliable impact on L2 learners’ comprehensibility and fluency. Collectively, these findings reinforce the importance of integrating L2 perception training into listening instruction, particularly for learners with lower proficiency levels.

4.3. The Role of L2 Perception in Enhancing Listening Comprehension

Regarding the relationship between L2 perception and listening comprehension skills, the variables turned out to be interrelated with each other. The result connotes an educational implication that English lessons to enhance students’ L2 perception are needed for EFL learners. Moreover, when delved into the results of the posttests, the correlation coefficient between the variables is 0.479, indicating a coefficient of determination of 0.229. It suggests that approximately 22.9% of the variance in the variable can be explained by the other, which is a substantial level of explanatory power. It appears that the blended phonetic training program incorporating HVPT features further strengthened the correlation between L2 perception and listening comprehension skills. In the EFL contexts where learners have less exposure to the sounds of the target language, the learners may experience difficulties properly discriminating and listening to English sounds. The ability to notice individual sounds might have effectively aided in understanding general messages in L2 speech, which aligns with previous research (Lund, 1991). It suggested that listeners often faced literal challenges in understanding individual words. Improved L2 perception could enhance word-level comprehension, helping learners to more effectively interpret the limited auditory input.
When delved into each proficiency group, it has shown differing effects. The correlation of the lower-level group has increased in the posttest (r = 0.691, p = 0.002) compared to the pretest (r = −0.122, p = 0.64). It can be stated that the blended phonetic training program with HVPT features have consolidated the relationships particularly for the low-proficiency group. Such figures reveal that the lower-level learners have received the most beneficial and practical effects from the blended phonetic training.

4.4. Reflections from Student Survey and Interviews

The student surveys and interviews provided valuable insights into learners’ subjective experiences and awareness of L2 sounds. The data from respondents, who accounted for approximately 60% of the total participants, revealed several meaningful patterns regarding students’ awareness and perception of English phonemes. Across all groups, respondents identified the voiceless dental fricative /θ/—which does not have an exact matching phoneme in Korean—as the most difficult phoneme. Korean lacks an interdental place of articulation and fricatives at this place, so learners often substitute /θ/ with phonetically similar Korean sounds such as [s], [t], or [tʰ]. These substitutions, however, differ both in place and manner of articulation. Due to its unfamiliar articulatory configuration, /θ/ might be one of the most difficult sounds for Korean learners to perceive and produce accurately. It was confirmed in participants’ self-assessments (Table 12) and actual performance on L2 perception tests. Due to the influence of L1, discriminating the foreign sound might have been a big challenge for the learners, confirming the SLM and the PAM assumptions (Best, 1995; Flege, 1995). In addition, the phonemic symbol, /θ/, does not match perfectly with the alphabetical system in English (i.e., ‘th’). As both voiced, /ð/, and voiceless dental fricative, /θ/, can sound in ‘th’, young EFL students may find it hard to process. Some discrepancies were also observed between perceived and actual difficulties. While respondents perceived /θ/ as the most difficult, they performed more poorly on fricatives like /f/, /v/, /s/, and /z/, indicating a gap between their awareness and actual perceptual ability. Similarly, they found /l/ and /r/ difficult to discriminate, likely due to their allophonic status in Korean. These findings suggest the importance of diagnosing hidden difficulties through formal testing, rather than relying on self-perception alone.
Despite these challenges, many respondents across all proficiency groups evaluated the lessons positively. The interviewees expressed that they enjoyed learning about articulation and reported increased motivation and awareness of English pronunciation. Notably, students in the lower proficiency group expressed strong interest in continuing the similar computer-assisted language learning. These findings underscore the value of integrating L2 perception training with explicit instruction and playful, engaging techniques like HVPT.

5. Conclusions and Implications

The current study aimed to examine the effectiveness of the blended phonetic training program with HVPT features in improving L2 speech perception and listening comprehension among Korean elementary EFL learners. The results demonstrated statistically significant improvements in both perception and L2 listening comprehension after four weeks of the intervention. These findings are consistent with previous research emphasizing the pivotal role of L2 perceptual development in enhancing overall English proficiency (Choi, 1988; Chung & Ahn, 2000; Li et al., 2012). Furthermore, the intervention was particularly beneficial for low-proficiency children, defined in this study as learners scoring in the bottom third of the pretest measure. It indicates that the intervention shows differential effects and is especially effective when implemented at early stages of L2 learning.
The findings also suggest that computer-assisted HVPT is not only effective but also feasible in real classroom settings. Learners showed increased motivation and reported enhanced confidence in their English abilities following the interventions. In the context of EFL elementary English education like Korea, which emphasizes communicative competence and oral interaction, the intervention can serve as a necessary input enhancement tool, compensating for the lack of exposure to authentic spoken English input. The observed correlation between perception and listening comprehension further highlights the foundational role of L2 perception in successful comprehension at the discourse level.
From a pedagogical perspective, the implications of this study are fourfold. Firstly, since the blended phonetic training program with HVPT features significantly enhances both L2 perception and listening comprehension skills, it should be systematically integrated into EFL curricula, particularly in the early stages of language education. This aligns with the objectives of public elementary education, which aim to provide inclusive and equitable access to essential language skills for all students. Secondly, the findings indicate that such instruction can positively influence learner affective factors. Many participants reported increased confidence in their English abilities, suggesting that perception-based training can help build learners’ self-efficacy by lowering affective filters including those at lower proficiency levels. Thirdly, the current study highlights the importance of careful phoneme selection in instructional design. A detailed contrastive analysis between the learners’ L1 and the L2 can help identify perceptually difficult sounds. In this study, participants struggled to perceive L2 phonemes that are similar but do not have an exact match in L1. Addressing such contrasts can lead to more effective perception training. Lastly, when designing L2 perception training interventions, referring to FLT (Catford, 1987) may further enhance instructional effectiveness. By prioritizing phonemes that are more crucial for intelligibility, educators can provide focused training on the most communicatively significant sound contrasts.
In sum, the current study underscores the pedagogical potential of the blended phonetic training program with HVPT features in elementary English education in the EFL settings. By addressing when and how HVPT can be introduced in L2 instruction, it extends previous research largely centered on adults and fills a gap by focusing on young learners. The findings suggest that such intervention can enhance learners’ L2 perception and listening skills while also boosting their motivation and confidence in L2 learning.
While the current design includes core HVPT components such as high-variability input in various phonetic contexts, its application to an ecological classroom requires certain adaptations. A possible future implementation is to incorporate immediate feedback mechanisms more systematically. For instance, integrating mobile-assisted pronunciation training (MAPT) tools could provide real-time corrective feedback, allowing for more individualized and responsive learning experiences. In addition, the absence of a control group represents a methodological limitation that should be considered in interpreting the findings. Given the practical constraints of school-based research with young learners, this design choice reflects common challenges in real-world educational contexts. The present findings, therefore, offer a pedagogical value within those constraints and contribute to expanding the scope of HVPT research to early L2 education. Future studies that adopt more rigorous experimental controls, including the use of comparable instructional groups and more tightly structured in-class treatment, would help strengthen the validity and generalizability of the findings.

Author Contributions

Conceptualization, K.L. and H.A.; Methodology, K.L. and H.A.; Software, K.L.; Validation, K.L. and H.A.; Formal analysis, K.L.; Investigation, K.L.; Resources, K.L.; Data curation, K.L.; Writing—original draft, K.L.; Writing—review & editing, K.L. and H.A.; Visualization, K.L.; Supervision, H.A.; Project administration, K.L. and H.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board of Seoul National University (IRB No. 2009/003-037, date of approval: 19 November 2020).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Data are available upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Question Types for Listening Comprehension Test

TypeQuestion #Score
1. Listen to the words and choose an odd one.24
2. Listen to the sentence and choose a picture that matches.36
3. Listen to conversations and answer questions (picture option).48
4. Listen to conversations, questions and answer the questions (picture option).520
5. Listen to conversations and choose a picture.26
6. Listen to conversations and answer questions (text option).512
7. Choose the appropriate response to complete the conversation.620
8. Understand the description of the visual material.26
9. Describe the picture. 14
10. Understand the natural flow of conversation. 39
11. Read relatively long conversations and answer questions (two questions for one conversation).28
Total33103
Note. question types are the same for the listening pretest and posttest.

Appendix B. L2 Perception Test Scripts Used in the Main Study

Teacher A:
1. /f, v/
(1) fan—van—fan
(2) vile—file—file
(3) proof—prove—proof
(4) fest—vest—vest
2. /s, θ/
(5) sing—thing—thing
(6) theme—theme—seem
(7) mouth—mouth—mouse
(8) worth—worse—worth
3. /s, z/
(9) seal—zeal—zeal
(10) scion– zion—scion
(11) sip—zip—zip
(12) bees—beez—bees
4. /ʃ, ʒ/
(13) assure—azure—assure
(14) pleasure—pressure—pressure
(15) rish—rish—ridge
(16) shop—jop—shop
5. /tʃ, dʒ/
(17) batch—batch—badge
(18) jeep—cheap—jeep
(19) jump—chump—chump
(20) jam—cham—jam
(21) jeer—cheer—jeer
(22) ridge—rich—ridge
6. /y/
(23) year—ear—year
(24) east—yeast—east
(25) yes—yes—es
(26) be-ond—beyond—beyond
7. /w/
(27) ooze—woos—ooze
(28) wink—wink—ink
(29) wood—ood—ood
(30) wet—et—wet
(31) wool—ool—ool
8. /l, r/
(32) Raw—law—raw
(33) lead—read—lead
(34) rink—link—link
(35) wait—ate—wait
Teacher B:
1. /f, v/
(1) van—fan—fan
(2) vile—vile—file
(3) prove—prove—proof
(4) vest—fest—vest
2. /s, θ/
(5) thing—sing—thing
(6) theme—seem—seem
(7) mouse—mouth—mouse
(8) worse—worse—worth
3. /s, z/
(9) zeal—seal—zeal
(10) scion—zion—zion
(11) sip—zip—zip
(12) bees—beez—bees
4. /ʃ, ʒ/
(13) azure—assure—assure
(14) pleasure—pressure– pleasure
(15) ridge—rish—ridge
(16) shop—shop– jop
5. /tʃ, dʒ/
(17) badge—batch—badge
(18) cheap—cheap—jeep
(19) jump—jump—chump
(20) jam—jam—cham
(21) cheer—jeer—jeer
(22) ridge—rich—rich
6. /y/
(23) year—ear—ear
(24) east—yeast—yeast
(25) es—yes—yes
(26) beyond—beyond—be-ond
7. /w/
(27) ooze—ooze—woos
(28) wink—ink—ink
(29) ood—ood—wood
(30) et—wet—wet
(31) wool—wool—ool
8. /l, r/
(32) Raw—Raw—law
(33) Lead—lead—read
(34) Rink—link—rink
(35) lay—ray—ray
Book Source: Vaughan-Rees, M. (2002).

Appendix C. L2 Perception Test Scripts Used in the Pilot Study

Teacher A:
1. /p, b/: pay—bay—bay/boast—post—boast
2. /t, d/: trip—trip—drip/down—town—down
3. /c, g/: came—game—came/clue—clue—glue
4. /f, v/: few—view—view/fast—fast—vast
5. /s, z/: sing—sing—zing/zoo—sue—sue
6. /ʃ, ʒ/: pressure—pressure—pleasure/assure—azure—azure
7. /θ, s/: thigh—sigh—sigh/thin—sin—thin
8. /tʃ, dʒ/: chunk—junk—chunk/chin—gin—gin
9. /m, n/: mail—nail—nail/mine—nine—nine
10. /h/: how—ow—how/hi—I—I
11. /l, r/: rip—lip—rip/race—lace—lace
12. /w/: wood—ood—wood/wolf—wolf—olf
13. /j/: jet—yet—jet/jam—yam—yam
14. /kw, k/: queen—keen—queen/quick—kick—kick
Teacher B:
1. /p, b/: pay—bay—pay/post—boast—boast
2. /t, d/: trip—drip—trip/down—down—town
3. /c, g/: came—came—game/glue—clue—clue
4. /f, v/: few—view—few/fast—vast—fast
5. /s, z/: sing—zing—sing/zoo—zoo—sue
6. /ʃ, ʒ/: pleasure—pressure—pressure/azure—assure—assure
7. /θ, s/: thigh—sigh—thigh/sin—thin—thin
8. /tʃ, dʒ/: chunk—chunk—junk/gin—chin—chin
9. /m, n/: nail—mail—mail/mine—mine—nine
10. /h/: ow—how—ow/I—hi—hi
11. /l, r/: lip—rip—rip/lace—race—race
12. /w/: ood—wood—ood/olf—wolf—olf
13. /j/: jet—jet—yet/yam—yam—jam
14. /kw, k/: queen—queen—keen/kick—quick—kick

Appendix D. Listening Comprehension Pretest Scores for Assigning Proficiency Levels

Listening Comprehension Pretest Score
SubjectScorez-ScoreLevel
Student 18−1.64Low
Student 28−1.64Low
Student 313−1.39Low
Student 414−1.34Low
Student 514−1.34Low
Student 619−1.09Low
Student 721−0.99Low
Student 822−0.95Low
Student 922−0.95Low
Student 1022−0.95Low
Student 1122−0.95Low
Student 1225−0.80Low
Student 1326−0.75Low
Student 1426−0.75Low
Student 1528−0.65Low
Student 1630−0.55Low
Student 1730−0.55Low
Student 1831−0.50Mid
Student 1932−0.45Mid
Student 2033−0.40Mid
Student 2133−0.40Mid
Student 2234−0.35Mid
Student 2334−0.35Mid
Student 2435−0.30Mid
Student 2535−0.30Mid
Student 2635−0.30Mid
Student 2735−0.30Mid
Student 2836−0.25Mid
Student 2937−0.20Mid
Student 3040−0.05Mid
Student 3140−0.05Mid
Student 32410.00Mid
Student 33410.00Mid
Student 34430.10Mid
Student 35430.10Mid
Student 36430.10Mid
Student 37440.15Mid
Student 38460.25Mid
Student 39460.25Mid
Student 40460.25Mid
Student 41460.25Mid
Student 42460.25Mid
Student 43460.25Mid
Student 44480.35Mid
Student 45490.40Mid
Student 46490.40Mid
Student 47490.40Mid
Student 48530.60High
Student 49530.60High
Student 50580.85High
Student 51661.25High
Student 52671.30High
Student 53721.55High
Student 54781.85High
Student 55812.00High
Student 56882.35High
Student 57942.65High
Student 58972.80High

References

  1. Aliaga-García, C., & Mora, J. C. (2009). Assessing the effects of phonetic training on L2 sound perception and production. In M. A. Watkins, A. S. Rauber, & B. O. Baptista (Eds.), Recent research in second language phonetics/phonology: Perception and production (pp. 2–31). Cambridge Scholars Publishing. [Google Scholar]
  2. Barrios, E., Flotts, A., Manzi, S., & Fuente, S. (2016). Contrasts involving new features with acoustically salient cues are easier to acquire than contrasts involving feature redeployment. Frontiers in Language Sciences, 10, 1295265. [Google Scholar]
  3. Bartholomew, A. (2024). Juna accent coach [Mobile app]. App Store. Available online: https://apps.apple.com/us/app/juna-accent-coach/id957254390 (accessed on 1 December 2020).
  4. Best, C. T. (1995). A direct realist perspective on cross-language speech perception. In W. Strange (Ed.), Speech perception and linguistic experience: Theoretical and methodological issues in cross-language speech research (pp. 167–200). York Press. [Google Scholar]
  5. Best, C. T., & Tyler, M. D. (2007). Nonnative and second-language speech perception. In O.-S. Bohn, & M. J. Munro (Eds.), Language experience in second language speech learning: In honor of James Emil Flege (pp. 13–34). John Benjamins. [Google Scholar]
  6. Brekelmans, G., Evans, B. G., & Wonnacott, E. (2024). Training child learners on nonnative vowel contrasts with phonetic training: The role of task and variability. Language Learning, Advance online publication, 1–36. [Google Scholar] [CrossRef]
  7. Carlet, A. (2017). L2 perception and production of English vowels by Catalan speakers: The effects of attention and training task in a cross-training study [Unpublished doctoral dissertation]. Universitat Autònoma de Barcelona.
  8. Carlet, A. (2019, August 5–9). Different high variability procedures for training L2 vowels and consonants. Proceedings of the ICPhS, Melbourne, Australia. [Google Scholar]
  9. Carlet, A., & Cebrian, J. (2022). The roles of task, segment type, and attention in L2 perceptual training. Applied Psycholinguistics, 43(2), 271–299. [Google Scholar] [CrossRef]
  10. Catford, J. C. (1987). Phonetics and the teaching of pronunciation: A systemic description of English phonology. In J. Morley (Ed.), Current perspectives on pronunciation: Practices anchored in theory (pp. 87–100). TESOL. [Google Scholar]
  11. Chen, Y., & Pederson, E. (2017). The efficacy of phonetic training on the perception and production of English word stress by Mandarin EFL learners. TESOL Quarterly, 51(1), 111–132. [Google Scholar]
  12. Choe, S., Lee, K., & So, Y. (2020). The effects of phonemic awareness instructions on L2 listening comprehension: A meta-analysis. The Journal of AsiaTEFL, 17(4), 1294–1309. [Google Scholar] [CrossRef]
  13. Choi, I. (1988). The necessity of teaching English fast speech phenomena for better aural comprehension skill in the Korean context [Unpublished doctoral dissertation]. University of Illinois at Urbana-Champaign.
  14. Chung, H., & Ahn, H. (2000). Phonemic awareness: Is this a prerequisite or a consequence of learning to listen in L2? Korean Journal of Applied Linguistics, 16(2), 65–81. [Google Scholar]
  15. Dao, P., Nguyen, M., & Nguyen, N. (2021). Effect of pronunciation instruction on L2 learners’ listening comprehension. Journal of Second Language Pronunciation, 7(1), 10–37. [Google Scholar] [CrossRef]
  16. Darcy, I., Daidone, D., & Kojima, C. (2013). Asymmetric development in L2 perceptual acquisition: Evidence from a longitudinal study. Language Learning, 63(3), 602–633. [Google Scholar]
  17. Escudero, P. (2005). Linguistic perception and second language acquisition: Explaining the attainment of optimal phonological categorization [Unpublished doctoral dissertation]. Utrecht University.
  18. Fagan, M. H., Neill, S., & Wooldridge, B. R. (2008). Exploring the intention to use computers: An empirical investigation of the role of intrinsic motivation, extrinsic motivation, and perceived ease of use. Journal of Computer Information Systems, 48(3), 31–37. [Google Scholar]
  19. Felker, E., Janse, E., Ernestus, M., & Broersma, M. (2023). How explicit instruction improves phonological awareness and perception of L2 sound contrasts in younger and older adults. Linguistic Approaches to Bilingualism, 13(3), 372–408. [Google Scholar] [CrossRef]
  20. Field, J. (2003). Promoting perception: Lexical segmentation in L2 listening. ELT Journal, 57(4), 325–334. [Google Scholar] [CrossRef]
  21. Flege, J. E. (1987). The production of “new” and “similar” phones in a foreign language: Evidence for the effect of equivalence classification. Journal of Phonetics, 15(1), 47–65. [Google Scholar] [CrossRef]
  22. Flege, J. E. (1995). Second language speech learning: Theory, findings, and problems. In W. Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 233–277). York Press. [Google Scholar]
  23. Flege, J. E., & Bohn, O. S. (2020). The revised speech learning model (SLM-r). In M. J. Munro, & O.-S. Bohn (Eds.), The second language speech learning: Theoretical and empirical progress (pp. 3–31). John Benjamins. [Google Scholar]
  24. Giannakopoulou, G., Uther, M., & Ylinen, S. (2017). Increased exposure to high variability phonetic training enhances non-native phoneme learning in children. Journal of Phonetics, 62, 44–62. [Google Scholar] [CrossRef]
  25. Hazan, V., Sennema, A., Iba, M., & Faulkner, A. (2005). Effect of audiovisual perceptual training on the perception and production of consonants by Japanese learners of English. Speech Communication, 47(3), 360–378. [Google Scholar] [CrossRef]
  26. Hirata, Y. (2004). Training native English speakers to perceive Japanese length contrasts in word versus sentence contexts. Journal of the Acoustical Society of America, 116(4), 2384–2395. [Google Scholar] [CrossRef]
  27. Hulstijn, J. H. (2001). Intentional and incidental second language vocabulary learning: A reappraisal of elaboration, rehearsal and automaticity. In P. Robinson (Ed.), Cognition and second language instruction (pp. 258–286). Cambridge University Press. [Google Scholar]
  28. Iverson, P., & Evans, B. G. (2009). Learning English vowels with different first-language vowel systems: Perception of formant targets, formant movement, and duration. Journal of the Acoustical Society of America, 126(1), 866–877. [Google Scholar] [CrossRef]
  29. Jacewicz, E., & Fox, R. A. (2012). Cross-dialectal differences in prosodic patterns in American English. Journal of Phonetics, 41(3–4), 145–161. [Google Scholar]
  30. Jeon, S. (2005). Introduction to English phonetics. Eulyoo Publishing. [Google Scholar]
  31. Kang, O., & Moran, M. (2015). Functional loads of pronunciation features in nonnative speakers’ oral assessment. TESOL Quarterly, 48(1), 176–193. [Google Scholar] [CrossRef]
  32. Ke, F., & Wang, W. (2021). Aural decoding and its relation to second language listening comprehension. System, 96, 102405. [Google Scholar] [CrossRef]
  33. Kissling, E. M. (2018). Pronunciation instruction can improve L2 learners’ bottom-up processing for listening. The Modern Language Journal, 102(4), 653–675. [Google Scholar] [CrossRef]
  34. Kuhl, P. K., Conboy, B. T., Padden, D., Nelson, T., & Pruitt, J. (2005). Early speech perception and later language development: Implications for the “critical period”. Language Learning and Development, 1(3–4), 237–264. [Google Scholar] [CrossRef] [PubMed]
  35. Lacabex, E., Gallardo-del-Puerto, F., & García Lecumberri, M. L. (2014). Computer-assisted pronunciation training with young learners: Implementation and effectiveness. ReCALL, 26(2), 168–186. [Google Scholar]
  36. Ladefoged, P., & Johnson, K. (2014). A course in phonetics (7th ed.). Nelson Education. [Google Scholar]
  37. Lambacher, S. G., Martens, W. L., Kakehi, K., Marasinghe, C. A., & Molholt, G. (2005). The effects of identification training on the identification and production of American English vowels by native speakers of Japanese. Applied Psycholinguistics, 26(2), 227–247. [Google Scholar] [CrossRef]
  38. Latifi, M., Tavakoli, M., & Dabaghi, A. (2014). The effect of metacognitive instruction on improving listening comprehension ability of intermediate EFL learners. International Journal of Research Studies in Language Learning, 3(6), 67–78. [Google Scholar] [CrossRef]
  39. Lee, H. (2023). Hangul-lo baeuneun yeongeo baram [Learning English pronunciation through Hangul]. Hangeul Jaemin Type Association. [Google Scholar]
  40. Lee, H., & Hwang, H. (2016). Gradient of learnability in teaching English pronunciation to Korean learners. The Journal of the Acoustical Society of America, 139(4), 1859–1872. [Google Scholar] [CrossRef]
  41. Li, M., Cheng, L., & Kirby, J. R. (2012). Phonological awareness and listening comprehension among Chinese English-immersion students. International Education, 41(2), 4. [Google Scholar]
  42. Logan, J. S., Lively, S. E., & Pisoni, D. B. (1991). Training Japanese listeners to identify English /r/ and /l/: A first report. The Journal of the Acoustical Society of America, 89(2), 874–886. [Google Scholar] [CrossRef]
  43. Lund, R. J. (1991). A comparison of second language listening and reading comprehension. The Modern Language Journal, 75(2), 196–204. [Google Scholar] [CrossRef]
  44. Mahdi, H. S., & Mohsen, M. A. (2023). A meta-analysis of high variability phonetic training in second language pronunciation learning. Journal of Language and Linguistic Studies, 19(1), 365–385. [Google Scholar]
  45. Melnik, E., & Peperkamp, S. (2020). The lasting effects of high variability phonetic training on word recognition. Second Language Research, 36(3), 443–461. [Google Scholar]
  46. Mora, J. C., & Mora-Plaza, I. (2023). From research in the lab to pedagogical practices in the EFL classroom: The case of task-based pronunciation teaching. Education Sciences, 13(10), 1042. [Google Scholar] [CrossRef]
  47. Mora, J. C., Ortega, M., Mora-Plaza, I., & Aliaga-García, C. (2022). Training the pronunciation of L2 vowels under different conditions: The use of non-lexical materials and masking noise. Phonetica, 79(1), 1–43. [Google Scholar] [CrossRef]
  48. Nam, H., Goldstein, L., & Saltzman, E. (2009). Self-organization and gestural timing in syllable structure. Laboratory Phonology, 10, 387–424. [Google Scholar]
  49. Nishi, K., & Kewley-Port, D. (2007). Training Japanese listeners to perceive American English vowels: Influence of training sets. The Journal of the Acoustical Society of America, 122(2), 1954–1966. [Google Scholar] [CrossRef]
  50. Perfetti, C. A., Liu, Y., & Tan, L. H. (2005). The lexical constituency model: Some implications of research on Chinese for general theories of reading. Psychological Review, 112(1), 43–59. [Google Scholar] [CrossRef]
  51. Saito, K., & Plonsky, L. (2019). Effects of second language pronunciation teaching revisited: A proposed measurement framework and meta-analysis. Language Learning, 69(3), 652–708. [Google Scholar] [CrossRef]
  52. Sakai, M., & Moorman, C. (2018). Can perception training improve the production of second language phonemes? A meta-analytic review of 25 years of perception training research. Applied Psycholinguistics, 39(1), 187–224. [Google Scholar] [CrossRef]
  53. Serrano, R. (2022). Spacing effects in second language learning: Research insights and pedagogical implications. Language Teaching Research, 26(1), 27–49. [Google Scholar]
  54. Simonchyk, A., & Darcy, I. (2017). Lexical encoding of L2 sounds: Evidence from a mispronunciation detection task. Studies in Second Language Acquisition, 39(3), 403–437. [Google Scholar]
  55. Strange, W. (2011). Automatic selective perception (ASP) of first and second language speech: A working model. Journal of Phonetics, 39(4), 456–471. [Google Scholar] [CrossRef]
  56. The University of Iowa. (2014). Sounds of speech. Available online: https://soundsofspeech.uiowa.edu/main.english (accessed on 1 December 2020).
  57. Thomson, R. I. (2018). High variability [pronunciation] training (HVPT) A proven technique about which every language teacher and learner ought to know. Journal of Second Language Pronunciation, 4(2), 208–231. [Google Scholar] [CrossRef]
  58. Thomson, R. I. (2024). English accent coach. Available online: https://www.englishaccentcoach.com (accessed on 1 December 2020).
  59. Thomson, R. I., & Derwing, T. M. (2015). The effectiveness of L2 pronunciation instruction: A narrative review. Applied Linguistics, 37(4), 559–582. [Google Scholar] [CrossRef]
  60. Thomson, R. I., & Derwing, T. M. (2016). Is phonemic training using nonsense or real words more effective? In J. M. Levis, H. Le, I. Lucic, E. Simpson, & S. Vo (Eds.), Proceedings of the 7th pronunciation in second language learning and teaching conference (pp. 88–97). Iowa State University. [Google Scholar]
  61. Toyama, M., & Hori, K. (2025). Technology-enhanced multimodal approaches in classroom L2 pronunciation training. Frontiers in Education, 10, 1552470. [Google Scholar] [CrossRef]
  62. Uchihara, T., Karas, M., & Thomson, R. I. (2024). Does perceptual high variability phonetic training improve L2 speech production? A meta-analysis of perception-production connection. Applied Psycholinguistics, 45(4), 591–623. [Google Scholar] [CrossRef]
  63. Unsworth, S. (2005). Child L2, adult L2, child L1: Differences and similarities [Unpublished doctoral dissertation]. Utrecht University.
  64. Vandergrift, L., & Goh, C. C. M. (2012). Teaching and learning second language listening: Metacognition in action. Routledge. [Google Scholar]
  65. Van Ooijen, B. (1996). Vowel mutability and lexical selection in English: Evidence from a word reconstruction task. Memory & Cognition, 24, 573–583. [Google Scholar]
  66. Vaughan-Rees, M. (2002). Test your pronunciation. Pearson Education. [Google Scholar]
  67. Zhang, H., Inoue, Y., Saito, D., Minematsu, N., & Yamauchi, Y. (2019, August 5–9). Computer-aided high variability phonetic training to improve robustness of learners’ listening comprehension. Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS) (pp. 924–928), Melbourne, Australia. [Google Scholar]
Figure 1. Result of the pilot test. Note: N = 24; percentage = correct answer rate of the phonemes.
Figure 1. Result of the pilot test. Note: N = 24; percentage = correct answer rate of the phonemes.
Languages 10 00122 g001
Figure 2. L2 perception pretest and posttest results (N = 57). Note: percentage = correct answer rate of the phoneme.
Figure 2. L2 perception pretest and posttest results (N = 57). Note: percentage = correct answer rate of the phoneme.
Languages 10 00122 g002
Figure 3. Mean scores by proficiency groups.
Figure 3. Mean scores by proficiency groups.
Languages 10 00122 g003
Figure 4. Scatter plot for correlation of L2 perception and LC in pretest and posttest (N = 57). Notes: pretest (left): y = 0.81x + 5.4; R2 = 0.182; p < 0.001; posttest (right): y = 1.16x + 0.86; R2 = 0.23; p < 0.001.
Figure 4. Scatter plot for correlation of L2 perception and LC in pretest and posttest (N = 57). Notes: pretest (left): y = 0.81x + 5.4; R2 = 0.182; p < 0.001; posttest (right): y = 1.16x + 0.86; R2 = 0.23; p < 0.001.
Languages 10 00122 g004
Figure 5. Scatter plot for the correlation of pretest and posttest by proficiency groups. Notes: pretest (left): high: y = 1.62x − 8.05; R2 = 0.32; p = 0.069; intermediate: y = 0.20x + 31.34; R2 = 0.099; p = 0.097; low: y = −0.07x + 23.12; R2 = 0.015; p = 0.64; posttest (right): high: y = 1.23x + 24.97; R2 = 0.142; p = 0.253; intermediate: y = 0.41x + 29.53; R2 = 0.10; p = 0.104; low: y = 0.9x + 5.24; R2 = 0.478; p = 0.002.
Figure 5. Scatter plot for the correlation of pretest and posttest by proficiency groups. Notes: pretest (left): high: y = 1.62x − 8.05; R2 = 0.32; p = 0.069; intermediate: y = 0.20x + 31.34; R2 = 0.099; p = 0.097; low: y = −0.07x + 23.12; R2 = 0.015; p = 0.64; posttest (right): high: y = 1.23x + 24.97; R2 = 0.142; p = 0.253; intermediate: y = 0.41x + 29.53; R2 = 0.10; p = 0.104; low: y = 0.9x + 5.24; R2 = 0.478; p = 0.002.
Languages 10 00122 g005
Table 1. Consonants tested in the pilot study.
Table 1. Consonants tested in the pilot study.
Pair #IPA SymbolPair #IPA Symbol
1/p, b/8/t, d/
2/c, g/9/f, v/
3/s, z/10/ʃ, ʒ/
4/θ, s/11/l, r/
5/m, n/12/h/
6/tʃ, dʒ/13/w/
7/j/14/k, kw/
Note: for phonemes without pairs, items contrasted with the presence vs. absence of the target phoneme.
Table 2. Target consonants.
Table 2. Target consonants.
Pair #IPA Symbol
1/l, r/
2/θ, s/
3/s, z/
4/f, v/
5/tʃ, dʒ/
6/ʃ, ʒ/
7/w/
8/j/
Table 3. Lesson procedures.
Table 3. Lesson procedures.
DayActivity
Day 1Pretest 1 (PA test), questionnaire
Day 2Pretest 2 (LC test)
Day 3Introduction of the course
Day 4–5Lesson of /l/, /r/
Day 6* Review (classroom)
Day 7–9Lesson of /θ/, /s/, /z/
Day 10* Review (classroom)
Day 11–12Lesson of /f/, /v/
Day 13* Review (classroom)
Day 14–17Lesson of /tʃ/, /dʒ/, /ʃ/, /ʒ/
Day 18* Review (classroom)
Day 19–20Lesson of /w/, /j/
Day 21* Review (classroom)
Day 22Posttest 1 (PA test)
Day 23Posttest 2 (LC test), survey and interview
Note: *: review sessions indicate offline classroom based review of previously taught phonemes.
Table 4. L2 perception test: descriptive statistics and paired samples t-test results (n = 57).
Table 4. L2 perception test: descriptive statistics and paired samples t-test results (n = 57).
TestminmaxMSDSEMean Difftdfp95% CI d
Pretest136243.4910.581.40
Posttest235946.108.301.102.612.8560.01 **[0.74, 4.48]0.26
Notes: total score: 64 points; ** p < 0.01; d = Cohen’s d.
Table 5. Listening comprehension test: descriptive statistics and paired samples t-test results (n = 57).
Table 5. Listening comprehension test: descriptive statistics and paired samples t-test results (n = 57).
TestminmaxMSDSEMean Difftdfp95% CI d
Pretest89740.8220.182.67 <0.001 ***
Posttest2410354.2420.052.6613.428.1856[10.13, 16.71]0.67
Notes: total score: 103 points; *** p < 0.001; d = Cohen’s d.
Table 6. Students’ performance by proficiency groups.
Table 6. Students’ performance by proficiency groups.
GroupSourceNminmaxMSDSE
HighLC Pretest11539773.3615.624.71
LC Posttest115310386.8215.444.66
Gain score 43613.46
IntermediateLC Pretest29314940.345.891.09
LC Posttest29277148.9710.331.20
Gain score −8308.63
LowLC Pretest1783020.596.991.70
LC Posttest17246742.1811.682.83
Gain score1745921.59
Total 57 40.9120.01
Table 7. Results of the LC test scores within groups.
Table 7. Results of the LC test scores within groups.
Paired Differences
MSDSE95% CItdfSig.d
LowerUpper
High13.459.832.966.8520.064.8410<0.001 ***0.87
Intermediate8.629.391.775.0512.194.9428<0.001 ***1.74
Low21.5914.483.6214.1429.036.1516<0.001 ***2.26
Note: *** p < 0.001; d = Cohen’s d.
Table 8. Correlation of L2 perception and LC scores of all groups (N = 57).
Table 8. Correlation of L2 perception and LC scores of all groups (N = 57).
GroupSource LC PretestLC Posttest
AllL2 perception pretestPearson Correlation0.427 ***
Sig.0.000
L2 perception posttestPearson Correlation 0.479 ***
Sig. 0.000
Note: *** p < 0.001.
Table 9. Results of correlation within groups.
Table 9. Results of correlation within groups.
GroupSource LC PretestLC Posttest
High
(n = 11)
PA PretestPearson correlation0.566
Sig.0.069
PA PosttestPearson correlation 0.377
Sig. 0.253
Intermediate
(n = 29)
PA PretestPearson correlation0.314
Sig.0.097
PA PosttestPearson correlation 0.308
Sig. 0.104
Low
(n = 17)
PA PretestPearson correlation−0.122
Sig.0.64
PA PosttestPearson correlation 0.691 **
Sig. 0.002
Note: ** p < 0.01.
Table 10. Result of post survey (yes/no questions; n = 34).
Table 10. Result of post survey (yes/no questions; n = 34).
QuestionsHigh (%)Intermediate (%)Low (%)Total (%)
1Have you ever taken English sound perception lessons before? (Yes)5
(83.3)
4
(20)
3 (37.5)12 (35.3)
2Was the lesson method effective? (Yes)5 (83.3)11 (55)4 (57.1)20 (58.8)
3Can you distinguish phoneme pairs now? (Yes)6 (100)13 (65)3 (42.9)22 (64.7)
4Do you think the lesson helped you improve your English listening skills? (Yes)6 (100)15 (75)6 (85.7)27 (79.4)
5Do you think the lesson improved your confidence in English? (Yes)6 (100)11 (55)3 (37.5)20 (58.8)
6Do you think you can utilize this English perception abilities? (Yes)4 (66.7)13 (65)5 (71.4)22 (64.7)
Note: includes only the respondents answered in the survey.
Table 11. Result of post survey (self-assessment with 5 Likert scales; n = 34).
Table 11. Result of post survey (self-assessment with 5 Likert scales; n = 34).
How Sincerely Have You Been Working on the Lessons?1
(%)
2
(%)
3
(%)
4
(%)
5
(%)
M
High (n = 6)002
(33.3)
1
(16.7)
3
(50)
4.17
Intermediate (n = 20)03
(15)
10
(50)
5
(25)
2
(10)
3.3
Low (n = 8)02
(25)
5
(62.5)
01
(12.5)
3
Total 0517663.38
Table 12. Result of post survey (self-assessment; n = 34).
Table 12. Result of post survey (self-assessment; n = 34).
Questions/θ//ʒ//l//r//f//tʃ//j//z//s//ʃ//w//v//y/
Which English Sound Was the Most Unfamiliar and Difficult?
High (n = 6)3323111211111
Intermediate (n = 20)9533432223100
Low (n = 8)4131122110000
Total 16987665544211
Note: multiple selections possible.
Table 13. Summary for individual interviews.
Table 13. Summary for individual interviews.
GroupStudentWhat I Have LearnedWhat I Want to Learn MoreHow I Felt about PA LessonsHow I Practiced Listening Before
HighS1-I could understand confusing pronunciations.-I still can’t make a perfect distinction on /r/, so I want to practice more.-It was fun and good.-Dictations, Fill-in-the-Blanks
HighS2-I learned anew how to do things like mouth shape and tongue shape by pronunciation.-Sometimes I’m confused about whether the pronunciation I’m practicing is correct.-It was a little difficult at first, but it was fun as I gradually understood.-Fill-in-the-Blanks
IntermediateS3-I learned that there are many ways to pronounce English pronunciation.-I want to study more phonemes.-It was new to me so I became more focused than other classes.
-By studying English sounds, I could speak English more accurately.
-Listen to the audio and memorize the pronunciation.
IntermediateS4-I knew that the tongue was important when pronouncing English.-I want to expand this lesson in a paragraph, as I can’t read an English paragraph well.-When I listened to English, I was proud to know what was wrong because I kept listening.
-When phonemes were difficult, like s, z, and th, it was good to see a video of native speakers repeating them.
-Listen to sentences and words on a laptop and take a test.
LowS5-I found out that there are various sounds in English.-I want to practice listening drills more.-I didn’t know that English pronunciation was important, but while learning the lessons, I realized that pronunciation was crucial. -Listening several times.
LowS6-I learned the vibration of the vocal cord.-I felt I had to study English more in the future.-I have fun learning. It was amazing because it was my first time studying HVPT.-Dictations.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lee, K.; Ahn, H. Blended Phonetic Training with HVPT Features for EFL Children: Effects on L2 Perception and Listening Comprehension. Languages 2025, 10, 122. https://doi.org/10.3390/languages10060122

AMA Style

Lee K, Ahn H. Blended Phonetic Training with HVPT Features for EFL Children: Effects on L2 Perception and Listening Comprehension. Languages. 2025; 10(6):122. https://doi.org/10.3390/languages10060122

Chicago/Turabian Style

Lee, KyungA, and Hyunkee Ahn. 2025. "Blended Phonetic Training with HVPT Features for EFL Children: Effects on L2 Perception and Listening Comprehension" Languages 10, no. 6: 122. https://doi.org/10.3390/languages10060122

APA Style

Lee, K., & Ahn, H. (2025). Blended Phonetic Training with HVPT Features for EFL Children: Effects on L2 Perception and Listening Comprehension. Languages, 10(6), 122. https://doi.org/10.3390/languages10060122

Article Metrics

Back to TopTop