The Impact of the iWrite Automated Writing Evaluation System on University EFL Students’ Writing Performance and Writing Anxiety

Du, Jiapeng; Nordin, Nur Rasyidah Mohd

doi:10.3390/educsci16030411

Open AccessArticle

The Impact of the iWrite Automated Writing Evaluation System on University EFL Students’ Writing Performance and Writing Anxiety

by

Jiapeng Du

^1,2,* and

Nur Rasyidah Mohd Nordin

¹

School of Languages, Civilisation and Philosophy, Universiti Utara Malaysia, Sintok 06010, Malaysia

²

School of Foreign Languages, Taishan University, Tai’an 271000, China

^*

Author to whom correspondence should be addressed.

Educ. Sci. 2026, 16(3), 411; https://doi.org/10.3390/educsci16030411

Submission received: 2 January 2026 / Revised: 13 February 2026 / Accepted: 25 February 2026 / Published: 9 March 2026

Download Versions Notes

Abstract

Automated Writing Evaluation (AWE) systems have been increasingly integrated into second-language writing instruction; however, empirical evidence regarding the effectiveness of localized AWE tools in EFL contexts remains limited. This study investigated the impact of the iWrite Automated Writing Evaluation system on university EFL students’ writing performance and writing anxiety. Employing a quasi-experimental mixed-methods design, 60 Chinese university students were assigned to an experimental group using iWrite and a control group receiving traditional teacher feedback over a 12-week instructional period. Writing performance was assessed using the complexity, accuracy, and fluency (CAF) framework, while writing anxiety was measured through a validated questionnaire. Quantitative results revealed that the experimental group demonstrated significantly greater improvements in writing accuracy, fluency, and lexical complexity, as well as significantly lower levels of writing anxiety, compared with the control group. No significant difference was found in syntactic complexity. Qualitative findings further indicated that immediate, non-judgmental feedback and opportunities for repeated revision contributed to increased learner confidence and reduced anxiety. The findings suggest that localized AWE systems such as iWrite can effectively support both the cognitive and affective dimensions of EFL writing when integrated within a human–AI collaborative instructional framework.

Keywords:

automated writing evaluation; iWrite; EFL writing; writing performance; writing anxiety

1. Introduction

Over the past several decades, the evaluation of second-language (L2) writing has increasingly incorporated technological support, evolving from early computer-assisted language learning (CALL) applications to more sophisticated automated writing evaluation (AWE) systems (Ranalli & Yamashita, 2022). Early forms of technology-assisted writing assessment relied primarily on rule-based error detection, focusing on surface-level features such as grammar, spelling, and mechanics (Barrot et al., 2021). With advances in natural language processing (NLP), AWE systems gradually expanded their evaluative scope to include multiple dimensions of writing, such as lexical choice, syntactic complexity, and organizational features, offering more systematic and consistent feedback to learners (Nunes et al., 2022).

Consequently, AWE systems have been widely adopted to address persistent challenges in writing instruction, particularly those associated with large class sizes, heavy teacher workloads, and delayed formative feedback (McNamara & Kendeou, 2022). These challenges are especially prominent in Chinese tertiary EFL classrooms, where writing instruction has traditionally been teacher-centered and product-oriented, with a strong emphasis on linguistic accuracy and examination performance (Z. V. Zhang & Hyland, 2022; Ng & Cheung, 2017). Such structural constraints limit opportunities for individualized feedback and iterative revision, often resulting in students’ mechanical engagement with writing tasks and heightened levels of writing anxiety (Barrot et al., 2021; Patty, 2024).

Empirical research conducted prior to the emergence of generative artificial intelligence (GenAI) has generally highlighted the pedagogical benefits of AWE systems. Studies suggest that automated feedback can support writing accuracy, encourage learner autonomy, and promote more frequent revision cycles (Ranalli & Yamashita, 2022; Geng & Razali, 2020; Zhai & Ma, 2022). In addition, AWE-mediated feedback has been found to reduce the affective pressure associated with direct teacher evaluation, potentially contributing to a more supportive writing environment (Barrot et al., 2021). However, earlier generations of AWE systems were largely constrained by fixed feedback templates and statistically driven algorithms, limiting their ability to engage meaningfully with content development, discourse organization, and learner-specific writing trajectories (Türkoğlu, 2025).

In recent years, the rapid development of GenAI technologies has fundamentally altered the landscape of writing instruction and evaluation (Bewersdorff et al., 2023). Unlike traditional AWE systems, GenAI-powered tools are capable of generating context-sensitive feedback, simulating dialogic interaction, and supporting writing as a dynamic, process-oriented activity rather than a purely product-based outcome (Yan, 2024). These developments have stimulated ongoing scholarly discussions regarding the pedagogical positioning of automated systems, particularly in relation to assessment practices and instructional roles (Loncar et al., 2023). Thus, the integration of automated writing systems now requires not only technical refinement but also careful pedagogical repositioning within instructional contexts.

Despite their growing prominence, most widely used AWE and AI-assisted writing tools—such as WriteToLearn^®, MY Access!^®, Criterion^®, and Project Essay Grader—have been developed primarily for English-as-a-first-language contexts or generalized international markets (Wilson & Roscoe, 2020). As a result, these systems often fail to fully accommodate the linguistic characteristics, curricular demands, and learning cultures of Chinese EFL learners (Z. V. Zhang & Hyland, 2022; Su, 2020). Challenges such as first-language interference, exam-oriented writing requirements, and context-specific rhetorical conventions remain insufficiently addressed, underscoring the need for localized and pedagogy-aligned AWE solutions (Y. Zhang, 2020).

In response to this contextual demand, iWrite has been developed as a locally grounded AWE system designed specifically for Chinese tertiary EFL instruction (J. Wang & Wang, 2021). Rather than functioning solely as a grammar-focused correction tool, iWrite adopts an analytical scoring approach that evaluates student writing across multiple dimensions, including language use, content development, and organizational structure (Qin & Liu, 2025). By aligning its assessment framework with national curriculum standards and mainstream instructional practices in China, iWrite represents an evolution of AWE that emphasizes pedagogical compatibility alongside technological functionality (X. Chen, 2025).

Beyond linguistic outcomes, affective factors have emerged as critical considerations in technology-mediated writing instruction. Writing anxiety, in particular, remains prevalent among Chinese university students and has been shown to negatively affect writing fluency, organizational coherence, and sustained engagement (Abdel Latif, 2019; Yan, 2024). From the perspective of Control-Value Theory, learners’ emotional responses to writing tasks are closely linked to their perceived control over the writing process and evaluative pressure (Pekrun, 2006; Deane, 2018). Automated feedback systems that provide structured, manageable, and supportive feedback may therefore contribute to anxiety reduction by enhancing learners’ sense of control and facilitating repeated engagement with writing tasks (Barrot et al., 2021).

In evaluating L2 writing development, complexity, accuracy, and fluency (CAF) have been widely recognized as core dimensions in second-language writing research (McCallum & Curry, 2023). Although CAF has been extensively employed in second-language acquisition studies, comprehensive investigations examining all three dimensions simultaneously in the Chinese EFL writing context remain limited (Lu & Ai, 2015; Y. Zhang, 2020; Cheng & Zhang, 2021; Qin & Liu, 2025). Existing research has often focused on isolated aspects of writing performance, leaving a gap in understanding how localized AWE systems influence overall writing development across CAF dimensions (Toufaha, 2024).

Taken together, while iWrite is theoretically positioned to address both instructional and affective challenges in Chinese EFL writing, robust empirical evidence supporting its effectiveness remains scarce. Specifically, few studies have systematically examined its impact on writing performance across CAF dimensions or its influence on writing anxiety in authentic university classroom settings (Ranalli & Yamashita, 2022; Toufaha, 2024). To address these gaps, this mixed-methods study aims to investigate the effects of iWrite on Chinese university students’ English writing performance and writing anxiety. It seeks to answer the following research questions:

RQ1. What is the effect of the iWrite system, compared to traditional instruction, on Chinese university students’ English writing performance as measured by CAF?

RQ2. What is the effect of the iWrite system, compared to traditional instruction, on Chinese university students’ English writing anxiety?

RQ3. What are students’ perceptions of the influence of the iWrite system on their English writing performance and writing anxiety?

2. Materials and Methods

2.1. Research Design

This study adopted an explanatory sequential mixed-methods design (Creswell & Clark, 2017), a two-phase approach that strategically combines quantitative and qualitative methods. The rationale for selecting this design was threefold. First, it aligns directly with the nature of our research questions: initial quantitative questions (RQ1 & RQ2) assess the effects and extent of changes in writing performance and anxiety, while the subsequent qualitative question (RQ3) seeks to explain how and why students experienced these changes, providing depth to the numerical results. Second, this design capitalizes on the complementary strengths of both paradigms. The quantitative, quasi-experimental pretest–posttest phase offers generalizable and statistically testable evidence of causal relationships, whereas the subsequent qualitative phase, utilizing semi-structured interviews, provides rich, contextualized insights into participants’ perceptions and experiences (Mrabti & Alaoui, 2024). This integration allows for a more comprehensive and nuanced understanding than either approach alone. Third, the sequential structure is pragmatically suitable for investigating educational interventions, as it enables the qualitative data to build upon and explain the initial quantitative findings, a logic well-established in applied linguistics and educational technology research.

Therefore, the research was operationalized in two distinct phases. In the first, quantitative phase, a quasi-experimental pretest–posttest design was employed to collect numerical data on writing performance (CAF metrics) and writing anxiety (SLWAI scores) from both experimental and control groups. In the second, qualitative phase, semi-structured interviews were conducted with a purposive sample of participants from the experimental group to explore their subjective perceptions and experiences in greater depth, thereby explaining and elaborating on the quantitative outcomes. The findings from both phases were integrated during the interpretation stage to provide a consolidated conclusion regarding the impact of the iWrite system.

2.2. Participants and Context

The participants in this study consisted of 60 first-year non-English majors enrolled in teacher education programs, including Primary Education and Preschool Education. Detailed demographic characteristics, including gender, age, and years of English learning, were comparable between the experimental group (n = 30) and the control group (n = 30), as shown in Table 1.

Due to constraints in the actual teaching arrangements, a convenience sampling method was adopted by selecting two intact classes. These classes were subsequently assigned to the experimental and control conditions following a quasi-experimental design. The sample size was determined with reference to similar empirical studies and was further examined through an a priori power analysis conducted using G*Power 3.1.9.7, which indicated that the sample size was sufficient to detect medium-to-large effect sizes (Cheng & Zhang, 2021).

Although individual random assignment was not feasible, pre-test measures and independent-samples t-tests were conducted to examine baseline equivalence between the two groups. The results indicated no statistically significant differences in writing performance or writing anxiety prior to the intervention, thereby supporting the validity of subsequent between-group comparisons (Creswell & Clark, 2017).

2.3. Instructional Procedures

The instructional intervention in this study was grounded in the theoretical frameworks of the process writing approach and formative assessment (Black & Wiliam, 1998; Flower & Hayes, 1981). Process writing conceptualizes writing as a recursive, multi-draft process. Accordingly, a 12-week cycle comprising five writing tasks was designed to facilitate a complete “planning-drafting-feedback-revising” cycle. The integration of the iWrite system aimed to transform the typically delayed, teacher-centered feedback into an immediate and multi-source feedback mechanism, aligning with formative assessment principles. Both groups received regular College English curriculum-aligned instruction, with the key difference residing not in task design, but in the feedback mechanism and the revision process it engendered.

Both groups were required to complete the same five writing assignments. All tasks were short argumentative essays (120–180 words) taken directly from the CET-4 test, ensuring genre consistency, and were administered under comparable classroom conditions. The use of these standardized prompts guaranteed consistent task difficulty and demands across groups.

The experimental group completed the assignments using the iWrite system, while the control group followed the traditional method by submitting their essays directly to the teacher. The key operational difference resided in the feedback mechanism and the revision process it engendered.

A structured, process-oriented cycle was implemented for each task. Both groups were afforded an identical three-week revision window following the initial submission and were permitted to revise their work during this period. The fundamental distinction lay in the nature of the feedback that scaffolded revision: students in the experimental group, utilizing the iWrite system, received immediate, iterative, and diagnostic automated feedback after each draft submission, which enabled a dynamic, feedback-driven revision process where multiple submissions were naturally facilitated by the tool. In contrast, students in the control group received delayed, holistic written feedback from the instructor at the beginning of the revision window; consequently, their revision was guided by a single, static set of comments, without a structured mechanism for obtaining incremental feedback.

2.4. Research Instruments

2.4.1. iWrite System

The iWrite English Writing Teaching and Evaluation System employed in this study is a localized Automated Writing Evaluation (AWE) system jointly developed in 2015 by the research team led by Professor Liang Maocheng at Beijing Foreign Studies University and the Foreign Language Teaching and Research Press. Its core functionality lies in providing automated multi-dimensional scoring of student essays (covering language, content, text structure, and technical conventions) and delivering immediate, diagnostic feedback, with particular strength in identifying and correcting language errors specific to Chinese EFL learners.

The system is specifically designed to address the characteristics of Chinese English learners. Its evaluation model is based on the aforementioned four core dimensions established by experts in the fields of second-language writing, language assessment, and corpus linguistics (Li, 2021). To enable accurate assessment, iWrite has built a dedicated corpus containing hundreds of millions of tokens, integrating native speaker corpora, international learner corpora, and—most importantly—a continuously updated Chinese Learner English Corpus (e.g., the iWrite Corpus), which focuses on diagnosing common errors among Chinese learners (Wu et al., 2024). Currently, its intelligent scoring engine utilizes advanced deep learning technologies and demonstrates strong performance in automated scoring and grammatical error correction.

In this study, the core procedure for students using iWrite was as follows: after submitting an essay, the system generated an overall score and provided diagnostic feedback across the four dimensions (including specific grammatical corrections and lexical suggestions). Students in the experimental group then revised their essays based on this feedback.

2.4.2. Writing Tests

All participants completed a total of five writing tasks over the course of the study, consisting of one pretest, three instructional writing tasks during the intervention phase, and one posttest. To assess writing performance, participants completed an argumentative essay as a pretest at the beginning of the semester and a parallel argumentative essay as a posttest at the end of the intervention. Tasks were assigned at regular three-week intervals to ensure sufficient time for drafting, feedback, and revision. The pretest and posttest prompts were parallel in terms of genre, topic familiarity, length requirement, and difficulty level, ensuring task equivalence for measuring writing development over time.

This study utilized writing prompts from the authentic National College English Test Band 4 (CET-4) as the testing materials. The selection of the CET-4 was based on the following three key considerations:

First, as a national standardized test, the validity and reliability of the CET-4 have been extensively validated over time, ensuring its measurement quality and solid academic credibility (S. Chen, 2022).

Second, the test content aligns closely with the objectives of this study. The CET-4 writing section directly assesses students’ ability to express themselves in writing on familiar topics. Its argumentative essay genre represents a core requirement in Chinese university English teaching and assessment, thereby ensuring good ecological validity for the research.

Finally, CET-4 scores carry significant social weight and certification value within the context of Chinese higher education. Consequently, research findings based on this test hold greater referential significance for teaching practices and related decision-making (Jiang, 2020).

Writing performance was assessed using the three-dimensional CAF (Complexity, Accuracy, and Fluency) framework. This framework has been widely recognized as a core, operationalizable index system for evaluating second-language writing development and output, capable of comprehensively reflecting different facets of learners’ language ability (Lee et al., 2023). Complexity was measured through syntactic complexity (clauses per T-unit) and lexical complexity (corrected type–token ratio). Accuracy was measured by the number of errors per 100 words. Fluency was measured by words per clause. Automated linguistic analysis tools, including L2SCA and LCA, were used to ensure objectivity and reliability (Saricaoglu & Atak, 2022).

To ensure the objectivity and consistency of scoring for writing performance (CAF), two trained raters independently assessed all essays. The raters were blinded to the group assignment (experimental/control) and testing occasion (pre-/post-test) of the samples. Inter-rater reliability was quantitatively assessed using a two-way random-effects model for the intraclass correlation coefficient (ICC). The analysis yielded an ICC of 0.92, which is conventionally interpreted as indicating excellent agreement (Cicchetti, 1994).

2.4.3. Writing Anxiety Questionnaire

This study adopts the Second Language Writing Anxiety Inventory (SLWAI), revised by Guo and Qin in 2010, as the research instrument. The scale is based on the Second Language Writing Anxiety Inventory developed by Cheng in 2004. The questionnaire assesses four dimensions: classroom anxiety, conceiving anxiety, avoidance behavior, and lack of confidence. Compared with Cheng’s SLWAI (2004) designed for English majors in Taiwan, the SLWAI adapted by Guo and Qin (2010) is more suitable for the actual situation of Chinese EFL university students due to its specificity, comprehensiveness, and localization adaptability, including language habits and cultural background. Such a localized scale can more accurately reflect the writing anxiety of Chinese EFL college students. In addition, it has been proven to have high reliability and validity (Guo & Qin, 2010). Therefore, in this study, it will be utilized to measure the writing anxiety of Chinese EFL university students.

2.4.4. Semi-Structured Interviews

To address RQ3 and gain a deeper understanding of the mechanisms behind the quantitative trends, semi-structured interviews were conducted. The interviews served a dual purpose: (1) to triangulate the quantitative findings on writing performance (CAF) and anxiety by seeking students’ explanatory perspectives; and (2) to explore in-depth the perceived advantages, challenges, and overall lived experience of using the iWrite system from the learners’ standpoint (Creswell & Creswell, 2017).

An interview protocol was developed based on the research objectives, ensuring alignment with the study’s theoretical framework (Coker & Akande, 2025). The protocol contained open-ended questions organized around key domains: students’ general experience with the writing tasks; their detailed perceptions of the feedback received; their described revision behaviors; and their feelings of confidence or anxiety throughout the process.

Semi-structured interviews were conducted with six participants from the experimental group following the completion of the intervention. A purposive sampling strategy was adopted to ensure variation in writing performance improvement and anxiety reduction levels. To ensure comfort and expressiveness, interviews were conducted face-to-face, one-on-one, in Mandarin Chinese (the participants’ first language). Each interview lasted approximately 30 min. Prior to the interview, informed consent was obtained, explicitly covering audio recording for research purposes. All interviews were recorded digitally and subsequently transcribed verbatim to prepare the textual data for analysis. Participant anonymity was maintained through the use of pseudonyms in all transcripts and reports.

2.5. Data Analysis

To address RQ1 and RQ2, quantitative data were analyzed in a two-stage procedure. Following descriptive statistics, inferential analyses were conducted. First, paired-sample t-tests assessed within-group changes from pre-test to post-test for each group separately.

Then, to directly compare the intervention effect between groups while controlling for pre-existing differences, a series of one-way analyses of covariance (ANCOVA) was performed with post-test scores as the dependent variable, group as the fixed factor, and the corresponding pre-test scores as the covariate. This method was selected over independent t-tests because it provides a more precise estimate of the treatment effect by accounting for baseline variability. Prior to these tests, the statistical assumptions were verified. The normality of distribution for all continuous variables was assessed using the Shapiro–Wilk test, and the homogeneity of variance was checked using Levene’s test. The data met the assumptions for parametric tests.

Effect sizes were calculated for statistically significant results, with Cohen’s d reported for t-tests and partial η² for ANCOVA. The magnitude of effects was interpreted in accordance with the L2-specific benchmarks proposed by Plonsky and Oswald (2014). All quantitative analyses were performed using SPSS 27.0.

The qualitative data were analyzed using thematic analysis following the procedures outlined by Braun and Clarke (2006). The analysis involved six steps: (1) familiarization with the data through repeated reading of transcripts; (2) initial open coding to identify meaningful units; (3) grouping related codes into broader categories; (4) generating candidate themes; (5) reviewing and refining themes to ensure internal coherence and external distinction; and (6) defining and naming the final themes.

To enhance credibility and reliability, a second researcher independently reviewed a subset (30%) of the transcripts and coding results. Discrepancies were discussed until consensus was reached. Member checking was also conducted by returning summarized interpretations to participants for confirmation.

The qualitative findings were used to triangulate and enrich the quantitative results, providing deeper insight into the mechanisms through which the iWrite system influenced writing performance and anxiety.

3. Results

3.1. Effects of iWrite on Writing Performance

To examine the effects of the iWrite system on students’ writing performance, this study analyzed changes in complexity, accuracy, and fluency (CAF) using a pre-test–post-test quasi-experimental design. Descriptive statistics (Table 2) indicate that the experimental group demonstrated noticeable mean gains across all four CAF dimensions, with particularly substantial improvements in lexical complexity (CTTR), accuracy (EFT/T), and fluency (W/T). In contrast, the control group showed only marginal changes over the same instructional period.

Paired-samples t-tests further revealed that the experimental group achieved statistically significant improvements in lexical complexity, accuracy, and fluency. According to the L2-specific benchmarks proposed by Plonsky and Oswald (2014), the effects (Cohen’s d) were intermediate-to-large for lexical complexity (d = 0.94) and large for accuracy (d = 1.44) and fluency (d = 1.14). In contrast, no significant changes were observed in the control group for any CAF measure (Table 3). Gains in syntactic complexity (C/T) did not reach statistical significance in either group.

The paired-samples t-tests revealed that the experimental group achieved statistically significant improvements with large effect sizes in lexical complexity (t(29) = 5.14, p < 0.001, d = 0.94), accuracy (t(29) = 7.86, p < 0.001, d = 1.44), and fluency (t(29) = 6.23, p < 0.001, d = 1.14). However, the change in syntactic complexity was not statistically significant (t(29) = 1.12, p = 0.271). Conversely, the control group showed no significant changes in any of the four CAF dimensions from pre-test to post-test (all p > 0.05).

To determine whether these improvements could be attributed to the iWrite intervention rather than pre-existing differences, one-way ANCOVAs were conducted using pre-test scores as covariates. As shown in Table 4, the experimental group significantly outperformed the control group on post-test measures of lexical complexity, accuracy, and fluency, with large effect sizes (partial η² ranging from 0.21 to 0.33). However, no significant between-group difference was found for syntactic complexity. These findings provide strong support for the effectiveness of iWrite in enhancing multiple dimensions of writing performance, while suggesting that short-term intervention may be insufficient to produce measurable gains in clause-level syntactic complexity, a dimension widely recognized as slower to develop in second-language writing (Nasseri, 2021).

Overall, the quantitative evidence indicates that iWrite significantly improves learners’ writing performance in lexical complexity, accuracy, and fluency, while its impact on syntactic complexity appears limited within the duration of the intervention. These findings align with prior research emphasizing the selective and dimension-specific effects of automated writing evaluation systems and highlight the pedagogical value of iWrite as a supportive, rather than comprehensive, instructional tool (Norris & Ortega, 2009; X. Wang, 2025).

3.2. Effects of iWrite on Writing Anxiety

To address Research Question 2, this study examined the effects of the iWrite system on students’ writing anxiety across four dimensions—classroom anxiety, conceiving anxiety, avoidance behavior, and lack of confidence—measured using the adapted Chinese version of the Second Language Writing Anxiety Inventory (Guo & Qin, 2010). As shown in Table 5, the experimental group exhibited substantial reductions in mean anxiety scores from pre-test to post-test across all four dimensions, whereas the control group showed only minimal changes.

Paired-samples t-tests (Table 6) revealed that these reductions were statistically significant for the experimental group, with large effect sizes for classroom anxiety, avoidance behavior, and lack of confidence, and a moderate effect size for conceiving anxiety, while no significant within-group changes were found in the control group. Further between-group comparisons using one-way ANCOVAs on post-test scores, controlling for pre-test levels, confirmed significant differences in favor of the experimental group across all anxiety dimensions (Table 7), with large partial η² values for classroom anxiety, avoidance behavior, and lack of confidence, and a moderate effect for conceiving anxiety.

3.3. Qualitative Findings

The thematic analysis of interview data yielded three major themes. First, students emphasized the value of immediate and non-judgmental feedback, noting that iWrite allowed them to identify and correct errors without fear of negative evaluation. Second, participants reported increased confidence and reduced anxiety, attributing these changes to repeated practice and visible improvement through revisions. Third, students acknowledged the limitations of iWrite, particularly its insufficient support for idea development and discourse organization, highlighting the continued importance of teacher guidance.

4. Discussion

4.1. Summary of Key Findings

This study investigated the effects of integrating the iWrite Automated Writing Evaluation (AWE) system into university EFL writing instruction. The quantitative results revealed a differentiated impact of the iWrite system on the writing performance of Chinese EFL learners: significant improvements with medium-to-large effect sizes were observed in accuracy, fluency, and lexical complexity, whereas no significant change was found in syntactic complexity. This pattern delineates the specific affordances and current boundaries of technology-mediated feedback.

Specifically, the marked gain in accuracy can be directly attributed to iWrite’s immediate, form-focused formative feedback, which created an efficient cycle of “error identification-autonomous correction.” This corroborates the effectiveness of Formative Assessment Theory and Autonomous Learning Theory in technology-enhanced environments (Black & Wiliam, 1998; Holec, 1981). Interview data confirmed that students actively utilized the feedback for multiple revisions, achieving autonomous optimization of linguistic accuracy.

The enhancement in fluency likely stems from a reallocation of cognitive load. By offloading part of the language-monitoring function onto the system, students could direct more attentional resources toward idea generation and content development, thereby producing more words per unit time (W/T). This supports the view that technological tools can promote production fluency by reducing the cognitive burden of the task.

The increase in lexical complexity indicates that iWrite can effectively serve as a scaffold for vocabulary development. The system’s feedback on word repetition and inappropriate collocations enhanced students’ metacognitive lexical awareness, prompting them to actively experiment with and incorporate more diverse vocabulary. This demonstrates the potential of technological feedback to facilitate constructivist learning.

However, the lack of significant improvement in syntactic complexity (C/T) can be interpreted from two perspectives. On one hand, it may reflect a “trade-off” in attentional resources, where students prioritized meeting the system’s explicit demands for accuracy and lexical diversity within a limited time (Skehan, 1998). On the other hand, it suggests that iWrite’s current feedback mechanism focuses more on sentence-level corrections and offers limited guidance for generating more complex syntactic structures like clause embedding.

Concurrently, on the affective dimension, the iWrite intervention led to a statistically significant reduction in writing anxiety across all four measured subscales, with particularly large effect sizes for “Avoidance Behavior” and “Lack of Confidence.” This provides robust empirical evidence for the system’s efficacy in alleviating the emotional barriers to writing among Chinese EFL learners. Qualitative data further revealed that this anxiety reduction was closely linked to the low-threat practice environment and the perceivable pathway for progress created by the system, laying the groundwork for a deeper analysis based on Control-Value Theory (Pekrun, 2006).

In summary, the findings paint a nuanced picture: the iWrite system is effective in enhancing language control, promoting production fluency, and enriching vocabulary, while also significantly reducing writing anxiety. However, its role in fostering the development of deeper linguistic competence, such as syntactic complexity, remains limited. This provides crucial evidence for understanding the scope of pedagogical empowerment offered by localized AWE tools.

4.2. Dialogue with Existing Theory and Literature

The aforementioned findings engage in a constructive dialog with core theoretical frameworks in second-language writing and educational technology. First, through its immediate and multi-round technological feedback, iWrite successfully institutionalized the “drafting-feedback-revision” cycle advocated by the process writing approach into a sustainable classroom practice (Y. Zhang, 2020), demonstrating the potential of technology to ground pedagogical principles. Second, the significant reduction in writing anxiety can be powerfully explained by Control-Value Theory (Pekrun, 2006). The system’s feedback, by providing a clear and actionable path for improvement, enhanced learners’ perceived control over the writing task, while its automated, non-judgmental nature reduced the perceived threat often associated with authoritative evaluation (Waer, 2023). Third, the results address the ongoing debate about whether AWE can promote deep language learning. The facilitation of lexical complexity supports the view that technology can act as a scaffold for constructivist learning. However, the stagnation in syntactic complexity echoes previous observations that current AWE systems are more adept at optimizing local language forms, while offering limited support for generating complex syntactic structures (Warschauer & Ware, 2006). This highlights the necessity of combining such technology with social interactions that emphasize meaning negotiation, such as teacher guidance.

4.3. Implications for Pedagogical Practice and Teacher Professional Development

The findings of this study hold direct significance for EFL writing instruction, particularly regarding the evolution of teachers’ roles and their professional growth. AWE systems like iWrite are best positioned as collaborative feedback providers. By efficiently handling surface-level linguistic features, they can liberate teachers from the heavy burden of mechanical grading, allowing them to focus on providing higher-order guidance on aspects such as ideation, logical organization, and rhetorical strategies (Bai, 2021). This enables the realization of human–computer collaborative differentiated instruction.

Critically, technology integration profoundly drives teacher professional development. The introduction of iWrite necessitates and fosters a threefold evolution in the teacher’s role: (1) From assessor to learning designer: Teachers need to redesign curricular processes to embed AWE organically within learning cycles, constructing a multi-source feedback ecosystem. (2) From judgment-by-experience to data-informed instructional decision-maker: Teachers need to develop data literacy to utilize the learning analytics provided by the system for targeted intervention (Lee et al., 2023). (3) From knowledge transmitter to metacognitive coach: Teachers need to guide students on how to critically utilize technological feedback, cultivating self-regulated learning strategies. This shift aligns with the Technological Pedagogical Content Knowledge (TPACK) framework, emphasizing the need for teachers to integrate technology, pedagogy, and content knowledge (DeLeon et al., 2023).

Therefore, successful AWE integration must be accompanied by systematic teacher professional development support. The focus of training should shift from software operation to technology-enhanced pedagogy, helping teachers master the skills required for these new roles and reflect on the place of technology within their teaching philosophy. This echoes the global discussion on the need for teachers to develop TPACK (Tseng et al., 2019) and provides an empirical case for its specific application in the Chinese EFL context.

4.4. Limitations and Directions for Future Research

This study has several limitations. The sample was drawn from a specific major at a single university, which limits the generalizability of the findings. While the 12-week intervention period allowed for the observation of short-term changes, the long-term effects remain unclear. Furthermore, the study treated iWrite as a holistic intervention, failing to disentangle the specific contributions of its different feedback types. Future research could: (1) conduct comparative studies across different regions and learner profiles; (2) implement longer-term longitudinal tracking; (3) employ micro-analytic methods to investigate learners’ attention to and internalization of different types of AWE feedback, thereby informing more refined system design and pedagogical integration.

5. Conclusions

This study investigated the pedagogical impact of integrating the iWrite Automated Writing Evaluation system into university EFL writing instruction, with a particular focus on writing performance and writing anxiety. Drawing on a quasi-experimental mixed-methods design, the findings demonstrate that sustained use of iWrite leads to significant improvements in writing accuracy, fluency, and lexical complexity, as well as marked reductions in learners’ writing anxiety, when compared with traditional teacher feedback. These results provide empirical support for the effectiveness of localized AWE systems in technology-enhanced EFL writing contexts.

From a theoretical perspective, the findings extend existing research on AWE-assisted writing by offering fine-grained evidence based on the CAF framework and by foregrounding the affective dimension of writing development. The differential effects observed across CAF dimensions suggest that automated feedback is particularly well suited to facilitating form-focused learning, while more complex aspects of syntactic development remain less responsive to short-term automated intervention. In addition, the significant reduction in writing anxiety lends support to Control–Value Theory by illustrating how enhanced perceived control and supportive feedback environments can positively shape learners’ emotional experiences in second-language writing.

Pedagogically, the study highlights the value of adopting a human–AI collaborative approach to EFL writing instruction. When employed as a formative tool, iWrite can effectively complement teacher feedback by handling routine linguistic evaluation and enabling repeated, low-stakes revision. Such an instructional configuration allows teachers to allocate more time and attention to higher-order concerns, including idea development, discourse organization, and rhetorical effectiveness. At the same time, the use of AWE systems may contribute to the creation of a more supportive and low-anxiety learning environment, thereby fostering sustained learner engagement in writing.

Despite these contributions, several limitations should be acknowledged. The relatively short duration of the intervention and the use of intact classes may limit the generalizability of the findings. Future research could employ longitudinal designs, involve participants from diverse institutional contexts, and explore the combined effects of automated and teacher feedback on higher-order writing development. Further investigation into learners’ long-term writing trajectories and self-regulatory behaviors would also enrich understanding of the pedagogical potential of AWE systems.

In conclusion, the present study underscores the promise of localized AWE systems such as iWrite in enhancing both the cognitive and affective dimensions of EFL writing. By integrating automated feedback within a carefully designed instructional framework, higher education institutions may better support learners’ writing development and well-being in increasingly technology-mediated learning environments.

Author Contributions

Conceptualization, J.D. and N.R.M.N.; methodology, J.D. and N.R.M.N.; investigation, J.D.; writing—original draft preparation, J.D.; writing—review and editing, N.R.M.N.; supervision, N.R.M.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Ethical review and approval were waived for this study because it investigated the impact of the iWrite automated writing evaluation system on the writing performance and writing anxiety of first-year English as a Foreign Language (EFL) students at Taishan University. The research was conducted within the framework of the regular College English curriculum. The rationale for requesting an exemption from formal ethical review is as follows: 1. Nature and Minimal Risk of the Research: This study falls under the category of educational practice and process evaluation research. All research activities, including writing practice using the iWrite system, completion of pre- and post-test questionnaires, and writing tasks, were integrated into the normal course instruction and assessment procedures. They constituted part of the students’ standard learning process. The study involved no invasive or physiological interventions, collected no sensitive personal information, and presented no foreseeable risk of harm to the participants’ physical or mental well-being, academic standing, or rights. 2. Robust Protection of Participant Rights: (1) Informed Consent: Prior to the commencement of the study, all potential student participants were provided with detailed verbal and written explanations regarding the research purpose, procedures, data usage, their right to voluntary participation and withdrawal at any time, and the strict anonymity and confidentiality guarantees. The informed consent process was documented (a blank template of the informed consent form is attached). (2) Voluntary Participation: Participation was entirely voluntary. A student’s decision not to participate or to withdraw would not affect their course grade or their relationship with the instructor. (3) Anonymity and Confidentiality: All collected data were de-identified immediately upon collection. For data analysis and in the final manuscript, only aggregated, group-level results are reported. No information that could identify any individual student will appear. The raw data will be securely destroyed within a stipulated period after the completion of the research. 3. Ethical Alignment of Research Design: The fundamental aim of this study is to explore effective pathways for enhancing students’ writing proficiency and alleviating learning anxiety through the application of a technological tool. This objective is inherently aligned with educational ethics and is aimed at fostering student development. Based on the reasons stated above, while this study involves human subjects, its nature is that of minimal-risk educational practice research. It has rigorously adhered to the core principles of the Declaration of Helsinki concerning respect for participants, informed consent, confidentiality, and risk minimization. Therefore, we hereby formally request an exemption from the requirement to provide a formal ethics committee approval document.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Abdel Latif, M. (2019). Helping L2 students overcome negative writing affect. Writing & Pedagogy, 11(1), 151–164. [Google Scholar] [CrossRef]
Bai, L. (2021). Factors inducing English writing anxiety and its manifestations among university EFL learners. Foreign Language and Literature, 37(5), 129–138. [Google Scholar]
Barrot, J. S., Llenares, I. I., & del Rosario, L. S. (2021). Students’ online learning challenges during the pandemic and how they cope with them: The case of the Philippines. Education and Information Technologies, 26, 7321–7338. [Google Scholar] [CrossRef] [PubMed]
Bewersdorff, A., Seßler, K., Baur, A., Kasneci, E., & Nerdel, C. (2023). Assessing student errors in experimentation using artificial intelligence and large language models: A comparative study with human raters. Computers and Education: Artificial Intelligence, 5, 100177. [Google Scholar] [CrossRef]
Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: Principles, Policy & Practice, 5(1), 7–74. [Google Scholar]
Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77–101. [Google Scholar] [CrossRef]
Chen, S. (2022). Consequences, impact and washback of CET test within assessment for use argument to validation. International Education Studies, 15(4), 42–57. [Google Scholar] [CrossRef]
Chen, X. (2025). Implementing e-assessment for learning in primary EFL writing: A case study in China. Springer Nature. [Google Scholar]
Cheng, X., & Zhang, L. J. (2021). Sustaining university English as a foreign language learners’ writing performance through provision of comprehensive written corrective feedback. Sustainability, 13(15), 8192. [Google Scholar] [CrossRef]
Cicchetti, D. V. (1994). Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychological Assessment, 6(4), 284. [Google Scholar] [CrossRef]
Cohen, J. (2013). Statistical power analysis for the behavioral sciences. Routledge. [Google Scholar]
Coker, D. C., & Akande, A. (2025). Preparation, development, and refinement of the interview protocol in qualitative research. Qualitative Report, 30(8), 4205–4235. [Google Scholar] [CrossRef]
Creswell, J. W., & Clark, V. L. P. (2017). Designing and conducting mixed methods research. Sage Publications. [Google Scholar]
Creswell, J. W., & Creswell, J. D. (2017). Research design: Qualitative, quantitative, and mixed methods approaches. Sage Publications. [Google Scholar]
Deane, P. (2018). The challenges of writing in school: Conceptualizing writing development within a sociocognitive framework. Educational Psychologist, 53(4), 280–300. [Google Scholar] [CrossRef]
DeLeon, T. M., Sarona, J., Casimiro, A., Lao, H., Lao, K. A., Pantaleon, C., & Alieto, E. (2023). Writing anxiety among prospective nonlanguage teachers: A quantitative study of a nonmetropolitan state university. Forum for Linguistic Studies (Transferred), 5(3), 1933. [Google Scholar]
Flower, L., & Hayes, J. R. (1981). A cognitive process theory of writing. College Composition & Communication, 32(4), 365–387. [Google Scholar]
Geng, J., & Razali, A. B. (2020). Tapping the potential of Pigai automated writing evaluation (AWE) program to give feedback on EFL writing. Universal Journal of Educational Research, 8(12B), 8334–8343. [Google Scholar] [CrossRef]
Guo, Y., & Qin, X. (2010). A test report on foreign language writing anxiety among Chinese non-English-major university students and its implications for writing instruction. Foreign Language World, 30(2), 54–62, 82. [Google Scholar] [CrossRef]
Holec, H. (1981). Autonomy and foreign language learning. Pergamon. [Google Scholar]
Jiang, X. (2020). English education in China: Initiation, evolution, and impact a case study of College English Test (CET-4/6) and College English education. State University of New York at Buffalo. [Google Scholar]
Lee, T. Y., Ho, Y. C., & Chen, C. H. (2023). Integrating intercultural communicative competence into an online EFL classroom: An empirical study of a secondary school in Thailand. Asian-Pacific Journal of Second and Foreign Language Education, 8(1), 4. [Google Scholar] [CrossRef]
Li, Z. (2021). Teachers in automated writing evaluation (AWE) system-supported ESL writing classes: Perception, implementation, and influence. System, 99, 102505. [Google Scholar] [CrossRef]
Loncar, M., Schams, W., & Liang, J. S. (2023). Multiple technologies, multiple sources: Trends and analyses of the literature on technology-mediated feedback for L2 English writing published from 2015–2019. Computer Assisted Language Learning, 36(4), 722–784. [Google Scholar] [CrossRef]
Lu, X., & Ai, H. (2015). Syntactic complexity in college-level English writing: Differences among writers with diverse L1 backgrounds. Journal of Second Language Writing, 29, 16–27. [Google Scholar] [CrossRef]
McCallum, L., & Curry, N. (2023). Complexity, accuracy and fluency in writing. In Insights into teaching and learning writing: A practical guide for early career teachers (pp. 27–44). Castledown. [Google Scholar]
McNamara, D. S., & Kendeou, P. (2022). The early automated writing evaluation (eAWE) framework. Assessment in Education: Principles, Policy & Practice, 29(2), 150–182. [Google Scholar] [CrossRef]
Mrabti, L., & Alaoui, Z. B. (2024). Balancing qualitative and quantitative research methods: Insights and applications. In Data collection and analysis in scientific qualitative research (pp. 87–118). IGI Global. [Google Scholar]
Nasseri, M. (2021). Statistical modelling of lexical and syntactic complexity of postgraduate academic writing: A genre and corpus-based study of EFL, ESL, and English L1 MA dissertations [Doctoral dissertation, University of Birmingham]. [Google Scholar]
Ng, C. H., & Cheung, Y. L. (2017). Innovations in writing instruction in China: Metasynthesis of qualitative research for the period 2005–2016. In Innovation in language learning and teaching: The case of China (pp. 63–87). Palgrave Macmillan. [Google Scholar]
Norris, J. M., & Ortega, L. (2009). Towards an organic approach to investigating CAF in instructed SLA: The case of complexity. Applied linguistics, 30(4), 555–578. [Google Scholar] [CrossRef]
Nunes, A., Cordeiro, C., Limpo, T., & Castro, S. L. (2022). Effectiveness of automated writing evaluation systems in school settings: A systematic review of studies from 2000 to 2020. Journal of Computer Assisted Learning, 38(2), 599–620. [Google Scholar] [CrossRef]
Patty, J. (2024). Addressing student writing challenges: A review of difficulties and effective strategies. Education Journal: Journal Educational Research and Development, 8(2), 369–392. [Google Scholar]
Pekrun, R. (2006). The control-value theory of achievement emotions: Assumptions, corollaries, and implications for educational research and practice. Educational Psychology Review, 18(4), 315–341. [Google Scholar] [CrossRef]
Plonsky, L., & Oswald, F. L. (2014). How big is “big”? Interpreting effect sizes in L2 research. Language Learning, 64(4), 878–912. [Google Scholar] [CrossRef]
Qin, J., & Liu, D. (2025). Introducing/testing new SFL-inspired communication/content/function-focused measures for assessing L2 narrative task performance. Applied Linguistics, 46(2), 286–304. [Google Scholar] [CrossRef]
Ranalli, J., & Yamashita, T. (2022). Automated written corrective feedback: Errorcorrection performance and timing of delivery. Language Learning & Technology, 26(1), 1–25. [Google Scholar] [CrossRef]
Saricaoglu, A., & Atak, N. (2022). Syntactic complexity and lexical complexity in argumentative writing: Variation by proficiency. Novitas-ROYAL (Research on Youth and Language), 16(1), 56–73. [Google Scholar]
Skehan, P. (1998). A cognitive approach to language learning. Oxford University Press. [Google Scholar]
Su, Y. (2020). A comparative study of representative Chinese and American automated writing evaluation systems in a blended learning environment: A case study of Inner Mongolia Agricultural University. Inner Mongolia Education, (24), 18–23. [Google Scholar]
Toufaha, S. A. H. K. I. (2024). The effect of automated error corrective feedback on the improvement of EFL learners’ writing and autonomy [Doctoral dissertation, University Kasdi Merbah Ouargla]. [Google Scholar]
Tseng, J. J., Cheng, Y. S., & Yeh, H. N. (2019). How pre-service English teachers enact TPACK in the context of web-conferencing teaching: A design thinking approach. Computers & Education, 128, 171–182. [Google Scholar]
Türkoğlu, S. (2025). AI vs human: A quasi-experimental study on ChatGPT and teacher feedback for English as a foreign language young learners’ writings [Master’s thesis, Middle East Technical University]. [Google Scholar]
Waer, H. (2023). The effect of integrating automated writing evaluation on EFL writing apprehension and grammatical knowledge. Innovation in Language Learning and Teaching, 17(1), 47–71. [Google Scholar] [CrossRef]
Wang, J., & Wang, X. (2021). The relationship between lexical and syntactic quantitative indices and automated writing evaluation: Evidence from Pigai and iWrite. Language and Culture Forum, (2), 258–273. [Google Scholar]
Wang, X. (2025). Students’ perceptions of automated writing evaluation in autonomous learning. Education and Information Technologies, 30, 14703–14735. [Google Scholar] [CrossRef]
Warschauer, M., & Ware, P. (2006). Automated writing evaluation: Defining the classroom research agenda. Language teaching research, 10(2), 157–180. [Google Scholar] [CrossRef]
Wilson, J., & Roscoe, R. D. (2020). Automated writing evaluation and feedback: Multiple metrics of efficacy. Journal of Educational Computing Research, 58(1), 87–125. [Google Scholar] [CrossRef]
Wu, B., Zhang, H., & Wu, X. (2024). The application of artificial intelligence in College English course and CET-4 mock examination—A case study of “iTEST” and “iWRITE”. In Proceedings of the 2024 2nd international conference on information education and artificial intelligence (pp. 674–682). Association for Computing Machinery. [Google Scholar]
Yan, C. (2024). The inducing factors and coping strategies of English writing anxiety. The Educational Review, USA, 8(1), 103–108. [Google Scholar] [CrossRef]
Zhai, N., & Ma, X. (2022). Automated writing evaluation (AWE) feedback: A systematic investigation of college students’ acceptance. Computer Assisted Language Learning, 35(9), 2817–2842. [Google Scholar] [CrossRef]
Zhang, Y. (2020). The effects of automated feedback on the English writing revision process: Evidence from iWrite 2.0. Journal of Gansu Normal Colleges, 25(6), 38–43. [Google Scholar]
Zhang, Z. V., & Hyland, K. (2022). Fostering student engagement with feedback: An integrated approach. Assessing Writing, 51, 100586. [Google Scholar] [CrossRef]

Table 1. Demographic Details of the Samples.

Characteristic	Category	Experimental Group (n = 30)	Control Group (n = 30)
Gender	Female	26 (86.7%)	25 (83.3%)
	Male	4 (13.3%)	5 (16.7%)
Age (years)	18	21 (70.0%)	20 (66.7%)
	19	7 (23.3%)	8 (26.7%)
	20	2 (6.7%)	2 (6.7%)
English Learning Experience (years)	<9	0 (0.0%)	0 (0.0%)
	9–11	29 (96.7%)	28 (93.3%)
	>11	1 (3.3%)	2 (6.7%)

Note. Both groups demonstrated comparable distributions in gender, age range, and years of English learning.

Table 2. Descriptive Statistics for Writing Performance (CAF) by Group and Time.

Dimension	Group	Pre-Test M (SD)	Post-Test M (SD)	Mean Gain
Syntactic Complexity (C/T)	Experimental	1.85 (0.21)	1.88 (0.20)	+0.03
	Control	1.83 (0.19)	1.84 (0.18)	+0.01
Lexical Complexity (CTTR)	Experimental	0.47 (0.05)	0.52 (0.06)	+0.05
	Control	0.46 (0.04)	0.47 (0.04)	+0.01
Accuracy (EFT/T)	Experimental	72.43 (6.87)	79.85 (6.32)	+7.42
	Control	71.86 (6.55)	73.15 (6.41)	+1.29
Fluency (W/T)	Experimental	10.72 (1.64)	12.35 (1.73)	+1.63
	Control	10.65 (1.58)	10.92 (1.66)	+0.27

Note. C/T = Clauses per T-unit; CTTR = Corrected Type-Token Ratio; EFT/T = Error-free T-units per T-unit; W/T = Words per T-unit.

Table 3. Paired-Samples t-Tests for CAF Measures within Groups.

Dimension	Group	t(29)	p	Cohen’s d
Syntactic Complexity (C/T)	Experimental	1.12	0.271	0.20
	Control	0.64	0.527	0.10
Lexical Complexity (CTTR)	Experimental	5.14	<0.001	0.94
	Control	1.02	0.315	0.18
Accuracy (EFT/T)	Experimental	7.86	<0.001	1.44
	Control	1.27	0.213	0.23
Fluency (W/T)	Experimental	6.23	<0.001	1.14
	Control	0.98	0.334	0.17

Note. Plonsky and Oswald’s d effect size: 0.40 = small, 0.70 = medium, 1.00 = large (Plonsky and Oswald, 2014).

Table 4. ANCOVA Results for CAF Measures between Groups.

Dimension	F(1, 57)	p	Partial η²
Syntactic Complexity (C/T)	1.24	0.270	0.02
Lexical Complexity (CTTR)	15.27	<0.001	0.21
Accuracy (EFT/T)	28.43	<0.001	0.33
Fluency (W/T)	19.56	<0.001	0.26

Note. Partial η² effect size: 0.01 = small, 0.06 = medium, 0.14 = large (Cohen, 2013).

Table 5. Descriptive Statistics for Writing Anxiety Dimensions by Group and Time.

Dimension	Group	Pre-Test M (SD)	Post-Test M (SD)	Mean Change
Classroom Anxiety	Experimental	3.68 (0.54)	2.81 (0.49)	−0.87
	Control	3.70 (0.52)	3.65 (0.55)	−0.05
Conceiving Anxiety	Experimental	3.45 (0.57)	3.12 (0.50)	−0.33
	Control	3.44 (0.55)	3.40 (0.56)	−0.04
Avoidance Behavior	Experimental	3.62 (0.50)	2.89 (0.48)	−0.73
	Control	3.60 (0.49)	3.57 (0.52)	−0.03
Lack of Confidence	Experimental	3.58 (0.56)	2.77 (0.47)	−0.81
	Control	3.61 (0.53)	3.59 (0.55)	−0.02

Note. Higher scores indicate greater anxiety levels.

Table 6. Results of Paired-Samples t-Tests for Within-Group Changes in Writing Anxiety.

Dimension	Group	t(29)	p	Cohen’s d
Classroom Anxiety	Experimental	8.41	<0.001	1.53
	Control	0.68	0.501	0.12
Conceiving Anxiety	Experimental	3.24	0.003	0.59
	Control	0.54	0.594	0.10
Avoidance Behavior	Experimental	7.36	<0.001	1.34
	Control	0.40	0.692	0.07
Lack of Confidence	Experimental	8.05	<0.001	1.47
	Control	0.22	0.829	0.04

Table 7. Results of One-Way ANCOVAs for Between-Group Differences on Post-Test Anxiety Scores.

Dimension	F(1, 57)	p	Partial η²
Classroom Anxiety	29.86	<0.001	0.34
Conceiving Anxiety	6.42	0.014	0.18
Avoidance Behavior	22.75	<0.001	0.29
Lack of Confidence	31.14	<0.001	0.35

Note. Partial η² effect sizes: 0.01 = small, 0.06 = medium, 0.14 = large (Cohen, 2013).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Du, J.; Nordin, N.R.M. The Impact of the iWrite Automated Writing Evaluation System on University EFL Students’ Writing Performance and Writing Anxiety. Educ. Sci. 2026, 16, 411. https://doi.org/10.3390/educsci16030411

AMA Style

Du J, Nordin NRM. The Impact of the iWrite Automated Writing Evaluation System on University EFL Students’ Writing Performance and Writing Anxiety. Education Sciences. 2026; 16(3):411. https://doi.org/10.3390/educsci16030411

Chicago/Turabian Style

Du, Jiapeng, and Nur Rasyidah Mohd Nordin. 2026. "The Impact of the iWrite Automated Writing Evaluation System on University EFL Students’ Writing Performance and Writing Anxiety" Education Sciences 16, no. 3: 411. https://doi.org/10.3390/educsci16030411

APA Style

Du, J., & Nordin, N. R. M. (2026). The Impact of the iWrite Automated Writing Evaluation System on University EFL Students’ Writing Performance and Writing Anxiety. Education Sciences, 16(3), 411. https://doi.org/10.3390/educsci16030411

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

The Impact of the iWrite Automated Writing Evaluation System on University EFL Students’ Writing Performance and Writing Anxiety

Abstract

1. Introduction

2. Materials and Methods

2.1. Research Design

2.2. Participants and Context

2.3. Instructional Procedures

2.4. Research Instruments

2.4.1. iWrite System

2.4.2. Writing Tests

2.4.3. Writing Anxiety Questionnaire

2.4.4. Semi-Structured Interviews

2.5. Data Analysis

3. Results

3.1. Effects of iWrite on Writing Performance

3.2. Effects of iWrite on Writing Anxiety

3.3. Qualitative Findings

4. Discussion

4.1. Summary of Key Findings

4.2. Dialogue with Existing Theory and Literature

4.3. Implications for Pedagogical Practice and Teacher Professional Development

4.4. Limitations and Directions for Future Research

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI