Next Article in Journal
Clustering Performance of a Recombinator Hartigan–Wong Algorithm
Previous Article in Journal
EEG-Based Gait Classification in Stroke Patients Using Deep Learning
Previous Article in Special Issue
Cognitive Grounding for Perspective Integration in Multi-LLM Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Boundary Conditions for LLM-Generated Feedback in Primary Writing: An Educator-Aligned Evaluation and Design Considerations

1
Faculty of Science, Engineering and Built Environment, School of Information Technology, Deakin University, Burwood, VIC 3125, Australia
2
Commonwealth Scientific and Industrial Research Organisation (CSIRO), Research Way, Clayton, VIC 3168, Australia
3
Kinetic Education, 506 Nepean Hwy, Frankston, VIC 3199, Australia
*
Authors to whom correspondence should be addressed.
Computers 2026, 15(6), 393; https://doi.org/10.3390/computers15060393 (registering DOI)
Submission received: 18 May 2026 / Revised: 10 June 2026 / Accepted: 13 June 2026 / Published: 18 June 2026

Abstract

Generative large language models (LLMs) are increasingly used to support writing feedback. However, the pedagogical safety and usefulness of LLM feedback for primary students remains under-evaluated. This study reports an educator-centered evaluation of GPT-4 Turbo for Year 5 narrative and persuasive writing in the context of an established online tutoring program. Using authentic students’ drafts paired with tutor feedback, we generated parallel LLM feedback via rubric-aligned prompting and compared the two feedback sources in a blinded, within-script design. Four experienced English specialists co-designed a six-dimensional rubric (clarity, specificity, helpfulness, feasibility, relevance, and overall effectiveness) and rated tutor versus LLM feedback for each script; their written reflections were analyzed thematically to surface boundary conditions and risk perceptions. Across dimensions, tutor feedback received slightly higher mean ratings, with the clearest descriptive advantage in perceived helpfulness; however, none of the differences remained statistically significant after Holm-Bonferroni correction. LLM feedback was often rated similarly for clarity and feasibility but was frequently characterized as generic, surface-focused, and occasionally misaligned with the student draft, which increased verification effort and posed a risk of misleading learners if used without mediation. Synthesizing ratings and educator reflections, we identify conditions under which LLM feedback is most appropriate as rapid first-pass support for routine structure and surface revision, and least appropriate for developmental judgment and context-sensitive guidance. We translate these findings into design requirements for teacher-in-the-loop primary writing feedback systems, including alignment to explicit pedagogical constructs, editable workflows, and safeguards that reduce unsupported feedback before release to students.
Keywords: large language models (LLMs); automated writing feedback; primary education; Human-AI collaboration; educator-centered evaluation large language models (LLMs); automated writing feedback; primary education; Human-AI collaboration; educator-centered evaluation

Share and Cite

MDPI and ACS Style

Zhang, D.; Hoang, T.; Zhu, Y.; Wang, R.; Crouch, P.; Wang, Y. Boundary Conditions for LLM-Generated Feedback in Primary Writing: An Educator-Aligned Evaluation and Design Considerations. Computers 2026, 15, 393. https://doi.org/10.3390/computers15060393

AMA Style

Zhang D, Hoang T, Zhu Y, Wang R, Crouch P, Wang Y. Boundary Conditions for LLM-Generated Feedback in Primary Writing: An Educator-Aligned Evaluation and Design Considerations. Computers. 2026; 15(6):393. https://doi.org/10.3390/computers15060393

Chicago/Turabian Style

Zhang, Dan, Thuong Hoang, Ye Zhu, Rui Wang, Paula Crouch, and Yi Wang. 2026. "Boundary Conditions for LLM-Generated Feedback in Primary Writing: An Educator-Aligned Evaluation and Design Considerations" Computers 15, no. 6: 393. https://doi.org/10.3390/computers15060393

APA Style

Zhang, D., Hoang, T., Zhu, Y., Wang, R., Crouch, P., & Wang, Y. (2026). Boundary Conditions for LLM-Generated Feedback in Primary Writing: An Educator-Aligned Evaluation and Design Considerations. Computers, 15(6), 393. https://doi.org/10.3390/computers15060393

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop