Next Article in Journal
Inclusion Across Educational Levels: Cultural Differences in the Attitudes of Jewish and Arab Teachers in Elementary, Middle, and High Schools
Previous Article in Journal
Investigating Teachers’ Changing Perceptions Towards MOOCs Through the Technology Acceptance Model
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Systematic Review

Effects of AI-Assisted Feedback via Generative Chat on Academic Writing in Higher Education Students: A Systematic Review of the Literature

by
Claudio Andrés Cerón Urzúa
1,
Ranjeeva Ranjan
1,*,
Eleazar Eduardo Méndez Saavedra
2,
María Graciela Badilla-Quintana
3,
Nancy Lepe-Martínez
4 and
Andrew Philominraj
5
1
Department of Educational Foundations, Faculty of Educational Sciences, Universidad Católica del Maule, Talca 3460000, Chile
2
Faculty of Educational Sciences, Universidad Católica del Maule, Talca 3460000, Chile
3
Department of Curriculum, Assessment, and Technologies in Education, Faculty of Education, Universidad Católica de la Santísima Concepción, Concepción 4090541, Chile
4
Department of Diversity and Educational Inclusion, Faculty of Educational Sciences, Universidad Católica del Maule, Talca 3460000, Chile
5
Department of Languages, Faculty of Educational Sciences, Universidad Católica del Maule, Talca 3460000, Chile
*
Author to whom correspondence should be addressed.
Educ. Sci. 2025, 15(10), 1396; https://doi.org/10.3390/educsci15101396
Submission received: 11 September 2025 / Revised: 13 October 2025 / Accepted: 15 October 2025 / Published: 18 October 2025

Abstract

The use of generative chat in education has become widespread over the last four years, raising many questions about its use and the effects of AI on learning. The aim of the current systematic review is to analyze the main effects of feedback through the use of generative chat on the production of academic texts by university students. This research is defined as a systematic review of the literature according to the guidelines of the PRISMA statement. The search was conducted in three international important databases (Scopus, Eric, and WoS), from which 12 articles were selected. The results highlighted that there are positive effects on university students’ writing when generative chat is used as a means of providing feedback. Among the main results, it was observed that feedback via chat helps to improve aspects mainly associated with the structure and organization of texts, allows for the proper use of grammatical conventions, and improves the fluency and cohesion of sentences, as well as the precision of ideas and vocabulary. In addition, other benefits were observed in the review, such as improved self-efficacy, self-regulation, proactivity, motivation, and reflection on writing, which promotes critical thinking about the text but also about AI, reducing anxiety and stress.

1. Introduction

Over the past decade, artificial intelligence has undergone significant advancements in its evolution, research, and mass adoption. This process is especially evident in information processing, as it is now possible to understand and interpret human language more efficiently (Y. A. Ahmed & Sharo, 2023; Gutiérrez, 2023; Khlaif et al., 2023; Roumeliotis & Tselikas, 2023). These developments have led to rapid growth in the design and use of chatbots, ushering in a new and more revolutionary phase of development, far beyond anything previously seen, consolidated by the arrival of ChatGPT and the incorporation of the Transformer neural network architecture (Al-Amin et al., 2024).
In this context, various authors (Roumeliotis & Tselikas, 2023; Y. A. Ahmed & Sharo, 2023; Diego Olite et al., 2023) argue that the Transformer neural network and its different ChatGPT versions have gained market dominance due to their novel approach to neural networks, which enhanced natural language processing. ChatGPT stands out for its efficiency in handling long and complex inputs, delivering effective performance in both language generations and comprehension. Specifically, it can recognize linguistic patterns and correlations, providing coherent and contextually appropriate responses (Roumeliotis & Tselikas, 2023; Gutiérrez, 2023). ChatGPT has also improved natural language communications between different nodes, facilitating the interpretation and real-time processing of various user requests promptly (Rane, 2023).
The use of chatbots in the educational sphere is growing exponentially. In this context, it is fundamental to know and understand the effects of feedback delivered through the use of generative chats that are emerging in this area of knowledge. There are several concerns surrounding the use of chatbots, particularly regarding the validity and effects of feedback on learning processes related to written text production skills among higher education students. In terms of limitations, validity, or errors in the use of conversational chat tools, various shortcomings have been identified, and several concerns have been raised—especially related to the accuracy and quality of information provided by conversational chatbots, which can often be linked to false information or so-called “hallucination” (Alkaissi & McFarlane, 2023; Meyer et al., 2023). It is important to note that this “hallucination” phenomenon occurs, according to Albadarin et al. (2024), when chat tools are used without pedagogical guidance, particularly in open user context, such as when students use conversational chatbots to support their learning.
According to some authors (Baidoo-Anu & Owusu Ansah, 2023; Gilson et al., 2023; Khan et al., 2023), the use of ChatGPT and similar models raises several issues when they are not embedded within a reliable, well-structured instructional plan. The inaccuracy and low reliability of results are attributed to factors such as insufficient training, poor-quality or inadequate data sets, and the way users formulate their prompts (Fuchs, 2023; Meyer et al., 2023; Wardat et al., 2023).
To mitigate these issues and enhance the effectiveness of feedback through chat tools, it is essential to train both teachers and students, strengthening their technological competencies and AI literacy. This training should include basic use as well as the ability to evaluate and critically interpret the information provided. In this context, it is important for schools to work with students to promote educational—not merely casual—uses of these tools (Tedre et al., 2023). In this regard, it is also valuable to distinguish between beginner and professional users (Cassany, 2024), since validity depends exclusively on the user, whether they are a student or a mediator, with greater responsibility falling on the teacher when chat tools are used for formative and pedagogical purposes.
Currently, a growing line of research is focusing on the intersection of assessment and Artificial Intelligence (AI), due to the opportunities AI offers in supporting learning through formative assessment and feedback (Tempelaar et al., 2018; Ifenthaler & Greiff, 2021; Gašević et al., 2022). Regarding pedagogical use—specifically using chat to provide effective feedback on students’ written production—several areas for improvement exist. According to Liu (2024), using chat for feedback supports student writing and encourages greater engagement. More specifically, research findings show that automated feedback provides timely guidance for students’ writing skills, particularly improving their linguistic abilities. Sarosa et al. (2020) argue that these tools should complement human instruction and do not replace the teacher. Chatbots should be used to enhance and strengthen students’ cognitive processes (Lucana & Roldan, 2023), not merely for information retrieval.
Within this framework, research has shown that AI-generated feedback is an effective tool for encouraging task completion and providing timely feedback on written texts (Herft, 2023). For some authors (Van Dis et al., 2023; Volante et al., 2023), these tools should be directed towards the development of higher-order thinking skills, rather than focusing solely on information gathering. Teachers must promote scientific literacy skills through pedagogical models that emphasize reflective and creative thinking. According to Cassany (1990), four methodological approaches to teaching written expression were prominent in the 20th century, each contributing to the development of higher-order skills. The first model involves mastery of grammatical knowledge and includes two sub-models: one focused on sentence construction and another on linguistic–discursive structures. The second model, based on language functions, is grounded in the idea that writing is a communicative tool that should be taught actively. The third model, content-based, prioritizes the form as a means to understand the topic, develop ideas, and produce written texts. Lastly, the fourth model—the process approach—assigns less importance to grammar and emphasizes the composition process, which includes idea generation, outlining, and drafting, as well as the evaluative acts of reviewing, correcting, and rewriting (Medina & Gajardo, 2018).
Given this scenario, and recognizing that each model focuses on important aspects of writing production (grammar, functionality, text, content, and process), an eclectic approach is necessary to understand the existing dynamism in this process. Therefore, incorporating a model that allows for the definition and categorization of writing characteristics is essential for evaluating text production. For this reason, we believe that Culham’s (2005) model complements this eclectic perspective of teaching and assessing writing well, as it proposes seven meaningful dimensions that help categorize content in selected studies and assess the impact of chatbot-based feedback on written production.
Keeping these discussions in mind, the objective of this systematic review is to analyze the effects of such feedback on academic text production at the university level. The question we aim to answer is: What are the effects of feedback delivered through the use of generative chats that are emerging in the educational sphere?

2. Methodology

The current study has been developed in the framework of a systematic review following a methodology that allows for the selection and discussion of findings on a particular subject of study through exhaustive search techniques (Letelier et al., 2005; Manterola et al., 2013). This study followed the guidelines of the PRISMA statement (Page et al., 2021) and the rigorous steps proposed by Gisbert and Bonfill (2004), including the formulation of questions and the use of a method to synthesize and select various studies of interest to the researcher. The search was conducted from March to July 2025, using three international databases: Scopus, Eric, and Web of Science (WoS). Several keywords were included, which allowed for the creation of three Boolean equations: “(TITLE-ABS-KEY: (Feedback AND chat AND writing) OR (feedback AND “artificial intelligence” AND writing) OR (feedback AND chatgpt AND writing))” and the time interval was set for articles published between 2021 and 2025 with the idea to analyze the state of the art in this field. The search was refined by selecting articles in the research areas of social sciences and the sub-areas of education and psychology. The inclusion criteria adopted were academic writing, feedback, generative chat, university context, and all types of language. Conventional feedback (by teachers), non-conversational chat, and non-university contexts were considered exclusion criteria. The selection of articles was carried out in three consecutive stages, as shown in the PRISMA flow diagram (Figure 1), which involved the elimination of duplicates, reading of titles and abstracts, and reading of the full text, applying the inclusion and exclusion criteria.

2.1. Search Strategy

On 4 March 2025, we conducted exhaustive bibliographic searches in three digital databases: ERIC, Web of Science, and Scopus. These databases were selected to cover high-standard research work linked to educational research in accordance with the scientific rigor of high-impact indexing. In this regard, ERIC was selected because it is a database with extensive coverage of educational research literature. Two other high-impact databases were also selected: Web of Science, for its high-quality multidisciplinary indexing and scientific citation, and Scopus, for its broad academic coverage and scientific rigor. We designed Boolean search strings that combined multiple keywords such as: Feedback, chat, “artificial intelligence,” ChatGPT, and writing. This search strategy yielded a total of 924 records (before removing duplicates) in the three databases.

2.2. Eligibility Criteria

For the systematic review, the group of researchers discussed its scope well, and the following inclusion and exclusion criteria were established to include the literature in this review in order to respond to the research question.

2.2.1. Inclusion Criteria

In the inclusion criteria, the following principles were considered.
  • Studies had to be published between 2021 and 2025 (to select the latest research findings on the use of generative chat feedback and to know the state of the art).
  • Articles written in any language were accepted, with English being the most common.
  • Empirical research (quantitative, qualitative, or mixed methods) that explicitly incorporated the implementation of generative chat feedback as a main component in the context of academic writing assessment.
  • Only articles from scientific journals were included.

2.2.2. Exclusion Criteria

In the exclusion criteria, we considered the following:
  • Publications that were systematic reviews of the literature, conference proceedings and presentations, editorials, or conceptual and theoretical articles, essays and book chapters were excluded.
  • Some of the studies, though appearing in the search, were excluded as they did not focus on feedback through the use of generative chat (for example, we excluded studies on “menu-based chat,” “rule-based chat,” “voice chat,” “non-generative AI chat,” platform-based chat, and “hybrid chatbots”).

2.3. Selection and Coding

After conducting the initial searches, we imported 500 scientific articles. An Excel spreadsheet was used to identify duplicates, and six duplicate records were removed from the 500, resulting in a set of 494 unique articles. The titles and abstracts of the remaining articles were then screened. Two reviewers (the first two authors of this review) worked on this review, independently screening each article based on the title and abstract according to the inclusion criteria. After reading the titles and abstracts, the exclusion criteria were applied, and 476 articles out of 494 were excluded, thus finally selecting only 18 articles to be read in full. After a complete reading of the 18 articles, 12 articles were selected that met both criteria and focused on feedback through the use of generative chat to improve writing (see Figure 1).
After extracting these details, we analyzed the data using a qualitative coding approach based on Corbin and Strauss’s (2014) grounded theory coding technique. A categorization matrix was used to identify open codes and axial categories. The open codes emerged from the content analysis of the selected articles, and the categorization was constructed based on the 6 + 1 model (Culham, 2005), which provided seven features for evaluating writing, associated with: (1) adaptation to the communicative situation, (2) ideas, (3) personal voice, (4) word choice, (5) fluency and cohesion, (6) structure and organization, and (7) grammatical conventions. Thus, to determine the effects on writing, the open codes were considered and grouped into these formal categories using this model, which offers a series of theoretical categories for evaluating writing and measuring results. With regard to the effects linked to other aspects, it was decided to conceptually code all types of positive effects that were outside the scope of writing and associated with socio-emotional aspects such as self-regulation and motivation, among others.

3. Results

In this section, the results are presented in two parts. In the first part, a general methodological characterization of the included research studies is done, and in the second part, the studies are presented in a tabular form, keeping in mind Culham’s model.

3.1. Years of Publication

Figure 2 shows that the selected publications were published between 2024 and 2025, as no articles on the subject of study were found before this date. It should be noted that, starting in 2022, ChatGPT led to the widespread use of generative chat among users, which is why scientifically rigorous research began after that year. For this reason, research began to take shape in 2023 and started publishing the following year.

3.2. Geographical Locations

Figure 3 and Figure 4 show the geographical distribution of authors who have contributed to selected publications on academic writing feedback through generative chat in the context of higher education. It is interesting to note that most of the selected publications are of Asian origin (Figure 3) and that the country with the highest number of publications on the subject is China, followed by the United States (Figure 4). In this sense, the findings highlight the concentration of research in East Asia and North America, with limited participation from other regions and a total absence from Latin America, indicating a disparity in research activity and capacity. In this sense, the results show that the Asian countries are far ahead in terms of research in the field of the use of technology for the purpose of feedback, and there is a need to deepen research in this field in the Latin American context.
Table 1 presents a chronological summary of the 12 articles that aligned with the objective of this systematic review, thereby meeting the stated eligibility criteria. The table also includes the name of the author, year of publication, research location, methodology, general findings, and the effects on writing, as well as other areas of improvement observed in students.
Table 1 shows that mixed-methods approaches predominate among the selected articles, with 7 out of the 12 studies employing this methodology. Meanwhile, 4 studies adopted a qualitative approach, and only one was purely quantitative. This trend can be explained by the research objectives of the selected studies and the need to incorporate designs that allow for the analysis of feedback effects through generative chat tools via interventions and controlled trials, while also considering the meanings participants attribute to their experiences. Furthermore, Table 1 reveals that in all 12 studies—regardless of their methodological approach or design—the findings consistently demonstrate that feedback provided through generative chat improves university students’ writing skills. In this regard, the effects are mostly positive, varying in scope depending on the study. Overall, feedback via chat contributes to improving the structure and organization of students’ texts, the correct use of grammatical conventions, the generation of ideas, and reflective practices related to writing and text improvement.

4. Discussion

From a methodological perspective, the selected studies predominantly involve mixed studies in which the effects of feedback through generative chat can be demonstrated through experimental processes or experiences. A significant number of researchers (Wang et al., 2024; Sysoyev et al., 2024; Li et al., 2024; Lu et al., 2024; Banihashem et al., 2024; Jiang, 2025; Dinh, 2025) have chosen to address the topic using a mixed approach, as it facilitates the collection of observable and measurable evidence through controlled interventions and written artifacts—such as essays or reports—that can be evaluated by experts using rubrics or other instruments and subsequently quantified. This allows researchers to define “what” and “to what extent” students’ writing skills improve. Additionally, when aligned with research objectives, it was possible to compare conventional feedback with generative chat feedback and/or hybrid approaches (Solak, 2024; Lu et al., 2024; Banihashem et al., 2024; Jiang, 2025; Alghannam, 2025). The mixed nature of these studies, by incorporating qualitative aspects, especially through interviews, enables the emergence of psychological categories such as motivation, self-regulation, self-efficacy, and reduced anxiety, along with increased participation (Sodiq & Rokib, 2024; Wang et al., 2024; Li et al., 2024; Solak, 2024; Lu et al., 2024; Banihashem et al., 2024). According to the 6 +1 Traits Writing Model (Culham, 2005), there are seven key traits for evaluation writing: adaptation to the communicative situation, ideas, personal voice, word choice, fluency and cohesion, structure and organization, and grammatical conventions. Of the seven characteristics, only six can be identified in the research as positive effects resulting from the use of feedback via chat. The only trait not evidenced in the findings is adaptation to the communicative approach, which relates to textual appropriateness, including intent, audience awareness, and the physical presentation of the written work.
Out of the 12 selected articles, 10 studies show that feedback via generative chat notably improves aspects related to structure and organization. Furthermore, this type of feedback supports the appropriate use of grammatical conventions and enhances the fluency and cohesion of written sentences. To a lesser extent, it can be observed that of the 12 articles selected, only a few provide evidence that generative chat feedback improves the precision of ideas and vocabulary. This variation may be due to differing writing instruction models, where priorities regarding what should be developed and assessed can vary. Nevertheless, the prevailing trend reflects a process-oriented model of writing instruction with an integrated approach, combining cognitive, social, and linguistic elements. This unified framework supports the development of a global, programmatic, and semantic writing plan, encouraging learners to focus on the writing process and the writer’s introspection through the use of generative chat tools, enabling a deeper understanding of what occurs during written learning (Marinkovich, 2006). Feedback via chat emphasizes the writer’s long-term memory, the task environment, and the text under construction (Björk & Blomstand, 2000). It centers on renewed cognitive dimension, prioritizing context, conditions, and the writer’s purpose (Parodi, 2003), as well as the social dimension, emphasizing that content and context should be tied to the social purpose of the text (Marinkovich, 2006).
In this context, and being more specific about the effects that can be identified, we can say that feedback through the use of chat improves academic essay writing (Dinh, 2025; Banihashem et al., 2024; Sodiq & Rokib, 2024), assists in sentence construction by enhancing grammar and the structure of prepositions (Sodiq & Rokib, 2024; Sysoyev et al., 2024), and fosters creativity, enabling students to propose and support ideas with arguments (Sodiq & Rokib, 2024; Dinh, 2025; Sysoyev et al., 2024). Similarly, the use of chat for feedback serves as a source of inspiration and improves the content of the text (Li et al., 2024; Alghannam, 2025; Sysoyev et al., 2024), as well as expanding and diversifying vocabulary (Elkatmis, 2024; Dinh, 2025). It is particularly interesting to note that in some studies, feedback via generative chat enables reflective and metacognitive processes, which are essential in writing. These processes allow students to evaluate and become aware of errors or difficulties in their text and make adjustments accordingly (Wang et al., 2024; Solak, 2024; Jiang, 2025). Using chat as a feedback tool helps students organize and structure their writing, which in the medium term facilitates editing and revising (Wang et al., 2024). Reflection supports metacognitive skills and allows for self-regulation of the writing process, while also encouraging students to critically question the guidance offered by AI (Solak, 2024; Jiang, 2025).
Based on the above, we can say that the scientific evidence coincides with what has been stated so far regarding the impact of feedback through the use of chat, so this tool can be defined as an effective action that allows information to be collected and feedback to be provided in a timely and massive manner, since it can cover a larger number of students at the same time, with the same intensity and precision as a teacher. When comparing the feedback provided by ChatGPT and teachers on writing, Solak (2024) argues that teachers were much more understanding, guiding, and emotionally intelligent than ChatGPT. However, the feedback generated by ChatGPT was considered to be much more detailed and comprehensive than conventional feedback. In addition, AI increased active participation and reflection on the task.
Although there are differences between the two types of feedback, according to Lu et al. (2024), when ChatGPT is complemented by the teacher, it allows students to better understand the teacher’s assessment criteria. This benefits students, regardless of their level of writing proficiency, as it fosters an understanding of what will be evaluated and what is relevant, encourages participation, promotes the formation of judgments about chat comments, and encourages independent thinking about revisions when reflective spaces and opportunities are provided. The study by Lu et al. (2024) revalues the role of the teacher and the effectiveness of the generative tool in feedback, also promoting a combined assessment approach. A similar impression is shared by Banihashem et al. (2024), who believe that ChatGPT provides much more descriptive feedback. In contrast, peer feedback provides information that includes identifying the problem in the essay, so the author suggests complementary use of ChatGPT and students in the feedback process, as both enhance each other and generate greater effects than when used separately. In a study conducted by Jiang (2025) through a controlled essay experiment, two groups were compared, and students who received hybrid feedback—that is, combined between AI and the teacher—obtained significantly higher writing scores than groups that only received monomodal feedback, either by AI or conventional. Jiang’s (2025) findings establish that pedagogical interaction is much richer, more dynamic, and more sustained when driven by teachers and AI, as it leads to greater reflection and deeper learning in writing. This proposal could relieve classroom congestion and improve time management, as a hybrid or combined model—whether with peers and AI or teachers and AI—would provide teachers with more time to focus on more reflective and guiding aspects, as well as greater attention to those students who are falling behind.
In this regard, chat-based feedback allows for reflection on the work and the correction of procedures to adjust the task (Schumacher & Ifenthaler, 2021) and continuous monitoring, establishing constant communication between the student and a tutor, which can positively stimulate student performance (Webb et al., 2018). The above is very important and demonstrates a level of self-management in students, which, for Wang et al. (2024), is a sign of responsibility towards their own learning with ChatGPT. In their study, Wang et al. (2024) show that most interviewees stated they critically reflected on their learning process and suggested that among the benefits of ChatGPT, receiving instant feedback allowed them to dedicate more time to practicing writing and making revisions.
On the other hand, it is observed that among the positive effects, feedback through chat use enables students to improve other aspects such as self-efficacy, self-regulation, proactivity, motivation, sense of responsibility, and participation. Moreover, feedback through generative chat promotes active learning and writing reflection, which supports critical judgment not only about the text but also about the AI, reducing anxiety and stress. In relation to the above, several studies selected in this systematic review (Sodiq & Rokib, 2024; Wang et al., 2024; Li et al., 2024; Solak, 2024; Jiang, 2025; Dinh, 2025) agreed that feedback through the use of chat supports self-regulation, which from an evaluative and metacognitive point of view enhances task development and the student’s critical reflection on writing processes (A. Ahmed & Pollitt, 2010). This is similarly repeated when asked whether feedback through generative chat helps with motivation. According to selected articles, motivation for writing improves when students perceive the benefits of ChatGPT (Wang et al., 2024), and therefore, students significantly improve their performance in the quality of content and linguistic expression in their texts (Li et al., 2024).
Finally, of the 12 selected articles, there are 3 research articles that show some type of disagreement in their results regarding the use of generative chat for feedback purposes. This may be attributed to the fact that these results of these studies are based on the perceptions and absence of controlled intervention of the experience. For Kim et al. (2024), based on a study conducted at Vancouver State University in Washington, United States, the teachers of the research participants stated that upon reviewing the feedback reports through chat, they observed some false statements, incorrect lab procedures, and irrelevant claims in the submitted reports. A similar observation was made by Elkatmis (2024) in Turkey, as there was no consensus regarding the benefits of feedback through chat. Those with a negative view of the experience argued that chat could make the human mind lazy, the information obtained was unreliable, and its versatility could pose a threat to humanity. Finally, Alghannam (2025), in a study in Saudi Arabia, presented some weaknesses regarding feedback through chat use, as some statements were not entirely accurate in relation to the corresponding text, and there was a lack of communicative comments. The study participants considered that ChatGPT provided corrective feedback focused on lower-level linguistic points. It is important to analyze what these three investigations do have in common. From a methodological standpoint, the three investigations are qualitative in nature and are based on the perceptions of students and teachers. Now, this is not the problem, as perceptions are a legitimate way to investigate the meanings that subjects assign to experiences. However, in the three previously mentioned works, there is no controlled intervention of the experience; rather, they work with a bank of essays and reports in which the student implements chat-based feedback in an inexperienced way, much like a beginner user would. We are dealing with three studies in which the validity of the feedback is called into question. Nonetheless, even when a pedagogical use is assigned, if those students are not prepared and the teachers have not been trained, have not fed the data corpus, nor supported students in presenting their prompts, that phenomenon of “hallucination” may occur (Fuchs, 2023; Wardat et al., 2023). One of the authors (Elkatmis, 2024) presents it as a limitation and weakness of the study, mentioning that the students’ lack of knowledge regarding the pedagogical use of ChatGPT may have led to errors.
In this sense, it is valid to conduct a documentary study based on the evaluation of writing through texts developed and experiences carried out by the students themselves. The problem lies in the lack of rigor in these experiences, as no time and resources are allocated to preparing a controlled essay for the implementation of the experience, nor is sufficient effort made to prepare the participants, which creates limitations for this type of study and line of research (Dunn & Mulvenon, 2009). In the study by Dinh (2025), another of the selected investigations, but of mixed nature, challenges will be raised regarding the use of generative chat for feedback. It is proposed to control the excessive dependency that the tool may generate and the difficulty students face in interpreting the comments, which is related to the lack of preparation they have as users, affecting the results of the research.
The issue surrounding the validity of the information provided by generative chat remains a controversial topic that raises many questions. Artificial intelligence has been questioned, and its language and information delivery capabilities have been underestimated. However, this literature review has found that many of the criticisms of generative chat by students and teachers are due to two reasons: from a research perspective, some studies lack scientific rigor in their interventions, lacking controlled pedagogical interventions that lead teachers and students to misuse AI, as some researchers request the free use of the tool without a protocol. On the other hand, the users of the same research are not trained and act as novice users, or at least the research does not detail whether the educational actors, teachers or students, have been previously trained in the use of the tool and the use of prompts. The latter may act as a limitation in some studies, especially those that work with the perception of actors, as they are asked about their experience in relation to the often-inappropriate use of AI or without protocols or teaching scripts that allow for scientific and pedagogical rigor in its use.

5. Conclusions

It can be concluded that generative chat feedback represents a significant innovation in writing instruction, with the potential to improve students’ accuracy, autonomy, and motivation. However, its effectiveness depends on multiple factors: the quality of the system, the type of errors addressed, the level of student participation, the teacher’s skills, the training of the AI using a corpus, and the teacher’s mediating role. It is important to consider that much of the inaccuracy and low reliability is due to lack of user training, inadequate data input or corpus, and the way users present their prompts in the chat (Fuchs, 2023; Wardat & Al Ali, 2024).
However, certain elements must be taken into account when implementing feedback through the use of generative chat, as there are certain limitations. Generative AI represented by the use of chat, in its current configuration, seems to replicate the focus on forms, prioritizing technical correction over meaningful interaction, which favors the role of the teacher who can make a difference in the use of AI in pedagogical terms. Similarly, some studies have observed a limited presence of affective or emotional feedback, which can affect student motivation, raising pedagogical concerns about some aspects where AI may not be as effective as teachers and conventional feedback.
On the other hand, generative AI and its use as a feedback tool is positioned as a potential assistant in the review and correction process, although it requires human supervision to ensure the quality and relevance of the comments. The findings of this systematic review raise questions about the effects of AI on student writing, as research shows that students exhibit high levels of responsibility and time management but lower levels of critical self-monitoring. This observation coincides with studies that warn of the risk of superficial learning and loss of critical thinking when generative tools are used indiscriminately (Wardat & Al Ali, 2024; Mogavi et al., 2024). In this context, it is important to conclude that students need support in developing critical thinking skills that allow them to reevaluate their written texts, as well as the information they receive from AI, in order to refine their senses and build criteria that allow them to accept or reject information.
Artificial intelligence, through chatbots, is seen as a significant opportunity in the field of education, although for certain sectors it still represents a threat that could displace the role of teachers. However, the use of these tools—focused on guiding planning, enhancing the development of key skills in students, promoting their autonomy, and optimizing classroom time through assertive and formative mediation—paves the way for the construction of concrete methodological proposals. To this end, their incorporation must be supported by rigorous technical, pedagogical, and didactic planning, preventing their application from devolving into mere improvisation. Authors such as Liu (2024) and Albadarin et al. (2024) agree on the importance of active student participation in the review process. In the case of ChatGPT, it has been observed that students who use metacognitive strategies to critically evaluate the AI system’s responses tend to improve their writing skills more significantly (Zhang & Hyland, 2018).

Author Contributions

Conceptualization, C.A.C.U., E.E.M.S. and R.R.; methodology, C.A.C.U., E.E.M.S., A.P. and R.R.; formal analysis, C.A.C.U., E.E.M.S. and R.R.; investigation, C.A.C.U., M.G.B.-Q., A.P. and N.L.-M.; data curation, C.A.C.U., E.E.M.S., A.P. and N.L.-M.; writing—original draft preparation, C.A.C.U., E.E.M.S. and R.R.; writing—review and editing, C.A.C.U., A.P., M.G.B.-Q., N.L.-M. and R.R.; visualization, R.R., E.E.M.S. and M.G.B.-Q.; supervision, C.A.C.U., A.P., M.G.B.-Q., N.L.-M. and R.R.; project administration, C.A.C.U., A.P., M.G.B.-Q., N.L.-M. and R.R.; funding acquisition, C.A.C.U. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by FOVI Project grant N° 240041 of National Research and Development Agency of Chile (ANID).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Ahmed, A., & Pollitt, A. (2010). The support model for interactive assessment. Assessment in Education: Principles, Policy & Practice, 17(2), 133–167. [Google Scholar] [CrossRef]
  2. Ahmed, Y. A., & Sharo, A. (2023). On the education effect of ChatGPT: Is AI ChatGPT to dominate education career profession? In 2023 international conference on intelligent computing, communication, networking and services (ICCNS) (pp. 79–84). IEEE. [Google Scholar] [CrossRef]
  3. Al-Amin, M., Ali, M. S., Salam, A., Khan, A., Ali, A., Ullah, A., Alam, N., & Chowdhury, S. K. (2024). History of generative Artificial Intelligence (AI) chatbots: Past, present, and future development. arXiv. [Google Scholar] [CrossRef]
  4. Albadarin, Y., Saqr, M., Pope, N., & Tukiainen, M. (2024). A systematic literature review of empirical research on ChatGPT in education. Discover Education, 3, 60. [Google Scholar] [CrossRef]
  5. Alghannam, M. (2025). Artificial intelligence as a provider of feedback on EFL student compositions. World Journal of English Language, 15(2), 161. [Google Scholar] [CrossRef]
  6. Alkaissi, H., & McFarlane, S. I. (2023). Artificial hallucinations in ChatGPT: Implications in scientific writing. Cureus, 15(2), e35179. [Google Scholar] [CrossRef] [PubMed]
  7. Baidoo-anu, D., & Owusu Ansah, L. (2023). Education in the Era of Generative Artificial Intelligence (AI): Understanding the Potential Benefits of ChatGPT in Promoting Teaching and Learning. Journal of AI, 7(1), 52–62. [Google Scholar] [CrossRef]
  8. Banihashem, S. K., Taghizadeh Kerman, N., Noroozi, O., Moon, J., & Drachsler, H. (2024). Feedback sources in essay writing: Peer-generated or AI-generated feedback? International Journal of Educational Technology in Higher Education, 21, 23. [Google Scholar] [CrossRef]
  9. Björk, L., & Blomstand, I. (2000). La escritura en la enseñanza secundaria. Los procesos del pensar y del escribir. Graó. [Google Scholar]
  10. Cassany, D. (1990). Enfoques didácticos para la enseñanza de la expresión escrita. Culture and Education, 2(6), 63–80. [Google Scholar] [CrossRef]
  11. Cassany, D. (2024). (Enseñar a) leer y escribir con inteligencias artificiales generativas: Reflexiones, oportunidades y retos. Enunciación, 29(2), 320–336. [Google Scholar] [CrossRef]
  12. Corbin, J., & Strauss, A. (2014). Basics of qualitative research: Techniques and procedures for developing grounded theory. Sage Publications. [Google Scholar]
  13. Culham, R. (2005). 6 + 1 traits of writing: The complete guide for the primary grades. Scholastic Inc. [Google Scholar]
  14. Diego Olite, F. M., Morales Suárez, I. d. R., & Vidal Ledo, M. J. (2023). Chat GPT: Origen, evolución, retos e impactos en la educación. Educación Médica Superior, 37(2), e3876. Available online: http://scielo.sld.cu/scielo.php?script=sci_arttext&pid=S0864-21412023000200016&lng=es&tlng=es (accessed on 12 June 2025).
  15. Dinh, C. T. (2025). Undergraduate English majors’ views on ChatGPT in academic writing: Perceived vocabulary and grammar improvement. FWU Journal of Social Sciences, 19(1), 1–11. [Google Scholar] [CrossRef] [PubMed]
  16. Dunn, K. E., & Mulvenon, S. W. (2009). A critical review of research on formative assessment: The limited scientific evidence of the impact of formative assessment in education. Practical Assessment, Research & Evaluation, 14(7), 7. [Google Scholar] [CrossRef]
  17. Elkatmis, M. (2024). Chat GPT and creative writing: Experiences of master’s students in enhancing. International Journal of Contemporary Educational Research, 11(3), 321–336. [Google Scholar] [CrossRef]
  18. Fuchs, K. (2023). Exploring the opportunities and challenges of NLP models in higher education: Is ChatGPT a blessing or a curse? Frontiers in Education, 8, 1166682. [Google Scholar] [CrossRef]
  19. Gašević, D., Greiff, S., & Shaffer, D. (2022). Towards strengthening links between learning analytics and assessment: Challenges and potentials of a promising new bond. Computers in Human Behavior, 134, 107304. [Google Scholar] [CrossRef]
  20. Gilson, A., Safranek, C. W., Huang, T., Socrates, V., Chi, L., Taylor, R. A., & Chartash, D. (2023). How does ChatGPT perform on the United States Medical Licensing Examination (USMLE)? The implications of large language models for medical education and knowledge assessment. JMIR Medical Education, 9, e45312. [Google Scholar] [CrossRef] [PubMed]
  21. Gisbert, J. P., & Bonfill, X. (2004). ¿Cómo realizar, evaluar y utilizar revisiones sistemáticas y metaanálisis? Gastroenterología y Hepatología, 27(3), 129–149. [Google Scholar] [CrossRef] [PubMed]
  22. Gutiérrez, J. (2023, January 27). En solo cinco días, Chat GPT-3 consiguió un millón de usuarios. La Jornada. Available online: https://www.jornada.com.mx/2023/01/27/economia/014n1eco (accessed on 20 August 2025).
  23. Herft, A. (2023). A teacher’s prompt guide to ChatGPT aligned with “what works best”. Department of Education, New South Wales, Sydney. [Google Scholar]
  24. Ifenthaler, D., & Greiff, S. (2021). Leveraging learning analytics for assessment and feedback. In J. Liebowitz (Ed.), Online learning analytics (pp. 1–18). Auerbach Publications. [Google Scholar] [CrossRef]
  25. Jiang, Y. (2025). Interaction and dialogue: Integration and application of artificial intelligence in blended mode writing feedback. The Internet and Higher Education, 64, 100975. [Google Scholar] [CrossRef]
  26. Khan, R. A., Jawaid, M., Khan, A. R., & Sajjad, M. (2023). ChatGPT—Reshaping medical education and clinical management. Pakistan Journal of Medical Sciences, 39(2), 605–607. [Google Scholar] [CrossRef] [PubMed]
  27. Khlaif, Z. N., Mousa, A., Hattab, M. K., Itmazi, J., Hassan, A. A., Sanmugam, M., & Ayyoub, A. (2023). The potential and concerns of using AI in scientific research: ChatGPT performance evaluation. JMIR Medical Education, 9, e47049. [Google Scholar] [CrossRef] [PubMed]
  28. Kim, D., Majdara, A., & Olson, W. (2024). A pilot study inquiring into the impact of ChatGPT on lab report writing in introductory engineering labs. International Journal of Technology in Education (IJTE), 7(2), 259–289. [Google Scholar] [CrossRef]
  29. Letelier, L. M., Manríquez, J. J., & Rada, G. (2005). Revisiones sistemáticas y metaanálisis: ¿Son la mejor evidencia? Revista Médica de Chile, 133(2), 246–249. [Google Scholar] [CrossRef]
  30. Li, H., Wang, Y., Luo, S., & Huang, C. (2024). The influence of GenAI on the effectiveness of argumentative writing in higher education: Evidence from a quasi-experimental study in China. Journal of Asian Public Policy, 18(2), 405–430. [Google Scholar] [CrossRef]
  31. Liu, H. (2024). A systematic review of automated writing evaluation feedback: Validity, effects and students’ engagement. Language Teaching Research Quarterly, 45, 86–105. [Google Scholar] [CrossRef]
  32. Lu, Q., Yao, Y., Xiao, L., Yuan, M., Wang, J., & Zhu, X. (2024). Can ChatGPT effectively complement teacher assessment of undergraduate students’ academic writing? Assessment & Evaluation in Higher Education, 49(5), 616–633. [Google Scholar] [CrossRef]
  33. Lucana, Y. E., & Roldan, W. L. (2023). Chatbot basado en inteligencia artificial para la educación escolar. Horizontes Revista De Investigación En Ciencias De La Educación, 7(29), 1580–1592. [Google Scholar] [CrossRef]
  34. Manterola, C., Astudillo, P., Arias, E., Claros, N., & Grupo MINCIR (Metodología e Investigación en Cirugía). (2013). Revisiones sistemáticas de la literatura. Qué se debe saber acerca de ellas. Cirugía Española, 91(3), 149–155. [Google Scholar] [CrossRef] [PubMed]
  35. Marinkovich, J. (2006). La escritura como proceso. Frasis. [Google Scholar]
  36. Medina, A., & Gajardo, A. (2018). Pruebas de comprensión lectora y producción de textos (CL-PT): 5° a 8° básico. Ediciones UC. [Google Scholar]
  37. Meyer, J. G., Urbanowicz, R. J., Martin, P. C. N., O’Connor, K., Li, R., Peng, P.-C., Bright, T. J., Tatonetti, N., Won, K. J., Gonzalez-Hernandez, G., & Moore, J. H. (2023). ChatGPT and large language models in academia: Opportunities and challenges. BioData Mining, 16, 20. [Google Scholar] [CrossRef] [PubMed]
  38. Mogavi, H., Chen, Y., & Lee, S. (2024). Generative AI and critical thinking in higher education. Journal of Learning Analytics, 11(1), 112–130. [Google Scholar]
  39. Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., Tetzlaff, J. M., Akl, E. A., Brennan, S. E., Chou, R., Glanville, J., Grimshaw, J. M., Hróbjartsson, A., Lalu, M. M., Li, T., Loder, E. W., Mayo-Wilson, E., McDonald, S., … Alonso-Fernández, S. (2021). Declaración PRISMA 2020: Una guía actualizada para la publicación de revisiones sistemáticas. Revista Española de Cardiología, 74(9), 790–799. [Google Scholar] [CrossRef] [PubMed]
  40. Parodi, G. (2003). Relaciones entre la lectura y escritura: Una perspectiva cognitiva discursiva. Ediciones Universitarias de la Valparaiso. [Google Scholar]
  41. Rane, N. (2023). ChatGPT and similar generative artificial intelligence (AI) for smart industry: Role, challenges and opportunities for industry 4.0, industry 5.0 and society 5.0. Innovations in Business and Strategic Management, 2(1), 10–17. [Google Scholar] [CrossRef]
  42. Roumeliotis, K. I., & Tselikas, N. D. (2023). ChatGPT and Open-AI models: A preliminary review. Future Internet, 15(6), 192. [Google Scholar] [CrossRef]
  43. Sarosa, M., Kusumawardani, M., Suyono, A., & Wijaya, M. H. (2020). Developing a social media-based chatbot for English learning. IOP Conference Series: Materials Science and Engineering, 732(1), 012074. [Google Scholar] [CrossRef]
  44. Schumacher, C., & Ifenthaler, D. (2021). Investigating prompts for supporting students’ self-regulation—A remaining challenge for learning analytics approaches? The Internet and Higher Education, 49, 100791. [Google Scholar] [CrossRef]
  45. Sodiq, S., & Rokib, M. (2024). Indonesian students’ use of Chat Generative Pre-trained Transformer in essay writing practices. International Journal of Evaluation and Research in Education (IJERE), 13(4), 2698–2706. [Google Scholar] [CrossRef]
  46. Solak, E. (2024). Examining writing feedback dynamics from ChatGPT AI and human educators: A comparative study. Pedagogika-Pedagogy, 96(7), 955–969. [Google Scholar] [CrossRef]
  47. Sysoyev, P. V., Filatov, E. M., Khmarenko, N. I., & Murunov, S. S. (2024). Пpeпoдaвaтeль vs. иcкyccтвeнный интeллeкт: Cpaвнeниe кaчecтвa пpeдocтaвляeмoй пpeпoдaвaтeлeм и гeнepaтивным иcкyccтвeнным интeллeктoм oбpaтнoй cвязи пpи oцeнкe пиcьмeнныx твopчecкиx paбoт cтyдeнтoв [Teacher vs. AI: A comparison of the quality of teacher-provided and generative AI feedback in assessing students’ written creative work]. Пepcпeктивы Нayки и Обpaзoвaния [Perspectives of Science and Education], 5(71), 694–712. [Google Scholar] [CrossRef]
  48. Tedre, M., Kahila, J., & Vartiainen, H. (2023). Exploration on how co-designing with AI facilitates critical evaluation of ethics of AI in craft education. In E. Langran, P. Christensen, & J. Sanson (Eds.), Proceedings of the society for information technology & teacher education international conference (pp. 2289–2296). Association for the Advancement of Computing in Education (AACE). Available online: https://www.learntechlib.org/primary/p/222124/ (accessed on 23 June 2025).
  49. Tempelaar, D. T., Rienties, B., Mittelmeier, J., & Nguyen, Q. (2018). Student profiling in a dispositional learning analytics application using formative assessment. Computers in Human Behavior, 78, 408–420. [Google Scholar] [CrossRef]
  50. Van Dis, E. A., Bollen, J., Zuidema, W., Van Rooij, R., & Bockting, C. L. (2023). ChatGPT: Five priorities for research. Nature, 614(7947), 224–226. [Google Scholar] [CrossRef]
  51. Volante, L., DeLuca, C., & Klinger, D. A. (2023). Leveraging AI to enhance learning. Phi Delta Kappan, 105(1), 40–45. [Google Scholar] [CrossRef]
  52. Wang, C., Li, Z., & Bonk, C. (2024). Understanding self-directed learning in AI-assisted writing: A mixed methods study of postsecondary learners. Computers and Education: Artificial Intelligence, 6, 100247. [Google Scholar] [CrossRef]
  53. Wardat, Y., & Al Ali, R. (2024). How ChatGPT will shape the teaching learning landscape in future? Journal of Educational and Social Research, 14(1), 47–65. [Google Scholar] [CrossRef]
  54. Wardat, Y., Tashtoush, M. A., AlAli, R., & Jarrah, A. M. (2023). ChatGPT: A revolutionary tool for teaching and learning mathematics. Eurasia Journal of Mathematics, Science and Technology Education, 19(7), em2286. [Google Scholar] [CrossRef] [PubMed]
  55. Webb, M. E., Prasse, D., Phillips, M., Kadijevich, D. M., Angeli, C., Strijker, A., Carvalho, A. A., Andresen, B. B., Dobozy, E., & Laugesen, H. (2018). Challenges for IT-enabled formative assessment of complex 21st century skills. Technology, Knowledge and Learning, 23(3), 441–456. [Google Scholar] [CrossRef]
  56. Zhang, Z., & Hyland, K. (2018). Student engagement with teacher and automated feedback on L2 writing. Assessing Writing, 36, 90–102. [Google Scholar] [CrossRef]
Figure 1. PRISMA flow diagram.
Figure 1. PRISMA flow diagram.
Education 15 01396 g001
Figure 2. Years of publication.
Figure 2. Years of publication.
Education 15 01396 g002
Figure 3. Geographic distribution of authors of selected studies by continent.
Figure 3. Geographic distribution of authors of selected studies by continent.
Education 15 01396 g003
Figure 4. Geographical distribution of authors of selected studies by country.
Figure 4. Geographical distribution of authors of selected studies by country.
Education 15 01396 g004
Table 1. Summary of selected articles.
Table 1. Summary of selected articles.
Author, Year, PlaceMethodologyResultsAspects Improved by Generative Chat (Based on Culham, 2005)Other Areas of Improvement
(1)
Sodiq and Rokib (2024), Indonesia
Quantitative approach, use of Likert-scale surveys N = 303 studentsMost students who used ChatGPT improved their essay writing and boosted their confidence. Students perceived ChatGPT as a valuable resource for enhancing grammar and sentence structure, fostering creativity, and developing solid ideas and arguments.2. Ideas:
2.1 Accuracy and variety: development of ideas.
3. VOICE OR PERSONAL STYLE:
3.1 Expressive capacity: Essay writing style, creativity, and originality.
5. FLUENCY AND COHESION:
5.1 Ideas flow naturally: Improved coherence
5.2 Use of connectors: Improved cohesion
6 WORD CHOICE:
6.2 Varied vocabulary: Expanding vocabulary
7. GRAMMATICAL CONVENTIONS:
7.1 Syntax: Sentence structure.
Self-efficacy
(2)
Wang et al. (2024), USA
Mixed methods, using surveys and semi-structured interviews. N = 384 students.ChatGPT was used for the idea generation. Student motivation improved as they perceived the benefits of using ChatGPT. Most participants demonstrated a strong responsibility for their own learning and stated that they engaged in critical reflection on their learning process.2. IDEAS:
2.1 Accuracy and variety: Depth of written expression
3. VOICE OR PERSONAL STYLE
3.1 Expressive capacity: Essay language
5. FLUENCY AND COHESION
5.1 Ideas flow naturally: improved writing
5.2 Use of connectors: Cohesion and writing structure
6. STRUCTURE AND ORGANIZATION:
6.1 Relevant text structure: Context, editing, and revision
7. GRAMMATICAL CONVENTIONS:
7.1 Syntax: Correcting sentence order.
  • Self-regulation and proactivity
  • Motivation
Reduced anxiety and stress
(3)
Sysoyev et al. (2024), Russia
Mixed methods. Using written essays, rubrics and statistical test. N = 350 students. ChatGPT is comparable to the teacher in terms of criteria such as: content of the work, organization and structure, validation of ideas and arguments, and originality of the essay. ChatGPT outperformed the teacher in aspects such as: language use and essay originality,3. VOICE OR PERSONAL STYLE:
3.1 Expressive capacity: Essay language and depth of written expression.
4. WORD CHOICE:
4.1 Precise vocabulary: Word selection.
6. STRUCTURE AND ORGANIZATION:
6.1 Relevant text structure: Introduction, main body and conclusion
7. GRAMMATICAL CONVENTION:
7.1 Syntax: Sentence structure
7.3 Spelling: Accuracy.
-
(4)
Li et al. (2024),
China
Mixed quasi-experimental approach with interventions. Use of a rubric and statistical test (Pre/Posttest). N = 61 students.Chat improves the quality of content and linguistic expression and has a positive impact by providing personalized feedback. It enhances motivation to write.3. VOICE OR PERSONAL STYLE:
3.1 Expressive capacity: Language
4. WORD CHOICE:
4.1 Precise vocabulary: Word accuracy
6. STRUCTURE AND ORGANIZATION:
6.1 Relevant text structure: content and academic style.
7. GRAMMATICAL CONVENTION:
7.1 Syntax: Sentence fluency
7.2 Morphosyntax: Correcting deficiencies in linguistic expression.
  • Self-regulation.
Motivation
(5)
Solak (2024), China
Phenomenological approach. Use of essays. Use of a closed-questionnaire content analysis. N = 15 students.When providing feedback, teachers were more empathetic, guiding, and used emotional intelligence. In contrast, the chatbot provided more detailed and comprehensive feedback, promoted engagement, reflective learning, and the construction of diverse knowledge.5. FLUENCY AND COHESION:
5.1 Ideas flow naturally: Coherence
5.2 Use of connectors: Cohesion
6. STRUCTURE AND ORGANIZATION:
6.1 Relevant structure: Content improvement
  • Self-efficacy
  • Self-regulation
  • Motivation
Vocabulary reflection
(6)
Lu et al. (2024),
China
Mixed approach, use of academic summaries assessed with scoring scales, statistical test, and interviews N = 46 studentsChatGPT can be used to complement teacher evaluation. It fosters a deeper understanding of teacher assessments; encourages students to make judgements about the feedback they receive; and promotes independent thinking regarding revisions.2. IDEAS:
2.1 Accuracy and variety: New ideas that improved writing quality (p. 623)
Reflection on feedback and writing
(7)
Banihashem et al. (2024), Netherlands
Mixed exploratory approach. Essays were used, along with context analysis and statistical test. N = 74 students.ChatGPT provided more descriptive feedback. In contrast, the students contributed information that helped identify the core problem in the essay.5. FLUENCY AND COHESION:
5.1 Ideas flow naturally: Quality of writing
5.2 Use of connectors: Essay coherence.
6. STRUCTURE AND ORGANIZATION:
6.1 Relevant text structure: Overall essay structure.
-
(8)
Kim et al. (2024),
USA
Qualitative approach. Laboratory reports were reviewed (N = 28). Use of a rubric and focus group. N = 7 studentsImplementing ChatGPT in the revision process improves the quality of engineering students’ lab reports due to a better understanding of the genre. However, using ChatGPT also led students to make false claims, incorrect lab procedures, or overly general statements.2. IDEAS:
2.1 Accuracy and variety: Idea generation
4. WORD CHOICE:
4.1 Precise vocabulary: Concise language
6. STRUCTURE AND ORGANIZATION:
6.1 Relevant text structure: Editing the report.
-
(9)
Elkatmis (2024),
Turkey
Qualitative approach. Semi-structured interviews N = 16 studentsStudents lacked knowledge about the pedagogical use of ChatGPT. There are both positive and negative perceptions, The positive view holds that ChatGPT improves writing skills and vocabulary, offers different perspectives, and makes the process more enjoyable. The native view argues that it may make the mind lazy, its information is unreliable, and its versatility could pose a threat to humanity,4. WORD CHOICE:
4.2 Varied vocabulary: Improves vocabulary
5. FLUENCY AND COHESION:
5.1 Ideas flow naturally: Speeds up my writing
6. STRUCTURE AND ORGANIZATION:
6.1 Relevant text structure: Improves text quality and effectiveness, and helps organize information
-
(10)
Jiang (2025), China
Mixed quasi-experimental approach with intervention (conventional feedback, AI feedback, and combined feedback) N = 86 studentsThe combination of teacher and AI feedback significantly improved writing skills. The group that received hybrid feedback scored significantly higher in writing than the groups that received monomodal feedback. Combined feedback enhances interaction, encourages deeper reflection, and improves students’ writing.4. WORD CHOICE:
4.1 Precise vocabulary: Language appropriateness
6. STRUCTURE AND ORGANIZATION:
6.1 Relevant text structure: Clarity of headings depth of argumentative development
7. GRAMMATICAL CONVENTION:
7.1 Syntax: Sentence and phrase correction
  • Responsibility.
  • Active learning
  • Effective participation, Self-efficacy
(11)
Alghannam (2025), Saudi Arabia
Document analysis: Essays were evaluated and content was coded N = 29 studentsThere are weaknesses in the use of ChatGPT for feedback. The comments were imprecise in relation to the text and mainly focused on content related to the message and emotion.2. IDEAS
2.1 Accuracy and variety: Clarity
3. VOICE OR PERSONAL STYLE
3.1 Expressive capacity: Integrity
6. STRUCTURE AND ORGANIZATION:
6.1 Relevant text structure: Organization and content
7. GRAMMATICAL CONVENTION:
7.3 Spelling: Literal and precise.
-
(12)
Dinh (2025), Vietnam
Mixed method approach. Use of surveys, reflective journals, and semi-structured interviews. N = 31 studentsQuantitative results revealed significant improvements in students’ perceptions regarding vocabulary accuracy, relevance, and depth. Qualitative analysis identified benefits such as vocabulary enrichment, improved grammatical accuracy, and increased confidence in academic writing.4. WORD CHOICE:
4.1 Precise vocabulary: Concise
4.2 Varied vocabulary: Vocabulary improvement
5. FLUENCY AND COHESION:
5.1 Ideas flow naturally: Clearer writing
5.2 Use of connectors: Cohesion, helps identify and eliminate redundancy
6. STRUCTURE AND ORGANIZATION:
6.1 Relevant text structure: Overall organization
7. GRAMMATICAL CONVENTION:
7.1 Syntax: Sentence structure
7.2 Morphosyntax: Word selection.
Self-efficacy.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Urzúa, C.A.C.; Ranjan, R.; Saavedra, E.E.M.; Badilla-Quintana, M.G.; Lepe-Martínez, N.; Philominraj, A. Effects of AI-Assisted Feedback via Generative Chat on Academic Writing in Higher Education Students: A Systematic Review of the Literature. Educ. Sci. 2025, 15, 1396. https://doi.org/10.3390/educsci15101396

AMA Style

Urzúa CAC, Ranjan R, Saavedra EEM, Badilla-Quintana MG, Lepe-Martínez N, Philominraj A. Effects of AI-Assisted Feedback via Generative Chat on Academic Writing in Higher Education Students: A Systematic Review of the Literature. Education Sciences. 2025; 15(10):1396. https://doi.org/10.3390/educsci15101396

Chicago/Turabian Style

Urzúa, Claudio Andrés Cerón, Ranjeeva Ranjan, Eleazar Eduardo Méndez Saavedra, María Graciela Badilla-Quintana, Nancy Lepe-Martínez, and Andrew Philominraj. 2025. "Effects of AI-Assisted Feedback via Generative Chat on Academic Writing in Higher Education Students: A Systematic Review of the Literature" Education Sciences 15, no. 10: 1396. https://doi.org/10.3390/educsci15101396

APA Style

Urzúa, C. A. C., Ranjan, R., Saavedra, E. E. M., Badilla-Quintana, M. G., Lepe-Martínez, N., & Philominraj, A. (2025). Effects of AI-Assisted Feedback via Generative Chat on Academic Writing in Higher Education Students: A Systematic Review of the Literature. Education Sciences, 15(10), 1396. https://doi.org/10.3390/educsci15101396

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop