Next Article in Journal
Advancing Equity in Education: Progress Towards Inclusive and Equal Access for the Vulnerable in South Africa
Previous Article in Journal
Writing with Decoding and Spelling Difficulties—A Qualitative Perspective
 
 
Article
Peer-Review Record

AI-Powered Prompt Engineering for Education 4.0: Transforming Digital Resources into Engaging Learning Experiences

Educ. Sci. 2025, 15(12), 1640; https://doi.org/10.3390/educsci15121640
by Paulo Serra 1,2,* and Ângela Oliveira 1,3,*
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3: Anonymous
Educ. Sci. 2025, 15(12), 1640; https://doi.org/10.3390/educsci15121640
Submission received: 10 October 2025 / Revised: 28 November 2025 / Accepted: 30 November 2025 / Published: 5 December 2025
(This article belongs to the Special Issue Supporting Student Engagement in Education 4.0 Environments)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

1.The type of research, year, techniques employed, sample size, and data analysis software should be clearly presented in the abstract section.

  1. On line 36, the term “transformation” could be replaced with “innovation”, which may be more appropriate in this context.
  2. In the Introduction section, smoother transitions should be established between sentences to enhance coherence.
  3. Under the “Related Works” heading, studies from the literature should be presented directly. This section should appear as a separate subheading following the Introduction.
  4. The Research Question, Research Strategy, and Data Extraction should be included in the Methods section.
  5. In the final paragraphs of the Introduction, the significance and purpose of the study should be explicitly stated.
  6. The research questions should follow the statement of the research aim.
  7. In the “Inclusion Criteria” section, provide a justification for selecting these criteria to ensure a more objective and transparent perspective.
  8. The “Research Strategy” section should not only list keywords or database names. This part should also include details such as the total number of articles found, the number deemed relevant to the study, and the selection process. Therefore, subsections 2.3 and 2.4 should be merged.
  9. In qualitative research, clearly explain how validity and reliability protocols were implemented and how the necessary conditions were ensured.
  10. Descriptions related to figures and tables should be placed directly below the corresponding figure or table. For example, on page 6, Figures 3 and 4 are discussed, but the graphs appear later. Presenting the explanation before the figures is not an appropriate approach.
  11. The Discussion section is rather weak for an academic article and should be strengthened by incorporating more recent and relevant sources. Additionally, a separate Conclusion section should be created to summarise the main findings and implications of the study.

Author Response

Thank you very much for your positive feedback on our manuscript. We are pleased that you consider our work a relevant contribution to the field of artificial intelligence in education. We have carefully reviewed the article and incorporated the necessary changes as suggested.

Q1.The type of research, year, techniques employed, sample size, and data analysis software should be clearly presented in the abstract section.

AQ1. The abstract was revised to make explicit the type of study (systematic review), the method (PRISMA), the data sources, and the number of studies analysed — essential elements of methodological rigor that had been missing.

Q2. On line 36, the term “transformation” could be replaced with “innovation”, which may be more appropriate in this context.

AQ2. The change has been made.

Q3. In the Introduction section, smoother transitions should be established between sentences to enhance coherence.

AQ3. The change has been made.

Q4. Under the “Related Works” heading, studies from the literature should be presented directly. This section should appear as a separate subheading following the Introduction.

Q5. The Research Question, Research Strategy, and Data Extraction should be included in the Methods section.

Q6. The research questions should follow the statement of the research aim.

Q7. In the “Inclusion Criteria” section, provide a justification for selecting these criteria to ensure a more objective and transparent perspective.

Q8. The “Research Strategy” section should not only list keywords or database names. This part should also include details such as the total number of articles found, the number deemed relevant to the study, and the selection process. Therefore, subsections 2.3 and 2.4 should be merged.

AQ4-Q8. The change was made in accordance with the PRISMA protocol and the reviewer's suggestions.

Q9. In the final paragraphs of the Introduction, the significance and purpose of the study should be explicitly stated.

AQ9. The change was made in accordance with the reviewer's suggestions.

Q10. In qualitative research, clearly explain how validity and reliability protocols were implemented and how the necessary conditions were ensured.

AQ10. To ensure validity and reliability throughout the review process, the PRISMA 2020 protocol was strictly followed. The search strategy, inclusion and exclusion criteria, and data extraction procedures were defined a priori and documented in a detailed protocol. Two reviewers independently screened the titles, abstracts and full texts, resolving any discrepancies by consensus, to strengthen inter-rater reliability. Data extraction was carried out in a manner that ensured the consistency of the process. The methodological quality of the included studies was assessed to ensure that the synthesis is based on credible and methodologically robust evidence.

Q11. Descriptions related to figures and tables should be placed directly below the corresponding figure or table. For example, on page 6, Figures 3 and 4 are discussed, but the graphs appear later. Presenting the explanation before the figures is not an appropriate approach.

AQ11. The change was made in accordance with the reviewer's suggestions.

Q12. The Discussion section is rather weak for an academic article and should be strengthened by incorporating more recent and relevant sources. Additionally, a separate Conclusion section should be created to summarise the main findings and implications of the study.

AQ12. Although we acknowledge the relevance of the suggestion, the discussion is based on the analysis of the articles included in the PRISMA systematic review referring to the period 2023–2025. In accordance with the protocol, no additional sources should be incorporated at this stage. However, a specific subsection containing the study’s conclusions has been added.

Thank you for making this article much completer and more accurate. All the changes made are reflected in the attached article.

 

Reviewer 2 Report

Comments and Suggestions for Authors

I am providing my feedback on the document that I will attach.

Comments for author File: Comments.pdf

Author Response

Thank you very much for your positive feedback on our manuscript. We are pleased that you consider our work a relevant contribution to the field of artificial intelligence in education. We have carefully reviewed the article and incorporated the necessary changes as suggested.

Q1. The topic discussed in this paper is interesting and relevant. However, I would suggest that the authors consider going beyond a literature review and move toward a more empirical or developmental study. Conducting an experiment or designing a practical model would make the paper more substantial and aligned with the expectations of a Q1 journal. This approach could also provide stronger evidence and a clearer contribution to both theory and practice.

AQ1. We appreciate the reviewer’s comment and fully agree that incorporating an empirical or developmental dimension would make the study more robust and relevant to both theory and practice. At this stage, the article focused on synthesising and structuring existing evidence through the PRISMA protocol, with the aim of establishing a solid conceptual foundation. However, in a future phase of the research, we plan to empirically validate the proposed prompt-engineering model, including experimental implementations and user testing to evaluate its impact on learning, usability, and pedagogical effectiveness.
As the researchers are also teacher trainers, several continuing professional development initiatives on AI in education are planned, through which the model will be applied and tested in real classroom contexts.

Q2. In the introduction section, the authors have provided a clear background and identified the research gap between this study and previous works. However, upon reviewing several related references, it appears that similar studies have been conducted before. Therefore, the current manuscript does not yet sufficiently highlight its novelty or specific contribution compared to prior research. I would recommend that the authors clarify what makes this study distinct either in terms of its context, methodology, or theoretical approach to better emphasize its originality and added value to the field.

AQ2. We thank the reviewer for this valuable comment and agree that the study’s originality and specific contribution needed to be more explicitly articulated. In the revised version, the end of the Introduction section has been rewritten to clearly highlight the distinctive features of this work: (i) the PRISMA-based systematic approach applied to a recent and focused period (2023–2025) characterised by the adoption of LLMs in education; (ii) the proposal of a three-level typology of prompt use (explicit, implicit, and absent); (iii) the development of a methodological model for prompt engineering aligned with the principles of Education 4.0; and (iv) the implementation of an innovative strategy that embeds a prompt within a static digital educational resource, transforming it into an interactive and adaptive learning resource. These revisions make the theoretical, methodological, and practical added value of the paper explicit and clarify its original contribution compared with previous studies.

Q3. The methodology section demonstrates a proper use of the PRISMA technique, clearly outlining the article selection process. However, the classification of students as the study’s subject appears too broad, as AI use may differ across educational levels. The authors should clarify or justify the specific target group to ensure contextual accuracy. Additionally, the rationale for selecting ChatGPT among other AI tools is not explained. Providing this justification would enhance the study’s methodological transparency.

AQ3. Regarding PRISMA, the target audience was identified from the reviewed studies and is presented in the table (Appendix A), illustrating its scope. The proposed strategy can be applied to any target group by simply adapting or modifying the prompt in the <target_age_group> tag, as the AI Agent automatically adjusts its style according to the age range. The choice of AI Agent is supported by the frequency of its occurrence across the studies listed in the “Platform/Software” column of the table. To further substantiate this, an explanatory note grounded in scientific literature was added to the article, following the reviewer’s valuable suggestion.

Q4. In the discussion section, the authors provide a well-developed explanation of strategies for effective prompt construction when using AI. This part offers valuable insights into how carefully designed prompts can optimize AI utilization and improve the quality of generated outputs. The authors also present several examples of prompt usage across different generative AI tools, which enriches the practical dimension of the discussion. To further strengthen this section, the authors might consider connecting these examples to relevant pedagogical or technological frameworks to highlight their theoretical significance.

AQ4. We appreciate the reviewer’s comment and the positive assessment of this section. We agree that linking the examples of prompts to relevant pedagogical or technological frameworks could strengthen the theoretical grounding and enhance the relevance of the discussion. However, we consider that such integration would, at this stage, be premature, as the strategies and examples presented have not yet been tested in real classroom contexts. Training activities with teachers, framed within an action research methodology, are planned. These will allow for the observation and analysis of the practical application of the proposed prompt design strategies. From these experiences, it will be possible to identify and support more suitable pedagogical and technological frameworks, thereby strengthening the connection between theory and practice.

Thank you for making this article much completer and more accurate. All the changes made are reflected in the attached article.

Reviewer 3 Report

Comments and Suggestions for Authors

The manuscript addresses an important and timely topic at the intersection of artificial intelligence and education by arguing that purposeful prompt design can transform static digital materials into adaptive learning experiences. The central idea is intriguing and relevant; however, the paper falls short in providing a theoretically grounded and methodologically rigorous treatment of the question. Below are several major areas for improvement.

First, the most significant concern lies in the review’s lack of critical evaluation of the included studies. The authors enumerate the frequency of AI techniques and educational levels but rarely assess the quality of evidence or the internal validity of those studies. There is little or no discussion of sample/study designs (experimental vs. correlational), or replication status. Without weighting studies by methodological robustness, the conclusions about “consistent improvements” in academic performance or engagement remain statistically and conceptually weak. The authors should include a critical scoring rubric or a structured appraisal system to distinguish exploratory prototypes from rigorously tested interventions and to ensure that the synthesis reflects evidence quality, not just quantity.

Second, the temporal scope raises methodological concerns. Limiting the review to English-language studies published between 2023 and 2025 is problematic, as it captures only the short-term post-ChatGPT research surge. The paper needs to justify why this narrow timeframe is theoretically or empirically meaningful. Without comparison to prior (or later) research periods, it risks conflating publication volume with genuine scholarly progress.

Third, the review overemphasizes technical taxonomies (e.g., ML, NLP) while neglecting cross-study synthesis on pedagogical variables, such as instructional design, learner autonomy, or feedback mechanisms. This fixed algorithmic categories creates an illusion of scientific precision but does not illuminate how or why AI actually enhances learning. The authors might consider incorporating instructional design theory or frameworks linking large language models to learning processes to offer a more pedagogically meaningful synthesis.

Fourth, the central theme, distinguishing between “explicit” and “implicit” prompt use, is conceptually interesting but superficially developed. The classification lacks theoretical clarity and operational transparency. The paper does not explain how a study qualifies as “prompt-based,” nor does it describe any coding scheme or inter-rater reliability measures. Without these, the typology appears subjective and weakens the analytical credibility of the review.

Fifth, the review may exhibit positive bias. Almost all reported findings are interpreted as evidence of success, while null results, mixed effects, or negative outcomes receive no serious attention. A balanced review should critically engage with failures or unintended consequences of AI systems since these insights are equally vital for theory and practice.

Sixth, the discussion of ethics and pedagogy is underdeveloped. It appears as a perfunctory disclaimer rather than an integrated analytical dimension. While the absence of an extensive ethics section is acceptable, the authors should at least explain why issues such as privacy and bias were not included in the analytical framework and how their omission affects the interpretive boundaries of the study.

Finally, the use of the PRISMA framework appears more procedural than substantive. PRISMA is invoked as a rhetorical marker of rigor, but its implementation lacks precision and accountability. The paper does not provide sufficient detail on inclusion and exclusion criteria, database selection rationale, or screening decisions. The reliance on only ACM and Scopus introduces disciplinary bias toward technology-oriented publications and excludes education-specific databases. The authors should acknowledge this limitation explicitly and provide more transparent data extraction procedures (through a supplementary appendix) to enhance reproducibility and methodological integrity.

Good luck.

Author Response

Thank you very much for your positive feedback on our manuscript. We are pleased that you consider our work a relevant contribution to the field of artificial intelligence in education. We have carefully reviewed the article and incorporated the necessary changes as suggested.

 The manuscript addresses an important and timely topic at the intersection of artificial intelligence and education by arguing that purposeful prompt design can transform static digital materials into adaptive learning experiences. The central idea is intriguing and relevant; however, the paper falls short in providing a theoretically grounded and methodologically rigorous treatment of the question. Below are several major areas for improvement.

Q1. First, the most significant concern lies in the review’s lack of critical evaluation of the included studies. The authors enumerate the frequency of AI techniques and educational levels but rarely assess the quality of evidence or the internal validity of those studies. There is little or no discussion of sample/study designs (experimental vs. correlational), or replication status. Without weighting studies by methodological robustness, the conclusions about “consistent improvements” in academic performance or engagement remain statistically and conceptually weak. The authors should include a critical scoring rubric or a structured appraisal system to distinguish exploratory prototypes from rigorously tested interventions and to ensure that the synthesis reflects evidence quality, not just quantity.

AQ1. We appreciate the reviewer’s comment and acknowledge the relevance of this observation. The selection of studies was conducted using high-quality academic databases, namely Scopus and the ACM Digital Library, which ensures a minimum standard of peer-reviewed research and scientific reliability. Nevertheless, we recognise that including a more detailed methodological assessment, for instance, considering study design, sample characteristics, and the robustness of the reported evidence, could further strengthen the credibility of the synthesis. Such an analysis will be considered in an extended or future version of this study.

Q2. Second, the temporal scope raises methodological concerns. Limiting the review to English-language studies published between 2023 and 2025 is problematic, as it captures only the short-term post-ChatGPT research surge. The paper needs to justify why this narrow timeframe is theoretically or empirically meaningful. Without comparison to prior (or later) research periods, it risks conflating publication volume with genuine scholarly progress.

AQ2. The decision to focus on studies published between 2023 and 2025 was deliberate, reflecting the emergence of generative AI tools, particularly ChatGPT, as a major turning point in the educational use of artificial intelligence. This short timeframe was therefore selected to capture and analyse the immediate post-ChatGPT wave of research, aiming to identify early patterns, emerging practices, and initial pedagogical implications of this technological disruption. We fully agree that future comparative analyses, extending the temporal scope to include pre-2023 and later studies, will be essential to assess whether the trends identified represent transient reactions or consolidated scholarly progress. We have clarified this rationale in the revised version of the manuscript.

Q3. Third, the review overemphasizes technical taxonomies (e.g., ML, NLP) while neglecting cross-study synthesis on pedagogical variables, such as instructional design, learner autonomy, or feedback mechanisms. This fixed algorithmic categories creates an illusion of scientific precision but does not illuminate how or why AI actually enhances learning. The authors might consider incorporating instructional design theory or frameworks linking large language models to learning processes to offer a more pedagogically meaningful synthesis.

AQ3. We acknowledge that the review places greater emphasis on technical taxonomies of artificial intelligence (e.g., ML, NLP). This focus was intentional, as the study aimed to map the technological foundations that underpin innovation in the context of Education 4.0. Within this paradigm, understanding the technological mechanisms is a necessary step towards interpreting how such systems support learner autonomy, personalised learning, and adaptive feedback, all of which are core pedagogical dimensions of Education 4.0. Rather than seeking precision through algorithmic classification alone, the review sought to highlight how these technologies enable new forms of instructional design and learner interaction. In this sense, the technical mapping serves as a framework for understanding the pedagogical potential of AI tools, consistent with the human–machine integration and personalised learning principles advocated by Education 4.0.

Q4. Fourth, the central theme, distinguishing between “explicit” and “implicit” prompt use, is conceptually interesting but superficially developed. The classification lacks theoretical clarity and operational transparency. The paper does not explain how a study qualifies as “prompt-based,” nor does it describe any coding scheme or inter-rater reliability measures. Without these, the typology appears subjective and weakens the analytical credibility of the review.

AQ4. We acknowledge the importance of ensuring conceptual and methodological clarity in distinguishing between the explicit and implicit use of prompts. The distinction proposed in this study was conceived as a conceptual framework rather than a rigid coding system, reflecting the exploratory nature of current research on the use of generative AI in educational contexts.

Within this framework, explicit prompt use refers to studies in which prompts are deliberately designed, manipulated, or optimised by users or teachers as part of the research intervention. In contrast, implicit prompt use refers to studies where AI-generated content or interactions depend on embedded prompting processes that are not directly controlled or defined by the user. This clarification has been added to the article to reinforce the conceptual transparency of the adopted classification.

Given the diversity and novelty of the studies analysed, no formal inter-rater reliability measures were applied at this stage. Instead, the classification was derived through an analytical consensus process among the authors, supported by the identification of recurring patterns in the literature.

 

Q5. Fifth, the review may exhibit positive bias. Almost all reported findings are interpreted as evidence of success, while null results, mixed effects, or negative outcomes receive no serious attention. A balanced review should critically engage with failures or unintended consequences of AI systems since these insights are equally vital for theory and practice.

AQ5. We fully acknowledge the importance of addressing the limitations, mixed results, and unintended consequences associated with the implementation of AI in educational contexts. These aspects are essential for a balanced and theoretically robust understanding of the field. Although many of the studies analysed did not explicitly report negative or null results, we have noted this absence as a potential indicator of publication bias, and the recognition of this limitation has been added to the text of the article.

Q6. Sixth, the discussion of ethics and pedagogy is underdeveloped. It appears as a perfunctory disclaimer rather than an integrated analytical dimension. While the absence of an extensive ethics section is acceptable, the authors should at least explain why issues such as privacy and bias were not included in the analytical framework and how their omission affects the interpretive boundaries of the study.

AQ6. We acknowledge that ethical and pedagogical issues were not treated as a central analytical dimension in this review. This decision was intentional, as the primary aim of the study was to map technological and methodological trends in the use of AI for learning personalisation, rather than to conduct a normative or ethical evaluation. In response to this comment, a reference to this limitation has been added to the conclusions of the systematic review to clarify the methodological scope of the study.

Q7. Finally, the use of the PRISMA framework appears more procedural than substantive. PRISMA is invoked as a rhetorical marker of rigor, but its implementation lacks precision and accountability. The paper does not provide sufficient detail on inclusion and exclusion criteria, database selection rationale, or screening decisions. The reliance on only ACM and Scopus introduces disciplinary bias toward technology-oriented publications and excludes education-specific databases. The authors should acknowledge this limitation explicitly and provide more transparent data extraction procedures (through a supplementary appendix) to enhance reproducibility and methodological integrity.

AQ7. We thank the reviewer for this thoughtful and constructive comment. We acknowledge that the implementation of the PRISMA framework in this review emphasised procedural transparency rather than exhaustive protocolisation. The decision to adopt PRISMA was motivated by the intention to ensure methodological consistency and replicability in the selection and synthesis of studies, while recognising that the field of generative AI in education is still in an early and highly dynamic stage of development.

The criteria for inclusion and exclusion were defined to capture peer-reviewed studies that explicitly addressed the application of AI techniques to learning personalisation, with particular attention to the use of prompts and intelligent recommendation systems. Scopus and the ACM Digital Library were selected as primary databases due to their comprehensive indexing of AI-related research, ensuring coverage of the most recent technological and methodological advances. We acknowledge, however, that this choice introduces a disciplinary bias favouring technological sources and potentially underrepresents the educational literature. This limitation has now been explicitly acknowledged in the manuscript.

 Thank you for making this article much completer and more accurate. All the changes made are reflected in the attached article.

 

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

1. The abstract omits key methodological details essential for transparency. The study period, sample size, search keywords, and the databases used for the literature review should also be explicitly stated in the abstract.

2. Research questions should be presented under the “Introduction” or “Objectives” section, that is, at the end of the introduction of the paper. The objective section explains why the systematic review was conducted, while the research questions clarify the focus and scope of the review. In conclusion, the research questions must be placed in the introduction, directly below the objective paragraph.

3. A paragraph addressing the significance of the study should be added above the paragraph stating the research aim.

4. “In summary, this study distinguishes itself by combining a systematic and evidence-based approach with a practical innovation grounded in the principles of Education 4.0. Methodologically, it applies the PRISMA 2020 protocol to a focused and recent timeframe (2023–2025), corresponding to the post-ChatGPT moment, a period marked by the rapid expansion of large language models and their disruptive impact on educational practices and research. This temporal focus is intended to capture the immediate scholarly response to the emergence of generative AI, allowing for the identification of early patterns, opportunities, and challenges associated with its integration into educational contexts.” The section between lines 102–109 appears to represent the research aim. Please revise the page by combining this part with the aim statement in the final paragraph.

5. “PRISMA Systematic Review” Such a heading represents a methodological approach and should be presented under the “Method” section. Only Method, Results, and Discussion qualify as primary headings in the manuscript structure.

6. Use a valid risk-of-bias assessment tool (e.g., Mixed Methods Appraisal Tool – MMAT, RoB 2.0, or QUADAS) to evaluate the methodological quality of the 54 included studies, and report the results. This will substantially strengthen the internal validity of the research.

Presentation of Reliability Coefficient: Calculate the Kappa coefficient to measure the level of agreement between the two authors during the data extraction and inclusion/exclusion phases, and report the results accordingly.

Discussion of Database Limitation: Address more comprehensively in the discussion section the potential impact of the selected databases (Scopus/ACM) on the findings. Doing so will enhance transparency by openly acknowledging the external validity limitation.

7. Ensure that all abbreviations are written out in full the first time they appear, followed by the abbreviation in parentheses. Review the abbreviations throughout the text accordingly to maintain consistency with this guideline.

Author Response

Thank you very much for your positive feedback on our manuscript. We are pleased that you consider our work a relevant contribution to the field of artificial intelligence in education. We have carefully reviewed the article and incorporated the necessary changes as suggested.

1. The abstract omits key methodological details essential for transparency. The study period, sample size, search keywords, and the databases used for the literature review should also be explicitly stated in the abstract.

AQ1 - The suggestions were included in the abstract.

2. Research questions should be presented under the “Introduction” or “Objectives” section, that is, at the end of the introduction of the paper. The objective section explains why the systematic review was conducted, while the research questions clarify the focus and scope of the review. In conclusion, the research questions must be placed in the introduction, directly below the objective paragraph.

AQ2 - The suggestions were incorporated into the manuscript (introduction section).

3. A paragraph addressing the significance of the study should be added above the paragraph stating the research aim.

AQ3 - It has been added a specific paragraph that clarifies the relevance and scientific contribution of the study, synthesising the theoretical, methodological, and practical significance of the research.

4. “In summary, this study distinguishes itself by combining a systematic and evidence-based approach with a practical innovation grounded in the principles of Education 4.0. Methodologically, it applies the PRISMA 2020 protocol to a focused and recent timeframe (2023–2025), corresponding to the post-ChatGPT moment, a period marked by the rapid expansion of large language models and their disruptive impact on educational practices and research. This temporal focus is intended to capture the immediate scholarly response to the emergence of generative AI, allowing for the identification of early patterns, opportunities, and challenges associated with its integration into educational contexts.” The section between lines 102–109 appears to represent the research aim. Please revise the page by combining this part with the aim statement in the final paragraph.

AQ4 - The indicated excerpt, which indeed states the central purpose of the study, has been integrated into the final paragraph of the introduction that outlines the aim of the work. The ideas were merged and the text was revised to ensure conceptual clarity, eliminate redundancies, and present a more coherent and consolidated statement of objectives.

5. “PRISMA Systematic Review” Such a heading represents a methodological approach and should be presented under the “Method” section. Only Method, Results, and Discussion qualify as primary headings in the manuscript structure.

AQ5 - The titles have been updated according to the suggestion.

6. Use a valid risk-of-bias assessment tool (e.g., Mixed Methods Appraisal Tool – MMAT, RoB 2.0, or QUADAS) to evaluate the methodological quality of the 54 included studies, and report the results. This will substantially strengthen the internal validity of the research.

AQ6 - In response, we conducted a structured assessment of the 54 included studies using a transversal adaptation of the Mixed Methods Appraisal Tool, which enabled the application of uniform criteria regardless of the methodological diversity of the studies. Five criteria were considered, namely the clarity of the objective, the description of the AI technique, the coherence between the technique and the pedagogical purpose, the characterisation of the educational context, and the reporting of limitations. We consider that this addition substantially strengthens the internal validity and transparency of the review.

7. Presentation of Reliability Coefficient: Calculate the Kappa coefficient to measure the level of agreement between the two authors during the data extraction and inclusion/exclusion phases and report the results accordingly.

AQ7. We have now calculated and reported the Cohen’s Kappa coefficient for both the study selection and data extraction phases. Substantial agreement was observed across reviewers in both stages.

8. Discussion of Database Limitation: Address more comprehensively in the discussion section the potential impact of the selected databases (Scopus/ACM) on the findings. Doing so will enhance transparency by openly acknowledging the external validity limitation.

AQ8 - The discussion section now includes an expanded analysis of the potential impact of selecting Scopus and ACM on the study’s results, as suggested.

9. Ensure that all abbreviations are written out in full the first time they appear, followed by the abbreviation in parentheses. Review the abbreviations throughout the text accordingly to maintain consistency with this guideline.

AQ9- We have conducted a thorough revision of the document and corrected all identified errors.

Thank you for making this article much completer and more accurate. All the changes made are reflected in the attached article.

Reviewer 3 Report

Comments and Suggestions for Authors

Thanks for addressing my concerns. The paper has improved, and I am signing off on it. One last comment: there are still some typos remaining—please address them.

Author Response

We appreciate your careful review of the manuscript and the comments that helped to improve it. We have conducted a thorough revision of the document and corrected all identified typographical errors.

     

Round 3

Reviewer 1 Report

Comments and Suggestions for Authors

1. Place the research questions directly below the paragraph where the purpose is described.

2. Remove the “s” suffix from the title “Method”.

Author Response

Q1. 1. Place the research questions directly below the paragraph where the purpose is described.

RQ1 - Thank you for the suggestion. The research questions have been repositioned so that they now appear directly after the paragraph in which the purpose of the study is outlined, in accordance with your recommendation.

Q2 - Remove the “s” suffix from the title “Method”. RQ2 - Thank you for the observation. The “s” suffix has been removed from the title, and it now appears as “Method” as requested.  

Thank you for making this article much completer and more accurate. All the changes made are reflected in the attached article.

Reviewer 3 Report

Comments and Suggestions for Authors

I accept this manuscript.

Author Response

Thank you once again for the suggestions incorporated into the article, which contributed to improving its quality.

Back to TopTop