Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Towards a Retrieval-Augmented Generation Framework for Originality Evaluation in Projects-Based Learning Classrooms

Educ. Sci. 2025, 15(6), 706; https://doi.org/10.3390/educsci15060706

by Samuel Yanes Luis^*

, Daniel Gutiérrez Reina

and Sergio Toral Marín

Reviewer 1: Anonymous

Reviewer 2:

Akhmad Habibi

Educ. Sci. 2025, 15(6), 706; https://doi.org/10.3390/educsci15060706

Submission received: 26 April 2025 / Revised: 28 May 2025 / Accepted: 28 May 2025 / Published: 5 June 2025

(This article belongs to the Section Technology Enhanced Education)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This manuscript addresses a timely and relevant issue in project-based learning by proposing a well-constructed framework for assessing students' originality using the RAG technique. The methodology is clearly presented, and the system is validated with alignment to human judgment.

The paper also acknowledges that the system tends to score more conservatively than human reviewers (Section 4.3).

However, a brief discussion of the pedagogical implications would enrich the paper.

In addition, the paper would benefit from a clearer treatment of ethical considerations, particularly in relation to student data protection and how the tool fits into the formative dimension of project-based learning.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

In spite of the fact that the application of RAG and LLMs to originality evaluation is intriguing, the paper does not differentiate its approach significantly from other plagiarism detection and originality techniques that are currently available. A substantial amount of evidence does not support the novelty claim.
The technological implementation of the RAG system is detailed at a high level; however, there is a lack of detail surrounding the architecture, data preparation, and integration of the various components (for example, how exactly DeepSeek and NotebookLM communicate with one another).
For the purpose of this evaluation, a single course that contains 91 prior works and 10 new test cases was used. Considering the fact that this dataset is quite small, there are some questions regarding the generalizability and robustness of the findings.
Although there is a promising correlation coefficient (0.87) between the RAG system and instructor scores, there is no statistical analysis (such as p-values or confidence intervals) to support the assertion that the system is effective.
It is possible that the instructors who rated the originality of the test projects were impacted by their past knowledge of the course and the projects that were assigned to them, which could have resulted in biassed results. The report does not explain the methods that can be used to reduce this bias.
There is no indication that a user research was conducted in which students or teachers participated in order to evaluate the system's usability, acceptance, or educational influence through its implementation in actual classroom settings.
There is a lack of in-depth discussion regarding the ethical implications of utilizing artificial intelligence for originality evaluation. These include but are not limited to transparency, explainability, and the potential of false positives and negatives.
Due to the fact that the framework is dependent on proprietary platforms (such as NotebookLM by Google), it may be difficult to reproduce and may not be sustainable over the long term. There is only a brief discussion of open-source alternatives, and none of them are implemented.
As a result of the fact that the paper does not compare its methodology with other originality assessment tools or established plagiarism detection systems (such as Turnitin or Moss), it is unable to evaluate the relative performance of the two.
There is a lack of clarity on the definition of the originality score criteria, which ranges from 0 to 10, and it is not specified how subjective evaluations are averaged across different evaluators or projects.
There is no examination of failure modes or limitations of the technique, nor is there a study of instances in which the RAG system and teachers had different points of view.
There are several typographical errors, strange phrasings, and inconsistent formatting throughout the article (for example, "roon seat reservation systen," "removd for peer review"), all of which distract from the paper's professionalism and make it more difficult to read.
All of the attention in this work is directed toward a single master's degree program in electronic engineering. There is a lack of debate regarding the several ways in which the framework could be modified to different fields of study, educational levels, or different kinds of project-based learning.
However, the work needs significant improvements in terms of technical depth, evaluation rigor, clarity, and explanation of broader consequences. Although it shows a potential approach, refinement is necessary. By addressing the problems that have been presented above, the manuscript will be greatly strengthened.

Author Response

Please see attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors The authors have thoroughly revised the manuscript, addressing all major concerns and incorporating feedback effectively. The revisions strengthen the paper, and the minor extensions enhance its clarity and depth. I recommend accepting the submission for publication.

Author Response

Thank you very much for your insights. They have really helped to improve the paper.

Article Menu

Towards a Retrieval-Augmented Generation Framework for Originality Evaluation in Projects-Based Learning Classrooms

Further Information

Guidelines

MDPI Initiatives

Follow MDPI