Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Deepfake-Style AI Tutors in Higher Education: A Mixed-Methods Review and Governance Framework for Sustainable Digital Education

Sustainability 2025, 17(21), 9793; https://doi.org/10.3390/su17219793

by Hanan Sharif^1,*

, Amara Atif²

and Arfan Ali Nagra¹

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3:

Steven Sexton

Sustainability 2025, 17(21), 9793; https://doi.org/10.3390/su17219793

Submission received: 5 September 2025 / Revised: 28 October 2025 / Accepted: 30 October 2025 / Published: 3 November 2025

(This article belongs to the Special Issue Advancing Sustainable Education Through AI and Technological Breakthroughs)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Thank you for the opportunity to review your manuscript. The topic is timely and important, especially given the explosion of AI-enabled tools, both tutors and beyond, in higher education contexts. The literature review and methodology for that review were strong and well-designed. The manuscript is well-organized and well-supported, strongly situated in the relevant scholarship. The qualitative component did leave me with some questions that I think would be useful to address in revision. First, questionnaire and interview are used interchangeably in the manuscript, which creates a barrier to methodological clarity. For example, a semi-structured questionnaire is simply an indication that the items are open-ended, whereas a semi-structured interview enables in-the-moment responsiveness to participant answers. The use of interview creates the impression of depth of data collection and probing of responses that was not present in the data collection process. I would recommend sticking to questionnaire as the terminology for clarity and trustworthiness of findings. Second, the inductive component of the coding process is unclear. It appears as though the literature was reviewed and codes were developed from that review. Then those codes were used to analyze the questionnaire data. That process is typical of a priori (deductive) coding in qualitative inquiry. If the process was to inductively code the literature in order to develop deductive codes for questionnaire data, then that process needs clarification. Additionally, it reads as though those a prior codes developed from the literature were applied as is. In other words, there was one round of deductive coding of the questionnaire responses, but then those a priori codes are listed as themes. That process is not consistent with thematic analysis according to Braun and Clarke so I would encourage you to clarify the process. Frequency of a priori codes is not typically the process of thematic analysis. Transparency of qualitative procedures is key to the credibility and trustworthiness of the findings, and more detail would be welcome here.

I would also encourage you to indicate what constitutes "in-depth" responses on the questionnaires. The sample quotations are fairly brief and descriptive, which can often limit thematic analysis, so I wonder if content analysis might better describe the process. If not, additional explanation of the analysis procedures would be useful. I would also welcome clarity around why the specific disciplines of participants were chosen (and not other disciplines or not solely education, for example) as well as how you operationalized "familiarity" with AI. Because you do not include the data collection instrument, some discussion on the development of the questions would be useful. The summary of the content did raise a question. For example, the question: In what ways could deepfake tutors enhance learning (e.g., personalization, accessibility, multilingual delivery)? That question includes an assumption that the tutors DID enhance learning and does not allow for the opposite impression from participants, especially given that the instrument is a questionnaire that did not allow for responsiveness in the moment of data collection to clarify.

In general, I appreciated the thoroughness of the analysis and the conclusions drawn. The authors specifically note limitations, and I particularly appreciated the inclusion of the limitation of not including the student perspective. I wonder if you also might address the novelty effect as it relates to pedagogical gains. Given that the experiences with tutors reviewed in the literature typically lasted less than an hour, it may be that any engagement was due to novelty.

Author Response

We sincerely thank the reviewer for the careful reading and constructive feedback. In addition to the major clarifications detailed below, we have corrected all minor issues highlighted in the annotated document (e.g., punctuation errors, undefined abbreviations, stray fragments, and inconsistent appendix references).

Comment: “Questionnaire and interview are used interchangeably… recommend sticking to ‘questionnaire’ for clarity.”

Authors’ Response: Implemented. We refer to the expert data collection as semi-structured interviews administered in writing via a standardized questionnaire and use “questionnaire(s)” consistently throughout Methods. See Section 3.3 and subsections 3.3.1–3.3.3.

Comment: “Inductive component unclear; looks like a priori (deductive) coding; clarify process.”

Authors’ Response: Clarified. Section 3.3.3 (Data Analysis) now states we used Braun & Clarke’s six phases with a hybrid deductive–inductive approach (literature-informed codebook + emergent codes) and reports Cohen’s κ = 0.81 for inter-coder reliability.

Comment: “A priori codes listed as themes; not consistent with Braun & Clarke; frequency ≠ thematic analysis.”

Authors’ Response: Addressed. We explain the progression codes → candidate themes → reviewed, named themes with illustrative excerpts; counts are only descriptive, not criteria for theme status (3.3.3).

Comment: “Indicate what constitutes ‘in-depth’ responses; consider content analysis vs thematic.”

Authors’ Response: We retain thematic analysis and acknowledge response-length constraints. Criteria and mitigation (analytic weighting, triangulation with SLR) are stated in 3.3.3, with limits acknowledged in 8.3 Methodological Constraints.

Comment: “Why these disciplines; why not students/admins; how was ‘AI familiarity’ operationalized; include instrument development.”

Authors’ Response: Sampling rationale and participant profile are in 3.3.1; instrument development and AI familiarity operationalization are in 3.3.2.

Comment: “Leading stem (‘ways tutors enhance…’)—risk of assuming enhancement.”

Authors’ Response: The Questionnaire Guide is provided (Appendix B) and stems were neutralized (e.g., “in what ways, if any, could…affect…”). See Appendix B.

Comment: “Novelty effect may drive perceived gains.”

Authors’ Response: Acknowledged in 8.3 Methodological Constraints and 8.7 Implications for Future Work.

Reviewer 2 Report

Comments and Suggestions for Authors

Review Comments

Originality

The paper conducts a systematic mixed-methods review on the emerging topic of "deepfake-style AI tutors" in higher education, and innovatively proposes an actionable governance framework comprising four pillars (Transparency and Disclosure, Data Governance and Privacy, Integrity and Detection, and Ethical Oversight and Accountability). It addresses the gaps in existing research that prioritize technology over governance and theory over practice, demonstrating high originality and publication merit.

Literature Review

The literature review is highly comprehensive, adhering to PRISMA guidelines. It covers key domains including technology, education, ethics, and law, with a rigorous search strategy. However, the review shows insufficient attention to non-English literature and cross-cultural comparative studies, particularly lacking exploration of research in the context of Global South countries. It is recommended to more explicitly identify this potential bias in the discussion or limitations section and examine its impact on the generalizability of the research conclusions.

Methodology

The paper adopts a reasonable mixed-methods design (systematic literature review + expert interviews), which is suitable for addressing complex research questions. The systematic review is executed in a standardized manner, with high-quality assessment and intercoder reliability. The sampling and analysis processes of the expert interviews are also described in a relatively standardized way. Nevertheless, the interview sample only includes 12 assistant professors, resulting in a narrow perspective that lacks insights from key stakeholders such as students, administrators, and technical developers. This limits the comprehensiveness of the research findings and the feasibility demonstration of the framework. It is recommended to discuss this issue more thoroughly in the limitations section.

Results and Analysis

The results are presented clearly, effectively using tables and figures to integrate and display data, with accurate summarization of the four core themes. However, the analysis section could be more in-depth, particularly regarding the insufficient exploration of the connection between "technical solutions" and "governance practices." For instance, there is a lack of in-depth trade-off analysis on the cost, feasibility of different detection technologies (e.g., rPPG), and how they match teaching scenarios with different risk levels. The link between conclusions and practical guidance could be tighter and specific.

Implications for Research, Practice, and Society

The paper articulates implications for research, practice, and society in great detail, which constitutes one of its prominent strengths. The proposed governance framework, policy checklist, phased roadmap, and RACI matrix provide highly valuable practical tools for universities and policymakers. In terms of social significance, it also delves into key issues such as academic integrity, privacy, fairness, and trust. It is recommended to further discuss the adaptability challenges of this governance framework in resource-constrained educational environments to enhance its global applicability.

Quality of Communication

The paper has a reasonable overall structure, coherent logic, accurate use of terminology, and professional figures/tables.

Recommendation

Major Revision

Comments to the Authors

The paper addresses a cutting-edge topic with rigorous methodology and significant contributions, but it requires key revisions to address the following issues:

Insufficient Global Perspective in Literature Review: The paper’s literature base primarily relies on academic achievements from the English-speaking world, with limited attention to relevant research in non-Western cultural contexts (e.g., Southeast Asia, Africa, Latin America). It is recommended to more explicitly identify this limitation in the "Discussion" or "Limitations" section and deeply explore its potential impact on the generalizability of the research conclusions, thereby enhancing the rigor and global relevance of the study.

Need to Expand the Perspective of Research Samples: The interview participants are limited to assistant professors. While this ensures depth in academic perspectives, it lacks insights from key stakeholders such as students (core users), IT administrators (technical implementers), and university leaders (resource decision-makers). This creates blind spots in the demonstration of the governance framework’s "feasibility" and "acceptability." It is recommended to more fully discuss this issue in the "Limitations" section, clarify its constraints on the research findings, and propose that future research incorporate a broader range of perspectives.

Need to Enhance Analytical Depth and Connection: The analysis of research findings can be further deepened. It is recommended to strengthen the discussion on the trade-offs between cost, effectiveness, and feasibility of technical solutions (e.g., various detection methods) and governance practices in different risk scenarios. It should more specifically elaborate on why a certain combination of technologies and management strategies is recommended in specific contexts (e.g., high-stakes examinations), rather than merely listing solutions, to enhance the insightfulness of the conclusions and their practical guiding significance.

Need to Strengthen the Framework’s Dynamism and Empirical Support: Generative AI technology is evolving rapidly, but the current governance framework appears somewhat static. It is recommended to add reflections on the framework’s dynamic update mechanism in the discussion (e.g., regular review cycles, conditions triggering updates). Meanwhile, make every effort to supplement existing case or pilot evidence (even from industry reports) for the proposed governance tools (e.g., KPIs, RACI matrix), or explicitly position them as "proposals to be empirically tested," to distinguish between the known and unknown and enhance persuasiveness.

Comments for author File: Comments.pdf

Author Response

Response to Reviewer

Comment: “Insufficient global perspective; English-language bias; Global South underrepresented.”

Authors’ Response: Acknowledged in Limitations (Scope & Generalizability). We note English-language bias, and recommend multilingual databases and regional studies to improve generalizability. Discussed in Discussion/Implications.

Comment: “Interview sample only assistant professors; missing stakeholders (students, admins, IT, leaders).”

Authors’ Response: Acknowledged in Limitations (Stakeholder Coverage). We commit to future multi-stakeholder studies (students, IT staff, administrators, leaders).

Comment: “Need deeper analysis linking technical solutions to governance practices; trade-offs by risk.”

Authors’ Response: We added text linking detector families to misuse vectors and risk tiers (Section 5.2 discussion) and calibrated controls by context (low- vs high-stakes) adjacent to Table 5 and in Table 8 (risk-tier matrix).

Comment: “Framework seems static; add dynamic updates; clarify empirical status of KPIs/RACI.”

Authors’ Response: Addressed. The framework is framed as living and iterative with review cycles and triggers; KPIs are conceptual starting points pending empirical validation (Sections 5.3, 6.5 and Conclusion).

Comment: “Adaptability in resource-constrained environments.”

Authors’ Response: Included resource-aware pathways in 6.2 Institutional Implementation, emphasizing staged adoption (disclosure/provenance → open-source detectors → biometrics for high-stakes only).

Reviewer 3 Report

Comments and Suggestions for Authors

Kia ora,

Thank you for the opportunity to read about your research. I have made some comments and raised some questions on the attached. I hope you fine these helpful.

Cheers,

Comments for author File: Comments.pdf

Author Response

Comment: “Assessment fraud — is this relevant to AI tutors?”

Authors’ Response: Clarified throughout as tutor-enabled assessment fraud (misuse for impersonation or unauthorized assistance). See Table 2 and Discussion section.

Comment: “Layered detection sentence needs clarity; link to Tables 2 & 3.”

Authors’ Response: We explain why no single detector suffices and describe layered, cross-modal pipelines aligned to misuse vectors; the surrounding text ties Table 5 to risks summarized in Tables 2 and 3.

Comment: “AI tutors taking exams? (table cell text).”

Authors’ Response: We clarified that these controls target examinee identity and tutor misuse in assessment contexts, not “tutors as candidates.” Table 4 wording retained concise form; surrounding text clarifies intent.

Comment: “Does incident-response text imply the institution knowingly uses tutors in exams?”

Authors’ Response: Wording now specifies misuse in assessment contexts (e.g., if a synthetic tutor is detected during an assessment, pause and review). See Section 5.4 Failure-Mode Playbooks.

Comment: “Policy & Regulatory Guidance — ‘recommendation?’”

Authors’ Response: Framed as a recommendation: “Based on our review, we recommend that policies include disclosure labels, informed consent, and privacy-by-design principles.”

Comment: “Does capacity-building imply staff wouldn’t know tutors are used?”

Authors’ Response: Clarified that faculty/staff are explicitly informed about institutional deployments and trained accordingly. Training is presented as preparation for known, institutionally managed deployments.

Comment: “How would AI tutors be part of this? (assessment fraud in Results).”

Authors’ Response: Revised to “tutor-enabled assessment fraud (e.g., impersonation or unauthorized assistance during exams).” This makes clear the risk arises from misuse of tutors in assessments.

Comment: “AI tutors? (detection paragraph)”

Authors’ Response: Paragraph revised to explicitly tie multimodal detection synthesis to AI tutors, noting its relevance for preventing impersonation, tutor-enabled exam misuse, and ensuring provenance of tutor outputs.

Comment: “Appendix/abbreviation issues, punctuation, stray fragments.”

Authors’ Response: Corrected. Acronyms are defined at first use and appendices standardized (Appendix A–C).

Comment: “Appendix B references were incorrect.”

Authors’ Response: Fixed. Participant demographics are now in Appendix C, and the Questionnaire Guide is in Appendix B; cross-references in 3.3.1–3.3.2 were aligned accordingly.

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

1. The author did not provide a point-by-point response to the comments from the first round of review in the revised manuscript. It is recommended that the author further organize and revise the manuscript and submit the second round of revisions along with a revision note as required.
1. Abstract: The abstract should be more concise, highlighting the author's contributions and innovations, and condensing the research process.
2. Introduction: The discussion in the introduction is somewhat loose. It is recommended that the author refine the introduction and enhance its logical coherence and flow. In particular, the review of existing literature is still rather superficial. Additionally, some phrasing is unprofessional, such as "studies published in MDPI-affiliated journals report that·······".
3. Delete the statements in section 1.1 Motivation and Historical Trajectory.
4. In the research questions section, the author is advised to streamline them to 3 questions and remove keywords preceding each question, such as "Engagement Potential" and "Integrity Challenges:".
4. Delete the last paragraph of the introduction. "For clarity, all acronyms and abbreviations in this manuscript are defined upon first use. The text avoids introducing any abbreviations without an accompanying definition, in accordance with the journal’s style guidelines and to ensure readability for a broad audience."
5. The chosen topic is a review article, but a separate section for "2. Literature Review" is listed in the text, which does not conform to writing standards. It is recommended that the author streamline section 2. Literature Review and merge it into the introduction.
6. The author mentioned semi-structured interviews in 3.1 Research Design. Please supplement with evidence and demographic information for expert interviews.
7. Figure 2 is still quite rough. It is recommended that the author use an overall framework to integrate these classifications and explain the basis for classification.
8. Delete 4.2.4 study design limitation. In addition, there is a lack of sufficient explanation and literature dialogue for the results in Tables 1 and 2. It is recommended to supplement this.
9. The author proposed four governance mechanisms, but did not explain the reasons for proposing these mechanisms or their target objectives.
10. It is recommended that the author delete parts of Tables 8 and 9, and discuss the proposed policy implications through language. Furthermore, the current proposals are not rigorous.
11. 8. The "limitation and future research" section is still quite lengthy. It is recommended that the author condense it into a single paragraph, providing a brief explanation. Currently, it seems to avoid the main issues and is unreasonable.
12. 9. The conclusion does not highlight the findings of this study, but rather mixes them with theoretical contributions. It is recommended to revise this.
13. Remove Appendix A and Appendix B.
14. Currently, as a review, the number of references cited is still relatively small, which may overlook important research findings. In addition, there are formatting issues with the author's references.

Comments for author File: Comments.pdf

Author Response

We sincerely apologize for the oversight in the previous submission and thank the reviewer for kindly pointing this out. We are deeply grateful for the reviewer’s time, guidance, and constructive feedback. In this revised version, we have carefully provided a detailed point-by-point response addressing all comments from the first and current review rounds. Each suggestion has been thoughtfully incorporated into the manuscript, and a complete revision note has been prepared to clearly indicate where changes have been made.

Reviewer Comment:
The English could be improved to more clearly express the research.

Author Response:
We sincerely thank the reviewer for this helpful suggestion. The entire manuscript has been carefully reviewed to enhance clarity, readability, and academic tone. Sentences with complex structures have been simplified, and word choices refined to ensure that the research objectives, methods, and findings are expressed more clearly. Minor grammatical and stylistic adjustments were also made throughout the text to improve fluency and coherence while maintaining the intended meaning of the study.

Reviewer Comment:
The abstract should be more concise, highlighting the author’s contributions and innovations, and condensing the research process.

Author Response:
We sincerely thank the reviewer for this valuable suggestion. The abstract has been carefully revised to improve clarity and conciseness while maintaining completeness. The updated version now highlights the study’s core objectives, methodological approach, key findings, and main contributions more succinctly. It also emphasizes the proposed four-pillar governance framework and its practical value for responsible integration of deepfake AI tutors in education. The revised abstract aligns with MDPI’s structured format and adheres to the recommended word limit.

Reviewer Comment:
Delete Section 1.1 Motivation and Historical Trajectory entirely and merge Section 2 Literature Review into the introduction to avoid a stand-alone review section.

Author Response (revised and polite):
We sincerely appreciate the reviewer’s thoughtful guidance aimed at improving the structure and coherence of the manuscript. After careful consideration, we have made several revisions to enhance the flow of the introduction; however, we respectfully retained Section 1.1 Motivation and Historical Trajectory and a distinct Literature Review section. These components were preserved because they provide essential contextual grounding and systematic synthesis, which help readers from diverse academic backgrounds understand both the technological evolution of deepfake tutors and the evidence base supporting this study. Removing them entirely could weaken the manuscript’s conceptual continuity and clarity.

That said, we have carefully refined these sections to reduce redundancy, improve transitions, and ensure that both remain concise and closely aligned with the article’s main objectives. We truly value the reviewer’s insightful feedback, which has guided us in achieving a better balance between brevity and contextual depth.

Reviewer Comment:
The author is advised to streamline the research questions to three and remove the leading keywords preceding each question.

Author Response:
We sincerely thank the reviewer for this helpful suggestion to enhance clarity and focus. The research questions have been revised in accordance with the recommendation. The updated version now includes three concise and well-aligned questions without leading keywords. These questions directly correspond to the study’s objectives and thematic structure, ensuring a clearer logical connection between the literature review, methodology, and findings. The revised questions are as follows:

How can deepfake-style AI tutors enhance personalization and learner engagement in higher education?
What challenges do they pose to academic integrity and detection mechanisms?
What governance and ethical frameworks are needed to ensure responsible deployment?

Reviewer Comment:
The author mentioned semi-structured interviews in Section 3.1 Research Design. Please supplement with evidence and demographic information for expert interviews. Additionally, re-label the subsection title to “semi-structured written questionnaires.”

Author Response:
We are deeply grateful to the reviewer for this thoughtful and constructive suggestion, which has helped us clarify and strengthen the methodology section. In response, the terminology throughout the manuscript has been carefully revised from “semi-structured interviews” to “semi-structured written questionnaires” to more accurately reflect the data collection process. Additionally, Section 3.3.1 (Participants and Sampling) has been expanded to include clear demographic details of the twelve participating assistant professors, covering their academic disciplines (Computer Science, Software Engineering, Education, and Ethics), gender distribution (75% male, 25% female), mean teaching and research experience (approximately seven years), and institutional affiliation (Lahore Garrison University). These details are now incorporated into the main text for transparency, while Appendix A has been retained for supplementary demographic information. We sincerely appreciate the reviewer’s guidance, which has significantly improved the clarity, precision, and completeness of this section.

Reviewer Comment:
Figure 2 (Taxonomy): Redesign to appear as an integrated conceptual framework (e.g., a flow diagram showing how tutor construction → delivery mode → contextual controls → governance pillars). Add an explanatory sentence: “Figure 2 illustrates the overall framework connecting tutor typologies with governance controls.”

Author Response:
We sincerely thank the reviewer for this valuable and insightful suggestion, which has significantly improved the clarity and conceptual coherence of the manuscript. In response, Figure 2 has been completely redesigned as an integrated conceptual framework that visually illustrates the progression from tutor construction to delivery mode, contextual controls, and finally to governance pillars. This new design better reflects the logical relationships among these components and aligns with the overall governance framework proposed in the study. An explanatory sentence has also been added at the end of Section 3.4 to guide readers and clarify how the figure connects tutor typologies with governance controls. The updated caption now reads:

“Integrated conceptual framework illustrating the relationship between tutor construction, delivery mode, contextual controls, and governance pillars for deepfake AI tutors.”
We are grateful for this constructive feedback, which has enhanced both the visual and conceptual integration of the paper.

Reviewer Comment:
Remove Subsection 4.2.4 “Study Design Limitations” (now repeated in Limitations section). Add a short discussion paragraph after Tables 1 & 2 interpreting their meaning and citing relevant literature for comparison.

Author Response:
We sincerely thank the reviewer for this valuable and constructive feedback, which has helped improve both the structure and analytical clarity of the manuscript. The subsection previously titled "4.2.3 Study Design Limitations" has been removed from the Results section to avoid redundancy, and its relevant methodological points have been integrated into Section 8.

Table 3, now renumbered as Table 1, has been retained because of its methodological importance and repositioned under Section 3.4 (Operational Definitions and Taxonomy) in a new subsection titled "Detection Classes and Technical Constraints." This relocation ensures that the table continues to support the technical foundation of the study rather than appearing as a limitation.

In addition, concise interpretive discussion paragraphs have been added immediately after Tables 1 and 2 to provide contextual interpretation and comparison with relevant literature. These revisions enhance analytical depth, strengthen connections with prior research, and improve the overall narrative flow of the Results section.

Reviewer Comment:
The author proposed four governance mechanisms but did not explain the reasons for proposing these mechanisms or their target objectives. It is recommended that the author delete parts of Tables 8 and 9 and discuss the proposed policy implications through language. Furthermore, the current proposals are not rigorous.

Author Response:
We sincerely thank the reviewer for this thoughtful and constructive feedback, which has been very helpful in improving the conceptual clarity and analytical depth of the manuscript.

In response, we have added a concise explanatory paragraph at the end of Section 5.1 to clarify the rationale and objectives underlying the four proposed governance pillars. The new paragraph highlights that these pillars were derived from recurring risk domains identified through both the systematic literature review and expert feedback, and that they aim respectively to enhance transparency, safeguard privacy, ensure academic integrity, and strengthen ethical accountability.

Regarding Tables 8 and 9, we carefully reviewed and refined their content to enhance precision and readability. Concise narrative explanations have also been added before and after each table to describe their context and policy relevance. These revisions ensure that the governance framework and policy implications are discussed more coherently in the main text, thereby improving both the rigor and interpretive clarity of the proposed mechanisms

Reviewer Comment:
The “Limitations and Future Research” section is still quite lengthy. It is recommended that the author condense it into a single paragraph, providing a brief explanation. Currently, it seems to avoid the main issues and is unreasonable.

Author Response:
We sincerely thank the reviewer for this thoughtful and constructive suggestion, which has helped improve the clarity and precision of the manuscript. The “Limitations and Future Research” section has been carefully revised and condensed into one integrated paragraph that directly addresses the main methodological, design, and contextual limitations of the study. Redundant explanations and subheadings have been removed to enhance focus and readability. The revised section now clearly highlights the key limitations, including sample size, study design, language scope, and the rapidly changing nature of generative AI, while outlining concise directions for future research. These revisions ensure that the section is concise, balanced, and fully aligned with the reviewer’s recommendation.

Reviewer Comment:
The conclusion does not highlight the findings of this study, but rather mixes them with theoretical contributions. It is recommended to revise this.

Author Response:
We sincerely thank the reviewer for this helpful and insightful comment. In response, the conclusion section has been carefully revised to clearly distinguish the study’s main findings from its theoretical and practical contributions. The revised version now begins with a concise summary of the empirical results drawn from the systematic review and expert feedback, followed by a separate discussion of the governance framework and its policy relevance. The section concludes with a brief reflection on collaboration and future directions. This restructuring ensures that the findings are clearly highlighted, and the theoretical contributions are presented in a more coherent and sequential manner, fully addressing the reviewer’s recommendation.

Reviewer Comment:
Remove Appendix A and Appendix B.

Author Response:
We sincerely thank the reviewer for this valuable observation, which has helped improve the focus and organization of the manuscript. To ensure clarity and conciseness, the PRISMA 2020 Checklist (previously Appendix A) has been removed, as it is not referenced in the main.

However, the Questionnaire Guide and Participant Demographics appendices have been retained in a concise and revised form because they are explicitly cited in the methodology section and contribute directly to the study’s transparency and reproducibility. The Questionnaire Guide outlines the structure of the expert input instrument, while the demographic summary provides essential context for interpreting participant diversity and thematic analysis. Both appendices have been streamlined to remove redundant details and ensure alignment with the reviewer’s recommendations, while maintaining the completeness and methodological integrity of the study.

Reviewer Comment:
Currently, as a review, the number of references cited is still relatively small, which may overlook important research findings. In addition, there are formatting issues with the author's references.

Author Response:
We sincerely thank the reviewer for this thoughtful and constructive observation. The references in this study were compiled following the PRISMA 2020 guidelines to ensure comprehensive coverage and methodological rigor. A total of sixty-one peer-reviewed and high-impact publications have been cited, representing the latest research trends and developments in deepfake detection, AI tutoring systems, and governance frameworks. These sources cover the most recent and relevant studies published between 2015 and 2025 across leading journals and publishers, including IEEE, Elsevier, Springer, and MDPI.

All references were managed using Mendeley Reference Manager to maintain consistency, accuracy, and alignment with the journal’s formatting requirements. We believe the current bibliography provides a balanced and up-to-date representation of the most significant research contributions within this emerging field.

Reviewer 3 Report

Comments and Suggestions for Authors

Kia ora,

You have taken onboard the comments and suggestions previously noted, I have made a few more there that I hope are beneficial.

Cheers,

Comments for author File: Comments.pdf

Author Response

We sincerely thank the reviewer for their careful reading and thoughtful feedback on our revised manuscript. We greatly appreciate the time and effort devoted to improving the precision, clarity, and overall presentation of our work. All suggested changes have been carefully reviewed and incorporated into the updated version. Our detailed responses to each comment are provided below.

Comment 1 — Sentence structure error (“when they oversight…”)

Reviewer: Sentence structure error — “when they?? oversight.”
Authors’ Response: We thank the reviewer for noting this grammatical issue. The sentence has been revised for clarity and now reads:

“The regulation adopts a risk-based model that classifies AI systems, including educational applications, as high-risk when they are subject to human oversight requirements and compliance checks.”

Comment 2 — “Which scholars? … Is this 24?”

Reviewer: Which scholars? This needs at least one reference to support. Is this 24?
Authors’ Response: We appreciate this helpful observation. To clarify attribution, the revised text now reads:

“Scholars, such as Chesney and Citron [24], are actively debating whether deepfake tutors should be classified as high-risk under the EU AI Act, given their potential impact on learners’ rights.”
We retained [24] and added a concluding sentence for completeness:
“These issues underscore the need for ethically informed design alongside technical innovation.”

Comment 3 — Supporting references / singular vs plural

Reviewer: This needs supporting references. Is this ‘recent surveys’—plural?
Authors’ Response: We thank the reviewer for this clarification. The sentence has been aligned with the single supporting reference [53] and now reads:

“A recent survey shows that around 40% of international educators use generative AI weekly …”
If preferred by the editor, we are prepared to include additional survey sources to justify the plural form.

Comment 4 — Figure cross-reference (“referred to here, but on page 9?”)

Reviewer: Referred to here, but on page 9?
Authors’ Response: We thank the reviewer for identifying this inconsistency. Figure 1 (PRISMA flow diagram) has been repositioned to appear immediately after Section 3.2.4, at the point of first citation, to improve readability and maintain alignment between the text and figure references.

Comment 5 — Appendix order (“Should the first Appendix mentioned be A?”)

Reviewer: Should the first Appendix mentioned be A?
Authors’ Response: We appreciate this helpful observation. The appendices have been reordered and relabeled to follow their first appearance in the text:

Appendix A: Participant Demographics
Appendix B: Questionnaire Guide (Summary)

All corresponding in-text citations have been updated accordingly.

Comment 6 — Reference for Braun & Clarke

Reviewer: Reference needed.
Authors’ Response: We thank the reviewer for this valuable reminder. The foundational citation has been added:

Braun, V.; Clarke, V. Using Thematic Analysis in Psychology. Qualitative Research in Psychology, 2006, 3(2), 77–101. https://doi.org/10.1191/1478088706qp063oa

The sentence now reads:

“We employed a thematic analysis following Braun and Clarke’s well-established six-phase approach [58].”

Comment 7 — Themes vs Research Questions

Reviewer: Four dominant themes emerged that happen to reflect the four RQs?
Authors’ Response: We appreciate the reviewer’s insight. This section has been refined to clarify that the themes were derived inductively and subsequently aligned with the study’s three research questions (RQ1–RQ3). The revised text now reads:

“While these themes arose inductively from the coding process, they align closely with the study’s research questions (RQ1–RQ3). The fourth theme extends the ethical dimension of RQ3 by capturing participants’ broader reflections on trust, transparency, and societal acceptance of synthetic tutors.”

Comment 8 — Domain counts wording (“delete the word can”)

Reviewer: These do exceed, so suggest delete the word can.
Authors’ Response: We thank the reviewer for this helpful precision. The revision has been made, and the sentence now reads:

“… the domain counts below are therefore non-mutually exclusive and exceed 42.”

Comment 9 — Placement of Table 4

Reviewer: Would it be better to move Table 4 up closer to where it is referred to?
Authors’ Response: We appreciate the reviewer’s thoughtful suggestion. Table 4 has been repositioned immediately after its first mention in Section 5.1 to improve continuity and maintain a logical flow with Tables 5 and 6.

Comment 10 — Misuse in exams

Reviewer: Who would identify the misuse of a tutor in this manner? If the course uses an AI tutor, why would one not be used in the exam?
Authors’ Response: We thank the reviewer for raising this important point. The paragraph in Section 5.4 has been revised for clarity. The updated text specifies that the concern refers to unauthorized use of synthetic tutors in restricted assessment contexts (e.g., summative exams). It further clarifies that such misuse would be detected through automated monitoring systems and verified by proctors or institutional IT staff. The revised sentence reads:

“… if the unauthorized use of a synthetic tutor is detected in an assessment context (e.g., impersonation or fraudulent draft generation), the exam should be paused, flagged for manual review, and supplemented with alternate items. Detection would typically be identified through automated monitoring and verified by proctors or institutional IT staff.”

Round 3

Reviewer 2 Report

Comments and Suggestions for Authors

Although the author made minor revisions, some issues remain to be resolved.

（1）Ensure that the term “deepfake-style AI tutor” is clearly defined at first mention in the Introduction, for readers who may be unfamiliar with the concept. Although the term is described later (pp. 10–11), an explicit definition up front would improve clarity. Consider briefly distinguishing “deepfake tutors” from other AI pedagogical agents.

（2）In Section 3 (Methodology), some sentences are lengthy. For example, the description of the mixed-methods approach (lines 873–880) could be split or tightened for readability. A concise presentation of the PRISMA process and thematic analysis steps would help readers follow the methods more easily.

（3）In the Discussion or Conclusion, explicitly note limitations such as the single-institution, single-country sample of expert respondents and the exclusion of non-English studies. Discuss how these factors might affect the generalizability of the findings, and suggest that future work include more diverse participants and multilingual literature. This will strengthen the credibility of the study.

（4）Verify that all figures and tables are clear and well-formatted. For example, check that text in Figure 2 is legible and that the PRISMA flow diagram (Figure 1) is easy to read. Ensure that table captions are fully descriptive; for instance, Table 1’s caption could begin with a full sentence explaining the table content. If space permits, widen columns in tables (such as Table 1 and Table 2) to prevent text crowding.

（5）Some broad claims (e.g. about AI trust and adoption trends) would benefit from additional recent references. For example, statements on the importance of trust in AI adoption could cite recent surveys in educational contexts. Adding such citations will reinforce the literature review and discussion.

(6) Carefully proofread the manuscript to correct minor issues. For instance, address any hyphenation or line-break artifacts (e.g. in Section 3.2.1), ensure consistent use of terminology (e.g. “AI tutor” vs “pedagogical agent”), and check punctuation. These editorial refinements will improve the manuscript’s polish.

Author Response

We are deeply grateful for the reviewer’s time, guidance, and constructive feedback. In this revised version, we have carefully provided a detailed point-by-point response addressing all comments from the first and current review rounds. Each suggestion has been thoughtfully incorporated into the manuscript, and a complete revision note has been prepared to clearly indicate where changes have been made.

Reviewer Comment:
The English could be improved to more clearly express the research.

Response: We sincerely thank the reviewer for this helpful suggestion. The entire manuscript has been carefully reviewed to enhance clarity, readability, and academic tone. Sentences with complex structures have been simplified, and word choices refined to ensure that the research objectives, methods, and findings are expressed more clearly. Minor grammatical and stylistic adjustments were also made throughout the text to improve fluency and coherence while maintaining the intended meaning of the study.

Reviewer Comment 1: Ensure that the term “deepfake-style AI tutor” is clearly defined at first mention in the Introduction, for readers who may be unfamiliar with the concept. Although the term is described later (pp. 10–11), an explicit definition up front would improve clarity. Consider briefly distinguishing “deepfake tutors” from other AI pedagogical agents.

Response: We sincerely thank the reviewer for this insightful comment. To address the concern, we have reallocated the paragraph that originally defined and distinguished deepfake AI tutors (previously located in paragraph four of the Introduction) to appear immediately after the first mention of the term “deepfake AI tutor.” This structural adjustment ensures that readers encounter a full and explicit definition at the outset, thereby improving conceptual clarity and accessibility for those unfamiliar with the term.

In the revised version, the relocated paragraph now clearly explains that deepfake AI tutors are synthetic, avatar-based instructional agents created using deep learning technologies (e.g., face-swapping or voice-cloning) that either replicate a real person’s likeness or embody entirely synthetic personas with real-time verbal and nonverbal interactivity. It also contrasts them with traditional text-based or talking-head virtual assistants that lack identity mimicry and with agentic AI systems whose risks arise from autonomy rather than identity replication.

This revision not only enhances clarity but also reduces redundancy by consolidating the term’s definition and distinctions in one place, eliminating the need for repetitive explanations later in the Introduction. We appreciate the reviewer’s constructive suggestion, which has significantly improved both the flow and precision of the introductory section.

Reviewer Comment 2: In Section 3 (Methodology), some sentences are lengthy. For example, the description of the mixed-methods approach (lines 873–880) could be split or tightened for readability. A concise presentation of the PRISMA process and thematic analysis steps would help readers follow the methods more easily.

Response: We sincerely appreciate the reviewer’s valuable feedback regarding the clarity and conciseness of the Methodology section. In response, we have carefully revised Section 3 to improve readability and structure without altering the methodological rigor or content. Specifically, we:

Simplified and split lengthy sentences in Section 3.1 (“Research Design”) to present the mixed-methods approach more clearly, separating the descriptions of the SLR and the expert questionnaire into distinct sentences.
Condensed the PRISMA process description (Section 3.2.4) by summarizing record counts in a concise format and referencing Figure 1 for detailed visualization, thereby reducing narrative length and redundancy.
Enumerated the thematic analysis steps (Section 3.3.3) following Braun and Clarke’s six-phase framework, using short, structured sentences for each step to enhance readability and transparency.

These revisions collectively make the methodology easier to follow and align with the reviewer’s recommendation for clarity and succinct presentation. We are grateful for this constructive suggestion, which has significantly improved the flow and accessibility of the Methodology section.

Reviewer Comment 3: In the Discussion or Conclusion, explicitly note limitations such as the single-institution, single-country sample of expert respondents and the exclusion of non-English studies. Discuss how these factors might affect the generalizability of the findings, and suggest that future work include more diverse participants and multilingual literature. This will strengthen the credibility of the study.

Response: We thank the reviewer for this valuable observation. The Discussion and Limitations sections of the manuscript already address these considerations. Specifically, the Discussion section (Section 7, paragraph 5) notes that future research should include broader stakeholder perspectives beyond assistant professors and acknowledges that small-scale and context-specific samples may limit generalizability. In addition, the Limitations and Future Research section (Section 8, paragraph 1) explicitly states that the expert sample was drawn from a single institution and that the systematic review was limited to English-language publications, which may restrict global generalizability.

These acknowledgments clarify the scope and contextual boundaries of the study while emphasizing that future work should incorporate participants from multiple institutions, regions, and linguistic backgrounds to strengthen cross-cultural applicability. We appreciate the reviewer’s attention to this important point, which aligns closely with the areas already discussed in the manuscript.

Reviewer Comment 4: Verify that all figures and tables are clear and well-formatted. For example, check that text in Figure 2 is legible and that the PRISMA flow diagram (Figure 1) is easy to read. Ensure that table captions are fully descriptive; for instance, Table 1’s caption could begin with a full sentence explaining the table content. If space permits, widen columns in tables (such as Table 1 and Table 2) to prevent text crowding.

Response: We thank the reviewer for the thoughtful and detailed suggestions regarding the clarity and formatting of figures and tables. In response, we have carefully reviewed all visual elements to ensure readability and consistency. Specifically, the text within Figure 1 (PRISMA flow diagram) and Figure 2 (Conceptual framework) has been verified for legibility, and both figures were resized appropriately to enhance visibility without distortion.

For the tables, Table 1 and Table 2 were adjusted to utilize full page width to minimize text crowding and improve readability. In addition, all table captions were revised to begin with clear, descriptive sentences summarizing each table’s content and purpose.

These adjustments ensure that figures and tables are now fully legible, visually balanced, and aligned with the journal’s formatting expectations. We appreciate the reviewer’s attention to these presentation details, which helped improve the manuscript’s clarity and professional appearance.

Reviewer Comment 5: Some broad claims (e.g. about AI trust and adoption trends) would benefit from additional recent references. For example, statements on the importance of trust in AI adoption could cite recent surveys in educational contexts. Adding such citations will reinforce the literature review and discussion.

Response: We thank the reviewer for this valuable suggestion. In line with the recommendation, additional recent studies have been incorporated into the Discussion section to strengthen claims related to trust and adoption of generative AI tools in higher education. Specifically, two recent empirical works - Zhang and Reusch (2025) and Luo (2024)—have been cited to provide up-to-date evidence on how trust influences adoption behaviors and the dynamics of teacher–student relationships in AI-supported learning environments. These sources complement the previously cited surveys and reinforce the argument that user trust and familiarity remain central determinants of AI acceptance in educational contexts.

We appreciate the reviewer’s insightful comment, which has helped ensure that the discussion reflects the most recent empirical findings and enhances the manuscript’s scholarly grounding.

Reviewer Comment 6: Carefully proofread the manuscript to correct minor issues. For instance, address any hyphenation or line-break artifacts (e.g. in Section 3.2.1), ensure consistent use of terminology (e.g. “AI tutor” vs “pedagogical agent”), and check punctuation. These editorial refinements will improve the manuscript’s polish.

Response: We thank the reviewer for this helpful comment. The entire manuscript has been carefully reviewed to ensure consistency and clarity. Minor grammatical and punctuation issues were corrected, and terminology has been standardized so that “AI tutor” is used consistently throughout, except where “pedagogical agent” is referenced in comparative or historical contexts.

Regarding hyphenation, we note that the MDPI journal template automatically applies line-break and hyphenation formatting according to layout rules. Therefore, these are not manually controlled in the manuscript file. Nonetheless, the content has been rechecked to confirm that no manual hyphenation or formatting artifacts remain.

These refinements have improved the manuscript’s overall polish and ensured conformity with the journal’s editorial standards.

Article Menu

Deepfake-Style AI Tutors in Higher Education: A Mixed-Methods Review and Governance Framework for Sustainable Digital Education

Response to Reviewer

Further Information

Guidelines

MDPI Initiatives

Follow MDPI