Transparency Mechanisms for Generative AI Use in Higher Education Assessment: A Systematic Scoping Review (2022–2026)

Pérez-Pérez, Itahisa; González-Afonso, Miriam Catalina; Plasencia-Carballo, Zeus; Pérez-Jorge, David

doi:10.3390/computers15020111

Open AccessReview

Transparency Mechanisms for Generative AI Use in Higher Education Assessment: A Systematic Scoping Review (2022–2026)

by

Itahisa Pérez-Pérez

¹

,

Miriam Catalina González-Afonso

^2,*

,

Zeus Plasencia-Carballo

³

and

David Pérez-Jorge

^2,*

¹

Department of History and Philosophy of Science, Education and Language, Faculty of Education, University of La Laguna, 38204 San Cristóbal de La Laguna, Spain

²

Department of Didactics and Educational Research, Faculty of Education, University of La Laguna, 38204 San Cristóbal de La Laguna, Spain

³

Department of Specific Didactics, Faculty of Education, University of La Laguna, 38204 San Cristóbal de La Laguna, Spain

^*

Authors to whom correspondence should be addressed.

Computers 2026, 15(2), 111; https://doi.org/10.3390/computers15020111

Submission received: 15 January 2026 / Revised: 31 January 2026 / Accepted: 5 February 2026 / Published: 6 February 2026

(This article belongs to the Special Issue Recent Advances in Computer-Assisted Learning (2nd Edition))

Download

Browse Figures

Versions Notes

Abstract

The integration of generative AI in higher education has reignited debates around authorship and academic integrity, prompting approaches that emphasize transparency. This study identifies and synthesizes the transparency mechanisms described for assessment involving generative AI, recognizes implementation patterns, and analyzes the available evidence regarding compliance monitoring, rigor, workload, and acceptability. A scoping review (PRISMA 2020) was conducted using searches in Scopus, Web of Science, ERIC, and IEEE Xplore (2022–2026). Out of 92 records, 11 studies were included, and four dimensions were coded: compliance assessment approach, specified requirements, implementation patterns, and reported evidence. The results indicate limited operationalization: the absence of explicit assessment (27.3%) and unverified self-disclosure (18.2%) are predominant, along with implicit instructor judgment (18.2%). Requirements are often poorly specified (45.5%), and evidence concerning workload and acceptability is rarely reported (63.6%). Overall, the literature suggests that transparency is more feasible when it is proportionate, grounded in clear expectations, and aligned with the assessment design, while avoiding punitive or overly surveillant dynamics. The review protocol was prospectively registered in PROSPERO (CRD420261287226).

Keywords:

generative AI; higher education; academic assessment; transparency; disclosure of use

1. Introduction

1.1. Assessment, Authorship, and Academic Integrity in the Age of Generative AI

The rapid integration of generative AI (GenAI) tools into higher education has sparked extensive debates on authorship, originality, the validity of learning evidence, and academic integrity, particularly in written assignments and remote assessments [1,2,3]. Various studies show that students exhibit a high willingness to engage with tools such as ChatGPT (V.5.0) in academic contexts, perceiving them as supportive for learning and task completion, which underscores the need to address their use through explicit and transparent frameworks [4].

In this context, a significant portion of recent literature examines how GenAI reshapes the conditions under which assessable work is produced, thereby challenging traditional foundations of grading and performance attribution [5,6,7].

Existing studies consistently highlight that uncertainty regarding the use of GenAI can lead to implicit instructor judgments and tensions around assessment fairness, especially when criteria are not clearly articulated or are applied arbitrarily [5,8]. Likewise, evidence suggests that the perception of punishment or stigma may encourage concealment strategies, fostering a low-trust environment and reinforcing control strategies rather than promoting learning-oriented ones [9,10].

While some studies focus on detection or debates about its reliability, evidence points to significant limitations (ethical risks, relational impacts, and surveillance effects), which have shifted the conversation towards more pedagogical and governance-oriented alternatives [2,7,11]. So far, findings point not only to a technical challenge but also to epistemological and pedagogical ones. In this regard, recent systematic reviews confirm that generative AI is being increasingly integrated into teaching and learning practices, though inconsistently and with still limited attention to its evaluative and ethical implications [12]. Other reviews also suggest that generative AI can support the development of complex cognitive skills—such as creativity, flexibility, and critical thinking—provided its use is pedagogically guided and embedded within ethical and reflective frameworks, avoiding opaque or purely instrumental approaches [13].

Therefore, we are entering a renewed debate on what it means to “assess” when the product (i.e., the assessable output) may be mediated by GenAI, and what conditions render such assessment legitimate [6,14]. Recent empirical studies comparing the performance of AI systems and students in problem-solving tasks show that the efficiency of the output does not necessarily equate to understanding, reasoning, or deep learning, reinforcing the need for caution when considering the final product as direct evidence of learning [15].

1.2. Disclosure, Attribution, and Academic Responsibility

The limitations of approaches focused on control and detection have prompted the emergence of studies focused on models grounded in transparency and declared responsibility, shifting the emphasis from proving use to making it traceable and justifiable [11,16,17]. This approach is reflected in works that promote disclosure statements (coversheets or forms), attribution/citation of tools, and, in some cases, process-based evidence (e.g., marked outputs, appendices, or selective logs) [18,19]. Along similar lines, self-assessment instruments have been proposed to explicitly teach writing ethics in contexts mediated by generative AI, with the aim of reinforcing students’ self-regulation, reflection, and academic responsibility [20].

Current studies highlight a critical strand that questions the efficacy and legitimacy of detection due to issues such as false positives, biases, disciplinary asymmetries, and the escalation of surveillance. This strand instead proposes governance alternatives grounded in greater proportionality [6,7,11]. Within this framework, transparency is presented as a strategy aligned with principles of integrity by clarifying the boundaries of student responsibility for the final output [1,16,17].

However, the literature notes that the term transparency is used heterogeneously: in many studies, it appears as a normative principle (ethics, integrity, good practice) without specifying operational requirements, while others move towards more concrete implementations [2,11]. This gap between discourse and operationalization is key to understanding why, even when disclosure is requested, undeclared or strategically framed use may still persist [9,10].

1.3. From Principles to Practice: Implementation, Assessment, and Feasibility of Transparency Mechanisms

Beyond the emerging consensus on the need to “promote transparency,” studies reveal considerable variation concerning how it should be implemented—ranging from symbolic declarations with no evaluative consequences to designs that incorporate transparency as an assessment criterion [8,16,17]. A central axis of variation is compliance assessment: some studies describe unverified self-disclosure, others indicate implicit instructor judgment in grading, and a minority propose explicit instruments (rubrics/checklists) or verifiable evidence of process [6,8,9,17].

In terms of rigor, practices range from binary declarations (use/no use) to descriptive requirements (tool used, purpose), process evidence (marked outputs, appendices), or explicit reflection and verification [17,18,19]. This variability is closely tied to feasibility-perceived workload, acceptability, incentives for concealment, and surveillance risks [7,9,10,11].

Other studies highlight risks and tensions such as privacy concerns (particularly when exhaustive logs are required), equity (unequal access to tools), and unintended consequences (strategic disclosure, stigmatization, informal penalization) [5,7,9,21]. From a sustainability and educational equity perspective, it has been argued that AI governance in education should balance the benefits of efficiency and personalization while safeguarding aspects such as bias, accessibility, and data protection, especially when transparency mechanisms are embedded within assessment practices [22].

Overall, current research suggests that the feasibility of transparency depends on its proportionality (selective vs. total evidence), regulatory clarity, and pedagogical alignment with learning goals and authentic assessment [6,16,17].

Against this backdrop, the aim of this study is to identify and synthesize the transparency mechanisms described in the literature on assessment involving generative AI in higher education, recognize implementation patterns, and analyze the available evidence on compliance monitoring, levels of rigor, workload, and acceptability. The review protocol was prospectively registered in the PROSPERO database (CRD420261287226) [23].

2. Materials and Methods

2.1. Study Design

We conducted a PRISMA-ScR (Supplementary Materials) systematic scoping review (a specific extension for scoping reviews) following the methodological framework proposed by Arksey and O’Malley [24] and the PRISMA 2020 guidelines. This approach was considered the most appropriate given the emerging nature of the topic, the diversity of document types, and the heterogeneity of practices related to transparency in the use of generative AI in university assessment.

The protocol for this scoping review was prospectively registered in the International Prospective Register of Systematic Reviews (PROSPERO; CRD420261287226) [23] prior to data extraction.

2.2. Search Strategy and Information Sources

Searches were conducted across four internationally recognized databases widely used in educational and technological research: Scopus, Web of Science (WoS), ERIC, and IEEE Xplore. These databases were selected for their complementary coverage of literature in higher education, educational technologies, and artificial intelligence studies.

The search strategy combined terms related to the following:

(a): Generative AI (e.g., ChatGPT, generative AI);
(b): Higher education;
(c): Academic assessment (assessment, assignment, coursework);
(d): Transparency mechanisms, including disclosure, citation, attribution, and prompt logs.

Search queries were adapted to the syntax and specific fields of each database while maintaining a consistent conceptual logic. Searches were carried out on 20 December 2025 (last updated: 10 January 2026) and limited to the 2022–2026 period, aiming to include research that emerged following the introduction and widespread adoption of generative AI tools in the university context.

Although the review window was defined as 2022–2026, the database searches were conducted in December 2025 and updated in January 2026. Records dated 2026 were retrieved through early access or online-first indexing available at the time of the update. In addition, a verification process was carried out to ensure that all eligible manuscripts indexed up to 1 February 2026 were included, thereby operationally defining the upper boundary of the review period.

To strengthen the comprehensiveness and sensitivity of the search strategy, an additional verification subprocess was conducted to assess whether the inclusion of widely used and emerging terms for generative artificial intelligence—such as artificial intelligence, AI, large language model, LLM, GPT, and ChatGPT—would identify relevant research not captured by the original search equations.

This subprocess was applied to the same databases used in the initial search (Scopus, Web of Science, ERIC, and IEEE Xplore), maintaining the same time frame (2022–2026), search fields, and inclusion/exclusion criteria. The alternative terms were incorporated into the original search equations using OR operators, preserving the conceptual focus on higher education, academic assessment, and explicit transparency mechanisms.

The terminological expansion increased the total number of retrieved records across all databases by 16. However, following title, abstract, and full-text screening, no additional studies met the inclusion criteria or contributed new transparency mechanisms relevant to the review objectives. Most of the additional records addressed general uses of generative AI, user perceptions, broad regulatory frameworks, or detection-focused approaches—areas already excluded by the primary search design.

Overall, this verification subprocess supports the adequacy and validity of the search strategy employed to identify research on transparency mechanisms in generative AI-mediated assessment in higher education. For transparency and reproducibility, the complete database-specific search strings—including searched fields, Boolean operators, applied limits, and exact execution dates—are provided in Appendix A.

2.3. Search Queries and Results by Database

To ensure the traceability and transparency of the study identification process, Table 1 presents the search queries used for each database. For each source, the total number of records retrieved and the number of studies ultimately included, following the application of eligibility criteria and full-text analysis, are reported.

In total, the initial searches retrieved 92 records, of which 11 studies met the inclusion criteria and were included in the final analysis.

2.4. Study Selection Process

After the initial identification of records in the selected databases, duplicates across sources were removed (N = 1). The remaining records were screened by title and abstract to exclude those that did not align with the study’s objective.

The full texts of potentially eligible documents were then reviewed, with the predefined inclusion and exclusion criteria applied systematically. At this stage, studies focused on AI detection tools, uses of generative AI unrelated to assessment processes, or non-university educational contexts (K–12) were excluded.

In addition to database searching, a supplementary citation-based search was conducted to enhance the comprehensiveness of the identification process. This procedure included both backward snowballing, consisting of a systematic review of the reference lists of all included studies (N = 11), and forward snowballing, using citation tracking in Scopus and Web of Science to identify publications citing the included studies.

All records identified through backward and forward snowballing were screened at the title, abstract, and full-text levels and assessed using the same predefined inclusion and exclusion criteria. This process did not yield any additional studies that met the eligibility criteria. The records identified through citation tracking primarily addressed general uses of generative AI, institutional or policy-level discussions, or detection-oriented approaches, all of which were outside the scope of this review.

The snowballing procedure and its outcomes are reported as a separate identification pathway in the PRISMA 2020 flow diagram.

The entire selection process is illustrated in Figure 1 using the PRISMA 2020 flow diagram, which details the number of records identified, excluded at each stage, and ultimately included, along with the main reasons for exclusion.

In addition, backward and forward snowballing procedures were conducted based on the 11 included studies. Reference lists were systematically reviewed, and citation tracking was performed using Scopus and Web of Science. This process yielded 16 additional records, all of which were screened at the title, abstract, and full-text levels using the same predefined inclusion and exclusion criteria. None of these records met the eligibility criteria. The results of this process are reflected in the PRISMA 2020 flow diagram.

2.5. Inclusion and Exclusion Criteria

To guarantee the coherence and relevance of the studies included in the review, explicit inclusion and exclusion criteria were defined in alignment with the study’s objective and the scoping review approach. These criteria were applied systematically during the title and abstract screening phase, as well as in the full-text review phase. This process narrowed the final selection to documents that explicitly address transparency mechanisms in the use of generative AI within assessment contexts in higher education. Table 2 summarizes the criteria used to select the studies.

2.6. Data Extraction and Analytical Framework

Data extraction was carried out using a structured, purpose-designed template that enabled the systematic recording of both descriptive information about the included studies and analytical variables related to transparency mechanisms in the use of generative AI in assessable work.

To avoid excessively fragmented coding, the analysis was organized around four analytical dimensions, using synthetic variables coded exclusively based on the explicit evidence reported in the texts.

For transparency and reproducibility, each analytical category was operationally defined a priori, and studies were coded only when explicit descriptions of the corresponding transparency mechanism or evaluation practice were reported in the text. Operational definitions for all analytical categories, along with illustrative examples linking each category to the included studies, are provided in Appendix B.

The evaluation approach dimension examines how compliance with transparency requirements is assessed, considering the presence of rubrics, assessable criteria, penalties, manual verification, self-disclosure, or the absence of evaluation. The requirements specified dimension analyzes the level of detail required in transparency practices, ranging from minimal statements to process evidence and critical reflection. The implementation pattern dimension identifies combinations of transparency mechanisms according to the type of assessment and the evaluative approach. Finally, the reported evidence dimension captures the evidence reported on workload, acceptability, and the practical feasibility of the mechanisms described.

This analytical framework enabled both descriptive counts and a systematic analysis of implementation patterns, maintaining a balance between methodological coherence and analytical flexibility—essential in studies of this nature.

The selection, data extraction, and analytical coding of the included studies were conducted iteratively by the research team to ensure coherence, traceability, and consistency throughout the review process. Title and abstract screening, as well as full-text eligibility assessment, were conducted independently by three reviewers using the predefined inclusion and exclusion criteria. Reasons for exclusion at the full-text stage were systematically documented and are reported in the PRISMA 2020 flow diagram.

Following the initial identification and screening of records, the full texts of potentially relevant studies were thoroughly analyzed, and information was extracted using a common template aligned with the research questions and the analytical framework outlined above.

To guarantee consistency in the review process and uniform application of the analytical framework, the following methodological control procedures were adopted:

During the full-text selection phase and the coding of the four analytical dimensions (compliance evaluation approach, level of specified requirements, implementation patterns, and reported evidence), cross-checks were conducted at predefined control points, with special attention to the assignment of synthetic categories and the interpretation of the transparency mechanisms described.

Discrepancies were resolved through informed discussion and consensus, based on a joint review of the full texts and strict application of the analytical framework, always prioritizing the explicit evidence reported by the studies and avoiding unsupported inferences.

This procedure helped reduce interpretive variability and ensure coding consistency without resorting to formal inter-rater agreement metrics, which were deemed less suitable given the conceptual and exploratory nature of this study and the limited sample size typical of a scoping review in an emerging field.

3. Results

3.1. General Characteristics of the Included Studies

Table 3 provides a descriptive summary of the eleven studies included in the review. The sample is composed primarily of empirical studies (N = 7; 63.6%), along with case studies or action research (N = 3; 27.3%) and one institutional policy analysis (N = 1; 9.1%).

The studies are situated in various higher education contexts and cover a range of disciplines. Assessable written work (essays, reports, or coursework) predominates, although some studies consider multiple assessment formats. The transparency mechanisms described include AI use disclosure, citation or attribution, marked outputs, and, to a lesser extent, rubrics or checklists.

Table 4 provides a cross-sectional summary of the results obtained from the analytical coding of the eleven included studies, showing the frequency distribution and percentages for each category within the four analytical dimensions.

3.2. Approaches to Evaluating Compliance with Transparency Requirements (Evaluation Approach)

Regarding the evaluation of compliance with transparency requirements, seven distinct approaches were identified across the eleven included studies (N = 11). The most common approach was the absence of explicit evaluation of compliance, found in 3 studies (27.3%), in which transparency is mentioned as a principle or recommendation without any associated evaluation mechanisms.

In two studies (18.2%), transparency is implemented through self-disclosure without subsequent verification, while two other studies (18.2%) describe instructor judgment or implicit penalties tied to grading, without standardized instruments. Lighter forms of assurance appear less frequently: one study (9.1%) uses an explicit rubric or checklist, and one study (9.1%) combines evaluable criteria with spot-check verification based on process evidence.

Lastly, one study (9.1%) describes reactive management of compliance, and one additional study (9.1%) focuses on documenting undeclared or hidden use without proposing formal evaluation mechanisms (Figure 2).

3.3. Level of Detail Required in Transparency Practices (Requirements Specified)

The level of detail required in transparency practices varies considerably across the included studies. In 5 of the 11 studies (45.5%), the requirements are not specified operationally and are limited to general references to ethics or academic integrity.

In two studies (18.2%), only a minimal declaration regarding the use or non-use of generative AI is required, while two other studies (18.2%) request a descriptive level that includes some explanation of the use or its purpose. Only one study (9.1%) requires evidence of the process, and one additional study (9.1%) incorporates critical reflection and an explicit declaration of verification and responsibility (Figure 3).

3.4. Implementation Patterns of Transparency Mechanisms

The analysis of implementation patterns reveals a predominance of low-traceability configurations. Specifically, two studies (18.2%) rely on mandatory disclosures without systematic verification, and two other studies (18.2%) describe narrative or recommended disclosures with no explicit evaluative consequences.

Configurations that combine disclosure with light evaluative instruments appear only in a limited number of cases (N = 1; 9.1%), as do those that integrate process evidence directly into the submitted assessment (N = 1; 9.1%). A subset of studies documents implementation gaps, describing contexts of hidden or undeclared use (N = 1; 18.2%), while one study (9.1%) is situated at the level of institutional policy, without specific evaluative procedures.

3.5. Reported Evidence on Compliance, Workload, and Acceptability

Empirical or descriptive evidence on the practical feasibility of transparency mechanisms is limited. Only three studies (27.3%) report explicit data on compliance or adherence, generally pointing to challenges in disclosure or undeclared use of AI.

Regarding acceptability, three studies (27.3%) involve gathering perceptions from students or faculty, revealing varying levels of acceptance and resistance. Workload is explicitly mentioned or inferable in four studies (36.4%), with minimal approaches associated with low workload and approaches focused on control and detection linked to moderate workload. The majority of studies (N = 7; 63.6%) do not report specific information on workload or acceptability.

Overall, the results show that the literature more frequently describes transparency mechanisms and their formats than verifiable procedures for evaluating compliance, and that evidence on workload and acceptability is reported inconsistently. This pattern suggests that the field is still in a phase of formulating and testing approaches, rather than systematically evaluating their feasibility.

4. Discussion

The findings of this review provide deeper insights into the debates addressed in this study regarding authorship, assessment equity, and academic integrity in contexts mediated by generative AI. In particular, the analysis reveals a clear tension between recognizing transparency as a desirable principle and its effective integration into real assessment practices in higher education.

Although the reviewed literature shows an emerging consensus on the need to promote transparency in AI use, the results indicate that, in most studies, such transparency is not accompanied by explicit mechanisms to evaluate compliance. This finding is especially relevant considering previous work warning that when criteria are not formalized, the evaluation of AI use does not disappear but instead shifts towards implicit forms of instructor judgment embedded in grading [5,8]. In such contexts, the absence of rubrics, checklists, or clear procedures can increase arbitrariness and evaluative bias, reinforcing concerns about fairness already highlighted by many authors [5,9].

The results confirm that requiring declarations about AI use does not, in itself, guarantee effective transparency. Several studies highlight the persistence of undeclared use or concealment strategies, even in contexts where disclosure is mandatory. This pattern supports prior research suggesting that fear of sanctions, stigmatization, or informal penalties may encourage defensive behaviors among students [9,10]. In this regard, the review suggests that transparency, when introduced without regulatory clarity or evaluative safeguards, can reproduce the same dynamics of distrust associated with detection-based approaches. Importantly, the absence of explicit transparency mechanisms identified in this review should be interpreted as an absence of reported and operationalized practices in the published literature, rather than as definitive evidence that such mechanisms are not being implemented in higher education practice more broadly.

From this perspective, our findings should be understood as a mapping of how transparency is currently articulated and operationalized in the published literature, rather than as a comprehensive account of institutional practice. These findings also align with critical literature on the limitations of AI detection tools and the shift towards governance models based on declared responsibility [6,7,11]. While this shift is clearly visible at the discursive level, our results highlight that its operationalization remains complex. In many cases, transparency is framed as an ethical principle or institutional recommendation, without clear translation into concrete, assessable tasks, reinforcing the policy-practice gap previously identified in institutional analysis studies [11].

In contrast, the few studies that integrate transparency as an explicit part of assessment design, through rubrics, evaluative criteria, or selective process evidence, suggest that these approaches may promote greater pedagogical coherence [16,17,18]. However, such models remain in the minority and raise important questions about their scalability, workload, and acceptance, factors that, as shown in this review, are rarely documented systematically.

Finally, our findings reinforce the idea that the feasibility of transparency mechanisms depends on their proportionality and alignment with learning and assessment objectives. The analyzed literature points to clear risks related to privacy, equity, and the generation of unintended incentives—particularly when exhaustive process evidence is required or when regulatory frameworks remain ambiguous [5,7,11].

In this sense, the identified patterns suggest four key implications for integrating transparency into university assessment involving generative AI: (1) If disclosure is requested, it is important to clarify what constitutes compliance and how it will be evaluated, to avoid reliance on implicit judgments; (2) selective process evidence should be prioritized over exhaustive logs, minimizing privacy risks and surveillance escalation; (3) the level of rigor should be proportional to the pedagogical purpose and the weight of the task [21]; and (4) when the normative framework is ambiguous or perceived as punitive, it is advisable to anticipate unintended effects such as concealment, stigmatization, or informal penalties.

On the whole, the findings of this study suggest that transparency can only fulfill its pedagogical and ethical function if it is conceived not as a mere declaration but as an explicit, coherent, and carefully designed component of assessment.

5. Limitations

This study presents several limitations that should be considered when interpreting the results. First, as a scoping review, its objective was to identify and synthesize the practices described in the literature, not to evaluate their effectiveness or the methodological quality of the included studies.

Second, the final sample of eleven studies, while consistent with the emerging and specific nature of the topic, limits the generalizability of the findings. Additionally, the coding was based exclusively on the explicit evidence reported in the texts, which may have underestimated transparency practices that were not described in detail.

Finally, the coding process relied on cross-checking and informed consensus, without the use of formal inter-rater agreement metrics. This was a methodologically coherent decision given the exploratory and conceptual nature of the study, but it may affect the strict replicability of some analytical decisions.

Notably, the small number of included studies should not be interpreted as a limitation of the search strategy but rather as an empirical indicator of the nascent and still weakly consolidated state of research on transparency mechanisms in generative AI-mediated assessment.

6. Conclusions

This scoping review confirms that, while transparency in the use of generative AI is increasingly promoted as a preferred alternative to detection-based approaches, its implementation in university assessment remains limited and inconsistent. Most of the analyzed studies describe transparency mechanisms without systematically integrating them into compliance evaluation, with self-disclosure and implicit instructor judgment being the most common approaches.

The results reveal a gap between normative discourse and the practical operationalization of transparency, with requirements often poorly specified and little demand for evidence or critical reflection. The findings suggest that transparency mechanisms are more feasible and acceptable when they are designed to be proportional, clearly defined, and aligned with pedagogical goals while avoiding punitive dynamics or excessive surveillance.

This study emphasizes the need to move toward explicit and assessable transparency approaches that uphold academic integrity without undermining trust or the legitimacy of assessment in contexts mediated by generative AI.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/computers15020111/s1, Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) Checklist.

Author Contributions

Conceptualization, methodology, I.P.-P., D.P.-J. and M.C.G.-A.; software, D.P.-J., M.C.G.-A. and Z.P.-C.; validation, I.P.-P., D.P.-J. and M.C.G.-A.; formal analysis, I.P.-P., M.C.G.-A., and Z.P.-C.; investigation, D.P.-J., M.C.G.-A. and Z.P.-C.; resources, I.P.-P., M.C.G.-A. and Z.P.-C.; data curation, I.P.-P.; M.C.G.-A. and D.P.-J.; writing—original draft preparation, I.P.-P., M.C.G.-A. and D.P.-J.; writing—review and editing, M.C.G.-A., D.P.-J. and Z.P.-C.; visualization, M.C.G.-A. and Z.P.-C.; supervision, D.P.-J., M.C.G.-A. and Z.P.-C.; project administration, I.P.-P., M.C.G.-A. and D.P.-J.; funding acquisition, D.P.-J., M.C.G.-A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study.

Acknowledgments

This manuscript was prepared by members of the research group at the University of La Laguna (DISAE). Generative artificial intelligence (ChatGPT, OpenAI) was used to assist with language editing, stylistic refinement, and clarity of expression during the preparation of this manuscript. The tool was accessed between December 2025 and February 2026. The selection of studies, screening, data extraction, coding, analysis, and interpretation of results were conducted by the authors. AI tools did not contribute to data generation, analytical decisions, or substantive interpretation. The authors assume full responsibility for the content of the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Database-Specific Search Strategies

Appendix A provides the complete, database-specific search strings used in this scoping review, including searched fields, Boolean operators, applied limits, and exact execution dates, to ensure full transparency and reproducibility of the search process.

Table A1. Search strategies and parameters used across databases.

Database	Searched Fields	Complete Search String	Filters/Limits Applied	Date(s) of Execution
Scopus	TITLE-ABS-KEY	(“AI disclosure” OR “disclosure statement” OR “AI citation” OR “prompt log” OR “prompt logs”) AND (ChatGPT OR “generative AI”) AND “higher education” AND (assessment OR assignment OR coursework)	Publication years: 2022–2026; Document type: Article, Conference Paper; Language: No restrictions	20 December 2025; updated 10 January 2026
Web of Science (WoS Core Collection)	TOPIC (TS)	(ChatGPT OR “generative AI”) AND “higher education” AND (assessment OR assignment OR coursework) AND (disclosure OR transparency OR attribution OR citation)	Timespan: 2022–2026; Indexes: SCI-EXPANDED, SSCI, ESCI; Document type: Article, Proceedings Paper; Language: No restrictions	20 December 2025; updated 10 January 2026
ERIC	All Fields	(ChatGPT OR “generative AI”) AND (“higher education”) AND (assessment OR assignment OR coursework) AND (disclosure OR transparency OR attribution OR citation)	Publication years: 2022–2026; Source type: Journals; Language: No restrictions	20 December 2025; updated 10 January 2026
IEEE Xplore	Metadata only (title, abstract, keywords)	(ChatGPT OR “generative AI”) AND “higher education” AND (assessment OR assignment OR coursework) AND (disclosure OR transparency OR attribution OR citation)	Publication years: 2022–2026; Content type: Journals, Conference Proceedings; Language: No restrictions	20 December 2025; updated 10 January 2026

Note: Although the review window was defined as 2022–2026, searches were conducted in December 2025 and updated in January 2026. Records dated 2026 were retrieved through early access or online-first indexing available at the time of the update. An additional verification process ensured the inclusion of eligible records indexed up to 1 February 2026.

Appendix B. Operational Definitions and Coding Traceability

Table A2. Coding framework and operational definitions for AI disclosure practices.

Dimension	Category	Operational Definition (Applied Coding)	Study Coded
Evaluation approach	No explicit evaluation	Transparency is mentioned as a principle or recommendation without defined criteria or verification procedures.	Maguire et al. [18]
Evaluation approach	Unverified self-disclosure	Mandatory or requested declaration of AI use with no verification or evaluative consequences.	Gonsalves [8]
Evaluation approach	Instructor judgment	AI use is implicitly considered in grading decisions without standardized instruments.	Spirgi [7]
Evaluation approach	Light rubric/checklist	Disclosure accompanied by a simple rubric or checklist used for guidance or partial evaluation.	Overono & Ditta [15]
Evaluation approach	Criteria + spot verification	Explicit evaluative criteria combined with selective verification of process evidence.	Cotelli Kureth et al. [16]
Evaluation approach	Reactive management	AI use is addressed only when suspected, without predefined procedures.	Adnin et al. [9]
Evaluation approach	Hidden use	Absence of formal mechanisms; undeclared AI use is reported or inferred.	Kirsanov et al. [24]
Requirements specified	Not specified	Normative or ethical references to transparency without operational requirements.	Al-Hajaya [23]
Requirements specified	Minimal (use/no use)	Binary declaration indicating whether AI was used or not.	Spirgi [7]
Requirements specified	Descriptive	Narrative description of how AI was used and for what purpose.	Maguire et al. [18]
Requirements specified	With evidence	Submission of process evidence (e.g., marked outputs, appendices).	Cotelli Kureth et al. [16]
Requirements specified	Reflection + verification	Declaration combined with ethical reflection and explicit responsibility.	García Ramos [17]
Implementation pattern	Mandatory disclosure	Disclosure required but not integrated into assessment or evaluation.	Gonsalves [8]
Implementation pattern	Narrative, non-evaluable	Disclosure encouraged or requested without grading impact.	Maguire et al. [18]
Implementation pattern	Disclosure + rubric	Disclosure explicitly integrated with a checklist or rubric.	Overono & Ditta [15]
Implementation pattern	Integrated evidence	Process evidence embedded within the assessment design.	Cotelli Kureth et al. [16]
Implementation pattern	Implementation gap	Reported mismatch between formal expectations and actual student practices.	Adnin et al. [9]
Implementation pattern	Policy only	Institutional guidelines without concrete assessment procedures.	Dabis & Csáki [10]
Reported evidence	Compliance	Explicit reporting of adherence or non-adherence to transparency requirements.	Cotelli Kureth et al. [16]
Reported evidence	Acceptance	Empirical or descriptive evidence of student or faculty perceptions.	Overono & Ditta [15]
Reported evidence	Workload	Explicit discussion of workload implications for students or instructors.	Maguire et al. [18]
Reported evidence	No evidence	No empirical or descriptive feasibility evidence reported.	Dabis & Csáki [10]

References

Al-Hajaya, K. Academic integrity is under fire in the Generative AI age: Insights from accounting educators to overcome challenges, threats and ethical concerns. High. Educ. Ski. Work-Based Learn. 2025, early access. 1–19. [Google Scholar] [CrossRef]
Gallent-Torres, C.; Zapata-González, A.; Ortego-Hernando, J. The impact of generative artificial intelligence in higher education: A focus on ethics and academic integrity. RELIEVE Rev. Electrón. Investig. Eval. Educ. 2023, 29, 2. [Google Scholar] [CrossRef]
Revell, T.; Yeadon, W.; Cahilly-Bretzin, G.; Clarke, I.; Manning, G.; Jones, J.; Mulley, C.; Pascual, R.J.; Bradley, N.; Thomas, D.; et al. ChatGPT versus human essayists: An exploration of the impact of artificial intelligence for authorship and academic integrity in the humanities. Int. J. Educ. Integr. 2024, 20, 18. [Google Scholar] [CrossRef]
Benuyenah, V.; Dewnarain, S. Students’ Intention to Engage with ChatGPT and Artificial Intelligence in Higher Education Business Studies Programmes. Int. J. Distance Educ. Technol. 2024, 22, 1–21. [Google Scholar] [CrossRef]
Luo, J.; Dawson, P. Exploring value judgements in grading: Will teachers mark down student work assisted by GenAI, and should they? Stud. High. Educ. 2025, 1–15. [Google Scholar] [CrossRef]
Kickbusch, S.; Ashford-Rowe, K.; Kemp, A.; Boreland, J.; Huijser, H. Beyond detection: Redesigning authentic assessment in an AI-mediated world. Educ. Sci. 2025, 15, 1537. [Google Scholar] [CrossRef]
Deep, P.; Edgington, W.D.; Ghosh, N.; Rahaman, M.S. Evaluating the effectiveness and ethical implications of AI detection tools in higher education. Information 2025, 16, 905. [Google Scholar] [CrossRef]
Spirgi, L. The Role of AI Disclosure in Academic Grading: Lecturer Perceptions, Challenges, and Implications. In Artificial Intelligence in Education, AIED 2025; Cristea, A.I., Walker, E., Lu, Y., Santos, O.C., Isotani, S., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2025; Volume 15882. [Google Scholar] [CrossRef]
Gonsalves, C. Addressing student non-compliance in AI use declarations: Implications for academic integrity and assessment in higher education. Assess. Eval. High. Educ. 2025, 50, 592–606. [Google Scholar] [CrossRef]
Adnin, R.; Pandkar, A.; Yao, B.; Wang, D.; Das, M. Examining Student and Teacher Perspectives on Undisclosed Use of Generative AI in Academic Work. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25); Association for Computing Machinery: New York, NY, USA, 2025; pp. 1–17. [Google Scholar] [CrossRef]
Dabis, A.; Csáki, C. AI and ethics: Investigating the first policy responses of higher education institutions to the challenge of generative AI. Humanit. Soc. Sci. Commun. 2024, 11, 1006. [Google Scholar] [CrossRef]
Ogunleye, B.; Zakariyyah, K.I.; Ajao, O.; Olayinka, O.; Sharma, H. A Systematic Review of Generative AI for Teaching and Learning Practice. Educ. Sci. 2024, 14, 636. [Google Scholar] [CrossRef]
Pérez-Jorge, D.; Olmos-Raya, E.; Alonso-Rodríguez, I.; Hernández-Dionis, P.; Pérez-Pérez, I. Harnessing AI for sustainable education: Pathways and implications. In Generative Artificial Intelligence in Education: Innovations, Challenges, and Future Prospects; Durak, G., Çankaya, S., Eds.; Springer Nature: Singapore, 2026; pp. 17–29. [Google Scholar] [CrossRef]
Alshamy, A.S.A.; Al-Harthi, A.S.A.; Abdullah, S. Challenges of using generative AI tools in Omani higher education institutions: Perceptions of students and academics. In Proceedings of the 2025 International Conference on Smart Applications, Communications and Networking (SmartNets); IEEE: New York, NY, USA, 2025. [Google Scholar] [CrossRef]
Hernández Aguirre, A. Microbe Detectives VS ChatGPT: Who Solves Better. In World Engineering Education Forum—Global Engineering Deans Council (WEEF-GEDC); IEEE: New York, NY, USA, 2024. [Google Scholar] [CrossRef]
Overono, A.L.; Ditta, A.S. The use of AI disclosure statements in teaching: Developing skills for psychologists of the future. Teach. Psychol. 2025, 52, 273–278. [Google Scholar] [CrossRef]
Cotelli Kureth, S.; Paliot, E.; Zink, S. Fostering transparency: A critical introduction of generative AI in students’ assignments. Lang. Lang. Learn. High. Educ. 2025, 15, 63–85. [Google Scholar] [CrossRef]
García Ramos, J. Development and introduction of a document disclosing AI-use: Exploring self-reported student rationales for artificial intelligence use in coursework: A brief research report. Front. Educ. 2025, 10, 1654805. [Google Scholar] [CrossRef]
Maguire, J.; English, R.; Cao, Q.; Seow, C.K. Themes in the declared use of generative artificial intelligence in assessment. In Proceedings of the 9th Conference on Computing Education Practice (CEP 2025); Association for Computing Machinery: New York, NY, USA, 2025; pp. 17–20. [Google Scholar] [CrossRef]
Ahn, S.H.; Choi, M.-J. Research on self-assessment items for teaching writing ethics in the era of generative AI. Glob. Educ. Citiz. 2024, 10, 7–35. [Google Scholar] [CrossRef]
Pérez-Jorge, D.; González-Afonso, M.C.; Santos-Álvarez, A.G.; Plasencia-Carballo, Z.; Perdomo-López, C.d.l.Á. The Impact of AI-Driven Application Programming Interfaces (APIs) on Educational Information Management. Information 2025, 16, 540. [Google Scholar] [CrossRef]
Pérez-Jorge, D.; González-Herrera, A.I.; Olmos-Raya, E.; Martínez-Murciano, M.C. Nurturing creative learning through generative AI: A systematic review. In Generative Artificial Intelligence in Education: Innovations, Challenges, and Future Prospects; Durak, G., Çankaya, S., Eds.; Springer Nature: Singapore, 2026; pp. 371–396. [Google Scholar] [CrossRef]
Pérez-Jorge, D.; González-Afonso, M.C. Transparency Mechanisms for Generative AI Use in Higher Education Assessment. PROSPERO 2026; CRD420261287226. Available online: https://www.crd.york.ac.uk/PROSPERO/view/CRD420261287226 (accessed on 16 January 2026).
Arksey, H.; O’Malley, L. Scoping studies: Towards a methodological framework. Int. J. Soc. Res. Methodol. 2005, 8, 19–32. [Google Scholar] [CrossRef]
Kirsanov, O.; Kushwah, L.; Selvaretnam, G. Beyond detection: How students use—And hide—AI in online assessments and what authentic tasks can do about it. J. Acad. Ethics 2026, 24, 14. [Google Scholar] [CrossRef]

Figure 1. Search flow diagram of selected studies.

Figure 2. Distribution of transparency evaluation approaches for AI in the analyzed studies.

Figure 3. Required levels of transparency in the analyzed studies.

Table 1. Search queries, results, and included studies by database.

Database	Search Query	Results	Included
Scopus	(“AI disclosure” OR “disclosure statement” OR “AI citation” OR “prompt log” OR “prompt logs”) AND (ChatGPT OR “generative AI”) AND “higher education” AND (assessment OR assignment OR coursework)	34	5
Web of Science	(ChatGPT OR “generative AI”) AND “higher education” AND (assessment OR assignment OR coursework) AND (disclosure OR transparency OR attribution OR citation)	38	5
ERIC	(ChatGPT OR “generative AI”) AND (“higher education”) AND (assessment OR assignment OR coursework) AND (disclosure OR transparency OR attribution OR citation)	14	1
IEEE Xplore	(ChatGPT OR “generative AI”) AND “higher education” AND (assessment OR assignment OR coursework) AND (disclosure OR transparency OR attribution OR citation)	6	0
Total		92	11

Table 2. Inclusion and exclusion criteria applied for document selection.

Inclusion Criteria

Exclusion Criteria

Situated within the context of higher education.
Published between 2022 and 2026.
Address assessable work (e.g., essays, reports, projects, programming tasks, or other forms of coursework).
Explicitly describe at least one transparency mechanism related to the use of generative AI in assessment, such as disclosure statements, AI citations or attributions, prompt logs, interaction appendices, or process documentation.

Studies focused exclusively on:

AI detection tools.
Uses of generative AI unrelated to assessment.
Non-university educational contexts (K–12).

Table 3. Descriptive overview of the eleven studies included.

Reference	Document Type	Country/Institution	Discipline/Context	Assessment Type	Main Transparency Mechanism	Specified Requirements	Compliance Evaluation	Reported Evidence	Risks/Safeguards
Gonsalves [9]	Empirical Study	UK (King’s Business School)	Business/Higher Education	Coursework	Mandatory Disclosure	Explicit AI use on the coversheet	Unverified self-disclosure	Low compliance; fear of sanctions; mixed acceptance	Normative ambiguity; concealment incentives
Overono & Ditta [16]	Case Study	USA	Psychology	Written Essays	Guided Disclosure with Attribution	AI disclosure with attribution	Light rubric/checklist	High acceptability; metacognitive improvement	Minimized privacy; pedagogical approach
Maguire et al. [19]	Empirical Study	Ireland/Australia	Computing Education	Varied Assessment	Narrative Disclosure	Thematic description of AI use	No explicit evaluation	Usage patterns; moderate workload	Low surveillance
Spirgi [8]	Empirical Study	Europe	Higher Education	Written Assessment	Evaluable Declaration	Explicit use/no use declaration	Implicit instructor judgment	Instructor concern; grading impact	Risk of evaluative bias
García Ramos [18]	Case Study	Spain	Higher Education	Written Work	Structured Declaration with Reflection	Declaration + ethical reflection	Unverified self-disclosure	High acceptance; student ethical reflection	Protected privacy; low burden
Adnin et al. [10]	Empirical Study	International	General Higher Education	Academic Work	No Formal Mechanism	Not specified. Undeclared use	Reactive management	Frequent hidden use	Distrust; reactive surveillance
Cotelli Kureth et al. [17]	Action Research	Switzerland	Languages/L2	Written Essay	Marked Outputs + Reflection	Explicit output ID + process reflection	Evaluable criteria + spot checks	Improved transparency; moderate workload	Data minimization; selective traceability
Dabis & Csáki [11]	Policy Analysis	Europe	Institutional		Recommended Declaration	Institutional disclosure guidelines	Not evaluated	Uneven adoption	Governance; equity
Kirsanov et al. [25]	Empirical Study	International	Online Higher Education	Online Assessment	No Transparency Mechanism		Implicit penalty/judgment	Strategic concealment	Risk of excessive surveillance
Luo & Dawson [5]	Empirical Study	Australia	General Higher Education	Essays	Implicit AI Use Reference	Indirect AI use mentioned	Instructor judgment	Grading and fairness impact	Bias; equity
Al-Hajaya [1]	Empirical Study	International	Accounting	Varied Assessment	Suggested Declaration	General policies and expectations	Implicit instructor judgment	Moderate resistance; normative ambiguity	Privacy; regulatory clarity

Table 4. Cross-sectional summary of results by analytical dimension.

Dimension	Synthetic Category	N	%
Evaluation approach	No explicit evaluation	3	27.3
	Unverified self-disclosure	2	18.2
	Instructor judgment/implicit penalty	2	18.2
	Light rubric/checklist	1	9.1
	Assessment criteria + spot verification	1	9.1
	Reactive management	1	9.1
	Absence of mechanisms (hidden use)	1	9.1
Requirements specified	Not specified/variable	5	45.5
	Minimal (use/no use)	2	18.2
	Descriptive	2	18.2
	With evidence	1	9.1
	With reflection and verification	1	9.1
Implementation pattern	Mandatory disclosure without verification	2	18.2
	Narrative disclosure without evaluative impact	2	18.2
	Disclosure + rubric/checklist	1	9.1
	Integrated process evidence	1	9.1
	Implementation gap (hidden use)	2	18.2
	Guideline/policy without operationalization	1	9.1
Reported evidence	Explicit evidence of compliance	3	27.3
	Perceptions of acceptance/resistance	3	27.3
	Reported workload	4	36.4
	No evidence reported	7	63.6

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Pérez-Pérez, I.; González-Afonso, M.C.; Plasencia-Carballo, Z.; Pérez-Jorge, D. Transparency Mechanisms for Generative AI Use in Higher Education Assessment: A Systematic Scoping Review (2022–2026). Computers 2026, 15, 111. https://doi.org/10.3390/computers15020111

AMA Style

Pérez-Pérez I, González-Afonso MC, Plasencia-Carballo Z, Pérez-Jorge D. Transparency Mechanisms for Generative AI Use in Higher Education Assessment: A Systematic Scoping Review (2022–2026). Computers. 2026; 15(2):111. https://doi.org/10.3390/computers15020111

Chicago/Turabian Style

Pérez-Pérez, Itahisa, Miriam Catalina González-Afonso, Zeus Plasencia-Carballo, and David Pérez-Jorge. 2026. "Transparency Mechanisms for Generative AI Use in Higher Education Assessment: A Systematic Scoping Review (2022–2026)" Computers 15, no. 2: 111. https://doi.org/10.3390/computers15020111

APA Style

Pérez-Pérez, I., González-Afonso, M. C., Plasencia-Carballo, Z., & Pérez-Jorge, D. (2026). Transparency Mechanisms for Generative AI Use in Higher Education Assessment: A Systematic Scoping Review (2022–2026). Computers, 15(2), 111. https://doi.org/10.3390/computers15020111

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Transparency Mechanisms for Generative AI Use in Higher Education Assessment: A Systematic Scoping Review (2022–2026)

Abstract

1. Introduction

1.1. Assessment, Authorship, and Academic Integrity in the Age of Generative AI

1.2. Disclosure, Attribution, and Academic Responsibility

1.3. From Principles to Practice: Implementation, Assessment, and Feasibility of Transparency Mechanisms

2. Materials and Methods

2.1. Study Design

2.2. Search Strategy and Information Sources

2.3. Search Queries and Results by Database

2.4. Study Selection Process

2.5. Inclusion and Exclusion Criteria

2.6. Data Extraction and Analytical Framework

3. Results

3.1. General Characteristics of the Included Studies

3.2. Approaches to Evaluating Compliance with Transparency Requirements (Evaluation Approach)

3.3. Level of Detail Required in Transparency Practices (Requirements Specified)

3.4. Implementation Patterns of Transparency Mechanisms

3.5. Reported Evidence on Compliance, Workload, and Acceptability

4. Discussion

5. Limitations

6. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Database-Specific Search Strategies

Appendix B. Operational Definitions and Coding Traceability

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI