Theoretical Foundations for Governing AI-Based Learning Outcome Assessment in High-Risk Educational Contexts
Abstract
1. Introduction
1.1. Rationale
1.2. Objectives
2. Background and Related Work
2.1. Regulatory Frameworks and the Governance of Learning Outcomes
2.2. ALTAI and Its Educational Reinterpretation
2.3. Explainability and Its Limits for Outcome Accountability
2.4. Governance Frameworks and Educational Accountability
2.5. Toward Outcome-Focused Self-Assessment
3. Theoretical Framework Development: XAI-ED Consequential Assessment Framework (XAI-ED CAF)
3.1. Pedagogical Foundations for Outcome-Focused Governance of AIB-LOA
3.2. ALTAI Dimensions Reinterpreted for Outcome-Focused Governance (RQ1)
ALTAI Dimension | Pedagogical Foundation Link | Assessment Focus |
---|---|---|
Human agency & oversight | Messick (consequential validity); Kirkpatrick (Levels 1–4) | Preserving student autonomy, educator authority, and meaningful human intervention. |
Technical robustness & safety | Messick (construct & predictive validity) | Ensuring alignment between algorithmic outputs and authentic learning outcomes. |
Privacy & data governance | Stufflebeam (CIPP—context, input, process, product) | Evaluating institutional data stewardship, consent, and compliance effectiveness. |
Transparency | Messick (consequential validity); Kirkpatrick (Level 2 learning) | Assessing interpretability, stakeholder understanding, and usefulness of explanations. |
Diversity, fairness & non-discrimination | Messick (consequential validity) | Monitoring equity of access and differential impacts across demographic groups. |
Societal & environmental well-being | Kirkpatrick (institutional results); Stufflebeam (product) | Measuring contribution to learning communities, institutional culture, and sustainability. |
Accountability | Stufflebeam (CIPP—governance processes) | Evaluating governance structures, policy responsiveness, and institutional learning capacity. |
3.2.1. Human Agency and Oversight: Preserving Educational Autonomy
3.2.2. Technical Robustness and Safety: Educational Construct Validity
3.2.3. Privacy and Data Governance: Educational Data Stewardship
3.2.4. Transparency: Educational Interpretability Impact Assessment
3.2.5. Diversity, Fairness, and Educational Equity: Opportunity Access Assessment
3.2.6. Environmental and Societal Well-Being: Educational Community Impact Assessment
3.2.7. Accountability: Educational Governance Effectiveness Assessment
3.3. Operational Indicators and Evidence Framework (RQ2)
4. Discussion
4.1. Theoretical Contributions to Educational AI Governance
4.2. Methodological Contributions to Assessment Framework Design
4.3. Implications for Educational Institutional Practice
4.4. Policy and Regulatory Compliance Implications
4.5. Limitations and Directions for Future Research
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Holmes, W.; Porayska-Pomsta, K.; Holstein, K.; Sutherland, E.; Baker, T.; Shum, S.B.; Santos, O.C.; Rodrigo, M.T.; Cukurova, M.; Bittencourt, I.I.; et al. Ethics of AI in education: Towards a community-wide framework. Int. J. Artif. Intell. Educ. 2022, 32, 504–526. [Google Scholar] [CrossRef]
- Zawacki-Richter, O.; Marín, V.I.; Bond, M.; Gouverneur, F. Systematic review of research on artificial intelligence applications in higher education–where are the educators? Int. J. Educ. Technol. High. Educ. 2019, 16, 39. [Google Scholar] [CrossRef]
- Luckin, R.; Holmes, W.; Griffiths, M.; Forcier, L.B. Intelligence Unleashed: An Argument for AI in Education; Pearson: London, UK, 2016; 18p. [Google Scholar]
- Holmes, W.; Bialik, M.; Fadel, C. Artificial Intelligence in Education: Promises and Implications for Teaching and Learning; Center for Curriculum Redesign: Boston, MA, USA, 2019; Available online: https://discovery.ucl.ac.uk/id/eprint/10139722 (accessed on 26 August 2025).
- Boccuzzi, G.; Nico, A.; Manganello, F. Delegated authority and algorithmic power: A rapid review of ethical issues in AI-based educational assessment. In Proceedings of the ACM 5th International Conference on Information Technology for Social Good (ACM GoodIT 2025), Antwerp, Belgium, 3–5 September 2025. [Google Scholar]
- European Parliament. Council of the European Union. Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonised rules on artificial intelligence and amending Regulations (EC) No 300/2008, (EU) No 167/2013, (EU) No 168/2013, (EU) 2018/858, (EU) 2018/1139 and (EU) 2019/2144 and Directives 2014/90/EU, (EU) 2016/797 and (EU) 2020/1828 (Artificial Intelligence Act). Off. J. Eur. Union. 2024. L 1689. Available online: http://data.europa.eu/eli/reg/2024/1689/oj (accessed on 26 August 2025).
- Edwards, L.; Veale, M. Slave to the algorithm? Why a ‘right to an explanation’ is probably not the remedy you are looking for. Duke Law Technol. Rev. 2017, 16, 18. [Google Scholar]
- Manganello, F.; Nico, A.; Boccuzzi, G. Mapping the research landscape of transparent AI in university assessment: A bibliometric investigation. In Proceedings of the 5th International Conference on AI Research (ICAIR), Genoa, Italy, 11–12 December 2025. [Google Scholar]
- Yan, Y.; Liu, H.; Chau, T. A systematic review of AI ethics in education: Challenges, policy gaps, and future directions. J. Glob. Inf. Manag. 2025, 33, 1–50. [Google Scholar] [CrossRef]
- European Commission; Directorate-General for Communications Networks, Content and Technology. The Assessment List for Trustworthy Artificial Intelligence (ALTAI) for Self-Assessment; Publications Office of the European Union: Luxembourg, 2020; Available online: https://digital-strategy.ec.europa.eu/en/library/assessment-list-trustworthy-artificial-intelligence-altai-self-assessment (accessed on 26 August 2025).
- Radclyffe, C.; Ribeiro, M.; Wortham, R. The assessment list for trustworthy artificial intelligence: A review and recommendations. Front. Artif. Intell. 2023, 6, 1020592. [Google Scholar] [CrossRef] [PubMed]
- Peterson, C.; Broersen, J. Understanding the limits of explainable ethical AI. Int. J. Artif. Intell. Tools 2024, 33, 2460001. [Google Scholar] [CrossRef]
- Reidenberg, J.R. Lex informatica: The formulation of information policy rules through technology. Tex. Law Rev. 1997, 76, 553. [Google Scholar]
- Fedele, A.; Punzi, C.; Tramacere, S. The ALTAI checklist as a tool to assess ethical and legal implications for a trustworthy AI development in education. Comput. Law Secur. Rev. 2024, 53, 105986. [Google Scholar] [CrossRef]
- Boccuzzi, G.; Nico, A.; Manganello, F. Harmonizing human and algorithmic assessment: Legal reflections on the right to explainability in education. In Proceedings of the 17th International Conference on Education and New Learning Technologies (EDULEARN25), Palma, Spain, 30 June–2 July 2025. [Google Scholar] [CrossRef]
- Boccuzzi, G.; Nico, A.; Manganello, F. Hybridizing human and AI judgment: Legal theories as a framework for educational assessment. In Proceedings of the 2nd Workshop on Law, Society and Artificial Intelligence (LSAI 2025), Held at HHAI 2025: The 4th International Conference on Hybrid Human-Artificial Intelligence, Pisa, Italy, 10 June 2025. [Google Scholar]
- UNESCO. Recommendation on the Ethics of Artificial Intelligence; Adopted on 23 November 2021; UNESCO: Paris, France, 2022; 43p, Available online: http://digitallibrary.un.org/record/4062376 (accessed on 26 August 2025).
- Holmes, W.; Miao, F. Guidance for Generative AI in Education and Research; UNESCO Publishing: Paris, France, 2023. [Google Scholar]
- Umoke, C.C.; Nwangbo, S.O.; Onwe, O.A. The governance of AI in education: Developing ethical policy frameworks for adaptive learning technologies. Int. J. Appl. Sci. Math. Theory 2025, 11, 71–88. [Google Scholar] [CrossRef]
- Messick, S. Standards of validity and the validity of standards in performance assessment. Educ. Meas. Issues Pract. 1995, 14, 5–8. [Google Scholar] [CrossRef]
- Kirkpatrick, D.; Kirkpatrick, J. Evaluating Training Programs: The Four Levels, 3rd ed.; Berrett-Koehler Publishers: San Francisco, CA, USA, 2006. [Google Scholar]
- Stufflebeam, D.L. Madaus, G.F., Scriven, M., Stufflebeam, D.L., Eds.; The CIPP model for program evaluation. In Evaluation Models: Viewpoints on Educational and Human Services Evaluation; Springer: Dordrecht, The Netherlands, 1983; pp. 117–141. [Google Scholar] [CrossRef]
- Boccuzzi, G.; Nico, A.; Manganello, F. Educational assessment in the age of AI: A narrative review on definitions and ethical-legal principles for trustworthy automated systems. In Proceedings of the 19th International Conference on e-Learning and Digital Learning (ELDL 2025) + STE 2025, Lisbon, Portugal, 23–25 July 2025. [Google Scholar]
- Gunning, D.; Stefik, M.; Choi, J.; Miller, T.; Stumpf, S.; Yang, G.Z. XAI—Explainable artificial intelligence. Sci. Robot. 2019, 4, eaay7120. [Google Scholar] [CrossRef] [PubMed]
- Khosravi, H.; Shum, S.B.; Chen, G.; Conati, C.; Tsai, Y.S.; Kay, J.; Gašević, D. Explainable artificial intelligence in education. Comput. Educ. Artif. Intell. 2022, 3, 100074. [Google Scholar] [CrossRef]
- Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar] [CrossRef]
- Albaladejo-González, M.; Ruipérez-Valiente, J.A.; Gómez Mármol, F. Artificial intelligence to support the training and assessment of professionals: A systematic literature review. ACM Comput. Surv. 2024, 57, 1–29. [Google Scholar] [CrossRef]
- Chan, C.K.Y.; Hu, W. A comprehensive AI policy education framework for university teaching and learning. Int. J. Educ. Technol. High. Educ. 2023, 20, 38. [Google Scholar] [CrossRef]
- Chen, L.; Chen, P.; Lin, Z. Artificial intelligence in education: A review. IEEE Access 2020, 8, 75264–75278. [Google Scholar] [CrossRef]
- Hooshyar, D.; Yang, Y. Problems with SHAP and LIME in interpretable AI for education: A comparative study of post-hoc explanations and neural-symbolic rule extraction. IEEE Access 2024, 12, 137472–137490. [Google Scholar] [CrossRef]
- Boncillo, J. AI in education: A systematic review of its applications, benefits, and ethical challenges. Int. J. Multidiscip. Educ. Res. Innov. 2025, 3, 436–447. [Google Scholar]
Source (A–Z) | Identified Gap | Contributed to the XAI-ED CAF | Relevance to RQs |
---|---|---|---|
Albaladejo-González et al. [27] | XAI techniques immature; explanations lack pedagogical and assessment relevance. | Reinforces critique of SHAP/LIME for failing to provide educationally meaningful explanations. | [RQ2] Highlights need for outcome-focused indicators beyond technical transparency. |
Fedele et al. [14] | ALTAI applied to educational AI but limited to vulnerabilities and compliance; no focus on outcome validity. | Demonstrates ALTAI’s relevance while exposing need for outcome-focused governance. | [RQ1] Motivates reinterpretation of ALTAI through evaluation theory. |
Holmes et al. [1] | Ethical and operational challenges of AI-based educational assessment remain under-theorized. | Justifies urgency of systematic governance approaches in AIB-LOA. | [RQ1] Frames the governance challenge requiring theoretical integration. |
Peterson & Broersen [12] | Autonomous ethical AI impossible; no single normative explanation. | Strengthens case for hybrid human–AI evaluation and pluralistic accountability. | [RQ1] Underlines role of evaluation theory in normative interpretation. |
Radclyffe et al. [11] | ALTAI remains self-assessment oriented; weak capacity for independent auditing. | Motivates translation of ALTAI into institutional self-assessment tailored to AIB-LOA. | [RQ1] Highlights regulatory limits and the need for educational reinterpretation. |
Ribeiro et al. [26] | Early XAI methods (LIME) created for technical interpretability, not pedagogical accountability. | Demonstrates need to adapt technical explainability to educational meaning-making. | [RQ2] Shows why indicators must link transparency to validity. |
Umoke et al. [19] | Fragmented governance; lack of standardized policies for educational AI ethics. | Validates urgency of sector-specific governance frameworks. | [RQ1] Confirms absence of frameworks tailored to Annex III, point 3b, EU AI Act. |
Yan et al. [9] | Fragmented AIED ethics literature; weak integration between principles and practice. | Confirms need for coherent frameworks linking ethics with systematic educational evaluation. | [RQ1 & RQ2] Highlights absence of empirical validation and outcome indicators. |
ALTAI Dimension | Illustrative Indicators (What to Consider) | Possible Evidence Types (How to Show) |
---|---|---|
Human agency & oversight | Presence of mechanisms for human override; stakeholder capacity to contest decisions; preservation of educator judgment and student autonomy | Records of appeals and overrides; surveys on student autonomy; educator feedback |
Technical robustness & safety | Validity and reliability of assessment outputs; mechanisms to detect and mitigate bias; resilience to failure | Validity studies; fairness audits; expert reviews of alignment with pedagogical goals |
Privacy & data governance | Compliance with data minimization; clarity of consent; effectiveness of data correction/deletion procedures | Institutional data governance policies; audit reports; stakeholder awareness surveys |
Transparency | Comprehensibility of explanations; traceability of outcomes; usefulness for decision-making | Comprehension assessments; explanation usage data; transparency policy documents |
Diversity, fairness & non-discrimination | Equity of access; absence of systematic disadvantage across demographic groups; bias mitigation effectiveness | Disaggregated outcome data; equity reports; fairness audit documentation |
Societal & environmental well-being | Contribution to institutional culture; effects on student–educator relationships; sustainability of infrastructure | Climate surveys; engagement reports; documentation of environmental impact |
Accountability | Clarity of roles and responsibilities; mechanisms for oversight and redress; evidence of institutional learning from feedback | Governance documents; records of complaints and resolutions; reports on policy revisions |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Manganello, F.; Nico, A.; Boccuzzi, G. Theoretical Foundations for Governing AI-Based Learning Outcome Assessment in High-Risk Educational Contexts. Information 2025, 16, 814. https://doi.org/10.3390/info16090814
Manganello F, Nico A, Boccuzzi G. Theoretical Foundations for Governing AI-Based Learning Outcome Assessment in High-Risk Educational Contexts. Information. 2025; 16(9):814. https://doi.org/10.3390/info16090814
Chicago/Turabian StyleManganello, Flavio, Alberto Nico, and Giannangelo Boccuzzi. 2025. "Theoretical Foundations for Governing AI-Based Learning Outcome Assessment in High-Risk Educational Contexts" Information 16, no. 9: 814. https://doi.org/10.3390/info16090814
APA StyleManganello, F., Nico, A., & Boccuzzi, G. (2025). Theoretical Foundations for Governing AI-Based Learning Outcome Assessment in High-Risk Educational Contexts. Information, 16(9), 814. https://doi.org/10.3390/info16090814