An Adaptive Multi-Agent Architecture with Reinforcement Learning and Generative AI for Intelligent Tutoring Systems: A Moodle-Based Case Study
Abstract
1. Introduction
1.1. Digital Transformation, Smart Tutoring, and the Evolution of ITS
1.2. Adaptive Learning in ITS: Reinforcement Learning and Deep Learning
1.3. Multiagent Systems, LLM, and Agentic AI for Intelligent Tutoring
1.4. Research Gap and Contribution
2. Materials and Methods
2.1. System Design and Architecture
2.1.1. Interaction and Communication Layer—User Interface
2.1.2. Orchestration and Automation Layer
2.1.3. Intelligent Switching Router
2.1.4. Multi-Agent System
2.1.5. Integration of Reinforcement Learning (RL)
2.1.6. Data Management and Knowledge Base Layer
2.1.7. Considerations for the Design of the Architecture
2.2. MAS and Intelligent Control Mechanism
2.3. Meta-Agent and Reinforcement Learning
2.4. Programming and Functional Logic of Agents
2.4.1. AI Translator Agent
| Element | Description | Pseudocode |
|---|---|---|
| Input | User message | Input: user_message Output: normalized_message, language, confidence_level START read user_message // Detect language language = DetectLanguage(user_message) // Normalize to Spanish for internal Processing IF language ! = “es” THEN normalized_message = Translate (user_message, target_language = “es”) ELSE normalized_message = user_message ENDIF // Estimate confidence confidence_level = Estimate Confidence (normalized_message) // Return structured output return {normalized_message, language, confidence_level} END |
| Process | Text analysis to identify the language | |
| Output | JSON with detected language and confidence level Message translated into Spanish or English | |
| Agent | Translator Agent |
2.4.2. AI Prompt and Receiving Agent
| Element | Description | Pseudocode |
|---|---|---|
| Input | Message + conversation history + student data | START read current_message read conversation_history analyze context classify request_type determine urgency_level identify academic_topic estimate complexity_level return {type, urgency, topic, complexity} END |
| Process | Semantic and contextual classification | |
| Output | JSON with type, urgency, topic, and complexity | |
| Agent | Request Receiver/Classifier Agent |
2.4.3. AI Prompt and Receiving Agent with Moodle
| Element | Description | Pseudocode |
|---|---|---|
| Input | Academic data in JSON format | START read moodle_data interpret academic_information IF tasks_exist list tasks clearly ELSE indicate no information found return friendly_message END |
| Process | Conversion to natural language | |
| Output | Friendly and clear response | |
| Agent | Moodle Assistant Agent |
2.4.4. Pedagogical Agent
| Element | Description | Pseudocode |
|---|---|---|
| Input | Classified request data | START read classified_data set response_language generate theoretical_explanation return response END |
| Process | Academic explanation generation | |
| Output | Theoretical response | |
| Agent | Pedagogical Agent |
2.4.5. Technical Agent
| Element | Description | Pseudocode |
|---|---|---|
| Input | Topic and user message | START read topic read user_message generate practical_solution include examples or code if required return response END |
| Process | Practical knowledge application | |
| Output | Technical response | |
| Agent | Technical Agent |
2.4.6. Prepare Prompt—Analysis Agent
| Element | Description | Pseudocode |
|---|---|---|
| Input | Student response | START read student_response evaluate clarity and correctness identify weaknesses suggest improvements return feedback END |
| Process | Qualitative evaluation | |
| Output | Feedback and improvement suggestions | |
| Agent | Performance Analysis Agent |
2.4.7. Adaptive Empathic Agent
| Element | Description | Pseudocode |
|---|---|---|
| Input | User level and emotional context | START read user_level read requested_tone adapt response_style generate empathetic_message return response END |
| Process | Communication adaptation | |
| Output | Adaptive and empathetic response | |
| Agent | Adaptive Communication Agent |
2.4.8. Ethical Agent
| Element | Description | Pseudocode |
|---|---|---|
| Input | Response generated by the agents and user context | START read generated_response read user_context read interaction_context evaluate bias_risk evaluate fairness_compliance evaluate pedagogical_alignment evaluate trust_and_reliability IF bias_risk == true OR fairness_compliance == false THEN refine generated_response ENDIF IF pedagogical_alignment == false THEN adjust educational_content ENDIF approve validated_response return validated_response END |
| Process | Ethical and pedagogical verification of content | |
| Output | Ethically and pedagogically validated answer | |
| Agent | Ethical Agent |
2.5. User Interface Design and Scalability
3. Study Cases
3.1. User-Centered System Evaluation
3.2. Simulation of Adaptive Decision-Making Using Reinforcement Learning in ELA Tutor
| Reinforcement Type | User Input | Assigned Value (R) | Impact on the Agent |
|---|---|---|---|
| Positive (Reward) | Keywords: “Thank you”, “Excellent”, “It works”, “Good job” | +1.0 | Success validation: The selected strategy successfully addressed the user’s need. |
| Negative (Penalty) | Keywords: “I don’t understand”, “Bad”, “Error”, “Confusing” | −1.0 | Error correction: The strategy was ineffective or the selected agent was not appropriate. |
| Neutral | Absence of explicit feedback or simple phatic interactions | 0.0 | State maintenance: There is insufficient evidence to modify the system’s behavior. |
4. Results
4.1. Students’ Perceptions of ELA TUTOR
| Knowledge of AI | n | Usability (M ± SD) | Satisfaction & Usefulness (M ± SD) | Accessibility & Interaction (M ± SD) |
|---|---|---|---|---|
| None | 2 | 2.75 ± 1.06 | 2.00 ± 1.41 | 2.00 ± 1.41 |
| Low | 12 | 3.69 ± 0.83 | 3.83 ± 0.78 | 4.21 ± 0.81 |
| Medium | 100 | 3.75 ± 0.82 | 3.85 ± 0.93 | 3.89 ± 1.06 |
| High | 33 | 3.95 ± 0.90 | 3.84 ± 0.93 | 3.97 ± 1.07 |
4.2. Adaptive Behavior Analysis of the Reinforcement Learning Mechanism
| Sn | Interaction Type | User Input | Detected Intent | Selected Agent | Decision Basis | R | Q-Score After Update | Observed Behavior |
|---|---|---|---|---|---|---|---|---|
| 1 | Social/Feedback | Thank you, it worked perfectly | Social feedback | AdaptiveAgent | Rule-based (Social Protocol) | +1.0 | 1.00 | Positive reinforcement of adaptive communication. |
| 1 | Administrative | Query about enrolled courses | Administrative | Moodle Queries | Rule-based (Moodle API) | +1.0 | 1.00 | Correct routing to LMS integration. |
| 1 | Conceptual | What is cost accounting? | Conceptual | Pedagogical Agent | Heuristic-based | −1.0 | −1.00 | Theoretical response penalized in practical context. |
| 1 | Procedural | How to perform a cash count step by step | Procedural | TechnicalContent | Heuristic-based | +1.0 | 1.00 | Correct activation of technical guidance. |
| 2 | Technical | Code example in Python | Pure technical | TechnicalContent | RL policy preferred | +1.0 | 1.00 | Policy reused successfully in similar state. |
| 2 | Administrative | Request for grades | Administrative | Moodle Queries | RL policy preferred | +1.0 | 1.00 | Stable convergence in administrative routing. |
| 3 | Mixed/ Ambiguous | I don’t understand this part | Negative feedback | AdaptiveAgent | Fallback + RL update | −1.0 | 0.33 | Policy adjusted after negative signal. |
| Interaction Type | Iterations | Avg Reward | Final Q-Score | Behavior |
|---|---|---|---|---|
| Procedural queries | 20 | +0.95 | 0.95 | Stable convergence to technical agent |
| Administrative queries | 15 | +1.00 | 1.00 | Consistent routing to Moodle agent |
| Conceptual-only queries | 10 | −0.40 | −0.40 | Progressive avoidance of pedagogical agent |
5. Discussion and Future Works
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| AI | Artificial Intelligence |
| IAGen | Generative AI |
| RL | Reinforcement Learning |
| ITS | Intelligent Tutoring Systems |
| MAS | Multi-Agent System |
References
- Cibu, B.-R.; Crăciun, L.; Molănescu, A.G.; Cotfas, L.-A. Exploring the Educational Applications of Large Language Models: A Systematic Review and Topic Analysis. Electronics 2025, 14, 4683. [Google Scholar] [CrossRef]
- Riedmann, A.; D’Eramo, C.; Lugrin, B. Real-world testing for reinforcement learning in education. In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Detroit, MI, USA, 19–23 May 2025; Available online: https://www.ifaamas.org/Proceedings/aamas2025/pdfs/p1764.pdf (accessed on 18 November 2025).
- Zou, Y.; Wang, H.; Li, J. Digital learning in the 21st century: Trends, challenges, and innovations. Front. Educ. 2025, 10, 1562391. [Google Scholar] [CrossRef]
- Giuffra, L.; Soler, E.; Rossi, G. A multi-agent system model to integrate virtual learning environments and intelligent tutoring systems. Int. J. Interact. Multimed. Artif. Intell. 2013, 2, 6–16. [Google Scholar] [CrossRef][Green Version]
- Hassouna, A.B.; Chaari, H.; Belhaj, I. LLM-Agent-UMF: LLM-based Agent Unified Modeling Framework for Seamless Design of Multi Active/Passive Core-Agent Architectures. Inf. Fusion 2026, 127, 103865. [Google Scholar] [CrossRef]
- Létourneau, A.; Robillard, P.N.; Léger, P.-M. A systematic review of AI-driven intelligent tutoring systems in K-12 education. Sci. Rep. 2025, 15, 8421. [Google Scholar] [CrossRef]
- Brohi, S.; Mastoi, Q.; Jhanjhi, N.Z.; Pillai, T.R. A Research Landscape of Agentic AI and Large Language Models: Applications, Challenges and Future Directions. Algorithms 2025, 18, 499. [Google Scholar] [CrossRef]
- Riedmann, A.; Schaper, P.; Lugrin, B. Reinforcement Learning in Education: A Systematic Literature Review. Int. J. Artif. Intell. Educ. 2025, 35, 2669–2723. [Google Scholar] [CrossRef]
- Zha, S.; Liu, Y.; Zheng, C.; Xu, J.; Yu, F.; Gong, J.; Xu, Y. Mentigo: An Intelligent Agent for Mentoring Students in the Creative Problem Solving Process. arXiv 2024, arXiv:2409.14228. [Google Scholar] [CrossRef]
- Zerkouk, M.; Mihoubi, M.; Chikhaoui, B. A Comprehensive Review of AI-based Intelligent Tutoring Systems: Applications and Challenges. arXiv 2024, arXiv:2507.18882. [Google Scholar]
- Deshmukh, S.; Sen, V. Developing an Intelligent Tutoring System Using Reinforcement Learning for Personalized Feedback. Int. Acad. J. Sci. Eng. 2025, 12, 30–33. [Google Scholar] [CrossRef]
- Viswanathan, N.; Yin, Y.; Ramachandran, S. Enhancement of online education system by using a multi-agent intelligent tutoring system. Comput. Educ. Artif. Intell. 2022, 3, 100057. [Google Scholar] [CrossRef]
- Panagiotidis, P. LLM-based chatbots in language learning: A systematic literature review. Comput. Educ. 2024, 7, 102–123. [Google Scholar]
- Silva, A.P.; Fernandes, J.; Rocha, A. A Recommendation Module based on Reinforcement Learning to an Intelligent Tutoring System. In Proceedings of the ICISSp 2022—8th International Conference on Information Systems Security and Privacy, Virtual Event, 9–11 February 2022; pp. 733–740. [Google Scholar] [CrossRef]
- Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Adaptive Learning Control via Proximal Policy Optimization. In Proceedings of the CEUR Workshop Proceedings, Odesa, Ukraine, 24–26 September 2025; Available online: https://ceur-ws.org/Vol-4048/paper37.pdf (accessed on 18 November 2025).
- Piech, C.; Huang, J.; Phulsaria, A.; Sivan, S.; Joshi, M.; Portela, A.; Tracing, D.K. Deep Knowledge Tracing. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Montréal, QC, Canada, 7–12 December 2015; pp. 505–513. Available online: https://stanford.edu/~cpiech/bio/papers/deepKnowledgeTracing.pdf (accessed on 18 November 2025).
- Wang, X.; Zhang, L.; Chen, W. Application of reinforcement learning in personalized learning path recommendation in secondary education. IEEE Trans. Learn. Technol. 2025, 18, 145–159. [Google Scholar]
- Moncada-Ramirez, J.; Matez-Bandera, J.-L.; Gonzalez-Jimenez, J.; Ruiz-Sarmiento, J.-R. Agentic Workflows for Improving Large Language Model Reasoning in Robotic Object-Centered Planning. Robotics 2025, 14, 24. [Google Scholar] [CrossRef]
- Le Thanh, T. Towards multi-agent system for learning object recommendation in e-learning platforms. Heliyon 2024, 10, e35119. [Google Scholar] [CrossRef]
- Ivanova, M. A multi-agent architecture for learning paths-based personalized e-learning systems. Int. J. Inf. Technol. Syst. 2023, 15, 15–28. [Google Scholar]
- Piccialli, F.; Chiaro, D.; Sarwar, S.; Cerciello, D.; Qi, P.; Mele, V. AgentAI: A comprehensive survey on autonomous agents in distributed AI for industry 4.0. Expert. Syst. Appl. 2025, 291, 128404. [Google Scholar] [CrossRef]
- Beale, R. Aligning conversational AI with proven theories of learning. arXiv 2025, arXiv:2506.19484. [Google Scholar] [CrossRef]
- Bandi, A.; Kongari, B.; Naguru, R.; Pasnoor, S.; Vilipala, S.V. The Rise of Agentic AI: A Review of Definitions, Frameworks, Architectures, Applications, Evaluation Metrics, and Challenges. Future Internet 2025, 17, 404. [Google Scholar] [CrossRef]
- Córdova-Esparza, D.-M. AI-Powered Educational Agents: Opportunities, Innovations, and Ethical Challenges. Information 2025, 16, 469. [Google Scholar] [CrossRef]
- Jiang, Y.-H.; Lu, Y.; Dai, L.; Wang, J.; Li, R.; Jiang, B. Agentic Workflow for Education: Concepts and Applications. 2025. Available online: https://arxiv.org/abs/2509.01517 (accessed on 1 December 2025).
- Sapkota, R.; Roumeliotis, K.I.; Karkee, M. AI Agents vs. Agentic AI: A Conceptual taxonomy, applications and challenges. Inf. Fusion 2026, 126, 103599. [Google Scholar] [CrossRef]
- Essa, S.G.; Celik, T.; Human-Hendricks, N.E. Personalized Adaptive Learning Technologies Based on Machine Learning Techniques to Identify Learning Styles: A Systematic Literature Review. IEEE Access 2023, 11, 48392–48409. [Google Scholar] [CrossRef]
- Moodle. Moodle Learning Management System. 2024. Available online: https://moodle.org (accessed on 21 November 2025).
- Lepper, M.R.; Woolverton, M. The Wisdom of Practice: Lessons Learned from the Study of Highly Effective Tutors. Adv. Instr. Psychol. 2002, 1, 135–158. [Google Scholar]
- Shahriari, K.; Shahriari, M. IEEE standard review—Ethically aligned design: A vision for prioritizing human wellbeing with artificial intelligence and autonomous systems. In Proceedings of the 2017 IEEE Canada International Humanitarian Technology Conference (IHTC), Toronto, ON, Canada, 21–22 July 2017; pp. 197–201. [Google Scholar] [CrossRef]
- IEEE. Ethically Aligned Design: A Vision for Prioritizing Human Well-Being with Autonomous and Intelligent Systems, 2nd ed.; IEEE: New York, NY, USA, 2017; Available online: https://standards.ieee.org/industry-connections/ec/autonomous-systems.html (accessed on 18 November 2025).








| IEEE EAD Principle | Criterion Applied | Implementation in ELA Tutor | Architectural Component |
|---|---|---|---|
| Human Well-being | Student-centered tutoring | Academic support and emotional accompaniment are prioritized. | MAS |
| Indirect teacher supervision | Pedagogical decisions are based on academic data and can be supervised by instructors. | Moodle, Django API | |
| Bias and Fairness | Equity in access | Enrolled students have access to the same functionalities. | Moodle, Web UI |
| Performance-based adaptation | Personalization is grounded in academic and interaction indicators. | Intelligent Switching Router, RL Meta-Agent | |
| Transparency | Decision traceability | Every decision is logged and traceable. | PostgreSQL, RL Policy Store |
| Separation between decision-making and generation | The LLM generates language but does not decide pedagogical strategies, reducing algorithmic opacity. | n8n Orchestrator, Router | |
| Accountability | Decision flow control | Tutoring logic is implemented through explicit flows and auditable rules. | n8n Orchestrator |
| Trust and Reliability | Consistency in tutoring | Students with similar academic backgrounds receive coherent strategies. | MAS |
| Experience-validated learning | The system adjusts its behavior only when explicit evidence of positive or negative feedback exists. | RL Meta-Agent, Reward Calculator | |
| Privacy and Data Governance | Data minimization | The system uses only the academic data necessary for tutoring. | Django API, PostgreSQL |
| Isolation of sensitive information | Data are stored in separate layers with controlled access. | Docker, Data Layer | |
| Awareness of Misuse | Ethical content filtering | Generated responses are validated. | Ethical Agent |
| Data minimization | The system uses only the academic data necessary for tutoring. | Django API, PostgreSQL | |
| Societal and Cultural Awareness | Educational contextualization | The answers align with the institutional curriculum. | Pedagogical Agent, Moodle |
| Robustness and Security | Architectural resilience | The use of microservices and containers allows faults to be isolated and system operation to be maintained. | Docker, n8n Orchestrator |
| Code | Dimension | Item |
|---|---|---|
| Q0 | Demographic profile | Sex, age, degree program, prior use of AI |
| Q1 | Usability | The intelligent tutoring system was easy to use. |
| Q2 | I did not need much help to learn how to use the system. | |
| Q3 | The system’s functions are well integrated and consistent with each other. | |
| Q4 | At times, the system became confusing or difficult to understand. | |
| Q5 | Satisfaction & usefulness | I am satisfied with the responses provided by the intelligent tutor. |
| Q6 | The intelligent tutor helped me better understand the course content. | |
| Q7 | The tutor’s responses were clear and useful. | |
| Q8 | I would like to continue using this intelligent tutor in other courses. | |
| Q9 | Accessibility & interaction | I was able to access the system without major technical difficulties. |
| Q10 | The response time of the intelligent tutor was adequate. |
| Dimension | M | SD |
|---|---|---|
| Usability | 3.77 | 1.21 |
| Satisfaction & usefulness | 3.82 | 1.02 |
| Accessibility & interaction | 3.90 | 1.15 |
| Dimensions | Question | M | SD |
|---|---|---|---|
| Usability | Q1 | 4.03 | 1.09 |
| Q2 | 3.90 | 1.22 | |
| Q3 | 3.84 | 1.06 | |
| Q4 * | 3.30 | 1.33 | |
| Satisfaction & usefulness | Q5 | 3.76 | 1.02 |
| Q6 | 3.85 | 0.97 | |
| Q7 | 3.81 | 1.03 | |
| Q8 | 3.85 | 1.06 | |
| Accessibility & interaction | Q9 | 3.89 | 1.25 |
| Q10 | 3.91 | 1.06 |
| Dimensions | Male (n = 90) M ± SD | Female (n = 60) M ± SD | Δ M (F − M) |
|---|---|---|---|
| Usability | 3.74 ± 1.18 | 3.78 ± 1.16 | +0.04 |
| Satisfaction & usefulness | 3.80 ± 1.05 | 3.84 ± 1.02 | +0.04 |
| Accessibility & interaction | 3.87 ± 1.22 | 3.92 ± 1.18 | +0.05 |
| Dimensions | ≤20 Years M ± SD | 21–25 Years M ± SD | ≥26 Years M ± SD |
|---|---|---|---|
| Usability | 3.72 ± 1.17 | 3.77 ± 1.15 | 3.75 ± 1.18 |
| Satisfaction & usefulness | 3.79 ± 1.04 | 3.83 ± 1.02 | 3.81 ± 1.06 |
| Accessibility & interaction | 3.85 ± 1.21 | 3.91 ± 1.18 | 3.88 ± 1.20 |
| Question & Comment Type | Qualitative Category | Main Associated Dimension | Approximate Frequency of Mentions |
|---|---|---|---|
| Q15—Positive | Response speed (“quick answers”, “immediate”, “adequate response time”) | Usability/ Accessibility | High |
| Q15—Positive | Clarity and usefulness of answers (“clear”, “coherent”, “helps me understand topics better”) | Satisfaction/ Usefulness | High |
| Q15—Positive | Academic support and task management (“helps with assignments”, “shows pending tasks and virtual classroom status”) | Satisfaction/ Accessibility | Medium |
| Q15—Positive | Ease of use and simple interface (“easy to use”, “simple handling”, “simple/minimalist interface”) | Usability | Medium |
| Q15—Positive | Integration with degree/program or student context (“knows my major”, “linked to the university”) | Satisfaction | Low–medium |
| Q16—Improvement | coherence and variety of answers (“more precise”, “does not always get it right”, “avoid repeating the same answer”) | Satisfaction/ Usefulness | High |
| Q16—Improvement | Depth and structure of explanations (“more detailed”, “step by step”, “not all in one paragraph”) | Satisfaction | Medium |
| Q16—Improvement | Interface and visual design (“improve design”, “more attractive/dynamic”, “change colors or logo”) | Usability | High |
| Q16—Improvement | Handling of files, images and mobile app (“upload PDF/Word/JPG”, “send images”, “mobile app”) | Accessibility | Medium |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
López-Goyez, J.P.; González-Briones, A.; Demazeau, Y. An Adaptive Multi-Agent Architecture with Reinforcement Learning and Generative AI for Intelligent Tutoring Systems: A Moodle-Based Case Study. Appl. Sci. 2026, 16, 1323. https://doi.org/10.3390/app16031323
López-Goyez JP, González-Briones A, Demazeau Y. An Adaptive Multi-Agent Architecture with Reinforcement Learning and Generative AI for Intelligent Tutoring Systems: A Moodle-Based Case Study. Applied Sciences. 2026; 16(3):1323. https://doi.org/10.3390/app16031323
Chicago/Turabian StyleLópez-Goyez, Juan P., Alfonso González-Briones, and Yves Demazeau. 2026. "An Adaptive Multi-Agent Architecture with Reinforcement Learning and Generative AI for Intelligent Tutoring Systems: A Moodle-Based Case Study" Applied Sciences 16, no. 3: 1323. https://doi.org/10.3390/app16031323
APA StyleLópez-Goyez, J. P., González-Briones, A., & Demazeau, Y. (2026). An Adaptive Multi-Agent Architecture with Reinforcement Learning and Generative AI for Intelligent Tutoring Systems: A Moodle-Based Case Study. Applied Sciences, 16(3), 1323. https://doi.org/10.3390/app16031323

