Dialogical Learning Support in RAG-Based E-Learning

Toskova, Asya; Georgiev, Kosta; Glushkova, Todorka

doi:10.3390/info17050418

Open AccessArticle

Dialogical Learning Support in RAG-Based E-Learning

by

Asya Toskova

^*

,

Kosta Georgiev

and

Todorka Glushkova

Faculty of Mathematics and Informatics, Plovdiv University “Paisii Hilendarski”, 236 Bulgaria Blvd., 4000 Plovdiv, Bulgaria

^*

Author to whom correspondence should be addressed.

Information 2026, 17(5), 418; https://doi.org/10.3390/info17050418

Submission received: 22 March 2026 / Revised: 18 April 2026 / Accepted: 22 April 2026 / Published: 27 April 2026

(This article belongs to the Special Issue Trends in Artificial Intelligence-Supported E-Learning)

Download

Browse Figures

Versions Notes

Abstract

This paper presents a web-based platform designed to support dialogical learning through a Retrieval-Augmented Generation (RAG) architecture. The system integrates retrieval grounding, context-aware dialogue management, and a modular, model-agnostic design to enable controlled and pedagogically aligned learning supported by Artificial Intelligence (AI) and based on instructor-verified educational materials. The proposed approach supports multilingual interaction, including operation in lower-resource languages such as Bulgarian, and models learning as a continuous dialogue rather than a sequence of isolated queries. To ensure reliable knowledge access, the system employs a hybrid retrieval strategy combining semantic embeddings with lexical matching within a two-stage indexing and retrieval framework. The approach is supported by an empirical evaluation based on a manually constructed question set with human-validated relevance assessment. The results demonstrate that the selected configuration achieves 90% retrieval accuracy at TOP-5 and up to 91.4% at TOP-6, providing a reliable contextual basis for response generation. A complementary manual evaluation of generated responses further indicated strong practical usefulness and generally grounded answer quality. The platform is further designed in alignment with European regulatory principles, emphasizing transparency, traceability, and controlled use of AI in educational environments. Overall, the study demonstrates that integrating retrieval precision with pedagogical structure enables the development of AI systems that support structured and contextually grounded learning processes.

Keywords:

Retrieval-Augmented Generation (RAG); dialogical learning; e-learning; hybrid retrieval; AI in education; LLM; EU AI Act; semantic search; CCA modeling

1. Introduction

In recent years, artificial intelligence (AI) has become an increasingly influential factor in the development of e-learning, with applications ranging from intelligent tutoring systems and adaptive learning platforms to conversational assistants [1,2]. The emergence of large language models (LLMs) has significantly expanded the possibilities for natural language interaction, enabling learners to ask questions freely, receive explanations, and engage in self-directed learning processes [3,4,5]. Within this context, dialogical learning has gained importance as a pedagogical approach in which knowledge is constructed through sustained interaction [6]. Rather than relying on isolated question–answer exchanges, dialogical learning emphasizes clarification, follow-up questioning, and incremental conceptual understanding. Learners are viewed as active participants who construct meaning through interaction rather than passive recipients of information.

However, the use of generative language models in educational settings raises important challenges related to reliability and epistemic validity [7,8]. One of the most widely discussed issues is the generation of so-called “hallucinations”—responses that appear plausible but are factually incorrect or insufficiently grounded [9,10]. In educational contexts, such behavior may lead to the formation of misconceptions and hinder the development of correct conceptual understanding.

Beyond hallucinations, a deeper limitation arises from the nature of the data on which large language models are trained. These models rely on heterogeneous corpora that include sources of varying credibility, making it difficult to distinguish between validated knowledge and informal or outdated information [3,4]. As a result, generated responses may lack epistemic grounding, even when they are linguistically coherent. This is particularly problematic in education, where learners often perceive system responses as authoritative [11].

Retrieval-Augmented Generation (RAG) has emerged as a promising approach to addressing these challenges by combining generative models with retrieval mechanisms over predefined document collections [9,12]. By grounding responses in curated educational materials, RAG enables more controlled and transparent linkage between generated content and instructional sources. The goal of such systems is not to establish absolute truth, but to ensure traceability and alignment with selected learning resources.

Despite the growing use of RAG architectures, their application in educational contexts has largely focused on improving answer accuracy, with limited attention to their pedagogical orchestration within dialogical learning processes [13,14]. In particular, the integration of retrieval grounding with sustained, context-aware learning dialogue remains insufficiently formalized.

The contribution of this study lies in the integration of retrieval grounding, dialogical interaction, and pedagogical alignment within a unified architectural framework. The proposed approach emphasizes controlled knowledge access and context-aware dialogue, supported by an empirical evaluation demonstrating high retrieval accuracy in an educational setting. The contributions of this study can be summarized as follows:

A dialogically oriented learning support architecture that models interaction as a continuous learning process rather than as isolated question–answer exchanges.
A bounded context-aware dialogue mechanism that preserves recent interaction history and supports incremental knowledge construction across multiple turns.
A retrieval-constrained generation approach grounded in curated and verified educational materials.
A modular and model-agnostic system design allowing flexible integration of different large language models.
Demonstration of effective retrieval-grounded operation in a lower-resource language setting (Bulgarian).
A hybrid retrieval strategy combining semantic similarity and lexical matching within a two-stage retrieval pipeline.
An empirical evaluation framework including retrieval accuracy, generative response quality, groundedness, hallucination analysis, and usefulness assessment.
A system design aligned with Compliance-by-Design principles in relation to the EU AI Act, supporting transparency and controlled educational use of AI.

The remainder of the paper is structured as follows: Section 2 reviews related work on RAG, educational chatbots, and AI-supported learning systems; Section 3 presents the proposed architecture; Section 4 describes the implementation and evaluation; Section 5 discusses the results and outlines future research directions; and Section 6 concludes the paper.

2. Related Work

Research in artificial intelligence and e-learning has increasingly focused on systems that support learning through natural language interaction. These systems span a wide range of approaches, including intelligent tutoring systems and educational chatbots, as well as more recent architectures based on large language models and hybrid retrieval–generation mechanisms [1,2].

Recent developments have explored the integration of large language models into educational environments, enabling conversational interaction and flexible access to explanations. Systematic reviews indicate that such systems can support self-directed learning, but also highlight challenges related to reliability, lack of pedagogical structure, and limited control over generated content [2,3]. LLM-based systems support personalized learning, automated feedback generation, and interactive knowledge exploration [4,5,15].

As a means of improving factual grounding, architectures that combine retrieval and generation mechanisms have gained prominence. These Retrieval-Augmented Generation systems extend language models with the ability to access external document collections, enabling responses to be linked to specific sources [9,12]. Recent research emphasizes that purely semantic retrieval may be insufficient in technical domains, where precise terminology plays a critical role, leading to the adoption of hybrid strategies that combine semantic similarity with keyword-based ranking [9,10]. At the same time, these systems may produce responses that are not consistently aligned with instructional objectives and may require additional pedagogical structuring [13]. To address these limitations, some authors highlight the importance of integrating LLMs with structured knowledge sources and external grounding mechanisms, improving both reliability and educational validity [14,16].

Within educational contexts, emerging approaches have begun to explore the use of RAG-based assistants to support question answering over course materials. Some systems incorporate elements of guided interaction, such as Socratic questioning [17] or scaffolded explanations [18]. Other approaches extend this idea by linking natural language queries to structured knowledge systems [19]. Recent research on RAG systems further extends these approaches by introducing advanced retrieval–generation workflows and multi-stage retrieval pipelines that improve grounding and response accuracy. In particular, the A-RAG (Agentic RAG) framework proposes a hierarchical retrieval interface that enables adaptive retrieval across multiple levels of granularity, allowing the model to participate more actively in the retrieval process [20]. In addition, optimized retrieval and ranking strategies have been proposed to enhance the precision of retrieved context, particularly in knowledge-intensive tasks. The WildGraphBench benchmark highlights the importance of realistic evaluation settings based on large-scale and heterogeneous corpora, revealing limitations of current GraphRAG approaches in handling complex, multi-source information [21]. Other studies emphasize the importance of systematic evaluation of RAG systems, including metrics related to faithfulness, relevance, and hallucination control [22]. Domain-oriented RAG applications further demonstrate the effectiveness of such systems for question answering over structured and semi-structured document collections [23]. These approaches highlight the potential of combining retrieval grounding with pedagogical strategies, but they are often limited to specific interaction patterns or lack integration with broader dialogical learning processes.

Another important research direction concerns the role of dialogue in learning. Studies in conversational AI and educational interaction emphasize the importance of maintaining context across multiple dialogue turns, enabling clarification, elaboration, and progressive refinement of understanding [24,25]. Recent work further highlights the pedagogical value of dialogical interaction as a mechanism for supporting deeper cognitive engagement and knowledge construction [6]. In parallel, research on conversational AI systems in education demonstrates that while interactive systems can enhance learner engagement, they often struggle to maintain coherent multi-turn context and instructional alignment [11,26].

Recent work also highlights the epistemic risks associated with generative models in education, including the potential for misinformation and over-reliance on system-generated responses [7,8]. The studies further emphasize these risks, pointing to reduced critical engagement and increased dependence on AI-generated explanations in learning contexts [27]. These concerns have led to increasing interest in approaches that constrain model outputs through curated knowledge sources and transparent grounding mechanisms.

In parallel, advances in multimodal learning systems have demonstrated the potential of integrating textual and visual information within unified representations. Foundational models such as CLIP (Contrastive Language-Image Pre-training) enable joint encoding of text and images [28], while subsequent approaches, including BLIP (Bootstrapped Language-Image Pretraining) and BLIP-2 (Bootstrapped Language-Image Pretraining-2), extend these capabilities toward more advanced vision–language understanding and generation [29,30]. More recent work on multimodal foundation models further generalizes these architectures across diverse tasks and domains [31]. These developments support the emergence of multimodal retrieval-augmented systems, which can enhance understanding in visually intensive domains [32].

Finally, regulatory and ethical considerations have become increasingly important with the introduction of frameworks such as the European Union AI Act, which emphasizes transparency, accountability, and human oversight in educational applications of AI [33,34].

Despite these advances, existing approaches typically focus on either improving retrieval accuracy or enhancing conversational capabilities, without fully integrating retrieval grounding, dialogical context management, and pedagogical alignment within a unified architecture.

3. Architecture and Design

In response to the limitations of existing RAG solutions and educational chatbots, this section presents a platform designed to support dialogical learning through controlled interaction grounded in verified educational sources. The platform combines a Retrieval-Augmented Generation architecture with a dialogically oriented design, modeling interaction as a context-aware sequential process rather than a series of independent queries.

3.1. Overall Architecture and Design Principles

The proposed platform is a web-based system with a modular architecture that enables flexible extension and adaptation to different educational contexts, including the replacement of embedding models and language models without modifying the underlying database logic. A key design principle is the separation between educational content management and dialogical interaction processes.

The system architecture is conceptually organized into three main layers (Figure 1):

knowledge management layer
response management layer
dialogical interaction layer

The knowledge management layer is responsible for storing, processing, and indexing educational materials organized into thematic collections. Content undergoes preprocessing, including segmentation and vector representation, enabling efficient retrieval.

The response management layer implements the RAG mechanism by retrieving relevant fragments from the knowledge base and using them as context for response generation. This ensures that generated responses are grounded in instructional materials.

The dialogical interaction layer maintains context across dialogue turns and supports the user interface. Interaction is modeled as a continuous process, where previous exchanges influence the interpretation of subsequent queries, enabling clarification and incremental knowledge construction.

The platform supports multiple profiles through authentication and session management, allowing interaction histories to be maintained per course. This design enables future extensions such as personalized learning services without modifying the core retrieval-generation process.

The response generation component is model-agnostic, allowing selection among different language models. This enables adaptation to institutional constraints, resource availability, or specific educational requirements.

The web-based architecture facilitates deployment in real educational settings and allows integration with external e-learning systems and future extensions such as learning analytics and recommendation services.

3.2. Knowledge Base Construction and Content Management

The knowledge base is a central component of the platform, supporting dialogical interaction through curated, instructor-validated educational materials. This approach enables structured control over content and alignment with specific learning objectives.

Learning materials are organized into thematic collections corresponding to courses or modules. Each document undergoes preprocessing, including segmentation into semantically coherent fragments, which supports precise retrieval and coherent response generation.

Fragments are transformed into vector representations, enabling hybrid retrieval that combines semantic similarity and lexical matching. This allows the system to capture both conceptual meaning and domain-specific terminology.

To balance retrieval precision and contextual coherence, the system employs overlapping text fragments of varying lengths for indexing and retrieval. These representations serve as the basis for selecting relevant context during response generation.

Content management is implemented as a separate functional module, enabling independent updates, reprocessing, and reindexing without disrupting system operation. This supports dynamic knowledge base maintenance and long-term adaptability.

The architecture also accounts for potential inconsistencies or outdated content. Since materials are instructor-provided, responsibility for content validity remains external to the model. The system preserves transparency by grounding responses in identifiable source fragments rather than resolving conceptual conflicts autonomously.

The modular design allows future extension toward multimodal knowledge bases, including integration of visual educational resources.

3.3. Retrieval and Answer Generation Workflow

The retrieval and generation workflow establishes a structured connection between user queries, educational sources, and generated responses.

When a learner submits a query, it is interpreted within the context of the ongoing dialogue, taking into account prior interaction. This supports continuity and prevents treating queries as isolated inputs.

Relevant fragments are then retrieved using a hybrid strategy that combines vector-based similarity with lexical ranking. This approach balances semantic relevance and exact term matching.

The selected fragments, together with the user query and dialogue history, are incorporated into a structured prompt. The language model is instructed to generate responses grounded in the retrieved materials, ensuring alignment with instructional content.

The choice of language model is configurable, allowing adaptation to different usage scenarios. Responses are generated in an explanatory style consistent with dialogical interaction and can be refined through subsequent dialogue turns.

The workflow is designed to ensure transparency and traceability, enabling users to relate generated responses to specific educational sources.

3.4. Pedagogical Design and Context Management

The primary objective of the proposed platform is to support the learning process through dialogue, viewed as a sequential and active construction of understanding. The proposed architecture prioritizes instructional grounding and sequential conceptual development. This approach treats the learning process as a continuous interaction where learners build meaning by asking questions, clarifying concepts, and gradually expanding their knowledge.

To support this dialogical model, the platform incorporates a stateful conversation management mechanism. Dialogue context is maintained over time by storing and analyzing the interaction history, including previous user queries, system responses, and the specific educational sources referenced in prior turns. This allows the system to recognize references, ellipses, and continuations of previously discussed topics without requiring the learner to restate the full context each time. Technically, the dialog history is dynamically injected into the context window of the generative model, ensuring that each new response is conditioned on both the retrieved fragments and the preceding interaction sequence.

Furthermore, the platform ensures a transparent connection between the generated responses and the verified educational sources. By allowing learners to trace the relationship between the AI-generated explanations and the original instructional materials, the system fosters epistemic trust and ensures that the structured knowledge construction remains strictly aligned with the curricular objectives. In this sense, the distinction of the proposed platform is both pedagogical and architectural, emphasizing a controlled learning trajectory over unrestricted information retrieval.

While the current implementation focuses on the accuracy of contextual retrieval and response generation, the system’s architecture is specifically designed to accommodate advanced pedagogical scaffolding. Future iterations will incorporate confidence-based thresholding and instruction-tuned Socratic prompting, enabling the platform to autonomously generate clarifying questions when user queries exhibit high semantic ambiguity.

3.5. Regulatory and Ethical Constraints in Educational Use of Language Models

According to the Artificial Intelligence Act [33], AI systems in education are considered high-risk when they perform automated assessment, influence access to education, or profile learners. In such cases, human oversight and transparency are required.

Conversely, language models may be used in supportive roles, such as providing explanations, facilitating dialogical interaction, and supporting self-directed learning. Under these conditions, systems are considered limited-risk, provided that transparency and clear disclosure are ensured [34].

The proposed platform is designed in alignment with these principles. By grounding responses in verified instructional materials and avoiding autonomous decision-making, the system supports learning without replacing the role of the instructor.

4. Modeling and Implementation

4.1. Calculus of Context-Aware Ambients (CCA) Modeling of Learning Scenario

When developing a real educational support platform that integrates heterogeneous, dynamically changing modules, the preliminary modeling process is particularly important. The formalization of the context-aware system can be represented and modeled using the formal mathematical notation CCA, which is conceptually related to the π-calculus formalism [35]. A CCA ambient is an identity through which a specific object or component is associated. Each ambient has a name, boundaries, and can contain other ambients within itself. Ambients can change their location and communicate with other ambients. There are three possible relationships between two ambients: parent, child, and sibling. Each ambient communicates with other ambients. The message exchange process is carried out using handshaking. In CCA notation, when two ambients exchange a message, three notations are used: “::” for interaction between sibling ambients; “↑” and “↓” for child–parent and parent–child communication; “<>” indicates sending a message, and “()” indicates receiving a message. Processes P are a basic syntactic category in CCA modeling. The process of each ambient defines its behavior and communication with other ambients in the system. Process 0 does nothing and terminates immediately. Process P|Q indicates that process P is running in parallel with process Q [36].

The main ambients used in the CCA model of the learning scenario are presented in the following Table 1.

The PA, RT, SC, LLMG, and KM ambients are siblings in the ambient hierarchy, and VDB, UDB, and MEDB are children of the KM ambient. This basic learning scenario can be represented by the following CCA model as processes of the participating ambients:

P_{P A} \equiv (\begin{matrix} R T ∷ < P A i, U s e r Q u e r y > . 0 | \\ S C ∷ < P A i, U s e r Q u e r y > . 0 | \\ L L M G ∷ (R e s p o n s e) . 0 \end{matrix})

(1)

P_{R T} \equiv (\begin{matrix} P A ∷ (P A i, U s e r Q u e r y) . K M ∷ < P A i, U s e r Q u e r y > . 0 | \\ K M ∷ (P A i, L e a r n i n g K n o w l e g e) . \\ L L M G ∷ < P A i, U s e r Q u e r y, L e a r n i n g K n o w l e g e > . 0 \end{matrix})

(2)

P_{L L M G} \equiv (\begin{matrix} R T ∷ (P A i, U s e r Q u e r y, L e a r n i n g K n o w l e g e) . \\ S C ∷ (P A i, L e a r n i n g C o n t e x t) . \\ P A ∷ < R e s p o n s e > . R T ∷ < P A i, R e s p o n s e > . 0 \end{matrix})

(3)

P_{S C} \equiv (\begin{matrix} P A ∷ (P A i, U s e r Q u e r y) . \\ L L M G ∷ < P A i, L e a r n i n g C o n t e x t > . 0 \end{matrix})

(4)

P_{K M} \equiv (\begin{matrix} R T ∷ (P A i, U s e r Q u e r y) . \\ V D B ↓ < P A i, U s e r Q u e r y > . \\ U D B ↓ < P A i, U s e r Q u e r y > . 0 | \\ V D B ↓ (P A i, L e a r n i n g K n o w l e g e) . \\ R T ∷ < P A i, L e a r n i n g K n o w l e g e > . 0 | \\ L M M G ∷ (P A i, R e s p o n s e) . U D B ↓ < P A i, R e s p o n s e > . 0 \end{matrix})

(5)

P_{M E D B} \equiv (V D B ∷ < E d u c a t i o n a l R e s o u r c e s > . 0 |)

(6)

P_{V D B} \equiv (\begin{matrix} K M ↑ (P A i, U s e r Q u e r y) . \\ M E D B ∷ (E d u c a t i o n a l R e s o u r s e s) . \\ K M ↑ < P A i, L e a r n i n g K n o w l e g e > . 0 \end{matrix})

(7)

P_{U D B} \equiv (\begin{matrix} K M ↑ (P A i, U s e r Q u e r y) . 0 | \\ K M ↑ (P A i, R e s p o n s e) . 0 \end{matrix}) .

(8)

The ccaPL programming language is a machine-readable version of CCA syntax. Let h be a conversion function from CCA to ccaPL; that is, for a process P in CCA, h(P) is the corresponding ccaPL notation. The ccaPL interpreter is developed as a Java application. Therefore, a simulator can be created to examine or test the scenario described above, which can be used for preliminary testing and verification of the designed processes. To assist specialists in the preliminary modeling of processes and scenarios in various context-dependent systems, we have created a visual CCA editor through which the designed services can be modeled, analyzed, and optimized before real development begins [37].

4.2. Implementation and Prototype

A functional prototype was developed to demonstrate the practical applicability of the proposed architecture. The system enables learners to upload course-specific educational materials, such as lecture presentations and PDF documents, and interact with an AI assistant constrained to operate within the scope of the uploaded content.

Educational materials are associated with course-specific profiles that define contextual parameters, including course metadata and assistant configuration (Figure 2). These parameters support alignment between generated responses and the specific learning scenario.

After upload, educational materials are converted to text and processed for indexing (Figure 3). The content is segmented into overlapping fragments of approximately 500 characters with an overlap of 150 characters. The selected chunking configuration was determined based on commonly adopted practices in retrieval-augmented systems and preliminary empirical observations, aiming to balance semantic coherence and retrieval granularity.

Each fragment is transformed into a vector representation using the intfloat/multilingual-e5-small embedding model. L2 normalization is applied to all vectors, enabling inner product similarity to serve as an exact approximation of cosine similarity within a FAISS IndexFlatIP index. This configuration is commonly used in semantic retrieval scenarios to ensure stable similarity estimation.

For each user query, the top-5 most relevant fragments are retrieved to provide contextual grounding for response generation.

During interaction, user queries are processed using a hybrid retrieval strategy that combines vector-based semantic similarity with keyword-based ranking (BM25). This approach ensures sensitivity to both conceptual meaning and domain-specific terminology, which is particularly important in technical educational content. Retrieval is implemented using both semantic and hybrid approaches. Semantic extraction searches the database for fragments relevant to the question.

The retrieved fragments, together with the user query and dialogue history, are incorporated into a structured prompt and passed to the generative model (GPT-3.5-turbo) via the OpenAI API. The model generates responses grounded in the retrieved instructional content, supporting context-aware and pedagogically aligned explanations (Figure 4).

In the illustrated example, the dialogical interaction is conducted in Bulgarian, with the answers generated based on the uploaded learning materials. However, the platform itself is language-independent and can work in different interaction languages depending on the selected model and configuration.

The prototype is implemented entirely in Python 3.9.6 and is accessible via a web-based user interface built with the Streamlit 1.50.0 framework, enabling access via a standard browser without local installation.

From a performance perspective, retrieval operates efficiently at course scale, while overall response latency is primarily influenced by embedding computation and external LLM API calls. Indexing is performed offline during content upload and does not affect interactive performance. The separation between indexing and runtime retrieval enables scalability without structural changes to the architecture.

4.3. Empirical Retrieval Evaluation and Sensitivity Analysis

A systematic empirical evaluation of retrieval configurations was conducted within the proposed RAG-based learning support system using specialized instructional material, namely a Bulgarian manual on UML (Unified Modeling Language). A dataset of 70 manually formulated questions was constructed and organized into seven thematic groups with ten questions per topic, ensuring balanced coverage. Relevance assessment was performed manually to ensure alignment with instructional content and conceptual correctness. The dataset was designed to reflect typical learner queries within the selected domain, while acknowledging that broader generalization requires further validation on larger and more diverse corpora.

Beyond reporting final accuracy values, the conducted experiments also serve as a partial ablation analysis of key retrieval-side design choices. Although the present study does not isolate every component of the full RAG pipeline (e.g., prompt construction, dialogue memory, or generation model variants), it systematically examines how retrieval quality changes under alternative chunking strategies, retrieval methods, and retrieval depths.

Two complementary evaluation procedures were applied. An automated algorithmic relevance criterion was employed for controlled sensitivity analysis across multiple chunking configurations. In addition, manual relevance assessment was performed, where retrieved fragments were evaluated in terms of alignment with instructional content and conceptual correctness.

4.3.1. Chunk Granularity Sensitivity Analysis

To examine the effect of chunk granularity on retrieval quality, an additional sensitivity analysis was conducted using four indexing configurations: 300/100, 300/150, 500/150, and 800/250 (chunk size/overlap). All other retrieval settings were kept fixed for comparability. The sensitivity analysis was conducted using the hybrid retrieval configuration with α = 0.7 under the same evaluation procedure across all chunking variants. This comparison provides partial ablation evidence regarding the influence of document segmentation on retrieval effectiveness. The automated results are presented in Table 2.

The results show that small chunks (300-character segments) led to lower retrieval accuracy, with TOP-5 values of 65.71% (300/100) and 60.00% (300/150), suggesting excessive fragmentation of instructional content. Increasing the overlap from 100 to 150 characters at the same chunk size substantially increased the number of indexed fragments, but did not improve retrieval quality consistently. In contrast, larger chunks achieved better results, with 800/250 providing the strongest automated performance, reaching 54.29% at TOP-1, 81.43% at TOP-5, and 84.29% at TOP-6. However, the gain from TOP-5 to TOP-6 remained modest (+2.86 percentage points), indicating diminishing returns from deeper retrieval. Overall, the findings confirm that chunk granularity significantly affects retrieval effectiveness and should be selected empirically rather than heuristically.

4.3.2. Retrieval Strategy Comparison

Based on the above findings, a two-stage chunking strategy was further examined, using 500/150 for indexing and 800/250 for retrieval. This design, often referred to as context expansion, balances fine-grained indexing with sufficient contextual coherence for response generation.

Two retrieval approaches were compared:

semantic retrieval based on embedding similarity (E5 model with “query:”/“passage:” formulation);
hybrid retrieval combining semantic similarity and keyword-based ranking.

Hybrid ranking was computed using a weighted aggregation (70% semantic, 30% lexical) over an expanded candidate set. Retrieval quality was evaluated using hit@k accuracy, where retrieval was considered correct if at least one relevant fragment appeared among the top-k results. For this comparison, relevance was assessed manually. The evaluation was conducted for k = 1, 3, 5, and 6. This comparison provides partial ablation evidence regarding the contribution of lexical ranking signals beyond dense semantic retrieval alone. The results are shown in Table 3.

Hybrid retrieval outperformed semantic retrieval at TOP-1 (78.57% vs. 70.00%), improving the likelihood that the most relevant fragment was ranked first. At TOP-5, hybrid retrieval also achieved a higher score (90.00% vs. 88.57%), while both approaches reached 91.43% at TOP-6. While semantic retrieval remained competitive at higher depths, hybrid retrieval provided stronger precision for technical educational content. This effect is particularly relevant in domains such as UML, where specific terms and symbols (e.g., “fork”, “composite”, “+”, “actor”) carry essential semantic meaning.

At TOP-5, the system achieved 90.00% accuracy, indicating that relevant instructional content was available to the generative model in the majority of cases. Since the gain at TOP-6 was marginal, TOP-K = 5 was selected as the operational configuration.

4.3.3. Comparative Validation of Retrieval Input Size

A comparative manual relevance assessment was conducted for the retrieval-side chunking configurations 500/150 and 800/250. While the automated analysis favored larger chunks in some settings, this follow-up evaluation examined which configuration better supports practically relevant retrieval quality under human judgment. The results are presented in Table 4.

The 500/150 setting consistently outperformed 800/250 across all retrieval depths, achieving 78.57% vs. 64.00% at TOP-1, 85.71% vs. 81.43% at TOP-3, 90.00% vs. 87.14% at TOP-5, and 91.43% vs. 87.14% at TOP-6. These findings indicate that medium-sized fragments provide a more effective balance between contextual completeness and retrieval precision, whereas larger fragments may introduce additional noise despite richer local context.

4.3.4. Failure Analysis and Dialogue Context

Notably, 5 out of 70 questions were not retrieved correctly by either method at any depth. These cases were associated with ambiguous queries, absence of relevant content in the knowledge base, or terminology overlap across conceptual contexts. This suggests that retrieval failure may reflect limitations in the knowledge base or query formulation rather than deficiencies of the retrieval mechanism itself.

In the current implementation, dialogue history is incorporated in a bounded manner by appending the five most recent messages from the interaction history to the prompt, within the limits of the model context window. This prevents uncontrolled accumulation of context and reduces the risk of noise and context drift. Retrieval is primarily conditioned on the current user query, while dialogue history provides additional contextual grounding during generation.

4.3.5. Final Configuration

Based on the experimental findings, the final system configuration was defined as hybrid retrieval with TOP-K = 5, α = 0.7, and a two-stage chunking strategy (500/150 for indexing and 800/250 for retrieval). In this configuration, the generative model operates as a context-constrained synthesizer grounded in verified educational materials, reducing the risk of unsupported or hallucinated responses.

Overall, the presented results indicate that retrieval quality in pedagogical RAG systems depends not only on the embedding model itself, but on the interaction between multiple configurable design choices. While a full end-to-end ablation remains a direction for future work, the current experiments already provide meaningful empirical justification for the selected retrieval architecture.

4.4. Generative Response Evaluation

To complement the retrieval experiments, a manual evaluation of response generation quality was conducted on 21 representative learner questions (three questions per topic across seven instructional topics). Responses were assessed using four criteria: correctness, groundedness, hallucination presence, and practical usefulness for learning support.

Hallucination was operationalized as the presence of factually incorrect or unsupported misleading claims. Groundedness was scored as 1 when the response was fully supported by the retrieved material, including paraphrasing or logical summarization; 0.5 when the response combined grounded content with additional external information; and 0 when substantial support from the retrieved material was absent.

The evaluation results indicate strong overall answer quality (Table 5). The system achieved 88.10% correctness, suggesting that most responses were factually accurate or partially accurate. Groundedness reached 76.19%, indicating that the majority of answers were substantially supported by the retrieved instructional fragments. The observed hallucination rate was 23.81% (76.19% hallucination-free responses), showing that unsupported additions still occurred in a minority of cases. However, usefulness reached 95.24%, demonstrating that even partially imperfect responses often remained educationally valuable for learners.

The following qualitative observations emerged during the experiment:

In several cases, correct conceptual answers were generated even when the retrieved passages did not contain an exact lexical match, suggesting that the system can preserve semantic relevance beyond superficial term overlap.
Most answers were correct; however, occasional conceptual simplifications were observed in cases requiring fine-grained distinctions between related UML control elements.
When retrieval failed, the model sometimes produced plausible domain-relevant answers based on prior knowledge. Although the prompt explicitly instructed the model to rely only on the retrieved context, prompt-level constraints did not fully suppress the influence of pretrained parametric knowledge. Such answers occasionally contained oversimplified or partially contradictory examples, highlighting the importance of grounding and evidence traceability in educational settings.
In several cases, the responses provided concise summaries or explanatory reformulations that remained useful for learning.

These findings support the role of RAG systems as learning support tools, while also highlighting the need for continued improvements in grounding control, retrieval coverage, and response verification.

As this was an exploratory study, the manual evaluation was conducted using a single-rater protocol, which should be extended to multi-rater validation in future work.

5. Discussion and Future Directions

This study presents a web-based platform designed to support dialogical interaction with educational materials, shifting the focus from passive content consumption toward active engagement. The system is explicitly conceived as a learning support tool in which AI operates as a mediator grounded in curated instructional resources, rather than as an autonomous source of knowledge.

The contributions of this study can be interpreted along three complementary dimensions. At the architectural level, the platform integrates retrieval grounding, dialogical interaction, and modular model selection within a unified framework, enabling controlled and adaptable learning support. At the interaction level, the system introduces context-aware dialogue management, allowing learning to unfold as a continuous process rather than a sequence of isolated queries. At the methodological level, the study provides an empirical evaluation based on human-validated relevance assessment, demonstrating that hybrid retrieval strategies can achieve high accuracy in specialized educational domains while maintaining controlled knowledge grounding.

5.1. Architectural and Pedagogical Implications

The use of a retrieval-augmented architecture enables explicit grounding of generated responses in instructor-verified materials, supporting alignment with curricular objectives and enhancing transparency. This design shifts the role of generative models from knowledge sources to mediators of structured educational content.

The integration of formal modeling through the Calculus of Context-aware Ambients contributes to the coherence of the system by reinforcing the separation between knowledge management, dialogue handling, and response generation. At the same time, alignment with the EU AI Act situates the platform within a compliance-oriented design paradigm, ensuring that AI remains a supportive tool that preserves instructor authority while enabling individualized assistance.

5.2. Technical Performance and Scalability

The empirical results indicate that hybrid retrieval, combining semantic embeddings with lexical matching, provides improved performance for technical educational materials. The selected configuration (TOP-5 retrieval with a two-stage chunking strategy) achieves 90% accuracy, ensuring that relevant instructional context is available to the generative model in most interaction scenarios.

From a systems perspective, the current prototype employs FAISS IndexFlatIP, which was appropriate for the present study and the bounded educational corpus. Indexing remains an offline process, while runtime performance is primarily influenced by embedding computation and external LLM response latency. The separation between indexing and retrieval supports future scalability to larger repositories without requiring fundamental architectural changes.

To complement the retrieval quality evaluation, retrieval latency was measured using the final operational configuration (hybrid retrieval, TOP-K = 5) across all 70 evaluation queries. This measurement focused specifically on the retrieval stage, excluding large language model generation time. The measured latency statistics are summarized in Table 6.

The first retrieval request required 3.74 s, reflecting one-time initialization overhead associated with model loading, file access, and caching processes. Such warm-up behavior is expected in prototype environments and does not represent steady-state runtime performance.

After initialization, retrieval latency remained consistently low across subsequent queries, typically ranging between 0.014 s and 0.021 s, with a median latency of 0.0167 s. These results indicate that retrieval operated at effectively interactive speed for bounded course-scale repositories.

From a scalability perspective, the current prototype employs FAISS IndexFlatIP, which performs exact similarity search with linear complexity relative to index size. This design was appropriate for the present study, where the corpus consisted of educator-curated instructional materials with a limited number of indexed fragments.

Overall, the findings suggest that the proposed architecture is computationally practical for course-level educational use, while also providing a clear pathway for future large-scale optimization.

5.3. Limitations and Challenges

Despite the strong retrieval performance, several limitations remain. Analysis of unsuccessful retrieval cases (approximately 7%) indicates that errors are primarily associated with ambiguous query formulation, incomplete coverage of instructional materials, or semantic ambiguity arising from overlapping terminology across conceptual contexts.

These findings highlight that the effectiveness of retrieval-augmented systems depends not only on the retrieval mechanism itself, but also on the quality, structure, and completeness of the underlying knowledge base, as well as on the clarity of user queries. In this sense, the system functions as a controlled knowledge mediation mechanism whose performance is inherently linked to the educational resources it operates upon.

A further limitation concerns indexing scalability. The current prototype relies on FAISS IndexFlatIP, an exact nearest-neighbor method that was suitable for the bounded course-scale repository used in this study. However, linear-search indexing may become less efficient for substantially larger multi-course or institution-wide deployments. In such contexts, approximate nearest-neighbor methods (e.g., IVF, HNSW, product quantization) and distributed retrieval architectures would be more appropriate.

For the evaluated bounded corpus (119–353 chunks, depending on chunking configuration), index construction completed within approximately 2–3 s on a Windows 11 Pro laptop equipped with an AMD Ryzen 7 7735HS processor and 32 GB RAM, indicating practical efficiency for course-scale repositories. However, larger multi-course or institution-wide deployments would require more scalable retrieval strategies, such as IVF, HNSW, product quantization, caching layers, or distributed retrieval services.

5.4. Future Research Directions

The proposed architecture provides a foundation for several meaningful directions of future development.

A first direction concerns the enrichment of pedagogical support mechanisms. Future versions of the platform may extend the current text-based knowledge base toward multimodal learning resources, including diagrams, figures, and other visual materials, which are particularly important in conceptually and visually intensive domains. In parallel, dialogical scaffolding may be further developed through guiding questions, clarification prompts, and adaptive conversational strategies that actively support learners during knowledge construction.

A second direction involves more personalized and trustworthy learning support. Future research may investigate learner modeling based on interaction patterns, enabling adaptive feedback, identification of individual learning needs, and more responsive long-term support. Additional attention should also be given to confidence estimation, fallback strategies, and automated response verification in order to strengthen robustness in educational use.

A third direction concerns technical expansion and broader validation. Future studies may extend the current evaluation through larger benchmark datasets, broader component-level analyses, and systematic investigation of retrieval and generation parameters across varied educational settings. For large-scale deployments, more scalable retrieval infrastructures—such as approximate nearest-neighbor indexing (e.g., HNSW, IVF), caching mechanisms, and distributed retrieval services—represent a natural next step while preserving transparency and pedagogical grounding.

6. Conclusions

This paper presented a RAG-based platform designed to support dialogical learning through interaction grounded in instructor-verified educational materials. The proposed architecture integrates retrieval mechanisms, context-aware dialogue management, and modular model selection to enable controlled and pedagogically aligned AI-assisted learning.

The results demonstrate that the combination of hybrid retrieval and a two-stage chunking strategy enables reliable identification of relevant instructional content. The achieved accuracy of 90% at TOP-5 indicates that, in most cases, the generative model is provided with an appropriate contextual basis for response generation. This supports the use of retrieval-augmented approaches as a means of constraining generative models within educational settings.

The system further illustrates how architectural design choices—such as grounding in curated sources, separation of indexing and interaction processes, and alignment with regulatory principles—can contribute to transparency and controlled use of AI in learning environments.

Overall, the study suggests that retrieval-grounded and pedagogically structured AI systems represent a viable direction for trustworthy educational support in future digital learning environments.

Author Contributions

Conceptualization, A.T.; methodology, A.T.; software, K.G.; validation, K.G.; formal analysis, T.G.; investigation, A.T.; writing—original draft preparation, A.T., K.G. and T.G.; writing—review and editing, A.T.; funding acquisition, T.G. All authors have read and agreed to the published version of the manuscript.

Funding

This study is financed by the European Union-NextGenerationEU, through the National Recovery and Resilience Plan of the Republic of Bulgaria, project № BG-RRP-2.004-0001-C01 and supported by the project FP25-FMI-010 “Innovative interdisciplinary research in informatics, mathematics and educational pedagogy” at the Plovdiv University “Paisii Hilendarski”.

Data Availability Statement

All relevant data supporting the findings of this study are included within the article. Additional details are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Hajjioui, Y.; Zine, O.; Benslimane, M.; Ibriz, A. Intelligent Tutoring Systems: A Review. In Big Data and Internet of Things (BDIoT 2024); Mahboub, O., Haddouch, K., Omara, H., Hefnawi, M., Eds.; Lecture Notes in Networks and Systems; Springer: Cham, Switzerland, 2025; Volume 887, pp. 663–676. [Google Scholar] [CrossRef]
Guan, R.; Raković, M.; Chen, G. How educational chatbots support self-regulated learning? A systematic review of the literature. Educ. Inf. Technol. 2025, 30, 4493–4518. [Google Scholar] [CrossRef]
Shi, Y.; Yu, K.; Dong, Y.; Chen, F. Large language models in education: A systematic review of empirical applications, benefits, and challenges. Comput. Educ. Artif. Intell. 2026, 10, 100529. [Google Scholar] [CrossRef]
Wang, S.; Xu, T.; Li, H.; Zhang, C.; Liang, J.; Tang, J.; Yu, P.; Wen, Q. Large Language Models for Education: A survey and outlook. IEEE Signal Process. Mag. 2025, 42, 51–63. [Google Scholar] [CrossRef]
Xing, W.; Nixon, N.; Crossley, S.; Denny, P.; Lan, A.; Stamper, J.; Yu, Z. The Use of Large Language Models in Education. Int. J. Artif. Intell. Educ. 2025, 35, 439–443. [Google Scholar] [CrossRef]
Khong, T.; Saito, E.; Hardy, I.; Gillies, R. Teacher learning through dialogue with colleagues, self and students. Educ. Res. 2023, 65, 170–188. [Google Scholar] [CrossRef]
Hannigan, T.; McCarthy, I.; Spicer, A. Beware of botshit: How to manage the epistemic risks of generative chatbots. Bus. Horiz. 2024, 67, 471–486. [Google Scholar] [CrossRef]
Michel-Villarreal, R.; Vilalta-Perdomo, E.; Salinas-Navarro, D.; Thierry-Aguilera, R.; Gerardou, F. Challenges and Opportunities of Generative AI for Higher Education as Explained by ChatGPT. Educ. Sci. 2023, 13, 856. [Google Scholar] [CrossRef]
Gao, Y.; Xiong, Y.; Gao, X.; Jia, K.; Pan, J.; Bi, Y.; Dai, Y.; Sun, J.; Wang, M.; Wang, H. Retrieval-Augmented Generation for Large Language Models: A Survey. arXiv 2024, arXiv:2312.10997. [Google Scholar] [CrossRef]
Barnett, S.; Kurniawan, S.; Thudumu, S.; Brannelly, Z.; Abdelrazek, M. Seven Failure Points When Engineering a Retrieval Augmented Generation System. In Proceedings of the IEEE/ACM 3rd International Conference on AI Engineering; Association for Computing Machinery: New York, NY, USA, 2024; pp. 194–199. [Google Scholar] [CrossRef]
Zhai, C.; Wibowo, S.; Li, L. The effects of over-reliance on AI dialogue systems on students’ cognitive abilities: A systematic review. Smart Learn. Environ. 2024, 11, 28. [Google Scholar] [CrossRef]
Lewis, P.; Perez, E.; Piktus, A.; Petroni, F.; Karpukhin, V.; Goyal, N.K.H.; Lewis, M.; Yih, W.; Rocktäschel, T.; Riedel, S.; et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. arXiv 2020, arXiv:2005.11401. [Google Scholar] [CrossRef]
Chu, Z.; Wang, S.; Xie, J.; Zhu, T.; Yan, Y.; Ye, J. LLM Agents for Education: Advances and Applications. In Findings of the Association for Computational Linguistics: EMNLP 2025; Association for Computational Linguistics: Stroudsburg, PA, USA, 2025; pp. 13782–13810. [Google Scholar] [CrossRef]
Oprea, S.-V.; Bâra, A. Transforming Education with Large Language Models. IEEE Access 2025, 13, 87292–87312. [Google Scholar] [CrossRef]
Hu, B.; Zhu, J.; Pei, Y.; Gu, X. Exploring the potential of LLM to enhance teaching plans through teaching simulation. npj Sci. Learn. 2025, 10, 7. [Google Scholar] [CrossRef]
Tan, K.; Yao, J.; Pang, T.; Fan, C.; Song, Y. ELF: Educational LLM Framework of Improving and Evaluating AI-generated Content for Classroom Teaching. ACM J. Data Inf. Qual. 2025, 17, 14. [Google Scholar] [CrossRef]
Sunil, K.; Thakkar, A. SocraticAI: Transforming LLMs into Guided CS Tutors Through Scaffolded Interaction. arXiv 2025, arXiv:2512.03501. [Google Scholar] [CrossRef]
Sonkar, S.; Liu, N.; Mallick, D.; Baraniuk, R. CLASS: A Design Framework for Building Intelligent Tutoring Systems Based on Learning Science Principles. arXiv 2023, arXiv:2305.13272. [Google Scholar] [CrossRef]
Lefton, L.; Rong, K.; Dankhara, C.; Ghemri, L.; Kausar, F.; Hamdallahi, A. A Socratic RAG Approach to Connect Natural Language Queries on Research Topics with Knowledge Organization Systems. arXiv 2025, arXiv:2502.15005. [Google Scholar] [CrossRef]
Du, M.; Xu, B.; Zhu, C.; Wang, S.; Wang, P.; Wang, X.; Mao, Z. A-RAG: Scaling Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces. arXiv 2026, arXiv:2602.03442. [Google Scholar] [CrossRef]
Wang, P.; Xu, B.; Zhang, L.; Wang, S.; Du, M.; Zhu, C.; Mao, Z. WildGraphBench: Benchmarking GraphRAG with Wild-Source Corpora. arXiv 2026, arXiv:2602.02053. [Google Scholar] [CrossRef]
Li, S.; Stenzel, S.; Eickhoff, C.; Bahrainian, S. Enhancing Retrieval-Augmented Generation: A Study of Best Practices. In Proceedings of the 31st International Conference on Computational Linguistics, Abu Dhabi, United Arab Emirates; Association for Computational Linguistics: Stroudsburg, PA, USA, 2025; pp. 6705–6717. [Google Scholar]
Arslan, M.; Ghanem, H.; Munawar, S.; Cruz, C. A Survey on RAG with LLMs. Procedia Comput. Sci. 2024, 246, 3781–3790. [Google Scholar] [CrossRef]
Chen, Z.; Zhou, K.; Zhang, B.; Gong, Z.; Zhao, W.; Wen, J.-R. ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models. arXiv 2023, arXiv:2305.14323. [Google Scholar] [CrossRef]
Khattab, O.; Singhvi, A.; Maheshwari, P.; Zhang, Z.; Santhanam, K.; Vardhamanan, S.; Haq, S.; Sharma, A.; Joshi, T.; Moazam, H.; et al. DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines. arXiv 2023, arXiv:2310.03714. [Google Scholar] [CrossRef]
Zhai, C.; Wibowo, S. A systematic review on artificial intelligence dialogue systems for enhancing English as foreign language students’ interactional competence in the university. Comput. Educ. Artif. Intell. 2023, 4, 100134. [Google Scholar] [CrossRef]
Klimova, B.; Pikhart, M. Exploring the effects of artificial intelligence on student and academic well-being in higher education: A mini-review. Front. Psychol. 2025, 16, 1498132. [Google Scholar] [CrossRef]
Radford, A.; Kim, J.; Hallacy, C.; Goh, R.A.G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; Krueger, G.; et al. Learning Transferable Visual Models from Natural Language Supervision (CLIP). arXiv 2021, arXiv:2103.00020. [Google Scholar] [CrossRef]
Li, J.; Li, D.; Xiong, C.H.S. BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation. arXiv 2022, arXiv:2201.12086. [Google Scholar] [CrossRef]
Li, J.; Li, D.; Savarese, S.; Hoi, S. BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models. arXiv 2023, arXiv:2301.12597. [Google Scholar] [CrossRef]
Li, C.; Gan, Z.; Yang, Z.; Yang, J.; Li, L.; Wang, L.; Gao, J. Multimodal Foundation Models: From Specialists to General-Purpose Assistants. arXiv 2023, arXiv:2309.10020. [Google Scholar] [CrossRef]
Mei, L.; Mo, S.; Yang, Z.; Chen, C. A Survey of Multimodal Retrieval-Augmented Generation. arXiv 2025, arXiv:2504.08748. [Google Scholar] [CrossRef]
European Union. Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 Laying Down Harmonised Rules on Artificial Intelligence (Artificial Intelligence Act); (EU) 2024/1689; Publications Office of the European Union: Luxembourg, 2024. [Google Scholar]
European Commission. Ethical Guidelines on the Use of Artificial Intelligence and Data in Teaching and Learning for Educators; Publications Office of the European Union: Luxembourg, 2022. [Google Scholar] [CrossRef]
Riely, J.; Hennessy, M. A typed language for distributed mobile processes. In Proceedings of the 25th Annual ACM Symposium on Principles of Programming Languages; Association for Computing Machinery: New York, NY, USA, 1998; pp. 378–390. [Google Scholar] [CrossRef]
Siewe, F.; Zedan, H.; Cau, A. The calculus of context-aware ambients. J. Comput. Syst. Sci. 2011, 77, 597–620. [Google Scholar] [CrossRef]
Glushkova, T.; Rusev, K.; Stoyanov, D. Multi-Context Modeling of Processes and Services in Cyber-Physical Educational Space. In Proceedings of the International Conference on Big Data, Knowledge and Control Systems Engineering (BdKCSE); IEEE: New York, NY, USA, 2023; pp. 1–8. [Google Scholar] [CrossRef]

Figure 1. Architecture of the proposed RAG-based e-learning platform for dialogical learning support.

Figure 2. Creation of a course-specific profile used to define contextual parameters.

Figure 3. Upload and preprocessing of course materials.

Figure 4. Example of dialogical interaction between a learner and the AI assistant in Bulgarian, illustrating the original language of the instructional materials used in the experiment.

Table 1. CCA Ambients.

Notation	Description
PA	Personal Assistant of Student
RT	Retriever Component
SC	Session Context Module
LLMG	LLM Generator
KM	Knowledge Manager
VDB	Vector Database, child of KM ambient
UDB	User Database, child of KM ambient
MEDB	Multimodal Educational DB, child of KM ambient

Table 2. Automated sensitivity analysis of chunking configurations.

Chunking	Chunks	TOP-1	TOP-3	TOP-5	TOP-6
300/100	233	47.14%	58.57%	65.71%	67.14%
300/150	353	48.57%	57.14%	60.00%	67.14%
500/150	149	52.86%	74.29%	77.14%	78.57%
800/250	119	54.29%	74.29%	81.43%	84.29%

Table 3. Manual comparison of retrieval strategies.

Method	TOP-1	TOP-3	TOP-5	TOP-6
Semantic	70.00%	87.14%	88.57%	91.43%
Hybrid	78.57%	85.71%	90.00%	91.43%

Table 4. Manual comparison of retrieval input chunking.

Retrieval Input	TOP-1	TOP-3	TOP-5	TOP-6
500/150	78.57%	85.71%	90.00%	91.43%
800/250	64.00%	81.43%	87.14%	87.14%

Table 5. Manual evaluation results for generative response quality.

Metric	Score
Correctness	88.10%
Groundedness	76.19%
Hallucination-Free	76.19%
Usefulness	95.24%

Table 6. Retrieval latency statistics for the evaluated course-scale repository.

Metric	Value
Number of queries	70
First-query latency (initialization)	3.74 s
Median latency (all queries)	0.0167 s
Mean latency (all queries)	0.0702 s
Minimum latency	0.0141 s
Maximum latency	3.7417 s
Typical steady-state range	0.014–0.021 s

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Toskova, A.; Georgiev, K.; Glushkova, T. Dialogical Learning Support in RAG-Based E-Learning. Information 2026, 17, 418. https://doi.org/10.3390/info17050418

AMA Style

Toskova A, Georgiev K, Glushkova T. Dialogical Learning Support in RAG-Based E-Learning. Information. 2026; 17(5):418. https://doi.org/10.3390/info17050418

Chicago/Turabian Style

Toskova, Asya, Kosta Georgiev, and Todorka Glushkova. 2026. "Dialogical Learning Support in RAG-Based E-Learning" Information 17, no. 5: 418. https://doi.org/10.3390/info17050418

APA Style

Toskova, A., Georgiev, K., & Glushkova, T. (2026). Dialogical Learning Support in RAG-Based E-Learning. Information, 17(5), 418. https://doi.org/10.3390/info17050418

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Dialogical Learning Support in RAG-Based E-Learning

Abstract

1. Introduction

2. Related Work

3. Architecture and Design

3.1. Overall Architecture and Design Principles

3.2. Knowledge Base Construction and Content Management

3.3. Retrieval and Answer Generation Workflow

3.4. Pedagogical Design and Context Management

3.5. Regulatory and Ethical Constraints in Educational Use of Language Models

4. Modeling and Implementation

4.1. Calculus of Context-Aware Ambients (CCA) Modeling of Learning Scenario

4.2. Implementation and Prototype

4.3. Empirical Retrieval Evaluation and Sensitivity Analysis

4.3.1. Chunk Granularity Sensitivity Analysis

4.3.2. Retrieval Strategy Comparison

4.3.3. Comparative Validation of Retrieval Input Size

4.3.4. Failure Analysis and Dialogue Context

4.3.5. Final Configuration

4.4. Generative Response Evaluation

5. Discussion and Future Directions

5.1. Architectural and Pedagogical Implications

5.2. Technical Performance and Scalability

5.3. Limitations and Challenges

5.4. Future Research Directions

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI