1. Introduction
In recent years, the rapid development of generative artificial intelligence has promoted the evolution of intelligent systems in various fields. Educational technology, as a key application direction, is facing the challenge of transforming from static content distribution to dynamic personalized teaching [
1,
2]. Adaptive learning has attracted widespread attention because it can dynamically adjust teaching strategies according to learner differences. Its core lies in accurately understanding learner needs and generating personalized learning resources in real time [
3,
4]. Generative AI based on the Transformer architecture provides technical support for efficient and intelligent teaching interaction. Exploring its deep integration in adaptive learning has important theoretical value and practical significance for improving teaching efficiency and optimizing learning experience [
5,
6].
Although existing research has initially applied generative AI to educational scenarios, there are still key problems in its specific implementation [
7,
8]. First, the pre-trained large model lacks sufficient adaptation to the language and knowledge structure in the field of education, resulting in low subject accuracy and cognitive fit of the generated content [
9,
10]. Second, the lack of personalized generation capabilities makes it difficult to provide targeted learning resources based on individual differences among students [
11,
12]. Third, the high cost of model inference limits its deployment and response efficiency in real-time teaching systems [
13,
14]. These problems seriously restrict the in-depth application of generative AI in adaptive learning systems [
15,
16].
By combining generative AI with education systems, we can better respond to diverse student needs, tailor learning content and paths for them, and thus provide a more effective learning experience [
17,
18]. To improve the performance of generative models in specific tasks, some studies have introduced efficient parameter fine-tuning methods, such as Adapter, Prefix Tuning, and LoRA (Low-Rank Adaptation). Among them, LoRA has attracted widespread attention due to its low computational overhead and high transfer efficiency [
19,
20]. At the same time, instruction fine-tuning has been proven to significantly improve the task perception ability of the model, and dynamic prompt engineering has performed well in achieving contextual personalized generation [
21,
22]. However, these methods are mostly focused on general language processing tasks, and their integrated application in educational scenarios is still immature, lacking systematic optimization for learner behavior modeling and targeted generation of teaching content [
23,
24]. To this end, this paper combines LoRA fine-tuning technology with a dynamic prompt mechanism to construct a generative AI optimization path suitable for adaptive learning scenarios [
25,
26].
The in-depth application of generative artificial intelligence in adaptive learning faces several fundamental scientific challenges. These challenges revolve around how to accurately adapt general-purpose models to the specialized language and knowledge structures of the educational field, how to achieve real-time personalized content generation based on individual learner differences, and how to maintain high-efficiency reasoning in resource-constrained environments. This research aims to address key research questions, including exploring an efficient fine-tuning path for Transformer models based on low-rank adaptation; designing optimization strategies that integrate teaching instructions to enhance the model’s understanding of educational tasks; and constructing a prompting mechanism that can dynamically generate personalized guidance based on learner states. This research is essentially an interdisciplinary field between educational science and artificial intelligence. It goes beyond optimizing model algorithms, involving computational modeling of learning behavior, cognitive patterns, and knowledge transfer processes, embodying an interdisciplinary methodology for empowering educational reform with intelligent technology.
This paper aims to improve the personalized generation quality and response efficiency of generative AI in adaptive learning scenarios and proposes a Transformer model optimization method that integrates low-rank adaptation, instruction fine-tuning, and dynamic prompt construction. First, the low-rank adaptation technology is used to fine-tune the parameters of the pre-trained model by injection, and the effective transfer of teaching language style and knowledge expression is achieved on a large-scale educational scenario dataset. Then, instruction fine-tuning is used to further enhance the model’s understanding and generation capabilities for diverse educational tasks (mathematical problem solving, language learning). This paper also optimizes the Transformer architecture itself, adjusting the attention mechanism and position encoding to adapt to the unique contextual requirements of the education field, thereby improving the accuracy and subject fit of the model when generating content. Finally, a dynamic prompt construction mechanism is introduced to optimize the personalized matching ability of generated content by modeling the learner context. The novelty of this method lies in its systematic integration and structural optimization for adaptive learning scenarios. The standard LoRA fine-tuning and instruction fine-tuning processes have been redesigned to embed semantic role annotation, task embedding vectors, and a dynamic prompt generation mechanism based on knowledge graphs and error types, all specific to the education domain. The collaborative work of these components aims to address the issues of subject-specific accuracy, personalized matching, and cognitive fit in educational content generation, rather than simply combining general techniques.
To systematically address the aforementioned issues, this paper unfolds according to the following structure. Chapter 2 elucidates the Transformer-based generative artificial intelligence optimization method, detailing the construction and preprocessing of educational datasets, laying the foundation for domain adaptability in model training; it introduces a Transformer fine-tuning technique based on low-rank adaptation, aiming to achieve efficient updating of model parameters and knowledge transfer; it describes instruction fine-tuning strategies to enhance the model’s understanding and execution capabilities for diverse teaching tasks; and it proposes a dynamic prompt construction mechanism to achieve perception of learners’ personalized contexts and control of content generation. Chapter 3 evaluates the adaptive generation performance, verifying the effectiveness of the proposed method through multi-dimensional experiments on generated content quality, personalization matching degree, multi-task adaptability, response time, and resource utilization. Chapter 4 summarizes the entire work, discussing the contributions and limitations of this method and providing an outlook on future research directions. Each chapter is interconnected, aiming to comprehensively present and demonstrate the optimization path of this research, from method construction and implementation to verification.
2. Transformer-Based Generative AI Optimization Method
This methodology partially covers the complete technical process from data preparation to model optimization. The construction and preprocessing of educational datasets are fundamental to subsequent model adaptation, aiming to provide structured, high-quality domain corpora. Low-rank adaptation-based Transformer fine-tuning is the core technology for achieving efficient transfer of pre-trained models. Instruction fine-tuning further endows the model with the ability to understand and execute specific teaching tasks. The dynamic prompting mechanism is responsible for transforming the learner’s individual state into contextual guidance for the generation process, ultimately achieving personalized content generation.
2.1. Data Construction and Preprocessing in the Education Field
To achieve effective migration and fine optimization of generative AI models in adaptive learning, this paper first constructs a structured, high-quality educational corpus. The data sources mainly include digital course resource platforms, subject question bank resources, real student question and answer logs, and teacher annotation texts.
The educational corpus constructed by this research institute is sourced from several mainstream digital education platforms in China, covering three subjects: mathematics, physics, and English, and encompassing curriculum content from junior high to senior high school. The total data comprises approximately 150,000 text fragments, including textbook explanations, exercise stems, student answers, and teacher comments. The subject distribution is even; the grade level coverage is complete; the texts are anonymized to preserve the original teaching context and knowledge structure.
First, during the data collection phase, an automated web crawler framework is used in conjunction with API (Application Programming Interface) calls to capture raw text data from a highly trusted educational platform. Content filtering rules are then set to retain only segments related to course instruction, student answers, and teacher explanations, while excluding non-teaching corpora. Meanwhile, natural language keyword extraction tools are used to roughly classify the original texts into three categories: teaching explanation, question–answer interaction, and task guidance, in order to improve the efficiency of subsequent processing.
Secondly, in the sample screening stage, TF-IDF (Term Frequency-Inverse Document Frequency) and similarity clustering algorithm (using cosine similarity) are used to identify and remove redundant, templated or highly repetitive texts to ensure data diversity and breadth of expression. At the same time, the BERT embedding vector is introduced to perform semantic consistency verification. For text pairs above the similarity threshold (set to 0.92), only those with more complete expressions are retained, as shown in
Figure 1.
Figure 1 shows the core steps of data collection and sample screening. The data collection stage aims to obtain high-quality teaching-related texts and roughly classify them through the TextRank algorithm. The sample screening stage uses TF-IDF and similarity clustering technology, combined with the BERT (Bidirectional Encoder Representations from Transformers) model, to further screen the data to ensure that useful, non-redundant, and highly semantically consistent texts are retained.
In the semantic hierarchical annotation stage, a weakly supervised annotation mechanism is introduced to divide the teaching text into four semantic roles: knowledge point statement, teaching intention description, task guidance language, and student answer fragments [
27,
28]. CRF (Conditional Random Field) is combined for sequence labeling task training, and manual review is used to ensure label consistency. In the cleaning and normalization stage, the focus is on dealing with problems such as non-standard expressions, incoherent texts, and colloquial language. First, the language model perplexity index is used to filter low-quality texts (texts with a threshold of Perplexity > 120 can be eliminated) [
29,
30]. Secondly, a text normalization rule library is introduced to uniformly transcribe common abbreviations, typos, and redundant expressions to ensure that the corpus format is consistent. For formula content, a parser is used to unify the expression form to avoid the model from misjudging mathematical structure information during training.
Finally, the structured coding process is entered to build a unified input format template. Each sample is organized into a multi-field JSON (JavaScript Object Notation) structure, with fields including “instruction” (task instruction), “input” (context information), “target” (reference output), and “metadata” (knowledge point number, grade, and subject category label information). In addition, hash coding is used to construct an anonymous student identity index, linking historical answer records with learning behavior labels, and providing a basic semantic vector for subsequent dynamic prompt construction.
2.2. Transformer Fine-Tuning Based on LoRA (Low-Rank Adaptation)
This study selects LLaMA-2-13B as the backbone network for its methodology. This model is a pre-trained Transformer decoder with an open structure and full parameter access. It has 13 billion parameters, and its architectural details and weights are publicly available, allowing for low-level parameter manipulation and adaptation layer injection. The model was chosen based on its recognized performance in complex instruction compliance and contextual understanding, providing a necessary semantic understanding foundation for subsequent prompting engineering and contextual adaptation in educational scenarios. Fine-tuning is achieved through adaptation layer injection, without directly modifying the weights of the original model. Instead, trainable low-rank adaptation modules are inserted in parallel alongside specific linear layers [
31,
32].
The first stage is model freezing and target layer selection. The original pre-trained model weights (using FP16 precision) are loaded, and only the query (Q) and value (V) weight matrices of all attention modules in the Transformer are selected as the injection targets. All other parameters are frozen and do not participate in training. This selection is based on the results of previous experiments, which show that the main impact of educational tasks on the quality of model generation is concentrated in the information selection and output construction links, corresponding to the action paths of the Q and V matrices, respectively. The number of model layers here is set to 24, and 2 sets of LoRA modules are injected into each layer.
The second stage is the injection and structure construction of the LoRA module. The original Attention weight matrix
is replaced with
The low-rank matrix
has a size of
,
, and a rank of
. Parameter initialization: Matrix
is initialized with He, and matrix
is initialized with Xavier to prevent gradient explosion and disappearance. During training, only matrices
and
are updated, and the rest of the weights remain frozen.
The third stage is task adaptation training. Based on the aforementioned educational corpus, the supervised fine-tuning strategy is used to train the inserted LoRA module. The loss function uses standard cross-entropy; the optimizer selects AdamW; the learning rate combines the linear warmup and cosine decay learning rate scheduling strategies. During the training process, a dynamic segmented input window (window size ranges from 512 to 2048 tokens) is used to adapt to teaching task samples of different lengths, thereby ensuring that the model can stably model contextual dependencies. The training is conducted for 30 epochs, each with a batch size of 64, and the training time is about 16 h, as shown in
Figure 2.
In the first 5 epochs, the loss value gradually decreases. This is because at this stage, the learning rate is gradually increased from a smaller value, and the model needs to gradually adapt to the task at the beginning, showing a higher training error. In the early stage, as the learning rate gradually increases, the training effect of the model gradually improves, and the loss gradually decreases. After the Warmup phase, the learning rate enters the Cosine Decay phase, which means that the learning rate can gradually decrease. The loss value continues to decrease, but the rate of decrease becomes slower and slower. This is because the model is close to its optimal solution, and further reducing the error requires more refined adjustments. Even so, due to the model’s increased stability, the loss tends to stabilize. In the later stages of training (that is, when approaching the 30th epoch), the loss value changes very little, indicating that the model has reached a stable state. In the warmup phase, the learning rate gradually increases from the initial small value until it reaches the predetermined maximum learning rate. The linear growth of the learning rate enables the model to find suitable parameters better in the early stage and improve the stability of training. After the warmup ends, the learning rate begins to enter the cosine decay phase and gradually decreases.
Figure 2 records the evolution of the cross-entropy loss value over 30 training epochs. The loss value is calculated based on the difference between the model’s prediction and the true label, obtained through a standard forward and backward propagation process. The loss value for each training step is recalculated by the optimizer after updating parameters based on gradients, and recorded in real-time to the log system. The curves in
Figure 2 are constructed based on these continuously recorded loss values. The verification of the curve’s correctness relies on the inherent computational correctness of the training framework, including the correct implementation of the loss function, the numerical stability of gradient calculation, and the mathematical consistency of optimizer updates. No gradient explosion or abnormal fluctuations in the loss value occurred during training, indicating that the computation process was effective.
To prevent target drift and expression degradation during fine-tuning, a variety of regularization enhancement mechanisms are introduced in training. Specific measures include (1) L2 regularization is introduced for the inserted LoRA parameter, with a weight set to 5 × 10−5. (2) A Frozen Layer Activation Check is performed every 500 steps. By calculating the KL divergence between the LoRA output and the frozen backbone output, the fine-tuning direction of the model is monitored to see if it deviates from the pre-trained semantic space. (3) A gradient clipping mechanism is used, and the clipping threshold is set to 1.0 to prevent local gradient explosion. These measures ensure the stability of training and the consistency of generated content, further improving the performance of the model in educational tasks.
To further improve the model’s ability to follow instructions in an educational context, this study embeds a lightweight instruction preprocessing module in the LoRA fine-tuning process. This module uses a GPT-tokenizer to structure the input teaching task description and introduces explicit task labels (such as [TASK:EXPLAIN], [TASK:Q&A]) as additional prompt prefixes, thereby significantly improving the LoRA module’s ability to identify task boundaries and generate targets. In the specific implementation process, the preprocessing module first parses the input teaching task description and automatically adds corresponding task tags according to the task type (such as explanation, question answering, and exercise generation). These tags provide clear contextual guidance for the model, enabling it to understand the goals and constraints of the current task more accurately. For example, the tag [TASK:EXPLAIN] is used to prompt the model to generate a detailed explanation of a certain knowledge point, while [TASK:Q&A] is used to guide the model to generate a dialogue form of questions and answers. Through this structured task guidance, the model can more clearly identify task requirements and adjust the style and structure of generated content according to the label. By introducing this instruction preprocessing mechanism, the model can better understand the task objectives when processing educational tasks, thereby generating more accurate and targeted content.
2.3. Instruction Fine-Tuning Improves Task Adaptability
This paper designs and implements a set of structured task instruction sets, combined with instruction fine-tuning strategies, to clearly guide the model to generate content that meets the teaching objectives. The whole process covers the key links of task instruction template construction, instruction–input pair construction, and fine-tuning training implementation.
First, a unified structured task instruction template is constructed based on the type of educational task. The template strictly follows the three elements of “task type, input requirements, and expected output format”. Examples include “Please generate five medium-difficulty math multiple-choice questions based on the following knowledge points”, “Please explain the following physical concepts and provide application examples”, etc. The template text uses concise and clear language to avoid ambiguity and ensure that the model can accurately identify the task objectives. A total of 6 types of task instructions are designed, covering key teaching scenarios such as exercise generation, knowledge point explanation, Q&A, and wrong question analysis.
Secondly, the instruction and input data pairs are constructed using automated script processing that structures the teaching content in the educational corpus into a triple format of “instruction, input, output”. Among them, the “instruction” field corresponds to the above-mentioned task template; the “input” field includes the knowledge point description, context text, and student answer fragments; the “output” field is the corresponding teaching-generated content, including question text, concept explanation or feedback comments, as shown in
Table 1. The original text is accurately segmented and matched through regular expressions and semantic matching algorithms to ensure semantic consistency and prevent information confusion. Finally, 124,000 structured training samples are generated.
Table 1 shows how to convert the teaching content in the educational corpus into a structured format of “instruction, input, output” triples. Each triple represents a complete teaching task, and the task template represents the type of task, such as concept explanation, exercise generation, or question-answering feedback. Instructions are specific prompts for task execution, telling the model what to do, such as [TASK:EXPLAIN] for explaining a concept, and [TASK:CREATE_Q] for generating a question. The input contains the background information of the task, including the description of knowledge points and the student’s answer fragments, which provide the background required for the task. The output is the expected result of the task, which is the content generated by the model, such as the detailed explanation of the concept, the questions generated by the exercises, or the specific answers to the Q&A feedback. The development of
Table 1 is based on the need for structured data representation required by the instruction fine-tuning strategy. The table’s construction stems from transforming raw teaching texts into a “command–input–output” triple format that the model can process. The sample examples in the table are generated directly from the constructed educational domain corpus. Data collection relied on automated scripts that parsed and aligned the teaching texts in the corpus based on predefined regular expressions and semantic matching algorithms. The accuracy of the data content was recorded and ensured through a two-stage process: the first stage, rule matching executed by the script, ensured the accuracy of field segmentation; the second stage involved manual verification through sampling to confirm semantic consistency, and the verification results were recorded to demonstrate the reliability of the data transformation. The samples presented in
Table 1 demonstrate structured examples of different teaching task types. Their scientific purpose is to concretely illustrate the data organization in the instruction fine-tuning stage and reflect the logical relationship between task instructions, contextual inputs, and target outputs, thereby laying a data foundation for the model to understand and execute diverse teaching tasks.
The third step is to load the insertion module based on LoRA fine-tuning into the pre-trained Transformer model and fine-tune it with the constructed instruction dataset. The model training input sequence is composed of “[task instruction] + [input content]”, and the output target is the corresponding teaching text. The cross entropy is used as the objective function; the optimizer is AdamW; the learning rate is set to 1 × 10−4; the batch size is 64; the training cycle is 3 epochs. To improve the generalization ability of multi-tasks, samples of different task types are randomly sampled during training to avoid overfitting a single task. The input token length is dynamically adjusted, and a maximum of 1024 tokens is supported.
In addition, in view of the characteristics of multi-task learning, this study introduces a task embedding vector in the Transformer structure. This vector is superimposed with the input token embedding and input into the model together to enhance the model’s perception of the task context. The role of the task embedding vector is to provide an independent identifier for each input task, so that the model can adaptively adjust its generation strategy according to different task types (such as exercise generation, concept explanation, and question answering). The vector is randomly initialized at the beginning of training and is optimized along with the model parameters during training. In this way, the task embedding vector gradually learns how to switch between different tasks, thereby improving the model’s recognition of task instructions and the targeted generation.
2.4. Dynamic Prompt Construction to Achieve Personalized Generation
Aiming at the needs of personalized teaching, this study designs and implements a dynamic prompt construction mechanism based on student behavior data and knowledge mastery, aiming to guide the generative AI model to produce teaching content that is highly consistent with the individual characteristics of learners through refined contextual prompts. The mechanism covers three parts: data collection and feature extraction, dynamic prompt template design, context splicing and real-time update. It combines deep learning and rule engines to achieve precise control of the adaptive generation environment.
First, a multi-dimensional behavioral feature vector is constructed for individual students. The data sources include students’ previous answer records, wrong question types, learning time, learning frequency, and feedback evaluation. Time series preprocessing methods are used for normalization to eliminate abnormal data points. Based on this data, a Transformer-based behavior encoder is applied to extract high-dimensional implicit state vectors. The encoder uses a multi-layer self-attention network to achieve information fusion within the time step, effectively capturing changes in learning behavior and dynamics of knowledge mastery. The encoder parameters adopt a pre-training and fine-tuning strategy, sharing some weights with the overall teaching task model to ensure a close connection between feature expression and generation tasks.
Secondly, a dynamic Prompt template is designed based on the current learning goals and knowledge point mastery. The template consists of multiple structured text segments, including three modules: “Learner status description”, “Current knowledge point summary”, and “Task instruction prompt”. The content of each module is dynamically generated based on the real-time feature vector: the learner status description module maps the output vector of the behavior encoder to a text label (such as “Basic concepts have been mastered”, “Need to strengthen application practice”). The knowledge point summary module calls the knowledge graph interface to obtain relevant knowledge units and difficulty levels. The task instruction prompt module selects predefined instruction templates for dynamic adjustment based on the current teaching objectives. The template generation process is completed by combining the rule engine with the neural network generator. The rule engine ensures that the generated content is logically coherent and meets educational standards, and the neural network is responsible for refining natural language expressions.
In the generation phase, the dynamic prompt template is spliced with the student’s input questions and context information to form a complete input sequence. A hierarchical tagging strategy is used in sequence encoding, and special separators are used to distinguish different modules to assist the model in accurately interpreting each part of the information. To ensure real-time performance, the system designs a lightweight cache mechanism to incrementally encode frequently updated feature vectors, reduce repeated calculation delays, and meet the needs of complex teaching scenarios. The effect of the hierarchical tagging strategy is shown in
Table 2.
Table 2 shows the changes in latency before and after the optimization strategy, reflecting the effects of the cache mechanism and incremental encoding. The cache mechanism prevents the repeated calculation of frequently occurring identical tasks, significantly reducing the repeated calculation time and thus reducing the total latency. By batch processing multiple tasks, throughput can be significantly improved, especially when the system is heavily loaded. Batch processing can reduce waiting time and improve overall response speed. The hierarchical labeling strategy refers to using specific delimiters to mark the boundaries of different information modules in the input sequence when constructing the complete input sequence of dynamic prompts. This strategy operates in the sequence encoding stage, aiming to clearly distinguish modules such as learner state descriptions, knowledge point summaries, and task instruction prompts in the dynamic prompt template, as well as the student’s input questions and contextual information.
Table 2 presents the quantitative effects of this strategy combined with caching mechanisms and incremental encoding on input processing time and overall latency. The importance of this strategy lies in its ability to ensure the model’s structured understanding of complex, multi-part inputs, providing a technical prerequisite for the dynamic prompt mechanism to accurately control the generation process. Its role in the research evidence process is to support the feasibility demonstration of the proposed dynamic prompt construction mechanism in real-time teaching scenarios through empirical data that reduces latency, establishing a link between the method design and its expected efficiency advantages through observable indicators.
In the model reasoning stage, based on the Transformer architecture fine-tuned by LoRA mentioned above, combined with instruction fine-tuning parameters, dynamic prompts are used as guides to generate multiple candidate texts through the Beam Search strategy, and the results with the highest semantic consistency and personalized matching are selected. The matching degree calculation uses weighted cosine similarity, and the weights are dynamically adjusted according to the teaching focus and student needs to ensure that the generated content not only meets the task requirements, but also fits the learner’s cognitive level and interest preferences.
In addition, the system regularly collects feedback on the generated output and adjusts the dynamic prompt template design and behavior encoder parameters in combination with the online learning mechanism to achieve continuous iterative optimization. The feedback data participates in model fine-tuning through the semi-supervised learning framework to improve the model’s sensitivity to personalized differences. This process combines gradient accumulation technology to balance training efficiency and data diversity.
3. Adaptive Generation Performance Evaluation
The educational corpus constructed in the experiment ultimately contains 152,347 text samples, covering three subjects, mathematics, physics, and English, and two educational levels: junior high and senior high. The sample type distribution is as follows: textbook explanations account for 41%, exercise stems for 29%, student answers for 18%, and teacher comments for 12%. All texts are anonymized to remove personally identifiable information.
Data preprocessing follows explicit rules. In the deduplication stage, TF-IDF combined with BERT embedding is used to calculate cosine similarity, with a threshold of 0.92. Texts with semantically repeated meanings exceeding this threshold are retained only for the most complete expression. In the cleaning stage, perplexity-based filtering is applied to remove low-quality text fragments with perplexity values greater than 120. The text normalization rule base defines 187 transformation rules to standardize common subject-specific abbreviations, correct typical spelling errors, and standardize the LaTeX expression format of mathematical formulas.
The key parameters for model training and inference are fixed as follows. The random seed for all training and evaluation procedures was fixed to 42 to ensure reproducibility. The base model is LLaMA-2-13B, using FP16 precision. In LoRA fine-tuning, the rank r is set to 8; the scaling factor α is 32; only the query and value projection matrices of all Transformer layers are injected. The optimizer is AdamW with a weight decay rate of 0.01. Training uses a linear warm-up and cosine decay learning rate scheduler with a maximum learning rate of 3 × 10−4 and 500 warm-up steps. The batch size is set to 64; the maximum sequence length is set to 1024 words; the gradient clipping threshold is 1.0. The learning rate in the instruction fine-tuning phase is set to 1 × 10−4, and training is performed for 3 epochs. The behavior encoder in the dynamic cueing mechanism is a 3-layer Transformer with a hidden layer dimension of 768.
The core training is as follows: after model initialization, pre-trained weights are loaded, and all parameters are frozen. Subsequently, LoRA modules are inserted in parallel next to the specified linear layers. During forward propagation, the original weights are added to the output of the low-rank increment matrix. The loss function calculates the cross-entropy between the model output and the target text. Backpropagation only updates the LoRA parameters and the task embedding vector. The KL divergence of the frozen layer activations is calculated every 500 steps to monitor training stability. After training, removing all adapter parameters except the original model weights yields a lightweight model for inference. The code framework, data processing scripts, and model configuration files used in this study are available in an open-source repository. This implementation is based on the PyTorch 2.6 and Hugging Face Transformers libraries. The codebase includes scripts for web crawling, TF-IDF and BERT-based deduplication, text normalization, and building structured instruction–output pairs; it also includes configuration files for the LLaMA-2-13B base model. The repository also includes configuration files specifying all hyperparameters (batch size 64, maximum sequence length 1024, LoRA alpha 32).
To comprehensively verify the effectiveness of the proposed method, performance evaluation was conducted from multiple dimensions. The generated content quality assessment examined the degree to which the model output aligns with educational objectives in terms of language and semantics. The personalization matching assessment measured the suitability of the generated content to the individual characteristics of learners. The multi-task adaptability assessment examined the model’s generalization ability in different teaching scenarios. The response time and resource utilization assessments addressed the feasibility of the method in practical deployment.