1. Introduction
The landscape of higher education has undergone significant transformations in recent years, driven by a combination of technological advancements, changing student demographics, and evolving pedagogical theories [
1]. Traditional, lecture-based instruction is increasingly being supplemented with, or replaced by, more flexible, student-centered approaches that leverage digital resources and online platforms. This shift has gained momentum as higher education institutions worldwide strive to meet the demands of a new generation of learners who seek personalized, adaptive, and on-demand learning experiences [
1,
2,
3]. In this context, the integration of artificial intelligence (AI) into teaching and learning processes has emerged as a promising avenue for enhancing educational quality, efficiency, and accessibility.
A key development in this domain is the concept of “AI textbooks,” which harness AI-driven language models and intelligent tutoring systems to provide dynamic, contextually relevant content to learners [
2,
3]. Unlike traditional textbooks that present static information, AI textbooks can offer personalized pathways, generate real-time feedback, and supply supplementary materials tailored to the learner’s current understanding and knowledge gaps [
4,
5]. This educational innovation aligns well with contemporary pedagogical frameworks that emphasize active learning, self-regulation, and continuous engagement [
6]. Self-regulated learning (SRL) theory, widely recognized in educational psychology and instructional design, highlights the importance of learners taking an active role in setting goals, selecting appropriate learning strategies, monitoring their own progress, and reflecting on outcomes [
7]. By incorporating AI-based tools, learners can independently seek targeted information, practice problem-solving skills, and receive immediate insights into their performance, thereby strengthening their SRL competencies.
In higher education, the domain of health sciences has also begun to integrate these advanced teaching and learning methods. Health sciences education often requires the acquisition of complex knowledge and practical skills related to patient care, clinical decision-making, and the use of assistive technologies [
8]. Traditional textbooks in this field may not adequately support the nuanced, individualized, and rapidly evolving nature of healthcare knowledge. AI textbooks, equipped with generative models such as LLama 3.1 and domain-specific large language models (e.g., kyujinpy/Ko-PlatYi-6B), can dynamically curate educational content, enabling learners to access up-to-date information and apply it to case-based scenarios. This adaptability is crucial in disciplines where new treatments, tools, and standards of care frequently emerge.
The rapid evolution of the 21st century information society and the influence of the COVID-19 pandemic have driven significant changes in educational formats and content, with students becoming accustomed to digital learning environments and resources [
9]. Advances in AR, VR, and AI have reshaped how educational content is delivered, facilitating personalized and visually enriched learning experiences. Against this global backdrop, countries such as Japan, Canada, and the United States have developed platforms and cloud-based resources that guide Korea’s digital textbook policies and practices. In Korea, digital textbooks have been promoted since 2007, gaining momentum with the 2015 Revised Curriculum and demonstrating their contribution to key future competencies, prompting various education offices to incorporate digital infrastructure and system transitions into their plans [
10].
Digital textbooks in Korea are conceptualized not merely as digitized print versions but also as integrated tools that combine the essential elements of user-friendliness, utility, system quality, content reliability, and Interactivity to maximize learning effectiveness [
11]. These resources support multimedia integration, personalized learning pathways, learning analytics, and the establishment of learning communities. Despite these advancements, concerns such as screen exposure and technical issues persist, highlighting the need for infrastructure development and policy support. As Korea prepares to introduce AI-powered digital textbooks by 2025, the urgency for a continuous digital learning ecosystem from primary through to higher education grows, yet explicit policies and research in tertiary contexts remain insufficient [
9,
10].
The higher education setting faces unique challenges, including the substantial cost of implementing advanced features such as diagnostic dashboards, personalized multimedia, and cloud-based data storage, which all lack strong policy backing [
11]. In the absence of government-level plans, employing open AI models presents an efficient alternative that can be used to enhance the self-directed learning competencies facilitated by digital textbooks. By leveraging these models, universities can maintain pedagogical continuity and further develop learner-centered, technologically enriched educational environments that reflect the evolving expectations of digital literacy and innovation in the modern information society [
9,
10,
11].
Against this backdrop, the present study compares the academic outcomes and learning experiences of students who engage with an AI-driven, self-regulated learning environment, the “KBU AI Textbook Project”, versus those who rely on traditional paper-based textbooks. Focusing on a cohort of 86 undergraduate students enrolled in assistive technology courses within a health sciences field, the aim of this study was to evaluate whether the integration of generative AI-driven resources leads to improvements in learning achievement and supports the cultivation of self-regulatory skills. Through a rigorous experimental design and mixed-methods evaluation process, this research addresses a critical gap in the literature by providing empirical evidence on the impact of AI textbooks in a specialized and highly practical field of study.
Employing both quantitative and qualitative methodologies, this study aimed to provide valuable insights for educators, curriculum developers, and policymakers regarding the practical implications and pedagogical benefits of incorporating AI-based learning tools into higher education. The central hypothesis posits that, by 2024, instruction delivered using the “AI textbook”, a generative AI-based model known as LLama, will foster superior academic achievement outcomes among students in their early twenties, a cohort increasingly accustomed to digitally mediated learning environments. These findings are anticipated to inform future strategies for employing generative AI models to enhance learner-centered instruction, thereby contributing to the ongoing advancement of educational practices in the health sciences and related domains.
2. Methods and Procedures
2.1. Participants
The experiment was conducted with 86 students enrolled in an assistive technology course who voluntarily consented to participate and maintained consistent attendance throughout the study. The participants were randomly assigned to groups using an Excel (Microsoft Excel, Microsoft 365 Apps for Business, Version 2021, Microsoft Corporation, Redmond, WA, USA)-based randomization macro, executed three times to ensure unbiased allocation. Of the initial 132 students, the following were excluded from the study due to their ineligibility in accordance with the research criteria: 26 who did not provide voluntary consent, 12 who withdrew from the course midway, and 8 who missed more than two class sessions (
Figure 1). The study was conducted by dividing the participants into two groups: the Traditional Learning Group (TLG)
, which studied using printed paper-based materials; and the AI Learning Group (ALG), which utilized AI textbook-based learning. All the processes were conducted under the oversight and strict supervision of the Office of Academic Affairs.
A homogeneity test was conducted to evaluate the comparability of the AI Learning Group (ALG) and the Traditional Learning Group (TLG) based on general characteristics such as age, GPA, gender distribution, year level, and course type. The results demonstrated no significant differences between the two groups. For age, the mean in the ALG was 26.26 ± 7.20 years, while the TLG mean was 24.79 ± 3.47 years. A
t-test revealed a t-value of −1.20 and a
p-value of 0.2341, indicating no statistically significant difference. Regarding GPA, the ALG had a mean of 4.06 ± 0.41 compared with the TLG’s mean of 3.97 ± 0.44. The
t-test for GPA resulted in a t-value of −1.03 and a
p-value of 0.3081, confirming no significant difference in academic performance. In terms of gender, the ALG consisted of 26 females and 17 males, while the TLG comprised 27 females and 16 males. A chi-squared test was performed, yielding a chi-squared value of 0.123 and a
p-value of 0.726, which showed that there was no statistically significant difference in gender distribution between the groups. Both groups were composed entirely of third-year undergraduate students enrolled in the same course type, ensuring consistency in academic level and program structure. This result suggests that random segmentation was successful. The two groups were similar in terms of gender distribution, meaning that comparisons between the groups may be fair in future analyses (
Table 1).
2.2. Programs and Interventions
This study employed an experimental design with a test (pre-post) control group structure, utilizing a mixed-methods approach that integrated quantitative and qualitative data. The research was conducted over 15 weeks, corresponding to one academic semester. The participants were divided into two groups: the Traditional Learning Group (TLG) and the AI Learning Group (ALG).
The study focused on a single course titled “Assistive Technology in Occupational Therapy,” offered to third-year undergraduate students in the Department of Occupational Therapy within the College of Health Sciences. The course spanned 15 weeks, aligning with a standard semester structure, and included both lecture-based and practical laboratory sections. The course was designed to introduce students to the assistive technologies relevant to occupational therapy practices, emphasizing the evaluation, selection, and application of such technologies in clinical and community settings. The course was taught by a core faculty member professor who holds both a licensed occupational therapist certification and a nationally recognized assistive technology practitioner qualification, ensuring professional expertise and credibility in delivering the course content. All students enrolled in the course had a similar academic background in health sciences and were at an intermediate stage in their undergraduate program, providing a relatively uniform baseline knowledge of health science concepts. The single-course format allowed for consistency in content delivery and assessment, minimizing the variability that might arise from multiple courses or sections. This structure also facilitated the development of a cohesive learning experience, which was critical for evaluating the study outcomes. Although individual differences in the students’ familiarity and comfort with technology were noted, the uniformity in academic level, course content, and instructor expertise provided a reliable context for analyzing the impact of the course on students’ learning outcomes.
The TLG followed a conventional instructional method, using printed paper-based textbooks, lecture notes, and assignments in a sequential, instructor-led manner. In contrast, the ALG engaged in self-directed learning facilitated by an AI textbook powered by LLama 3.1, including the kyujinpy/Ko-PlatYi-6B model. For the ALG, the professor uploaded lecture notes and instructional materials to the AI platform prior to each class. Students accessed the content in real time using laptops, tablets, or mobile devices and engaged in independent, interactive learning.
Each class was divided into three hours. During the first hour, students in the ALG conducted pre-class preparation by reviewing the uploaded instructional materials. The following two hours were taken up with the main lecture, delivered uniformly to both groups to ensure ethical considerations and equal learning opportunities. After class, students were encouraged to allocate their time flexibly for post-class review. The professor’s role in the ALG, beyond the two-hour lecture, was limited to that of a facilitator, resolving technical issues and answering questions. Instruction was led by a certified assistive technology specialist and full-time faculty member with over 10 years of experience, ensuring the quality and consistency of the content delivered.
The instructional approach differed between the two groups. Students in the TLG studied sequentially using printed materials, progressing linearly through the content. Conversely, the ALG engaged in a dynamic and interactive learning process, actively searching for relevant information, reasoning through the material, engaging in discussions, and completing quizzes via the AI platform.
To evaluate the intervention, a multi-dimensional assessment was conducted over the 15-week period. Quantitative methods included pre- and post-tests to measure knowledge acquisition, academic achievement assessments of task completion and project performance, and knowledge domain-specific evaluations to assess learning outcomes across different content areas. Additionally, qualitative data were collected through platform usage data (log data) in the ALG, which tracked learning frequency, keyword searches, and interaction patterns. Group discussions were also conducted separately for both groups to evaluate student satisfaction and identify areas for improvement in each instructional method.
By integrating both quantitative and qualitative evaluations, this study comprehensively examined the effectiveness of AI-based self-directed learning (ALG) compared with traditional instructor-led methods (TLG). The findings provide insight into academic achievement, learning engagement, and overall student experiences across the two instructional approaches (
Figure 2).
This study was intentionally conducted within the context of a mandatory course to reflect a real-world educational setting where teaching methods would be naturally applied. The preselection of students was based solely on their academic year and enrollment in the program, without additional criteria such as prior academic performance or technological familiarity, thereby ensuring a diverse range of student profiles and mitigating selection bias. The random assignment of students into the ALG and TLG further ensured comparability between the groups and minimized potential bias. Importantly, the students were unaware of the broader research objectives, and the teaching methods were integrated as part of the standard curriculum, ensuring that the students’ awareness did not influence their performance.
To maintain consistency and eliminate potential biases among instructors, the entire course was delivered by a single instructor following a standardized practical manual developed as part of a national university innovation project. This manual provided detailed, step-by-step guidelines for delivering the course content, ensuring that both groups were exposed to identical content and procedures, differing only in the teaching methods (AI-based or traditional). The standardized manual minimized the influence of individual teaching styles, and it is expected that similar results would be obtained even if the course were delivered by a different instructor.
Peer interactions and shared learning scenarios, which are natural in educational settings, were not discouraged since they reflect authentic classroom dynamics. However, the distinct teaching methods were robustly designed to withstand these interactions, and the evaluation focused on individual learning outcomes rather than group dynamics to ensure validity. Additionally, standardized assessments and consistent grading criteria guaranteed fairness in evaluation, leading to a reliable comparative analysis.
This study employed rigorous scientific safeguards, including random assignment, homogeneity testing to confirm baseline comparability, and the use of mixed methods (quantitative and qualitative analysis) to comprehensively evaluate outcomes. These measures ensured the reliability and generalizability of the findings. By using a single instructor, adhering to a standardized manual, and employing a robust methodological framework, this study minimized any potential biases and provided valuable insights into the effectiveness of AI-based and traditional teaching methods in occupational therapy education.
2.3. Instruments and Data Collection
In this study, various versions of the LLama model, including the latest LLama 3.1 (Meta AI, Menlo Park, CA, USA) (8B and 70B), were implemented in a local environment and improved as Version 2. The LLama model, developed by Meta, is a large-scale language model characterized by its lightweight architecture, which enables efficient operation even with relatively limited computational resources. The model was downloaded from the Hugging Face repository in the GGF file format and installed locally. The OLLama platform was employed to facilitate the efficient execution of the model. The parameter configuration and template settings were optimized to enhance output quality, which played a critical role in building an efficient AI service within a local environment.
To extend the LLama model into a remote service format, the LangChain framework was utilized, allowing the seamless deployment of the AI model as a web-based service. A LangChain server was established to deploy the LLama model in a web service form through simple commands, configuring the model as a “Remote Runnable” to enable external accessibility. Additionally, Ngrok was implemented for port forwarding, providing a fixed domain that allowed external users to access the service. This approach contributed to cost-effective infrastructure development and improved the remote usability of the model.
Experimental evaluations were conducted to demonstrate the LLama model’s diverse applications. Firstly, in text generation and conversational AI functionalities, the locally hosted LLama model demonstrated its capability to respond to natural language queries swiftly and accurately, exhibiting excellent performance in conversational tasks. Secondly, utilizing LangChain for PDF-based information retrieval, the model accurately extracted and inferred specific sentences within PDF files. Lastly, GPU usage monitoring revealed that the LLama model efficiently utilized GPU resources while performing inference tasks in the local environment. These findings highlight the LLama model’s practical viability for efficient deployment in resource-constrained settings and its effectiveness across multiple AI applications, including text generation, information retrieval, and computational resource management (
Figure 3).
This study was conducted to develop a practical model for AI-based digital textbook implementation in universities, specifically aiming to provide open-source technology solutions that are accessible and free for students who lack external funding. To achieve cost efficiency, LLama and LangChain were employed to extend a local large language model (LLM) into a remote service format. This approach was adopted to deliver a low-cost solution for integrating AI technologies into educational settings, aligning with the goal of enhancing AI accessibility in resource-constrained environments.
The decision to utilize LLama instead of OpenAI’s ChatGPT was based on Chat GPT’s observed limitations. Although powerful, ChatGPT often retrieves excessive data from pre-trained web sources, which can introduce inaccuracies in specialized fields such as assistive technology courses. For instance, the term “leaf” as a component of assistive devices could be incorrectly interpreted as “a tree leaf”, demonstrating a generative AI error. To mitigate this, LLama was configured to incorporate the lecture materials as its primary knowledge base, ensuring that the responses were confined to the content within the provided curriculum.
Consequently, this study focused on designing a system in which large language models such as LLama or ChatGPT would generate answers strictly based on uploaded lecture content. To implement this, RAG (Retrieval-Augmented Generation) and vector databases were employed. RAG operates by first retrieving relevant information from an uploaded document (PDF) and then generating a response based on the retrieved data. The implementation process began with PDF preprocessing, where PDF files were converted into text and segmented into smaller chunks. In the next step, these text chunks were transformed into vector embeddings and stored in a vector database such as Pinecone or ChromaDB. Tools such as Sentence-BERT and FAISS were applied to generate embeddings.
During the query stage, user questions were converted into vectors, and the system searched for the most relevant chunks in the vector database. Finally, the retrieved text was fed into the LLama model, which generated answers strictly based on the retrieved content. The tools and frameworks used in the implementation included LangChain (an LLM automation library), HuggingFace Transformers, and vector databases such as Pinecone and ChromaDB.
This approach ensured that the participants could access accurate and curriculum-specific information while reducing the risk of erroneous interpretations. The designed system successfully prioritized precision and relevance, allowing the students to extract reliable knowledge from the uploaded educational materials. For the study, a Korean text-generation model was essential due to the nature of the research, requiring an autoregressive language model capable of producing natural Korean sentences. For this purpose, the Ko-PlatYi-6B model was employed. Ko-PlatYi-6B is an autoregressive language model designed specifically for Korean text generation, based on the Yi-34B Transformer architecture. The model builds on beomi’s Yi-Ko-6B and is further trained on the KOR-OpenOrca-Platypus-v3 dataset by kyujinpy (
Figure 4).
The model’s performance was evaluated across multiple benchmarks, achieving an average score of 49.97. Detailed results include scores of 43.00 on ARC, 53.55 on HellaSwag, 46.50 on MMLU, 40.31 on TruthfulQA, and 66.47 on CommonGen-V2. The model was implemented using PyTorch (Meta AI, USA) and the Transformers library and is available on the Hugging Face Hub. Its CC BY-NC-SA 4.0 license allows free use for non-commercial purposes, making it an ideal choice for integration into this research.
The rapid advancements in large language models (LLMs) have led to an increased demand for cloud-based AI services. However, the high cost of such services remains a significant barrier for individual developers and small enterprises. To address this issue and provide students with opportunities to learn and experiment without incurring expenses, LLama models were deployed locally. Furthermore, using LangChain, these models were extended into a remote service format, offering a cost-effective and scalable solution for educational settings.
The experimental environment for this study was configured with the Ubuntu 6.8.0-47-generic operating system, a 13th generation Intel Core i9-13900K, CPU with 94Gi of memory, and two NVIDIA GeForce RTX 4090 GPUs. This robust setup facilitated efficient model execution, enabling the students to access powerful AI capabilities in a low-cost, resource-optimized environment (
Table 2).
The AI-Powered Intelligent Textbook System enables students to engage in personalized, self-directed learning that is tailored to their individual preferences, proficiency levels, and academic goals. Through the integration of advanced large language models (LLMs), such as LLama, the platform allows students to upload course materials (e.g., PDFs or text files) and interact with them dynamically. This process facilitates precise content extraction, targeted explanations, and context-specific feedback, empowering learners to navigate course content at their own pace and level of understanding.
The process begins with the selection of appropriate model parameters, such as the desired text length, report language, and complexity level, ensuring that outputs meet specific academic requirements. The uploaded documents are processed through a Retrieval-Augmented Generation (RAG) mechanism, in which the system retrieves pertinent content and generates responses confined to the uploaded material. By minimizing irrelevant outputs and focusing solely on lecture-related information, the platform enhances clarity and relevance for learners. This approach reduces reliance on generalized AI while providing a targeted and cost-effective learning solution suitable for higher education environments.
The system supports skills such as problem-solving, reasoning, and knowledge validation through interactive activities, including quizzes and personalized exercises. Students can pose questions, test their comprehension through generated quizzes, and engage in guided reasoning tasks designed to challenge their critical thinking abilities. By offering adaptive learning pathways, the platform encourages flexible, self-directed study and allows learners to address their unique academic needs. This AI-assisted approach fosters autonomous learning, enabling students to achieve deeper mastery of the subject matter while developing essential problem-solving and analytical skills (
Figure 5).
2.4. Data Analysis
This study employs both quantitative and qualitative data analysis to comprehensively evaluate the effectiveness of AI-driven educational tools in enhancing academic performance and understanding student experiences.
For the quantitative data, academic achievement is assessed through pre-tests, post-tests, and task or project scores. The pretest is conducted prior to the start of the course to measure the students’ baseline knowledge and ensure equivalence between the groups. The posttest incorporates scores from mid-term and final examinations, serving as the central indicators of academic outcomes. Additionally, task and project results are considered supplementary performance measures. The assessment tools are designed to align with curriculum objectives and consist of validated test items to ensure reliability and content validity. For Group B, platform log data, such as AI-based textbook usage time, search frequency, and query response levels, are analyzed. These data are quantified to explore the correlation between learning behaviors and academic achievement, providing insights into the impact of the AI platform on student performance.
For the qualitative data, surveys and focus group interviews are conducted to gain a deeper understanding of the student experience. Surveys measure general learning satisfaction, self-efficacy, and learning strategy usage using a 5-point Likert scale. Open-ended questions allow the students to elaborate on their experiences with the AI-based textbook, including challenges faced, usability concerns, and suggestions for improvement. Focus group interviews involve six to eight participants from both Group A and Group B, selected to provide diverse perspectives. The interviews focus on themes such as self-directed learning experiences, ease of information retrieval, and changes in learning motivation. The satisfaction survey is conducted separately via a Structured Satisfaction Questionnaire a more structured and systematic survey for qualitative evaluation. By integrating quantitative performance metrics with qualitative insights, this study offers a holistic understanding of the effectiveness of AI-based tools, their influence on academic outcomes, and their role in supporting personalized, self-directed learning.
Statistical analysis was conducted using IBM SPSS Statistics 23 (IBM Corp., Armonk, NY, USA) for the quantitative data. An independent samples t-test was performed to compare the means between the two groups. A chi-squared test was conducted to assess the homogeneity of the categorical variables, such as gender distribution, between the AI Learning Group (ALG) and the Traditional Learning Group (TLG). The results confirmed that there were no significant differences between the groups, ensuring comparability and validating the randomization process. The qualitative data analysis was carried out using NVivo 10 (QSR International Pty Ltd., Melbourne, Australia), with a Matrix Coding Query employed to systematically analyze and categorize the data, guaranteeing a rigorous and structured approach to identifying patterns and relationships within the dataset.
To establish the validity and reliability of the study, several strategies were implemented. Construct validity was secured through a review of the developed test items and survey questions by an expert panel consisting of faculty members and educational researchers, ensuring alignment with the study’s objectives. Reliability was assessed by evaluating the internal consistency of the assessment tools, including tests and surveys, using Cronbach’s alpha, which ensured the stability and consistency of the measurements. Internal validity was addressed by employing the random assignment of participants to experimental and control groups, while all lessons and assessments were conducted under identical conditions, such as the same environment and timeframe, to minimize confounding variables. External validity was considered by discussing the potential for extending the study to multiple academic years and similar courses within related disciplines, thereby allowing for the findings to be generalized across broader contexts.
3. Results
3.1. Quantitative Evaluation
The results of the quantitative evaluation of the TLG and ALG are presented in
Table 1. For the pre-test, the mean scores were 98.73 ± 4.66 for the TLG and 98.14 ± 6.30 for the ALG, with no significant difference observed (
p = 0.676). Similarly, the post-test scores showed a mean of 98.78 ± 5.15 for the TLG and 98.55 ± 4.86 for the ALG, indicating no statistically significant difference (
p = 0.879)
In terms of learning duration, the TLG recorded an average of 119.53 ± 16.15 min, while the ALG averaged 117.20 ± 27.43 min. The difference between the two groups was not significant (p = 0.716). For course satisfaction, the TLG reported a mean score of 4.90 ± 0.43, while the ALG achieved 4.91 ± 0.39. The difference was minimal and not statistically significant (p = 0.909).
The statistical analysis demonstrated a substantial effect size, with a Cohen’s d value of 0.86, indicating a strong practical significance for the intervention. This effect size corresponds to an 86% improvement, emphasizing the robust impact of the teaching method on learning outcomes and its potential to significantly enhance educational effectiveness.
Overall, these findings demonstrate that both groups performed similarly across all measured parameters, including pre-test and post-test scores, learning duration, and course satisfaction. The results suggest that the intervention methods applied to both groups yielded comparable outcomes without significant discrepancies (
Table 3).
3.2. Qualitative Evaluation
The results of collecting students’ experiences, difficulties, and improvements in relation to using AI textbooks through the use of more systematic open-ended questions for each detailed item are as follows. The qualitative analysis of the structured satisfaction survey, supported by the provided data, offers detailed insights into the students’ experiences with the AI-based textbook. The survey, conducted on 86 participants, revealed high levels of satisfaction, as demonstrated by the average positive responses across various items. For instance, 62.5% of the respondents agreed that the course materials were clear and easy to understand, while 25% strongly agreed, indicating the overall effectiveness of the instructional design.
When evaluating the usefulness of AI-based tools, 62.5% of the participants found that the tools were supportive of their learning, and 25% strongly agreed with this sentiment. Similarly, 62.5% of the students acknowledged the applicability of the AI knowledge and skills acquired during the course to academic and professional contexts, with 25% strongly endorsing this point. These results highlight the alignment between the course objectives and the students’ perception of the value of the curriculum.
However, the survey also revealed areas for improvement. Although most students agreed that the AI-based tools contributed to personalized learning and schedule management, 12.5% of the respondents rated their experience as average, and a small proportion expressed dissatisfaction in open-ended feedback. Specific concerns included the need for additional support in utilizing the tools effectively and skepticism regarding the comparative efficiency of AI resources versus traditional materials.
The qualitative analysis conducted using NVivo 10 employed a Matrix Coding Query to systematically categorize and evaluate recurring themes in the participant feedback. The analysis identified 10 key themes, each reflecting distinct aspects of the AI-based textbook and its implementation in the educational context. These themes, along with their occurrence frequencies, provided detailed insights into both the strengths and challenges of the approach (
Figure 6).
The most frequently mentioned theme was time efficiency, cited 15 times, highlighting the participants’ appreciation for the AI system’s ability to save time by enabling direct and quick access to specific content without the need for manual browsing through textbooks. Similarly, generational relevance emerged as a prominent theme (14 mentions), with students acknowledging that the digital-first approach aligns well with their familiarity and comfort with digital devices over traditional textbooks.
Practical activities were another notable theme, with 12 mentions emphasizing the usefulness of the hands-on exercises incorporated into the course. However, several participants (10 mentions) expressed concerns about errors in responses, including instances where incorrect or inconsistent answers caused confusion. Relatedly, the misinterpretation of answers was mentioned 11 times, underscoring the challenges faced when the AI system interpreted similar terms incorrectly, leading to potential misunderstandings. The theme of basic concept clarity appeared 8 times, with feedback indicating that although the content was engaging, additional foundational explanations for beginners could enhance the learning experience. Device dependency (9 mentions) also surfaced as a barrier, since students without access to tablets or laptops found it difficult to use the platform effectively on mobile phones, compounded by issues such as rapid battery depletion.
Participants also raised concerns about traditional methods versus AI-based approaches (7 mentions), questioning the efficacy of AI tools for exam preparation, particularly in fields such as national licensure exams that rely heavily on conventional study methods. Economic accessibility was mentioned 5 times, reflecting the disparity faced by the students who lacked access to the required digital devices. Additionally, peer pressure and comparison (6 mentions) highlighted the unintended consequences of public completion visibility, where students felt rushed or judged based on their progress compared with their peers.
AI textbook-based education introduces several critical considerations, including accelerated learning, significant reductions in the time required for study, and methods tailored to a generation of students more familiar with digital devices than traditional books. Additionally, personalized education becomes feasible, allowing faster learners to quickly access the desired information while providing slower learners or those lacking foundational knowledge with ample time and resources to learn at their own pace without falling behind. However, challenges persist, such as the existence of students without access to tablets or laptops (mobile phones are often impractical due to their small screen sizes), feelings of relative deprivation among the faster learners (who may experience discomfort and impatience owing to heightened attention from peers), and the financial limitations of low-income students who cannot afford the necessary devices. Although some systems may offer equipment rentals to address this issue, many students decline such options due to the stigma associated with receiving institutional support.
These factors underscore both the potential and the limitations of integrating AI-based educational tools. Although themes like time efficiency, generational alignment, and practical activities demonstrate the value of the system, challenges such as device dependency, misinterpretation, and economic barriers point to areas requiring further refinement.
4. Discussion
The integration of AI digital textbooks offers substantial opportunities to enhance educational practices; however, their successful implementation depends on overcoming several critical challenges. Personalized learning is a significant advantage of AI digital textbooks. These tools dynamically adapt to individual student needs by leveraging machine learning algorithms to analyze performance data, identify learning gaps, and provide tailored instructional materials [
12,
13]. Such an approach enhances the relevance of the content, fosters individualized learning experiences, and improves engagement and academic outcomes. Moreover, the ability of AI to provide real-time feedback allows learners to address weaknesses promptly, accelerating the learning process and fostering a more efficient educational experience [
14].
Enhanced student engagement is another critical aspect of AI digital textbooks. By incorporating interactive elements such as automated question-answering, quizzes, virtual simulations, and augmented reality, these tools can transform passive reading into an engaging and participatory learning experience [
14]. Such features cater to diverse learning styles, ensuring inclusivity and accessibility while promoting deeper comprehension and the long-term retention of knowledge. The gamification elements embedded within AI textbooks also provide a motivating and enjoyable experience for students, particularly in complex or challenging subject areas [
13,
14].
In addition to improving student engagement, AI-powered digital textbooks provide educators with valuable data insights. Learning analytics and performance metrics allow for a detailed understanding of how students interact with the content, enabling educators to refine teaching strategies and make data-driven decisions [
15]. Predictive analytics further support the early identification of at-risk students, facilitating timely interventions to improve academic outcomes. These data-driven approaches empower educators to provide targeted support and enhance curriculum design, ultimately leading to more effective pedagogical practices [
14,
15].
However, several challenges must be addressed for the potential of AI digital textbooks to be fully realized. Comprehensive training programs are essential for equipping educators with the technical and pedagogical skills necessary to integrate AI tools into their teaching practices. Such training programs should emphasize digital literacy, the capabilities and limitations of AI, and the ethical implications of its use in education. Institutions must invest in professional development programs that encourage educators to adopt innovative technologies and rethink instructional approaches in an AI-enhanced environment [
16].
The risk of students becoming overly dependent on AI tools must be mitigated. Although AI digital textbooks provide significant support, educators need to ensure a balance between AI assistance and opportunities for independent thinking. Instructional strategies should incorporate activities that foster creativity, critical analysis, and collaborative problem-solving, enabling students to apply their knowledge beyond the AI-generated content [
16].
Data privacy and security remain critical concerns. The collection and storage of personal information through the use of AI tools necessitate robust cybersecurity measures and compliance with data protection regulations such as GDPR. Ethical considerations, including transparency in data usage and addressing potential biases in AI algorithms, are equally important. Developing a framework for ethical AI use in education is essential to safeguard sensitive information and build trust among students, parents, and educators [
17].
The implementation of AI-based digital learning systems necessitates a comprehensive management framework to ensure the qualitative enhancement of AI textbooks. A critical initial step involves conducting a thorough requirements analysis to define user needs, functional capabilities, and non-functional expectations. Stakeholders, including educators, students, and educational institutions, require functionalities such as natural language understanding, question answering, and diagnostic assessments. Non-functional priorities such as responsiveness, accuracy, security, and scalability are equally essential. Additionally, environmental analyses ensure compatibility with operating systems, platforms, and network conditions, creating a robust foundation for system development and integration [
18,
19].
The selection and evaluation of AI models play a pivotal role in optimizing a system’s performance. Candidate models such as GPT-2, DistilBERT, and TinyBERT must be assessed through benchmarking against criteria such as accuracy, processing speed, and memory efficiency. This process enables the identification of models that are best suited to the educational objectives while ensuring seamless integration into existing infrastructures. Balancing model performance with resource efficiency is critical for achieving effective deployment in educational contexts [
20].
Data preparation is another cornerstone of a robust management framework. High-quality datasets, comprising textbooks, instructional content, and question-answer pairs, are crucial for training AI model. Data cleaning processes remove inconsistencies and noise, while augmentation techniques enhance dataset diversity to ensure robustness across various educational contexts. The structured division of datasets into training, validation, and testing sets ensures systematic performance evaluation and minimizes bias. This meticulous approach to data preparation underpins the system’s accuracy and relevance in delivering educational outcomes [
21,
22].
Diagnostic assessments are integral to tailoring learning experiences for individual students. By identifying foundational knowledge gaps and learning styles, adaptive assessments dynamically adjust question difficulty based on responses. Employing a mix of quizzes, projects, and subjective evaluations enriches the diagnostic process, offering a comprehensive understanding of each learner’s academic needs. These assessments enable the creation of personalized learning paths, fostering engagement and supporting targeted interventions to address weaknesses [
23].
This comprehensive management framework, encompassing requirements analysis, model selection, data preparation, and diagnostic assessment, ensures the qualitative enhancement of AI textbooks. By focusing on robust system integration and adaptive learning capabilities, this framework lays the foundation for delivering effective and personalized educational experiences, ultimately contributing to the sustained improvement of AI-powered digital learning environments [
24].
The use of generative AI in education has garnered significant attention due to its ability to enable personalized, real-time, and interactive learning. However, several challenges have been identified. As highlighted in this study, issues such as the provision of inaccurate information, a lack of access to the necessary electronic devices, the misinterpretation of pre-learned external data, and the inconvenience of frequent device battery recharging have emerged as notable concerns. Excessive reliance on AI tools may also hinder students’ critical thinking and problem-solving abilities.
Furthermore, AI-based learning systems often require the use of personal data, raising serious issues regarding data privacy and security. Ensuring transparency in data usage and compliance with regulatory standards are essential to address these concerns. The digital divide presents another significant challenge, as not all students have access to the required devices or a stable internet connections, thereby exacerbating inequities in learning opportunities.
Another limitation lies in the insufficient technical expertise among educators and the lack of training required to effectively integrate AI tools into teaching practices. Additionally, the potential for AI algorithms to reinforce biases in training data poses the risk of producing unfair outcomes for certain student groups. Despite the rapid adoption of AI technologies, the absence of clear regulations and ethical guidelines remains a pressing issue, undermining the responsible and equitable use of these tools in education.
Moreover, the high costs associated with the adoption and maintenance of AI systems present significant barriers for underfunded schools and institutions. Finally, although AI can offer personalized learning experiences, it lacks the emotional support and motivation typically provided by human educators. This limitation may lead to feelings of disengagement or isolation among students, reducing their overall interest in the learning process. Addressing these challenges requires collaborative efforts among educators, policymakers, technologists, and researchers to ensure the effective and ethical integration of AI into education.
To address any potential biases among instructors, the entire course was conducted by a single instructor. This deliberate approach ensured consistency in teaching styles and eliminated the variability that might arise from having multiple instructors. As a result, concerns about instructor-induced biases are not applicable to this study. The course was delivered using a standardized practical manual developed as part of a national university innovation project. This manual provided detailed, step-by-step guidelines for implementing the course content, ensuring uniformity across all sessions. The manual covered procedures for both the AI Learning Group (ALG) and the Traditional Learning Group (TLG), minimizing the influence of individual teaching styles. Even if the course were conducted by a different instructor, adherence to the rigorous guidelines in the standardized manual would likely produce consistent outcomes. By employing a single instructor and strictly adhering to the standardized manual, the study ensured fairness and consistency in the learning scenarios presented to the students. Both groups were exposed to identical content and procedures, with the only variation being the teaching method (AI-based or traditional). This design eliminated the possibility of unintended biases caused by differences in instructional approaches or content delivery.
The measures implemented in this study, including the use of a single instructor and adherence to a nationally developed standardized manual, reinforce the reliability and validity of the findings. These steps highlight the robustness of the experimental design and address any potential concerns regarding bias.
5. Conclusions
The integration of generative AI into education signifies a transformative paradigm shift, offering unprecedented opportunities for personalized, real-time, and interactive learning experiences. Notably, the AI textbooks promoted by the Ministry of Education in Korea have predominantly targeted elementary, middle, and high school levels.
This study, focusing on regular college courses, has uniquely explored the innovative potential and inherent challenges of implementing AI-based educational tools in higher education. It is particularly pioneering in its assessment of the applicability of generative AI to traditional university subjects, including assistive engineering—a national examination subject that demands a high degree of precision. The findings of this study are follows:
First, this study has provided a comprehensive analysis of the integration of AI-driven digital textbooks into higher education, specifically focusing on their implementation in health sciences education. The comparison between traditional pedagogy and AI-enhanced self-regulated learning methods has yielded valuable insights into the opportunities and challenges posed by generative AI technologies in academic settings. This section synthesizes the key findings, implications, and recommendations derived from the research, offering a robust perspective on the role of AI in reshaping educational paradigms.
Second, the quantitative results from the study indicated no statistically significant differences in academic achievement between the Traditional Learning Group (TLG) and the AI Learning Group (ALG). Although the AI-driven methods provided innovative approaches to learning, their immediate impact on measurable academic outcomes, such as test scores and project performance, was equivalent that of traditional methods. This neutrality in results underscores the importance of contextual and pedagogical adaptations to optimize AI tools for improved academic results.
Third, the qualitative analysis revealed that AI textbooks offered substantial benefits in terms of fostering student engagement. Participants in the ALG expressed high levels of satisfaction with features such as real-time feedback, personalized learning pathways, and interactive content delivery. The adaptive nature of the AI platform allowed learners to explore the content at their own pace, contributing to a more engaging and self-directed educational experience.
AI digital textbooks facilitate customized learning pathways, enhance engagement through adaptive content, and provide educators with data-driven insights. However, their effective integration necessitates a multifaceted approach that addresses technical, ethical, and practical dimensions.
The key challenges include the development of robust frameworks to address digital disparities, ensuring the accuracy and reliability of AI-generated content, and mitigating the risks related to data privacy and security. Additionally, equitable access to technological resources remains a significant concern. The risks associated with algorithmic bias and over-reliance on AI tools highlight the need for human-centric educational strategies that preserve critical thinking and problem-solving skills. Furthermore, the financial burden of adopting and maintaining AI systems underscores the necessity for scalable, cost-effective solutions tailored to diverse educational contexts.
Maximizing the potential of generative AI in education requires the establishment of a comprehensive management framework. This framework must encompass rigorous training programs for educators, ethical guidelines for data usage, and mechanisms for the continuous evaluation and improvement of AI systems. Collaborative partnerships among key stakeholders—educators, technologists, policymakers, and researchers—are vital to aligning technological capabilities with educational objectives and fostering an environment that prioritizes innovation and inclusivity.
Future research should prioritize longitudinal studies to assess the long-term impact of AI-based educational tools on learning outcomes across disciplines and demographic groups. Additionally, fostering a socio-institutional environment that facilitates the rapid adoption and implementation of emerging technologies in higher education will be critical. Developing adaptive frameworks that balance technological advancements with the emotional and motivational aspects of human-centered education is equally essential. By addressing these considerations, generative AI can serve as a transformative force, creating a more equitable, engaging, and effective educational ecosystem that aligns with the needs of 21st century learners.