Analyzing the Alignment between AI Curriculum and AI Textbooks through Text Mining

Yang, Hyeji; Kim, Jamee; Lee, Wongyu

doi:10.3390/app131810011

Open AccessArticle

Analyzing the Alignment between AI Curriculum and AI Textbooks through Text Mining

by

Hyeji Yang

¹

,

Jamee Kim

² and

Wongyu Lee

^1,*

¹

Department of Computer Science and Engineering, Graduate School, Korea University, Seoul 02841, Republic of Korea

²

Major of Computer Science Education, Graduate School of Education, Korea University, Seoul 02841, Republic of Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(18), 10011; https://doi.org/10.3390/app131810011

Submission received: 20 July 2023 / Revised: 23 August 2023 / Accepted: 23 August 2023 / Published: 5 September 2023

(This article belongs to the Special Issue ICTs in Education)

Download

Browse Figures

Versions Notes

Abstract

:

The field of artificial intelligence (AI) is permeating education worldwide, reflecting societal changes driven by advancements in computing technology and the data revolution. Herein, we analyze the alignment between core AI educational curricula and textbooks to provide guidance on structuring AI knowledge. Text mining techniques using Python 3.10.3 and frame-based content analysis tailored to the computing field are employed to examine a substantial amount of text data within educational curriculum textbooks. We comprehensively examine the frequency of knowledge incorporated in AI curricula, topic structure, and practical tool utilization. The degree to which keywords are reflected in curriculum textbooks and in the textbook characteristics are determined using Term Frequency (TF) and Term Frequency-Inverse Document Frequency (TF-IDF) analysis, respectively. The topic structure distribution is derived by Latent Dirichlet Allocation (LDA) topic modeling and the trained model is visualized using PyLDAvis. Furthermore, the variation in vertical content range or level is investigated by content analysis, considering the tools used to teach similar AI knowledge. Lastly, the implications for AI curriculum structure are discussed in terms of curriculum composition, knowledge construction, practical application, and curriculum utilization. This study provides practical guidance for structuring curricula that effectively foster AI competency based on a systematic research methodology.

Keywords:

artificial intelligence (AI) education; artificial intelligence (AI) curriculum; text mining; Term Frequency-Inverse Document Frequency (TF-IDF); Latent Dirichlet Allocation (LDA); content analysis

1. Introduction

In the era of digital transformation, computational thinking, artificial intelligence (AI) knowledge, and digital literacy are recognized as core competencies for students and professionals. Informatics education plays a crucial role in strengthening basic competencies in the computing field and cultivating essential individual capabilities in the modern world [1,2,3].

As a result of the social changes induced by the data revolution and the advances in computing technology, the AI field, in particular, has been increasingly incorporated into the educational systems of several countries worldwide. AI traces its roots back to the 1950s but has truly flourished over the past 13 years, alongside related computing fields like robotics [4]. Notably, there has been a growing focus on developing and reinforcing AI curricula in various organizations and countries, including the Association for Computing Machinery (ACM)/IEEE, UNESCO, the United States, India, China, and Korea [4,5,6,7,8,9,10]. The ACM introduced the Computing Curricula 2020 (CC 2020), which includes data science as a computing field and incorporates AI-related content within three of the eleven data science knowledge areas—artificial intelligence, data mining, and machine learning—and the AI body of knowledge within one of the eleven computer science field areas—intelligent systems [4,5,6].

Regarding the K-12 standard curriculum, UNESCO published a K-12 AI Curriculum founded on competency-based education [3]. Additionally, the Association for the Advancement of Artificial Intelligence (AAAI) and the Computer Science Teacher Association (CSTA) jointly launched the AI for K-12 Students (AI4K12) initiative and introduced K-12 AI Guidelines based on the 5 big ideas for AI education [7]. At the national level, the CBSE of India has designated AI as a skill subject for senior secondary levels XI and XII [8], whereas China has implemented a mandatory elective module for high school students, titled “Preliminary Artificial Intelligence” (“人工智能初步”) [9]. In 2020, Korea established “Introduction to AI” as a career elective subject in high schools, leading to the development of eight types of “Introduction to AI” textbooks [10].

Curricula can be classified into two categories: national-level curriculum systems, in which curricula are developed and evaluated by the national government; and localized curriculum systems, in which curricula are structured by local governments or individual schools [11,12]. In K-12 education, school curricula are realized through textbooks, i.e., textbooks serve as the primary learning materials through which students engage with educational content in both national-level and localized curriculum systems. Consequently, textbooks constitute important indicators that guide teachers in the design of teaching and learning strategies.

Research on textbook analysis has adopted diverse approaches that involve multiple analysis and data consistency perspectives depending on the research objectives and academic characteristics of the subject. Studies have explored textbooks from a “knowledge perspective”, analyzing the discipline or subject matter addressed in the content, as well as from a “social perspective”, examining how educational content reflects social issues or cultural and historical changes. Most studies have focused on analyzing the textual content of textbooks. In cases where textbooks are developed and supplied based on a national-level curriculum system, the studies investigated the changes in content between textbooks published after different curriculum revisions. Previous research has employed various analysis methods, including “semantic analysis through content analysis”, “statistical analysis”, and “mixed methods research”. Applying information technology to textbook data opens new perspectives for fundamental questions in educational research. Consequently, text mining has emerged as an effective tool for analyzing textbooks. AI-driven natural language processing technologies enable frequency analysis of large amounts of text data in textbooks through the Term Frequency-Inverse Document Frequency (TF-IDF) method. Additionally, significance, concordance, and relevance can be objectively calculated through topic modeling using Latent Dirichlet Allocation (LDA).

Since curricula are influenced by social and cultural contexts, the composition and content of AI curricula will differ depending on the institutions or entities responsible for their development. Given the increasing integration and emphasis on AI from higher to K-12 computing education worldwide, it is crucial to investigate the knowledge composition of AI curricula. Therefore, this study aims to evaluate the alignment between AI curricula and textbooks by combining text mining technology and content analysis methods and to suggest directions for enhancing the knowledge composition of AI curricula.

2. Related Research

2.1. Trends in AI Education Content

The significance of education in developing AI competency, spanning from “higher education curriculum standards” to “K-12 curricula”, has received consistent attention in numerous countries globally. As AI is a relatively new field in computing education, the K-12 AI curricula announced by governmental and non-governmental entities in each country play a vital role in guiding different knowledge composition tasks within the educational domain, such as textbook development, teacher training, and lesson planning. To examine the direction of AI educational content, we can summarize the “AI-related knowledge areas in the computing standard curriculum” and the “K-12 AI educational content” as follows.

2.1.1. AI-Related Knowledge Areas in Structuring Standard Curriculum (Higher Education)

The artificial intelligence (AI) curriculum should be structured through a linkage between higher education curriculum standards and the K-12 curriculum. This approach is essential to reflect the continuously evolving AI technology, AI industry that embodies it, and potential impact on AI program development.

Since the 1960s, the ACM/IEEE has supported global computing education by proposing higher education curriculum standards that align with the constantly evolving computing technology landscape. In 2020, the revision from CC 2005 to CC 2020 incorporated AI content within the field of computer science, specifically under the “intelligent systems (IS)” knowledge area (KA), one of the 18 KAs in the field [4,5,6]. In the data science field, AI content was included in three of the eleven KAs—“AI”, “data mining (DM)”, and “machine learning (ML)”. Detailed information is provided in Table 1.

In computer science, AI has been extensively studied in the IS domain as a solution to challenging or impractical problems using traditional methods. AI finds widespread application in supporting everyday tasks, designing and analyzing autonomous agents that interact rationally with their environment. These solutions rely on diverse knowledge-representation schemes, problem-solving mechanisms, and learning techniques. They encompass areas such as sensing, problem solving, acting, and the architectures required to support these intelligent agents.

AI encompasses methodologies that aim to model and simulate various human abilities recognized as aspects of intelligence within data science. The key themes of AI include perceiving, representing, learning, planning, and reasoning with knowledge and evidence. DM focuses on data processing, analysis, and presentation to extract valuable information. Clustering, classification, regression, pattern mining, prediction, association, and outlier detection are the essential types of analysis, with attention given to various types of data, including time series and web data. ML, also known as statistical learning, refers to a wide range of algorithms that identify patterns in data and use them to build prediction models. Data privacy, security, integrity, and analysis for security (DPSIA) are cross-cutting topics that apply to competencies across all KAs, particularly those that address privacy, security, and integrity concerns.

AI content within the field of computer science is structured using the “knowledge area, knowledge unit, and learning outcome” (KA-KU-LO) model, with each knowledge unit associated with a list of topics. The learning outcomes are based on Bloom’s taxonomy and modified to include three levels of mastery: familiarity, usage, and assessment. In the field of data science, AI content is structured around a competency framework that includes knowledge, skills, and dispositions (K-S-D). Skills are expressed through learning outcomes, utilizing Bloom’s level of cognitive processes to specify the level of skill required for successful task completion. Descriptive learning outcomes aid in the clarification of intended goals and facilitate the interpretation of outcomes within a specific curriculum context.

2.1.2. K-12 AI Curriculum Content

K-12 AI content can be categorized based on international standards and individual country curricula. International standards, such as the “K-12 AI curricula” developed by UNESCO (2022) and “K-12 AI guidelines based on five big ideas” promoted by AI4K12 (a collaboration between AAAI and CSTA), provide frameworks for K-12 AI education [3,7]. At the national level, various AI-related subjects, such as “AI” in the Indian CBSE school system (India’s CBSE curriculum is divided into academic and skill subjects. “Artificial Intelligence” was first organized as a skill subject in the 2019–2020 curriculum and has been revised five times in the “2020–2021”, “2021–2022”, “2022–2023”, and “2023–2024” curricula. This study targets senior secondary levels XI & XII in 2023–2024), “Preliminary AI” (“人工智能初步”) in China (The high school IT curriculum in China consists of 10 modules, including two required modules, six elective required modules, and two elective modules), and “Introduction to AI” in Korea (In China, India, and Korea, AI curricula and textbook development projects are led by various IT companies, public institutions, and regional initiatives) [8,9,10], have been incorporated into high school curricula.

Different approaches to competency are reflected in documents addressing K-12 AI educational content. UNESCO’s K-12 and India’s AI curricula consider “competency” from a competency-based framework perspective, as outlined in the CCDS 2021. Conversely, Korea’s “Introduction to AI” includes competencies such as “knowledge of information culture”, “computational thinking skills”, “cooperative problem-solving skills”, and “basic knowledge of AI”. In this study, “competency” is defined as the ability to accomplish a specific task, and the components fostering competency within the competency-based framework include “knowledge”, “skills”, “attitude”, and “values”.

K-12 AI international standards aim to provide a framework to guide standard writers and curriculum developers to understand AI concepts, essential knowledge, and skills at the grade level. The composition of K-12 AI international standards is presented in Table 2.

The analysis of the K-12 AI curricula and K-12 AI guidelines yields the following results:

The first aspect pertains to the composition of curricula.

The K-12 AI curricula are structured based on competency-based education, with specific competencies (“knowledge”, “skills”, “values”, and “attitudes”) that students must achieve. In other words, the curricula categorize students’ expected capabilities at each grade level based on their knowledge, skills, values, and attitudes in AI.

In contrast, the K-12 AI guidelines are organized around five big ideas (1: perception, 2: representation and reasoning, 3: learning, 4: natural interaction, and 5: societal impact.) However, these guidelines do not include specific competency items as framework components. Instead, students are encouraged to develop comprehensive competencies related to the five big ideas by achieving learning objectives (LOs) and enduring understanding (EU).

The second aspect relates to the expression of knowledge.

The K-12 AI curricula address the value dimension by incorporating the four categories (“personal”, “social”, “societal”, and “human”) proposed by the OECD (2019), expanding the perspectives that students should consider. In the context of AI, curricula include aspects, such as personal goals, interpersonal relationships, shared priorities within cultures or societies (which may be legally established), and shared priorities, that transcend national and cultural borders.

The names of knowledge and skill areas and domains include various perspectives. However, when attempting to reconstitute knowledge based on the K-12 AI curricula, the nomenclature when mixing types of knowledge, concepts, and competencies may pose limitations.

The K-12 AI guidelines, organized around the five big ideas, facilitate curriculum restructuring by systematically organizing knowledge within this conceptual framework. This method supports the hierarchical and cohesive organization of KAs that comprise AI content within higher education curriculum standards, such as ISs in CS 2013 and AI, DM, and ML in CCDS 2021. Concepts are categorized using unique codes to enable the systematic management and utilization of knowledge related to the five big ideas.

The third aspect pertains to the achievement standard.

The K-12 AI curricula propose STEM education in learning outcomes, aiming to promote convergence between the computing field and other domains (computing + X), as proposed in CC 2020. This method enables students to explore transformative relationships between computing and non-computing fields.

The learning outcomes related to “knowledge”, “skills”, “values”, and “attitudes” in the K-12 AI curricula lack a clear framework. The learning outcome hierarchy is ambiguous because they were not developed based on cognitive process levels, as is the case with the skill-based learning outcomes in the CCDS 2021 or the three levels of mastery (familiarity, usage, and assessment) in the CS 2013. Although knowledge and skills share the same area and domain, the linkage between them is unclear owing to the separation of grade-level engagement mapping by knowledge-learning outcomes and skill descriptions without establishing their relationship.

In contrast, the K-12 AI guidelines propose LOs and the EU for various grade levels (K-2, 3–5, 6–8, 9–12) within the framework of the five big ideas. These guidelines specify what students should understand about AI and what they should be able to accomplish. However, they lack a systematic method or clear principle for aligning LOs, the EU, grade levels, and concepts.

The fourth aspect relates to teaching and learning.

The K-12 AI guidelines provide specific teaching and learning examples through activities and unpacked content while establishing grade-level connections. The curriculum’s lesson coherence is supported by teaching and learning materials that consider these grade-level connections.

In countries such as India, China, and Korea, where national-level curricula reflect educational policies or document characteristics, textbook development relies heavily on currently implemented curricula [8,9,10]. Consequently, the significance of the curriculum is particularly prominent in textbook development. Table 3 summarizes the composition of the K-12 AI International Standards.

The results of the analysis of the AI curricula in India, China, and Korea are as follows:

The first aspect pertains to curriculum composition. National curricula systematically organize content with respect to level or area.

In India, the “AI” curriculum categorizes content based on levels, considering “what students must be capable of accomplishing”. Similarly, in China, the “Preliminary AI” curriculum organizes content into “basic knowledge acquisition”, “application module building”, and “application”, considering the progression of levels. The curriculum composition simplifies the structure into “area” and “content demand”.

The content of the “Introduction to AI” curriculum in Korea is similar to the conceptual composition of the five big ideas in the K-12 AI guidelines, facilitating the restructuring of curricula. The content system is similar to the “knowledge-based learning” method of CS 2013, where the highest subject content category represents a subject’s character, basic concepts or principles, required learning content for each grade, and content necessary for students to meet the standards are systematically organized.

The second aspect pertains to the expression of knowledge.

In India’s “AI” curriculum, the competency items (knowledge, skills, and values) were concretized by unit within the competency-based framework, similar to CCDS 2021 or the K-12 AI curriculum. In data-driven storytelling, critical and creative thinking skills are separately addressed. This method is significant in establishing connections between AI readiness concepts, technical AI skills, and life skills related to AI, which are proposed as “skills to be developed”.

In China, the “Preliminary AI” curriculum covers the history of AI and its contribution to human progress, while emphasizing responsibility. It focuses on the impact of AI on individuals, society, and humanity from past, present, and future perspectives. Specific cases or experiences are used to highlight content acquisition.

The “Introduction to AI” curriculum in Korea aims to develop AI-based problem-solving skills in various fields and real-life situations. It focuses on developing the ability to understand fundamental AI concepts and principles, apply AI technology, and discuss AI ethics.

The knowledge embodied in computer science’s intelligent systems (ISs), data science’s artificial intelligence (AI), DM, and ML is integrated into the curriculum in a top-down manner. In Korea, for example, the fundamentals of AI include knowledge from ISs, such as natural language, perception, and computer vision.

The third aspect relates to the explicit perspectives of theory and practice.

In India’s “AI”, curriculum, theory and practice are organized by unit and specifically addressed from examination and time perspectives. Units that can be assessed through a student’s practice process and results can be distinguished. At the unit level, lessons are designed to consider both theory and practice.

2.2. Research on Textbook Analysis Methods

Curricula and textbooks play essential roles in school education. Curricula provide guidelines for systematically organizing standards and selecting, organizing, executing, evaluating, and improving educational goals, perspectives, and content. Textbooks, on the other hand, are teaching and learning materials that concretely implement the curricula officially announced by the national government. They have long served as a standard medium of educational content and a source of insight into school education.

While the design level of curricula may vary at the national, local, and school levels, textbooks serve as the medium for delivering education in schools. As curricula are documented and subject content elements are expressed in textbooks used at schools, it is important to ensure that curricula are not accepted or interpreted differently by readers. Specifically, the meaning of content elements deemed important in the subject must be expressed in a manner that allows readers to fully comprehend the intended meaning.

Therefore, analyzing the content of textbooks is crucial for achieving optimal teaching and learning outcomes and implementing the educational goals of each subject. Textbook analyses have adopted various approaches depending on the academic characteristics and research objectives of the subject. Table 4 summarizes the research on textbook analysis methods.

First, textbook analysis reflects the perspective of “knowledge” or “society” related to subjects. Studies have been conducted to analyze the “presence”, “scope”, and “level” of academic knowledge relevant to each subject. By analyzing how social, cultural, and political elements are reflected in textbooks, researchers have examined content related to gender, safety education, neglected groups (gender, race, and ethnicity), and democracy. In terms of competency approaches, Wang et al. (2021) and Choi (2015) explored how “convergence education” and “problem-solving skills” are reflected in textbooks [16,28]. They observed that the curriculum composition of each subject mainly emphasized “knowledge composition by subject”, “reflection of social changes, cultural context, and history”, and “the relationship between competency and subject content”.

Second, the consistency of data analysis was examined. Ho (2021) and Chen et al. (2020) analyzed how policy directions are reflected in educational content [15,19]. Ho (2022) and Pinson (2021) investigated the flow of knowledge composition through the historical development of subjects [15,18]. In these studies, characteristics were deduced by comparatively analyzing two or more textbooks based on their development backgrounds, such as the publisher and curriculum revision period.

Third, various research methods have been employed depending on the research objectives. Content analysis, which involves coding based on analysis criteria and frameworks, is a commonly used method for analyzing textbooks. Qiao-Ping et al. (2021) proposed features for both horizontal and vertical analysis [17]. Horizontal analyses involve examining simple frequencies or general characteristics of textbook components, while vertical analyses assess how textbooks handle a concept and the extent to which they consider the concept’s knowledge system. However, the content analysis method has limitations in terms of objectivity and consistency since it relies on the researcher subjectively analyzing textbook information based on intuition. Therefore, the use of appropriate framework compositions and analysis methods that align with the target and purpose of analysis has attracted increasing interest.

Several studies have utilized text mining techniques for analyzing textbooks. Chen (2021) conducted keyword frequency analysis [19], Lucy (2020) employed topic modeling to deduce topics based on keyword association analysis in documents [21], and Yun et al. (2018) used network analysis to determine degree centrality between keywords [27]. When applying text mining to textbook analysis, it is crucial to select suitable preprocessing and analysis methods that align with the characteristics of the text data and the analysis objectives. Human inspection and insight are essential when interpreting the categories and patterns generated by automated text-mining algorithms. Therefore, Chen (2021) designed a mixed research method that integrates text mining with content analysis, thereby enabling a comprehensive analysis of the educational implications present in textbooks [19].

2.3. Analysis of Educational Data USING Text Mining

Text mining has emerged as an effective tool for analyzing textbooks. AI-driven natural language processing technologies enable frequency analysis of large amounts of text data in textbooks through the Term Frequency-Inverse Document Frequency (TF-IDF) method. Additionally, significance, concordance, and relevance can be objectively calculated through topic modeling using Latent Dirichlet Allocation (LDA).

Text mining is a method that extracts semantic information by effectively processing unstructured data. It falls within the realm of natural language processing, a subfield of AI, and aims to uncover associations between topics and concepts by analyzing linguistic connections within textual data. Through quantitative analysis of word types, frequencies, and semantic relationships in large volumes of text, text mining unveils the characteristics and hidden patterns within the data [30,31,32]. This approach enables the objective analysis of educational data from multiple perspectives, shedding light on research topics and trends. Two key techniques in text mining are frequency analysis using TF-IDF and topic modeling using LDA [33,34,35].

TF-IDF is a statistical method that assigns weights to words in a document-term matrix (DTM) based on their importance. The TF component represents the total frequency of words in the document, whereas the IDF part indicates the number of documents in which each word appears, highlighting frequently used words [33]. TF-IDF combines these values to assign weights to words in the document, as follows:

TF-IDF (t, d, D) = TF (t, d) × IDF (t, D).

$w_{x}, y = t f_{x, y} \times \log (\frac{N}{d f_{x}})$
TF (t, d) = (number of occurrences of term t in document d)/(total number of terms in document d).
IDF (t, D) = ln (total number of documents in the corpus)/(number of documents with term t).

Topic modeling is commonly employed for topic analysis and trend research in specific domains. It assumes that each text document comprises multiple topics and clusters the associations between words within the document to uncover the underlying semantic structure. This method facilitates the extraction of topics from the text and identifies the most relevant terms associated with each topic [34,35].

One of the various topic modeling methods available is LDA, which is a probabilistic graphical model proposed by Blei et al. (2003) [36] that estimates the probability of word inclusion in a particular topic using a Dirichlet distribution. In LDA, a corpus, which is the collection of documents under analysis, is assumed to consist of multiple topics. The distribution of words corresponding to each topic within the corpus determines the topics assuming that the words inferring a latent topic are fixed. This method groups correlated words into a single topic, enabling the inference of latent topics based on words assigned to the same group [36,37].

The concept of LDA was presented in Figure 1: the analyzed corpus comprises M documents, with N representing the number of words in each document, ω denoting a word vector, and z being the topic assigned to a specific word within the document. The latent parameter ω defines the topic distribution of each document following a Dirichlet distribution, and words (ω) related to topics are allocated using Bayesian posterior probability extraction. α is an external parameter that adjusts the topic distribution θ of each document, and β is an external parameter that adjusts the word ω of each topic.

Equation (1) illustrates the estimation of the latent parameters (θ, z) for topic extraction. It can also be expressed using variational parameters θ and θ through a Bayesian posterior probability distribution, as shown in Equation (2). Here, γ is a Dirichlet parameter, and γ is a polynomial parameter (φ1, …, φN).

p (θ, z | ω, α, β) \frac{p (θ, z, ω | α, β)}{p (ω | α, β)}

(1)

q (θ, z | γ, φ) = q (θ | γ) \prod_{n = 1}^{N} q (z_{n} | φ_{n})

(2)

Prior to the introduction of text mining technology in the field of education, research was limited to qualitative paradigms, such as grounded theory or content analysis methods, which were employed to analyze text without explicit assistance from computer algorithms. However, text mining has brought new possibilities by enabling the identification of complex concepts and semantic relationships using extensive text data.

Frequency analysis through TF-IDF and text mining techniques such as topic modeling have been applied to automatically identify and extract valuable information from unstructured text in the education field. TF-IDF has been utilized to analyze document similarities or keywords in educational texts based on frequency analysis. For instance, studies such as “Applying Unsupervised Learning TF-IDF Algorithm in Word Segmentation of Ideological and Political Education [37]”, “Reporting Search Function Using TF-IDF for PBL Education [38]”, and “Analyzing Keywords of Mass Media’s News Articles on Maker Education in South Korea [39]” utilized TF-IDF. LDA has been employed to extract topics from educational documents. Examples of such studies include “Mapping Analysis of CS2013 by Supervised LDA and Isomap [40]”, “Explaining the Paradox of World University Rankings in China (Higher Education Sustainability Analysis with Sentiment Analysis and LDA Topic Modeling) [41]”, “Using Topic Modeling to Extract Preservice Teachers’ Understandings of Computational Thinking from Their Coding Reflections [42]”, “Unsupervised Characterization of Lessons According to Temporal Patterns of Teacher Talk via Topic Modeling [43]”, and “Analysis of Knowledge Domains and Skill Sets Using LDA-based Topic Modeling (Big Data Software Engineering) [44]”.

However, no study has yet suggested combining the TF-IDF and LDA methodologies, which could help explore the relationship between textbook characteristics and curriculum documents and provide guidance for curriculum knowledge construction.

3. Research Methods

This study aimed to systematically analyze the content of AI-related textbooks and propose directions for AI knowledge composition. To achieve this purpose, the research employed a combination of text mining and content analysis to assess the concordance between AI curriculum and textbooks. The research procedure involved steps as shown in Figure 2.

First, text mining was conducted using Python, employing TF-IDF analysis and LDA topic modeling. The text was extracted from the main body of the textbooks, and a suitable data preprocessing strategy for text data was implemented. The TF-IDF analysis and LDA topic modeling were then applied to examine the association and peculiarity of AI knowledge composition based on the four areas proposed in the Korean AI curriculum. Second, content analysis was performed by composing a frame and conducting coding to analyze a platform based on the fundamental AI curriculum. In AI education, both theoretical understanding and practical training hold equal importance. The scope and level of vertical content may vary depending on the tool used, even when covering the same content. Finally, the implications for AI curriculum knowledge composition were discussed by conducting a comprehensive analysis of the results obtained through text mining and content analysis.

3.1. Data Collection

In Korea, a basic AI curriculum was set to be established for high schools by 2020 [10]. The Introduction to AI curriculum comprises four areas: “1. Understanding of AI”, “2. Principles and application of AI”, “3. Data and machine learning”, and “4. Social impact of AI”. The curriculum document delineates key concepts, content elements, and learning elements for each area. The contents of the Basic AI Curriculum are presented in Table 5.

“1. Understanding of AI” encompasses topics such as “social changes and changes in career and occupation due to advances in AI technology” and “AI forming relationships with humans and being utilized as an intelligent agent”. “2. Principles and application of AI” covers aspects like “types and characteristics of various AI approaches for implementing recognition, search, inference, and learning” and “acquiring principles of and differences between approaches based on actual cases”. “3. Data and machine learning” includes topics on “data attribute perspective, machine learning, and classification models” and “problem-solving and performance evaluation methods”. Lastly, “4. Social impact of AI” focuses on “social values of AI and impact recognition” and “social responsibilities and fairness practice as a member of AI society”.

Based on the basic AI curriculum, eight AI textbooks were developed in 2021. This study analyzed the text data of the main text in four units from the eight textbooks.

Table 6 presents the frequency (ratio) of each content area in the eight textbooks [45,46,47,48,49,50,51,52]. This refers to the frequency (ratio) calculated by dividing the ”number of pages allocated to each area” by the ”total number of pages” for individual textbooks. This is significant as through this, the understanding of the importance of each section in individual textbooks can be achieved.

Unit 2 of all textbooks exhibited the highest ratios, ranging from 35.4% to 40.5%. Unit 3 had the second-highest ratio, ranging from 25.1% to 35.0%. The ratio of Unit 1 ranged from 13.5% to 17.5%, while that of Unit 4 ranged from 13.0% to 21.3%. In other words, Units 2 and 3, which displayed a high frequency of content and learning elements in the curriculum documents, held significant weight across all textbooks.

3.2. Data Preprocessing

Data preprocessing is a crucial step in text mining for the analysis of unstructured data. In this study, the focus of data preprocessing was to accurately analyze technical terms in documents related to Korean language education in the computing field [53,54]. The data preprocessing procedure was as follows:

Firstly, since the target of this study was curricula and textbooks, the KoNLPy package, which is a Korean language morpheme parser library reflecting language characteristics, was installed. The text data was then preprocessed using KOMORAN, a morpheme analyzer (parser).

Secondly, the technical terms provided by KOMORAN were employed, considering the analysis focus on documents in the educational field of AI. Technical terms composed of two or more words in KOMORAN, such as “computer vision”, “supervised learning”, and “voice recognition”, were processed as a single corpus.

Thirdly, abbreviations that expressed the same terms, such as using “AI” for “artificial intelligence”, were consolidated into a single corpus.

Fourthly, the analyzed data were transformed into a corpus and tokenized using morphemes. By selecting specific parts of speech and words with one or more letters, the normalization process was enhanced through iterations of stemming, stop-word elimination (numbers, special characters, punctuation marks), and review.

Table 7 presents the frequency of each corpus according to the textbook and morpheme type.

The corpus frequency varied across textbooks, ranging from 4928 to 14,042. Proper nouns (NNP), common nouns (NNG), and verbs (VV) exhibited higher corpus frequencies.

3.3. Analysis Method

In this study, a comprehensive analysis of knowledge frequency, topic composition, and practical training tools in AI curricula was conducted through text mining using Python 3.10.3 and a frame-based content analysis that reflects the specificity of computing field textbooks on a large scale.

Firstly, the study examined whether the keywords in each textbook were aligned with the learning elements of the curricula. This was accomplished by performing a TF analysis for each area in the textbooks. Specifically, the top 20 keywords with the highest TF analysis values for the eight textbooks were comparatively analyzed based on an analysis frame composed of the learning elements of each area in the curriculum. This exploration aimed to assess the extent to which the learning elements of each area are covered in the textbooks. In “TF analysis,” to understand the linkage between the areas proposed in the curriculum and the chapters of the textbook, both keywords included in learning elements (√) and other areas (*) were checked. For each keyword, if it was presented as the “learning elements” of the corresponding area in the curriculum, keywords included in learning elements (√) were marked. If it was related to the “learning elements” of other areas, other areas (*) were marked.

Secondly, TF-IDF was employed to deduce the top 15 words with the highest TF-IDF values in each textbook. This analysis aimed to identify the characteristics of the textbooks by distinguishing between terms related to the computing field and keywords related to cases. In other words, while targeting the keywords with high TF-IDF in each area of individual textbooks, the “Computing field-related keywords (*)” were checked, with the aim of understanding what they imply.

Thirdly, LDA topic modeling was conducted to analyze the topic composition of each textbook and compare them with the areas of the AI curriculum. The trained model was visualized using the LDA visualization library pyLDAvis for Python, as shown in Figure 3.

The distance between the circles in the visualization represents the discriminant validity. A greater distance between topics indicates higher discriminant validity, indicating distinct topics. On the other hand, a closer distance or overlapping topics suggests lower discriminant validity and similarity among topics. The size of the circles indicates the proportion of data accounted for by each topic. The distribution of keywords within each topic was computed and comparatively analyzed against the curriculum’s areas.

Fourthly, a content analysis framework was designed to reflect the specificity of the computing field. In the “Understanding of AI” section of the curriculum, the teaching and learning direction specifies the need to “select and use educational tools and platforms that are appropriate for the level of learners and laboratory environment”. Therefore, a content analysis frame was developed to identify the composition of educational tools per unit in each textbook. After deducing the tools for each unit from the textbook’s main body, the frame was constructed by clustering the tools according to their roles, and coding was performed accordingly.

Ultimately, this study evaluated the alignment between AI curricula and textbooks by integrating text mining technology and content analysis results. Based on the findings of this analysis, implications for AI curriculum composition were suggested.

4. Textbook Analysis Results

4.1. Evaluation of Consistency between Curriculum and Textbooks through Frequency Analysis

4.1.1. Consistency of Curriculum and Textbooks through TF Analysis

A comparison and analysis of the top 20 keywords with high TF analysis results for eight textbooks were conducted using the analysis framework composed of learning elements for each domain of the curriculum. Using this as a foundation, the study explores the extent to which each domain’s learning elements are represented in textbooks.

The concordance between the curriculum and textbooks, as inferred from the TF analysis of each textbook, is presented in Table 8.

First, we examine the perspective of “keywords that are learning elements of the curriculum but are not included in textbooks”. In the “2. Principles and Application of AI” section, “sensor” is identified as a learning element that emphasizes the importance of data collection through various types of sensors. However, the term “sensor” had a relatively low frequency in five out of the eight textbooks. Similarly, the frequency of “computer vision” was high in only one of the eight textbooks, while “robot vision” did not appear in any of the textbooks. Terms such as “search”, “inference”, “classification”, “clustering”, and “forecasting” were not very frequent in most textbooks, and none of the textbooks had a high frequency of the term “deep learning”. For instance, textbook A did not list “search”, “inference”, “clustering”, “classification”, “forecasting”, or “deep learning” as top frequency keywords in the “2. Principles and Application of AI” section.

In the “3. Data and Machine Learning” section, no textbook had a high frequency of the term “unstructured data”, whereas keywords related to “core attribute extraction”, “training data”, “test data”, and “performance evaluation” were only found in certain textbooks. In the “4. Social Impact of AI” section, no textbook included top frequency keywords related to “values of AI”. However, the keywords related to “fairness of AI” exhibited a high frequency in four of the textbooks (50%).

Next, we consider the perspective of “keywords from other areas included with a high frequency”. In textbook E, keywords from other sections, such as “data (24)”, “inference (13)”, and “recognition (8)”, were very frequent in the first section of “1. Understanding of AI”. Textbook A included “learning (17)” and “robot (12)”, while textbook B featured “data (15)” and “robot (9)” with high frequency in the first section of “1. Understanding of AI”. In the “2. Principles and Application of AI” section, textbook B had the most frequent occurrences of “learning (58)” and “data (53)” among the learning elements of “3. Data and Machine Learning”. In textbooks A, C, E, F, G, and G, the term “data” appeared with frequencies of 63, 37, 84, 85, 88, and 44, respectively.

Finally, we explored the perspective of “keywords demonstrating a high frequency regardless of curriculum area or textbook type”. Certain terms such as “AI”, “data”, “information”, “human (or “person”)”, and “technology” were top frequency keywords in most textbooks.

4.1.2. Evaluation of Textbook Specificity through TF-IDF Analysis

To understand the characteristics of each textbook, the top 15 words with high TF-IDF values were derived using TF-IDF. The textbooks’ characteristics were examined by distinguishing between “terms related to the computing field” and “keywords associated with the case studies”. The results of examining the specificity of each textbook through the TF-IDF analysis are listed in Table 9.

First, in textbook A, the TF-IDF values for “supervised learning (6.33)” and “pattern recognition (3.92)” were particularly high in the “2. Principles and Application of AI” section, which focused on machine learning. The TF-IDF values in textbook C were high for “node (6.58)” and “proposition (4.7)”. Textbook G exhibited the highest TF-IDF value of 17.65 among the textbooks for “propositional logic”, which was discussed in the context of inference. Furthermore, in textbook G, “node (12.69)”, “proposition (10.34)”, “supervised learning (8.63)”, and “linear regression (8.32)” had significant TF-IDF values. Second, there were cases where the TF-IDF values for terms from other sections were high. In textbook A, “computer vision”, typically addressed in the “2. Principles and Application of AI” section, was included in Unit 1, with a TF-IDF value of 2.94. In textbook C, “1. Understanding AI” and “3. Data and Machine Learning” included cases related to the “4. Social Impact of AI” section, specifically the ethics of AI. Third, we focus on keywords in the computing field that are distinctively used in textbooks. In textbook A, “computing” and “data type” had high TF-IDF values as general terms in the computing field, although they were not among the top frequency keywords in the textbooks. In textbook D, “software”, “function”, and “library” had high TF-IDF values. Textbook G emphasized “security (8.32)” with a high TF-IDF value in the “4. Social Impact of AI” section, indicating the importance of this particular aspect in the context of AI ethics. Finally, the textbooks differed in terms of the cases and data used in the theoretical explanations and practical exercises. For instance, in the “3. Data and Machine Learning” section, textbook A utilized animal or plant data such as “iris”, “salmon”, “bass”, and “fish”, while textbook B used keywords like “meal”, “clothes”, “protein”, “snack bar”, and “go to school” as practical training examples. In textbook H, real-life keywords, such as “movie”, “question”, “answer”, “review”, and “actor”, were utilized.

4.2. Evaluation of Textbook Knowledge Composition through LDA Topic Modeling Analysis

LDA was used to analyze the correspondence between textbook topic composition and curriculum area composition, with the aim of examining the implications of AI knowledge composition. To compare the topic composition of textbooks with the domain structure of the artificial intelligence curriculum, topic modeling was performed using Latent Dirichlet allocation (LDA). We can determine how topics are structured and calculate the weights of keywords that constitute each topic by analyzing the composition of topics in each textbook. The learned model was visualized using pyLDAvis, a Python library for LDA visualization, and the results were reviewed. The results are shown in Table 10.

The results are outlined below:

First, the topics in each textbook were categorized into two or three distinct groups. None of the textbooks shared identical topics. Two textbooks consisted of two topics, while the remaining six contained three topics. Textbook (D) had two overlapping topics that were similar among the six textbooks with three topics, indicating low discriminant validity (Figure 4). Second, the topic size significantly influenced the data weight. There was a noticeable disparity in topic size between textbooks B and H, which featured two topics, with one topic accounting for a higher proportion of keywords than the other. Third, the evaluation results for curriculum consistency varied for each topic. In some cases, a single curriculum included multiple topics. For example, in textbook A, the topics were categorized into ((2,3), (1,4), (2)) sections of the curriculum, whereas textbook B’s topics were grouped into ((2,3), (1,2,4)), with multiple topics falling under “2. The Principles and Applications of AI”. The topics in textbook H were classified into ((4), (1,2,3)), wherein “4. The Social Impact of AI” consisted of one topic, whereas the other three areas each had one topic.

4.3. Tool Utilization through Content Analysis

Content analysis was used to structure a framework for analyzing platforms based on the foundational AI curriculum, and coding was carried out. This is because both theory and practice are crucial in AI education, and even for identical content, the depth and level of content can vary depending on the tool used. The content analysis method is designed to reflect the uniqueness of the computing field. Consequently, a content analysis framework was developed to identify the composition of the educational tools in each chapter of the textbooks. The tools from textbook chapters were extracted and clustered based on their roles to structure the framework. Coding was performed based on this foundation.

The results of the tool analysis conducted through content analysis are presented as shown in Table 11.

First, regarding the use of tools for data processing, including recognition, collection, analysis, and model generation, the following observations were made: In the “2. Principles and Application of AI” section, three textbooks (A, B, and C) utilized Quick, Draw! to facilitate the learning of image data recognition, whereas textbook F used code.org to provide hands-on experience with the supervised learning principles of AI. In “3. The Data and Machine Learning” section, textbooks B, G, and H used Orange3, whereas textbook C used Brightics AI to generate AI models based on various real-life data samples. Notably, some textbooks solely focused on theoretical explanations without incorporating specific tools. Each textbook used a distinct approach to explain the same concept, and only a few textbooks included practical training examples.

Second, regarding AI models and program development, the following perspectives were identified: In the “2. Principles and Application of AI” section, five textbooks used ENTRY; one, Scratch; and six, block-based programming languages. However, these textbooks did not cover program development using programming languages. In the “3. Data and Machine Learning” section, three textbooks used ENTRY and one used Scratch, resulting in four textbooks using block-based programming languages. In addition, three textbooks developed AI programs that processed the data using Python. Notably, in Units 2 and 3, the two textbooks provided options for both text- and block-based programming languages. However, in two or three sections, certain textbooks did not include development practices that use programming languages.

Third, the analysis considered tools that support AI ethics. Seven out of the eight textbooks used moral machines to facilitate ethical decision making in various dilemma situations.

4.4. Results of the Alignment Evaluation between Curriculum and Textbooks

The AI curriculum should be structured to meet societal requirements because industrial society is evolving. There is a growing trend in K-12 education, which educates students ready to enter society, and higher education, which prepares students for their upcoming societal debut, to consider AI education as crucial. In other words, systematic AI curriculum development is meaningful because it positively impacts the cultivation of AI talent, AI research and development, and the direction of technological advancement.

The results of the textbook analysis are discussed as follows:

First, it is essential to review learning elements that are inadequately reflected in each curriculum section. Content knowledge considered important within each curriculum area should be examined to determine whether it is adequately addressed in textbooks. Second, the top-frequency keywords in textbooks that were not covered in the curriculum sections must be reviewed. These keywords serve as fundamental materials for assessing the appropriateness of a curriculum’s knowledge composition. Third, keywords that appear frequently in certain textbooks but are not included as learning elements in the curriculum should be considered for inclusion based on the TF-IDF analysis of the textbooks. Examples of such keywords include “neural networks” and “regression”. Fourth, if subsequent curriculum areas require the incorporation of specific learning elements, the prerequisite knowledge required for effective learning must be considered. Fifth, the division of units should be evaluated to determine if it is clearly structured according to the curriculum knowledge units. A clear division indicates a well-defined knowledge composition of textbook content. This analysis shows that the four curriculum areas are not adequately represented in textbooks. Finally, considering the significant disparity in tool usage across different curricula, it is necessary to propose clear criteria for using tools in teaching and learning. These criteria should be based on each area’s theoretical foundations and aligned with the objectives of practical training.

5. Discussion

Based on the aforementioned results, the directions for knowledge composition in the AI curriculum can be addressed from three perspectives.

First, curriculum composition should facilitate the reconstruction of curriculum knowledge by adopting a concept-based framework that aligns with competency-oriented education. Like in other frameworks, such as the Data Science structure introduced by ACM/IEEE, KAs of Computer Science, and the five big ideas of K-12 AI Guidelines of AI4K12, the curriculum’s detailed areas and overall structure should be designed based on the knowledge or concepts within the field of AI. This approach supports the systematic organization and reconstruction of the curriculum composition.

The content elements within these areas should effectively systematize the competencies in knowledge, skills, values, and attitudes. It is crucial to specify the desired competencies students should possess after completing the curriculum, encompassing comprehensive development in terms of knowledge, skills, values, and attitudes within each area.

To manage and utilize the relationships between different items, a three-layered structure consisting of areas, detailed areas, and content elements should be systematically established and organized. This allows for efficient curriculum revision and enhancement from a curriculum management perspective, as well as effective reconstruction of the curriculum based on the field’s requirements from a curriculum application perspective. The order of the areas should not be constrained by prerequisites or subsequent knowledge. In other words, the AI curriculum’s knowledge composition should be such that the educational content can be reconstituted considering the curriculum composition and the purpose of textbook development.

Similar to frameworks like the CS 2013, CCDS 2021, and K-12 AI Curriculum, it is essential to propose document systems that map K-12 education to the content elements of each curriculum area. As educational and learner environments vary across countries and regions, it is necessary to stratify the content level, as observed in curricula from countries like India or China. This stratification ensures that lessons are designed according to the educational objectives or lesson goals, enabling AI educational content to maintain connectivity and relevance. Content can be extracted based on the specific environment and the learners’ proficiency level in each school, thereby facilitating the operation of curricula.

The second perspective pertains to the composition of knowledge. The content composition within the computing field is closely intertwined with and reflects technological advancements and societal changes. The Computing Curricula of ACM/IEEE, serving as the standard curriculum in the computing field, exhibits mutual influence and complementarity with K-12 curricula worldwide. Therefore, to develop a comprehensive AI curriculum, it is essential to adopt a top-down approach, focusing on the CCDS 2021 and CS 2013 standards provided by ACM/IEEE, while also taking into consideration the K-12 AI Curriculum established by internationally recognized institutions. A thorough examination of standard- and country-specific curricula reveals common elements, such as computing basics (algorithms, computing systems, data, programming, and ethics), AI concepts and principles, AI and its societal implications, the significance of data, modeling, and programming, and the ethical considerations of AI. Additionally, the curriculum should encompass AI convergence education (STEM) by emphasizing the integration of computing + X.

It is crucial to define the scope of knowledge encompassed within the AI curriculum. The AI field, as proposed in the standard computing curriculum of ACM/IEEE and the K-12 curricula of individual countries, covers education in AI application, principles, and convergence. Furthermore, AI education comprises AI knowledge, concept and principles, and developer education. Hence, the curriculum content should be structured based on a clear definition of the curriculum’s scope.

Concepts that require emphasis across all areas should be designated as “cross-cutting” and should permeate throughout the curriculum. This approach is akin to the cross-cutting themes of CSTA 2016 (“abstraction”, “system relation”, “human-computer interaction”, “personal information protection and security”, and “communication and cooperation”) and the big ideas of AP Computer Science (“modularity”, “variable”, “control”, and “impact of computing”) [55,56]. Within CCDS, the “Data Privacy, Security, Integrity, and Analysis for Security” area is defined as “cross-cutting” and relates to all KAs [5]. For example, “data”, “AI security”, “human-computer interaction”, and “modularity” can be incorporated as “cross-cutting” elements within the AI curriculum.

The third perspective focuses on the support for practical training in AI education. Practical training should encompass various aspects, such as “experience”, “application”, “understanding principles”, “implementation (data processing, model design, program development)”, “ethics”, and “convergence”. Instead of prescribing specific tools for practical training in limited areas or content, education should advocate for the availability of a diverse range of tools to teach and learn the same content.

In practical training, two important considerations are the “expansion of data processing possibilities” and “stability”. The data utilized in AI education reflects the society and culture of each country, wherein data bias may become a relative concept. Therefore, instead of focusing on specific types of data, a wide variety of data should be provided, ensuring its safety in terms of privacy, security, and copyright for all students. Starting with the practices using data introduced in textbooks, students should have access to data through secure cloud links, enabling them to design AI models or develop programs that reflect ethical values and are applicable to real-life situations and diverse contexts.

Another aspect to consider is the flexibility of curriculum application. From an educational environment perspective, flexibility in terms of time allocation and balance between theory and practical training should be provided. From a content perspective, flexibility should encompass the differentiation of content, the sequencing of topics, the balance between theory and practice, and the adaptability to different levels of education. Autonomous and active curriculum content should be formulated and implemented by proposing the order of curriculum areas, reconstitution of content, integration of theory and practice based on the school level, and utilization of data application.

6. Conclusions

This study makes a valuable contribution by systematically evaluating the alignment between AI curriculum and textbooks through text mining and content analysis. The text analysis method developed in this study enables a comprehensive examination of textbooks within the computing field’s curriculum. Notably, this study is significant as it demonstrates the potential to quantify and expand textbook analysis by integrating a mining technique that enables the examination of a massive amount of text data from textbooks with a frame-based content analysis that accounts for the specificities of computing textbooks. The findings of this study serve as a solid foundation for conducting further investigations on the relationship between curriculum and textbooks using larger sample data from the computing field. Ultimately, the proposed directions for AI curriculum knowledge composition presented in this study hold particular importance, as they establish the basis for developing K-12 curricula within the computing field.

Artificial Intelligence (AI) education based on a systematic AI curriculum will serve as a foundation for nurturing talents equipped with basic AI literacy, convergence capabilities (AI + X, X + AI), and professional skills in industrial and academic fields. In other words, this study is significant in that it has provided the technology and basic data that can contribute to industry, academia, and education in the AI era.

Author Contributions

Conceptualization, H.Y. and W.L.; methodology, H.Y., J.K. and W.L.; software, H.Y.; validation, W.L., J.K. and H.Y.; writing—original draft preparation, H.Y.; writing—review and editing, H.Y. and J.K.; supervision, J.K. and W.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2021R1A2C2013735).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Dondi, M.; Klier, J.; Panier, F.; Schubert, J. Defining the Skills Citizens Will Need in the Future World of Work; McKinsey & Company: Tokyo, Japan, 2021; p. 25. [Google Scholar]
OECD. An OECD Learning Framework 2030; Springer International Publishing: Cham, Switzerland, 2019; pp. 23–35. [Google Scholar]
Miao, F.; Shiohira, K. K-12 AI Curricula. A Mapping of Government-Endorsed AI Curricula; UNESCO: Paris, France, 2022. [Google Scholar]
Clear, A.; Parrish, A.; Impagliazzo, J.; Wang, P.; Ciancarini, P.; Cuadros-Vargas, E. Computing Curricula 2020 (CC2020): Paradigms for Future Computing Curricula; ACM/IEEE Computer Society: New York, NY, USA, 2020. [Google Scholar]
Danyluk, A.; Leidig, P.; Cassel, L.; Servin, C. Computing competencies for undergraduate data science curricula: ACM Data Science Task Force. In Proceedings of the 52nd ACM Technical Symposium on Computer Science Education, Virtual, 13–20 March 2021. [Google Scholar]
Draft, S. Computing Science Curricula 2013(CS2013); ACM/IEEE: New York, NY, USA, 2013. [Google Scholar]
AI4K12. Available online: https://ai4k12.org/ (accessed on 1 May 2023).
CBSE. Artificial Intelligence (Sub. Code 843) Class—XI&XII Cbse Department of Skill Education Curriculum for Session 2021–2022; BSE: New Delhi, India, 2021. [Google Scholar]
Ministry of Education of the People’s Republic of China. Information Technology Curriculum Standards for Ordinary High Schools; Ministry of Education of the People’s Republic of China: Beijing, China, 2017. [Google Scholar]
Ministry of Education. Ministry of Education Announcement No. 2015-74 [Supplementary Book 10]: Curriculum Guidelines for Practical Subjects (Technology/Home Economics) and Informatics Studies; Ministry of Education: Sejong-si, Republic of Korea, 2020. [Google Scholar]
Astiz, M.F.; Wiseman, A.W.; Baker, D.P. Slouching towards decentralization: Consequences of globalization for curricular control in national education systems. Comp. Educ. Rev. 2002, 46, 66–88. [Google Scholar] [CrossRef]
Mok, K.H. (Ed.) Centralization and Decentralization: Educational Reforms and Changing Governance in Chinese Societies; Springer Science & Business Media: Cham, Switzerland, 2013. [Google Scholar]
Gumilar, S.; Hadianto, D.; Amalia, I.F.; Ismail, A. The portrayal of women in Indonesian national physics textbooks: A textual analysis. Int. J. Sci. Educ. 2022, 44, 416–433. [Google Scholar] [CrossRef]
Aivelo, T.; Neffling, E.; Karala, M. Representation for whom? Transformation of sex/gender discussion from stereotypes to silence in Finnish biology textbooks from 20th to 21th century. J. Biol. Educ. 2022, 1–15. [Google Scholar] [CrossRef]
Ho, Y.-R. Indigenous language curriculum revival: An emancipatory education analysis of Taiwanese Indigenous language policy and textbooks. J. Curric. Stud. 2022, 54, 501–519. [Google Scholar] [CrossRef]
Wang, T.; Ma, Y.; Ling, Y.; Wang, J. Integrated STEM in high school science courses: An analysis of 23 science textbooks in China. Res. Sci. Technol. Educ. 2021, 41, 1197–1214. [Google Scholar] [CrossRef]
Zhang, Q.-P.; Wong, N.-Y. The Learning Trajectories of Similarity in Mathematics Curriculum: An Epistemological Analysis of Hong Kong Secondary Mathematics Textbooks in the Past Half Century. Mathematics 2021, 9, 2310. [Google Scholar] [CrossRef]
Pinson, H.; Agbaria, A.K. Ethno-nationalism in citizenship education in Israel: An analysis of the official civics textbook. Br. J. Sociol. Educ. 2021, 42, 733–751. [Google Scholar] [CrossRef]
Chen, K.; Zhou, J.; Lin, J.; Yang, J.; Xiang, J.; Ling, Y. Conducting Content Analysis for Chemistry Safety Education Terms and Topics in Chinese Secondary School Curriculum Standards, Textbooks, and Lesson Plans Shows Increased Safety Awareness. J. Chem. Educ. 2020, 98, 92–104. [Google Scholar] [CrossRef]
Heemann, T.; Hammann, M. Towards teaching for an integrated understanding of trait formation: An analysis of genetics tasks in high school biology textbooks this paper was presented at the ERIDOB conference 2020. J. Biol. Educ. 2020, 54, 191–201. [Google Scholar] [CrossRef]
Lucy, L.; Demszky, D.; Bromley, P.; Jurafsky, D. Content Analysis of Textbooks via Natural Language Processing: Findings on Gender, Race, and Ethnicity in Texas U.S. History Textbooks. AERA Open 2020, 6, 2332858420940312. [Google Scholar] [CrossRef]
Sakhovskiy, A.; Solovyev, V.; Solnyshkina, M. Topic Modeling for Assessment of Text Complexity in Russian Textbooks. In Proceedings of the 2020 Ivannikov Ispras Open Conference (ISPRAS), Moscow, Russia, 10–11 December 2020; IEEE: New York, NY, USA, 2020; pp. 102–108. [Google Scholar]
BouJaoude, S.; Noureddine, R. Analysis of science textbooks as cultural supportive tools: The case of Arab countries. Int. J. Sci. Educ. 2020, 42, 1108–1123. [Google Scholar] [CrossRef]
González-Delgado, M.; Lorenzo, M.F.; Machado-Trujillo, C. The concept of the State in textbooks: Analysis and reinterpretation during the Spanish Transition to Democracy (1976–1986). Br. J. Educ. Stud. 2020, 68, 331–347. [Google Scholar] [CrossRef]
Hyun-joo, P.; Kwon, J. Analysis of inquiry tendencies in high-level middle school 1 chemistry textbooks during the Kim Jong-un era in North Korea. J. Korean Chem. Soc. 2019, 63, 266–279. [Google Scholar]
Rusek, M.; Vojíř, K. Analysis of text difficulty in lower-secondary chemistry textbooks. Chem. Educ. Res. Pract. 2019, 20, 85–94. [Google Scholar] [CrossRef]
Yun, E.; Park, Y. Extraction of scientific semantic networks from science textbooks and comparison with science teachers’ spoken language by text network analysis. Int. J. Sci. Educ. 2018, 40, 2118–2136. [Google Scholar] [CrossRef]
Choi, G.S.; Lee, J.Y.; Yoon, H.S. Development of a quantitative analysis model of creative problem solving ability in computer textbooks. Clust. Comput. 2015, 18, 733–745. [Google Scholar] [CrossRef]
Cohen, R.; Yarden, A. How the Curriculum Guideline “The Cell Is to Be Studied Longitudinally” Is Expressed in Six Israeli Junior-High-School Textbooks. J. Sci. Educ. Technol. 2010, 19, 276–292. [Google Scholar] [CrossRef]
Lei, L. Text Analysis with R for Students of Literature. J. Quant. Linguist. 2016, 23, 228–233. [Google Scholar] [CrossRef]
Dieng, A.B.; Ruiz, F.J.; Blei, D.M. The dynamic embedded topic model. arXiv 2019, arXiv:1907.05545. [Google Scholar]
Ferreira-Mello, R.; André, M.; Pinheiro, A.; Costa, E.; Romero, C. Text mining in education. Wiley Interduce Rev. Data Min. Knowl. Discov. 2019, 9, e1332. [Google Scholar] [CrossRef]
Rezgui, Y. Text-based domain ontology building using Tf-Idf and metric clusters techniques. Knowl. Eng. Rev. 2007, 22, 379–403. [Google Scholar] [CrossRef]
Mcauliffe, J.; Blei, D. Supervised topic models. Adv. Neural Inf. Process. Syst. 2007, 20, 1–8. [Google Scholar]
Hoffman, M.; Bach, F.; Blei, D. Online learning for latent dirichlet allocation. Adv. Neural Inf. Process. Syst. 2010, 23, 1–9. [Google Scholar]
Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent dirichlet allocation. J. Mach. Learn. Res. 2003, 3, 993–1022. [Google Scholar]
Chen, S.; Wu, L.; Zhuo, J. The Application of Unsupervised Learning TF-IDF Algorithm in Word Segmentation of Ideological and Political Education. Wirel. Commun. Mob. Comput. 2022, 2022, 5219117. [Google Scholar] [CrossRef]
Fukushima, Y.; Shin, M.; Miyazaki, K.; Ito, T.; Yonekura, R.; Tanaka, M.S. Report Search Function Using TF-IDF for PBL Education. In Proceedings of the 2020 IEEE 9th Global Conference on Consumer Electronics (GCCE), Kobe, Japan, 13–16 October 2020; IEEE: New York, NY, USA, 2020; pp. 802–803. [Google Scholar]
Lee, D.; Kwon, H. Keyword analysis of the mass media’s news articles on maker education in South Korea. Int. J. Technol. Des. Educ. 2020, 32, 333–353. [Google Scholar] [CrossRef]
Sekiya, T.; Matsuda, Y.; Yamaguchi, K. Mapping analysis of CS2013 by supervised LDA and isomap. In Proceedings of the 2014 IEEE International Conference on Teaching, Assessment and Learning for Engineering (TALE), Wellington, New Zealand, 8–10 December 2014; IEEE: New York, NY, USA, 2014; pp. 33–40. [Google Scholar]
Wen, Y.; Zhao, X.; Li, X.; Zang, Y. Explaining the Paradox of World University Rankings in China: Higher Education Sustainability Analysis with Sentiment Analysis and LDA Topic Modeling. Sustainability 2023, 15, 5003. [Google Scholar] [CrossRef]
Cutumisu, M.; Guo, Q. Using Topic Modeling to Extract Pre-Service Teachers’ Understandings of Computational Thinking From Their Coding Reflections. IEEE Trans. Educ. 2019, 62, 325–332. [Google Scholar] [CrossRef]
Altamirano, M.; Uribe, P.; Schlotterbeck, D.; Jiménez, A.; Araya, R.; Moris, J.v.d.M.; Caballero, D. Unsupervised characterization of lessons according to temporal patterns of teacher talk via topic modeling. Neurocomputing 2022, 484, 211–222. [Google Scholar] [CrossRef]
Gurcan, F.; Cagiltay, N.E. Big Data Software Engineering: Analysis of Knowledge Domains and Skill Sets Using LDA-Based Topic Modeling. IEEE Access 2019, 7, 82541–82552. [Google Scholar] [CrossRef]
Kumsung. Introduction to Artificial Intelligence; Kumsung: Seoul, Republic of Korea, 2021. [Google Scholar]
Gilbut. Introduction to Artificial Intelligence; Gilbut: Gyeonggi, Republic of Korea, 2021. [Google Scholar]
MiraeN. Introduction to Artificial Intelligence; MiraeN: Jeonnam, Republic of Korea, 2021. [Google Scholar]
Visang. Introduction to Artificial Intelligence; Visang: Seoul, Republic of Korea, 2021. [Google Scholar]
Samyang. Introduction to Artificial Intelligence; Samyang: Seoul, Republic of Korea, 2021. [Google Scholar]
Seongandang. Introduction to Artificial Intelligence; Seongandang: Seoul, Republic of Korea, 2021. [Google Scholar]
Cmass. Introduction to Artificial Intelligence; Cmass: Seoul, Republic of Korea, 2021. [Google Scholar]
Chunjaetext. Introduction to Artificial Intelligence; Chunjaetext: Seoul, Republic of Korea, 2021. [Google Scholar]
Park, E.L.; Cho, S. KoNLPy: Korean natural language processing in Python. In Proceedings of the 26th Annual Conference on Human & Cognitive Language Technology, Chuncheon, Republic of Korea, 22 April–13 May 2014; Volume 6, pp. 133–136. [Google Scholar]
Hidayatullah, A.F.; Ma’arif, M.R. Road traffic topic modeling on Twitter using latent dirichlet allocation. In Proceedings of the 2017 International Conference on Sustainable Information Engineering and technology (SIET), Batu City, Indonesia, 24–25 November 2017; IEEE: New York, NY, USA, 2017; pp. 47–52. [Google Scholar]
K-12 Computer Science Framework Steering Committee. K-12 Computer Science Framework; ACM: New York, NY, USA, 2016. [Google Scholar]
College Board. College Board AP® Computer Science a Course and Exam Description; College Board: New York, NY, USA, 2020. [Google Scholar]

Figure 1. Conceptual diagram of LDA.

Figure 2. Research process.

Figure 3. The trained model visualized using the LDA visualization library pyLDAvis.

Figure 4. The trained model visualized using the pyLDAvis.

Table 1. AI-related KAs in the computing standard curriculum (Higher Education).

Field	Body of Knowledge			Remarks
Computer Science: CS 2013	KA	Knowledge Units (KUs)		-
Computer Science: CS 2013	Intelligent Systems (ISs)	o Fundamental Issues o Basic Search Strategies o Basic Knowledge Representation and Reasoning o Basic Machine Learning o Advanced Search o Advanced Representation and Reasoning	o Reasoning Under Uncertainty o Agents o Natural Language o Advanced Machine Learning o Robotics o Perception and Computer- Vision	o Knowledge-based learning: - Knowledge area (KA), knowledge unit (KU), learning outcome (LO) o Total number of KAs: 18 o Cross-reference: - Algorithms and Complexity (AL) - Discrete Structures (DS) - Human–Computer Interaction (HCI) - Information Management (IM) o Learning outcomes: 3 levels of mastery—familiarity, usage, assessment
Data Science: CCDS 2021	KA	Sub-domains		-
	Artificial Intelligence (AI)	o General o Knowledge Representation and Reasoning (Logic-based Models)	o Planning and Search Strategies	o Competency-based learning o Competency = knowledge + skills + dispositions in context o Competency framework - Area > scope, competencies, sub-domains > Knowledge-Skills-Disposition (K-S-D) o Skills: expressed in areas of learning outcomes. o Total number of KAs: 11 o Data Privacy, Security, Integrity, and Analysis for Security (DPSIA): cross-cutting issues around privacy, security, and integrity that relate to competencies in all of the Knowledge Areas
	Data Mining (DM)	o Proximity Measurement o Data Preparation o Information Extraction o Cluster Analysis o Classification and Regression	o Pattern Mining o Outlier Detection o Time Series Data o Mining Web Data o Information Retrieval
	Machine Learning (ML)	o General o Supervised Learning o Unsupervised Learning o Mixed Methods	o Deep Learning o Reinforcement Learning (Appears in AI-Knowledge Representation and Reasoning—Probability-based Models)

Table 2. AI-related KAs in the computing standard curriculum.

Curriculum	Contents		Remarks
K-12 AI Curriculum	Competency	Area: Domain	∙Competency-based education (CBE) - knowledge: Domain > sub-domain > learning outcomes (common verbs include “know”, “understand”, “reflect”, and “compare”) - Skills: Topic area > description of skills (common verbs include “use”, “create”, “build”, “revise”, and “write”) - Values: Value/Attitude to be developed > examples of related knowledge and skills outcomes (common verbs include “explore”, “solve”, “create”, “reflect”, “design”, “use”, “explain”, “compare”) ∙Grade levels: Primary, middle, high
	Knowledge	∙AI foundations: Algorithms, Programming, Contextual problem-solving, Data literacy ∙Understanding, using, and developing AI: AI techniques, AI technologies, AI development ∙Ethics and social impact: Applications of AI to other domains (AI Applications), AI Ethics, Social implications of AI
	Skills
	Values/ Attitudes	∙Personal: Interest in ICT, Persistence/Resilience, Personal empowerment, Reflection, Critical thinking and reflection, Entrepreneurship ∙Social: Collaboration/Teamwork, Communication ∙Societal: Respect for others, Personal responsibility, Integrity, Tolerance ∙Human: Respect for the environment/sustainable mindset, Commitment to equity
K-12 AI Guidelines	Big Idea	Concept	∙Big idea > concept, LO, EU, Unpacked description, activity - LO (Learning Objective): what students should be able to learn - EU (Enduring Understanding): What students should know - Unpacked descriptions are included when necessary to illustrate the LO or EU - Activity: Teaching and learning materials that support practice are provided as a google link ∙The AI4K12 draft guidelines are organized in grade band progression charts that span K-2, 3–5, 6–8, and 9–12 grade bands - “Perception” is the process of extracting meaning from sensory signals - Agents maintain “representations” of the world and use them for “reasoning.” - “Machine learning” is a kind of statistical inference that finds patterns in data - Intelligent agents require many kinds of knowledge to “interact naturally” with humans - AI can “impact society” in both positive and negative ways
	Perception	∙Sensing: [1-A-i] Living things, [1-A-ii] Computer sensors, [1-A-iii] Digital encoding ∙Processing: [1-B-i] Sensing vs perception, [1-B-ii], Feature extraction, [1-B-iii] Abstraction pipeline: language, [1-B-iv] Abstraction pipeline: vision ∙Domain knowledge: [1-C-i] Types of domain knowledge, [1-C-ii] Inclusivity
	Representation and Reasoning	∙Representation: [2-A-i] Abstraction, [2-A-ii] Symbolic representations, [2-A-iii] Data structures, [2-A-iv] Feature vectors ∙Search: [2-B-i] State spaces and operators, [2-B-ii] Combinatorial search ∙Reasoning: [2-C-i] Types of reasoning problems, [2-C-ii] Reasoning algorithms
	Learning	∙Nature of Learning: [3-A-i] Humans vs. machines, [3-A-ii] Finding patterns in data, [3-A-iii] Training a model, [3-A-iv] Constructing vs. using a reasoner, [3-A-v] Adjusting internal representations, [3-A-vi] Learning from experience ∙Neural Networks: [3-B-i] Structure of a neural network, [3-B-ii] Weight adjustment ∙Datasets: [3-C-i] Feature sets, [3-C-ii] Large datasets, Bias
	Natural Interaction	∙Natural Language: [4-A-i] Structure of language, [4-A-ii] Ambiguity of language, [4-A-iii] Reasoning about text, [4-A-iv] Applications ∙Common-sense Reasoning [4-B-i] ∙Understanding Emotion [4-C-i] ∙Philosophy of Mind [4-D-i]
	Societal Impact	-

Table 3. Composition of K-12 national-level AI curricula.

Nation	Title of Document	Contents		Remarks
		Level	Unit	∙Level > units > topics, Learning outcomes (knowledge, comprehensions, evaluation) ∙*: Should be assessed in practical examination only ∙Unit: knowledge, skills, values, critical and creative thinking skills indicated. ∙Recommended duration for each unit: - Total marks: 100 (Theory 50 + Practical 50), hours per unit (Theory + Practice)
India CBSE (2023–2024)	Artificial Intelligence (senior secondary level CLASS—XI & XII)	AI Informed (AI Foundations)	∙Introduction to AI (Knowledge) ∙AI Applications & Methodologies * (Knowledge) ∙Math for AI (Knowledge) ∙AI Values—Ethical Decision Making (Values) ∙Introduction to Storytelling * (Skills)
		AI Inquired (AI Apply)	∙Critical & Creative Thinking * (Skills) ∙Data Analysis—Computational Thinking * (Skills) ∙Regression (Knowledge) ∙Classification & Clustering(Knowledge) ∙AI Values—Bias Awareness * (Values)
		AI Innovate	∙Capstone Project ∙Model lifecycle (Knowledge) ∙Storytelling Through Data (Critical and Creative thinking Skills)
China (2017)	Preliminary AI (High school)	Area	Content demand	∙1 course out of 6 ∙Required elective modules ∙Area > content demand
		Introduction to AI	∙The concept and basic features of AI are explained. The development process of AI and typical applications and trends are taught. ∙Core AI algorithms (heuristic search, decision-making tree) are understood through specific case analysis. The basic process and implementation principle of smart technology application are described.
		Development of simple AI application module	∙Development tools and platforms for AI application of specific area (machine learning) are described, and their features, application models, and limitations are understood through specific cases. ∙A Simple AI application module is built using open source AI application tools; appropriate environment, parameters, and natural interaction technology are acquired depending on actual needs.
		Development and application of AI technology	∙Ethical and security issues faced by intelligent society are explored through experiences related to the application of intelligent systems. Security awareness and responsibilities are strengthened by learning basic methods and measures of information system security. ∙The fact that AI can bring both massive value and potential threat to the future development of human societies is dialectically recognized. Compliance with norms and laws concerning socialization and application of AI is realized.
Republic of Korea (2020)	Introduction to AI (High school)	Area	Key concept	∙1 of 2 elective subjects ∙Content system: area > key concept > content elements ∙Learning elements, achievement standards
		Understanding AI	∙AI and society ∙AI and agent
		Principles and application of AI	∙Recognition ∙Search and inference ∙Learning
		Data and ML	∙Data ∙ML models
		Social impact of AI	∙Impact of AI ∙Ethics of AI

Table 4. Research on textbook analysis methods.

Year		2022 [13]	2022 [14]	2022 [15]	2021 [16]	2021 [17]	2021 [18]	2020 [19]	2020 [20]	2020 [21]	2020 [22]	2020 [23]	2020 [24]	2019 [25]	2018 [26]	2018 [27]	2015 [28]	2010 [29]
Nation		Indonesia	Finland	Taiwan	China	Hong Kong	Israel	China	Germany	USA	Russia	Arab	Spain	North Korea	Czech Republic	Korea	Korea	Israel
Subject		Physics	Biology	Language	Science	Math	Civic (political, civic)	Chemistry	Biology	American history	Social science	Science	Social science	Chemistry	Chemistry	Science	Computer	Science
Analytical perspective	Knowledge or society	Female	Liberation of the natives			Similarity		Safety education	Genetic section—transgenic work	Underprivileged groups (gender, race, ethnicity)		Science, religion	Concept of state democracy					Cells should be studied longitudinally
Analytical perspective	Etc. (stem, competency)				Stem in physics, chemistry, and biology												Problem solving ability
Consistency of analysis data	Comparison to policy documents		Department of Education policy documents				Curriculum	Curriculum, lesson plans, textbooks										Curriculum guidelines
	Comparison over time				Previous-current	Different period (historical context, learning trajectory)	Previous-current	Previous-current				Changes compared to the 21st century	During the democratic transition	North Korea’s Kim Jong-un era
	Compare multiple documents	3	48		23	6	√		3	15	7 + 7	8	√	√				6
Type of data		Text, visual image	Text, illustration	Text	Text	Text	Text	Text (term/topic)	Text	Text	Text	Text	Text	Text	Text	Text	Text	Visual expression, text
Content analysis	Frame composition based on prior research		√		√	√		√	√			√	√		√		√	√
	Cohen’s kappa reliability test		√														√	√
	Gather expert opinions		√									√			√		√	√
	Etc.	Analysis of discourse	Analysis of discourse			Horizontal and vertical analysis	Change in discourse						Subsequent evolution of descriptions, expressions			Compared to the spoken language of teachers
Text mining								NLPIR (Chinese text mining tool), frequency analysis, word cloud		Topic modeling (LDA), Wordnet, word embedding, cross reference	LDA, ARTM Word2vec—correlation of cosine distances					Text network analysis
Statistical analysis or Romey method														Romey method			Using SPSS 18.0 and Amos 20.0 SW tools + confirmatory factor analysis quantitative analysis coefficient
Etc.											Text complexity			Inclination to seek	Difficulty

Table 5. Contents of basic AI curriculum.

Area	Key Concept	Content Elements	Learning Elements
1. Understanding of AI	AI and society	∙Concept and characteristics of AI ∙Advances in AI technology and social changes	· Concept of AI · Characteristics of AI · Role of AI	· AI and social changes · AI and occupational changes
1. Understanding of AI	AI and agent	∙Concept and role of intelligent agent	· Concept of intelligent agent · Relationship between AI and intelligent agent	· Role of intelligent agent
2. Principles and application of AI	Recognition	∙Sensor and recognition ∙Computer vision ∙Voice recognition and language comprehension	· Sensor · Voice recognition · AI speaker · Chatbot	· Image recognition · Computer vision · Robot vision
	Search and inference	∙Problem solving and search ∙Representation and inference	· Search tree · Best-first search	· Knowledge representation · Inference
	Learning	∙Concept and application of machine learning ∙Concept and application of deep learning	· Machine learning · Classification · Clustering	· Forecasting · Deep learning
3. Data and machine learning	Data	∙Data attributes ∙Structured data ∙Unstructured data	· Data attributes · Structured data · Unstructured data
3. Data and machine learning	Machine learning models	∙Classification model ∙Machine learning model implementation	· Machine learning · Classification model · Machine learning model implementation · Key property extraction	· Training data · Test data · Model training · Performance evaluation
4. Social impact of AI	Impact of AI	∙Solving social problems ∙Data bias	· Values of AI · Impact of AI	· Data bias
4. Social impact of AI	Ethics of AI	∙Ethical dilemma ∙Social responsibilities and fairness	· Ethics of AI · Ethical dilemma	· Social responsibilities · Fairness of AI

Table 6. Frequency (%) of each content area in the composition of the different textbooks analyzed.

Textbooks	1. Understanding of AI	2. Principles and Application of AI	3. Data and Machine Learning	4. Social Impact of AI	Total
A [45]	32 (17.5)	66 (36.1)	46 (25.1)	39 (21.3)	183
B [46]	32 (14.4)	80 (36.0)	72 (32.4)	38 (17.1)	222
C [47]	32 (16.2)	70 (35.4)	64 (32.3)	32 (16.2)	198
D [48]	30 (15.0)	74 (37.0)	70 (35.0)	26 (13.0)	200
E [49]	28 (13.5)	74 (35.6)	70 (33.7)	36 (17.3)	208
F [50]	34 (15.7)	80 (37.0)	62 (28.7)	40 (18.5)	216
G [51]	36 (16.2)	90 (40.5)	60 (27.0)	36 (16.2)	222
H [52]	34 (16.5)	78 (37.9)	64 (31.1)	30 (14.6)	206

Table 7. Corpus frequency in the composition of the eight textbooks.

	NNP	NNG	VCN	VA	VV	NP	NNB	NR	VX	Sum
Textbook	Proper Noun	Common Noun	Negative Determiner	Adjective	Verb	Pronoun	General Dependent Noun	Numeral	Auxiliary Verb	Sum
A [45]	1761	4945	17	73	504	22	34	14	4	7374
B [46]	1813	4876	22	114	505	50	41	6	17	7444
C [47]	1351	3947	10	108	420	50	21	7	14	5928
D [48]	1790	4797	26	116	457	65	26	13	5	7295
E [49]	1642	4856	27	86	530	44	33	12	6	7236
F [50]	2535	6866	25	173	761	57	40	19	11	10,487
G [51]	3444	8956	64	254	1058	105	112	24	25	14,042
H [52]	1268	3912	20	148	449	87	47	12	3	5946

Table 8. TF analysis results by curriculum area and textbook.

Cur.	Textbook: Corpus (Frequency) √: Keywords Included in Learning Elements *: Other Areas
Area	A [45]	B [46]	C [47]	D [48]	E [49]	F [50]	G [51]	H [52]
1. Understanding of AI	√ AI (66)	√ AI (98)	√ AI (76)	√ AI (70)	√ AI (95)	√ AI (81)	√ AI (137)	√ Intelligence (45)
	√ Agent (42)	Technology (31)	√ Intelligence (31)	√ Intelligence (51)	√ Agent (34)	√ Agent (45)	√ Intelligence (97)	√ AI (43)
	Technology (28)	Learning (29)	√ Agent (28)	√ Agent (46)	√ Intelligence (30)	√ Intelligence (44)	√ Agent (78)	√ Agent (32)
	√ Intelligence (25)	√ Agent (28)	* Learning (20)	Human (23)	Human (27)	Human (30)	Human (44)	Human (17)
	Human (19)	Human (28)	Technology (17)	Field (21)	* Data (24)	* Learning (18)	Behavior (41)	√ Role (13)
	* Learning (17)	√ Intelligence (26)	√ Occupation (17)	* Learning (15)	Learning (20)	System (18)	Environment (31)	Computer (12)
	Information (17)	√ Occupation (16)	√ Change (17)	√ Role (14)	Technology (13)	√ Change (17)	Information (30)	√ Occupation (9)
	Field (16)	* Data (15)	√ Society (17)	Execute (12)	* Inference (13)	Technology (16)	Software (30)	√ Society (9)
	Goal (15)	√ Change (14)	Mail (15)	Situation (12)	Field (12)	Through (16)	Person (27)	Use (9)
	Environment (13)	Development (13)	Ability (12)	√ Occupation (11)	√ Change (11)	Execute (14)	Field (26)	Behavior (8)
	Through (13)	√ Society (12)	Application (12)	√ Society (11)	Role (10)	User (14)	√ Change (25)	√ Change (8)
	Problem (13)	Software (11)	Inference (11)	Individual (11)	√ Society (10)	Nature (14)	Occupation (23)	Judgment (8)
	√ Society (12)	Machine (10)	Field (11)	Application (10)	Software (10)	√ Field (13)	Learning (22)	New (8)
	Behavior (12)	Use (10)	Task (11)	Task (10)	Apply (10)	√ Society (13)	Product (21)	We (8)
	* Robot (12)	Ability (10)	Spam (11)	Recognition (10)	Problem (10)	Apply (13)	Necessity (19)	Software (7)
	Characteristics (11)	Environment (9)	Apply (9)	Technology (9)	Not (9)	Recognition (13)	Fast (19)	Application (7)
	Kinds (11)	Through (9)	Judgment (9)	Process (9)	Behavior (9)	* Inference (12)	User (18)	Create (7)
	Knowledge (11)	* Robot (9)	Execute (9)	Not (9)	Application (8)	Behavior (12)	According (18)	Through (7)
	Based (11)	Characteristics (9)	System (9)	√ Change (8)	* Recognition (8)	Understand (12)	Execute (17)	Instead (7)
	Rule (11)	Understand (9)	Development (8)	Characteristics (8)	Concept (8)	Process (11)	Role (17)	Autonomous (6)
2. Principles and application of AI	* Data (63)	√ AI (68)	√ Search (60)	State (74)	√ Learning (88)	State (110)	√ Search (99)	√ Learning (86)
	√ Learning (47)	√ Recognition (59)	√ Recognition (50)	√ Search (65)	* Data (84)	√ Learning (99)	√ Learning (91)	* Data (44)
	Node (38)	Information (59)	* Data (37)	√ Recognition (35)	State (71)	* Data (85)	* Data (88)	Method (40)
	State (36)	* Learning (58)	Information (36)	√ AI (29)	√ Search (50)	√ AI (72)	Method (63)	√ Search (36)
	√ Search (36)	* Data (53)	√ Knowledge (32)	Goal (28)	√ Recognition (46)	Application (62)	√ Recognition (59)	Application (34)
	√ Sensor (36)	Technology (52)	√ Sensor (31)	√ Knowledge (26)	Through (41)	√ Recognition (58)	√ Representation (56)	We (34)
	√ Recognition (33)	√ Search (45)	Application (29)	Understand (26)	Application (39)	Field (55)	Human (55)	√ AI (33)
	Rule (33)	√ Sensor (39)	Method (27)	√ Sensor (24)	Neural network (36)	Information (50)	Word (54)	√ Machine (33)
	Application (32)	Application (38)	√ AI (26)	√ Image (24)	√ AI (34)	√ Classification (50)	Knowledge (53)	√ Recognition (29)
	Problem (31)	√Voice recognition (34)	√ Image (26)	Technology (24)	Problem (34)	√Search (47)	√ AI (52)	Knowledge (29)
	√ Machine (31)	State (32)	Learning (24)	Information (20)	For (30)	√ Forecasting (44)	Classification (52)	Person (28)
	√ Knowledge (26)	Use (32)	√ Representation (24)	√ Representation (17)	√ Image (29)	Create (7)	State (49)	State (24)
	Technology (26)	√ Computer vision (32)	Rule (22)	Method (16)	√Sensor (25)	√ Clustering (40)	Information (49)	Utilize (24)
	Analysis (25)	Person (31)	Human (22)	Person (16)	Technology (25)	√ Image (39)	√ Image (48)	Computer (24)
	System (25)	√ Image (30)	Use (21)	Intelligence (15)	√ Inference (25)	Rule (39)	√ Machine (48)	√ Representation (22)
	Through (24)	Field (27)	Judgment (21)	Utilize (14)	Process (25)	For (38)	√ Sensor (45)	Problem (22)
	Process (24)	Problem (26)	Through (20)	For (14)	Person (23)	√ Machine (37)	Use (43)	Create (21)
	Supervised learning (22)	Through (26)	Technology (19)	Object (14)	Utilize (23)	Use (36)	Problem (41)	How (21)
	Information (21)	Rule (23)	Person (19)	Language (14)	Field (23)	Case (34)	Process (39)	Field (20)
	Human (21)	√ Representation (23)	Utilize (19)	Experience (13)	Next (23)	Method (31)	Utilize (37)	Technology (20)
3. Data and machine learning	√ Data (170)	√ Data (151)	√ Data (151)	√ Data (337)	√ Data (204)	√ Data (245)	√ Data (313)	√ Data (160)
	√ Model (57)	√ Attribute (80)	√ Classification (37)	√ Learning (192)	√ Model (94)	√Classification (109)	√Classification (126)	√ Model (72)
	√ Classification (55)	√ Learning (64)	√ AI (33)	√ Attribute (176)	√ Classification (80)	√ Learning (105)	√ Model (111)	√ Learning (66)
	√ Learning (53)	√ Classification (56)	√ Learning (27)	√ Model (96)	√ Attribute (51)	√ Machine (82)	√ Learning (103)	√ Attribute (65)
	Iris (35)	√ Model (53)	√ Structured (27)	√ Machine (78)	√ Learning (45)	√ Attribute (68)	Machine (71)	√ Classification (58)
	√ Attribute (34)	√ Structured (41)	√ Attribute (24)	Utilize (61)	* AI (45)	√ Model (54)	√ Attribute (64)	Problem (49)
	√ Machine (28)	Meal (34)	Application (24)	Classification (53)	Generate (35)	Iris (45)	Problem (54)	√ Machine (30)
	√ Structured (24)	√ Core (28)	√ Model (23)	Viewpoint (48)	Problem (29)	Problem (35)	√ Structured (54)	Solving (30)
	Problem (19)	* AI (26)	Viewpoint (21)	Use (48)	Use (27)	Set (32)	Widget (49)	√ Structured (28)
	Kinds (17)	Use (24)	Necessity (18)	Relation (43)	Solving (25)	Label (28)	Iris (45)	Use (26)
	Kind (17)	√ Machine (23)	Use (15)	Analysis (40)	Collection (25)	Through (27)	* Image (44)	Class (26)
	Application (16)	Problem (20)	Problem (15)	Input (37)	Necessity (24)	√ Performance (26)	Use (38)	* Image (23)
	Characteristics (16)	√ Test (19)	Method (14)	Function (37)	Information (24)	Structured (26)	Message (38)	* Forecasting (21)
	For (15)	√ Performance (19)	Solving (13)	* Forecasting (33)	√ Performance (23)	Next (26)	√ Test (36)	Input (20)
	Information (15)	* Forecasting (15)	Representation (12)	Ratings (33)	√ Structured (21)	Feature (26)	Confirmation (36)	Case (20)
	Necessity (14)	Extract (15)	Rule (12)	Necessity (28)	√ Training (21)	Banana (26)	√ Performance (35)	Movie (20)
	Analysis (14)	Analysis (14)	√ Machine (11)	Result (28)	For (20)	Kinds (25)	For (32)	Kinds (16)
	√ Assessment (14)	Necessity (13)	Analysis (11)	Examine (28)	Input (19)	Solving (22)	Kinds (31)	Method (15)
	Use (13)	√ Training (13)	Exist (11)	√ Structured (27)	According (19)	Case (22)	Solving (30)	Relevant (15)
	Utilize (13)	Label (13)	For (10)	For (27)	Form (18)	Necessity (21)	Kind (30)	Set (14)
4. Social impact of AI	√ AI (145)	√ AI (99)	√ AI (103)	√ AI (105)	√ AI (109)	√ AI (181)	√ AI (218)	√ AI (59)
	√ Data (65)	√ Data (43)	√ Society (51)	√ Ethics (28)	√ Data (56)	√ Ethics (49)	√ Data (72)	√ Data (35)
	√ Bias (48)	√ Society (32)	√ Ethics (41)	√ Data (28)	√ Bias (28)	√ Society (46)	* Learning (57)	√ Society (29)
	√ Ethics (39)	√ Bias (29)	√ Data (31)	Human (23)	√ Society (27)	Problem (46)	√ Society (53)	√ Ethics (28)
	Human (36)	√ Ethics (22)	√ Bias (22)	Technology (18)	√ Ethics (26)	Technology (43)	√ Ethics (50)	Problem (26)
	√ Society (35)	Human (22)	Learning (21)	√ Bias (17)	Application (22)	Learning (42)	Application (47)	√ Bias (20)
	Learning (30)	Person (20)	* Problem (20)	For (17)	* Learning (19)	√ Data (41)	Human (44)	For (17)
	√ Dilemma (23)	Occur (18)	Human (19)	Use (16)	Person (17)	Human (38)	Result (43)	Use (13)
	Judgment (23)	Judgment (17)	Application (19)	√ Fair (15)	Kinds (14)	Application (34)	√ Bias (38)	√ Fair (13)
	Situation (22)	Select (16)	Judgment (16)	Develop (14)	Problem (13)	Solving (28)	Occurrence (37)	√ Dilemma (12)
	Problem (21)	Situation (15)	Occurrence (15)	Consideration (13)	Image (13)	Occurrence (23)	Technology (26)	Learning (11)
	For (21)	Problem (15)	√ Impact (15)	√ Impact (11)	Human (12)	Develop (21)	Person (24)	Technology (11)
	Application (20)	Drive (15)	Situation (14)	For (11)	Result (12)	√ Dilemma (19)	Problem (23)	√ Responsibility (11)
	Technology (20)	√ Fair (14)	√ Dilemma (13)	Occurrence (10)	Use (11)	For (19)	√Responsibility (23)	Solving (10)
	Person (17)	Autonomous (14)	Develop (13)	Necessity (10)	Situation (11)	For (18)	Use (23)	Member (10)
	Result (16)	Automobile (14)	Technology (13)	Secure (10)	Development (10)	√ Bias (17)	For (21)	Person (9)
	For (14)	Application (13)	For (12)	Case (10)	Developer (10)	Situation (17)	Necessity (21)	For (9)
	Kinds (13)	Use (13)	Result (12)	Perspective (9)	Shown (10)	Future (17)	Not (21)	Because (9)
	√ Fair (12)	√ Impact (13)	Solving (10)	Proper (9)	Case (9)	User (16)	Automobile (20)	√ Impact (8)
	Recognition (11)	Dilemma (12)	√ Fair (9)	Individual (9)	√ Dilemma (8)	System (16)	Machine (20)	Prejudice (8)

Table 9. TF-IDF analysis results by curriculum area and textbook.

Curr.	Textbook TF-IDF: Corpus(TF-IDF) *: Computing Field-Related Keywords
Area	A [45]	B [46]	C [47]	D [48]	E [49]	F [50]	G [51]	H [52]
1. Understanding of AI	* Computing (4.16)	Dust (6.93)	Mail (14.71)	Book (6.93)	Law (3.92)	Nature (6.58)	Excellent (9.81)	Individual (4.16)
	Intensity (4.16)	* Neural network (5.55)	Spam (10.79)	Center (4.16)	Intuition (2.94)	Protagonist (4.16)	Logistics (8.32)	Multitude (3.92)
	Perception (3.76)	* Clustering (4.16)	Airline ticket (8.32)	Counseling (3.47)	Supplement (2.94)	Coffee (4.16)	Surgery (6.93)	Action (2.94)
	Manufacture (2.94)	Material (4.16)	Revolution (4.16)	Customer (3.29)	Memory (2.94)	Mail (3.92)	Construction (6.93)	Vitality (2.94)
	Save (2.94)	Cleaning (3.29)	Fourth industrial (4.16)	Prescription (2.94)	* Computing (2.77)	Command (3.16)	Camera (5.55)	Here (2.77)
	* Knowledge base (2.94)	So (2.94)	Reservation (3.47)	Confirmation (2.94)	English (2.77)	Raise (2.94)	Shooting (5.55)	Collide (2.77)
	Engineering (2.94)	Advantage (2.94)	Prospect (3.47)	Material (2.94)	Voice (2.77)	Active (2.94)	With machine (5.55)	Economy (2.08)
	* Computer vision (2.94)	Current (2.94)	Properly (2.94)	File (2.94)	Judge (2.77)	Daily life (2.77)	Premise (5.55)	Value (2.08)
	Portability (2.77)	Chess (2.94)	Happen (2.94)	Surroundings (2.94)	Clinical (2.77)	Lend (2.77)	Forest fire (5.55)	Stay (1.96)
	Preference (2.77)	Obstacle (2.77)	Influence (2.77)	Communication (2.94)	Cleaning (2.77)	Handle (2.77)	Be (5.55)	Imitate (1.88)
2. Principles and application of AI	* Node (17.86)	Assist (11.09)	Withdraw (6.93)	One side (4.16)	Production (6.93)	Blog (16.64)	* Propositional logic (17.65)	Sugar level (8.32)
	* Computing (7.05)	Creation (11.09)	Fault (6.93)	Cold (4.16)	* Supervised (5.75)	Detection (12.48)	Attack (15.25)	Conjecture (8.32)
	Turn (6.93)	Group (9.7)	* Node (6.58)	* Entry (4.16)	Apple (5.55)	Attach (8.32)	Giraffe (13.86)	Achievement (5.55)
	Piece (6.87)	Hotdog (8.83)	* Propositional logic (5.88)	Deform (4.16)	Farmer (5.55)	Carrot (6.93)	* Node (12.69)	Sunshine (5.55)
	* Supervised learning (6.33)	Substitute (6.87)	Lie (5.88)	Guess (3.29)	* Identifier (5.55)	Future (6.93)	Frequency (11.09)	Supervised (5.47)
	Books (5.55)	Piece (5.88)	Affect (5.55)	* Software (2.94)	Prerequisite (5.55)	Exercise (6.93)	* Proposition (10.34)	Attribute (5.18)
	Graft (4.16)	Voice (4.85)	Point (4.89)	* Operator (2.77)	* Generator (5.55)	Current (5.88)	Hill (9.81)	Entity (4.16)
	Reflect (4.16)	* Supervised (4.32)	* Proposition (4.7)	Return (2.77)	Cross (4.85)	* Regression (5.88)	Officer (9.7)	Apple (4.16)
	Launch (4.16)	Remind (4.16)	Junction (4.16)	Descend (2.77)	Concentration (4.16)	Number of cases (5.88)	* Supervised learning (8.63)	Knowledge base (4.16)
	* Pattern recognition (3.92)	Expenditure (4.16)	Volume (4.16)	Edge (2.77)	Verify (4.16)	Bicycle (5.88)	* Linear regression (8.32)	Class (4.16)
3. Data and machine learning	Iris (10.07)	Meal (47.13)	Customer (5.88)	Ratings (45.75)	Discover (8.32)	Banana (18.02)	Widget (67.93)	Class (18.02)
	Salmon (9.7)	Clothes (11.09)	Chart (5.88)	Function (25.65)	Period (6.93)	Body type (16.64)	Message (52.68)	Audience (18.02)
	Sea bass (9.7)	Number of people (9.7)	Hateful comments (4.16)	Library (23.57)	Transaction (6.93)	Iris (12.95)	Folder (22.18)	Movie (13.86)
	Fish (5.55)	Morning (8.83)	Deficiency (4.16)	Survival (23.57)	Price (6.87)	Correlation (11.09)	Connection (13.17)	Question (9.01)
	* Data type (5.55)	Peak (8.32)	Hashtag (4.16)	Das (13.86)	Score (6.24)	Surface (11.09)	Iris (12.95)	Answer (8.32)
	Zebra (5.55)	Protein (6.93)	Late night (4.16)	Grape (13.86)	Wear (5.88)	Hit (9.81)	List (11.77)	Review (6.87)
	Okapi (5.55)	Alight (5.55)	Temperature (4.16)	Furniture (13.86)	Mask (5.88)	Code (7.85)	Confusion (11.09)	Box (5.55)
	Giraffe (5.55)	Snack bar (5.55)	Agent (4.16)	Dependence (13.86)	* Class (5.55)	Song (6.93)	* Logistic regression (10.4)	Wear (4.9)
	Kind (4.89)	Crunch (5.55)	Chicken (4.16)	Kyoho grape (13.86)	Generate (4.67)	Width (6.04)	Orange (9.7)	Actor (4.9)
	Table (4.23)	Go to school (5.55)	Block (4.16)	3B (12.48)	Used (4.16)	License plate (5.88)	Kind (8.63)	Uplift (4.16)
4. Social impact of AI	Weapon (7.85)	Reality (4.85)	Properly (4.9)	Accept (4.9)	Track (9.7)	Prepare (5.55)	* Security (8.32)	Committee (4.16)
	Intentional (6.93)	Disease (4.85)	Red (4.16)	Permission (4.16)	Image (9.01)	Picture (4.9)	Crime (7.52)	Beneficial (2.77)
	Translation (6.87)	Disability (4.16)	Contribute (2.94)	Subject (3.47)	Cat (8.32)	Weapon (4.9)	Algal bloom (6.93)	Reward (2.77)
	Duty (5.88)	Amazon (4.16)	Logic (2.94)	Item (2.94)	Hair (8.32)	System (4.6)	Stock (6.93)	Principles of (2.77)
	Omission (5.55)	Truck (4.16)	Solution (2.94)	Gap (2.94)	Appearance (5.88)	Life (4.16)	Incident (6.87)	Unnecessity (2.77)
	Competence (5.55)	Pollution (4.16)	Contradiction (2.77)	Decline (2.77)	You (5.55)	Elephant (4.16)	Foreign language (5.55)	Advantage (2.77)
	Reduction (5.55)	Efficiency (3.92)	Display (2.77)	Instruction (2.77)	Structured (5.55)	Grant (4.16)	Old (5.55)	Rail (2.77)
	Expense (4.9)	Hurt (3.92)	Barista (2.77)	Emerge (2.77)	Background (4.9)	Interests (4.16)	Disappear (5.55)	Provider (2.77)
	Example (4.16)	Driver (3.76)	Coffee (2.77)	Aim (2.77)	Quality (4.16)	Prejudice (3.92)	Transaction (5.55)	Induce (2.77)
	Experimenter (4.16)	Diagnosis (3.47)	Deficiency (2.08)	Lose (2.77)	Discover (4.16)	Daily life (3.92)	Damage (5.55)	All (2.35)

Table 10. LDA analysis results for each textbook and evaluation of consistency with the curriculum.

Text Book	Topic Composition																				Total	Area of Curr.
Text Book	Topic Composition																				Total	1	2	3	4
A [45]	0.093*”data”	+	0.031*”model”	+	0.030*”classification”	+	0.029*”learning”	+	0.019*”iris”	+	0.019*”attribute”	+	0.015*”machine”	+	0.013*”structured”	+	0.011*”problem”	+	0.009*”kinds”	+	0.415		√	√
	0.009*”kind”	+	0.009*”application”	+	0.009*”characteristics”	+	0.008*”information”	+	0.008*”regarding”	+	0.008*”analysis”	+	0.008*”necessity”	+	0.008*”assessment”	+	0.007*”use”	+	0.007*”utilize”	+
	0.007*”length”	+	0.007*”AI”	+	0.007*”for”	+	0.007*”form”	+	0.007*”save”	+	0.006*”image”	+	0.006*”discern”	+	0.006*”understand”	+	0.006*”collection”	+	0.006*”training”
	0.066*”AI”	+	0.023*”data”	+	0.017*”human”	+	0.015*”bias”	+	0.015*”society”	+	0.015*”technology”	+	0.014*”learning”	+	0.013*”agent”	+	0.012*”ethics”	+	0.010*”problem”	+	0.330	√			√
	0.010*”Judgment”	+	0.009*”regarding”	+	0.009*”intelligence”	+	0.008*”application”	+	0.007*”kinds”	+	0.007*”dilemma”	+	0.007*”situation”	+	0.007*”information”	+	0.007*”person”	+	0.006*”through”	+
	0.006*”robot”	+	0.006*”for”	+	0.006*”field”	+	0.006*”result”	+	0.005*”behavior”	+	0.005*”according”	+	0.005*”explain”	+	0.005*”necessity”	+	0.005*”recognition”	+	0.004*”develop”
	0.021*”data”	+	0.016*”learning”	+	0.013*”state”	+	0.013*”sensor”	+	0.012*”node”	+	0.012*”search”	+	0.012*”rule”	+	0.011*”recognition”	+	0.011*”machine”	+	0.011*”application”	+	0.281		√
	0.011*”problem”	+	0.009*”knowledge”	+	0.009*”technology”	+	0.009*”analysis”	+	0.009*”system”	+	0.009*”process”	+	0.008*”through”	+	0.008*”information”	+	0.007*”supervised learning”	+	0.007*”human”	+
	0.007*”field”	+	0.007*”use”	+	0.007*”process”	+	0.006*”for”	+	0.006*”method”	+	0.006*”computer vision”	+	0.006*”understand”	+	0.006*”goal”	+	0.006*”input”	+	0.006*”case”
B [46]	0.076*”data”	+	0.040*”attribute”	+	0.032*”learning”	+	0.028*”classification”	+	0.027*”model”	+	0.021*”structured”	+	0.017*”meal”	+	0.014*”core”	+	0.013*”AI”	+	0.012*”use”	+	0.425		√	√
	0.012*”machine”	+	0.010*”problem”	+	0.010*”performance”	+	0.010*”test”	+	0.008*”forecast”	+	0.008*”extract”	+	0.007*”analysis”	+	0.007*”according”	+	0.007*”necessity”	+	0.007*”label”	+
	0.007*”implement”	+	0.007*”training”	+	0.006*viewpoint”	+	0.006*”application”	+	0.006*”image”	+	0.006*”select”	+	0.006*”result”	+	0.005*”for”	+	0.005*”solving”	+	0.005*”neighbor”
	0.045*”AI”	+	0.019*”data”	+	0.016*”learning”	+	0.015*”technology”	+	0.012*”information”	+	0.012*”recognition”	+	0.012*”human”	+	0.009*”application”	+	0.009*”person”	+	0.009*”use”	+	0.279	√	√		√
	0.008*”sensor”	+	0.008*”search”	+	0.008*”problem”	+	0.007*”society”	+	0.007*”agent”	+	0.007*”through”	+	0.006*”intelligence”	+	0.006*”field”	+	0.006*”voice recognition”	+	0.006*”state”	+
	0.006*”machine”	+	0.006*”computer vision”	+	0.005*”image”	+	0.005*”understand”	+	0.005*”Judgment”	+	0.005*”bias”	+	0.005*”autonomous”	+	0.005*”drive”	+	0.005*”development”	+	0.005*”we”
C [47]	0.100*”data”	+	0.025*”classification”	+	0.022*”AI”	+	0.018*”learning”	+	0.018*”structured”	+	0.016*”application”	+	0.016*”attribute”	+	0.015*”model”	+	0.014*”viewpoint”	+	0.012*”necessity”	+	0.399		√	√
	0.010*”use”	+	0.010*”problem”	+	0.009*”method”	+	0.009*”solving”	+	0.008*”rule”	+	0.008*”representation”	+	0.007*”analysis”	+	0.007*”machine”	+	0.007*”exist”	+	0.007*”for”	+
	0.007*”criteria”	+	0.007*”collection”	+	0.006*”according”	+	0.006*”through”	+	0.006*”utilize”	+	0.006*”create”	+	0.006*”many”	+	0.006*”input”	+	0.006*”because”	+	0.005*”human”
	0.078*”AI”	+	0.030*”society”	+	0.018*”learning”	+	0.018*”ethics”	+	0.016*”data”	+	0.014*”application”	+	0.014*”intelligence”	+	0.013*”technology”	+	0.012*”agent”	+	0.011*”problem”	+	0.372	√			√
	0.011*”Judgment”	+	0.011*”human”	+	0.010*”bias”	+	0.008*”for”	+	0.008*”impact”	+	0.008*”change”	+	0.008*”occupation”	+	0.007*”ability”	+	0.007*”occur”	+	0.007*”develop”	+
	0.007*”solving”	+	0.007*”field”	+	0.007*”mail”	+	0.007*”result”	+	0.007*”situation”	+	0.006*”how”	+	0.006*”task”	+	0.006*”dilemma”	+	0.005*”inference”	+	0.005*”role”
	0.022*”search”	+	0.018*”recognition”	+	0.014*”data”	+	0.013*”information”	+	0.012*”knowledge”	+	0.011*”sensor”	+	0.011*”application”	+	0.010*”method”	+	0.010*”AI”	+	0.010*”image”	+	0.268	√	√
	0.009*”learning”	+	0.009*”representation”	+	0.008*”human”	+	0.008*”rule”	+	0.008*”Judgment”	+	0.008*”use”	+	0.007*”through”	+	0.007*”person”	+	0.007*”technology”	+	0.007*”utilize”	+
	0.007*”surrounding”	+	0.007*”field”	+	0.006*”according”	+	0.006*”computer”	+	0.006*”computer vision”	+	0.006*”point”	+	0.006*”assessment”	+	0.005*”agent”	+	0.005*”situation”	+	0.005*”process”
D [48]	0.085*”data”	+	0.048*”learning”	+	0.044*”attribute”	+	0.024*”model”	+	0.020*”machine”	+	0.015*”utilize”	+	0.013*”classification”	+	0.012*”use”	+	0.012*”viewpoint”	+	0.011*”relation”	+	0.416			√
	0.010*”analysis”	+	0.009*”function”	+	0.009*”input”	+	0.008*”ratings”	+	0.008*”forecast”	+	0.007*”necessity”	+	0.007*”result”	+	0.007*”examine”	+	0.007*”for”	+	0.007*”structured”	+
	0.006*”program”	+	0.006*”create”	+	0.006*”grasp”	+	0.005*”form”	+	0.005*”application”	+	0.005*”new”	+	0.005*”image”	+	0.005*”what”	+	0.005*”problem”	+	0.005*”answer”
	0.057*”AI”	+	0.042*”intelligence”	+	0.038*”agent”	+	0.019*”human”	+	0.017*field”	+	0.012*”learning”	+	0.012*”role”	+	0.010*”situation”	+	0.010*”execute”	+	0.009*”society”	+	0.368	√
	0.009*”individual”	+	0.009*”occupation”	+	0.008*”recognition”	+	0.008*”application”	+	0.008*”task”	+	0.008*”technology”	+	0.007*”not”	+	0.007*”process”	+	0.007*”understand”	+	0.007*”person”	+
	0.007*”surrounding”	+	0.007*”software”	+	0.007*”change”	+	0.007*”characteristics”	+	0.006*”knowledge”	+	0.006*”explain”	+	0.006*”input”	+	0.006*”repeat”	+	0.006*”customer”	+	0.006*”instead”
	0.050*”AI”	+	0.028*”state”	+	0.025*”search”	+	0.016*”technology”	+	0.014*”data”	+	0.013*”recognition”	+	0.013*”understand”	+	0.011*”goal”	+	0.011*”human”	+	0.011*”ethics”	+	0.334		√		√
	0.010*”knowledge”	+	0.009*”for”	+	0.009*”image”	+	0.009*”sensor”	+	0.009*”use”	+	0.009*”information”	+	0.008*”person”	+	0.008*”regarding”	+	0.007*”method”	+	0.006*”bias”	+
	0.006*”representation”	+	0.006*”how”	+	0.006*”application”	+	0.006*”necessity”	+	0.006*”situation”	+	0.006*”next”	+	0.006*”intelligence”	+	0.006*”fair”	+	0.005*”utilize”	+	0.005*”field”
E [49]	0.087*”data”	+	0.040*”model”	+	0.034*”classification”	+	0.022*”attribute”	+	0.019*”AI”	+	0.019*”learning”	+	0.015*”generate”	+	0.012*”problem”	+	0.012*”use”	+	0.011*”solving”	+	0.424		√	√
	0.011*”collection”	+	0.010*”information”	+	0.010*”necessity”	+	0.010*”performance”	+	0.009*”training”	+	0.009*”structured”	+	0.009*”for”	+	0.008*”input”	+	0.008*”according”	+	0.008*”form”	+
	0.007*”application”	+	0.007*”through”	+	0.007*”next”	+	0.006*”case”	+	0.006*”test”	+	0.006*”result”	+	0.006*”viewpoint”	+	0.006*”kind”	+	0.005*”machine”	+	0.005*”algorithm”
	0.078*”AI”	+	0.040*”data”	+	0.020*”bias”	+	0.019*”society”	+	0.019*”ethics”	+	0.016*”application”	+	0.014*”learning”	+	0.012*”person”	+	0.010*”kinds”	+	0.009*”image”	+	0.367	√	√		√
	0.009*”problem”	+	0.009*”result”	+	0.009*”human”	+	0.008*”situation”	+	0.008*”use”	+	0.007*”developer”	+	0.007*”emerge”	+	0.007*”development”	+	0.007*”case”	+	0.006*”automobile”	+
	0.006*”dilemma”	+	0.006*”process”	+	0.006*”training”	+	0.005*”fait”	+	0.005*”develop”	+	0.005*”occur”	+	0.005*”track”	+	0.005*”autonomous”	+	0.005*”impact”	+	0.005*”drive”
	0.031*”AI”	+	0.026*”data”	+	0.026*”leaning”	+	0.018*”state”	+	0.014*”search”	+	0.013*”recognition”	+	0.013*”agent”	+	0.012*”intelligence”	+	0.012*”through”	+	0.011*”application”	+	0.322	√	√
	0.011*”problem”	+	0.011*”human”	+	0.009*”inference”	+	0.009*”technology”	+	0.009*”for”	+	0.009*”neural network”	+	0.009*”field”	+	0.008*”image”	+	0.007*”utilize”	+	0.006*sensor”	+
	0.006*”process”	+	0.006*”input”	+	0.006*”person”	+	0.006*”solving”	+	0.006*”process”	+	0.006*”forecast”	+	0.006*”method”	+	0.006*”next”	+	0.005*”work”	+	0.005*”understand”
F [50]	0.081*”data”	+	0.036*”classification”	+	0.035*”learning”	+	0.027*”machine”	+	0.022*”attribute”	+	0.018*”model”	+	0.015*”iris”	+	0.012*”problem”	+	0.011*”set”	+	0.009*”label”	+	0.415		√	√
	0.009*”through”	+	0.009*”next”	+	0.009*”performance”	+	0.009*”feature”	+	0.009*”structured”	+	0.009*”banana”	+	0.008*”kinds”	+	0.007*”case”	+	0.007*”solving”	+	0.007*”application”	+
	0.007*”forecast”	+	0.007*”image”	+	0.007*”necessity”	+	0.007*”width”	+	0.007*”rule”	+	0.007*”input”	+	0.006*”analysis”	+	0.006*”difficult”	+	0.006*”test”	+	0.006*”training”
	0.083*”AI”	+	0.022*”society”	+	0.021*”ethics”	+	0.020*”problem”	+	0.019*”technology”	+	0.018*”learning”	+	0.018*”data”	+	0.018*”human”	+	0.015*”application”	+	0.012*”solving”	+	0.382				√
	0.010*”occur”	+	0.009*”develop”	+	0.008*”for”	+	0.008*”dilemma”	+	0.008*”regarding”	+	0.008*”system”	+	0.008*”future”	+	0.007*”user”	+	0.007*”situation”	+	0.007*”bias”	+
	0.006*”Judgment”	+	0.006*”member”	+	0.006*”service”	+	0.006*”information”	+	0.006*”fair”	+	0.006*”intention”	+	0.005*”through”	+	0.005*”responsibility”	+	0.005*”individual”	+	0.005*”result”
	0.023*”AI”	+	0.020*”learning”	+	0.019*”state”	+	0.016*data”	+	0.012*”application”	+	0.012*”recognition”	+	0.012*”field”	+	0.010*”information”	+	0.010*”human”	+	0.010*”agent”	+	0.284	√	√
	0.009*”classification”	+	0.009*”intelligence”	+	0.008*”search”	+	0.008*”through”	+	0.008*”forecast”	+	0.008*”image”	+	0.007*”create”	+	0.007*”for”	+	0.007*”technology”	+	0.007*”rule”	+
	0.007*”machine”	+	0.007*”utilize”	+	0.007*”clustering”	+	0.006*”understand”	+	0.006*”sensor”	+	0.006*”case”	+	0.006*”apply”	+	0.006*”problem”	+	0.006*”process”	+	0.005*”knowledge”
G [51]	0.084*”data”	+	0.034*”classification”	+	0.030*”model”	+	0.028*”learning”	+	0.019*”machine”	+	0.017*”attribute”	+	0.015*”problem”	+	0.015*”structured”	+	0.013*”widget”	+	0.012*”iris”	+	0.430		√	√
	0.012*”image”	+	0.010*”use”	+	0.010*”message”	+	0.010*”test”	+	0.010*”confirmation”	+	0.009*”performance”	+	0.009*”for”	+	0.008*”kinds”	+	0.008*”solving”	+	0.008*”kind”	+
	0.008*”training”	+	0.008*”spam”	+	0.007*”result”	+	0.007*”assessment”	+	0.007*”process”	+	0.007*”AI”	+	0.007*”petal”	+	0.006*”form”	+	0.006*”select”	+	0.006*”new”
	0.046*”AI”	+	0.032*”intelligence”	+	0.026*”agent”	+	0.015*”human”	+	0.014*”behavior”	+	0.010*”environment”	+	0.010*”software”	+	0.010*”information”	+	0.009*”person”	+	0.009*”field”	+	0.298	√		√
	0.008*”change”	+	0.008*”occupation”	+	0.007*”learning”	+	0.007*”product”	+	0.006*”necessity”	+	0.006*”fast”	+	0.006*”according”	+	0.006*”user”	+	0.006*”for”	+	0.006*”role”	+
	0.006*”execute”	+	0.005*”data”	+	0.005*”machine”	+	0.005*”recognition”	+	0.005*”autonomous”	+	0.005*”automation”	+	0.005*”research”	+	0.005*”search”	+	0.005*”ability”	+	0.005*”kinds”
	0.032*”AI”	+	0.019*”data”	+	0.018*”learning”	+	0.012*”search”	+	0.012*”human”	+	0.010*”application”	+	0.009*”recognition”	+	0.008*”method”	+	0.008*”machine”	+	0.008*”use”	+	0.262		√		√
	0.008*”problem”	+	0.007*”information”	+	0.007*”classification”	+	0.007*”representation”	+	0.007*”result”	+	0.007*”society”	+	0.007*”knowledge”	+	0.007*”word”	+	0.007*”for”	+	0.006*”image”	+
	0.006*”state”	+	0.006*”sensor”	+	0.006*”process”	+	0.006*”ethics”	+	0.006*”occur”	+	0.006*”occur”	+	0.005*”understand”	+	0.005*”through”	+	0.005*”technology”	+	0.005*”not”
H [52]	0.053*”AI”	+	0.031*”data”	+	0.026*”society”	+	0.025*”ethics”	+	0.023*”problem”	+	0.018*”bias”	+	0.015*”for”	+	0.012*”use”	+	0.012*”fair”	+	0.011*”dilemma”	+	0.372				√
	0.010*”learning”	+	0.010*”technology”	+	0.010*”responsibility”	+	0.009*”solving”	+	0.009*”member”	+	0.008*”person”	+	0.008*”regarding”	+	0.008*”because”	+	0.007*”impact”	+	0.007*”prejudice”	+
	0.006*”kinds”	+	0.006*”result”	+	0.006*”situation”	+	0.006*”not”	+	0.006*”occur”	+	0.006*”necessity”	+	0.006*”service”	+	0.006*”effort”	+	0.006*”difficult”	+	0.006*”trust”
	0.039*”data”	+	0.030*”learning”	+	0.017*”model”	+	0.016*”attribute”	+	0.015*”AI”	+	0.014*”problem”	+	0.013*”classification”	+	0.013*”machine”	+	0.011*”method”	+	0.010*”intelligence”	+	0.322	√	√	√
	0.009*”we”	+	0.009*”agent”	+	0.009*”search”	+	0.009*”application”	+	0.008*”use”	+	0.008*”solving”	+	0.008*”person”	+	0.008*”create”	+	0.008*”image”	+	0.007*”computer”	+
	0.007*”recognition”	+	0.006*”human”	+	0.006*”representation”	+	0.006*”new”	+	0.006*”forecast”	+	0.006*”knowledge”	+	0.006*”structured”	+	0.006*”how”	+	0.006*”class”	+	0.006*”input”

√: included.

Table 11. Tool analysis result by textbook and area.

Frame			Textbook
Area	Tool		A	B	C	D	E	F	G	H
1. Understanding of AI	Principles understanding support	Machine Learning for Kids			√
1. Understanding of AI	Principles understanding support	Scratch			√
2. Principles and application of AI	Data processing	Quick, Draw!	√	√	√
		Mystery Animal			√
		Scroobly	√
		CLOVER OCR	√
		Teachable Machine		√				√	√
	AI model development	Machine Learning for Kids			√
		Scratch			√
		ENTRY				√	√	√	√	√
		Colab						√	√
		code.org		√				√
		prolog						√	√
3. Data and machine learning	AI model and program development	Teachable Machine	√
		ENTRY		√			√			√
		Orange3		√					√	√
		Brightics AI			√
		Quick, Draw!					√
		Machine Learning for Kids			√
		Scratch			√
		Python (Colab)	√			√	√
4. Social impact of AI	AI ethics learning support	Moral Machine		√	√	√	√	√	√	√
4. Social impact of AI	AI ethics learning support	ENTRY					√

√: included.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, H.; Kim, J.; Lee, W. Analyzing the Alignment between AI Curriculum and AI Textbooks through Text Mining. Appl. Sci. 2023, 13, 10011. https://doi.org/10.3390/app131810011

AMA Style

Yang H, Kim J, Lee W. Analyzing the Alignment between AI Curriculum and AI Textbooks through Text Mining. Applied Sciences. 2023; 13(18):10011. https://doi.org/10.3390/app131810011

Chicago/Turabian Style

Yang, Hyeji, Jamee Kim, and Wongyu Lee. 2023. "Analyzing the Alignment between AI Curriculum and AI Textbooks through Text Mining" Applied Sciences 13, no. 18: 10011. https://doi.org/10.3390/app131810011

APA Style

Yang, H., Kim, J., & Lee, W. (2023). Analyzing the Alignment between AI Curriculum and AI Textbooks through Text Mining. Applied Sciences, 13(18), 10011. https://doi.org/10.3390/app131810011

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Analyzing the Alignment between AI Curriculum and AI Textbooks through Text Mining

Abstract

1. Introduction

2. Related Research

2.1. Trends in AI Education Content

2.1.1. AI-Related Knowledge Areas in Structuring Standard Curriculum (Higher Education)

2.1.2. K-12 AI Curriculum Content

2.2. Research on Textbook Analysis Methods

2.3. Analysis of Educational Data USING Text Mining

3. Research Methods

3.1. Data Collection

3.2. Data Preprocessing

3.3. Analysis Method

4. Textbook Analysis Results

4.1. Evaluation of Consistency between Curriculum and Textbooks through Frequency Analysis

4.1.1. Consistency of Curriculum and Textbooks through TF Analysis

4.1.2. Evaluation of Textbook Specificity through TF-IDF Analysis

4.2. Evaluation of Textbook Knowledge Composition through LDA Topic Modeling Analysis

4.3. Tool Utilization through Content Analysis

4.4. Results of the Alignment Evaluation between Curriculum and Textbooks

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI