Learning Analytics for Diagnosing Cognitive Load in E-Learning Using Bayesian Network Analysis

: A learner’s cognitive load is highly associated with their academic achievement within learning systems. Diagnostic information about a learner’s cognitive load is useful for achieving optimal learning, by enabling the learner to manage and control their cognitive load in the e-learning environment. However, little empirical research has been conducted to obtain diagnostic information about the cognitive load in e-learning systems. The purpose of this study was to analyze a personalized diagnostic evaluation for a learner’s cognitive load in an e-learning system, using the Bayesian Network (BN) as a learning analytic method. Data from 700 learners were collected from Cyber University. A learner’s cognitive load level was measured in terms of three components: extraneous cognitive load, intrinsic cognitive load, and germane cognitive load. The BN was built by representing the relationship among the extraneous cognitive load, intrinsic cognitive load, germane cognitive load, and academic achievement. The conditional and marginal probabilities in the BN were estimated. This study found that the BN provided diagnostic information about a learner’s level of cognitive load in the e-learning system. In addition, the BN predicted the learner’s academic achievement in terms of their different cognitive load patterns. This study’s results imply that diagnostic information related to cognitive load helps learners to improve academic achievement by managing and controlling their cognitive loads in the e-learning environment. In addition, instructional designers are able to offer more appropriately customized instructional methods by considering learners’ cognitive loads in online learning.


Introduction
In the post-Corona era, the current educational system faces major changes in moving from face-to-face classroom learning to online learning systems [1]. E-learning systems play a pivotal role in improving the quality of learning in remote education [2,3]. Moreover, previous research reports that the academic performances in distance learning with learning technologies are similar to those in face-to-face classroom learning [4]. Since the traditional educational environment has largely shifted to an e-learning environment, the educational system has started to pay attention to identifying a learner's characteristics related to e-learning instruction, which could enhance learner-centered online learning [5]. For example, identification of a learner's characteristics related to instructional formats and techniques in an online learning system (e.g., adaptation to a new learning technology, engagement in e-learning, the degree of difficulty of the topics in e-learning) can be useful information for the development and operation of effective e-learning systems [6]. Most of all, instructional design considering human cognitive processing has been emphasized in multimedia learning systems in order to effectively activate learners' cognitive processes [7]. Kalyuga (2011) emphasized the need for an instructional design that takes into account the cognitive load of human cognitive processing in an online learning environment [8].
Cognitive load theory states that optimal learning can be achieved by reducing the loads that prevent learning and enhancing the mental efforts that systematically organize the new information [9]. That is, optimal learning occurs when learners effectively manage and control their cognitive load [10]. Educational instruction focuses on offering a learning environment in which learners can manage their cognitive load without their working memories becoming overloaded. Properly measuring a learner's cognitive load can provide useful diagnostic information for developing effective learning systems and offering customized instructional techniques that allow learners to effectively control their cognitive load.
Previous studies have reported that the cognitive load explains learners with different learning patterns, and how e-learning instructional design should reflect human cognitive systems [11,12]. Specifically, cognitive load theory states that there are three components to cognitive load: extraneous cognitive load, intrinsic cognitive load, and germane cognitive load [13]. The intrinsic, external, and germane cognitive loads affect the level of learning transfer and academic achievement differently. In addition, they influence learners' motivation, engagement, and adaptation to the learning system. Specifically, cognitive load theory states that optimal learning occurs by minimizing unnecessary extraneous cognitive loads that interfere with the transition of learning, and maximizing mental and cognitive efforts that organize knowledge and skills [8,14]. Therefore, diagnostic information about the strengths and weaknesses of a learner's learning cognitive load related to academic achievement provides an informative evaluation of the effectiveness of instruction. From this point of view, educational research must consider how the cognitive load can be efficiently measured and controlled [12,15,16].
This study investigated a learner's cognitive load in an e-learning system using learning analytics obtained by a Bayesian Network (BN) [17,18]. We estimated a personal diagnostic information regarding the cognitive load from the BN. In addition, we identified the patterns of the relationships among the extraneous cognitive load, intrinsic cognitive load, germane cognitive load, and academic achievement in an e-learning system.

Cognitive Load Theory
Optimal learning can be achieved by managing and controlling the cognitive load in a learning system [19][20][21]. In multimedia learning environments, information about a learner's cognitive load can be critical for providing effective learning instruction [22]. Sweller (1988) proposed that the cognitive load has three components: intrinsic cognitive load, extrinsic cognitive load, and germane cognitive load [23]. Intrinsic cognitive load reflects the degree of complexity and difficulty of the content, topics, and objectives that learners must manage during a course. That is, if a learner thinks that the topics and content in a course are difficult and complex, the intrinsic cognitive load is high. For example, a previous study used the question "How easy or difficult do you consider this theory at this moment?" to measure the intrinsic cognitive load [24]. Since the perceived difficulty and complexity in the domain, content, and topics that a learner should manage during a course are highly associated with the intrinsic cognitive load, the intrinsic cognitive load can be dependent on learners' previous expertise and experiences [25].
The extraneous cognitive load is the unnecessary cognitive load that prevents learners from forming a mental model of the knowledge, skills, and abilities of a particular domain.
In an e-learning system, the extraneous cognitive load mostly results from poorly designed teaching processes, such as how information is presented in the learning system and the functions that facilitate learning activities (e.g., discussion and Q&A section). Therefore, the instructional design in multimedia learning focuses on how to reduce extraneous cognitive loads in the learning system [26,27]. For example, a previous study used the question "How difficult was it to learn with the materials?" to measure the extraneous cognitive load [14,24,28]. However, it is not easy in an e-learning system to provide an optimal instructional design that fits all learners, due to the variable characteristics of learners [29]. Therefore, diagnostic information regarding a learner's extraneous cognitive load is useful in order to design a customized optimal e-learning environment. A systematic e-learning environment can be designed by minimizing extraneous cognitive loads and providing learning activities that consider a learner's intrinsic cognitive load, which activates learning transfer as much as possible [30,31].
Lastly, the germane cognitive load reflects when a learner makes a cognitive and mental effort to organize new knowledge [32]. Therefore, the germane cognitive load is positively associated with learning outcomes. The mental effort of learning is associated with learners' behaviors relating to motivation and engagement on a course, and adaptation to new learning environments [33]. Consequently, the germane cognitive load can be measured by answering a question about how much a learner is emotionally engaged in and contributes to a course [34].
Considering the three components of cognitive load, effective e-learning should provide a customized instructional design minimizing the extraneous cognitive load, offering learner-fit content given the learner's intrinsic cognitive load, and maximizing the germane cognitive load [35,36]. Therefore, learning analytics regarding a learner's cognitive load in e-learning can provide useful diagnostic information to help instructors and learners increase learning. The purpose of this study was to propose a personalized diagnostic evaluation of a learner's cognitive load in an e-learning system, using the BN.

Learning Analysis for Diagnostic Information Using Bayesian Network Analysis
BN analysis is a probability-based statistical modeling framework for reasoning and making decisions with uncertain and inconsistent patterns [37]. The BN combines a probability theory and a graph theory to represent the probabilistic relationships among variables under uncertainty [38]. A graphical representation is used in the BN through a graph model, to facilitate an efficient representation. The graphical representation is basically a concept of a finite acyclic directed graph (DAG). The DAG consists of nodes and edges. The nodes are unobservable or observable variables. Edges are the relationships among variables. A graph is a pair G = (A, E), where A is a set of nodes (variables) and E is a set of edges in which one edge is a line between two vertices.
In the directed graph (G = (A, E)), there is an independent/dependent relationship among the variables. To express the independent and dependent relationship, there are two concepts, such as the parent variable and children variable. The sets of variables have arrows pointing from themselves to another set of variables (A), hence are independent, and are called parents of A. They are denoted pa (A| G) or simply pa (A). The variable A with an edge toward it is the children of A.; hence, the children variable is dependent to the parent variable. The relationship between the parent variables and children variables is expressed as the conditional probability and marginal probability.
The formal notation of the conditional probability distribution associated with each variable, given all of its parent variables, is as follows: The formal notation of a joint distribution associated with a BN is as follows: Lastly, if there are no parents (i.e., pa(A) is empty), then the conditional probability is regarded as a marginal probability.
This study used a BN to infer a learner's cognitive load in an e-learning course. The graphical representation shows the relationship among the intrinsic cognitive load, extraneous cognitive load, germane cognitive load, and academic achievement. Then, the marginal and conditional probabilities were estimated for the learner's cognitive load pattern, regarding their academic achievement.

Research Objectives
The main purpose of this study is to estimate a learner's cognitive load in an e-learning system using the BN. We first built the BN as the representation of extraneous cognitive load, intrinsic cognitive load, germane cognitive load, and academic achievement. The BN estimated the parameters of the conditional and marginal probabilities of the variables in the network. Lastly, we predicted diagnostic information about the learner's cognitive load in terms of academic achievement. Therefore, the research questions of this study are as follows: (1) Does BN estimate a learner's levels of extraneous, intrinsic, and germane cognitive load? (2) Does BN predict a learner's academic achievement based on the patterns of three cognitive load components?
The findings of this study provide useful information on how to design e-learning instructions, considering an individual learner's cognitive load pattern.

Participants
The study was conducted using data collected from 700 students in Cyber University. A total of 754 students were attending the e-learning class, but 54 students did not take the mid-term or final exam. Hence, we did not include 54 students in this study. The e-learning class sampled was an Introduction to Statistics in Social Science class. The course consisted of 14 classes on basic statistics. We also collected the final academic achievements based on the scores of mid-term exams and final exams during the course. The final academic achievement was computed by transferring the sum scores of mid-term and final exam scores to the standardized scores. The standardized scores were divided into A-D grades (i.e., A is above 90, B is between 80 and 90, C is between 60 and 80, and D is below 60). Table 1 shows the descriptive statistics of the subjects.

Data Analysis
Descriptive statistics of all the measures were computed according to the mean, standard deviation, skewness, and kurtosis. We analyzed the data to estimate the parameters of the conditional and marginal probabilities in the representation of the BN. This study used the Netica provided by Norsys Software Corporation [39]. This software can be downloaded from the website: http://www.norsys.com (accessed on 10 July 2021). The probabilities of the network can be estimated using the function of "Learning EM" in Netica. Expectation and Maximization (EM) algorithms, gradient ascent, and Markov chain Monte Carlo Estimation (MCMC) are commonly used in BN software programs. This study used the EM algorithm to estimate the parameters of the BN [40].

Measures
Previous studies have proposed several cognitive load scales to measure extraneous cognitive load, intrinsic cognitive load, and germane cognitive load [10,14]. Proper measures of the cognitive load enable learners and educators to efficiently manage and control cognitive loads [12][13][14]. In this study, we used the following items for measuring the three components of cognitive load, based on previous studies [10,14].
For the intrinsic cognitive load, we used three items: (1) the topics covered in this course were difficult based on my previous knowledge, skills, and educational experiences; (2) the concepts and definitions covered in this course were complex based on my previous knowledge, skills, and educational experiences; (3) the class quizzes and class activities with other learners were difficult based on my previous knowledge, skills, and educational experiences. Since the previous research stated that the intrinsic cognitive load is related to the difficulty and complexity of learning activities during a course, we asked about the degree of difficulty and complexity of the topics, concepts, and exams in the course.
For the extraneous cognitive load, we asked three questions about whether the instructional design and methods were appropriate; (1) the format of the lecture screen is designed to be easy to learn; (2) the functions for learning activities in this e-learning course (e.g., buttons and menus for Q&A session, discussion session with other learners, learning activities with other learners, quizzes, exams, etc.) are easy to access; (3) the instruction is designed to support adaptation to the learning environment and improve the sense of belonging to the course. If a learner thinks that the instructional materials are appropriately designed for learning, the learner's extraneous cognitive load should be low.
Finally, we used three items to measure the germane cognitive load, as follows: (1) Did you concentrate and become engaged during the lectures? (2) Did you put in mental and emotional effort for this class? (3) Did this course enhance your motivation to gain new knowledge, understanding, and application of skills in the domain? A learner's engagement, concentration, and motivation on a course are important mental efforts for achieving optimal learning. The germane cognitive load can reflect various cognitive and mental efforts required to master knowledge, skills, and abilities. All items are dichotomously scored, so that students can answer one of two options with Yes and No.

Descriptive Statistics
The descriptive statistics of the intrinsic cognitive load, extraneous cognitive load, germane cognitive load, and academic achievement were computed including the mean, standard deviation, skewness, and kurtosis ( Table 2). We also conducted a correlational analysis among the four variables (i.e., intrinsic cognitive load, extraneous cognitive load, germane cognitive load, and academic achievement). We found that extraneous cognitive load and intrinsic cognitive load were negatively associated with academic achievement, while germane cognitive load was not statistically significantly related to academic achievement in this study (Table 3). Table 3. Correlation analysis of the variables in this study.

Bayesian Network Analysis for Diagnostic Information about the Cognitive Load
The BN has been proposed for modeling student performance in educational settings, to estimate and diagnose proficiency [41][42][43].
In this study, we used the BN to estimate diagnostic information about a learner's cognitive load pattern, regarding their academic achievement. First, we built a BN representing the relationships among the extraneous cognitive load, intrinsic cognitive load, germane cognitive load, and academic achievement. The BN was constructed by using the plausible hypothesized conditional probability for each item related to the three cognitive load components, and the marginal probability of academic achievement. Figure 1 displays an initial BN representation using Netica software. The BN consisted of three cognitive load nodes, which were each loaded with three items. The three levels of each cognitive load were estimated based on the responses to nine items. A higher level indicated a higher cognitive load. In addition, academic achievement was represented in the BN. Figure 1 is an initial BN. First, nine item nodes had two values for Yes and No. Second, a proficiency node regarding academic achievement had four grades (i.e., A grade, B grade, C grade, and D grade). Third, the three cognitive loads had three values (i.e., level 1, level 2, and level 3). All variables are observed except the cognitive load variables. Therefore, the cognitive load variables were estimated based on the responses to the survey items (Figure 2).
At the first step, for the proficiency variable, equal probabilities were considered to take the values for four grades when prior information regarding learner proficiencies on the course had not yet been obtained [18]. In addition, the cognitive load nodes also took equal probabilities of each level when no prior information had been obtained regarding learners' cognitive load. For the item nodes, hypothesized conditional probabilities reflecting task characteristics associated with the states of the cognitive load nodes could be considered. In this study, equal probabilities were used for the item nodes, which means that there was no prior information regarding the relationship between the cognitive load components and the items for the learners. Therefore, all probabilities in this BN were estimated by the data collected from this study, and no prior information was used.
The estimated marginal probabilities of four statuses in academic achievement are listed in Table 4. In addition, the estimated marginal probabilities of the three levels of the cognitive load components (i.e., intrinsic cognitive load, extraneous cognitive load, and germane cognitive load) are shown in Table 5. Lastly, Table 6    At the first step, for the proficiency variable, equal probabilities were considered to take the values for four grades when prior information regarding learner proficiencies on the course had not yet been obtained [18]. In addition, the cognitive load nodes also took equal probabilities of each level when no prior information had been obtained regarding learners' cognitive load. For the item nodes, hypothesized conditional probabilities reflecting task characteristics associated with the states of the cognitive load nodes could be considered. In this study, equal probabilities were used for the item nodes, which means that there was no prior information regarding the relationship between the cognitive load components and the items for the learners. Therefore, all probabilities in this BN were estimated by the data collected from this study, and no prior information was used.
The estimated marginal probabilities of four statuses in academic achievement are listed in Table 4. In addition, the estimated marginal probabilities of the three levels of the cognitive load components (i.e., intrinsic cognitive load, extraneous cognitive load, and  To understand what a learner's cognitive load pattern is, Figure 3 displays the estimated three cognitive load nodes with nine items. Once a learner's responses to the nine items have been obtained, this information is propagated through the network via Bayes' theorem to yield the posterior probability distribution of the learner's levels for the three cognitive load components (i.e., extraneous cognitive load, intrinsic cognitive load, and germane cognitive load). We can see the cognitive load pattern in Figure 3. The BN representation in Figure 3 shows three cognitive load components when a learner has a particular response pattern to the items. Suppose that a learner's responses to all items are (Yes, Yes, Yes, No, Yes, No, Yes, Yes, Yes). The learner's levels of extraneous cognitive load, intrinsic cognitive load, and germane cognitive load would be (Level 1, Level 1, and Level 3), with probabilities of (75.6, 53.2, and 53.9), respectively.   Next, Figure 4 shows the predicted levels of academic achievement when a learner has a particular cognitive pattern. Considering a situation in which a learner has the cognitive load pattern (Level 3 in extraneous cognitive load, Level 3 in intrinsic cognitive load, and Level 1 in germane cognitive load), the academic achievement of the learner would be a grade D, with the probability of 64.3, on this course.

Discussion
The aim of this study was to introduce learning analytics using a BN for estimating a learner's cognitive loads in an e-learning system. In addition, we predicted a learner's academic achievement based on the learner's cognitive load pattern. The findings from this study suggest that the BN can capture evidence to identify an individual learner's cognitive load pattern, as well as predict the learner's academic achievement based on the cognitive load pattern. More specifically, the BN estimates the levels of a leaner's extrinsic, intrinsic, and germane cognitive load based on a leaner's responses to the questions about cognitive load. For an example of three cognitive load components, when a learner has a particular response pattern to the questions, such as (Yes, Yes, Yes, No, Yes, No, Yes, Yes, Yes), the BN reports that the learner's levels of extraneous cognitive load, intrinsic cogni-

Discussion
The aim of this study was to introduce learning analytics using a BN for estimating a learner's cognitive loads in an e-learning system. In addition, we predicted a learner's academic achievement based on the learner's cognitive load pattern. The findings from this study suggest that the BN can capture evidence to identify an individual learner's cognitive load pattern, as well as predict the learner's academic achievement based on the cognitive load pattern. More specifically, the BN estimates the levels of a leaner's extrinsic, intrinsic, and germane cognitive load based on a leaner's responses to the questions about cognitive load. For an example of three cognitive load components, when a learner has a particular response pattern to the questions, such as (Yes, Yes, Yes, No, Yes, No, Yes, Yes, Yes), the BN reports that the learner's levels of extraneous cognitive load, intrinsic cognitive load, and germane cognitive load would be (Level 1, Level 1, and Level 3), with probabilities of (75.6, 53.2, and 53.9) (see Figure 3). Therefore, an instructor is able to have an individual diagnostic information about a leaner's cognitive load pattern. In addition, the BN predicts the academic achievement with a particular cognitive load pattern, which helps an instructor provide the next remedy step regarding cognitive load for improving academic achievement. The BN is a useful statistical modeling method that can provide diagnostic information about a particular learner, such as which cognitive load components affect the learner during a course. This information is beneficial to learners, instructors, and instructional developers, to enhance student learning in an e-learning system. In addition, the correlation analysis found that extrinsic and intrinsic cognitive loads were negatively associated with academic achievement. These results suggest that it is important to design an e-learning environment to minimize the learner's extrinsic cognitive load and provide learning activities considering the learner's intrinsic cognitive load. In particular, the previous research suggested that the intrinsic cognitive load affects academic achievement differently depending on students' prior knowledge and intellectual level. For example, leaners with high intelligent level prefer more difficult tasks and they are more motivated in the class when they feel challenged. This is called expertise reversal effect [44]. The expertise reversal effect explains the interaction effect between the intrinsic cognitive load and levels of expertise in a particular content. Therefore, instructional designers and instructors should consider that the relationship patterns between the intrinsic cognitive load and academic achievement may vary depending on the learner's characteristics. In addition, the germane cognitive load was not significantly associated with academic achievement in this study, which may be because the germane cognitive load is related to learning transfer. The diagnostic information of the cognitive load in an e-learning system could help learners, instructional designers, and instructors to identify the reasons why a learner is struggling during an e-learning course. From the information, instructional designers and instructors could provide effective instructional methods and a customized learning environment. In other words, diagnostic information about the strengths and weaknesses of a learner's cognitive load during a course could be used to offer an effective customized learning system, reducing the burdens of extraneous cognitive loads and promoting the germane cognitive load.
The current study has several limitations. First, the application study was limited to a particular discipline in the e-learning system. Future studies could be conducted considering different disciplines within various e-learning formats. Second, this study did not evaluate other learning variables which may influence the cognitive load and academic achievement, such as learners' ages, educational levels, and gender, etc. Because diverse learners attend courses in e-learning systems within higher education, learners' ages and previous educational experiences can be the important variables that might affect the level of cognitive load and academic achievement. Future studies could consider these confounding variables, in order to build more accurate BNs for providing diagnostic information regarding a learner's cognitive load. Moreover, the network in this study was built based on a training data set. The future study with new data set should be collected for computing the accuracy rate and cross validation of the BNs.
Nevertheless, this study, using BNs, was able to (1) infer a learner's cognitive load pattern based on the learner's academic achievement in the e-learning system and (2) predict the learner's academic achievement by managing the learner's strengths and weaknesses related to their cognitive load during the course.
A large percentage of instructors using e-learning systems have indicated a desire for more individualized diagnostic information regarding learners' cognitive loads. Diagnostic information on cognitive load can help instructors obtain a better understanding of the intrinsic and germane cognitive loads of learners with different levels of expertise. Instructors are also able to identify which elements of cognitive load (e.g., prerequisite knowledge, contributions or engagement on course, instructional techniques) promote academic achievement. Consequently, instructional designers are able to offer customized instructional methods to learners, considering their cognitive load in an online learning environment. Furthermore, this study implies that diagnostic information about the cognitive load helps learners to improve their academic achievement by managing and controlling their cognitive load in the e-learning environment.
Author Contributions: Conceptualization, supervision, methodology, data analysis, writing: Y.C. and Conceptualization and editing: J.K. All authors have read and agreed to the published version of the manuscript.