AI Detection of Human Understanding in a Gen-AI Tutor

Earl Woodruff

doi:10.3390/ai5020045

Centre for Smart Learning and Development, Department of Applied Psychology and Human Development, Ontario Institute for Studies in Education, University of Toronto, Toronto, ON M5S 1V6, Canada

^†

Suite of AI tools was employed in the development of this paper: (1) ChatGPT4 (for writing clarity and reference checking), (2) Grammarly (for grammar and editorial corrections), (3) ResearchRabbit (reference management) and (4) DALL-E for the creation of Figures 1–5.

AI2024, 5(2), 898-921;https://doi.org/10.3390/ai5020045

This article belongs to the Special Issue Development of Artificial Intelligence and Computational Thinking: Future Directions, Opportunities, and Challenges

Version Notes

Order Reprints

Review Reports

Abstract

Subjective understanding is a complex process that involves the interplay of feelings and cognition. This paper explores how computers can monitor a user’s sympathetic and parasympathetic nervous system activity in real-time to detect the nature of the understanding the user is experiencing as they engage with study materials. By leveraging advancements in facial expression analysis, transdermal optical imaging, and voice analysis, I demonstrate how one can identify the physiological feelings that indicate a user’s mental state and level of understanding. The mental state model, which views understandings as composed of assembled beliefs, values, emotions, and feelings, provides a framework for understanding the multifaceted nature of the emotion–cognition relationship. As learners progress through the phases of nascent understanding, misunderstanding, confusion, emergent understanding, and deep understanding, they experience a range of cognitive processes, emotions, and physiological responses that can be detected and analyzed by AI-driven assessments. Based on the above approach, I further propose the development of Abel Tutor. This AI-driven system uses real-time monitoring of physiological feelings to provide individualized, adaptive tutoring support designed to guide learners toward deep understanding. By identifying the feelings associated with each phase of understanding, Abel Tutor can offer targeted interventions, such as clarifying explanations, guiding questions, or additional resources, to help students navigate the challenges they encounter and promote engagement. The ability to detect and respond to a student’s emotional state in real-time can revolutionize the learning experience, creating emotionally resonant learning environments that adapt to individual needs and optimize educational outcomes. As we continue to explore the potential of AI-driven assessments of subjective understanding, it is crucial to ensure that these technologies are grounded in sound pedagogical principles and ethical considerations, ultimately empowering learners and facilitating the attainment of deep understanding and lifelong learning for advantaged and disadvantaged students.

Keywords:

adaptive AI tutoring; adaptive ITS; affect aware AI; subjective understanding; emotion–cognition relationship; feelings; sympathetic and parasympathetic nervous system; facial expression analysis; transdermal optical imaging; voice analysis; mental states; homeostasis; cognitive dissonance; knowledge building; deep understanding; phases of understanding

1. Introduction

In this paper, I explore how computers can monitor a user’s sympathetic and parasympathetic nervous system activity, both directly and indirectly, to detect the nature of the understanding the user is experiencing in real-time while engaging with study materials. Following Damasio’s [1] popularized description, I will refer to the neurological reactions and physiological changes as feelings, which serve as the primary data for AI detection of understanding. Frijda elaborates on the foundation for this approach by pointing out that feelings draw our attention to significant shifts, whether positive or negative, within our psychological states [2]. According to his Law of Change, states often respond to perceived changes rather than static conditions [2]. Beyond their role as indicators, conscious feelings also function as personal appraisals or evaluations of our experiences, affecting our interpretations of events in relation to our goals or needs [3].

Moreover, feelings can motivate us to adapt our behaviour in response to our perceptions and evaluations [2]. As such, feelings serve not only as reflections of our inner states but also as catalysts for action. Conscious feelings can be expressed through facial movements, blood flow patterns, and vocal changes, providing observable indicators of the user’s feelings and mental states. This paper will focus on detecting physiological feelings to inform AI-driven assessments of subjective understanding. While constructed emotions, which are based on the sensing of internal physiological states such as heart rate variability and blood pressure, help us interpret and communicate our internal experiences, they will not be the main focus of this paper.

Understanding how we make subjective sense of the world and our experiences is a complex process that involves the interplay of emotions, feelings, and cognition [4]. To comprehend this process, Damasio [1] suggests that we need to examine the role of feelings as conscious mental events arising from the interoperception of bodily responses and emotions. In Frijda’s [2] and Damasio’s [1] view, emotions are primarily unconscious, automatic bodily reactions that have evolved to promote survival. Feelings, on the other hand, are the conscious representations of these emotional mental states. While emotions and feelings are distinct, they are closely related, and we often discuss our feelings in terms of emotions. Indeed, as Scherer and Moors [5] point out, nonverbal feelings can become conscious during the appraisal process when they are categorized and labelled with emotion words or verbal emotional expressions. During this process, feelings are also assigned valences that impact our reasoning and decision-making [1].

This approach is not entirely new, for Leon Festinger [6] theorized over half a century ago that the consciousness of feelings, when they are negatively valenced, generates dissonance (i.e., feelings such as anxiety, fear, or sadness) and creates a state of discomfort and arousal that motivates the individual to take action to reduce the dissonance. More recently, Harmon-Jones et al. [7] have used this drive as the basis of their action-based model. However, dissonance can be viewed within a larger homeostasis framework for mental states. We can speculate that maintaining homeostasis may be considered as part of a broader system of physiological and mental processes that provide a need to improve our subjective understandings. Figure 1, below, suggests that negatively and positively valenced feelings will return to a balanced state as mental homeostasis is asserted. In the figure below, even positively valenced feelings like an “ah-ha” or “eureka” will fade back to a balanced state over time [3]; according to Frijda’s Law of Change, “Continued pleasures wear off; continued hardships lose their poignancy” (cf., [3], p. 353). It also suggests that a student may let a negatively valenced context regain balance by avoiding re-engaging the concept or entire subject area. Damasio acknowledges this when he states, “… the negative or positive valence of the experience advises me to correct the situation or else accept it and do little or nothing” (cf., [1], p. 96).

Figure 1. Metaphorical representation of mental homeostasis.

This view of physical and mental stasis goes beyond explaining why we need to eat when we are hungry; to see that when we are confused, we need to improve our understanding. Thus, the process involves integrating mental states with bodily processes. As Figure 2 below suggests, a successful knowledge builder will attempt to correct the balance and, therefore, engage in cycles of ever-deeper understanding and, as Damasio [1] argues, homeostasis and feelings will facilitate a student’s learning, adaptation, and effective decision-making—a perspective central to the investigation of embodied cognition. In the past, Damasio’s speculations would simply be untestable. Fortunately, new technologies permit us to begin investigating his ideas.

Figure 2. Graph depicts the cyclical feature of ever-deepening understanding.

Advances in detecting and measuring bodily events in real-time as proxies of emotion have made it possible to explore Damasio’s theory in authentic settings. Specifically, technological advances have enabled the real-time examination of sympathetic and parasympathetic nervous system events [8] through various methods, including facial expression analysis, transdermal optical imaging, and voice analysis. Recent research employing sophisticated acoustic analysis techniques has revealed numerous consistent patterns in vocal expression, especially those associated with evaluations of control or power [5]. Regarding facial expression, our research will focus solely on the preprocessed muscle movements, called action units, generated before identifying specific emotions. Scherer and Moors [5] report that several researchers have found it challenging to demonstrate reliable distinctions in patterns across discrete emotions (cf. [9]).

Facial expression analysis was initially designed to help researchers identify universal emotions by analyzing specific facial muscle movements [10]. This method is grounded in the facial action coding system [11], usually called FACS, and categorizes all facial movements. The approach, developed by Ekman, concentrates less on the emotions and more on the expressions [12]. By examining the FACS-generated AUs, researchers can infer activation of the sympathetic nervous system, as various emotions correlated with stress, excitement, or fear involve characteristic facial expressions. Transdermal optical imaging (TOI), as discovered in Kang Lee’s lab, offers a novel approach by using video to capture changes in facial blood flow patterns, which can indicate shifts in autonomic nervous system activity, providing a window into an individual’s emotional and physiological state [13]. Moreover, Fu and colleagues [13] showed that each emotion, including the neutral state, elicited a distinct blood flow pattern in the facial epidermis, indicating that physiological correlates of discrete emotions can be detected even when the face is expressionless.

Transdermal optical imaging is a non-invasive technique that uses visible and near-infrared light to measure various biological parameters beneath the skin’s surface. When emitted light enters the skin, it interacts with different biological components, such as blood vessels, melanin, and collagen, through absorption, scattering, and reflection. TOI primarily targets hemoglobin, measuring light absorption at specific wavelengths to determine the relative concentrations of oxygenated and deoxygenated hemoglobin, providing information about blood oxygenation and tissue metabolism [14]. Light scattering in the skin is caused by collagen fibres and other cellular structures, with longer wavelengths penetrating deeper into the tissue. Photodetectors capture the returning light and mathematical algorithms process the raw data to reconstruct images or maps of the tissue [15].

Voice analysis complements optical imaging by assessing variations in vocal properties, such as pitch, tone, and energy, which can change in response to different emotional states and autonomic nervous system activity [16,17,18]. For example, the sympathetic nervous system’s activation can lead to a tightened vocal cord and changes in speech patterns, which can be detected through voice analysis software. Emotion detection in voice, also known as speech emotion recognition (SER), is a technology that analyzes various acoustic features of speech to identify the speaker’s emotional state. Eunice Jang’s lab has used software to detect emotions using features such as variations in pitch, pitch range, volume fluctuations, melody, breathiness, etc. [19]. This work used the open software platform, Pratt, with speech recognition software and machine learning algorithms of children’s voice recordings. Though Jang’s work currently focuses on assessing fluency and oral language ability, those acoustic features play a crucial role in understanding and interpreting emotions in speech as they capture non-verbal clues and subtleties that reveal a speaker’s emotional state [20].

Collectively, these three technologies offer a novel toolkit for real-time analysis of the intricate interactions between the sympathetic and parasympathetic nervous systems and the conscious brain. They form the foundation of a new methodology employed in this paper to explore the emotion–cognition relationship from an embodied cognition perspective. This approach allows for the unobtrusive real-time measurement of physiological and affective states, providing deeper insights into the interplay between emotions and cognition. As I will demonstrate, integrating data from facial expressions, optical imaging, and voice analysis presents researchers with a powerful methodology to understand how emotions, feelings, and physiological states manifest in the body. This, in turn, offers valuable insights into human psychology, stress responses, and emotional well-being.

2. Mental States and Phases

The mental state model allows us to view understandings as composed of assembled beliefs, values, emotions, and feelings. Beliefs and values provide the cognitive framework through which we interpret and make sense of our experiences, while emotions and feelings add depth and meaning to these interpretations [12]. Together, these components shape our perceptions, decisions, and actions, ultimately leading to a more comprehensive understanding of ourselves and the world around us. Further, the mental state model conveys the complexity and multifaceted nature of the relationship between feelings and cognition. The model is a metaphor that helps us understand the complex relationship between feelings and cognition. By recognizing the intertwined nature of these processes [1], the activating or deactivating potential of mental states and their emotions [21], and the role of beliefs, values, and emotions in shaping understanding [4], we can gain a complete and more nuanced picture of the emotion–cognition relationship. This interplay forms the foundation for how individuals perceive, interpret, and interact with the world around them (cf., [22]).

Moreover, mental states act as filters through which sensory information is processed, categorized, and assigned meaning based on current physiological (feeling) states, cognitive processes, and past experiences [23]. Such processing guides attention, interpretation, and the emotional valuation of information, prioritizing specific experiences or facts for attention and memory. They ensure that understanding is an active construction of meaning, where individuals continually generate predictions about the world and test them against incoming sensory information. This iterative process facilitates the refinement of existing knowledge and the formation of new understandings, highlighting the intertwined nature of cognition, feelings, and understanding.

Mental states are not necessarily discrete or linear, and learners may move between them dynamically as they process new information and refine their understanding [24]. The progression through these states can be influenced by various factors, such as the complexity of the subject matter, the learner’s prior knowledge, cognitive abilities, and motivational factors [25]. Moreover, each state’s specific cognitive processes and experiences may vary depending on the domain or task. For example, in problem-solving tasks, confusion may involve a more explicit awareness of the problem space and the need to develop new strategies [26]. In contrast, conceptual learning may involve a deeper questioning of one’s beliefs and assumptions [27].

If we imagine a student working to understand a concept, these mental states can be viewed as phases of the process that track the changes and growth of the student’s understanding. From this perspective, the phases might appear as follows: (1) nascent understanding, (2) misunderstanding, (3) confusion, (4) emergent understanding, and (5) deep understanding. See Figure 3, below.

Figure 3. A symbolic depiction of the network of interconnected phases of understanding.

The nascent understanding phase is a starting point where learners have little knowledge or comprehension of the subject matter. It describes an understanding just coming into existence and displaying signs of promise for future emergence. In this stage, learners may sense misunderstandings or confusions about the subject matter as they encounter new information. This phase precedes confusion because learners may initially form inaccurate understandings based on limited or incorrect information, giving them a false sense of comprehension. Later, confusion occurs when learners realize that their initial understanding is inconsistent, incomplete, or contradictory to new information, and they enter a state of confusion (this phase follows misunderstanding because the recognition of conflicting or unclear information challenges learners’ existing beliefs and prompts them to question their understanding). If the student persists, we expect to see emergent understanding appear.

As learners struggle with various instances of confusion and seek to resolve the inconsistencies in their understanding, they construct new knowledge and insights. In this phase, learners actively process information, make connections, and start to form a more coherent understanding of the subject matter. However, their understanding is still developing and may be incomplete or unstable. Finally, we speculate that a deep understanding will appear. With continued learning, reflection, and application of relational knowledge, learners solidify their understanding of the subject matter [28,29]. Learners have a deep, comprehensive, and stable understanding of the topic in this final phase. They can effectively apply their knowledge, connect concepts, and communicate their understanding.

The phases of understanding in Figure 3, when viewed through a cognitive psychology lens, highlight the complex interplay of cognitive processes, knowledge structures, metacognitive awareness, and feelings that learners experience as they progress in their understanding of the subject matter. The phases are not necessarily discrete or linear, and learners may move back and forth between them as they encounter new challenges or deepen their understanding. Furthermore, this framework provides a helpful way to conceptualize the cognitive and affective experiences that learners go through as they progress in their learning journey. By considering these phases and the factors that influence them, educators and researchers can gain insight into how to support and optimize learning outcomes. Each phase is elaborated below:

Nascent Understanding: In this phase, learners have insufficient prior knowledge of the subject matter. They may experience a lack of schema or mental models to organize and interpret new information [30]. Learners may struggle with attention and encoding, as they have difficulty identifying relevant information or distinguishing between essential and non-essential elements [31]. This state is characterized by a lack of comprehension and a limited ability to form meaningful connections or representations.

Misunderstanding: Learners in this phase have formed incomplete or incorrect mental representations of the subject matter. They may have developed misconceptions or faulty models based on limited or misinterpreted information [32]. These learners may exhibit overconfidence bias, where they overestimate their understanding and are unaware of their knowledge gaps [33]. In this state, learners may use flawed reasoning or make incorrect inferences based on inaccurate knowledge structures.

Confusion: In this phase, learners experience cognitive disequilibrium as they encounter information that conflicts with their knowledge structures [34]. They become aware of inconsistencies, contradictions, or gaps in their understanding, leading to confusion or perplexity. Learners may use effortful cognitive processing to reconcile new information with prior knowledge [35]. This state is characterized by heightened metacognitive awareness and a motivation to resolve the cognitive conflict.

Emergent Understanding: Learners in this phase actively engage in knowledge construction and integration. They employ cognitive strategies, such as elaboration, organization, and critical thinking, to form new connections and update their mental models [36]. Learners may experience insight or conceptual change as they restructure their knowledge and develop a more coherent understanding [37]. In this state, learners can meaningfully relate new information to their existing knowledge structures and apply their understanding to solve problems or explain concepts.

Deep Understanding: In this phase, learners have constructed a rich, interconnected network of knowledge related to the subject matter. They have developed well-organized and elaborated schemas for efficient information processing and retrieval [38]. They may exhibit solid metacognitive skills, such as self-monitoring, self-regulation, and reflection, which enable them to assess and refine their understanding [39]. They can use flexible thinking, transfer knowledge to novel situations, and generate new insights or creative solutions [40]. This state is characterized by deep comprehension, automaticity in applying knowledge, and the ability to communicate understanding effectively.

3. Instructional Use

Moving beyond the psychological perspective and looking toward educational applications, the mental phases are compatible with the SOLO (Structure of the Observed Learning Outcome) model developed by John Biggs and Kevin Collis [41], where understanding is viewed as a progression through five hierarchical levels of increasing complexity. These levels describe the quality of a learner’s understanding and knowledge structure concerning a particular subject or task.

In the SOLO model, understanding is not viewed as a binary state (i.e., understanding or not understanding) but rather as a continuum of depth and sophistication. The learner’s journey of understanding starts from the pre-structural level, where they possess only scattered bits of information that lack coherence or meaning. As they progress to the uni-structural level, they begin to grasp a single relevant aspect or concept, albeit in isolation from the broader context. Moving further, learners at the multi-structural level can comprehend several relevant aspects independently but struggle to integrate them into a unified whole. At the relational level, learners genuinely begin to see the interconnectedness of different elements, weaving them together into a coherent tapestry of understanding. Finally, at the pinnacle of the SOLO taxonomy, the extended abstract level, learners transcend the boundaries of the subject matter, generalizing their knowledge to novel situations, generating original insights, and exhibiting a profound metacognitive awareness. The progression of understanding, from fragmented and superficial to integrated and transformative, lies at the heart of the SOLO model.

While the SOLO model is deeply rooted in the cognitive perspective and concentrates on outcomes, it may be mapped onto the emotional components essential to the mental state approach. Table 1 below illustrates how the five phases of understanding, which describe learners’ cognitive and metacognitive processes, align with the SOLO phases, which focus on the structural complexity and integration of knowledge. The brief descriptors provide a concise summary of the key characteristics of each phase or level, highlighting the progression from a lack of understanding to a deep, transferable, and transformative understanding of the subject matter.

Table 1. The five phases of understanding mapped onto the SOLO phases.

The emotions associated with each phase of understanding accompany the cognitive processes, challenges, and realizations that learners experience as they progress through their learning journey (See Table 2). It is important to note that the emotional experiences are not exclusive to each phase and may overlap or vary in intensity depending on the individual learner and the specific learning context. However, we hypothesize that, among the various measures collected for each phase, there will be emotional patterns that are unique to that phase.

Table 2. Hypothesized Emotions Associated with Each Phase of Understanding.

Nascent Understanding: In this phase, learners may experience confusion, frustration, and anxiety. They may feel lost in a sea of unfamiliar concepts and terminology, struggYesling to make sense of the new information presented to them. This lack of comprehension can lead to boredom and apathy as the learner may perceive the subject matter as irrelevant or disconnected from their prior knowledge and interests. Furthermore, when faced with a large amount of complex information without the necessary cognitive frameworks to process it effectively, learners may experience a sense of overwhelm, feeling ill-equipped to tackle the learning task at hand [42,43,44,45].

Misunderstanding: Learners may experience false confidence and overconfidence during the misunderstanding phase. Having constructed an incomplete or incorrect understanding of the subject matter, they may believe that they have a grasp on the concepts, unaware of the misconceptions they hold. This false sense of understanding can produce mistakes and lead to confusion and frustration when they encounter information that contradicts their misunderstandings, creating cognitive dissonance. When confronted with evidence challenging their flawed mental models, learners may become defensive or feel embarrassed as their self-perceptions of competence are questioned (cf., [46]).

Confusion: In the confusion phase, learners are confronted with the realization that their current understanding is insufficient or inconsistent, sometimes leading to surprise. Perplexity and uncertainty take hold as they struggle with the limitations of their existing knowledge structures. Frustration may mount as learners attempt to reconcile new information with their prior understanding, experiencing the discomfort of cognitive dissonance. However, amidst this confusion, a spark of curiosity can emerge. Learners begin to recognize the gaps in their understanding and develop a desire to resolve the confusion and gain new insights. This motivation to overcome the perplexity drives them to seek additional information, clarification, or support [47,48,49,50,51].

Emergent Understanding: As learners transition into the emergent understanding phase, they experience positive emotions. Curiosity and excitement grow as they begin to form new connections and gain fresh insights, realizing the rewarding nature of the learning process. Enthusiasm builds as they engage more deeply with the subject matter, recognizing its relevance and applicability to their lives. Satisfaction and pride emerge as learners successfully integrate new information into their existing knowledge structures, experiencing a sense of accomplishment. The confusion that once clouded their understanding dissipates, replaced by a feeling of relief and clarity. This newfound understanding ignites a motivation to learn more as learners become eager to expand their knowledge and explore the subject further [52,53,54].

Deep Understanding: In the phase of deep understanding, learners may experience a profound sense of confidence in their knowledge and abilities. They have developed a robust and integrated understanding of the subject matter, enabling them to navigate the domain easily. They may feel satisfaction and fulfillment as they realize the extent of their intellectual growth and the ability to apply their knowledge effectively. Enthusiasm and enjoyment may be high as learners explore the subject’s intricacies, discovering new avenues for application and inquiry. They may be inspired by the transformative power of deep understanding, motivated to share their knowledge with others and make meaningful contributions to their field. Curiosity may lead learners to venture into related domains and further expand their understanding. With heightened metacognitive awareness, they reflect on their own thought processes, strategies, and intellectual journey, fostering a sense of self-efficacy and self-regulated learning (cf., [55,56]).

As students work to improve their understanding, identifying phases of understanding in real time may provide us with opportunities to heighten task engagement. We know emotions are pivotal in heightening task engagement and transforming how students approach and interact with their studies [57]. Positively valenced emotions, such as enthusiasm, curiosity, and excitement, increase motivation and engage the students through what Pekrun has called activating emotions [50]. Engagement reaches new heights when individuals connect emotionally with a task, perceiving it as personally relevant and aligned with their values and goals [50]. This emotional investment can turn the task into a meaningful endeavour, sometimes creating intense focus and immersion, sending individuals into a state of flow where the boundaries between self and task dissolve, leading to optimal performance and a profound sense of enjoyment [58].

Furthermore, emotional regulation skills have proven crucial in navigating obstacles and setbacks, which involve activating and deactivating emotions [50]. By cultivating an environment that nurtures positive emotional experiences, we also provide opportunities for personal growth through heightened engagement [59]. Figure 4 below depicts a hypothesized relationship between phases of understanding and task engagement from a cognitive and educational psychology perspective.

Figure 4. Hypothesized task engagement across valenced phases of understanding.

Overall, the emotions and phases align with the idea that cognitive processes and emotional experiences are closely intertwined during learning. As individuals progress through the different stages of understanding, their emotional responses and level of engagement vary accordingly. This relationship is supported by research in educational psychology, which emphasizes the role of emotions in learning, motivation, and academic achievement [21]. Thus, the graph above highlights the importance of guiding learners from nascent understanding or misunderstanding to emergent and deep understanding. By providing appropriate support, feedback, and instructional resources, educators can facilitate the development of accurate mental models and foster positive emotional experiences that enhance task engagement and learning outcomes.

Figure 4 represents how learners’ emotional valence and task engagement levels vary across mental understanding states. We are looking at valence, for as Tamir and colleagues [60] have written, “[Valence] has long been implicated in social and affective processing. As such, it may come as no surprise that valence plays an important role in the organization of mental state representations” (cf., [60], p. 197). Further, the figure illustrates that as the valence becomes increasingly positive, learners progress from nascent to deep understanding and their task engagement and motivation levels rise accordingly.

Schwartz and Wrzesniewski’s [61] elaboration of internal motivation provides a foundation for describing how deepening students’ understanding of a concept can lead to increased task engagement. Internal motivation arises from personal factors such as goals, values, and self-concept, driving individuals to participate in activities that align with their feelings, beliefs, and desires. Internal motivation can move you to undertake positively or negatively valenced activities and personalize what is to be learned—which Reeve and Tseng [62] identify as a crucial component of student agentic engagement. Agentively engaged students actively invest in their learning, employing self-regulated strategies to process information deeply and construct meaning. Thus, they are driven by curiosity, challenge, and the desire to explore the subject matter.

Figure 4 illustrates that as students progress through the phases of understanding, from nascent understanding to emergent understanding and, finally, to deep understanding, we can expect their cognitive engagement to intensify. The more they understand, the more invested they become in the learning process, leading to higher levels of task engagement. This increased task engagement, fueled by internal and intrinsic motivation, can create a positive feedback loop. As students engage more deeply with the task, they experience a greater sense of stasis and competence, further reinforcing their cognitive and task motivation.

Learners who do not understand will likely experience slightly negative emotions and exhibit below-average task engagement. As noted above, a lack of comprehension can lead to discouragement and disengagement from the learning process. Moving up higher on the graph, we see that misunderstanding is associated only with a marginal increase in task engagement due to some partial understanding of the material. In this phase, we expect that the emotional valence remains slightly negative, indicating that learners may still feel confused or frustrated despite their initial grasp of the concept.

Confusion is in the central part of the graph. Typically, it is characterized by neutral emotional valence and moderate task engagement. This transitional state represents a point where learners actively grapple with the material, trying to make sense of the information presented. Their emotional response is mixed and their engagement level is moderate as they work through the challenges of understanding.

As learners develop an emergent understanding of a concept, their position on the graph shifts slightly toward a positive emotional valence and above-average task engagement. This movement indicates that accurate comprehension is accompanied by more positive emotions and increased investment in the learning task. Learners who achieve deep understanding will experience the most positive emotions and exhibit the highest levels of task engagement. They will likely feel confident and satisfied and employ better learning strategies. As Muis has pointed out, “Specifically, following Pekrun’s [63] framework, positive activating emotions (curiosity, enjoyment) will positively predict deep processing learning strategies, including elaboration, critical thinking, and metacognitive self-regulation (see [50]). Negative activating emotions (anxiety, frustration) will positively predict shallow processing strategies for rehearsal of learning material, and confusion and surprise will positively predict metacognitive self-regulation to reduce cognitive incongruity [50]. Negative deactivating emotions such as boredom will impair the systematic use of learning strategies [64]” (cf., [44], p. 171).

Overall, the graph highlights the interconnectedness of cognitive and affective factors in the learning process. It emphasizes the importance of guiding learners towards accurate understanding, as it enhances their comprehension and fosters positive emotions and engagement. These factors are crucial for sustained learning and achievement, highlighting the need for educators to support learners through the various stages of understanding and create learning environments that promote cognitive growth and emotional well-being.

To this point, I have addressed the concept of understanding in the sense of it being ascribed by the learner through integrating insights from cognitive psychology, neuroscience, and education. Central to the attainment is the mental state framework, which can help us identify the phases of understanding as students transition from nascent understanding to misunderstanding, confusion, emergent understanding, and, eventually, to deep understanding. This framework is a lens through which educators might view a student’s growth in understanding, enabling them to recognize the cognitive processes and emotional experiences that characterize each stage. Further, I have highlighted the interplay between cognition and emotion in learning. Its interwoven nature underscores the importance of considering both the intellectual and affective dimensions of understanding. Indeed, I have suggested that the SOLO model could incorporate a mental state framework. By acknowledging the emotional states associated with each phase of understanding, from the frustration and anxiety of nascent understanding to the joy and confidence of deep understanding, teachers can cultivate a more supportive and empathetic classroom culture that honours the full spectrum of the learning experience. Implementing such a program on an individual basis, however, would be very difficult in a typical classroom situation.

We must use affect-aware technologies to support children’s learning with an augmented Generative Artificial Intelligence (Gen-AI) tutor to take full advantage of the mental state framework. Advances in detecting feelings and emotions raise the possibility of creating adaptive learning systems that can respond to a student’s emotional state in real-time. Imagine a future where educational technologies can sense a learner’s feelings and dynamically adjust the pace of instruction, provide personalized support, or offer additional resources to optimize the learning experience. While such technologies are still developing, seminal research in affect-aware tutoring suggests they promise to transform education and ensure every privileged or disadvantaged student receives the support they need to thrive.

4. Toward the Affect-Aware Tutor

Artificial intelligence (AI) tutors require more sophisticated models of the learner to support student learning [65] effectively. These models should be grounded in theories of knowledge acquisition to capture the learning process accurately. While advanced AI techniques such as graph neural networks, adversarial approaches, federated learning [66], Levenberg–Marquardt optimization in neural networks [67], or AI-based power routing optimization in DC–DC converters [68] have their applications in various domains, they are unlikely to contribute significantly to the improvement of student models in AI tutors. If we wish to enhance the effectiveness of AI tutors, our focus should be on developing theoretically grounded student models that accurately represent the complexities of human learning.

The science of learning highlights the importance of social interaction, like dialogues with expert tutors or peers, for deeper learning [69]. Human tutors remain the most effective, but well-designed Intelligent Tutoring Systems (ITSs) can approach their efficacy by leveraging expert tutoring strategies [70,71]. Critically, Lepper and Woolverton [72] identified motivational and affective strategies used by expert tutors, such as building trust, giving praise, and being responsive to student emotions. These findings highlight the need for affect-sensitive intelligent tutoring systems to adopt the above techniques to enhance student motivation and engagement. Further, I suggest that affect-sensitive intelligent tutoring systems need to incorporate student models of real-time subjective understanding.

D’Mello and Graesser’s [65] comprehensive review of the last 50 years of intelligent tutoring systems highlights the importance of student models in enabling ITSs to emulate the adaptability and interactivity of expert human tutors. Additionally, they note five illusions that current intelligent tutoring systems are affected by that involve students’ understanding: (1) the illusion of grounding, which is the mistaken belief of shared knowledge between the speaker and listener; (2) the illusion of feedback accuracy, where the tutor mistakenly believes that students accurately indicate their understanding when probed; (3) the illusion of discourse alignment, which is the unwarranted assumption that the listener understands the speaker’s intentions and meaning; (4) the illusion of student mastery, where the tutor believes the student has mastered more than they have; and (5) the illusion of knowledge transfer, where the tutor believes that the student accurately encodes the information they express. These illusions emerge from inaccurate assumptions about students’ understanding and learning progress. However, if the AI tutor were aware of the student’s current state of understanding, advances in addressing the illusions would be possible. The first step, and the focus of this paper, will be to describe how an AI tutor could detect the student’s subjective understanding.

Current work in my lab by Milan Lazic examines the potential for AI to detect students’ phases of understanding using the webcam available on most computers and laptops [73]. Students sit at a computer while they are trying to solve problems. The camera records their facial expressions and OpenFace Version 2.2.0 identifies the FACS action units evidenced from frame to frame. Examining the FACS AUs is not done to recognize individual emotions but rather to identify the patterns of AUs associated with each of the phases of understanding. Earlier, I suggested that each phase may have many different feelings and emotions that appear in more than one phase. We speculate, however, that each phase will have uniquely identifiable patterns that machine learning analyses will detect and model.

In emotion recognition research, FACS is used to identify the specific facial muscle movements associated with different emotions. FACS consists of 46 main action units numerically coded to represent specific facial muscle movements. For example, AU 1 represents the raising of the inner portion of the eyebrows, while AU 12 represents the pulling of the lip corners upward in a smile. Trained FACS coders can identify these AUs’ presence, intensity, and timing, allowing for a detailed analysis of facial expressions. This approach has been widely applied in studies of emotional communication, social interaction, and detecting affective disorders because of its objectivity and reliability. Since the system is based on specific muscle movements rather than subjective interpretations of expressions, it allows for more consistent and accurate measurement of facial behaviour across different individuals and cultures.

Figure 5 shows a human face with a sample of AU locations. These AUs offer an objective lens to measure the subtle display of emotions across the face. For example, AU01, the inner brow raiser, and AU02, the outer brow raiser, draw the eye upward, hinting at the interplay of surprise, concern, or concentration. AU04, the brow lowered, and AU05, the upper lid raiser, work in tandem to create expressions of intensity or focus.

Figure 5. Sample facial action coding system (FACS) action units.

Part of Lazic’s dissertation research collects the AUs recorded as undergraduate students see if they can understand riddles posed to them while they sit at a computer. Riddles are used because we anticipated that moments of nascent understanding, misunderstanding, confusion, emergent understanding, and deep understanding could be objectively validated by human observers (e.g., they could attribute, in many cases, the phases of understanding across raters reliably). The phases will not be identified directly by the students’ emotions; instead, they will be determined by patterns of AU that are unique to each phase. These data allow us to develop supervised machine learning models that identify phases of understanding.

To better understand this procedure, imagine you want to teach a computer to recognize different types of fruits. The computer is like a student, and you are the teacher. To help the computer learn, you must show it many examples of each fruit and tell it what each one is. This process is called supervised machine learning. First, you gather an extensive collection of fruit pictures, making sure to include a variety of examples for each type of fruit. You carefully sort through the images, removing any that are blurry or do not clearly show the fruit. You must also have a balanced set of examples, so the computer does not learn to favour one fruit over another. Next, you decide on the best way to teach the computer.

When you start teaching the machine, you show the computer a picture of a fruit and tell it what it is. The computer looks at the picture and tries to determine what makes that fruit unique. Is it the colour, shape, or texture? The computer adjusts its understanding based on your feedback. You do this many times with different pictures of the same fruit, helping the computer to refine its understanding. After showing the computer many examples of each fruit, you test its knowledge. You show it new pictures of fruits it has not seen before and ask it to identify them. If the computer gets most of them right, you know it has learned well. If it makes a lot of mistakes, you might need to go back and teach it some more, perhaps focusing on the fruits it had trouble with. Once you are satisfied that it has learned, you can use the program to identify fruit—or, in our case, phases of understanding.

The initial results of Lazic’s study indicate that machine learning models can accurately predict the student’s understanding phase. Each of the five phases can be identified reliably. Once the models are refined, the next step will be to evaluate them as the students work on authentic problems of understanding related to their academic studies, using a Gen-AI program to improve their understanding. Gen-AI, sometimes called a generative pre-trained transformer (GPT), is a type of artificial intelligence that can create new content, such as text, images, or music, based on patterns learned from existing data. It uses deep learning algorithms to analyze and understand the underlying patterns in the input data. By training on vast amounts of data, Gen-AI systems can learn to generate new content that resembles the original data but is not an exact copy.

One of the essential aspects of Gen-AI is that it does not store the content it generates. Instead, it creates new content based on the patterns and rules it learned during training. When a user provides an input or prompt, the Gen-AI system uses its trained neural networks to analyze the input and generate a relevant output in response. Therefore, the Gen-AI system does not have a pre-defined database of content to pull from; instead, it dynamically creates new content tailored to the specific input. This approach allows Gen-AI systems to be highly flexible and adaptable, as they can generate a wide variety of content without being limited by a fixed set of stored data [74]. Creating content on the fly enables Gen-AI to produce novel and unique outputs, making it a powerful tool for various applications, such as content creation, virtual assistance, and creative problem-solving.

The content creation feature of Gen-AI makes it a potentially powerful tutor because it can create personalized learning experiences tailored to an individual student’s needs, learning style, and pace. By analyzing the student’s responses and performance, the AI tutor can adapt its teaching approach and generate explanations, examples, and exercises that are most relevant and effective for that student. Gen-AI tutors can provide instant feedback and support, helping students identify and correct mistakes in real-time, accelerating the learning process and keeping students engaged and motivated. Students can ask questions and receive accurate, detailed answers generated on the spot or request to be tested and receive instant feedback on their answers.

Moreover, with its ability to create content on the fly, a Gen-AI tutor can generate a virtually unlimited number of practice problems, examples, and explanations, ensuring that students have access to a wide variety of learning materials and can practice as much as they need to master a topic without running out of content [75]. Gen-AI models can tutor in various subjects, from math and science to language and arts, allowing students to have a consistent, personalized learning experience across different subjects and educational levels. In doing so, they can evaluate their output if requested.

Furthermore, Gen-AI tutors can be accessed anytime, anywhere, through digital devices, making quality education more accessible to students who may not have access to traditional tutoring services. Gen-AI tutors’ scalability means they can serve many students simultaneously without compromising the quality of instruction. By leveraging the power of content creation on the fly, Gen-AI can reform how we approach education and provide students with highly effective, personalized learning experiences [76].

We leverage machine learning and generative predictive transformation AI technologies in our work. Output from Lazic’s AU monitoring is processed to identify phases of understanding and this information is then fed to our initial version of a Gen-AI tutor we call Abel Tutor.

5. Abel Tutor

Considerable thought has been put into the ethical considerations of building a Gen-AI tutor with an affective perception of the user. As the development of Abel Tutor moves forward, we are mindful of the complex ethical concerns of creating a tool designed to aid students. It needs to be a tool that promotes equal opportunities and addresses the diverse needs of learners. The algorithms must be meticulously designed to avoid perpetuating or amplifying existing biases and disparities in educational outcomes. Regular audits will need to be part of the workflow, allowing us to (1) identify and mitigate any unintended biases that threaten to undermine the educational value of the technology; (2) detect and remove inaccurate information that the system may have “hallucinated”; and (3) refine the guardrails designed to suppress harmful and hateful content.

Additionally, we understand the importance of data privacy and security in dealing with sensitive information about students’ learning patterns, strengths, and weaknesses—a responsibility no one should take lightly. Robust data protection measures, including encryption and strict access controls, will be woven into every element of Abel Tutor. We will prioritize ensuring that students’ personal information remains confidential and secure, fostering trust between the learners and the system.

Additionally, transparency and accountability are essential. The team recognizes that for students, teachers, and parents to embrace Abel Tutor, they must understand how the system makes its decisions and recommendations. To that end, we are developing visual dashboards to provide clear and accessible explanations of the underlying measurements, empowering users to make informed decisions about their engagement with the platform. We will also have channels for feedback and open communication, ensuring that any concerns or questions will be promptly addressed.

Further, we are monitoring the balance between the benefits of personalized learning and the potential risks of over-reliance on AI. While the system is designed to adapt to individual learning needs and provide tailored support, we are aware of the importance of human interaction and the role of teachers in the learning process. We will develop Abel Tutor as a complementary tool that can enhance the guidance and mentorship provided by educators. Moreover, we are committed to ongoing monitoring and refinement, ensuring that the system continues to align with fairness, privacy, transparency, and balanced learning support. We understand that accurate measure of success will not be found solely in the technological sophistication of the platform but in its ability to positively transform the lives of learners while upholding the highest ethical standards. Finally, the ability to turn on or off the affective perception features of Abel Tutor will be entirely under the control of the students. No data about individuals or their performance will be stored outside of the data of fully informed research participants engaged in a research study.

Figure 6 illustrates the flow of information within the Abel Tutor system. The information pipeline consists of several components that work together to process and analyze user data, ultimately providing personalized tutoring support. As illustrated in the figure, Abel Tutor’s backend processing uses the student’s video and voice inputs. This information is processed in real-time to detect action unit (AU) dynamics, transdermal optical imaging (TOI) changes, and vocal dynamics. Three measures are employed to enhance the reliability and validity of the physiological effects. Both voice and video are collected because talking can degrade AU information and TOI may be unreliable under unstable lighting conditions. The data on physiological feelings are then passed to the phase identification model, where phases of understanding are detected based on machine learning (ML) models. If a phase is identified, the information is sent to the individualized student model, which builds and saves a history for the learner throughout the session. This information helps determine if the student is repeatedly confused, overconfident in assessing their understanding, or potentially developing a misconception. The student model information is then passed to the instructional model, where prompts are generated to inform the generative AI (Gen-AI) tutor about the student’s physiological state and current phase of understanding, session history of understanding, and any special conditions that need to be addressed, such as possible misconceptions. Based on the student’s input and the instructional model’s prompts, the Gen-AI tutor constructs its output for the student. The tutor then waits for the next round of input from the student.

Figure 6. Generalized information pipeline for the Abel Tutor system.

The information flow in the Abel Tutor system is designed to be iterative and dynamic, allowing for real-time monitoring and adaptation to the user’s evolving understanding. By leveraging advanced technologies for feeling detection and mental state modelling, the system aims to provide emotionally resonant and individualized tutoring support that facilitates deep understanding and lifelong learning. This is a novel approach to AI tutoring that advances the field in several critical ways compared to existing tutors. Unlike many existing tutors that primarily rely on text-based input or simple user interactions, Abel Tutor uses video and voice input to instantaneously detect a student’s state of understanding. The system processes AU dynamics, TOI changes, and vocal dynamics in real time, allowing for a more accurate and timely assessment of the student’s emotional and cognitive state, which is not found in existing tutor systems. By employing three measures for physiological effects and collecting both voice and video data, Abel Tutor aims to increase the reliability and validity of its assessments, addressing potential limitations of existing systems that may rely on a single modality-based measure.

Further, using machine learning (ML) models to detect phases of understanding is a significant advancement in Abel Tutor. It allows the tutor to adapt its approach based on the student’s current level of comprehension, which is crucial for effective tutoring. Moreover, Abel Tutor builds and maintains an individualized student model that tracks the learner’s history throughout the session, enabling the tutor to identify patterns, such as repeated confusion or overconfidence, and potentially detect misconceptions. This level of personalization and extended tracking is lacking in existing tutor systems.

The instructional model in Abel Tutor generates prompts to inform the generative AI (Gen-AI) tutor about the student’s state, history, and special conditions, allowing for a more tailored context and tutoring experience. The Gen-AI tutor can then construct its output based on the student’s input and the instructional model’s prompts, enabling a more dynamic and responsive interaction than pre-programmed tutoring systems.

In summary, Abel Tutor’s integration of multimodal input, real-time processing, robust measures, advanced machine learning models, individualized student modeling, and a responsive generative AI tutor represents a new generation of cognitive and affect-aware AI tutoring systems. By building upon existing research and techniques, this system pushes the boundaries of AI tutoring, creating the conditions for more sophisticated and impactful educational technologies.

6. Discussion and Implications

The Abel Tutor system aims to enhance the learning experience by viewing affect and cognition as integrated, as suggested by Damasio’s theory. Statements like “It just didn’t feel right” or “I knew in my gut that what he said was wrong” demonstrate what Damasio [1] is talking about in Feeling & Knowing: Making Minds Conscious. Of course, the gut does not know facts, but the feelings it generates accompany cognition and serve as a proxy for what the brain is contemplating. We can infer the cognitive processes that underlie a learner’s understanding by measuring physiological feelings.

I have speculated that while this approach aligns with Festinger’s Cognitive Dissonance Theory, which posits that negative feelings drive individuals to reduce dissonance, our model goes further. We extend Festinger by considering negative and positive valences and the learner’s mental state. Specifically, while dissonance motivates you to resolve your confusion or misunderstanding, positively valenced feelings can provide the internal motivation and cognitive engagement necessary to deepen your understanding.

Throughout this paper, I suggest that understanding progresses through distinct phases, each characterized by unique affective identifiers. By detecting these identifiers, we can determine a learner’s current understanding phase and predict their task engagement level. The phases are non-linear. A student might flip between emerging understanding and confusion numerous times before they move to deeper understanding. The instructional model must anticipate these iterations. The student needs to keep working until “it feels right”. Information about the student’s affect and phase is crucial for providing targeted support and interventions to optimize the learning experience. Therefore, to demonstrate that such a Gen-AI tutor is possible, we leverage two forms of AI: machine learning for affect awareness and Gen-AI for adaptive behaviour. Lazic’s research has demonstrated the feasibility of using machine learning to detect affective states. By incorporating this capability into Abel Tutor, we can gain valuable insights into the learner’s emotional experiences. Additionally, Gen-AI can utilize this affective information to adjust its behaviour and provide personalized support tailored to the learner’s needs (see sample text in Table 3 below).

Table 3. Sample tutor–student interaction.

The tutoring session highlighted in Table 3 showcases two integral features. Firstly, in addition to the phase of understanding, the tutor automatically tracks the emotional tone in both its replies and the student’s input, which is an important element of the interaction. Secondly, the nature of the student’s engagement is given considerable importance.

Abel can detect and respond to the learner’s feelings, emotional cues, and understanding levels. This capability enables Abel to create a learning experience far more individualized and engaging than mere knowledge transfer. This strategy is grounded in the idea that learning should cultivate a deep interest and a self-driven desire to explore a subject. This goal aligns with Frijda’s theory of passion as a sustained eagerness to pursue an area of interest [77] and improve comprehension—an approach backed by notable research in teaching and learning [78].

Incorporating learner engagement means having the tutor encourage students to express their thoughts, questions, and uncertainties. This dialogue facilitates targeted support [79,80]. By urging learners to communicate, the tutor can assess their current level of comprehension and emotional state, leading to support that is both personalized and effective. Additionally, the tutor’s empathetic response to the learner’s emotional state is crucial, providing the necessary scaffolding to help them navigate the learning process [56,72,81]. Furthermore, monitoring the learners’ feelings, along with offering tailored support, can foster a learning environment that is both safe and conducive to deeper engagement with the subject matter [82,83]. Personalizing the learning experience to align with the learner’s prior knowledge, interests, and learning style is also vital, enhancing their connection to the material and motivating them to explore further [84,85].

Ideally, the AI tutor should create opportunities for learners to apply their knowledge meaningfully, promoting a deep understanding and a well-developed interest in the subject. Practical applications, like problem-solving, case studies, or creative projects, help learners perceive the value of their learning, moving them to engage fully with the subject and fostering a true passion for learning [78]. At the same time, a primary design feature of Abel will be to prevent the establishment of deeply rooted misconceptions that are difficult to remediate later. By continuously monitoring the learner’s affective states and phases of understanding, the tutor can identify and address misconceptions early in the learning process. We anticipate this will be particularly helpful in mathematics, where many student misconceptions are well-documented. Additionally, Abel Tutor may serve as an assessment tool that can determine when a student is ready to move on to the next stage of learning. By presenting test-confirming questions when the tutor detects deep understanding, the tutor can ensure that learners progress efficiently and avoid wasting time studying material they have already mastered.

In summary, the Abel Tutor system represents a new generation of educational technology. By integrating affect and cognition, leveraging AI capabilities, and adapting to individual learner profiles, the tutor has the potential to create emotionally resonant learning experiences that foster deep understanding and lifelong learning. The power of this technology should help both advantaged and disadvantaged learners. We anticipate that disadvantaged students will have access to a personal, non-judgemental tutor willing to work with them at any time for as long as they wish, imparting learning strategies and emotional support in a way they may never have experienced. As we continue to refine and develop this system, we anticipate a transformative impact on how we approach education and support learners in their pursuit of knowledge in a manner that promotes equity of outcomes.

The instructional implications of the phased understanding framework are far-reaching. By determining the learner’s phases of understanding, educators, even those not using an AI tutor, can craft instructional strategies that are finely attuned to the needs of learners in each phase. For example, when a child struggles with nascent understanding or misunderstanding, teachers can break down complex concepts into manageable chunks, provide clear explanations, and help learners forge connections between new information and prior knowledge. As learners navigate the confusion phase, educators can encourage questioning, provide opportunities for exploration and discovery, and facilitate discussions that help learners resolve cognitive conflicts and construct new understandings. As learners progress to emergent and deep understanding, teachers can challenge them with more complex tasks, encourage metacognition and reflection, and provide knowledge application and transfer opportunities. Such an approach can potentially shape how we approach education profoundly. By illuminating the cognitive and emotional dimensions of learning, aligning with existing educational theories, and pointing towards the potential of emerging technologies, we can imagine creating learning experiences that are intellectually nurturing.

The mental state model, which views understanding as being composed of assembled beliefs, values, emotions, and feelings, provides a framework for understanding the complex interplay between emotion and cognition in the learning process. As learners progress through nascent understanding, misunderstanding, confusion, emergent understanding, and deep understanding, they experience a range of cognitive processes, knowledge structures, metacognitive awareness, and emotions that shape their learning journey. By identifying the feelings associated with each understanding phase in real time, Abel Tutor can provide targeted interventions and support tailored to each individual learner’s needs. For example, when a student is detected to be in the confusion phase, characterized by perplexity, frustration, and curiosity, Abel Tutor can offer clarifying explanations, guiding questions, or additional resources to help resolve the confusion and foster emergent understanding. As the student progresses towards deep understanding, accompanied by feelings of enjoyment, satisfaction, and inspiration, Abel Tutor can provide more advanced, thought-provoking materials and encourage the student to explore the subject matter further, nurturing their intrinsic motivation and engagement. Throughout this process, Abel gives feedback when subjective understanding develops.

Detecting and responding to a student’s emotional state in real-time is a powerful tool for enhancing the learning experience. By creating an emotionally resonant learning environment that adapts to the individual’s needs, Abel Tutor can help students navigate the challenges and setbacks they encounter, promoting the development of accurate mental models and fostering a sense of self-efficacy and self-regulated learning. As we continue to develop and refine AI-driven assessments of subjective understanding, we must ensure that these technologies are grounded in sound pedagogical principles and ethical considerations. By harnessing the power of real-time monitoring and analysis of physiological feelings, Abel Tutor can provide personalized, adaptive support that empowers learners and optimizes their educational outcomes, ultimately facilitating the attainment of deep subjective understanding.

As Abel Tutor progresses, there will be two significant foci that represent our challenges. The primary focus will be to continue to help students advance their subjective understanding, which will involve (a) ongoing fine-tuning of the ML models to identify the phases of understanding, (b) elaborating and adjusting the student model based on the tutor’s success and failures, and (c) refining the instructional model through improved prompt engineering. The second focus will be to help students nurture their curiosity, interest, and motivation. The description of interest as a psychological construct has been described as a phase-driven process (see Renninger and Hidi [86]). Current research is underway in my lab to create ML models that will detect these phases.

This paper has concentrated on describing the first focus, which should not be taken educationally as any diminution of the second focus. As Litman [53] points out, curiosity can promote a need to know “… that becomes increasingly bothersome until satisfied by obtaining the desired pieces of missing information” (cf., [53], p. 418). However, Renninger and Hidi [86] view curiosity as a temporary motivational state triggered by novelty or information gaps. On the other hand, interest is a more enduring and stable motivational state characterized by focused attention, increased cognitive functioning, and a willingness to engage with a subject over time. Further, these researchers also say “At any age, in any context, interest can be encouraged to develop” (cf., [87], p. 282). Specifically, Abel Tutor will be designed to develop interest, expressed in curiosity-driven research, and hopefully nurture what Csikszentmihalyi [58] would call flow or what Frijda [2] would describe as passion.

Abel Tutor’s role is to help students develop a subjective understanding of the material being taught. Subjective understanding, or what Searle [88,89,90] would refer to as ontological subjective understanding, is a student’s personal, individualized way of relating to and making sense of information. This understanding allows students to connect new knowledge to their prior experiences, beliefs, and emotions, making the learning process more meaningful and engaging. Abel Tutor should strive to awaken a passion for the subject in its students to enhance learning. When students are passionate, they are intrinsically motivated to explore, ask questions, and seek additional information. This passion can be sparked by presenting the material to highlight its relevance to students’ lives, interests, and goals.

To advance students further, it becomes necessary for us to shift the focus toward objective understanding. Carl Bereiter [28] argues that students must develop objective understanding in the knowledge age, an age characterized by rapid technological advancements and exponential information growth. As Bereiter explains, objective understanding, or what Searle [88,89,90] would refer to as epistemically objective understanding, involves grasping the underlying principles, theories, and facts that constitute a body of knowledge independent of personal experiences or opinions. Shifting from subjective to objective understanding encourages students to ask questions, engage in peer discussions, and explore the subject matter more creatively, systematically, and rigorously. Collaboratively engaging with ideas in such a manner is a pedagogy of knowledge building that will complement Abel Tutor’s solitary tutoring approach—emphasizing the human component that some fear may be lost as general artificial intelligence advances.

7. Conclusions

This paper emphasizes that subjective understanding is a complex process involving the interplay of feelings and cognition. Feelings act as indicators of mental states and highlight significant shifts in psychological states. Advances in technology, such as facial expression analysis, transdermal optical imaging, and voice analysis, enable computers to monitor a user’s sympathetic and parasympathetic nervous system activity in real-time. This allows for the detection of feelings and mental states associated with different phases of understanding as users engage with study materials.

I introduce the mental state model as a framework for understanding the multifaceted nature of the emotion–cognition relationship expressed by embodied cognition theorists [91,92,93]. Accordingly, understandings are composed of assembled beliefs, values, emotions, and feelings that shape perceptions, decisions, and actions. I further describe five phases of understanding that learners may progress through: nascent understanding, misunderstanding, confusion, emergent understanding, and deep understanding. Each phase is characterized by specific cognitive processes, emotions, and physiological responses that can be detected and analyzed by AI-driven assessments.

Grounded in the above approach, I propose the development of Abel Tutor, an AI-a driven system that uses real-time monitoring of physiological feelings to provide individualized, adaptive tutoring support. By identifying the feelings associated with each phase of understanding, Abel Tutor can offer targeted interventions to help students navigate challenges and promote engagement, as advocated by developers like Sal Khan [94]. This paper suggests that AI-driven assessments of subjective understanding could reform the learning experience by creating emotionally resonant learning environments that adapt to individual needs and optimize educational outcomes for both advantaged and disadvantaged students.

Finally, I emphasize the importance of ensuring that AI-driven assessments of subjective understanding are grounded in sound pedagogical principles and ethical considerations as these technologies continue to be explored. The ultimate goal is to empower learners by facilitating deep understanding and lifelong learning.

Funding

This research received no external funding.

Institutional Review Board Statement

Not Applicable.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

As a perspective, there is no original data to share.

Conflicts of Interest

The author declares no conflict of interest.

References

Damasio, A. Feeling & Knowing: Making Minds Conscious; Pantheon Books: New York, NY, USA, 2021. [Google Scholar]
Frijda, N.H. The Laws of Emotion, 1st ed.; Taylor and Francis: Abingdon, UK, 2007. [Google Scholar] [CrossRef]
Frijda, N.H. The Emotions; Cambridge University Press: Cambridge, UK, 1986. [Google Scholar]
Olson, D.R. Making Sense: What It Means to Understand; Cambridge University Press: Cambridge, UK, 2022. [Google Scholar]
Scherer, K.R.; Moors, A. The Emotion Process: Event Appraisal and Component Differentiation. Annu. Rev. Psychol. 2019, 70, 719–745. [Google Scholar] [CrossRef] [PubMed]
Festinger, L. Cognitive Dissonance. Sci. Am. 1962, 207, 93–106. [Google Scholar] [CrossRef] [PubMed]
Harmon-Jones, E.; Harmon-Jones, C.; Levy, N. An Action-Based Model of Cognitive Dissonance Processes. Curr. Dir. Psychol. Sci. 2015, 24, 184189. [Google Scholar] [CrossRef]
Wei, J.; Luo, H.; Wu, S.J.; Zheng, P.P.; Fu, G.; Lee, K. Transdermal Optical Imaging Reveal Basal Stress via Heart Rate Variability Analysis: A Novel Methodology Comparable to Electrocardiography. Front. Psychol. 2018, 9, 309547. [Google Scholar] [CrossRef] [PubMed]
Barrett, L.F. How Emotions Are Made: The Secret Life of the Brain; Houghton Mifflin Harcourt: Boston, MA, USA, 2017. [Google Scholar]
Buono, S.; Zdravkovic, A.; Lazic, M.; Woodruff, E. The Effect of Emotions on Self-Regulated-Learning (SRL) and Story Comprehension in Emerging Readers. In Frontiers in Education; Frontiers Media SA: Lausanne, Switzerland, 2020; Volume 5, pp. 1–12. [Google Scholar] [CrossRef]
Ekman, P.; Rosenberg, E.L. (Eds.) What the Face Reveals: Basic and Applied Studies of Spontaneous Expression Using the Facial Action Coding System (FACS); Oxford University Press: Oxford, UK, 1997. [Google Scholar]
Oatley, K. Our Minds, Our Selves: A Brief History of Psychology; Princeton University Press: Princeton, NJ, USA, 2018. [Google Scholar]
Fu, G.; Zhou, X.; Wu, S.J.; Nikoo, H.; Panesar, D.; Zheng, P.P.; Oatley, K.; Lee, K. Discrete Emotions Discovered by Contactless Measurement of Facial Blood Flows. Cogn. Emot. 2022, 36, 1429–1439. [Google Scholar] [CrossRef] [PubMed]
Jacques, S.L. Optical Properties of Biological Tissues: A Review. Phys. Med. Biol. 2013, 58, 37–61. [Google Scholar] [CrossRef] [PubMed]
Bashkatov, A.N.; Genina, E.A.; Tuchin, V.V. Optical Properties of Skin, Subcutaneous, and Muscle Tissues: A Review. J. Innov. Opt. Health Sci. 2011, 4, 9–38. [Google Scholar] [CrossRef]
Lu, C.; Zong, Y.; Zheng, W.; Li, Y.; Tang, C.; Schuller, B.W. Domain Invariant Feature Learning for Speaker-Independent Speech Emotion Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 2022, 30, 2217–2230. [Google Scholar] [CrossRef]
Hunte, M.R.; McCormick, S.; Shah, M.; Lau, C.; Jang, E.E. Investigating the Potential of NLP-Driven Linguistic and Acoustic Features for Predicting Human Scores of Children’s Oral Language Proficiency. Assess. Educ. Princ. Policy Pract. 2021, 28, 477–505. [Google Scholar] [CrossRef]
Wu, Z.; Li, D.; Yang, W. Rules Based Feature Modification for Affective Speaker Recognition. In Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, Toulouse, France, 14–19 May 2006; Volume 1, p. 1. [Google Scholar]
Vincett, M.; Jang, E.E. Examining Young Learners’ Emotions and How They Relate to Cognition and Learning. Ph.D. Thesis, University of Toronto, Toronto, ON, Canada, 2022. [Google Scholar]
Jang, E.E.; Hannah, L.; Lee, M.-H.; Russell, B. Investigating Prosody-Inclusive Automated Oral Reading Fluency Assessment: Construct Representativeness and Linguistic Equity. In Proceedings of the Symposium at 2023 Language Testing Research Colloquium, NYC, New York, NY, USA, 7–9 June 2023. [Google Scholar]
Pekrun, R.; Linnenbrink-Garcia, L. Introduction to Emotions in Education. In Educational Psychology Handbook Series. International Handbook of Emotions in Education; Pekrun, R., Linnenbrink-Garcia, L., Eds.; Routledge: London, UK; Taylor & Francis Group: Abingdon, UK, 2014; pp. 1–10. [Google Scholar] [CrossRef]
Palmer, C.J.; Seth, A.K.; Hohwy, J. The Felt Presence of Other Minds: Predictive Processing, Counterfactual Predictions, and Mentalising in Autism. Conscious. Cogn. Int. J. 2015, 36, 376–389. [Google Scholar] [CrossRef]
Ünal, E.; Papafragou, A. The Relation between Language and Mental State Reasoning’. In Metacognitive Diversity: An Interdisciplinary Approach; Proust, J., Fortier, M., Eds.; Oxford University Press: Oxford, UK, 2018. [Google Scholar]
Siegler, R.S. Emerging Minds: The Process of Change in Children’s Thinking; Oxford University Press: Oxford, UK, 1996. [Google Scholar]
Pintrich, P.R. A Motivational Science Perspective on the Role of Student Motivation in Learning and Teaching Contexts. J. Educ. Psychol. 2003, 95, 667–686. [Google Scholar] [CrossRef]
Newell, A.; Simon, H.A. Human Problem Solving; Prentice-Hall: Hoboken, NJ, USA, 1972. [Google Scholar]
Chi, M.T.H. Three Types of Conceptual Change: Belief Revision, Mental Model Transformation, and Categorical Shift. In Handbook of Research on Conceptual Change; Vosniadou, S., Ed.; Erlbaum: Mahwah, NJ, USA, 2008; pp. 61–82. [Google Scholar]
Bereiter, C. Education and Mind in the Knowledge Age; Routledge: London, UK, 2005. [Google Scholar]
Woodruff, E. Manifold Relational Understanding; University of Toronto: Toronto, ON, Canada, 2005. [Google Scholar]
Bartlett, F.C. Remembering: A Study in Experimental and Social Psychology; Cambridge University Press: Cambridge, UK, 1932. [Google Scholar]
Broadbent, D.E. Perception and Communication; Pergamon Press: Oxford, UK, 1958. [Google Scholar]
Johnson-Laird, P.N. Mental Models: Towards a Cognitive Science of Language, Inference, and Consciousness; Harvard University Press: Cambridge, MA, USA, 1983. [Google Scholar]
Dunning, D.; Kruger, J. Unskilled and Unaware of It: How Difficulties in Recognizing One’s Own Incompetence Lead to Inflated Self-Assessments. J. Pers. Soc. Psychol. 1999, 77, 1121–1134. [Google Scholar]
Piaget, J. The Origins of Intelligence in Children; International Universities Press: Madison, CT, USA, 1952. [Google Scholar]
Sweller, J. Cognitive Load during Problem Solving: Effects on Learning. Cogn. Sci. 1988, 12, 257–285. [Google Scholar] [CrossRef]
Weinstein, C.E.; Mayer, R.E. The Teaching of Learning Strategies. In Handbook of Research on Teaching; Wittrock, M.C., Ed.; Macmillan: New York, NY, USA, 1986; pp. 315–327. [Google Scholar]
Vosniadou, S.; Brewer, W.F. Mental Models of the Earth: A Study of Conceptual Change in Childhood. Cognit. Psychol. 1992, 24, 535–585. [Google Scholar] [CrossRef]
Anderson, J.R. The Architecture of Cognition; Harvard University Press: Cambridge, MA, USA, 1984. [Google Scholar]
Flavell, J.H. Metacognition and Cognitive Monitoring: A New Area of Cognitive Developmental Inquiry. Am. Psychol. 1979, 34, 906–911. [Google Scholar] [CrossRef]
Barnett, S.M.; Ceci, S.J. When and Where Do We Apply What We Learn? A Taxonomy for Far Transfer. Psychol. Bull. 2002, 128, 612–637. [Google Scholar] [CrossRef] [PubMed]
Biggs, J.B.; Collis, K.F. Evaluating the Quality of Learning: The SOLO Taxonomy (Structure of the Observed Learning Outcome); Academic Press: Cambridge, MA, USA, 1982. [Google Scholar]
Graesser, A.C.; D’Mello, S. Emotions during the Learning of Difficult Material. In The Psychology of Learning and Motivation; Ross, B.H., Ed.; Academic Press: Cambridge, MA, USA, 2012; Volume 57, pp. 183–225. [Google Scholar] [CrossRef]
Pekrun, R.; Goetz, T.; Titz, W.; Perry, R.P. Academic Emotions in Students’ Selfregulated Learning and Achievement: A Program of Qualitative and Quantitative Research. Educ. Psychol. 2002, 37, 91–105. [Google Scholar] [CrossRef]
Muis, K.R.; Pekrun, R.; Sinatra, G.M.; Azevedo, R.; Trevors, G.; Meier, E.; Heddy, B.C. The Curious Case of Climate Change: Testing a Theoretical Model of Epistemic Beliefs, Epistemic Emotions, and Complex Learning. Learn. Instr. 2015, 39, 168–183. [Google Scholar] [CrossRef]
Baker, R.S.; D’Mello, S.K.; Rodrigo, M.M.T.; Graesser, A.C. Better to Be Frustrated than Bored: The Incidence, Persistence, and Impact of Learners’ Cognitive-Affective States during Interactions with Three Different Computer-Based Learning Environments. Int. J. Hum.-Comput. Stud. 2010, 68, 223–241. [Google Scholar] [CrossRef]
D’Mello, S.; Graesser, A. Dynamics of Affective States during Complex Learning. Learn. Instr. 2012, 22, 145–157. [Google Scholar] [CrossRef]
D’Mello, S.; Lehman, B.; Pekrun, R.; Graesser, A. Confusion Can Be Beneficial for Learning. Learn. Instr. 2014, 29, 153–170. [Google Scholar] [CrossRef]
Muis, K.R.; Sinatra, G.M.; Pekrun, R.; Winne, P.H.; Trevors, G.; Losenno, K.M.; Munzar, B. Main and Moderator Effects of Refutation on Task Value, Epistemic Emotions, and Learning Strategies during Conceptual Change. Contemp. Educ. Psychol. 2018, 55, 155–165. [Google Scholar] [CrossRef]
Muis, K.R.; Chevrier, M.; Singh, C.A. The Role of Epistemic Emotions in Personal Epistemology and Self-Regulated Learning. Educ. Psychol. 2018, 53, 165–184. [Google Scholar] [CrossRef]
Pekrun, R.; Stephens, E.J. Academic Emotions. In APA Educational Psychology Handbook; Harris, K.R., Graham, S., Urdan, T., Graham, S., Royer, J.M., Zeidner, M., Eds.; American Psychological Association: Washington, DC, USA, 2012; Volume 2, pp. 3–31. [Google Scholar] [CrossRef]
Woodruff, E.; Buono, S.; Schnabel, R.; Moreno, M.; Xu, Z. Optimizing Affective Confusion. In Proceedings of the Hawaii International Conference on Education (16th Annual Conference), Honolulu, Hawaii, 4–7 January 2018; pp. 1953–1960. [Google Scholar]
Kang, M.J.; Hsu, M.; Krajbich, I.M.; Loewenstein, G.; McClure, S.M.; Wang, J.T.Y.; Camerer, C.F. The Wick in the Candle of Learning: Epistemic Curiosity Activates Reward Circuitry and Enhances Memory. Psychol. Sci. 2009, 20, 963973. [Google Scholar] [CrossRef] [PubMed]
Litman, J. Curiosity: Nature, Dimensionality, and Determinants. In The Cambridge Handbook of Motivation and Learning; Cambridge University Press: Cambridge, UK, 2019; pp. 418–442. [Google Scholar] [CrossRef]
Schukajlow, S.; Rakoczy, K.; Pekrun, R. Emotions and Motivation in Mathematics Education: Where We Are Today and Where We Need to Go. ZDM Math. Educ. 2023, 55, 249–267. [Google Scholar] [CrossRef] [PubMed]
Pekrun, R.; Vogl, E.; Muis, K.R.; Sinatra, G.M. Measuring Emotions during Epistemic Activities: The Epistemically-Related Emotion Scales. Cogn. Emot. 2017, 31, 1268–1276. [Google Scholar] [CrossRef] [PubMed]
Meyer, D.K.; Turner, J.C. Re-Conceptualizing Emotion and Motivation to Learn in Classroom Contexts. Educ. Psychol. Rev. 2006, 18, 377–390. [Google Scholar] [CrossRef]
Fredrickson, B.L. The Role of Positive Emotions in Positive Psychology: The Broaden and-Build Theory of Positive Emotions. Am. Psychol. 2001, 56, 218226. [Google Scholar] [CrossRef] [PubMed]
Csikszentmihalyi, M. Flow: The Psychology of Optimal Experience; Harper & Row: New York, NY, USA, 1990. [Google Scholar]
Ryan, R.M.; Deci, E.L. Self-Determination Theory and the Facilitation of Intrinsic Motivation, Social Development, and Well-Being. Am. Psychol. 2000, 55, 68–78. [Google Scholar] [CrossRef]
Tamir, D.I.; Thornton, M.A.; Contreras, J.M.; Mitchell, J.P. Neural Evidence That Three Dimensions Organize Mental State Representation: Rationality, Social Impact, and Valence. Proc. Natl. Acad. Sci. USA 2016, 113, 194–199. [Google Scholar] [CrossRef]
Schwartz, B.; Wrzesniewski, A. Reconceptualizing Intrinsic Motivation: Excellence as Goal. In The Cambridge Handbook of Motivation and Learning; Renninger, K.A., Hidi, S.E., Eds.; Cambridge University Press: Cambridge, UK, 2019; pp. 373–393. [Google Scholar]
Reeve, J.; Tseng, C.M. Agency as a Fourth Aspect of Students’ Engagement during Learning Activities. Contemp. Educ. Psychol. 2011, 36, 257–267. [Google Scholar] [CrossRef]
Pekrun, R. The Control-Value Theory of Achievement Emotions: Assumptions, Corollaries, and Implications for Educational Research and Practice. Educ. Psychol. Rev. 2006, 18, 315–341. [Google Scholar] [CrossRef]
Pekrun, R.; Goetz, T.; Daniels, L.M.; Stupnisky, R.H.; Perry, R.P. Boredom in Achievement Settings: Exploring Control-Value Antecedents and Performance Outcomes of a Neglected Emotion. J. Educ. Psychol. 2010, 102, 531–549. [Google Scholar] [CrossRef]
D’Mello, S.K.; Graesser, A. Intelligent Tutoring Systems: How Computers Achieve Learning Gains That Rival Human Tutors. In Handbook of Educational Psychology; Routledge: London, UK, 2023; pp. 603–629. [Google Scholar]
Ren, L.; Zhao, L.; Hong, S.; Zhao, S.; Wang, H.; Zhang, L. Remaining Useful Life Prediction for Lithium-Ion Battery: A Deep Learning Approach. IEEE Access 2018, 6, 50587–50598. [Google Scholar] [CrossRef]
Zhang, J.; Ding, G.; Zou, Y.; Qin, S.; Fu, J. Review of Job Shop Scheduling Research and Its New Perspectives under Industry 4.0. J. Intell. Manuf. 2019, 30, 1809–1830. [Google Scholar] [CrossRef]
Liu, Y.; Wang, Y.; Zhang, J. New Machine Learning Algorithm: Random Forest. In Proceedings of the Information Computing and Applications: Third International Conference, ICICA 2012, Chengde, China, 14–16 September 2012; Springer: Chengde, China, 2012; Volume Proceedings 3, pp. 246–252. [Google Scholar]
Medicine National Academies of Sciences, Engineering; Division of Behavioral and Social Sciences and Education; Board on Science Education; Sensory Sciences Board on Behavioral, Cognitive; Committee on How People Learn II: The Science and Practice of Learning. How People Learn II: Learners, Contexts, and Cultures; National Academies Press: Washington, DC, USA, 2018. [Google Scholar]
VanLehn, K. The Relative Effectiveness of Human Tutoring, Intelligent Tutoring Systems, and Other Tutoring Systems. Educ. Psychol. 2011, 46, 197–221. [Google Scholar] [CrossRef]
VanLehn, K. Regulative Loops, Step Loops and Task Loops. Int. J. Artif. Intell. Educ. 2016, 26, 107–112. [Google Scholar] [CrossRef]
Lepper, M.R.; Woolverton, M. The Wisdom of Practice: Lessons Learned from the Study of Highly Papertive Tutors. In Improving Academic Achievement; Aronson, J., Ed.; Academic Press: Washington, DC, USA, 2002. [Google Scholar]
Lazic, M.; Woodruff, E. Towards a Real-Time Tutor for Deep Learning. In Proceedings of the Canadian Psychological Associations’ 85th Annual National Convention, Ottawa, ON, Canada, 21–23 June 2024. [Google Scholar]
Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language models are unsupervised multitask learners. OpenAI Blog 2019, 1, 9. [Google Scholar]
Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Amodei, D. Language Models Are Few-Shot Learners. Adv. Neural Inf. Process. Syst. 2020, 33, 346. [Google Scholar] [CrossRef]
Zawacki-Richter, O.; Marín, V.I.; Bond, M.; Gouverneur, F. Systematic Review of Research on Artificial Intelligence Applications in Higher Education–Where Are the Educators? Int. J. Educ. Technol. High. Educ. 2019, 16, 1–27. [Google Scholar] [CrossRef]
Frijda, N.H. The Laws of Emotion. Am. Psychol. 1988, 43, 349–358. [Google Scholar] [CrossRef] [PubMed]
Donovan, M.S.; Bransford, J.D.; Pellegrino, J.W. (Eds.) How People Learn: Bridging Research and Practice; National Academy Press: Washington, DC, USA, 1999. [Google Scholar]
Chi, M.T.H.; Wylie, R. The ICAP Framework: Linking Cognitive Engagement to Active Learning Outcomes. Educ. Psychol. 2014, 49, 219–243. [Google Scholar] [CrossRef]
Graesser, A.C.; Person, N.K. Question Asking during Tutoring. Am. Educ. Res. J. 1994, 31, 104–137. [Google Scholar] [CrossRef]
Meyer, D.K.; Turner, J.C. Scaffolding Emotions in Classrooms. In Emotion in Education; Academic Press: Cambridge, MA, USA, 2007. [Google Scholar]
Walkington, C.; Bernacki, M.L. Personalization of Instruction: Design Dimensions and Implications for Cognition. J. Exp. Educ. 2018, 86, 50–68. [Google Scholar] [CrossRef]
Bloom, B.S. The 2 Sigma Problem: The Search for Methods of Group Instruction as Effective as One-to-One Tutoring. Educ. Res. 1984, 13, 416. [Google Scholar] [CrossRef]
Savery, J.R. Overview of Problem-Based Learning: Definitions and Distinctions. Interdiscip. J. Probl.-Based Learn. 2006, 1, 3. [Google Scholar] [CrossRef]
Kolodner, J.L.; Camp, P.J.; Crismond, D.; Fasse, B.; Gray, J.; Holbrook, J.; Puntambekar, S.; Ryan, M. Problem-Based Learning Meets Case-Based Reasoning in the Middle-School Science Classroom: Putting Learning by DesignTM into Practice. J. Learn. Sci. 2003, 12, 495–547. [Google Scholar] [CrossRef]
Renninger, K.A.; Hidi, S. Revisiting the Conceptualization, Measurement, and Generation of Interest. Educ. Psychol. 2011, 46, 168–184. [Google Scholar] [CrossRef]
Renninger, K.A.; Hidi, S.E. Interest Development and Learning. In The Cambridge Handbook of Motivation and Learning; Cambridge University Press: Cambridge, UK, 2019; pp. 265–290. [Google Scholar] [CrossRef]
Searle, J.R. Intentionality: An Essay in the Philosophy of Mind; Cambridge University Press: Cambridge, UK, 1983. [Google Scholar]
Searle, J.R. The Rediscovery of the Mind; MIT Press: Cambridge, MA, USA, 1992. [Google Scholar]
Searle, J.R. Mind: A Brief Introduction; Oxford University Press: Oxford, UK, 2004. [Google Scholar]
Gallagher, S. How the Body Shapes the Mind; Clarendon Press: London, UK, 2006. [Google Scholar]
Shapiro, L.A. The Routledge Handbook of Embodied Cognition, 2nd ed.; Routledge: London, UK, 2024. [Google Scholar]
Fischer, M.H. The Embodied Cognition Approach: Principles and Research Questions. In Experimental Methods in Embodied Cognition; Routledge: London, UK, 2024; pp. 3–18. [Google Scholar]
Khan, S. Brave New Words: How AI Will Revolutionize Education (and Why That’s a Good Thing); ebook; Viking: New York, NY, USA, 2024. [Google Scholar]

Figure 1. Metaphorical representation of mental homeostasis.

Figure 2. Graph depicts the cyclical feature of ever-deepening understanding.

Figure 3. A symbolic depiction of the network of interconnected phases of understanding.

Figure 4. Hypothesized task engagement across valenced phases of understanding.

Figure 5. Sample facial action coding system (FACS) action units.

Figure 6. Generalized information pipeline for the Abel Tutor system.

Table 1. The five phases of understanding mapped onto the SOLO phases.

Phases of Understanding	SOLO Phases	Brief Descriptor of SOLO Phases
Nascent Understanding	Pre-structural	Little to no understanding; fragmented or irrelevant information.
Misunderstanding	Uni-structural	Basic understanding of one relevant aspect; focus on a single element without seeing the broader context.
Confusion	Multi-structural	Grasp of several relevant aspects independently; aspects treated as separate entities without integration.
Emergent Understanding	Relational	Integration of multiple aspects into a coherent structure; understanding of relationships and connections.
Deep Understanding	Extended abstract	Generalization of knowledge to new domains; application in novel contexts; generation of insights; metacognitive awareness.

Table 2. Hypothesized Emotions Associated with Each Phase of Understanding.

Phase of Understanding	Possible Emotions Manifested in Feelings
Nascent Understanding	Confusion, frustration, anxiety, boredom, apathy, overwhelm.
Misunderstanding	Surprise, false confidence, overconfidence, confusion, frustration, defensiveness, embarrassment.
Confusion	Surprise, perplexity, frustration, curiosity, uncertainty, discomfort, cognitive dissonance, motivation to resolve confusion.
Emergent Understanding	Curiosity, excitement, enthusiasm, satisfaction, pride, relief, motivation to learn more.
Deep Understanding	Enjoyment, satisfaction, fulfillment, enthusiasm, inspiration, curiosity to explore further, metacognitive awareness, self-efficacy.

Table 3. Sample tutor–student interaction.

SAMPLE TUTOR-STUDENT INTERACTION ON ABEL
Abel Tutor:	So, Kelly, you’re curious about how things move. It’s pretty fascinating stuff. Some people think that to keep something moving, you have to keep pushing it. But actually, things want to keep doing what they’re already doing—whether that’s moving or staying still—unless something else gets in the way. What do you think could get in the way of an object that’s moving?
Kelly’s response:	Maybe it stops when the push stops?
Probability measures indicate:	60% misunderstanding (holds the misconception), 40% not understanding (confusion about the concept). Valence in text: Uncertain.
Abel Tutor: Response Valence:	Curious and inviting clarification.
Abel Tutor:	It’s a common idea, but there’s a twist. Imagine you’re skating on ice and you push off. You glide smoothly without needing to keep pushing, right? This is because there’s hardly any friction—the force that usually stops things from moving. On ice, you keep moving even after you stop pushing. This is what Sir Isaac Newton discovered too. It’s not the push that keeps things moving; it’s the absence of enough friction or other forces to stop them. How does that sound?
Kelly’s response:	Oh, I see. So, if there’s no friction, things don’t stop?
Probability measures indicate:	20% misunderstanding, 80% emergent understanding. Valence in text: Intrigued.
Abel Tutor: Response Valence:	Enlightening and explanatory.
Abel Tutor: Abel Tutor:	Exactly! You’ve got it. If there’s no friction or another force, an object in motion stays in motion. This is what Newton’s First Law of Motion is about: an object will stay still or keep moving at the same speed in a straight line unless acted on by a force. Now, imagine if you’re skating and you hit a rough patch of ice or the ice turns to carpet. What do you think happens then?
Kelly’s response:	I’d probably slow down and stop.
Probability measures indicate:	90% emergent understanding, 10% deep understanding. Valence in text: Understanding.
Abel Tutor: Response Valence:	Confirmatory and reinforcing understanding.
Abel Tutor:	You got it! That slowing down is because of the force of friction. When you hit the rough patch, it’s a force that changes your motion. Without that force, like on the smooth ice, you’d just keep going. This is Newton’s way of saying that it’s not the continuous force that keeps us moving, it’s the lack of a force that stops us from moving. So, knowing this, how do you feel about objects in motion now?
Kelly’s response:	It makes sense. Things stay moving unless something stops them. It’s not about the push all the time.
Probability measures indicate:	5% emergent understanding, 95% deep understanding. Valence in text: Confident.
Abel Tutor: Response Valence:	Affirmative and encouraging exploration.
Abel Tutor:	That’s an excellent grasp of the concept, Kelly! You’ve navigated from the initial misunderstanding to a deep understanding of Newton’s First Law of Motion. Are you ready to see how this applies to more than just skating? Like how planets orbit the sun or cars move on the road?

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

AI Detection of Human Understanding in a Gen-AI Tutor

Abstract

1. Introduction

2. Mental States and Phases

3. Instructional Use

4. Toward the Affect-Aware Tutor

5. Abel Tutor

6. Discussion and Implications

7. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics