Next Article in Journal
A Comparative Review of Quantum Neural Networks and Classical Machine Learning for Cardiovascular Disease Risk Prediction
Previous Article in Journal
Research Advances in Maize Crop Disease Detection Using Machine Learning and Deep Learning Approaches
Previous Article in Special Issue
The Art Nouveau Path: Longitudinal Analysis of Students’ Perceptions of Sustainability Competence Development Through a Mobile Augmented Reality Game
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Cognitive Affective Model of Motion Capture Training: A Theoretical Framework for Enhancing Embodied Learning and Creative Skill Development in Computer Animation Design

by
Xinyi Jiang
1,2,*,
Zainuddin Ibrahim
1,*,
Jing Jiang
2,
Jiafeng Wang
2 and
Gang Liu
2
1
Faculty of Art & Design, Universiti Teknologi MARA, Shah Alam 40450, Malaysia
2
Faculty of Animation, School of Arts, Anhui Xinhua University, Hefei 230088, China
*
Authors to whom correspondence should be addressed.
Computers 2026, 15(2), 100; https://doi.org/10.3390/computers15020100
Submission received: 25 December 2025 / Revised: 21 January 2026 / Accepted: 23 January 2026 / Published: 2 February 2026

Abstract

There has been a surge in interest in and implementation of motion capture (MoCap)-based lessons in animation, creative education, and performance training, leading to an increasing number of studies on this topic. While recent studies have summarized these developments, few have been conducted that synthesize existing findings into a theoretical framework. Building upon the Cognitive Affective Model of Immersive Learning (CAMIL), this study proposes the Cognitive Affective Model of Motion Capture Training (CAMMT) as a theoretical and research-based framework for explaining how MoCap fosters creative cognition in computer animation practice. The model identifies six affective and cognitive constructs: Control and Active Learning, Reflective Thinking, Perceptual Motor Skills, Emotional Expressive, Artistic Innovation, and Collaborative Construction that describe how MoCap’s technological affordances of immersion and interactivity support creativity in animation practice. The findings indicate that instructional and design methods from less immersive media can be effectively adapted to MoCap environments. Although originally developed for animation education, CAMMT contributes to broader theories of creative design processes by linking cognitive, affective, and performative dimensions of embodied interaction. This study offers guidance for researchers and designers exploring creative and embodied interaction across digital performance and design contexts.

Graphical Abstract

1. Introduction

Liberating actors from the constraints of their physical appearance enables them to embody a diverse range of characters beyond their natural selves, fulfilling the ancient imperative of performance: to become somebody else [1]. In the contemporary context, the widespread accessibility of technologies such as Motion Capture (MoCap), Virtual Reality (VR), and open-source development engines has made it increasingly possible for anyone to engage in immersive virtual performances. This technological evolution presents new opportunities for computer animation education, a field traditionally rooted in 3D software manipulation and keyframe animation techniques [2]. In recent years, the number of studies exploring MoCap technology in education, creative practice, and skill training has grown steadily [3]. A literature search indicates that the number of Scopus-indexed publications on learning, education, and animation has increased rapidly in recent years (see Figure 1). In particular, after 2015, the publication trend shows a sharp, almost exponential rise, reaching its highest level between 2023 and 2025. This surge suggests that MoCap has emerged as a central focus in animation education research, likely driven by advances in immersive technologies, digital performance practices, virtual production workflows, and a growing emphasis on practice-based and technology-enhanced learning models. Despite this progress, most studies have concentrated on technical applications, production workflows, or discrete learning outcomes, without establishing a cohesive theoretical framework to explain how MoCap technology influences creative skill development in animation design (e.g., [4,5,6]). Therefore, it is necessary to develop a research-based theoretical model to provide an understanding of MoCap-based training so that students, teachers, designers, or stakeholders can use MoCap effectively to support embodied creative learning and performance-driven animation skills.
This study proposes the Cognitive Affective Model of Motion Capture Training (CAMMT) to provide a research-based theoretical framework for MoCap-based animation training environments. Drawing inspiration from the Cognitive Affective Model of Immersive Learning (CAMIL) [7], CAMMT synthesizes existing research on immersive virtual reality (IVR) and MoCap learning to define key technological affordances (real-time immersion and interactivity), core psychological factors (presence and agency), and essential learning constructs (Control and Active Learning, Reflective Thinking, Perceptual–Motor Skills, Emotional Expressive, Artistic Innovation, and Collaborative Construction). Below, before describing the model in detail, we first discuss its definition in animation design and education.

2. Defining MoCap in Computer Animation Design and Training

In films, animations, and video games, MoCap refers to the process of recording the movements of human actors and using that data to animate digital character models in 2D or 3D computer animation (e.g., [8,9,10]). When Mocap extends to include facial expressions, finger movements, and other subtle gestures, it is often referred to as performance capture [11]. MoCap systems include various hardware setups, such as optical tracking systems, inertial-sensor suits, depth cameras, and markerless AI-based tracking devices [12]. These systems are widely used in professional fields like film, game design, and virtual production and are now increasingly accessible for educational and creative learning purposes. A defining characteristic that differentiates MoCap-based training from traditional computer animation training is real-time performance immersion. Unlike conventional animation techniques that rely on manual keyframing or timeline editing, MoCap enables learners to use their physical bodies directly as the input device for generating animation [13]. In this sense, similar to VR, which serves as a system that enhances learning through multimedia interaction, MoCap fosters an embodied learning environment in which the learner’s body functions simultaneously as a creative tool and the subject of performance exploration. This distinctive feature shifts the animation learning process from purely technical manipulation to an experiential, body-driven creative practice.
Beyond its technical and educational utility, MoCap also operates as a creative design medium that transforms physical movement into an act of idea generation and expressive synthesis. Within this framework, the animator’s body becomes a site of embodied ideation where motion, emotion, and intention are interwoven to form creative meaning. As shown in Figure 2, the process of performing, observing, and refining one’s captured motion parallels the iterative cycles of ideation, prototyping, and reflection that characterize design creativity [14,15]. Thus, MoCap does not merely record motion data but also serves as a creative interface that bridges human performance and digital design. Through this lens, MoCap-based animation training is best understood as a platform for creative innovation, where technological affordances stimulate experimentation, esthetic exploration, and affective expression.
Although MoCap is applied across multiple industries relevant to immersive technologies, the CAMMT focuses specifically on the creative and educational context of animation training. This focus provides a concrete example of how MoCap facilitates embodied creative learning, performance-driven skill development, and movement literacy in animation education. While other media technologies (e.g., VR) also provide immersive experiences, MoCap is unique in that it integrates the learner’s physical actions as the primary creative mechanism for generating animation content, rather than treating them as a passive input for visualization. In contrast, VR isolates the learner within a fully simulated environment [16], whereas MoCap connects real-world performance with virtual representation, maintaining a dynamic feedback loop between body, space, and imagination.
Another defining feature of MoCap-based creative training is performance-driven interaction, in which movement serves as both expressive language and a design method. Using real-time MoCap, the user’s motion is mapped to a virtual character, providing immediate visual feedback. Participants can observe their virtual body move in real time, either directly or through virtual mirror reflections. This moment of synchronization creates a sense of agency and creative authorship, as learners perceive themselves not only as performers but also as designers shaping the expressive dynamics of their virtual characters. This interaction is closely tied to feedback fidelity, the degree to which the captured motion accurately reflects the learner’s real-time actions in the digital environment [17].
In conclusion, the significance of MoCap-based animation training lies in its dual nature as both a technological and creative system. Its unique features, real-time immersion, performance-based interaction, and embodied feedback differentiate it from traditional animation methods and open new possibilities for design creativity and innovation. MoCap enables learners to explore physical performance as a medium for creative ideation, translating human expression into digital form while deepening their understanding of motion, emotion, and storytelling. Understanding how immersion, embodiment, and feedback fidelity influence not only learning but also creative development is essential for designing training strategies and creative workflows that fully leverage MoCap’s potential to support performance-driven innovation in animation design.

3. The Theoretical Perspective of CAMMT

The CAMMT is a research-based theoretical model developed to describe the process of embodied creative learning in MoCap-based animation design. The development of this model responds to a recognized gap in the field, where scholars have highlighted the lack of theoretical frameworks to guide research, instructional design, and technology integration in performance-driven learning environments (e.g., [4,6,13]). While existing studies have demonstrated the pedagogical and technical potential of MoCap, few have examined how such embodied technologies shape creative cognition and design ideation in animation practice. CAMMT therefore extends beyond learning theory to propose a framework that explains how MoCap mediates creative thinking, expression, and innovation within training settings.
CAMMT extends the theoretical perspective of CAMIL by arguing that learning effectiveness in MoCap environments results from media–method interaction, rather than from technology alone are shown in Table 1. The empirical lineage of CAMIL is grounded in VR learning research: building on the model proposed by Salzman, Dede, Loftin, and Chen [18], SEM studies in desktop VR show that the effects of VR features on learning outcomes are mediated by non-cognitive variables such as motivation, presence, and perceived usability [19]. Subsequent work further specified these mechanisms by highlighting immersion-driven affective and cognitive pathways, informed by control-value theory [20,21], and by modeling how VR features relate to knowledge acquisition, motivation, and self-efficacy through affective and cognitive mediators, with measurable pre- and post-knowledge gains [22]. In addition, CAMMT draws on the Cognitive Affective Theory of Learning with Media (CATLM) and related accounts of interactive multimodal learning [23], which emphasize cognitive load, motivation, affect, and metacognitive self-regulation as key determinants of learning in interactive environments, alongside design principles for optimizing instruction. In this view, CATLM provides the general cognitive–affective foundation for learning with interactive media, while CAMIL advances the IVR perspective by specifying presence and agency as central psychological affordances arising from immersive technologies. Hence, these foundations support CAMMT’s MoCap-specific shift toward performance-driven embodied creation, in which MoCap affordances and instructional methods jointly shape learning and creative development.
CAMMT inherits the core assumptions of these models and reconciles these perspectives by proposing that MoCap’s effectiveness in animation education arises from the interaction between its technological affordances and the instructional methods employed. In particular, CAMMT expands the principle of media–method interaction into the creative domain, suggesting that embodied interaction in MoCap systems enables design-oriented cognition in which learners engage in processes similar to design ideation and reflection-in-action [14]. Within this framework, the learner is not merely absorbing knowledge but actively designing through motion, iteratively generating, testing, and refining expressive ideas through physical performance. This process reflects the iterative creative cycle described in design creativity research, encompassing stages of exploration, reflection, and synthesis [15,24,25].
Unlike IVR learning, MoCap-based creative learning positions the learner’s physical body as the primary medium of design thinking, where motion operates as both representation and exploration—analogous to sketching or prototyping in design practice [26,27]. Through embodied enactment and rapid feedback, learners externalize and refine ideas, emotions, and expressive intent. Accordingly, CAMMT foregrounds MoCap-relevant creative mechanisms (e.g., Perceptual Motor skills, Emotional Expressive, Artistic Innovation, and Collaborative Construction) as core support for performance-driven animation skill development. These mechanisms map onto widely used creativity dimensions—fluency, flexibility, originality, and elaboration—in established creativity assessment models [28,29].
CAMMT adopts a media–method interaction perspective in which MoCap affordances (immersion, interactivity, and representational fidelity) shape learning through psychological affordances (presence and agency), which, in turn, activate affective/cognitive processes and design-oriented creativity during motion-based creation. Thus, CAMMT functions as a context-specific extension of CATLM or CAMIL that explicitly accounts for embodied, performative, and creative dimensions of animation design.
In the following sections, the relationships between the main variables of CAMMT (illustrated in Figure 3) are described. First, the discussion focuses on how technological factors in MoCap systems, particularly interactivity, immersion, and representational fidelity, influence the psychological affordances of presence and agency. Next, it examines how these psychological affordances affect six mediating affective and cognitive factors: Control and Active Learning, Reflective Thinking, Perceptual Motor Skills, Emotional Expressive, Artistic Innovation, and Collaborative Construction. Finally, it explains how these factors contribute to important learning outcomes, including the development of procedural animation skills, the acquisition of movement literacy, the enhancement of creative expression, and the transfer of knowledge across animation design contexts. This study concludes with a discussion of the implications of CAMMT for future research and instructional design in animation education. Furthermore, the model recognizes that external factors, such as learners’ prior experience, technological familiarity, and collaborative learning settings, may influence CAMMT’s operation and should be considered in future empirical studies.

3.1. What Factors Lead to Presence in CAMMT?

In the context of CAMMT, presence refers to the psychological state in which learners experience their physical bodies as fully engaged and expressive agents within the animation creation process. This form of presence aligns with Lee’s [30] concept of physical presence, the perception that virtual or mediated actions are experienced as real, whether through sensory or cognitive engagement. However, CAMMT expands this definition to include a creative–performative dimension, in which bodily engagement becomes a vehicle for imaginative exploration and design ideation. Unlike traditional immersive VR, which relies primarily on visual and spatial immersion, MoCap-based training environments cultivate embodied presence through full-body motion tracking, real-time visual feedback, and expressive control over animated characters. Here, presence is not only a perceptual experience but also a creative condition that enables learners to prototype ideas through movement and refine them through iterative bodily feedback. Presence in MoCap is not purely technological; it is co-determined by learner characteristics such as attentional focus, bodily awareness, and prior experience with movement-based or performance-driven tasks [31].
CAMMT identifies three key technological factors that contribute to presence in MoCap-based creative learning. First, performance immersion describes the learner’s capacity to engage in live, body-driven input that is immediately visualized as animated performance, transforming the body into a design interface for ideation and expressive testing. Second, interactive motion control refers to the degree, immediacy, and precision with which learners influence animated outcomes through physical gestures, consistent with Witmer and Singer’s [32] emphasis on control as central to immersive presence. In creative terms, this control supports a sense of authorship and improvisation within the design process. Third, representational fidelity, adapted from Dalgarno and Lee [33], reflects the extent to which expressive movements are captured and rendered with accuracy and nuance. High fidelity enhances the learner’s sense of expressive authenticity, allowing motion design to function as a medium for creative intention rather than mechanical replication. Recent technological advances, such as the integration of Augmented Reality (AR) and LED wall systems into MoCap pipelines [34,35], have further increased spatial and social presence, extending the learner’s sense of creative embodiment in hybrid physical–digital spaces.
In summary, CAMMT identifies immersion (positive relation; Path 1 in Figure 3), interactivity (positive relation; Path 2), and representational fidelity (positive relation; Path 4) as the primary determinants of presence in MoCap environments. These factors correspond to CAMIL’s constructs of immersion, control, and fidelity but are reconceptualized here as embodied mechanisms that foster design creativity and expressive innovation.

3.2. What Factors Lead to Agency in CAMMT?

In the context of CAMMT, agency refers to the learner’s subjective experience of intentionally generating and controlling animated motion through physical performance. Drawing on Moore and Fletcher [36], agency is defined as the feeling of initiating and influencing actions, a sense of authorship that is particularly salient in MoCap environments where the performer’s movements directly shape the animated outcome. In this framework, agency extends beyond technical manipulation to encompass creative authorship: the learner experiences themselves not only as an operator of a system but as the originator of expressive motion. As Johnson-Glenberg [37] notes, the most critical determinant of agency in interactive environments is action control—the learner’s ability to manipulate elements in the system responsively and purposefully. By contrast, in low-agency or passive settings, where learners can observe but not influence motion, this sense of authorship diminishes, resulting in reduced creative engagement.
CAMMT emphasizes that real-time interactivity and representational fidelity are the primary technological factors that foster agency in MoCap-based creative learning. Agency arises when a close match is achieved between a user’s motor intention and the corresponding visual feedback, a mechanism supported by forward modeling processes in the central nervous system [38,39]. When movement and feedback are precisely synchronized, learners experience their virtual performance as an authentic extension of their bodily action. This correspondence transforms the digital interface into a responsive creative partner, enabling learners to experiment, improvise, and iterate in real time. In the CAMMT framework, interactive control (positive relation; Path 3) and representational fidelity (positive relation; Path 5) together enable this embodied authorship, turning MoCap into an instrument for design exploration rather than a tool for mechanical recording. The immediacy and accuracy with which physical movement is translated into animated performance strengthen the learner’s sense of being a creator who intentionally shapes expressive form within an embodied design process.
Moreover, this creative sense of agency provides motivational and cognitive grounding for the six mediating constructs identified in CAMMT. As learners experience their bodies as instruments of design, these mediators reinforce one another, linking intentional motion control with reflective analysis and creative improvisation. Agency, therefore, functions as the connective force that translates technological affordances into meaningful creative action. It not only enhances self-efficacy and engagement but also supports the emergence of innovation through iterative cycles of embodied experimentation and reflection. The following section explains how presence and agency activate CAMMT’s six mediating constructs, which together form the embodied pathway of creative training.

3.3. How Presence and Agency Mediate CAMMT’s Six Constructs in Fostering Embodied Creativity and Design Innovation?

3.3.1. Control and Active Learning

Control and Active Learning describe how learners exercise intentional control over tasks and engage in purposeful exploration, in which embodied action becomes a vehicle for creative understanding and design ideation in animation practice. This stance aligns with studio and active pedagogies such as Studio Courses, SCALE-UP, and TEAL, which combine performing, critiquing, and iterating to support inquiry and concept development [40]. MoCap heightens engagement by allowing learners to manipulate content through their own movement, turning learning by doing into learning by moving; recent work across Human–Computer Interaction (HCI) and education shows that Mixed Reality (MR) and MoCap authoring increase interactivity, collaboration, and creative exploration [41,42,43]. In CAMMT, the body functions as the interface: learners prototype motion alternatives in real time and see outcomes instantly, which supports rapid cycles of exploration, critique, and revision [44]. Accordingly, CAMMT posits that presence and agency jointly cultivate a state of control and sustain active learning during MoCap-based creation (Paths 6 and 7). Theoretically, active learning aligns with constructivist and immersive-learning accounts that emphasize learner-centered engagement and contextualized problem solving; in related VR research, immersion and interactivity shape meaningful experiences that reorganize prior schemas [45]. CAMMT adopts these insights and specifies an embodied mechanism: tight action–feedback loops that couple movement, perception, and intention during creative performance. Importantly, this construct captures process-level regulation and exploratory engagement (i.e., how learners work), rather than the affective clarity or novelty of the resulting motion.
Meanwhile, agency plays a central role in enhancing learner control, as it allows users to generate motion and experiment with character performance in real time through bodily input. Johnson-Glenberg and Megowan-Romanowicz [46], in their study on gesture and MoCap in physics education, demonstrated that groups engaged in embodied interaction learned significantly more than those using traditional symbolic and text-based input. When assessed using a gesture-supported Wacom tablet, the active gesture-based groups outperformed the keyboard group by a significant margin. Similarly to how traditional animation interfaces rely on tools like a mouse, a timeline, or keyframes to mediate control, MoCap provides direct, embodied control over animation. This immediacy reinforces learners’ active role in creative decision-making, encouraging performance revision and iterative exploration. In summary, Control and Active Learning enable learners to engage dynamically with content and sustain purposeful iteration, ultimately translating these gains into creative practice (Path 18).

3.3.2. Reflective Thinking

Reflective Thinking, first conceptualized by Dewey [47] and later expanded by Habermas [48], refers to the careful examination of experiences and issues of personal or professional relevance [49]. In MoCap-based animation training, Reflective Thinking is enacted through cycles of performing, viewing, and revising, so that observation and action inform one another in short, iterative bursts of practice. In design cognition terms, such iteration aligns with the reflective cycles designers use to externalize ideas, evaluate representations, and refine intent [14]. In this context, Reflective Thinking is supported through enactive engagement. Starcic et al. [50] suggest that MoCap facilitates enactive learning by linking physical body movements to psychological responses and cognitive states. Within the CAMMT framework, reflection operates as a design process in which motion serves as a prototype that can be tested and incrementally refined.
The sense of presence afforded by immersive MoCap experiences enhances Reflective Thinking by situating learners in context-rich environments, where the impact of their actions is immediately visible and open to analysis. When learners feel physically located within the performance space, immediate and intuitive feedback on movement quality, timing, and expression becomes available, enabling quick comparison between intention and observed effect. This embodied feedback acts as a catalyst for reflection, prompting learners to analyze their own bodily execution and artistic intent in relation to the animation output [51]. As a result, learners can identify mismatches, specify targeted adjustments, and iterate toward expressive coherence.
Agency supports metacognitive regulation by granting learners intentional control over their animated performance. Mou [13] compared MoCap with keyframing and found that students using MoCap reflected more on natural motion and storytelling fidelity, indicating heightened cognitive awareness. This interactivity encourages learners to question, test, and revise movement decisions, which strengthens metacognitive awareness and learning flexibility (e.g., [6,52]). In CAMMT, Reflective Thinking is conceptualized as a cognitive learning construct that is positively influenced by both presence and agency (Paths 8 and 9). When learners perceive themselves as embodied and empowered performers who can both sense their actions and shape them deliberately, they are more likely to engage in reflective practice, monitor performance with clear criteria, and iteratively refine creative strategies over time (Path 19).

3.3.3. Perceptual Motor Skills

Perceptual Motor Skills can be defined as the process of acquiring and refining coordinated movement abilities through the integration of sensory perception and motor execution [53]. Fundamental motor skills serve as the foundation for sports and physical activities, and their development involves the cultivation of timing, rhythm, spatial awareness, and proprioception [54]. These components are also essential in animation performance, where expressive movement must be both physically grounded and creatively intentional to achieve authenticity and believability [55]. This perspective is further supported by empirical research. Several studies (e.g., [56,57,58]) have explored the integration of three-dimensional motion tracking with tactile feedback, consistently highlighting the inherent advantages of MoCap in promoting motor skill development and providing real-time performance feedback.
Presence plays a crucial role in Perceptual Motor training by anchoring the learner’s awareness within their body during animation tasks. When learners experience a high degree of embodied presence, they become more attuned to the kinesthetic qualities of their movements, including how gestures feel, flow, and align with narrative or character intent. This situated awareness allows learners to fine-tune control over speed, balance, and spatial orientation, thereby improving the realism and expressivity of the animated output [59]. Agency reinforces the connection between perception and motion through immediate visual feedback; this sensorimotor feedback loop, as emphasized by Kiiski et al. [51] and Maraffi [44], strengthens calibration between intended actions and perceived effects over time. CAMMT thus proposes a positive relationship between presence, agency, and perceptual-motor learning (Paths 10 and 11). When learners are fully engaged both physically and cognitively in the animation task and can simultaneously experience their own movement while observing its effects, they are more likely to develop precise control, a deeper understanding of movement, and the embodied animation skills that are essential for expressive performance (Path 20).

3.3.4. Emotional Expressive

Emotional Expressive refers to the learner’s ability to convey affective states, such as joy, tension, or fear, through physical movement and character performance. Drawing on design research, Ståhl [60] emphasized that inspiration from bodily movements proved highly effective for developing emotionally expressive systems. Similarly, Kiiski et al. [51] argue that, in the context of animation, Emotional Expressive plays a crucial role in transforming motion into meaning, enabling learners to imbue characters with personality and emotional nuance. Within the CAMMT framework, Emotional Expressive is thus understood not merely as an artistic capability but as a Cognitive Affective learning construct, grounded in embodied experience and shaped through intentional control. Conceptually, Emotional Expressive emphasizes affective clarity, nuance, and congruence in motion, rather than the originality of movement solutions.
Presence intensifies emotional engagement by making performance feel immediate and personally meaningful and by deepening learners’ connection to the animated character or scene; Bennett and Kruse [4] and Maraffi [44] suggest this fosters greater emotional investment and enhances the authenticity of emotional expression. In parallel, agency strengthens expressive animation by empowering learners to experiment with timing, gesture, and exaggeration through real-time bodily input. Crane and Gross [61] and Ennis et al. [62] found that when learners can control subtle expressive elements, such as a character’s posture shift, head tilt, or breath pattern, they gain the ability to fine-tune emotional delivery. This aligns with research on embodied cognition and emotion, which suggests that emotional expression is rooted in sensorimotor processes and can be refined through bodily feedback [63]. Thus, CAMMT proposes a positive relationship among presence, agency, and the development of Emotional Expressive (Paths 12 and 13). The capacity to physically feel, generate, and refine emotional performance through MoCap allows learners to engage more meaningfully in the animation process. In this way, Emotional Expressive becomes both a form of embodied training and an artistic output (Path 21).

3.3.5. Artistic Innovation

Artistic Innovation is a vital yet complex factor to assess in the design of animation and training [64]. It encompasses the learner’s capacity to explore novel forms of movement, stylistic variations, and expressive techniques throughout the animation process. As MoCap systems become more affordable and accessible, opportunities for creative experimentation have expanded across the animation industry [2]. Consistent with the standard definition of creativity, Artistic Innovation in animation must be both original and narratively fitting for character and story [25]. Within CAMMT, Artistic Innovation refers to novelty and appropriateness in movement solutions (what is created), while Emotional Expressive refers to affective clarity and nuance in performance (how emotion is communicated). Hence, Artistic Innovation is both an outcome and a developmental process that grows through embodied action, immediate feedback [65,66].
Presence frames MoCap work as lived performance, encouraging risk-taking, improvisation, and alternative interpretations; evidence indicates that feeling inside the animation space increases the likelihood of creative risk and expressive variation [5]. In Boden’s terms, presence widens the perceptual search within a conceptual space of possible movements, enabling combinational and exploratory creativity before consolidating solutions that satisfy narrative constraints [67]. This immersion reframes the task from technical execution to creative expression enacted through motion. Agency complements this process by enabling real-time experimentation with stylization and movement variation. With intentional control and immediate visual feedback, learners shape movement rapidly. This freedom fosters authorship and encourages departure from standard movement templates toward original performance styles [68]. Agency thus operationalizes the generative side of the geneplore cycle by supporting the rapid production and evaluation of preinventive motion variants, facilitating shifts from exploratory to transformational moves within the movement design space. Accordingly, CAMMT proposes positive relations between presence, agency, and the development of Artistic Innovation and expressive performance (Paths 14 and 15). The embodied control afforded by MoCap, together with immersive and responsive engagement, supports creation that is technically reliable, esthetically meaningful, and personally distinctive (Path 22).

3.3.6. Collaborative Construction

Collaborative Construction is defined as the social process through which learners engage in shared inquiry, co-create meaning, and build collective understanding via interaction and communication [69]. Growing evidence suggests that technology-based training environments are most effective when complemented by face-to-face interaction, which can enhance collaboration and knowledge sharing [70]. Cila [71] emphasizes that immersive environments support collaboration by enabling shared experiences among users, thanks to the realism, interactivity, and flexibility that are difficult to achieve with traditional media. Empirical studies also support the use of VR and AR to build domain-specific collaborative spaces (e.g., medicine, architecture, and engineering) [72,73,74,75]. Johnson-Glenberg et al. [76] examined Embodied Mixed Reality Learning Environments (EMRELEs) and attributed learning gains to two collaborative learning perspectives: (i) social cohesion, where group success and shared motivation drive outcomes, and (ii) cognitive developmentalism, where more advanced learners explain, model behaviors, and respond to prompts to support peer learning and knowledge construction.
Nevertheless, in animation design, particularly within MoCap workflows, collaboration is also essential [77]. MoCap-based production typically involves a range of interdependent roles, including performers, directors, designers, and technical operators, all of whom must coordinate closely to achieve outcomes. Within the CAMMT framework, collaborative learning is understood not only as a form of interpersonal engagement but also as a creative co-construction process supported by embodied participation and real-time responsiveness. When learners feel situated and visible within a shared creative environment, such as a live MoCap session, they become more attuned to others’ actions, intentions, and feedback. In multi-user MoCap settings, presence helps learners attend to others’ actions and feedback and synchronize performance (e.g., Manaf et al. [59]). Agency supports collaboration by giving each learner a distinct controllable role, encouraging active participation in team interactions [41].
Ultimately, real-time interactivity allows teams to iterate together, give and receive feedback, and adjust their performances based on collective decisions, fostering a sense of joint authorship over the creative outcome. CAMMT thus proposes a positive relationship between presence, agency, and collaborative knowledge construction (Paths 16 and 17). In embodied, performance-driven settings, collaboration functions as collective ideation that feeds co-creation, converting individual movement literacy into design innovation at the team level and strengthening both creative quality and pedagogical value (Path 23).

3.4. Validation and Verification of CAMMT

As a theory-building framework, CAMMT is intended to be empirically testable and falsifiable rather than treated as a purely descriptive metaphor. In response to calls for evidence-based applicability, this section clarifies how the model can be validated (i.e., whether the proposed constructs and relationships represent real MoCap learning processes) and verified (i.e., whether a given MoCap training implementation instantiates the model’s assumptions and measurable variables) are shown in Table 2.

3.4.1. Operationalization Roadmap for Classroom V&V

A practical V&V strategy for CAMMT can combine brief self-reports, behavioral or system logs, and artifact/performance evaluations, enabling triangulation without imposing an excessive burden on instructors or students are shown in Table 3.

3.4.2. Suggested Empirical Designs for Testing CAMMT

CAMMT can be evaluated using (i) classroom-based pilot studies (pre-post within a MoCap module), (ii) comparative designs (e.g., real-time vs. delayed feedback; higher vs. lower representational fidelity), and/or (iii) SEM/path analysis to estimate the full structure (technology affordances → presence/agency → mediators → outcomes). Mixed-method designs are recommended to capture both measurable learning gains and creative/performative quality.

3.5. Case Illustration: Applying CAMMT in a Character Performance Module

This case illustration outlines a classroom/studio procedure for applying CAMMT to a MoCap-based character performance module designed around Disney’s principles of animation (e.g., staging, anticipation, timing, arcs, exaggeration, and appeal). The aim is not to report new empirical results, but to provide an actionable example of how instructors can structure MoCap lessons to elicit the CAMMT.
As shown in Figure 4, the illustrative module follows a typical pre-, during, and post-structure. First, a pre-test can be used to establish baseline knowledge of the 12 principles and baseline performance quality (e.g., a short-acting beat task). Students then complete an orientation session covering the MoCap pipeline, role assignment (performer/operator/director), safety, calibration, and basic retargeting. The core of the module consists of several training units, each targeting a subset of the 12 principles. Each unit repeats the MoCap training cycle (Figure 2): brief → rehearse → capture → real-time mapping → playback/critique → retake. This design intentionally strengthens learners’ perceived presence and agency by maintaining a tight action–feedback loop. In CAMMT terms, the technology affordances (immersion, interactivity, representational fidelity; Paths 1–5) are expected to support presence/agency, which, in turn, activate the six mediating constructs (Paths 6–17), such as Perceptual–Motor skills, Emotional Expressive, and Reflective Thinking. Post-test assessment can then capture learning and creative outcomes (Paths 18–23) using a combination of short knowledge checks, performance rubrics, portfolio artifacts, and (optionally) a transfer task in a new character brief or genre constraint.

4. What Are the Creative and Cognitive Outcomes Included in the CAMMT?

While traditional education models categorize learning outcomes as factual, conceptual, procedural, and transferable knowledge [78], CAMMT reinterprets these domains through the lens of embodied design creativity in MoCap-based animation practice. For terminological clarity, CAMMT distinguishes creativity-related language at two analytical levels: (i) process-level mechanisms enacted during training, including ideation/exploration and co-creation (captured primarily by Artistic Innovation and Collaborative Construction), expressive performance (captured by Emotional Expressive), and motor nuance/performance quality (captured by Perceptual–Motor Skills); and (ii) outcome-level creativity, operationalized here as adaptive innovation/transfer, namely applying learned motion principles and creative strategies to new character briefs, genres, or pipelines. Although these elements often co-occur in MoCap sessions, CAMMT treats them as conceptually distinct constructs and outcomes within the model to avoid conflating ideation, expressiveness, motor nuance, collaboration, and transfer. Thus, CAMMT identifies four foundational cognitive outcomes that collectively scaffold creative development: technical literacy (factual), conceptual understanding (design logic), procedural mastery (creative technique), and adaptive innovation (transfer). These stages are not linear but cyclical, mirroring the creative process of understanding–enacting–reflecting–reapplying that underpins animation design and digital performance.
Technical literacy refers to a learner’s grasp of the tools, systems, and terminology that enable creative manipulation of MoCap technologies, such as rig calibration, character setup, or animation pipeline logic [4,79]. Conceptual understanding, by contrast, involves integrating motion logic, narrative intent, and animation principles [13]. In MoCap-based environments, this knowledge is formed through embodied experimentation: learners test theoretical concepts such as gestures or anticipation through their own movement, linking cognitive abstraction to physical realization. Studies have shown that immersive learning does not simply replace conceptual reasoning but deepens it through embodied feedback and visual immediacy [80]. Within CAMMT, this synthesis of factual and conceptual awareness provides the cognitive foundation for creative performance, enabling learners to reason about movement and emotion through embodied cognition.
Procedural mastery and adaptive innovation extend MoCap learning from technical execution to creative design practice. Within the CAMMT framework, procedural knowledge involves iterative, hands-on processes such as rehearsing choreography, synchronizing data, and refining expressive timing through continuous feedback [44,81]. These embodied techniques cultivate sensitivity to rhythm, flow, and spatial design, elements that underpin esthetic composition and performance quality in animation. Adaptive innovation, in turn, captures the learner’s capacity to transfer embodied insights across contexts, applying expressive or narrative strategies developed in MoCap sessions to new story genres, production pipelines, or media forms [82]. This transfer represents creative generalization: the transformation of embodied experience into flexible design capability.
Beyond these procedural and adaptive outcomes, MoCap-based animation training also fosters distinct creative and performative capacities. Because the learner’s physical presence serves as both a tool and an expressive medium, MoCap transforms training into a process of creative authorship. Learners design gestures, moods, and character personas through bodily exploration, developing expressive timing, stylized motion, and performance nuance essential to storytelling and emotional resonance [4,59]. Studies further demonstrate that collaborative creativity, including ensemble coordination, shared direction, and embodied co-design, emerges as a recurring feature of MoCap education, reinforcing the social dimension of creative practice [41,83]. Taken together, these outcomes illustrate how cognition, motion, and creativity converge within CAMMT. MoCap functions not merely as a training technology but as a catalyst for ideation and innovation, integrating affective, cognitive, and sensorimotor processes into a unified model of embodied creative learning.

5. What Are the Implications for Future Research Based on CAMMT?

The CAMMT offers a performance-centered lens for understanding how learning and practice unfold in MoCap-based animation training. Building on CAMIL, CAMMT shifts the research emphasis from comparing media types to investigating the interaction between MoCap’s technological affordances (real-time immersion and interactivity) and the instructional methods employed. Future research should focus on how, specifically, presence and agency can be intentionally supported through instructional strategies to improve learning outcomes in animation training. CAMMT also prompts researchers to move beyond the novelty or “wow” effect associated with embodied media like MoCap. In general, emerging technologies are said to progress through different levels of hype (i.e., expectations surrounding a technology over time from its initial launch [84]). While novelty may briefly heighten attention and motivation, sustainable practice depends on intentional pedagogical design, not merely on technological sophistication. This implies that future research should control prior experience with MoCap systems, ensuring that performance gains reflect instructional effectiveness rather than transient excitement. Just as CAMIL cautions against uncritical enthusiasm for immersive VR, we recognize that motion performance alone is insufficient without structured guidance, reflection, and feedback.
Despite being rooted in empirical findings and theory, few studies have explicitly tested the full set of pathways proposed in CAMMT. Further research is needed to investigate how presence and agency lead to learning through affective and cognitive mechanisms such as Emotional Expressive, reflective thinking, embodiment, and Artistic Innovation. These constructions are central to animation learning yet underexplored in empirical studies. Moreover, more attention is required to understand when immersive embodiment enhances training and when it might overload learners or distract from conceptual clarity, because CAMMT did not initially assume that the affordances provided by MoCap training might negatively affect the predicted mediating constructs and thus affect the training outcomes. CAMMT also distinguishes between factual, conceptual, and procedural knowledge and transfer, yet most MoCap learning studies fail to categorize learning outcomes with this level of precision. Future research should apply knowledge taxonomies to distinguish outcome types and explore which learning constructs are most strongly mediated by embodied interaction. The transfer of learning, particularly across narrative genres, performance styles, or production pipelines, is especially relevant, given the creative design skill demands in animation education. However, because MoCap training is highly situated and context-specific, future studies must also address how learners generalize skills acquired through MoCap to unfamiliar tasks or media platforms.
It is also worth noting that one of the primary motivations for developing the CAMMT was to address a widely recognized gap in animation design and education, namely the lack of a theoretical framework to guide immersive creative learning and performance-based training modules (e.g., [6,13,85,86]). However, the model’s implications extend well beyond instructional design. In addition to its educational focus, CAMMT can be situated within the broader discourse of design creativity and innovation. In this view, presence and agency are not only psychological affordances that shape learning but also catalysts for embodied creative cognition. By positioning learners as performers and creators who externalize ideas through physical enactment, MoCap becomes a medium for creative ideation, reflection, and synthesis. Future research could explore how these embodied interactions support the generative and evaluative phases of the creative process, comparable to sketching or prototyping in traditional design practice.
While the model is grounded in animation education, its theoretical principles are broadly applicable to other performance-centered design disciplines, such as music, drama, dance, and interactive media, where embodied practice, expressive performance, and Artistic Innovation are essential to creative development. Future studies should therefore extend the theoretical reach of CAMMT to examine how embodied interaction and affective engagement foster creativity across design contexts. In doing so, CAMMT may contribute to a more comprehensive understanding of how MoCap and similar technologies function not only as learning tools but also as creative systems that enable the design and performance of new forms of human–technology collaboration.

6. Important External Factors That Influence the CAMMT

Although not explicitly represented in the CAMMT framework, several external factors critically shape learners’ experience of MoCap-based creative training. These factors include usability, social influences, and individual learner characteristics such as age, personality traits, spatial ability, and prior experience with immersive systems. Recognizing these contextual moderators is essential for understanding variability in MoCap-based creative performance and innovation outcomes.
Usability remains foundational in ensuring meaningful engagement with MoCap systems. In the context of animation design education, usability refers to how efficiently and intuitively learners can employ capture technologies to achieve expressive and technical goals. System complexity, calibration errors, or unintuitive interfaces can disrupt presence and agency, which are two core constructs in CAMMT, by impeding smooth interaction or limiting expressive control [87]. When learners struggle with system friction rather than creative exploration, the training focus shifts from ideation to error management. Therefore, MoCap usability should be approached not merely as a technical concern but as a dimension of human-centered creative system design, directly influencing esthetic expression and cognitive flow. Social and institutional factors further shape learning dynamics. The Technology Acceptance Model [88] highlights how perceived usefulness, peer influence, and organizational support determine technology adoption. In collaborative design education, shared authorship and peer feedback reinforce social presence, co-creation, and iterative reflection, all of which are essential to design innovation.
Individual differences also moderate MoCap-based training. Younger learners often demonstrate greater openness to immersive technologies [89], while spatial ability influences how effectively individuals manipulate and interpret motion data in 3D environments [90]. Conversely, factors such as physical fatigue, motion discomfort, or body-image anxiety may reduce engagement by limiting embodied participation. Such differences underscore the need for inclusive, ergonomic design that accommodates diverse physical and psychological profiles, thereby supporting equitable access to creative education.
In sum, these contextual moderators reveal that creative training with MoCap extends beyond system design or pedagogy, as it depends on the holistic integration of human, social, and environmental factors. CAMMT thus encourages future research to consider usability, collaboration, and learner diversity as essential dimensions of human-centered creative learning ecosystems, where embodied interaction and design cognition mutually reinforce innovation across artistic and educational domains.

7. Conclusions

While creativity includes both process and product, CAMMT highlights their interdependence by addressing how embodied MoCap training fosters both creative development and innovative outcomes in animation research. The model explains how MoCap environments support the acquisition of factual, conceptual, and procedural knowledge while also fostering transferable and creative skills. By extending the earlier CAMIL framework, CAMMT addresses the distinctive characteristics of embodied creative practice in animation design, where the learner’s physical body serves simultaneously as an instrument of action and a medium of artistic expression. Although research on immersive and MoCap-based training is expanding rapidly, recent reviews continue to highlight theoretical and methodological gaps in this field [6,82,91]. To advance this research, future studies should incorporate CAMMT’s key constructs, including presence, agency, and the six mediating Cognitive Affective factors, when examining how MoCap supports creative learning and design innovation. It is equally important to consider contextual moderators such as system usability, learner diversity, and social collaboration, as these shape both engagement and expressive performance. Further research should also explore how variables such as body-image perception, prior technological experience, and emotional disposition influence the experience of embodiment in creative training. As MoCap becomes increasingly embedded in creative education and design research, CAMMT provides a timely, structured theoretical foundation for understanding how embodied interaction transforms cognition, emotion, and Artistic Innovation. Ultimately, the model offers guidance for researchers and designers seeking to develop human-centered, embodied learning environments that cultivate creativity and innovation in the next generation of animators, performers, and digital creators.

Author Contributions

Conceptualization, X.J. and Z.I.; methodology, X.J.; software, X.J.; validation, X.J., Z.I. and J.J.; formal analysis, X.J.; investigation, X.J., J.W. and G.L.; resources, Z.I., J.W. and G.L.; data curation, X.J.; writing—original draft preparation, X.J.; writing—review and editing, X.J. and Z.I.; visualization, X.J.; supervision, Z.I.; project administration, Z.I. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Anhui Provincial Office for Philosophy and Social Sciences Planning (2025), through the project “Digital Human Motion Capture and the Integration of Huizhou Intangible Cultural Heritage Opera Performance Training (Grant No. AHSKYY2025D64).” Furthermore, we thank the Virtual Production Studio, Anhui Xinhua University, and the Faculty of Art and Design, Universiti Teknologi MARA (UiTM).

Data Availability Statement

The data presented in this study are available upon request from the corresponding author.

Acknowledgments

The authors would like to express their sincere gratitude to Zainuddin Ibrahim for his invaluable academic guidance, theoretical insights, and continuous support throughout the conception, development, and completion of this study. His expertise and supervision were instrumental in shaping the research design and the study’s overall direction. The authors also acknowledge Jing Jiang, Jiafeng Wang, and Gang Liu for their assistance with data collection, technical support, and constructive feedback at various stages of the research process. Finally, the authors would like to thank Yue Yuan for her help with the preparation of the manuscript illustrations and the graphic abstract.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ARAugmented Reality
CAMILCognitive Affective Model of Immersive Learning
CAMMTCognitive Affective Model of Motion Capture Training
CATLMCognitive Affective Theory of Learning with Media
HCIHuman–Computer Interaction
IVRImmersive Virtual Reality
MoCapMotion Capture
MRMixed Reality
SEMStructural Equation Modeling
VRVirtual Reality

References

  1. Hart, H. When Will a Motion-Capture Actor Win an Oscar? WIRED. 24 January 2012. Available online: https://www.wired.com/2012/01/andy-serkis-oscars/ (accessed on 12 September 2025).
  2. Wibowo, M.C.; Nugroho, S.; Wibowo, A. The use of motion capture technology in 3D animation. Int. J. Comput. Digit. Syst. 2024, 15, 975–987. [Google Scholar] [CrossRef]
  3. Reuter, A.S.; Schindler, M. Motion capture systems and their use in educational research: Insights from a systematic literature review. Educ. Sci. 2023, 13, 167. [Google Scholar] [CrossRef]
  4. Bennett, G.; Kruse, J. Teaching visual storytelling for virtual production pipelines incorporating motion capture and visual effects. In Proceedings of the SIGGRAPH Asia 2015 Symposium on Education, Kobe, Japan, 2–6 November 2015; Association for Computing Machinery: New York, NY, USA, 2015. [Google Scholar] [CrossRef]
  5. Mou, T.Y. Motion capture in supporting creative animation design. In DS86, Proceedings of the Fourth International Conference on Design Creativity, Georgia Institute of Technology, Atlanta, GA, USA, 2–4 November 2016; The Design Society: Shenzhen, China, 2016. [Google Scholar]
  6. Najafi, H.; Kennedy, J.; Ramsay, E.; Todoroki, M.; Bennett, G. A pedagogical workflow for interconnected learning: Integrating motion capture in animation, visual effects, and game design: Major/minor curriculum structure that supports the integration of motion capture with animation, visual effects and game design teaching pathways. In Proceedings of the SIGGRAPH Asia 2024 Educator’s Forum (SA ’24). Association for Computing Machinery, Tokyo, Japan, 3–6 December 2024; Association for Computing Machinery: New York, NY, USA, 2024; pp. 1–7. [Google Scholar] [CrossRef]
  7. Makransky, G.; Petersen, G.B. The cognitive affective model of immersive learning (CAMIL): A theoretical research-based model of learning in immersive virtual reality. Educ. Psychol. Rev. 2021, 33, 937–958. [Google Scholar] [CrossRef]
  8. Child, B. Andy Serkis: Why Won’t Oscars Go Ape over Motion-Capture Acting. The Guardian, 12 August 2011. [Google Scholar]
  9. Rapp, I. Motion Capture Actors: Body Movement Tells the Story. DirectSubmit from NYCastings. 2017. Available online: https://www.nycastings.com/motion-capture-actors-body-movement-tells-the-story (accessed on 11 September 2025).
  10. Salomon, A. Growth in Performance Capture Helping Gaming Actors Weather Slump; Backstage: New York, NY, USA, 2013; Available online: https://www.backstage.com/magazine/article/growth-performance-capturehelping-gaming-actors-weatherslump-47881/ (accessed on 11 November 2025).
  11. Auslander, P. Film acting and performance capture. PAJ J. Perform. Art 2017, 39, 7–23. [Google Scholar] [CrossRef]
  12. Menolotto, M.; Komaris, D.-S.; Tedesco, S.; O’Flynn, B.; Walsh, M. Motion capture technology in industrial applications: A systematic review. Sensors 2020, 20, 5687. [Google Scholar] [CrossRef]
  13. Mou, T.-Y. Keyframe or motion capture? Reflections on education of character animation. Eurasia J. Math. Sci. Technol. Educ. 2018, 14, em1649. [Google Scholar] [CrossRef]
  14. Cross, N. Design Thinking: Understanding How Designers Think and Work; Oxford University Press: Oxford, UK, 2011. [Google Scholar]
  15. Dorst, K. Design problems and design paradoxes. Des. Issues 2006, 22, 4–17. [Google Scholar] [CrossRef]
  16. Loomis, J.M.; Blascovich, J.J.; Beall, A.C. Immersive virtual environment technology as a basic research tool in psychology. Behav. Res. Methods Instrum. Comput. 1999, 31, 557–564. [Google Scholar] [CrossRef]
  17. Slater, M.; Sanchez-Vives, M.V. Enhancing our lives with immersive virtual reality. Front. Robot. AI 2016, 3, 74. [Google Scholar] [CrossRef]
  18. Salzman, M.C.; Dede, C.; Loftin, R.B.; Chen, J. A model for understanding how virtual reality aids complex conceptual learning. Presence Teleoper. Virtual Environ. 1999, 8, 293–316. [Google Scholar] [CrossRef]
  19. Lee, E.A.-L.; Wong, K.W.; Fung, C.C. How does desktop virtual reality enhance learning outcomes? A structural equation modeling approach. Comput. Educ. 2010, 55, 1424–1442. [Google Scholar] [CrossRef]
  20. Makransky, G.; Lilleholt, L. A structural equation modeling investigation of the emotional value of immersive virtual reality in education. Educ. Technol. Res. Dev. 2018, 66, 1141–1164. [Google Scholar] [CrossRef]
  21. Pekrun, P. The control-value theory of achievement emotions: Assumptions, corollaries, and implications for educational research and practice. Educ. Psychol. Rev. 2006, 18, 315–341. [Google Scholar] [CrossRef]
  22. Makransky, G.; Petersen, G.B. Investigating the process of learning with desktop virtual reality: A structural equation modeling approach. Comput. Educ. 2019, 134, 15–30. [Google Scholar] [CrossRef]
  23. Moreno, R.; Mayer, R. Interactive multimodal learning environments: Special issue on interactive learning environments: Contemporary issues and trends. Educ. Psychol. Rev. 2007, 19, 309–326. [Google Scholar] [CrossRef]
  24. Gero, J.S. Design prototypes: A knowledge representation schema for design. AI Mag. 1990, 11, 26–36. [Google Scholar] [CrossRef]
  25. Runco, M.A.; Jaeger, G.J. The standard definition of creativity. Creat. Res. J. 2012, 24, 92–96. [Google Scholar] [CrossRef]
  26. Goel, V. Sketches of Thought; MIT Press: Cambridge, MA, USA, 1995. [Google Scholar]
  27. Goldschmidt, G. Linkography: Unfolding the Design Process; MIT Press: Cambridge, MA, USA, 2014. [Google Scholar]
  28. Guilford, J.P. The Nature of Human Intelligence; McGraw-Hill: Columbus, OH, USA, 1967. [Google Scholar]
  29. Torrance, E.P. Torrance Tests of Creative Thinking: Norms-Technical Manual; Scholastic Testing Service: Bensenville, IL, USA, 1974. [Google Scholar]
  30. Lee, K.M. Presence, explicated. Commun. Theory 2004, 14, 27–50. [Google Scholar] [CrossRef]
  31. Riva, G.; Davide, F.; IJsselsteijn, W.A. Being There: Concepts, Effects and Measurements of User Presence in Synthetic Environments; IOS Press: Amsterdam, The Netherlands, 2003. [Google Scholar]
  32. Witmer, B.G.; Singer, M.J. Measuring presence in virtual environments: A presence questionnaire. Presence 1998, 7, 225–240. [Google Scholar] [CrossRef]
  33. Dalgarno, B.; Lee, M.J. What are the learning affordances of 3-D virtual environments? Br. J. Educ. Technol. 2010, 41, 10–32. [Google Scholar] [CrossRef]
  34. Cannavò, A.; Pratticò, F.G.; Bruno, A.; Lamberti, F. AR-MoCap: Using augmented reality to support motion capture acting. In Proceedings of the IEEE Conference Virtual Reality and 3D User Interfaces (VR), Shanghai, China, 25–29 March 2023. [Google Scholar]
  35. Seymour, M. Art of LED Wall Virtual Production, Part One: Lessons from the Mandalorian. 2022. Available online: https://www.luxmc.com/press-a/art-of-led-wall-virtual-production-part-one-lessons-from-the-mandalorian/ (accessed on 11 November 2025).
  36. Moore, J.W.; Fletcher, P.C. Sense of agency in health and disease: A review of cue integration approaches. Conscious. Cogn. 2012, 21, 59–68. [Google Scholar] [CrossRef]
  37. Johnson-Glenberg, M.C. The necessary nine: Design principles for embodied VR and active STEM education. In Learning in a Digital World: Perspectives on Interactive Technologies for Formal and Informal Educationi; Springer: Singapore, 2019; pp. 83–112. [Google Scholar]
  38. Farrer, C.; Bouchereau, M.; Jeannerod, M.; Franck, N. Effect of distorted visual feedback on the sense of agency. Behav. Neurol. 2008, 19, 53–57. [Google Scholar] [CrossRef]
  39. Kilteni, K.; Groten, R.; Slater, M. The sense of embodiment in virtual reality. Presence Teleoper. Virtual Environ. 2012, 21, 373–387. [Google Scholar] [CrossRef]
  40. Belcher, J.W. Improving Student Understanding with TEAL; The MIT Faculty Newsletters: Cambridge, MA, USA, 2003; p. 16. [Google Scholar]
  41. Pan, Y.; Mitchell, K. Group-based expert walkthroughs: How immersive technologies can facilitate the collaborative authoring of character animation. In Proceedings of the 2020 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), Atlanta, GA, USA, 22–26 March 2020; pp. 188–195. [Google Scholar] [CrossRef]
  42. Salomão, A.; Andaló, F.; Prim, G.; Horn Vieira, M.L.; Romeiro, N.C. Case studies of motion capture as a tool for human-computer interaction research in the areas of design and animation. In Human-Computer Interaction. Theoretical Approaches and Design Methods; Kurosu, M., Ed.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2022; Volume 13302. [Google Scholar] [CrossRef]
  43. Park, M.; Cho, Y.; Na, G.; Kim, J. Application of virtual avatar using motion capture in immersive virtual environment. Int. J. Hum.-Comput. Interact. 2024, 40, 6344–6358. [Google Scholar] [CrossRef]
  44. Maraffi, T. Metahuman theatre: Teaching photogrammetry and MoCap as a performing arts process. In Proceedings of the ACM SIGGRAPH 2024 Educator’s Forum, Denver, CO, USA, 27 July–1 August 2024. [Google Scholar] [CrossRef]
  45. Petersen, G.B.; Petkakis, G.; Makransky, G. A study of how immersion and interactivity drive VR learning. Comput. Educ. 2022, 179, 104429. [Google Scholar] [CrossRef]
  46. Johnson-Glenberg, M.C.; Megowan-Romanowicz, C. Embodied science and mixed reality: How gesture and motion capture affect physics education. Cogn. Res. Princ. Implic. 2017, 2, 24. [Google Scholar] [CrossRef]
  47. Dewey, J. How We Think: A Restatement of the Relation of Reflective Thinking to the Educative Process; D.C. Heath: Lexington, MA, USA, 1933. [Google Scholar]
  48. Habermas, J. The Theory of Communicative Action, Vol. 2: Lifeworld and System: A Critique of Functionalist Reason; McCarthy, T., Translator; Beacon Press: Boston, MA, USA, 1987. [Google Scholar]
  49. Kuiper, R.A.; Pesut, D.J. Promoting cognitive and metacognitive reflective reasoning skills in nursing practice: Self-regulated learning theory. J. Adv. Nurs. 2004, 45, 381–391. [Google Scholar] [CrossRef]
  50. Starcic, A.I.; Lipsmeyer, W.M.; Lin, L. Using motion capture technologies to provide advanced feedback and scaffolds for learning. In Mind, Brain and Technology: Learning in the Age of Emerging Technologies; Springer: Berlin/Heidelberg, Germany, 2018; pp. 107–121. [Google Scholar] [CrossRef]
  51. Kiiski, H.; Hoyet, L.; Woods, A.T.; O’Sullivan, C.; Newell, F.N. Strutting hero, sneaking villain: Utilizing body motion cues to predict the intentions of others. ACM Trans. Appl. Percept. 2015, 13, 1–21. [Google Scholar] [CrossRef]
  52. Bowman, C.; Fujita, H.; Perin, G. Towards a knowledge-based environment for the cognitive understanding and creation of immersive visualization of expressive human movement data. In Trends in Applied Knowledge-Based Systems and Data Science; Fujita, H., Ali, M., Selamat, A., Sasaki, J., Kurematsu, M., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2016; Volume 9799. [Google Scholar] [CrossRef]
  53. Frost, J.L.; Wortham, S.C.; Reifel, R.S. Play and Child Development; Pearson: London, UK, 2012. [Google Scholar]
  54. Zeng, N.; Ayyub, M.; Sun, H.; Wen, X.; Xiang, P.; Gao, Z. Effects of physical activity on motor skills and cognitive development in early childhood: A systematic review. BioMed Res. Int. 2017, 2017, 2760716. [Google Scholar] [CrossRef]
  55. Hooks, E. Acting for Animators; Routledge: London, UK, 2017. [Google Scholar]
  56. Miaw, D.R.; Raskar, R. Second skin: Motion capture with actuated feedback for motor learning. In CHI ’09 Extended Abstracts on Human Factors in Computing Systems; Association for Computing Machinery: Boston, MA, USA, 2009; pp. 4537–4542. [Google Scholar]
  57. Pilati, F.; Faccio, M.; Gamberi, M.; Regattieri, A. Learning manual assembly through real-time motion capture for operator training with augmented reality. Procedia Manuf. 2020, 45, 189–195. [Google Scholar] [CrossRef]
  58. Sprenkels, B. Promoting Motor Learning in Squats Through Visual Error Augmented Feedback: A Markerless Motion Capture Approach. Master’s Thesis, University of Twente, Enschede, The Netherlands, 2024. [Google Scholar]
  59. Manaf, A.A.; Arshad, M.R.; Bahrin, K. Team learning in motion capture operations and independent rigging processes. Int. J. Sci. Technol. Res. 2020, 9, 2545–2549. [Google Scholar]
  60. Ståhl, A. Designing for Emotional Expressivity. Doctoral Dissertation, Linnaeus University, Växjö, Sweden, 2005. [Google Scholar]
  61. Crane, E.; Gross, M. Motion capture and emotion: Affect detection in whole body movement. In Affective Computing and Intelligent Interaction; Paiva, A.C.R., Prada, R., Picard, R.W., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2007; Volume 4738. [Google Scholar] [CrossRef]
  62. Ennis, C.; Hoyet, L.; Egges, A.; McDonnell, R. Emotion capture: Emotionally expressive characters for games. In MIG ’13 Proceedings of Motion on Games; Association for Computing Machinery: New York, NY, USA, 2013; pp. 53–60. [Google Scholar] [CrossRef]
  63. Winkielman, P.; Niedenthal, P.; Wielgosz, J.; Eelen, J.; Kavanagh, L.C. Embodiment of cognition and emotion. In APA Handbook of Personality and Social Psychology; American Psychological Association: Worcester, MA, USA, 2015. [Google Scholar] [CrossRef]
  64. Wang, X.; Zhong, W. Evolution and innovations in animation: A comprehensive review and future directions. Concurr. Comput. Pract. Exp. 2024, 36, e7904. [Google Scholar] [CrossRef]
  65. Finke, R.A.; Ward, T.B.; Smith, S.M. Creative Cognition: Theory, Research, and Applications; MIT Press: Cambridge, MA, USA, 1992. [Google Scholar]
  66. Ward, T.B. Structured imagination: The role of category structure in exemplar generation. Cogn. Psychol. 1994, 27, 1–40. [Google Scholar] [CrossRef]
  67. Boden, M.A. The Creative Mind: Myths and Mechanisms; Routledge: Oxfordshire, UK, 2004. [Google Scholar]
  68. Chan, J.C.; Leung, H.; Tang, J.K.; Komura, T. A virtual reality dance training system using motion capture technology. IEEE Trans. Learn. Technol. 2010, 4, 187–195. [Google Scholar] [CrossRef]
  69. Fischer, F.; Bruhn, J.; Gräsel, C.; Mandl, H. Fostering collaborative knowledge construction with visualization tools. Learn. Instr. 2002, 12, 213–232. [Google Scholar] [CrossRef]
  70. Asllani, A.; Ettkin, L.P.; Somasundar, A. Sharing knowledge with conversational technologies: Web logs versus discussion boards. Int. J. Inf. Technol. Manag. 2008, 7, 217–230. [Google Scholar] [CrossRef]
  71. Cila, N. Designing human-agent collaborations: Commitment, responsiveness, and support. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, New Orleans, LA, USA, 30 April–5 May 2022. [Google Scholar] [CrossRef]
  72. Chheang, V.; Saalfeld, P.; Joeres, F.; Boedecker, C.; Huber, T.; Huettl, F.; Lang, H.; Preim, B.; Hansen, C. A collaborative virtual reality environment for liver surgery planning. Comput. Graph. 2021, 99, 234–246. [Google Scholar] [CrossRef]
  73. Park, M.J.; Kim, D.J.; Lee, U.; Na, E.J.; Jeon, H.J. A literature overview of virtual reality (VR) in treatment of psychiatric disorders: Recent advances and limitations. Front. Psychiatry 2019, 10, 505. [Google Scholar] [CrossRef]
  74. Chen, C.; Helal, S.; De Deugd, S.; Smith, A.; Chang, C.K. Toward a collaboration model for smart spaces. In Proceedings of the 2012 Third International Workshop on Software Engineering for Sensor Network Applications (SESENA), Zurich, Switzerland, 2 June 2012; IEEE: New York, NY, USA, 2012; pp. 37–42. [Google Scholar] [CrossRef]
  75. Devigne, L.; Babel, M.; Nouviale, F.; Narayanan, V.K.; Pasteau, F.; Gallien, P. Design of an immersive simulator for assisted power wheelchair driving. In Proceedings of the 2017 International Conference on Rehabilitation Robotics (ICORR), London, UK, 17–20 July 2017; pp. 995–1000. [Google Scholar] [CrossRef]
  76. Johnson-Glenberg, M.C.; Birchfield, D.A.; Tolentino, L.; Koziupa, T. Collaborative embodied learning in mixed reality motion-capture environments: Two science studies. J. Educ. Psychol. 2014, 106, 86–99. [Google Scholar] [CrossRef]
  77. Liu, X.; Li, L.; Lu, J.; Du, L.; Shen, G. A preliminary study on collaborative methods in animation design. In Proceedings of the 2010 14th International Conference on Computer Supported Cooperative Work in Design, Shanghai, China, 14–16 April 2010; IEEE: New York, NY, USA, 2010; pp. 764–771. [Google Scholar] [CrossRef]
  78. Mayer, R.E. The Cambridge Handbook of Multimedia Learning; Cambridge University Press: Cambridge, UK, 2005. [Google Scholar] [CrossRef]
  79. Lamberti, F.; Paravati, G.; Gatteschi, V.; Cannavo, A.; Montuschi, P. Virtual character animation based on affordable motion capture and reconfigurable tangible interfaces. IEEE Trans. Vis. Comput. Graph. 2017, 24, 1742–1755. [Google Scholar] [CrossRef]
  80. Parong, J.; Mayer, R.E. Learning science in immersive virtual reality. J. Educ. Psychol. 2018, 110, 785–797. [Google Scholar] [CrossRef]
  81. Gupta, A.; Agrawala, M.; Curless, B.; Cohen, M. MotionMontage: A system to annotate and combine motion takes for 3D animations. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Toronto, ON, Canada, 26 April–1 May 2014. [Google Scholar] [CrossRef]
  82. Young, G.W.; Dinan, G.; Smolic, A. Realtime-3D interactive content creation for multi-platform distribution: A 3D interactive content creation user study. In Virtual, Augmented and Mixed Reality, Proceedings of the 15th International Conference, VAMR 2023, Held as Part of the 25th HCI International Conference, HCII 2023, Copenhagen, Denmark, 23–28 July 2023, Proceedings; Springer: Cham, Switzerland, 2023. [Google Scholar] [CrossRef]
  83. Megre, R.; Kunz, S. Motion capture visualisation for mixed animation techniques. In Proceedings of the Electronic Visualisation and the Arts London 2020 Conference, London, UK, 16–20 November 2020. [Google Scholar] [CrossRef]
  84. Fenn, J.; Blosch, M. Understanding Gartner’s Hype Cycles; Gartner: Singapore, 2020. [Google Scholar]
  85. Bennett, G.; Denton, A. Developing practical models for teaching motion capture. In ACM SIGGRAPH ASIA 2009 Educators Program; Association for Computing Machinery: New York, NY, USA, 2009; pp. 1–5. [Google Scholar] [CrossRef]
  86. Nikolai, J.R.; Bennett, G.; Marks, S.; Gilson, G. Active learning and teaching through digital technology and live performance: ‘Choreographic thinking’ as art practice in the tertiary sector. Int. J. Art Des. Educ. 2019, 38, 137–152. [Google Scholar] [CrossRef]
  87. Dunleavy, M.; Dede, C.; Mitchell, R. Affordances and limitations of immersive participatory augmented reality simulations for teaching and learning. J. Sci. Educ. Technol. 2009, 18, 7–22. [Google Scholar] [CrossRef]
  88. Venkatesh, V.; Davis, F.D. A theoretical extension of the technology acceptance model: Four longitudinal field studies. Manag. Sci. 2000, 46, 186–204. [Google Scholar] [CrossRef]
  89. Suh, A.; Prophet, J. The state of immersive technology research: A literature analysis. Comput. Hum. Behav. 2018, 86, 77–90. [Google Scholar] [CrossRef]
  90. Li, P.; Legault, J.; Klippel, A.; Zhao, J. Virtual reality for student learning: Understanding individual differences. Hum. Behav. Brain 2020, 1, 28–36. [Google Scholar] [CrossRef]
  91. Fu, Y.; Li, Q.; Ma, D. User experience of a serious game for physical rehabilitation using wearable motion capture technology. IEEE Access 2023, 11, 108407–108417. [Google Scholar] [CrossRef]
Figure 1. Number of articles on the Scopus database that refer to MoCap in computer animation education. Note: The following search string was applied: (TITLE-ABS-KEY (“motion capture” OR MoCap) AND TITLE-ABS-KEY (animation OR computer animation * OR character animation *)) AND TITLE-ABS-KEY (education OR learn * OR train OR teach *)) from January 2000 to October 2025.
Figure 1. Number of articles on the Scopus database that refer to MoCap in computer animation education. Note: The following search string was applied: (TITLE-ABS-KEY (“motion capture” OR MoCap) AND TITLE-ABS-KEY (animation OR computer animation * OR character animation *)) AND TITLE-ABS-KEY (education OR learn * OR train OR teach *)) from January 2000 to October 2025.
Computers 15 00100 g001
Figure 2. MoCap training cycle.
Figure 2. MoCap training cycle.
Computers 15 00100 g002
Figure 3. Overview of the CAMMT.
Figure 3. Overview of the CAMMT.
Computers 15 00100 g003
Figure 4. Case illustration for MoCap-based module. Note: (A) Schematic of a typical classroom/studio MoCap training scene; (B) overview of the iterative lesson procedure.
Figure 4. Case illustration for MoCap-based module. Note: (A) Schematic of a typical classroom/studio MoCap training scene; (B) overview of the iterative lesson procedure.
Computers 15 00100 g004
Table 1. Distinguishing CAMMT from CAMIL.
Table 1. Distinguishing CAMMT from CAMIL.
DimensionCAMILCAMMT
Primary focusLearning in immersive virtual reality (IVR): how VR features shape presence/agency and learning outcomes.MoCap-based character animation training: how performance affordances shape presence/agency and creative skill development.
Core interaction locusLearner is situated in a simulated virtual environment and interacts with virtual content (often HMD-based).Learner performs in the physical world; motion is captured, mapped to a character, and refined via capture–playback–retake cycles.
Psychological
Affordances
Presence: “being there” in the virtual environment; agency: perceived control/ownership over actions in IVR.Presence: creative–performative embodied engagement; agency: creative authorship over expressive motion (intent → action → mapped feedback).
Six mediatorsAffective/cognitive mediators such as interest, intrinsic motivation, self-efficacy, embodiment, cognitive load, and self-regulation.Control and Active Learning; Reflective Thinking; Perceptual Motor Skills; Emotional Expressive; Artistic Innovation; Collaborative Construction.
Outcome emphasisKnowledge outcomes (factual, conceptual, procedural) and transfer of learning.Technical literacy; conceptual understanding; procedural mastery; adaptive innovation/transfer, with explicit emphasis on expressive performance quality and creative output.
Practical implicationGuides IVR lesson design to maximize learning while managing cognitive/affective constraints.Guides MoCap lesson design (task briefs, critique loops, iteration, collaboration roles) to cultivate expressive motion, creativity, and transfer.
Table 2. Testable propositions.
Table 2. Testable propositions.
Model SegmentPropositionPath(s)PredictorOutcome
Technology affordances → psychological affordancesP11ImmersionPresence
P22InteractivityPresence
P33InteractivityAgency
P44Representational fidelityPresence
P55Representational fidelityAgency
Psychological affordances → mediating constructsP66–7 (7+)Presence and AgencyControl and Active Learning
P78–9 (9+)Presence and AgencyReflective Thinking
P810–11 (11+)Presence and AgencyPerceptual–Motor Skills
P912–13 (13+)Presence and AgencyEmotional Expressive
P1014–15 (15+)Presence and AgencyArtistic Innovation
P1116–17 (17+)Presence and AgencyCollaborative Construction
Mediating constructs → embodied learning and creative outcomesP1818Control and Active LearningEmbodied learning and Creative Outcomes
P1919Reflective ThinkingEmbodied learning and Creative Outcomes
P2020Perceptual–Motor SkillsEmbodied learning and Creative Outcomes
P2121Emotional ExpressiveEmbodied learning and Creative Outcomes
P2222Artistic InnovationEmbodied learning and Creative Outcomes
P2323Collaborative ConstructionEmbodied learning and Creative Outcomes
Table 3. Operationalization roadmap for CAMMT.
Table 3. Operationalization roadmap for CAMMT.
ComponentPathsOperational Indicators and Feasible Data Sources (Triangulation)
Technology affordancesPaths 1–5System/pipeline logs (latency, dropped frames, tracking error, mapping stability); calibration success rate; brief instructor implementation checklist (immersion/interactivity/fidelity).
Presence and AgencyPaths 1–5Short presence/agency items after each capture–playback cycle (or post-session); behavior traces: intentional retakes, motion variants, active-performance time vs. passive observation.
Six mediatorsPaths 6–17
(7+, 9+, 11+, 13+, 15+, 17+)
Rubrics and artifacts aligned to each construct: reflection notes (Reflective Thinking); motion-quality rubric (Perceptual Motor Skills); emotion clarity/congruence ratings (Emotional Expressive); novelty-appropriateness rubric (Artistic Innovation); collaboration rubric (role coordination, feedback density).
Learning OutcomesPaths 18–23Knowledge checks; end-of-module performance task; portfolio artifacts; transfer task (apply learned motion principles to a new character brief/genre constraint).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Jiang, X.; Ibrahim, Z.; Jiang, J.; Wang, J.; Liu, G. The Cognitive Affective Model of Motion Capture Training: A Theoretical Framework for Enhancing Embodied Learning and Creative Skill Development in Computer Animation Design. Computers 2026, 15, 100. https://doi.org/10.3390/computers15020100

AMA Style

Jiang X, Ibrahim Z, Jiang J, Wang J, Liu G. The Cognitive Affective Model of Motion Capture Training: A Theoretical Framework for Enhancing Embodied Learning and Creative Skill Development in Computer Animation Design. Computers. 2026; 15(2):100. https://doi.org/10.3390/computers15020100

Chicago/Turabian Style

Jiang, Xinyi, Zainuddin Ibrahim, Jing Jiang, Jiafeng Wang, and Gang Liu. 2026. "The Cognitive Affective Model of Motion Capture Training: A Theoretical Framework for Enhancing Embodied Learning and Creative Skill Development in Computer Animation Design" Computers 15, no. 2: 100. https://doi.org/10.3390/computers15020100

APA Style

Jiang, X., Ibrahim, Z., Jiang, J., Wang, J., & Liu, G. (2026). The Cognitive Affective Model of Motion Capture Training: A Theoretical Framework for Enhancing Embodied Learning and Creative Skill Development in Computer Animation Design. Computers, 15(2), 100. https://doi.org/10.3390/computers15020100

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop