Multimodal Technologies and Interaction

19 pages, 978 KB

Open AccessArticle

From Consumption to Co-Creation: A Systematic Review of Six Levels of AI-Enhanced Creative Engagement in Education

by Margarida Romero

Multimodal Technol. Interact. 2025, 9(10), 110; https://doi.org/10.3390/mti9100110 - 21 Oct 2025

Viewed by 660

As AI systems become more integrated into society, the relationship between humans and AI is shifting from simple automation to co-creative collaboration. This evolution is particularly important in education, where human intuition and imagination can combine with AI’s computational power to enable innovative [...] Read more.

As AI systems become more integrated into society, the relationship between humans and AI is shifting from simple automation to co-creative collaboration. This evolution is particularly important in education, where human intuition and imagination can combine with AI’s computational power to enable innovative forms of learning and teaching. This study is grounded in the #ppAI6 model, a framework that describes six levels of creative engagement with AI in educational contexts, ranging from passive consumption to active, participatory co-creation of knowledge. The model highlights progression from initial interactions with AI tools to transformative educational experiences that involve deep collaboration between humans and AI. In this study, we explore how educators and learners can engage in deeper, more transformative interactions with AI technologies. The #ppAI6 model categorizes these levels of engagement as follows: level 1 involves passive consumption of AI-generated content, while level 6 represents expansive, participatory co-creation of knowledge. This model provides a lens through which we investigate how educational tools and practices can move beyond basic interactions to foster higher-order creativity. We conducted a systematic literature review following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines for reporting the levels of creative engagement with AI tools in education. This review synthesizes existing literature on various levels of engagement, such as interactive consumption through Intelligent Tutoring Systems (ITS), and shifts focus to the exploration and design of higher-order forms of creative engagement. The findings highlight varied levels of engagement across both learners and educators. For learners, a total of four studies were found at level 2 (interactive consumption). Two studies were found that looked at level 3 (individual content creation). Four studies focused on collaborative content creation at level 4. No studies were observed at level 5, and only one study was found at level 6. These findings show a lack of development in AI tools for more creative involvement. For teachers, AI tools mainly support levels two and three, facilitating personalized content creation and performance analysis with limited examples of higher-level creative engagement and indicating areas for improvement in supportive collaborative teaching practices. The review found that two studies focused on level 2 (interactive consumption) for teachers. In addition, four studies were identified at level 3 (individual content creation). Only one study was found at level 5 (participatory co-creation), and no studies were found at level 6. In practical terms, the review suggests that educators need professional development focused on building AI literacy, enabling them to recognize and leverage the different levels of creative engagement that AI tools offer. Full article

► Show Figures

Figure 1

17 pages, 1055 KB

Open AccessArticle

Testing a New Approach to Monitor Mild Cognitive Impairment and Cognition in Older Adults at the Community Level

by Isabel Paniak, Ethan Cohen, Christa Studzinski and Lia Tsotsos

Multimodal Technol. Interact. 2025, 9(10), 109; https://doi.org/10.3390/mti9100109 - 21 Oct 2025

Viewed by 340

Abstract

Dementia and mild cognitive impairment (MCI) are growing health concerns in Canada’s aging population. Over 700,000 Canadians currently live with dementia, and this number is expected to rise. As the older adult population increases, coupled with an already strained healthcare system, there is [...] Read more.

Dementia and mild cognitive impairment (MCI) are growing health concerns in Canada’s aging population. Over 700,000 Canadians currently live with dementia, and this number is expected to rise. As the older adult population increases, coupled with an already strained healthcare system, there is a pressing need for innovative tools that support aging in place. This study explored the feasibility and acceptability of using a Digital Human (DH) conversational agent, combined with AI-driven speech analysis, to monitor cognitive function, anxiety, and depression in cognitively healthy community-dwelling older adults (CDOA) aged 65 and older. Sixty older adults participated in up to three in-person sessions over six months, interacting with the DH through journaling and picture description tasks. Afterward, 51 of the participants completed structured interviews about their experiences and perceptions of the DH and AI more generally. Findings showed that 84% enjoyed interacting with the DH, and 96% expressed interest in learning more about AI in healthcare. While participants were open and curious about AI, 67% voiced concerns about AI replacing human interaction in healthcare. Most found the DH friendly, though reactions to its appearance varied. Overall, participants viewed AI as a promising tool, provided it complements, rather than replaces, human interactions. Full article

► Show Figures

Figure 1

18 pages, 1175 KB

Open AccessArticle

NAMI: A Neuro-Adaptive Multimodal Architecture for Wearable Human–Computer Interaction

by Christos Papakostas, Christos Troussas, Akrivi Krouska and Cleo Sgouropoulou

Multimodal Technol. Interact. 2025, 9(10), 108; https://doi.org/10.3390/mti9100108 - 18 Oct 2025

Viewed by 424

Abstract

The increasing ubiquity of wearable computing and multimodal interaction technologies has created unprecedented opportunities for natural and seamless human–computer interaction. However, most existing systems adapt only to external user actions such as speech, gesture, or gaze, without considering internal cognitive or affective states. [...] Read more.

The increasing ubiquity of wearable computing and multimodal interaction technologies has created unprecedented opportunities for natural and seamless human–computer interaction. However, most existing systems adapt only to external user actions such as speech, gesture, or gaze, without considering internal cognitive or affective states. This limits their ability to provide intelligent and empathetic adaptations. This paper addresses this critical gap by proposing the Neuro-Adaptive Multimodal Architecture (NAMI), a principled, modular, and reproducible framework designed to integrate behavioral and neurophysiological signals in real time. NAMI combines multimodal behavioral inputs with lightweight EEG and peripheral physiological measurements to infer cognitive load and engagement and adapt the interface dynamically to optimize user experience. The architecture is formally specified as a three-layer pipeline encompassing sensing and acquisition, cognitive–affective state estimation, and adaptive interaction control, with clear data flows, mathematical formalization, and real-time performance on wearable platforms. A prototype implementation of NAMI was deployed in an augmented reality Java programming tutor for postgraduate informatics students, where it dynamically adjusted task difficulty, feedback modality, and assistance frequency based on inferred user state. Empirical evaluation with 100 participants demonstrated significant improvements in task performance, reduced subjective workload, and increased engagement and satisfaction, confirming the effectiveness of the neuro-adaptive approach. Full article

► Show Figures

Figure 1

17 pages, 7080 KB

Open AccessArticle

The Hybrid Learning Atelier: Designing a Hybrid Learning Space

by Jan Michael Sieber, Anne Brannys, Heinrich Söbke, Mubtasim Islam Sabik and Eckhard Kraft

Multimodal Technol. Interact. 2025, 9(10), 107; https://doi.org/10.3390/mti9100107 - 14 Oct 2025

Viewed by 409

Abstract

Hybrid learning spaces may be described as physical environments enhanced by digital technologies, which enable learning scenarios involving both in-person and online participation. This article presents a hybrid learning space designed for higher education. The design of the space has been informed by [...] Read more.

Hybrid learning spaces may be described as physical environments enhanced by digital technologies, which enable learning scenarios involving both in-person and online participation. This article presents a hybrid learning space designed for higher education. The design of the space has been informed by Lefebvre’s design principles: (a) spatial practice enabling flexible usage scenarios, (b) representations of space conveying openness and adaptability, and (c) representational spaces supporting experiences of presence in both physical and digital form. The article describes design characteristics guiding the implementation of the hybrid learning space and explains corresponding design decisions, such as the use of a wall-sized projection. Further, the article introduces affordances and usage scenarios of the hybrid learning space developed. Moreover, an evaluation study of the hybrid learning space is conducted by means of a 360°-based virtual field trip (VFT). The VFT, led by an educator, serves as preparation for a field trip (FT) to a composting plant two weeks later. Participants of both VFT and FT (N = 11) completed a questionnaire addressing psychological constructs related to learning, including motivation, emotion, immersion, presence, and cognitive load. We report the results of the VFT alongside those of the FT as a baseline. Some notable differences, for example in social presence, suggest areas for further development of the hybrid learning space. Overall, the study characterises key features of hybrid learning spaces, identifies their contribution to high-quality teaching and provides inspirations for their further development. Full article

(This article belongs to the Special Issue Online Learning to Multimodal Era: Interfaces, Analytics and User Experiences)

► Show Figures

Figure 1

24 pages, 5068 KB

Open AccessArticle

Multimodal Learning Interactions Using MATLAB Technology in a Multinational Statistical Classroom

by Qiaoyan Cai, Mohd Razip Bajuri, Kwan Eu Leong and Liangliang Chen

Multimodal Technol. Interact. 2025, 9(10), 106; https://doi.org/10.3390/mti9100106 - 13 Oct 2025

Viewed by 373

Abstract

This study explores and models the use of MATLAB technology in multimodal learning interactions to address the challenges of teaching and learning statistics in a multinational postgraduate classroom. The term multimodal refers to the deliberate integration of multiple representational and interaction modes, i.e., [...] Read more.

This study explores and models the use of MATLAB technology in multimodal learning interactions to address the challenges of teaching and learning statistics in a multinational postgraduate classroom. The term multimodal refers to the deliberate integration of multiple representational and interaction modes, i.e., visual, textual, symbolic, and interactive computational modelling, within a coherent instructional design. MATLAB is utilised as it is a comprehensive tool for enhancing students’ understanding of statistical skills, practical applications, and data analysis—areas where traditional methods often fall short. International postgraduate students were chosen for this study because their diverse educational backgrounds present unique learning challenges. A qualitative case study design was employed, and data collection methods included classroom observations, interviews, and student work analysis. The collected data were analysed and modelled by conceptualising key elements and themes using thematic analysis, with findings verified through data triangulation and expert review. Emerging themes were structured into models that illustrate multimodal teaching and learning interactions. The novelty of this research lies in its contribution to multimodal teaching and learning strategies for multinational students in statistics education. The findings highlight significant challenges international students face, including language and technical barriers, limited prior content knowledge, time constraints, technical difficulties, and a lack of independent thinking. To address these challenges, MATLAB promotes collaborative learning, increases student engagement and discussion, boosts motivation, and develops essential skills. This study suggests that educators integrate multimodal interactions in their teaching strategies to better support multinational students in statistical learning environments. Full article

► Show Figures

Figure 1

16 pages, 4268 KB

Open AccessArticle

Research on the Detection Method of Flight Trainees’ Attention State Based on Multi-Modal Dynamic Depth Network

by Gongpu Wu, Changyuan Wang, Zehui Chen and Guangyi Jiang

Multimodal Technol. Interact. 2025, 9(10), 105; https://doi.org/10.3390/mti9100105 - 10 Oct 2025

Viewed by 358

Abstract

In aviation safety, pilots must efficiently process dynamic visual information and maintain a high level of attention. Any missed judgment of critical information or delay in decision-making may lead to mission failure or catastrophic consequences. Therefore, accurately detecting pilots’ attention states is the [...] Read more.

In aviation safety, pilots must efficiently process dynamic visual information and maintain a high level of attention. Any missed judgment of critical information or delay in decision-making may lead to mission failure or catastrophic consequences. Therefore, accurately detecting pilots’ attention states is the primary prerequisite for improving flight safety and performance. To better detect the attention state of pilots, this paper takes flight trainees as the research object and the simulated flight environment as the experimental background. It proposes a method for detecting the attention state of flight trainees based on a multi-modal dynamic depth network (M3D-Net). The M3D-Net architecture is a lightweight neural network architecture that integrates temporal image features, visual information features, and flight operation data features. It aligns image and text features through an attention mechanism to enhance the semantic association between modalities; it utilizes the Depth-wise Separable Convolution and LSTM (DSC-LSTM) module to model temporal information, dynamically capturing the contextual dependencies within the sequence, and achieving six-level attention state classification. This paper conducted ablation experiments to comparatively analyze the classification effects of the model and also evaluates the effectiveness of our proposed method through model evaluation metrics. Experiments show that the classification effect of the model architecture proposed in this paper reaches 97.56%, with a model size of 18.6 M. Compared with traditional algorithms, the M3D-Net architecture has better performance prospects in terms of application. Full article

► Show Figures

Figure 1

23 pages, 3467 KB

Open AccessArticle

Adaptive Neuro-Fuzzy Inference System Framework for Paediatric Wrist Injury Classification

by Olamilekan Shobayo, Reza Saatchi and Shammi Ramlakhan

Multimodal Technol. Interact. 2025, 9(10), 104; https://doi.org/10.3390/mti9100104 - 8 Oct 2025

Viewed by 346

Abstract

An Adaptive Neuro-Fuzzy Inference System (ANFIS) framework for paediatric wrist injury classification (fracture versus sprain) was developed utilising infrared thermography (IRT). ANFIS combines artificial neural network (ANN) learning with interpretable fuzzy rules, mitigating the “black-box” limitation of conventional ANNs through explicit membership functions [...] Read more.

An Adaptive Neuro-Fuzzy Inference System (ANFIS) framework for paediatric wrist injury classification (fracture versus sprain) was developed utilising infrared thermography (IRT). ANFIS combines artificial neural network (ANN) learning with interpretable fuzzy rules, mitigating the “black-box” limitation of conventional ANNs through explicit membership functions and Takagi–Sugeno rule consequents. Forty children (19 fractures, 21 sprains, confirmed by X-ray radiograph) provided thermal image sequences from which three statistically discriminative temperature distribution features namely standard deviation, inter-quartile range (IQR) and kurtosis were selected. A five-layer Sugeno ANFIS with Gaussian membership functions were trained using a hybrid least-squares/gradient descent optimisation and evaluated under three premise-parameter initialisation strategies: random seeding, K-means clustering, and fuzzy C-means (FCM) data partitioning. Five-fold cross-validation guided the selection of membership functions standard deviation (σ) and rule count, yielding an optimal nine-rule model. Comparative experiments show K-means initialisation achieved the best balance between convergence speed and generalisation versus slower but highly precise random initialisation and rapidly convergent yet unstable FCM. The proposed K-means–driven ANFIS offered data-efficient decision support, highlighting the potential of thermal feature fusion with neuro-fuzzy modelling to reduce unnecessary radiographs in emergency bone fracture triage. Full article

► Show Figures

Figure 1

26 pages, 4710 KB

Open AccessArticle

Research on Safe Multimodal Detection Method of Pilot Visual Observation Behavior Based on Cognitive State Decoding

by Heming Zhang, Changyuan Wang and Pengbo Wang

Multimodal Technol. Interact. 2025, 9(10), 103; https://doi.org/10.3390/mti9100103 - 1 Oct 2025

Viewed by 521

Abstract

Pilot visual behavior safety assessment is a cross-disciplinary technology that analyzes pilots’ gaze behavior and neurocognitive responses. This paper proposes a multimodal analysis method for pilot visual behavior safety, specifically for cognitive state decoding. This method aims to achieve a quantitative and efficient [...] Read more.

Pilot visual behavior safety assessment is a cross-disciplinary technology that analyzes pilots’ gaze behavior and neurocognitive responses. This paper proposes a multimodal analysis method for pilot visual behavior safety, specifically for cognitive state decoding. This method aims to achieve a quantitative and efficient assessment of pilots’ observational behavior. Addressing the subjective limitations of traditional methods, this paper proposes an observational behavior detection model that integrates facial images to achieve dynamic and quantitative analysis of observational behavior. It addresses the “Midas contact” problem of observational behavior by constructing a cognitive analysis method using multimodal signals. We propose a bidirectional long short-term memory (LSTM) network that matches physiological signal rhythmic features to address the problem of isolated features in multidimensional signals. This method captures the dynamic correlations between multiple physiological behaviors, such as prefrontal theta and chest-abdominal coordination, to decode the cognitive state of pilots’ observational behavior. Finally, the paper uses a decision-level fusion method based on an improved Dempster–Shafer (DS) evidence theory to provide a quantifiable detection strategy for aviation safety standards. This dual-dimensional quantitative assessment system of “visual behavior–neurophysiological cognition” reveals the dynamic correlations between visual behavior and cognitive state among pilots of varying experience. This method can provide a new paradigm for pilot neuroergonomics training and early warning of vestibular-visual integration disorders. Full article

► Show Figures

Figure 1

18 pages, 7318 KB

Open AccessArticle

Design of Enhanced Virtual Reality Training Environments for Industrial Rotary Dryers Using Mathematical Modeling

by Ricardo A. Gutiérrez-Aguiñaga, Jonathan H. Rosales-Hernández, Rogelio Salinas-Santiago, Froylán M. E. Escalante and Efrén Aguilar-Garnica

Multimodal Technol. Interact. 2025, 9(10), 102; https://doi.org/10.3390/mti9100102 - 30 Sep 2025

Viewed by 331

Abstract

Rotary dryers are widely used in industry for their ease of operation in processing large volumes of material continuously despite persistent challenges in energy efficiency, cost-effectiveness, and safety. Addressing the need for effective operator training, the purpose of this study is to develop [...] Read more.

Rotary dryers are widely used in industry for their ease of operation in processing large volumes of material continuously despite persistent challenges in energy efficiency, cost-effectiveness, and safety. Addressing the need for effective operator training, the purpose of this study is to develop virtual reality (VR) environments for industrial rotary dryers. Visual and behavioral aspects were considered in the methodology for developing the environments for two application cases—ammonium nitrate and low-rank coal drying. Visual aspects considered include the industrial-scale geometry and detailed components of the rotary dryer, while behavioral aspects were governed by mathematical modeling of heat and mass transfer phenomena. The case studies of ammonium nitrate and low-rank coal were selected due to their industrial relevance and contrasting drying characteristics, ensuring the versatility and applicability of the developed VR environments. The main contribution of this work is the embedding of validated mathematical models—expressed as ordinary differential equations—into these environments. The numerical integration of these models provides key process variables, such as solid temperature and moisture content along the rotary dryer, thereby enhancing the behavioral realism of the developed VR environments. Full article

► Show Figures

Figure 1

22 pages, 2952 KB

Open AccessArticle

SmartRead: A Multimodal eReading Platform Integrating Computing and Gamification to Enhance Student Engagement and Knowledge Retention

by Ifeoluwa Pelumi and Neil Gordon

Multimodal Technol. Interact. 2025, 9(10), 101; https://doi.org/10.3390/mti9100101 - 23 Sep 2025

Viewed by 661

Abstract

This paper explores the integration of computing and multimodal technologies into personal reading practices to enhance student engagement and knowledge assimilation in higher education. In response to a documented decline in voluntary academic reading, we investigated how technology-enhanced reading environments can re-engage students [...] Read more.

This paper explores the integration of computing and multimodal technologies into personal reading practices to enhance student engagement and knowledge assimilation in higher education. In response to a documented decline in voluntary academic reading, we investigated how technology-enhanced reading environments can re-engage students through interactive and personalized experiences. Central to this research is SmartRead, a proposed multimodal eReading platform that incorporates gamification, adaptive content delivery, and real-time feedback mechanisms. Drawing on empirical data collected from students at a higher education institution, we examined how features such as progress tracking, motivational rewards, and interactive comprehension aids influence reading behavior, engagement levels, and information retention. Results indicate that such multimodal interventions can significantly improve learner outcomes and user satisfaction. This paper contributes actionable insights into the design of innovative, accessible, and pedagogically sound digital reading tools and proposes a framework for future eReading technologies that align with multimodal interaction principles. Full article

► Show Figures

Graphical abstract

10 pages, 7955 KB

Open AccessArticle

Investigating the Effect of Pseudo-Haptics on Perceptions Toward Onomatopoeia Text During Finger-Point Tracing

by Satoshi Saga and Kanta Shirakawa

Multimodal Technol. Interact. 2025, 9(10), 100; https://doi.org/10.3390/mti9100100 - 23 Sep 2025

Viewed by 498

Abstract

With the advancement of haptic technology, the use of pseudo-haptics to provide tactile feedback without physical contact has garnered significant attention. This paper aimed to investigate whether sliding fingers over onomatopoetic text strings with pseudo-haptic effects induces change in perception toward their symbolic [...] Read more.

With the advancement of haptic technology, the use of pseudo-haptics to provide tactile feedback without physical contact has garnered significant attention. This paper aimed to investigate whether sliding fingers over onomatopoetic text strings with pseudo-haptic effects induces change in perception toward their symbolic semantics. To address this, we conducted an experiment using finger-point reading as our subject matter. The experimental results confirmed that the “neba-neba,” “puru-puru,” and “fusa-fusa” effects create a pseudo-haptic feeling for the associated texts on the “hard–soft,” “slippery–sticky,” and “elastic–inelastic” adjective pairs. Specifically, for “hard–soft,” it was found that the proposed effects could consistently produce an impact. Full article

► Show Figures

Figure 1

Journal Menu

Journal Browser

Multimodal Technol. Interact., Volume 9, Issue 10 (October 2025) – 11 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI