Multimodal Technologies and Interaction

41 pages, 3508 KB

Open AccessSystematic Review

Who, Where, What, and How to Nudge: A Systematic Review of Co-Designed Digital Nudges for Behavioral Interventions

by Alaa Ziyud, Khaled Al-Thelaya and Jens Schneider

Multimodal Technol. Interact. 2026, 10(4), 43; https://doi.org/10.3390/mti10040043 - 21 Apr 2026

Viewed by 669

Digital nudges refer to subtle modifications in digital choice architectures that are increasingly applied across domains such as healthcare, human–computer interactions, and behavioral science. However, existing approaches often overlook users’ needs, contextual factors, and ethical considerations related to transparency and autonomy. This systematic [...] Read more.

Digital nudges refer to subtle modifications in digital choice architectures that are increasingly applied across domains such as healthcare, human–computer interactions, and behavioral science. However, existing approaches often overlook users’ needs, contextual factors, and ethical considerations related to transparency and autonomy. This systematic literature review, guided by PRISMA 2020, examines the integration of co-design methodologies in digital nudging across four dimensions: participants, application domains, nudge forms, and development methods. The findings show that co-design is primarily driven by end-users, supported by domain experts and technology specialists. Applications are concentrated in health-related contexts, particularly chronic disease management and mental health. The effectiveness of priming varied across studies, with some reporting short-term benefits and others indicating user fatigue, suggesting context-dependent impact and limited long-term effectiveness. Full article

► Show Figures

Graphical abstract

34 pages, 8939 KB

Open AccessArticle

From Prompts to High-Fidelity Prototypes: A Usability Evaluation of Generative AI-Driven Prototyping Tools for Smart Mobile App Design

by John Bustamante-Orejuela, Xavier Quiñonez-Ku and Pablo Pico-Valencia

Multimodal Technol. Interact. 2026, 10(4), 42; https://doi.org/10.3390/mti10040042 - 17 Apr 2026

Viewed by 418

Abstract

The integration of Generative Artificial Intelligence (GAI) into software design tools has transformed the early stages of mobile application development, particularly prototype creation from natural-language prompts. This study evaluates the usability and effectiveness of GAI-assisted prototyping tools for generating high-fidelity mobile application prototypes. [...] Read more.

The integration of Generative Artificial Intelligence (GAI) into software design tools has transformed the early stages of mobile application development, particularly prototype creation from natural-language prompts. This study evaluates the usability and effectiveness of GAI-assisted prototyping tools for generating high-fidelity mobile application prototypes. A controlled laboratory usability study was conducted in which undergraduate Information Technology Engineering students used and evaluated four widely adopted prototyping platforms: Figma, Uizard, Visily, and Stitch. Participants employed these tools to recreate mobile interfaces corresponding to the interaction model of the Duolingo application. The System Usability Scale (SUS) was used to assess perceived usability and effectiveness from the users’ perspective. The results indicate that all evaluated tools enabled rapid prototype generation; however, significant differences emerged in usability, structural fidelity, and perceived control. Figma and Stitch achieved the highest usability scores and demonstrated greater alignment with the reference prototype (82.86 and 80.36, respectively). Visily achieved a favorable usability score (78.57), while Uizard obtained a moderate score (67.14). Although Uizard and Visily exhibited strong automation capabilities and faster initial generation, their outputs required additional manual refinement to achieve higher fidelity and customization. Participant feedback emphasized the importance of output quality, responsiveness, and foundational design knowledge in achieving satisfactory results. Overall, the findings suggest that current GAI-based prototyping tools are effective and valuable in real-world software development contexts. However, their effectiveness appears closely related to the degree of user control, responsiveness, and the ability to iteratively refine AI-generated interface components. Full article

(This article belongs to the Special Issue Intelligent Interaction Design: Innovative Models and the Future of Human–Computer Experience)

► Show Figures

Graphical abstract

28 pages, 10998 KB

Open AccessArticle

Introducing Brain–Computer Interfaces in Factories and Fabrication Lines for the Inclusion of Disabled Workers–Industry 5.0—A Modern Challenge and Opportunity

by Marian-Silviu Poboroniuc, Zoltán Nochta, Martin Klepal, Nina Hunter, Danut-Constantin Irimia, Alina Georgiana Baciu, Kelaja Schert, Tim Piotrowski and Alexandru Mitocaru

Multimodal Technol. Interact. 2026, 10(4), 41; https://doi.org/10.3390/mti10040041 - 17 Apr 2026

Viewed by 356

Abstract

Flexible factories and adaptive fabrication lines offer a testbed for advanced multimodal interaction concepts that can support the inclusion of disabled workers in Industry 5.0 manufacturing systems. The study synthesizes interdisciplinary data from ergonomics, industrial automation, and EU regulatory frameworks to establish a [...] Read more.

Flexible factories and adaptive fabrication lines offer a testbed for advanced multimodal interaction concepts that can support the inclusion of disabled workers in Industry 5.0 manufacturing systems. The study synthesizes interdisciplinary data from ergonomics, industrial automation, and EU regulatory frameworks to establish a conceptual model for human-machine interaction. Building on conceptual modeling and a structured literature analysis, the study proposes a six-step integration framework that links task demands, worker capabilities, and interaction modalities within human-in-the-loop manufacturing environments. Although no empirical case study was conducted in this phase, an exemplary application is presented for a semi-automated bike wheel manufacturing process. Detailed machine-based assembly line flows and simulated process data were utilized for illustrative purposes to depict the process and validate the proposed Capability–Task Matching Matrix. The results operationalize the human-centric vision of Industry 5.0 by providing a structured methodology for the inclusion of disabled workers within fabrication environments. The findings are organized into two primary components: the conceptual development of the Integration Approach and its practical application to a semi-automated industrial use-case. Finally, a particular focus is placed on Brain–Computer Interfaces (BCIs) as an emerging interaction channel that enables non-muscular control, attention monitoring, and neuroadaptive feedback, complementing conventional interfaces rather than replacing them. The framework is illustrated through application to the same semi-automated bicycle wheel assembly line, where BCI-supported interaction, augmented interfaces, and robotic assistance are mapped to specific production tasks and assessed in terms of feasibility and technological maturity. Drawing on the paper’s results, an explanatory 10-year roadmap outlines the feasibility and phased deployment of BCI solutions. It aligns technological advances with European regulations and a vision for a fully inclusive manufacturing enterprise. Full article

► Show Figures

Figure 1

13 pages, 1727 KB

Open AccessArticle

The Discrimination Threshold on the Palm for Two Successive Rectangular Stimuli

by Mayuka Kojima and Akio Yamamoto

Multimodal Technol. Interact. 2026, 10(4), 40; https://doi.org/10.3390/mti10040040 - 15 Apr 2026

Viewed by 332

Abstract

This study investigates tactile spatial resolution on the palm using two successive rectangular stimuli. Whereas classical tactile resolution studies have focused mainly on point or circular stimulation, less is known about how spatial resolution depends on the placement and geometry of rectangular, device-relevant [...] Read more.

This study investigates tactile spatial resolution on the palm using two successive rectangular stimuli. Whereas classical tactile resolution studies have focused mainly on point or circular stimulation, less is known about how spatial resolution depends on the placement and geometry of rectangular, device-relevant stimuli. We measured the successive two-stimulus discrimination threshold using three rectangular stimulators across five palm areas aligned along the proximal–distal axis. Participants compared a fixed reference stimulus with a variable comparison stimulus, and the minimum separation at which the two stimuli were perceived as occurring at different locations was recorded as the threshold. The overall average threshold across all experimental conditions was approximately 5.2 mm. The threshold varied systematically across palm regions, being smallest around the palmar digital crease and the base of the fingers. In the central palm, threshold differences were more evident for changes in stimulator width than for changes in stimulator length. These results extend tactile spatial resolution research beyond point stimulation and provide design-relevant guidance for palm-based haptic devices. Full article

► Show Figures

Figure 1

31 pages, 3398 KB

Open AccessArticle

Multimodal Smart-Skin for Real-Time Sitting Posture Recognition with Cross-Session Validation

by Giva Andriana Mutiara, Muhammad Rizqy Alfarisi, Paramita Mayadewi, Lisda Meisaroh and Periyadi

Multimodal Technol. Interact. 2026, 10(4), 39; https://doi.org/10.3390/mti10040039 - 9 Apr 2026

Viewed by 367

Abstract

Prolonged sitting with poor posture is associated with musculoskeletal disorders, reduced productivity, and long-term health risks. Many existing posture monitoring systems predominantly rely on single-modality sensing, such as pressure or vision-based approaches, limiting their ability to capture both static alignment and dynamic micro-movements. [...] Read more.

Prolonged sitting with poor posture is associated with musculoskeletal disorders, reduced productivity, and long-term health risks. Many existing posture monitoring systems predominantly rely on single-modality sensing, such as pressure or vision-based approaches, limiting their ability to capture both static alignment and dynamic micro-movements. This study proposes a multimodal smart-skin system integrating pressure, temperature, and vibration sensors for sitting posture recognition. A total of 42 sensors distributed across 14 anatomical locations were deployed, generating 15,037 samples collected over three independent sessions to evaluate cross-session temporal generalization across nine posture classes under controlled experimental conditions. Two deep learning architectures—Temporal Convolutional Networks with Attention (TCN + Attn) and Convolutional Neural Network–Long Short-Term Memory (CNN − LSTM)—were compared under Leave-One-Session-Out (LOSO) cross-validation. TCN + Attn achieved 85.23% LOSO accuracy, outperforming CNN − LSTM by 2.56 percentage points while reducing training time by 36.7% and inference latency by 33.9%. Ablation analysis revealed that temperature sensing was the most discriminative unimodal modality (71.5% accuracy), and full multimodal fusion improved LOSO accuracy by 22.93% compared to pressure-only configurations. These results demonstrate the feasibility of multimodal smart-skin sensing combined with temporal convolutional modeling for cross-session posture recognition and indicate potential for efficient real-time, privacy-preserving ergonomic monitoring. This study should be interpreted as a controlled, single-subject proof-of-concept, and further validation in multi-subject and real-world environments is required to establish broader generalizability. Full article

► Show Figures

Figure 1

34 pages, 3911 KB

Open AccessArticle

PAD-Guided Multimodal Hybrid Contrastive Emotion Recognition upon STEM-E²VA Dataset

by Shufei Duan, Wenjie Zhang, Liangqi Li, Ting Zhu, Fangyu Zhao, Fujiang Li and Huizhi Liang

Multimodal Technol. Interact. 2026, 10(4), 38; https://doi.org/10.3390/mti10040038 - 2 Apr 2026

Viewed by 520

Abstract

There are still challenges in speech emotion recognition, as the representation capability of single-modal information is limited, there are difficulties in capturing continuous emotional transitions in discrete emotion annotations, and the issues of modal structural differences and cross-sample alignment in multimodal fusion methods [...] Read more.

There are still challenges in speech emotion recognition, as the representation capability of single-modal information is limited, there are difficulties in capturing continuous emotional transitions in discrete emotion annotations, and the issues of modal structural differences and cross-sample alignment in multimodal fusion methods persist. To address these, this study undertakes work from both data and model perspectives. For data, a Chinese multimodal database STEM-E²VA was constructed, synchronously collecting four modalities of data: articulatory kinematics, acoustics, glottal signals, and videos. This covers seven discrete emotion categories and employs PAD continuous annotation. By integrating discrete and continuous dimensional annotations, it better represents the distinction between strong and weak emotions under the same discrete emotion label. Concurrently, to process the biases in PAD annotations, we employed the SCL-90 psychological questionnaire to analyze annotators’ cognitive and emotional perceptions, thereby ensuring data reliability. For model, this paper proposes a multimodal supervised contrastive fusion network incorporating PAD perception. It employs a PAD-enhanced hybrid contrastive loss function to optimize intra-model and inter-modal feature alignment. Utilizing a cross-attention mechanism combined with a GRU–Transformer network for temporal feature extraction, it achieves deep fusion of multimodal information, reducing inter-modal discrepancies and cross-class confusion. Experiments demonstrate that the proposed method achieves 85.47% accuracy in discrete sentiment recognition on STEM-E²VA, with a substantial reduction in RMSE for PAD dimension prediction. It also exhibits excellent generalization capability on IEMOCAP, providing a novel framework for integrating discrete and continuous sentiment representations. Full article

► Show Figures

Figure 1

19 pages, 1426 KB

Open AccessArticle

Ergonomic Evaluation of Augmented Reality-Based Visualization of Scattered Radiation Distribution During Partial-Angle CT

by Hiroaki Hasegawa

Multimodal Technol. Interact. 2026, 10(4), 37; https://doi.org/10.3390/mti10040037 - 2 Apr 2026

Viewed by 546

Abstract

Computed tomography (CT)-guided procedures require close proximity to the CT gantry or patient, increasing occupational exposure to scattered radiation. Even though radiation-protective equipment is commonly used, the optimization of CT fluoroscopic techniques remains important. Partial-angle CT (PACT) employs a limited exposure angle, producing [...] Read more.

Computed tomography (CT)-guided procedures require close proximity to the CT gantry or patient, increasing occupational exposure to scattered radiation. Even though radiation-protective equipment is commonly used, the optimization of CT fluoroscopic techniques remains important. Partial-angle CT (PACT) employs a limited exposure angle, producing cumulative scattered radiation distributions that vary with the selected angle and are difficult to estimate in advance. I aimed to develop an augmented reality (AR)-based visualization method for cumulative scattered radiation distributions during PACT and to evaluate its ergonomic feasibility as a proof of concept for occupational exposure reduction. An AR display system was developed to overlay cumulative scattered radiation distributions onto physical space using AR glasses. Workload was assessed using the NASA Task Load Index (NASA-TLX), and usability was assessed using the System Usability Scale (SUS). Compared with non-virtual conditions using radiation-protective glasses alone, AR-assisted visualization was associated with increased perceived workload, and usability scores were lower than those reported in previous AR studies. These findings indicate that, for AR display systems to support occupational exposure reduction, perceived task demands must be comparable to conventional protection strategies. Further improvements in visualization methods, user familiarity with AR environments, and ergonomic optimization are required to facilitate clinical implementation. Full article

► Show Figures

Figure 1

20 pages, 8535 KB

Open AccessArticle

The Emergent Rhythms of a Robot Vacuum Cleaner—An Empirically Grounded Account of Agential Realism

by Linus de Petris, Siamak Khatibi and Yuan Zhou

Multimodal Technol. Interact. 2026, 10(4), 36; https://doi.org/10.3390/mti10040036 - 1 Apr 2026

Viewed by 390

Abstract

This article builds on the argument that design for complex interactive systems should shift from creating linear transactional interactions toward organizing relational complexity. Grounded in Karen Barad’s agential realism, we argue that a designer’s role can benefit from not predefining interactions but from [...] Read more.

This article builds on the argument that design for complex interactive systems should shift from creating linear transactional interactions toward organizing relational complexity. Grounded in Karen Barad’s agential realism, we argue that a designer’s role can benefit from not predefining interactions but from curating the material-discursive conditions under which meaningful relations can emerge. To explore the empirical and temporal dimensions of this practice, we conducted an exploratory workshop setting the conditions for emergent gameplay dynamics and discussions on agential realist anticipation. Participants utilized a custom-designed game and built their own physical controllers to anticipate and adapt to shifting gameplay conditions. Our results demonstrate how alterations in relational constraints, rather than explicit pre-programmed goals, drove the emergence of non-predefined gameplay rhythms. The findings provide empirical grounding for an agential realist understanding of anticipation, showing that an interactive system’s identity lies in its unfolding processual patterns rather than a static final state. Based on these findings, we propose three design principles for further exploration: Design for Relational Emergence, Design for Re-membering, and Design for Emergent Patterns. Consequently, we conclude by outlining a conceptual approach for non-linear computational architectures, drawing on principles from Enactive AI and reservoir computing. Full article

► Show Figures

Figure 1

26 pages, 4885 KB

Open AccessArticle

Reading Noise: Integrating Physiological Sensing and Sound-Driven Visualization to Externalize Noise-Related Cognitive Disruption During Reading

by Xueyi Li, Yonghong Liu, Zihui Jiang and Yangcheng Wang

Multimodal Technol. Interact. 2026, 10(4), 35; https://doi.org/10.3390/mti10040035 - 30 Mar 2026

Viewed by 573

Abstract

Environmental noise may interfere with the reading experience by increasing cognitive load and psychophysiological arousal, yet these effects are difficult to perceive and communicate in real time. This study presents Reading Noise, an interactive installation that combines physiological sensing and sound-driven visualization to [...] Read more.

Environmental noise may interfere with the reading experience by increasing cognitive load and psychophysiological arousal, yet these effects are difficult to perceive and communicate in real time. This study presents Reading Noise, an interactive installation that combines physiological sensing and sound-driven visualization to externalize perceived noise-related disturbance and psychophysiological strain during reading. In a controlled experiment, 46 participants completed reading tasks under four levels of background conversational noise (0–30, 31–60, 61–90, and >90 dB) while ambient sound level, electrodermal activity (EDA), and electrocardiogram (ECG) were recorded in real time. Following data quality screening, inferential statistical analyses were performed on the analyzable physiological subset (n = 16). Based on these data, a hybrid mapping strategy combining rule-based assignment and LMM-informed exploratory calibration was developed to map acoustic and physiological changes onto dynamic text-based visual parameters, including deformation intensity, jitter, and motion instability, for real-time feedback. Within the analyzable subset, noise level was associated with significant changes in the recorded physiological indicators (all p < 0.05): skin conductance level (SCL) and skin conductance responses per minute (SCRs/min) increased (4.69 ± 2.13 to 5.93 ± 2.19 μS; 1.49 ± 1.59 to 2.51 ± 2.13), whereas the percentage of successive RR intervals differing by more than 50 ms (pNN50) and the root mean square of successive differences (RMSSD) decreased (15.84 ± 16.52% to 10.57 ± 11.35%; 36.63 ± 17.62 to 29.67 ± 16.66 ms). Subjective cognitive load also increased significantly (2.06 ± 0.29 to 6.38 ± 0.31). A follow-up installation study with 24 cross-disciplinary participants, with reported group interaction observations drawn from a 12-participant subset, suggested that the installation may facilitate shared interpretation of attention-related disruption and cognitive strain, indicating the potential of physiology-informed visual translation as a boundary object approach for empathetic, sound-mediated communication. Full article

► Show Figures

Figure 1

23 pages, 320 KB

Open AccessArticle

Distributed Teaching Agency–AI in the University: A Typology Based on Student Voice

by Tomás Fontaines-Ruiz, Antonio Ponce-Rojo, Paolo Fabre Merchán, Walther Casimiro Urcos and Liliana Cánquiz Rincón

Multimodal Technol. Interact. 2026, 10(4), 34; https://doi.org/10.3390/mti10040034 - 27 Mar 2026

Viewed by 590

Abstract

Generative AI is reshaping university teaching and creating tension around authority, evidence, and accountability when decisions are made using algorithms. From a student perspective, this study constructed a typology of distributed teacher–AI agency (TAI) and examined the discursive mechanisms that produce the illusion [...] Read more.

Generative AI is reshaping university teaching and creating tension around authority, evidence, and accountability when decisions are made using algorithms. From a student perspective, this study constructed a typology of distributed teacher–AI agency (TAI) and examined the discursive mechanisms that produce the illusion of teacher autonomy. A non-experimental, cross-sectional, explanatory study was conducted: a lexicometric analysis of the ALCESTE (IRAMUTEQ) questionnaire, using open-ended responses from 3120 students (Mexico, n = 2051; Ecuador, n = 1069), segmented into 1077 units, and analyzed using positioning theory. Co-agency was operationalized using Teacher Agency (A), Delegation to AI (D), Governance (G: disclosure, criteria, verification), and the Illusion Index (II = A/(D + G + 1)). Three configurations emerged: Immediate Customizer (28.8%) with very high A and minimal D/G (II = 25.4); Technological Literacy Facilitator (27.3%) with visible delegation and safeguards (II ≈ 2.0); and Operational Optimizer (43.9%) oriented toward accelerating tasks with moderate governance (II ≈ 2.7). The illusion was associated with the agentive erasure of AI and a rhetoric of immediacy/efficiency that replaced verifiable criteria. These findings transform the student voice into a criteria-based diagnostic tool for strengthening traceability, minimal verification, and responsible orchestration of AI in higher education. Full article

Journal Menu

Journal Browser

Multimodal Technol. Interact., Volume 10, Issue 4 (April 2026) – 10 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI