Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (48)

Search Parameters:
Keywords = voice-controlled robot

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 14849 KB  
Article
A Collaborative Robotic System for Autonomous Object Handling with Natural User Interaction
by Federico Neri, Gaetano Lettera, Giacomo Palmieri and Massimo Callegari
Robotics 2026, 15(3), 49; https://doi.org/10.3390/robotics15030049 - 27 Feb 2026
Viewed by 693
Abstract
In Industry 5.0, the transition from fixed traditional automation to flexible human–robot collaboration (HRC) needs interfaces that are both intuitive and efficient. This paper introduces a novel, multimodal control system for autonomous object handling, specifically designed to enhance natural user interaction in dynamic [...] Read more.
In Industry 5.0, the transition from fixed traditional automation to flexible human–robot collaboration (HRC) needs interfaces that are both intuitive and efficient. This paper introduces a novel, multimodal control system for autonomous object handling, specifically designed to enhance natural user interaction in dynamic work environments. The system integrates a 6-Degrees of Freedom (DoF) collaborative robot (UR5e) with a hand-eye RGB-D vision system to achieve robust autonomy. The core technical contribution lies in a vision pipeline utilizing deep learning for object detection and point cloud processing for accurate 6D pose estimation, enabling advanced tasks such as human-aware object handover directly onto the operator’s hand. Crucially, an Automatic Speech Recognition (ASR) is incorporated, providing a Natural Language Understanding (NLU) layer that allows operators to issue real-time commands for task modification, error correction and object selection. Experimental results demonstrate that this multimodal approach offers a streamlined workflow aiming to improve operational flexibility compared to traditional HMIs, while enhancing the perceived naturalness of the collaborative task. The system establishes a framework for highly responsive and intuitive human–robot workspaces, advancing the state of the art in natural interaction for collaborative object manipulation. Full article
(This article belongs to the Special Issue Human–Robot Collaboration in Industry 5.0)
Show Figures

Figure 1

20 pages, 14885 KB  
Article
MultiPhysio-HRC: A Multimodal Physiological Signals Dataset for Industrial Human–Robot Collaboration
by Andrea Bussolan, Stefano Baraldo, Oliver Avram, Pablo Urcola, Luis Montesano, Luca Maria Gambardella and Anna Valente
Robotics 2025, 14(12), 184; https://doi.org/10.3390/robotics14120184 - 5 Dec 2025
Cited by 3 | Viewed by 1751
Abstract
Human–robot collaboration (HRC) is a key focus of Industry 5.0, aiming to enhance worker productivity while ensuring well-being. The ability to perceive human psycho-physical states, such as stress and cognitive load, is crucial for adaptive and human-aware robotics. This paper introduces MultiPhysio-HRC, a [...] Read more.
Human–robot collaboration (HRC) is a key focus of Industry 5.0, aiming to enhance worker productivity while ensuring well-being. The ability to perceive human psycho-physical states, such as stress and cognitive load, is crucial for adaptive and human-aware robotics. This paper introduces MultiPhysio-HRC, a multimodal dataset containing physiological, audio, and facial data collected during real-world HRC scenarios. The dataset includes electroencephalography (EEG), electrocardiography (ECG), electrodermal activity (EDA), respiration (RESP), electromyography (EMG), voice recordings, and facial action units. The dataset integrates controlled cognitive tasks, immersive virtual reality experiences, and industrial disassembly activities performed manually and with robotic assistance, to capture a holistic view of the participants’ mental states. Rich ground truth annotations were obtained using validated psychological self-assessment questionnaires. Baseline models were evaluated for stress and cognitive load classification, demonstrating the dataset’s potential for affective computing and human-aware robotics research. MultiPhysio-HRC is publicly available to support research in human-centered automation, workplace well-being, and intelligent robotic systems. Full article
(This article belongs to the Special Issue Human–Robot Collaboration in Industry 5.0)
Show Figures

Figure 1

17 pages, 1520 KB  
Article
Exploring the Impacts of Service Robot Interaction Cues on Customer Experience in Small-Scale Self-Service Shops
by Wa Gao, Yuan Tian, Wanli Zhai, Yang Ji and Shiyi Shen
Sustainability 2025, 17(22), 10368; https://doi.org/10.3390/su172210368 - 19 Nov 2025
Cited by 2 | Viewed by 937
Abstract
Since service robots serving as salespersons are expected to be deployed efficiently and sustainably in retail environments, this paper explores the impacts of their interaction cues on customer experiences within small-scale self-service shops. The corresponding customer experiences are discussed in terms of fluency, [...] Read more.
Since service robots serving as salespersons are expected to be deployed efficiently and sustainably in retail environments, this paper explores the impacts of their interaction cues on customer experiences within small-scale self-service shops. The corresponding customer experiences are discussed in terms of fluency, comfort and likability. We analyzed customers’ shopping behaviors and designed fourteen body gestures for the robots, giving them the ability to select appropriate movements for different stages in shopping. Two experimental scenarios with and without robots were designed. For the scenario involving robots, eight cases with distinct interaction cues were implemented. Participants were recruited to measure their experiences, and statistical methods including repeated-measures ANOVA, regression analysis, etc., were used to analyze the data. The results indicate that robots solely reliant on voice interaction are unable to significantly enhance the fluency, comfort and likability effects experienced by customers. Combining a robot’s voice with the ability to imitate a human salesperson’s body movements is a feasible way to truly improve these customer experiences, and a robot’s body movements can positively influence these customer experiences in human–robot interactions (HRIs) while the use of colored light cannot. We also compiled design strategies for robot interaction cues from the perspectives of cost and controllable design. Furthermore, the relationships between fluency, comfort and likability were discussed, thereby providing meaningful insights for HRIs aimed at enhancing customer experiences. Full article
Show Figures

Figure 1

12 pages, 890 KB  
Article
Control Modality and Accuracy on the Trust and Acceptance of Construction Robots
by Daeguk Lee, Donghun Lee, Jae Hyun Jung and Taezoon Park
Appl. Sci. 2025, 15(21), 11827; https://doi.org/10.3390/app152111827 - 6 Nov 2025
Viewed by 952
Abstract
This study investigates how control modalities and recognition accuracy influence construction workers’ trust and acceptance of collaborative robots. Sixty participants evaluated voice and gesture control under varying levels of recognition accuracy while performing tiling together with collaborative robots. Experimental results indicated that recognition [...] Read more.
This study investigates how control modalities and recognition accuracy influence construction workers’ trust and acceptance of collaborative robots. Sixty participants evaluated voice and gesture control under varying levels of recognition accuracy while performing tiling together with collaborative robots. Experimental results indicated that recognition accuracy significantly affected perceived enjoyment (PE, p = 0.010), ease of use (PEOU, p = 0.030), and intention to use (ITU, p = 0.022), but not trust, usefulness (PU), or attitude (ATT). Furthermore, the interaction between control modality and accuracy shaped most acceptance factors (PE, p = 0.049; PEOU, p = 0.006; PU, p = 0.006; ATT, p = 0.003, and ITU, p < 0.001) except trust. In general, high recognition accuracy enhanced user experience and adoption intentions. Voice interfaces were favored when recognition accuracy was high, whereas gesture interfaces were more acceptable under low-accuracy conditions. These findings highlight the importance of designing high-accuracy, task-appropriate interfaces to support technology acceptance in construction. The preference for voice interfaces under accurate conditions aligns with the noisy, fast-paced nature of construction sites, where efficiency is paramount. By contrast, gesture interfaces offer resilience when recognition errors occur. The study provides practical guidance for robot developers, interface designers, and construction managers, emphasizing that carefully matching interaction modalities and accuracy levels to on-site demands can improve acceptance and long-term adoption in this traditionally conservative sector. Full article
(This article belongs to the Special Issue Robot Control in Human–Computer Interaction)
Show Figures

Figure 1

17 pages, 2127 KB  
Article
Leveraging Large Language Models for Real-Time UAV Control
by Kheireddine Choutri, Samiha Fadloun, Ayoub Khettabi, Mohand Lagha, Souham Meshoul and Raouf Fareh
Electronics 2025, 14(21), 4312; https://doi.org/10.3390/electronics14214312 - 2 Nov 2025
Cited by 3 | Viewed by 3121
Abstract
As drones become increasingly integrated into civilian and industrial domains, the demand for natural and accessible control interfaces continues to grow. Conventional manual controllers require technical expertise and impose cognitive overhead, limiting their usability in dynamic and time-critical scenarios. To address these limitations, [...] Read more.
As drones become increasingly integrated into civilian and industrial domains, the demand for natural and accessible control interfaces continues to grow. Conventional manual controllers require technical expertise and impose cognitive overhead, limiting their usability in dynamic and time-critical scenarios. To address these limitations, this paper presents a multilingual voice-driven control framework for quadrotor drones, enabling real-time operation in both English and Arabic. The proposed architecture combines offline Speech-to-Text (STT) processing with large language models (LLMs) to interpret spoken commands and translate them into executable control code. Specifically, Vosk is employed for bilingual STT, while Google Gemini provides semantic disambiguation, contextual inference, and code generation. The system is designed for continuous, low-latency operation within an edge–cloud hybrid configuration, offering an intuitive and robust human–drone interface. While speech recognition and safety validation are processed entirely offline, high-level reasoning and code generation currently rely on cloud-based LLM inference. Experimental evaluation demonstrates an average speech recognition accuracy of 95% and end-to-end command execution latency between 300 and 500 ms, validating the feasibility of reliable, multilingual, voice-based UAV control. This research advances multimodal human–robot interaction by showcasing the integration of offline speech recognition and LLMs for adaptive, safe, and scalable aerial autonomy. Full article
Show Figures

Figure 1

20 pages, 3942 KB  
Article
Self-Supervised Voice Denoising Network for Multi-Scenario Human–Robot Interaction
by Mu Li, Wenjin Xu, Chao Zeng and Ning Wang
Biomimetics 2025, 10(9), 603; https://doi.org/10.3390/biomimetics10090603 - 9 Sep 2025
Cited by 1 | Viewed by 1488
Abstract
Human–robot interaction (HRI) via voice command has significantly advanced in recent years, with large Vision–Language–Action (VLA) models demonstrating particular promise in human–robot voice interaction. However, these systems still struggle with environmental noise contamination during voice interaction and lack a specialized denoising network for [...] Read more.
Human–robot interaction (HRI) via voice command has significantly advanced in recent years, with large Vision–Language–Action (VLA) models demonstrating particular promise in human–robot voice interaction. However, these systems still struggle with environmental noise contamination during voice interaction and lack a specialized denoising network for multi-speaker command isolation in an overlapping speech scenario. To overcome these challenges, we introduce a method to enhance voice command-based HRI in noisy environments, leveraging synthetic data and a self-supervised denoising network to enhance its real-world applicability. Our approach focuses on improving self-supervised network performance in denoising mixed-noise audio through training data scaling. Extensive experiments show our method outperforms existing approaches in simulation and achieves 7.5% higher accuracy than the state-of-the-art method in noisy real-world environments, enhancing voice-guided robot control. Full article
(This article belongs to the Special Issue Intelligent Human–Robot Interaction: 4th Edition)
Show Figures

Figure 1

25 pages, 19135 KB  
Article
Development of a Multi-Platform AI-Based Software Interface for the Accompaniment of Children
by Isaac León, Camila Reyes, Iesus Davila, Bryan Puruncajas, Dennys Paillacho, Nayeth Solorzano, Marcelo Fajardo-Pruna, Hyungpil Moon and Francisco Yumbla
Multimodal Technol. Interact. 2025, 9(9), 88; https://doi.org/10.3390/mti9090088 - 26 Aug 2025
Viewed by 2305
Abstract
The absence of parental presence has a direct impact on the emotional stability and social routines of children, especially during extended periods of separation from their family environment, as in the case of daycare centers, hospitals, or when they remain alone at home. [...] Read more.
The absence of parental presence has a direct impact on the emotional stability and social routines of children, especially during extended periods of separation from their family environment, as in the case of daycare centers, hospitals, or when they remain alone at home. At the same time, the technology currently available to provide emotional support in these contexts remains limited. In response to the growing need for emotional support and companionship in child care, this project proposes the development of a multi-platform software architecture based on artificial intelligence (AI), designed to be integrated into humanoid robots that assist children between the ages of 6 and 14. The system enables daily verbal and non-verbal interactions intended to foster a sense of presence and personalized connection through conversations, games, and empathetic gestures. Built on the Robot Operating System (ROS), the software incorporates modular components for voice command processing, real-time facial expression generation, and joint movement control. These modules allow the robot to hold natural conversations, display dynamic facial expressions on its LCD (Liquid Crystal Display) screen, and synchronize gestures with spoken responses. Additionally, a graphical interface enhances the coherence between dialogue and movement, thereby improving the quality of human–robot interaction. Initial evaluations conducted in controlled environments assessed the system’s fluency, responsiveness, and expressive behavior. Subsequently, it was implemented in a pediatric hospital in Guayaquil, Ecuador, where it accompanied children during their recovery. It was observed that this type of artificial intelligence-based software, can significantly enhance the experience of children, opening promising opportunities for its application in clinical, educational, recreational, and other child-centered settings. Full article
Show Figures

Graphical abstract

26 pages, 6831 KB  
Article
Human–Robot Interaction and Tracking System Based on Mixed Reality Disassembly Tasks
by Raúl Calderón-Sesmero, Adrián Lozano-Hernández, Fernando Frontela-Encinas, Guillermo Cabezas-López and Mireya De-Diego-Moro
Robotics 2025, 14(8), 106; https://doi.org/10.3390/robotics14080106 - 30 Jul 2025
Cited by 4 | Viewed by 3637
Abstract
Disassembly is a crucial process in industrial operations, especially in tasks requiring high precision and strict safety standards when handling components with collaborative robots. However, traditional methods often rely on rigid and sequential task planning, which makes it difficult to adapt to unforeseen [...] Read more.
Disassembly is a crucial process in industrial operations, especially in tasks requiring high precision and strict safety standards when handling components with collaborative robots. However, traditional methods often rely on rigid and sequential task planning, which makes it difficult to adapt to unforeseen changes or dynamic environments. This rigidity not only limits flexibility but also leads to prolonged execution times, as operators must follow predefined steps that do not allow for real-time adjustments. Although techniques like teleoperation have attempted to address these limitations, they often hinder direct human–robot collaboration within the same workspace, reducing effectiveness in dynamic environments. In response to these challenges, this research introduces an advanced human–robot interaction (HRI) system leveraging a mixed-reality (MR) interface embedded in a head-mounted device (HMD). The system enables operators to issue real-time control commands using multimodal inputs, including voice, gestures, and gaze tracking. These inputs are synchronized and processed via the Robot Operating System (ROS2), enabling dynamic and flexible task execution. Additionally, the integration of deep learning algorithms ensures precise detection and validation of disassembly components, enhancing accuracy. Experimental evaluations demonstrate significant improvements, including reduced task completion times, enhanced operator experience, and compliance with strict adherence to safety standards. This scalable solution offers broad applicability for general-purpose disassembly tasks, making it well-suited for complex industrial scenarios. Full article
(This article belongs to the Special Issue Robot Teleoperation Integrating with Augmented Reality)
Show Figures

Figure 1

21 pages, 1118 KB  
Review
Integrating Large Language Models into Robotic Autonomy: A Review of Motion, Voice, and Training Pipelines
by Yutong Liu, Qingquan Sun and Dhruvi Rajeshkumar Kapadia
AI 2025, 6(7), 158; https://doi.org/10.3390/ai6070158 - 15 Jul 2025
Cited by 9 | Viewed by 11360
Abstract
This survey provides a comprehensive review of the integration of large language models (LLMs) into autonomous robotic systems, organized around four key pillars: locomotion, navigation, manipulation, and voice-based interaction. We examine how LLMs enhance robotic autonomy by translating high-level natural language commands into [...] Read more.
This survey provides a comprehensive review of the integration of large language models (LLMs) into autonomous robotic systems, organized around four key pillars: locomotion, navigation, manipulation, and voice-based interaction. We examine how LLMs enhance robotic autonomy by translating high-level natural language commands into low-level control signals, supporting semantic planning and enabling adaptive execution. Systems like SayTap improve gait stability through LLM-generated contact patterns, while TrustNavGPT achieves a 5.7% word error rate (WER) under noisy voice-guided conditions by modeling user uncertainty. Frameworks such as MapGPT, LLM-Planner, and 3D-LOTUS++ integrate multi-modal data—including vision, speech, and proprioception—for robust planning and real-time recovery. We also highlight the use of physics-informed neural networks (PINNs) to model object deformation and support precision in contact-rich manipulation tasks. To bridge the gap between simulation and real-world deployment, we synthesize best practices from benchmark datasets (e.g., RH20T, Open X-Embodiment) and training pipelines designed for one-shot imitation learning and cross-embodiment generalization. Additionally, we analyze deployment trade-offs across cloud, edge, and hybrid architectures, emphasizing latency, scalability, and privacy. The survey concludes with a multi-dimensional taxonomy and cross-domain synthesis, offering design insights and future directions for building intelligent, human-aligned robotic systems powered by LLMs. Full article
Show Figures

Figure 1

29 pages, 7197 KB  
Review
Recent Advances in Electrospun Nanofiber-Based Self-Powered Triboelectric Sensors for Contact and Non-Contact Sensing
by Jinyue Tian, Jiaxun Zhang, Yujie Zhang, Jing Liu, Yun Hu, Chang Liu, Pengcheng Zhu, Lijun Lu and Yanchao Mao
Nanomaterials 2025, 15(14), 1080; https://doi.org/10.3390/nano15141080 - 11 Jul 2025
Cited by 7 | Viewed by 2976
Abstract
Electrospun nanofiber-based triboelectric nanogenerators (TENGs) have emerged as a highly promising class of self-powered sensors for a broad range of applications, particularly in intelligent sensing technologies. By combining the advantages of electrospinning and triboelectric nanogenerators, these sensors offer superior characteristics such as high [...] Read more.
Electrospun nanofiber-based triboelectric nanogenerators (TENGs) have emerged as a highly promising class of self-powered sensors for a broad range of applications, particularly in intelligent sensing technologies. By combining the advantages of electrospinning and triboelectric nanogenerators, these sensors offer superior characteristics such as high sensitivity, mechanical flexibility, lightweight structure, and biocompatibility, enabling their integration into wearable electronics and biomedical interfaces. This review presents a comprehensive overview of recent progress in electrospun nanofiber-based TENGs, covering their working principles, operating modes, and material composition. Both pure polymer and composite nanofibers are discussed, along with various electrospinning techniques that enable control over morphology and performance at the nanoscale. We explore their practical implementations in both contact-type and non-contact-type sensing, such as human–machine interaction, physiological signal monitoring, gesture recognition, and voice detection. These applications demonstrate the potential of TENGs to enable intelligent, low-power, and real-time sensing systems. Furthermore, this paper points out critical challenges and future directions, including durability under long-term operation, scalable and cost-effective fabrication, and seamless integration with wireless communication and artificial intelligence technologies. With ongoing advancements in nanomaterials, fabrication techniques, and system-level integration, electrospun nanofiber-based TENGs are expected to play a pivotal role in shaping the next generation of self-powered, intelligent sensing platforms across diverse fields such as healthcare, environmental monitoring, robotics, and smart wearable systems. Full article
(This article belongs to the Special Issue Self-Powered Flexible Sensors Based on Triboelectric Nanogenerators)
Show Figures

Figure 1

28 pages, 1791 KB  
Article
Speech Recognition-Based Wireless Control System for Mobile Robotics: Design, Implementation, and Analysis
by Sandeep Gupta, Udit Mamodiya and Ahmed J. A. Al-Gburi
Automation 2025, 6(3), 25; https://doi.org/10.3390/automation6030025 - 24 Jun 2025
Cited by 9 | Viewed by 6102
Abstract
This paper describes an innovative wireless mobile robotics control system based on speech recognition, where the ESP32 microcontroller is used to control motors, facilitate Bluetooth communication, and deploy an Android application for the real-time speech recognition logic. With speech processed on the Android [...] Read more.
This paper describes an innovative wireless mobile robotics control system based on speech recognition, where the ESP32 microcontroller is used to control motors, facilitate Bluetooth communication, and deploy an Android application for the real-time speech recognition logic. With speech processed on the Android device and motor commands handled on the ESP32, the study achieves significant performance gains through distributed architectures while maintaining low latency for feedback control. In experimental tests over a range of 1–10 m, stable 110–140 ms command latencies, with low variation (±15 ms) were observed. The system’s voice and manual button modes both yield over 92% accuracy with the aid of natural language processing, resulting in training requirements being low, and displaying strong performance in high-noise environments. The novelty of this work is evident through an adaptive keyword spotting algorithm for improved recognition performance in high-noise environments and a gradual latency management system that optimizes processing parameters in the presence of noise. By providing a user-friendly, real-time speech interface, this work serves to enhance human–robot interaction when considering future assistive devices, educational platforms, and advanced automated navigation research. Full article
(This article belongs to the Section Robotics and Autonomous Systems)
Show Figures

Figure 1

25 pages, 7813 KB  
Article
Deep Learning-Based Speech Recognition and LabVIEW Integration for Intelligent Mobile Robot Control
by Kai-Chao Yao, Wei-Tzer Huang, Hsi-Huang Hsieh, Teng-Yu Chen, Wei-Sho Ho, Jiunn-Shiou Fang and Wei-Lun Huang
Actuators 2025, 14(5), 249; https://doi.org/10.3390/act14050249 - 15 May 2025
Cited by 1 | Viewed by 2472
Abstract
This study implemented an innovative system that trains a speech recognition model based on the DeepSpeech2 architecture using Python for voice control of a robot on the LabVIEW platform. First, a speech recognition model based on the DeepSpeech2 architecture was trained using a [...] Read more.
This study implemented an innovative system that trains a speech recognition model based on the DeepSpeech2 architecture using Python for voice control of a robot on the LabVIEW platform. First, a speech recognition model based on the DeepSpeech2 architecture was trained using a large speech dataset, enabling it to accurately transcribe voice commands. Then, this model was integrated with the LabVIEW graphical user interface and the myRIO controller. By leveraging LabVIEW’s graphical programming environment, the system processed voice commands, translated them into control signals, and directed the robot’s movements accordingly. Experimental results demonstrate that the system not only accurately recognizes various voice commands, but also controls the robot’s behavior in real time, showing high practicality and reliability. This study addresses the limitations inherent in conventional voice control methods, demonstrates the potential of integrating deep learning technology with industrial control platforms, and presents a novel approach for robotic voice control. Full article
(This article belongs to the Section Actuators for Robotics)
Show Figures

Figure 1

15 pages, 587 KB  
Systematic Review
AI Applications to Reduce Loneliness Among Older Adults: A Systematic Review of Effectiveness and Technologies
by Yuyi Yang, Chenyu Wang, Xiaoling Xiang and Ruopeng An
Healthcare 2025, 13(5), 446; https://doi.org/10.3390/healthcare13050446 - 20 Feb 2025
Cited by 33 | Viewed by 15648
Abstract
Background/Objectives: Loneliness among older adults is a prevalent issue, significantly impacting their quality of life and increasing the risk of physical and mental health complications. The application of artificial intelligence (AI) technologies in behavioral interventions offers a promising avenue to overcome challenges in [...] Read more.
Background/Objectives: Loneliness among older adults is a prevalent issue, significantly impacting their quality of life and increasing the risk of physical and mental health complications. The application of artificial intelligence (AI) technologies in behavioral interventions offers a promising avenue to overcome challenges in designing and implementing interventions to reduce loneliness by enabling personalized and scalable solutions. This study systematically reviews the AI-enabled interventions in addressing loneliness among older adults, focusing on the effectiveness and underlying technologies used. Methods: A systematic search was conducted across eight electronic databases, including PubMed and Web of Science, for studies published up to 31 January 2024. Inclusion criteria were experimental studies involving AI applications to mitigate loneliness among adults aged 55 and older. Data on participant demographics, intervention characteristics, AI methodologies, and effectiveness outcomes were extracted and synthesized. Results: Nine studies were included, comprising six randomized controlled trials and three pre–post designs. The most frequently implemented AI technologies included speech recognition (n = 6) and emotion recognition and simulation (n = 5). Intervention types varied, with six studies employing social robots, two utilizing personal voice assistants, and one using a digital human facilitator. Six studies reported significant reductions in loneliness, particularly those utilizing social robots, which demonstrated emotional engagement and personalized interactions. Three studies reported non-significant effects, often due to shorter intervention durations or limited interaction frequencies. Conclusions: AI-driven interventions show promise in reducing loneliness among older adults. Future research should focus on long-term, culturally competent solutions that integrate quantitative and qualitative findings to optimize intervention design and scalability. Full article
Show Figures

Figure 1

19 pages, 2527 KB  
Article
The Use of Voice Control in 3D Medical Data Visualization Implementation, Legal, and Ethical Issues
by Miklos Vincze, Bela Molnar and Miklos Kozlovszky
Information 2025, 16(1), 12; https://doi.org/10.3390/info16010012 - 30 Dec 2024
Viewed by 2159
Abstract
Voice-controlled devices are becoming increasingly common in our everyday lives as well as in medicine. Whether it is our smartphones, with voice assistants that make it easier to access functions, or IoT (Internet of Things) devices that let us control certain areas of [...] Read more.
Voice-controlled devices are becoming increasingly common in our everyday lives as well as in medicine. Whether it is our smartphones, with voice assistants that make it easier to access functions, or IoT (Internet of Things) devices that let us control certain areas of our home with voice commands using sensors and different communication networks, or even medical robots that can be controlled by a doctor with voice instructions. Over the last decade, systems using voice control have made great progress, both in terms of accuracy of voice processing and usability. The topic of voice control is intertwined with the application of artificial intelligence (AI), as the mapping of spoken commands into written text and their understanding is mostly conducted by some kind of trained AI model. Our research had two objectives. The first was to design and develop a system that enables doctors to evaluate medical data in 3D using voice control. The second was to describe the legal and ethical issues involved in using AI-based solutions for voice control. During our research, we created a voice control module for an existing software called PathoVR, using a model taught by Google to interpret the voice commands given by the user. Our research, presented in this paper, can be divided into two parts. In the first, we have designed and developed a system that allows the user to evaluate 3D pathological medical serial sections using voice commands. In contrast, in the second part of our research, we investigated the legal and ethical issues that may arise when using voice control in the medical field. In our research, we have identified legal and ethical barriers to the use of artificial intelligence in voice control, which need to be answered in order to make this technology part of everyday medicine. Full article
(This article belongs to the Special Issue Feature Papers in Artificial Intelligence 2024)
Show Figures

Figure 1

13 pages, 1449 KB  
Article
Evaluating the User Experience and Usability of the MINI Robot for Elderly Adults with Mild Dementia and Mild Cognitive Impairment: Insights and Recommendations
by Aysan Mahmoudi Asl, Jose Miguel Toribio-Guzmán, Álvaro Castro-González, María Malfaz, Miguel A. Salichs and Manuel Franco Martín
Sensors 2024, 24(22), 7180; https://doi.org/10.3390/s24227180 - 8 Nov 2024
Cited by 8 | Viewed by 2585
Abstract
Introduction: In recent years, the integration of robotic systems into various aspects of daily life has become increasingly common. As these technologies continue to advance, ensuring user-friendly interfaces and seamless interactions becomes more essential. For social robots to genuinely provide lasting value [...] Read more.
Introduction: In recent years, the integration of robotic systems into various aspects of daily life has become increasingly common. As these technologies continue to advance, ensuring user-friendly interfaces and seamless interactions becomes more essential. For social robots to genuinely provide lasting value to humans, a favourable user experience (UX) emerges as an essential prerequisite. This article aimed to evaluate the usability of the MINI robot, highlighting its strengths and areas for improvement based on user feedback and performance. Materials and Methods: In a controlled lab setting, a mixed-method qualitative study was conducted with ten individuals aged 65 and above diagnosed with mild dementia (MD) and mild cognitive impairment (MCI). Participants engaged in individual MINI robot interaction sessions, completing cognitive tasks as per written instructions. Video and audio recordings documented interactions, while post-session System Usability Scale (SUS) questionnaires quantified usability perception. Ethical guidelines were followed, ensuring informed consent, and the data underwent qualitative and quantitative analyses, contributing insights into the MINI robot’s usability for this demographic. Results: The study addresses the ongoing challenges that tasks present, especially for MD individuals, emphasizing the importance of user support. Most tasks require both verbal and physical interactions, indicating that MD individuals face challenges when switching response methods within subtasks. These complexities originate from the selection and use of response methods, including difficulties with voice recognition, tablet touch, and tactile sensors. These challenges persist across tasks, with individuals with MD struggling to comprehend task instructions and provide correct answers and individuals with MCI struggling to use response devices, often due to the limitations of the robot’s speech recognition. Technical shortcomings have been identified. The results of the SUS indicate positive perceptions, although there are lower ratings for instructor assistance and pre-use learning. The average SUS score of 68.3 places device usability in the “good” category. Conclusions: Our study examines the usability of the MINI robot, revealing strengths in quick learning, simple system and operation, and integration of features, while also highlighting areas for improvement. Careful design and modifications are essential for meaningful engagement with people with dementia. The robot could better benefit people with MD and MCI if clear, detailed instructions and instructor assistance were available. Full article
(This article belongs to the Section Sensors and Robotics)
Show Figures

Figure 1

Back to TopTop