MDPI - Publisher of Open Access Journals

16 pages, 2440 KiB

Open AccessArticle

Dog–Stranger Interactions Can Facilitate Canine Incursion into Wilderness: The Role of Food Provisioning and Sociability

by Natalia Rojas-Troncoso, Valeria Gómez-Silva, Annegret Grimm-Seyfarth and Elke Schüttler

Biology 2025, 14(8), 1006; https://doi.org/10.3390/biology14081006 (registering DOI) - 6 Aug 2025

Abstract

Most research on domestic dog (Canis familiaris) behavior has focused on pets with restricted movement. However, free-ranging dogs exist in diverse cultural contexts globally, and their interactions with humans are less understood. Tourists can facilitate unrestricted dog movement into wilderness areas, [...] Read more.

Most research on domestic dog (Canis familiaris) behavior has focused on pets with restricted movement. However, free-ranging dogs exist in diverse cultural contexts globally, and their interactions with humans are less understood. Tourists can facilitate unrestricted dog movement into wilderness areas, where they may negatively impact wildlife. This study investigated which stimuli—namely, voice, touch, or food—along with inherent factors (age, sex, sociability) motivate free-ranging dogs to follow a human stranger. We measured the distance (up to 600 m) of 129 free-ranging owned and stray dogs from three villages in southern Chile as they followed an experimenter who presented them one of the above stimuli or none (control). To evaluate the effect of dog sociability (i.e., positive versus stress-related or passive behaviors), we performed a 30 s socialization test (standing near the dog without interacting) before presenting a 10 s stimulus twice. We also tracked whether the dog was in the company of other dogs. Each focus dog was video-recorded and tested up to three times over five days. Generalized linear mixed-effects models revealed that the food stimulus significantly influenced dogs’ motivation to follow a stranger, as well as a high proportion of sociable behaviors directed towards humans and the company of other dogs present during the experiment. Juveniles tended to follow a stranger more than adults or seniors, but no effects were found for the dog’s sex, whether an owner was present, the repetition of trials, the location where the study was performed, or for individuals as a random variable. This research highlights that sociability as an inherent factor shapes dog–stranger interactions in free-ranging dogs when food is given. In the context of wildlife conservation, we recommend that managers promote awareness among local communities and tourists to avoid feeding dogs, especially in the context of outdoor activities close to wilderness. Full article

(This article belongs to the Special Issue Biology, Ecology, Management and Conservation of Canidae)

► Show Figures

Graphical abstract

25 pages, 2082 KiB

Open AccessArticle

XTTS-Based Data Augmentation for Profanity Keyword Recognition in Low-Resource Speech Scenarios

by Shin-Chi Lai, Yi-Chang Zhu, Szu-Ting Wang, Yen-Ching Chang, Ying-Hsiu Hung, Jhen-Kai Tang and Wen-Kai Tsai

Appl. Syst. Innov. 2025, 8(4), 108; https://doi.org/10.3390/asi8040108 - 31 Jul 2025

Viewed by 174

Abstract

As voice cloning technology rapidly advances, the risk of personal voices being misused by malicious actors for fraud or other illegal activities has significantly increased, making the collection of speech data increasingly challenging. To address this issue, this study proposes a data augmentation [...] Read more.

As voice cloning technology rapidly advances, the risk of personal voices being misused by malicious actors for fraud or other illegal activities has significantly increased, making the collection of speech data increasingly challenging. To address this issue, this study proposes a data augmentation method based on XText-to-Speech (XTTS) synthesis to tackle the challenges of small-sample, multi-class speech recognition, using profanity as a case study to achieve high-accuracy keyword recognition. Two models were therefore evaluated: a CNN model (Proposed-I) and a CNN-Transformer hybrid model (Proposed-II). Proposed-I leverages local feature extraction, improving accuracy on a real human speech (RHS) test set from 55.35% without augmentation to 80.36% with XTTS-enhanced data. Proposed-II integrates CNN’s local feature extraction with Transformer’s long-range dependency modeling, further boosting test set accuracy to 88.90% while reducing the parameter count by approximately 41%, significantly enhancing computational efficiency. Compared to a previously proposed incremental architecture, the Proposed-II model achieves an 8.49% higher accuracy while reducing parameters by about 98.81% and MACs by about 98.97%, demonstrating exceptional resource efficiency. By utilizing XTTS and public corpora to generate a novel keyword speech dataset, this study enhances sample diversity and reduces reliance on large-scale original speech data. Experimental analysis reveals that an optimal synthetic-to-real speech ratio of 1:5 significantly improves the overall system accuracy, effectively addressing data scarcity. Additionally, the Proposed-I and Proposed-II models achieve accuracies of 97.54% and 98.66%, respectively, in distinguishing real from synthetic speech, demonstrating their strong potential for speech security and anti-spoofing applications. Full article

(This article belongs to the Special Issue Advancements in Deep Learning and Its Applications)

14 pages, 283 KiB

Open AccessArticle

Teens, Tech, and Talk: Adolescents’ Use of and Emotional Reactions to Snapchat’s My AI Chatbot

by Gaëlle Vanhoffelen, Laura Vandenbosch and Lara Schreurs

Behav. Sci. 2025, 15(8), 1037; https://doi.org/10.3390/bs15081037 - 30 Jul 2025

Viewed by 238

Abstract

Due to technological advancements such as generative artificial intelligence (AI) and large language models, chatbots enable increasingly human-like, real-time conversations through text (e.g., OpenAI’s ChatGPT) and voice (e.g., Amazon’s Alexa). One AI chatbot that is specifically designed to meet the social-supportive needs of [...] Read more.

Due to technological advancements such as generative artificial intelligence (AI) and large language models, chatbots enable increasingly human-like, real-time conversations through text (e.g., OpenAI’s ChatGPT) and voice (e.g., Amazon’s Alexa). One AI chatbot that is specifically designed to meet the social-supportive needs of youth is Snapchat’s My AI. Given its increasing popularity among adolescents, the present study investigated whether adolescents’ likelihood of using My AI, as well as their positive or negative emotional experiences from interacting with the chatbot, is related to socio-demographic factors (i.e., gender, age, and socioeconomic status (SES)). A cross-sectional study was conducted among 303 adolescents (64.1% girls, 35.9% boys, 1.0% other, 0.7% preferred not to say their gender; M_age = 15.89, SD_age = 1.69). The findings revealed that younger adolescents were more likely to use My AI and experienced more positive emotions from these interactions than older adolescents. No significant relationships were found for gender or SES. These results highlight the potential for age to play a critical role in shaping adolescents’ engagement with AI chatbots on social media and their emotional outcomes from such interactions, underscoring the need to consider developmental factors in AI design and policy. Full article

(This article belongs to the Special Issue Protective Factors and Mechanisms of Mental Health in Children and Adolescents)

26 pages, 6831 KiB

Open AccessArticle

Human–Robot Interaction and Tracking System Based on Mixed Reality Disassembly Tasks

by Raúl Calderón-Sesmero, Adrián Lozano-Hernández, Fernando Frontela-Encinas, Guillermo Cabezas-López and Mireya De-Diego-Moro

Robotics 2025, 14(8), 106; https://doi.org/10.3390/robotics14080106 - 30 Jul 2025

Viewed by 199

Abstract

Disassembly is a crucial process in industrial operations, especially in tasks requiring high precision and strict safety standards when handling components with collaborative robots. However, traditional methods often rely on rigid and sequential task planning, which makes it difficult to adapt to unforeseen [...] Read more.

Disassembly is a crucial process in industrial operations, especially in tasks requiring high precision and strict safety standards when handling components with collaborative robots. However, traditional methods often rely on rigid and sequential task planning, which makes it difficult to adapt to unforeseen changes or dynamic environments. This rigidity not only limits flexibility but also leads to prolonged execution times, as operators must follow predefined steps that do not allow for real-time adjustments. Although techniques like teleoperation have attempted to address these limitations, they often hinder direct human–robot collaboration within the same workspace, reducing effectiveness in dynamic environments. In response to these challenges, this research introduces an advanced human–robot interaction (HRI) system leveraging a mixed-reality (MR) interface embedded in a head-mounted device (HMD). The system enables operators to issue real-time control commands using multimodal inputs, including voice, gestures, and gaze tracking. These inputs are synchronized and processed via the Robot Operating System (ROS2), enabling dynamic and flexible task execution. Additionally, the integration of deep learning algorithms ensures precise detection and validation of disassembly components, enhancing accuracy. Experimental evaluations demonstrate significant improvements, including reduced task completion times, enhanced operator experience, and compliance with strict adherence to safety standards. This scalable solution offers broad applicability for general-purpose disassembly tasks, making it well-suited for complex industrial scenarios. Full article

(This article belongs to the Special Issue Robot Teleoperation Integrating with Augmented Reality)

► Show Figures

Figure 1

22 pages, 6359 KiB

Open AccessArticle

Development and Testing of an AI-Based Specific Sound Detection System Integrated on a Fixed-Wing VTOL UAV

by Gabriel-Petre Badea, Mădălin Dombrovschi, Tiberius-Florian Frigioescu, Maria Căldărar and Daniel-Eugeniu Crunteanu

Acoustics 2025, 7(3), 48; https://doi.org/10.3390/acoustics7030048 - 30 Jul 2025

Viewed by 232

Abstract

This study presents the development and validation of an AI-based system for detecting chainsaw sounds, integrated into a fixed-wing VTOL UAV. The system employs a convolutional neural network trained on log-mel spectrograms derived from four sound classes: chainsaw, music, electric drill, and human [...] Read more.

This study presents the development and validation of an AI-based system for detecting chainsaw sounds, integrated into a fixed-wing VTOL UAV. The system employs a convolutional neural network trained on log-mel spectrograms derived from four sound classes: chainsaw, music, electric drill, and human voices. Initial validation was performed through ground testing. Acoustic data acquisition is optimized during cruise flight, when wing-mounted motors are shut down and the rear motor operates at 40–60% capacity, significantly reducing noise interference. To address residual motor noise, a preprocessing module was developed using reference recordings obtained in an anechoic chamber. Two configurations were tested to capture the motor’s acoustic profile by changing the UAV’s orientation relative to the fixed microphone. The embedded system processes incoming audio in real time, enabling low-latency classification without data transmission. Field experiments confirmed the model’s high precision and robustness under varying flight and environmental conditions. Results validate the feasibility of real-time, onboard acoustic event detection using spectrogram-based deep learning on UAV platforms, and support its applicability for scalable aerial monitoring tasks. Full article

► Show Figures

Figure 1

20 pages, 7061 KiB

Open AccessArticle

Soundscapes and Emotional Experiences in World Heritage Temples: Implications for Religious Architectural Design

by Yanling Li, Xiaocong Li and Ming Gao

Buildings 2025, 15(15), 2681; https://doi.org/10.3390/buildings15152681 - 29 Jul 2025

Viewed by 184

Abstract

The impact of soundscapes in religious architecture on public psychology has garnered increasing attention in both research and policy domains. However, the mechanisms by which temple soundscapes influence public emotions remain scientifically unclear. This paper aims to explore how soundscapes in temple architectures [...] Read more.

The impact of soundscapes in religious architecture on public psychology has garnered increasing attention in both research and policy domains. However, the mechanisms by which temple soundscapes influence public emotions remain scientifically unclear. This paper aims to explore how soundscapes in temple architectures designated as World Natural and Cultural Heritage sites affect visitors’ experiences. Considering visitors with diverse social and demographic backgrounds, the research design includes subjective soundscape evaluations and EEG measurements from 193 visitors at two World Heritage temples. The results indicate that visitors’ religious beliefs primarily affect their soundscape perception, while their soundscape preferences show specific correlations with chanting and human voices. Furthermore, compared to males, females exhibit greater sensitivity to emotional variations induced by soundscape experiences. Urban architects can enhance visitors’ positive emotional experiences by integrating soundscape design into the planning of future religious architectures, thereby creating pleasant acoustic environments. Full article

(This article belongs to the Section Architectural Design, Urban Science, and Real Estate)

► Show Figures

Figure 1

17 pages, 8512 KiB

Open AccessArticle

Interactive Holographic Display System Based on Emotional Adaptability and CCNN-PCG

by Yu Zhao, Zhong Xu, Ting-Yu Zhang, Meng Xie, Bing Han and Ye Liu

Electronics 2025, 14(15), 2981; https://doi.org/10.3390/electronics14152981 - 26 Jul 2025

Viewed by 315

Abstract

Against the backdrop of the rapid advancement of intelligent speech interaction and holographic display technologies, this paper introduces an interactive holographic display system. This paper applies 2D-to-3D technology to acquisition work and uses a Complex-valued Convolutional Neural Network Point Cloud Gridding (CCNN-PCG) algorithm [...] Read more.

Against the backdrop of the rapid advancement of intelligent speech interaction and holographic display technologies, this paper introduces an interactive holographic display system. This paper applies 2D-to-3D technology to acquisition work and uses a Complex-valued Convolutional Neural Network Point Cloud Gridding (CCNN-PCG) algorithm to generate a computer-generated hologram (CGH) with depth information for application in point cloud data. During digital human hologram building, 2D-to-3D conversion yields high-precision point cloud data. The system uses ChatGLM for natural language processing and emotion-adaptive responses, enabling multi-turn voice dialogs and text-driven model generation. The CCNN-PCG algorithm reduces computational complexity and improves display quality. Simulations and experiments show that CCNN-PCG enhances reconstruction quality and speeds up computation by over 2.2 times. This research provides a theoretical framework and practical technology for holographic interactive systems, applicable in virtual assistants, educational displays, and other fields. Full article

(This article belongs to the Special Issue Artificial Intelligence, Computer Vision and 3D Display)

► Show Figures

Figure 1

23 pages, 3210 KiB

Open AccessArticle

Design and Optimization of Intelligent High-Altitude Operation Safety System Based on Sensor Fusion

by Bohan Liu, Tao Gong, Tianhua Lei, Yuxin Zhu, Yijun Huang, Kai Tang and Qingsong Zhou

Sensors 2025, 25(15), 4626; https://doi.org/10.3390/s25154626 - 25 Jul 2025

Viewed by 242

Abstract

In the field of high-altitude operations, the frequent occurrence of fall accidents is usually closely related to safety measures such as the incorrect use of safety locks and the wrong installation of safety belts. At present, the manual inspection method cannot achieve real-time [...] Read more.

In the field of high-altitude operations, the frequent occurrence of fall accidents is usually closely related to safety measures such as the incorrect use of safety locks and the wrong installation of safety belts. At present, the manual inspection method cannot achieve real-time monitoring of the safety status of the operators and is prone to serious consequences due to human negligence. This paper designs a new type of high-altitude operation safety device based on the STM32F103 microcontroller. This device integrates ultra-wideband (UWB) ranging technology, thin-film piezoresistive stress sensors, Beidou positioning, intelligent voice alarm, and intelligent safety lock. By fusing five modes, it realizes the functions of safety status detection and precise positioning. It can provide precise geographical coordinate positioning and vertical ground distance for the workers, ensuring the safety and standardization of the operation process. This safety device adopts multi-modal fusion high-altitude operation safety monitoring technology. The UWB module adopts a bidirectional ranging algorithm to achieve centimeter-level ranging accuracy. It can accurately determine dangerous heights of 2 m or more even in non-line-of-sight environments. The vertical ranging upper limit can reach 50 m, which can meet the maintenance height requirements of most transmission and distribution line towers. It uses a silicon carbide MEMS piezoresistive sensor innovatively, which is sensitive to stress detection and resistant to high temperatures and radiation. It builds a Beidou and Bluetooth cooperative positioning system, which can achieve centimeter-level positioning accuracy and an identification accuracy rate of over 99%. It can maintain meter-level positioning accuracy of geographical coordinates in complex environments. The development of this safety device can build a comprehensive and intelligent safety protection barrier for workers engaged in high-altitude operations. Full article

(This article belongs to the Section Electronic Sensors)

► Show Figures

Figure 1

20 pages, 3148 KiB

Open AccessArticle

Dynamic Ultrasonic Jamming via Time–Frequency Mosaic for Anti-Eavesdropping Systems

by Zichuan Yu, Lu Tang, Kai Wang, Xusheng Tang and Hongyu Ge

Electronics 2025, 14(15), 2960; https://doi.org/10.3390/electronics14152960 - 24 Jul 2025

Viewed by 199

Abstract

To combat microphone eavesdropping on devices like smartphones, ultrasonic-based methods offer promise due to human inaudibility and microphone nonlinearity. However, existing systems suffer from low jamming efficiency, poor energy utilization, and weak robustness. Based on these problems, this paper proposes a novel ultrasonic-based [...] Read more.

To combat microphone eavesdropping on devices like smartphones, ultrasonic-based methods offer promise due to human inaudibility and microphone nonlinearity. However, existing systems suffer from low jamming efficiency, poor energy utilization, and weak robustness. Based on these problems, this paper proposes a novel ultrasonic-based jamming algorithm called the Time–Frequency Mosaic (TFM) technique, which can be used for anti-eavesdropping. The proposed TFM technique can generate short-time, frequency-coded jamming signals according to the voice frequency characteristics of different speakers, thereby achieving targeted and efficient jamming. A jamming prototype using the Time–Frequency Mosaic technique was developed and tested in various scenarios. The test results show that when the signal-to-noise ratio (SNR) is lower than 0 dB, the text Word Error Rate (WER) of the proposed method is basically over 60%; when the SNR is 0 dB, the WER of the algorithm in this paper is on average more than 20% higher than that of current jamming algorithms. In addition, when the jamming system maintains the same distance from the recording device, the algorithm in this paper has higher energy utilization efficiency compared with existing algorithms. Experiments prove that in most cases, the proposed algorithm has a better jamming effect, higher energy utilization efficiency, and stronger robustness. Full article

(This article belongs to the Topic Addressing Security Issues Related to Modern Software)

► Show Figures

Figure 1

26 pages, 2261 KiB

Open AccessArticle

Real-Time Fall Monitoring for Seniors via YOLO and Voice Interaction

by Eugenia Tîrziu, Ana-Mihaela Vasilevschi, Adriana Alexandru and Eleonora Tudora

Future Internet 2025, 17(8), 324; https://doi.org/10.3390/fi17080324 - 23 Jul 2025

Viewed by 250

Abstract

In the context of global demographic aging, falls among the elderly remain a major public health concern, often leading to injury, hospitalization, and loss of autonomy. This study proposes a real-time fall detection system that combines a modern computer vision model, YOLOv11 with [...] Read more.

In the context of global demographic aging, falls among the elderly remain a major public health concern, often leading to injury, hospitalization, and loss of autonomy. This study proposes a real-time fall detection system that combines a modern computer vision model, YOLOv11 with integrated pose estimation, and an Artificial Intelligence (AI)-based voice assistant designed to reduce false alarms and improve intervention efficiency and reliability. The system continuously monitors human posture via video input, detects fall events based on body dynamics and keypoint analysis, and initiates a voice-based interaction to assess the user’s condition. Depending on the user’s verbal response or the absence thereof, the system determines whether to trigger an emergency alert to caregivers or family members. All processing, including speech recognition and response generation, is performed locally to preserve user privacy and ensure low-latency performance. The approach is designed to support independent living for older adults. Evaluation of 200 simulated video sequences acquired by the development team demonstrated high precision and recall, along with a decrease in false positives when incorporating voice-based confirmation. In addition, the system was also evaluated on an external dataset to assess its robustness. Our results highlight the system’s reliability and scalability for real-world in-home elderly monitoring applications. Full article

(This article belongs to the Special Issue Artificial Intelligence for Smart Healthcare: Methods, Applications, and Challenges)

► Show Figures

Figure 1

10 pages, 857 KiB

Open AccessProceeding Paper

Implementation of a Prototype-Based Parkinson’s Disease Detection System Using a RISC-V Processor

by Krishna Dharavathu, Pavan Kumar Sankula, Uma Maheswari Vullanki, Subhan Khan Mohammad, Sai Priya Kesapatnapu and Sameer Shaik

Eng. Proc. 2025, 87(1), 97; https://doi.org/10.3390/engproc2025087097 - 21 Jul 2025

Viewed by 195

Abstract

In the wide range of human diseases, Parkinson’s disease (PD) has a high incidence, according to a recent survey by the World Health Organization (WHO). According to WHO records, this chronic disease has affected approximately 10 million people worldwide. Patients who do not [...] Read more.

In the wide range of human diseases, Parkinson’s disease (PD) has a high incidence, according to a recent survey by the World Health Organization (WHO). According to WHO records, this chronic disease has affected approximately 10 million people worldwide. Patients who do not receive an early diagnosis may develop an incurable neurological disorder. PD is a degenerative disorder of the brain, characterized by the impairment of the nigrostriatal system. A wide range of symptoms of motor and non-motor impairment accompanies this disorder. By using new technology, the PD is detected through speech signals of the PD victims by using the reduced instruction set computing 5th version (RISC-V) processor. The RISC-V microcontroller unit (MCU) was designed for the voice-controlled human-machine interface (HMI). With the help of signal processing and feature extraction methods, the digital signal is impaired by the impairment of the nigrostriatal system. These speech signals can be classified through classifier modules. A wide range of classifier modules are used to classify the speech signals as normal or abnormal to identify PD. We use Matrix Laboratory (MATLAB R2021a_v9.10.0.1602886) to analyze the data, develop algorithms, create modules, and develop the RISC-V processor for embedded implementation. Machine learning (ML) techniques are also used to extract features such as pitch, tremor, and Mel-frequency cepstral coefficients (MFCCs). Full article

(This article belongs to the Proceedings of The 5th International Electronic Conference on Applied Sciences)

► Show Figures

Figure 1

27 pages, 2136 KiB

Open AccessArticle

The Effect of Shared and Inclusive Governance on Environmental Sustainability at U.S. Universities

by Dragana Djukic-Min, James Norcross and Elizabeth Searing

Sustainability 2025, 17(14), 6630; https://doi.org/10.3390/su17146630 - 21 Jul 2025

Viewed by 427

Abstract

As climate change consequences intensify, higher education institutions (HEIs) have an opportunity and responsibility to model sustainable operations. This study examines how embracing shared knowledge and inclusion in sustainability decision making facilitates green human resource management (GHRM) efforts to invigorate organizational environmental performance. [...] Read more.

As climate change consequences intensify, higher education institutions (HEIs) have an opportunity and responsibility to model sustainable operations. This study examines how embracing shared knowledge and inclusion in sustainability decision making facilitates green human resource management (GHRM) efforts to invigorate organizational environmental performance. The study examines the effects of shared and inclusive governance on campus sustainability via a regression model and the mediating role of employee participation via a structural equation modeling approach. The results show that shared governance and inclusive governance positively predict the commitment of HEIs to reducing greenhouse gas emissions, and campus engagement mediates these relationships, underscoring the importance of participation. These findings align with stakeholder theory in demonstrating that diverse voices in decision making can enhance commitment to organizational goals like sustainability. The findings also highlight the importance of shared and inclusive governance arrangements at college campuses not only for ethical reasons but also for achieving desired outcomes like carbon neutrality. For campus leaders striving to “green” their institutions, evaluating cross-departmental representation in governance structures and promoting inclusive cultures that make all students and staff feel welcome appear as important complements to GHRM practices. Full article

(This article belongs to the Special Issue Sustainable Management for the Future of Education Systems)

► Show Figures

Figure 1

18 pages, 276 KiB

Open AccessArticle

The Soul at Prayer

by Richard G. T. Gipps

Religions 2025, 16(7), 928; https://doi.org/10.3390/rel16070928 - 18 Jul 2025

Viewed by 305

Abstract

Wittgenstein lists prayer as a distinct language-game, but leaves to others the investigation of its character. Formulating it as “conversation with God” is correct but potentially unhelpful, in part because it presupposes that we can understand what God is independently of knowing what [...] Read more.

Wittgenstein lists prayer as a distinct language-game, but leaves to others the investigation of its character. Formulating it as “conversation with God” is correct but potentially unhelpful, in part because it presupposes that we can understand what God is independently of knowing what it is to pray. But by situating the language-game in the context of our human form of life we make better progress. The discussion of this paper, the focus of which is Christian prayer, first reminds us of what it is to have a soul life—i.e., a life in which hope, conscience, and vitality are interpenetrating elements. It next sketches a more distinctly Christian anthropology in which our lives our understood as marred by pride, lack of trust and openness, and ingratitude. Against this backdrop, prayer can be understood for what it is as the soul coming out of its proud retreat, speaking in its own voice, owning its distortions, acknowledging its gratitude, and pleading its true desires. And God can be understood as (inter alia) that to which prayer is principally offered. Full article

(This article belongs to the Special Issue New Work on Wittgenstein's Philosophy of Religion)

16 pages, 317 KiB

Open AccessPerspective

Listening to the Mind: Integrating Vocal Biomarkers into Digital Health

by Irene Rodrigo and Jon Andoni Duñabeitia

Brain Sci. 2025, 15(7), 762; https://doi.org/10.3390/brainsci15070762 - 18 Jul 2025

Viewed by 528

Abstract

The human voice is an invaluable tool for communication, carrying information about a speaker’s emotional state and cognitive health. Recent research highlights the potential of acoustic biomarkers to detect early signs of mental health and neurodegenerative conditions. Despite their promise, vocal biomarkers remain [...] Read more.

The human voice is an invaluable tool for communication, carrying information about a speaker’s emotional state and cognitive health. Recent research highlights the potential of acoustic biomarkers to detect early signs of mental health and neurodegenerative conditions. Despite their promise, vocal biomarkers remain underutilized in clinical settings, with limited standardized protocols for assessment. This Perspective article argues for the integration of acoustic biomarkers into digital health solutions to improve the detection and monitoring of cognitive impairment and emotional disturbances. Advances in speech analysis and machine learning have demonstrated the feasibility of using voice features such as pitch, jitter, shimmer, and speech rate to assess these conditions. Moreover, we propose that singing, particularly simple melodic structures, could be an effective and accessible means of gathering vocal biomarkers, offering additional insights into cognitive and emotional states. Given its potential to engage multiple neural networks, singing could function as an assessment tool and an intervention strategy for individuals with cognitive decline. We highlight the necessity of further research to establish robust, reproducible methodologies for analyzing vocal biomarkers and standardizing voice-based diagnostic approaches. By integrating vocal analysis into routine health assessments, clinicians and researchers could significantly advance early detection and personalized interventions for cognitive and emotional disorders. Full article

(This article belongs to the Topic Language: From Hearing to Speech and Writing)

22 pages, 11043 KiB

Open AccessArticle

Digital Twin-Enabled Adaptive Robotics: Leveraging Large Language Models in Isaac Sim for Unstructured Environments

by Sanjay Nambiar, Rahul Chiramel Paul, Oscar Chigozie Ikechukwu, Marie Jonsson and Mehdi Tarkian

Machines 2025, 13(7), 620; https://doi.org/10.3390/machines13070620 - 17 Jul 2025

Viewed by 425

Abstract

As industrial automation evolves towards human-centric, adaptable solutions, collaborative robots must overcome challenges in unstructured, dynamic environments. This paper extends our previous work on developing a digital shadow for industrial robots by introducing a comprehensive framework that bridges the gap between physical systems [...] Read more.

As industrial automation evolves towards human-centric, adaptable solutions, collaborative robots must overcome challenges in unstructured, dynamic environments. This paper extends our previous work on developing a digital shadow for industrial robots by introducing a comprehensive framework that bridges the gap between physical systems and their virtual counterparts. The proposed framework advances toward a fully functional digital twin by integrating real-time perception and intuitive human–robot interaction capabilities. The framework is applied to a hospital test lab scenario, where a YuMi robot automates the sorting of microscope slides. The system incorporates a RealSense D435i depth camera for environment perception, Isaac Sim for virtual environment synchronization, and a locally hosted large language model (Mistral 7B) for interpreting user voice commands. These components work together to achieve bi-directional synchronization between the physical and digital environments. The framework was evaluated through 20 test runs under varying conditions. A validation study measured the performance of the perception module, simulation, and language interface, with a 60% overall success rate. Additionally, synchronization accuracy between the simulated and physical robot joint movements reached 98.11%, demonstrating strong alignment between the digital and physical systems. By combining local LLM processing, real-time vision, and robot simulation, the approach enables untrained users to interact with collaborative robots in dynamic settings. The results highlight its potential for improving flexibility and usability in industrial automation. Full article

(This article belongs to the Topic Smart Production in Terms of Industry 4.0 and 5.0)

► Show Figures

Figure 1

Search Results (557)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (557)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI