Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (620)

Search Parameters:
Keywords = speech in noise

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
26 pages, 712 KB  
Article
Comparing Multi-Scale and Pipeline Models for Speaker Change Detection
by Alymzhan Toleu, Gulmira Tolegen and Bagashar Zhumazhanov
Acoustics 2026, 8(1), 5; https://doi.org/10.3390/acoustics8010005 (registering DOI) - 25 Jan 2026
Abstract
Speaker change detection (SCD) in long, multi-party meetings is essential for diarization, Automatic speech recognition (ASR), and summarization, and is now often performed in the space of pre-trained speech embeddings. However, unsupervised approaches remain dominant when timely labeled audio is scarce, and their [...] Read more.
Speaker change detection (SCD) in long, multi-party meetings is essential for diarization, Automatic speech recognition (ASR), and summarization, and is now often performed in the space of pre-trained speech embeddings. However, unsupervised approaches remain dominant when timely labeled audio is scarce, and their behavior under a unified modeling setup is still not well understood. In this paper, we systematically compare two representative unsupervised approaches on the multi-talker audio meeting corpus: (i) a clustering-based pipeline that segments and clusters embeddings/features and scores boundaries via cluster changes and jump magnitude, and (ii) a multi-scale jump-based detector that measures embedding discontinuities at several window lengths and fuses them via temporal clustering and voting. Using a shared front-end and protocol, we vary the underlying features (ECAPA, WavLM, wav2vec 2.0, MFCC, and log-Mel) and test the model’s robustness under additive noise. The results show that embedding choice is crucial and that the two methods offer complementary trade-offs: the pipeline yields low false alarm rates but higher misses, while the multi-scale detector achieves relatively high recall at the cost of many false alarms. Full article
Show Figures

Figure 1

19 pages, 1002 KB  
Article
Fidgeting Increases Pupil Diameter During Auditory Processing in Young Healthy Adults
by Satoko Kataoka, Hideki Miyaguchi, Chinami Ishizuki, Hiroshi Fukuda, Masanori Yasunaga and Hikari Kirimoto
Brain Sci. 2026, 16(2), 127; https://doi.org/10.3390/brainsci16020127 (registering DOI) - 24 Jan 2026
Abstract
Background/Objectives: People often engage in small, repetitive movements—or “fidgeting”—while listening. This behavior has traditionally been regarded as a sign of inattention. However, recent perspectives suggest that these movements may support engagement and arousal regulation. Yet, little is known about how different types of [...] Read more.
Background/Objectives: People often engage in small, repetitive movements—or “fidgeting”—while listening. This behavior has traditionally been regarded as a sign of inattention. However, recent perspectives suggest that these movements may support engagement and arousal regulation. Yet, little is known about how different types of fidgeting affect the allocation of cognitive resources during auditory processing. This study examined whether hand and leg fidgeting influence pupil-linked arousal and auditory task performance. Methods: Young, healthy adults aged 18–26 years completed four auditory processing tasks while performing either hand fidgeting (manipulating a small fidget toy) or leg fidgeting (very light ergometer pedaling). A control group did not fidget. Pupil-linked arousal was assessed using changes in pupil diameter, and listening performance was evaluated across tasks of varying difficulty. Results: Both forms of fidgeting caused pupil dilation compared to the control group, particularly in the case of Hand Fidgeting during the listening task with speech in noise and the fast speech task. Despite these physiological changes, there were no measurable differences in auditory task performance across conditions. Conclusions: Fidgeting modulates pupil-linked arousal without impairing auditory processing in young, healthy adults. Hand fidgeting may help sustain engagement during demanding listening tasks. However, because the fidgeting was intentional and task performance approached ceiling or floor levels, these findings should be interpreted as preliminary. Future studies should examine whether fidgeting supports arousal maintenance or listening performance in individuals with attentional vulnerabilities or auditory processing difficulties. Full article
Show Figures

Figure 1

11 pages, 1506 KB  
Technical Note
Development of a Speech Intelligibility Test for Children in Swiss German Dialects
by Christoph Schmid, Stefanie Blatter, Eberhard Seifert, Philipp Aebischer and Martin Kompis
Audiol. Res. 2026, 16(1), 16; https://doi.org/10.3390/audiolres16010016 - 22 Jan 2026
Viewed by 8
Abstract
Objective: This paper describes the development of a speech intelligibility test in Swiss German dialects, designed for children aged four to nine who are not yet familiar with standard German. Method: Suitable monosyllabic words and trochees in different Swiss German dialects were compiled, [...] Read more.
Objective: This paper describes the development of a speech intelligibility test in Swiss German dialects, designed for children aged four to nine who are not yet familiar with standard German. Method: Suitable monosyllabic words and trochees in different Swiss German dialects were compiled, illustrated, and evaluated. Picture-pointing test procedures appropriate for children were developed. The selected test words and the pictures representing them were evaluated in a preliminary trial with forty-six normal-hearing children between two and nine years of age. Results: A set of 60 monosyllabic words and 40 trochees was recorded in four different Swiss German dialects as well as in standard German, resulting in a total of 500 recordings. Drawings were created to illustrate each word and found to be appropriate for children aged four years old or older. A non-adaptive and an adaptive test procedure using a weighted up–down method to measure speech reception thresholds in quiet and in noise were developed. Conclusions: A novel test to determine speech intelligibility in children in four different Swiss dialects was developed and evaluated in a pilot study. A validation study with more participants was designed to evaluate the test material and procedures. Full article
(This article belongs to the Section Speech and Language)
Show Figures

Figure 1

18 pages, 3705 KB  
Article
Cross-Platform Multi-Modal Transfer Learning Framework for Cyberbullying Detection
by Weiqi Zhang, Chengzu Dong, Aiting Yao, Asef Nazari and Anuroop Gaddam
Electronics 2026, 15(2), 442; https://doi.org/10.3390/electronics15020442 - 20 Jan 2026
Viewed by 94
Abstract
Cyberbullying and hate speech increasingly appear in multi-modal social media posts, where images and text are combined in diverse and fast changing ways across platforms. These posts differ in style, vocabulary and layout, and labeled data are sparse and noisy, which makes it [...] Read more.
Cyberbullying and hate speech increasingly appear in multi-modal social media posts, where images and text are combined in diverse and fast changing ways across platforms. These posts differ in style, vocabulary and layout, and labeled data are sparse and noisy, which makes it difficult to train detectors that are both reliable and deployable under tight computational budgets. Many high performing systems rely on large vision language backbones, full parameter fine tuning, online retrieval or model ensembles, which raises training and inference costs. We present a parameter efficient cross-platform multi-modal transfer learning framework for cyberbullying and hateful content detection. Our framework has three components. First, we perform domain adaptive pretraining of a compact ViLT backbone on in domain image-text corpora. Second, we apply parameter efficient fine tuning that updates only bias terms, a small subset of LayerNorm parameters and the classification head, leaving the inference computation graph unchanged. Third, we use noise aware knowledge distillation from a stronger teacher built from pretrained text and CLIP based image-text encoders, where only high confidence, temperature scaled predictions are used as soft labels during training, and teacher models and any retrieval components are used only offline. We evaluate primarily on Hateful Memes and use IMDB as an auxiliary text only benchmark to show that the deployment aware PEFT + offline-KD recipe can still be applied when other modalities are unavailable. On Hateful Memes, our student updates only 0.11% of parameters and retain about 96% of the AUROC of full fine-tuning. Full article
(This article belongs to the Special Issue Data Privacy and Protection in IoT Systems)
Show Figures

Figure 1

26 pages, 29009 KB  
Article
Quantifying the Relationship Between Speech Quality Metrics and Biometric Speaker Recognition Performance Under Acoustic Degradation
by Ajan Ahmed and Masudul H. Imtiaz
Signals 2026, 7(1), 7; https://doi.org/10.3390/signals7010007 - 12 Jan 2026
Viewed by 327
Abstract
Self-supervised learning (SSL) models have achieved remarkable success in speaker verification tasks, yet their robustness to real-world audio degradation remains insufficiently characterized. This study presents a comprehensive analysis of how audio quality degradation affects three prominent SSL-based speaker verification systems (WavLM, Wav2Vec2, and [...] Read more.
Self-supervised learning (SSL) models have achieved remarkable success in speaker verification tasks, yet their robustness to real-world audio degradation remains insufficiently characterized. This study presents a comprehensive analysis of how audio quality degradation affects three prominent SSL-based speaker verification systems (WavLM, Wav2Vec2, and HuBERT) across three diverse datasets: TIMIT, CHiME-6, and Common Voice. We systematically applied 21 degradation conditions spanning noise contamination (SNR levels from 0 to 20 dB), reverberation (RT60 from 0.3 to 1.0 s), and codec compression (various bit rates), then measured both objective audio quality metrics (PESQ, STOI, SNR, SegSNR, fwSNRseg, jitter, shimmer, HNR) and speaker verification performance metrics (EER, AUC-ROC, d-prime, minDCF). At the condition level, multiple regression with all eight quality metrics explained up to 80% of the variance in minDCF for HuBERT and 78% for WavLM, but only 35% for Wav2Vec2; EER predictability was lower (69%, 67%, and 28%, respectively). PESQ was the strongest single predictor for WavLM and HuBERT, while Shimmer showed the highest single-metric correlation for Wav2Vec2; fwSNRseg yielded the top single-metric R2 for WavLM, and PESQ for HuBERT and Wav2Vec2 (with much smaller gains for Wav2Vec2). WavLM and HuBERT exhibited more predictable quality-performance relationships compared to Wav2Vec2. These findings establish quantitative relationships between measurable audio quality and speaker verification accuracy at the condition level, though substantial within-condition variability limits utterance-level prediction accuracy. Full article
Show Figures

Figure 1

23 pages, 1063 KB  
Article
A Comparative Experimental Study on Simple Features and Lightweight Models for Voice Activity Detection in Noisy Environments
by Bo-Yu Su, Berlin Chen, Shih-Chieh Huang and Jeih-Weih Hung
Electronics 2026, 15(2), 263; https://doi.org/10.3390/electronics15020263 - 7 Jan 2026
Viewed by 150
Abstract
This work presents a comparative study of voice activity detection in noise using simple acoustic features and relatively compact recurrent models within a controlled MATLAB-based framework. For each utterance, 9 baseline spectral-plus-periodicity features, MFCCs, and FBANKs are extracted and passed to several lightweight [...] Read more.
This work presents a comparative study of voice activity detection in noise using simple acoustic features and relatively compact recurrent models within a controlled MATLAB-based framework. For each utterance, 9 baseline spectral-plus-periodicity features, MFCCs, and FBANKs are extracted and passed to several lightweight BiLSTM-based networks, either alone or preceded by a 1D CNN layer. The main experiments are carried out at a fixed SNR to separate the influence of the network structure and the feature type, and an additional series with four SNR levels is used to assess whether the same performance trends hold when the SNR varies. The results show that adding a compact CNN front-end before the BiLSTM consistently improves detection scores, that MFCCs generally outperform the baseline spectral–periodicity features and often give better recall/F1 than FBANKs for the considered lightweight models, and that CNN(3,32)+BiLSTM with 13-dimensional MFCCs offers a favorable trade-off between accuracy, robustness across SNRs, and model size. Because all conditions share a single MATLAB implementation with fixed noise types, SNR values, and evaluation metrics, this work is positioned as a benchmark and practical guideline publication for noise-robust, resource-constrained VAD, rather than as a proposal of a completely new deep-learning architecture. Full article
Show Figures

Figure 1

14 pages, 1392 KB  
Article
AirSpeech: Lightweight Speech Synthesis Framework for Home Intelligent Space Service Robots
by Xiugong Qin, Fenghu Pan, Jing Gao, Shilong Huang, Yichen Sun and Xiao Zhong
Electronics 2026, 15(1), 239; https://doi.org/10.3390/electronics15010239 - 5 Jan 2026
Viewed by 267
Abstract
Text-to-Speech (TTS) methods typically employ a sequential approach with an Acoustic Model (AM) and a vocoder, using a Mel spectrogram as an intermediate representation. However, in home environments, TTS systems often struggle with issues such as inadequate robustness against environmental noise and limited [...] Read more.
Text-to-Speech (TTS) methods typically employ a sequential approach with an Acoustic Model (AM) and a vocoder, using a Mel spectrogram as an intermediate representation. However, in home environments, TTS systems often struggle with issues such as inadequate robustness against environmental noise and limited adaptability to diverse speaker characteristics. The quality of the Mel spectrogram directly affects the performance of TTS systems, yet existing methods overlook the potential of enhancing Mel spectrogram quality through more comprehensive speech features. To address the complex acoustic characteristics of home environments, this paper introduces AirSpeech, a post-processing model for Mel-spectrogram synthesis. We adopt a Generative Adversarial Network (GAN) to improve the accuracy of Mel spectrogram prediction and enhance the expressiveness of synthesized speech. By incorporating additional conditioning extracted from synthesized audio using specified speech feature parameters, our method significantly enhances the expressiveness and emotional adaptability of synthesized speech in home environments. Furthermore, we propose a global normalization strategy to stabilize the GAN training process. Through extensive evaluations, we demonstrate that the proposed method significantly improves the signal quality and naturalness of synthesized speech, providing a more user-friendly speech interaction solution for smart home applications. Full article
Show Figures

Figure 1

32 pages, 28708 KB  
Article
Adaptive Thermal Imaging Signal Analysis for Real-Time Non-Invasive Respiratory Rate Monitoring
by Riska Analia, Anne Forster, Sheng-Quan Xie and Zhiqiang Zhang
Sensors 2026, 26(1), 278; https://doi.org/10.3390/s26010278 - 1 Jan 2026
Viewed by 456
Abstract
(1) Background: This study presents an adaptive, contactless, and privacy-preserving respiratory-rate monitoring system based on thermal imaging, designed for real-time operation on embedded edge hardware. The system continuously processes temperature data from a compact thermal camera without external computation, enabling practical deployment for [...] Read more.
(1) Background: This study presents an adaptive, contactless, and privacy-preserving respiratory-rate monitoring system based on thermal imaging, designed for real-time operation on embedded edge hardware. The system continuously processes temperature data from a compact thermal camera without external computation, enabling practical deployment for home or clinical vital-sign monitoring. (2) Methods: Thermal frames are captured using a 256×192 TOPDON TC001 camera and processed entirely on an NVIDIA Jetson Orin Nano. A YOLO-based detector localizes the nostril region in every even frame (stride = 2) to reduce the computation load, while a Kalman filter predicts the ROI position on skipped frames to maintain spatial continuity and suppress motion jitter. From the stabilized ROI, a temperature-based breathing signal is extracted and analyzed through an adaptive median–MAD hysteresis algorithm that dynamically adjusts to signal amplitude and noise variations for breathing phase detection. Respiratory rate (RR) is computed from inter-breath intervals (IBI) validated within physiological constraints. (3) Results: Ten healthy subjects participated in six experimental conditions including resting, paced breathing, speech, off-axis yaw, posture (supine), and distance variations up to 2.0 m. Across these conditions, the system attained a MAE of 0.57±0.36 BPM and an RMSE of 0.64±0.42 BPM, demonstrating stable accuracy under motion and thermal drift. Compared with peak-based and FFT spectral baselines, the proposed method reduced errors by a large margin across all conditions. (4) Conclusions: The findings confirm that accurate and robust respiratory-rate estimation can be achieved using a low-resolution thermal sensor running entirely on an embedded edge device. The combination of YOLO-based nostril detector, Kalman ROI prediction, and adaptive MAD–hysteresis phase that self-adjusts to signal variability provides a compact, efficient, and privacy-preserving solution for non-invasive vital-sign monitoring in real-world environments. Full article
Show Figures

Graphical abstract

17 pages, 899 KB  
Article
Exploring Bidirectional Associations Between Voice Acoustics and Objective Motor Metrics in Parkinson’s Disease
by Anna Carolyna Gianlorenço, Paulo Eduardo Portes Teixeira, Valton Costa, Walter Fabris-Moraes, Paola Gonzalez-Mego, Ciro Ramos-Estebanez, Arianna Di Stadio, Deniz Doruk Camsari, Mirret M. El-Hagrassy, Felipe Fregni, Tim Wagner and Laura Dipietro
Brain Sci. 2026, 16(1), 48; https://doi.org/10.3390/brainsci16010048 - 29 Dec 2025
Viewed by 277
Abstract
Background/Objectives: Speech and motor control share overlapping neural mechanisms, yet their quantitative relationships in Parkinson’s disease (PD) remain underexplored. This study investigated bidirectional associations between acoustic voice features and objective motor metrics to better understand how vocal and motor systems relate in PD. [...] Read more.
Background/Objectives: Speech and motor control share overlapping neural mechanisms, yet their quantitative relationships in Parkinson’s disease (PD) remain underexplored. This study investigated bidirectional associations between acoustic voice features and objective motor metrics to better understand how vocal and motor systems relate in PD. Methods: Cross-sectional baseline data from participants in a randomized neuromodulation trial were analyzed (n = 13). Motor performance was captured using an Integrated Motion Analysis Suite (IMAS), which enabled quantitative, objective characterization of motor performance during balance, gait, and upper- and lower-limb tasks. Acoustic analyses included harmonic-to-noise ratio (HNR), smoothed cepstral peak prominence (CPPS), jitter, shimmer, median fundamental frequency (F0), F0 standard deviation (SD F0), and voice intensity. Univariate linear regressions were conducted in both directions (voice ↔ motor), as well as partial correlations controlling for PD motor symptom severity. Results: When modeling voice outcomes, faster motor performance and shorter movement durations were associated with acoustically clearer voice features (e.g., higher elbow flexion-extension peak speed with higher voice HNR, β = 8.5, R2 = 0.56, p = 0.01). Similarly, when modeling motor outcomes, clearer voice measures were linked with faster movement speed and shorter movement durations (e.g., higher voice HNR with higher peak movement speed in elbow flexion/extension, β = 0.07, R2 = 0.56, p = 0.01). Conclusions: Voice and motor measures in PD showed significant bidirectional associations, suggesting shared sensorimotor control. These exploratory findings, while limited by sample size, support the feasibility of integrated multimodal assessment for future longitudinal studies. Full article
(This article belongs to the Special Issue Computational Intelligence and Brain Plasticity)
Show Figures

Figure 1

19 pages, 1187 KB  
Article
Dual-Pipeline Machine Learning Framework for Automated Interpretation of Pilot Communications at Non-Towered Airports
by Abdullah All Tanvir, Chenyu Huang, Moe Alahmad, Chuyang Yang and Xin Zhong
Aerospace 2026, 13(1), 32; https://doi.org/10.3390/aerospace13010032 - 28 Dec 2025
Viewed by 297
Abstract
Accurate estimation of aircraft operations, such as takeoffs and landings, is critical for airport planning and resource allocation, yet it remains particularly challenging at non-towered airports, where no dedicated surveillance infrastructure exists. Existing solutions, including video analytics, acoustic sensors, and transponder-based systems, are [...] Read more.
Accurate estimation of aircraft operations, such as takeoffs and landings, is critical for airport planning and resource allocation, yet it remains particularly challenging at non-towered airports, where no dedicated surveillance infrastructure exists. Existing solutions, including video analytics, acoustic sensors, and transponder-based systems, are often costly, incomplete, or unreliable in environments with mixed traffic and inconsistent radio usage, highlighting the need for a scalable, infrastructure-free alternative. To address this gap, this study proposes a novel dual-pipeline machine learning framework that classifies pilot radio communications using both textual and spectral features to infer operational intent. A total of 2489 annotated pilot transmissions collected from a U.S. non-towered airport were processed through automatic speech recognition (ASR) and Mel-spectrogram extraction. We benchmarked multiple traditional classifiers and deep learning models, including ensemble methods, long short-term memory (LSTM) networks, and convolutional neural networks (CNNs), across both feature pipelines. Results show that spectral features paired with deep architectures consistently achieved the highest performance, with F1-scores exceeding 91% despite substantial background noise, overlapping transmissions, and speaker variability These findings indicate that operational intent can be inferred reliably from existing communication audio alone, offering a practical, low-cost path toward scalable aircraft operations monitoring and supporting emerging virtual tower and automated air traffic surveillance applications. Full article
(This article belongs to the Special Issue AI, Machine Learning and Automation for Air Traffic Control (ATC))
Show Figures

Figure 1

15 pages, 1669 KB  
Article
Combined Effects of Speech Features and Sound Fields on the Elderly’s Perception of Voice Alarms
by Hui Ma, Qujing Chen, Weiyu Wang and Chao Wang
Acoustics 2026, 8(1), 2; https://doi.org/10.3390/acoustics8010002 - 24 Dec 2025
Viewed by 388
Abstract
Using efficient voice alarms to ensure safe evacuation is important during emergencies, especially for the elderly. Factors that have important influence on speech perceptions have been investigated for several years. However, relatively few studies have specifically explored the key factors influencing perceptions of [...] Read more.
Using efficient voice alarms to ensure safe evacuation is important during emergencies, especially for the elderly. Factors that have important influence on speech perceptions have been investigated for several years. However, relatively few studies have specifically explored the key factors influencing perceptions of voice alarms in emergency situations. This study investigated the combined effects of speech rate (SR), signal-to-noise ratio (SNR), and reverberation time (RT) on older people’s perception of voice alarms. Thirty older adults were invited to evaluate speech intelligibility, listening difficulty, and perceived urgency after hearing 48 different voice alarm conditions. For comparison, 25 young adults were also recruited in the same experiment. The results for older adults showed that: (1) When SR increased, speech intelligibility significantly decreased, and listening difficulty significantly increased. Perceived urgency reached its maximum at the normal speech rate for older adults, in contrast to young adults, for whom urgency was greatest at the fast speech rate. (2) With the rising SNR, speech intelligibility and perceived urgency significantly increased, and listening difficulty significantly decreased. In contrast, with the rising RT, speech intelligibility and perceived urgency significantly decreased, while listening difficulty significantly increased. (3) RT exerted a relatively stronger independent influence on speech intelligibility and listening difficulty among older adults compared to young adults, which tended not to be substantially moderated by SR or SNR. The interactive effect of SR and RT on perceived urgency was significant for older people, but not significant for young people. These findings provide referential strategies for designing efficient voice alarms for the elderly. Full article
Show Figures

Figure 1

34 pages, 761 KB  
Review
Retrocochlear Auditory Dysfunctions (RADs) and Their Treatment: A Narrative Review
by Domenico Cuda, Patrizia Mancini, Giuseppe Chiarella and Rosamaria Santarelli
Audiol. Res. 2026, 16(1), 5; https://doi.org/10.3390/audiolres16010005 - 23 Dec 2025
Viewed by 552
Abstract
Background/Objectives: Retrocochlear auditory dysfunctions (RADs), including auditory neuropathy (AN) and auditory processing disorders (APD), encompass disorders characterized by impaired auditory processing beyond the cochlea. This narrative review critically examines their distinguishing features, synthesizing recent advances in classification, pathophysiology, clinical presentation, and treatment. [...] Read more.
Background/Objectives: Retrocochlear auditory dysfunctions (RADs), including auditory neuropathy (AN) and auditory processing disorders (APD), encompass disorders characterized by impaired auditory processing beyond the cochlea. This narrative review critically examines their distinguishing features, synthesizing recent advances in classification, pathophysiology, clinical presentation, and treatment. Methods: This narrative review involved a comprehensive literature search across major electronic databases (e.g., PubMed, Scopus) to identify and synthesize relevant studies on the classification, diagnosis, and management of AN and APD. The goal was to update the view on etiologies (genetic/non-genetic) and individualized rehabilitative strategies. Diagnosis relies on a comprehensive assessment, including behavioral, electrophysiological, and imaging tests. Rehabilitation is categorized into bottom-up and top-down approaches. Results: ANSD is defined by neural desynchronization with preserved outer hair cell function, resulting in abnormal auditory brainstem responses and poor speech discrimination. The etiologies (distal/proximal) influence the prognosis for interventions, particularly cochlear implants (CI). APD involves central processing deficits, often with normal peripheral hearing and heterogeneous symptoms affecting speech perception and localization. Rehabilitation is multidisciplinary, utilizing bottom-up strategies (e.g., auditory training, CI) and compensatory top-down approaches. Remote microphone systems are highly effective in improving the signal-to-noise ratio. Conclusions: Accurate diagnosis and personalized, multidisciplinary management are crucial for optimizing communication and quality of life. Evidence suggests that combined bottom-up and top-down interventions may yield superior outcomes. However, methodological heterogeneity limits the generalizability of protocols, highlighting the need for further targeted research. Full article
Show Figures

Figure 1

21 pages, 349 KB  
Review
Hearing Loss in Young Adults: Risk Factors, Mechanisms and Prevention Models
by Razvan Claudiu Fleser, Violeta Necula, Laszlo Peter Ujvary, Andrei Osman, Alexandru Orasan and Alma Aurelia Maniu
Biomedicines 2025, 13(12), 3116; https://doi.org/10.3390/biomedicines13123116 - 18 Dec 2025
Viewed by 978
Abstract
Hearing loss is increasingly recognized as a major public health concern among young adults, who are traditionally considered a low-risk group. This narrative review synthesizes recent evidence on risk and aggravating factors of early-onset hearing impairment, including recreational and occupational noise exposure, genetic [...] Read more.
Hearing loss is increasingly recognized as a major public health concern among young adults, who are traditionally considered a low-risk group. This narrative review synthesizes recent evidence on risk and aggravating factors of early-onset hearing impairment, including recreational and occupational noise exposure, genetic susceptibility, infections, ototoxic medications, and lifestyle contributors. Pathophysiological mechanisms include cochlear synaptopathy, oxidative stress, excitotoxicity, vascular compromise, and immune-mediated injury. Global Burden of Disease data and World Health Organization reports indicate that more than one billion young people are at risk due to unsafe listening practices. Studies highlight emerging risk factors such as hidden hearing loss, extended high-frequency impairment and associations with COVID-19. Aggravating factors include delayed diagnosis, cumulative exposures and lack of preventive strategies. Early detection via advanced audiological assessments, such as extended high-frequency audiometry, otoacoustic emissions, speech-in-noise testing and auditory brainstem responses, is critical to prevent permanent damage. Public health interventions—particularly safe listening campaigns, early screening and monitoring in high-risk populations—are essential to reduce long-term disability. Full article
(This article belongs to the Special Issue Hearing Loss: Mechanisms and Targeted Interventions)
14 pages, 2574 KB  
Article
The Role of Patient Motivation in Single-Sided Deafness: Patterns in Treatment Selection and Cochlear Implant Outcomes
by Leena Asfour, Allison Oliva, Erin Williams and Meredith A. Holcomb
J. Clin. Med. 2025, 14(24), 8944; https://doi.org/10.3390/jcm14248944 - 18 Dec 2025
Viewed by 392
Abstract
Background/Objectives: Single-sided deafness (SSD) treatment options include Contralateral Routing of Signal (CROS) or Bilateral Routing of Signal (BiCROS) systems, bone conduction devices, cochlear implants (CIs) and no intervention. Aligning treatment recommendations with patient motivations is fundamental for satisfaction and successful outcomes. At our [...] Read more.
Background/Objectives: Single-sided deafness (SSD) treatment options include Contralateral Routing of Signal (CROS) or Bilateral Routing of Signal (BiCROS) systems, bone conduction devices, cochlear implants (CIs) and no intervention. Aligning treatment recommendations with patient motivations is fundamental for satisfaction and successful outcomes. At our institution, a structured telehealth consultation precedes formal testing and includes treatment motivation exploration and comprehensive review of all interventions. This study examined SSD treatment motivations and their association with pursuing cochlear implantation. Methods: Adults who completed a pre-treatment SSD telehealth consultation over a four-year period were identified. Charts were retrospectively reviewed for demographics, SSD characteristics, treatment motivations, treatment choice, and CI outcomes. Results: A total of 122 adults were evaluated. Mean age was 56.3 (±13.0) years, and 59.8% were male. Mean SSD duration was 10.8 (±15.8) years. The most common etiology was sudden sensorineural hearing loss. The top primary motivations were improving overall hearing (23.0%), restoring hearing to the deaf ear (22.1%), and improving hearing in noise (21.3%). Most patients (45.1%) opted for a hearing aid, CROS or BiCROS system; 38.5% chose CI; and 14.8% declined treatment. Only 57.4% of those who selected CI had the implant, primarily due to surgery avoidance (31.5%) and insurance limitations (10.5%). Motivation did not predict treatment choice or CI receipt. Among CI recipients (n = 27), those motivated by hearing restoration demonstrated poorer speech outcomes and datalogging. Conclusions: Improving overall hearing and restoring hearing to the deaf ear were the most common motivations for seeking SSD treatment. Adult CI recipients had similar motivations to those who chose non-surgical options. Full article
Show Figures

Figure 1

21 pages, 1070 KB  
Article
Influence of Noise Level and Reverberation on Children’s Performance and Effort in Primary Schools
by Ilaria Pittana, Cora Pavarin, Irene Pavanello, Antonino Di Bella, Piercarlo Romagnoni, Pietro Scimemi and Francesca Cappelletti
Appl. Sci. 2025, 15(24), 13213; https://doi.org/10.3390/app152413213 - 17 Dec 2025
Viewed by 437
Abstract
Classroom acoustics and noise exposure significantly impact students’ emotional, cognitive, and academic well-being. This study investigates how classroom noise and acoustics affect auditory and cognitive performance among 131 children in three primary schools in northeast Italy. Student performance was assessed using standardised tests [...] Read more.
Classroom acoustics and noise exposure significantly impact students’ emotional, cognitive, and academic well-being. This study investigates how classroom noise and acoustics affect auditory and cognitive performance among 131 children in three primary schools in northeast Italy. Student performance was assessed using standardised tests evaluating working memory, verbal short and long-term memory, and visuospatial memory. Children were tested under two distinct acoustic conditions: ambient classroom noise and artificially induced noise (comprising a sequence of typical internal and external classroom sounds, intelligible speech, and unintelligible conversations). Prior to testing, hearing threshold was assessed, in order to reveal any existing impairments. Following each experimental session, children rated their perceived effort and fatigue in completing the tests. Acoustic characterisation of empty classrooms was performed using Reverberation Time (T20), Clarity (C50), and Speech Transmission Index (STI), while noise level was measured during all testing phases. Regression analysis was employed to correlate noise levels and reverberation times with class-average performance and perception scores. Results indicate that noise significantly impaired both verbal working memory and visual attention, increasing perceived effort and fatigue. Notably, both ambient and induced noise conditions exhibited comparable adverse effects on attentional and memory task performance. These findings underscore the critical importance of acoustic design in educational environments and provide empirical support for developing classroom acoustic standards. Full article
(This article belongs to the Special Issue Musical Acoustics and Sound Perception)
Show Figures

Figure 1

Back to TopTop