Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (99)

Search Parameters:
Keywords = acoustic voice analysis

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 559 KiB  
Systematic Review
Acoustic Voice Analysis as a Tool for Assessing Nasal Obstruction: A Systematic Review
by Gamze Yesilli-Puzella, Emilia Degni, Claudia Crescio, Lorenzo Bracciale, Pierpaolo Loreti, Davide Rizzo and Francesco Bussu
Appl. Sci. 2025, 15(15), 8423; https://doi.org/10.3390/app15158423 - 29 Jul 2025
Viewed by 178
Abstract
Objective: This study aims to critically review and synthesize the existing literature on the use of voice analysis in assessing nasal obstruction, with a particular focus on acoustic parameters. Data sources: PubMed, Scopus, Web of Science, Ovid Medline, and Science Direct. Review methods: [...] Read more.
Objective: This study aims to critically review and synthesize the existing literature on the use of voice analysis in assessing nasal obstruction, with a particular focus on acoustic parameters. Data sources: PubMed, Scopus, Web of Science, Ovid Medline, and Science Direct. Review methods: A comprehensive literature search was conducted without any restrictions on publication year, employing Boolean search techniques. The selection and review process of the studies followed PRISMA guidelines. The inclusion criteria comprised studies with participants aged 18 years and older who had nasal obstruction evaluated using acoustic voice analysis parameters, along with objective and/or subjective methods for assessing nasal obstruction. Results: Of the 174 abstracts identified, 118 were screened after the removal of duplicates. The full texts of 37 articles were reviewed. Only 10 studies met inclusion criteria. The majority of these studies found no significant correlations between voice parameters and nasal obstruction. Among the various acoustic parameters examined, shimmer was the most consistently affected, with statistically significant changes identified in three independent studies. A smaller number of studies reported notable findings for fundamental frequency (F0) and noise-related measures such as NHR/HNR. Conclusion: This systematic review critically evaluates existing studies on the use of voice analysis for assessing and monitoring nasal obstruction and hyponasality. The current evidence remains limited, as most investigations predominantly focus on glottic sound and dysphonia, with insufficient attention to the influence of the vocal tract, particularly the nasal cavities, on voice production. A notable gap exists in the integration of advanced analytical approaches, such as machine learning, in this field. Future research should focus on the use of advanced analytical approaches to specifically extrapolate the contribution of nasal resonance to voice thus defining the specific parameters in the voice spectrogram that can give precise information on nasal obstruction. Full article
(This article belongs to the Special Issue Innovative Digital Health Technologies and Their Applications)
Show Figures

Figure 1

12 pages, 445 KiB  
Article
The Effect of Phoniatric and Logopedic Rehabilitation on the Voice of Patients with Puberphonia
by Lidia Nawrocka, Agnieszka Garstecka and Anna Sinkiewicz
J. Clin. Med. 2025, 14(15), 5350; https://doi.org/10.3390/jcm14155350 - 29 Jul 2025
Viewed by 268
Abstract
Background/Objective: Puberphonia is a voice disorder characterized by the persistence of a high-pitched voice in sexually mature males. In phoniatrics and speech-language pathology, it is also known as post-mutational voice instability, mutational falsetto, persistent fistulous voice, or functional falsetto. The absence of an [...] Read more.
Background/Objective: Puberphonia is a voice disorder characterized by the persistence of a high-pitched voice in sexually mature males. In phoniatrics and speech-language pathology, it is also known as post-mutational voice instability, mutational falsetto, persistent fistulous voice, or functional falsetto. The absence of an age-appropriate vocal pitch may adversely affect psychological well-being and hinder personal, social, and occupational functioning. The aim of this study was to evaluate of the impact of phoniatric and logopedic rehabilitation on voice quality in patients with puberphonia. Methods: The study included 18 male patients, aged 16 to 34 years, rehabilitated for voice mutation disorders. Phoniatric and logopedic rehabilitation included voice therapy tailored to each subject. A logopedist led exercises aimed at lowering and stabilizing the pitch of the voice and improving its quality. A phoniatrician supervised the therapy, monitoring the condition of the vocal apparatus and providing additional diagnostic and therapeutic recommendations as needed. The duration and intensity of the therapy were adjusted for each patient. Before and after voice rehabilitation, the subjects completed the following questionnaires: the Voice Handicap Index (VHI), the Vocal Tract Discomfort (VTD) scale, and the Voice-Related Quality of Life (V-RQOL). They also underwent an acoustic voice analysis. Results: Statistical analysis of the VHI, VTD, and V-RQOL scores, as well as the voice’s acoustic parameters, showed statistically significant differences before and after rehabilitation (p < 0.005). Conclusions: Phoniatric and logopedic rehabilitation is an effective method of reducing and maintaining a stable, euphonic male voice in patients with functional puberphonia. Effective voice therapy positively impacts selected aspects of psychosocial functioning reported by patients, improves voice-related quality of life, and reduces physical discomfort in the vocal tract. Full article
(This article belongs to the Section Otolaryngology)
Show Figures

Figure 1

10 pages, 678 KiB  
Article
Do Rare Genetic Conditions Exhibit a Specific Phonotype? A Comprehensive Description of the Vocal Traits Associated with Crisponi/Cold-Induced Sweating Syndrome Type 1
by Federico Calà, Elisabetta Sforza, Lucia D’Alatri, Lorenzo Frassineti, Claudia Manfredi, Roberta Onesimo, Donato Rigante, Marika Pane, Serenella Servidei, Guido Primiano, Giangiorgio Crisponi, Laura Crisponi, Chiara Leoni, Antonio Lanatà and Giuseppe Zampino
Genes 2025, 16(8), 881; https://doi.org/10.3390/genes16080881 - 26 Jul 2025
Viewed by 224
Abstract
Background: Perceptual analysis has highlighted that the voice characteristics of patients with rare congenital genetic syndromes differ from those of normophonic subjects. In this paper, we describe the voice phenotype, also called the phonotype, of patients with Crisponi/cold-induced sweating syndrome type 1 [...] Read more.
Background: Perceptual analysis has highlighted that the voice characteristics of patients with rare congenital genetic syndromes differ from those of normophonic subjects. In this paper, we describe the voice phenotype, also called the phonotype, of patients with Crisponi/cold-induced sweating syndrome type 1 (CS/CISS1). Methods: We conducted an observational study at the Department of Life Sciences and Public Health, Rome. Thirteen patients were included in this study (five males; mean age: 16 years; SD: 10.63 years; median age: 12 years; age range: 6–44 years), and five were adults (38%). We prospectively recorded and analyzed acoustical features of three corner vowels [a], [i], and [u]. For perceptual analysis, the GIRBAS (grade, instability, roughness, breathiness, asthenia, and strain) scale was utilized. Acoustic analysis was performed through BioVoice software. Results: We found that CS/CISS1 patients share a common phonotype characterized by articulation disorders and hyper-rhinophonia. Conclusions: This study contributes to delineating the voice of CS/CISS1 syndrome. The phonotype can represent one of the earliest indicators for detecting rare congenital conditions, enabling specialists to reduce diagnosis time and better define a spectrum of rare and ultra-rare diseases. Full article
(This article belongs to the Section Human Genomics and Genetic Diseases)
Show Figures

Figure 1

23 pages, 3741 KiB  
Article
Multi-Corpus Benchmarking of CNN and LSTM Models for Speaker Gender and Age Profiling
by Jorge Jorrin-Coz, Mariko Nakano, Hector Perez-Meana and Leobardo Hernandez-Gonzalez
Computation 2025, 13(8), 177; https://doi.org/10.3390/computation13080177 - 23 Jul 2025
Viewed by 287
Abstract
Speaker profiling systems are often evaluated on a single corpus, which complicates reliable comparison. We present a fully reproducible evaluation pipeline that trains Convolutional Neural Networks (CNNs) and Long-Short Term Memory (LSTM) models independently on three speech corpora representing distinct recording conditions—studio-quality TIMIT, [...] Read more.
Speaker profiling systems are often evaluated on a single corpus, which complicates reliable comparison. We present a fully reproducible evaluation pipeline that trains Convolutional Neural Networks (CNNs) and Long-Short Term Memory (LSTM) models independently on three speech corpora representing distinct recording conditions—studio-quality TIMIT, crowdsourced Mozilla Common Voice, and in-the-wild VoxCeleb1. All models share the same architecture, optimizer, and data preprocessing; no corpus-specific hyperparameter tuning is applied. We perform a detailed preprocessing and feature extraction procedure, evaluating multiple configurations and validating their applicability and effectiveness in improving the obtained results. A feature analysis shows that Mel spectrograms benefit CNNs, whereas Mel Frequency Cepstral Coefficients (MFCCs) suit LSTMs, and that the optimal Mel-bin count grows with corpus Signal Noise Rate (SNR). With this fixed recipe, EfficientNet achieves 99.82% gender accuracy on Common Voice (+1.25 pp over the previous best) and 98.86% on VoxCeleb1 (+0.57 pp). MobileNet attains 99.86% age-group accuracy on Common Voice (+2.86 pp) and a 5.35-year MAE for age estimation on TIMIT using a lightweight configuration. The consistent, near-state-of-the-art results across three acoustically diverse datasets substantiate the robustness and versatility of the proposed pipeline. Code and pre-trained weights are released to facilitate downstream research. Full article
(This article belongs to the Section Computational Engineering)
Show Figures

Graphical abstract

16 pages, 317 KiB  
Perspective
Listening to the Mind: Integrating Vocal Biomarkers into Digital Health
by Irene Rodrigo and Jon Andoni Duñabeitia
Brain Sci. 2025, 15(7), 762; https://doi.org/10.3390/brainsci15070762 - 18 Jul 2025
Viewed by 528
Abstract
The human voice is an invaluable tool for communication, carrying information about a speaker’s emotional state and cognitive health. Recent research highlights the potential of acoustic biomarkers to detect early signs of mental health and neurodegenerative conditions. Despite their promise, vocal biomarkers remain [...] Read more.
The human voice is an invaluable tool for communication, carrying information about a speaker’s emotional state and cognitive health. Recent research highlights the potential of acoustic biomarkers to detect early signs of mental health and neurodegenerative conditions. Despite their promise, vocal biomarkers remain underutilized in clinical settings, with limited standardized protocols for assessment. This Perspective article argues for the integration of acoustic biomarkers into digital health solutions to improve the detection and monitoring of cognitive impairment and emotional disturbances. Advances in speech analysis and machine learning have demonstrated the feasibility of using voice features such as pitch, jitter, shimmer, and speech rate to assess these conditions. Moreover, we propose that singing, particularly simple melodic structures, could be an effective and accessible means of gathering vocal biomarkers, offering additional insights into cognitive and emotional states. Given its potential to engage multiple neural networks, singing could function as an assessment tool and an intervention strategy for individuals with cognitive decline. We highlight the necessity of further research to establish robust, reproducible methodologies for analyzing vocal biomarkers and standardizing voice-based diagnostic approaches. By integrating vocal analysis into routine health assessments, clinicians and researchers could significantly advance early detection and personalized interventions for cognitive and emotional disorders. Full article
(This article belongs to the Topic Language: From Hearing to Speech and Writing)
19 pages, 1039 KiB  
Article
Prediction of Parkinson Disease Using Long-Term, Short-Term Acoustic Features Based on Machine Learning
by Mehdi Rashidi, Serena Arima, Andrea Claudio Stetco, Chiara Coppola, Debora Musarò, Marco Greco, Marina Damato, Filomena My, Angela Lupo, Marta Lorenzo, Antonio Danieli, Giuseppe Maruccio, Alberto Argentiero, Andrea Buccoliero, Marcello Dorian Donzella and Michele Maffia
Brain Sci. 2025, 15(7), 739; https://doi.org/10.3390/brainsci15070739 - 10 Jul 2025
Viewed by 504
Abstract
Background: Parkinson’s disease (PD) is the second most common neurodegenerative disorder after Alzheimer’s disease, affecting countless individuals worldwide. PD is characterized by the onset of a marked motor symptomatology in association with several non-motor manifestations. The clinical phase of the disease is usually [...] Read more.
Background: Parkinson’s disease (PD) is the second most common neurodegenerative disorder after Alzheimer’s disease, affecting countless individuals worldwide. PD is characterized by the onset of a marked motor symptomatology in association with several non-motor manifestations. The clinical phase of the disease is usually preceded by a long prodromal phase, devoid of overt motor symptomatology but often showing some conditions such as sleep disturbance, constipation, anosmia, and phonatory changes. To date, speech analysis appears to be a promising digital biomarker to anticipate even 10 years before the onset of clinical PD, as well serving as a useful prognostic tool for patient follow-up. That is why, the voice can be nominated as the non-invasive method to detect PD from healthy subjects (HS). Methods: Our study was based on cross-sectional study to analysis voice impairment. A dataset comprising 81 voice samples (41 from healthy individuals and 40 from PD patients) was utilized to train and evaluate common machine learning (ML) models using various types of features, including long-term (jitter, shimmer, and cepstral peak prominence (CPP)), short-term features (Mel-frequency cepstral coefficient (MFCC)), and non-standard measurements (pitch period entropy (PPE) and recurrence period density entropy (RPDE)). The study adopted multiple machine learning (ML) algorithms, including random forest (RF), K-nearest neighbors (KNN), decision tree (DT), naïve Bayes (NB), support vector machines (SVM), and logistic regression (LR). Cross-validation technique was applied to ensure the reliability of performance metrics on train and test subsets. These metrics (accuracy, recall, and precision), help determine the most effective models for distinguishing PD from healthy subjects. Result: Among all the algorithms used in this research, random forest (RF) was the best-performing model, achieving an accuracy of 82.72% with a ROC-AUC score of 89.65%. Although other models, such as support vector machine (SVM), could be considered with an accuracy of 75.29% and a ROC-AUC score of 82.63%, RF was by far the best one when evaluated across all metrics. The K-nearest neighbor (KNN) and decision tree (DT) performed the worst. Notably, by combining a comprehensive set of long-term, short-term, and non-standard acoustic features, unlike previous studies that typically focused on only a subset, our study achieved higher predictive performance, offering a more robust model for early PD detection. Conclusions: This study highlights the potential of combining advanced acoustic analysis with ML algorithms to develop non-invasive and reliable tools for early PD detection, offering substantial benefits for the healthcare sector. Full article
(This article belongs to the Section Neurodegenerative Diseases)
Show Figures

Figure 1

19 pages, 661 KiB  
Article
Prediction of Voice Therapy Outcomes Using Machine Learning Approaches and SHAP Analysis: A K-VRQOL-Based Analysis
by Ji Hye Park, Ah Ra Jung, Ji-Na Lee and Ji-Yeoun Lee
Appl. Sci. 2025, 15(13), 7045; https://doi.org/10.3390/app15137045 - 23 Jun 2025
Viewed by 250
Abstract
This study aims to identify personal, clinical, and acoustic predictors of therapy outcomes based on changes in Korean voice-related quality of life (K-VRQOL) scores, as well as to compare the predictive performance of traditional regression and machine learning models. A total of 102 [...] Read more.
This study aims to identify personal, clinical, and acoustic predictors of therapy outcomes based on changes in Korean voice-related quality of life (K-VRQOL) scores, as well as to compare the predictive performance of traditional regression and machine learning models. A total of 102 participants undergoing voice therapy are retrospectively analyzed. Multiple regression analysis and four machine learning algorithms—random forest (RF), gradient boosting (GB), light gradient boosting machine (LightGBM), and extreme gradient boosting (XGBoost)—are applied to predict changes in K-VRQOL scores across the total, physical, and emotional domains. The Shapley additive explanations (SHAP) approach is used to evaluate the relative contribution of each variable to the prediction outcomes. Female gender and comorbidity status emerge as significant predictors in both the total and physical domains. Among the acoustic features, jitter, SFF, and MPT are closely associated with improvements in physical voice function. LightGBM demonstrates the best overall performance, particularly in the total domain (R2 = 32.54%), while GB excels in the physical domain. The emotional domain shows relatively low predictive power across the models. SHAP analysis reveals interpretable patterns, highlighting jitter and speaking fundamental frequency (SFF) as key contributors in high-performing models. Integrating statistical and machine learning approaches provides a robust framework for predicting and interpreting voice therapy outcomes. These findings support the use of explainable artificial intelligence (AI) to enhance clinical decision-making and pave the way for personalized voice rehabilitation strategies. Full article
Show Figures

Figure 1

18 pages, 2817 KiB  
Article
Relationship Between Voice Analysis and Functional Status in Patients with Amyotrophic Lateral Sclerosis
by Margarita Pérez-Bonilla, Paola Díaz Borrego, Marina Mora-Ortiz, Roberto Fernández-Baillo, María Nieves Muñoz-Alcaraz, Fernando J. Mayordomo-Riera and Eloy Girela López
Audiol. Res. 2025, 15(3), 53; https://doi.org/10.3390/audiolres15030053 - 7 May 2025
Cited by 1 | Viewed by 1018
Abstract
Background: Amyotrophic Lateral Sclerosis (ALS) is a progressive neurodegenerative disease affecting both upper and lower motor neurons, with bulbar dysfunction manifesting in up to 80% of patients. Dysarthria, characterized by impaired speech production, is common in ALS and often correlates with disease severity. [...] Read more.
Background: Amyotrophic Lateral Sclerosis (ALS) is a progressive neurodegenerative disease affecting both upper and lower motor neurons, with bulbar dysfunction manifesting in up to 80% of patients. Dysarthria, characterized by impaired speech production, is common in ALS and often correlates with disease severity. Voice analysis has emerged as a promising tool for detecting disease progression and monitoring functional status. Methods: This study investigates acoustic and biomechanical voice alterations in ALS patients and their association with clinical measures of functional independence. A descriptive observational case series study was conducted, involving 43 ALS patients and 43 age and sex matched controls with non-neurological voice disorders. Sustained vowel /a/ recordings were obtained and analyzed using Voice Clinical Systems® and Praat software (version 6.2.22). Biomechanical and acoustic parameters were correlated with ALS Functional Rating Scale-Revised (ALSFRS-R) and Barthel Index scores. Results: Significant differences were observed between ALS and control groups (elevated muscle force and tension and interedge distance in non-ALS individuals). Between bulbar and spinal ALS subtypes, elevated values were observed in certain parameters in Bulbar ALS patients, indicating irregular vocal fold contact and weakened phonatory control, while spinal ALS exhibited increased values, suggesting higher phonatory muscle tension. Elevated biomechanical parameters were significantly correlated with low ALSFRS-R scores, suggesting a possible relationship between voice measures and functional decline. However, acoustic measurements showed no relationship with performance status. Conclusions: These results highlight the potential of voice analysis as a non-invasive, objective tool for monitoring ALS stage and differentiating between subtypes. Further research is needed to validate these findings and explore their clinical applications. Full article
Show Figures

Figure 1

12 pages, 3240 KiB  
Article
AI-Driven Data Analysis for Asthma Risk Prediction
by Meng-Han Chen, Guanling Lee and Lun-Ping Hung
Healthcare 2025, 13(7), 774; https://doi.org/10.3390/healthcare13070774 - 31 Mar 2025
Cited by 1 | Viewed by 917
Abstract
Background: Asthma is a well-known otolaryngological and immunological disorder that affects patients worldwide. Currently, the primary diagnosis relies on a combination of clinical history, physical examination findings consistent with asthma, and objective evidence of reversible airflow obstruction. However, the diagnostic process can be [...] Read more.
Background: Asthma is a well-known otolaryngological and immunological disorder that affects patients worldwide. Currently, the primary diagnosis relies on a combination of clinical history, physical examination findings consistent with asthma, and objective evidence of reversible airflow obstruction. However, the diagnostic process can be invasive and time-consuming, which limits clinical efficiency and accessibility. Objectives: In this study, an AI-based prediction system was developed, leveraging voice changes caused by respiratory contraction due to asthma to create a machine learning (ML)-based clinical decision support system. Methods: A total of 1500 speech samples—comprising high-pitch, normal-pitch, and low-pitch recitations of the phonemes [i, a, u]—were used. Long-Term Average Spectrum (LTAS) and Single-Frequency Filtering Cepstral Coefficients (SFCCs) were extracted as features for classification. Seven machine learning algorithms were employed to assess the feasibility of asthma prediction. Results: The Decision Tree, CNN, and LSTM models achieved average accuracies above 0.8, with results of 0.88, 0.80, and 0.84, respectively. Observational results indicate that the Decision Tree model performed best for high-pitch phonemes, whereas the LSTM model outperformed others in normal-pitch and low-pitch phonemes. Additionally, to validate model efficiency and enhance interpretability, feature importance analysis and overall average spectral analysis were applied. Conclusions: This study aims to provide medical clinicians with accurate and reliable decision-making support, improving the efficiency of asthma diagnosis through AI-driven acoustic analysis. Full article
Show Figures

Figure 1

16 pages, 1110 KiB  
Article
How Anxiety State Influences Speech Parameters: A Network Analysis Study from a Real Stressed Scenario
by Qingyi Wang, Feifei Xu, Xianyang Wang, Shengjun Wu, Lei Ren and Xufeng Liu
Brain Sci. 2025, 15(3), 262; https://doi.org/10.3390/brainsci15030262 - 28 Feb 2025
Viewed by 1159
Abstract
Background/Objectives: Voice analysis has shown promise in anxiety assessment, yet traditional approaches examining isolated acoustic features yield inconsistent results. This study aimed to explore the relationship between anxiety states and vocal parameters from a network perspective in ecologically valid settings. Methods: [...] Read more.
Background/Objectives: Voice analysis has shown promise in anxiety assessment, yet traditional approaches examining isolated acoustic features yield inconsistent results. This study aimed to explore the relationship between anxiety states and vocal parameters from a network perspective in ecologically valid settings. Methods: A cross-sectional study was conducted with 316 undergraduate students (191 males, 125 females; mean age 20.3 ± 0.85 years) who completed a standardized picture description task while their speech was recorded. Participants were categorized into low-anxiety (n = 119) and high-anxiety (n = 197) groups based on self-reported anxiety ratings. Five acoustic parameters—jitter, fundamental frequency (F0), formant frequencies (F1/F2), intensity, and speech rate—were analyzed using network analysis. Results: Network analysis revealed a robust negative relationship between jitter and state anxiety, with jitter as the sole speech parameter consistently linked to state anxiety in the total group. Additionally, higher anxiety levels were associated with a coupling between intensity and F1/F2, whereas the low-anxiety network displayed a sparser organization without intensity and F1/F2 connection. Conclusions: Anxiety could be recognized by speech parameter networks in ecological settings. The distinct pattern with the negative jitter-anxiety relationship in the total network and the connection between intensity and F1/2 in high-anxiety states suggest potential speech markers for anxiety assessment. These findings suggest that state anxiety may directly influence jitter and fundamentally restructure the relationships among speech features, highlighting the importance of examining jitter and speech parameter interactions rather than isolated values in speech detection of anxiety. Full article
(This article belongs to the Section Neuropsychiatry)
Show Figures

Figure 1

15 pages, 3599 KiB  
Article
Exploring Agreement in Voice Acoustic Parameters: A Repeated Measures Case Study Across Varied Recording Instruments, Speech Samples, and Daily Timeframes
by Lady Catherine Cantor-Cutiva, Adrián Castillo-Allendes and Eric James Hunter
Acoustics 2025, 7(1), 6; https://doi.org/10.3390/acoustics7010006 - 22 Jan 2025
Cited by 1 | Viewed by 1609
Abstract
Aims: The aim was to assess the agreement between microphone-derived and neck accelerometer-derived voice acoustic parameters and their associations with recording moments and speech types. Methods: Using simultaneous recordings, a 7-week study on a single individual was conducted to reduce intersubject variability. Agreement [...] Read more.
Aims: The aim was to assess the agreement between microphone-derived and neck accelerometer-derived voice acoustic parameters and their associations with recording moments and speech types. Methods: Using simultaneous recordings, a 7-week study on a single individual was conducted to reduce intersubject variability. Agreement was assessed using Bland–Altman plots, and associations were examined with generalized estimating equations. Results: Bland–Altman plots showed no significant bias between microphone (MIC) and accelerometer (ACC) measurements for alpha ratio, CPP, PPE, SPL SD, fundamental frequency (fo) mean, and SD. Speech type and measurement timing were significantly associated with alpha ratio, while the instrument was not. Microphone measurements resulted in slightly lower CPP compared to the accelerometer, while reading samples yielded higher CPP compared to vowel productions. PPE, SPL SD, and fo mean showed significant associations with speech type, based on univariate analysis. Microphone measurements yielded a statistically smaller fo SD compared to the accelerometer, while reading productions had a larger fo SD than vowel productions. Conclusions: Fundamental frequency, alpha ratio, PPE, and SPL SD values were robust, regardless of the instrument used, suggesting the potential use of accelerometers in less-controlled environments. These findings are crucial for enhancing confidence in voice metrics and exploring efficient clinical assessment protocols. Full article
Show Figures

Figure 1

27 pages, 2443 KiB  
Article
The Domestic Acoustic Environment in Online Education—Part 2: Different Interference Perception of Sound Sources and While Conducting Academic Tasks
by Virginia Puyana-Romero, Angela María Díaz-Márquez, Christiam Garzón and Giuseppe Ciaburro
Buildings 2025, 15(1), 93; https://doi.org/10.3390/buildings15010093 - 30 Dec 2024
Viewed by 992
Abstract
Noise is increasingly recognized as a factor impacting health, including its effects on online education. However, differences in the perception of acoustic environmental factors have been scarcely analyzed. This study aimed to evaluate perceived differences in the interference of five types of sound [...] Read more.
Noise is increasingly recognized as a factor impacting health, including its effects on online education. However, differences in the perception of acoustic environmental factors have been scarcely analyzed. This study aimed to evaluate perceived differences in the interference of five types of sound (traffic, voices, TV/radio/household appliances, music, and animals) while conducting autonomous and synchronous activities during online learning. It is also aimed to identify which activities are more affected by the domestic acoustic environment among a group of 4 synchronous and 6 autonomous activities. The data were obtained from an online survey distributed online among the students of the Universidad de las Américas in Quito, Ecuador. The differences between acoustical variables were evaluated using frequentist and inferential analysis. Findings indicated that traffic noise was the least disruptive sound for autonomous activities, likely due to reduced vehicle circulation during the COVID-19 lockdown. In contrast, voices were identified as the most disturbing noise source, underscoring that background speech can significantly disrupt concentration. Additionally, domestic noise is more disturbing while taking exams than during solving problem tasks, comprehensive reading, or group work, probably because during the exams students cannot control unwanted sound sources. These outcomes underscore the need for acoustic strategies in domestic educational settings to reduce noise-related distractions. Full article
Show Figures

Figure 1

13 pages, 956 KiB  
Article
Associations of Voice Metrics with Postural Function in Parkinson’s Disease
by Anna Carolyna Gianlorenço, Valton Costa, Walter Fabris-Moraes, Paulo Eduardo Portes Teixeira, Paola Gonzalez, Kevin Pacheco-Barrios, Ciro Ramos-Estebanez, Arianna Di Stadio, Mirret M. El-Hagrassy, Deniz Durok Camsari, Tim Wagner, Laura Dipietro and Felipe Fregni
Life 2025, 15(1), 27; https://doi.org/10.3390/life15010027 - 30 Dec 2024
Viewed by 1120
Abstract
Background: This study aimed to explore the potential associations between voice metrics of patients with PD and their motor symptoms. Methods: Motor and vocal data including the Unified Parkinson’s Disease Rating Scale part III (UPDRS-III), Harmonic–Noise Ratio (HNR), jitter, shimmer, and smoothed cepstral [...] Read more.
Background: This study aimed to explore the potential associations between voice metrics of patients with PD and their motor symptoms. Methods: Motor and vocal data including the Unified Parkinson’s Disease Rating Scale part III (UPDRS-III), Harmonic–Noise Ratio (HNR), jitter, shimmer, and smoothed cepstral peak prominence (CPPS) were analyzed through exploratory correlations followed by univariate linear regression analyses. We employed these four voice metrics as independent variables and the total and sub-scores of the UPDRS-III as dependent variables. Results: Thirteen subjects were included, 76% males and 24% females, with a mean age of 62.9 ± 10.1 years, and a median Hoehn and Yahr stage of 2.3 ± 0.7. The regression analysis showed that CPPS is associated with posture (UPDRS-III posture scores: β = −0.196; F = 10.0; p = 0.01; R2 = 0.50) and UPDRS-III postural stability scores (β = −0.130; F = 5.57; p = 0.04; R2 = 0.35). Additionally, the associations between CPPS and rapid alternating movement (β = −0.297; p = 0.07), rigidity (β= −0.36; p = 0.11), and body bradykinesia (β = −0.16; p = 0.13) showed a trend towards significance. Conclusion: These findings highlight the potential role of CPPS as a predictor of postural impairments secondary to PD, emphasizing the need for further investigation. Full article
(This article belongs to the Special Issue New Trends in Otorhinolaryngology)
Show Figures

Figure 1

21 pages, 273 KiB  
Article
Prospective Voice Assessment After Thyroidectomy Without Recurrent Laryngeal Nerve Injury
by Ivana Šimić Prgomet, Stjepan Frkanec, Ika Gugić Radojković and Drago Prgomet
Diagnostics 2025, 15(1), 37; https://doi.org/10.3390/diagnostics15010037 - 27 Dec 2024
Viewed by 968
Abstract
Background: Thyroidectomy, a surgical procedure for thyroid disorders, is associated with postoperative voice changes, even in cases without recurrent laryngeal nerve (RLN) injury. Our study evaluates the prevalence and predictors of voice disorders in thyroidectomy patients without RLN injury. Methods: Our [...] Read more.
Background: Thyroidectomy, a surgical procedure for thyroid disorders, is associated with postoperative voice changes, even in cases without recurrent laryngeal nerve (RLN) injury. Our study evaluates the prevalence and predictors of voice disorders in thyroidectomy patients without RLN injury. Methods: Our single-center prospective study at the University Hospital Center Zagreb included 243 patients, with pre- and postoperative voice evaluations using acoustic analysis and videostroboscopy. Logistic regression, chi-square, MANOVA, and non-parametric tests assessed the impact of surgical, sociodemographic, and lifestyle factors. Results: The study analyzed 243 participants (141 lobectomy, 102 total thyroidectomy). Postoperative voice disorders occurred in 200 patients (100 lobectomy, 100 total thyroidectomy); 43 (17.7%) experienced no voice disorders. Significant associations were observed for surgery type (χ2 = 29.88, p < 0.001), with total thyroidectomy having higher risk, surgery duration (χ2 = 16.40, p < 0.001), thyroid volume (χ2 = 4.24, p = 0.045), and BMI (χ2 = 8.97, p = 0.011). Gender and age showed no significant correlation. Acoustic parameters differed significantly, with lobectomy patients showing better intensity, jitter, and shimmer values across postoperative measurements. Logistic regression identified surgery type (Exp(B) = 16.533, p = 0.001) and thyroid volume (Exp(B) = 2.335, p = 0.023) as predictors of voice disorders, achieving 82.7% classification accuracy. Multivariate analysis confirmed gender and surgery duration as significant contributors. Surgery duration exceeding 90 min and enlarged thyroid volume negatively influenced outcomes. Significant acoustic differences were also linked to BMI categories, with obese participants exhibiting poorer parameters, particularly shimmer and jitter. Conclusions: Surgery type, thyroid volume, BMI, and surgery duration are most likely significant predictors of postoperative voice disorders. Full article
(This article belongs to the Special Issue Diagnosis and Management of Thyroid Disorders)
36 pages, 4793 KiB  
Article
Cross-Regional Patterns of Obstruent Voicing and Gemination: The Case of Roman and Veneto Italian
by Angelo Dian, John Hajek and Janet Fletcher
Languages 2024, 9(12), 383; https://doi.org/10.3390/languages9120383 - 20 Dec 2024
Viewed by 1768
Abstract
Italian has a length contrast in its series of voiced and voiceless obstruents while also presenting phonetic differences across regional varieties. Northern varieties of the language, including Veneto Italian (VI), are described as maintaining the voicing contrast but, in some cases, not the [...] Read more.
Italian has a length contrast in its series of voiced and voiceless obstruents while also presenting phonetic differences across regional varieties. Northern varieties of the language, including Veneto Italian (VI), are described as maintaining the voicing contrast but, in some cases, not the length contrast. In central and southern varieties, the opposite trend may occur. For instance, Roman Italian (RI) is reported to optionally pre-voice intervocalic voiceless singleton obstruents whilst also maintaining the length contrast for this consonant class. This study looks at the acoustic realization of selected obstruents in VI and RI and investigates (a) prevoicing patterns and (b) the effects and interactions of regional variety, gemination, and (phonological and phonetic) voicing on consonant (C) and preceding-vowel (V) durations, as well as the ratio between the two (C/V), with a focus on that particular measure. An acoustic phonetic analysis is conducted on 3703 tokens from six speakers from each variety, producing eight repetitions of 40 real CV́C(C)V and CVC(C)V́CV words embedded in carrier sentences, with /p, pp, t, tt, k, kk, b, bb, d, dd, ɡ, ɡɡ, f, ff, v, vv, t∫, tt∫, dʒ, ddʒ/ as the target intervocalic consonants. The results show that both VI and RI speakers produce geminates, yielding high C/V ratios in both varieties, although there are cross-regional differences in the realization of singletons. On the one hand, RI speakers tend to pre-voice voiceless singletons and produce overall shorter C durations and lower C/V ratios for these consonants. On the other hand, VI speakers produce longer C durations and higher C/V ratios for all voiceless singletons, triggering some overlap between the C length categories, which results in partial degemination through singleton lengthening, although only for voiceless obstruents. The implications of a trading relationship between phonetic voicing and duration of obstruents in Italian gemination are discussed. Full article
(This article belongs to the Special Issue Speech Variation in Contemporary Italian)
Show Figures

Figure 1

Back to TopTop