Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (23)

Search Parameters:
Keywords = vocal fold oscillations

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 2798 KiB  
Article
High-Speed Videoendoscopy and Stiffness Mapping for AI-Assisted Glottic Lesion Differentiation
by Magdalena M. Pietrzak, Justyna Kałuża-Olszewska, Ewa Niebudek-Bogusz, Artur Klepaczko and Wioletta Pietruszewska
Cancers 2025, 17(8), 1376; https://doi.org/10.3390/cancers17081376 - 21 Apr 2025
Viewed by 502
Abstract
Objectives: This study evaluates the potential of high-speed videoendoscopy (HSV) in differentiating between benign and malignant glottic lesions, offering a non-invasive diagnostic tool for clinicians. Moreover, a new parameter derived from high-speed videoendoscopy (HSV) had been proposed and implemented in the analysis [...] Read more.
Objectives: This study evaluates the potential of high-speed videoendoscopy (HSV) in differentiating between benign and malignant glottic lesions, offering a non-invasive diagnostic tool for clinicians. Moreover, a new parameter derived from high-speed videoendoscopy (HSV) had been proposed and implemented in the analysis for an objective assessment of the vocal fold stiffness. Methods: High-speed videoendoscopy (HSV) was conducted on 102 participants, including 21 normophonic individuals, 39 patients with benign vocal fold lesions, and 42 with glottic cancer. Laryngotopographic parameter describing the stiffness of vocal fold (SAI) and kymographic parameters describing amplitude, symmetry, and glottal dynamics were quantified. Statistical differences between groups were assessed using receiver operating characteristic (ROC) analysis and lesion classification was performed using a machine learning model. Results: Univariate receiver operating characteristic (ROC) analysis revealed that SAI (AUC = 0.91, 95% CI: 0.839–0.962) and weighted amplitude asymmetry (AUC = 0.92, 95% CI: 0.85–0.974) were highly effective in distinguishing between normophonic and organic lesions (p < 0.01). Further multivariate analysis using machine learning models demonstrated improved accuracy, with the SVM classifier achieving an AUC of 0.93 for detecting organic lesions and 0.83 for distinguishing benign from malignant lesions. Conclusions: The study demonstrates the potential value of parameter describing the pliability of infiltrated vocal fold (SAI) as a non-invasive tool to support histopathological evaluation in laryngeal lesions, with machine learning models enhancing diagnostic performance. Full article
(This article belongs to the Special Issue Application of Biostatistics in Cancer Research)
Show Figures

Figure 1

15 pages, 3132 KiB  
Article
Liquid Lens Optical Design for Adjustable Laser Spot Array for the Laser-Based Three-Dimensional Reconstruction of Vocal Fold Oscillations
by Benjamin Haas, Rose Mary, Kristian Cvecek, Clemens Roider, Michael Schmidt, Michael Döllinger and Marion Semmler
Optics 2025, 6(1), 10; https://doi.org/10.3390/opt6010010 - 12 Mar 2025
Viewed by 729
Abstract
Standard endoscopy of vocal folds is in general limited to two-dimensional imaging. Laser-based 3D imaging offers not only absolute measurements but also the possibility of assessing all three spatial directions. However, due to human inter-individuality, a fixed grid configuration (with fixed edge length [...] Read more.
Standard endoscopy of vocal folds is in general limited to two-dimensional imaging. Laser-based 3D imaging offers not only absolute measurements but also the possibility of assessing all three spatial directions. However, due to human inter-individuality, a fixed grid configuration (with fixed edge length and spot size) does not necessarily provide the best coverage and resolution. We present a liquid lens optical design for a diffractive spot array generator with dynamic adjustment capabilities for both array size and spot size. The tunable nature of the liquid lenses enables precise control over the spot array generated by a diffractive optical element (DOE). The first liquid lens controls the spot divergence in the observation plane, while the second liquid lens adjusts the zoom factor. The optical configuration provides a dynamic range of 1.8 with respect to array size, significantly enhancing adaptability in imaging across various applications. Full article
(This article belongs to the Special Issue Advanced Optical Imaging for Biomedicine)
Show Figures

Figure 1

22 pages, 6742 KiB  
Article
Comparative Evaluation of High-Speed Videoendoscopy and Laryngovideostroboscopy for Functional Laryngeal Assessment in Clinical Practice
by Joanna Hoffman, Magda Barańska, Ewa Niebudek-Bogusz and Wioletta Pietruszewska
J. Clin. Med. 2025, 14(5), 1723; https://doi.org/10.3390/jcm14051723 - 4 Mar 2025
Viewed by 1122
Abstract
Advancements in dynamic laryngeal imaging, particularly high-speed videoendoscopy (HSV), have addressed several limitations of laryngovideostroboscopy (LVS). This study aimed to compare the success rates of LVS and HSV in generating recordings suitable for objective functional assessment of vocal fold movements. Methods: This study [...] Read more.
Advancements in dynamic laryngeal imaging, particularly high-speed videoendoscopy (HSV), have addressed several limitations of laryngovideostroboscopy (LVS). This study aimed to compare the success rates of LVS and HSV in generating recordings suitable for objective functional assessment of vocal fold movements. Methods: This study included 200 patients with voice disorders (123 with benign glottal lesions, 56 with malignant lesions, and 21 with functional voice disorders) and 47 normophonic individuals. All participants underwent LVS followed by HSV. Kymographic analysis was performed to evaluate phonatory parameters, including amplitude, symmetry, and glottal dynamics. The success of both methods in generating analyzable kymograms was assessed, and statistical comparisons were made using the chi-square test (significance level set at p < 0.05). Results: The failure rate for LVS was significantly higher (43.32%) compared to HSV. HSV successfully generated kymograms in 68.22% of cases where LVS failed. The primary factors contributing to LVS failure included synchronization issues, inadequate recording brightness, unstable phonation, and hidden glottal opening. Failure rates related to structural obstacles were similar between the two methods. HSV demonstrated superior kymogram feasibility across all subgroups, with the highest success observed in cases of organic glottal pathologies (30.73%). A significant advantage of HSV was observed for both benign and malignant glottal lesions, especially in cases of asynchronous vocal fold oscillations. Conclusions. By overcoming the inherent limitations of LVS, HSV provides a more reliable and objective assessment of phonatory function. Its ability to generate suitable kymograms with greater precision makes HSV a valuable tool for routine clinical diagnostics, enabling the accurate identification of subtle laryngeal pathologies and enhancing diagnostic accuracy. Full article
(This article belongs to the Special Issue New Advances in the Management of Voice Disorders)
Show Figures

Figure 1

17 pages, 5763 KiB  
Article
Assessment of the Interdependencies Between High-Speed Videoendoscopy and Simultaneously Recorded Audio Data in Various Glottal Pathologies
by Magdalena M. Pietrzak, Wioletta Pietruszewska, Magda Barańska, Aleksander Rycerz, Konrad Stawiski and Ewa Niebudek-Bogusz
Biomedicines 2025, 13(2), 511; https://doi.org/10.3390/biomedicines13020511 - 18 Feb 2025
Viewed by 525
Abstract
Background: This study aimed to investigate the relationships between kymographic parameters derived from high-speed videoendoscopy (HSV) and simultaneously recorded acoustic signals. The research provides insights into the vibratory dynamics of various glottal pathologies, assessed across different glottal widths, and their mutual relations [...] Read more.
Background: This study aimed to investigate the relationships between kymographic parameters derived from high-speed videoendoscopy (HSV) and simultaneously recorded acoustic signals. The research provides insights into the vibratory dynamics of various glottal pathologies, assessed across different glottal widths, and their mutual relations with audio data. Methods: The study included 192 participants categorized as normophonic or having functional or organic lesions (benign, premalignant, and malignant). Parameters describing vocal fold oscillations were calculated using HSV kymography for three glottal widths, along with corresponding acoustic data. Initially, linear correlations between these parameters were assessed. Next, the consistency in cycle detection and its influence on the correlation levels were evaluated. Results: The fundamental frequency (F0) and mean Jitter (Jita) showed the highest correlations between the HSV- and audio-determined parameters (F0: 0.97, Jita: 0.40–0.70), with even stronger correlations when the number of detected cycles was consistent (F0: 0.99, Jita: 0.68–0.98). The correlations for other parameters ranged from low to moderate, with no significant differences observed between the diagnostic subgroups (functional changes and benign and malignant glottal lesions). However, in the premalignant lesions group, high correlations (0.77–0.9) were observed between the HSV and audio parameters, but only for measures describing period perturbations. Beyond F0 and mean Jitter, consistency in cycle detection did not significantly affect correlation levels. Conclusions: The simultaneous audio signal proved useful in verifying the accuracy of HSV quantification measures, particularly for F0, which showed strong agreement between the methods. Discrepancies in other parameters and low correlations between HSV-derived kymography and audio data may suggest the influence of the throat, mouth, and nose resonators, which are added to the glottal signal. While the kymographic analysis based on HSV provides detailed descriptions of vocal fold oscillations, it does not fully capture the three-dimensional structure and complex functionality of the vocal folds. Full article
(This article belongs to the Section Biomedical Engineering and Materials)
Show Figures

Figure 1

24 pages, 4555 KiB  
Review
Biophysics of Voice Onset: A Comprehensive Overview
by Philippe H. DeJonckere and Jean Lebacq
Bioengineering 2025, 12(2), 155; https://doi.org/10.3390/bioengineering12020155 - 6 Feb 2025
Viewed by 1557
Abstract
Voice onset is the sequence of events between the first detectable movement of the vocal folds (VFs) and the stable vibration of the vocal folds. It is considered a critical phase of phonation, and the different modalities of voice onset and their distinctive [...] Read more.
Voice onset is the sequence of events between the first detectable movement of the vocal folds (VFs) and the stable vibration of the vocal folds. It is considered a critical phase of phonation, and the different modalities of voice onset and their distinctive characteristics are analysed. Oscillation of the VFs can start from either a closed glottis with no airflow or an open glottis with airflow. The objective of this article is to provide a comprehensive survey of this transient phenomenon, from a biomechanical point of view, in normal modal (i.e., nonpathological) conditions of vocal emission. This synthetic overview mainly relies upon a number of recent experimental studies, all based on in vivo physiological measurements, and using a common, original and consistent methodology which combines high-speed imaging, sound analysis, electro-, photo-, flow- and ultrasound glottography. In this way, the two basic parameters—the instantaneous glottal area and the airflow—can be measured, and the instantaneous intraglottal pressure can be automatically calculated from the combined records, which gives a detailed insight, both qualitative and quantitative, into the onset phenomenon. The similarity of the methodology enables a link to be made with the biomechanics of sustained phonation. Essential is the temporal relationship between the glottal area and intraglottal pressure. The three key findings are (1) From the initial onset cycles onwards, the intraglottal pressure signal leads that of the opening signal, as in sustained voicing, which is the basic condition for an energy transfer from the lung pressure to the VF tissue. (2) This phase lead is primarily due to the skewing of the airflow curve to the right with respect to the glottal area curve, a consequence of the compressibility of air and the inertance of the vocal tract. (3) In case of a soft, physiological onset, the glottis shows a spindle-shaped configuration just before the oscillation begins. Using the same parameters (airflow, glottal area, intraglottal pressure), the mechanism of triggering the oscillation can be explained by the intraglottal aerodynamic condition. From the first cycles on, the VFs oscillate on either side of a paramedian axis. The amplitude of these free oscillations increases progressively before the first contact on the midline. Whether the first movement is lateral or medial cannot be defined. Moreover, this comprehensive synthesis of onset biomechanics and the links it creates sheds new light on comparable phenomena at the level of sound attack in wind instruments, as well as phenomena such as the production of intervals in the sung voice. Full article
(This article belongs to the Special Issue The Biophysics of Vocal Onset)
Show Figures

Figure 1

17 pages, 7473 KiB  
Article
Three-Dimensional Analysis of Vocal Fold Oscillations: Correlating Superior and Medial Surface Dynamics Using Ex Vivo Human Hemilarynges
by Reinhard Veltrup, Susanne Angerer, Elena Gessner, Friederike Matheis, Emily Sümmerer, Jann-Ole Henningson, Michael Döllinger and Marion Semmler
Bioengineering 2024, 11(10), 977; https://doi.org/10.3390/bioengineering11100977 - 28 Sep 2024
Viewed by 1360
Abstract
The primary acoustic signal of the voice is generated by the complex oscillation of the vocal folds (VFs), whereby physicians can barely examine the medial VF surface due to its anatomical inaccessibility. In this study, we investigated possibilities to infer medial surface dynamics [...] Read more.
The primary acoustic signal of the voice is generated by the complex oscillation of the vocal folds (VFs), whereby physicians can barely examine the medial VF surface due to its anatomical inaccessibility. In this study, we investigated possibilities to infer medial surface dynamics by analyzing correlations in the oscillatory behavior of the superior and medial VF surfaces of four human hemilarynges, each in 24 different combinations of flow rate, VF adduction, and elongation. The two surfaces were recorded synchronously during sustained phonation using two high-speed camera setups and were subsequently 3D-reconstructed. The 3D surface parameters of mean and maximum velocities and displacements and general phonation parameters were calculated. The VF oscillations were also analyzed using empirical eigenfunctions (EEFs) and mucosal wave propagation, calculated from medial surface trajectories. Strong linear correlations were found between the 3D parameters of the superior and medial VF surfaces, ranging from 0.8 to 0.95. The linear regressions showed similar values for the maximum velocities at all hemilarynges (0.69–0.9), indicating the most promising parameter for predicting the medial surface. Since excessive VF velocities are suspected to cause phono-trauma and VF polyps, this parameter could provide added value to laryngeal diagnostics in the future. Full article
(This article belongs to the Section Biomedical Engineering and Biomaterials)
Show Figures

Figure 1

15 pages, 5547 KiB  
Technical Note
Pragmatic De-Noising of Electroglottographic Signals
by Sten Ternström
Bioengineering 2024, 11(5), 479; https://doi.org/10.3390/bioengineering11050479 - 11 May 2024
Cited by 1 | Viewed by 2109
Abstract
In voice analysis, the electroglottographic (EGG) signal has long been recognized as a useful complement to the acoustic signal, but only when the vocal folds are actually contacting, such that this signal has an appreciable amplitude. However, phonation can also occur without the [...] Read more.
In voice analysis, the electroglottographic (EGG) signal has long been recognized as a useful complement to the acoustic signal, but only when the vocal folds are actually contacting, such that this signal has an appreciable amplitude. However, phonation can also occur without the vocal folds contacting, as in breathy voice, in which case the EGG amplitude is low, but not zero. It is of great interest to identify the transition from non-contacting to contacting, because this will substantially change the nature of the vocal fold oscillations; however, that transition is not in itself audible. The magnitude of the cycle-normalized peak derivative of the EGG signal is a convenient indicator of vocal fold contacting, but no current EGG hardware has a sufficient signal-to-noise ratio of the derivative. We show how the textbook techniques of spectral thresholding and static notch filtering are straightforward to implement, can run in real time, and can mitigate several noise problems in EGG hardware. This can be useful to researchers in vocology. Full article
(This article belongs to the Special Issue Models and Analysis of Vocal Emissions for Biomedical Applications)
Show Figures

Figure 1

16 pages, 6324 KiB  
Article
Simultaneous High-Speed Video Laryngoscopy and Acoustic Aerodynamic Recordings during Vocal Onset of Variable Sound Pressure Level: A Preliminary Study
by Peak Woo
Bioengineering 2024, 11(4), 334; https://doi.org/10.3390/bioengineering11040334 - 29 Mar 2024
Cited by 3 | Viewed by 1537
Abstract
Voicing: requires frequent starts and stops at various sound pressure levels (SPL) and frequencies. Prior investigations using rigid laryngoscopy with oral endoscopy have shown variations in the duration of the vibration delay between normal and abnormal subjects. However, these studies were not physiological [...] Read more.
Voicing: requires frequent starts and stops at various sound pressure levels (SPL) and frequencies. Prior investigations using rigid laryngoscopy with oral endoscopy have shown variations in the duration of the vibration delay between normal and abnormal subjects. However, these studies were not physiological because the larynx was viewed using rigid endoscopes. We adapted a method to perform to perform simultaneous high-speed naso-endoscopic video while simultaneously acquiring the sound pressure, fundamental frequency, airflow rate, and subglottic pressure. This study aimed to investigate voice onset patterns in normophonic males and females during the onset of variable SPL and correlate them with acoustic and aerodynamic data. Materials and Methods: Three healthy males and three healthy females were studied by simultaneous high-speed video laryngoscopy and recording with the production of the gesture [pa:pa:] at soft, medium, and loud voices. The fiber optic endoscope was threaded through a pneumotachograph mask for the simultaneous recording and analysis of acoustic and aerodynamic data. Results: The average increase in the sound pressure level (SPL) for the group was 15 dB, from 70 to 85 dB. The fundamental frequency increased by an average of 10 Hz. The flow was increased in two subjects, reduced in two subjects, and remained the same in two subjects as the SPL increased. There was a steady increase in the subglottic pressure from soft to loud phonation. Compared to soft to medium phonation, a significant increase in glottal resistance was observed with medium-to-loud phonation. Videokymogram analysis showed the onset of vibration for all voiced tokens without the need for full glottis closure. In loud phonation, there is a more rapid onset of a larger amplitude and prolonged closure of the glottal cycle; however, more cycles are required to achieve the intended SPL. There was a prolonged closed phase during loud phonation. Fast Fourier transform (FFT) analysis of the kymography waveform signal showed a more significant second- and third-harmonic energy above the fundamental frequency with loud phonation. There was an increase in the adjustments in the pharynx with the base of the tongue tilting, shortening of the vocal folds, and pharyngeal constriction. Conclusion: Voice onset occurs in all modalities, without the need for full glottal closure. There was a more significant increase in glottal resistance with loud phonation than that with soft or middle phonation. Vibration analysis of the voice onset showed that more time was required during loud phonation before the oscillation stabilized to a steady state. With increasing SPL, there were significant variations in vocal tract adjustments. The most apparent change was the increase in tongue tension with posterior displacement of the epiglottis. There was an increase in pre-phonation time during loud phonation. Patterns of muscle tension dysphonia with laryngeal squeezing, shortening of the vocal folds, and epiglottis tilting with increasing loudness are features of loud phonation. These observations show that flexible high-speed video laryngoscopy can reveal observations that cannot be observed with rigid video laryngoscopy. An objective analysis of the digital kymography signal can be conducted in selected cases. Full article
(This article belongs to the Special Issue The Biophysics of Vocal Onset)
Show Figures

Figure 1

18 pages, 4569 KiB  
Article
Deep Learning for Neuromuscular Control of Vocal Source for Voice Production
by Anil Palaparthi, Rishi K. Alluri and Ingo R. Titze
Appl. Sci. 2024, 14(2), 769; https://doi.org/10.3390/app14020769 - 16 Jan 2024
Cited by 1 | Viewed by 2327
Abstract
A computational neuromuscular control system that generates lung pressure and three intrinsic laryngeal muscle activations (cricothyroid, thyroarytenoid, and lateral cricoarytenoid) to control the vocal source was developed. In the current study, LeTalker, a biophysical computational model of the vocal system was used [...] Read more.
A computational neuromuscular control system that generates lung pressure and three intrinsic laryngeal muscle activations (cricothyroid, thyroarytenoid, and lateral cricoarytenoid) to control the vocal source was developed. In the current study, LeTalker, a biophysical computational model of the vocal system was used as the physical plant. In the LeTalker, a three-mass vocal fold model was used to simulate self-sustained vocal fold oscillation. A constant /ə/ vowel was used for the vocal tract shape. The trachea was modeled after MRI measurements. The neuromuscular control system generates control parameters to achieve four acoustic targets (fundamental frequency, sound pressure level, normalized spectral centroid, and signal-to-noise ratio) and four somatosensory targets (vocal fold length, and longitudinal fiber stress in the three vocal fold layers). The deep-learning-based control system comprises one acoustic feedforward controller and two feedback (acoustic and somatosensory) controllers. Fifty thousand steady speech signals were generated using the LeTalker for training the control system. The results demonstrated that the control system was able to generate the lung pressure and the three muscle activations such that the four acoustic and four somatosensory targets were reached with high accuracy. After training, the motor command corrections from the feedback controllers were minimal compared to the feedforward controller except for thyroarytenoid muscle activation. Full article
(This article belongs to the Special Issue Computational Methods and Engineering Solutions to Voice III)
Show Figures

Figure 1

17 pages, 1545 KiB  
Article
Confounding Factor Analysis for Vocal Fold Oscillations
by Deniz Gençağa
Entropy 2023, 25(12), 1577; https://doi.org/10.3390/e25121577 - 23 Nov 2023
Viewed by 1208
Abstract
This paper provides a methodology to better understand the relationships between different aspects of vocal fold motion, which are used as features in machine learning-based approaches for detecting respiratory infections from voice recordings. The relationships are derived through a joint multivariate analysis of [...] Read more.
This paper provides a methodology to better understand the relationships between different aspects of vocal fold motion, which are used as features in machine learning-based approaches for detecting respiratory infections from voice recordings. The relationships are derived through a joint multivariate analysis of the vocal fold oscillations of speakers. Specifically, the multivariate setting explores the displacements and velocities of the left and right vocal folds derived from recordings of five extended vowel sounds for each speaker (/aa/, /iy/, /ey/, /uw/, and /ow/). In this multivariate setting, the differences between the bivariate and conditional interactions are analyzed by information-theoretic quantities based on transfer entropy. Incorporation of the conditional quantities reveals information regarding the confounding factors that can influence the statistical interactions among other pairs of variables. This is demonstrated on a vector autoregressive process where the analytical derivations can be carried out. As a proof of concept, the methodology is applied on a clinically curated dataset of COVID-19. The findings suggest that the interaction between the vocal fold oscillations can change according to individuals and presence of any respiratory infection, such as COVID-19. The results are important in the sense that the proposed approach can be utilized to determine the selection of appropriate features as a supplementary or early detection tool in voice-based diagnostics in future studies. Full article
(This article belongs to the Special Issue Information-Theoretic Approaches in Speech Processing and Recognition)
Show Figures

Figure 1

18 pages, 24683 KiB  
Article
An Investigation of Acoustic Back-Coupling in Human Phonation on a Synthetic Larynx Model
by Christoph Näger, Stefan Kniesburges, Bogac Tur, Stefan Schoder and Stefan Becker
Bioengineering 2023, 10(12), 1343; https://doi.org/10.3390/bioengineering10121343 - 22 Nov 2023
Cited by 6 | Viewed by 1666
Abstract
In the human phonation process, acoustic standing waves in the vocal tract can influence the fluid flow through the glottis as well as vocal fold oscillation. To investigate the amount of acoustic back-coupling, the supraglottal flow field has been recorded via high-speed particle [...] Read more.
In the human phonation process, acoustic standing waves in the vocal tract can influence the fluid flow through the glottis as well as vocal fold oscillation. To investigate the amount of acoustic back-coupling, the supraglottal flow field has been recorded via high-speed particle image velocimetry (PIV) in a synthetic larynx model for several configurations with different vocal tract lengths. Based on the obtained velocity fields, acoustic source terms were computed. Additionally, the sound radiation into the far field was recorded via microphone measurements and the vocal fold oscillation via high-speed camera recordings. The PIV measurements revealed that near a vocal tract resonance frequency fR, the vocal fold oscillation frequency fo (and therefore also the flow field’s fundamental frequency) jumps onto fR. This is accompanied by a substantial relative increase in aeroacoustic sound generation efficiency. Furthermore, the measurements show that fo-fR-coupling increases vocal efficiency, signal-to-noise ratio, harmonics-to-noise ratio and cepstral peak prominence. At the same time, the glottal volume flow needed for stable vocal fold oscillation decreases strongly. All of this results in an improved voice quality and phonation efficiency so that a person phonating with fo-fR-coupling can phonate longer and with better voice quality. Full article
Show Figures

Figure 1

19 pages, 4440 KiB  
Article
Effect of Ligament Fibers on Dynamics of Synthetic, Self-Oscillating Vocal Folds in a Biomimetic Larynx Model
by Bogac Tur, Lucia Gühring, Olaf Wendler, Samuel Schlicht, Dietmar Drummer and Stefan Kniesburges
Bioengineering 2023, 10(10), 1130; https://doi.org/10.3390/bioengineering10101130 - 26 Sep 2023
Cited by 7 | Viewed by 2209
Abstract
Synthetic silicone larynx models are essential for understanding the biomechanics of physiological and pathological vocal fold vibrations. The aim of this study is to investigate the effects of artificial ligament fibers on vocal fold vibrations in a synthetic larynx model, which is capable [...] Read more.
Synthetic silicone larynx models are essential for understanding the biomechanics of physiological and pathological vocal fold vibrations. The aim of this study is to investigate the effects of artificial ligament fibers on vocal fold vibrations in a synthetic larynx model, which is capable of replicating physiological laryngeal functions such as elongation, abduction, and adduction. A multi-layer silicone model with different mechanical properties for the musculus vocalis and the lamina propria consisting of ligament and mucosa was used. Ligament fibers of various diameters and break resistances were cast into the vocal folds and tested at different tension levels. An electromechanical setup was developed to mimic laryngeal physiology. The measurements included high-speed video recordings of vocal fold vibrations, subglottal pressure and acoustic. For the evaluation of the vibration characteristics, all measured values were evaluated and compared with parameters from ex and in vivo studies. The fundamental frequency of the synthetic larynx model was found to be approximately 200–520 Hz depending on integrated fiber types and tension levels. This range of the fundamental frequency corresponds to the reproduction of a female normal and singing voice range. The investigated voice parameters from vocal fold vibration, acoustics, and subglottal pressure were within normal value ranges from ex and in vivo studies. The integration of ligament fibers leads to an increase in the fundamental frequency with increasing airflow, while the tensioning of the ligament fibers remains constant. In addition, a tension increase in the fibers also generates a rise in the fundamental frequency delivering the physiological expectation of the dynamic behavior of vocal folds. Full article
Show Figures

Figure 1

15 pages, 4933 KiB  
Article
High-Speed Videoendoscopy Enhances the Objective Assessment of Glottic Organic Lesions: A Case-Control Study with Multivariable Data-Mining Model Development
by Jakub Malinowski, Wioletta Pietruszewska, Konrad Stawiski, Magdalena M. Pietrzak, Magda Barańska, Aleksander Rycerz and Ewa Niebudek-Bogusz
Cancers 2023, 15(14), 3716; https://doi.org/10.3390/cancers15143716 - 22 Jul 2023
Cited by 3 | Viewed by 2172
Abstract
The aim of the study was to utilize a quantitative assessment of the vibratory characteristics of vocal folds in diagnosing benign and malignant lesions of the glottis using high-speed videolaryngoscopy (HSV). Methods: Case-control study including 100 patients with unilateral vocal fold lesions in [...] Read more.
The aim of the study was to utilize a quantitative assessment of the vibratory characteristics of vocal folds in diagnosing benign and malignant lesions of the glottis using high-speed videolaryngoscopy (HSV). Methods: Case-control study including 100 patients with unilateral vocal fold lesions in comparison to 38 normophonic subjects. Quantitative assessment with the determination of vocal fold oscillation parameters was performed based on HSV kymography. Machine-learning predictive models were developed and validated. Results: All calculated parameters differed significantly between healthy subjects and patients with organic lesions. The first predictive model distinguishing any organic lesion patients from healthy subjects reached an area under the curve (AUC) equal to 0.983 and presented with 89.3% accuracy, 97.0% sensitivity, and 71.4% specificity on the testing set. The second model identifying malignancy among organic lesions reached an AUC equal to 0.85 and presented with 80.6% accuracy, 100% sensitivity, and 71.1% specificity on the training set. Important predictive factors for the models were frequency perturbation measures. Conclusions: The standard protocol for distinguishing between benign and malignant lesions continues to be clinical evaluation by an experienced ENT specialist and confirmed by histopathological examination. Our findings did suggest that advanced machine learning models, which consider the complex interactions present in HSV data, could potentially indicate a heightened risk of malignancy. Therefore, this technology could prove pivotal in aiding in early cancer detection, thereby emphasizing the need for further investigation and validation. Full article
Show Figures

Figure 1

39 pages, 2509 KiB  
Article
Deriving Vocal Fold Oscillation Information from Recorded Voice Signals Using Models of Phonation
by Wayne Zhao and Rita Singh
Entropy 2023, 25(7), 1039; https://doi.org/10.3390/e25071039 - 10 Jul 2023
Cited by 2 | Viewed by 2846
Abstract
During phonation, the vocal folds exhibit a self-sustained oscillatory motion, which is influenced by the physical properties of the speaker’s vocal folds and driven by the balance of bio-mechanical and aerodynamic forces across the glottis. Subtle changes in the speaker’s physical state can [...] Read more.
During phonation, the vocal folds exhibit a self-sustained oscillatory motion, which is influenced by the physical properties of the speaker’s vocal folds and driven by the balance of bio-mechanical and aerodynamic forces across the glottis. Subtle changes in the speaker’s physical state can affect voice production and alter these oscillatory patterns. Measuring these can be valuable in developing computational tools that analyze voice to infer the speaker’s state. Traditionally, vocal fold oscillations (VFOs) are measured directly using physical devices in clinical settings. In this paper, we propose a novel analysis-by-synthesis approach that allows us to infer the VFOs directly from recorded speech signals on an individualized, speaker-by-speaker basis. The approach, called the ADLES-VFT algorithm, is proposed in the context of a joint model that combines a phonation model (with a glottal flow waveform as the output) and a vocal tract acoustic wave propagation model such that the output of the joint model is an estimated waveform. The ADLES-VFT algorithm is a forward-backward algorithm which minimizes the error between the recorded waveform and the output of this joint model to estimate its parameters. Once estimated, these parameter values are used in conjunction with a phonation model to obtain its solutions. Since the parameters correlate with the physical properties of the vocal folds of the speaker, model solutions obtained using them represent the individualized VFOs for each speaker. The approach is flexible and can be applied to various phonation models. In addition to presenting the methodology, we show how the VFOs can be quantified from a dynamical systems perspective for classification purposes. Mathematical derivations are provided in an appendix for better readability. Full article
(This article belongs to the Special Issue Information-Theoretic Approaches in Speech Processing and Recognition)
Show Figures

Figure 1

14 pages, 4932 KiB  
Article
Development of Parameters towards Voice Bifurcations
by Takeshi Ikuma, Andrew J. McWhorter, Lacey Adkins and Melda Kunduk
Appl. Sci. 2021, 11(12), 5469; https://doi.org/10.3390/app11125469 - 12 Jun 2021
Cited by 4 | Viewed by 2262
Abstract
Pathological vocal folds are known to exhibit multiple oscillation patterns, depending on tissue imbalance, subglottal pressure level, and other factors. This includes mid-phonation changes due to bifurcations in the underlying voice source system. Knowledge of when changes in oscillation patterns occur is helpful [...] Read more.
Pathological vocal folds are known to exhibit multiple oscillation patterns, depending on tissue imbalance, subglottal pressure level, and other factors. This includes mid-phonation changes due to bifurcations in the underlying voice source system. Knowledge of when changes in oscillation patterns occur is helpful in the assessments of voice disorders, and the knowledge could be transformed into useful objective measures. Mid-phonation bifurcations can occur in rapid succession; hence, a fast classification of oscillation pattern is critical to minimize the averaging of data across bifurcations. This paper proposes frequency-ratio based short-term measures, named harmonic disturbance factor (HDF) and biphonic index (BI), towards the detection of the bifurcations. For the evaluation of HDF and BI, a frequency selection algorithm for glottal source signals is devised, and its efficacy is demonstrated with the glottal area waveforms of four cases, representing the wide range of oscillatory behaviors. The HDF and BI exhibit clear transitions when the voice bifurcations are apparent in the spectrograms. The presented proof-of-concept experiment’s outcomes warrant a larger scale study to formalize the parameters of the frequency selection algorithm. Full article
(This article belongs to the Special Issue Computational Methods and Engineering Solutions to Voice II)
Show Figures

Figure 1

Back to TopTop