Special Issue on Computational Methods and Engineering Solutions to Voice II

Today, research into voice and speech is not only limited to acoustic, medical, and clinical studies and investigations [...]


Introduction
Today, research into voice and speech is not only limited to acoustic, medical, and clinical studies and investigations. Based on fruitful interdisciplinary working research groups, many new approaches have been suggested during the last decade. As this Special Issue will show, this includes highly advanced numerical models simulating parts or the entire complex fluid-structure-acoustic interactions (FSAI), investigation of glottal dynamics by high-speed-video endoscopy (HSV) and sophisticated machine learning based data analysis methods.
Contributions from different fields, such as mathematics, computer science, artificial intelligence, fluid dynamics, mechatronics, and biology, have been and will be achieving new insights into and a better understanding of the physiological and pathological laryngeal processes within voice and speech production. The purpose of this Special Issue is to provide an insight into the newest and most innovative techniques applied in our field at the beginning of a new decade. Young colleagues, especially, contributed with their work indicating that voice and speech research is an highly interesting scientific area and that we do not have to worry about scientific progress and new findings in the years to come.

High Speed Videoendoscopy
High-speed video endoscopy (HSV), with recording rates of 4000 fps and more, has become more and more popular in voice analysis during the last two decades. The advantage of HSV compared to video stroboscopy is that it allows for quantitative evaluation of vocal fold dynamics. Thereby, many parameters have been suggested computed on kymography, Phonovibrograms and the glottal area waveform (GAW). However, often the clinical interpretation and meaningfulness of suggested parameters is not clear or entirely understood. The paper, authored by Yamauchi et al. [1], applies multivariate analysis to find key parameters reflecting specific voice disorders. They determined and described different key parameters for vocal fold paralysis, sulcus vocalis, vocal fold scarring, atrophy and laryngeal cancer, showing that different disorders are reflected in different parameters. The paper by Ikuma et al. [2] suggests two new parameters, named harmonic disturbance factor (HDF) and biphonic index (BI), towards the detection of the bifurcations frequently occurring during disordered vocal fold oscillations. A shortcoming of common HSV systems means that no metric units on vocal fold vibrations can be computed. Hence, Ghasemzadeh et al. [3] propose a calibration method in combination with a laser projection unit to overcome this shortcoming.

Numerical Modelling
Computational fluid dynamics (CFD) and Computational Aeroacoustics (CAA), applying highly sophisticated two-dimensional and three-dimensional models in combination with high-performance computing (HPC), have become a common tool in our field. These numerical models, using finite-element methods (FEM) or finite volume methods (FVM), have been highly useful in analysing complex fluid-structure-acoustic interaction (FSAI) during the phonatory process. Balázsová et al. [4] use a nonlinear elasticity model of the vocal folds to simulate vocal fold vibrations excited by a compressible airflow. They show that the nonlinear elasticity model of the vocal folds achieves substantially higher accuracy of the computed vocal folds deformation than for linear elasticity models. Li et al. [5] conclude from their study that the vertical glottal duct length in the convergent glottis has important effects on the phonation process and should be specified when using computational and physical models. Li et al. [6] suggest a FEM model for the investigation of unilateral vocal fold paralysis to improve surgical outcomes. Bodaghi et al. [7] study the effect of subglottic stenosis on vocal fold vibration and voice production, considering the entire fluid-structure-acoustic interaction process in a three-dimensional model. Schoder et al. [8] investigate the aero acoustic sound source, applying the so-called Perturbed Convective Wave Equation (PCWE). They conclude from their study that turbulent structures increase the broadband component of the voice signal, supporting previous assumptions on the effect of glottal closure and glottal insufficiency. Lasota et al. [9] investigate the impact of the Sub-Grid Scale (SGS) turbulence model in aero acoustic simulations. Finally, Rosenthal et al. [10] investigate the mechanics of straw phonation, introducing a new electrical circuit-based model of the vocal tract as a transmission line as counterpart for CFD modelling.

Machine Learning
Methods from the field of artificial intelligence (AI) are currently applied in most clinical and medical research areas and are, of course, also applied in voice and speech production research. Gao et al. [11] use surface electromyography (sEMG) data to classify vocal fatigue by using support vector machines (SVM). Devaray and Aichinger [12] also use a SVM classifier to investigate vocal fry, a phonatory phenomenon becoming more and more popular in the younger population. Yousef et al. [13] apply the k-means clustering method in combination with active contour modelling to automatically segment vocal fold edges in HSV data during running speech.
The Special Issue is completed with three contributions not belonging to the previous topics but dealing with mucus analysis and investigations of the acoustic signal. The influence of laryngeal mucus towards vocal fold oscillations has been discussed during the last few years. Peters et al. [14] perform rheological characterization by particle tracking micro-rheology and oscillatory shear rheology on human mucus to gain further insight into viscoelasticity. In their preliminary study, they found great diversity and grouped the mucus samples in three categories. Schlegel et al. [15] perform investigations of acoustic parameters using in vivo porcine models to enable quantitative voice outcome tracking of laryngeal surgical interventions for porcine models. Vojtech et al. [16] analyse phonatory offset and onset processes and characteristics, using the relative fundamental frequency. Acknowledgments: This issue would not be possible without the contributions of various talented authors, hardworking and professional reviewers, and the dedicated editorial team of Applied Sciences. Congratulations to all authors. The feedback, comments, and suggestions from the reviewers and editors helped the authors to improve their papers. I would like to take this opportunity to record my sincere gratefulness to all reviewers. Finally, I place on record my gratitude to the editorial team of Applied Sciences, and special thanks to Jennifer Li, Associated Publisher, who always supported and helped when necessary.

Conflicts of Interest:
The author declares no conflict of interest.