Current Trends and Future Directions in Voice Acoustics Measurement

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Acoustics and Vibrations".

Deadline for manuscript submissions: closed (31 October 2022) | Viewed by 25942

Special Issue Editor


E-Mail Website
Guest Editor
Department of Speech, Music and Hearing, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, SE-100 44 Stockholm, Sweden
Interests: voice acoustics; choir acoustics; voice analysis; voice synthesis; music acoustics; audio signal processing; audio technology; voice range profile; electroglottography

Special Issue Information

Dear Colleagues,

The human voice production mechanism implements a superbly rich communication channel that at once tells us what, who, how, and much more. This is thanks to its many degrees of freedom and its large variability. This same variability, however, presents a multitude of challenges in using the sound of the voice for making clinical voice assessments. Decades of research notwithstanding, many acoustic and other physical measures of voice are still not solidly established as clinical evidence, and this is true even though experienced clinicians can often hear what the problem is. There are several underlying reasons for this situation.

First, there appears to be a divide between, on the one hand, enthusiastic engineers continually searching for new voice metrics and pathology classification schemes, and on the other, pragmatic clinicians ever struggling to understand the metrics on offer, in the hope that essential quantitative evidence might be produced for the efficacy of treatments. It has long been clear that this divide must be bridged if real progress is to be made. The inherent complexity of voice production and its disorders makes this difficult but not impossible.

Second, there appears to be a lack of a common understanding of how variable the voice can be. The amount of normal and pathological variation both between and within individuals has profound consequences for how to elicit and sample vocal productions. Once this has been agreed upon, we can proceed to agree on how to negotiate this variability.

Third, it is not easy to generate a market pull for new methods that is sufficient for the healthcare tech industry to engage. There is no lack of voice metrics: there are already hundreds of them in the literature, and physicists, mathematicians, and engineers love to come up with new ones. However, very few of these people love to run the ensuing gauntlet of commercial deployment. Many clever ideas, perhaps even most of them, tend to fizzle out when the engineering students who prototyped them move on. Investors and industry, from which the serious resources must come, are not tempted to engage unless they see enough of a market pull to establish a critical mass of customers in clinical speech and voice care. Such a market pull presupposes not only proven practice and clinical consensus, but also interoperable technical and medical standards, regulations, and legislation, and not least persuading decision-makers at different levels of the health sector that investing in progress will increase productivity within their own domain. Resolving this chicken-and-egg situation amounts to a very tall order, beyond the remit of any single agency, and so voice analysis tends to move along as best it can, in academic labs and conferences. Can voice analysis find a way out into the world, there to earn its keep? The administrative wisdom and clout needed for implementing a productive set of top–down policies and incentives appear to be in short supply almost everywhere. Perhaps, however, many bottom–up efforts and proofs of principle could muster the critical mass, if researchers and clinicians can get their acts together even more. With this Special Issue, we attempt to sketch some inspirational input to such an effort.

Surely, we should not need more metrics. Rather, in order to achieve more expeditious clinical uptake and research advances, we need to identify the most useful metrics and methods, and learn how best to adopt them in stringent and deployable ways. This Special Issue also discusses how voice data—mostly acoustic—would need to be collected, analyzed, and interpreted in order to improve the evidential value of objective measurements. Although the focus here is on acoustic measures, for their convenient non-invasive nature, the ideas are equally applicable to measures from physiology, biomechanics, image processing, and aerodynamics. Indeed, it is probably through the intelligent collation of multimodal measurements that the voice will reveal essential aspects of its function.

In this Special Issue on Voice Acoustics Measurement, the contributors report not only on making various innovative measurements, but also on the more general and fundamentally important issues of acquisition, sampling, statistics, and clinical relevance. We believe these to be essential stepping stones to the achievement of broad clinical uptake and industrial engagement, which in turn are necessary for energizing continued advancement in voice research. New measurement paradigms, critical appraisals, fresh perspectives, and broad collaborations are encouraged. We thank all contributing authors in the present Issue for giving fine examples of such initiatives.

Prof. Dr. Sten Ternström
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • voice acoustics
  • voice measurement
  • variability
  • phonation
  • voice production
  • measurement sampling
  • voice maps
  • clinical relevance
  • machine learning

Published Papers (13 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research, Review, Other

3 pages, 182 KiB  
Editorial
Special Issue on Current Trends and Future Directions in Voice Acoustics Measurement
by Sten Ternström
Appl. Sci. 2023, 13(6), 3514; https://doi.org/10.3390/app13063514 - 09 Mar 2023
Cited by 3 | Viewed by 793
Abstract
The human voice production mechanism implements a superbly rich communication channel that at once tells us what, who, how, and much more [...] Full article
(This article belongs to the Special Issue Current Trends and Future Directions in Voice Acoustics Measurement)

Research

Jump to: Editorial, Review, Other

12 pages, 1228 KiB  
Article
Laryngeal Imaging Study of Glottal Attack/Offset Time in Adductor Spasmodic Dysphonia during Connected Speech
by Maryam Naghibolhosseini, Stephanie R. C. Zacharias, Sarah Zenas, Farrah Levesque and Dimitar D. Deliyski
Appl. Sci. 2023, 13(5), 2979; https://doi.org/10.3390/app13052979 - 25 Feb 2023
Cited by 4 | Viewed by 1986
Abstract
Adductor spasmodic dysphonia (AdSD) disrupts laryngeal muscle control during speech and, therefore, affects the onset and offset of phonation. In this study, the goal is to use laryngeal high-speed videoendoscopy (HSV) to measure the glottal attack time (GAT) and glottal offset time (GOT) [...] Read more.
Adductor spasmodic dysphonia (AdSD) disrupts laryngeal muscle control during speech and, therefore, affects the onset and offset of phonation. In this study, the goal is to use laryngeal high-speed videoendoscopy (HSV) to measure the glottal attack time (GAT) and glottal offset time (GOT) during connected speech for normophonic (vocally normal) and AdSD voices. A monochrome HSV system was used to record readings of six CAPE-V sentences and part of the “Rainbow Passage” from the participants. Three raters visually analyzed the HSV data using a playback software to measure the GAT and GOT. The results show that the GAT was greater in the AdSD group than in the normophonic group; however, the clinical significance of the amount of this difference needs to be studied further. More variability was observed in both GATs and GOTs of the disorder group. Additionally, the GAT and GOT time series were found to be nonstationary for the AdSD group while they were stationary for the normophonic voices. This study shows that the GAT and GOT measures can be potentially used as objective markers to characterize AdSD. The findings will potentially help in the development of standardized measures for voice evaluation and the accurate diagnosis of AdSD. Full article
(This article belongs to the Special Issue Current Trends and Future Directions in Voice Acoustics Measurement)
Show Figures

Figure 1

21 pages, 2876 KiB  
Article
Determination of Harmonic Parameters in Pathological Voices—Efficient Algorithm
by Joana Filipa Teixeira Fernandes, Diamantino Freitas, Arnaldo Candido Junior and João Paulo Teixeira
Appl. Sci. 2023, 13(4), 2333; https://doi.org/10.3390/app13042333 - 11 Feb 2023
Cited by 4 | Viewed by 2114
Abstract
The harmonic parameters Autocorrelation, Harmonic to Noise Ratio (HNR), and Noise to Harmonic Ratio are related to vocal quality, providing alternative measures of the harmonic energy of a speech signal. They will be used as input resources for an intelligent medical decision support [...] Read more.
The harmonic parameters Autocorrelation, Harmonic to Noise Ratio (HNR), and Noise to Harmonic Ratio are related to vocal quality, providing alternative measures of the harmonic energy of a speech signal. They will be used as input resources for an intelligent medical decision support system for the diagnosis of speech pathology. An efficient algorithm is important when implementing it on low-power devices. This article presents an algorithm that determines these parameters by optimizing the window type and length. The method used comparatively analyzes the values of the algorithm, with different combinations of window and size and a reference value. Hamming, Hanning, and Blackman windows with lengths of 3, 6, 12, and 24 glottal cycles and various sampling frequencies were investigated. As a result, we present an efficient algorithm that determines the parameters using the Hanning window with a length of six glottal cycles. The mean difference of Autocorrelation is less than 0.004, and that of HNR is less than 0.42 dB. In conclusion, this algorithm allows extraction of the parameters close to the reference values. In Autocorrelation, there are no significant effects of sampling frequency. However, it should be used cautiously for HNR with lower sampling rates. Full article
(This article belongs to the Special Issue Current Trends and Future Directions in Voice Acoustics Measurement)
Show Figures

Figure 1

14 pages, 894 KiB  
Communication
Do We Get What We Need from Clinical Acoustic Voice Measurements?
by Meike Brockmann-Bauser and Maria Francisca de Paula Soares
Appl. Sci. 2023, 13(2), 941; https://doi.org/10.3390/app13020941 - 10 Jan 2023
Cited by 6 | Viewed by 2282
Abstract
Instrumental acoustic measurements of the human voice have enormous potential to objectively describe pathology and, thereby, to assist clinical treatment decisions. Despite the increasing application and accessibility of technical knowledge and equipment, recent research has highlighted a lack of understanding of physiologic, speech/language-, [...] Read more.
Instrumental acoustic measurements of the human voice have enormous potential to objectively describe pathology and, thereby, to assist clinical treatment decisions. Despite the increasing application and accessibility of technical knowledge and equipment, recent research has highlighted a lack of understanding of physiologic, speech/language-, and culture-related influencing factors. This article presents a critical review of the current state of the art in the clinical application of instrumental acoustic voice quality measurements and points out future directions for improving its applications and dissemination in less privileged populations. The main barriers to this research relate to (a) standardization and reporting of acoustic analysis techniques; (b) understanding of the relation between perceptual and instrumental acoustic results; (c) the necessity to account for natural speech-related covariables, such as differences in speaking voice sound pressure level (SPL) and fundamental frequency f0; (d) the need for a much larger database to understand normal variability within and between voice-disordered and vocally healthy individuals related to age, training, and physiologic factors; and (e) affordable equipment, including mobile communication devices, accessible in various settings. This calls for further research into technical developments and optimal assessment procedures for pathology-specific patient groups. Full article
(This article belongs to the Special Issue Current Trends and Future Directions in Voice Acoustics Measurement)
Show Figures

Figure 1

27 pages, 95915 KiB  
Article
Mapping Phonation Types by Clustering of Multiple Metrics
by Huanchen Cai and Sten Ternström
Appl. Sci. 2022, 12(23), 12092; https://doi.org/10.3390/app122312092 - 25 Nov 2022
Cited by 4 | Viewed by 1137
Abstract
For voice analysis, much work has been undertaken with a multitude of acoustic and electroglottographic metrics. However, few of these have proven to be robustly correlated with physical and physiological phenomena. In particular, all metrics are affected by the fundamental frequency and sound [...] Read more.
For voice analysis, much work has been undertaken with a multitude of acoustic and electroglottographic metrics. However, few of these have proven to be robustly correlated with physical and physiological phenomena. In particular, all metrics are affected by the fundamental frequency and sound level, making voice assessment sensitive to the recording protocol. It was investigated whether combinations of metrics, acquired over voice maps rather than with individual sustained vowels, can offer a more functional and comprehensive interpretation. For this descriptive, retrospective study, 13 men, 13 women, and 22 children were instructed to phonate on /a/ over their full voice range. Six acoustic and EGG signal features were obtained for every phonatory cycle. An unsupervised voice classification model created feature clusters, which were then displayed on voice maps. It was found that the feature clusters may be readily interpreted in terms of phonation types. For example, the typical intense voice has a high peak EGG derivative, a relatively high contact quotient, low EGG cycle-rate entropy, and a high cepstral peak prominence in the voice signal, all represented by one cluster centroid that is mapped to a given color. In a transition region between the non-contacting and contacting of the vocal folds, the combination of metrics shows a low contact quotient and relatively high entropy, which can be mapped to a different color. Based on this data set, male phonation types could be clustered into up to six categories and female and child types into four. Combining acoustic and EGG metrics resolved more categories than either kind on their own. The inter- and intra-participant distributional features are discussed. Full article
(This article belongs to the Special Issue Current Trends and Future Directions in Voice Acoustics Measurement)
Show Figures

Figure 1

11 pages, 2840 KiB  
Article
Voice Simulation: The Next Generation
by Ingo R. Titze and Jorge C. Lucero
Appl. Sci. 2022, 12(22), 11720; https://doi.org/10.3390/app122211720 - 18 Nov 2022
Cited by 4 | Viewed by 1383
Abstract
Simulation of the acoustics and biomechanics of sound production in humans and animals began half a century ago. The three major components are the mechanics of tissue under self-sustained oscillation, the transport of air from the lungs to the lips, and the propagation [...] Read more.
Simulation of the acoustics and biomechanics of sound production in humans and animals began half a century ago. The three major components are the mechanics of tissue under self-sustained oscillation, the transport of air from the lungs to the lips, and the propagation of sound in the airways. Both low-dimensional and high-dimensional computer models have successfully predicted control of pitch, loudness, spectral content, vowel production, and many other features of speaking and singing. However, the problems of computational efficiency, validity, and accuracy have not been adequately addressed. Low-dimensional models are often more revealing of nonlinear phenomena in coupled oscillators, but the simplifying assumptions are not always validated. High-dimensional models can provide more accuracy, but interpretations of results are sometimes clouded by computational redundancy and uncertainty of parameters. The next generation will likely combine pre-calculations and machine learning with abbreviated critical calculations. Full article
(This article belongs to the Special Issue Current Trends and Future Directions in Voice Acoustics Measurement)
Show Figures

Figure 1

24 pages, 3410 KiB  
Article
The Role of Data Analytics in the Assessment of Pathological Speech—A Critical Appraisal
by Pedro Gómez-Vilda, Andrés Gómez-Rodellar, Daniel Palacios-Alonso, Victoria Rodellar-Biarge and Agustín Álvarez-Marquina
Appl. Sci. 2022, 12(21), 11095; https://doi.org/10.3390/app122111095 - 02 Nov 2022
Cited by 8 | Viewed by 1984
Abstract
Pathological voice characterization has received increasing attention over the last 20 years. Hundreds of studies have been published showing inventive approaches with very promising findings. Nevertheless, methodological issues might hamper performance assessment trustworthiness. This study reviews some critical aspects regarding data collection and [...] Read more.
Pathological voice characterization has received increasing attention over the last 20 years. Hundreds of studies have been published showing inventive approaches with very promising findings. Nevertheless, methodological issues might hamper performance assessment trustworthiness. This study reviews some critical aspects regarding data collection and processing, machine learning-oriented methods, and grounding analytical approaches, with a view to embedding developed clinical decision support tools into the diagnosis decision-making process. A set of 26 relevant studies published since 2010 was selected through critical selection criteria and evaluated. The model-driven (MD) or data-driven (DD) character of the selected approaches is deeply examined considering novelty, originality, statistical robustness, trustworthiness, and clinical relevance. It has been found that before 2020 most of the works examined were more aligned with MD approaches, whereas over the last two years a balanced proportion of DD and MD-based studies was found. A total of 15 studies presented MD characters, whereas seven were mainly DD-oriented, and four shared both profiles. Fifteen studies showed exploratory or prospective advanced statistical analysis. Eighteen included some statistical validation to avail claims. Twenty-two reported original work, whereas the remaining four were systematic reviews of others’ work. Clinical relevance and acceptability by voice specialists were found in 14 out of the 26 works commented on. Methodological issues such as detection and classification performance, training and generalization capability, explainability, preservation of semantic load, clinical acceptance, robustness, and development expenses have been identified as major issues in applying machine learning to clinical support systems. Other important aspects to be taken into consideration are trustworthiness, gender-balance issues, and statistical relevance. Full article
(This article belongs to the Special Issue Current Trends and Future Directions in Voice Acoustics Measurement)
Show Figures

Figure 1

28 pages, 3545 KiB  
Article
Ambulatory Monitoring of Subglottal Pressure Estimated from Neck-Surface Vibration in Individuals with and without Voice Disorders
by Juan P. Cortés, Jon Z. Lin, Katherine L. Marks, Víctor M. Espinoza, Emiro J. Ibarra, Matías Zañartu, Robert E. Hillman and Daryush D. Mehta
Appl. Sci. 2022, 12(21), 10692; https://doi.org/10.3390/app122110692 - 22 Oct 2022
Cited by 1 | Viewed by 2216
Abstract
The aerodynamic voice assessment of subglottal air pressure can discriminate between speakers with typical voices from patients with voice disorders, with further evidence validating subglottal pressure as a clinical outcome measure. Although estimating subglottal pressure during phonation is an important component of a [...] Read more.
The aerodynamic voice assessment of subglottal air pressure can discriminate between speakers with typical voices from patients with voice disorders, with further evidence validating subglottal pressure as a clinical outcome measure. Although estimating subglottal pressure during phonation is an important component of a standard voice assessment, current methods for estimating subglottal pressure rely on non-natural speech tasks in a clinical or laboratory setting. This study reports on the validation of a method for subglottal pressure estimation in individuals with and without voice disorders that can be translated to connected speech to enable the monitoring of vocal function and behavior in real-world settings. During a laboratory calibration session, a participant-specific multiple regression model was derived to estimate subglottal pressure from a neck-surface vibration signal that can be recorded during natural speech production. The model was derived for vocally typical individuals and patients diagnosed with phonotraumatic vocal fold lesions, primary muscle tension dysphonia, and unilateral vocal fold paralysis. Estimates of subglottal pressure using the developed method exhibited significantly lower error than alternative methods in the literature, with average errors ranging from 1.13 to 2.08 cm H2O for the participant groups. The model was then applied during activities of daily living, thus yielding ambulatory estimates of subglottal pressure for the first time in these populations. Results point to the feasibility and potential of real-time monitoring of subglottal pressure during an individual’s daily life for the prevention, assessment, and treatment of voice disorders. Full article
(This article belongs to the Special Issue Current Trends and Future Directions in Voice Acoustics Measurement)
Show Figures

Figure 1

16 pages, 3566 KiB  
Article
Re-Training of Convolutional Neural Networks for Glottis Segmentation in Endoscopic High-Speed Videos
by Michael Döllinger, Tobias Schraut, Lea A. Henrich, Dinesh Chhetri, Matthias Echternach, Aaron M. Johnson, Melda Kunduk, Youri Maryn, Rita R. Patel, Robin Samlan, Marion Semmler and Anne Schützenberger
Appl. Sci. 2022, 12(19), 9791; https://doi.org/10.3390/app12199791 - 28 Sep 2022
Cited by 7 | Viewed by 1703
Abstract
Endoscopic high-speed video (HSV) systems for visualization and assessment of vocal fold dynamics in the larynx are diverse and technically advancing. To consider resulting “concepts shifts” for neural network (NN)-based image processing, re-training of already trained and used NNs is necessary to allow [...] Read more.
Endoscopic high-speed video (HSV) systems for visualization and assessment of vocal fold dynamics in the larynx are diverse and technically advancing. To consider resulting “concepts shifts” for neural network (NN)-based image processing, re-training of already trained and used NNs is necessary to allow for sufficiently accurate image processing for new recording modalities. We propose and discuss several re-training approaches for convolutional neural networks (CNN) being used for HSV image segmentation. Our baseline CNN was trained on the BAGLS data set (58,750 images). The new BAGLS-RT data set consists of additional 21,050 images from previously unused HSV systems, light sources, and different spatial resolutions. Results showed that increasing data diversity by means of preprocessing already improves the segmentation accuracy (mIoU + 6.35%). Subsequent re-training further increases segmentation performance (mIoU + 2.81%). For re-training, finetuning with dynamic knowledge distillation showed the most promising results. Data variety for training and additional re-training is a helpful tool to boost HSV image segmentation quality. However, when performing re-training, the phenomenon of catastrophic forgetting should be kept in mind, i.e., adaption to new data while forgetting already learned knowledge. Full article
(This article belongs to the Special Issue Current Trends and Future Directions in Voice Acoustics Measurement)
Show Figures

Figure 1

Review

Jump to: Editorial, Research, Other

19 pages, 49644 KiB  
Review
Real-Time Visual Feedback in Singing Pedagogy: Current Trends and Future Directions
by Filipa M. B. Lã and Mauro B. Fiuza
Appl. Sci. 2022, 12(21), 10781; https://doi.org/10.3390/app122110781 - 25 Oct 2022
Cited by 4 | Viewed by 2528
Abstract
Singing pedagogy has increasingly adopted guide awareness through the use of meaningful real-time visual feedback. Technology typically used to study the voice can also be applied in a singing lesson, aiming at facilitating students’ awareness of the three subsystems involved in voice production—breathing, [...] Read more.
Singing pedagogy has increasingly adopted guide awareness through the use of meaningful real-time visual feedback. Technology typically used to study the voice can also be applied in a singing lesson, aiming at facilitating students’ awareness of the three subsystems involved in voice production—breathing, oscillatory and resonatory—and their underlying physiological, aerodynamical and acoustical mechanisms. Given the variety of real-time visual feedback tools, this article provides a comprehensive overview of such tools and their current and future pedagogical applications in the voice studio. The rationale for using real-time visual feedback is discussed, including both the theoretical and practical applications of visualizing physiological, aerodynamical and acoustical aspects of voice production. The monitorization of breathing patterns is presented, displaying lung volume as the sum of abdominal and ribcage movements signals. In addition, estimates of subglottal pressure are visually displayed using a subglottal pressure meter to assist with the shaping of musical phrases in singing. As to what concerns vibratory patterns of the vocal folds and phonatory airflow, the use of electroglottography and inverse filters is applied to monitor the phonation types, voice breaks, pitch and intensity range of singers of different music genres. These vocal features, together with intentional voice distortions and intonation adjustments, are also displayed using spectrographs. As the voice is invisible to the eye, the use of real-time visual feedback is proposed as a key pedagogical approach in current and future singing lessons. The use of such an approach corroborates the current trend of developing evidence-based practices in voice education. Full article
(This article belongs to the Special Issue Current Trends and Future Directions in Voice Acoustics Measurement)
Show Figures

Figure 1

18 pages, 2042 KiB  
Review
A Scoping Literature Review of Relative Fundamental Frequency (RFF) in Individuals with and without Voice Disorders
by Victoria S. McKenna, Jennifer M. Vojtech, Melissa Previtera, Courtney L. Kendall and Kelly E. Carraro
Appl. Sci. 2022, 12(16), 8121; https://doi.org/10.3390/app12168121 - 13 Aug 2022
Cited by 4 | Viewed by 1914
Abstract
Relative fundamental frequency (RFF) is an acoustic measure that characterizes changes in voice fundamental frequency during voicing transitions. Despite showing promise as an indicator of vocal disorder and laryngeal muscle tension, the clinical adoption of RFF remains challenging, partly due to a lack [...] Read more.
Relative fundamental frequency (RFF) is an acoustic measure that characterizes changes in voice fundamental frequency during voicing transitions. Despite showing promise as an indicator of vocal disorder and laryngeal muscle tension, the clinical adoption of RFF remains challenging, partly due to a lack of research integration. As such, this review sought to provide summative information and highlight next steps for the clinical implementation of RFF. A systematic literature search was completed across 5 databases, yielding 37 articles that met inclusion criteria. Studies most often included adults with and without tension-based voice disorders (e.g., muscle tension dysphonia), though patient and control groups were directly compared in only 32% of studies. Only 11% of studies tracked therapeutic progress, making it difficult to understand how RFF can be used as a clinical outcome. Specifically, there is evidence to support within-person RFF tracking as a clinical outcome, but more research is needed to understand how RFF correlates to auditory-perceptual ratings (strain, effort, and overall severity of dysphonia) both before and after therapeutic interventions. Finally, a marked increase in the use of automated estimation methods was noted since 2016, yet there remains a critical need for a universally available algorithm to support widespread clinical adoption. Full article
(This article belongs to the Special Issue Current Trends and Future Directions in Voice Acoustics Measurement)
Show Figures

Figure 1

Other

26 pages, 5555 KiB  
Perspective
Voice Maps as a Tool for Understanding and Dealing with Variability in the Voice
by Sten Ternström and Peter Pabon
Appl. Sci. 2022, 12(22), 11353; https://doi.org/10.3390/app122211353 - 09 Nov 2022
Cited by 4 | Viewed by 2062
Abstract
Individual acoustic and other physical metrics of vocal status have long struggled to prove their worth as clinical evidence. While combinations of metrics or “features” are now being intensely explored using data analytics methods, there is a risk that explainability and insight will [...] Read more.
Individual acoustic and other physical metrics of vocal status have long struggled to prove their worth as clinical evidence. While combinations of metrics or “features” are now being intensely explored using data analytics methods, there is a risk that explainability and insight will suffer. The voice mapping paradigm discards the temporal dimension of vocal productions and uses fundamental frequency (fo) and sound pressure level (SPL) as independent control variables to implement a dense grid of measurement points over a relevant voice range. Such mapping visualizes how most physical voice metrics are greatly affected by fo and SPL, and more so individually than has been generally recognized. It is demonstrated that if fo and SPL are not controlled for during task elicitation, repeated measurements will generate “elicitation noise”, which can easily be large enough to obscure the effect of an intervention. It is observed that, although a given metric’s dependencies on fo and SPL often are complex and/or non-linear, they tend to be systematic and reproducible in any given individual. Once such personal trends are accounted for, ordinary voice metrics can be used to assess vocal status. The momentary value of any given metric needs to be interpreted in the context of the individual’s voice range, and voice mapping makes this possible. Examples are given of how voice mapping can be used to quantify voice variability, to eliminate elicitation noise, to improve the reproducibility and representativeness of already established metrics of the voice, and to assess reliably even subtle effects of interventions. Understanding variability at this level of detail will shed more light on the interdependent mechanisms of voice production, and facilitate progress toward more reliable objective assessments of voices across therapy or training. Full article
(This article belongs to the Special Issue Current Trends and Future Directions in Voice Acoustics Measurement)
Show Figures

Figure 1

11 pages, 824 KiB  
Tutorial
Inferential Statistics Is an Unfit Tool for Interpreting Data
by Anders Sand
Appl. Sci. 2022, 12(15), 7691; https://doi.org/10.3390/app12157691 - 30 Jul 2022
Cited by 6 | Viewed by 2060
Abstract
Null hypothesis significance testing is a commonly used tool for making statistical inferences in empirical studies, but its use has always been controversial. In this manuscript, I argue that even more problematic is that significance testing, and other abstract statistical benchmarks, often are [...] Read more.
Null hypothesis significance testing is a commonly used tool for making statistical inferences in empirical studies, but its use has always been controversial. In this manuscript, I argue that even more problematic is that significance testing, and other abstract statistical benchmarks, often are used as tools for interpreting study data. This is problematic because interpreting data requires domain knowledge of the scientific topic and sensitivity to the study context, something that significance testing and other purely statistical approaches are not. By using simple examples, I demonstrate that researchers must first use their domain knowledge—professional expertise, clinical experience, practical insight—to interpret the data in their study and then use inferential statistics to provide some reasonable estimates about what can be generalized from the study data. Moving beyond the current focus on abstract statistical benchmarks will encourage researchers to measure their phenomena in more meaningful ways, transparently convey their data, and communicate their intellectual reasons for interpreting the data as they do, a shift that will better foster a scientific forum for cumulative science. Full article
(This article belongs to the Special Issue Current Trends and Future Directions in Voice Acoustics Measurement)
Show Figures

Figure 1

Back to TopTop