Bayesian Inference of Vocal Fold Material Properties from Glottal Area Waveforms Using a 2D Finite Element Model

Hadwin, Paul J.; Motie-Shirazi, Mohsen; Erath, Byron D.; Peterson, Sean D.

doi:10.3390/app9132735

Open AccessArticle

Bayesian Inference of Vocal Fold Material Properties from Glottal Area Waveforms Using a 2D Finite Element Model

¹

Department of Mechanical and Mechatronics Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada

²

Department of Mechanical and Aeronautical Engineering, Clarkson University, Potsdam, NY 13699, USA

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2019, 9(13), 2735; https://doi.org/10.3390/app9132735

Submission received: 7 May 2019 / Revised: 1 July 2019 / Accepted: 3 July 2019 / Published: 6 July 2019

(This article belongs to the Special Issue Computational Methods and Engineering Solutions to Voice)

Download

Browse Figures

Versions Notes

Abstract

Bayesian estimation has been previously demonstrated as a viable method for developing subject-specific vocal fold models from observations of the glottal area waveform. These prior efforts, however, have been restricted to lumped-element fitting models and synthetic observation data. The indirect relationship between the lumped-element parameters and physical tissue properties renders extracting the latter from the former difficult. Herein we propose a finite element fitting model, which treats the vocal folds as a viscoelastic deformable body comprised of three layers. Using the glottal area waveforms generated by self-oscillating silicone vocal folds we directly estimate the elastic moduli, density, and other material properties of the silicone folds using a Bayesian importance sampling approach. Estimated material properties agree with the “ground truth” experimental values to within

3 %

for most parameters. By considering cases with varying subglottal pressure and medial compression we demonstrate that the finite element model coupled with Bayesian estimation is sufficiently sensitive to distinguish between experimental configurations. Additional information not available experimentally, namely, contact pressures, are extracted from the developed finite element models. The contact pressures are found to increase with medial compression and subglottal pressure, in agreement with expectation.

Keywords:

patient-specific modeling; silicone vocal fold models; Bayesian inverse analysis; finite element analysis

1. Introduction

Numerical models have long been employed to better understand the complex physics involved in human phonation. For example, reduced-order and finite element numerical models of the vocal folds (VFs) can self-oscillate in a manner representative of actual VF kinematics during sustained vowels [1], pitch glides [2], and, in a few cases, running speech [3]. Such models have explored a wide range of phenomena relevant to normal and pathological phonation, including the impact of a posterior glottal gap [4], the ventricular folds [5], phonation onset pressure [6], and the efficacy of various compensation mechanisms for vocal hyperfunction [7,8].

Whereas the majority of modeling efforts in speech, to date, have employed models with general population-based parameters to uncover the universal physical underpinnings of human phonation, a few research teams have begun to explore the development of subject-specific numerical models for phonation [9,10,11,12,13]. The dynamics of the VFs are sensitive to a variety of factors, including subglottal pressure [14], laryngeal muscle activation [15], and a posterior glottal gap [4], to name a few. These factors, which can play a role in vocal hyperfunction [16] and other pathologies [17,18,19,20], can be a challenge to observe clinically. Subject-specific models, on the other hand, are constructed based on measurements of the subject through less challenging media, such as high speed videoendoscopy (HSV) [9,11,21] and offer the potential to elucidate clinically opaque features and parameters, such as VF contact pressures.

In general, development of a subject-specific model entails estimation of numerical parameters such that the model behavior “matches” some observed data from the individual of interest, such as VF kinematics from HSV [9,11,21]. Specifically, numerical model parameters are sought such that some model output(s) (e.g., the VF kinematics) best match the equivalent observation data from the subject. This problem is ill-posed, thus necessitating inverse analysis techniques to determine the model parameters [22]. To date, optimization [9,22], machine learning [23], and Bayesian [10,11,21] frameworks have been developed to estimate VF model parameters.

Optimization-based approaches define a cost functional and employ optimization techniques to determine the parameter values that minimize the functional. Döllinger et al. [22] was the first to successfully use this approach, employing the Nelder-Mead algorithm to determine the vibrating masses, spring stiffnesses, and subglottal pressure of a two mass VF model by minimizing the least-squares error between specific Fourier coefficients of the measured VF trajectories and the simulated trajectories. Genetic algorithms were subsequently employed to determine parameters of a two mass model that best reproduced the trajectories from patients suffering from unilateral VF paralysis [16]. Inverse procedures have also been used to classify disordered versus healthy VF oscillations. Wurzbacher et al. [24] employed simulated annealing to minimize the Euclidean distance between the model and experimental vocal fold trajectories to distinguish between normal and dysphonic subjects during sustained vowel phonation and a pitch glide. Very recently, deep learning tools have been employed for subject-specific modeling [23]. A long short-term memory network, trained on simulated data, was used to estimate the subglottal pressure from ex vivo recordings of porcine vocal folds. They were able to produce accurate estimates of the subglottal pressure with very low online computational costs.

The vast majority of these subject-specific VF modeling efforts, and all of those mentioned above, have used lumped-element representations of the VFs and employed inverse methods to determine the reduced-order parameters. Of noted exception are the efforts by Xue et al. [12] and Chang et al. [13] that employ finite element model representations of the VFs. Xue et al. used computed tomography to reconstruct the larynx of a subject in a computational domain. The external geometry of the VFs was obtained from the scan, whereas the VF layers and material properties were assumed from population-based histological data. Chang et al. constructed a laryngeal model of a rabbit from magnetic resonance images. Two VF layers were observed in the scans, leading to a body-cover-type construction wherein the densities and Poisson ratios were assumed a priori. The elastic moduli of the layers were determined by an informal optimization that attempted to match the maximum glottal width and fundamental frequency of the model with HSV measurements.

In contrast to optimization-based and machine learning methods that treat estimated model parameters as deterministic, the Bayesian framework treats all parameters and measurements as random variables. Such a framework has the benefit of elucidating the propagation of measurement uncertainties to the model parameters and outputs. Uncertainty estimates are powerful as they provide additional information about the fidelity of the model outputs, which are expected to be of value for clinical decisions. Cataldo et al. [10] introduced the Bayesian framework to estimate stationary (non-time-varying) parameters of a two mass VF model using an importance sampling approach. This work was extended to estimate non-stationary parameters of a three mass body-cover model using both a particle filter [11] and an extended Kalman filter [21]. The Bayesian framework has, thus far, been demonstrated using simulated observation data from a reduced-order VF model, wherein the ground truth was known a priori, to estimate the parameters of a different reduced-order VF fitting model. The similarity between the generation and fitting models enabled direct assessment of the performance of the Bayesian framework. Herein we build on the Bayesian inference framework for developing subject-specific VF models by: (i) considering real HSV recordings of self-oscillating silicone VFs as the observation data; and, for the first time, (ii) employing a two-dimensional (2D) finite element (FE) model of the VFs as the fitting model. These two considerations address several limitations in prior Bayesian estimation-based inference studies applied to voiced speech, while simultaneously extending the methodology to more complex VF representations. Specifically, actual experimental data, with its inherent uncertainties, are used as the observations. Furthermore, by employing a FE representation of the VFs, the estimated FE parameters have direct relation to the material properties of the silicone VF models, reducing the layer of abstraction associated with reduced-order models. By using silicone VFs to generate the observation data, however, the “ground truth” is still available to assess the performance (estimated properties and their uncertainties) of the estimation procedure. Lastly, the use of experimental data provides a more realistic test bed due to the dynamical differences between the data source and the fitting model.

We consider two different VF base configurations, with and without medial compression, as well as multiple subglottal pressures. Exploring medial compression offers insights into how FE models and Bayesian estimation can incorporate VF posturing for subject-specific models, while varying subglottal pressure affords a mechanism for evaluating how well Bayesian inference, when accounting for measurement uncertainty, can differentiate between phonation conditions.

The paper is structured as follows: Section 2 discusses the silicone VFs and experimental conditions used during collection of the data sets and handling of the HSV. Respectively, Section 3 and Section 4 briefly discuss the mathematical model and estimation procedure used in the analysis. Section 5 presents the results of the estimates and provides discussion of the findings. Additionally, this section examines how the numerical parameters, including the triangulation density and step size in time, effect the subject-specific estimates. The manuscripts concludes with Section 6.

2. Experimental Setup and Data Collection

We employ the measured kinematics of silicone VFs undergoing self-sustained oscillations as a proxy for clinical HSV recordings. Silicone VFs have been shown to exhibit similar kinematics to real VFs [25], and thus serve as a reasonable test platform for exploring the viability of Bayesian estimation for producing subject-specific models from clinical data. Furthermore, the known histology and material properties of the silicone VFs provide “ground truth” data with which to compare property estimates. This section discusses the details of the silicone VF experiments, including the manufacturing process used to develop the silicone VFs, the experimental procedures used to capture the HSV, and extraction of the glottal area waveform (GAW) from the HSV, which will serve as the observation data in the inverse analysis.

2.1. Silicone Vocal Fold Models

Silicone VFs are manufactured to reflect the physiological VF structure. The geometry, shown in Figure 1a, is based on the M5 geometry [25] and comprises four layers to reflect the layered VF structure; namely, the body, ligament, superficial lamina propria (SLP), and epithelium. Additionally, a fiber is included in the middle of the ligament layer to control anterior-posterior stiffness. Each VF has an anterior-posterior length of

17.00 mm

, inferior-superior depth of

10.51 mm

, and a medial-lateral height of

8.40 mm

.

The manufacturing process of each silicone VF follows previous approaches [25], where the specific ratios of part A, B, and thinner used to manufacture each layer of the VFs, and the approximate resulting value of the elastic modulus, can be found in Table 1. The same silicone VF models were used for all tests.

2.2. Laryngeal Flow Facility

The synthetic VFs were mounted in a custom laryngeal flow facility driven by compressed air (∼

550 kPa

) that was regulated down to

17.0 kPa

via a Siemens 40-2 pressure regulator. The flow then passed through a Dwyer RMC 103-SSV flow meter that measured the volumetric flow rate and further regulated the flow before entering the vocal tract test facility shown in Figure 1b. The facility was comprised of a model lung plenum, which consisted of a

0.03 m^{3}

cylindrical chamber that was acoustically treated on the inside to reduce acoustic reflections. The plenum exhausted to a square tube, representing the trachea, with a cross-sectional area of

4.94 {cm}^{2}

and

48.0 cm

in length. The subglottal pressure was measured with a flush-mounted Kulite ET-3DC pressure transducer in the wall of the trachea tube,

3.8 cm

upstream of the glottal exit.

Two square mounting brackets that housed the synthetic VF models were attached at the exit plate of the tracheal tube. An inset within the tracheal tube transitioned the interior dimensions of the square tracheal tube down to

1.7 cm \times 1.7 cm

over a length of

2.5 cm

. While the medial-lateral dimension of each VF is

8.4 mm

, an inset depth of

7.6 mm

was machined into each opposing mounting bracket to hold the VFs. In this manner, the medial surface extended above the contacting surface of each bracket by

0.8 mm

, such that when the opposing VFs were brought into contact the amount of medial compression could be adjusted.

Two data sets were collected. The first positioned the VFs in a natural, uncompressed position such that there was no medial compression. The second incorporated a

0.8 mm

shim between the opposing brackets when the medial surfaces were positioned in contact. This resulted in each VF being compressed by

0.4 mm

in the medial-lateral direction. Anterior–posterior tensioning was applied in all cases by pulling on the string embedded in the models with a constant force of

0.3 N

. Note that in this study there was no supraglottal tract, see Figure 1b.

High-speed video of the VF motion was acquired using an 8-bit IDT MotionPro X3 PLUS camera equipped with an Elicar V-HQ Macro 90 mm lens. In this study, an array size of

484 \times 504

pixels was used with a frame rate of

2000 fps

. The spatial resolution of the videos was

24.4 pixels / mm

, corresponding to a physical area for each pixel of

1.680 \times 10^{- 3} {mm}^{2}

.

For the case without medial compression, the VFs were driven with a subglottal pressure of

1.00 kPa

. With medial compression, data were collected for subglottal pressures of

0.91

,

1.00

,

1.09

, and

1.18 kPa

, yielding average flow rates of 236, 260, 291, and

307 mL / s

, respectively. HSV was recorded for

1 s

in each configuration, from which a

300 ms

segment was used to extract the GAW using segmentation [26].

3. Finite Element Model

Estimates of the silicone VF material properties were computed by matching the simulated motion with the kinematics captured by the HSV. To simulate the dynamics we employed a FE representation of the silicone VFs based upon the model developed by Alipour et al. [27], where the displacement field of a transversely isotropic linear medium is approximated using a FE basis defined by piece-wise linear functions over triangular elements. The FE formulation, which is functionally equivalent to Alipour et al. [27], is briefly described in this section.

3.1. Mathematical Model

Small deformations of the silicone VFs can be modeled using a displacement field

u

, which describes the displacement of two linear transversely isotropic elastic bodies [27]. The VFs were oriented such that the medial-lateral direction defines the x-axis, the y-axis is in the anterior-posterior direction, and the z-axis is positive in the superior direction. As a simplification, the dynamics of the two VFs are treated as symmetric and displacement is treated as uniform along the length of the VFs; as such, the displacement field does not vary along the y-axis, i.e.,

\begin{matrix} u = u (x, t) = u_{x} (x, t) \hat{i} + u_{z} (x, t) \hat{j}, \end{matrix}

(1)

where

x = {[x, z]}^{T}

is a vector of planar spatial coordinates. This reduction of physical dimension greatly reduces the computational cost of the FE model.

Given a deformation with displacement field

u

and an arbitrary variation

u^{δ}

, the virtual work principle gives [28]

\begin{matrix} \int_{Ω} {(u^{δ})}^{T} ρ (x) \ddot{u} d V + \int_{Ω} σ (u) : ε (u^{δ}) d V = \int_{Γ} {(u^{δ})}^{T} f_{s} d S, \end{matrix}

(2)

where

Ω

and

Γ

are the 2D VF domain and its boundary, respectively,

ρ

is the material density,

σ

and

ε

are the stress and strain tensors, and

f_{s}

is the surface force, in this case arising from aerodynamics.

Since the vocal folds are modeled as isotropic in the

x z

-plane, Hooke’s law gives [28]

\begin{matrix} σ (u) : ε (u^{δ}) = \tilde{σ} {(u)}^{T} \tilde{ε} (u^{δ}) = \tilde{ε} {(u^{δ})}^{T} \tilde{σ} (u) = \tilde{ε} {(u^{δ})}^{T} C \tilde{ε} (u), \end{matrix}

(3)

where

\tilde{ε} (u) = {[\frac{\partial u_{x}}{\partial x}, \frac{\partial u_{z}}{\partial z}, \frac{\partial u_{z}}{\partial x} + \frac{\partial u_{x}}{\partial z}]}^{T}

and

C

is a positive definite matrix given by

\begin{matrix} C (E, ν) = (\begin{matrix} λ + 2 μ & λ & 0 \\ λ & λ + 2 μ & 0 \\ 0 & 0 & μ \end{matrix}) = μ (E, ν) (\begin{matrix} \frac{2 (1 - ν)}{1 - 2 ν} & \frac{2 ν}{(1 - 2 ν)} & 0 \\ \frac{2 ν}{(1 - 2 ν)} & \frac{2 (1 - ν)}{1 - 2 ν} & 0 \\ 0 & 0 & 1 \end{matrix}) . \end{matrix}

(4)

Here

λ

and

μ

are the Lamé parameters that are related to the Young’s modulus E and Poisson’s ratio

ν

via

\begin{matrix} μ (E, ν) = \frac{E}{2 (1 + ν)}, λ (E, ν) = \frac{ν E}{(1 + ν) (1 - 2 ν)} . \end{matrix}

(5)

Note that for

C (E, ν)

to be positive definite requires

E > 0

and

- 1 < ν < 1 / 2 .

Finally, following Alipour et al. [27], the viscous damping experienced by the VFs is modeled by replacing the shear modulus,

μ (E, ν)

, with

μ + η d / d t

, where

η

is viscosity.

3.2. Finite Element Approximation

The displacement field is approximated with a linear combination of piece-wise linear basis functions

{ϕ_{i}}_{i = 1}^{N}

, given by

\begin{matrix} u (x, t) \approx & (\begin{matrix} \sum_{i = 1}^{N} α_{i} (t) ϕ_{i} (x) \\ \sum_{i = 1}^{N} β_{i} (t) ϕ_{i} (x) \end{matrix}) \end{matrix}

(6)

where each

ϕ_{i} (x)

is defined over a set of connected elements that approximate the geometry of the VFs. The FE approximation is then found by substituting this approximation into Equation (2), giving

\begin{matrix} M \ddot{θ} + D \dot{θ} + K θ = F, \end{matrix}

(7)

where

\begin{matrix} M = \int_{Ω} ρ Φ^{T} Φ d V, D = \int_{Ω} η Φ_{d}^{T} S Φ_{d} d V, K = \int_{Ω} Φ_{d}^{T} C Φ_{d} d V, and F = \int_{Γ} Φ^{T} f_{s} d S . \end{matrix}

(8)

Here,

\begin{matrix} θ (t) = & {[α_{1} (t), \dots α_{N} (t), β_{1} (t), \dots β_{N} (t)]}^{T}, \end{matrix}

(9)

\begin{matrix} Φ (x) = & (\begin{matrix} ϕ_{1} (x) & \dots & ϕ_{N} (x) & 0 & \dots & 0 \\ 0 & \dots & 0 & ϕ_{1} (x) & \dots & ϕ_{N} (x) \end{matrix}), \end{matrix}

(10)

\begin{matrix} Φ_{d} (x) = & (\begin{matrix} \partial_{x} ϕ_{1} (x) & \dots & \partial_{x} ϕ_{N} (x) & 0 & \dots & 0 \\ 0 & \dots & 0 & \partial_{z} ϕ_{1} (x) & \dots & \partial_{z} ϕ_{N} (x) \\ \partial_{z} ϕ_{1} (x) & \dots & \partial_{z} ϕ_{N} (x) & \partial_{x} ϕ_{1} (x) & \dots & \partial_{x} ϕ_{N} (x) \end{matrix}), \end{matrix}

(11)

and integration is interpreted element wise. The computation of

M

,

D

, and

K

now follow the standard finite element construction [28].

The boundary of the VFs was split into two sections: (i) a section where external forces are applied, denoted

Γ_{free}

, and (ii) a section where the nodes do not move, which represents the walls of the trachea, denoted

Γ_{fixed}

. The vector

F

is the nodal force vector and is the result of the transverse surface force. In this work aerodynamic pressure is treated as the only stress leading to an external force on the surface nodes. The pressure at each node along the free surface is modeled using a one-dimensional Bernoulli flow model, which gives

\begin{matrix} p (s, t) = \{\begin{matrix} p_{sub} - (p_{sub} - p_{\sup}) {(\frac{A_{sep} (t)}{A (s, t)})}^{2}, & A (s, t) < A_{sep} \\ p_{\sup}, & A (s, t) \geq A_{sep} \end{matrix} \end{matrix}

(12)

where

s \in Γ_{free}

,

p_{sub}

and

p_{\sup}

are the subglottal and supraglottal pressures, respectively,

A_{\min} (t)

is the minimum glottal area at time t,

A (s, t)

is the area at location s at time t, and

A_{sep} = 1.3 A_{\min} (t)

is the glottal area at the location of flow separation [29,30].

Vocal fold collision is modeled by restricting the x-coordinate of each node from crossing the midline. Should a time step result in a node crossing the midline, its x-coordinate is forced to the midline. The force required to restrict the node to the midline is directly the contact force. Contact pressure can be obtained by dividing by the edge area (distance between the the node and its neighbor).

Finally, time integration was achieved through the finite differencing scheme set out by Alipour et al. [27]. The time integration gives the displacement field over time, yielding a position time series for each node in the finite element mesh. From this the glottal width, equivalent to what can be observed with HSV;

W_{gl}

is given by

W_{gl} (t) = 2 min {x_{1} (t), x_{2} (t), \dots, x_{N} (t)},

where N is the number of nodes, and

x_{i} (t)

represents the x-coordinate position time series the of the

i^{t h}

node. Thus the simulated glottal area,

A_{s}

, can be computed as

A_{s} (t) = ℓ_{gl} \times W_{gl} (t),

where

ℓ_{gl}

is the length of the glottis, a constant in the two-dimensional approximation employed herein.

4. Estimation of VF Material Properties

The FE model embeds the material properties (e.g., E,

ρ

, etc.) that we aim to directly estimate via Bayesian inference. This section details the parameterization of the FE model and the Bayesian inference procedure employed herein.

4.1. Parameterization of the FE Model

The silicone VFs were numerically modeled with three layers: the body, cover (which combines the SLP and epithelium), and ligament. Employing the same dimensions as the silicone VFs (see Section 2.1), a FE representation was generated comprising 205 triangular elements and 120 nodes, see Figure 2a. The sensitivity of the results to the triangularization is discussed in Section 5.3.

All material properties are modeled as uniform across all layers except for the Young’s modulus, which differs from layer to layer, but is constant within a layer. As a result, the FE model is parameterized by:

ρ

,

η

, E (for each layer),

ν

, and

p_{sub}

and

p_{\sup}

. The density, viscosity, Young’s modulus for each layer, and subglottal pressure were estimated while all other parameters were treated as fixed and known. Specifically, we assumed

ν = 0.4995

[31] and

p_{\sup} = 0 Pa

. This parameterization results in low dimensionality for the estimation problem while still employing a high degree-of-freedom VF model.

As discussed in Section 2.2, medial compression of the silicone VFs is considered. Medial compression pre-stresses the VFs introducing more initial potential energy into the system. The compression is modeled via an initial position parameter,

x_{0}

, which represents the maximum value that the x coordinate of any node can attain. As a result, having

x_{0} = 8.4 mm

indicates no medial compression and any value

0 \leq x_{0} < 8.4 mm

indicates the presences of medial compression, as shown in Figure 2. The initial displacement of each node

θ_{0}

in cases involving medial compression was calculated from Equation (7) with time derivatives set to zero. This equation was solved iteratively by adjusting the x coordinate of the nodes in steps of

10^{- 4} mm

while enforcing that no node cross the midline.

The FE simulations used a time step of

h = 0.05 ms

[27], which is equivalent to a frame rate of 20,000 fps. Since the HSV was captured at

2000 fps

, the simulated signal was downsampled to match that of the HSV. The first

250 ms

of the simulated GAW were trimmed to ensure that the numerical model had reached stable oscillations and to avoid any initial numerical transients. This generally resulted in a phase shift between the measured and simulated signals which was corrected by cross correlating the first two cycles of the signals and phase shifting to align the remainder of the signals.

4.2. Bayesian Inference

The Bayesian framework for parameter inference seeks a joint probability distribution that represents the probability of all potential values of the parameters of interest,

χ

. Such a density is found through Bayes equation [32]

\begin{matrix} π (χ | y) = \frac{π (y | χ) π_{pri} (χ)}{π (y)} \propto π (y | χ) π_{pri} (χ), \end{matrix}

(13)

where

π (χ | y)

is the posterior probability density function, which contains all probabilistic information about

χ

given observed measurements

y

. The density

π_{pri} (χ)

is the “prior” probability density,

π (y | χ)

is the “likelihood”, and

π (y)

is the “evidence”. The prior contains known or expected statistical properties of the parameters based on all knowledge available prior to obtaining the measurements. For instance, if subglottal pressure is a model parameter to be inferred, it is known ahead of time that the value cannot be negative, and is likely within a specified bound. The likelihood quantifies the probability of an observed measurement occurring given fixed parameter values; that is, given a particular model with set parameters, what is the likelihood that the measured data would be observed. Lastly, the evidence is a normalization constant that ensures the Law of Total Probability is satisfied.

In the present work, the importance sampling approach is used due to the computational complexity of the model [33]. Such approaches have been successfully used previously for the study of phonation [10,11,34]. The fundamental premise of importance sampling is that certain values of the inputs are more important to the parameter being estimated than others. So a greater weight, which we hereafter refer to as an importance weight, is allocated to those regions in the parameter space that exhibit a better fit to the measurements. In particular, random samples of

χ

are drawn from some proposal distribution, and the resulting observations are simulated using the randomly drawn parameters. The likelihood distribution is then used to probabilistically quantify the goodness of fit to the measured data, from which an importance weight is allocated. When a sufficiently large number of random draws have been computed, a new ensemble is constructed by sampling from the random draws in proportion to their computed importance weight. This can be summarized as follows:

Initialization: generate an ensemble ${{\tilde{χ}}^{(ℓ)}}_{ℓ = 1}^{N}$ of N random samples from the proposal distribution $π_{0} (χ)$ .
Update: for each of the drawn samples, calculate the relative likelihood and normalize to get the importance weight for that sample

$\begin{matrix} w^{(ℓ)} = \frac{1}{W} π (y | {\tilde{χ}}^{(ℓ)}), W = \sum_{ℓ = 1}^{N} π (y | {\tilde{χ}}^{(ℓ)}) . \end{matrix}$

(14)
Resample: generate another ensemble ${χ^{(ℓ)}}_{ℓ = 1}^{N}$ by sampling each ${\tilde{χ}}^{ℓ}$ with probability $w^{ℓ}$ .

If the proposal distribution is chosen to be the prior density, the final resampled ensemble converges in distribution to the posterior [35]. As a result, the resampled ensemble can then be used to compute sample-based point and spread estimates of the posterior. Herein, the sampling ensembles consist of 50,000 random draws from the prior distributions, which treats each parameter as independent. The sample mean of each ensemble is used to estimate the material properties and the sample standard deviation serves as an uncertainty estimate. The sensitivity of the results to the number of samples is considered in Section 5.3.

The above algorithm defines importance weights in terms of the likelihood density

π (y | \tilde{χ})

, which is determined by the specific error model [33]. In particular, if errors are modeled as unbiased additive normal errors the likelihood density is given by

\begin{matrix} π (y | θ) \propto exp (\frac{- 1}{2 σ_{e}^{2}} {∥A_{m} - A_{s}∥}^{2}), \end{matrix}

(15)

where

σ_{e}

is the expected standard deviation of the measurement error, and

{∥A_{m} - A_{s}∥}^{2}

is the squared two-norm of the difference between the vectors of the measured and simulated GAWs, respectively. In this work we choose

σ_{e} = 1

mm

^{2}

; this assumes a noise level of around

6 %

, which is quite large. Such a large noise level was chosen to “whiten” the likelihood to compensate for any model errors that are present [33].

The prior distribution for all estimated parameters were assumed uniform so as to impart the least information into the posterior. Table 2 lists the parameters to be estimated, their experimental “ground truth” values (when known), and the bounds of the uniform prior distribution. The specific bounds for these priors were selected to ensure that the priors sampled a sufficiently wide range of combinations of the parameters and were selected in an ad hoc manner based on an amplification of the expected uncertainty in the experimental value or expectations about that value. We note that a more informative prior distribution (e.g., a Normal distribution) will generally improve estimates and reduce uncertainties, see [36] for more details of how different priors impact estimates.

5. Results and Discussion

Considering the case without medial compression as an exemplar, the estimated material properties from the Bayesian inferences are presented in Table 3. Overall, the estimated parameters show good agreement with the “ground truth” experimental values; the maximum discrepancy between the estimates and the experimental values occurs with the Young’s modulus of the ligament with a

9.6 %

difference. We note that the experimental values, while considered the “ground truth”, are themselves approximate values estimated from the silicone mixture fractions.

We observe that

E_{bdy}

and

E_{cvr}

are over-estimated while

p_{sub}

is under-estimated when compared with the experimental values. The under-estimation of subglottal pressure is likely due to the simplified fluids model being used resulting in high nodal pressures. Over-estimating Young’s modulus of the cover is likely due to the lack of an epithelium in the FE model. The epithelium in the silicone VFs is extremely thin, but has a high Young’s modulus (

45 kPa

). As a result it is likely that the estimate for

E_{cvr}

is slightly elevated to compensate. It is unclear why the estimate for the body is consistently higher than expected, but may be related with the under-estimation of

E_{lig}

. An important observation, however, is that all of the experimental values fall within two standard deviations of the estimated values. This indicates that the use of a FE model of the VFs is statistically capable of inferring accurate estimates of the material properties from a GAW. The relative uncertainty (standard deviation divided by the estimated value) shows that all estimates except for the Young’s modulus of the ligament have uncertainties of approximately

3 %

; the Young’s modulus of the ligament has an

8.6 %

level of uncertainty.

The relatively large bias and uncertainty for the ligament stiffness in comparison with the other parameters potentially results from the comparative insensitivity of the FE kinematics to this parameter. The ligament is a small internal region of the geometry and as a result has less impact on the large scale kinematics compared with the body and cover layers. Since

E_{lig}

is the only parameter defined in this region, a range of values are likely to generate similar GAWs, and any error in the estimate of

E_{lig}

can be compensated for by slight adjustments to

E_{bdy}

and/or

E_{cvr}

. Despite the modest difficulty in estimating this parameter, overall Bayesian inference is able to accurately estimate the VF material properties from HSV data alone, presuming that the histology of the folds are known a priori.

Figure 3 compares the kinematics of the FE model employing the estimated material properties from Table 3 with the HSV over a single vibratory cycle. The FE model captures the silicone VF motion well, including the mucosal wave and the pronounced inferior-superior motion of the folds (see the third column in Figure 3). The fourth column highlights that closure of the FE model does not necessarily always correspond with closure of the silicone VFs.

The GAW extracted from the FE model and the corresponding HSV for the no medial compression case is presented in Figure 4. As suggested by the fourth column of Figure 3, the FE model closes (GAW reaches zero) before the silicone VFs close. In fact, small openings along the span of the silicone VFs exist during the “closed” phase, as suggested by Figure 2a. Overall, however, the simulated GAW fits the observed data well, with the measured GAW falling within the estimated uncertainty bounds the majority of the time. There are a few persistent mismatches, such as the small peak immediately after opening. Such errors are likely to be model errors induced by approximations, including the use of a simplified 2D approximation of the VFs.

5.1. Effect of Medial Compression

As discussed in Section 2.1, HSV was captured with and without medial compression for

p_{sub} = 1 kPa

. Since it was difficult to measure the degree of medial compression experimentally, the actual pre-stress is highly uncertain. As such, the initial position parameter

x_{0}

in the FE model is included as an estimated parameter for both cases.

Table 4 presents the estimated material properties for both medial compression cases. The estimated properties (excepting

x_{0}

) agree well with each other and with the “ground truth” values. This is encouraging in two respects: the same silicone VFs were used in both cases, and the pre-stress in the medial compression case is captured in

x_{0}

and thus does not appreciably bias the stiffness estimates. However, the higher parameter uncertainties for the medial compression case is very likely due to the fact that the pre-stress associated with medial compression can be approximated by varying other stiffness parameters. This yields more overall uncertainty in the results, as other parameter combinations can explain the observed data.

The estimates for the initial position parameter,

x_{0}

, are statistically different between the two cases. When there is no medial compression the initial position is estimated to be

8.39 mm

(

8.4 mm

corresponds to zero compression, see Figure 2a); with compression,

x_{0}

is estimated to be

8.27 mm

. This

0.12 mm

difference in the estimates is more than double the sum of the two estimated standard deviations, with a t-value of

3.67

, indicating that the FE model was capable of distinguishing between the experimental configurations. This does differ from the

0.4 mm

shim placed experimentally to produce the medial compression, though again, the actual degree of experimental medial compression was very difficult to ascertain.

Comparing the case without medial compression with the results in Table 3 shows that the uncertainties in the estimated parameters are larger in the present case despite using the same observation data. By including the extra fitting parameter,

x_{0}

, the estimated uncertainties increase due to the higher dimensionality (more parameters being fit given the same input data). That is, with the addition of

x_{0}

as a parameter, there is now an alternative pathway to influence the energy in the system. That said, the uncertainties in both estimates are large enough and the estimated values are similar enough that the two data sets cannot be distinguished statistically.

Figure 5 presents the measured and estimated GAWs for the two medial compression cases, wherein the FE models employ the material properties given in Table 4. Both estimates fit the data reasonably well given the simplified FE model being used. The open quotient and speed quotients are both lower for the FE models, with the effect more pronounced for the case with medial compression. This results in a more peaked GAW in comparison with the HSV. The case without medial compression exhibits larger maximum glottal area, however the maximum contact pressure experienced during collision in the case with compression is

18 %

higher, on average, due to the pre-stress of the system (

825.7 \pm 58.6 Pa

versus

665.0 \pm 53.1 Pa

).

5.2. Distinguishing between Model Configurations

As the eventual goal of this research is patient-specific modeling, we wish to investigate whether the FE model is sufficiently sensitive and if HSV data provide enough information to distinguish between similar experimental configurations. As a first order exploration we consider varying subglottal pressures for silicone VFs with medial compression. As discussed in Section 2.1, the pressures considered range from

0.91

to

1.18 kPa

.

The fits to each GAW are shown in Figure 6 and the resulting estimates of the degree of medial compression, along with the other material properties, are shown in Table 5. Similar to Figure 5, the fits are again reasonable given the simplified model being used. As the subglottal pressure increases, so too does the maximum glottal area, as expected; this increase is captured by the FE model. As with the previous comparison, the open quotient and speed quotient are both lower for the FE model, though the difference decreases with increasing subglottal pressure.

As shown in Table 5, all experimental values fall within two standard deviations of the estimated values; furthermore, the viscosity is estimated to be approximately

3 Poise

, which is consistent with the previous results. The estimated medial compression,

x_{0}

, varies somewhat from case to case, but all values are within two standard deviations of one another, indicating statistical consistency. Overall, comparing the estimates in Table 3, Table 4 and Table 5, we find that the estimated material properties are quite consistent across all cases studied, engendering confidence in the method.

Considering the uncertainty in the estimates, we see that as the subglottal pressure increases its uncertainty decreases. As the variance of the measurement noise is treated as fixed at

1 {mm}^{2}

in all estimates, this decrease in uncertainty is not due to a decrease in measurement uncertainty. Furthermore, the same prior distributions are used in all cases. Hence, this decrease in uncertainty is due to an increase in sensitivity of the model, which could be due to the larger glottal width having fewer parameter combinations that are capable of matching the data. Alternatively, the change in uncertainty could be due to the pre-stress model. The estimates computed for

p_{sub} = 1.09 kPa

and

p_{sub} = 1.18 kPa

have

x_{0} > 8.3 mm

, whereas the other two estimates have

x_{0} < 8.3 mm

; in addition, there is a marked decrease in the uncertainties in

x_{0}

as

p_{sub}

increases. This could indicate that a higher level of VF compression results in a model with lower sensitivity to the parameters. That is, the dynamics may be more influenced by pre-stress at higher subglottal pressures.

Overall, the consistency in the material property estimates and the reasonably low relative uncertainty in

p_{sub}

indicates that the FE model is capable of distinguishing between operational conditions. In pairwise t-tests of the four estimates only two pairs fail to reject the hypothesis of unique data sets (

95 %

confidence); those pairs are

p_{sub} = 0.91 kPa

versus

1.00 kPa

and

p_{sub} = 1.00 kPa

versus

1.09 kPa

. In these two pairs there are small changes in the experimental subglottal pressures (

0.09 kPa

) and similar fundamental frequencies. Interestingly, there is significant difference between the two cases with the highest subglottal pressures, despite also only differing by

0.09 kPa

in line with the decreased uncertainty in subglottal pressure at these conditions. We note there is a more marked difference between fundamental frequencies in these cases.

For further validation we compare the volumetric flow rate, Q, estimated from the model with the experimentally measured values; volumetric flow rate was not included in the estimation process and thus provides an independent measure for method/model validation. Table 6 compares the estimated and experimentally measured values. The estimated flow rate is derived directly from the Bernoulli flow model embedded in the FE model (see Section 3) as

Q (t) = 1.3 A_{\min} (t) \sqrt{2 ϱ^{- 1} (p_{sub} - p_{\sup})}

where

ϱ = 1.14 kg m^{- 3}

is the density of air. As can be seen in the table, the estimated values generally agree well with, but tend to slightly over-predict, the experimentally measured values. Excepting for the case of

p_{sub} = 1.18 kPa

, the estimated values are all within one standard deviation of the measurements.

With the FE models developed for the four cases and validated with an independent measure (Q), an additional parameter is explored that is not available experimentally. Table 7 presents the average mean and maximum contact pressures experienced during collisions for the four subglottal pressures. The contact pressures increase with increasing subglottal pressure, which qualitatively agrees with previous studies [37]. Excepting for the

p_{sub} = 1.18 kPa

case, the contact pressures are all less than the subglottal pressure. The silicone VFs employed in this study qualitatively do not appear to have vigorous contact when self-oscillating, and as such, the contact pressures may indeed be less than the subglottal pressure. The outlier in the estimated pressure data is the highest subglottal pressure, which exhibits mean and maximum contact pressures well above the subglottal pressure. This is also the case that predicted a volumetric flow rate well above the measured value, suggesting that this model is less reliable. Interestingly, this is the case that had the lowest material property uncertainties. It is likely that the poor agreement in Q, and the exceedingly high contact pressures for this case, are a result of more complex vibratory patterns for the silicone VFs at this higher subglottal pressure that are not captured well with the simplified 2D FE model. This suggests that additional measures may be required in the estimation process to generate an accurate model for this case.

5.3. Sensitivity Analysis

The estimates presented in Section 5 were produced for fixed ensemble size for the importance sampling, and fixed time step size and mesh density for the FE model. These parameters influence the quality of the numerical model and the estimation procedure while also impacting the computational load. The estimates presented in the previous sections, for example, took 140 h for 16 parallel threads on a AMD Ryzen Threadripper 1950X with 16 cores at 3.4 GHz and 128 GB of RAM to run the importance sampling for all 50,000 samples. As such, there is motivation to use a coarser model (larger/fewer elements and larger time step) and fewer samples to decrease the computational load. To ensure that the estimates presented in this work are not conditional on the numerical parameters being used in the model we explore the sensitivity of the results to them in this section. We use the case without medial compression, again, as the exemplar.

5.3.1. Ensemble Size

Sensitivity of the results to the importance sampling ensemble size was checked by computing estimates with a progressively smaller number of samples from the priors. Figure 7 presents the relative error, defined as the percentage difference in the estimate and experimental value, and uncertainty level for ensemble sizes ranging from 100 to 50,000. It was found that, on average, the estimated values stabilize as the ensemble size increases. The peaks which occur at approximately 15,000 and 25,000 samples are due to the sample-based nature of the estimate. Since the estimates are the sample mean of the resampled ensemble the exact estimated value for each parameter changes as more samples are included in the ensemble. As such, it is encouraging to note that the estimates do appear to stabilize. Specifically, the average error of the estimates stabilizes when using an ensemble size of approximately 15,000, with the maximum error stabilizing with approximately 40,000 samples. The uncertainty smoothly decreases as the ensemble increases, stabilizing when approximately 25,000 samples are used. Error in the estimates converges faster than the uncertainty since an accurate estimate of the mean is easier to attain than a stable estimate of the variance.

5.3.2. FE Time Step and Triangulation

Estimates were produced using time step sizes of

h = 0.025

,

0.05

, and

0.1 ms

and triangulations with 172, 205, and 263 elements over the same geometry. For direct comparison, the estimates were computed using an importance sampling ensemble size of 50,000.

Figure 8a shows that, on average, the relative error (as defined above) in the estimates is very similar whether the triangulation involves 263 or 205 elements. In fact, all estimated parameter values differ by

0.1 %

or less for the two triangulations. This difference is likely due to numerical error resulting from the sample-based nature of importance sampling. In contrast, the lower density mesh introduces a larger error for all step sizes, but rapidly improves with decreasing step size.

Similar trends are observed in Figure 8b where decreasing step size or increasing mesh density results in a decrease in uncertainty. The observed decrease in uncertainty with decreasing time step and increasing mesh density results from having a higher fidelity numerical model that is more sensitive to the parameter values; that is, small changes in the parameter values will have a larger impact on the simulated data [33]. As a result, the uncertainty of the estimates will decrease as the fidelity of the model increases. However, as the time step decreases and mesh density increases the computational cost grows exponentially. Thus, examining how the estimated uncertainty behaves as the fidelity of the model increases becomes quickly infeasible.

6. Conclusions

To date, the approaches employed for developing subject-specific numerical VF models have focused on lumped-elements for the fitting model in the inverse analysis; as such, parameter estimates are often greatly abstracted from the physical tissue properties. To overcome these limitations, the present work proposes a FE model of the VFs for a fitting model. The FE model captures the geometry and layered structure of the VFs more accurately, treating them as a multi-layered viscoelastic body, thus better approximating their kinematics. Since the FE model directly employs the tissue properties, such as Young’s moduli, these properties are estimated directly. Estimation of material properties was demonstrated using HSV data of silicone VFs as the observation, showing good agreement between the estimated and “ground truth” material properties.

The robustness of the method was demonstrated by considering experimental data with different degrees of medial compression and differing subglottal pressures. The FE model faithfully recovered the material properties in all cases, including the degree of medial compression, which was embedded into the FE model in the form of base displacement. This suggests that the employed Bayesian framework using a FE fitting model is sufficiently sensitive to distinguish between different experimental conditions, even though the model was restricted to two dimensions.

The FE models were validated by comparing the volumetric flow rate predicted by the model with experimentally measured values. This observation was not included in the estimation process and, as such, was an independent measure. The volumetric flow rate was slightly over-predicted, but generally agreed well. The exception was the highest subglottal pressure case, which was considerably over-predicted. Additionally, the contact pressures extracted from the developed FE models were found to increase with increasing subglottal pressure and medial compression, which is a trend that qualitatively agrees with previous studies. The highest subglottal pressure case was again an outlier, suggesting that the FE model for that case does not accurately capture the kinematics of the silicone VFs, likely due to the 2D geometry. This could potentially be improved by incorporating additional observations in the estimation procedure or expanding to a three-dimensional model.

The stability of the results was examined with respect to numerical parameters, such as the importance sampling ensemble size, time step size, and mesh density. Estimated values converged at relatively modest ensemble sizes, though resolving the uncertainties required considerably more samples. Decreasing the time step size and increasing the mesh density lead to smaller uncertainties at the cost of significant computational time. One of the main drawbacks to our proposed model is the computational complexity; this cost will increase if more complex fluids models, a three-dimensional geometry, or acoustics are included.

One source of uncertainty that has not been considered in this work is the structure of the layers. All estimates in this work have been computed with a FE model that was formed treating the layers and dimensions of the silicone VFs as perfectly known. The use of imperfect layers will affect the estimates and uncertainties of the material properties, however, this is the subject of ongoing research and requires a further examination.

As an introductory effort, incorporating a FE fitting model into the Bayesian estimation framework has shown good promise. Future work includes validating the contact pressure estimates with experimental data, implementing a three-dimensional FE model, and employing clinical HSV.

Author Contributions

P.J.H. and S.D.P. conceived and designed the research; P.J.H. developed the numerical model and performed all simulations, M.M.-S. performed all physical experiments; P.J.H. and M.M.-S. analyzed the data; P.J.H., M.M.-S., B.D.E., and S.D.P. interpreted results; P.J.H. and S.D.P. drafted the manuscript; P.J.H., M.M.-S., B.D.E., and S.D.P. edited, revised, and approved the final version of the manuscript.

Funding

Research reported in this work was supported by the NIDCD of the NIH under award P50DC015446 and the Ontario Ministry of Research and Innovation through the Early Researcher Award. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

Story, B.H. An overview of the physiology, physics and modeling of the sound source for vowels. Acoust. Sci. Technol. 2002, 23, 195–206. [Google Scholar] [CrossRef]
Titze, I.R.; Worley, A.S. Modeling source-filter interaction in belting and high-pitched operatic male singing. J. Acoust. Soc. Am. 2009, 126, 1530–1540. [Google Scholar] [CrossRef] [PubMed]
Birkholz, P.; Kroger, B.; Neuschaefer-Rube, C. Model-based reproduction of articulatory trajectories for consonant–vowel sequences. IEEE Trans. Audio Speech Lang. Process. 2011, 19, 1422–1433. [Google Scholar] [CrossRef]
Zañartu, M.; Galindo, G.E.; Erath, B.D.; Peterson, S.D.; Wodicka, G.R.; Hillman, R.E. Modeling the effects of a posterior glottal opening on vocal fold dynamics with implications for vocal hyperfunction. J. Acoust. Soc. Am. 2014, 136, 3262–3271. [Google Scholar] [CrossRef] [PubMed]
Xue, Q.; Zheng, X. The effect of false vocal folds on laryngeal flow resistance in a tubular three-dimensional computational laryngeal Model. J. Voice 2017, 31, 275–281. [Google Scholar] [CrossRef] [PubMed]
Titze, I.; Schmidt, S.; Titze, M. Phonation threshold pressure in a physical model of the vocal fold mucosa. J. Acoust. Soc. Am. 1995, 97, 3080–3084. [Google Scholar] [CrossRef] [PubMed]
Galindo, G.E.; Peterson, S.D.; Erath, B.D.; Castro, C.; Hillman, R.E.; Zañartu, M. Modeling the pathophysiology of phonotraumatic vocal hyperfunction with a triangular glottal model of the vocal folds. J. Speech Lang. Hear. Res. 2017, 60, 2452–2471. [Google Scholar] [CrossRef] [PubMed]
Hillman, R.E.; Mehta, D.; Stepp, C.; van Stan, J.; Zañartu, M. Objective assessment of vocal hyperfunction. J. Acoust. Soc. Am. 2016, 139, 2193–2194. [Google Scholar] [CrossRef]
Do¨llinger, M.; Go´mez, P.; Patel, R.R.; Alexiou, C.; Bohr, C.; Schu¨tzenberger, A. Biomechanical simulation of vocal fold dynamics in adults based on laryngeal high-speed videoendoscopy. PLoS ONE 2017, 12, e0187486. [Google Scholar]
Cataldo, E.; Soize, C.; Sampaio, R. Uncertainty quantification of voice signal production mechanical model and experimental updating. Mech. Syst. Signal Process. 2013, 40, 718–726. [Google Scholar] [CrossRef]
Hadwin, P.J.; Galindo, G.E.; Daun, K.J.; Zañartu, M.; Erath, B.D.; Cataldo, E.; Peterson, S.D. Non-stationary bayesian estimation of parameters from a body cover model of the vocal folds. J. Acoust. Soc. Am. 2016, 139, 2683–2696. [Google Scholar] [CrossRef] [PubMed]
Xue, Q.; Zheng, X.; Mittal, R.; Bielamowicz, S. Subject-specific computational modeling of human phonation. J. Acoust. Soc. Am. 2014, 135, 1445–1456. [Google Scholar] [CrossRef] [PubMed]
Chang, S.; Novaleski, C.K.; Kojima, T.; Mizuta, M.; Luo, H.; Rousseau, B. Subject-specific computational modeling of evoked rabbit phonation. J. Biomech. Eng. 2015, 138, 011005. [Google Scholar] [CrossRef] [PubMed]
Deguchi, S. Mechanism of and threshold biomechanical conditions for falsetto voice onset. PLoS ONE 2011, 6, e17503. [Google Scholar] [CrossRef] [PubMed]
Patel, R.R.; Liu, L.; Galatsanos, N.; Bless, D.M. Differential vibratory characteristics of adductor spasmodic dysphonia and muscle tension dysphonia on high-speed digital imaging. Ann. Otol. Rhinol. Laryngol. 2011, 120, 21–32. [Google Scholar] [CrossRef] [PubMed]
Schwarz, R.; Hoppe, U.; Schuster, M.; Wurzbacher, T.; Eysholdt, U.; Lohscheller, J. Classification of unilateral vocal fold paralysis by endoscopic digital high-speed recordings and inversion of a biomechanical model. IEEE Trans. Biomed. Eng. 2006, 75, 1099–1108. [Google Scholar] [CrossRef] [PubMed]
Inwald, E.C.; Do¨llinger, M.; Schuster, M.; Eysholdt, U.; Bohr, C. Multiparametric analysis of vocal fold vibrations in healthy and disordered voices in high-speed imaging. J. Voice 2011, 25, 576–590. [Google Scholar] [CrossRef] [PubMed]
Hora´cˇek, J.; Laukkanen, A.M.; Sˇidlof, P.; Murphy, P.; Sˇvec, J.G. Comparison of acceleration and impact stress as possible loading factors in phonation: A computer modeling study. Folia Phoniatr. Logop. 2009, 61, 137–145. [Google Scholar] [CrossRef]
Unger, J.; Lohscheller, J.; Reiter, M.; Eder, K.; Betz, C.S.; Schuster, M. A noninvasive procedure for early-stage discrimination of malignant and precancerous vocal fold lesions based on laryngeal dynamics analysis. Cancer Res. 2015, 75, 31–39. [Google Scholar] [CrossRef]
Hillman, R.E.; Holmberg, E.B.; Perkell, J.S.; Walsh, M.; Vaughan, C. Objective assessment of vocal hyperfunction. J. Speech Lang. Hear. Res. 1989, 32, 373–392. [Google Scholar] [CrossRef]
Hadwin, P.J.; Peterson, S.D. An extended Kalman filter approach to non-stationary Bayesian estimation of reduced-order vocal fold model parameters. J. Acoust. Soc. Am. 2017, 141, 2909–2920. [Google Scholar] [CrossRef] [PubMed]
Do¨llinger, M.; Hoppe, U.; Hettlich, F.; Lohscheller, J.; Schuberth, S.; Eysholdt, U. Vibration parameter extraction from endoscopic image series of the vocal folds. IEEE Trans. Biomed. Eng. 2002, 49, 773–781. [Google Scholar] [CrossRef] [PubMed]
Go´mez, P.; Schu¨tzenberger, A.; Semmler, M.; Do¨llinger, M. Laryngeal pressure estimation with a recurrent neural network. IEEE J. Transl. Eng. Health Med. 2019, 7, 2000111. [Google Scholar] [CrossRef] [PubMed]
Wurzbacher, T.; Schwarz, R.; Do¨llinger, M.; Hoppe, U.; Eysholdt, U.; Lohscheller, J. Model-based classification of nonstationary vocal fold vibrations. J. Acoust. Soc. Am. 2006, 120, 1012–1027. [Google Scholar] [CrossRef] [PubMed]
Murray, P.R.; Thomson, S.L. Synthetic, multi-layer, self-oscillating vocal fold model fabrication. J. Vis. Exp. 2011, 58, 3498. [Google Scholar] [CrossRef] [PubMed]
Murtola, T.; Alku, P.; Malinen, J.; Geneid, A. Parameterization of a computational physical model for glottal flow using inverse filtering and high-speed videoendoscopy. Speech Commun. 2018, 96, 67–80. [Google Scholar] [CrossRef]
Alipour, F.; Berry, D.A.; Titze, I.R. A finite-element model of vocal-fold vibration. J. Acoust. Soc. Am. 2000, 108, 3003–3012. [Google Scholar] [CrossRef] [PubMed]
Bathe, K.J. Finite Element Procedures; Prentice Hall: Upper Saddle River, NJ, USA, 2006. [Google Scholar]
Titze, I.R. Regulating glottal airflow in phonation: Application of the maximum power transfer theorem to a low dimensional phonation model. J. Acoust. Soc. Am. 2002, 111, 367–376. [Google Scholar] [CrossRef]
Alipour, F.; Scherer, R.C. Flow separation in a computational oscillating vocal fold model. J. Acoust. Soc. Am. 2004, 116, 1710–1719. [Google Scholar] [CrossRef]
Zhang, Z. The influence of material anisotropy on vibration at onset in a three-dimensional vocal fold model. J. Acoust. Soc. Am. 2014, 135, 1480–1490. [Google Scholar] [CrossRef]
Liu, J. Monte Carlo Strategies in Scientific Computing; Springer-Verlag: New York, NY, USA, 2004. [Google Scholar]
Kaipio, J.P.; Somersalo, E. Statistical and Computational Inverse Problems; Springer-Verlag: New York, NY, USA, 2005. [Google Scholar]
Cataldo, E.; Soize, C.; Sampaio, R. A computational method for updating a probabilistic model of an uncertain parameter in a voice production model. In Proceedings of the 1st International Symposium on Uncertainty Quantification and Stochastic Modeling, Maresias, Brazil, 26 February–2 March 2012. [Google Scholar]
Smith, A.F.M.; Gelfand, A.E. Bayesian statistics without tears: A sampling-resampling perspective. Am. Stat. 1992, 46, 84–88. [Google Scholar]
Hadwin, P.J.; Sipkens, T.A.; Thomson, K.A.; Liu, F.; Daun, K.J. Quantifying uncertainty in soot volume fraction estimates using Bayesian inference of auto correlated laser induced incandescence measurements. Appl. Phys. B 2016, 122, 1–16. [Google Scholar] [CrossRef]
Jiang, J.J.; Titze, I.R. Measurement of vocal fold intraglottal pressure and impact stress. J. Voice 1994, 8, 132–144. [Google Scholar] [CrossRef]

Figure 1. (a) Model of the geometry of the silicone vocal folds; and (b) image of the experimental flow facility.

Figure 2. (a) Finite element triangulation used to simulate the silicone vocal folds; and (b) mesh deformation occurring as a result of medial compression. Red region: body; green region: ligament; and blue region: cover.

Figure 3. Kinematics of the FE model in comparison with the observed high speed videoendoscopy (HSV) for the case with no medial compression (

p_{sub} = 1 kPa

) at several time points throughout a single oscillation cycle.

Figure 3. Kinematics of the FE model in comparison with the observed high speed videoendoscopy (HSV) for the case with no medial compression (

p_{sub} = 1 kPa

) at several time points throughout a single oscillation cycle.

Figure 4. Comparison of the glottal area waveforms extracted from the FE model and the HSV for the case with no medial compression (

p_{sub} = 1 kPa

). Blue dashed line: HSV; red solid line: FE model; orange solid lines: uncertainty bounds from the FE estimate.

Figure 4. Comparison of the glottal area waveforms extracted from the FE model and the HSV for the case with no medial compression (

p_{sub} = 1 kPa

). Blue dashed line: HSV; red solid line: FE model; orange solid lines: uncertainty bounds from the FE estimate.

Figure 5. The glottal area waveform extracted from HSV of self-oscillating silicone vocal folds at

p_{sub} = 1.00 kPa

(a) without medial compression and (b) with medial compression. Blue dashed line: HSV; red solid line: FE model; orange solid lines: uncertainty bounds from the FE estimate.

Figure 5. The glottal area waveform extracted from HSV of self-oscillating silicone vocal folds at

p_{sub} = 1.00 kPa

(a) without medial compression and (b) with medial compression. Blue dashed line: HSV; red solid line: FE model; orange solid lines: uncertainty bounds from the FE estimate.

Figure 6. Comparison on the glottal area waveforms extracted from HSV and the FE predictions from the fitted material properties for (a)

p_{sub} = 0.91, kPa

(b)

p_{sub} = 1.00 kPa

, (c)

p_{sub} = 1.09 kPa

, and (d)

p_{sub} = 1.18 kPa

. Blue dashed line: HSV; red solid line: FE model; orange solid lines: uncertainty bounds from the FE estimate.

Figure 6. Comparison on the glottal area waveforms extracted from HSV and the FE predictions from the fitted material properties for (a)

p_{sub} = 0.91, kPa

(b)

p_{sub} = 1.00 kPa

, (c)

p_{sub} = 1.09 kPa

, and (d)

p_{sub} = 1.18 kPa

. Blue dashed line: HSV; red solid line: FE model; orange solid lines: uncertainty bounds from the FE estimate.

Figure 7. (a) Relative error of the estimates; and (b) relative uncertainty for increasing ensemble size. Solid blue line: average; dashed red line: maximum.

Figure 8. The average (a) relative error and (b) relative uncertainty of the estimates as a function of time step size for various triangulations. Blue with the plus markers: 172 elements; red with circle markers: 205 elements; black with diamond markers: 263 elements.

Table 1. Different mixture ratios of silicone and their corresponding Young’s moduli. A ratio of 1:1:x means that the layer was formed by mixing 1 part A, 1 part B, and x parts thinner by weight.

VF Layer	Material	Ratio (A-B-Thinner)	Young’s Modulus (kPa)
Body	Ecoflex	1-1-1	11.8
Ligament	Ecoflex	1-1-4	2
SLP	Ecoflex	1-1-6	0.6
Epithelium	Dragon Skin	1-1-1	45

Table 2. List of the the experimental values and prior distributions used for the estimated parameters. The distribution bounds are the bounds of the employed uniform prior distribution (lower bound, upper bound).

	$E_{bdy}$	$E_{cvr}$	$E_{lig}$	$p_{sub}$	$ρ$	$η$	$x_{0}$
	(kPa)	(Pa)	(kPa)	(Pa)	(kg/m $^{3}$ )	(Poise)	(mm)
Experimental Value	11.8	600	2	1000	1049.75	-	-
Distribution Bounds	(9, 15)	(250, 950)	(0.5, 3)	(400, 1800)	(950, 1200)	(1, 7)	(7.4, 8.4)

Table 3. Material property estimates for the case without medial compression.

	$E_{bdy}$	$E_{cvr}$	$E_{lig}$	$p_{sub}$	$ρ$	$η$
	(kPa)	(Pa)	(kPa)	(Pa)	(kg/m $^{3}$ )	(Poise)
Experimental Value	11.8	600	2	1000	1049	-
Estimated Value	12.15	636.2	1.808	989.9	1051	3.079
Standard Deviation as a
percentage of the Estimate	$3.19 %$	$3.35 %$	$8.63 %$	$3.21 %$	$0.31 %$	$3.48 %$

Table 4. Estimates and the associated uncertainties as a percentage of the estimate values (in brackets) for material properties of the silicone vocal folds and initial position for cases with and without medial compression (

p_{sub} = 1 kPa

).

Table 4. Estimates and the associated uncertainties as a percentage of the estimate values (in brackets) for material properties of the silicone vocal folds and initial position for cases with and without medial compression (

p_{sub} = 1 kPa

).

	$E_{bdy}$	$E_{cvr}$	$E_{lig}$	$p_{sub}$	$ρ$	$η$	$x_{0}$
Data Set	(kPa)	(Pa)	(kPa)	(Pa)	(kg/m $^{3}$ )	(Poise)	(mm)
Experimental Value	11.8	600	2	1,000	1049	-	-
Without Medial Compression	12.32	632.1	1.807	987.7	1051	2.981	8.39
	$(3.96 %)$	$(3.53 %)$	$(9.74 %)$	$(3.98 %)$	$(0.25 %)$	$(4.66 %)$	$(0.38 %)$
With Medial Compression	12.28	629.8	1.842	991.9	1051	3.11	8.27
	$(4.30 %)$	$(4.1 %)$	$(12.16 %)$	$(4.536 %)$	$(0.26 %)$	$(4.66 %)$	$(0.30 %)$

Table 5. Material properties estimates for varying subglottal pressures with medial compression. The associated estimate uncertainties as a percentage of the estimate values are in brackets.

$p_{sub}$	$E_{bdy}$	$E_{cvr}$	$E_{lig}$	$p_{sub}$	$ρ$	$η$	$x_{0}$
(kPa)	(kPa)	(Pa)	(kPa)	(Pa)	(kg/m $^{3}$ )	(Poise)	(mm)
$0.91$	12.38	643.0	2.375	901.2	1048	2.98	8.29
	$(4.31 %)$	$(6.22 %)$	$(12.67 %)$	$(6.16 %)$	$(0.28 %)$	$(6.28 %)$	$(0.37 %)$
$1.00$	12.28	629.8	1.842	991.9	1051	3.11	8.27
	$(4.3 %)$	$(4.1 %)$	$(12.16 %)$	$(4.53 %)$	$(0.26 %)$	$(4.66 %)$	$(0.30 %)$
$1.09$	12.15	635.1	1.858	1062	1047	3.01	8.33
	$(2.27 %)$	$(3.01 %)$	$(10.39 %)$	$(2.89 %)$	$(0.16 %)$	$(4.45 %)$	$(0.28 %)$
$1.18$	12.25	639.6	2.098	1164	1049	3.03	8.32
	$(2.07 %)$	$(2.74 %)$	$(8.1 %)$	$(2.43 %)$	$(0.15 %)$	$(3.86 %)$	$(0.22 %)$

Table 6. Experimental and estimated mean volumetric flow rates along with a measure of uncertainty in the form of a standard deviation as a percentage of the estimated value. Data are all in mL/s.

$p_{sub}$ (kPa)	$0.91$	$1.00$	$1.09$	$1.18$
Experimental	236	260	291	307
Estimated	242.5	272.2	301.4	361.6
Uncertainty	$8.29 %$	$7.02 %$	$t 6.14 %$	$4.87 %$

Table 7. Estimated average maximum and mean contact pressures along with a measure of uncertainty in the form of a standard deviation as a percentage of the mean estimated value for the cases with medial compression. All data are in (Pa).

$p_{sub}$ (kPa)	$0.91$	$1.00$	$1.09$	$1.18$
Maximum	405.8	825.7	1032.8	2000.9
Mean	246.5	475.8	662.6	1469.8
Uncertainty	$13.87 %$	$12.32 %$	$10.14 %$	$6.67 %$

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hadwin, P.J.; Motie-Shirazi, M.; Erath, B.D.; Peterson, S.D. Bayesian Inference of Vocal Fold Material Properties from Glottal Area Waveforms Using a 2D Finite Element Model. Appl. Sci. 2019, 9, 2735. https://doi.org/10.3390/app9132735

AMA Style

Hadwin PJ, Motie-Shirazi M, Erath BD, Peterson SD. Bayesian Inference of Vocal Fold Material Properties from Glottal Area Waveforms Using a 2D Finite Element Model. Applied Sciences. 2019; 9(13):2735. https://doi.org/10.3390/app9132735

Chicago/Turabian Style

Hadwin, Paul J., Mohsen Motie-Shirazi, Byron D. Erath, and Sean D. Peterson. 2019. "Bayesian Inference of Vocal Fold Material Properties from Glottal Area Waveforms Using a 2D Finite Element Model" Applied Sciences 9, no. 13: 2735. https://doi.org/10.3390/app9132735

APA Style

Hadwin, P. J., Motie-Shirazi, M., Erath, B. D., & Peterson, S. D. (2019). Bayesian Inference of Vocal Fold Material Properties from Glottal Area Waveforms Using a 2D Finite Element Model. Applied Sciences, 9(13), 2735. https://doi.org/10.3390/app9132735

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Bayesian Inference of Vocal Fold Material Properties from Glottal Area Waveforms Using a 2D Finite Element Model

Abstract

1. Introduction

2. Experimental Setup and Data Collection

2.1. Silicone Vocal Fold Models

2.2. Laryngeal Flow Facility

3. Finite Element Model

3.1. Mathematical Model

3.2. Finite Element Approximation

4. Estimation of VF Material Properties

4.1. Parameterization of the FE Model

4.2. Bayesian Inference

5. Results and Discussion

5.1. Effect of Medial Compression

5.2. Distinguishing between Model Configurations

5.3. Sensitivity Analysis

5.3.1. Ensemble Size

5.3.2. FE Time Step and Triangulation

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI