Effect of Subglottic Stenosis on Vocal Fold Vibration and Voice Production Using Fluid–Structure–Acoustics Interaction Simulation

: An in-house 3D ﬂuid–structure–acoustic interaction numerical solver was employed to investigate the effect of subglottic stenosis (SGS) on dynamics of glottal ﬂow, vocal fold vibration and acoustics during voice production. The investigation focused on two SGS properties, including severity deﬁned as the percentage of area reduction and location. The results show that SGS affects voice production only when its severity is beyond a threshold, which is at 75% for the glottal ﬂow rate and acoustics, and at 90% for the vocal fold vibrations. Beyond the threshold, the ﬂow rate, vocal fold vibration amplitude and vocal efﬁciency decrease rapidly with SGS severity, while the skewness quotient, vibration frequency, signal-to-noise ratio and vocal intensity decrease slightly, and the open quotient increases slightly. Changing the location of SGS shows no effect on the dynamics. Further analysis reveals that the effect of SGS on the dynamics is primarily due to its effect on the ﬂow resistance in the entire airway, which is found to be related to the area ratio of glottis to SGS. Below the SGS severity of 75%, which corresponds to an area ratio of glottis to SGS of 0.1, changing the SGS severity only causes very small changes in the area ratio; therefore, its effect on the ﬂow resistance and dynamics is very small. Beyond the SGS severity of 75%, increasing the SGS severity, leads to rapid increases of the area ratio, resulting in rapid changes in the ﬂow resistance and dynamics. total ﬂow resistance also rapidly increases with the SGS severity. When the SGS area reduction reaches 99%, the trans-stenosis resistance has become comparable to the trans-glottis resistance; as a result, the total ﬂow resistance is about twice its values in the baseline case. These observations indicate that, when the SGS area reduction is beyond 75%, the SGS starts to generate additional ﬂow resistance, which increased rapidly with the SGS area reduction.


Introduction
Subglottic stenosis (SGS) is a narrowing of the airway between the larynx and trachea. The most frequent cause of SGS is post-intubation injury in which edema and inflammation cause the formation of ulcer and granulation tissues [1,2]. SGS has also been reported in patients having congenital diseases or traumas, treated by tracheotomy method, etc. The narrowing due to SGS can affect both breathing and voice with the symptoms, including noisy breathing, shortness of breath, hoarse voice, and dysphonia [3][4][5].
The severity of SGS is generally categorized into four grades by the level of obstruction of the airway: Grade I, below 50% obstruction; Grade II, 51-70% obstruction; Grade III, 71-99% obstruction; and Grade IV, 100% obstruction [6]. Treatment is different depending on the severity. For lower grades (Grades I and II), little to no treatment may be needed. For higher grades (Grades III and IV), open airway reconstructive procedures may be needed to restore airway patency; however, voice changes before and after the procedures were reported in clinical studies [7][8][9][10]. For example, Ettema et al. [11] did a perceptual study on the voice change due to SGS and found that about 50% of the patients' voices were moderately to extremely affected. More interestingly, their results revealed that the effect where ρ 0 , V, P, and υ 0 are incompressible flow density, velocity, pressure, and kinematic viscosity. The equations are solved on a Cartesian mesh. A sharp-interface immersed boundary method is used to treat complex and moving boundaries. The details of the solver, numerical schemes, and discretization can be found in Reference [20]. 3 of 17 Linearized perturbed compressible equations (LPCE) based on the hydrodynamic/acoustic splitting method [21] were used to simulate vocal tract acoustics: where v i and p are the perturbed compressible velocity components and pressure, and γ is the ratio of specific heats. The equations are also solved on a Cartesian mesh with the immersed boundary method to treat the boundaries. In the computation, the incompressible flow is computed first, and the obtained flow solution is used to compute the perturbed compressible variables. The final total velocity and pressure variables are computed as follows: where v and p are the total flow velocity and pressure. This method was successfully verified in simulating acoustics in human phonation [22]. Further details of the LPCE equations can be found in Reference [21]. The governing equations of the vocal fold dynamics are Navier equations, as shown below: where ρ, d, σ, and f are tissue density, displacement, stress, and body force per unit mass, respectively. The finite element method was used to solve Equation (4) [23]. The above three solvers are explicitly coupled through a Lagrangian interface, which is represented by triangular surface meshes. The incompressible flow solver is marched by one step with the existing deformed shape and velocities of the solid tissue as the boundary conditions. The perturbed compressible solver is then marched with the updated incompressible flow field, as well as the existing deformed shape and velocities of the solid tissue as the boundary conditions. The forces at the vocal fold surface are then calculated with the new incompressible flow pressure and the perturbed pressure. At last, the solid solver is marched by one step with the updated surface traction. The deformation and velocities on the solid grid are then transferred to the vocal fold surface, so that the fluid/solid interface can be updated. The details of the coupling can be found in Zheng et al. [23] and Jiang et al. [24]. Figure 1a shows the geometric models of the vocal folds, SGS, supraglottal tract, and subglottal tract, as well as the dimension of the computational domain. The configuration of the supraglottal airway was generated based on a neutral area function reported in Story [25], which is the mean area function of 12 vowels of a male speaker. According to another study by Story [26], the first and second natural frequencies (F1 and F2) of the mean area function are similar to those from a uniform tube, but higher frequencies are more subject-specific. The overall length of the supraglottal airway centerline is 17.4 cm. The geometries of the ventricle, false vocal folds, and subglottic tract were superimposed to the tract based on CT scans of a male subject larynx [27].

Geometric Models
The geometry of the vocal fold ( Figure 1b) was generated by using a mathematical model proposed by Titze and Talkin [28] that matches physiologically realistic key parameters and features (e.g., shape of medial surface, anterior-posterior shape variation, and the subglottal angle). The model assumes a three-layer structure of the vocal fold including the cover, ligament, and body, as shown in Figure 1b. The thickness of the cover and ligament layer was set as 0.5 and 1.1 mm, respectively, based on the average values of human measurement [29]. The vocal fold tissues were modeled as viscoelastic transversely isotropic materials. The material properties of each layer are listed in Table 1. These values were adopted from previous studies which showed how to reproduce typical vocal fold dynamics [24,30,31].
where is the SGS length and was set as 1.0 cm, which is the most common SGS length [32]; is the distance of the lower bound of the SGS to the component of the point ( ); and is the percentage reduction of the hydraulic diameter of the airway cross-section due to the SGS.   The subglottic tract was extended downward by 6.6 cm to incorporate the SGS. An idealized geometry of the SGS was created by using a cosine function ( Figure 2a). Specifically, for a given surface point (x 1 , y 1 , z 1 ) at the SGS region where x 1 , y 1 , and z 1 are the original coordinates of the point, the new coordinates (x s , y s , z s ) of the point after adding the SGS were determined by using the equation below: where L is the SGS length and was set as 1.0 cm, which is the most common SGS length [32]; y c is the distance of the lower bound of the SGS to the y component of the point (y 1 ); and a is the percentage reduction of the hydraulic diameter of the airway cross-section due to the SGS. In this study, the severity and location of SGS were both varied. The SGS severity was quantified by the percentage of area reduction due to SGS. Seven values were used, including 0% (baseline case), 25%, 50%, 75%, 90%, 96%, and 99%, by varying "a" in Equation (5) from 0 to 0.9 (Figure 2b). The SGS location was defined as the distance between the narrowest section of the SGS and the superior surface of the vocal folds ( Figure 2c). Two values were used, 2.1 and 3.13 cm, which respectively correspond to the minimum and average values of 92 SGS patients reported in the past studies [32,33].
In this study, the severity and location of SGS were both varied. The SGS severity was quantified by the percentage of area reduction due to SGS. Seven values were used, including 0% (baseline case), 25%, 50%, 75%, 90%, 96%, and 99%, by varying " " in Equation (5) from 0 to 0.9 (Figure 2b). The SGS location was defined as the distance between the narrowest section of the SGS and the superior surface of the vocal folds ( Figure 2c). Two values were used, 2.1 and 3.13 cm, which respectively correspond to the minimum and average values of 92 SGS patients reported in the past studies [32,33]. The different values of SGS severity, which is defined as the percentage of area reduction due to SGS. (c) The SGS location, defined as the distance between the narrowest section of the SGS and superior surface of the vocal folds.

Computational Domain and Boundary Conditions
The entire geometric model was immersed into a 2.3 cm × 17.8 cm × 9.9 cm rectangular computational domain. The domain was discretized by a 64 × 256 × 92 Cartesian grid in the x, y, and z directions, respectively. A 0.8 kPa pressure drop was applied between the inlet and outlet to drive the flow. The total reflection boundary condition for acoustics was applied at the outlet. To eliminate the acoustic reflection at the inlet boundary, an anechoic zone was enforced in the LPCE solver to buffer the acoustics [34]. Vocal tract walls were treated as no-slip boundaries in the incompressible solver and hard wall boundary condition in the LPCE solver [35]. A traction boundary condition was applied at the fluid-structure interaction interface, while the lateral, anterior and posterior surfaces of the vocal folds were fixed in the solid solver. A penalty-method-based contact model was applied to prevent the penetration of the vocal folds. An artificial minimum gap of 0.2 mm was enforced between the vocal folds, which is necessary for the success of the flow solver. The vocal fold was discretized, using 28,997 tetrahedral elements. The time-steps of 1.149 × 10 s and 0.05 × 1.149 × 10 s were used in the incompressible

Computational Domain and Boundary Conditions
The entire geometric model was immersed into a 2.3 cm × 17.8 cm × 9.9 cm rectangular computational domain. The domain was discretized by a 64 × 256 × 92 Cartesian grid in the x, y, and z directions, respectively. A 0.8 kPa pressure drop was applied between the inlet and outlet to drive the flow. The total reflection boundary condition for acoustics was applied at the outlet. To eliminate the acoustic reflection at the inlet boundary, an anechoic zone was enforced in the LPCE solver to buffer the acoustics [34]. Vocal tract walls were treated as no-slip boundaries in the incompressible solver and hard wall boundary condition in the LPCE solver [35]. A traction boundary condition was applied at the fluid-structure interaction interface, while the lateral, anterior and posterior surfaces of the vocal folds were fixed in the solid solver. A penalty-method-based contact model was applied to prevent the penetration of the vocal folds. An artificial minimum gap of 0.2 mm was enforced between the vocal folds, which is necessary for the success of the flow solver. The vocal fold was discretized, using 28,997 tetrahedral elements. The time-steps of 1.149 × 10 −6 s and 0.05 × 1.149 × 10 −6 s were used in the incompressible and LPCE solvers, respectively, to satisfy their Courant-Friedrichs-Lewy condition of below 1. 256 processors were used to simulate 4.6 × 10 −2 s on the XSEDE COMET cluster (Intel Xeon E5-2680v3 processors) which resulted in the computational cost of 161,000 CPU hours.

Results and Discussion
In this section, the results of the baseline case, including the dynamics of the glottal flow, vocal fold vibration, and acoustics, are first presented to demonstrate that the simulation represents typical human phonatory dynamics. Next, the changes of the dynamics with the varied SGS severity (area reduction) and location in the cases are quantified. The relationship between the quantities and SGS severity and locations is discussed. Finally, the underlying mechanism responsible for the relationships is analyzed.

Glottal Flow Rate Waveform
The simulation was successfully carried out for eight vocal fold vibration cycles. Self-sustained vibration was established after the second cycle. Figure 3 shows the phase-averaged volumetric flow rate at the glottic exit based on the steady cycles. Phase-averaging is done by dividing each steady cycle into 1000 phases and calculating the average value at each phase over all the steady cycles. The flow rate shows a typical glottal flow rate waveform of human phonation which is characterized by a slower rising, faster dropping, and a lasting period of glottic closure. Notice that the flow rate did not drop to zero during glottic closure due to the artificial minimum glottic gap enforced in the simulation. Several typical voice quality related quantities were computed based on the waveform, including the f 0 ; the skewness quotient (τ s ), defined as the ratio of the duration of flow acceleration to flow deceleration; the peak glottal flow rate; the mean glottal flow rate; and the MFDR. The values are found within the typical physiological ranges of human phonation [31] (Table 2). relationship between the quantities and SGS severity and locations is discussed. Finally, the underlying mechanism responsible for the relationships is analyzed.

Glottal Flow Rate Waveform
The simulation was successfully carried out for eight vocal fold vibration cycles. Selfsustained vibration was established after the second cycle. Figure 3 shows the phase-averaged volumetric flow rate at the glottic exit based on the steady cycles. Phase-averaging is done by dividing each steady cycle into 1000 phases and calculating the average value at each phase over all the steady cycles. The flow rate shows a typical glottal flow rate waveform of human phonation which is characterized by a slower rising, faster dropping, and a lasting period of glottic closure. Notice that the flow rate did not drop to zero during glottic closure due to the artificial minimum glottic gap enforced in the simulation. Several typical voice quality related quantities were computed based on the waveform, including the f0; the skewness quotient ( ), defined as the ratio of the duration of flow acceleration to flow deceleration; the peak glottal flow rate; the mean glottal flow rate; and the MFDR. The values are found within the typical physiological ranges of human phonation [31] ( Table 2).    Table 2. Voice-quality-related quantities computed from the baseline model and their typical physiological ranges; f 0 is the fundamental frequency; τ s is the skewness quotient; τ o is the open quotient; q peak and q mean are the mean and peak glottal flow rates, respectively; MFDR is the maximum flow declination rate.

Vocal Fold Dynamics
Glottal area waveform, which measures the minimum glottal cross-section area versus time during sustained vibrations, typically obtained from videostroboscopy images, is an important clinical measure of vocal fold dynamics. Different from the glottal flow rate waveform, which includes the effect of both vocal folds' kinematics and air inertial effect, glottal area waveform is determined only by the vocal fold kinematics; therefore, it is widely used to quantify the vocal fold opening amplitude, speed, and time. Figure 4 shows the phase-averaged glottal area over one vibration cycle in the baseline case. Due to the artificially enforced minimum glottal gap, the glottal area did not drop to zero during glottic closure. The open quotient (τ o ) of the vibration, which is defined as the ratio of the duration of open glottis to the duration of the vibration cycle, was computed and found to be 0.67. Note that the closed glottis was defined when the time rate of change of the glottal area was less than 2% of the maximum time rate of change of the glottal area. Moreover, τ o is another laryngeal dynamics quantity of important implication to voice quality. This parameter typically ranges between 0.4 and 0.7 for normal phonation. A value lower than 0.4 indicates a "pressed" voice, and a value greater than 0.7 indicates a "lax" voice [36]. The value of 0.67 in the baseline case is within the normal range but implies a lightly lax voice.
shows the phase-averaged glottal area over one vibration cycle in the baseline case. Due to the artificially enforced minimum glottal gap, the glottal area did not drop to zero during glottic closure. The open quotient ( ) of the vibration, which is defined as the ratio of the duration of open glottis to the duration of the vibration cycle, was computed and found to be 0.67. Note that the closed glottis was defined when the time rate of change of the glottal area was less than 2% of the maximum time rate of change of the glottal area. Moreover, is another laryngeal dynamics quantity of important implication to voice quality. This parameter typically ranges between 0.4 and 0.7 for normal phonation. A value lower than 0.4 indicates a "pressed" voice, and a value greater than 0.7 indicates a "lax" voice [36]. The value of 0.67 in the baseline case is within the normal range but implies a lightly lax voice. To show the detailed vibration pattern of the vocal folds, Figure 5 plots the mid-coronal profile of the vocal folds at four different time instants during one vibration cycle. At 0.17 T, when the vocal folds were opening, the glottis formed a convergent shape. At 0.36 T, the glottis formed a parallel channel. At 0.55 T, when the vocal folds were closing, the glottis formed a divergent shape. Finally, at 0.74 T, the vocal folds fully closed and remained closed for the rest of the cycle. This alternative convergent-divergent shape change of the glottis clearly indicates the formation and propagation of a mucosal wave on the vocal fold medial surface. This mucosal wave propagation is a hallmark of vocal fold dynamics, found to be essential for the sustained energy transfer from flow to vocal folds [36]. Both the results of the glottal area waveform and vocal fold vibration pattern indicate that our simulation reproduced typical vocal fold vibratory dynamics. To show the detailed vibration pattern of the vocal folds, Figure 5 plots the midcoronal profile of the vocal folds at four different time instants during one vibration cycle. At 0.17 T, when the vocal folds were opening, the glottis formed a convergent shape. At 0.36 T, the glottis formed a parallel channel. At 0.55 T, when the vocal folds were closing, the glottis formed a divergent shape. Finally, at 0.74 T, the vocal folds fully closed and remained closed for the rest of the cycle. This alternative convergent-divergent shape change of the glottis clearly indicates the formation and propagation of a mucosal wave on the vocal fold medial surface. This mucosal wave propagation is a hallmark of vocal fold dynamics, found to be essential for the sustained energy transfer from flow to vocal folds [36]. Both the results of the glottal area waveform and vocal fold vibration pattern indicate that our simulation reproduced typical vocal fold vibratory dynamics. Vocal fold vibration is known to be the result of a dynamic entrainment process of a number of vibration modes [37,38]. To reveal this dynamic process, we applied the proper orthogonal decomposition (POD) method to obtain the primary vibration modes of the vocal fold vibration. The POD method is a common method to use low-dimensional quantities to express complex dynamics. It computes the most energetic vibratory modes and their modal coefficients through the proper orthogonal decomposition of vocal fold kinematics. Figure 6a shows the 3D shape, as well as the mid-coronal profile of the first and second POD modes of the vocal fold vibration, which captured 65% and 27% of the total vibration energy, respectively. In the 2D plot, the solid line illustrates the undeformed shape of the vocal fold, and the dashed and dashed-dot lines illustrate the two-extreme states of these two modes. It can be seen that the first and second modes primarily represent a lateral motion and a convergent-divergent type motion, respectively. Figure 6b shows the time history of the modal coefficient of the two modes. It reveals that the two Vocal fold vibration is known to be the result of a dynamic entrainment process of a number of vibration modes [37,38]. To reveal this dynamic process, we applied the proper orthogonal decomposition (POD) method to obtain the primary vibration modes of the vocal fold vibration. The POD method is a common method to use low-dimensional quantities to express complex dynamics. It computes the most energetic vibratory modes and their modal coefficients through the proper orthogonal decomposition of vocal fold kinematics. Figure 6a shows the 3D shape, as well as the mid-coronal profile of the first and second POD modes of the vocal fold vibration, which captured 65% and 27% of the total vibration energy, respectively. In the 2D plot, the solid line illustrates the undeformed shape of the vocal fold, and the dashed and dashed-dot lines illustrate the two-extreme states of these two modes. It can be seen that the first and second modes primarily represent a lateral motion and a convergent-divergent type motion, respectively. Figure 6b shows the time history of the modal coefficient of the two modes. It reveals that the two modes vibrated at the same frequency, with a 0.38 T phase lag.
number of vibration modes [37,38]. To reveal this dynamic process, we applied the proper orthogonal decomposition (POD) method to obtain the primary vibration modes of the vocal fold vibration. The POD method is a common method to use low-dimensional quantities to express complex dynamics. It computes the most energetic vibratory modes and their modal coefficients through the proper orthogonal decomposition of vocal fold kinematics. Figure 6a shows the 3D shape, as well as the mid-coronal profile of the first and second POD modes of the vocal fold vibration, which captured 65% and 27% of the total vibration energy, respectively. In the 2D plot, the solid line illustrates the undeformed shape of the vocal fold, and the dashed and dashed-dot lines illustrate the two-extreme states of these two modes. It can be seen that the first and second modes primarily represent a lateral motion and a convergent-divergent type motion, respectively. Figure 6b shows the time history of the modal coefficient of the two modes. It reveals that the two modes vibrated at the same frequency, with a 0.38 T phase lag.   Figure 7a plots the phase-averaged sound pressure monitored at a hypothetical point at a distance of 5 cm outside of the tract exit (mouth). The sound pressure was calculated by assuming a 3D spherical sound wave propagation, starting from the mouth and using Equation (6):

Acoustics
where p s is the sound pressure, ρ is the air density, r is the distance of the location to the source (mouth), and dQ/dt is the time derivative of the flow rate at the mouth. It can be seen that the sound pressure oscillates with a frequency different from vibration f 0 (201 Hz), suggesting that the highest acoustic energy was not on f 0 . Figure 7b shows the frequency spectrum of the sound pressure. The total time length of the pressure wave is ∆t = 0.0249s, based on which the frequency resolution is 1/∆t = 40.2 Hz. The figure shows the first three formants of the sound pressure wave at 602, 1365, and 2449 Hz. Note that, by assuming the tract as a uniform open-closed tube, its natural frequencies can be estimated by using F n = (2n − 1) c 4L [36], where n = 1, 2, 3, . . ., c is the speed of sound and L is the length of the tube. By employing c = 351.88 m/s and L = 17.4 cm, the first three natural frequencies of the tract are 506, 1517, and 2528 Hz, consistent with the formants. Several voice-quality-related quantities were computed. The signal-to-noise ratio (SNR) is 11.45 dB. Sound intensity, calculated as 10log 10 I I 0 in which I 0 = 10 −12 and I = ∆p s 2 2ρc (∆p s is the amplitude of the pressure waveform, ρ is air density, and c is the employed sound speed), is 89.31 dB. Vocal efficiency, calculated as 4πr 2 10 (SPL−120)/10 /P pul U g P pul is the Appl. Sci. 2021, 11, 1221 9 of 17 pulmonary pressure, and U g is the mean glottal flow rate) [36], is 4.74 × 10 −5 . All of these quantities are within or close to the typical range of human phonation [36,[39][40][41][42]. by using = (2 − 1) [36], where = 1, 2, 3, …, is the speed of sound and is the length of the tube. By employing = 351.88 m/s and = 17.4 cm, the first three natural frequencies of the tract are 506, 1517, and 2528 Hz, consistent with the formants. Several voice-quality-related quantities were computed. The signal-to-noise ratio (SNR) is 11.45 dB. Sound intensity, calculated as 10 in which = 10 and = ∆ (∆ is the amplitude of the pressure waveform, is air density, and is the employed sound speed), is 89.31 dB. Vocal efficiency, calculated as 4 10 ( ) ⁄ ( is the pulmonary pressure, and is the mean glottal flow rate) [36], is 4.74 × 10 . All of these quantities are within or close to the typical range of human phonation [36,[39][40][41][42].

Effects of SGS Severity on Glottal Flow Dynamics, Vocal Fold Vibration, and Acoustics
In this section, we investigate the effects of SGS severity on voice production for the area reductions of 25%, 50%, 75%, 90%, 96%, and 99% and SGS distance of 3.13 cm. Important quantities of the dynamics of glottal flow, vocal fold vibrations, and acoustics were computed for each case. The relationships between these quantities and SGS severity are discussed.

Effects of SGS Severity on Glottal Flow Dynamics, Vocal Fold Vibration, and Acoustics
In this section, we investigate the effects of SGS severity on voice production for the area reductions of 25%, 50%, 75%, 90%, 96%, and 99% and SGS distance of 3.13 cm. Important quantities of the dynamics of glottal flow, vocal fold vibrations, and acoustics were computed for each case. The relationships between these quantities and SGS severity are discussed.

Glottal Flow Rate Waveform
For all of the cases, self-sustained vibration was reached after the second cycle, and eight steady vibration cycles were generated. Following the method in the baseline case, the phase-averaged glottal flow rate waveform over one vibration cycle was plotted for each case and shown in Figure 8a. It is found that there are no noticeable changes (less than 1%) in the peak glottal flow rate up to 75% area reduction, which is similar to the results in Smith and Thomson [18]. Beyond 75% area reduction, the flow rate drops rapidly with the area reduction. It is also interesting to notice that, when the area reduction is beyond 75%, the glottal flow rate waveform shows a delayed glottic closure. Quantitatively, the glottic closure occurs at the phase of 0.71 T, 0.71 T, 0.72 T, and 0.76 T, corresponding to 75%, 90%, 96%, and 99% area reduction, respectively. With the phase of the peak flow rate remaining unchanged in all of the cases, the delayed glottic closure means a prolonged duration of flow deceleration. As a result, the skewness quotient decreases much quicker when the area reduction is beyond 90% (Figure 8b). The combined effect of the reduced peak flow rate and prolonged flow deceleration time also results in corresponding decreases of MFDR (Figure 8c). A lower MFDR leads to reduced sound intensity. Therefore, our results indicate that both the flow rate and sound intensity start to show noticeable reduction when the area reduction is above 75% and quicker reduction when the area reduction is above 90%; below 75% area reduction, they are nearly not affected. Interestingly, the 75% threshold corresponds well to the clinical observation that area reduction below 70% is categorized into Grade I and Grade II SGS, for which no treatment is normally needed. sponding decreases of MFDR (Figure 8c). A lower MFDR leads to reduced sound intensity. Therefore, our results indicate that both the flow rate and sound intensity start to show noticeable reduction when the area reduction is above 75% and quicker reduction when the area reduction is above 90%; below 75% area reduction, they are nearly not affected. Interestingly, the 75% threshold corresponds well to the clinical observation that area reduction below 70% is categorized into Grade I and Grade II SGS, for which no treatment is normally needed.  Figure 9a plots the phase-averaged glottal area waveform for different SGS severities. Similar to the glottal flow rate, the glottal area is reduced beyond a threshold SGS severity. However, different from the glottal flow rate, the threshold value is at the SGS area reduction of 90%. Specifically, with the area reduction of 90%, 96%, and 99%, the maximum glottal area, respectively, drops 1.2%, 7%, and 30%; the average speed of glottic opening, respectively, drops 1.7%, 15%, and 49%; and the average speed of glottic closing, respectively, drops 1.5%, 10%, and 57% relative to the baseline case. Figure 9b plots the open quotient versus the SGS severity for different cases. The variation of the open quotient is negligible up to 90% severity and increases rapidly beyond that. Our results suggest that,  Figure 9a plots the phase-averaged glottal area waveform for different SGS severities. Similar to the glottal flow rate, the glottal area is reduced beyond a threshold SGS severity. However, different from the glottal flow rate, the threshold value is at the SGS area reduction of 90%. Specifically, with the area reduction of 90%, 96%, and 99%, the maximum glottal area, respectively, drops 1.2%, 7%, and 30%; the average speed of glottic opening, respectively, drops 1.7%, 15%, and 49%; and the average speed of glottic closing, respectively, drops 1.5%, 10%, and 57% relative to the baseline case. Figure 9b plots the open quotient versus the SGS severity for different cases. The variation of the open quotient is negligible up to 90% severity and increases rapidly beyond that. Our results suggest that, while SGS tends to reduce vocal fold vibration, the effect is noticeable only when the area reduction is beyond 90% (Grades III and IV SGS). while SGS tends to reduce vocal fold vibration, the effect is noticeable only when the area reduction is beyond 90% (Grades III and IV SGS). The POD analysis was performed to investigate the effect of SGS severity on vocal fold vibration modes and their dynamic entrainment. Figure 10a shows the energy percentage of the two most energetic vibration modes in each case. The total energy of the two modes is also plotted. For all of the cases, the two modes captured more than 90% of the total vibration energy. It can be observed that there are no noticeable changes in the modal energy percentage up to 90% of area reduction. Beyond that, at a 96% and 99% area reduction, the energy of the first mode respectively increases by 8.5% and 7.6%; the energy The POD analysis was performed to investigate the effect of SGS severity on vocal fold vibration modes and their dynamic entrainment. Figure 10a shows the energy percentage of the two most energetic vibration modes in each case. The total energy of the two modes is also plotted. For all of the cases, the two modes captured more than 90% of the total vibration energy. It can be observed that there are no noticeable changes in the modal energy percentage up to 90% of area reduction. Beyond that, at a 96% and 99% area reduction, the energy of the first mode respectively increases by 8.5% and 7.6%; the energy of the second mode respectively reduces by 7% and 1.6%; and the total energy respectively increases by 1.5% and 6%. To evaluate if any changes occurred in the modal shape, the similarity of the shape of the corresponding modes in the different cases is quantified by computing the dot product of the modal shape vectors. The result is plotted in Figure 10b. A value of 1 of the dot product represents two identical modes, and a value of 0 represents two completely orthogonal modes. It is observed that the similarity of the corresponding modes between different SGS severity cases and baseline case remains as high as 98% up to 96% area reduction. A significant drop in the modal similarity on the second mode is observed when the area reduction is 99%. To show the details, Figure 10c plots the extreme states of the first and second modes of the baseline case and those of the case with 99% area reduction. As expected, the shape of the first mode remains highly similar between the two cases. However, in the second mode, a large difference is observed on the superior surface of the vocal fold that the waves on the superior surface appear to have opposite phases between the two cases. It indicates that increasing the SGS area reduction to 99% results in different dynamic entrainment between the waves on the medial surface and superior surface. Figure 10d shows the relationship between vibration frequency and SGS severity. Consistent with vocal fold dynamics, f 0 is unaffected up to a 90% area reduction. At 90%, the severity f 0 was slightly reduced to 199 Hz, and then it was reduced further to 195 and 183 Hz at 96% and 99% severities, respectively.

Acoustics
The signal-to-noise ratio, sound intensity, and vocal efficiency for each SGS severity case were computed based on the sound pressure and spectrum. These quantities are plotted against the severity level in Figure 11. These figures show, again, negligible differences of these quantities up to a 75% SGS severity. Beyond this threshold, SNR decreases by 0.6, 3.5, and 7.1 dB, and sound intensity decreases by 0.93, 3.34, and 15.1 dB at 90%, 96%, and 99% SGS severity, respectively. Most significantly, the vocal efficiency decreases by 38%, 82%, and 99.9% at the SGS severity of 90%, 96%, and 99%, respectively, suggesting a significant increase of vocal effort when the area reduction is extremely large. In sum, our results show that the SGS starts to show noticeable effects on vocal fold vibrations when the area reduction reaches 90%, including reducing the vibration amplitude, increasing the glottal open quotient, and reducing the vibration frequency. The results also suggest that the changes in the vocal fold dynamics are associated with the variation in the energy percentage of the vibration modes or the generation of different vibration modes. It is interesting to note that the threshold of SGS area reduction for noticeable effect on vocal fold dynamics is 90%, while on flow rate, it is 75%, suggesting that the change of the flow rate associated with the SGS area reduction from 75% to 90% is likely due to the flow viscous loss across the SGS itself.

Acoustics
The signal-to-noise ratio, sound intensity, and vocal efficiency for each SGS severity case were computed based on the sound pressure and spectrum. These quantities are plotted against the severity level in Figure 11. These figures show, again, negligible differences of these quantities up to a 75% SGS severity. Beyond this threshold, SNR decreases by 0.6, 3.5, and 7.1 dB, and sound intensity decreases by 0.93, 3.34, and 15.1 dB at 90%, 96%, and 99% SGS severity, respectively. Most significantly, the vocal efficiency decreases by 38%, 82%, and 99.9% at the SGS severity of 90%, 96%, and 99%, respectively, suggesting a significant increase of vocal effort when the area reduction is extremely large.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 13 of 18 Figure 11. The signal-to-noise ratio, sound intensity, and vocal efficiency for different SGS severities.

Effects of SGS Location on Glottal Flow Dynamics, Vocal Fold Vibration and Acoustics
In this paragraph, we investigate the effects of SGS location on voice production. Since the above results show that SGS only shows noticeable effects on voice production when the area reduction is above 90%, the effects of SGS location are only studied on the cases with SGS area reduction of 90% and 96%. Two SGS distances of 3.13 and 2.1 cm were employed in the simulations with the varied SGS severity. Figure 12 compares the glottal flow rate waveform, glottal area waveform, f0, sound intensity, and the similarity of the vibration modes with those of the baseline case. The overall observation is that the SGS location does not change its effects. The changes in the peak flow rate, maximum glottal area, and other quantities in Figure 12, due to the change of the SGS location, are all within 2%. We have also computed other quantities including the skewness and open quotients, MFDR, mode energies, SNR, and vocal efficiency. Overall, we do not observe noticeable effects of SGS location on them. This clearly indicates that the voice outcome is insensitive to the location of SGS. Figure 11. The signal-to-noise ratio, sound intensity, and vocal efficiency for different SGS severities.

Effects of SGS Location on Glottal Flow Dynamics, Vocal Fold Vibration and Acoustics
In this paragraph, we investigate the effects of SGS location on voice production. Since the above results show that SGS only shows noticeable effects on voice production when the area reduction is above 90%, the effects of SGS location are only studied on the cases with SGS area reduction of 90% and 96%. Two SGS distances of 3.13 and 2.1 cm were employed in the simulations with the varied SGS severity. Figure 12 compares the glottal flow rate waveform, glottal area waveform, f 0 , sound intensity, and the similarity of the vibration modes with those of the baseline case. The overall observation is that the SGS location does not change its effects. The changes in the peak flow rate, maximum glottal area, and other quantities in Figure 12, due to the change of the SGS location, are all within 2%. We have also computed other quantities including the skewness and open quotients, MFDR, mode energies, SNR, and vocal efficiency. Overall, we do not observe noticeable effects of SGS location on them. This clearly indicates that the voice outcome is insensitive to the location of SGS.

Underlying Mechanism-Correlation to the Flow Resistance
Same as Smith and Thomson [18], we hypothesized that the above-observed eff of SGS on voice production are primarily due to its effect on the flow resistance. To ex ine this hypothesis, we computed the flow resistances across the SGS (trans-stenosis sistance) and glottis (trans-glottis resistance) for all of the SGS severity cases. The f resistance is defined as |∆ | , where ∆ is the mean pressure drop across the SGS or glo and is the corresponding incompressible glottal flow rate. The total flow resistanc defined as the sum of the two. Figure 13a shows the cycle-averaged flow resistances v sus the SGS severity in which the SGS location is 3.13 cm.
It is noticed that, up to the SGS area reduction of 75%, the trans-stenosis resista stays nearly zero, and as a result, the total flow resistance is the same as the trans-glo flow resistance. Moreover, the change of the SGS area reduction up to 75% does not af the trans-glottis flow resistance. It indicates that, when the SGS area reduction is be 75%, the SGS does not generate additional flow resistance nor affect the trans-glottis f resistance. When the SGS area reduction is 75%, a very slight increase in the trans-sten flow resistance and total flow resistance can be seen. Beyond 75% SGS severity, the tra stenosis resistance starts to increase rapidly with the increasing SGS severity. In the me while, the trans-glottis resistance slightly drops. Because the increase of the trans-sten

Underlying Mechanism-Correlation to the Flow Resistance
Same as Smith and Thomson [18], we hypothesized that the above-observed effects of SGS on voice production are primarily due to its effect on the flow resistance. To examine this hypothesis, we computed the flow resistances across the SGS (trans-stenosis resistance) and glottis (trans-glottis resistance) for all of the SGS severity cases. The flow resistance is defined as |∆P| Q , where ∆P is the mean pressure drop across the SGS or glottis, and Q is the corresponding incompressible glottal flow rate. The total flow resistance is defined as the sum of the two. Figure 13a shows the cycle-averaged flow resistances versus the SGS severity in which the SGS location is 3.13 cm.
It is noticed that, up to the SGS area reduction of 75%, the trans-stenosis resistance stays nearly zero, and as a result, the total flow resistance is the same as the trans-glottis flow resistance. Moreover, the change of the SGS area reduction up to 75% does not affect the trans-glottis flow resistance. It indicates that, when the SGS area reduction is below 75%, the SGS does not generate additional flow resistance nor affect the trans-glottis flow resistance. When the SGS area reduction is 75%, a very slight increase in the transstenosis flow resistance and total flow resistance can be seen. Beyond 75% SGS severity, the trans-stenosis resistance starts to increase rapidly with the increasing SGS severity. In the meanwhile, the trans-glottis resistance slightly drops. Because the increase of the trans-stenosis resistance is much more than the drop of the trans-glottis resistance, the total flow resistance also rapidly increases with the SGS severity. When the SGS area reduction reaches 99%, the trans-stenosis resistance has become comparable to the transglottis resistance; as a result, the total flow resistance is about twice its values in the baseline case. These observations indicate that, when the SGS area reduction is beyond 75%, the SGS starts to generate additional flow resistance, which increased rapidly with the SGS area reduction. the SGS severity. Therefore, the effect of SGS on the area ratio correlates well with its effect on the trans-stenosis flow resistance, suggesting that the area ratio is a key parameter, to determine the effect of SGS on the flow resistance. Specifically, when the glottal area is significantly smaller than the SGS area (SGS severity below 75%), the SGS generates nearly no additional flow resistance, and the vocal fold dynamics are little affected. By increasing the SGS severity beyond 75%, the area ratio starts to increase rapidly, and the trans-stenosis flow resistance increases accordingly. It is noticed that, when the SGS severity is 96%, the area ratio reaches 0.62, meaning that the SGS area is only less than two times the cycleaveraged glottal area, but the trans-stenosis resistance is only 0.14 of the trans-glottis resistance. When the SGS severity reaches 99%, the area ratio becomes 2.14, meaning that the SGS area is only half of the cycle-averaged glottal area, but the trans-stenosis resistance is about the same as the trans-glottis resistance. The reason is that very high trans-glottis flow resistance is generated during the early glottal opening and late glottal closing when the glottal area is very small.  The effect of the SGS on the flow resistance is found to be consistent with its effect on the flow rate and vocal fold vibration. The SGS is found to start to lower the flow rate at 75% area reduction and lower the vocal fold vibration amplitude at 90% area reduction. These are consistent with its threshold to increase the flow resistance. That the threshold of SGS severity to affect vocal fold vibration appears higher than that to affect the flow rate is likely because the flow rate is more sensitive to the changes in flow resistance. Furthermore, Titze and Talkin [28] reported that a decrease in trans-glottis pressure drop could result in a slight decrease in the f 0 . Our results show that increasing the SGS severity beyond 75% decreases the flow rate and trans-glottis flow resistance, suggesting that the trans-glottis pressure would also be decreased. It thus explains the drop of the f 0 with the SGS severity beyond 75%.
To investigate why the threshold SGS severity to increase the flow resistance occurs at 75% area reduction, we computed the area ratio between the glottis and SGS and plot its relationship with SGS severity in Figure 13b. The area ratio is defined as the cycle-averaged glottal area to the SGS area. The figure shows a similar relationship between the area ratio and SGS severity to that of the trans-stenosis flow resistance: The area ratio remains nearly zero until the SGS severity reaches 75%; after that, it rapidly increases with the SGS severity. Therefore, the effect of SGS on the area ratio correlates well with its effect on the trans-stenosis flow resistance, suggesting that the area ratio is a key parameter, to determine the effect of SGS on the flow resistance. Specifically, when the glottal area is significantly smaller than the SGS area (SGS severity below 75%), the SGS generates nearly no additional flow resistance, and the vocal fold dynamics are little affected. By increasing the SGS severity beyond 75%, the area ratio starts to increase rapidly, and the trans-stenosis flow resistance increases accordingly. It is noticed that, when the SGS severity is 96%, the area ratio reaches 0.62, meaning that the SGS area is only less than two times the cycle-averaged glottal area, but the trans-stenosis resistance is only 0.14 of the trans-glottis resistance. When the SGS severity reaches 99%, the area ratio becomes 2.14, meaning that the SGS area is only half of the cycle-averaged glottal area, but the trans-stenosis resistance is about the same as the trans-glottis resistance. The reason is that very high trans-glottis flow resistance is generated during the early glottal opening and late glottal closing when the glottal area is very small. Finally, we compare the flow resistances in the cases with different SGS locations. As shown in Figure 14, the SGS location has negligible effects on flow resistance. The 2.1 and 3.13 cm cases have nearly the same flow resistance, with reductions of 0.8% of the total flow resistance and 1.2% of the trans-glottis resistance for the 90% area reduction case and 0.4% of the total flow resistance and 0.9% of the trans-glottis resistance for the 96% area reduction case. This also nicely corresponds to the observation that the SGS location has nearly no effect on voice production within that stenosis-glottis distance interval. Finally, we compare the flow resistances in the cases with different SGS locations. As shown in Figure 14, the SGS location has negligible effects on flow resistance. The 2.1 and 3.13 cm cases have nearly the same flow resistance, with reductions of 0.8% of the total flow resistance and 1.2% of the trans-glottis resistance for the 90% area reduction case and 0.4% of the total flow resistance and 0.9% of the trans-glottis resistance for the 96% area reduction case. This also nicely corresponds to the observation that the SGS location has nearly no effect on voice production within that stenosis-glottis distance interval.

Conclusions
In this study, a 3D laryngeal model was employed to study the effects of subglottic stenosis on dynamics of glottal flow, vocal fold vibration, and acoustics during voice production. The results show that the SGS starts to have a noticeable effect on glottal flow rate only when the area reduction is beyond 75% and has an effect on vocal fold vibration when the area reduction is beyond 90%. Specifically, when the SGS area reduction is below this threshold, varying the SGS area reduction had little effect on glottal flow, vocal fold vibration, and acoustics; when it is beyond this threshold, increasing the SGS area reduction quickly decreases the flow rate, vocal fold vibration amplitude, and vocal efficiency; it slightly decreases the skewness quotient, vibration frequency, signal-to-noise ratio, and vocal intensity; and it slightly increases the open quotient. Furthermore, our results show that large SGS area reduction can also affect vocal fold dynamics by affecting the energy percentage of the vibration modes or generating different vibration modes. Changing SGS location shows no effect on the dynamics.
The detailed analysis on the trans-stenosis flow resistance, trans-glottis flow resistance, and total flow resistance shows that the effects of SGS are related to its effect on the total flow resistance, which is further related to the ratio of the cycle-averaged glottal area to SGS area. When the SGS area reduction is below 75%, the area ratio was nearly zero, meaning that the glottis opening is much smaller than the SGS opening. In these cases, the trans-stenosis resistance is nearly zero, and the total flow resistance is almost the same as the trans-glottis flow resistance, suggesting that the SGS does not generate additional flow resistance. As a result, the SGS shows nearly no effect on the dynamics. By increasing the SGS severity beyond 75%, the area ratio starts to increase rapidly with the SGS severity, and the trans-stenosis flow resistance is observed to increase rapidly too, resulting in the quick increase of the total flow resistance. As a result, the flow rate and vocal fold vibration reduce quickly. In the meantime, increasing SGS severity also causes a decrease in the trans-glottis resistance and trans-glottis pressure drop, which further causes a drop in the fundamental frequency. Therefore, our results indicate that the effect of SGS on phonatory dynamics is largely determined by the area ratio between the glottis Figure 14. The cycle-averaged trans-glottis, trans-stenosis, and tract flow resistance of the 2.1 and 3.13 cm SGS distances for the 90% and 96% SGS severities.

Conclusions
In this study, a 3D laryngeal model was employed to study the effects of subglottic stenosis on dynamics of glottal flow, vocal fold vibration, and acoustics during voice production. The results show that the SGS starts to have a noticeable effect on glottal flow rate only when the area reduction is beyond 75% and has an effect on vocal fold vibration when the area reduction is beyond 90%. Specifically, when the SGS area reduction is below this threshold, varying the SGS area reduction had little effect on glottal flow, vocal fold vibration, and acoustics; when it is beyond this threshold, increasing the SGS area reduction quickly decreases the flow rate, vocal fold vibration amplitude, and vocal efficiency; it slightly decreases the skewness quotient, vibration frequency, signal-to-noise ratio, and vocal intensity; and it slightly increases the open quotient. Furthermore, our results show that large SGS area reduction can also affect vocal fold dynamics by affecting the energy percentage of the vibration modes or generating different vibration modes. Changing SGS location shows no effect on the dynamics.
The detailed analysis on the trans-stenosis flow resistance, trans-glottis flow resistance, and total flow resistance shows that the effects of SGS are related to its effect on the total flow resistance, which is further related to the ratio of the cycle-averaged glottal area to SGS area. When the SGS area reduction is below 75%, the area ratio was nearly zero, meaning that the glottis opening is much smaller than the SGS opening. In these cases, the trans-stenosis resistance is nearly zero, and the total flow resistance is almost the same as the trans-glottis flow resistance, suggesting that the SGS does not generate additional flow resistance. As a result, the SGS shows nearly no effect on the dynamics. By increasing the SGS severity beyond 75%, the area ratio starts to increase rapidly with the SGS severity, and the trans-stenosis flow resistance is observed to increase rapidly too, resulting in the quick increase of the total flow resistance. As a result, the flow rate and vocal fold vibration reduce quickly. In the meantime, increasing SGS severity also causes a decrease in the trans-glottis resistance and trans-glottis pressure drop, which further causes a drop in the fundamental frequency. Therefore, our results indicate that the effect of SGS on phonatory dynamics is largely determined by the area ratio between the glottis and SGS. A threshold area ratio exists, below which the SGS has nearly no effect on the dynamics, and, beyond which, the effect of SGS grows rapidly with its severity. In our results, the threshold area ratio is around 0.1, corresponding to an SGS area reduction of 75%. The results are found to be consistent with clinical observations that Grades I and II SGS (area reduction below 70%) had very minor effects on both voice production, and respiration and normally no treatment is needed; only at Grades III and IV SGS, effects on breathing and voice production become significant and treatments are needed.