Aeroacoustic Sound Source Characterization of the Human Voice Production-Perturbed Convective Wave Equation

: The ﬂow-induced sound sources of human voice production are investigated based on a validated voice model. This analysis is performed using a hybrid aeroacoustic workﬂow based on the perturbed convective wave equation. In the ﬁrst step, the validated 3D incompressible turbulent ﬂow simulation is computed by the ﬁnite volume method using STARCCM+ . In a second step, the aeroacoustic sources are evaluated and studied in detail. The formulation of the sound sources is compared to the simpliﬁcation (neglecting the convective sources) systematically using time-domain and Fourier-space analysis. Additionally, the wave equation is solved with the ﬁnite element solver openCFS to obtain the 3D sound ﬁeld in the acoustic far-ﬁeld. During the detailed effect analysis, the far-ﬁeld sound spectra are compared quantitatively, and the ﬂow-induced sound sources are visualized within the larynx. In this contribution, it is shown that the convective part of the sources dominates locally near the vocal folds (VFs) while the local time derivative of the incompressible pressure is distributed in the whole supra-glottal area. Although the maximum amplitude of the time derivative is lower, the integral contribution dominates the sound spectrum. As a by-product of the detailed perturbed convective wave equation source study, we show that the convective source term can be neglected since it only reduces the validation error by 0.6%. Neglecting the convective part reduces the algorithmic complexity of the aeroacoustic source computation of the perturbed convective wave equation and the stored ﬂow data. From the source visualization, we learned how the VF motion transforms into speciﬁc characteristics of the aeroacoustic sources. We found that if the VFs are fully closing, the aeroacoustic source terms yield the highest dynamical range. If the VFs are not fully closing, VFs motion does not provide as much source energy to the ﬂow-induced sound sources as in the case of a healthy voice. As a consequence of not fully closing VFs, the cyclic pulsating velocity jet is not cut off entirely and therefore turbulent structures are permanently present inside the supraglottal region. These turbulent structures increase the broadband component of the voice signal, which supports research results of previous studies regarding glottis closure and insufﬁcient voice production.


Introduction
The voice, the carrier signal of speech, is modulated by pharynx constrictions, tongue motion, and oral-nasal area. The process of voice production can be described by the Thus, the flow field variables are separated into mean (p,v,ρ) and fluctuating parts. Additionally, the fluctuating parts are assumed to be a sum of acoustic (p a , v a , ρ a ) and incompressible flow components (p ic , v ic ). Using an arbitrary Lagrangian-Eulerian description for the convective differential operators where v rg is the relative velocity of the grid, we arrive at the PCWE for moving meshes [16]: The formulation reduces the number of equation unknowns (acoustic pressure p a =ρ Dψ a Dt and particle velocity v a = −∇ψ a ) to just one scalar unknown, being the acoustic velocity potential ψ a . Compared to the acoustic perturbation equation (APE) [13], the PCWE has two advantages. Firstly, it is a scalar equation and thus computationally efficient. Secondly, it describes aeroacoustics generated by incompressible flow structures and wave propagation through moving media [17].
Regarding the computational efficiency and the applicability to complicated geometries, the model was validated for the voice production process [18]. At low Mach numbers without back-coupling of the acoustic waves on the flow field, the numerical separation of flow and acoustics (hybrid aeroacoustic approach) is valid. This approach involves three steps (see Figure 1): (1) perform unsteady incompressible flow simulation; (2) compute the acoustic sources from the flow field; (3) simulate the sound propagation. In addition to the efficient model, these three steps guarantee a practical method for fast and accurate aeroacoustic computations regarding the physical assumptions and numerical requirements [19][20][21].  This hybrid workflow was frequently applied to compute the human voice production [6,9,10,[22][23][24][25][26][27][28][29][30]. A full fluid-solid-acoustic coupling was presented by [24], who used Lighthill's acoustic analogy to calculate the acoustics. Recently, Ref. [31,32] developed a model to compute flow-induced sound and include vibration of the human vocal folds. This model coupled the immersed boundary method for incompressible flow to a finite element solver. The finite element solver computes the viscoelastic tissue and the linearized perturbed compressible equations [14]. However, the fully coupled fluid-structure simulations suffer from a lack of precise geometrical and material properties of the living tissues [24,31]. Since the parameters are highly subject-specific and most of the vocal fold tissue measurements are still hardly applicable in-vivo to precisely determine the vocal fold material parameters [33,34]. As shown in [35], the full fluid-structure interaction solution can be approximated properly by prescribing the vocal fold motion for the flow computation. In this way, the fluid-structure interaction is circumvented and spares considerable computational resources.
According to these findings, Ref. [18] describes a validated aeroacoustic model based on a simplified PCWE formulation in conjunction with the hybrid workflow. The larynx geometry of the numerical model is based on the experimental model of [36]. Firstly, the 3D flow inside the larynx is computed by an incompressible Large Eddy Simulation in combination with the wall-adapting local eddy-viscosity subgrid-scale model and a finitevolume cell-centered non-staggered grid. The vocal fold motion is enforced by the model of [37,38]. Based on the experiments [39,40], a constant inlet flow pressure of 2450 Pa and an outlet boundary flow pressure of 0 Pa was set. The wall boundaries are defined as noslip, no-injection boundary conditions. Secondly, the acoustic source terms are computed on the flow grid and a conservative interpolation to the acoustic grid is performed [41]. Thirdly, the sound propagation is simulated using the finite element method [18]. Therein, the computational effort of sound computation was reduced by 95% while preserving a satisfying accuracy. This optimized sound propagation model consists of an aeroacoustic source domain, a propagation domain, and an accurate perfectly matched layer (PML) accounting for far-field treatment of the acoustic waves. However, the acoustic simulation was computed using a simplified PCWE variant. Using dimensional analysis the first term of material derivative (4) scales with the Helmholtz number He and the second with the Mach number Ma. At low Mach number, the material derivative (4) in Equation (5) is assumed to be approximated solely by the partial time derivative. In doing so, the PCWE equation reduces to an equation already proposed by Ribner [42]: In contrast to [43], we investigate the low Mach number assumption of the sources systematically. Therefore, we analyze each source term contribution separately as well as jointly.
As a consequence, we will provide an interpretation framework to judge aeroacoustic sources of the PCWE equation.
Thus, this work focuses on the generation of aeroacoustic sound sources regarding PCWE. Throughout the work, we visualize different flow parameters, such as the pressure, turbulent kinetic energy, and the terms of the aeroacoustic sound sources to investigate the sound production process. It is of particular interest to separate the different parts of the PCWE source terms (see e.g., (5)) and provide a solid ground for the assumptions regarding the application towards human phonation. Therefore, the source distribution based on PCWE is discussed in detail for specific time steps during an oscillation (closed, opening, open, closing). Furthermore, the sources contours are displayed at the fundamental frequency f 0 of the VF and characteristic higher frequencies associated with broadband noise. The outcomes provide further evidence of the shape and mechanism of the human voice production and a better understanding of disordered and physiological voice production processes. Section 2 introduces the acoustic simulation model and discusses the planned investigations based on the model simplifications. In general, the developed methodology can be applied to real phonation with physiological geometries. Section 3 summarizes the visualization of PCWE sources and their interpretation. During the interpretation, we address the application towards disordered effects of the VFs like glottic insufficiency. Finally, we conclude the recent advances and provide an outlook.

Phonation and Simulation Model
To adequately validate the human voice production process while also looking towards computational efficiency, a simplified phonation model is required. Therefore, instead of modeling a human larynx and vocal tract, a simplified synthetic larynx model, developed by [39,40] is used to model the fluid dynamics and also the aeroacoustics in a hybrid workflow.

Phonation Model
The phonation model consists of two rectangular channels (subglottal and supraglottal) with a cross-section of 18 × 15 mm² , a length of 190 mm and the synthetic larynx in between (see Figure 2). The larynx includes the vocal folds (VF) and the rigid ventricular or false vocal folds (fVF), described in the experiments of [39,40]. The geometry of the vocal folds replicates the M5 model introduced by [44,45]. Instead of incorporating the fluid-structure interaction of the VFs, the glottal area waveform (GAW) was extracted from the flow-induced motion in the experimental model and transfered to the CFD model as moving boundary.
For the acoustic simulation, following the hybrid aeroacoustic approach, a different grid is used than for the CFD simulation. The three dimensional computational acoustics (CA) grid as shown in a coronal plane of the larynx in Figure 2, is inherited from [18], which is referred to for a detailed explanation. For computational efficiency, the subglottal channel is truncated by an absorbing boundary condition (ABC) to prevent reflections. At the end of the vocal tract (supraglottal channel), a cuboid-shaped propagation domain is attached with a two element thick, perfectly matched layer (PML) [46] region on the boundary, to sufficiently ensure the open domain property [18]. All other boundaries, including the bottom plane of the propagation domain, are modeled as acoustic hard. The larynx region and vocal tract are meshed by approximately 127 500 tetrahedral elements with a maximum element size of 4.3 mm. In contrast, the propagation and PML regions are meshed with hexahedral elements.
The flow field is interpolated to the CA grid using a conservative interpolation technique presented in [18]. Then, the finite element approach solves the PCWE that describes the acoustic wave propagation. This framework has been implemented in openCFS [47] to perform the aeroacoustic simulations presented in this work. The incompressible fluid density is set toρ = 1.138 kg/m 3 and the speed of sound is given by c = 343.4 m /s.

Aeroacoustic Model
Based on [18], we derive the weak formulation of the partial differential equation for the scalar acoustic potential ψ a ∈ H 1 . In doing so, we introduce an appropriate test function ϕ ∈ H 1 , multiply Equation (5) by the test function, integrate over the whole computational domain Ω, integrate by parts the second term on the left hand side and arrive at In [18], the convective contribution to the PCWE source was neglected. Based on the PCWE source in Equation (5), the two additive source parts (time derivative of the incompressible pressure and the convective source term part) are investigated separately. As a by-product of this study, the effect of the convected source term onto the far-field sound spectrum is obtained. As published by [43], we believe that the source visualization was strongly dominated by the highly localized convective source term inside the glottal gap. This convective source term is large due to the enhanced flow velocity inside this region. Therefore, the mean velocity and flow pressure gradient are well aligned and feature a large locally active source term. Additionally, it was found that with increasing glottal insufficiency this source term is reduced, which can be explained by the reduced flow resistance change within a cycle. To provide a profound basis for this hypothesis regarding the significance of the flow sources, we investigate the contributions of each individual term. Additionally, we will justify that the partial time derivative of the incompressible flow pressure is more significant with respect to the resulting acoustic spectrum in the far-field [18].

Convective Background Flow Source
We define the Helmholtz number He = f L/c (L being a characteristic length scale) and the Mach number Ma = U /c (U being a characteristic velocity scale, f the frequency, and c the speed of sound). Then, the partial time derivative in Equation (4) scales with the Helmholtz number and the convective operator in Equation (4) with the Mach number.
Regarding the dimensional analysis of the terms, we investigate the contribution to the aeroacoustic source terms of the material derivative Equation (4). Dimensionless analysis of the right-hand side volume term in the time-harmonic form yields With increasing frequency, the first term will play a dominant role and is localized between the VF and the fVF (will be shown in Section 3.2). The second term plays a role at higher Mach number and in regions where the flow pressure gradient is aligned or anti-aligned with the flow velocity. In the case of phonation the second term is large within the glottal gap (between the VFs) and is strongly modulated by the vocal fold motion. Neglecting the second term means that we introduce errors by neglecting sources inside the VF gap, especially at low frequencies where the first term has a minor impact.

Simulation Overview
Regarding the previously described simulation model, a 3D field visualization of flow quantities and aeroacoustic sound sources is of interest.Each visualization was performed at characteristic instantaneous time steps and in the frequency domain at specific frequencies.
Firstly, we investigated the sources at four time instances during one VF oscillation: 1.
the VFs are entirely closed (minimum opening), 2.
the VFs are opening, 3.
the VFs are entirely opened (maximum opening), 4. and the VFs are closing.
The VF opening states are visualized in Figure 3 by the GAW Secondly, we study the aeroacoustic source in the frequency domain: • the VFs fundamental frequency f 0 = 148 Hz, • and the VFs non-harmonic (e.g., f = 3589 Hz). Based on the evaluation, we analyze the different variables of the PCWE right-handside. Firstly, we will visualize the incompressible flow pressure p ic as the primary variable of the source term. Secondly, we investigate the fluctuating velocity |u| and the mean velocity |u| of the flow field, being a characteristic quantity of the source. Thirdly, we visualize the gradient of the incompressible flow pressure |∇p ic | as a main building block of the right-hand-side. Fourthly, the right-hand side of Equation (5) represents the two components of the acoustic sources ∂p/∂t and the convective source term u · ∇p multiplied by a scalar factor −1/ρc 2 . Finally, we investigate the right hand-side of Equation (5) systematically.

Aeroacoustic Source Visualization
In this section, we will explain the source vizualizations in time and frequency domain. Evaluating both enables us to interpret specific source characteristics. Firstly, the temporal evolution is connected to the source field concerning characteristic time steps that are connected to the VF movement. Secondly, the frequency resolution let us conclude where we find dominant sources with respect to the frequency.

Flow Quantities
Before we start with the visualization of the aeroacoustic source terms, we visualize the flow field. In particular, the flow pressure distribution, the turbulent kinetic energy (TKE), the velocity field, the mean velocity, and the gradient of the flow pressure are investigated systematically. Figure 4 shows the flow pressure distribution and contour surfaces for four time instances. The contour values have been selected to provide an illustrative visualization of the physical quantity inside the supra-glottal area. In the case of opening vocal folds, the high value of the pressure and the flow pressure contours indicate that flow is accelerated and turbulent structures are blown out of the region between the false VFs. The highly developed turbulent structures are caused by a local flow pressure decrease after the VFs (indicating a developing glottal jet) when the VFs are opening. This process continues until the VFs are fully open. In the case of fully open VFs, the flow pressure iso-contours indicate a compact glottal jet. Due to the high jet velocity the dominating pressure contour is negative. From this time instance on, the VFs are closing and constrict the glottal gap. This constriction reduces and diffuses the velocity jet and the flow pressure starts to rise in the supra-glottal area. In the case of fully closed VFs, the flow pressure is positive again with spatial fluctuations due to the decaying vortices that are localized in a resting fluid. The flow is cut off, the driving force of the fluid jet is close to zero, and the flow intensifies curling.
Having obtained the finely resolved source term, a field fast Fourier transformation (field FFT) is performed in MATLAB © . The field FFT is an FFT of the time signal, performed for every nodal value on the computational grid. For the following plots, the oscillation mode of the Fourier transformed nodal values is chosen to be visualized time-harmonically employing iso-surfaces [48]. Figure 5 shows the pressure distribution for two characteristic frequencies. The visualization at the VF fundamental frequency shows that pressure contours dominate the whole domain. The contour visualizes a gradual pressure drop in x direction. The spatial distribution of the pressure at this frequency is equivalent to the shape of the first time derivative of the pressure. Therefore, the first term of the PCWE source term will reach far into the sub-glottal region. This fact will be further investigated by Figure 6 and the next paragraph. In contrast to the fundamental frequencies and the higher multiples of it, we investigate the pressure distribution at a different frequency. This visualization has the shape of propagating vortical structures, depicted as differently colored conglomerates indicating a phase shift.   Fluid pressure distribution on the center line. Velocity profile along the center line. In this paragraph, we provide evidence for the dominant spatial distribution of first term in the PCWE source. Figure 6 shows the flow pressure and velocity within the domain along the center line in x-direction at the above introduced phases of VF oscillation. We have to note that the pressure values at initial and final position are constant as a consequence of the flow boundary condition. In general, the pressure distribution is segmented into four areas along a streamline. Firstly, the pressure increases partly due to the approaching high flow resistance. Secondly, within the glottis gap the pressure drops significantly and the flow is accelerated. Thirdly, downstream the glottis gap, vortical and turbulent structures cause pressure and velocity fluctuations. Finally, the pressure and the velocity fluctuations decay towards the flow boundary. Again, we analyzed four time instances and found that dominant pressure changes occur within the whole domain. If the glottal gap is narrow, then the pressure drop is high and vice versa. Both the pressure gradient change and the velocity change inside the glottal gap are large over one full VF cycle. As a consequence, the time derivative of the incompressible pressure is large inside the upstream and downstream region (based on the origin of the x coordinate).
Based on these four time instances, we determine the phases, the flow undergoes during one VF oscillation cycle. Firstly, the resting vortical structures are blown out. Secondly, a new stable glottal jet is formed. Thirdly, this jet is decelerated. Fourthly, deceleration curls up the flow structures into vortices. Figure 7 displays the mean velocity magnitude and the TKE, respectively. The mean velocity is an important quantity inside the PCWE source term. It features in the convective source terms. As indicated, the shear layer zones are also subject to the dominant pressure structures. Furthermore, inside the glottal jet, the convective source terms will be large. The TKE is a measure of the mean kinetic energy that can be associated with vortices in the turbulent flow. As it is an averaged quantity, it does not change drastically during a full cycle and therefore only one time instance is visualized in Figure 7. As observed previously, it shows that most of the turbulent structures are located inside the glottal jet region overlapping with the regions of high mean velocity. Taking a closer look at the flow velocity, Figure 8 gives illustration of the velocity magnitude in the frequency domain. Just like for the flow pressure, at the VF fundamental frequency f 0 , the velocity changes globally and propagates downstream. The velocity profile for higher harmonics does not differ significantly from the ones at a non-harmonic frequency. The contour planes become small separated regions scattered around the whole computational domain, visualizing the turbulent structures inside the flow.  Figure 9 shows the incompressible flow pressure gradient magnitude in the time domain. As the flow pressure was discussed previously, many of the following observations are connected to the pressure distribution. While the VFs are opening and turbulent structures are evolving, the gradient magnitude is rather low in all areas, except the highly turbulent regions close to the VFs. Thereafter, the glottal jet flow is visible with a high gradient magnitude inside the jet region caused by turbulent structures that are moving downstream. This process continues, while the VFs are closing. The area in which a large gradient magnitude is present enlarges because of the fluctuations caused by the decelerating flow. Once the VFs are fully closed, the remaining vortical structures generate the visible pressure gradient, in the otherwise resting fluid. VFs closing VFs closed Figure 9. Pressure gradient field magnitude |∇p ic | visualized by contours at |∇p ic | = 2 · 10 5 , 4 · 10 5 , 6 · 10 5 Pa/m.
In the frequency domain, Figure 10 depicts the oscillation mode of the incompressible flow pressure gradient magnitude |∇p ic |. For the VF oscillation frequency, the timeharmonically visualized flow pressure gradient seems to not change sign throughout the plotted region, indicating an in-phase oscillation in the whole domain. This behavior is most likely due to the global pressure variations that were discussed previously. In contrast, the oscillation mode at higher harmonic frequencies is of similar magnitude as the oscillation mode at non-harmonic frequencies. The main observation is, however, the dominance of the mode at the VF oscillation frequency. VFs non-harmonic 3586Hz (contour at F |∇p ic | = ±5 · 10 3 Pa/m) Figure 10. Pressure gradient visualization for two frequencies.

Aeroacoustic Sources
This section deals with the shape and location of the aeroacoustic sources. We examine the sources in the time and frequency domain and expect to receive significant insight into important flow characteristics. Figure 11 shows the convective part of the PCWE source term and contour lines for four time instances. In general, we can interpret this source term as a projection of the pressure gradient onto the mean velocity (some sort of flow blending). Therefore, we will see large contributions in areas where the flow velocity is large (consequently between the VFs and inside the area of the glottal jet). In the case of opening vocal folds, the source term contours indicate that flow accelerates a plume into the vocal tract looking like a mushroom. This process continues until the VFs are fully open. In the case of fully open VFs, the source iso-contours indicate well developed fluctuations inside the glottal jet. From this time instance on, the VFs are closing. During constriction of the glottal gap, the velocity and the pressure contours still propagate into the vocal tract and widen the contours of the source term. Finally, in the case of fully closed VFs, the driving energy is at minimum. The flow is resting, the previously developed vortices are decaying, and the sources are reduced drastically. Figure 12 shows convective part of the source term at two frequencies in a timeharmonic visualization of the oscillation mode. It again shows the main area of influence of the convective part inside the glottal jet region, as well as the decrease in size of the source structures with increasing frequency for both high harmonic and non-harmonic frequencies. Figure 13 visualizes the partial time derivative of the pressure for four time instances. Firstly, the illustrations show similar trends as the mean pressure convection plots. During opening, structures are formed and convect downstream. When the VFs are fully open, the sources show vortical structures that are connected to a stable jet. During closing, the jet energy is reduced which causes a negative pressure envelope. Furthermore, the flow resistance between the VFs increases rapidly and lowers the pressure level towards zero inside the sub-glottal region during closing. This increasing flow resistance causes negative source structures for the closed VFs. From this analysis, we can learn two important general facts: Firstly, if the VFs are not fully closing the source energy (at specific time instances) VFs motion does not provide as much source energy to the flow-induced sound sources as in the case of a healthy voice. Secondly, if the VFs are not fully closing the source energy during one cycle is not cut off entirely and therefore turbulent and vortical structures are permanently present. These turbulent structures increase the broadband component of the voice signal. The findings highlight the importance of glottal closure for a high voice quality and supports the research output of several studies reporting smaller harmonic amplitudes of a voice affected by glottal insufficiency [49,50]. VF closing (contour at u · ∇p ic = ±5 · 10 6 Pa/s) VF closed (contour at u · ∇p ic = ±5 · 10 6 Pa/s) Figure 11. u · ∇p ic as it is present in the PCWE formulation during a full cycle. f 0 (harmonic; contour at F u · ∇p ic = ±10 5 Pa/s) 3589Hz (non-harmonic; contour at F u · ∇p ic = ±10 5 Pa/s) Figure 12. Visualization of F u · ∇p ic for two frequencies. Figure 14 shows the spatial distribution of the time derivative of incompressible pressure in the frequency domain at two characteristic frequencies. Based on the Fourier transformation, the shapes are tightly connected to the pressure distribution of Figure 5. The visualization at the VF driving frequency shows that source contours dominate the whole visualization domain. The spatial distribution of the first time derivative of the incompressible pressure is similar to the shape of the incompressible flow pressure at this frequency. Thus, the first term reaches far into the sub-glottal region (see Figure 6). At frequency ( f = 3589 Hz), we see that the incompressible pressure distribution and the time derivative are nearly equivalent since the Fourier transformation features a prefactor of −iω. This source visualization shows propagating vortical and turbulent structures after the false VFs. From these graphs, we learn that with increasing frequency the characteristic length of the source structures is reduced. f 0 (contour at F ∂p ic /∂t = 6 · 10 5 , 4.5 · 10 5 Pa/s) 3589Hz (contour at F ∂p ic /∂t = ±1 · 10 5 Pa/s) Figure 14. Visualization of F ∂p ic /∂t for two frequencies. Figure 15 shows the convective source term, the local time derivative of the pressure and the sum of those. Therefore, the corresponding values at the centerline in the xdirection are evaluated at the four characteristic time steps that were used in the previous discussion. To calculate the convective source term on the centerline, the pressure gradient was approximated by the local derivative in x-direction ∂p ∂x , which is valid, because the yand z-components of the mean velocity u are negligible inside the center yz-plane of the whole computational domain.
Based on Figure 15, we can explain the results of the previous source term analysis in [43]. First of all, the convective source part u · ∇p is localized inside the glottis gap and is of about one magnitude larger than the time derivative source part ∂p ∂t . However, the time derivative has a larger spatial extend and the integral contribution is larger (discussed later in Figure 16). This global sum over the source terms gives an expectation of the relevancy in the resulting voice sound spectra in the far field. This let us conclude that the convective part is probably not so relevant. From the time instances in Figure 15, we learn that if the gap is narrow, the gradient term is larger, and dominates the visualization (as featured in [43]). If the gap widens, the peak of the convective term is reduced significantly. For a fully open glottis gap, the convective part and the time derivative part have a similar magnitude. This let us conclude that in the case of glottis insufficiency, we see that the convective term is decreases with increasing insufficiency [43]. Mostly, this was found to be a reduction of the pressure gradient with increasing insufficiency [49].  Having extensively investigated the spatial distribution of the acoustic source term in the time and frequency domain, the source terms are now also investigated globally, to get an idea of the energy content related to certain frequencies. For doing so, the absolute value of the Fourier transformed, conservatively interpolated acoustic source term (RHS of Equation (6)) is summed over all nodes inside the source region. The resulting spectra of the two source term contributions of the PCWE are depicted in Figure 16. Therein, the non-harmonic contribution to the source term is depicted by simple thin lines, whereas the harmonics of the VF vibration are indicated by dots. The spectrum essentially represent the energy content of the primary voice signal, which is the arising sound before it is modified by the filter characteristics of the vocal tract. The result shows that the convective term has a noteworthy contribution to the total source term in the low frequency range up to about 800 Hz (5 f 0 ). In general, the tonal components dominate the spectrum in the lower and middle frequency range, whereas in the higher frequency range the broadband contribution increases and has considerable amplitudes. Regarding a prospective propagation simulation, we can conclude that the convective term might have an impact on the resulting acoustic spectrum in the lower frequency range. Figure 17 shows the total PCWE source term during a full cycle. Again, we see the four phases. Firstly, the developing glottal jet during opening VFs. Secondly, a fully developed glottal jet at maximum strength. Thirdly, the starting vortex decay during closing VFs. Finally, the vortex curling in the stagnating fluid for the fully closed VFs. Figure 18 shows the oscillation modes of two frequencies. As the PCWE source sums both previously analyzed source terms ∂p ic /∂t andv · ∇p ic and the Fourier transformation is a linear operation, the oscillation modes should give a superposition of the oscillation modes of each component. Therefore, at the VF oscillation frequency, the time derivative part of the source term dominates the profile. At high frequencies -both harmonic and non-harmonic -the convective source term becomes more important to the shape of the oscillation mode. This is in contrast to observations made in Figure 16, where the convective source term decreased in amplitude for high frequencies and therefore dwindled in influence, compared to the local time derivative. However, it has to be noted that all previous visualizations covered only a small part of the vocal tract and, as the convective source term is much more localized around the vocal fold, this lets us conclude that the decreasing spatial extent with increasing frequency is more important to the global source energy, which is not as apparent for ∂p ic /∂t.
To summarize the observations, the convective part of the PCWE source term is expected to have a smaller impact on the acoustic spectrum compared to the time derivative part. We prove this expectation by three acoustic propagation simulations that are performed using the presented simulation model with each part of the source term exclusively as well as the whole source term. The acoustic pressure p a is extracted at a microphone position inside the propagation region (see Figure 2), which is located at a distance of 1 m in a straight line to the center of the end of the vocal tract. Figure 19 shows the sound pressure level (SPL) spectrum for these three simulations. Additionally, it includes the SPL spectrum measured at the experimental model. The simulations are based on [49]. As expected, using the convective part of the source term only results in a low SPL that further decreases with increasing frequency. In contrast, the SPL gained from the other two simulations are visually indistinguishable, especially for high frequencies. Overall the obtained spectrum matches well with the measurement data for frequencies f > 2 f 0 = 296 Hz [18]. Since the cutoff frequency of the anechoic chamber where the measurement was conducted is 296 Hz, acoustic free field conditions cannot be assured below this frequency which is thought to be the reason for the deviation between measurement and simulation at this frequency range. 3589Hz (contour at F ∂p ic /∂t + u · ∇p ic = ±10 5 Pa/s) Figure 18. Visualization of the totak PCWE source term at two frequencies. Figure 19. Sound pressure level of simulations using ∂p ic /∂t, the convective part, and the full PCWE source term as acoustic source in comparison with experimental measurement data at a microphone positioned 1m from the end of the vocal tract (see [18]).

Conclusions
The aeroacoustic sources of human voice production are investigated based on a validated voice model. Including the convective part of the source term yields the most accurate results. The error is reduced by 0.6% compared to the validated model [18]. Neglecting convective source effects yields in lower accuracy, but overall the convective term can be neglected for human phonation providing a potential of a reduction of computational effort and resources.
Firstly, the Fourier transformed sources let us conclude where we find dominant sources for a specific frequency. Regarding the local pressure time derivative at the base frequency and the lower frequency harmonics, the acoustic sources are found in the supraglottal area. At higher harmonics and non-harmonic frequencies, the dominant sources are located mainly in the area of high fluid velocity, being inside the glottal jet and in areas of turbulent structures.
Secondly, the temporal evolution is connected to the source field concerning characteristic time steps that are connected to the VF movement. In the case of opening vocal folds, the flow is accelerated and turbulent structures are blown out of the region between the false VFs. A new glottal jet is formed when opening the VFs. This process continues until the VFs are fully open. In the case of fully opened VFs, a well-developed glottal jet is present. From this time instance on, the VFs are closing and constrict the glottal gap. This constriction diffuses the velocity jet and the structures start to fade out. Strong fluctuations are caused by the decelerating fluid. In the case of fully closed VFs, the vortices are decaying and are localized in a resting fluid. Because the flow is cut off, the driving energy of the flow is zero and the flow starts curling.
Since most voice disorders are a combination of mechanisms that prevent the VFs from closing the glottal gap, the occurring effects on the voice can be partly explained by the discussed time instances. From these findings, we can learn how vocal fold disorders transform into specific characteristics of the aeroacoustic sources.
From these findings, we can learn how pathological effects on the vocal folds transform into specific characteristics of the aeroacoustic sources. Since most voice disorders are a combination of mechanisms that prevent the VFs from closing the glottis, the occurring effects on the voice can be partly explained by the discussed time instances.
We learned that if the gap is narrow, the pressure gradient term is large and dominates the visualization but is not contributing to the sound spectra to a large extend (0.6% accuracy increase). If the glottal gap is widened, the peak of the convective term is reduced significantly. For a fully open glottis gap, the convective part and the time derivative part have a similar magnitude. Finally, we have to note again that the convective effect dominates the source visualization of the PCWE source term (simply by the magnitude between the VFs), which misleads in a possible conclusion for voice disorders. As it was reported previously, we see that the convective term is decreases with increasing insufficiency. Although this convective source term is larger in areas of the glottis gap and dominates the source visualization, its overall contribution in the acoustic spectrum is negligible. Institutional Review Board Statement: Not applicable.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author (S.S.).

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: