Towards a Clinically Applicable Computational Larynx Model

Featured Application: Potential application in medical diagnostic and therapeutic processes in the ﬁeld of voice disorders. Abstract: The enormous computational power and time required for simulating the complex phonation process preclude the effective clinical use of computational larynx models. The aim of this study was to evaluate the potential of a numerical larynx model, considering the computational time and resources required. Using Large Eddy Simulations (LES) in a 3D numerical larynx model with prescribed motion of vocal folds, the complicated ﬂuid-structure interaction problem in phonation was reduced to a pure ﬂow simulation with moving boundaries. The simulated laryngeal ﬂow ﬁeld is in good agreement with the experimental results obtained from authors’ synthetic larynx model. By systematically decreasing the spatial and temporal resolutions of the numerical model and optimizing the computational resources of the simulations, the elapsed simulation time was reduced by 90% to less than 70 h for 10 oscillation cycles of the vocal folds. The proposed computational larynx model with reduced mesh resolution is still able to capture the essential laryngeal ﬂow characteristics and produce results with sufﬁciently good accuracy in a signiﬁcant shorter time-to-solution. The reduction in computational time achieved is a promising step towards the clinical application of these computational larynx models in the near future.


Introduction
Voice production is the basic and most important communication tool for humans. The ability of a person to produce normal voice or, more specifically, voiced speech depends on different factors and may change during the lifetime. These factors include predisposition, gender, training, age and state of health [1,2]. Voice disorders restrict or even inhibit the oral communication of patients. As a result, both personal and economic restrictions arise that can lead to isolation and depression in patients [3][4][5]. Furthermore, the productivity reduction of the work-related disabilities of these people can be a significant social constraint [6,7]. In this context, Ruben [8] estimated the loss to the US Gross National Product to be up to $186 billion per year assuming that only 5 to 10% of the US population suffer from communication disorders. This trend was also predicted to increase due to the change to a communication-based economy in the present century. The associated annual healthcare cost of these patients totals hundreds of million dollars, which is comparable to that for other chronic diseases [9]. In general, the negative impact of voice disorders on quality of life targets not only the occupational group but also the general population of all societies [10][11][12][13][14].
Phonation is the process of voice production in the human larynx. Periodic oscillations of the vocal folds are induced by the airflow from the lungs and produce the basic sound of the voice. This flow-induced sound is filtered in the vocal tract, forming the typical spectral characteristics of voiced speech. The production of sound sources in the larynx is a complex process which results from the fluid-structure interaction between the airflow and the structural mechanics of the tissue of the vocal folds. A major problem in voice research and clinical diagnostics is the restricted access to the larynx during phonation, which allows only visual inspection of the vibration of the vocal folds with stroboscopy [15] or high-speed videoendoscopy [16,17]. Other measuring techniques include acoustic voice measurement and electroglottography (EGG) [18], both of which provide indirect measurements of the process in the larynx. As a consequence, synthetic [19] and ex-vivo [20] larynx models have been established in experimental voice research that provide almost unconfined access to the laryngeal flow region. As an alternative and complementary method, computational models have been developed in recent decades. With the development of advanced numerical methods and improvement of computational power, more aspects of the phonation process, from glottal flow analysis [21][22][23][24][25] to mechanical analysis of oscillations of the vocal folds [26][27][28] and acoustic analysis of voice sources [29][30][31][32], have been investigated numerically in detail. These computational models have the advantage of simulating the entire process with enormously high spatial and temporal resolution in the flow region and the structural domain of the vocal fold tissue. However, the ultimate goal of employing these computational tools in clinical work, for instance as a clinically standardized pre-surgery tool, has not yet been achieved. The main reason is the high computational costs for these highly accurate simulations.
Computational fluid dynamics (CFD) and structural mechanics have been employed in recent decades to investigate biomedical processes, especially hemodynamics [33][34][35]. These computational tools have also been applied for potential pre-and post-surgical planning of cardiovascular [36][37][38] and upper airway systems [39][40][41][42][43]. Mylavarapu et al. [42] used CFD tools to develop an effective surgical strategy for patients with obstructive airway disorders by comparing different virtual surgery techniques for treating coexisting glottic and subglottic stenosis by measuring the airway resistance and pressure distribution in airway lumen. Furthermore, these tools have been employed for post-operative assessment after glottis-widening surgery to improve breathing in patients with bilateral vocal fold paralysis (BVFP) [43] and also in post-endoscopic sinus surgery of nasal cavity airflow [40].
In the field of voice disorders, only a few studies have been focused on the use of computational tools to directly support clinical work. A major problem is the incorporation of different scales in the flow and tissue dynamics and acoustics and their interactions in the phonation process. Mittal et al. [44] employed a high-resolution model based on the immersed boundary method for a fully coupled phonation process as a pre-surgery tool to identify the optimal placement position of an implant for the medialization of the paralyzed vocal fold. Additionally, they evaluated the sound sources with a hybrid approach. Unfortunately, they did not mention the computational costs of their simulations to evaluate the potential application in clinics. Xue et al. [28,45] later employed this model to investigate phonatory disorders based on vocal fold left-right asymmetry and tension imbalance and reported that a huge computational time and power were required for each of their simulations. Smith and Thomson [27] also investigated numerically another pathology, subglottic stenosis (SGS), without providing the computational costs. In a recent study, Zhang [46] used a reduced-order model of the vocal fold dynamics based on the eigenmodes of oscillations to reduce the computational effort in order to find a real-time simulation model of voice production. However, for the flow excitation, they applied a quasi-one-dimensional model. Although they claimed that the effects of flow three-dimensionality have less importance in the phonation process, many previous studies proved the significance of these effects such as glottal jet instabilities, its axis switching and the transition to turbulence on laryngeal aerodynamics and acoustics [29,[47][48][49].
In a previous study [50], we showed that pure flow simulation in the numerical larynx model is able to capture all the characteristic features of laryngeal aerodynamics if realistic oscillation patterns of vocal folds are applied as the moving boundaries on the model. Furthermore, such models with forced motion of vocal fold oscillation have been reported in the literature as powerful tools for the analysis of sound generation within the larynx [29][30][31]51] with the advantage of saving the computational time consumed by a fully coupled flow-structure interaction. In our study, it was demonstrated that the total wall time (elapsed real-time) to simulate laryngeal aerodynamics with high accuracy is in the range of weeks even when employing high-performance computing (HPC) resources. This was found to be in agreement with other studies. Therefore, CFD analysis may not yet be applicable in clinical diagnostics and treatment.
The question arises of whether the accuracy of the computational models can be reduced without losing the ability to provide the relevant information for supporting physicians' planning and decision-making processes during the treatment of patients with voice disorders. To answer this question, we reduced systematically the spatial and temporal resolution of the model and evaluated the results considering the computational costs. By making optimal use of a high-performance parallel computer cluster, we performed Large Eddy Simulations (LES) of laryngeal aerodynamics with externally imposed vocal fold dynamics for different mesh resolutions. The validity of the simulations was further evaluated by comparing the results with experimental data obtained from a synthetic larynx model with silicone vocal folds.

Experimental Model and Measuring Techniques
The computational larynx model is based on the experimental model introduced by Kniesburges [52] that has been further used in several other studies [32,[53][54][55]. The experimental setup consisted of a mass flow generator, the subglottal channel containing the synthetic vocal folds and finally the supraglottal channel. Both of the channels had a rectangular cross-section of 18 × 15 mm 2 and a length of 190 mm. The vocal fold geometry was based on the M5 model introduced by Scherer et al. [56] and Thomson et al. [57]. The life-size synthetic vocal fold models were produced as single-layer models consisting of silicone rubber with elastic properties of Poisson's ratio ν = 0.499 and Young's modulus E = 4.4 kPa. These silicone rubber vocal folds showed flow-induced periodic oscillations at a frequency of 145 Hz. In this context, it should be noted that vocal fold oscillations in humans can easily exceed several hundreds hertz [58].
The glottal area waveform (GAW) was extracted from high-speed video footage (with a sample rate of 4000 frames per second) using the in-house software package Glottis Analyse Tool-GAT (Division for Phoniatrics and Pediatric Audiology, University Hospital Erlangen). The vocal fold models reproduced the physiological characteristics of oscillations such as periodic glottis closure and a mucosal wave-like motion with its typical convergent-to-divergent shape change of the glottal duct during an oscillation cycle [55]. Synchronously to the high-speed video visualization, the pressure distribution along the supraglottal channel was measured by pressure sensors mounted on three walls of the supraglottal channel at discrete positions, as depicted in Figure 1.
To visualize and measure the flow velocity field, a laser light sheet was guided into the supraglottal channel to illuminate the central coronal plane. The flow was seeded with water droplets and the droplet patterns were recorded with a high-speed camera. The region of interest (ROI) for the flow visualization was located immediately downstream of the vocal folds and covered 55 mm of the supraglottal channel. A phase-locked particle image velocimetry (PIV) technique was used to determine the flow velocity field at discrete instances or phase angles during an oscillation cycle. A subsequent averaging computation of the flow velocity at each phase angle yielded the phase-averaged flow field. More details about the synthetic larynx model and the experimental measuring techniques can be found elsewhere [52][53][54][55].  Figure 2a shows the flow domain of the computational larynx model, the shape of which is based on the experimental model shown above. The geometry of the vocal fold models and the rectangular cross-section of the sub-and supraglottal channels were kept equal to those in the experimental model. The subglottal channel of the experimental model was shortened to 25 mm in the computational model to avoid the unnecessary computational cost. The length of the supraglottal channel was also reduced to 60 mm based on the prior measurement of the glottal jet length. Another case with a longer supraglottal channel (190 mm) comparable to the experimental model was also simulated and will be introduced in Section 2.2.3.

Geometry of Computational Model and Vocal Fold Motion
The flow-induced oscillations of the experimental vocal fold models were reproduced in the computational model by externally forcing the motion of the vocal folds. This motion exactly reproduced the GAW ( Figure 2) obtained from the high-speed visualization in the experiments. Furthermore, the same time point of the convergent-to-divergent shape change of the glottal duct and the same relationship between the vocal fold motion and the sub-and supraglottal flow devolution were maintained from the experiments.
To model the mucosal wave-like motion of the vocal folds, a translational-rotational motion was applied on the medial surface of the vocal folds (blue surface in Figure 2c), generating a maximum convergent angle θ max−conv = 10 o and a maximum divergent angle θ max−div = 20 o of the glottal duct. These values were approximated from the high-speed footage of the oscillations of the silicone vocal folds and values reported in the literature [51,59,60]. The translational and rotational motions controlled the medial-lateral and the convergent-divergent motions, respectively. Finally, the translational motion was varied along the longitudinal axis of the vocal folds with a sinusoidal function to create the elliptical shape of the glottis constriction as shown in Figure 2c. More details of the modeled motion of the vocal folds can be found in Sadeghi et al. [50].

Numerical Methods
For the numerical model, the unsteady incompressible flow equations were solved using the software package STAR-CCM+ (Siemens PLM Software, Plano, TX, USA) with a finite-volume cell-centered non-staggered grid. LES with a WALE (wall-adapting local eddy-viscosity) [61] subgrid scale model was performed to model the turbulent flow in the larynx. A central difference scheme with second-order accuracy was used to discretize the convective and diffusive terms of the Navier-Stokes equations. These pressure-velocity linked equations were solved non-iteratively within a PISO (Pressure-Implicit with Splitting of Operators) algorithm [62]. An algebraic multigrid (AMG) method with a Gauss-Seidel relaxation scheme and biconjugate gradient accelerator was used to solve the final linear system of equations. The density and kinematic viscosity of air were considered to be ρ = 1.18 kg/m 3 and ν = 1.57 × 10 −5 m 2 /s, respectively.

Boundary Conditions
At all walls of the larynx models, no-slip no-injection boundary conditions were applied. The walls of vocal folds were defined as moving boundaries. At the inlet and outlet of the flow domain, different pressure boundary conditions were applied to identify the best combination of constant or temporal pressure evolution based on the experiments. Furthermore, two lengths of the supraglottal channel were considered as described in Section 2.2.1. In total, five simulation cases were defined as specified in Table 1. Cases C1-C4 had the shorter supraglottal channel (60 mm) and case C5 had the longer channel (190 mm) equal to the experimental model to evaluate the effect of shortening the supraglottal channel on the results. The inlet and outlet pressure boundary conditions were either pressure waveforms taken from the experiments (P s (t) and P 60 (t) in Figure 3) or their mean values. P s (t) and P 60 (t) are the pressure waveforms measured at the subglottal channel and 60 mm downstream of the glottal exit of the synthetic larynx model, respectively. The mean values of the subglottal and supraglottal pressures were P s = 3251 Pa and P 60 ≈ P 0 , respectively. P 0 is 0 Pa, being the relative pressure with respect to the environment. P 0 was also applied at the outlet of case C5 similar to the experimental model. Although the applied subglottal pressure is higher than the physiological values during normal phonation, such high lung pressures have been detected in special cases of phonation such as shouting, loud speech or singing [1].

Mesh Generation and Mesh Motion
The starting mesh was the fine mesh proposed by Sadeghi et al. [50], which was subjected to a mesh independence study. Since the subglottal pressure in Sadeghi et al. [50] was lower than the current study, the mesh independence study was reprocessed and extended. The maximum axial velocity at the glottis and 10 mm downstream of the glottal duct exit, the mean static pressure along the center axis and the flow rate were deviated by only 0.4%, 0.01%, 1.2% and 0.07%, respectively, from the finest mesh in the mesh independence study containing approximately 40 million cells. Therefore, the proposed mesh is valid for the higher subglottal pressure applied in this study. The mesh consisted of hexahedral elements that were refined in the glottal area (glottis refinement) and near supraglottal region (Figure 4). The overset mesh approach provided by STAR-CCM+ was employed to manage the mesh motion for moving boundaries. In this method, a fixed Eulerian background mesh is assigned to the whole numerical domain and an overset mesh wrapped around each vocal fold. The cells in the overset regions are always active and deform according to the movement of the boundaries in an Arbitrary Lagrangian-Eulerian (ALE) manner. The cells in the background region are fixed and become inactive wherever they overlap with the overset cells. As a result, the total number of cells changes during an oscillation cycle depending on the distance between the vocal folds. Furthermore, some cells are required between the overset and background regions as interpolation layers. Therefore, complete closure of the glottis was not possible in this model and a minimum glottal gap of 0.2 mm was forced in the glottis during the glottal closure, which caused negligible leakage.
For this starting mesh, the total number of cells ranged between 2.1 and 2.9 million cells. The corresponding time step size was 1 × 10 −6 s. This led to a mean CFL number of 5 which is appropriate for implicit solvers [63,64]. More details of the mesh motion strategies and the mesh independence study can be found in Sadeghi et al. [50].

Reduction of Mesh Resolution
To reduce the computational costs of the numerical larynx model, the mesh resolution was reduced in this study. The aim was to answer the question of how far it is possible to decrease the mesh resolution and still obtain relevant and significant results from the numerical model. Starting with the finest and base mesh (MB), three meshes (M1-M3) were generated with systematically coarsened spatial resolution. The resulting mesh parameters and total numbers of cells are summarized in Table 2. For validation, the results obtained with the coarse meshes were compared with those obtained with the base mesh.
As mentioned by Mihaescu et al. [22], for LES modeling the mesh size should be of the order of the Taylor micro-scale (λ T ), which is a measure of the length scale of turbulence. By considering the integral length scale (a measure of the large-scale eddies) to be one order of magnitude smaller than the geometric characteristic length, which in the larynx model is the glottal width (D g ), 10% turbulence intensity and the maximum axial velocity u xmax from the simulation with the base mesh, the Taylor micro-scale can be calculated as follows [22,65]: The mean value of the Taylor micro-scale for the numerical simulation with the base mesh was approximately λ Tmean ≈ 0.085 mm. Therefore, the mesh size in the glottal duct, which contains the finest mesh, was not increased above this value during the process of reduction of the mesh resolution.
Furthermore, in order to exclude the impact of the CFL number variation, the time step size was increased synchronously to the increasing mesh size with the same factor.
The requirement for some cells between the vocal folds in the closed phase of the cycle was mentioned in Section 2.2.4. The number of remaining cell layers in this area depends on the mesh size in the glottal duct. Considering this, the number of cell layers in the minimum glottal area of M3 was lower than the others, which caused deactivation of some of these cells by the overset mesh solver. As a result, the deactivated cells were replaced by wall boundary conditions and flow through them was prevented. Therefore, it was partially working like a blockage of the flow at the time of glottal closure. Table 2. Summary of the meshes used for the grid resolution reduction study and their properties. The basic and glottis refinement areas are represented in Figure 4.

Validation of the Computational Model
To demonstrate the validity of the computational model and find the most appropriate combination of boundary conditions, the five simulation cases shown in Table 1 were compared. Figure 5 represents the pressure evolution at the pressure sensors in the supraglottal channel of the synthetic larynx model (P1-P5 in Figure 1) for one oscillation cycle and pressure evolutions obtained from the simulations. The three experimental pressure values at an axial position were averaged and compared with the pressure values acquired at the corresponding axial position at the center axis of the numerical model. The experimental data were best reproduced by the simulation cases C3 and C4, which have the shorter supraglottal channel, and the time-varying pressure P 60 (t) taken from the experimental model at x = 60 mm (Figure 3b) as the outlet pressure boundary condition. In contrast to the cases C1 and C2, the choice of the time-varying outlet pressure in C3 and C4 is the key feature to produce a supraglottal flow field that is consistent with the experiments. The configuration C5 with longer supraglottal channel did indeed improve the consistency of the pressure distribution over 2/3 of the cycle, but showed deviations in the last third of the cycle, as shown in Figure 5. Furthermore, the corresponding L 2 relative error norm between the experimental and simulated pressure evolutions (Err L 2 rel = ∑ T t=0 (P (t) − P(t)) 2 / ∑ T t=0 P (t) 2 [66]) at each position provided in Table 3 revealed that C4 with the applied mean value of the experimental subglottal pressure (P s = 3251 Pa) at the inlet has the smallest deviation of the pressure evolution at all five axial positions. Especially in the region immediately downstream of the vocal folds, the L2 error of C4 is significantly smaller than for C3. Therefore, the most appropriate simulation case regarding the boundary conditions and length of the supraglottal channel is C4. Figure 6 shows the velocity fields in the supraglottal channel in the mid-coronal plane of C4 compared with the averaged flow field obtained from the PIV measurements. Configuration C4 reproduced the typical glottal jet deflection caused by the interaction between the jet and a large recirculation area in the supraglottal channel [50,53,59,67]. Moreover, the simulated flow field shows small-scale fluctuations as shear layer instabilities that do not exist in the PIV-based flow field owing to the phase averaging. The length of the jet is also comparable between the numerical and experimental models, with slightly higher velocity values in the numerical model. This is also visible in the velocity evolution graphs in Figure 7, representing the magnitude of the velocity along the center axis of the supraglottal channel for an oscillation cycle of the computational and experimental models. The trends of the velocity evolutions are similar, especially at 5 and 10 mm downstream of the glottis exit. At larger distances, higher strength turbulent flow structures occurred in the instantaneous flow field of the numerical model. As mentioned above, those turbulent flow fluctuations are not evident in the phase-averaged flow field in the PIV measurements. The higher velocity of the jet may be due on the one hand to the phase averaging of the experimental velocities and on the other to the different subglottal pressure values between the experiment for the PIV measurement and that for supraglottal pressure measurement. The latter is the basis of the numerical model in this study. Furthermore, it seems that the jet started to develop sooner and shut off later in the cycle in the numerical case, which may be due to the different glottal area waveforms in the two experiments and also the small leakage at the beginning and end of the cycle in the numerical model. Table 3. L 2 error norms of the pressures shown in Figure 5 relative to those obtained in the experiment. C1-C5 are the simulation cases introduced in Table 1 5. Pressure evolution for a cycle of oscillation at the pressure probes P1-P5 (parts (a-e)) in the supraglottal region introduced in Figure 1 for the experimental larynx model and the numerical simulation cases C1-C5 introduced in Table 1.

Reduction of Computational Costs
The results obtained with different mesh resolutions M1-M3 (Table 2) were compared qualitatively and quantitatively with the results of the base mesh to evaluate their validity.
The supraglottal flow fields at a representative instance in the last quarter of the oscillation cycle are displayed in Figure 8. It can be seen that the deflection of the glottal jet, which is a characteristic feature of the laryngeal flow, was reproduced with all the mesh resolutions. In general, the jet skews either upwards or downwards depending on the flow conditions in the supraglottal channel [50,53]. In the instantaneous oscillation cycle shown in Figure 8, upward jet deflection was found for M1 and M3 and downward deflection for MB and M2. Regarding the flow field within the glottal duct (Figure 9), symmetry of the glottal jet was preserved by all the meshes. However, flow separation occurred more upstream for meshes MB and M3 whereas the jet remained attached to the glottal walls for M1 and M2. Thus, only M3 reproduced the typical intraglottal vortices in the region of separated flow. Thereby, the separation points were similarly estimated in M3 comparable to MB. These intraglottal vortices have been identified in the literature [68,69] to increase the closing forces on the vocal folds and the intensity of tonal sound.
According to Hussain and Husain [70], elliptical jets show switching of the main axes in the cross-sectional plane along their streamwise evolution. This three-dimensional flow phenomenon has also been reported in experimental and numerical larynx models with elliptical glottis shapes [47,48,50,51,71]. In Figure 10, the iso-surfaces of the velocity magnitude (50 m/s) are provided at a representative time instance (0.58 T). In the coronal perspective ( Figure 10, left column), the diameter of the jet increased during its streamwise evolution, whereas its diameter in the sagittal perspective ( Figure 10, right column) decreased at the same time. This effect, called lengthwise vena contracta [47], was reproduced by all of the meshes. Additionally, the length of the jet was similar in all of them.   (a-d)).
The flow rates during an oscillation cycle are shown in Figure 11 for all four mesh resolutions. Meshes M1-M3 with decreased resolutions produced a very similar trend in comparison with the base mesh MB. During the closing phase of the cycle, M1-M3 exhibited higher flow rates, with the largest deviations for M1 and M2. The reason is the full attachment of the flow to the glottal walls in these meshes for the presented cycle whereas the flow separated from the walls in M3, similar to the base mesh. Subsequently, the increase in flow rate is the result of the divergent expansion of glottal flow [72] in M1 and M2 (Figure 9b,c). The deviation of the flow rate in M3 was lowest owing to the very similar flow pattern to that of MB during the closing period.
At the very beginning and end of the cycle with the closed glottis, MB, M1 and M2 produced flow leakage, which degenerated almost to zero in M3 owing to the blockage of the glottal duct as described in Section 2.2.5.     Summarizing all these comparisons of flow variables, M3 with the lowest grid density among the three unexpectedly produced the best results in comparison with the base mesh MB. The L2 error norms of the presented parameters relative to the base mesh for M3 are smaller than those for M1 and M2 (Table 4). Furthermore, these errors are one order of magnitude smaller than the validation errors in Table 3. The average number of cells was reduced by more than 50% with the mesh resolution M3 compared with the base mesh. Additionally, the time step size was also increased by a factor of 1.36. As a result, the elapsed simulation time of one oscillation cycle decreased significantly to 19 h in comparison with 79 h for MB. The mentioned elapsed times are related to the parallel simulations using one compute node of the RRZE's Emmy cluster of Erlangen-Nürnberg University, which is introduced in the next section.

Using HPC Resources to Reduce the Time-to-Solution
As already mentioned, simulation of the aerodynamics of human phonation within the larynx is computationally very expensive. To apply the computational model of the larynx in clinical routine, the use of high-performance computers is indispensable. These systems may involve single high-performance workstations with dozens of computing units up to very powerful supercomputers with hundreds or thousands of compute nodes each with tens of cores.
In this study, the RRZE's Emmy cluster of Erlangen-Nürnberg University was employed to run the simulations. The Emmy cluster has a total of 560 compute nodes, each equipped with two Xeon 2660v2 Ivy Bridge chips (10 cores per chip + SMT) running at 2.2 GHz with 25 MB shared cache per chip and 64 GB of RAM. Hyper-Threading or SMT (Simultaneous Multithreading) permits a physical core (a single computing unit) to execute two threads (instruction sequences) simultaneously and hence appears to work like two logical cores sharing execution resources [74].
According to Amdahl's law [75], parallel performance is restricted by the serial part of the work, the dependencies, communication between the working cores and imbalanced allocation of the threads to parallel cores, ineffective use of the shared resources, etc. Therefore, the parallel performance never increases linearly with the increase in the number of cores. Furthermore, as mentioned above, the cluster consists of multiple compute nodes, each equipped with many cores, integrated into two chips. Therefore, the scalability of the simulations in parallel execution depends on (1) the number of parallel cores in one compute node and (2) the number of nodes. In order to check the overall scalability, the simulation first ran on a single compute node with increasing number of cores. In the second step, the simulation was executed with different numbers of compute nodes, each with optimal number of cores from the first step. The parallel scalability of the simulation is shown on the basis of the elapsed simulation time (wall time) and the speed-up factor for increasing cores or nodes. The speed-up factor is the ratio of the wall time of the simulation with one core/node to the wall time with N cores/nodes. Figure 14a shows the wall time (in minutes) for a 0.1 ms simulation time and corresponding speed-up by increasing the number of physical cores within one compute node. The wall time sharply decreased for up to 10 cores, followed by a sudden increase on adding one more core. With further progress, the wall time decreased again slightly by requesting more cores. However, it never reached the minimum wall time obtained with 10 cores. The reason is that 20 threads fitted into the 10 cores of the first chip and adding more cores for computations led to the employment of the second chip. As a result, the communication time between the two chips were more than the gain on adding the extra cores which cause the drop in performance after 10 cores. Henceforth, only 10 cores of each requested compute node were employed for the following simulations.  Figure 14b demonstrates the wall time (in hours) for one oscillation cycle of the vocal folds and the corresponding speed-up as function of the requested parallel compute nodes from the cluster. It can be seen that the wall time experienced a sharp decrease on adding the second node. It decreased further on applying more compute nodes, but its gradient decreased by requesting more nodes. As result, the speed-up factor did not grow very much for a large number of compute nodes and consequently the parallel performance deteriorated. Therefore, seven compute nodes with a parallel efficiency equal to 40% of the linear scaling efficiency (based on the wall time for one node) is proposed as a good choice for running the simulations in a competitive time using reasonable resources.
To sum up, by using a total of 70 physical cores of the Emmy cluster, the simulation of 10 oscillation cycles in this numerical larynx model lasts less than 70 h.

Summary and Conclusions
The largest challenge in applying CFD simulations in clinical diagnostics and therapy is generating reliable results at a very short time-to-solution and with low hardware requirements. In this study, a pure CFD simulation with appropriate boundary conditions was introduced. Employing LES, the aerodynamics of the phonation process was simulated and validated with results of the experimental synthetic larynx model. The vocal fold motion was prescribed based on the oscillations of the synthetic vocal folds. The numerical model reproduced the relevant aerodynamic parameters with sufficiently good accuracy and small modeling errors. As boundary conditions, the pressure at the inlet was set to be constant, equal to the mean value of the experimental subglottal pressure, whereas the outlet pressure reproduced the temporal evolution determined in the experiments.
In the next step, the spatial and temporal resolutions of the numerical model were systematically decreased (25%, 45% and finally 55% decreases in the average number of control volumes) to reduce the computational costs. The results with these coarser meshes were compared with those obtained with the fine mesh to evaluate their validity. In general, the coarser meshes showed good agreement with the fine mesh regarding the flow pattern and aerodynamic parameters. Using the coarsest mesh, the average number of control volumes was reduced to 1.1 million cells and accordingly the time step size was increased to 1.36 × 10 −6 s. As a result, the elapsed simulation time decreased significantly by a factor of four. Finally, it was shown that the optimized use of HPC resources can reduce the wall time of simulating 10 oscillation cycles to 70 h (an approximately 75% improvement from the previous study [50] by using 30% fewer computing units), showing the potential of numerical simulations in clinical applications. However, the potential of reducing the time-to-solution is not exhausted and further improvement is required for an acceptable range of daily clinical routine. As the next steps towards a concrete clinical application, clinical MRI or CT images have to be used to generate a patient-specific geometric flow domain during phonation and the proposed model should be combined with an aeroacoustic model that may then provide new insights into the sound production and propagation process.