Towards a Physiological Scale of Vocal Fold Agent-Based Models of Surgical Injury and Repair: Sensitivity Analysis, Calibration and Verification.

Agent based models (ABM) were developed to numerically simulate the biological response to surgical vocal fold injury and repair at the physiological level. This study aimed to improve the representation of existing ABM through a combination of empirical and computational experiments. Empirical data of vocal fold cell populations including neutrophils, macrophages and fibroblasts were obtained using flow cytometry up to four weeks following surgical injury. Random Forests were used as a sensitivity analysis method to identify model parameters that were most influential to ABM outputs. Statistical Parameter Optimization Tool for Python was used to calibrate those parameter values to match the ABM-simulation data with the corresponding empirical data from Day 1 to Day 5 following surgery. Model performance was evaluated by verifying if the empirical data fell within the 95% confidence intervals of ABM outputs of cell quantities at Day 7, Week 2 and Week 4. For Day 7, all empirical data were within the ABM output ranges. The trends of ABM-simulated cell populations were also qualitatively comparable to those of the empirical data beyond Day 7. Exact values, however, fell outside of the 95% statistical confidence intervals. Parameters related to fibroblast proliferation were indicative to the ABM-simulation of fibroblast dynamics in final stages of wound healing.


Introduction
Voice disorders involve impairments with varying pitch, quality and loudness in daily conversation as well as in occupational voice use such as during voice performance and teaching. Among all voice disorders, vocal fold scarring is considered as one of the most perplexing clinical problems [1][2][3]. Current treatment options to fully repair the fibrotic tissue are deemed to be limited [2][3][4]. Vocal fold scars can develop within the lamina propria after surgical removal of benign or malignant vocal fold lesions. The scarred tissue alters the microarchitecture and vibratory functions of the vocal folds, resulting in a debilitating condition of the human voice, namely, dysphonia [2][3][4][5][6][7][8]. Iatrogenic vocal fold scarring triggers a cascade of cellular and molecular events associated with inflammation and wound repair, whereas the degree of response varies across patients depending on myriad personal and lifestyle factors. Computer simulation has become an appealing approach to integrate and analyze massive patient data towards more personalized and precise medical treatments.
Agent-based computational models (ABM) are often used to simulate complex system dynamics. The basic framework of ABM consists of agents and agent-rules. Agents are decision-making units that interact with other agents and the environment. Agent-rules are formulated to control the action and decision of agents in the virtual world [9]. ABM have been widely applied to ecology, economics, geography, business, political sciences and social sciences [10,11]. For biomedical applications, ABM are used to understand complex acute and chronic diseases such as diabetes, traumatic brain injury, spinal cord injuries, acute liver failure and sepsis [12][13][14][15][16]. ABM can be further developed to numerically simulate individual treatment response of a disease as a function of patient profile [17]. In silico experiments allow the exploration of a much wider parameter space and a much longer time scale than what would be very costly or sometimes impossible with animal and human models. In the era of personalized/ precision medicine, computational models have become an indispensable tool to integrate different forms of patient data (e.g., demographics, clinical, lifestyle etc.) and generate specific disease phenotypes or cohorts for in silico clinical trials [18,19].
Model calibration is an important step to ensure sufficient authenticity of model representation to the real world. Model calibration involves systematically altering the values of parameters in the model in an iterative fashion until the simulated outputs and the observable behavior of the system are as close as possible. Some common parameter estimation methods for ABM calibration have been reported to reduce the uncertainty of the model ( Table 1).
The pattern-oriented approach is used to estimate parameters by comparing observed trends and patterns in the model with empirical data [44,51,52]. For complex models having a large number of parameters, this approach has been limited to the identification of relevant patterns for discovering hidden information [52]. The particle swarm optimization (PSO) algorithm is a heuristic optimization method that is more suitable for computational models with a large number of parameters [47,53]. The PSO investigates the mutual collaboration between particles for sharing the internal information and reducing parameter space by discarding implausible inputs [47,53]. Since PSO has a low convergence rate in the iterative process, it is not considered as ideal for ABM, which would require thousands of iterations for each simulation [53]. Genetic algorithms (GA) are another heuristic optimization method that use a natural selection process to estimate as many solutions as possible simultaneously [48,54]. In GA, the model parameter values are progressively modified using genetic operators namely mutation and crossover, to optimize the model's fitness in predicting the empirical data [48,54]. One limitation of GA is parameter overfitting and large variation in the results for each run [54]. Other existing tools for parameter estimation in ABM include parameter sweeping, Bayesian approaches, greedy algorithms and regressions, hybrid approaches and nonlinear multi-grid/finite difference methods. These methods are often computationally expensive and are not ideal for biological models at large scales [43,45,46,49,50].
When the model has a large number of parameters with unknown values, model calibration can also require expensive computing resources. The parameter search space scales up with the number of parameters in the model [55,56]. Lengthy simulation times, ambiguities in model design and a large number of unknown parameters can also limit the efficiency of model calibration [57]. Sensitivity analysis is thus required to reduce the number of parameters before model calibration. Sensitivity analysis is a statistical technique to explore the effects of systematically varying input parameters on the variability of the model outputs. Multiple sensitivity analysis methods have been reported for ABM [58,59]. The most common techniques are one parameter at a time (OPAT), Fourier amplitude sensitivity test (FAST) regression-based methods and variance-based methods [58,60,61]. These methods, however, are often computationally expensive or unreliable for large numbers of agents or they require excessive numerical simulation times [58,[60][61][62][63]. Alternative schemes for sensitivity analysis are therefore worth exploring for running large-scale ABM with longtime scale simulations. For instance, Random forests, a popular machine learning technique, has been applied for the quantification of parameter importance and parameter selection [64][65][66]. Random forest was proposed to handle high dimensional data, thousands of input variables, complex parameter interaction, and missing data in complex models [67].
The primary goal of this study was to improve the accuracy of existing VF-ABM in simulating cellular and molecular activities associated with surgical vocal fold injury and repair. Experimental and computational studies were conducted. This paper was structured to present: (1) The experimental study of identifying and phenotyping immune and repair cells in surgically injured rat vocal folds up to four weeks after surgical injury; as well as (2) the computational study of optimizing the biological representation of VF-ABM through sensitivity analysis, model calibration and verification using flow cytometry data.

Vocal Fold Cell Phenotyping
The animal study was approved by the Institutional Animal Care and Use Committee of the University of Maryland-College Park (protocol number: R-12-85). The study design was a one-way between-subjects design with time [0 (uninjured control), one, two, three, or five days, and one, two, or four weeks post-surgery] as the independent variable. Dependent variables were percentages of neutrophils, macrophages, fibroblasts, and endothelial cells in the overall cell population based on multiparametric flow cytometry analyses. ketamine hydrochloride (90 mg/kg) and xylazine hydrochloride (9 mg/kg). Rats with injured vocal folds were euthanized at each of seven post-surgery time points: Days 1,2,3,5 and Weeks 1,2,4. Due to the unexpected death of the animals, between 16 to 19 animals survived at each assigned time point for laryngeal harvest and the subsequent flow cytometry analysis.

Vocal Fold Cell Isolation-
The mucosae of both sides of vocal folds were dissected and separated from the underlying thyroarytenoid muscles under a stereoscope. Vocal fold mucosal samples within an experimental group (e.g., 1 day after injury) and cytometric panel (see below) were pooled and subjected to single cell isolation. The strategy of sample pooling has been used for rat vocal fold cell and protein analysis considering the small size of rat vocal folds [69,70]. Samples were dispersed into single cell suspensions using digestion, centrifugation and filtration steps [69]. First, dissected mucosa were placed in conical panels covered in aluminum foil and then incubated in Ca/Mg free DPBS solution (Mediatech, USA, Cat#: 21-031-CV) with 0.05% collagenase (Gibco, USA, Cat#: 17018-029) /0.001% DNAse I (Sigma Aldrich, USA, Cat#: D4513) for 20 min at 37 °C to dissolve the extracellular matrix for cell release. The samples were thoroughly mixed and incubated again in 37 °C water bath. Once samples were visually confirmed to contain no large chunks of tissue, the solution was filtered with 40 nm cell strainers. Filtered samples were centrifuged at 290g for five minutes at 4 °C. After aspirating the supernatant, remaining cells were suspended in HBSS (Corning, USA, Cat#: 21-022-CV) from 1M HEPES solution with 2% fetal calf serum (HyClone, Canada, Cat#: SH30071.03) and 10nM HEPES (diluted from 1M HEPES solution, Corning, USA, Cat#: 25-060-CI). Resulted single cell suspension samples were transferred to flow cytometry polystyrene tubes and centrifuged at 201g for 10 min at 4 °C. After aspiration and suspension, the total cell number was determined with trypan blue staining and a hemacytometer. The number of isolated vocal fold cells ranged between 2.15 × 10 5 cells and 2.02 × 10 6 cells across time points.

Sample Preparation for Flow Cytometry
Experimental Samples: Two independent flow cytometry panels (Panels A and B) were designed to cross-validate the results for cell identification. Cell surface markers specific to the cells of interests were selected based on the literature of rat immunology. Multiple markers were used for cell identification since no single surface marker is specific to the cells of interest (Tables 2 and 3). Panel A consisted of 11 parameters including FSC (cell size), SSC (cell granularity), one cell viability marker (AmCyan) and eight fluorescent cell surface markers (CD11b/c, CD29, CD44H, CD45, CD68, CD105, CD106 and His48). Panel B was composed of eight parameters including FSC, SSC, one cell viability marker (AmCyan) and five fluorescent cell surface markers (CD31, CD45, CD90, CD163 and His48) (see Table S1 and Table S2 for the corresponding biological functions of each marker in Supplementary Materials).
Pre-conjugated primary antibodies were used to facilitate specific affinity to surface antigens (see Table S3 and Table S4 for catalogue number and the fluorescence conjugate in Supplementary Materials). Isolated cells were first incubated in 1:100 diluted purified mouse anti-rat CD32 -FcγII blocker (Monoclonal D34-485, 0.5 mg/ml, BD) in staining buffer for 20 min at 4 °C. This blocking step was required to prevent non-specific antibody reactions to the Fc-receptor on cells such as monocytes and macrophages. Samples were then stained in 1:100 dilution of either eight (Panel A) or five (Panel B) preconjugated antibodies with staining buffer for 30 min at 4 °C in a dark room. At the end, we collected cell type information across a total of 16 datasets (8 time points × 2 flow panels).
Control Samples: Unstained samples were used as negative controls that contained 5% FBS and DPBS without Ca/Mg ions. Samples were washed with DPBS without Ca/Mg ions twice and then stained with 1:1000 diluted fixable viability dye in DPBS without Ca/Mg ions for 30 min at 4 °C in a dark room. A fixable viability dye was used to exclude dead cells from the analysable population. UltraComp eBeads, which are beads conjugated with individual fluorochromes, were used as single-color controls for compensation setup to correct the spectral overlap. For the fixable viability dye and UltraComp beads, procedures were performed according to manufacturers. Samples were washed with staining buffer twice and then transferred to clear polystyrene flow cytometry tubes until analysis. Samples were read using BD FACSAria II (BD Bio sciences, San Jose, CA, USA) in the Maryland Pathogen Research Institute. Compensation steps were performed to correct spectral overlap before every run of the flow analysis [83,84].

Flow Cytometry Data
Processing and Analysis-Bivariate gating, an approach commonly used in flow cytometric analyses, was used primarily to analyze the multi-dimensional flow cytometry data [85][86][87][88][89][90][91]. FlowJo software (version 10.0.7, LLC, Ashland, OR, USA) was used in this study. Cell viability and compensation were applied during the setup of the experiment. For all datasets, gating analysis was performed by one individual (AG) who was blind to experimental conditions. The gating strategy for Panel A first used FSC-A vs SSC-A to represent the distribution of cells based on size, granularity and intracellular composition. These two scattering parameters led to the exclusion of debris, other non-cellular particles and lymphocytes [92,93]. To exclude doublets that were considered as single cells, three tests were implemented in the following order: (1) FSC-W vs FSC-H (W=width, H=height) to select low FSC-W cells, (2) FSC-W histogram to check the threshold and (3) FSC-H vs FSC-A (A=area) to select cells that were clustered diagonally [94,95].
Multiple plots and parameters were used to decide the threshold limit in the generation of gates and to verify gate positions (see Figure S1 for illustration in Supplementary Materials). Contour and density plots (Figure S1A (iii) and (iv)), which show expression levels and relative density of data, are commonly used in placing the gates [89,96]. Backgating was used to verify the gate positions ( Figure S1B). Backgating not only helped to analyse cells identified in a gate on dot plots with different parameters but also showed the final gated population within the population of its ancestors. The same gating strategy was applied across all eight time points within the same panel to compute the percentage of each cell type in the total cell population.

Flow Cytometry Statistical
Analysis-Statistical analyses were performed using JMP software (SAS Institute, Cary, NC). Mixed effects models and correlation analyses were used to evaluate the degree of similarity in results from Panels A and B. Distinct mixed effects models were used to assess variation between panels for cell types (neutrophils, macrophages, endothelial cells and fibroblasts). Density data for a particular cell type (i.e., % of cells in total cell population) at each time point was obtained from one flow cytometry run. To analyze the effect of flow panel on estimates of cell density, we used the data for each of the eight time points as independent data points (i.e., time not included in the model, n = 8 for each panel). We additionally ran a full-factorial model with flow panel (A vs. B) and cell type as the independent variables, which allowed us to analyze the effect of the panel simultaneously across cell types. Because each flow cytometry run generated data for each cell type, the identity of each flow cytometry run was included as a random variable for the mixed effects model. Pearson's correlations were used further to analyze the relationship between the results from Panel A and B, and to complement the results from mixed effects models.

VF-ABM Implementation-
The overview of the VF-ABM is illustrated in Algorithm 1. In our implementation, the ABM world represents a rat VF tissue and is spatially discretized into rectangular volumes called 3D patches. Each mobile agent represents a cell that moves from one patch to an adjacent patch and makes decisions to perform specific biological functions at discrete time steps. Agents make decisions based on the state of the patches. For example, chemokines and ECM proteins are associated with the states of the patches. Cells would perform their actions as a function of the concentrations of chemokines and ECM on their current and nearby patches. A detailed description of VF-ABM can be found in [25]. The overview of the VF-ABM is illustrated in Algorithm 1. The initial model size and configuration of VF-ABM are summarized in Table 4. Agentrules of VF-ABM were formulated based on reported mechanisms in the vocal fold wound healing literature [38][39][40][41][42]. Each time-step (or tick) corresponds to approximately 30 min of "real time". VF-ABM were implemented on a computer node with two NVIDIA Tesla P100 12 GB Graphics Processing Units (GPUs) and 32 Intel E5-2683 v4 computer processing units (CPUs). GPUs were mainly used for running the chemical diffusion component of the model and implementing the visualizations of the simulations Open Graphics Library (OpenGL). The model was implemented in the object-oriented programming language C++ and Open Multi-Processing (OpenMP) for parallel computing [25][26][27]. Each simulation took about 22 min to complete for 28 days of "real time". The source code of the VF-ABM can be found at https://github.com/VF-ABM/hpc-abm-vf-version_0_6.

Sensitivity Analysis-
The goal of sensitivity analysis was to determine the influence of model parameters on the ABM cell count outputs in order to reduce the number of parameters for calibration by selecting the most important parameters. Random Forests were chosen as the sensitivity analysis method herein because it does not require many samples for the implementation and thus makes the computational cost feasible for the notable scale of VF-ABM.
Random Forests use an ensemble method with decision trees that are constructed using bootstrapping [97] (see Figure S2 for the workflow of Random Forests in Supplementary Materials). To split the node on any variable (or parameter) within a tree, the GINI impurity criterion (GINI Index) is used [98,99]. The GINI Index for two descendent nodes should be less than that of the parent node. A node is split if the change in GINI Index is significant [98], i.e., where, t L is the node on the left, and t R the node on the right, N L is the number of samples on left node and N R is the number of samples on right node.
The importance of any variable (or parameter) is determined by Mean decrease GINI, given by [98] The mean decrease GINI (I) for a parameter (P) is evaluated by adding up the weighted GINI indices (i) for all nodes (N) where parameter P is used (averaged over all trees T in the forest) [98]. This quantity indicates how often a parameter P was selected for a split and provides a relative ranking of the parameters.
To find the suitable number of iterations, n for our sensitivity analysis, pilot tests were implemented with n = 3000, 4000, 4500, 5000, and 5500. It was observed that after 4500, the top three parameters for each cell type and time-point were identical and thus n = 5000 was chosen for this study. Sensitivity analysis using Random Forest was then implemented independently for the first four time-points (Day 1, Day 2, Day 3 and Day 5) as these time points would be used for the subsequent calibration.
The user setting was set as (1)  X=Samples Y=Model Output for a time point T and cell type C Df = data . frame Y,X allX = paste "X", 1: ncol X , sep = "" names df = c "Y", allX fit = randomForest factor Y . , data = df VI_F = importance fit varImp fit varImpPlot fit, type = 2

Model Calibration-The
Statistical Parameter Optimization Tool for Python (SPOTPY) package was used for parameter calibration [101]. The SPOTPY package has a library of algorithms and objective functions for model calibration and verification. The algorithm has five major steps: (1) Sampling, (2) Simulation, (3) Evaluation, (4) Objective Function and (5) Parameter Estimation. Sampling was implemented for the most important parameters based on the sensitivity analysis using a uniform distribution and a predefined range. After sampling was done, the simulation was executed with the generated parameter set using the sampling function and the output was stored as simulated data. In our case, the empirical flow cytometry data were used as evaluation data. An objective function was then used to estimate the performance of the model for that given parameter set. The root mean square error (RMSE) was then used as the objective function, which determines the fitness of simulated data to evaluated data. RMSE is given by Here, e(i) is evaluated data, s(i) is simulated data and m is the total number of parameters.
Lastly, the Robust Parameter Estimation (ROPE) algorithm was used for parameter optimization based on the concept of data depth [101]. Data depth refers to a statistical method that assigns a numerical value to a data point with respect to a set of points based on its centrality. It is a non-parametric multivariate statistical method that no distributional assumptions are needed [101,102]. ROPE was selected for VF-ABM calibration because it accelerates the calibration process by iteratively using information from previous simulations to estimate the outputs of subsequent ones [101,103]. The principle of ROPE is to identify a set of parameter vectors with the high model performance and subsequently generate a set of parameter vectors with high data depth with respect to the first set [101,103].
Instead of requiring thousands of random parameter sets, ROPE uses the results from the previous simulation as knowledge to define the parameter sets for the subsequent simulations. ROPE is recommended when the model requires parallel computing and the simulations are expensive, as the case of VF-ABM herein [101,103]. According to ROPE, the top 10% results are used to generate the next parameter sets. After estimating the best set of parameters, the algorithm repeats the sampling again. This time, the algorithm uses parameter sets learned from previous runs to determine the parameter set for the next run. The procedure repeats either until the defined number of iterations is reached or the RMSE reaches zero. In the present study, flow cytometry data from the first four time-points (Day 1, Day 2, Day 3 and Day 5) was used and the VF-ABM was run 800 times iteratively for each calibration step ( Figure 1).
The top three parameters for each cell type (neutrophils, macrophages and fibroblasts) for each time point were used to proceed for calibration. Hence, a total of 36 top parameters were ranked for each cell type at each time point (3 parameters x 3 cell types x 4 time points); however, 12 of these parameters were found to be the same across certain conditions. As a result, a set of 24 unique parameters was calibrated iteratively with all other parameters being fixed as constants until the model eventually yielded a satisfactory match between simulation data and empirical data.
All four time points (Day 1, Day 2, Day 3 and Day 5) were first calibrated for each cell type individually and calibrated values of parameter sets for each cell type were determined. Subsequently, a new parameter range was reduced to 20% of the size of the original range, targeting the calibrated value, and was then re-entered manually in SPOTPY. The modified range for a particular parameter was estimated using where, Mi' and Ma' are new minima and maxima, Mi and Ma are old minima and maxima and C is the calibrated value of the parameter. The parameters were then recalibrated again using the modified range with ROPE.

2.2.4.
Model Verification-For model verification, the calibrated model was evaluated statistically for its accuracy in cell outputs at the Day 7, Week 2 and Week 4. Given ABM's stochastic properties, the model was run 100 times up to Day 28 to generate a representative data set for statistical evaluation. To avoid making assumptions on the distributions of the output data, a non-parametric method was implemented to obtain predictive confidence intervals. A two-sided 95% confidence interval was computed for each cell type at each time-point by directly calculating the quantiles of the 100 simulation outputs using ordered statistics. The VF-ABM-simulated outputs would be considered accurate if the empirical results for a given cell type fall within the 95% confidence interval of the ABM-simulation outputs.

Cell Compositions in Injured Rat Vocal Folds
Across both Panels A and B, flow cytometric analyses revealed variations in vocal fold cellular composition following injury (Figure 2A,B). Across all cell types, data from Panels A and B (i.e., two types of analyses) were not significantly different from each other (mixed effects model: F 1,14 = 0.4, p = 0.5630) and significantly positively correlated with each other (Pearson's correlation: p < 0.01 for pairwise correlations for each cell type). This result is consistent with results from the full-factorial mixed effects model (Panel and Cell Type as independent variables, cytometry run as a random variable), with Panel and the interaction between Panel and Cell Type not being significant (p > 0.05 for each). Importantly, the effect of cell type was significant (F 3,42 = 59.3, p < 0.0001) and was driven by the fact that fibroblasts were more abundant than any other cell type analyzed (Tukey's HSD, p < 0.05).
No significant differences were found in cell density among the remaining cell types. These analyses suggest convergent estimates of cell types emerge from the distinct types of cytometry panels; consequently, we averaged the results from the two panels for the final characterization of changes to cell types across time ( Figure 2C).
In uninjured controls, neutrophils, macrophages, endothelial cells and fibroblasts, respectively, represented 7.70%, 4.70%, 9.54% and 53.49% of the total cell population. In surgically injured vocal folds, fibroblasts also appeared as the prevalent cell type (>50%) among the four cell types at all time-points except Day 2 and Day 3. Neutrophils ranged from 2.31% to 14.37% on average throughout the examined time course. Macrophages appeared as the most dominant cell type (~33%) on Day 2. Shortly after, both neutrophils and macrophages were found in less than 5% of the total cell populations. The population of endothelial cells was below 5% in injured vocal folds on average except a relatively sharp increase during Week 2 (14.10%).

ABM Parameter Ranking
Random Forests were used to quantify the variance contribution of influential parameters to model outputs. Mean decrease GINI was used to rank the parameters from most influential to least influential. The ranking of parameter importance was then obtained by sorting the parameters according to their Mean Decrease GINI ( Figure 3 for the top 25 parameters for Day 1; see Figures S3-S5 for the complete results in Supplementary Materials).

VF-ABM Calibration with Most Influential Parameters
VF-ABM were calibrated for Day 1, Day 2, Day 3 and Day 5 from the flow cytometry data. The top three parameters for each cell type for each time point were selected for model calibration ( Table 5). As described in Section 2.2.3, 12 out of 36 parameters were overlapped across conditions, leading to a total of 24 unique parameters for the calibration. These top parameters were mostly related to cytokine synthesis, cell activation, ECM synthesis and cell recruitment (sprouting amount).

Verification of ABM-Predicted Cellular Dynamics
VF-ABM were verified for Day 7, Week 2, and Week 4 with the flow cytometry data. The VF-ABM reproduced the dynamics of all cell populations from the empirical data between Day 1 to Week 4 ( Figure 4). For neutrophils, the peak was delayed by one day. Macrophage counts reached a maximum concentration on Day 2, in contrast to Day 1 in the data. For fibroblasts, the model reproduced the peak at the correct time point on Day 2 but did not resemble the oscillation pattern as observed in the empirical data.
In addition to the overall patterns, statistical tests of 95% confidence intervals were used to quantitatively evaluate the VF-ABM cell outputs. As the VF-ABM was run 100 times, a 95% confidence interval was obtained by computing the 2.5% and 97.5% quantiles from the ABM simulation output samples for each time point and cell type. The goal here was to evaluate for each time point (i.e., Day 7, Week 2 and Week 4), how many empirical data points (i.e., counts of neutrophils, macrophages and fibroblasts) fell within the two-sided 95% confidence interval of simulated outputs (Table 6). For Day 7, all empirical data were within the 95% confidence intervals (i.e., 100%, 3/3). For Week 2 and Week 4, all empirical data points fell outside the 95% intervals except for neutrophils at Week 4. However, when examining the data closely, the VF-ABM outputs were reasonably close to the empirical data. For instance, the empirical cell counts for neutrophils and macrophages were 180 and 186 at Week 2 respectively; whereas the VF-ABM outputs were from 188.5 to 222.5 and from 103.5 to 155.5 respectively.

Time-Evolution of Cell Populations in Surgical Vocal Fold Injury and Repair
Multi-parametric flow cytometry was used to analyze dynamic changes to cell composition of vocal folds up to four weeks following injury. Results are generally consistent with the temporal dynamics of immune cells (neutrophils and macrophages) known in general wound healing literature. In dermal wound healing, neutrophils arrive at the wound site in the first few hours, peak in abundance one day after the injury, and reduce rapidly by the third day after injury [101][102][103][104][105][106][107]. Macrophages peak slightly later, by the second day after the injury, but also demonstrate rapid decreases on the third day after surgery [101][102][103][104]. Given the similarities in the temporal dynamics of neutrophil and macrophage changes following injury in vocal folds and other types of tissue, comparable cellular functions of immune cells underlying the inflammation and repair processes are speculated.
In addition to immune cells, the temporal pattern of endothelial cells following vocal fold damage also resembles the general wound healing literature [105,106]. In this study, endothelial cells contributed about 9% of the total cell population in uninjured controls that were likely originated from the endothelium layer of undisturbed blood capillaries in native vocal fold mucosae. A rise in the endothelial cell population was observed in Week 2 after surgery, representing about 14% of total cell populations. An active angiogenesis might have started between Week 1 and Week 2 post-surgery. To our best knowledge, this study is the first to evaluate the population of endothelial cells in vocal fold injury and healing. Further investigation on the precise role of endothelial cells, their interaction with other cell types as well as their implications in vocal fold scarring is warranted before we could reliably formulate agent-rules of endothelial cells in VF-ABM.
Fibroblasts normally arrive at the wound site around Day 3 and start the proliferative phase [37,103,105,106]. These cells usually reach their maximum concentrations between Day 5 and Day 7, and start to decrease gradually by the end of Week 1 [101][102][103][104]. Interestingly, fibroblasts were predominantly present in injured rat vocal folds spanning from the early acute inflammatory to later remodeling phases of wound healing as shown in this study. Previous studies reported the abundant presence of fibroblasts up to one week after surgical vocal fold injury [36,66,108]. Our data further indicated a notable accumulation of fibroblasts at the wound site up to four weeks after surgery, likely associated with their important roles in ECM remodeling.
Further, fibroblasts remain the dominant cell type at Day 1 (53.81%) and Day 2 (29.37%) despite the infiltration of neutrophils and macrophages into the acute wound environment. Previous work showed that both macrophages and fibroblasts were the major cell source of damage-associated molecular patterns (DAMP), namely high mobility group box-1, in modulating the inflammatory cytokine production in acute vocal fold injury [67,109]. Our data and others collectively suggest immunological and repair functions of fibroblasts that are unique in vocal fold microenvironment [110,111].

Random Forests and ROPE for VF-ABM Parameter Optimization
Common sensitivity analysis methods (e.g., One Parameter At a Time (OPAT), Fourier Amplitude Sensitivity Test (FAST) etc.) and calibration algorithms (e.g., Genetic Algorithm (GA), Particle Swamp Optimization (PSO), Bayesian approach etc.) are not optimal for large scale ABM as in our case herein. We thus used a new approach utilizing Random Forests and ROPE for sensitivity analysis and model calibration respectively. The execution of Random Forests does not require a large number of samples for running sensitivity analysis. This feature makes Random Forests less computationally expensive and more reliable for simulating long time scale models as compared to other sensitivity analysis methods [62][63][64][65]. For example, the estimated number of samples required for Fourier Amplitude Sensitivity Test (FAST) is 128 × parameter number ˆ 2. That is, a total of 5,766,549 samples for the case of VF-ABM. An estimated time of running FAST is seven years with the computer node of two NVIDIA Tesla P100 12 GB GPUs and two Intel E5-2683 v4 CPUs [60,61]. In contrast, Random Forests uses a bootstrapping method for sampling and required only 5,000 samples for VF-ABM, making the sensitivity analysis much feasible [62][63][64]112].
Among the 24 parameter subjected to ABM calibration (Table 5), they were mostly related to cytokine synthesis, cell activation, ECM synthesis and cell recruitment (sprouting amount). In our VF-ABM, many biological activities of neutrophils, macrophages and fibroblasts are controlled by parameters of sprouting frequency and amount as well as cytokine synthesis. Specifically, parameters of sprouting frequency and amount were used to abstract the migration of neutrophils and macrophages to vocal fold vasculatures and transmigration from capillary to mucosal tissue as well as the recruitment and proliferation of fibroblasts in VF-ABM. Two sprouting-related parameters, namely parameter 155 (WhSproutAmount4) and parameter 156 (WhSproutAmount5) were used to control the amount of neutrophils. Three sprouting-related parameters (parameter 159 (WhSproutAmount8), 160 (WhSproutAmount9) and 161 (WhSproutAmount10) were used to control the amount of macrophages. Four sprouting-related parameters (parameter 150 (WhSproutFreq5), 162 (WhSproutAmount11), 163 (WhSproutAmount12) and 164 (WhSproutAmount13) were used to control the amount of fibroblasts. Further, the sproutingrelated parameters for fibroblasts were linked to cytokines, growth factors and ECM contents in VF-ABM. Sprouting-related parameters are thus critical to initiate and sustain cell activities of neutrophils, macrophages and fibroblasts during tissue repair. The ABM outputs of cell numbers are mostly determined by the sprouting amount and sprouting frequency of these three cell types. As cell numbers were used as inputs for calibration, it is reasonable that the parameters of sprouting amount and frequency have the strongest influence on outputs of cell numbers in VF-ABM.

VF-ABM Performance
VF-ABM were calibrated for Day 1, Day 2, Day 3 and Day 5 and verified for Day 7, Week 2 and Week 4 with the flow cytometry data. The overall temporal trajectories of neutrophils, macrophages and fibroblasts were found to be in reasonable agreement between empirical and simulated data. Statistically, the empirical data of all three cell types fell within the VF-ABM output ranges for Day 7, but slightly outside the 95% confidence interval for remote time points (Week 2 and Week 4 except for neutrophils). Of note, the VF-ABM stimulated data showed a discrepancy of fibroblast count from the empirical trend at Week 4. The simulated fibroblast counts were lower than those of empirically observed by about 14 folds at Week 4. Based on the literature, proliferation rates of fibroblasts are 15.4 ± 1.1% after 4 days, 4.1 ± 0.6% after 1 week, and less than 0.5% after 2 weeks from rodent cardiac literature [113]. Since the proliferation rate of fibroblasts greatly depends on the microenvironment such as cytokine levels and ECM contents, the exact proliferation rate for vocal fold fibroblasts was not fully accurate based on the cardiac tissue data. Additional empirical data are thus needed to better estimate the proliferation rates of vocal fold fibroblasts in both homeostatic and injurious conditions. Once the data become available, parameters related to the sprouting amount and sprouting frequency of fibroblasts can be revised to improve the ABM-simulation of fibroblast dynamics in the final stages of wound healing such as wound remodeling.
We also noticed that VF-ABM neutrophil outputs did not have a perfect match with those from the empirical data in terms of timing and magnitude. ( Figure 4A) The estimation of initial parameter range for corresponding sprouting frequencies (i.e., the rate of the cell infiltration when there is tissue damage) was difficult to define for calibration. The infiltration rate of neutrophils was only available as a function of blood flow in the human circulatory system [114]. However, specific information on rates of blood flow and injuryinduced vasodilation within the vocal fold lamina propria are not available to date. Such data are necessary to better simulate the transmigration of neutrophils from capillary to mucosal tissue in VF-ABM.

Conclusions
The overarching goal of this research was to further develop VF-ABM into computer software that could pre-operatively inform clinicians about an individual's risk of iatrogenic scarring from surgery. Iatrogenic vocal fold scarring is one of the most perplexing complications in phonosurgery. Surgical injury triggers a cascade of cellular and molecular events in inflammation and healing. Due to the individual-specific response to surgical injury, a computational tool could be useful to aid clinicians in tailoring the vocal treatment. For example, if the clinician can obtain patients' biological profile before surgery, the computer model can help estimate individual post-operative response. In this study, empirical data on major vocal fold cell population were used to optimize existing VF-ABM through a scheme of sensitivity analysis, calibration and verification. General agreement was observed between the empirical and simulated data but further refinement is still warranted. The success of this work will contribute to the advancement of computational medicine in voice disorders.

Supplementary Material
Refer to Web version on PubMed Central for supplementary material. Workflow of VF-ABM Calibration. A total of 36 top parameters were ranked for each cell type at each time point. Twelve of them were overlapped across conditions. As a result, a total of 24 unique parameters were used for calibration.    Rat cell surface marker profile for flow cytometry in Panel A. "+" denotes positive expression and "-" denotes negative expression.  Rat cell surface marker profile for flow cytometry in Panel B. "+" shows positive expression and "-" shows negative expression.

Empirical Data within 95% CI 33%
* denotes the empirical data within the 95% confidence interval (CI) of VF-ABM outputs.