Considerations of the Impacts of Cell-Specific Growth and Production Rate on Clone Selection—A Simulation Study

For the manufacturing of complex biopharmaceuticals using bioreactors with cultivated mammalian cells, high product concentration is an important objective. The phenotype of the cells in a reactor plays an important role. Are clonal cell populations showing high cell-specific growth rates more favorable than cell lines with higher cell-specific productivities or vice versa? Five clonal Chinese hamster ovary cell populations were analyzed based on the data of a 3-month-stability study. We adapted a mechanistic cell culture model to the experimental data of one such clonally derived cell population. Uncertainties and prior knowledge concerning model parameters were considered using Bayesian parameter estimations. This model was used then to define an inoculum train protocol. Based on this, we subsequently simulated the impacts of differences in growth rates (±10%) and production rates (±10% and ±50%) on the overall cultivation time, including making the inoculum train cultures; the final production phase, the volumetric titer in that bioreactor and the ratio of both, defined as overall process productivity. We showed thus unequivocally that growth rates have a higher impact (up to three times) on overall process productivity and for product output per year, whereas cells with higher productivity can potentially generate higher product concentrations in the production vessel.


Introduction
For the production of certain biopharmaceuticals, animal cells have to be expanded from a frozen vial. Today, Chinese hamster ovary (CHO) cells are by far the most popular system in use [1] because they are known to be easy to grow; safe as far as not carrying any infectious agents; and last but not least, highly productive with yields in the multiple grams per liter range [2,3]. Nevertheless, questions and issues remain to be solved to maximize their utility, particularly since CHO cells have a very wide range of genotypic diversity and thus corresponding phenotypic differences [4][5][6]. This is quite obvious when clonally derived cell populations from a single transfection are compared against each other. These phenotypic differences have impacts on growth-related characteristics, cellspecific productivity and the quality of the final product (e.g., glycosylation patterns) [6,7]. Screening and profiling methods have been introduced (amongst others by [5,7]) to assess cell growth rate, cell-specific productivity and glycosylation patterns, along with further quality attributes of the produced recombinant proteins. In addition, the phenotypic stability of cell populations is another important parameter. A factor to maintaining stability over a reasonable time frame is the use of environmental conditions of cells within narrow and favorable ranges [2,3]. For the testing of the genetic and production stability, clonally derived cell lines, also in the following referred to as "clones" or "clonally derived cell populations", undergo typically stability studies which can last up to six months.
Desired phenotypic parameters to be maintained in such generated cell lines (besides properties characterizing the quality of the produced recombinant protein) are high cellspecific productivity [3,4,8,9] and high growth rates in order to reduce overall cultivation times (including the duration of the cell expansion process). However, growth rates and specific productivity of recombinant cells are often inversely related to each other [3]. Thus, a frequent trade-off has to be weighed between clonal populations, with one showing faster growth but a lower cell-specific production rate and vice versa (see Figure 1a), assuming little or no quality differences in the product obtained.
Mathematical process models appear to be suitable tools for analysis, for the generation of process understanding and for simulation and prediction. Several examples of using such process models addressing biopharmaceutical manufacturing can be found in the literature [10][11][12][13][14][15][16]. Within this field, uncertainty-based methods gained attention because model uncertainty, uncertainty in measurements and batch-to-batch variability can be taken into consideration in this way.
This study aims to present a model-based investigation of the impacts of clonal differences concerning cell-specific growth rates and cell-specific production rates on the duration of an inoculum train; the volumetric titer in production; and the overall process productivity, defined by the ratio of volumetric titer in production to the overall cultivation time, including the duration of the cell expansion process (inoculum train). In this study, a batch process (for simulation of the inoculum train and also for the production bioreactor) has been used for simulation and evaluation because it is a good first step to obtain results regarding the impacts of phenotypic differences on the above-described response values. This can be further expanded-once a smaller number of clonally derived cell lines have been chosen-to also involve fed-batch processes and/or perfusion mode.
The investigation is divided into four main blocks (see also Figure 1b(I-IV) for orientation). I: Growth rate and production rate were analyzed for five clonal populations based on the data of a stability study (Section 3.1).
II: A reference cell line was taken from one of these and a mechanistic cell culture model was adapted to the data obtained in the laboratory (Section 3.2). Uncertainties were considered and prior knowledge from previous studies concerning model parameters was integrated into the model using Bayesian parameter estimation.
III: Upstream simulations were performed for three different clonal cell lines under consideration of the variabilities observed in Section 3.1. For each clonal cell line a suitable inoculum train protocol is defined. Furthermore, these inoculum train protocols are compared to each other with respect to inoculum train duration and volumetric titer in production (see Section 3.3).
IV: Several combinations of maximum growth rate (±10%) and maximum production rate (±10% and ±50%) within realistic ranges were considered (Section 3.4). First, production rate was varied ±10% for a multiple regression. Second, a variation of ±50% was applied to cover all three investigated clones with their growth and production rates and to illustrate them in a response surface plot. Based on the results, a decision criterion is provided which is expected to help in evaluating different clonal cell lines.

Data from a Stability Study
Experimental data from a stability study for analysis of cell-specific growth rates (in the following growth rates) and cell-specific production rates (in the following production rates) for five different clonal CHO populations (named clone 1, ..., clone 5) have been used for statistical analysis. The CHOExpress cells have been used as a host system, which are known to be moderate producers of ammonia. This is also based on the media formulations used in the work. Cells were cultivated with and without puromycin in duplicate runs in 50 mL OrbShake tubes (TubeSpin bioreactor 50™, TPP, Trasadingen, Switzerland, 30 mm diameter) with culture volumes of 5 mL. Each subcultivation was started with a viable cell density of 5 ·10 5 cells mL −1 . The cultures were shaken at 180 rpm in a Kühner SFX-1 incubator (Kühner AG, Birsfelden, Switzerland), set at a temperature of 37 • C and a CO 2 set-point of 5%. For each clonal population, cells have been cultivated and passaged (subcultivated) every 3 or 4 days during a time period of 13 weeks (25 subultivations in total). Measurements of volumetric product titer were taken 4 days after starting a new subcultivation for every second subcultivation. Viable cell densities were determined at the end of every subcultivation. The seeding density of 5 ·10 5 cells mL −1 was based on calculated dilutions into fresh medium. These data have been used for approximations of empirical growth rates and production rates according to Section 2.4.
For clarity, in the following and throughout this paper, data presentation and discussions on cell cultures refer to experimental work with clonal cell lines by identifying these in numbers, i.e., clonal cell line # 1, 2, 3, or equivalent. In contrast, modelled cell lines are referred to as Clone A, B, etc.

Cultivation Systems for Inoculum Train Simulations
For inoculum train simulation, only vessel-types applicable for orbital shaking have been considered, with the expectation that the cultivation conditions were highly similar during cell expansion. These have been taken from the list reported in [17]. Based on the given working volumes, an inoculum train has been designed to include 5 scales from 10 mL to 100 L target volume and a production scale of 1000 L target volume (see Table 1). OrbShake bioreactor prototype (50 L) 15 5 OrbShake bioreactor prototype (200 L) 100 6 OrbShake bioreactor prototype (2500 L) 1000

Approximations of Empirical Growth Rates and Production Rates
Based on data of viable cell densities of a clonal population, empirical (averaged) growth rates and empirical (averaged) production rates have been determined. The empirical growth rate µ emp between two points in time t i and t i+1 was calculated using the corresponding viable cell density values X v,i and X v,i+1 according to The empirical production rate q titer,emp between two points in time t i and t i+1 was calculated using the corresponding volumetric titer values c titer,i and c titer,i+1 according to

Statistical Testing of the Differences in Means between Clonal Cell Populations
Clonal populations have been analyzed regarding their growth rates and production rates by applying statistical tests to determine the differences between population means using the statistical software R [18]. Variance homogeneity was tested using the Bartlett test [19]. A global test on differences between population means was performed using the Brown and Forsythe F-test [20] (similar to the classical ANOVA but adapted for heterogeneous variances). To identify where the differences come from and to determine the differences between individual groups, post hoc tests have to be performed. When comparing more than two populations, a method for multiple testing containing an adjustment of the significance level is additionally required. Multiple testing methods exist for groups showing heterogeneous variances. In this work, a pairwise comparison was performed using the adjustment method by Benjamini and Yekutieli [21]. The applied statistical tests and R-commands are given in Figure 2.

Mechanistic Model
The applied kinetic model is a modification of previous model variations published in [12,13,22,23]. Differential equations (see Table 2), consisting of nine mostly Monod-type algebraic equations (description of growth rate, death rate, substrate uptake, metabolite production kinetics and production rate) and 18 model parameters, describe the cell culture dynamics of total and viable cell density, X t and X v and concentrations of glucose c Glc , glutamine c Gln , limiting substrate c LS , lactate c Lac , ammonia c Amm and volumetric titer c titer . All these variables and model parameters are listed in Table A1, including units and descriptions. Table 2. Mechanistic model (for batch and fed-batch mode) [12,13,22,23] for descriptions of cell growth, cell death, substrate uptake, metabolite production and antibody production.

Balance Equations Kinetic Equations
Biomass The model can be applied for batch-mode and fed-batch mode. The presented model example contains extended fed-batch terms for a glucose feed, a glutamine feed and a medium feed containing specific glucose and glutamine concentrations. Therefore, the differential equations are extended by the terms (at the end of each differential equation) including feeding rates for glucose (F Glc ) and glutamine (F Gln ) and for the medium feed (F Medium ). The specific glucose and glutamine concentrations are denoted by c Glc,F , c Gln,F , c Glc,Medium and c Gln,Medium . When applying fed-batch mode, all considered concentrations (viable and total cells, substrates and metabolites) are diluted during addition of the feed. This is represented by the dilution term − F Glc +F Gln +F Medium V . At the same time, glucose or glutamine concentrations increase during the glucose or glutamine feeding and during the medium feeding. This is represented by the terms + , respectively. When applying this model to batch-mode, all these feeding terms are omitted.

Bayesian Parameter Estimation
One of the main differences between Bayesian statistics and frequentist statistical methods ("classical statistics" based on frequencies) is that Bayesian statistics provides a framework to integrate prior process knowledge (knowledge available before applying new data for analysis), including input uncertainty, and to calculate probabilities based on both, prior knowledge and new collected data. Applying this principle within the context of parameter estimation is called Bayesian parameter estimation. A very brief description of this procedure is given through the following steps: Step 1 quantifies the prior knowledge, including input uncertainties (e.g., measurement uncertainties of initial concentrations and uncertainties concerning model parameters). In this contribution a gamma distribution has been chosen to describe the probability distribution of model parameters (further details can be found in the Appendix A.1).
The second step is to determine the posterior parameter distributions using an appropriate algorithm. A Markov chain Monte Carlo method was applied based on a single-component metropolis algorithm, resulting in posterior distributions, including the maximum a posteriori (MAP) estimate and variance.
Step 3 is to evaluate the parameter estimation results, for example, based on the Monte Carlo error and the posterior parameter distributions. This method was implemented in the self-developed seed train-software tool developed at Ostwestfalen-Lippe University of Applied Sciences and Arts. For a more detailed description of this approach, refer to [13].

Upstream Simulation-Software Tool
The upstream process has been simulated and digitally displayed using the seed train-software tool [12][13][14] implemented in MATLAB [24]. To digitally display an upstream process, several inputs are required: The estimated model parameters, initial concentrations of cells of the first scale of operation, a passaging or subcultivation strategy for the cells (e.g., concerning the point in time for cell passaging), the inoculum train vessels and operating conditions and medium concentrations. For further details, see [12][13][14].

Uncertainty-Based Prediction
For simulation of the production scale, uncertainty in measurements of initial concentrations and in parameters was considered and propagated onto the output. Therefore, Monte Carlo samples were generated sampling initial values of state variables for scale 1 from a gamma distribution described above. The corresponding histograms can be found in the appendix; see Figure A1.
The obtained 90% prediction bands (credible bands) were used for comparisons of different clonal cell populations, and were calculated using the 5% and 95% quantiles of the obtained Monte Carlo sample at a specific point in time.

Response Surface Modeling
Response surface models (RSM) describe the relationships between individual explanatory variables (here µ max and q titer,max ) on one or more process variables of interest. In this contribution, the first response variable was the volumetric titer in production. The second response variable was the overall process productivity (=Space-Time-Yield: volumetric titer in production/overall cultivation time, including inoculum train). To explain: This is very important when a given manufacturing plant can be used for production during one year-how much product can this facility deliver for the market? This methodology mostly consists of solving a multiple regression model, meaning to estimate the corresponding regression coefficients. First order and second order polynomials have been adapted (through estimation of regression coefficients) using MATLAB [24]. To evaluate the obtained model, the coefficient of determination was calculated. It is a measure used to explain how well differences in the response variable can be explained by its relationship to the considered independent factors.
The β-coefficients (here β = (β 1 , β 2 )) indicate how much the response value changes per each unit variation of the independent variable (or factor, here µ max , or q titer,max ). Thus, a higher β-coefficient stands for a higher correlation between the factor and the response value (here volumetric titer or overall process productivity).

Results and Discussion
The following results provide insights into the roles of cell-specific growth rate (in the following growth rate) and cell-specific production rate (in the following production rate) in the cell expansion process (inoculum train) and the final production scale of operation using a model-based simulation approach. The following variables are considered: duration of the inoculum train; the volumetric titer in production; and the overall process productivity, defined by the ratio of volumetric titer in production to the overall cultivation time, including inoculum train.
In the first step (compare to Figure 2), data of a stability study of five clonal CHO cell lines were analyzed concerning growth rates and production rates. Can statistically significant differences can be observed between these five cell populations?
An implemented and tested mechanistic cell culture model was adapted to further exploit the experimental data of one of these populations. Modeling and parameter estimations based on new experiments at 10, 50 and 500 mL were performed. This model was then used for further theoretical considerations.
Uncertainty-based simulations of inoculum train and production scale were performed for three clonal cell lines with established differences of growth and production rates.
Finally, a study was performed for several combinations of growth rate and production rate, showing the impacts of these differences on cultivation time and overall process productivity.

Analysis of Variabilities in Growth Rate and Production Rate for Five Clonal Cell Lines
Experimental data from a 3-month stability study were used to calculate growth and production rates for each clonal population. The clonal populations were subcultivated every 3 or 4 days during a time period of 13 weeks (see Section 2.1). The averaged empirical growth rates for two measurements of viable cell density X v,i and X v,i+1 (at the beginning and at the end of a subcultivation) have been calculated according to Equation (1). The averaged empirical production rate between the beginning of a subcultivation and 4 days later was calculated according to Equation (2) for every second subcultivation (volumetric titer were only determined for the 4-day subcultivations).
The obtained average growth and production rates are illustrated in Figure 3 over every second subcultivation. The corresponding distributions can be found in the appendix (see Figures A2 and A3). . Growth rates µ emp (a) and production rates q titer,emp (b) for clonal populations "clone 1" to "clone 5" for every second subcultivation in 50 mL OrbShake tubes.
Mean, standard deviation (sd), coefficient of variation (cv) and maximum (max) are listed for both quantities, growth rate and production rate, for all five populations in Table 3. Table 3. Mean, standard deviation (sd), coefficient of variation (cv) and maximum of empirical growth rate µ emp and mean, standard deviation, coefficient of variation and maximum of empirical production rate q titer,emp for 'clone 1' to 'clone 5' based on data from a 13 weeks-stability study. Clone 5 showed the highest growth rate (a mean of 0.030 h −1 , a cv of 7% and a maximum value of 0.035 h −1 ) and clone 2 the highest production rate (a mean of 17.4 ·10 −10 mg cell −1 h −1 , a cv of 5.8% and a maximum value of 19.6 ·10 −10 mg cell −1 h −1 ) (see also Figure 3).
However, to identify which clones differ from each other in terms of averaged growth rates and production rates, the variations of the calculated rates have to be considered as well. To decide if several clonal populations have significant differences in terms of their means, an analysis of variance adapted for heterogeneous variances and post hoc tests (multiple comparison) has been performed according to the statistical procedure described in Section 2.5. To test on variance homogeneity, the Bartlett test was applied, and the result (p-value = 0.019 for µ emp,max , p-value = 0.021 for q titer,emp,max ) indicates heterogeneous variances. Hence, the Brown and Forsythe F-test [20] was applied. The results (p-value = 2.9 · 10 −11 for µ emp,max and p-value = 5.0 · 10 −27 for q titer,emp,max ) show that statistically significant differences (on a 5%-level) exist in both cases. The results of the post hoc tests, to identify differences between individual groups, are listed in Table 4. It can be seen that most populations show statistically significant differences between each other (p-values 0.05), except clone 2 and clone 3 concerning growth rate (p-value = 1), and clone 4 candompared to clone 5 concerning both, growth rate (p-value = 0.19) and production rate (p-value = 1). The biggest difference in terms of growth rate has been found between clone 5 and clone 3, with a difference of 0.0041 h −1 (see Table 4, row 1, columns 1-3). A positive value in column 2 means that the left clone in column 1 has a higher µ emp,max than the right clone in column 1.
Clone 2 has a significantly higher specific productivity than any of the other clones (see Table 4, rows 1 to 4, columns 4-6). All differences between clone 2 and the compared clone are positive and statistically significant (p-values < 0.05). The following is cell line 1 with significantly higher production rates than clones 3, 4 and 5 (see Table 4, rows 5, 8 and 9 in columns 4-6).
To investigate whether a theoretical clonal cell population showing high growth rates is more favorable than cell lines with higher production rates, clonal populations (here referred to in a generalist way as clone A and clone B) are considered which are inversely related to each other. This means that the following criteria are fulfilled: • The averaged empirical growth rate of clone A, µ emp,A , is statistically significantly higher than the averaged empirical growth rate of clone B, µ emp,B , i.e., The averaged empirical production rate of clone A, q titer,emp,A , is statistically significantly lower than averaged empirical production rate of clone B, q titer,emp,B , i.e., q titer,emp,A < q titer,emp,B .
This holds for the comparisons "clone 1 vs. clone 5" and "clone 2 vs. clone 5" of Section 3.1. Therefore, the differences between these clones in terms of growth rate and production rate (highlighted in bold font in Table 4) are considered in the following.
It should be noted that the averaged growth and production rates differ from the model parameters maximum growth rate µ max and maximum production rate q titer,max , used within a cell culture model. For this reason, the presented findings regarding differences between clonal populations have also been calculated on a percentage basis, to keep the same ratios within the simulation-based investigations. The empirical growth rate of clone 1 was approximately 7.6% higher than that of clone 5 and 10% higher than that of clone 2. The empirical production rate was 10.5% lower than that of clone 5 and 74% higher than that of clone 2.
In order to know how these clones would behave in a typical cell expansion process (from vial to production vessel) and at the final production phase, a representation was created which is explained in the following section. Growth rates and production rates are assumed to remain the same at the larger scales of operation.

Model Adaption of a Mechanistic Cell Culture Model for Prediction Using Bayesian Parameter Estimation
To display the cell growth behavior of a cell line, a growth model has to be applied and adapted based on experimental data. Since only clone 5 was available for further experiments, cell expansion processes from 5 mL and 10 mL in parallel to 500 mL have been performed at ExcellGene SA for this clone. At 5 and 10 mL scales, viable cell density, viability and volumetric titer have been measured. Calculated growth rates and production rates have been used to define the prior distributions of µ max and q titer,max in the following.
At 500 mL scale, cells have been cultivated over a period of 8 days and substrates (glucose and glutamine) and metabolites (lactate and ammonia) were measured in addition to viable cell density, viability and volumetric titer (to also adapt parameters characterizing substrate uptake and death rate). Based on these experiments, a growth model (see Section 2.2), which had been already applied to other CHO cell lines, was used here while applying Bayesian parameter estimations. This approach consists of the following steps: In a first step, the prior knowledge about model parameters had to be quantified. In the second step, experimental data were added, and a Markov chain Monte Carlo algorithm is used to find the posterior probability distributions of the model parameters to be estimated. The obtained posterior distributions contained information from prior knowledge and new experimental data.

Prior Knowledge
To quantify the prior probability distributions of model parameters, data from the stability study and data from additional experiments at 5 mL and 10 mL with the same clone, clone 5, have been used in the following way: The maximum growth rate of clone 5 over all subcultivations of the stability study was µ emp,max = 0.035 h −1 . Additional experiments at 5 and 10 mL-scales revealed growth rates of 0.046 and µ emp,max = 0.048 h −1 , respectively. The additional experiments provide one measurement per day, allowing the computation of the growth rate per day. The stability study provides data at the beginning and at the end of each subcultivation (with a duration of 3 or 4 days each). Consequently, the maximum growth rate cannot be approximated as precisely as using daily measurements. Nevertheless, it is considered for determination of the prior distribution but with less weight (1/3) than the approximations of further experiments (2/3).
The maximum production rate of clone 5 over all subcultivations of the stability study revealed q titer,emp,max = 10.7 · 10 −10 mg cell −1 h −1 . The maximum production rates of clone 5, based on additional experiments at 5 and 10 mL-scales, were q titer,emp,max = 5.9 · 10 −10 mg cell −1 h −1 and q titer,emp,max = 6.8 · 10 −10 mg cell −1 h −1 , respectively. The reason for the variation of these values is unknown, but the variation (uncertainty) itself is information also included in the prior probability study. (A higher uncertainty signifies less weight for the prior mean within the parameter estimation process).
Based on this information, mean and variance have been calculated to characterize the prior probability distribution of maximum growth rate µ max and maximum production rate q titer,max according to Equation (A1). These are listed in Table 5, including the corresponding coefficient of variation (cv). Table 5. Prior parameter values for maximum growth rate µ max (mean, variance and coefficient of variation (cv)) and empirical production rate q titer,max (mean, variance and coefficient of variation (cv)).

Posterior Distributions
Bayesian parameter estimation has been performed using a Markov chain Monte Carlo (MCMC) algorithm considering cultivation data (at 500 mL over 8 days) and prior distributions as described in Section 2.7. Measured and simulated time course data are presented in Figure 4. It can be seen that reasonable agreement between measured and simulated data can be achieved by the set of model parameters used, although more experimental data between day 4 and day 8 could have helped to define more precisely when the cells entered into the stationary phase. Prior (before parameter estimation) and posterior (after parameter estimation) distributions are shown in Figure 5. Posterior means of estimated model parameters and values of the fixed model parameters are presented in Table A1. It can be concluded from Figure 4 together with Figure 5 that parameters µ max and q titer,max represent rather well the measured data: Posterior distributions (red solid lines) are much narrower than the prior distributions (blue dashed lines), thereby reducing uncertainty for these model parameters. This means that uncertainty has been reduced for these model parameters. Furthermore, the means moved slightly to the right in the case of maximum growth rate µ max and strongly to the left in case of q titer,max . Posterior distributions of the remaining model parameters do not differ much from their prior distributions. Posterior distributions Bayesian parameter estimation has been performed using a Markov Chain Monte Carlo (MCMC) algorithm considering cultivation data (at 500 mL over 8 days) and prior distributions as described in Section 2.7. Measured and simulated time course data are presented in Figure 4. It can be seen that a reasonable good agreement between measured and simulated data can be achieved by the set of model parameters used although more experimental data between day 4 and day 8 could have helped to define more precisely when the cells enter into the stationary phase. Prior (before parameter estimation) and posterior (after parameter estimation) distributions are shown in Figure 5. Posterior means of estimated model parameters and values of the fixed model parameters are presented in Table A1. It can be concluded from Figure 4 together with Figure 5 that parameters µ max and q titer,max represent rather well the measured data: Posterior distributions (red solid lines) are much narrower than the prior distributions (blue dashed lines), thus reducing uncertainty for these model parameters. This means, that uncertainty has been reduced for these model parameters. Furthermore, the means moved slightly to the right in case of maximum growth rate µ max and strongly to the left in case of q titer,max . Posterior distributions of the remaining model parameters do not differ much from their prior distributions.

Uncertainty-Based Upstream Process Simulation -Comparison of Three Clonal Populations with Different Growth and Production Rates
In this section, the adapted model is applied to perform upstream simulations for three different theoretical cell lines, named A, B and C, under consideration of variabilities observed in Section 3.1. The reference clone A is defined, characterized by the model parameter distributions obtained in the previous section (parameter estimation for clone 5). The two other clones B and C are defined showing a lower growth rate than clone A, but a higher production rate than clone A as listed in Table 6. In order to choose realistic values concerning the differences between clones A, B and C, the differences obtained in Section 3.1 concerning growth rate and production rate have been applied. Empirical growth rates for experimentally analyzed cell lines 5 and 1 showed an averaged difference of 7.6% and for 5 and 2 an averaged

Uncertainty-Based Upstream Process Simulation-Comparison of Three Clonal Populations with Different Growth and Production Rates
In this section, we describe the application of the adapted model perform upstream simulations for three different theoretical cell lines, named A, B and C, under consideration of variabilities observed in Section 3.1. The reference clone A is characterized by the model parameter distributions obtained in the previous section (parameter estimation for clone 5). The two other clones B and C are defined as showing lower growth rates than clone A, but higher production rates than clone A, as listed in Table 6. In order to choose realistic values concerning the differences between clones A, B and C, the differences obtained in Section 3.1 concerning growth rate and production rate have been applied. Empirical growth rates for experimentally analyzed cell lines 5 and 1 showed an averaged difference of 7.6% and for 5 and 2 an averaged difference of 10%.Therefore, model parameter µ max of clone B was chosen to be 7.6% lower than µ max of clone A and µ max of clone C was chosen to be 10% lower than µ max of clone A.
Empirical production rates q titer,max for cell lines 5 and 1 showed averaged differences of 10.5% and 74%, respectively. Therefore, model parameter q titer,max of clone B was chosen to be 10.5% higher than q titer,max of clone A, and q titer,max of clone C was chosen to be 74% higher than q titer,max of clone A. A suitable inoculum train protocol was defined for each clonal cell line. Furthermore, these simulations were used to investigate and illustrate the impact of differences in growth and production rates between all three clones regarding duration of the inoculum train, volumetric titer in production and overall process productivity for a batch process. difference of 10%. Therefore, model parameter µ max of clone B is chosen to be 7.6% lower than µ max of clone A and µ max of clone C is chosen to be 10% lower than µ max of clone A.
Empirical production rates q titer,max for cell line 5 and 1 showed an averaged difference of 10.5% and for 5 and 2 a difference of 74%. Therefore, model parameter q titer,max of clone B is chosen to be 10.5% higher than q titer,max of clone A and q titer,max of clone C is chosen to be 74% higher than q titer,max of clone A. A suitable inoculum train protocol is defined for each clonal cell line. Furthermore, these simulations are used to investigate and illustrate the impact of differences in growth and production rates between all three clones on duration of the inoculum train, volumetric titer in production and overall process productivity for a batch process.  Table 2 based on experimental data from cultivation in 500 mL (TubeSpin 600 TM bioreactor) with a cultivation time of 8 days.  Table 2 based on experimental data from cultivation in 500 mL (TubeSpin 600™ bioreactor) with a cultivation time of 8 days. Table 6. Model parameters maximum growth rate µ max and maximum production rate q titer,max of the three clones used for the simulation.

Clone
Remark µ max q titer,max [h −1 ] [1 · 10 −10 mg cell − To digitally display an upstream process, the following inputs have been defined: Volumes: The simulated upstream process consisted of six scales of operation with the following volumes: 10 mL → 120 mL → 1.5 L → 15 L → 100 L → 1000 L (production).
This setup enabled the use of a highly similar type of cultivation approach (orbital shaking) at all scales. Note: ExcellGene has considerable experience with scale-up cultures in both orbital shaken and standard stirred systems to have sufficient confidence in the matching impacts of critical parameters in both approaches (not published). Passaging strategy: Cells were passaged, i.e., subcultivated, as soon as a required cell density for transfer was reached, using the predicted viable cell density. The required cell biomass was based on the optimal cell density for inoculation at 5 · 10 5 cells mL −1 . Initial concentrations: viable cell density, X v,0 = 5.3 · 10 5 cells mL −1 ; viability = 100%; glucose, c Glc,0 = 32.6 mmol L −1 ; glutamine c Gln,0 = 3.3 mmol L −1 ; lactate, c Lac,0 = 0.001 mmol L −1 ; ammonia, c Amm,0 = 2.6 mmol L −1 , titer, c titer,0 = 0 mg L −1 ; and volume, V = 0.01 L. Furthermore, a limiting substrate was assumed to have initial value c LS,0 = 2 mmol L −1 .
The corresponding simulated time profiles for trends in viable cell density and titer are presented in Figure 6.
It turned out that the designed inoculum trains seemed suitable for cell expansion of all three clonal populations. When the inoculum cell densities were fulfilled, cells did not enter into the stationary phase during the inoculum train, and transfer cell densities were within an acceptable range (maximum cell density below 1 · 10 7 cells mL −1 ).
The durations of the inoculum train cultures ranged from 298 to 333 h. Obviously, lower growth rates cause longer cultivation times. Clone B needed 24 h and clone C 35 h more than clone A. Clone A and clone B, concerning the predicted volumetric titers in the production vessel, differed by a 10.5% higher production rate of clone B, resulting in a 13% higher volumetric titer during the first hours of the production phase. However this difference shrunk over time: After 25 h clone A reached 13 mg L −1 and clone B 15 mg L −1 . Between 50 and 100 h in the production vessel, clone A compensated for the disadvantage through a 7.6% higher growth rate. After 100 h, clone A presentd a titer of 222 mg L −1 , 7% more than clone B with 207.0 mg L −1 . Nevertheless, after 168 h (7 days) clone B reached a higher volumetric titer (558 mg L −1 ) than clone A (539 mg L −1 ). This was due to the fact that the higher growth rate of clone A led to an earlier beginning of the death phase (here in batch mode) compared to clone B. Putting the volumetric titer in relation to the overall cultivation time and accepting an overall error of about 10%, both clones led to a similar overall process productivity (1.16 mg L −1 h −1 for clone A and 1.14 mg L −1 h −1 for clone B).
A clearer impact was observed for clone C, having a 10% lower maximum growth rate combined with a 74% higher production rate as compared to clone A. Already, after the first 25 h in the production vessel, clone C reached 25 mg L −1 on average (clone A and B only 13 and 15 mg L −1 , respectively), and after 7 days (168 h) clone C reached 876 mg L −1 (clone A and B only 539 and 558 mg L −1 , respectively). This is an increase of 337 mg L −1 (63.5% of the volumetric titer generated with clone A). Figure 6. Simulated viable cell density (VCD) and volumetric titer over inoculum train cultures (5 scales) and production scale (1000 L) for clones A (above), B (middle) and C (below). The maximum growth rate of clone B was 7.6% lower than that of clone A, and the maximum cell-specific production rate of clone B was 10.5% higher than that of clone A. The maximum growth rate of clone C was 10.5% lower than that of clone A and the maximum cell-specific production rate of clone C was 74% higher than that of clone A.
It should be noted that the presented model-based method can be further extended to fed-batch processes which are most frequently applied in industrial large scale manufacturing or to perfusion mode. However, the batch process has been considered in this study because the focus was not to find an optimal operating mode for the production bioreactor, but rather to consider how phenotypic differences effect cell growth in the inoculum train, which contributes significantly to the manufacturing time and overall process productivity, yet is rarely considered in literature [26].
A comparison of different CHO host cell lines for batch, fed-batch and perfusion modes was recently reported in [6]. They found that differences in phenotypic properties affect cell growth and productivity regardless of process mode (batch, fed-batch or perfusion) or cell culture media.
For a better illustration, Figure 7 shows how variabilities in model parameters µ max and q titer,max propagate onto the output uncertainty in form of probability distributions (histograms) at each interesting point in time. It thus becomes visible how the output distribution changes over time. In accordance with the results presented in Figure 6, there is not a huge difference concerning volumetric titer between the distributions of clone A and clone B: not after 25 h (Figure 7c), 100 h (Figure 7d) or 168 h (Figure 7e) of production. The distributions are almost overlapping, although after 100 h clone A shows a higher mean than clone B, as described above. The distribution of clone C instead differs clearly from those of clone A and clone B (small overlap) after 25 h in production. After 100 h of production there are larger overlapping areas between all three clones, indicating a decline in differences between them. However, after 168 h (7 days) of production, a clear difference (smaller overlap) is visible between clone C and clones A and B, whereas clone A and B are almost totally overlapping. In this case and under the assumptions of equal stability and quality, clone C would be the recommended clone for moving forward.
Nevertheless, it may be the case that two or more clonal populations differ in a different proportion to each other in terms of phenotypic characteristics than the here discussed three. The following section tries to address such.

Impacts of Differences in Growth and Production Rates on Inoculum Train and Titer at Production Scale-General Considerations and a Decision Criterion
To judge the effects of growth rate and specific productivity in numerous clonal cell populations, one needs to know the resulting overall process productivities (=Space-Time-Yield: volumetric titer in production/overall cultivation time, including inoculum train).
We determined these effects for a realistic cell expansion setup and based on model parameter ranges derived from the previous sections (µ max = 0.397 h −1 ± 10%, q titer,max = 5 · 10 −10 mg cell −1 h −1 ± 10%). For each parameter combination, the two response numbers (volumetric titer in production and overall process productivity) were obtained by upstream simulations as before. These results were then adapted to corresponding response surfaces (see Figures A4-A6), which visualize the effects of maximum growth rate and maximum production rate on each response quantity.
Multiple linear regression has been performed for the responses after 50, 100 and 168 h in the production vessel. Due to their different orders of magnitude, all variables have been scaled (transformed) to the range of [0, 1]. The results are presented in Table 7. Table 7. Results of a multiple linear regression (R 2 , β-coefficients, standard error (SE) and p-value) in two variables, maximum growth rate µ max (coefficient β 1 ) and maximum production rate q titer,max (coefficient β 2 ) at 50, 100 and 168 h of production. Response values are volumetric titer in production and overall process productivity (=Space-Time-Yield: volumetric titer in production/overall cultivation time, including inoculum train). All regressions have an R 2 -value very close to one, meaning that the applied model is suitable to present the correlation between factors and response variables. All determined β-coefficients, which describe the correlation of µ max (β 1 ) and q titer,max (β 2 ) for the investigated response variable, show p-values less than 0.05 (meaning that they are statistically significant to a 5%-level). Due to the scaling of both factors, the β-coefficients stayed within the range of 0 and 1. It is interesting to see that the impact and the relation between both factors, µ max and q titer,max , varies depending on which point in time in production is considered.

Response
Regarding volumetric titer as a response variable, it can be observed that after 50 h in the production vessel, the impact of q titer,max (β 2 = 0.57) was 1.4 times higher than the impact of µ max (β 1 = 0.42). After 100 h in production, this changed. Then, the impact of µ max (β 1 = 0.69) was 2.2 times higher than the impact of q titer,max (β 2 = 0.31). However, after 168 h (7 days) in production, q titer,max (β 2 = 0.67) was again higher than µ max (two times β 1 = 0.33). The decreasing impact of µ max after 168 h can be explained because cells probably entered in the stationary/death phase (here batch-mode is assumed) while cells were still producing titer.
Considering overall process productivity, µ max has a higher impact than q titer,max , regardless of the considered point in time (see β-coefficients for the overall process productivity in Table 7). It was 1.6 times higher after 50 h, three times higher after 100 h and 1.17 times higher after 168 h cultivation time in the production vessel compared to q titer,max . Obviously, therefore, growth rates have a higher impact on the output per year.
This regression analysis was performed within a range of 0.397 h −1 ± 10% for µ max and 5 ·10 −10 mg cell −1 h −1 ± 10% for q titer,max ); it should be noted, however, that the results of the stability study showed a higher variation of the production rate than that of the growth rate. Therefore, q titer,max has been varied ± 50%, and response surfaces for both response variables, volumetric titer and overall process productivity, have been estimated as shown in Figure 8. Response surface for volumetric titer after 168 h (7 days) of production over maximum production rate and maximum growth rate (a). Overall process productivity (=Space-Time-Yield: volumetric titer after 168 h (7 days) in production/overall cultivation time, including inoculum train) over maximum production rate and maximum growth rate (b). The reference clone A (red solid line) and two compared clones, clone B (green dashed line, 7.6% lower µ max and 10.5% higher q titer,max ) and clone C (orange dotted line, 10% lower µ max and 74% higher q titer,max ) are placed in the graphs. Table 8. Results, volumetric titer and overall process productivity (=Space-Time-Yield: volumetric titer after 168 h (7 days) in production/overall cultivation time, including inoculum train) for reference clone A, and two clones to be compared, clone B and clone C. These clones differ in terms of maximum growth rates and maximum production rates, as listed in the corresponding rows. Choosing a clonal population showing a lower growth rate but a higher production rate will only be favorable if the productivity is high enough. As stated before, and summarized in Table 8, clones A and B delivered very similar process productivities (1.16 mg L −1 h −1 for clone A and 1.14 mg L −1 h −1 for clone B). The final titers for clone A and B were 539 and 558 mg L −1 , respectively-a negligible difference. When taking 74% higher productivity for C, then a more significant difference is obtained with 876 mg L −1 after 168 h, and an 1.75 mg L −1 h −1 overall process productivity enhancement over A and B is seen.

Clone
The response surface models can be used, therefore, to approximate volumetric titer and overall process productivity for a realistic combination of growth rate and production rate, and help with the decision processes. These simulations can be used to determine to what extent growth rate or production rate must differ to cause a difference of at least 5% in the response variables.

Conclusions
A model-based approach in combination with statistical methods was applied to study the impacts of the cell-specific growth rate ("growth rate") and cell-specific production rate ("production rate" or "specific productivity") on overall product yield using parameters such as time needed for the inoculum train cultures, the volumetric titer during final production phase and the overall process productivity (=Space-Time-Yield: volumetric titer in production/overall cultivation time, including inoculum train). For three theoretical clonal populations an inoculum train protocol was defined, suitable for all of them: Cell line A showed a 7.6% higher maximum growth rate, cell line B a 10.5% higher production rate than cell line A and cell line C a 10% lower maximum growth rate and a 74% higher production rate than cell line A. For all three cell lines, a prediction model of an inoculum train, including predictive uncertainty arising from model parametric uncertainty (due to biological variabilities), has been utilized. For cell line A (higher µ max ) the inoculum train would take 298 h until inoculation of cells into the production bioreactor (1000 L), for B (higher q titer,max ) it would take 322 h and for C (higher q titer,max ) 333 h. Cell line A would generate a volumetric titer of approximately 539 mg L −1 after 168 h in the final production vessel, B would result in 558 mg L −1 and C would result in 876 mg L −1 , assuming a batch process.
Moreover, response surface modeling was applied to quantify the effects of both parameters on volumetric titer and overall process productivity at specific points in time in production. Based on the results of a simulation using mathematical process models in combination with statistical methods, decision criteria can be provided that can help to evaluate different clonal cell lines for future manufacturing purposes. This can be seen as a support tool in addition to the characterization of biochemical, biophysical and functionality properties to asses the quality of the final product. Assuming little or no quality differences in the products obtained in cell culture, the growth rate of a clonal cell population has the higher impact (up to three times) on the overall process productivity, and thus, on the output per year, and clones with higher production rates have the potential to generate significantly more volumetric titer in production.
It has not escaped our attention that modern processes in large scale manufacturing are most frequently fed-batch processes with production run times exceeding the hereindiscussed 7-day batch processes. These fed-batch processes can increase volumetric titers quite dramatically. Nevertheless, the batch process evaluation is a good first step to obtain quick results. They can be further expanded once a smaller number of clonally derived cell lines have been chosen to also involve fed-batch processes. Moreover, shorter batch processes have certain advantages for some products-for example, reducing the negative impacts of certain losses in production campaigns, such as contaminations or disruptions from instrument failures. Thus, the preferred mode for most efficient use of a given manufacturing facility would be to shorten overall production time phases (in the largest bioreactor) while maximizing growth in inoculum train cultures and to achieve the highest maximal density in the so-called N-1 cultures (i.e., the culture preceding the production vessel). In spite of this, the authors of this article hope to having provided a useful discussion on the complex relationships between different phenotypes of CHO cells, particularly those that have major impacts on overall productivity in manufacturing.  To perform Bayesian parameter estimation, prior distributions have to be quantified to form probability distributions. An appropriate type of distribution has to be chosen in accordance with the available knowledge. In this contribution, a gamma distribution has been chosen to describe the probability distribution of model parameters. This assumption is based on the fact that the considered random variables can only adopt positive values, and furthermore, the gamma distribution is well suited for representing the realistic range based on the available prior knowledge. It is defined by the parameters α(shape) and λ(rate).
To characterize the individual distribution of a variable Y (here µ max or q titer,max ), the corresponding mean (E(Y)) and variance (V(Y)) are used to compute the distribution parameters rate (α) and shape (λ), according to:        . Response surfaces showing the impact of maximum growth rate and maximum production rate on volumetric titer and overall process productivity after 168 hours (7 days) in production. Maximum cell-specific glutamine uptake rate k Gln mmol L −1 0.5 Monod kinetic constant for glutamine uptake Figure A6. Response surfaces showing the impacts of maximum growth rate and maximum production rate on volumetric titer and overall process productivity after 168 h (7 days) of production. Cell-specific maximum ammonia uptake rate K Amm -1.9 (fixed) Correction factor for ammonia uptake q titer,max mg cell −1 h −1 3.9 ·10 −10 Cell-specific maximum production rate