Automated Conditional Screening of Multiple Escherichia coli Strains in Parallel Adaptive Fed-Batch Cultivations

In bioprocess development, the host and the genetic construct for a new biomanufacturing process are selected in the early developmental stages. This decision, made at the screening scale with very limited information about the performance in larger reactors, has a major influence on the efficiency of the final process. To overcome this, scale-down approaches during screenings that show the real cell factory performance at industrial-like conditions are essential. We present a fully automated robotic facility with 24 parallel mini-bioreactors that is operated by a model-based adaptive input design framework for the characterization of clone libraries under scale-down conditions. The cultivation operation strategies are computed and continuously refined based on a macro-kinetic growth model that is continuously re-fitted to the available experimental data. The added value of the approach is demonstrated with 24 parallel fed-batch cultivations in a mini-bioreactor system with eight different Escherichia coli strains in triplicate. The 24 fed-batch cultivations were run under the desired conditions, generating sufficient information to define the fastest-growing strain in an environment with oscillating glucose concentrations similar to industrial-scale bioreactors.


Introduction
Emerging technologies in robotic biolaboratories open new opportunities for both high-throughput (HT) screening and HT bioprocess development. On the screening side, significant progress has been made in terms of cultivation scale (down to femtoliter), parallelization and non-invasive observation, which have been widely reviewed [1][2][3]. The focus of this work is conditional screening, where a reduced number of candidate clones are tested under different conditions with the aim to significantly improve the performance at an industrial scale (e.g., media, pH and temperature profiles, bioreactor heterogeneities, induction and feeding strategies). These factors are known to affect the underlying nonlinear dynamics of the bioprocess and are part of the very complex time-dependent interaction between the bioreactor environment and the cell. This highly nonlinear behavior makes it difficult to predict the effect of changes in the cultivating conditions and is responsible for the high failure rate in scale-up [4][5][6]. In order to overcome these challenges, experiments in conditional screening require highly advanced experimental setups able to: (i) operate as similar as possible to the industrial strategy (e.g., fed-batch or continuous cultivations), (ii) mimic the harsh conditions of Figure 1. Illustration of the model calibration cycle in the adaptive framework for conditional screening experiments. The cultivation of the clones is performed on the cultivation and analysis platform (consisting of two liquid handling stations and a mini-bioreactor system); samples are collected and autonomously analyzed. The generated online and at-line measurements are sent to the central data storage (database). The model calibration cycle starts with the collection of all available data. Based on the measurements, the sensitivity analysis is performed; based on the results, the identifiable parameters are selected, and non-identifiable parameters are not considered/fixed in the subsequent parameter estimation. In the parameter estimation, the identifiable parameter subset is adjusted to fit the model to the measurements. Based on the calibrated model, the feed is calculated during the feed calculation step, according to previously defined criteria. The feed is further converted into corresponding pulses with individual times. These time/pulse setpoints are stored in the database and executed directly by the cultivation and analysis platform.

HTBD Facility
The high-throughput bioprocess development facility is composed of two liquid handling stations (LHS, Freedom Evo 200, Tecan, Switzerland; Microlab Star, Hamilton, Switzerland) and a mini-bioreactor system (48 BioReactor, 2mag AG, Munich, Germany), which is mounted on the Tecan LHS. Both LHSs are connected at the hardware and software level to exchange samples, process and measurement information (Figure 2). The process control (e.g., feed, pH control and volume balance) is carried out by the LHSs in a pulsed based manner. A detailed description of the used hardware and software framework is given in Haby et al. 2019 [17]. The model calibration cycle starts with the collection of all available data. Based on the measurements, the sensitivity analysis is performed; based on the results, the identifiable parameters are selected, and non-identifiable parameters are not considered/fixed in the subsequent parameter estimation. In the parameter estimation, the identifiable parameter subset is adjusted to fit the model to the measurements. Based on the calibrated model, the feed is calculated during the f eed calculation step, according to previously defined criteria. The feed is further converted into corresponding pulses with individual times. These time/pulse setpoints are stored in the database and executed directly by the cultivation and analysis platform.

HTBD Facility
The high-throughput bioprocess development facility is composed of two liquid handling stations (LHS, Freedom Evo 200, Tecan, Switzerland; Microlab Star, Hamilton, Switzerland) and a mini-bioreactor system (48 BioReactor, 2mag AG, Munich, Germany), which is mounted on the Tecan LHS. Both LHSs are connected at the hardware and software level to exchange samples, process and measurement information (Figure 2). The process control (e.g., feed, pH control and volume balance) is carried out by the LHSs in a pulsed based manner. A detailed description of the used hardware and software framework is given in Haby et al. 2019 [17].

Figure 2.
Schematic description of the high-throughput bioprocess development facility used in this study. Composition of the two liquid handling stations (LHS, 1 and 2) and a mini-bioreactor system (3). The liquid handling stations are connected by a linear transfer unit (4) for automatic sample exchange. The facility is surrounded by supporting laboratory equipment (5)(6)(7)(8), all accessible by one of the liquid handling stations.

Cultivation
Precultures were performed with EnPresso B (Enpresso GmbH, Berlin, Germany) medium with 9 U L −1 Reagent A at 37 °C in a 24-multi-well Oxodish plate to keep the cells in the growth phase (PreSens GmbH, Regensburg Germany). The main culture was started as a batch culture at 37 °C with 5 g L −1 glucose. The initial batch phase was prolonged after 1 h by an additional feed pulse to a final concentration of 5 g L −1 glucose. The stirrer speed was kept constant at 3000 rpm. After the end of the batch phase, a fed-batch was started with a pulse-based glucose feeding every 5 min of feed solution 400 g L −1 glucose dissolved in deionized water. The feeding rate was increased exponentially and switched to a constant feed when the maximum pulse volume of 22 µL was reached. In total, the cultivations were carried out over 8 h with fed-batch phases of 5.4 to 6.1 h, depending on the length of the clone-specific batch phases. The µset for the exponential feed was chosen to be 50% of the modelpredicted µmax value and was adapted in every modelling cycle for each clone. The volume of the feed pulses was determined on the basis of the calculated feed rate. All experiments were carried out as biological triplicates.

Sampling and Analytic
During the cultivations, pH and dissolved oxygen tension (DOT) were measured online in the mini-bioreactor system. Each column of the bioreactor system was sampled every 45 min in a sequential mode with a sampling interval of 15 min. Samples were inactivated directly with NaOH and stored in 96-well plates at 4 °C on the deck of the LHS until further processing [20]. After 5 samplings, the sampling plates were automatically transferred to the Hamilton LHS for OD600, glucose and acetate measurements in 96-well plates [17]. For the OD600 measurements, the samples were diluted to remain in the linear range. The dilution factor was adjusted between 20 and 100 over the course of the cultivation process. All OD600 values were multiplied by a correction factor of 2.62 to convert the values to a liquid height of 1 cm. Based on the OD600 measurements, the dry cell weight The liquid handling stations are connected by a linear transfer unit (4) for automatic sample exchange. The facility is surrounded by supporting laboratory equipment (5)(6)(7)(8), all accessible by one of the liquid handling stations.

Cultivation
Precultures were performed with EnPresso B (Enpresso GmbH, Berlin, Germany) medium with 9 U L −1 Reagent A at 37 • C in a 24-multi-well Oxodish plate to keep the cells in the growth phase (PreSens GmbH, Regensburg Germany). The main culture was started as a batch culture at 37 • C with 5 g L −1 glucose. The initial batch phase was prolonged after 1 h by an additional feed pulse to a final concentration of 5 g L −1 glucose. The stirrer speed was kept constant at 3000 rpm. After the end of the batch phase, a fed-batch was started with a pulse-based glucose feeding every 5 min of feed solution 400 g L −1 glucose dissolved in deionized water. The feeding rate was increased exponentially and switched to a constant feed when the maximum pulse volume of 22 µL was reached. In total, the cultivations were carried out over 8 h with fed-batch phases of 5.4 to 6.1 h, depending on the length of the clone-specific batch phases. The µ set for the exponential feed was chosen to be 50% of the model-predicted µ max value and was adapted in every modelling cycle for each clone. The volume of the feed pulses was determined on the basis of the calculated feed rate. All experiments were carried out as biological triplicates.

Sampling and Analytic
During the cultivations, pH and dissolved oxygen tension (DOT) were measured online in the mini-bioreactor system. Each column of the bioreactor system was sampled every 45 min in a sequential mode with a sampling interval of 15 min. Samples were inactivated directly with NaOH and stored in 96-well plates at 4 • C on the deck of the LHS until further processing [20]. After 5 samplings, the sampling plates were automatically transferred to the Hamilton LHS for OD 600 , glucose and acetate measurements in 96-well plates [17]. For the OD 600 measurements, the samples were diluted to remain in the linear range. The dilution factor was adjusted between 20 and 100 over the course of the cultivation process. All OD 600 values were multiplied by a correction factor of 2.62 to convert the values to a liquid height of 1 cm. Based on the OD 600 measurements, the dry cell weight of the biomass was calculated by multiplying the OD 600 by 0.33 [26]. Due to the time-consuming sampling and analysis procedure, the values for biomass, glucose and acetate were written to the database with a delay of 0.25-1.35 h for the biomass and 0.66-2 h for glucose and acetate, depending on the column of the bioreactor system where the sample was taken.
During the eight hours of cultivation, for each reactor, 1440 values for DOT and pH were collected, as well as 23 samples for biomass (OD 600 ), 20 for glucose and 20 for acetate measurements. This yields, in total, for each reactor, 1503 data points. Considering three replicates, the size of the parameter sensitivity matrix is (1503 x 18) * 3 (measurements x parameter) * replicates).

Strains
The strains used in this study are E.

Computational Methods
The E. coli macro-kinetic growth model [25] consists of 5 ordinary differential equations, describing biomass, glucose, acetate, oxygen and enzymatic glucose release, and represents the major extracellular dynamics of E. coli, including the acetic acid overflow. The model contains 18 parameters, from which 13 have been shown to vary with clones and cultivation conditions. All computational methods related to the model calibration and feed calculation were carried out in MATLAB (The MathWorks, Inc., Natick, Massachusetts, USA), available at https://gitlab.tu-berlin.de/hts_modelling/ModellingFramework. The commit used for this study is efaee5eba813237860264fc33ba79315eef5bbca. Cultivation time and data for the different sequential tasks are summarized in Table 1. All measurements used for the parameter estimation are available in Table S1. The parameter estimation problem is solved for a reduced (identifiable) parameter subset. This subset is updated in each model calibration cycle in Table 1 and is selected based on the local sensitivity matrix [28]. In doing so, the model calibration updates both the identifiable parameter subset and corresponding parameter values. This approach is useful as the information content in the data increases with each cycle. The algorithm implements a stepwise forward selection of Bioengineering 2020, 7, 145 6 of 18 parameters to be included in the estimation problem based on the dynamical parameter sensitivities. Identifiable parameters are selected by a ranking of all parameters according to linear independence and an analysis of the matrix rank condition of the sensitivity matrix.
The parameter estimation is formulated as the following optimization problem: where the objective function reads: where y i,j (U, θ) are the simulated states and y m i,j are the corresponding measured states. The index i = 1, . . . , 5 indicates the measured variables and the index j = 1, . . . , N i indicates individual data points.
The CVODES solver in SUNDIALS [29] is used to solve the ODE system and the interior-point algorithm is used for optimisation (MATLAB fmincon). Initial values and lower and upper bounds of the parameter estimation (PE) are based on experts' knowledge and summarized in Table S2.

Monte Carlo Simulation
Parameter distribution and pairwise correlations are determined by Monte Carlo simulations based on the last dataset (n = 500) and with the identifiable parameter set based on the subset selection. Monte Carlo simulations were carried out with σ = 0.15 for biomass, glucose and acetate and with σ = 0.05 for DOT.

Feed Calculation
The exponential feed was calculated using the standard fed-batch equation [30], which was adapted to consider a pulse-based profile. Since the feed in a fed-batch process is the only major volume-changing factor, volume changes due to sampling are neglected at this point and the volume change could be described as With µ set h −1 as the predefined specific growth rate, F 0 g L −1 is the initial feed rate and time t [h]. The pulse volume is calculated as with is the glucose concentration in the feed solution, X 0 [g] is the biomass concentration and V 0 is the volume at the feed start. Volume manipulations by the pipetting robot (e.g., volume balancing, sampling and base addition for pH control) are considered in the feed calculation apart from the equations above.
Biomass and volume for the calculation of F 0 (Equation (5)) were estimated by simulations based on the last calibrated model. The end of the batch phase was defined as the time point where the predicted glucose and acetate concentrations were below 0.02 g L −1 , but not later than 45 min after the depletion of glucose.

Results
Eight different E. coli K-12 clones were cultivated in parallel with an industrial process-relevant feeding design consisting of batch, exponential fed-batch and constant feed phases. The feed was applied as pulses to expose the cells to inhomogeneities similar to those in large-scale bioreactors.

Parallel Cultivation
The length of the batch phase varied among the clones and lasted 1.65 h for E. coli W3110 (the fastest growing clone) and 1.86 h for E. coli BW25113 ∆glcB (the slowest growing clone). After the end of the batch phase, the feed was automatically started. Due to the pulse nature of the feed procedure, the feed start is visible through the oscillating DOT values (Figure 3a). These oscillations, as well as the glucose at-line data, prove that glucose limitation was maintained during the fed-batch phase in all cultivations. Furthermore, no significant acetate accumulation was observed ( Figure 3b). The cultivations show a low variance between triplicates, which is obvious from the online DOT and pH profiles as well as from the automatically analyzed biomass, glucose and acetate values. As expected, the pH decreased during the batch phase and started to increase after glucose depletion (typically caused by acetate consumption). During each glucose pulse cycle, a perturbation of pH is observed, which is caused by the transient production of acetic acid ( Figure 3c). Finally, a small increase in pH was observed after the switch to constant pulse-based feed.

Results
Eight different E. coli K-12 clones were cultivated in parallel with an industrial process-relevant feeding design consisting of batch, exponential fed-batch and constant feed phases. The feed was applied as pulses to expose the cells to inhomogeneities similar to those in large-scale bioreactors.

Parallel Cultivation
The length of the batch phase varied among the clones and lasted 1.65 h for E. coli W3110 (the fastest growing clone) and 1.86 h for E. coli BW25113 ΔglcB (the slowest growing clone). After the end of the batch phase, the feed was automatically started. Due to the pulse nature of the feed procedure, the feed start is visible through the oscillating DOT values ( Figure 3a). These oscillations, as well as the glucose at-line data, prove that glucose limitation was maintained during the fed-batch phase in all cultivations. Furthermore, no significant acetate accumulation was observed ( Figure 3b). The cultivations show a low variance between triplicates, which is obvious from the online DOT and pH profiles as well as from the automatically analyzed biomass, glucose and acetate values. As expected, the pH decreased during the batch phase and started to increase after glucose depletion (typically caused by acetate consumption). During each glucose pulse cycle, a perturbation of pH is observed, which is caused by the transient production of acetic acid ( Figure 3c). Finally, a small increase in pH was observed after the switch to constant pulse-based feed.

Prediction of Batch and Feed Start
After inoculation, samples for biomass were taken and, together with the initial parameter set, served as the basis for the first prediction of the batch phase, feed start and feed rate (Figure 4a, black dashed line). The difference in the initial biomass between the clones led to different simulation results and different predicted batch phases prior to the first model calibration. After the second biomass measurement, the first model calibration cycle was initiated after 1.4 h of batch cultivation and the feed start and rate were re-computed using the updated models. From that time onwards, the model calibration cycle was started after each entry of new at-line data (biomass or glucose/acetate). The end of batch was defined as the time point at which glucose as well as acetate (produced during overflow growth) were depleted (Table 2). Therefore, the fed-batch phase in our cultivations started purposely later compared to typical fed-batch processes, which are mostly started when glucose is depleted and the DO signal increases. Note that feeding was started only when acetate had been metabolized. This prevent possible overfeeding with glucose by co-metabolism of the remaining acetate and thus allowed a higher process stability.
(produced during overflow growth) were depleted (Table 2). Therefore, the fed-batch phase in our cultivations started purposely later compared to typical fed-batch processes, which are mostly started when glucose is depleted and the DO signal increases. Note that feeding was started only when acetate had been metabolized. This prevent possible overfeeding with glucose by co-metabolism of the remaining acetate and thus allowed a higher process stability. Figure 4a illustrates the outcome of the model calibration cycle during the batch phase, with the example of the E. coli BW25113 ΔglcB cultivation data (grey cross) and simulations after model calibration (blue line). It is obvious that the first parameter estimate indicates, for this strain, a slower growth compared to the initial parameter set. However, with each model calibration cycle, the computed growth rate (µmax) increased from 0.36 h −1 at t1 to 0.58 h −1 at t2 and up to 0.82 h −1 at the third shown model calibration cycle. The fit to the cultivation data is improved with each modelling cycle and the trend of the cultivation is well represented, at least after the third modelling cycle.
In addition, due to the underestimated µmax, the first model calibration cycle failed to propose the end of the batch phase properly. An accurate estimation of the specific glucose uptake rate is only reached after the first glucose measurements are available, but then the estimation is very precise. Although the glucose depletion is equally estimated in the third model calibration cycle (1.94 h) and in the initial, unadjusted model (1.92 h; Figure 4a, black dashed line), the feed (Figure 4b) started 21 min later (calibrated model: 2.40 h; initial model: 2.01 h). This is due to differences in the production and consumption rates of acetate, resulting in different starting times of the fed-batch phase. Based on the DOT profiles, acetate is consumed after 2.5 h; this also corresponds well with the at-line measurements of E. coli BW25113 ΔglcB in Figure 3.    Figure 4a illustrates the outcome of the model calibration cycle during the batch phase, with the example of the E. coli BW25113 ∆glcB cultivation data (grey cross) and simulations after model calibration (blue line). It is obvious that the first parameter estimate indicates, for this strain, a slower growth compared to the initial parameter set. However, with each model calibration cycle, the computed growth rate (µ max ) increased from 0.36 h −1 at t 1 to 0.58 h −1 at t 2 and up to 0.82 h −1 at the third shown model calibration cycle. The fit to the cultivation data is improved with each modelling cycle and the trend of the cultivation is well represented, at least after the third modelling cycle.
In addition, due to the underestimated µ max , the first model calibration cycle failed to propose the end of the batch phase properly. An accurate estimation of the specific glucose uptake rate is only reached after the first glucose measurements are available, but then the estimation is very precise. Although the glucose depletion is equally estimated in the third model calibration cycle (1.94 h) and in the initial, unadjusted model (1.92 h; Figure 4a, black dashed line), the feed (Figure 4b) started 21 min later (calibrated model: 2.40 h; initial model: 2.01 h). This is due to differences in the production and consumption rates of acetate, resulting in different starting times of the fed-batch phase. Based on the DOT profiles, acetate is consumed after 2.5 h; this also corresponds well with the at-line measurements of E. coli BW25113 ∆glcB in Figure 3.
The predicted end of the batch phase is very close to the observed one in all cultivations, even after the second model calibration and 1.5 h of cultivation (Table 2). For some cultivations, the time of glucose depletion was predicted with an accuracy of less than one minute (E. coli BW25113 ∆gatZ).
In the worst case, the glucose depletion was predicted 22.8 min too late (E. coli BW25113 ∆aceA). A missed batch end and even a short starvation phase could lead to unwanted metabolic reactions by the clone and influence the process and product quality. However, with this triplicate, the variance of glucose uptake is very high, because with one triplicate, the glucose consumption is clearly leading (Figure 3a). Without the leading one, the gap between predicted and observed glucose depletion is significant lower. Nevertheless, a difference of 22 min is still in an operational range for conditional screening. Due to operational reasons, the model calibration with all clones was maintained. The overall mean difference between the observed and predicted glucose depletion is 6.9 min for the calibrated model after 1.5 h and, thus, better, compared to the initial model with a mean prediction error of 7.3 min.
Complete consumption of acetate was only observed for five of the eight clones. For all these clones, the adjusted model predicted the acetic acid consumption to better compare to the initial model, with the exception of E. coli BW25113 ∆omp. Complete consumption of acetic acid was not observed for three clones because of the time-dependent restrictions in the feed start (maximum tolerance between end of glucose depletion and feed start; see Section 2.5.3). Nevertheless, for these three clones, the initial model predicted a faster, and the adjusted model, a slower, acetic acid consumption rate. The times of the first feed pulse are summarized in Table 2 (feed start); the predicted end of batch and the first pulse may differ due to technical reasons (delay in computation or first pulses are calculated with 0 µL due to the minimal pipetting volume restrictions).

Feed and Fed-Batch
During the fed-batch phase, the size of the feed pulses is re-computed during each model calibration cycle. The maximal glucose uptake rate was determined as basis for the new feed rates. However, the biomass concentration and the substrate yield coefficient (Y X/S ) have a major impact on the initial feed rate (F 0 ) and influence the feed as well. With the exception of E. coli BW25113 ∆glcB (Figure 5h), the first feed rate (grey bars) was higher than the following calculated feed pulses. However, the second applied feed rates for E. coli BW25113 ∆ompT, E. coli BW25113 ∆fliA and E. coli BW25113 ∆gatZ (Figure 5c,d,f) were close to the initial feed rates, indicating only a minor parameter drift, but were reduced in the later model calibration cycles. In the case of E. coli BW25113 ∆glcB, the second feed was somewhat higher than the later one, which is reflected in both the initial feed rate and in the slope of the feed (all feed pulses are summarized in Figure 5). Feed pulses were calculated by the optimization algorithm for each clone and applied to all biological triplicates. In this way, eight different feeding rates were calculated and 24 cultivations were carried out in parallel.

Parameter Estimation
During all model calibration cycles, the model parameters are estimated on the basis of all available data (Table S1), i.e., all data which were collected from the start of the cultivations to the actual time point. For all clones, the measurements and dynamics of cultivation are well represented in the simulation of the calibrated model, as illustrated in Figure 6 for the strain E. coli BW25113 ΔglcB (last modelling cycle; for the other strains, see supplementary Figures S1-S7). In contrast to the calibrated model, the initial model overestimated the biomass formation. This trend could be observed for all strains, with the exception for E. coli BW25113 ( Figure S2). The DOT measurements indicate a slower glucose uptake rate than predicted in the last modelling cycle. A lower maximal specific glucose uptake rate (qSmax) was calculated in the first two modelling calibration cycles compared to the later ones (Figure 7). In the first two model calibration cycles, no glucose measurements were available due to the time delay in the at-line analytics. The prediction error of acetic acid was decreased in the batch and fed-batch phase after model calibration. The cultivation dynamics of all cultivation measurements are well represented by the calibrated model. The parameters to be adjusted in each model calibration cycle are selected by the included subset selection (sensitivity analysis, Figure 1). The parameters Kap, kLa and qm are not adjusted in one model calibration cycle. This means that the underlying measurement data are not sufficient to determine these model parameters with sufficient certainty, not even in the last model calibration cycle. The model parameters Ko, Ksq, Yofm and Yoresp are only partly selected for parameter estimation (Figure 7, filed dots). Within the subset selection model parameters can influence other's parameter sensitivity if they have dependencies to each other. Accordingly, the selection of one parameter may lead to a lower sensitivity of another one and the latter may be excluded from the subset selection although it was active in previous cycles. All parameter subsets and parameter values are summarized in the Supplementary Tables S3-S10. Regularization of parameter estimation using a subsets selection method [28] was used to ensure a meaningful parameter set and to avoid non-physiological model calibrations.

Parameter Estimation
During all model calibration cycles, the model parameters are estimated on the basis of all available data (Table S1), i.e., all data which were collected from the start of the cultivations to the actual time point. For all clones, the measurements and dynamics of cultivation are well represented in the simulation of the calibrated model, as illustrated in Figure 6 for the strain E. coli BW25113 ∆glcB (last modelling cycle; for the other strains, see Supplementary Figures S1-S7). In contrast to the calibrated model, the initial model overestimated the biomass formation. This trend could be observed for all strains, with the exception for E. coli BW25113 ( Figure S2). The DOT measurements indicate a slower glucose uptake rate than predicted in the last modelling cycle. A lower maximal specific glucose uptake rate (qS max ) was calculated in the first two modelling calibration cycles compared to the later ones (Figure 7). In the first two model calibration cycles, no glucose measurements were available due to the time delay in the at-line analytics. The prediction error of acetic acid was decreased in the batch and fed-batch phase after model calibration. The cultivation dynamics of all cultivation measurements are well represented by the calibrated model. The parameters to be adjusted in each model calibration cycle are selected by the included subset selection (sensitivity analysis, Figure 1). The parameters K ap, k L a and q m are not adjusted in one model calibration cycle. This means that the underlying measurement data are not sufficient to determine these model parameters with sufficient certainty, not even in the last model calibration cycle. The model parameters K o , K sq , Y ofm and Y oresp are only partly selected for parameter estimation (Figure 7, filed dots). Within the subset selection model parameters can influence other's parameter sensitivity if they have dependencies to each other. Accordingly, the selection of one parameter may lead to a lower sensitivity of another one and the latter may be excluded from the subset selection although it was active in previous cycles. All parameter subsets and parameter values are summarized in the Supplementary Tables S3-S10. Regularization of parameter estimation using a subsets selection method [28] was used to ensure a meaningful parameter set and to avoid non-physiological model calibrations.  Monte Carlo simulations have been shown to give a good insight into actual, non-linear parameter distribution [31] and were therefore performed to gain a better understanding of the parameter correlation and its variances. In Figure 8, the parameter distributions for E. coli BW25113 ΔglcB (last model calibration cycle) are shown based on Monte Carlo simulations. The correlation between all parameters is very weak. Only Kaq and Ksq showed a correlation with qAmax. The model  Monte Carlo simulations have been shown to give a good insight into actual, non-linear parameter distribution [31] and were therefore performed to gain a better understanding of the parameter correlation and its variances. In Figure 8, the parameter distributions for E. coli BW25113 ΔglcB (last model calibration cycle) are shown based on Monte Carlo simulations. The correlation between all parameters is very weak. Only Kaq and Ksq showed a correlation with qAmax. The model Monte Carlo simulations have been shown to give a good insight into actual, non-linear parameter distribution [31] and were therefore performed to gain a better understanding of the parameter correlation and its variances. In Figure 8, the parameter distributions for E. coli BW25113 ∆glcB (last model calibration cycle) are shown based on Monte Carlo simulations. The correlation between all parameters is very weak. Only K aq and K sq showed a correlation with qA max . The model parameters K aq and K sq are the affinity constants for acetate and glucose uptake, respectively, and a dependence to the maximal acetate uptake rate (qA max ) cannot be avoided in the model. The high significance of each parameter is indicated by a narrow distribution in Figure 8 as well as low variation for the most important model parameters (Table 3), especially for the model parameters Y em , qS max and Y osresp. Normal distribution is given for all parameters except for Y am . This parameter is quite close to the lower bound of the previously defined solution space (lower and upper parameter bound). It is noted that this situation should be avoided as it might reduce the accuracy of the parameter estimates.
Bioengineering 2020, 7, x FOR PEER REVIEW 12 of 18 parameters Kaq and Ksq are the affinity constants for acetate and glucose uptake, respectively, and a dependence to the maximal acetate uptake rate (qAmax) cannot be avoided in the model. The high significance of each parameter is indicated by a narrow distribution in Figure 8 as well as low variation for the most important model parameters (Table 3), especially for the model parameters Yem, qSmax and Yosresp. Normal distribution is given for all parameters except for Yam. This parameter is quite close to the lower bound of the previously defined solution space (lower and upper parameter bound). It is noted that this situation should be avoided as it might reduce the accuracy of the parameter estimates. In the present work, eight strains were examined in 24 successful cultivations. The end of glucose uptake was, in part, predicted with small errors of less than one minute, thanks to the iterative model calibration cycle. The feed start was automatic and in an operable acceptable time window using the dynamic process redesign as defined in the model calibration cycle. The model parameter sets estimated are always unique and with a physiological meaning, even with very little data in the initial phase of this study, e.g., the first 3 h of cultivation. This is ensured by the built-in subset selection and is proven by the Monte Carlo simulations made afterwards.   In the present work, eight strains were examined in 24 successful cultivations. The end of glucose uptake was, in part, predicted with small errors of less than one minute, thanks to the iterative model calibration cycle. The feed start was automatic and in an operable acceptable time window using the dynamic process redesign as defined in the model calibration cycle. The model parameter sets estimated are always unique and with a physiological meaning, even with very little data in the initial phase of this study, e.g., the first 3 h of cultivation. This is ensured by the built-in subset selection and is proven by the Monte Carlo simulations made afterwards.

Discussion
In this study, we presented a computational framework able to design and operate parallel E. coli cultivations without human supervision. The results demonstrate that a robust operation tailored to each specific clone is possible through an adaptive input design. Undesired experimental conditions (e.g., overfeeding and starvation) are avoided while sufficient information to allow a confident discrimination of the clones is generated. Both start time and feed rate were accurately predicted for each one of the eight clones, using feedback information from online and at-line cultivation measurements. This is essential in an experimental facility aimed to perform screening cultivations for clones whose phenotype is not known beforehand. The relevance of an adaptive and specific experimental design can be seen in this case study. As illustrated in Figure 9, despite the fact that the clone characteristics differ only minimally from each other, an experiment with a fixed start time and feeding rate would have violated important experimental constrains (here, overfeeding).

Discussion
In this study, we presented a computational framework able to design and operate parallel E. coli cultivations without human supervision. The results demonstrate that a robust operation tailored to each specific clone is possible through an adaptive input design. Undesired experimental conditions (e.g., overfeeding and starvation) are avoided while sufficient information to allow a confident discrimination of the clones is generated. Both start time and feed rate were accurately predicted for each one of the eight clones, using feedback information from online and at-line cultivation measurements. This is essential in an experimental facility aimed to perform screening cultivations for clones whose phenotype is not known beforehand. The relevance of an adaptive and specific experimental design can be seen in this case study. As illustrated in Figure 9, despite the fact that the clone characteristics differ only minimally from each other, an experiment with a fixed start time and feeding rate would have violated important experimental constrains (here, overfeeding). Additionally, the use of a macro-kinetic growth model that describes the main extracellular dynamics of E. coli was shown to be sufficient, even though it is insufficient to describe the complex nonlinear dynamics of the bioprocess and the different genotypes of the clones. The adaptive nature of the framework ensures a proper prediction within the current horizon and is sufficient to assure a robust operation of the cultivations. On average, the predicted feed start differed by less than 10 min for the optimal one, which is in an acceptable range and is mainly caused by unobserved disturbances in the system. If necessary, the mismatch can be further reduced by increasing the frequency of model calibration. Furthermore, the framework provides all necessary parameters and actions to define a wide range of alternative feet start triggers (e.g., glucose reduction or acetate consumption).
As expected, the parameter variance in general decreases with every model calibration cycle. After the cultivation, the parameter distribution is generally very narrow. This is also reflected in the small deviations of the simulation results of the Monte Carlo studies ( Figure 10) and demonstrates the value of the parameters for further in-silico studies. Important parameters, such as maximum glucose uptake (qSmax), can be distinguished with statistical significance and used for clone discrimination (Figure 7, Table 3). In this study, E. coli BW25113 ∆ompT had the largest qSmax value and E. coli BW25113 ∆gatZ the lowest one. However, some parameters could not, or only with insufficient confidence, be identified. This hampered a distinction of some essential parameters, such as the maximal acetate uptake rate (qAmax). Furthermore, parameter identifiability can be increased in future applications using methods for optimal experimental design (OED) [20,32,33] or enhanced parameter identifiability analysis [34][35][36]. Additionally, the use of a macro-kinetic growth model that describes the main extracellular dynamics of E. coli was shown to be sufficient, even though it is insufficient to describe the complex nonlinear dynamics of the bioprocess and the different genotypes of the clones. The adaptive nature of the framework ensures a proper prediction within the current horizon and is sufficient to assure a robust operation of the cultivations. On average, the predicted feed start differed by less than 10 min for the optimal one, which is in an acceptable range and is mainly caused by unobserved disturbances in the system. If necessary, the mismatch can be further reduced by increasing the frequency of model calibration. Furthermore, the framework provides all necessary parameters and actions to define a wide range of alternative feet start triggers (e.g., glucose reduction or acetate consumption).
As expected, the parameter variance in general decreases with every model calibration cycle. After the cultivation, the parameter distribution is generally very narrow. This is also reflected in the small deviations of the simulation results of the Monte Carlo studies ( Figure 10) and demonstrates the value of the parameters for further in-silico studies. Important parameters, such as maximum glucose uptake (qS max ), can be distinguished with statistical significance and used for clone discrimination (Figure 7, Table 3). In this study, E. coli BW25113 ∆ompT had the largest qS max value and E. coli BW25113 ∆gatZ the lowest one. However, some parameters could not, or only with insufficient confidence, be identified. This hampered a distinction of some essential parameters, such as the maximal acetate uptake rate (qA max ). Furthermore, parameter identifiability can be increased in future applications using methods for optimal experimental design (OED) [20,32,33] or enhanced parameter identifiability analysis [34][35][36]. The frequency of the parameter estimation was defined based on the availability of at-line data (biomass, glucose and acetate), and as expected, the at-line data are decisive to achieve model identifiability. Still, the results show that especially parameters related to glucose consumption can be identified using only the online DOT signal. This shows that, even though in a significantly limited manner, the framework can also be used to increase the robustness of robotic facilities that do not have embedded at-line analytics. This significantly reduces the operative effort of the experimental setup. The glucose consumption rate seems to be observable from the DOT signal, by which a reduced version of the macro-kinetic model could be used to build an observer-based feeding control. Finally, we also demonstrated that the length of the batch phase is essential to assure sufficient data before the start of the feeding so as to allow a reliable operation of the following phases.
Some parameters drift over time or change rapidly between batch and fed-batch (Figure 7). This could be related to the increasing information content of the growing data set and the resulting addition of previously neglected model parameters to the subset selection. However, the variations in the parameters caused by intracellular changes in the metabolic machinery together with heterogeneous mutations in the population [37] are not represented in the model and could also cause such parameter changes. Such uncovered intracellular changes may also explain the apparently poor representation of the batch phase in the last model calibration cycle compared to earlier ones. Therefore, an iterative recalculation of the feed is necessary to cope with disturbances in the experiments and inaccuracies in the model prediction. Moving horizon approaches can also increase the model prediction accuracy by allowing different parameter sets in different cultivation phases for a single clone [38].

Conclusions
The operation of robotic experiments with multiple fed-batch cultivations in parallel is very challenging even for skilled operators, since many decisions and tasks are needed at the same time.
In this work, we present an adaptive framework for conditional screening for parallel fed-batch experiments, aiming to identify the best candidate strain for industrial scale biomanufacturing. We demonstrate that the use of a macro-kinetic growth model in an adaptive framework using online The frequency of the parameter estimation was defined based on the availability of at-line data (biomass, glucose and acetate), and as expected, the at-line data are decisive to achieve model identifiability. Still, the results show that especially parameters related to glucose consumption can be identified using only the online DOT signal. This shows that, even though in a significantly limited manner, the framework can also be used to increase the robustness of robotic facilities that do not have embedded at-line analytics. This significantly reduces the operative effort of the experimental setup. The glucose consumption rate seems to be observable from the DOT signal, by which a reduced version of the macro-kinetic model could be used to build an observer-based feeding control. Finally, we also demonstrated that the length of the batch phase is essential to assure sufficient data before the start of the feeding so as to allow a reliable operation of the following phases.
Some parameters drift over time or change rapidly between batch and fed-batch (Figure 7). This could be related to the increasing information content of the growing data set and the resulting addition of previously neglected model parameters to the subset selection. However, the variations in the parameters caused by intracellular changes in the metabolic machinery together with heterogeneous mutations in the population [37] are not represented in the model and could also cause such parameter changes. Such uncovered intracellular changes may also explain the apparently poor representation of the batch phase in the last model calibration cycle compared to earlier ones. Therefore, an iterative recalculation of the feed is necessary to cope with disturbances in the experiments and inaccuracies in the model prediction. Moving horizon approaches can also increase the model prediction accuracy by allowing different parameter sets in different cultivation phases for a single clone [38].

Conclusions
The operation of robotic experiments with multiple fed-batch cultivations in parallel is very challenging even for skilled operators, since many decisions and tasks are needed at the same time.
In this work, we present an adaptive framework for conditional screening for parallel fed-batch experiments, aiming to identify the best candidate strain for industrial scale biomanufacturing. We demonstrate that the use of a macro-kinetic growth model in an adaptive framework using online and at-line data information in a feedback loop is necessary to:

1.
Design a specific strategy for each different clone of the conditional screening experiment.

2.
Increase the robustness of the robotic operation against experimental disturbance.

3.
Give an approximation of the reliability of the simulation results with respect to production scale performance.
To our knowledge, this is the first successful model-based operation of 24 fed-batch cultivations with as many as eight different clones in parallel including its characterization, sufficient for clone discrimination. The results clearly demonstrate the capabilities of the framework to increase the efficiency of combined mini-bioreactor systems with liquid handling stations to drastically reduce the experimental time, efforts and failure rate in high-throughput bioprocess development.  Table S10: parameter subsets and values for BW25113 ∆glcB; Figure S1: Comparison of measurement and simulation for E. coli W3110; Figure S2: Comparison of measurement and simulation for E. coli BW25113; Figure S3: Comparison of measurement and simulation for E. coli BW25113 ∆ompT; Figure S4: Comparison of measurement and simulation for E. coli BW25113 ∆aceA; Figure S5: Comparison of measurement and simulation for E. coli BW25133 ∆fliA; Figure S6: Comparison of measurement and simulation for E. coli BW25133 ∆gatC; Figure S7: Comparison of measurement and simulation for E. coli BW25113 ∆gatZ.