A General Hybrid Modeling Framework for Systems Biology Applications: Combining Mechanistic Knowledge with Deep Neural Networks under the SBML Standard

: In this paper, a computational framework is proposed that merges mechanistic modeling with deep neural networks obeying the Systems Biology Markup Language (SBML) standard. Over the last 20 years, the systems biology community has developed a large number of mechanistic models that are currently stored in public databases in SBML. With the proposed framework, existing SBML models may be redesigned into hybrid systems through the incorporation of deep neural networks into the model core, using a freely available python tool. The so-formed hybrid mechanistic/neural network models are trained with a deep learning algorithm based on the adaptive moment estimation method (ADAM), stochastic regularization and semidirect sensitivity equations. The trained hybrid models are encoded in SBML and uploaded in model databases, where they may be further analyzed as regular SBML models. This approach is illustrated with three well-known case studies: the Escherichia coli threonine synthesis model, the P58IPK signal transduction model, and the Yeast glycolytic oscillations model. The proposed framework is expected to greatly facilitate the widespread use of hybrid modeling techniques for systems biology applications.


Introduction
Hybrid modeling methods combining mechanistic knowledge with machine learning (ML) in a common workflow have found wide application in process systems engineering since the early 1990s (e.g., review by von Stosch et al., [1]). Psichogios and Ungar [2] described one of the first applications of hybrid models to bioprocess engineering. The proposed hybrid model consisted of dynamic material balance equations of biochemical species (system of ordinary differential equations (ODEs)) connected with a shallow feedforward neural network in a common mathematical structure. Sensitivity equations were derived enabling the training of the neural network by error backpropagation on indirect training examples (e.g., measured target variables not coincident with the neural network output variables). Thompson and Kramer [3] framed this problem as hybrid semiparametric modeling, as such models merge parametric functions (stemming from knowledge) with nonparametric functions (stemming from data) in the same mathematical structure. Schubert et al. [4] presented the first industrial application of hybrid modeling (material balance equations combined with neural networks) to a Baker's yeast process. Since the early 1990s, hybrid model structure definition, parameter identification and model-based process control have been extensively covered (e.g., [5][6][7][8][9][10]). Hybrid models were applied to a wide array of microbial, animal cells, mixed microbial and enzyme processes in different industries, such as wastewater treatment, clean energy, biopolymers and biopharmaceutical manufacturing (Agharafeie et al. [11]). The potential advantages of hybrid modeling may be summarized as a more rational usage of prior knowledge (mechanistic, heuristic

General SBML Hybrid Model
SBML models are organized as j = 1, . . . , n compartments with size V j . Each compartment contains m j species with a concentration vector c j . The species are interlinked through q j reactions with stoichiometry S j and reaction kinetics r j . SBML models also contain parameters, θ, with given initial values (parameters may be local to reactions or global; for simplicity, we assume global). In SBML, the parameter values are not necessarily fixed as they may change over time according to predefined algebraic rules. The compartment size may also change over time according to predefined compartment rate rules (other rate rules were not considered here for simplicity). External time dependent stimuli may be defined through events, giving rise to a vector of exogenous input variables, u, that may change over time. With these elements, the dynamics of biochemical species in a generic compartment j may be described by the following ODEs model: Equation (1a) is a conservation law of mass assuming a perfectly mixed compartment. Equation (1b) represents a generic compartment rate rule in case the compartment size changes over time. Equation (1c) represents generic algebraic rules to compute model parameters over time. Equations (1a)-(1c) are of a parametric nature with fixed structure stemming from prior knowledge (e.g., mass conservation laws, reaction stoichiometry or enzyme kinetics). Some variables may, however, lack a mechanistic basis (e.g., unknown reaction kinetics mechanisms or unknown physicochemical properties of molecular species such as charge or glycosylation pattern). In the general SBML hybrid model, variables lacking a mechanistic basis are defined as loose nonparametric functions, ϑ(·), without a fixed structure. They are computed by a deep feedforward neural network (FFNN) with nh hidden layers as a function of species concentrations, exogenous inputs, and other relevant variables: H 0 = g V j , c j , θ, u, t (2a) A non-linear pre-processing function, g(V j , c j , θ, u, t), may be used to compute the FFNN input signals to improve the training (Equation (2a)). The input signals are forward propagated through the hidden layers according to Equation (2b). The σ(·) represents the nodes transfer function in the hidden layers (always the hyperbolic tangent function in this study). Finally, the FNN outputs, ϑ, are computed by a linear output layer (Equation (2c)).
The nodes connections weights, w = w 1 , w 2 , . . . , w nh+1 and b = b 1 , b 2 , . . . , b nh+1 , are calculated during the training of the model, for which an informative dataset is needed.
For a particular biological model, Equations (1)-(2) describing n compartments with species and reactions are transformed via automatic symbolic manipulation into an equivalent set of ODEs and derived sensitivity equations using the Symbolic Math toolbox (MATLAB R2020a, MathWorks Inc.). The end result of this procedure is an automatically generated Matlab/Octave function that computes time derivatives of all state variables, y = c 1 , c 2 , . . . , c n , V 1 , V 2 , . . . , V n , Deep learning of hybrid models obeying to the system of Equations (3a)-(3c) has been thoroughly investigated by Pinto et al. [37]. A Runge-Kutta 4th order ODE solver was implemented in MATLAB R2020a (MathWorks Inc.) to integrate the system of Equations (3a)-(3c). The training was performed in a weighted least squares sense by minimizing the following loss function, with T the number of training examples, y * t the measured training example at time t, y t the corresponding model prediction and σ t the measurement standard deviation. The gradients of the loss function with respect to the neural network outputs were computed by the equation, The output layer gradients, ∂W MSE/∂ϑ, were back-propagated to the input layer via the well-known error backpropagation algorithm (Werbos, 1974), yielding the loss function gradients with respect to the neural network parameters, Finally, the adaptive moment estimation algorithm (ADAM) [41] with stochastic minibatch and weights dropout regularization was adopted to minimize the loss function given by Equation (4), using gradients, g. For further details, the reader is referred to [25]. The code was implemented in MATLAB R2020a (MathWorks Inc.) on a computer with Intel ® Core TM i5-8265U CPU @ 1.60 GHz 1.80 GHz, and 24 GB of RAM.

Interfacing with SBML Databases and SBML Modeling Tools
The SBML2HYB python package [36] was adopted to read SBML models, redesign them as hybrid models and to store them in model databases. This freely available python package converts existing systems biology models encoded in SBML into hybrid models that combine mechanistic equations and deep neural networks (currently limited to FFNNs). SBML is not a common format to encode ML models. An intermediate HMOD format supports the conversion process. The HMOD format is a text-based file (ASCII) with the list of properties defining the model (species, reactions, parameters, rates and rules) in a similar manner to SBML, by considering any number of species with a certain initial concentration distributed among any number of compartments. These species are then interlinked through a list of reactions and rate rules. The user inputs the information of the deep neural network into the HMOD file either manually or through a pre-configured neural network in Python keras, using the SBML2HYB tool. The resulting hybrid model in HMOD format is reconverted to SBML and uploaded in model databases. In this step, the FFNN Equations (2a)-(2c) are mapped to assignment rules in SBML format, whereas the network weights are mapped to global parameters in the SBML format. The resulting SBML hybrid models may be simulated, analyzed and/or trained with existing tools such as MATLAB (MathWorks Inc.), COPASI [42] or special purpose tools with training algorithms for hybrid models that are able to read SBML files. For further details, the reader is referred to [36].

Case Studies
The SBML hybrid modeling framework was applied to three systems biology case studies freely available in the JWS Online database (https://jjj.bio.vu.nl/models/, accessed on 31 January 2023) [33] with the accession ID given in Table 1. The first case study is a metabolic network describing the synthesis of threonine in E. coli proposed by Chassagnole et al. [38]. The second case study is the P58IPK signal transduction network to study Influenza infection dynamics proposed by Goodman et al. [39]. The third case study is a reduced yeast glycolytic model with preserved limit cycle stability proposed by Dano et al. [40]. In order to upgrade the original mechanistic models in hybrid mechanistic/neural network versions, the following pipeline of activities ( Figure 1) was applied to each of the case studies: Table 1. Summary of the three SBML models that were redesigned to hybrid mechanistic/neural network models in the present study.  Figure 1. Schematic workflow for redesigning existing SBML models stored in databases into hybrid mechanistic/neural network models.

JWS
Step 1: An SBML biologic model is extracted from a model database.
Step 2: A synthetic time series dataset is generated to train the hybrid model.
Step 3: A feedforward neural network (FFNN) is inserted in the mechanistic kinetic model and converted to the HMOD format using the SBML2HYB tool.
Step 4: The hybrid mechanistic/FFNN model encoded in the HMOD format is trained by applying the deep learning approach (Section 2.1) and the synthetic dataset.
Step 5: The trained hybrid model in the HMOD format is reconverted to SBML using the SBML2HYB tool.
Step 6: The final trained hybrid model in the SBML format is uploaded in the model database and simulated comparatively to the original nonhybrid model.  Step 1: An SBML biologic model is extracted from a model database.
Step 2: A synthetic time series dataset is generated to train the hybrid model.
Step 3: A feedforward neural network (FFNN) is inserted in the mechanistic kinetic model and converted to the HMOD format using the SBML2HYB tool.
Step 4: The hybrid mechanistic/FFNN model encoded in the HMOD format is trained by applying the deep learning approach (Section 2.1) and the synthetic dataset.
Step 5: The trained hybrid model in the HMOD format is reconverted to SBML using the SBML2HYB tool.
Step 6: The final trained hybrid model in the SBML format is uploaded in the model database and simulated comparatively to the original nonhybrid model.
Step 1: The original systems biology models were retrieved from the JWS database in SBML format. The respective files are provided as Supplementary Material.
Step 2: Synthetic time series datasets were generated by simulating the original models in the JWS platform. The resulting data sets are provided as Supplementary Material. These data are needed to train the hybrid models as a proof-of-concept. No experimental data were used in this study. More details are provided in the results section.
Step 3: For each case study, a feedforward neural network (FFNN) was inserted into the mechanistic model and converted to the HMOD format using the SBML2HYB python tool, freely available in [36]. The size of the FFNN and interface with the mechanistic model depended on the case study. More details are given in the results section.
Step 4: The hybrid mechanistic/FFNN models encoded in the HMOD format were trained using the deep learning approach described in Section 2.1 and the datasets generated in step 2. Implementation details varied in the case studies (more on this in the Results section). The main concern was the proof-of-concept that SBML hybrid models may be efficiently trained to a comparable performance to the original mechanistic models. The effect of the size of the FFNN was investigated. The final trained hybrid models, with the updated FFNN weights, were saved in the HMOD format.
Step 5: The trained hybrid models in the HMOD format were reconverted to SBML using the SBML2HYB tool. In this step, the FFNN information is mapped to assignment rules in the SBML format. The obtained SBML files were uploaded to the JWS online platform and are now freely available for the community to analyze. The hybrid model structures encoded in SBML were visualized using the freely available Cytoscape cy3sbml tool [43]. The hybrid models SBML files are provided as Supplementary Material.
Step 6: For proof-of-concept, the original mechanistic SBML models (step 1) and the final hybrid SBML models (step 5) were simulated and compared using the JWS online simulator (https://jjj.bio.vu.nl/models/experiments/, accessed on 31 January 2023) showing that their outputs are practically coincident.
As mentioned in step 5, hybrid models with different network depths and sizes were evaluated for each case study. The "best" hybrid model was discriminated on the basis of the Akaike Information Criterion with a second order bias correction (AICc), computed for the training data partition as follows: with nw the total number of FFNN weights that are calculated during the training process.
AICc includes an overparameterization penalty and is commonly used to discriminate between empirical model candidates and to select a parsimonious model for small sample sizes [44].

Case Study 1: Threonine Synthesis Pathway in E. coli
The first case study is the metabolic model proposed by Chassagnole et al. [38], describing the threonine synthesis pathway in E. coli (Table 1). This model dynamically simulates the time course of 11 species (adp, asa, asp, aspp, atp, hs, hsp, nadp, naph, phos and thr) in a single compartment, corresponding to 11 ODEs. It has seven reactions (with rates vak, vasd, vatpase, vhdh, vhk, vnadph_endo and vtsy) and 47 kinetic parameters (the names of variables were kept the same as in the original SBML model to facilitate cross-reference; for details, the reader is referred to the JWS Online model with accession ID 'chassagnole').
Hybrid models were created by combining deep FFNNs of different sizes with the original mechanistic model, following the previously described procedure (Figure 1). The FFNNs had 11 inputs corresponding to the concentrations of the 11 species (adp, asa, asp, aspp, atp, hs, hsp, nadp, naph, phos, thr). The number of hidden layers and nodes in the hidden layers varied ( Table 2). The activation function in the hidden layers was always the hyperbolic tangent function. The FFNNs had seven outputs corresponding to the maximum reaction rate values of the seven metabolic reactions. The kinetic equations of the original SBML model were fully kept in the hybrid models. The job of the FFNNs was thus to describe the maximum reaction rate parameters as a function of species concentrations.  (Table 2) using the Cytoscape cy3sbml tool. This figure shows an heterogenous (hybrid) network composed of nodes and edges of different nature. On the biochemical network side (left), the large circles represent the molecular species, which have a physical concentration associated. The small black squares and respective edges represent biochemical reactions with a well-defined stoichiometry. The black triangles are the reaction kinetic rates. On the feedforward neural network side (right), the blue circles represent the neural network nodes, which have an abstract numerical value associated defining the node strength. The green squares and respective edges represent signal propagation between nodes. The interlink between the two sides of the network is mediated by the black triangles, which in this case correspond to the maximum reaction rate parameters to be applied in the kinetic law equations. An interesting analogy may be established between the neural network part and an artificial nucleus of a cell with associated signal transduction networks and gene regulatory networks, with the job of controlling the underlying metabolic processes. The hybrid models were trained with a synthetic data set following the procedure of The training was performed with ADAM with default hyperparameters (Table 2), 5000 iterations, semidirect sensitivity equations and stochastic regularization with a minibatch size of 0.78 and weights dropout of 0.22. The choice of the minibatch size and weights dropout was based on the results by Pinto et al. [37]. Table 2 shows the overall training metrics for different FFNNs sizes. The performances of the hybrid models in terms of training error (WMSE train) and testing error (WMSE test) are comparable. The magnitude of the train and test errors are also comparable, denoting an effective training without overfitting in all cases. This is further strengthened by the very low noise-free test error showing that model predictions are very close to the true process behavior in all cases. The total number of network weights varied almost 10-fold but this was not reflected in the training performance. The best hybrid structure was chosen to be the smallest one [11 × 5 × 5 × 7] based on the lowest AICc value (1st row in Table 2).  The hybrid models were trained with a synthetic data set following the procedure of  (Table 2), 5000 iterations, semidirect sensitivity equations and stochastic regularization with a minibatch size of 0.78 and weights dropout of 0.22. The choice of the minibatch size and weights dropout was based on the results by Pinto et al. [37]. Table 2 shows the overall training metrics for different FFNNs sizes. The performances of the hybrid models in terms of training error (WMSE train) and testing error (WMSE test) are comparable. The magnitude of the train and test errors are also comparable, denoting an effective training without overfitting in all cases. This is further strengthened by the very low noise-free test error showing that model predictions are very close to the true process behavior in all cases. The total number of network weights varied almost 10-fold but this was not re-  Table 2) visualized in the cy3sbml tool [43]. Left side: Metabolic network with physical meaning. Large circles represent biochemical species (metabolites). Black squares and black edges represent biochemical reactions. Black triangles represent kinetic laws. Right side: Artificial feedforward neural network with size [11 × 5 × 5 × 7]. Small blue circles represent neural network nodes. Green squares and gray edges represent signal propagation between neural network nodes. The first layer receives input signals of biochemical species concentrations (Large circles). The last layer delivers kinetic parameter values to the black triangles, which mediate the communication between both sides of the network.
The trained hybrid models may be simulated and analyzed in any systems biology platform complying with the SBML standard. As proof-of-concept, the best hybrid model [11 × 5 × 5 × 7] in the SBML format was uploaded to the JWS online platform and simulated. Figure 3 shows the JWS online simulation of the original model and of the best hybrid model [11 × 5 × 5 × 7] for a test experiment not used for training (the center point experiment of the CC-DOE). The results show that the hybrid model perfectly mimicked the dynamics of the original mechanistic model.
The procedure presented in Figure 1 may result in mathematical structures that are more detailed mechanistically and much more complex to train than previously published hybrid models. This may raise concerns about the training feasibility of FFNNs interlinked with complex mathematical structures. Pinto et al. [37] compared traditional shallow hybrid modeling (using the Levenberg-Marquardt algorithm coupled with the indirect sensitivity equations, cross-validation and a hyperbolic tangent activation function) with deep hybrid modeling (using ADAM, semidirect sensitivity equations, stochastic regularization and multiple hidden layers). A clear advantage of hybrid deep learning both in terms of predictive power and computational cost was demonstrated. However, all experiments had a simplistic mechanistic part. Here, case study 1 model kept the original kinetic law equations. Seven highly complex kinetic equations with 47 parameters were "merged" with the FFNN. Table 2 results suggest nonetheless that the previously published deep learning approach for hybrid models (ADAM + semidirect sensitivity equations + stochastic regularization) is equally effective at training hybrid models with complex parametric functions.
The trained hybrid models may be simulated and analyzed in any systems biology platform complying with the SBML standard. As proof-of-concept, the best hybrid model [11 × 5 × 5 × 7] in the SBML format was uploaded to the JWS online platform and simulated. Figure 3 shows the JWS online simulation of the original model and of the best hybrid model [11 × 5 × 5 × 7] for a test experiment not used for training (the center point experiment of the CC-DOE). The results show that the hybrid model perfectly mimicked the dynamics of the original mechanistic model.  Table 2).
The procedure presented in Figure 1 may result in mathematical structures that are more detailed mechanistically and much more complex to train than previously published hybrid models. This may raise concerns about the training feasibility of FFNNs interlinked with complex mathematical structures. Pinto et al. [37] compared traditional shallow hybrid modeling (using the Levenberg-Marquardt algorithm coupled with the indirect sensitivity equations, cross-validation and a hyperbolic tangent activation function) with deep hybrid modeling (using ADAM, semidirect sensitivity equations, stochastic regularization and multiple hidden layers). A clear advantage of hybrid deep learning both in terms of predictive power and computational cost was demonstrated. However, all experiments had a simplistic mechanistic part. Here, case study 1 model kept the original kinetic law equations. Seven highly complex kinetic equations with 47 parameters were "merged" with the FFNN. Table 2 results suggest nonetheless that the previously published deep learning approach for hybrid models (ADAM + semidirect sensitivity equations + stochastic regularization) is equally effective at training hybrid models with complex parametric functions.

Case study 2: P58IPK Signal Transduction Pathway
The second case study was based on the viral infection model proposed by Goodman et al. [39], freely available in SBML in the JWS Online database (http://www.jjj.bio.vu.nl, , accessed on January 2023) under accession ID 'goodman' ( Table 1). The authors studied the dynamics of the P58IPK signal transduction pathway during Influenza virus infection.   Table 2).

Case Study 2: P58IPK Signal Transduction Pathway
The second case study was based on the viral infection model proposed by Goodman et al. [39], freely available in SBML in the JWS Online database (http://www.jjj.bio. vu.nl, accessed on 31 January 2023) under accession ID 'goodman' ( Table 1). The authors studied the dynamics of the P58IPK signal transduction pathway during Influenza virus infection. A mathematical model was developed to evaluate the effect of protein P58a activation on the P58IPK pathway dynamics, particularly on the activation of the PKR kinase and on the phosphorylation of eIF2, both controlling viral protein expression. The model comprehends nine species (Flu, NS1, P58a, P58total, PKRp, PKRtotal, eIF2ap, eIF2atotal and ext) in a single compartment, of which four are fixed (P48total, PKRtotal, eIF2atotal and ext), corresponding to five ODEs. The model further has nine reactions and 10 parameters. The names of variables were kept the same as in the original model and are explained in the database.
As in the previous case study, SBML hybrid models were created by combining FFNNs of different sizes (Table 3) with the original mechanistic model following the procedure of Figure 1. Figure 4 shows the hybrid model structure [5 × 10 × 10 × 10 × 9] (Table 3) using the SBML-visualizing cy3sbml tool [43]. The left side of Figure 4 represents the original mechanistic signal transduction network, whereas the right side represents the FFNN added to the mechanistic core. The FFNN has five inputs corresponding to the concentrations of the five dynamical species (Flu, NS1, P58a, PKRp and EIF2ap), three hidden layers (10 × 10 × 10) with hyperbolic tangent activation functions, and nine outputs corresponding to the kinetic rates (v_1r, v_2r, v_3r, v_4r, v_5r, v_6r, v_7r, v_8r, v_9r as they are named in the original SBML implementation). In this case study, the FFNNs in Table 3 completely replaced the kinetic laws of the original model, which were therefore deleted in the hybrid model structures. This network may be interpreted as a hybrid signal transduction pathway with a physical part composed of proteins and an artificial part composed of abstract neural network nodes.  Kingma (2014) (α = 0.001, =0.9, =0.999 and ζ = 1 × 10 −8 ). The number of iterations was 5000. The minibatch size was 78% and weight dropout probability was 0.22 as suggested by Pinto et al. (2022). The AICc was computed on the training set only. The noise-free WSSE measures the error between noise-free data (e.g., true process behavior) and model predictions.   Table 3) visualized using the cy3sbml tool [43]. Left side: Signal transduction netwo with physical meaning. Large circles represent biochemical species (proteins). Black squares a black edges represent biochemical reactions. Black triangles represent kinetic laws. Right side: A tificial feedforward neural network with size [5 × 10 × 10 × 10 × 9]. Small blue circles represent neu network nodes. Green squares and gray edges represent signal propagation between neural n work nodes. The first layer receives input signals of biochemical species concentrations (Large c cles). The last layer delivers kinetic parameter values to the black triangles, which mediate the co munication between both sides of the network.

Hybrid
Hybrid SBML models with varying number of hidden layers and nodes in the hidd layers were trained using a synthetic data set. A time-series dataset was created by sim lating the original SBML model in the JWS platform following a similar procedure to ca study 1 (available in the Supplementary material as Simulation_data.xlsx; goodman_da  Table 3) visualized using the cy3sbml tool [43]. Left side: Signal transduction network with physical meaning. Large circles represent biochemical species (proteins). Black squares and black edges represent biochemical reactions. Black triangles represent kinetic laws. Right side: Artificial feedforward neural network with size [5 × 10 × 10 × 10 × 9]. Small blue circles represent neural network nodes. Green squares and gray edges represent signal propagation between neural network nodes. The first layer receives input signals of biochemical species concentrations (Large circles). The last layer delivers kinetic parameter values to the black triangles, which mediate the communication between both sides of the network.
Hybrid SBML models with varying number of hidden layers and nodes in the hidden layers were trained using a synthetic data set. A time-series dataset was created by simulating the original SBML model in the JWS platform following a similar procedure to case study 1 (available in the Supplementary Material as Simulation_data.xlsx; goodman_data sheet). A two-factor CC-DOE was carried out to the initial amount of Flu (overall level of infection within the host cell) between 2 and 6 (arbitrary units) and the initial amount of PKRp (phosphorylated PKR protein) between 0 and 2 (arbitrary units). The data for each experiment were recorded as a time series with 100 points and sampling time of 0.05 (arbitrary units). This resulted in nine experiments with 100 time points each. Additionally, 10% Gaussian noise was added to concentrations of species to simulate experimental error. As in the previous case study, four experiments were selected for training (the star experiments of the CC-DOE corresponding to 400 training examples for each state variable) and five experiments were used for testing (the square plus the center experiments of the CC-DOE corresponding to 500 training examples for each state variable). The training was performed using ADAM with default hyperparameters (Table 3), 5000 iterations, semidirect sensitivity equations and stochastic regularization (minibatch size of 0.78 and weight dropout of 0.22, as before). The overall training results for different FFNN sizes are shown in Table 3. As opposed to the previous case study, the size of the FFNN has an effect on the training performance. This may be explained by the smaller amount of mechanistic knowledge embodied in the hybrid models. Since the original kinetic laws were completely deleted in the hybrid models, the training results are more heavily dependent on the FFNN structure. Interestingly, the larger networks with a higher depth (three hidden layers) outperformed the smaller networks, particularly in the extrapolation experiments (test WMSE). Overall, the structure [5 × 10 × 10 × 10 × 9] stands out as the best-performing model with the lowest training error (WMSE train) and lowest testing error (WMSE test). This is further reinforced by the lowest noise-free test error and the lowest AICc. This structure was uploaded to the JWS online platform and simulated comparatively to the original mechanistic model ( Figure 5). As in the previous case study, the best-performing hybrid SBML model [5 × 10 × 10 × 10 × 9] was able to perfectly mimic the dynamics of the original mechanistic model. performing model with the lowest training error (WMSE train) and lowest testing error (WMSE test). This is further reinforced by the lowest noise-free test error and the lowest AICc. This structure was uploaded to the JWS online platform and simulated comparatively to the original mechanistic model ( Figure 5). As in the previous case study, the bestperforming hybrid SBML model [5 × 10 × 10 × 10 × 9] was able to perfectly mimic the dynamics of the original mechanistic model.  Table 3).

Case Study 3: Yeast Glycolytic Oscillations
The third case study consisted of the reduced dynamical model of yeast glycolysis proposed by Dano et al. [40]. This model is a reduced version of a more detailed yeast glycolysis model. Both the original and reduced models exhibit limit cycle stability, with a certain number of species showing stable oscillations over time. It comprehends eight species (ADP, AMP, ATP, BPG, DHAP, FBP, GAP and sink) in a single compartment. The dynamic variable 'sink' was the only one that was fixed, thus translating to a system of seven ODEs. The model further comprehends 11 metabolic reactions and 31 parameters.  Table 3).

Case Study 3: Yeast Glycolytic Oscillations
The third case study consisted of the reduced dynamical model of yeast glycolysis proposed by Dano et al. [40]. This model is a reduced version of a more detailed yeast glycolysis model. Both the original and reduced models exhibit limit cycle stability, with a certain number of species showing stable oscillations over time. It comprehends eight species (ADP, AMP, ATP, BPG, DHAP, FBP, GAP and sink) in a single compartment. The dynamic variable 'sink' was the only one that was fixed, thus translating to a system of seven ODEs. The model further comprehends 11 metabolic reactions and 31 parameters. This model is freely available in SBML format on the JWS Online database (http://www.jjj. bio.vu.nl, accessed on 31 January 2023) with accession ID 'dano1 .
SBML hybrid models were created by combining FFNNs of different sizes with the original mechanistic model. Figure 6 illustrates this process for the structure [7 × 10 × 10 × 10 × 11] with 421 weights (6th row of Table 4). The right side of Figure 6 represents the original metabolic network, whereas the left side represents the incorporated FFNN. In this example, the FFNN has seven inputs corresponding to the concentrations of the seven species (ADP, AMP, ATP, BPG, DHAP, FBP, GAP), three hidden layers (10 × 10 × 10) with hyperbolic tangent activation functions, and 11 outputs corresponding to the kinetic rates (v_1r, v_2r, v_3r, v_4r, v_5r, v_6r, v_7r, v_8r, v_9r, v_10r, v_11r as they are named in the original SBML model). As in case study 2, the original kinetic laws were completely deleted in the hybrid models. Hybrid SBML models of different sizes were trained with a synthetic dataset ing a similar process to the previous case studies. A two-factor CC-DOE was car by varying the amount of initial ADP concentration between 1 and 2 (arbitrary un the initial ATP concentration between 1 and 2 (arbitrary units), resulting in nine ments. Each experiment was simulated on the JWS Online platform with the r time-series data (100 time points) recorded with a sampling time of 0.05 (arbitrary Gaussian noise (10%) was added to the concentrations of species. This synthetic d available as Supplementary material (Simulation_data.xlsx; dano1_data shee  Table 3) visualized in the cy3sbml tool [43]. Left side: Reduced glycolysis network with physical meaning. Large circles represent biochemical species (metabolites). Black squares and black edges represent biochemical reactions. Black triangles represent kinetic laws. Right side: Artificial feedforward neural network with size [7 × 10 × 10 × 10 × 11]. Small blue circles represent neural network nodes. Green squares and gray edges represent signal propagation between neural network nodes. The first layer receives input signals of biochemical species concentrations (Large circles). The last layer delivers kinetic parameter values to the black triangles, which mediate the communication between both sides of the network.  Kingma (2014) (α = 0.001, =0.9, =0.999 and ζ = 1 × 10 −8 ). The number of iterations was 10000. The minibatch size was 78% and weight dropout probability was 0.22 as suggested by Pinto et al. (2022). The AICc was computed on the training set only. The noise-free WSSE measures the error between noise-free data (e.g., true process behavior) and model predictions. Hybrid SBML models of different sizes were trained with a synthetic dataset following a similar process to the previous case studies. A two-factor CC-DOE was carried out by varying the amount of initial ADP concentration between 1 and 2 (arbitrary units) and the initial ATP concentration between 1 and 2 (arbitrary units), resulting in nine experiments. Each experiment was simulated on the JWS Online platform with the resulting time-series data (100 time points) recorded with a sampling time of 0.05 (arbitrary units). Gaussian noise (10%) was added to the concentrations of species. This synthetic dataset is available as Supplementary Material (Simulation_data.xlsx; dano1_data sheet). Four experiments were selected for training (the star experiments of the CC-DOE corresponding to 400 training examples for each state variable) and five were used for testing (the square plus the center experiments of the CC-DOE corresponding to 500 training examples for each state variable). The hybrid models were trained with this data using ADAM with default hyperparameters (10,000 iterations, semidirect sensitivity equations, stochastic regularization with minibatch size of 0.78 and weight dropout of 0.22). The overall training results for different FFNNs sizes are shown in Table 4. Unsurprisingly, limit cycle stability is a more challenging problem for hybrid model development. The effect of the FFNN depth and size was much more pronounced than in the previous example. The smaller networks were not able to exhibit stable oscillations even for the training examples. Only models with three hidden layers were able to accurately capture the oscillatory dynamics. The three largest structures show a comparable training and testing error. However, the structure [7 × 10 × 10 × 10 × 11] clearly stands out as the best-performing model with the lowest training error (WMSE train) and the lowest testing error (WMSE test). This is further accentuated by the significantly lower noise-free test error and lower AICc. This hybrid SBML model was uploaded to the JWS online platform and simulated comparatively to the original metabolic model for the center point test experiment (not used for training) of the CC-DOE (Figure 7). Remarkably, the best hybrid model structure [7 × 10 × 10 × 10 × 11] was able to reproduce very faithfully the oscillatory behavior of the original metabolic model when exposed to different initial conditions than those applied in the training experiments.

Hybrid
AICc. This hybrid SBML model was uploaded to the JWS online platform and simulated comparatively to the original metabolic model for the center point test experiment (not used for training) of the CC-DOE (Figure 7). Remarkably, the best hybrid model structure [7 × 10 × 10 × 10 × 11] was able to reproduce very faithfully the oscillatory behavior of the original metabolic model when exposed to different initial conditions than those applied in the training experiments.  Table 4).

Conclusions
SBML is an open standard based on XML currently adopted by the systems biology community to encode computational models of biological processes. An extensive body of research has produced a large number of such SBML models that are currently stored in public databases. The SBML standard is, however, not commonly adopted to encode ML models. The main novelty of the present study is the combination of both modeling formalisms in a common hybrid workflow obeying the SBML standard. With few exceptions, previously published hybrid models embodied relatively simple mechanistic models (mechanistic scale-gap) and relatively simple ML models (ML scale-gap). With the proposed SBML hybrid modeling framework, the mechanistic scale-gap may be significantly narrowed. It is shown with three simple examples how publicly available SBML models may be easily upgraded to hybrid mechanistic/neural network models obeying the SBML standard. Such hybrid models may be trained with state-of-the-art deep learning algorithms to either mimic, improve or extend existing SBML models. They may be further  Table 4).

Conclusions
SBML is an open standard based on XML currently adopted by the systems biology community to encode computational models of biological processes. An extensive body of research has produced a large number of such SBML models that are currently stored in public databases. The SBML standard is, however, not commonly adopted to encode ML models. The main novelty of the present study is the combination of both modeling formalisms in a common hybrid workflow obeying the SBML standard. With few exceptions, previously published hybrid models embodied relatively simple mechanistic models (mechanistic scale-gap) and relatively simple ML models (ML scale-gap). With the proposed SBML hybrid modeling framework, the mechanistic scale-gap may be significantly narrowed. It is shown with three simple examples how publicly available SBML models may be easily upgraded to hybrid mechanistic/neural network models obeying the SBML standard. Such hybrid models may be trained with state-of-the-art deep learning algorithms to either mimic, improve or extend existing SBML models. They may be further uploaded, trained and analyzed in SBML compatible software tools. Even if the presented examples are relatively simple, the proposed framework is, in principle, directly scalable to larger whole organism models, eventually at the genome-scale. All in all, we expect this framework to greatly facilitate the adoption of hybrid mechanistic/ML techniques to develop computational models of biological systems.

Conflicts of Interest:
The authors declare no conflict of interest.