# Shaping and Dilating the Fitness Landscape for Parameter Estimation in Stochastic Biochemical Models

^{1}

^{2}

^{3}

^{4}

^{5}

^{6}

^{7}

^{*}

## Abstract

**:**

## Featured Application

**Potential applications of this work regard the parameter estimation and the resulting analyses of complex biochemical systems characterized by a stochastic behavior, which allows for elucidating the unknown underlying mechanisms.**

## Abstract

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Reaction-Based Modeling and Stochastic Simulation Algorithm

- The set $\mathcal{S}=\{{S}_{1},\dots ,{S}_{N}\}$ of molecular species;
- The set $\mathcal{R}=\{{R}_{1},\dots ,{R}_{M}\}$ of the biochemical reactions describing all interactions among the species in $\mathcal{S}$.

- ${R}_{1}:S+E\stackrel{{c}_{1}={10}^{-2}}{\to}ES$;
- ${R}_{2}:ES\stackrel{{c}_{2}={10}^{-1}}{\to}S+E$;
- ${R}_{3}:ES\stackrel{{c}_{3}=1}{\to}E+P$.

#### 2.2. Fuzzy Self-Tuning Particle Swarm Optimization

#### 2.3. Dilation Functions

#### 2.4. Evolving Dilation Functions

#### 2.5. Surrogate Fourier Modeling with surF

- 1.
**Discrete Cosine Transform.**

- 2.
**Reducing the number of samples.**

- If $\overrightarrow{z}$ is in the convex hull defined by the points ${\overrightarrow{z}}_{0},\dots {\overrightarrow{z}}_{\sigma -1}$, then $\widehat{f}\left(\overrightarrow{z}\right)$ is obtained by a linear interpolation;
- otherwise, a linear interpolation is not possible and $\widehat{f}\left(\overrightarrow{z}\right)$ is defined as $\widehat{f}\left({\overrightarrow{z}}^{\prime}\right)$, where ${\overrightarrow{z}}^{\prime}$ is the point among ${\overrightarrow{z}}_{0},\dots {\overrightarrow{z}}_{\sigma -1}$ nearest to $\overrightarrow{z}$.

- 3.
**Parameters of surF.**

- $\sigma $, which is the number of samples from f used to build $\widehat{f}$;
- $\rho $, which is the “density” of samples from $\widehat{f}$ to obtain the ${\rho}^{D}$ points used to calculate the DFT; and
- $\gamma $, which controls the number of low frequencies preserved.

## 3. Results

#### 3.1. Effect of DFs on the PE Problem

#### 3.2. Combining DFs and Fourier Surrogate Modeling

## 4. Discussion

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## Abbreviations

ABF | Average Best Fitness |

BF | Basis Function |

DCT | Discrete Cosine Transform |

DF | Dilation Function |

E | Enzyme |

ES | Enzyme–Substrate complex |

FRBS | Fuzzy Rule-Based System |

FST-PSO | Fuzzy Self-Tuning Particle Swarm Optimization |

MM | Michaelis–Menten |

P | Product |

PE | Parameter Estimation |

PSO | Particle Swarm Optimization |

RBM | Reaction-Based Model |

S | Substrate |

SSA | Stochastic Simulation Algorithm |

surF | Fitness Landscape Surrogate Modeling with Fourier Filtering |

Mathematical Notation | |

${\alpha}_{m,n}$ | stoichiometric coefficients associated with the n-th reactant |

${\beta}_{m,n}$ | stoichiometric coefficients associated with the m-th reaction |

$\mathbf{c}$ | vector of stochastic constants |

${c}_{m}$ | stochastic (kinetic) constant |

${C}_{\mathrm{cog}}$ | cognitive attractor of FST-PSO |

${C}_{\mathrm{soc}}$ | social attractor of FST-PSO |

D | number of dimensions of the search space |

${d}_{m}$ | number of distinct combinations of the reactant molecules |

f | original fitness function |

$\tilde{f}$ | dilated fitness function |

$\overline{f}$, $\widehat{f}$ | surrogate fitness functions |

${\mathcal{F}}_{r}\left({q}_{p}\right)$ | folding operator |

$\gamma $ | number of lower frequencies to not be zeroed |

I | number of sampled points to compute the dilated landscape |

${[\ell ,u]}^{D}$ | lower and upper bounds of the search space |

${l}_{p}$ | linear basis function |

${O}_{n}\left({t}_{k}\right)$ | the experimental (target) amount of ${S}_{n}$ measured at time ${t}_{k}$ |

p | parameter of the linear basis function ${l}_{p}$ |

${\psi}_{\rho}$ | coefficient representing the amplitude of $\rho $-th frequency |

r | parameter of the folding operator |

$\mathcal{R}$ | set of biochemical reactions |

${R}_{m}$ | m-th biochemical reaction |

$\rho $ | number of equally spaced points to build the surrogate function |

$\mathcal{S}$ | set of molecular species |

${S}_{i}$ | i-th molecular specie |

$\sigma $ | number of samples used to construct the surrogate |

t | time of the system |

$\mathbf{t}$ | vector of time points |

${t}_{k}$ | k-th time point |

$\tau $ | waiting time |

${\overrightarrow{v}}_{\mathrm{max}}$ | maximum velocity of the FST-PSO particles |

${\overrightarrow{v}}_{\mathrm{min}}$ | minimum velocity of the FST-PSO particles |

$\omega $ | inertia factor of FST-PSO |

${X}_{n}^{\mathbf{c}}\left({t}_{k}\right)$ | simulated amount of the species ${S}_{n}$ at time ${t}_{k}$ |

$\mathbf{x}\left(t\right)$ | vector representing the state of the system at time t |

${x}_{n}$ | amount of the n-th molecular specie |

$\chi $ | control point |

${\chi}_{Q}$ | Q-th control point |

$\xi $ | length of the individuals representing the DFs |

$\chi $ | vector of control points |

${y}_{\rho}$ | fitness value of the $\rho $-th point of the surrogate |

${z}_{\rho}$ | $\rho $-th point of the search space to build the surrogate function |

${\zeta}_{1}$ | random number sampled from an uniform distribution |

${\zeta}_{2}$ | random number sampled from an uniform distribution |

## References

- Munsky, B.; Tuzman, K.T.; Fey, D.; Dobrzynski, M.; Kholodenko, B.N.; Olson, S.; Huang, J.; Fox, Z.; Singh, A.; Grima, R.; et al. Quantitative Biology: Theory, Computational Methods, and Models; The MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Nobile, M.S.; Besozzi, D.; Cazzaniga, P.; Mauri, G.; Pescini, D. A GPU-based multi-swarm PSO method for parameter estimation in stochastic biological systems exploiting discrete-time target series. In Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics; LNCS; Giacobini, M., Vanneschi, L., Bush, W., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; Volume 7246, pp. 74–85. [Google Scholar]
- Nobile, M.S.; Besozzi, D.; Cazzaniga, P.; Mauri, G.; Pescini, D. Estimating reaction constants in stochastic biological systems with a multi-swarm PSO running on GPUs. In Proceedings of the 14th Annual Conference companion on Genetic and Evolutionary Computation (ACM 2012), New York, NY, USA, 7–11 July 2012; pp. 1421–1422. [Google Scholar]
- Daigle, B.J.; Roh, M.K.; Petzold, L.R.; Niemi, J. Accelerated maximum likelihood parameter estimation for stochastic biochemical systems. BMC Bioinform.
**2012**, 13, 68. [Google Scholar] [CrossRef] [PubMed][Green Version] - Nobile, M.S.; Cazzaniga, P.; Spolaor, S.; Besozzi, D.; Manzoni, L. Fourier Surrogate Models of Dilated Fitness Landscapes in Systems Biology: Or how we learned to torture optimization problems until they confess. In Proceedings of the 2020 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), Via del Mar, Chile, 27–29 October 2020; pp. 1–8. [Google Scholar]
- Nobile, M.S.; Cazzaniga, P.; Ashlock, D.A. Dilation Functions in Global Optimization. In Proceedings of the 2019 IEEE Congress on Evolutionary Computation (CEC), Wellington, New Zealand, 10–13 June 2019; pp. 2300–2307. [Google Scholar]
- Chunming, F.; Yadong, X.; Jiang, C.; Xu, H.; Huang, Z. Improved differential evolution with shrinking space technique for constrained optimization. Chin. J. Mech. Eng.
**2017**, 30, 553–565. [Google Scholar] - Wang, Y.; Cai, Z.; Zhou, Y. Accelerating adaptive trade-off model using shrinking space technique for constrained evolutionary optimization. Int. J. Numer. Methods Eng.
**2009**, 77, 1501–1534. [Google Scholar] [CrossRef] - Aguirre, A.H.; Rionda, S.B.; Coello Coello, C.A.; Lizárraga, G.L.; Montes, E.M. Handling constraints using multiobjective optimization concepts. Int. J. Numer. Methods Eng.
**2004**, 59, 1989–2017. [Google Scholar] [CrossRef] - Wang, H.; Wu, Z.; Liu, Y.; Wang, J.; Jiang, D.; Chen, L. Space transformation search: A new evolutionary technique. In Proceedings of the First ACM/SIGEVO Summit on Genetic and Evolutionary Computation (ACM 2009), New York, NY, USA, 12–14 June 2009; pp. 537–544. [Google Scholar]
- Bhosekar, A.; Ierapetritou, M. Advances in surrogate based modeling, feasibility analysis, and optimization: A review. Comput. Chem. Eng.
**2018**, 108, 250–267. [Google Scholar] [CrossRef] - Manzoni, L.; Papetti, D.M.; Cazzaniga, P.; Spolaor, S.; Mauri, G.; Besozzi, D.; Nobile, M.S. Surfing on fitness landscapes: A boost on optimization by Fourier surrogate modeling. Entropy
**2020**, 22, 285. [Google Scholar] [CrossRef] [PubMed][Green Version] - Nobile, M.S.; Spolaor, S.; Cazzaniga, P.; Papetti, D.M.; Besozzi, D.; Ashlock, D.A.; Manzoni, L. Which random is the best random? A study on sampling methods in Fourier surrogate modeling. In Proceedings of the 2020 IEEE Congress on Evolutionary Computation (CEC), Glasgow, UK, 19–24 July 2020. [Google Scholar]
- Nobile, M.S.; Cazzaniga, P.; Besozzi, D.; Colombo, R.; Mauri, G.; Pasi, G. Fuzzy Self-Tuning PSO: A settings-free algorithm for global optimization. Swarm Evol. Comput.
**2018**, 39, 70–85. [Google Scholar] [CrossRef] - Poli, R.; Kennedy, J.; Blackwell, T. Particle swarm optimization. Swarm Intell.
**2007**, 1, 33–57. [Google Scholar] [CrossRef] - Gillespie, D.T. Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem.
**1977**, 81, 2340–2361. [Google Scholar] [CrossRef] - Cazzaniga, P.; Damiani, C.; Besozzi, D.; Colombo, R.; Nobile, M.S.; Gaglio, D.; Pescini, D.; Molinari, S.; Mauri, G.; Alberghina, L.; et al. Computational strategies for a system-level understanding of metabolism. Metabolites
**2014**, 4, 1034–1087. [Google Scholar] [CrossRef] [PubMed] - Elowitz, M.B.; Levine, A.J.; Siggia, E.D.; Swain, P.S. Stochastic gene expression in a single cell. Science
**2002**, 297, 1183–1186. [Google Scholar] [CrossRef] [PubMed][Green Version] - Nobile, M.S.; Tangherloni, A.; Rundo, L.; Spolaor, S.; Besozzi, D.; Mauri, G.; Cazzaniga, P. Computational Intelligence for Parameter Estimation of Biochemical Systems. In Proceedings of the 2018 IEEE Congress on Evolutionary Computation (CEC), Rio de Janeiro, Brazil, 8–13 July 2018; pp. 1–8. [Google Scholar]
- Nelson, D.; Cox, M. Lehninger Principles of Biochemistry; W. H. Freeman Company: New York, NY, USA, 2004. [Google Scholar]
- Empereur-Mot, C.; Pesce, L.; Doni, G.; Bochicchio, D.; Capelli, R.; Perego, C.; Pavan, G.M. Swarm-CG: Automatic Parametrization of Bonded Terms in MARTINI-Based Coarse-Grained Models of Simple to Complex Molecules via Fuzzy Self-Tuning Particle Swarm Optimization. ACS Omega
**2020**, 5, 32823–32843. [Google Scholar] [CrossRef] - Tangherloni, A.; Spolaor, S.; Cazzaniga, P.; Besozzi, D.; Rundo, L.; Mauri, G.; Nobile, M.S. Biochemical parameter estimation vs. benchmark functions: A comparative study of optimization performance and representation design. Appl. Soft Comput.
**2019**, 81, 105494. [Google Scholar] [CrossRef] - SoltaniMoghadam, S.; Tatar, M.; Komeazi, A. An improved 1-D crustal velocity model for the Central Alborz (Iran) using Particle Swarm Optimization algorithm. Phys. Earth Planet. Inter.
**2019**, 292, 87–99. [Google Scholar] [CrossRef] - Fuchs, C.; Spolaor, S.; Nobile, M.S.; Kaymak, U. A Swarm Intelligence Approach to Avoid Local Optima in Fuzzy C-Means Clustering. In Proceedings of the 2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), New Orleans, LA, USA, 23–26 June 2019; pp. 1–6. [Google Scholar]
- Papetti, D.M.; Ashlock, D.A.; Cazzaniga, P.; Besozzi, D.; Nobile, M.S. If You Can’t Beat It, Squash It: Simplify Global Optimization by Evolving Dilation Functions. In Proceedings of the 2021 IEEE Congress on Evolutionary Computation (CEC), Kraków, Poland, 28 June–1 July 2021. [Google Scholar]
- Cazzaniga, P.; Nobile, M.S.; Besozzi, D. The impact of particles initialization in PSO: Parameter estimation as a case in point. In Proceedings of the 2015 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), Niagara Falls, ON, Canada, 12–15 August 2015; pp. 1–8. [Google Scholar]
- Beyer, H.G.; Schwefel, H.P. Evolution strategies—A comprehensive introduction. Nat. Comput.
**2002**, 1, 3–52. [Google Scholar] [CrossRef] - Cooley, J.W.; Tukey, J.W. An algorithm for the machine calculation of complex Fourier series. Math. Comput.
**1965**, 19, 297–301. [Google Scholar] [CrossRef] - Sobol’, I.M. On the distribution of points in a cube and the approximate evaluation of integrals. Zhurnal Vychislitel’noi Mat. I Mat. Fiz.
**1967**, 7, 784–802. [Google Scholar] [CrossRef] - Spolaor, S.; Gribaudo, M.; Iacono, M.; Kadavy, T.; Oplatková, Z.K.; Mauri, G.; Pllana, S.; Senkerik, R.; Stojanovic, N.; Turunen, E.; et al. Towards Human Cell Simulation. In High-Performance Modelling and Simulation for Big Data Applications; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2019; Volume 11400, pp. 221–249. [Google Scholar]

**Figure 1.**Example of stochastic simulation: 100 independent SSA runs of the MM model—all starting from the same initial state ($S=200,E=100,ES=0$, and $P=0$) and using the same parameterization—lead to quantitatively different trajectories and a final distributions of the chemical species.

**Figure 3.**Effect of the anti-log DF. (

**Left**): control points encoding the DF. (

**Right**): effect of the mapping of a particle’s component (x axis) in the original search space of the PE problem (y axis).

**Figure 4.**(

**Left**): different examples of the linear transformation ${l}_{p}$ with different values of p. (

**Right**): examples of folding operators with varying values of p and r.

**Figure 5.**Graphical representation of the DF used in the PE problem. (

**Left**): anti-log function analytically designed to expand the fitness landscape in the lowest orders of magnitude. (

**Right**): evolved DF that substantially performs the same dilation of the anti-log DF.

**Figure 6.**Effect of DFs on the fitness landscape of the MM model. (

**a**) Original fitness landscape. (

**b**) Fitness landscape dilated by means of the analytical anti-log DF. (

**c**) Fitness landscape dilated by means of the optimal evolved DF.

**Figure 7.**(

**Left**): convergence plot of the PE of the MM model obtained with FST-PSO (no ${v}_{min}$) (coral solid line), FST-PSO using the analytic DF (green dashed line), and FST-PSO using the evolved DF (blue dotted line). The lines correspond to the ABF calculated over 30 runs. (

**Right**): boxplots representing the distributions of the best solutions found at the end of each run, for each methodology; red dashes denote the mean of the distributions, while diamonds denote the outliers. The asterisks denote the p-values obtained by comparing the distributions by means of the Mann–Whitney U tests ( ** p-value $\le 0.0001$).

**Figure 8.**Effect of the surrogate modeling on the fitness landscape of the MM model. (

**a**) Surrogate model of the original fitness landscape. (

**b**) Surrogate model of the fitness landscape dilated by means of the anti-log DF. (

**c**) Surrogate model of the fitness landscape dilated by means of the evolved DF.

**Figure 9.**Surrogate models of the fitness landscape for the PE of the MM model dilated using the evolved DF, with $\gamma =2$ (

**a**), $\gamma =3$ (

**b**), $\gamma =5$ (

**c**), $\gamma =10$ (

**d**), $\gamma =50$ (

**e**), $\gamma =100$ (

**f**).

**Figure 10.**(

**Left**): convergence plot of the PE of the MM model obtained with FST-PSO (no ${v}_{min}$) (coral solid line), FST-PSO using the analytic DF + surF (green dashed line), FST-PSO using the evolved DF + surF (blue dotted line). The lines correspond to the ABF calculated over 30 runs. (

**Right**): boxplots representing the distribution of the best solutions found at the end of each run; red dashes denote the mean of the distributions, while diamonds denote outliers. The asterisks denote the p-values obtained by comparing the distributions by means of the Mann–Whitney U tests (* p-value $\le 0.5$, ** p-value $\le 0.0001$).

Molecular Species | Amount |
---|---|

S (substrate) | 200 |

E (enzyme) | 100 |

$ES$ (enzyme–substrate complex) | 0 |

P (product) | 0 |

ID | Name | Semantics |
---|---|---|

0 | ${I}_{p}$ | Identity |

1 | ${l}_{p}$ | Linear transformation |

2 | ${\mathcal{F}}_{\frac{1}{4}}\left({l}_{p}\right)$ | |

3 | ${\mathcal{F}}_{\frac{1}{2}}\left({l}_{p}\right)$ | Folding operators |

4 | ${\mathcal{F}}_{\frac{3}{4}}\left({l}_{p}\right)$ |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Nobile, M.S.; Papetti, D.M.; Spolaor, S.; Cazzaniga, P.; Manzoni, L. Shaping and Dilating the Fitness Landscape for Parameter Estimation in Stochastic Biochemical Models. *Appl. Sci.* **2022**, *12*, 6671.
https://doi.org/10.3390/app12136671

**AMA Style**

Nobile MS, Papetti DM, Spolaor S, Cazzaniga P, Manzoni L. Shaping and Dilating the Fitness Landscape for Parameter Estimation in Stochastic Biochemical Models. *Applied Sciences*. 2022; 12(13):6671.
https://doi.org/10.3390/app12136671

**Chicago/Turabian Style**

Nobile, Marco S., Daniele M. Papetti, Simone Spolaor, Paolo Cazzaniga, and Luca Manzoni. 2022. "Shaping and Dilating the Fitness Landscape for Parameter Estimation in Stochastic Biochemical Models" *Applied Sciences* 12, no. 13: 6671.
https://doi.org/10.3390/app12136671