# Towards Generative Design of Computationally Efficient Mathematical Models with Evolutionary Learning

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Related Work

## 3. Problem Statement

## 4. Important Obstacles on the Way of Generative Co-Design Implementation

## 5. Experimental Studies

#### 5.1. Choice of the Model Evaluation Algorithm

#### 5.2. Computationally Intensive Function Parallelization

#### 5.2.1. Parallelization of Generative Algorithm for PDE Discovery

#### 5.2.2. Reducing of the Computational Complexity of Composite Models

#### 5.3. Co-Design Strategies for the Evolutionary Learning Algorithm

#### 5.4. Strategies for Optimization of Hyperparameters in Evolutionary Learning Algorithm

Algorithm 1: The simplified pseudocode of the composite models tuning algorithm illustrated in Figure 6b. |

#### 5.5. Estimation of the Empirical Performance Models

## 6. Discussion and Future Works

## 7. Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

## Abbreviations

AI | Artificial intelligence |

ANN | Artificial neural network |

AutoML | Automated machine learning |

DAG | Directed acyclic graph |

EPM | Empirical performance model |

GPU | Graphics processing unit |

ML | Machine learning |

MSE | Mean squared error |

NAS | Neural architecture search |

ODE | Ordinary differential equation |

PDE | Partial differential equation |

PM | Performance model |

${R}^{2}$ | Coefficient of determination |

RMSE | Root mean square error |

ROC AUC | Area under receiver operating characteristic curve |

## Appendix A. Additional Details on the Empirical Performance Models Validation

**Table A1.**Approximation errors for the different empirical performance models’ structures obtained for the atomic ML models. The best suitable structure is highlighted with bold.

Model | ${\mathsf{\Theta}}_{1}{\mathit{N}}_{\mathit{obs}}{\mathit{N}}_{\mathit{feat}}$ | ${\mathsf{\Theta}}_{1}{\mathit{N}}_{\mathit{obs}}{\mathit{N}}_{\mathit{feat}}$ + ${\mathsf{\Theta}}_{2}{\mathit{N}}_{\mathit{obs}}$ | $\frac{{\mathit{N}}_{\mathit{obs}}}{{\mathsf{\Theta}}_{1}^{2}}+\frac{{\mathit{N}}_{\mathit{obs}}^{2}{\mathit{N}}_{\mathit{feat}}}{{\mathsf{\Theta}}_{2}^{2}}$ | |||
---|---|---|---|---|---|---|

RMSE, s | ${\mathit{R}}^{2}$ | RMSE, s | ${\mathit{R}}^{2}$ | RMSE, s | ${\mathit{R}}^{2}$ | |

LDA | 0.35 | 0.92 | 0.11 | 0.99 | 0.66 | 0.74 |

QDA | 0.75 | 0.57 | 0.03 | 0.99 | 0.93 | 0.36 |

Naive Bayes | 0.82 | 0.42 | 0.04 | 0.99 | 0.961 | 0.21 |

Decision tree | 1.48 | 0.98 | 1.34 | 0.98 | 3.49 | 0.89 |

PCA | 0.28 | 0.78 | 0.04 | 0.99 | 0.28 | 0.95 |

Logit | 0.54 | 0.91 | 0.37 | 0.96 | 0.95 | 0.75 |

Random forest | 96.81 | 0.60 | 26.50 | 0.71 | 21.36 | 0.92 |

**Figure A1.**The empirical performance models for the different atomic models: LDA, QDA, Decision Tree (DT), PCA dimensionality reduction model, Bernoulli Naïve Bayes model, logistic regression. The heatmap represent the prediction of EPM and the black points are real measurements.

## References

- Packard, N.; Bedau, M.A.; Channon, A.; Ikegami, T.; Rasmussen, S.; Stanley, K.; Taylor, T. Open-Ended Evolution and Open-Endedness: Editorial Introduction to the Open-Ended Evolution I Special Issue; MIT Press: Cambridge, MA, USA, 2019. [Google Scholar]
- Krish, S. A practical generative design method. Comput.-Aided Des.
**2011**, 43, 88–100. [Google Scholar] [CrossRef] - Ferreira, C. Gene Expression Programming: Mathematical Modeling by an Artificial Intelligence; Springer: Berlin/Heidelberg, Germany, 2006; Volume 21. [Google Scholar]
- Pavlyshenko, B. Using stacking approaches for machine learning models. In Proceedings of the 2018 IEEE Second International Conference on Data Stream Mining & Processing (DSMP), Lviv, Ukraine, 21–25 August 2018; pp. 255–258. [Google Scholar]
- Kovalchuk, S.V.; Metsker, O.G.; Funkner, A.A.; Kisliakovskii, I.O.; Nikitin, N.O.; Kalyuzhnaya, A.V.; Vaganov, D.A.; Bochenina, K.O. A conceptual approach to complex model management with generalized modelling patterns and evolutionary identification. Complexity
**2018**, 2018, 5870987. [Google Scholar] [CrossRef] - Kalyuzhnaya, A.V.; Nikitin, N.O.; Vychuzhanin, P.; Hvatov, A.; Boukhanovsky, A. Automatic evolutionary learning of composite models with knowledge enrichment. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion, Cancun, Mexico, 8–12 July 2020; pp. 43–44. [Google Scholar]
- Lecomte, S.; Guillouard, S.; Moy, C.; Leray, P.; Soulard, P. A co-design methodology based on model driven architecture for real time embedded systems. Math. Comput. Model.
**2011**, 53, 471–484. [Google Scholar] [CrossRef] - He, X.; Zhao, K.; Chu, X. AutoML: A Survey of the State-of-the-Art. arXiv
**2019**, arXiv:1908.00709. [Google Scholar] - Caldwell, J.; Ram, Y.M. Mathematical Modelling: Concepts and Case Studies; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013; Volume 6. [Google Scholar]
- Banwarth-Kuhn, M.; Sindi, S. How and why to build a mathematical model: A case study using prion aggregation. J. Biol. Chem.
**2020**, 295, 5022–5035. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Castillo, O.; Melin, P. Automated mathematical modelling for financial time series prediction using fuzzy logic, dynamical systems and fractal theory. In Proceedings of the IEEE/IAFE 1996 Conference on Computational Intelligence for Financial Engineering (CIFEr), New York City, NY, USA, 24–26 March 1996; pp. 120–126. [Google Scholar]
- Kevrekidis, I.G.; Gear, C.W.; Hyman, J.M.; Kevrekidid, P.G.; Runborg, O.; Theodoropoulos, C. Equation-free, coarse-grained multiscale computation: Enabling mocroscopic simulators to perform system-level analysis. Commun. Math. Sci.
**2003**, 1, 715–762. [Google Scholar] - Schmidt, M.; Lipson, H. Distilling free-form natural laws from experimental data. Science
**2009**, 324, 81–85. [Google Scholar] [CrossRef] - Kondrashov, D.; Chekroun, M.D.; Ghil, M. Data-driven non-Markovian closure models. Phys. D Nonlinear Phenom.
**2015**, 297, 33–55. [Google Scholar] [CrossRef] [Green Version] - Maslyaev, M.; Hvatov, A.; Kalyuzhnaya, A. Data-Driven Partial Derivative Equations Discovery with Evolutionary Approach. In International Conference on Computational Science; Springer: Berlin/Heidelberg, Germany, 2019; pp. 635–641. [Google Scholar]
- Qi, F.; Xia, Z.; Tang, G.; Yang, H.; Song, Y.; Qian, G.; An, X.; Lin, C.; Shi, G. A Graph-based Evolutionary Algorithm for Automated Machine Learning. Softw. Eng. Rev.
**2020**, 1, 10–37686. [Google Scholar] - Olson, R.S.; Bartley, N.; Urbanowicz, R.J.; Moore, J.H. Evaluation of a tree-based pipeline optimization tool for automating data science. In Proceedings of the Genetic and Evolutionary Computation Conference, New York, NY, USA, 20–24 July 2016; pp. 485–492. [Google Scholar]
- Zhao, H. High Performance Machine Learning through Codesign and Rooflining. Ph.D. Thesis, UC Berkeley, Berkeley, CA, USA, 2014. [Google Scholar]
- Amid, A.; Kwon, K.; Gholami, A.; Wu, B.; Asanović, K.; Keutzer, K. Co-design of deep neural nets and neural net accelerators for embedded vision applications. IBM J. Res. Dev.
**2019**, 63, 6:1–6:14. [Google Scholar] [CrossRef] - Li, Y.; Park, J.; Alian, M.; Yuan, Y.; Qu, Z.; Pan, P.; Wang, R.; Schwing, A.; Esmaeilzadeh, H.; Kim, N.S. A network-centric hardware/algorithm co-design to accelerate distributed training of deep neural networks. In Proceedings of the 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Fukuoka, Japan, 20–24 October 2018; pp. 175–188. [Google Scholar]
- Bertels, K. Hardware/Software Co-Design for Heterogeneous Multi-Core Platforms; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
- Wang, K.; Liu, Z.; Lin, Y.; Lin, J.; Han, S. HAQ: Hardware-Aware Automated Quantization With Mixed Precision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Cai, H.; Zhu, L.; Han, S. Proxylessnas: Direct neural architecture search on target task and hardware. arXiv
**2018**, arXiv:1812.00332. [Google Scholar] - Dosanjh, S.S.; Barrett, R.F.; Doerfler, D.; Hammond, S.D.; Hemmert, K.S.; Heroux, M.A.; Lin, P.T.; Pedretti, K.T.; Rodrigues, A.F.; Trucano, T. Exascale design space exploration and co-design. Future Gener. Comput. Syst.
**2014**, 30, 46–58. [Google Scholar] [CrossRef] - Gramacy, R.B.; Lee, H.K. Adaptive Design of Supercomputer Experiments. 2018. Available online: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.312.3750&rep=rep1&type=pdf (accessed on 26 December 2020).
- Glinskiy, B.; Kulikov, I.; Snytnikov, A.V.; Chernykh, I.; Weins, D.V. A multilevel approach to algorithm and software design for exaflops supercomputers. Numer. Methods Program.
**2015**, 16, 543–556. [Google Scholar] - Kaltenecker, C. Comparison of Analytical and Empirical Performance Models: A Case Study on Multigrid Systems. Master’s Thesis, University of Passau, Passau, Germany, 2016. [Google Scholar]
- Calotoiu, A. Automatic Empirical Performance Modeling of Parallel Programs. Ph.D. Thesis, Technische Universität, Berlin, Germany, 2018. [Google Scholar]
- Eggensperger, K.; Lindauer, M.; Hoos, H.H.; Hutter, F.; Leyton-Brown, K. Efficient benchmarking of algorithm configurators via model-based surrogates. Mach. Learn.
**2018**, 107, 15–41. [Google Scholar] [CrossRef] [Green Version] - Chirkin, A.M.; Belloum, A.S.; Kovalchuk, S.V.; Makkes, M.X.; Melnik, M.A.; Visheratin, A.A.; Nasonov, D.A. Execution time estimation for workflow scheduling. Future Gener. Comput. Syst.
**2017**, 75, 376–387. [Google Scholar] [CrossRef] - Gamatié, A.; An, X.; Zhang, Y.; Kang, A.; Sassatelli, G. Empirical model-based performance prediction for application mapping on multicore architectures. J. Syst. Archit.
**2019**, 98, 1–16. [Google Scholar] [CrossRef] [Green Version] - Shi, Z.; Dongarra, J.J. Scheduling workflow applications on processors with different capabilities. Future Gener. Comput. Syst.
**2006**, 22, 665–675. [Google Scholar] [CrossRef] [Green Version] - Visheratin, A.A.; Melnik, M.; Nasonov, D.; Butakov, N.; Boukhanovsky, A.V. Hybrid scheduling algorithm in early warning systems. Future Gener. Comput. Syst.
**2018**, 79, 630–642. [Google Scholar] [CrossRef] - Melnik, M.; Nasonov, D. Workflow scheduling using Neural Networks and Reinforcement Learning. Procedia Comput. Sci.
**2019**, 156, 29–36. [Google Scholar] [CrossRef] - Olson, R.S.; Moore, J.H. TPOT: A tree-based pipeline optimization tool for automating machine learning. Proc. Mach. Learn. Res.
**2016**, 64, 66–74. [Google Scholar] - Evans, L.; Society, A.M. Partial Differential Equations; Graduate Studies in Mathematics; American Mathematical Society: Providence, RI, USA, 1998. [Google Scholar]
- Czarnecki, W.M.; Osindero, S.; Jaderberg, M.; Swirszcz, G.; Pascanu, R. Sobolev training for neural networks. In Proceedings of the Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; pp. 4278–4287. [Google Scholar]
- Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys.
**2019**, 378, 686–707. [Google Scholar] [CrossRef] - Epicoco, I.; Mocavero, S.; Porter, A.R.; Pickles, S.M.; Ashworth, M.; Aloisio, G. Hybridisation strategies and data structures for the NEMO ocean model. Int. J. High Perform. Comput. Appl.
**2018**, 32, 864–881. [Google Scholar] [CrossRef] - Nikitin, N.O.; Polonskaia, I.S.; Vychuzhanin, P.; Barabanova, I.V.; Kalyuzhnaya, A.V. Structural Evolutionary Learning for Composite Classification Models. Procedia Comput. Sci.
**2020**, 178, 414–423. [Google Scholar] [CrossRef] - Full Script That Allows Reproducing the Results Is Available in the GitHub Repository. Available online: https://github.com/ITMO-NSS-team/FEDOT.Algs/blob/master/estar/examples/ann_approximation_experiments.ipynb (accessed on 26 December 2020).
- Full Script That Allows Reproducing the Results Is Available in the GitHub Repository. Available online: https://github.com/ITMO-NSS-team/FEDOT.Algs/blob/master/estar/examples/Pareto_division.py (accessed on 26 December 2020).
- Géron, A. Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems; O’Reilly Media: Sebastopol, CA, USA, 2019. [Google Scholar]
- Nikitin, N.O.; Vychuzhanin, P.; Hvatov, A.; Deeva, I.; Kalyuzhnaya, A.V.; Kovalchuk, S.V. Deadline-driven approach for multi-fidelity surrogate-assisted environmental model calibration: SWAN wind wave model case study. In Proceedings of the Genetic and Evolutionary Computation Conference Companion, Prague, Czech Republic, 13–17 July 2019; pp. 1583–1591. [Google Scholar]
- Olson, R.S.; La Cava, W.; Orzechowski, P.; Urbanowicz, R.J.; Moore, J.H. PMLB: A large benchmark suite for machine learning evaluation and comparison. BioData Min.
**2017**, 10, 1–13. [Google Scholar] [CrossRef] [Green Version] - Li, K.; Xiang, Z.; Tan, K.C. Which surrogate works for empirical performance modelling? A case study with differential evolution. In Proceedings of the 2019 IEEE Congress on Evolutionary Computation (CEC), Wellington, New Zealand, 10–13 June 2019; pp. 1988–1995. [Google Scholar]
- Bauernhansl, T.; Hartleif, S.; Felix, T. The Digital Shadow of production–A concept for the effective and efficient information supply in dynamic industrial environments. Procedia CIRP
**2018**, 72, 69–74. [Google Scholar] [CrossRef] - Cha, D.H.; Wang, Y. A dynamical initialization scheme for real-time forecasts of tropical cyclones using the WRF model. Mon. Weather Rev.
**2013**, 141, 964–986. [Google Scholar] [CrossRef] - Melnik, M.; Nasonov, D.A.; Liniov, A. Intellectual Execution Scheme of Iterative Computational Models based on Symbiotic Interaction with Application for Urban Mobility Modelling. IJCCI
**2019**, 1, 245–251. [Google Scholar]

**Figure 1.**The description of the generative co-design concept: the different aspects of the model design (genotype, phenotype, and the identification methods); the pipeline of the data-driven modeling; the difference between classical design approach and co-design approach.

**Figure 2.**The structure of the genotype during evolutionary optimization: functional properties, set of parameters and relations between atomic blocks.

**Figure 3.**Pareto frontier obtained after the evolutionary learning of the composite model in the “quality-execution time” subspace. The points referred as ${M}_{1}$–${M}_{4}$ represent the different solutions obtained during optimization. ${p}^{max}$ and ${\tau}_{c}$ represent quality and time constraints.

**Figure 4.**Setup that illustrates inefficiency of the parallel evolution implementation due to fitness function computation complexity.

**Figure 5.**Approaches to the parallel calculation of fitness function with the evolutionary learning algorithm: (

**a**) synchronously, each element of the population is processed at one node until all is processed (

**b**) asynchronously, one of the nodes controls the calculations in other nodes.

**Figure 6.**The different strategies of hyper-parameters tuning for the composite models: (

**a**) individual tuning for each atomic model (

**b**) the tuning of the composite model that uses secondary models to evaluate the tuning quality for the primary models.

**Figure 7.**Comparison of the equation solution and its approximation by artificial neural networks (ANNs) for a time slice (

**a**) and heatmap of the approximation error (${u}_{approx}-{u}_{true}$) (

**b**).

**Figure 8.**Comparison of derivatives obtained by polynomial differentiation and by symbolic regression for first time derivative (

**a**) first spatial derivatives (

**b**) for a time slice ($t=50$).

**Figure 9.**Comparison of derivatives obtained by polynomial differentiation and by symbolic regression for second time derivative (

**a**) second spatial derivatives (

**b**) for a time slice ( $t=50$).

**Figure 10.**The solution of ODE from Equation (20), its approximation by neural network, and derivatives calculated by analytic, polynomial and automatic differentiation.

**Figure 11.**The results of the experiments on the divided domains. (

**a**) evaluations of discovered equation quality for different division fractions along each axis (2× division represents division of domain into 4 square parts); (

**b**) domain processing time (relative to the processing of entire domain) for subdomain number.

**Figure 13.**The total number model fit requests and the actually executed fits (cache misses) for the shared and local cache.

**Figure 14.**(

**a**) The best achieved fitness value for the different computational configurations (represented as different number of parallel threads) used to evaluate the evolutionary algorithm on classification benchmark. The boxplots are build for the 10 independent runs. (

**b**) Pareto frontier (blue) obtained for the classification benchmark in “execution time-model quality” subspace. The red points represent dominated individuals.

**Figure 15.**(

**a**) The comparison of different scenarios of evolutionary optimization: best (ideal), realistic and worst cases (

**b**) The conceptual dependence of the parallelization efficiency from the variance of the execution time in population for the different types of selection.

**Figure 16.**The comparison of different approaches to the evolutionary optimization of the composite models. The min-max intervals are built for the 10 independent runs. The green line represents the static optimization algorithm with 20 individuals in the population; the blue line represented the dynamic optimization algorithm with 10 individuals in the population. ${T}_{0}$, ${T}_{1}$ and ${T}_{2}$ are different real-time constraints, ${F}_{0}$, ${F}_{1}$ and ${F}_{2}$ are the values of fitness functions obtained with the corresponding constraints.

**Figure 17.**Predictions of the performance model that uses an additive approach for local empirical performance models (EPMs) of atomic models. The red points represent the real evaluations of the composite model as a part of validation.

**Table 1.**The quality measures for the composite models after and before random search-based tuning of hyperparameters. The regression problems from PMLB suite [45] are used as benchmarks.

Benchmark Name | MSE without Tuning | MSE with Tuning | ${\mathit{R}}^{2}$ without Tuning | ${\mathit{R}}^{2}$ with Tuning |
---|---|---|---|---|

1203_BNG_pwLinear | 8.213 | 0.102 | 0.592 | 0.935 |

197_cpu_act | 5.928 | 7.457 | 0.98 | 0.975 |

215_2dplanes | 1.007 | 0.001 | 0.947 | 1 |

228_elusage | 126.755 | 0.862 | 0.524 | 0.996 |

294_satellite_image | 0.464 | 0.591 | 0.905 | 0.953 |

4544_GeographicalOriginalofMusic | 0.194 | 2.113 | 0.768 | 0.792 |

523_analcatdata_neavote | 0.593 | 0.025 | 0.953 | 0.999 |

560_bodyfat | 0.07 | 0.088 | 0.998 | 0.894 |

561_cpu | 3412.46 | 0.083 | 0.937 | 0.91 |

564_fried | 1.368 | 0.073 | 0.944 | 0.934 |

ML Model | ${\mathsf{\Theta}}_{1}\xb7{10}^{4}$ | ${\mathsf{\Theta}}_{2}\xb7{10}^{3}$ | ${\mathit{R}}^{2}$ |
---|---|---|---|

LDA | 2.9790 | 3.1590 | 0.9983 |

QDA | 1.9208 | 3.1012 | 0.9989 |

Naive Bayes for Bernoulli models | 1.3440 | 3.3120 | 0.9986 |

Decision tree | 31.110 | 4.1250 | 0.9846 |

PCA | 3.1291 | 2.4174 | 0.9992 |

Logistic regression | 9.3590 | 2.3900 | 0.9789 |

Random forest | $-94.42\xb7{10}^{4}$ | 2.507$\xb7{10}^{8}$ | 0.9279 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Kalyuzhnaya, A.V.; Nikitin, N.O.; Hvatov, A.; Maslyaev, M.; Yachmenkov, M.; Boukhanovsky, A.
Towards Generative Design of Computationally Efficient Mathematical Models with Evolutionary Learning. *Entropy* **2021**, *23*, 28.
https://doi.org/10.3390/e23010028

**AMA Style**

Kalyuzhnaya AV, Nikitin NO, Hvatov A, Maslyaev M, Yachmenkov M, Boukhanovsky A.
Towards Generative Design of Computationally Efficient Mathematical Models with Evolutionary Learning. *Entropy*. 2021; 23(1):28.
https://doi.org/10.3390/e23010028

**Chicago/Turabian Style**

Kalyuzhnaya, Anna V., Nikolay O. Nikitin, Alexander Hvatov, Mikhail Maslyaev, Mikhail Yachmenkov, and Alexander Boukhanovsky.
2021. "Towards Generative Design of Computationally Efficient Mathematical Models with Evolutionary Learning" *Entropy* 23, no. 1: 28.
https://doi.org/10.3390/e23010028