A Regularized Real-Time Integrator for Data-Driven Control of Heating Channels

Ghnatios, Chady; Champaney, Victor; Pasquale, Angelo; Chinesta, Francisco

doi:10.3390/computation10100176

Open AccessArticle

A Regularized Real-Time Integrator for Data-Driven Control of Heating Channels

by

Chady Ghnatios

^1,*

,

Victor Champaney

²,

Angelo Pasquale

^2,3 and

Francisco Chinesta

^4,5

¹

Department of Mechanical Engineering, Notre Dame University-Louaize, Zouk Mosbeh P.O. Box 72, Lebanon

²

PIMM Laboratory, Arts et Métiers Institute of Technology, 151 Boulevard de l’Hôpital, 75013 Paris, France

³

LAMPA Laboratory, Arts et Métiers Institute of Technology, 2 Boulevard du Ronceray, BP 93525, CEDEX 01, 49035 Angers, France

⁴

PIMM Laboratory, Arts et Métiers Institute of Technology, CNRS, Cnam, HESAM Université, 151 Boulevard de l’Hôpital, 75013 Paris, France

⁵

ESI Group, 3 bis Rue Saarien, 94528 Rungis, France

^*

Author to whom correspondence should be addressed.

Computation 2022, 10(10), 176; https://doi.org/10.3390/computation10100176

Submission received: 11 September 2022 / Revised: 26 September 2022 / Accepted: 27 September 2022 / Published: 5 October 2022 / Corrected: 11 March 2024

(This article belongs to the Section Computational Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

In many contexts of scientific computing and engineering science, phenomena are monitored over time and data are collected as time-series. Plenty of algorithms have been proposed in the field of time-series data mining, many of them based on deep learning techniques. High-fidelity simulations of complex scenarios are truly computationally expensive and a real-time monitoring and control could be efficiently achieved by the use of artificial intelligence. In this work we build accurate data-driven models of a two-phase transient flow in a heated channel, as usually encountered in heat exchangers. The proposed methods combine several artificial neural networks architectures, involving standard and transposed deep convolutions. In particular, a very accurate real-time integrator of the system has been developed.

Keywords:

machine learning; artificial intelligence; integrator; data-driven modeling

1. Introduction

Most phenomena in engineering are described by sequential data. In the context of data-driven simulation and control [1], time series classification and forecasting play therefore a fundamental role. A huge variety of applications, such as financial mathematics, language and speech recognition, weather forecasting, physical and chemical unsteady systems, are extensively discussed in the literature [2,3,4,5].

In the last decade, data-driven approximations of the Koopman operator have successfully been applied to build reduced-models of complex dynamical systems, allowing their identification and control [6,7,8,9]. Several approaches are based on combining algorithms derived from the dynamic mode decomposition–DMD–with deep learning–DL–techniques, as discussed in [10,11] and references therein.

Many algorithms have been applied in artificial intelligence–AI–for the estimation of output time series from input time series. Among them, convolution and recurrent neural networks–CNNs, RNNs– [12,13,14], time delay neural networks–TDNNs– [15,16,17], nonlinear autoregressive neural networks–NARNETs– [18,19] and nonlinear autoregressive neural networks with exogenous inputs–NARXNETs– [20,21,22,23,24].

In this work, several surrogates based on standard and transposed convolution neural networks–CNNs– [25,26,27] are applied to a thermohydraulic problem in the context of heat exchangers involving change of phase [28,29,30]. The problem will be described in detail later. It basically consists in fixing the input heat power

u (t)

injected into a two-phase flow and monitoring some quantities

y (t)

at the outlet of a pipeline in which the gas-liquid mixture is confined and flows.

Mathematically speaking, let us consider an input and output time series, respectively

u

and

y

, tracking

N_{t}

time instants. The problem is the estimation of the non-linear transfer function h such that

y = h (u)

. Learning the function h allows the real-time integration of a newly defined input to know the impact on some outputs of interest.

A first approach is based on exploiting a parametrization of the input function

u (t)

, through a few values

{U_{1}, U_{2}, \dots, U_{K}}

. In this case, the modeling framework is non-incremental, since the whole output time series is computed from the inputs

{U_{1}, U_{2}, \dots, U_{K}}

, that is:

y = h (U_{1}, U_{2}, \dots, U_{K}) .

(1)

The estimation of h will be done through the use of artificial neural networks. However, since the output consists of

N_{t}

evaluations, while the input of

K < N_{t}

parameters, a transposed convolution (also known as fractionally-strided convolution) is employed to reach the right dimension at the output layer. As a matter of fact, transposed convolutions increase (up sample) the spatial dimensions of intermediate feature maps, reversing down sample operations obtained by classical convolutions [25].

A second approach consists of considering an incremental modelling framework, in which the model (also known as integrator) computes the next value of the output sequence from a given number of previous inputs and outputs. Such approach has some similarities with the NARXNETs [18]. Denoting with

y_{i} = y (t_{i})

, for

i = 1, \dots, N_{t}

, we can write:

y_{i} = h (y_{i - 1}, y_{i - 2}, \dots, y_{i - k}, u_{i}, u_{i - 1}, \dots, u_{i - k + 1}),

(2)

where the parameter

k \geq 0

is a delay term, known as the process dead time. We can also note

U = y_{p a s t} \cup u

,

y_{p a s t}

being the known y outputs used to predict future values of y. A neural-network architecture based on a convolution approach can then be defined. Two non-incremental models (for different values of the delay coefficient k) will be compared, in terms of speed and accuracy.

The paper aims to build a data-driven model that is capable of reproducing the data with high fidelity, while possibly performing the integrator task over a large time frame. First of all, the model of the addressed problem and the simulation data structure are described in Section 2. Section 3 presents three different surrogates:

A non-incremental modelling framework is presented in Section 3.1, which predicts the whole time frame in a single shot
Two incremental alternatives are presented in Section 3.2. These alternatives used different memory size to predict the future variation in the quantities of interest.
The incremental surrogate model with large memory is then used as an integrator to predict the variation over a large time series

Finally, Section 4 addresses the conclusion summarizing the results and highlighting the main differences among the learned models.

2. Problem Statement

High-fidelity numerical simulations of a thermohydraulic problem are performed using the software CATHARE for nuclear safety analysis [31,32,33]. The simulation consists into a two-phase (steam-water) flow moving upward against gravity into a channel heated by hot walls, as sketched in Figure 1. The model input is the time-varying thermal power

q (t)

injected into the fluid through the walls, while the outputs are the time histories of several fluid properties at the channel outlet. The simulation covers the interval

[0, 100]

s, with a time step

Δ t = 0.1

s.

Using the bold notation for vectors (sequences of 1000 elements corresponding to the time discretization), the quantities of interest are the steam fraction in the flowing system

A

, the liquid velocity

V

, the steam temperature

T_{G}

and the liquid temperature

T_{L}

. This work aims to produce a surrogate model h for inferring the outputs as a function of the input variable

q

:

(\begin{matrix} A \\ V \\ T_{G} \\ T_{L} \end{matrix}) = h (q) .

(3)

The surrogate model is built in a non-intrusive manner, defining a Design of Experiments–DoE–on the power density and combining the outputs of the corresponding offline simulations.

A user-defined power control law allows to assign the value of the provided power density at specific time instants. Therefore, the input function

q (t)

is defined through a few values

Q_{i} = q (t_{i})

, normalized by the maximum power density

Q_{m a x} = 8.4 \times 10^{8}

W/m³, and then linearly interpolating in-between. In particular, the time instants (in seconds)

{0, 1, 25, 50, 75, 100}

are considered for the power changes. The corresponding values

Q_{i}

, for

i = 0, \dots, 5

, define a parametrization of the input function

q (t)

.

A DoE of 23 data points is established using different combinations of the values

Q_{i}

, as listed in Table 1. For the rest of this work, the last 4 datasets are considered in the testing set and therefore not fed into the training loop of the built surrogate model. Note that some of the values are fixed (

Q_{0} = 0

,

Q_{1} = Q_{5} = 1

) thus the model input parameters are only

Q_{2}, Q_{3}

and

Q_{4}

. Figure 2 shows an example of the heat power density

q (t)

defined from values

Q_{i}

.

A sample set of results, of the available datasets number 1, 2, 3 and 4 of Table 1, is shown in Figure 3 for illustration purposes.

3. Surrogate Modeling of the Data

Several models are tested and in what follows we explain the results of three of them. Before any training, the variables are normalized by subtracting their corresponding means and dividing by the variation amplitude

•_{m a x} - •_{m i n}

.

3.1. A Non-Incremental Approach

In this section, we present a non-incremental modelling framework, aiming to reproduce all the measured 1000 steps time sequence while considering only the three variables

Q_{2}

,

Q_{3}

and

Q_{4}

. Therefore, the first model

H_{1}

aims to create an output of the form:

(\begin{matrix} \hat{A} \\ \hat{V} \\ \hat{T_{G}} \\ \hat{T_{L}} \end{matrix}) = H_{1} (Q_{2}, Q_{3}, Q_{4}),

(4)

with

\hat{•}

being the approximation of the data, output of the surrogate model. In this section, since we are using a non-incremental approach, each dataset is nothing but a single data-point, for which we evaluate a single output. We use a neural network based on a transposed convolution approach [25,26,27]. The network starts with an input of 3 values, for instance (Q₂, Q₃, Q₄), and propagate the variation of the input variables into the required output. The architecture of the used network is illustrated in Table 2 and schematically represented through the diagram in Figure 4. The first layer in the designed neural network transforms the input into a dimension suitable for beginning a convolution transpose. The second layer reshapes the inputs into a simulated 2D picture of dimension 8 × 4, with 240 filters. The subsequent layers reduce the number of filtering while increasing the size of the first dimension of the tensor, through multiplying by 5 on each layer, reaching therefore a dimension of 1000 × 4 on layer 6, the exact dimension of the required 4 variables output.

The neural network is trained over 5000 epochs, with a batch size of 10 datasets, knowing that we have only 19 available datasets. The selected loss function is the mean square error, without regularization, and the optimization algorithm is Adam [35] with an initial learning rate

α = 10^{- 4}

. The results for a selected training dataset are shown in Figure 5, whereas the results for the selected testing datasets are illustrated for datasets 20 and 22 in Figure 6 and Figure 7, respectively.

The relative errors generated by the non-incremental approach are illustrated in Table 3. Since the output

A

contains several zero values, the relative error

E_{A}

is normalized with respect to the maximum value of

A

:

E_{A} = \frac{∥ A - \hat{A} ∥_{2}}{{∥ A ∥}_{\infty}},

(5)

with

A

being the measured data and

\hat{A}

the surrogate model output. The normalization is standard for any other variable

v

:

E_{v} = {∥| v - \hat{v} | ⊘ | v |∥}_{2},

(6)

where, similarly,

v

denotes the measured data,

\hat{v}

the surrogate model output and the symbol ⊘ the Hadamard division operator. The results are excellent in either training or testing sets.

3.2. Incremental Approaches

Using an incremental approach consists in learning the integrator that allows to update the state of the system at each time step. The model is built by first decomposing the datasets into a sequence chain of length

t_{s}

. Increasing the chain size

t_{s}

, by considering therefore a higher number of time steps in the variation history, will definitely improve the results, by jeopardizing the computation time required for the prediction of the data point. Two incremental models are tested, the first one considers

t_{s} = 4

, and is an excellent predictor of the next output of the considered sequences but fails as dynamical integrator as will be discussed later, whereas the second model uses

t_{s} = 20

, with a regularization, for improving stability, that performs correctly as integrator.

3.2.1. A Fast Predictor

In this section we explain the possibility of using a fast algorithm based on convolution layers, to predict the next outcome of the sequence at hand. The model will take as an input the variables

(A, V, T_{G}, T_{L})

at time steps

i - 1

,

i - 2

,

i - 3

and

i - 4

, as well as the input q at time steps i,

i - 1

,

i - 2

and

i - 3

, to predict the quantities of interest at time step i. For instance, we can write the second trained surrogate model

H_{2}

as follows:

(\begin{matrix} {\hat{A}}_{i} \\ {\hat{V}}_{i} \\ {\hat{T_{G}}}_{i} \\ {\hat{T_{L}}}_{i} \end{matrix}) = H_{2} ((\begin{matrix} A_{i - 1} \\ A_{i - 2} \\ A_{i - 3} \\ A_{i - 4} \end{matrix}), (\begin{matrix} V_{i - 1} \\ V_{i - 2} \\ V_{i - 3} \\ V_{i - 4} \end{matrix}), (\begin{matrix} {T_{G}}_{i - 1} \\ {T_{G}}_{i - 2} \\ {T_{G}}_{i - 3} \\ {T_{G}}_{i - 4} \end{matrix}), (\begin{matrix} {T_{L}}_{i - 1} \\ {T_{L}}_{i - 2} \\ {T_{L}}_{i - 3} \\ {T_{L}}_{i - 4} \end{matrix}), (\begin{matrix} q_{i} \\ q_{i - 1} \\ q_{i - 2} \\ q_{i - 3} \end{matrix})) .

(7)

The considered network to approximate the surrogate model

H_{2}

is summarized in Table 4 and schematized in Figure 8. The input of the network is a 3D tensor of dimension

(4 \times 5 \times 1)

, for 4 time steps and 5 variables. The selected optimization algorithm is Adam with the loss function being the mean square errors. The available data after reformulation into a sequence of 4 steps has the size of

19 \times 996

inputs for the training sets and

4 \times 996

for the testing set. A batch size of 996 is selected with 1500 epochs for the training. The initial learning rate was set to

α = 10^{- 4}

.

The results for a selected training dataset are shown in Figure 9, whereas the results for the selected testing datasets are illustrated for datasets 20 and 22 in Figure 10 and Figure 11, respectively. The relative errors are shown in Table 5, computed using Equations (5) and (6), as described in Section 3.1.

Model

H_{2}

shows an excellent ability to reproduce the datasets when the input is always taken from the physical dataset values and not from the output of the predictor

H_{2}

itself. The errors increase dramatically when the input of

H_{2}

is taken from its previous output, yielding an impossibility to use it as an integrator of the dynamical system.

On the other hand, the performance of

H_{2}

is comparable to the performance of

H_{1}

, when comparing the relative errors of both models. From a comparison of the resulting graphical illustrations,

H_{2}

seems to perform better on the testing sets.

3.2.2. A Larger Sequence Time Integrator

In this section we consider again the rationale followed in Section 3.2.1, with an algorithm based on convolution layers, to predict the next outcome of the sequence at hand. The model will take as an input the variables available at the last 20 time steps. According to the previously introduced notation,

t_{s} = 20

. The built model

H_{3}

will predict the quantities of interest at the subsequent time step. For instance, we can write the trained surrogate model

H_{3}

as shown below:

(\begin{matrix} {\hat{A}}_{i} \\ {\hat{V}}_{i} \\ {\hat{T_{G}}}_{i} \\ {\hat{T_{L}}}_{i} \end{matrix}) = H_{3} ((\begin{matrix} A_{i - 1} \\ ⋮ \\ A_{i - 20} \end{matrix}), (\begin{matrix} V_{i - 1} \\ ⋮ \\ V_{i - 20} \end{matrix}), (\begin{matrix} {T_{G}}_{i - 1} \\ ⋮ \\ {T_{G}}_{i - 20} \end{matrix}), (\begin{matrix} {T_{L}}_{i - 1} \\ ⋮ \\ {T_{L}}_{i - 20} \end{matrix}), (\begin{matrix} q_{i} \\ ⋮ \\ q_{i - 19} \end{matrix})) .

(8)

The surrogate model

H_{3}

is built using the network illustrated in Table 6 and represented through the diagram in Figure 12. The input of the network is a 3D tensor of dimension

(20 \times 5 \times 1)

, for 20 time steps and 5 variables as inputs. The selected optimization algorithm is Adam with the loss function being the mean square errors, modified to add a regularization similar to the DMD regularization [36,37], to perform integration in time without increasing the error significantly. For instance, the loss function

J

in this section reads:

\begin{matrix} J = \sum_{j = 1}^{19} (∥ A_{j} - \hat{A_{j}} ∥ + ∥ V_{j} - \hat{V_{j}} ∥ + ∥ {T_{G}}_{j} - {\hat{T_{G}}}_{j} ∥ + ∥ {T_{L}}_{j} - {\hat{T_{L}}}_{j} ∥) + \\ + \frac{λ_{w}}{2 m} \sum_{j = 1}^{m} θ_{j}^{2} + \frac{λ_{b}}{2 p} \sum_{j = 1}^{p} κ_{j}^{2} \end{matrix}

(9)

where

θ_{j}

are all the weights of the last layer of the network described in Table 6 (the linear layer), and

κ_{j}

its biases.

λ_{w}

and

λ_{b}

are regularization parameters. The available data after reformulation into a sequence of 20 steps has the size of

19 \times 980

inputs for the training sets and

4 \times 980

for the testing set. A batch size of 980 is selected with 1500 epochs for the training. The initial learning rate was set to

α = 10^{- 4}

. The input shape has therefore of a size of

20 \times 5 \times 1

. The results are shown for a

λ_{w} = λ_{b} = 10^{- 6}

.

The results using physical inputs for a selected training dataset 10 are shown in Figure 13, whereas the results for the selected testing datasets 20 and 22 in Figure 14 and Figure 15, respectively. The relative errors are shown in Table 7, computed using Equations (5) and (6), as illustrated in Section 3.1.

3.2.3. Integrator Results

The neural network described in Table 6, complemented with the regularized loss given in Equation (9), can be used to integrate in time the system behavior. For instance, we can write

(\begin{matrix} {\hat{A}}_{i} \\ {\hat{V}}_{i} \\ {\hat{T_{G}}}_{i} \\ {\hat{T_{L}}}_{i} \end{matrix}) = H_{3} ((\begin{matrix} {\hat{A}}_{i - 1} \\ ⋮ \\ {\hat{A}}_{i - 20} \end{matrix}), (\begin{matrix} {\hat{V}}_{i - 1} \\ ⋮ \\ {\hat{V}}_{i - 20} \end{matrix}), (\begin{matrix} {\hat{T_{G}}}_{i - 1} \\ ⋮ \\ {\hat{T_{G}}}_{i - 20} \end{matrix}), (\begin{matrix} {\hat{T_{L}}}_{i - 1} \\ ⋮ \\ {\hat{T_{L}}}_{i - 20} \end{matrix}), (\begin{matrix} q_{i} \\ ⋮ \\ q_{i - 19} \end{matrix})) .

(10)

The integrator is stable enough to compile all the time steps. Figure 16 shows the integrator results in a training case, whereas Figure 17 and Figure 18 show the integrator results on two testing cases. The integrator errors are given in Table 8.

It can be clearly noticed that the integrator performs extremely well on the selected datasets.

4. Conclusions

In this work several prediction models for output time series from input time series have been compared. Without loss of generality, a coupled problem in thermohydraulics has been considered, but the procedure generalizes to many other phenomena and fields. Moreover, instead of simulation-based data, experimental data can be used.

The models are based on different deep neural networks architectures. If a parametrization of the input function is available, a non-incremental approach involving transposed convolution neural networks allows the prediction of the whole sequence, as described and discussed in Section 3.1. The non-incremental approach is based on convolution transpose layers, which takes a low amount of data as an input and increase the size of the output to reach a desired dimension. These layers appears as a suitable approach in this situation where only few inputs are required to built the whole sequence of time measurements.

The incremental models (dynamical system integrators) presented in Section 3.2 are based on regressing the variable on its own lagged (i.e., past) values together with the ones of the input, through standard deep convolution neural networks. Therefore, the prediction occurs step-by-step. The difference between the two incremental models stands in the delay (or lag) factor

t_{s}

. A small value of

t_{s}

allows really fast predictions. However, the model can be unstable when its input is taken from the previous model outputs (this occurs, for instance, when

t_{s} = 4

). In such case, the surrogate cannot be used as an integrator of the system. Results are thoroughly improved for

t_{s} = 20

, where also a regularization term was introduced in the loss function. The incremental approach is based on convolution layers. In fact, these layers will take a large input and reduce the input size to extract relevant information. The filters are designed with a large dimension along the time axis, to allow easier extraction of the inputs’ time variation.

The non-incremental approach can be clearly faster to evaluate the output of the solution, however, the integrator clearly outperforms the non-incremental approach. Moreover, the integrator can be used on novel datasets, and potentially outside of the simulated time frame. The initialization requires only

t_{s}

time-steps to be simulated, and the remaining dynamics is predicted in real time, allowing huge computational gains. However, the non-incremental approach can’t be used on a different time frame, out of the simulated region, as the output is hardly coded to predict only

N_{t} = 1000

time steps.

Author Contributions

Conceptualization, C.G. and F.C.; methodology, C.G and A.P.; software, C.G.; formal analysis, C.G.; investigation, V.C. and A.P.; data curation, V.C. and A.P.; writing—original draft preparation, C.G.; writing—review and editing, V.C. and A.P.; supervision, F.C.; project administration, F.C.; funding acquisition, F.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are the property of the project’s consortium and cannot be made available.

Acknowledgments

The authors greatly acknowledge the support of Anoop Ebey Thomas from ESI Group and all the partners of the PSPC Réacteur Numérique project.

Conflicts of Interest

The authors declare no conflict of interest.

References

Markovsky, I.; Rapisarda, P. Data-driven simulation and control. Int. J. Control. 2008, 81, 1946–1959. [Google Scholar] [CrossRef]
Box, G.; Jenkins, G.M. Time Series Analysis: Forecasting and Control; Holden-Day: San Francisco, CA, USA, 1976. [Google Scholar]
Pollock, D.; Green, R.; Nguyen, T. A Handbook of Time-Series Analysis, Signal Processing and Dynamics; Signal Processing and Its Applications; Academic Press: Cambridge, MA, USA, 1999. [Google Scholar]
Kantz, H.; Schreiber, T. Nonlinear Time Series Analysis; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
Palit, A.K.; Popovic, D. Computational Intelligence in Time Series Forecasting: Theory and Engineering Applications (Advances in Industrial Control); Springer: Berlin/Heidelberg, Germany, 2005. [Google Scholar]
Lian, Y.; Wang, R.; Jones, C.N. Koopman based data-driven predictive control. arXiv 2021, arXiv:2102.05122. [Google Scholar]
Klus, S.; Nüske, F.; Peitz, S.; Niemann, J.H.; Clementi, C.; Schütte, C. Data-driven approximation of the Koopman generator: Model reduction, system identification, and control. Phys. D Nonlinear Phenom. 2020, 406, 132416. [Google Scholar] [CrossRef]
Otto, S.; Rowley, C. Koopman Operators for Estimation and Control of Dynamical Systems. Annu. Rev. Control. Robot. Auton. Syst. 2021, 4, 59–87. [Google Scholar] [CrossRef]
Maksakov, A.; Palis, S. Koopman-based data-driven control for continuous fluidized bed spray granulation with screen-mill-cycle. J. Process. Control. 2021, 103, 48–54. [Google Scholar] [CrossRef]
Williams, M.O.; Kevrekidis, I.G.; Rowley, C.W. A Data–Driven Approximation of the Koopman Operator: Extending Dynamic Mode Decomposition. J. Nonlinear Sci. 2015, 25, 1307–1346. [Google Scholar] [CrossRef]
Zhang, C.; Zuazua, E. A quantitative analysis of Koopman operator methods for system identification and predictions. 2021; in press. [Google Scholar]
Tijskens, A.; Roels, S.; Janssen, H. Neural networks for metamodelling the hygrothermal behaviour of building components. Build. Environ. 2019, 162, 106282. [Google Scholar] [CrossRef]
Connor, J.; Martin, R.; Atlas, L. Recurrent neural networks and robust time series prediction. IEEE Trans. Neural Netw. 1994, 5, 240–254. [Google Scholar] [CrossRef] [PubMed]
Mandic, D.; Mandic, D.; Chambers, J. Recurrent Neural Networks for Prediction: Learning Algorithms, Architectures and Stability; Adaptive and Cognitive Dynamic Systems: Signal Processing; Wiley: Hoboken, NJ, USA, 2001. [Google Scholar]
Waibel, A.; Hanazawa, T.; Hinton, G.; Shikano, K.; Lang, K. Phoneme recognition using time-delay neural networks. IEEE Trans. Acoust. Speech, Signal Process. 1989, 37, 328–339. [Google Scholar] [CrossRef]
Peddinti, V.; Povey, D.; Khudanpur, S. A time delay neural network architecture for efficient modeling of long temporal contexts. Proc. Interspeech 2015, 3214–3218. [Google Scholar] [CrossRef]
Wöhler, C.; Anlauf, J. Real-time object recognition on image sequences with the adaptable time delay neural network algorithm—Applications for autonomous vehicles. Image Vis. Comput. 2001, 19, 593–618. [Google Scholar] [CrossRef]
Pinheiro, E.; Bosco, J.; Lima, L.; Geraldo, F. Nonlinear autoregressive neural networks for forecasting wind speed time series. Int. J. Dev. Res. 2021, 10, 40336–40343. [Google Scholar] [CrossRef]
Ruiz, L.; Cuéllar, M.; Calvo-Flores, M.; Pegalajar Jiménez, M.d.C. An Application of Non-Linear Autoregressive Neural Networks to Predict Energy Consumption in Public Buildings. Energies 2016, 9, 684. [Google Scholar] [CrossRef]
Xie, H.; Tang, H.; Liao, Y.H. Time series prediction based on NARX neural networks: An advanced approach. In Proceedings of the 2009 International Conference on Machine Learning and Cybernetics, Hebei, China, 12–15 July 2009; Volume 3, pp. 1275–1279. [Google Scholar] [CrossRef]
Diaconescu, E. The use of NARX neural networks to predict chaotic time series. WSEAS Trans. Comput. Res. 2008, 3, 182–191. [Google Scholar]
Guzman, S.M.; Paz, J.O.; Tagert, M.L.M. The Use of NARX Neural Networks to Forecast Daily Groundwater Levels. Water Resour. Manag. 2017, 31, 1591–1603. [Google Scholar] [CrossRef]
Menezes, J.M.P.; Barreto, G.A. Long-term time series prediction with the NARX network: An empirical evaluation. Neurocomputing 2008, 71, 3335–3343. [Google Scholar] [CrossRef]
Boussaada, Z.; Curea, O.; Remaci, A.; Camblong, H.; Mrabet Bellaaj, N. A Nonlinear Autoregressive Exogenous (NARX) Neural Network Model for the Prediction of the Daily Direct Solar Radiation. Energies 2018, 11, 620. [Google Scholar] [CrossRef]
Dumoulin, V.; Visin, F. A guide to convolution arithmetic for deep learning. arXiv 2016, arXiv:1603.07285. [Google Scholar]
Zeiler, M.D.; Fergus, R. Visualizing and Understanding Convolutional Networks. arXiv 2013, arXiv:1311.2901. [Google Scholar]
Zeiler, M.D.; Taylor, G.W.; Fergus, R. Adaptive deconvolutional networks for mid and high level feature learning. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2018–2025. [Google Scholar] [CrossRef]
Bernard, M.; Dellacherie, S.; Faccanoni, G.; Grec, B.; Penel, Y. Study of a low Mach nuclear core model for two-phase flows with phase transition I: Stiffened gas law. ESAIM Math. Model. Numer. Anal. 2014, 48, 1639–1679. [Google Scholar] [CrossRef]
Brown, J.; Maldonado, G. History of PWR and BWR development. Encycl. Nucl. Energy 2021, 157–171. [Google Scholar] [CrossRef]
Brown, J.A. Pressurized Water Reactors. In Encyclopedia of Nuclear Energy; Greenspan, E., Ed.; Elsevier: Oxford, UK, 2021; pp. 196–213. [Google Scholar] [CrossRef]
Liu, Z.T.; Qin, B.K.; Xie, H.; Wang, B.H. Status and trends of thermal-hydraulic system codes for nuclear power plants with pressurized water reactors. Yuanzineng Kexue Jishu/At. Energy Sci. Technol. 2009, 43, 966–972. [Google Scholar]
Zhang, K. The multiscale thermal-hydraulic simulation for nuclear reactors: A classification of the coupling approaches and a review of the coupled codes. Int. J. Energy Res. 2020, 44, 3295–3315. [Google Scholar] [CrossRef]
Emonot, P.; Souyri, A.; Gandrille, J.; Barré, F. CATHARE-3: A new system code for thermal-hydraulics in the context of the NEPTUNE project. Nucl. Eng. Des. 2011, 241, 4476–4481. [Google Scholar]
Klambauer, G.; Unterthiner, T.; Mayr, A. Self-Normalizing Neural Networks. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; pp. 1–102. [Google Scholar]
Kingma, D.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the ACM SIGMOD International Conference on Management of Data, San Diego, CA, USA, 9–12 June 2003. [Google Scholar]
Sancarlos, A.; Cameron, M.; Le Peuvedic, J.M.; Groulier, J.; Duval, J.L.; Cueto, E.; Chinesta, F. Learning stable reduced-order models for hybrid twins. Data-Centric Eng. 2021, 2, e10. [Google Scholar] [CrossRef]
Erichson, N.; L, L.M.; Kutz, J.; Brunton, S. Randomized dynamic mode decomposition. SIAM J. Appl. Dyn. Syst. 2019, 18, 1867–1891. [Google Scholar]

Figure 1. Fluid channel with wall surface heating, controlled by the heat power density law

q (t)

. The green point represents the location of the measurement node at the outlet.

Figure 1. Fluid channel with wall surface heating, controlled by the heat power density law

q (t)

. The green point represents the location of the measurement node at the outlet.

Figure 2. Example of provided heat power density, as a fraction of

Q_{m a x} = 8.4 \times 10^{8}

W/m³ (this corresponds to the run 9 of the DoE).

Figure 2. Example of provided heat power density, as a fraction of

Q_{m a x} = 8.4 \times 10^{8}

W/m³ (this corresponds to the run 9 of the DoE).

Figure 3. Sample results for the datasets number 1 to 4 as shown in Table 1.

Figure 4. Graphical diagram of the deep convolution neural network used for the fitting of

H_{1}

. All convolutions employ “same” padding (also called “half” padding). If the input and output of a transpose convolution have shapes

I_{1} \times I_{2} \times I_{3}

and

O_{1} \times O_{2} \times O_{3}

, respectively, then the dimensions are obtained as

O_{i} = s_{i} I_{i}

, where

s_{i}

denotes the stride along the dimension i, for

i = 1, 2

. The dimension

O_{3}

is determined by the number of filters.

Figure 4. Graphical diagram of the deep convolution neural network used for the fitting of

H_{1}

. All convolutions employ “same” padding (also called “half” padding). If the input and output of a transpose convolution have shapes

I_{1} \times I_{2} \times I_{3}

and

O_{1} \times O_{2} \times O_{3}

, respectively, then the dimensions are obtained as

O_{i} = s_{i} I_{i}

, where

s_{i}

denotes the stride along the dimension i, for

i = 1, 2

. The dimension

O_{3}

is determined by the number of filters.

Figure 5. Sample results of

H_{1}

surrogate model for the training datasets number 10, as shown in Table 1.

Figure 5. Sample results of

H_{1}

surrogate model for the training datasets number 10, as shown in Table 1.

Figure 6. Sample results of

H_{1}

surrogate model for the testing datasets number 20, as shown in Table 1.

Figure 6. Sample results of

H_{1}

surrogate model for the testing datasets number 20, as shown in Table 1.

Figure 7. Sample results of

H_{1}

surrogate model for the testing datasets number 22, as shown in Table 1. Note the representation scale in panel (c).

Figure 7. Sample results of

H_{1}

surrogate model for the testing datasets number 22, as shown in Table 1. Note the representation scale in panel (c).

Figure 8. Graphical diagram of the deep convolution neural network used for the fitting of

H_{2}

. All convolutions employ “same” padding (also called “half” padding). If the input and output of a convolution have shapes

I_{1} \times I_{2} \times I_{3}

and

O_{1} \times O_{2} \times O_{3}

, respectively, then the dimensions are obtained as

O_{i} = ⌊ I_{i} / s_{i} ⌋

, where

s_{i}

denotes the stride along the dimension i, for

i = 1, 2

. The dimension

O_{3}

is determined by the number of filters.

Figure 8. Graphical diagram of the deep convolution neural network used for the fitting of

H_{2}

. All convolutions employ “same” padding (also called “half” padding). If the input and output of a convolution have shapes

I_{1} \times I_{2} \times I_{3}

and

O_{1} \times O_{2} \times O_{3}

, respectively, then the dimensions are obtained as

O_{i} = ⌊ I_{i} / s_{i} ⌋

, where

s_{i}

denotes the stride along the dimension i, for

i = 1, 2

. The dimension

O_{3}

is determined by the number of filters.

Figure 9. Sample results of

H_{2}

surrogate model for the training datasets number 10, as shown in Table 1. (a)

A

; (b)

V

; (c)

T_{G}

; (d)

T_{L}

.

Figure 9. Sample results of

H_{2}

surrogate model for the training datasets number 10, as shown in Table 1. (a)

A

; (b)

V

; (c)

T_{G}

; (d)

T_{L}

.

Figure 10. Sample results of

H_{2}

surrogate model for the testing datasets number 20, as shown in Table 1. (a)

A

; (b)

V

; (c)

T_{G}

; (d)

T_{L}

.

Figure 10. Sample results of

H_{2}

surrogate model for the testing datasets number 20, as shown in Table 1. (a)

A

; (b)

V

; (c)

T_{G}

; (d)

T_{L}

.

Figure 11. Sample results of

H_{2}

surrogate model for the testing datasets number 22, as shown in Table 1.

Figure 11. Sample results of

H_{2}

surrogate model for the testing datasets number 22, as shown in Table 1.

Figure 12. Graphical diagram of the deep convolution neural network used for the fitting of

H_{3}

. All convolutions employ “same” padding (also called “half” padding). If the input and output of a convolution have shapes

I_{1} \times I_{2} \times I_{3}

and

O_{1} \times O_{2} \times O_{3}

, respectively, then the dimensions are obtained as

O_{i} = ⌊ I_{i} / s_{i} ⌋

, where

s_{i}

denotes the stride along the dimension i, for

i = 1, 2

. The dimension

O_{3}

is determined by the number of filters.

Figure 12. Graphical diagram of the deep convolution neural network used for the fitting of

H_{3}

. All convolutions employ “same” padding (also called “half” padding). If the input and output of a convolution have shapes

I_{1} \times I_{2} \times I_{3}

and

O_{1} \times O_{2} \times O_{3}

, respectively, then the dimensions are obtained as

O_{i} = ⌊ I_{i} / s_{i} ⌋

, where

s_{i}

denotes the stride along the dimension i, for

i = 1, 2

. The dimension

O_{3}

is determined by the number of filters.

Figure 13. Sample results of

H_{3}

surrogate model for the training dataset number 10, as shown in Table 1.

Figure 13. Sample results of

H_{3}

surrogate model for the training dataset number 10, as shown in Table 1.

Figure 14. Sample results of

H_{3}

surrogate model for the testing datasets number 20, as shown in Table 1.

Figure 14. Sample results of

H_{3}

surrogate model for the testing datasets number 20, as shown in Table 1.

Figure 15. Sample results of

H_{3}

surrogate model for the testing datasets number 22, as shown in Table 1.

Figure 15. Sample results of

H_{3}

surrogate model for the testing datasets number 22, as shown in Table 1.

Figure 16. Sample integrator results of

H_{3}

surrogate model when integrating the training datasets number 10, as shown in Table 1.

Figure 16. Sample integrator results of

H_{3}

surrogate model when integrating the training datasets number 10, as shown in Table 1.

Figure 17. Sample integrator results of

H_{3}

surrogate model when integrating the testing datasets number 20, as shown in Table 1.

Figure 17. Sample integrator results of

H_{3}

surrogate model when integrating the testing datasets number 20, as shown in Table 1.

Figure 18. Sample integrator results of

H_{3}

surrogate model when integrating the testing datasets number 22, as shown in Table 1.

Figure 18. Sample integrator results of

H_{3}

surrogate model when integrating the testing datasets number 22, as shown in Table 1.

Table 1. Available plan of experiment (DoE) and sequences of control points values, as a fraction of

Q_{m a x} = 8.4 \times 10^{8}

W/m³.

Table 1. Available plan of experiment (DoE) and sequences of control points values, as a fraction of

Q_{m a x} = 8.4 \times 10^{8}

W/m³.

Sequence Number	$Q_{1}$	$Q_{2}$	$Q_{3}$	$Q_{4}$	$Q_{5}$
1	1	0	0	0	1
2	1	1	0	0	1
3	1	0	1	0	1
4	1	0	0	1	1
5	1	0	1	1	1
6	1	1	0	1	1
7	1	1	1	0	1
8	1	1	1	1	1
9	1	$0.19$	$0.07$	$0.9$	1
10	1	$0.06$	$0.8$	$0.28$	1
11	1	$0.91$	$0.76$	$0.58$	1
12	1	$0.69$	$0.19$	1	1
13	1	$0.12$	$0.57$	$0.38$	1
14	1	$0.2$	$0.67$	$0.02$	1
15	1	$0.99$	$0.92$	$0.47$	1
16	1	$0.55$	$0.94$	$0.74$	1
17	1	$0.47$	$0.24$	$0.83$	1
18	1	$0.76$	$0.36$	$0.18$	1
19	1	$0.82$	$0.49$	$0.4$	1
20	1	$0.66$	$0.66$	$0.72$	1
21	1	$0.31$	$0.43$	$0.6$	1
22	1	$0.37$	$0.31$	$0.11$	1
23	1	$0.44$	$0.05$	$0.2$	1

Table 2. Structure of the deep convolution transpose neural network used for the fitting of

H_{1}

.

t a n h

stands for the hyperbolic tangent activation function,

s e l u

stands for the scaled exponential linear unit [34], while

l i n e a r

stands for linear activation or no activation function. CT stands for “convolution transpose”.

Table 2. Structure of the deep convolution transpose neural network used for the fitting of

H_{1}

.

t a n h

stands for the hyperbolic tangent activation function,

s e l u

stands for the scaled exponential linear unit [34], while

l i n e a r

stands for linear activation or no activation function. CT stands for “convolution transpose”.

Layers	Shape	Activation
1	Fully connected dense layer with 7680 neurons	$t a n h$
2	Reshape layer into a 3D Tensor of shape $(8 \times 4 \times 240)$	No activation
3	2D CT, 120 filters, kernel $(5 \times 1)$ , strides $(5 \times 1)$ , “same” padding	$s e l u$
4	2D CT, 60 filters, kernel $(5 \times 1)$ , strides $(5 \times 1)$ , “same” padding	$s e l u$
5	2D CT, 30 filters, kernel $(5 \times 1)$ , strides $(5 \times 1)$ , “same” padding	$s e l u$
6	2D CT, 1 filter, kernel $(1 \times 1)$ , strides $(1 \times 1)$ , “same” padding	$l i n e a r$

Table 3. Mean relative errors of the trained non-incremental model

H_{1}

.

Table 3. Mean relative errors of the trained non-incremental model

H_{1}

.

Model $H_{1}$	$E_{A}$	$E_{V}$	$E_{T_{G}}$	$E_{T_{L}}$
Training sets	0.058%	0.735%	$1.7 \times 10^{- 5}$ %	0.037%
Testing sets	0.14%	1.61%	$9.0 \times 10^{- 5}$ %	0.265%

Table 4. Structure of the deep convolution neural network used for the fitting of

H_{2}

.

t a n h

stands for the hyperbolic tangent activation function,

s e l u

stands for the scaled exponential linear unit [34], while

l i n e a r

stands for linear activation or no activation function.

Table 4. Structure of the deep convolution neural network used for the fitting of

H_{2}

.

t a n h

stands for the hyperbolic tangent activation function,

s e l u

stands for the scaled exponential linear unit [34], while

l i n e a r

stands for linear activation or no activation function.

Layers	Shape	Activation
1	2D convolution, 60 filters, kernel $(2 \times 1)$ , strides $(2 \times 1)$ , “same” padding	$s e l u$
2	2D convolution, 120 filters, kernel $(2 \times 1)$ , strides $(1 \times 1)$ , “same” padding	$s e l u$
3	A flatten layer, reshapes all inputs into a single vector	no activation
4	Fully connected dense layer with 250 neurons	$s e l u$
5	Fully connected dense layer with 125 neurons	$s e l u$
6	Fully connected dense layer with 70 neurons	$s e l u$
7	Fully connected dense layer with 25 neurons	$t a n h$
8	Fully connected dense layer with 4 neurons	$l i n e a r$

Table 5. Mean relative errors of the trained incremental model

H_{2}

with

t_{s} = 4

. All convolutions employ “same” padding, meaning that the input is half padded and that the filter is applied to all the elements of the input.

Table 5. Mean relative errors of the trained incremental model

H_{2}

with

t_{s} = 4

. All convolutions employ “same” padding, meaning that the input is half padded and that the filter is applied to all the elements of the input.

Model $H_{2}$	$E_{A}$	$E_{V}$	$E_{T_{G}}$	$E_{T_{L}}$
Training sets	0.11%	0.81%	$1.91 \times 10^{- 5}$ %	0.25%
Testing sets	0.039%	0.898%	$1.94 \times 10^{- 5}$ %	0.284%

Table 6. Structure of the deep convolution neural network used for the fitting of

H_{3}

.

t a n h

stands for the hyperbolic tangent activation function,

s e l u

stands for the scaled exponential linear unit [34], while

l i n e a r

stands for linear activation or no activation function.

Table 6. Structure of the deep convolution neural network used for the fitting of

H_{3}

.

t a n h

stands for the hyperbolic tangent activation function,

s e l u

stands for the scaled exponential linear unit [34], while

l i n e a r

stands for linear activation or no activation function.

Layers	Shape	Activation
1	2D convolution, 30 filters, kernel $(4 \times 5)$ , strides $(2 \times 1)$ , “same” padding	$s e l u$
2	2D convolution, 60 filters, kernel $(2 \times 1)$ , strides $(2 \times 1)$ , “same” padding	$s e l u$
3	2D convolution, 120 filters, kernel $(5 \times 1)$ , strides $(1 \times 1)$ , “same” padding	$s e l u$
4	A flatten layer, reshapes all inputs into a single vector	no activation
6	Fully connected dense layer with 125 neurons	$s e l u$
7	Fully connected dense layer with 70 neurons	$s e l u$
8	Fully connected dense layer with 25 neurons	$s e l u$
9	Fully connected dense layer with 4 neurons	$l i n e a r$

Table 7. Mean relative errors of the trained incremental model

H_{3}

with

t_{s} = 20

.

Table 7. Mean relative errors of the trained incremental model

H_{3}

with

t_{s} = 20

.

Model $H_{3}$	$E_{A}$	$E_{V}$	$E_{T_{G}}$	$E_{T_{L}}$
Training sets	0.058%	1.21%	$9.18 \times 10^{- 6}$ %	0.039%
Testing sets	0.012%	1.4%	$9.11 \times 10^{- 6}$ %	0.04%

Table 8. Mean relative errors of the trained incremental model

H_{3}

with

t_{s} = 20

, when performing an integrator operation as per Equation (10).

Table 8. Mean relative errors of the trained incremental model

H_{3}

with

t_{s} = 20

, when performing an integrator operation as per Equation (10).

Model $H_{3}$	$E_{A}$	$E_{V}$	$E_{T_{G}}$	$E_{T_{L}}$
Training sets	1.7%	1.54%	$2.07 \times 10^{- 5}$ %	0.091%
Testing sets	0.44%	1.05%	$2.9 \times 10^{- 5}$ %	0.045%

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ghnatios, C.; Champaney, V.; Pasquale, A.; Chinesta, F. A Regularized Real-Time Integrator for Data-Driven Control of Heating Channels. Computation 2022, 10, 176. https://doi.org/10.3390/computation10100176

AMA Style

Ghnatios C, Champaney V, Pasquale A, Chinesta F. A Regularized Real-Time Integrator for Data-Driven Control of Heating Channels. Computation. 2022; 10(10):176. https://doi.org/10.3390/computation10100176

Chicago/Turabian Style

Ghnatios, Chady, Victor Champaney, Angelo Pasquale, and Francisco Chinesta. 2022. "A Regularized Real-Time Integrator for Data-Driven Control of Heating Channels" Computation 10, no. 10: 176. https://doi.org/10.3390/computation10100176

APA Style

Ghnatios, C., Champaney, V., Pasquale, A., & Chinesta, F. (2022). A Regularized Real-Time Integrator for Data-Driven Control of Heating Channels. Computation, 10(10), 176. https://doi.org/10.3390/computation10100176

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Regularized Real-Time Integrator for Data-Driven Control of Heating Channels

Abstract

1. Introduction

2. Problem Statement

3. Surrogate Modeling of the Data

3.1. A Non-Incremental Approach

3.2. Incremental Approaches

3.2.1. A Fast Predictor

3.2.2. A Larger Sequence Time Integrator

3.2.3. Integrator Results

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI