First-Order Comprehensive Adjoint Sensitivity Analysis Methodology for Neural Ordinary Differential Equations: Mathematical Framework and Illustrative Application to the Nordheim–Fuchs Reactor Safety Model
Abstract
:1. Introduction
2. Neural Ordinary Differential Equations (NODE): Basic Properties and Uses
- (i)
- The quantity is a time-like independent variable which parameterizes the dynamics of the hidden/latent neuron units; the initial value is denoted as (which can be considered to be an initial measurement time) while the stopping value is denoted as (which can be considered to be the next measurement time).
- (ii)
- The -dimensional vector-valued function represents the hidden/latent neural networks. In this work, all vectors are considered to be column vectors and the dagger “” symbol will be used to denote “transposition”. The symbol “” signifies “is defined as” or, equivalently, “is by definition equal to”.
- (iii)
- The -dimensional vector-valued nonlinear function models the dynamics of the latent neurons with learnable scalar adjustable weights represented by the components of the vector , where denotes the total number of adjustable weights in all of the latent neural nets.
- (iv)
- The -dimensional vector-valued function represents the “encoder” which is characterized by “inputs” and “learnable” scalar adjustable weights , where denotes the total number of “inputs” and denotes the total number of “learnable encoder weights” that define the “encoder”.
- (v)
- The -dimensional vector-valued function represents the vector of “system responses”. The vector-valued function represents the “decoder” with learnable scalar adjustable weights, which are represented by the components of the vector , where denotes the total number of adjustable weights that characterize the “decoder”. Each component can be represented in integral form as follows:
3. Illustrative Paradigm Application: NODE Conceptual Modeling of the Nordheim–Fuchs Phenomenological Reactor Dynamics/Safety Model
- The time-dependent neutron balance (point kinetics) equation for the neutron flux φ(t):
- 2.
- The energy production equation:
- 3.
- The energy conservation equation:
- 4.
- The reactivity–temperature feedback equation: , where denotes the changed multiplication factor following the reactivity insertion at , denotes the magnitude of the negative temperature coefficient, denotes the reactor’s temperature, and denotes the reactor’s initial temperature at time . For illustrating the application of the 1st-FASAM methodology, it suffices to consider the special case of a “prompt critical transient”, when the reactor becomes prompt critical after the reactivity insertion, i.e., when , so that the reactivity–temperature feedback equation takes on the following particular form:
- (i)
- Eliminating the function φ(t) from Equations (13) and (14) yields a nonlinear differential equation which can be integrated directly to obtain the following relation:
- (ii)
- Using Equation (16) in Equation (14) yields the following nonlinear equation for the released energy :
- (iii)
- Replacing Equation (18) into Equation (16) yields the following closed-form expression for :
- (iv)
- Replacing Equation (18) into Equation (11) yields the following closed-form expression for :
- (i)
- The neutron flux φ(τ) in the reactor at a “final time” instance denoted as t = τ, after the initiation at t = 0 of the prompt-critical power transient, which can be defined mathematically as follows:
- (ii)
- The total energy per cm3, , released at a user-chosen “final time” instance denoted as , after the initiation at of the prompt-critical power transient, which can be defined mathematically as follows:
- (iii)
- The reactor’s temperature at a “final time” instance denoted as after the initiation at of the prompt-critical power transient, which can be defined mathematically as follows:
4. First-Order Comprehensive Adjoint Sensitivity Analysis Methodology for Neural Ordinary Differential Equations (1st-CASAM-NODE): Mathematical Framework
5. Illustrative Application of the 1st-CASAM-NODE Methodology to Compute First-Order Sensitivities of Nordheim–Fuchs Model Responses with Respect to the Underlying Parameters
5.1. First-Order Sensitivities of the Flux Response
- Consider that the 1st-level variational function is an element in a Hilbert space denoted as , , comprising elements of the form , , and being endowed with the inner product introduced in Equation (50), which takes on the following particular form for the Nordheim–Fuchs model:
- Use Equation (79) to form the inner product of Equations (69)–(71) with a yet undefined function , to obtain the following relation, which is the particular form taken on by Equation (51) for the Nordheim–Fuchs model:
- Integrating by parts the terms on the left side of Equation (80) yields the following relation:
- 4.
- The definition of the function is now completed by requiring that: (i) the integral term on the right side of Equation (81) represent the G-differential defined in Equation (62) and (ii) the appearance of the unknown values of the components of be eliminated from appearing in Equation (81). These requirements will be satisfied if the function is the solution of the following “1st-Level Adjoint Sensitivity System” (1st-LASS):
- 5.
- Using Equations (84), (85), (80), (62), (72), (73) and (74) in Equation (81) yields the following expression for the first G-differential of the response under consideration:
5.2. First-Order Sensitivities of the Energy Released Response
5.3. First-Order Sensitivities of the Temperature Response
5.4. First-Order Sensitivities of the Thermal Conductivity Response
- (i)
- the 1st-LASS defined by Equations (84) and (85), which are solved for obtaining the corresponding 1st-level adjoint sensitivity function needed for computing the sensitivities of the component of the state function ;
- (ii)
- the 1st-LASS defined by Equations (95) and (96), which are solved for obtaining the corresponding 1st-level adjoint sensitivity function needed for computing the sensitivities of the component of the state function ;
- (iii)
- the 1st-LASS defined by Equations (98) and (99), which are solved for obtaining the corresponding 1st-level adjoint sensitivity function needed for computing the sensitivities of the component of the state function ; and
- (iv)
- the 1st-LASS defined by Equations (104) and (105), which are solved for obtaining the corresponding 1st-level adjoint sensitivity function needed for computing the sensitivities stemming from the indirect-effect term ,
5.5. Most Efficient Computation of First-Order Sensitivities: Application of the 1st-FASAM-N
6. Use of First-Order Sensitivities for Uncertainty Analysis of NODE Responses
- The expected (or mean) value of a model parameter , denoted as , is defined as follows:
- 2.
- The covariance, , of two parameters, and , is defined as follows:
- 3.
- The third-order moment, , of the distribution of parameters and the associated third-order correlation among three parameters are defined as follows, for :
- 4.
- The fourth-order moment, , of the distribution of parameters and the associated fourth-order correlation among four parameters are defined as follows, for :
7. Discussion and Conclusions
Funding
Data Availability Statement
Conflicts of Interest
References
- Haber, E.; Ruthotto, L. Stable architectures for deep neural networks. Inverse Probl. 2017, 34, 014004. [Google Scholar] [CrossRef]
- Lu, Y.; Zhong, A.; Li, Q.; Dong, B. Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; PMLR. pp. 3276–3285. [Google Scholar]
- Ruthotto, L.; Haber, E. Deep neural networks motivated by partial differential equations. J. Math. Imaging Vis. 2018, 62, 352–364. [Google Scholar] [CrossRef]
- Chen, R.T.Q.; Rubanova, Y.; Bettencourt, J.; Duvenaud, D.K. Neural ordinary differential equations. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: New York, NY, USA, 2018; Volume 31, pp. 6571–6583. [Google Scholar] [CrossRef]
- Dupont, E.; Doucet, A.; The, Y.W. Augmented neural odes. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Volume 32, pp. 14–15. [Google Scholar]
- Kidger, P. On Neural Differential Equations. arXiv 2022, arXiv:2202.02435. [Google Scholar]
- Kidger, P.; Morrill, J.; Foster, J.; Lyons, T. Neural controlled differential equations for irregular time series. In Proceedings of the Advances in Neural Information Processing Systems, Virtual, 6–12 December 2020; Volume 33, pp. 6696–6707. [Google Scholar]
- Morrill, J.; Salvi, C.; Kidger, P.; Foster, J. Neural rough differential equations for long time series. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; PMLR. pp. 7829–7838. [Google Scholar]
- Grathwohl, W.; Chen, R.T.Q.; Bettencourt, J.; Sutskever, I.; Duvenaud, D. Ffjord: Free-form continuous dynamics for scalable reversible generative models. In Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
- Zhong, Y.D.; Dey, B.; Chakraborty, A. Symplectic ode-net: Learning Hamiltonian dynamics with control. In Proceedings of the International Conference on Learning Representations, Addis Ababa, Ethiopia, 30 April 2020. [Google Scholar]
- Tieleman, T.; Hinton, G. Lecture 6.5—RMSProp: Divide the gradient by a running average of its recent magnitude. COURSERA Neural Netw. Mach. Learn. 2012, 4, 26. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Pontryagin, L.S. Mathematical Theory of Optimal Processes; CRC Press: Boca Raton, FL, USA, 1987. [Google Scholar]
- LeCun, Y. A theoretical framework for back-propagation. In Proceedings of the Connectionist Models Summer School; Touresky, D., Hinton, G., Sejnowski, T., Eds.; Morgan Kaufmann Publishers, Inc.: San Mateo, CA, USA, 1988. [Google Scholar]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Norcliffe, A.; Deisenroth, M.P. Faster training of neural ODEs using Gauss–Legendre quadrature. arXiv 2023, arXiv:2308.10644. [Google Scholar]
- Lamarsh, J.R. Introduction to Nuclear Reactor Theory; Adison-Wesley Publishing Co.: Reading, MA, USA, 1966; pp. 491–492. [Google Scholar]
- Hetrick, D.L. Dynamics of Nuclear Reactors; American Nuclear Society, Inc.: La Grange Park, IL, USA, 1993; pp. 164–174. [Google Scholar]
- Cacuci, D.G. Computation of high-order sensitivities of model responses to model parameters. II: Introducing the Second-Order Adjoint Sensitivity Analysis Methodology for Computing Response Sensitivities to Functions/Features of Parameters. Energies 2023, 16, 6356. [Google Scholar] [CrossRef]
- Tukey, J.W. The Propagation of Errors, Fluctuations and Tolerances; Technical Reports No. 10–12; Princeton University: Princeton, NJ, USA, 1957. [Google Scholar]
- Cacuci, D.G. The nth-Order Comprehensive Adjoint Sensitivity Analysis Methodology (nth-CASAM): Overcoming the Curse of Dimensionality in Sensitivity and Uncertainty Analysis, Volume I: Linear Systems; Springer Nature: Cham, Switzerland, 2022; 362p. [Google Scholar] [CrossRef]
- Cacuci, D.G. The Fourth-Order Comprehensive Adjoint Sensitivity Analysis Methodology for Nonlinear Systems (4th-CASAM-N): I. Mathematical Framework. J. Nucl. Eng. 2022, 3, 37–71. [Google Scholar] [CrossRef]
- Cacuci, D.G. Sensitivity theory for nonlinear systems: I. Nonlinear functional analysis approach. J. Math. Phys. 1981, 22, 2794–2812. [Google Scholar] [CrossRef]
- Cacuci, D.G. Introducing the nth-Order Features Adjoint Sensitivity Analysis Methodology for Nonlinear Systems (nth-FASAM-N): I. Mathematical Framework. Am. J. Comput. Math. 2024, 14, 11–42. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cacuci, D.G. First-Order Comprehensive Adjoint Sensitivity Analysis Methodology for Neural Ordinary Differential Equations: Mathematical Framework and Illustrative Application to the Nordheim–Fuchs Reactor Safety Model. J. Nucl. Eng. 2024, 5, 347-372. https://doi.org/10.3390/jne5030023
Cacuci DG. First-Order Comprehensive Adjoint Sensitivity Analysis Methodology for Neural Ordinary Differential Equations: Mathematical Framework and Illustrative Application to the Nordheim–Fuchs Reactor Safety Model. Journal of Nuclear Engineering. 2024; 5(3):347-372. https://doi.org/10.3390/jne5030023
Chicago/Turabian StyleCacuci, Dan Gabriel. 2024. "First-Order Comprehensive Adjoint Sensitivity Analysis Methodology for Neural Ordinary Differential Equations: Mathematical Framework and Illustrative Application to the Nordheim–Fuchs Reactor Safety Model" Journal of Nuclear Engineering 5, no. 3: 347-372. https://doi.org/10.3390/jne5030023
APA StyleCacuci, D. G. (2024). First-Order Comprehensive Adjoint Sensitivity Analysis Methodology for Neural Ordinary Differential Equations: Mathematical Framework and Illustrative Application to the Nordheim–Fuchs Reactor Safety Model. Journal of Nuclear Engineering, 5(3), 347-372. https://doi.org/10.3390/jne5030023