Next Article in Journal
Addressing Challenges in Large-Scale Bioprocess Simulations: A Circular Economy Approach Using SuperPro Designer
Previous Article in Journal
Machine Learning for Industrial Optimization and Predictive Control: A Patent-Based Perspective with a Focus on Taiwan’s High-Tech Manufacturing
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The First- and Second-Order Features Adjoint Sensitivity Analysis Methodologies for Fredholm-Type Neural Integro-Differential Equations: I. Mathematical Framework

by
Dan Gabriel Cacuci
Department of Mechanical Engineering, University of South Carolina, Columbia, SC 29208, USA
Processes 2025, 13(7), 2258; https://doi.org/10.3390/pr13072258
Submission received: 29 May 2025 / Revised: 7 July 2025 / Accepted: 14 July 2025 / Published: 15 July 2025
(This article belongs to the Section Energy Systems)

Abstract

This work presents the “First-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integro-Differential Equations of Fredholm-Type” (1st-FASAM-NIDE-F) and the “Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integro-Differential Equations of Fredholm-Type” (2nd-FASAM-NIDE-F). It is shown that the 1st-FASAM-NIDE-F methodology enables the most efficient computation of exactly-determined first-order sensitivities of decoder response with respect to the optimized NIDE-F parameters, requiring a single “large-scale” computation for solving the 1st-Level Adjoint Sensitivity System (1st-LASS), regardless of the number of weights/parameters underlying the NIDE-F decoder, hidden layers, and encoder. The 2nd-FASAM-NIDE-F methodology enables the computation, with unparalleled efficiency, of the second-order sensitivities of decoder responses with respect to the optimized/trained weights, requiring only as many large-scale computations for solving the 2nd-Level Adjoint Sensitivity System (2nd-LASS) as there are non-zero feature functions of parameters. The application of both the 1st-FASAM-NIDE-F and the 2nd-FASAM-NIDE-F methodologies is illustrated in an accompanying work (Part II) by considering a paradigm heat transfer model.

1. Introduction

The introduction of “Neural Ordinary Differential Equations” (NODE) models by Chen et al. [1] has significantly advanced the versatility and applicability of neural-nets by providing an explicit connection between deep feed-forward neural networks and dynamical systems. By using differential equation solvers to learn dynamics through continuous deep learning models of neural networks, NODE models provide a bridge between modern deep learning and traditional numerical modeling while offering trade-offs between efficiency, memory costs, and accuracy, as demonstrated by various applications [1,2,3,4,5,6,7,8,9]. However, NODE models are limited to describing systems that are instantaneous, since each time-step is determined locally in time, without contributions from the state of the system at other times.
In contradistinction to differential equations, integral equations (IEs) model global spatio-temporal relations, which are learned through an IE-solver (see, e.g., [10]) which samples the domain of integration continuously. Due to their non-local behavior, IE-solvers are suitable for modeling complex dynamics. Zappala et al. [11] have introduced the Neural Integral Equation (NIE) and the Attentional Neural Integral Equation (ANIE), which can be used to infer the spatio-temporal relations that generated the data, thus enabling the continuous learning of non-local dynamics with arbitrary time resolution [11,12]. Often, ordinary and/or partial differential equations can be recast in integral-equation forms that can be solved more efficiently using IE-solvers, as exemplified in [13,14,15].
Zappala et al. [16] have also developed a deep learning method called Neural Integro-Differential Equation (NIDE), which “learns” an integro-differential equation (IDE) whose solution approximates data sampled from given non-local dynamics. The motivation for using NIDE stems from the need to model systems that present spatio-temporal relations which transcend local modeling, as illustrated by the pioneering works of Volterra on population dynamics [17]. Combining the properties of differential and integral equations, IDEs also present properties that are unique to their non-local behavior [18,19,20], with applications in computational biology, physics, engineering, and applied sciences [18,19,20,21,22,23,24,25,26].
All neural-nets are trained by minimizing a user-chosen “loss functional” which aims at representing the discrepancy between the output produced by the respective net’s decoder and a user-chosen “reference solution”. The neural-net is optimized to reproduce the underlying physical system as closely as possible. However, the physical system modeled by a neural-net comprises parameters that stem from measurements and/or computations which are subject to uncertainties. Therefore, even if the neural-net reproduces perfectly the underlying system, the uncertainties inherent in the system’s parameters would propagate to the subsequent results produced by the decoder. Quantifying the uncertainties in the decoder’s response can only be performed if the sensitivities of the decoder’s response with respect to the neural-net’s optimized parameters are known.
In addition to scalar-valued parameters, neural-nets often comprise scalar-valued functions (e.g., correlations, material properties, etc.) of the model’s scalar parameters. Calling such scalar-valued functions as “features of primary model parameters”, Cacuci [27] has recently introduced the “nth-Order Features Adjoint Sensitivity Analysis Methodology for Nonlinear Systems (nth-FASAM-N)”. The nth-FASAM-N enables the most efficient computation of the exact expressions of arbitrarily high-order sensitivities of model responses with respect to the model’s “features”. Subsequently, the sensitivities of the responses with respect to the primary model parameters are determined, analytically and trivially, by using the well-known “chain-rule of differentiation” to obtain the response sensitivities with respect to the model’s features/functions of parameters.
Based on the general framework of the nth-FASAM-N methodology [27], Cacuci has developed specific sensitivity analysis methodologies for NODE-nets, as follows: the “First-Order and Features Adjoint Sensitivity Analysis Methodology for Neural Ordinary Differential Equations (1st-FASAM-NODE)” and the “Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Ordinary Differential Equations (2nd-FASAM-NODE)” [28]. The 1st-FASAM-NODE and the 2nd-FASAM-NODE are pioneering sensitivity analysis methodologies which enable the computation, with unparalleled efficiency, of exactly-determined first-order and, respectively, second-order sensitivities of decoder response with respect to the optimized/trained weights involved in the NODE’s hidden layers, decoder, and encoder.
Two important families of IDEs are the Volterra and the Fredholm equations. In a Volterra IDE, the interval of integration grows linearly during the system’s dynamics, while in a Fredholm IDE the interval of integration is fixed during the dynamic-history of the system, but at any given time instance within this interval, the system depends on the past, present, and future states of the system. By applying the general concepts underlying the nth-FASAM-N methodology [27], Cacuci [29,30] has also developed the general methodologies underlying the “Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integral Equations of Fredholm-Type (2nd-FASAM-NIE-F)” and the “Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integral Equations of Volterra-Type (2nd-FASAM-NIE-V)”. The 2nd-FASAM-NIE-F encompasses the “First-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integral Equations of Fredholm-Type (1st-FASAM-NIE-F), while the 2nd-FASAM-NIE-V encompasses the “First-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integral Equations of Volterra-Type (1st-FASAM-NIE-V)”. The 1st-FASAM-NIE-F and 1st-FASAM-NIE-V methodologies, respectively, enable the computation, with unparalleled efficiency, of exactly-determined first-order sensitivities of decoder response with respect to the NIE-parameters, requiring a single “large-scale” computation for solving the 1st-Level Adjoint Sensitivity System (1st-LASS), regardless of the number of weights/parameters underlying the NIE-net. The 2nd-FASAM-NIE-F and 2nd-FASAM-NIE-V methodologies, respectively, enable the computation (with unparalleled efficiency) of exactly-determined second-order sensitivities of decoder response with respect to the NIE-parameters, requiring only as many “large-scale” computations as there are first-order sensitivities with respect to the feature functions.
Generalizing the methodologies presented in [29,30], this work presents the “First- and Second Order Features Adjoint Sensitivity Analysis Methodology for Neural Integro-Differential Equations of Fredholm-Type” abbreviated as “1st-FASAM-NIDE-F” and “2nd-FASAM-NIDE-F”, respectively. These methodologies are also based on the general principles underlying the nth-FASAM-N methodology [24]. The 1st-FASAM-NIDE-F is presented in Section 2, while the 2nd-FASAM-NIDE-F is presented in Section 3, in the sequel. The discussion presented in Section 4 concludes this work by highlighting the unparalleled efficiency of the 1st-FASAM-NIDE-F and 2nd-FASAM-NIDE-F methodologies, respectively, for computing exact first- and second-order sensitivities, respectively, of decoder responses to model parameters in optimized NIE-F networks. The accompanying work [31] presents an illustrative application of the 1st-FASAM-NIDE-F and 2nd-FASAM-NIDE-F methodologies to a paradigm heat conduction model which admits exact closed-form solutions/expressions for all quantities of interest and is of fundamental importance in many scientific fields [32,33,34,35,36,37].

2. First-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integro-Differential Equations of Fredholm-Type (1st-FASAM-NIDE-F)

The mathematical expression of the network of nonlinear Fredholm-type Neural Integro-Differential Equations (NIDE-F) considered in this work generalizes the NIDE-net model introduced in [16] and is represented in component form by the following system of Nth-order integro-differential equations:
n = 1 N c i , n h t ; f θ ; t d n h i t d t n = g i h t ; f θ + j = 1 T L φ i , j f θ ; t t 0 t f d τ ψ j h τ ; f θ ; τ ; t t 0 , t f ; i = 1 , , T H .
The boundary conditions, imposed at the “initial time” t = t 0 and/or “final time” t = t f on the functions h i t and their time-derivatives associated with the encoder of the NIDE-F net represented by Equation (1) are represented in operator form as follows:
B j h t ; f θ ; t = 0 ; a t t = t 0 a n d / o r t = t f ; j = 1 , , B C .
The quantities appearing in Equations (1) and (2) are defined as follows:
(i)
The real-valued scalar quantities t and τ , t 0 t , τ t f , are time-like independent variables which parameterize the dynamics of the hidden/latent neuron units. Customarily, the variable t is called the “global time” while the variable τ is called the “local time”. The initial time-value is denoted as t 0 while the stopping time-value is denoted as t f . Thus, the dynamics modeled by Equation (1) depends both on non-local effects, as well as on instantaneous information.
(ii)
The components of the T H -dimensional vector-valued function h t h 1 t , , h T H t represents the hidden/latent neural networks; T H denotes the total number of components of h t . In this work, the symbol “ ” will be used to denote “is defined as” or, equivalently, “is by definition equal to”. The various vectors will be considered to be column vectors. Typically, vectors will be denoted using bold lower-case letters. The dagger “ ” symbol will be used to denote “transposition”.
(iii)
The components of the column-vector θ θ 1 , , θ T W represent the “primary” network parameters, namely scalar learnable adjustable parameters/weights, in all of the latent neural-nets, including the encoders(s) and decoder(s), where T W denotes the total number of adjustable parameters/weights.
(iv)
The scalar-valued components f i θ , i = 1 , , T F , of the vector-valued function f θ f 1 θ , , f T F θ represent the “feature/functions of the primary model parameters”. The quantity T F denotes the total number of such feature functions comprised in the NIDE-F. In particular, all of the model parameters that might appear solely in the boundary and/or initial conditions are considered to be included among the components of the vector θ . In general, f θ is a nonlinear vector-valued function of θ . The total number of feature functions must necessarily be smaller than the total number of primary parameters (weights), i.e., T F < T W . When the NIDE-F comprises only primary parameters, it is considered that f i θ θ i for all i = 1 , , T W T F .
(v)
The functions ψ j h τ ; f θ ; τ model the dynamics of the neurons in a latent space where the local time integration occurs, while the functions φ i , j f θ ; t map the local space back to the original data space. The functions g i h t ; f θ model additional dynamics in the original data space. In general, these functions are nonlinear in their arguments.
(vi)
The functions c i , n h t ; f θ ; t are coefficient-functions, which may depend nonlinearly on the functions h t and f θ , associated with the order, n = 1 , , N , of the time-derivatives d n h i t / d t n of the functions h i t .
(vii)
The operators B j h t ; f θ ; t , j = 1 , , B C , represent boundary conditions associated with the encoder and/or decoder, imposed at t = t 0 and/or at t = t f on the functions h i t and on their time-derivatives; the quantity “BC” denotes the “total number of boundary conditions”.
Customarily, the NIDE-F net is “trained” by minimizing a user-chosen loss functional representing the discrepancy between a reference solution (“target data”) and the output produced by the NIDE-F decoder. The “training” process produces “optimal” values for the primary parameters θ θ 1 , , θ T W , which will be denoted in this work by using the superscript “zero”, as follows: θ 0 θ 1 0 , , θ T W 0 . Using these optimal/nominal parameter values to evaluate the NIDE-F net yields the optimal/nominal solution h 0 t , x h 1 0 t , , h T H 0 t which will satisfy the following form of Equation (1):
n = 1 N c i , n h 0 t ; f θ 0 ; t d n h i 0 t d t n = g i h 0 t ; f θ 0 + j = 1 T L φ i , j f θ 0 ; t t 0 t f d τ ψ j h 0 τ ; f θ 0 ; τ ; i = 1 , , T H ;
subject to the following optimized/trained boundary conditions:
B j h 0 t ; f θ 0 ; t = 0 ; a t t = t 0 a n d / o r t = t f ; j = 1 , , B C .
After the NIDE-F net is optimized to reproduce the underlying physical system as closely as possible, the subsequent responses of interest are no longer “loss functionals” but become specific functionals of the NIDE-F’s “decoder” output, which can be generally represented by the functional R h ; f θ defined below:
R h ; f θ = t 0 t f D h t ; f θ ; t d t .
The function D h t ; f θ ; t models the decoder. The scalar-valued quantity R h ; f θ is a functional of h t , x and f θ , and represents the NIDE-F’s decoder-response. At the optimal/nominal parameter values, i.e., at θ = θ 0 , the decoder response takes on the following formal form:
R h 0 ; f θ 0 = t 0 t f D h 0 t ; f θ 0 ; t d t .
The physical system modeled by the NIDE-F net comprises parameters that stem from measurements and/or computations. Consequently, even if the NIDE-F net models perfectly the underlying physical system, the NIDE-F’s optimal weights/parameters are unavoidably afflicted by uncertainties stemming from the parameters underlying the physical system. Hence, it is important to quantify the uncertainties induced in the decoder output, R h ; f θ , by the uncertainties that afflict the parameters/weights underlying the physical system modeled by the NIDE-F net. The relative contributions of the uncertainties afflicting the optimal parameters to the total uncertainty in the decoder response are quantified by the sensitivities of the NIDE-F decoder-response with respect to the optimized NIDE-F parameters. The general methodology for computing the first-order sensitivities of the decoder output, R h ; f θ , with respect to the components of the feature function f θ , and with respect to the primary model parameters θ 1 , , θ T W , will be presented in this Section.
The known nominal values θ 0 of the primary model parameters (“weights”) characterizing the NIDE-V net will differ from the true but unknown values θ of the respective weights by variations denoted as δ θ θ θ 0 . The variations δ θ θ θ 0 will induce corresponding variations δ f f θ f 0 , f 0 f θ 0 , in the feature functions. The variations δ θ and δ f will induce, through Equation (1), variations v 1 t v 1 1 t , , v T H 1 t δ h 1 t , , δ h T H t around the nominal/optimal functions h 0 t . In turn, the variations δ f f θ f 0 and v 1 t ; x will induce variations δ R h 0 ; f 0 ; v 1 ; δ f ; t in the NIE decoder’s response.
The “First-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integro-Differential Equations of Fredholm-Type (1st-FASAM-NIDE-F)” aims at obtaining the exact expressions of the first-order sensitivities (i.e., functional derivatives) of the decoder’s response with respect to the feature function and the primary model parameters, followed by the most efficient computation of these sensitivities. The 1st-FASAM-NIDE-F will be established by applying the same principles as those underlying the 1st-FASAM-N [24] methodology. The fundamental concept for defining the sensitivity of an operator-valued quantity R x with respect to variations δ x in a neighborhood around the nominal values x 0 , has been shown in 1981 by Cacuci [38] to be provided by the 1st-order Gateaux- (G-) variation δ R x 0 ; δ x of R x , which is defined as follows:
δ R x 0 ; δ x d d ε R x 0 + ε δ x ε = 0 lim ε 0 R x 0 + ε δ x R x 0 ε ,
for a scalar ε and for arbitrary vectors δ x in a neighborhood x 0 + ε δ x around x 0 . When the G-variation δ R x 0 ; δ x is linear in the variation δ x , it can be written in the form δ R x 0 ; δ x = R / x x 0 δ x , where R / x x 0 denotes the first-order G-derivative of R x with respect to x , evaluated at x 0 .
Applying the definition provided in Equation (7) to Equation (5) yields the following expression for the first-order G-variation δ R h 0 ; f 0 ; v 1 ; δ f of the response R h ; f θ :
δ R h 0 ; f 0 ; v 1 ; δ f = d d ε t 0 t f D h 0 t + ε v 1 t ; f 0 + ε δ f ; t d t ε = 0 = δ R h 0 ; f 0 ; δ f d i r + δ R h 0 ; f 0 ; v 1 i n d ,
where the “direct effect term” arises directly from variations δ f and is defined as follows:
δ R h 0 ; f 0 ; δ f d i r i = 1 T F t 0 t f d t D h t ; f θ ; t f i δ f i θ 0 ,
and where the “indirect effect term” arises indirectly, through the variations v 1 t in the hidden state functions h t , is defined as follows:
δ R h 0 ; f 0 ; v 1 i n d i = 1 T H t 0 t f d τ D h t ; f θ ; t h i t v i 1 t θ 0 .
The direct-effect term can be quantified using the nominal values h 0 ; f 0 but the indirect-effect term can be quantified only after determining the variations v 1 t , which are caused by the variations δ f through the NIDE-F net defined in Equation (1).
The first-order relationship between the variations v 1 t and δ f is obtained from the first-order G-variations of Equations (1) and (2). The first-order G-variations of Equations (1) and (2), respectively, are obtained, by definition, as follows:
d d ε n = 1 N c i , n h 0 t + ε v 1 t ; f θ 0 + ε δ f ; t d n h i 0 t + ε δ h i t d t n ε = 0 = d g i h 0 t + ε v 1 t ; f θ 0 + ε δ f d ε ε = 0 + d d ε j = 1 T L φ i , j f θ 0 + ε δ f ; t t 0 t f d τ ψ j h 0 τ + ε v 1 τ ; f θ 0 + ε δ f ; τ ε = 0 .
d d ε B j h 0 t + ε v 1 t ; f θ 0 + ε δ f ; t ε = 0 = 0 ; t = t 0 ; t = t f ; j = 1 , , B C .
Carrying out the operations indicated in Equations (11) and (12) yields the following NIDE-F net of Fredholm-type for the function v 1 t :
k = 1 T H n = 1 N d n h i t d t n k = 1 T H c i , n h ; f θ ; t h k t v k 1 t + n = 1 N c i , n h ; f θ ; t d n v i 1 t d t n θ 0 j = 1 T L φ i , j f θ ; t t 0 t f d τ k = 1 T H ψ j h ; f θ ; τ h k τ v k 1 τ θ 0 k = 1 T H g i h t ; f θ ; t h k t v k 1 t θ 0 = k = 1 T F q i , k 1 h ; f ; t δ f k θ 0 , i = 1 , , T H ;
k = 1 T H B j h ; f θ ; t h k t v k 1 t θ 0 = k = 1 T F B j h ; f θ ; t f k δ f k θ 0 , a t t = t 0 ; t = t f ; j = 1 , , B C ;
where:
q i , k 1 h ; f ; t n = 1 N c i , n f θ ; t f k d n h i t d t n + j = 1 T L φ i , j f ; t f k t 0 t f d τ ψ j h τ ; f θ ; τ + g i h ; f ; t f k + j = 1 T L φ i , j f ; t t 0 t f d τ ψ j h τ ; f θ ; τ f k ; i = 1 , , T H ; k = 1 , , T F .
The NIDE-F net represented by Equations (13) and (14) is called [24] the “1st-Level Variational Sensitivity system (1st-LVSS) and its solution, v 1 t is called [24] the “1st-level variational function”. All of the quantities in Equations (13) and (14) are to be computed at the nominal parameter values, but the respective indication has not been explicitly shown in order to simplify the notation.
It is important to note that the 1st-LVSS is linear in the variational function v 1 t . Therefore, the 1st-LVSS represented by Equation (13) can be written in matrix-vector form as follows:
L h ; f θ ; t v 1 t = Q 1 h ; f θ ; t δ f θ ,
where the T H × T F -dimensional rectangular matrix Q 1 h ; f θ ; t comprises as components the quantities q i , k 1 h ; f defined in Equation (15), while the components of the T H × T H square matrix L h ; f θ ; t L i k T H × T H are operators (algebraic, differential, integral) defined below, for i , k = 1 , , T H :
L i i v i 1 t n = 1 N d n h i t d t n c i , n h ; f θ ; t h i t v i 1 t + n = 1 N c i , n h ; f θ ; t d n v i 1 t d t n j = 1 T L φ i , j f θ ; t t 0 t f ψ j h ; f θ ; τ h i τ v i 1 τ d τ g i h t ; f θ ; t h i t v i 1 t ;
L i k v k 1 t n = 1 N d n h i t d t n c i , n h ; f θ ; t h k t v k 1 t g i h t ; f θ ; t h k t v k 1 t j = 1 T L φ i , j f θ ; t t 0 t f d τ ψ j h ; f θ ; τ h k τ v k 1 τ ; i , k = 1 , , T H .
Note that the 1st-LVSS would need to be solved anew for each variation δ F j , j = 1 , , T F , in order to determine the corresponding function v 1 t , which is prohibitively expensive computationally if T F is a large number. The need for repeatedly solving the 1st-LVSS can be avoided if the variational function v 1 t could be eliminated from appearing in the expression of the indirect-effect term defined in Equation (10). This goal can be achieved [24] by expressing the right-side of Equation (10) in terms of the solutions of the “1st-Level Adjoint Sensitivity System (1st-LASS)” to be constructed next. The construction of this 1st-LASS will be performed in a Hilbert space comprising elements of the same form as v 1 t H 1 Ω t , defined on the domain Ω t t t 0 , t f . This Hilbert space is endowed with an inner product of two elements χ ( 1 ) t χ 1 1 t , , χ T H 1 t H 1 Ω t and η ( 1 ) t η 1 1 t , , η T H 1 t H 1 Ω t , denoted as χ ( 1 ) t , η ( 1 ) t 1 and defined as follows:
χ ( 1 ) t , η ( 1 ) t 1 t 0 t f χ ( 1 ) t η ( 1 ) t d t = i = 1 T H t 0 t f χ i 1 t η i 1 t d t .
The next step is to construct the inner product of Equation (13) with a vector a 1 t a 1 1 t , , a T H 1 t H 1 Ω t , where the superscript “(1)” indicates “1st-Level”, to obtain the following relationship:
a 1 t , L h ; f θ ; t v 1 t 1 = a 1 t , Q 1 h ; f θ ; t δ f θ 1 .
The terms appearing in Equation (20) are to be computed at the nominal values h 0 ; F 0 but the respective notation has been omitted for simplicity.
Using the definition of the adjoint operator in H 1 Ω t , the term on the left-side of Equation (20) is integrated by parts and the order of summations is reversed to obtain the following relation:
a 1 t , L h ; f θ ; t v 1 t 1 = v 1 t , A 1 h ; f θ ; t a 1 t 1 + P h ; f ; v 1 ; a 1 ,
where the operator A 1 h ; f θ ; t L h ; f θ ; t denotes the formal adjoint of the operator L h ; f θ ; t and where P h ; F ; v 1 ; a 1 represents the scalar-valued bilinear concomitant evaluated on the boundary t = t 0 and/or t = t f . Note that the T H × T H matrix valued operator A 1 h ; f θ ; t A i j 1 h ; f θ ; t T H × T H acts linearly on the vector a 1 t . The “star” superscript (*) will be used in this work to denote “formal adjoint operator”.
It follows from Equations (20) and (21) that the following relation holds:
v 1 t , A 1 h ; f θ ; t 1 = a 1 t , Q 1 δ f 1 P h ; f ; v 1 ; a 1 .
The term on the left-side of Equation (22) is now required to represent the indirect effect term defined in Equation (10) by imposing the following relation:
A 1 h ; f θ ; t a 1 t = D h t ; f θ ; t h t ; t 0 < t < t f .
Using Equations (22) and (23) in Equation (10) yields the following expression for the indirect effect term:
δ R h 0 ; f 0 ; v 1 i n d = a 1 t , Q 1 δ f 1 P h ; f ; v 1 ; a 1 θ 0 .
The boundary conditions accompanying Equation (23) for the function a 1 t are now chosen at the time values t = t f and/or t = t 0 so as to eliminate all unknown values of the 1st-level variational function v 1 t from the bilinear concomitant P h ; f ; v 1 ; a 1 which remain after implementing the initial conditions provided in Equation (2). These boundary conditions for the function a 1 t can be represented in operator form as follows:
B j a 1 t ; f θ ; t = 0 ; a t t = t f a n d / o r t = t 0 ; j = 1 , , B C .
The Fredholm-like NIDE net represented by Equations (23) and (25) will be called the “1st-Level Adjoint Sensitivity System” and the solution, a ( 1 ) t , will be called the “1st-level adjoint sensitivity function”. The 1st-LASS is solved using the nominal/optimal values for the parameters and for the function h t but this fact has not been explicitly indicated in order to simplify the notation. Notably, the 1st-LASS is independent of any parameter variations so it needs to be solved just once to obtain the 1st-level adjoint sensitivity function a ( 1 ) t The 1st-LASS is linear in a ( 1 ) t but is, in general, nonlinear in h t ; x .
Adding the result obtained in Equation (24) for the indirect-effect term δ R h 0 ; f 0 ; v 1 i n d to the result obtained in Equation (9) for the direct-effect term yields the following expression for the first-order G-differential of the response R h ; f θ :
δ R h 0 ; f 0 ; v 1 ; δ F = a 1 t , Q 1 δ f 1 P h ; f ; v 1 ; a 1 θ 0 + i = 1 T F t 0 t f d t D h t ; f θ ; t f i δ f i θ 0 i = 1 T F R 1 i ; h t ; a 1 t ; f θ δ f i θ 0 ,
where R 1 i ; h t ; a 1 t ; f θ R u x ; f θ / f i denotes the first-order sensitivity of the response R u x ; f θ with respect to the components f i of the “feature”. Each sensitivity R 1 i ; u x ; a 1 x ; f α is obtained by identifying the expression that multiplies the corresponding variation δ f i and can be represented formally in the following integral form:
R 1 i ; h t ; a 1 t ; f θ t 0 t f S 1 i ; h t ; a 1 t ; f θ d t ; i = 1 , , T F .
The functions S 1 i ; h t ; a 1 t ; f θ will be subsequently used for determining the exact expressions of the second-order sensitivities of the response with respect to the components of the feature function f θ of model parameters.
In the following subsections, the detailed forms of the 1st-LASS are provided for first-order (n = 1) and, respectively, second-order (n = 2) Fredholm-like NIDE.

2.1. First-Order Neural Integral Equations of Fredholm-Type (1st-NIDE-F)

The representation of the first-order n = 1 neural integral equations of Fredholm-type (1st-NIDE-F) is provided below, for i = 1 , , T H :
c i , 1 h t ; f θ d h i t d t = g i h t ; f θ + j = 1 T L φ i , j f θ ; t t 0 t f d τ ψ j h τ ; f θ ; τ .
The typical boundary conditions provided at t = t 0 (“encoder”) are as follows:
h i t 0 = e i ; i = 1 , , T H ,
where the scalar values e i are known, albeit imprecisely, since they are considered to stem from experiments and/or computations. Equations (28) and (29) are customarily considered an “initial value (NIDE-F) problem” although the independent variable t could represent some other physical entity (e.g., space, energy, etc.) rather than time.
The 1st-LVSS for the function v 1 t is obtained by G-differentiating Equations (28) and (29), and has the following particular forms of Equations (13) and (14) for n = 1 :
c i , 1 h t ; f θ θ 0 d v i 1 t d t + d h i t d t k = 1 T H c i , 1 h t ; f θ h k t v k 1 t θ 0 j = 1 T L φ i , j f θ ; t t 0 t f d τ k = 1 T H ψ j h ; f θ ; τ h k τ v k 1 τ θ 0 k = 1 T H g i h t ; f θ ; t h k t v k 1 t θ 0 = k = 1 T F q i , k 1 h ; f θ 0 δ f k , i = 1 , , T H ;
v i 1 t 0 = δ e i ; i = 1 , , T H ;
where:
q i , k 1 h ; f d h i t d t c i , 1 h t ; f θ f k + j = 1 T L φ i , j f ; t f k t 0 t f ψ j h τ ; f θ ; τ d τ + g i h ; f ; t f k + j = 1 T L φ i , j f ; t t 0 t f d τ ψ j h τ ; f θ ; τ f k ; i = 1 , , T H ; k = 1 , , T F .
The 1st-LASS is constructed by using Equation (19) to form the inner product of Equation (30) with a vector a 1 t a 1 1 t , , a T H 1 t H 1 Ω t to obtain the following relationship:
i = 1 T H t 0 t f a i 1 t d t c i , 1 h t ; f d v i 1 t d t + d h i t d t k = 1 T H c 1 h t ; f h k t v k 1 t j = 1 T L φ i , j f ; t t 0 t f d τ k = 1 T H ψ j h τ ; f ; τ h k τ v k 1 τ k = 1 T H g i h t ; f ; t h k t v k 1 t = i = 1 T H t 0 t f a i 1 t d t k = 1 T F q i , k 1 h ; f ; t δ f k .
Examining the structure of the left-side of Equation (33) reveals that the bilinear concomitant will arise from the integration by parts of the first term the on the left-side of Equation (33) to obtain the following relation:
i = 1 T H t 0 t f a i 1 t c i , 1 h t ; f d v i 1 t d t d t = P h ; f ; v 1 ; a 1 i = 1 T H t 0 t f v i 1 t d a i 1 t c i , 1 h t ; f d t d t ,
where the bilinear concomitant P h ; f ; v 1 ; a 1 has the following expression, by definition:
P h ; f ; v 1 ; a 1 i = 1 T H a i 1 t f c i , 1 h t f ; f v i 1 t f a i 1 t 0 c i , 1 h t 0 ; f v i 1 t 0 .
The second term on the left-side of Equation (33) will be recast in its “adjoint form” by reversing the order of summations so as to transform the inner product involving the function a ( 1 ) t to an inner product involving the function v 1 t , as follows:
i = 1 T H t 0 t f a i 1 t d t d h i t d t k = 1 T H c i , 1 h t ; f h k t v k 1 t = i = 1 T H t 0 t f v i 1 t c i , 1 h t ; f h i t k = 1 T H a k 1 t d h k t d t d t .
The third term on the left-side of Equation (33) is now recast in its “adjoint form” by reversing the order of summations and integrations so as to transform the inner product involving the function a ( 1 ) t into an inner product involving the function v 1 t , as follows:
i = 1 T H t 0 t f d t a i 1 t j = 1 T L φ i , j f θ ; t t 0 t f k = 1 T H ψ j h ; f θ ; τ h k τ v k 1 τ d τ = i = 1 T H t 0 t f d τ v i 1 τ j = 1 T L ψ j h ; f θ ; τ h i τ t 0 t f k = 1 T H a k 1 t φ k , j f θ ; t d t .
The fourth term on the left-side of Equation (33) will be recast in its “adjoint form” by reversing the order of summations and integrations so as to transform the inner product involving the function a ( 1 ) t into an inner product involving the function v 1 t , as follows:
i = 1 T H t 0 t f a i 1 t d t k = 1 T H g i h t ; f ; t h k t v k 1 t = i = 1 T H t 0 t f v i t k = 1 T H a k 1 t g k h t ; f θ ; t h i t .
Using the results obtained in Equations (34)–(38) in the left-side of Equation (33) yields the following relation:
i = 1 T H t 0 t f a i 1 t d t c i , 1 h t ; f d v i 1 t d t + d h i t d t k = 1 T H c 1 h t ; f h k t v k 1 t j = 1 T L φ i , j f ; t t 0 t f d τ k = 1 T H ψ j h τ ; f ; τ h k τ v k 1 τ k = 1 T H g i h t ; f ; t h k t v k 1 t = P h ; f ; v 1 ; a 1 + i = 1 T H t 0 t f v i 1 t d t d d t a i 1 t c i , 1 h ; f + c 1 h t ; f h i t k = 1 T H a k 1 t d h k t d t j = 1 T L ψ j h ; f θ ; t h i t t 0 t f k = 1 T H a k 1 τ φ k , j f θ ; τ d τ k = 1 T H a k 1 t g k h t ; f θ ; t h i t = i = 1 T H t 0 t f a i 1 t d t k = 1 T F q i , k 1 h ; f ; t δ f k .
The relation in Equation (39) is rearranged as follows:
i = 1 T H t 0 t f a i 1 t d t k = 1 T F q i , k 1 h ; f ; t δ F k P h ; f ; v 1 ; a 1 = i = 1 T H t 0 t f v i 1 t d t d d t a i 1 t c i , 1 h ; f + c i , 1 h t ; f h i t k = 1 T H a k 1 t d h k t d t j = 1 T L ψ j h ; f θ ; t h i t t 0 t f k = 1 T H a k 1 τ φ k , j f θ ; τ d τ k = 1 T H a k 1 t g k h t ; f θ ; t h i t .
The term on the right-side of Equation (40) is now required to represent the “indirect-effect” term defined in Equation (10), which is achieved by requiring the components of the function a 1 t a 1 1 t , , a T H 1 t to satisfy the following system of first-order NIDE-F equations:
d d t a i 1 t c i , 1 h t ; f + c i , 1 h t ; f h i t k = 1 T H a k 1 t d h k t d t j = 1 T L ψ j h ; f θ ; t h i t t 0 t f k = 1 T H a k 1 τ φ k , j f θ ; τ d τ k = 1 T H a k 1 t g k h t ; f θ ; t h i t = D h t ; f θ ; t h i t ; i = 1 , , T H .
The relation obtained in Equation (41) is the explicit form of the relation provided in Equation (23) for the particular case when n = 1 , i.e., when considering first-order neural integral equations of Fredholm-type (1st-NIDE-F).
The unknown values v i t f in the bilinear concomitant P h ; f ; v 1 ; a 1 in Equation (40) are eliminated by imposing the following final-time conditions:
a i 1 t f = 0 ; i = 1 , , T H .
It follows from Equations (33)–(42) and (31) that the indirect-effect term defined in Equation (10) has the following expression in terms of the 1st-level adjoint sensitivity function a ( 1 ) t :
δ R h ; f ; a ( 1 ) i n d = t 0 t f a i 1 t d t k = 1 T F q i , k 1 h ; f ; t δ f k + i = 1 T H a i 1 t 0 c i , 1 h t 0 ; f δ e i .
The first-order NIDE-F obtained in Equations (41) and (42) represents the explicit form for the particular case n = 1 of the 1st-LASS represented, in general, by Equations (23) and (25). To obtain the 1st-level adjoint sensitivity function a 1 t a 1 1 t , , a T H 1 t , the 1st-LASS is solved backwards in time (globally) using the nominal/optimal values for the parameters and for the function h t but this fact has not been explicitly indicated in order to simplify the notation. Notably, the 1st-LASS is independent of any parameter variations so it needs to be solved just once to obtain the 1st-level adjoint sensitivity function a ( 1 ) t The 1st-LASS is linear in a ( 1 ) t but is, in general, nonlinear in h t ; x .
Using the results obtained in Equations (43) and (9) in Equation (8) yields the following expression for the G-variation δ R h 0 ; F 0 ; v 1 ; δ f , which is seen to be linear in the variations δ f j , j = 1 , , T F , in the model’s feature functions (induced by variations in the model’s primary parameters) and the variations δ e i , i = 1 , , T H in the decoder’s initial conditions:
δ R h t ; f ; a ( 1 ) t ; δ f j = 1 T F t 0 t f d t D h t ; f ; t F j δ F j + i = 1 T H a i 1 t 0 c i , 1 h t 0 ; f δ e i + i = 1 T H t 0 t f a i 1 t d t j = 1 T F q j k 1 h ; f ; t δ f j j = 1 T F R f j δ f j + i = 1 T H R e i δ e i .
The expression in Equation (44) is to be satisfied at the nominal/optimal values for the respective model parameters, but this fact has not been indicated explicitly in order to simplify the notation.
Identifying in Equation (44) the expressions that multiply the variations δ e i yields the following expressions for the decoder response sensitivities with respect to the encoder’s initial conditions:
R e i = a i 1 t 0 c i , 1 h t 0 ; f = t 0 t f a i 1 t c i , 1 h t ; f δ t t 0 d t ; i = 1 , , T H .
It is apparent from Equation (45) that the sensitivities R / e i are functionals of the form predicted in Equation (27). It is also apparent from Equation (45) that the sensitivities R / e i are proportional to the values of the respective component a i 1 t 0 of the 1st-level adjoint function evaluated at the initial-time t = t 0 . This relation provides an independent mechanism for verifying the correctness of solving the 1st-LASS from t = t f to t = t 0 (backwards in time) since the sensitivities R / e i can be computed independently of the 1st-LASS by using finite differences of appropriately high-order in conjunction with known variations δ e i and the correspondingly induced variations in the decoder response. Special attention needs to be devoted, however, to ensure that the respective finite-difference formula is accurate, which may need several trials with different values chosen for the variation δ e i .
It also follows from Equations (44) and (32) that the sensitivities R / f j of the response R h ; f θ with respect to the components f j θ of the feature function f θ have the following expressions, written in the form of Equation (27):
R / f j R 1 j ; h t ; a 1 t ; f θ t 0 t f S 1 1 j ; h t ; a 1 t ; f θ d t ; j = 1 , , T F ;
where
S 1 1 j ; h t ; a 1 t ; f θ = D h t ; f θ ; t f j i = 1 T H a i 1 t d h i t d t c i , 1 h t ; f θ f j + i = 1 T H a i 1 t g i f ; t ; t f j + i = 1 T H a i 1 t k = 1 T L φ i , k f ; t f j t 0 t f ψ k h τ ; f θ ; τ d τ + i = 1 T H a i 1 t k = 1 T L φ i , k f ; t t 0 t f d τ ψ k h τ ; f θ ; τ f j ; j = 1 , , T F .
The subscript “1” attached to the quantity S 1 1 j ; h t ; a 1 t ; f θ indicates that this quantity refers to a “first-order” NIDE-F net, while the superscript “(1)” indicates that this quantity refers to “first-order” sensitivities.
The sensitivities with respect to the primary model parameters can be obtained by using the result shown in Equation (46) together with the “chain rule” of differentiating compound functions, as follows:
R θ j = i = 1 T F R f i f i θ j , j = 1 , , T W .
When there only model parameters (i.e., there are no feature functions of model parameters), then f i θ θ i for all i = 1 , , T F T W , and the expression obtained in Equation (46) yields directly the first-order sensitivities R / θ j , for all j = 1 , , T W . In this case, all of the sensitivities R / θ j , for all j = 1 , , T W would be obtained by computing integrals (using quadrature formulas). In contradistinction, when features of parameters can be established, only T F   T F < T W integrals would need to be computed (using quadrature formulas) to obtain the R / F j , j = 1 , , T F ; the sensitivities with respect to the model parameters would subsequently be obtained analytically using the chain-rule provided in Equation (48).
Occasionally, the boundary conditions may be provided through a measurement at the boundary t = t f (“decoder”), as follows:
h i t f = d i ; i = 1 , , T H ,
where the scalar values d i are known, albeit imprecisely, since they are considered to stem from experiments and/or computations. In such a case, the determination of the first-order sensitivities R / f j of the response R h ; f θ with respect to the components f j θ of the feature function f θ follows the same steps as above, yielding the following results:
(i)
The 1st-LASS will become an “initial value problem” comprising Equation (41), subject not the conditions shown in Equation (42), but subject to the following “initial conditions
a i 1 t 0 = 0 ; i = 1 , , T H .
(ii)
The sensitivities R / f j of the response R h ; f θ with respect to the components f j θ of the feature function f θ will have the same formal expressions as in Equation (46) but the components of the 1st-level adjoint function a 1 t a 1 1 t , , a T H 1 t will be the solution of Equations (41) and (50).
(iii)
The sensitivities of the response R h ; f θ with respect to boundary conditions at t = t f will have the following expressions:
R d i = a i 1 t f c i , 1 h t f ; f ; i = 1 , , T H .

2.2. Second-Order Neural Integral Equations of Fredholm-Type (2nd-NIDE-F)

The representation of the second-order n = 2 neural integral equations of Fredholm-type (2nd-NIDE-F) is provided below, for i = 1 , , T H :
c i , 1 h t ; f θ d h i t d t + c i , 2 h t ; f θ d 2 h i t d t 2 = g i h t ; f θ + j = 1 T L φ i , j f θ ; t t 0 t f ψ j h τ ; f θ ; τ d τ ; i = 1 , , T H .
There are several combinations of boundary conditions that can be provided, either for the function h i t and/or for its first-derivative d h i t / d t , i = 1 , , T H , at either t = t 0 (encoder) or at t = t f (decoder), or a combination thereof. For illustrative purposes, consider that the boundary conditions are as follows:
h i t 0 = e i ; h i t f = d i ; i = 1 , , T H .
The 1st-LVSS is obtained by taking the G-variations of Equations (52) and (53) to obtain the following system, comprising the forms taken on for n = 2 by Equations (13) and (14), respectively:
c i , 1 h t ; f θ θ 0 d v i 1 t d t + d h i t d t k = 1 T H c i , 1 h t ; f θ h k t v k 1 t θ 0 + c i , 2 h t ; f θ θ 0 d 2 v i 1 d t 2 + d 2 h i t d t 2 k = 1 T H c i , 2 h t ; f θ h k t v k 1 t θ 0 j = 1 T L φ i , j f θ ; t t 0 t f d τ k = 1 T H ψ j h ; f θ ; τ h k τ v k 1 τ θ 0 k = 1 T H g i h t ; f θ ; t h k t v k 1 t θ 0 = k = 1 T F q i , k 1 h ; f θ 0 δ f k , i = 1 , , T H ;
v i 1 t 0 = δ e i ; v i 1 t f = δ d i ; i = 1 , , T H ;
where for i = 1 , , T H ; and k = 1 , , T F :
q i k 1 h ; f d h i t d t c i , 1 h t ; f θ f k d 2 h i t d t 2 c i , 2 h t ; f θ f k + g i h ; f ; t f k + j = 1 T L φ i , j f ; t f k t 0 t f ψ j h τ ; f θ ; τ d τ + j = 1 T L φ i , j f ; t t 0 t f d τ ψ j h τ ; f θ ; τ f k .
The 1st-LASS is constructed by using Equation (19) to form the inner product of Equation (54) with a vector a 1 t a 1 1 t , , a T H 1 t H 1 Ω t to obtain the following relationship:
i = 1 T H t 0 t f a i 1 t d t c i , 1 h t ; f d v i 1 t d t + d h i t d t k = 1 T H c i , 1 h t ; f h k t v k 1 t + c i , 2 h t ; f d 2 v i 1 t d t 2 + d 2 h i t d t 2 k = 1 T H c i , 2 h t ; f h k t v k 1 t j = 1 T L φ i , j f ; t t 0 t f d τ k = 1 T H ψ j h τ ; f ; τ h k τ v k 1 τ k = 1 T H g i h t ; f ; t h k t v k 1 t = i = 1 T H t 0 t f a i 1 t d t k = 1 T F q i , k 1 h ; f ; t δ f k .
Examining the structure of the left-side of Equation (57) reveals that the bilinear concomitant will arise from the integration by parts of the first and third terms the on the left-side of Equation (57), as follows:
i = 1 T H t 0 t f a i 1 t c i , 1 h ; f d v i 1 t d t d t + i = 1 T H t 0 t f a i 1 t c i , 2 h ; f d 2 v i 1 t d t 2 d t = P h ; f ; v 1 ; a 1 i = 1 T H t 0 t f v i 1 t d d t a i 1 t c i .1 h t ; f d t + i = 1 T H t 0 t f v i 1 t d 2 d t 2 a i 1 t c i .2 h t ; f d t ,
where the bilinear concomitant P h ; f ; v 1 ; a 1 has the following expression:
P h ; f ; v 1 ; a 1 i = 1 T H a i 1 t f c i , 1 h t f ; f v i 1 t f a i 1 t 0 c i , 1 h t 0 ; f v i 1 t 0 + i = 1 T H a i 1 t f c i , 2 h t f ; f d v i 1 t f d t a i 1 t 0 c i , 2 h t 0 ; f d v i 1 t 0 d t i = 1 T H v i 1 t f a i 1 t d c i , 2 h t ; f d t + c i , 2 h t ; f d a i 1 t d t t = t f + i = 1 T H v i 1 t 0 a i 1 t d c i , 2 h t ; f d t + c i , 2 h t ; f d a i 1 t d t t = t 0
The remaining terms on the left-side of Equation (57) will be recast into their corresponding “adjoint form” by using the results obtained in Equations (34)–(38). Using these results together with the results obtained in Equations (58) and (59) yields the following expression for the left-side Equation (57):
i = 1 T H t 0 t f a i 1 t d t c i , 1 h t ; f d v i 1 t d t + d h i t d t k = 1 T H c i , 1 h t ; f h k t v k 1 t + c i , 2 h t ; f d 2 v i 1 t d t 2 + d 2 h i t d t 2 k = 1 T H c i , 2 h t ; f h k t v k 1 t j = 1 T L φ i , j f ; t t 0 t f d τ k = 1 T H ψ j h τ ; f ; τ h k τ v k 1 τ k = 1 T H g i h t ; f ; t h k t v k 1 t = P h ; f ; v 1 ; a 1 i = 1 T H t 0 t f v i 1 t d d t a i 1 t c i , 1 h t ; f d t + i = 1 T H t 0 t f v i 1 t d 2 d t 2 a i 1 t c i , 2 h t ; f d t + i = 1 T H t 0 t f v i 1 t c i , 1 h t ; f h i t k = 1 T H a k 1 t d h k t d t d t + i = 1 T H t 0 t f v i 1 t c i , 2 h t ; f h i t k = 1 T H a k 1 t d 2 h k t d t 2 d t i = 1 T H t 0 t f v i t k = 1 T H a k 1 t g k h t f θ ; t h i t i = 1 T H t 0 t f d τ v i 1 τ j = 1 T L ψ j h ; f θ ; τ h i τ t 0 t f k = 1 T H a k 1 t φ k , j f θ ; t d t
Using Equation (58) and rearranging the terms on the right-side of Equation (60) yields the following relation:
i = 1 T H t 0 t f a i 1 t d t k = 1 T F q i , k 1 h ; f ; t δ F k P h ; f ; v 1 ; a 1 = i = 1 T H t 0 t f v i t d t d d t a i 1 t c i , 1 h t ; f + d 2 d t 2 a i 1 t c i , 2 h t ; f + c i , 1 h t ; f h i t k = 1 T H a k 1 t d h k t d t + c i , 2 h t ; f h i t k = 1 T H a k 1 t d 2 h k t d t 2 d t k = 1 T H a k 1 t g k h t ; f θ ; t h i t j = 1 T L ψ j h ; f θ ; t h i t t 0 t f k = 1 T H a k 1 τ φ k , j f θ ; τ d τ .
The term on the right-side of Equation (61) is now required to represent the “indirect-effect” term defined in Equation (10), which is achieved by requiring the components of the function a 1 t a 1 1 t , , a T H 1 t to satisfy the following 1st-LASS:
d d t a i 1 t c i , 1 h t ; f + d 2 d t 2 a i 1 t c i , 2 h t ; f + c i , 1 h t ; f h i t k = 1 T H a k 1 t d h k t d t + c i , 2 h t ; f h i t k = 1 T H a k 1 t d 2 h k t d t 2 d t k = 1 T H a k 1 t g k h t ; f θ ; t h i t j = 1 T L ψ j h ; f θ ; t h i t t 0 t f k = 1 T H a k 1 τ φ k , j f θ ; τ d τ = D h t ; f θ ; t h i t .
The relation obtained in Equation (62) is the explicit form of the relation provided in Equation (23) for the particular case when n = 2 , i.e., when considering second-order neural integral equations of Fredholm-type (2nd-NIDE-F).
The unknown values involving the function v i t in the bilinear concomitant P h ; f ; v 1 ; a 1 defined in Equation (59) are eliminated by imposing the following conditions:
a i 1 t 0 = 0 ; a i 1 t f = 0 ; i = 1 , , T H .
It follows from Equations (57)–(63) and (55) that the indirect-effect term defined in Equation (10) has the following expression in terms of the 1st-level adjoint sensitivity function a ( 1 ) t :
δ R h ; f ; a ( 1 ) i n d = t 0 t f a i 1 t d t k = 1 T F q i k 1 h ; f ; t δ f k P ^ h ; f ; v 1 ; a 1 ,
where the boundary quantity P ^ h ; f ; v 1 ; a 1 contains the known remaining terms after having implemented the known boundary conditions given in Equations (55) and (63), and has the following explicit expression:
P ^ h ; f ; v 1 ; a 1 i = 1 T H δ d i c i , 2 h t ; f d a i 1 t d t t = t f + i = 1 T H δ e i c i , 2 h t ; f d a i 1 t d t t = t 0 .
Using the results obtained in Equations (64), (65), (56) and (9) in Equation (8) yields the following expression for the G-variation δ R h 0 ; f 0 ; v 1 ; δ f , which is seen to be linear in the variations δ d i , δ e i ( i = 1 , , T H ) and δ f j ( j = 1 , , T F ):
δ R h t ; f θ ; a ( 1 ) t ; δ f j = 1 T F t 0 t f d t D h t ; f θ ; t f j δ f j + i = 1 T H t 0 t f a i 1 t d t j = 1 T F q j k 1 h ; f ; t δ f j + i = 1 T H δ d i c i , 2 h t ; f d a i 1 t d t t = t f i = 1 T H δ e i c i , 2 h t ; f d a i 1 t d t t = t 0 j = 1 T F R f j δ f j + i = 1 T H R d i δ d i + i = 1 T H R e i δ e i .
The expression in Equation (66) is to be satisfied at the nominal/optimal values for the respective model parameters, but this fact has not been indicated explicitly in order to simplify the notation.
It also follows from Equations (66) and (56) that the sensitivities R / f j of the response R h ; f θ with respect to the components f j θ of the feature function f θ have the following expressions, written in the form of Equation (27):
R / f j R 1 j ; h t ; a 1 t ; f θ t 0 t f S 2 1 j ; h t ; a 1 t ; f θ d t ; j = 1 , , T F ;
where
S 2 1 j ; h t ; a 1 t ; f θ D h t ; f θ ; t f j i = 1 T H a i 1 t d h i t d t c i , 1 h t ; f θ f j i = 1 T H a i 1 t d 2 h i t d t 2 c i , 2 h t ; f θ f j + i = 1 T H a i 1 t g i f ; t ; x f j + i = 1 T H a i 1 t k = 1 T L φ i , k f ; t f j t 0 t f ψ k h τ ; f θ ; τ d τ + i = 1 T H a i 1 t k = 1 T L φ i , k f ; t t 0 t f d τ ψ k h τ ; f θ ; τ f j ; j = 1 , , T F .
The subscript “2” attached to the quantity S 2 1 j ; h t ; a 1 t ; f θ indicates that this quantity refers to a “second-order” NIDE-F net, while the superscript “(1)” indicates that this quantity refers to “first-order” sensitivities. As expected, the expression of S 2 1 j ; h t ; a 1 t ; f θ reduces to the expression of S 1 1 j ; h t ; a 1 t ; f θ when the “second-order NIDE-F net” reduces to the “first-order NIDE-F net” in the case when c i , 2 h t ; f θ 0 .
Identifying in Equation (66) the expressions that multiply the variations δ e i yields the following expressions for the decoder response sensitivities with respect to the encoder’s initial-time conditions:
R e i = c i , 2 h t ; f d a i 1 t d t t = t 0 = t 0 t f c i , 2 h t ; f d a i 1 t d t δ t t 0 d t ; i = 1 , , T H .
Identifying in Equation (66) the expressions that multiply the variations δ d i yields the following expressions for the decoder response sensitivities with respect to the final-time conditions:
R d i = c i , 2 h t ; f d a i 1 t d t t = t f = t 0 t f c i , 2 h t ; f d a i 1 t d t δ t t f d t ; i = 1 , , T H .
If the boundary conditions imposed on the forward functions h i t and/or the first-derivatives d h i t / d t , i = 1 , , T H , differ from the illustrative ones selected in Equation (53), then the corresponding boundary conditions for the 1st-level adjoint function a 1 t a 1 1 t , , a T H 1 t would also differ from the ones shown in Equation (63), as would be expected. The components of a 1 t a 1 1 t , , a T H 1 t would consequently have different values; therefore, all of the first-order sensitivities R / f j would have values different from those computed using Equation (68), even though the formal mathematical expressions of the respective sensitivities would remain unchanged. Of course, the sensitivities R / e i and R / d i would have expressions that would differ from those in Equations (69) and (70), respectively, if the boundary conditions in Equation (53), and consequently those in Equation (63), were different, since the residual bilinear concomitant P ^ h ; f ; v 1 ; a 1 would have a different expression from that shown in Equation (65).

3. Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integro-Differential Equations of Fredholm Type (2nd-FASAM-NIDE-F)

The second-order sensitivities of the response R h ; f θ defined in Equation (5) will be computed by conceptually using their basic definitions as being the “first-order sensitivities of the first-order sensitivities”. Recall that the generic expression of the first-order sensitivities, R 1 j 1 ; h t ; a 1 t ; f θ , j 1 = 1 , , T F , of the response with respect to the components of the feature function f θ is provided in Equation (46). It follows that the second-order sensitivities of the response with respect to the components of the feature function will be provided by the first-order G-differential δ R 1 of R 1 j 1 ; h t ; a 1 t ; f θ , which is by definition obtained as follows:
δ R 1 j 1 ; h 0 t ; a 1 , 0 t ; f 0 θ ; v 1 x ; δ a 1 x ; δ f d d ε δ R 1 j 1 ; h 0 x + ε v 1 x ; a 1 , 0 x + ε δ a 1 x ; f 0 + ε δ f ε = 0 = j 2 = 1 T F R 1 j 1 ; u ; a 1 ; f f j 2 θ 0 δ f j 2 + δ R 1 j 1 ; h ; a 1 ; f ; v 1 x ; δ a 1 x i n d ,
where the indirect-effect term δ R 1 j 1 ; h ; a 1 ; f ; v 1 x ; δ a 1 x i n d comprises all dependencies on the vectors v 1 x and δ a 1 x of variations in the state functions h t and a 1 t , around the respective nominal values denoted as h 0 t and a 1 , 0 t , respectively, which are computed at the nominal parameter values θ 0 . This indirect-effect term is defined as follows:
δ R 1 j 1 ; h ; a 1 ; f ; v 1 ; δ a 1 i n d t 0 t f d t S 1 1 j 1 ; h t ; a 1 t ; f θ u v 1 x θ 0 + t 0 t f d t S 1 1 j 1 ; h t ; a 1 t ; f θ a 1 δ a 1 x θ 0 ; j 1 = 1 , , T F .
The variational function δ a 1 x is the solution of the system of equations obtained by G-differentiating the 1st-LASS defined in Equations (23) and (25), which is by definition obtained as follows:
d d ε i = 1 T H A i j 1 h 0 + ε v 1 t ; f 0 + ε δ f ; t a i 1 , 0 + ε δ a i 1 ε = 0 = d d ε D h 0 + ε v 1 t ; f 0 + ε δ f ; t h j t ε = 0 ; j = 1 , , T H .
d d ε B j h 0 + ε v 1 t ; f 0 + ε δ f ; a 1 , 0 + ε δ a 1 ; t ε = 0 ; j = 1 , , B C .
Carrying out the operations indicated in Equations (73) and (74) yields the following relations:
k = 1 T H h k t i = 1 T H A i j 1 h ; f ; t a i 1 j 1 ; t 2 D h 0 + ε v 1 t ; f 0 + ε δ f ; t h k t h j 1 t θ 0 v k 1 t + i = 1 T H A i j 1 h ; f ; t θ 0 δ a i 1 j 1 ; t = j 2 = 1 T F q j , j 2 2 j 1 ; j 2 ; h ; f ; t δ f j 2 θ ; j = 1 , , T H ; j 1 = 1 , , T F .
q j , j 2 2 j 1 ; j 2 ; h ; f ; t 2 D h 0 + ε v 1 t ; f 0 + ε δ f ; t f j 2 θ h j 1 t θ 0 f j 2 θ i = 1 T H A i j 1 h ; f ; t a i 1 j 1 ; t θ 0 ; j = 1 , , T H ; j 1 , j 2 = 1 , , T F .
B j h ; f ; a 1 h t θ 0 v 1 t + B j h ; f ; a 1 a 1 t θ 0 δ a 1 t + B j h ; f ; a 1 f θ θ 0 δ f θ = 0 ; a t t = t f a n d / o r t = t 0 ; j = 1 , , B C .
For subsequent derivations, it is convenient to represent the relations in Equation (75) in matrix-vector form, as follows:
V 21 2 u ; a 1 ; f v 1 t + V 22 2 u ; f δ a 1 t = Q 2 h t ; a 1 t ; f θ ; t δ f θ ;
where
V 21 2 h ; a 1 ; f A 1 h ; a 1 ; f h t 2 D h ; a 1 ; f h t h t θ 0 ; V 22 2 h ; a 1 ; f A 1 h ; a 1 ; f a 1 t θ 0 ; Q 2 j 1 ; j 2 ; h ; a 1 ; f θ q j , j 2 2 j 1 ; j 2 ; h ; f ; t T H × T F ;
As indicated by Equation (78), the variational functions v 1 x and δ a 1 x are the solutions of the system of matrix equations obtained by concatenating the 1st-LVSS defined by Equations (14) and (16) with Equations (77) and (78). The concatenated system thus obtained will be called the 2nd-Level Variational Sensitivity System (2nd-LVSS) and has the block-matrix form provided below:
V M 2 2 × 2 ; U 2 2 ; t ; f V 2 2 ; t θ 0 = Q V 2 2 ; U 2 2 ; t ; f ; δ f θ 0 ; t 0 < t < t f ;
B V 2 2 ; U 2 2 ; t ; V 2 2 ; t ; f ; δ f θ 0 = 0 2 0 , 0 ; a t t = t f a n d / o r t = t 0 .
To distinguish block-matrices from block-vectors, two bold capital-letters have been used (and will henceforth be used) to denote block-matrices, as in the case of “the second-level variational matrix” V M 2 2 × 2 ; U 2 2 ; t ; f . The “2nd-level” is indicated by the superscript “(2)”. The argument “ 2 × 2 ”, which appears in the list of arguments of V M 2 2 × 2 ; U 2 2 ; t ; f , indicates that this matrix is a 2 × 2 -dimensional block-matrix comprising four submatrices, each of dimensions T D × T D . The structure of the block-matrix V M 2 2 × 2 ; U 2 2 ; t ; f is provided below:
V M 2 2 × 2 ; U 2 2 ; t ; f L h ; f θ ; t 0 V 21 2 h ; a 1 ; f ; t V 22 2 h ; a 1 ; f ; t
The argument “2” which appears in the list of arguments of the vector U 2 2 ; t and of the “variational vector” V 2 2 ; t in Equation (80) indicates that each of these vectors is a 2-block column vector, each block comprising a column-vector of dimension T D ; the vectors U 2 2 ; t and V 2 2 ; t are defined as follows:
U 2 2 ; t h t a 1 t ; V 2 2 ; t v 1 t δ a 1 t .
The 2-block vector Q V 2 2 ; U 2 2 ; t ; f ; δ f is defined as follows:
Q V 2 2 ; U 2 2 ; t ; f ; δ f Q 1 h t ; f θ ; t δ f θ Q 2 h t ; a 1 t ; f θ ; t δ f θ ;
The 2-block column vector B V 2 2 ; U 2 2 ; t ; V 2 2 ; t ; f ; δ f in Equation (81) represents the concatenated boundary/initial conditions provided in Equations (14) and (77), evaluated at the nominal parameter values. The argument “2” in the expression 0 2 0 , 0 in Equation (81) indicates that this expression is a two-block column vector comprising two vectors, each of which has T D -components, all of which are zero-valued.
The need for solving the 2nd-LVSS is circumvented by deriving an alternative expression for the indirect-effect term δ R 1 j 1 ; u ; a 1 ; f ; v 1 ; δ a 1 i n d defined in Equation (72), in which the function V 2 2 ; t is replaced by a 2nd-level adjoint function that is independent of variations in the model parameter and state functions. This 2nd-level adjoint function will be the solution of a 2nd-Level Adjoint Sensitivity System (2nd-LASS), which will be constructed by using the same principles as employed for deriving the 1st-LASS. The 2nd-LASS is constructed in a Hilbert space H 2 Ω t , Ω t t t 0 , t f , comprising block-vectors having the same structure as V 2 2 ; t that can generically be represented as follows: Φ ( 2 ) 2 ; t φ ( 2 ) 1 ; t , φ ( 2 ) 2 ; t H 2 Ω t , with φ ( 2 ) i ; t φ i , 1 2 t , φ i , j 2 t , , φ i , T H 2 t , for i = 1 , 2 . The Hilbert space H 2 Ω t is endowed with the following inner product of two vectors Φ ( 2 ) 2 ; t H 2 Ω t and Ψ ( 2 ) 2 ; t H 2 Ω t :
Ψ ( 2 ) 2 ; t , Φ ( 2 ) 2 ; t 2 i = 1 2 ψ ( 2 ) i ; t , φ ( 2 ) i ; t 1 = j = 1 T H t 0 t f χ 1 , j 2 t η 1 , j 2 t d t + j = 1 T H t 0 t f χ 2 , j 2 t η 2 , j 2 t d t .
The inner product defined in Equation (85) will be used to construct the 2nd-Level Adjoint Sensitivity System (2nd-LASS) for a 2nd-level adjoint function A ( 2 ) 2 ; t a ( 2 ) 1 ; t , a ( 2 ) 2 ; t H 2 Ω t , a i 2 t a i , 1 2 t , , a i , T H 2 t , i = 1 , 2 , by implementing the following sequence of steps, which are conceptually similar to those implemented in Section 2 for constructing the 1st-FASAM-NIDE-F methodology:
  • Using Equation (85), construct the inner product of the yet undetermined function A ( 2 ) 2 ; t a ( 2 ) 1 ; t , a ( 2 ) 2 ; t H 2 Ω t with Equation (80) to obtain the following relation:
    A ( 2 ) 2 ; t , V M 2 2 × 2 ; U 2 2 ; t ; f V 2 2 ; t 2 θ 0 = A ( 2 ) 2 ; t , Q V 2 2 ; U 2 2 ; t ; f ; δ f 2 θ 0 .
  • Use the definition of the operator adjoint to V M 2 2 × 2 ; U 2 2 ; t ; f in the Hilbert space H 2 Ω t to transform the inner product on the left-side of Equation (86) as follows:
    A ( 2 ) 2 ; t , V M 2 2 × 2 ; U 2 2 ; t ; f V 2 2 ; t 2 θ 0 = P 2 U 2 ; A ( 2 ) ; V 2 ; f θ 0 + V 2 2 ; t , A M 2 2 × 2 ; U 2 2 ; t ; f A ( 2 ) 2 ; t 2 θ 0 ,
    where the quantity P 2 U 2 ; A ( 2 ) ; V 2 ; f θ 0 denotes the corresponding bilinear concomitant on the domain’s boundary, evaluated at the nominal values for the parameters and respective state functions, and where the operator A M 2 2 × 2 ; U 2 2 ; t ; f V M 2 2 × 2 ; U 2 2 ; t ; f denotes the formal adjoint of the matrix-valued operator V M 2 2 × 2 ; U 2 2 ; t ; f , comprising 2 × 2 block-matrices, each of dimensions T D 2 , having the following block-matrix structure.
    A M 2 2 × 2 ; U 2 2 ; t ; f L 0 V 21 2 V 22 2 = L V 21 2 0 V 22 2 .
  • Require the inner product on the right-side of Equation (87) to represent the indirect-effect term δ R 1 j 1 ; u ; a 1 ; f ; v 1 ; δ a 1 i n d defined in Equation (72) by imposing the following relation:
    A M 2 2 × 2 ; U 2 2 ; t ; f A ( 2 ) 2 ; j 1 ; t θ 0 = Q A 2 2 ; j 1 ; U 2 2 ; t ; f θ 0 , j 1 = 1 , , T F ;
    where
    Q A 2 2 ; j 1 ; U 2 2 ; t ; f S 1 1 j 1 ; h t ; a 1 t ; f θ / u S 1 1 j 1 ; h t ; a 1 t ; f θ / a 1 , j 1 = 1 , , T F .
Since the source-term on the right-side of Equation (89) is a distinct quantity for each value of the index j 1 , this index has been added to the list of arguments of the function A ( 2 ) 2 ; j 1 ; t a ( 2 ) 1 ; j 1 ; t , a ( 2 ) 2 ; j 1 ; t in order to emphasize that a distinct function A ( 2 ) 2 ; j 1 ; t will correspond to each index j 1 . Of course, the adjoint operator A M 2 2 × 2 ; U 2 2 ; t ; f that acts on the function A ( 2 ) 2 ; j 1 ; t is independent of the index j 1 and could, in principle, be inverted just once and stored for subsequent repeated applications to the j 1 -dependent source terms Q A 2 2 ; j 1 ; U 2 2 ; t ; f for computing the corresponding functions A ( 2 ) 2 ; j 1 ; t .
4
The definition of the function A ( 2 ) 2 ; j 1 ; t is completed by requiring it to satisfy adjoint boundary/initial conditions represented in operator form as follows:
B A 2 2 ; U 2 2 ; t ; A ( 2 ) 2 ; j 1 ; t ; f θ 0 = 0 2 ; j 1 = 1 , , T F ; a t t = t f a n d / o r t = t 0 .
The boundary/initial conditions represented by Equation (91) are determined imposing the following requirements:
(a)
they must be independent of unknown values of V 2 2 ; t ;
(b)
the substitution of the boundary and/or initial conditions represented by Equations (81) and (91) into the expression of the bilinear concomitant P 2 U 2 ; A ( 2 ) ; V 2 ; f θ 0 must cause all terms containing unknown boundary/initial values of V 2 2 ; t to vanish.
The NIDE-net comprising Equations (89) and (91) is called the “2nd-Level Adjoint Sensitivity System (2nd-LASS)” and its solution, A ( 2 ) 2 ; j 1 ; t a ( 2 ) 1 ; j 1 ; t , a ( 2 ) 2 ; j 1 ; t , j 1 = 1 , , T F , is called the “2nd-level adjoint sensitivity function”. The unique properties of the 2nd-LASS will be highlighted in the sequel, below.
Using in Equation (72) the relations defining 2nd-LASS together with the 2nd-LVSS and the relation provided in Equation (87) yields the following alternative expression for the indirect-effect term, involving the 2nd-level adjoint sensitivity function A ( 2 ) 2 ; j 1 ; x a ( 2 ) 1 ; j 1 ; x , a ( 2 ) 2 ; j 1 ; x instead of the 2nd-level variational function V 2 2 ; t :
δ R 1 j 1 ; h ; a 1 ; f ; A ( 2 ) 2 ; j 1 ; x i n d = P ^ 2 U 2 ; A ( 2 ) ; V 2 ; f ; δ f θ 0 + A ( 2 ) 2 ; j 1 ; t , Q V 2 2 ; U 2 2 ; t ; f ; δ f 2 θ 0
where P ^ 2 U 2 ; A ( 2 ) ; V 2 ; f ; δ f θ 0 denotes known residual (non-zero) boundary terms which may not have vanished after having used the boundary and/or initial conditions represented by Equations (81) and (91).
Replacing the expression obtained in Equation (92) into Equation (71) yields the following expression:
δ R 1 j 1 ; U 2 2 ; t ; A ( 2 ) 2 ; j 1 ; t ; f ; δ f θ 0 = P ^ 2 U 2 ; A ( 2 ) ; V 2 ; f ; δ f θ 0 + A ( 2 ) 2 ; j 1 ; t , Q V 2 2 ; U 2 2 ; t ; f ; δ f 2 θ 0 + j 2 = 1 T F R 1 j 1 ; u ; a 1 ; f f j 2 θ 0 δ f j 2 j 2 = 1 T F 2 R h ; f θ f j 2 f j 1 θ 0 δ f j 2 ; j 1 = 1 , , T F .
The expressions of the second-order sensitivities 2 R h ; f θ / f j 2 f j 1 of the response with respect to the components of the feature function are obtained by performing the following sequence of operations:
(i)
Use Equation (84) to recast the second term on the right-side of Equation (93) as follows:
A ( 2 ) 2 ; j 1 ; t , Q V 2 2 ; U 2 2 ; t ; f ; δ f 2 θ 0 = a ( 2 ) 1 ; j 1 ; t , Q 1 h t ; f θ ; t δ f 1 θ 0 + a ( 2 ) 2 ; j 1 ; t , Q 2 h t ; a 1 t ; f θ ; t δ f θ 1 θ 0 .
(ii)
Recall that Q 1 h t ; f θ ; t δ f k = 1 T F q i , k 1 h ; f ; t δ f k θ 0 , where the quantities q i , k 1 h ; f ; t were defined in Equation (15). Recall that Q 2 h t ; a 1 t ; f θ ; t δ f θ j 2 = 1 T F q j , j 2 2 j 1 ; j 2 ; h ; f ; t δ f j 2 θ where the quantities q j , j 2 2 j 1 ; j 2 ; h ; f ; t were defined in Equation (76). Insert these expressions in Equation (94) to obtain the following relation:
A ( 2 ) 2 ; j 1 ; t , Q V 2 2 ; U 2 2 ; t ; f ; δ f 2 θ 0 = j 2 = 1 T F i = 1 T H t 0 t f q i , k 1 h ; f ; t a 1 , i 2 j 1 ; t d t δ f j 2 θ 0 + j 2 = 1 T F i = 1 T H t 0 t f q j , j 2 2 j 1 ; j 2 ; h ; f ; t a 2 , i 2 j 1 ; t d t δ f j 2 θ 0 .
(iii)
Insert into Equation (93) the equivalent expression obtained in Equation (95), and subsequently identifying the quantities that multiply the variations δ f j 2 , to obtain the following expression for the second-order sensitivities 2 R h ; f θ / f j 2 f j 1 :
2 R h ; f θ f j 2 f j 1 = R 1 j 1 ; u ; a 1 ; f f j 2 P ^ 2 U 2 ; A ( 2 ) ; V 2 ; f ; δ f f j 2 + i = 1 T H t 0 t f q i , k 1 h ; f ; t a 1 , i 2 j 1 ; t d t + i = 1 T H t 0 t f q i , j 2 2 j 1 ; j 2 ; h ; f ; t a 2 , i 2 j 1 ; t d t ; j 1 , j 2 = 1 , , T F .
It is important to note that the 2nd-LASS is independent of variations δ f and variations V 2 2 ; x in the respective state functions. It is also important to note that the 2 × T D 2 -dimensional matrix A M 2 2 × 2 ; U 2 2 ; t ; f is independent of the index j 1 . Only the source-term Q A 2 2 ; j 1 ; U 2 2 ; t ; f depends on the index j 1 . Therefore, the same solver can be used to invert the matrix A M 2 2 × 2 ; U 2 2 ; t ; f in order to solve numerically the 2nd-LASS for each j 1 -dependent source Q A 2 2 ; j 1 ; U 2 2 ; t ; f in order to obtain the corresponding j 1 -dependent 2 × T D -dimensional 2nd-level adjoint function A ( 2 ) 2 ; j 1 ; t a ( 2 ) 1 ; j 1 ; t , a ( 2 ) 2 ; j 1 ; t . Computationally, it would be most efficient to store, if possible, the inverse matrix A M 2 2 × 2 ; U 2 2 ; t ; f 1 , in order to multiply directly the inverse matrix A M 2 2 × 2 ; U 2 2 ; t ; f 1 with the corresponding source term Q A 2 2 ; j 1 ; U 2 2 ; t ; f , for each index j 1 , in order to obtain the corresponding j 1 -dependent 2 × T D -dimensional 2nd-level adjoint function A ( 2 ) 2 ; j 1 ; t a ( 2 ) 1 ; j 1 ; t , a ( 2 ) 2 ; j 1 ; t .
Since the adjoint matrix A M 2 2 × 2 ; U 2 2 ; t ; f is block-diagonal, solving the 2nd-LASS is equivalent to solving two 1st-LASS, with two different source terms. Thus, the “solvers” and the computer program used for solving the 1st-LASS can also be used for solving the 2nd-LASS. The 2nd-LASS was designated as the “second-level” rather than the “second-order” adjoint sensitivity system, since the 2nd-LASS does not involve any explicit 2nd-order G-derivatives of the operators underlying the original system but involves the inversion of the same operators that need to be inverted for solving the 1st-LASS.
If the 2nd-LASS is solved T F -times, the 2nd-order mixed sensitivities 2 R h ; f θ / f j 2 f j 1 will be computed twice, in two different ways, in terms of two distinct 2nd-level adjoint functions. Consequently, the symmetry property 2 R h ; f θ / f j 2 f j 1 = 2 R h ; f θ / f j 1 f j 2 provides an intrinsic (numerical) verification that the 1st-level adjoint function a ( 1 ) x and the components of the 2nd-level adjoint function A ( 2 ) 2 ; j 1 ; x and are computed accurately.
The second-order sensitivities of the decoder-response with respect to the optimal weights/parameters θ k , k = 1 , , T W , are obtained analytically by using the chain rule in conjunction with the expressions obtained in Equations (46) and (96), as follows:
2 R h ; f θ θ k θ j = θ k i = 1 T F R h ; f θ f i θ f i θ θ j , j , k = 1 , , T W .

4. Discussion

The capabilities of the traditional neural-nets [39,40,41] have been significantly generalized by Chen et al. [1] with the introduction of Neural Ordinary Differential Equations (NODE), which provide an explicit connection between deep feed-forward neural networks and dynamical systems. Since NODE models are limited to describing systems that are instantaneous, Zappala et al. [11] have introduced the Neural Integral Equation (NIE) and the Attentional Neural Integral Equation (ANIE), which can be used to infer the spatio-temporal relations that generated the data, thus enabling the continuous learning of non-local dynamics with arbitrary time resolution [11,12]. Subsequently, Zappala et al. [16] have generalized both NODE and NIE/ANIE by developing a deep learning method called Neural Integro-Differential Equation (NIDE), which “learns” an integro-differential equation (IDE) whose solution approximates data sampled from given non-local dynamics. The motivation for using NIDE stems from the need to model systems that present spatio-temporal relations which transcend local modeling, as illustrated by applications in population dynamics, biology, physics, engineering and applied sciences [17,18,19,20,21,22,23,24,25,26].
The physical system modeled by a neural-net comprises parameters and functions of parameters (correlations, material properties, etc.) that stem from measurements and/or computations which are subject to uncertainties. Therefore, even if the neural-net reproduces perfectly the underlying system, the uncertainties inherent in the system’s parameters would propagate to the subsequent results produced by the decoder. Quantifying the uncertainties in the decoder’s response can only be performed if the sensitivities of the decoder’s response with respect to the neural-net’s optimized parameters are known. Based on the general framework of the nth-FASAM-N methodology [27], Cacuci has the “Second-Order Features Adjoint Sensitivity Analysis Methodologies for Neural Ordinary Differential Equations (2nd-FASAM-NODE)” [28], the “Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integral Equations of Fredholm-Type (2nd-FASAM-NIE-F)” [29] and the “Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integral Equations of Volterra-Type (2nd-FASAM-NIE-V)” [30]. The 2nd-FASAM-NODE encompasses the “First-Order Features Adjoint Sensitivity Analysis Methodologies for Neural Ordinary Differential Equations (1st-FASAM-NODE)”. The 2nd-FASAM-NIE-F encompasses the “First-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integral Equations of Fredholm-Type (1st-FASAM-NIE-F), while the 2nd-FASAM-NIE-V encompasses the “First-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integral Equations of Volterra-Type (1st-FASAM-NIE-V)”. The 1st-FASAM-NODE, 1st-FASAM-NIE-F and 1st-FASAM-NIE-V methodologies, respectively, enable the computation, with unparalleled efficiency, of exactly-determined first-order and, respectively, second-order sensitivities of decoder response with respect to the optimized/trained weights (and functions thereof) that characterize the network’s hidden layers, decoder, and encoder, requiring a single “large-scale” computation for solving the 1st-Level Adjoint Sensitivity System (1st-LASS), regardless of the number of weights/parameters underlying the NIE-net. The 2nd-FASAM-NODE, 2nd-FASAM-NIE-F and 2nd-FASAM-NIE-V methodologies, respectively, enable the computation (with unparalleled efficiency) of exactly-determined second-order sensitivities of decoder response with respect to the network’s-parameters and functions thereof, requiring only as many “large-scale” computations as there are non-zero first-order sensitivities with respect to the feature functions.
Generalizing the methodologies presented in [28,29,30], this work has introduced the general mathematical frameworks of the First-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integro-Differential Equations of Fredholm-Type (1st-FASAM-NIDE-F), and of the Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integro-Differential Equations of Fredholm-Type (2nd-FASAM-NIDE-F). The 1st-FASAM-NIDE-F methodology requires a single “large-scale” computation, for solving the 1st-Level Adjoint Sensitivity System (1st-LASS) in order to compute all of the first-order sensitivities of the decoder response, regardless of the number of weights/parameters underlying the NIDE-F net. The 2nd-FASAM-NIDE-F methodology requires at most as many “large-scale” computations as there are non-zero feature functions of model parameters. In the accompanying “Part II” [31], the application of the 1st-FASAM-NIDE-F and 2nd-FASAM-NIDE-F methodologies will be illustrated by means of a paradigm heat conduction model, which admits exact closed-form solutions/expressions for all quantities of interest, including state functions, first-order and second-order sensitivities. This paradigm heat conduction model is of fundamental importance in many scientific fields involving heat transfer [32,33,34,35,36,37].

5. Conclusions

As has been shown in this work, the 1st-FASAM-NIDE-F methodology yields exact expressions of all first-order sensitivities of NIDE-F decoder-responses with respect to the optimized NIDE-F weights/parameters and functions thereof. These exact expressions are integrals involving the first-level adjoint sensitivity function that is obtained by solving the 1st-Level Adjoint Sensitivity System (1st-LASS). Thus, the computation of all first-order sensitivities of the decoder response requires a single “large-scale” computation for solving the 1st-LASS, regardless of the number of feature functions and parameters underlying the NIDE-F net. It has also been shown that the 2nd-FASAM-NIDE-F methodology enables the most efficient computation of the exactly obtained expressions of the second-order sensitivities of NIDE-F decoder-responses with respect to the optimized NIDE-F weights/parameters, involving integrals over the second-level adjoint sensitivity functions. Computing the requisite second-level adjoint sensitivity functions requires at most as many “large-scale” computations (for solving the 2nd-LASS) as there are non-zero feature functions of model parameters. The second-order mixed sensitivities are computed twice, using distinct 2nd-level adjoint functions, thus providing a stringent mechanism for the verification of the accuracy of the computations of all these adjoint functions.
To complement the present work, ongoing work aims at developing the “Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integro-Differential Equations of Volterra-Type (2nd-FASAM-NIDE-V)”, which will also comprise the 1st-FASAM-NIDE-V (“First-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integro-Differential Equations of Volterra-Type”). This work will parallel the present 2nd-FASAM-NIDE-F and will enable the most efficient computation of exactly-obtained expressions of the first- and second-order sensitivities of NIDE-V decoder-responses with respect to the optimized NIDE-F weights/parameters and functions thereof. This ongoing work will complete the development of second-order sensitivity analysis methodologies for NODEs, NIEs, and NIDEs. Subsequent work is planned for applications of these methodologies to quantify the accuracy of predicted results by models of large-scale systems of interest to engineering, physical, and biological sciences.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Chen, R.T.Q.; Rubanova, Y.; Bettencourt, J.; Duvenaud, D.K. Neural ordinary differential equations. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: New York, NY, USA, 2018; Volume 31, pp. 6571–6583. [Google Scholar] [CrossRef]
  2. Rokhlin, V. Rapid solution of integral equations of classical potential theory. J. Comput. Phys. 1985, 60, 187–207. [Google Scholar] [CrossRef]
  3. Ruthotto, L.; Haber, E. Deep neural networks motivated by partial differential equations. J. Math. Imaging Vis. 2018, 62, 352–364. [Google Scholar] [CrossRef]
  4. Lu, Y.; Zhong, A.; Li, Q.; Dong, B. Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations. In Proceedings of Machine Learning Research, Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; ML Research Press: Amsterdam, The Netherlands, 2018; pp. 3276–3285. [Google Scholar]
  5. Grathwohl, W.; Chen, R.T.Q.; Bettencourt, J.; Sutskever, I.; Duvenaud, D. Ffjord: Free-form continuous dynamics for scalable reversible generative models. In Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
  6. Dupont, E.; Doucet, A.; The, Y.W. Augmented neural odes. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Volume 32, pp. 14–15. [Google Scholar]
  7. Kidger, P.; Morrill, J.; Foster, J.; Lyons, T. Neural controlled differential equations for irregular time series. In Proceedings of the Advances in Neural Information Processing Systems, Virtual, 6–12 December 2020; Volume 33, pp. 6696–6707. [Google Scholar]
  8. Kidger, P. On Neural Differential Equations. arXiv 2022, arXiv:2202.02435. [Google Scholar] [PubMed]
  9. Morrill, J.; Salvi, C.; Kidger, P.; Foster, J. Neural rough differential equations for long time series. In Proceedings of Machine Learning Research, Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; ML Research Press: Amsterdam, The Netherlands, 2021; pp. 7829–7838. [Google Scholar]
  10. Effati, S.; Buzhabadi, R. A neural network approach for solving Fredholm integral equations of the second kind. Neural Comput. Appl. 2012, 21, 843–852. [Google Scholar] [CrossRef]
  11. Zappala, E.; de Oliveira Fonseca, A.H.; Caro, J.O.; van Dijk, D. Neural Integral Equations. arXiv 2023, arXiv:2209.15190v4. [Google Scholar]
  12. Xiong, Y.; Zeng, Z.; Chakraborty, R.; Tan, M.; Fung, G.; Li, Y.; Singh, V. Nystromformer: A nystrom-based algorithm for approximating self-attention. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 19–21 May 2021; NIH Public Access: Bethesda, MD, USA, 2021; Volume 35, p. 14138. [Google Scholar]
  13. Rokhlin, V. Rapid solution of integral equations of scattering theory in two dimensions. J. Comput. Phys. 1990, 86, 414–439. [Google Scholar] [CrossRef]
  14. Greengard, L.; Kropinski, M.C. An integral equation approach to the incompressible Navier-Stokes equations in two dimensions. SIAM J. Sci. Comput. 1998, 20, 318–336. [Google Scholar] [CrossRef]
  15. Prinja, A.K.; Larsen, E.W. General Principles of Neutron Transport. In Handbook of Nuclear Engineering; Cacuci, D.G., Ed.; Springer Science + Business Media: New York, NY, USA, 2010; Volume 1, pp. 543–642. [Google Scholar]
  16. Zappala, E.; De Oliveira Fonseca, A.H.; Moberly, A.H.; Higley, M.J.; Abdallah, C.; Cardin, J.; Van Dijk, D. Neural integro-differential equations. arXiv 2022, arXiv:2206.14282v4. [Google Scholar] [CrossRef]
  17. Volterra, V. Theory of functionals and of integral and integro-differential equations. Bull. Amer. Math. Soc. 1932, 38, 623. [Google Scholar]
  18. Caffarelli, L.; Silvestre, L. Regularity theory for fully nonlinear integro-differential equations. Commun. Pure Appl. Math. 2009, 62, 597–638. [Google Scholar] [CrossRef]
  19. Grigoriev, Y.N.; Ibragimov, N.H.; Kovalev, V.F.; Meleschko, S.V. Symmetries of integro-differential equations: With applications in mechanics and plasma physics. In Lecture Notes in Physics; Springer Science + Business Media: Dordrecht, The Netherlands, 2010; Volume 806. [Google Scholar]
  20. Lakshmikantham, V. Theory of Integro-Differential Equations; CRC Press: Boca Raton, FL, USA, 1995; Volume 1. [Google Scholar]
  21. Amari, S. Dynamics of pattern formation in lateral-inhibition type neural fields. Biol. Cybern. 1977, 27, 77–87. [Google Scholar] [CrossRef] [PubMed]
  22. Medlock, J.; Kot, M. Spreading disease: Integro-differential equations old and new. Math. Biosci. 2003, 184, 201–222. [Google Scholar] [CrossRef] [PubMed]
  23. Wilson, H.R.; Cowan, J.D. Excitatory and inhibitory interactions in localized populations of model neurons. Biophys. J. 1972, 12, 1–24. [Google Scholar] [CrossRef] [PubMed]
  24. Minakov, A.A.; Schick, C. Integro-Differential Equation for the Non-Equilibrium Thermal Response of Glass-Forming Materials: Analytical Solutions. Symmetry 2021, 13, 256. [Google Scholar] [CrossRef]
  25. Ciesielski, M.; Mochnacki, B.; Majchrzak, E. Integro-differential form of the first-order dual phase lag heat transfer equation and its numerical solution using the Control Volume Method. Arch. Mech. 2020, 72, 415–444. [Google Scholar] [CrossRef]
  26. Laitinen, M.T.; Tiihonen, T. Integro-differential equation modelling heat transfer in conducting, radiating and semitransparent materials. Math. Meth. Appl. Sci. 1998, 21, 375–392. [Google Scholar] [CrossRef]
  27. Cacuci, D.G. Introducing the nth-Order Features Adjoint Sensitivity Analysis Methodology for Nonlinear Systems (nth-FASAM-N): I. Mathematical Framework. Am. J. Comput. Math. 2024, 14, 11–42. [Google Scholar] [CrossRef]
  28. Cacuci, D.G. Introducing the Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Ordinary Differential Equations. I: Mathematical Framework. Processes 2024, 12, 2660. [Google Scholar] [CrossRef]
  29. Cacuci, D.G. Introducing the Second-Order Features Adjoint Sensitivity Analysis Methodology for Fredholm-Type Neural Integral Equations. Mathematics 2025, 13, 14. [Google Scholar] [CrossRef]
  30. Cacuci, D.G. Introducing the Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integral Equations of Volterra-Type: Mathematical Methodology and Illustrative Application to Nuclear Engineering. J. Nucl. Eng. 2025, 6, 8. [Google Scholar] [CrossRef]
  31. Cacuci, D.G. The First- and Second-Order Features Adjoint Sensitivity Analysis Methodologies for Fredholm-Type Neural Integro-Differential Equations. II. Illustrative Application to a Heat Transfer Model. Processes 2025, 2025042086. [Google Scholar]
  32. Schneider, P.J. Conduction Heat Transfer; Addison-Wesley Publishing Co.: Reading, MA, USA, 1955. [Google Scholar]
  33. Carslaw, H.S.; Jaeger, J.C. Conduction of Heat in Solids; Clarendon Press: London, UK, 1959. [Google Scholar]
  34. Arpaci, V.S. Conduction Heat Transfer; Addison-Wesley Publishing Co.: Reading, MA, USA, 1966. [Google Scholar]
  35. Ozisik, M.N. Boundary Value Problems of Heat Conduction; International Textbook Company: Scranton, PA, USA, 1968. [Google Scholar]
  36. Ozisik, M.N. Heat Conduction; John Wiley & Sons: New York, NY, USA, 1980. [Google Scholar]
  37. Todreas, N.E.; Kazimi, M.S. Nuclear Systems I: Thermal Hydraulic Fundamentals; Taylor & Francis: Bristol, PA, USA, 1993. [Google Scholar]
  38. Cacuci, D.G. Sensitivity Theory for Nonlinear Systems: I. Nonlinear Functional Analysis Approach. J. Math. Phys. 1981, 22, 2794–2802. [Google Scholar] [CrossRef]
  39. Bishop, C.M. Neural Networks for Pattern Recognition; Clarendon Press: Oxford, UK, 1995. [Google Scholar]
  40. Bishop, C.M. Pattern Recognition and Machine Learning; Springer Science + Business Media: New York, NY, USA, 2006. [Google Scholar]
  41. Bishop, C.M.; Bishop, H. Deep Learning: Foundations and Concepts; Springer Nature: Cham, Switzerland, 2024. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cacuci, D.G. The First- and Second-Order Features Adjoint Sensitivity Analysis Methodologies for Fredholm-Type Neural Integro-Differential Equations: I. Mathematical Framework. Processes 2025, 13, 2258. https://doi.org/10.3390/pr13072258

AMA Style

Cacuci DG. The First- and Second-Order Features Adjoint Sensitivity Analysis Methodologies for Fredholm-Type Neural Integro-Differential Equations: I. Mathematical Framework. Processes. 2025; 13(7):2258. https://doi.org/10.3390/pr13072258

Chicago/Turabian Style

Cacuci, Dan Gabriel. 2025. "The First- and Second-Order Features Adjoint Sensitivity Analysis Methodologies for Fredholm-Type Neural Integro-Differential Equations: I. Mathematical Framework" Processes 13, no. 7: 2258. https://doi.org/10.3390/pr13072258

APA Style

Cacuci, D. G. (2025). The First- and Second-Order Features Adjoint Sensitivity Analysis Methodologies for Fredholm-Type Neural Integro-Differential Equations: I. Mathematical Framework. Processes, 13(7), 2258. https://doi.org/10.3390/pr13072258

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop