1. Introduction
The functional derivatives of results (customarily called “responses”) produced by computational models of physical systems with respect to the model’s parameters are customarily called the “response sensitivities.” The first-order sensitivities have been used for a variety of purposes, including: (i) understanding the model by ranking the importance of the various parameters; (ii) performing “reduced-order modeling” by eliminating unimportant parameters and/or processes; (iii) quantifying the uncertainties induced in a model response due to model parameter uncertainties; (iv) performing “model validation,” by comparing computations to experiments to address the question “does the model represent reality?” (v) prioritizing improvements in the model; (vi) performing data assimilation and model calibration as part of forward “predictive modeling” to obtain best-estimate predicted results with reduced predicted uncertainties; (vii) performing inverse “predictive modeling”; and (viii) designing and optimizing the system.
As is well known, non-linear operators do not admit adjoint operators; only linear operators admit corresponding adjoint operators. For this reason, many of the most important responses for linear systems involve the solutions of both the forward and the adjoint linear models that correspond to the respective physical system. Included among the widest used system responses that involve both the forward and adjoint functions are the various forms of Lagrangian functionals, the Raleigh quotient for computing eigenvalues and/or separation constants when solving partial differential equations, the Schwinger functional for first-order “normalization-free” solutions, and many others (see, e.g., [
1,
2]). These functionals play a fundamental role in optimization and control procedures, derivation of numerical methods for solving equations (differential, integral, integro-differential), etc. The analysis of responses that simultaneously involve both forward and adjoint functions makes it necessary to treat linear systems in their own right, rather than treating them as particular cases of nonlinear systems. This is in contradistinction to responses for a nonlinear system, which can depend only on the forward functions, since nonlinear operators do not admit bona-fide adjoint operators;
only a linearized form of a nonlinear operator admits an adjoint operator.
As is well known, even the approximate determination of the first-order sensitivities
of a model response
to
parameters
using conventional finite-difference methods would require at least
large-scale computations with altered parameter values. The computation of the
distinct second-order response sensitivities would require
large-scale computations, which rapidly becomes unfeasible for large-scale models comprising many parameters, even using supercomputers. The computation of higher-order sensitivities by conventional methods is limited in practice by the so-called “curse of dimensionality” [
3] since the number of large-scale computations needed for computing higher-order response sensitivities increases exponentially with the order of the response sensitivities. Already the First-Order Adjoint Sensitivity Analysis Methodology for Nonlinear Systems, conceived and developed by Cacuci [
4,
5,
6], provides a considerable step forward in the direction of overcoming the “curse of dimensionality”. For the exact computation of the first- and second-order response sensitivities to parameters, the “curse of dimensionality” has been overcome by the Second-Order Adjoint Sensitivity Analysis Methodology conceived and developed by Cacuci [
7,
8,
9], as was demonstrated by the application of this methodology to compute [
10,
11,
12,
13,
14,
15], comprehensively and efficiently, the exact expression of the first- and second-order sensitivities of a the leakage response (which has also been measured experimentally) to the model parameters of an OECD/NEA reactor physics benchmark [
16]. The neutron transport code PARTISN [
17] has been used to computationally model this reactor physics benchmark, which comprises 21,976 model parameters. Hence, there are 21,976 first-order sensitivities (of which 7477 have nonzero values) and 241,483,276 second-order sensitivities (of which 27,956,503 have nonzero values) of the benchmark’s leakage response to the model parameters. The work presented in [
10,
11,
12,
13,
14,
15] has been extended to third-order in [
18], which has subsequently been applied in [
19,
20,
21] to the computation of the third-order sensitivities of the leakage response of the OECD/NEA benchmark [
16] to the benchmark’s total microscopic nuclear cross sections, which turned out to be the most important model parameters.
This work presents the Fourth-Order Comprehensive Adjoint Sensitivity Analysis Methodology (4th-CASAM) for general response-coupled forward/adjoint systems. The 4th-CASAM enables the efficient computation of the exact expressions of the 1st-, 2nd-, 3rd- and 4th-order sensitivities of a generic system response, which depends on both the forward and adjoint state functions with respect to all of the parameters underlying the respective systems. The qualifier “comprehensive” is used because this methodology provides exact expressions for the sensitivities of a system response not only to the system’s internal parameters, but also to its (possibly uncertain) boundaries and internal interfaces in phase-space. The development of the 4th-CASAM for coupled forward/adjoint linear system will provide the basis for significant advances towards overcoming the “curse of dimensionality” in sensitivity analysis, uncertainty quantification, as well as forward and inverse predictive modeling [
22,
23,
24].
This work is structured as follows:
Section 2 presents the generic mathematical formulation of the forward and adjoint equations that underlies the computational model of a linear physical system, having a response that depends nonlinearly on the forward and adjoint state functions and parameters.
Section 3 presents the development of the novel 4th-CASAM, which is developed successively from the 1st-, 2nd- and 3rd-CASAM.
Section 3 also presents, for comparison, the expressions and number of large-scale computations that would be needed if the 1st-, 2nd-, 3rd-, and 4th-order response sensitivities were computed by using finite-difference formulas (which would not only be impractical for large-scale systems, but would provide only approximate values for the respective sensitivities) and the Forward Sensitivity Analysis Methodology (FSAM), which would provide exact expressions for the respective sensitivities, but would also be prohibitively expensive in terms of computational costs for large-scale systems. Finally,
Section 4 offers conclusions regarding the significance of this work’s novel results in the quest to overcome the curse of dimensionality in sensitivity analysis, uncertainty quantification and predictive modeling.
2. Background: Mathematical Description of the Physical System
A physical system is modeled by using independent variables, dependent variables (“state functions”), as well as parameters which are seldom, if ever, known precisely. Without loss of generality, the model parameters can be considered to be real scalar quantities, having known nominal (or mean) values and, possibly, known higher-order moments or cumulants (i.e., variance/covariances, skewness, kurtosis), which are determined outside the model, e.g., from experimental data. These imprecisely known model parameters will be denoted as ,…,, where denotes the total number of imprecisely known parameters underlying the model under consideration. These model parameters are considered to include imprecisely known geometrical parameters that characterize the physical system’s boundaries in the phase-space of the model’s independent variables. For subsequent developments, it is convenient to consider that these parameters are components of a “vector of parameters” denoted as ; , where denotes the -dimensional subset of the set of real scalars. The vector is considered to include any imprecisely known model parameters that may enter into defining the system’s boundary in the phase-space of independent variables. The symbol “” will be used to denote “is defined as” or “is by definition equal to.” Matrices and vectors will be denoted using bold letters. All vectors in this work are considered to be column vectors, and transposition will be indicated by a dagger superscript.
The model is considered to comprise independent variables which are denoted as , and are considered to be components of the vector .The vector of independent variables is considered to be defined on a phase-space domain, denoted as , and defined as follows: . The lower boundary-point of an independent variable is denoted as (e.g., the inner radius of a sphere or cylinder, the lower range of an energy-variable, etc.), while the corresponding upper boundary-point is denoted as (e.g., the outer radius of a sphere or cylinder, the upper range of an energy-variable, etc.). A typical example of boundaries that depend on imprecisely known parameters is provided by the boundary conditions needed for models based on diffusion theory, in which the respective “flux and/or current conditions” for the “boundaries facing vacuum” are imposed on the “extrapolated boundary” of the respective spatial domain. As is well known, the “extrapolated boundary” depends not only on the imprecisely known physical dimensions of the problem’s domain, but also on the medium’s microscopic transport cross sections and atomic number densities. The boundary of , which will be denoted as , comprises the set of all of the endpoints of the respective intervals on which the components of are defined, i.e., .
The model is considered to comprise dependent variables (also called “state functions”), denoted as , which are considered to be the components of the “vector of dependent variables” defined as .
A linear physical system is generally modeled by a system of
linear operator-equations which can be generally represented as follows:
where
, is a matrix of dimensions
, while
is a column vector of dimension
. The components
are operators that act linearly on the dependent variables
and are, in general, nonlinear functions of the imprecisely known parameters
. The components
are operators that act linearly on the dependent variables
and are, in general, nonlinear functions of the imprecisely known parameters
. The components
of
, where the subscript “
” indicates sources associated with the “forward” system of equations, are also nonlinear functions of
. Since the right-side of Equation (1) may contain distributions, the equality in this equation is considered to hold in the weak (“distributional”) sense. Similarly, all of the equalities that involve differential equations in this work will be considered to hold in the weak/distributional sense.
When
contains differential operators, a set of boundary and/or initial conditions which define the domain of
must also be given. Since the complete mathematical model is considered to be linear in
, the boundary and/or initial conditions needed to define the domain of
must also be linear in
. Such a linear boundary and/or initial conditions are represented in the following operator form:
In Equation (2), the operator is a matrix comprising, as components, operators that act linearly on and nonlinearly on ; the quantity denotes the total number of boundary and initial conditions. The operator is a -dimensional vector comprising components that are operators acting, in general, nonlinearly on . The subscript “” in Equation (2) indicates boundary conditions associated with the forward state function . In this work, capital bold letters will be used to denote matrices (whose components may be operators rather than just functions) while lower case bold letter will be used to denote vectors.
The nominal solution of Equations (1) and (2) is denoted as
, and is obtained by solving these equations at the nominal (or mean) values of the model parameter
. The superscript “zero” will henceforth be used to denote “nominal” (or, equivalently, “expected” or “mean” values). Thus, the vectors
and
satisfy the following equations:
Linear systems differ fundamentally from nonlinear systems in that linear operators admit adjoint operators, whereas nonlinear operators do not admit adjoint operators. Furthermore, important model responses of linear systems can be functions of both the forward and the adjoint state functions, a situation that cannot occur for nonlinear problems. Therefore, linear physical systems cannot be simply considered to be particular cases of linear systems (although they may be just that in particular cases), but need to be treated comprehensively in their own right.
Physical problems modeled by linear systems and/or operators are naturally defined in Hilbert spaces. Thus, for physical systems represented by Equations (1) and (2), the components
are considered to be square-integrable functions and
, where
is the “
original”, or “
zeroth-level” Hilbert space, as denoted by the subscript “zero”. Subsequently in this work, higher-level Hilbert spaces, which will be denoted as
etc., will also be introduced. Evidently, all of the elements of
are
-dimensional vectors that are functions of the independent variables
. For two elements
and
, the Hilbert space
is endowed with an inner product that will be denoted as
and which is defined as follows:
In Equation (5), the product-notation
compactly denotes the respective multiple integrals, while the dot indicates the “scalar product of two vectors” defined as follows:
In most practical situations the Hilbert space
is self-dual. The operator
admits an adjoint (operator), which will be denoted as
, and which is defined through the following relation for an arbitrary vector
:
In Equation (7), the formal adjoint operator
is the
matrix
comprising elements
which are obtained by transposing the formal adjoints of the operators
. Thus, the system adjoint to the linear system represented by Equations (1) and (2) has the following general representation in operator form:
The domain of is determined by selecting the adjoint boundary and/or initial conditions represented in operator form in Equation (10), where the subscript “” indicates adjoint boundary and/or initial conditions associated with the adjoint state function . These adjoint boundary and/or initial conditions are selected so as to ensure that the boundary terms that arise in the so-called “bilinear concomitant” when going from the left-side to the right side of Equation (7) vanish, in conjunction with the forward boundary conditions given in Equation (2).
The nominal solution of Equations (9) and (10) is denoted as
, and is obtained by solving these equations at the nominal parameter values
, i.e.,
In view of Equations (1) and (9), the relationship shown in Equation (7), which is the basis for defining the adjoint operator, also provides the following fundamental “reciprocity-like” relation between the sources of the forward and the adjoint equations, respectively:
The functional on the right-side of Equation (13) represents a “detector response”, i.e., a particle reaction-rate representative of the “count” of particles incident on a detector of particles, measuring the respective particle flux . Thus, the source term in Equation (11) is usually associated with the “result of interest” to be measured and/or computed, which is customarily called the system’s “response”.
The system response generally depends on the model’s state-functions and on the system parameters, which are considered to also include parameters that may specifically appear only in the definition of the response under consideration (but which may not appear in the definition of the model). Thus, the (physical) “system” is defined in this work to comprise both the system’s computational model and the system’s response. The system’s response will be denoted as and, in the most general case, is a nonlinear operator acting on the model’s forward and adjoint state functions, as well as on imprecisely known parameters, both directly and indirectly through the state functions. The nominal value of the response, , is determined by using the nominal parameter values , the nominal value of the forward state function [obtained by solving Equations (3) and (4)] and the nominal value of the adjoint function [obtained by solving Equations (11) and (12)].
A particularly important class of system responses comprises (scalar-valued) functionals of the forward and adjoint state functions. Such responses occur in many fields, including optimization, control, model verification, data assimilation, model validation and calibration, predictive modeling, etc. For example, the well-known Lagrangian functional is usually represented in the following form:
In particular, the Schwinger “normalization-free Lagrangian” is usually represented in the following form [
1,
2]:
When Equation (1) takes on the eigenvalue (or “separated”) form
, which implies that the adjoint function
is the solution of the corresponding adjoint equation
, the well-known Raleigh quotient is usually represented in the following form:
Actually, any response can be represented in terms of functionals using the inner product underlying the Hilbert space in which the physical problem is formulated (in this case,
). For example, a measurement of a physical quantity can be represented as a response
which is located at a specific point,
, in phase-space. Such a response can be represented mathematically as a functional of the following form:
where
denotes the multidimensional Dirac-delta functional. Furthermore, a function-valued (operator) response
can be represented by a spectral (in multidimensional orthogonal polynomials or Fourier series) expansion of the form:
where the quantities
,
, denote the corresponding spectral functions (e.g., orthogonal polynomials or Fourier exponential/trigonometric functions) and where the spectral Fourier) coefficients
are defined as follows:
The coefficients
can themselves be considered as system responses since the spectral polynomials
are perfectly well known while the expansion coefficients will contain all of the dependencies of the respective response on the imprecisely known model and response parameters. Consequently, the sensitivity analysis of operator-valued responses can be reduced to the sensitivity analysis of scalar-valued responses. The expressions in both Equations (17) and (19) are functionals of the forward and adjoint state functions and can be represented in the following general form:
where
denotes a suitably Gateaux- (G-) differentiable function of the indicated arguments. Specific examples that illustrate the effects of the first- and second-order sensitivities of responses of the form shown in Equations (14)–(20) to imprecisely known model and domain-boundary parameters are provided in [
25,
26,
27,
28,
29,
30,
31,
32,
33,
34].
The generic response defined in Equation (20) provides the basis for constructing any other responses of specific interest, and will therefore be used for the generic “Fourth-Order Comprehensive Sensitivity Analysis Methodology (4th-CASAM) for Linear Systems” to be developed in the remainder of this work. Note that the generic response defined in Equation (20) is, in general, a nonlinear function of all of its arguments, i.e., is nonlinear in , , and .