Next Article in Journal
Pulverized Coal-Fired Boilers: Future Directions of Scientific Research
Previous Article in Journal
Greywater as a Future Sustainable Energy and Water Source: Bibliometric Mapping of Current Knowledge and Strategies
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Perspective

Perspective on Predictive Modeling: Current Status, New High-Order Methodology and Outlook for Energy Systems

by
Dan Gabriel Cacuci
Center for Nuclear Science and Energy, University of South Carolina, Columbia, SC 29208, USA
Energies 2023, 16(2), 933; https://doi.org/10.3390/en16020933
Submission received: 30 October 2022 / Revised: 5 December 2022 / Accepted: 10 January 2023 / Published: 13 January 2023

Abstract

:
This work presents a perspective on deterministic predictive modeling methodologies, which aim at extracting best-estimate values for model responses and parameters along with reduced predicted uncertainties for these best-estimate values. The two oldest such methodologies are the data-adjustment method, which stems from the nuclear energy field, and the data-assimilation method, which is implemented in the geophysical sciences. Both of these methodologies attempt to minimize, in the least-square sense, a user-defined functional that represents the discrepancies between computed and measured model responses. These two methodologies were briefly reviewed and shown to be inconsistent even to first-order in the sensitivities of the response to the model parameters. In contrast to these methodologies, it was shown that the “maximum entropy”-based predictive modeling methodology (called BERRU-PM) that was developed by the author not only dispenses with the subjective “user-chosen functional to be minimized” but is also inherently amenable to high-order formulations. This inherent potential was illustrated by presenting a novel, higher-order, MaxEnt-based predictive modeling methodology, labelled BERRU-PM-2+, which is complete and exact to second-order sensitivities and moments of both the a priori and posterior distributions of responses and parameters, while explicitly including third- and fourth-order sensitivities and correlations, thus indicating the mechanism for incorporating information of orders higher than second in predictive modeling. The presentation of this new predictive modeling methodology also aims at motivating a widespread application of predictive modeling principles and methodologies in the energy sciences for obtaining best-estimate results with reduced uncertainties.

1. Introduction

The modeling of a physical system includes the following elements:
(a)
a mathematical model comprising equations that relate the system’s independent variables and parameters to the system’s state (i.e., dependent) variables;
(b)
inequality and/or equality constraints that delimit the ranges of the system’s parameters;
(c)
one or several computational results, customarily referred to as “system responses” (or objective functions, or indices of performance), which are computed using the mathematical model; and
(d)
experimentally measured responses, with their respective nominal (mean) values and uncertainties (variances, covariances, skewness, kurtosis, etc.).
The results of measurements and computations are never perfectly accurate but are subject to unavoidable errors. Thus, the results of measurements are inevitably influenced by experimental errors stemming from imperfect instruments or imperfectly known calibration standards. Around any reported experimental value, therefore, there always exists a range of values that may also be plausibly representative of the true but unknown value of the measured quantity. Similarly, computations are subject to errors stemming from uncertain model parameters, initial and boundary conditions, imperfectly known physical processes, imperfectly known physical geometry caused by imperfect material boundaries and, finally, inexact numerical computations. Therefore, knowledge of just the nominal values of experimentally measured or computed quantities is insufficient for determining the reliability of results in practical applications. The quantitative uncertainties accompanying measurements and computations are also needed, along with the respective nominal values.
The uncertainties inherent to experimental and computational results provide the basic motivation for performing quantitative model verification, validation, qualification, and predictive estimation. The activity of “code/model verification” addresses the question “are the equations underlying the mathematical model solved correctly?” The activity of “code/model validation” addresses the question “does the respective model represent reality?” Model validation requires benchmarking the respective model against independently obtained experimental results, including all accompanying uncertainties (computational, experimental, etc.) to quantify the respective model’s accuracy by comparing the computational results produced by the model, including their accompanying statistical characteristics (standard deviations, skewness, kurtosis, and corresponding correlations) with the corresponding experimental results. The activity “code/model qualification” addresses the certification that a proposed simulation and/or design methodology satisfies all performance and safety specifications regarding the system under consideration.
In the author’s view, “predictive modeling” comprises three key elements, namely: model calibration, model extrapolation, and estimation of the validation domain. Model calibration addresses the integration of experimental data for the purpose of updating the parameters underlying the computational model. Important components include the estimation of inconsistencies in the experimental data, and quantification of the biases between model predictions and experimental data. The state-of-the-art of model calibration is fairly well developed, but current methods are still hampered in practice by the significant computational effort required for performing exhaustive (i.e., including all of the sensitivities of model responses to the model’s parameters) model calibration for large-scale models. Reducing the computational effort is paramount, and methods based on adjoint models show great promise in his regard. “Model extrapolation” addresses the quantification of the uncertainties in predictions under new conditions, including both untested regions of the parameter space and higher levels of system complexity in the validation hierarchy. Extrapolation of models and the resulting increase of uncertainty are poorly understood, particularly the estimation of uncertainty that results from nonlinear coupling of two or more physical phenomena that were not coupled in the existing validation database. The quantification of the validation domain underlying the models of interest requires estimation of contours of constant uncertainty in the high-dimensional space that characterizes the application of interest. In practice, this involves the identification of areas where the predictive estimation of uncertainty meets specified requirements for the performance, reliability, or safety of the system of interest. The conceptual and mathematical development of methods for quantifying the validation domain is in an incipient stage.
The earliest activities aimed at extracting best-estimate values for model parameters were initiated in the 1960s [1,2,3] in the nuclear energy field. These activities aimed at evaluating neutron cross sections by using time-independent reactor physics experiments for measuring “integral quantities” (i.e., model responses) such as reaction rates and multiplication factors. These activities reached conceptual maturity under the name of “cross section adjustment” [4,5,6,7,8], which used a least-square procedure, weighted with first-order response sensitivities, for combining uncertainties in the model responses with uncertainties in the corresponding experimental data. The resulting “adjusted” neutron cross sections (model parameters) and their “adjusted” uncertainties were subsequently employed in the respective reactor core model to predict improved model responses (e.g., improved reaction rates and reaction rate ratios, reactor multiplication factors, Doppler coefficients) in an extended application domain (e.g., a new or improved reactor core design). Since the neutron transport (or diffusion) equation is linear in the neutron flux (dependent variable), it admits a corresponding adjoint transport (or diffusion) equation for the adjoint neutron flux, which was in turn used to compute efficiently the first-order response sensitivities, which appeared as weighting functions in the least squares adjustment procedure. The “data-adjustment” methodology can thus be considered to have been the earliest systematic methodology that embodies principles of “predictive modeling”. The principles underlying this data-adjustment methodology for the linear, time-independent nuclear energy systems modeled by the neutron transport equation are briefly summarized in Section 2.1.
A pioneering formalism for performing data adjustment in the context of time-dependent nonlinear system was conceived by Cacuci’s group at Oak Ridge National Laboratory [9], using the adjoint method of sensitivity analysis for nonlinear systems conceived and developed by Cacuci [10], who had formally introduced the concepts of sensitivity analysis by using concepts of nonlinear functional analysis. The data adjustment formalism for time-dependent nonlinear system [9] also relies on the minimization, in the least-square sense, of a user-defined functional; its underlying principles are presented in Section 2.2.
In the late 1980s and during the 1990s, the fundamental concepts underlying “data adjustment” seem to have been rediscovered in the atmospheric and geophysical sciences while developing the so-called “data assimilation” procedure, in that the concepts underlying data assimilation are the same as those underlying the previously developed “data adjustment” procedure, relying on the minimization, in the least-square sense, of a user-defined “cost functional. The principles underlying data assimilation procedures are amply described in the books by Lewis et al. [11], Lahoz et al. [12], Farago et al. [13] and Cacuci et al. [14]. These principles are briefly summarized in Section 2.3, in order to facilitate comparisons with the other procedures that aim at the same goals.
In contradistinction to the methods used for data adjustment and/or assimilation, the BERRU predictive modeling methodology developed by Cacuci [15,16] dispenses with the need to minimize an a priori chosen “cost functional” (usually a quadratic functional that represents the weighted errors between measured and computed responses), by employing the “maximum entropy” (MaxEnt) principle [17] to combine computational and experimental information for obtaining best-estimate predicted mean values for model responses and parameters, together with reduced predicted uncertainties for these best-estimate values. The BERRU predictive modeling methodology [15,16] incorporates first-order sensitivities of model responses with respect to the model parameters, computed using the adjoint method [10].
The conception of the second- and higher-order comprehensive adjoint sensitivity analysis methodology by Cacuci [18,19,20,21] has enabled the efficient computation of exact expression for arbitrarily-high order sensitivities, thus opening the path to further advances of all of the activities which could use such higher-order sensitivities. In particular, the availability of the second- and higher-order model response sensitivities to model parameters has also enabled the extension of the BERRU predictive modeling methodology from first-order to higher-orders. This extended, novel MaxEnt-based predictive modeling formalism, is labeled the “BERRU-PM-2+” methodology because it incorporates not only all of the possible second-order sensitivities and correlations, but also provides the mechanism for incorporating arbitrarily-high-order sensitivities of model responses with respect to model parameters. The novel “BERRU-PM-2+” methodology is presented in Section 3 of this work.
Section 4 concludes this work by presenting a comparative discussion of the deterministic methodologies currently available in the context of their abilities to predict best-estimate mean values and uncertainties (variances/covariances) for calibrating and validating large-scale models of energy-related physical systems.

2. Traditional Least-Square Based Deterministic Predictive Modeling Methodologies

This Section briefly reviews the traditional deterministic methodologies which comprise elements of predictive modeling, as follows:
(i)
The so-called “data-adjustment methodology”, which is perhaps the oldest such methodology in use and was developed in the nuclear energy field. The initial “data-adjustment methodology”, which was developed in the 1960s for the large-scale time-independent linear systems modeled by the linear neutral particle transport equation, is briefly reviewed in Section 2.1. The data-adjustment methodology for time-dependent nonlinear systems, which could be considered to be the predecessor for the so-called “4D-Var” methodology mentioned below, is reviewed in Section 2.2.
(ii)
The so-called “data assimilation” methodology, which is implemented for the large-scale time-dependent systems encountered in the atmospheric and geophysical field, is briefly reviewed in Section 2.3.

2.1. Time-Independent Least-Squares Based Data Adjustment

The latest versions of the “data-adjustment” methodology for the time-independent systems are described in the leading nuclear engineering code-packages SCALE [22] and CRISTAL [23]. This data-adjustment methodology is intended for assimilating experimental information in order to “adjust” (calibrate) model parameters (namely nuclear cross sections) describing linear reactor physics problems modeled by the linear Boltzmann equation. The data-adjustment methodology relies on a “Generalized Linear Least Squares Methodology (GLLSM)” which involves minimizing the following user-defined Lagrangian functional, denoted below as L y , z :
min y , z L y , z ;   L y , z y , z C m m C m α C α m C α α y z + 2 λ S k z y .
The quantities appearing in functional L y , z defined in Equation (1) have the following meanings/definitions:
(i)
The (symmetric) relative covariance matrix for the model parameters, C α α , is defined as follows:
C α α cov α n , α p α n α p ;   n , p = 1 , , M ,
where α n denotes the nth model parameter (cross sections); cov α n , α p denotes the covariance between the two respective model parameters; M denotes the total number of model parameters.
(ii)
The (symmetric) relative covariance matrix for the measured model responses (effective multiplication factors), C m m , is defined as follows:
C m m m i k i cov m i , m j m i m j m j k j ;   i , j = 1 , , I ,
where k i denotes the nominal values of the computed effective multiplication factor, using the nominal values of the model parameters (cross sections); m i denotes the nominal values of the measured effective multiplication factor; cov m i , m j denotes the covariance between two measured effective multiplication factors; I denotes the total number of model responses (measured or computed effective multiplication factors).
(iii)
The rectangular matrix C α m contains as elements the relative covariances between the measured responses and model parameters, and is defined as follows:
C α m cov α n , m i α n m i m i k i ;   n = 1 , , M ;   i = 1 , , I ;   C m α C α m .
(iv)
Each model response, denoted as k i α , is considered to be a linear function of the model parameters, denoted as α α 1 , , α M , having the following form:
k i α + δ α   =   k i α 1 + n = 1 M S n i δ α n α n ;   i = 1 , , I ;   δ α δ α 1 , , δ α M ,
where δ α n denotes an arbitrary perturbation in the n-th model parameter, and where S n i represents the relative sensitivity of the i-th response to the n-th parameter, computed at the nominal parameter values, and is defined as follows:
S n i k i α n α n k i ;   i = 1 , , I ;   n = 1 , , M .
The linear relationship representing the model response as a function of the model parameters is re-written in the following form:
y = d + S k z ,
where
S k S n i k i α n α n k i ;   i = 1 , , I ;   n = 1 , , M ;
z z 1 , , z M ;   z n δ α n α n ;
d d 1 , , d I ;   d i k i α m i k i α ;
y y 1 , , y I ;   y i k i α + δ α m i k i α .
(v)
The (column) vector of Lagrange multipliers λ λ 1 , , λ I , which appears in the functional R y , z defined in Equation (1), enforces the linear model defined by Equation (7) as a “hard constraint.”
The “adjusted” parameter and response values are obtained by solving the following equations, which are to be satisfied at the minimum of the functional R y , z :
R y , z z = R y , z y = 0 .
The solution of Equation (12) yields the following relations for the “adjusted” parameter and response values:
z = C α m C α α S k C d d 1 d ,
y = C m m C m α S k C d d 1 d ,
C d d S k C α α S k + C m m S k C α m C m α S k .
The “adjusted covariance matrix”, C m m a d j , for the “adjusted” responses (measured or computed) and “the adjusted covariance matrix”, C α α a d j , for the “adjusted” parameters, respectively, are obtained by using Equations (13) and (14), to obtain the following expressions:
C m m a d j = C m m C m m C m α S k C d d 1 C m m S k C α m ,
C α α a d j = C α α C α m C α α S k C d d 1 C m α S k C α α .
Care must be exercised, however, since the indiscriminate incorporation of all seemingly relevant experimental-response data could produce a set of calibrated (adjusted) parameter values that might differ unreasonably much from the corresponding original nominal values and might even fail to improve the agreement between the calculated and measured values of some of the very responses that were used to calibrate the model parameters. When calibrating (adjusting) a library of model parameters, it is tacitly assumed that the given parameters are basically “correct”, in that their true values lie within some reasonable uncertainty of their reported value. However, calibration can be used to increase the accuracy of a model beyond that which is constrained by its reliance on the reported value. The calibration procedure uses additional data (e.g., experimentally measured responses) for improving the parameter values while reducing their uncertainties. Although such additional information induces modifications of the original parameter values, the adjusted parameters are still generally expected to remain consistent with their original nominal values, within the range of their original uncertainties. However, calibration of a library of model parameters by experimental responses which significantly deviate from their respective computed values would significantly modify the resulting adjusted parameters, perhaps even violating the restriction of linearity expressed by Equation (7). Such unlikely adjustments would most probably lead to failure of even reproducing the original experimental responses.
On the other hand, calibrating model parameter by using measured responses that are very close to their respective computed values would cause minimal parameter modifications and a nearly perfect reproduction of the given responses by the calibrated parameters, as would be expected. In such a case, the given responses would be considered as being consistent with the parameter library, in contradistinction to adjustment by inconsistent experimental information, in which case the adjustment (calibration) could fail because of inconsistencies. These considerations clearly underscore the need for using a quantitative indicator to measure the mutual and joint consistency of the information available for model calibration.

2.2. Time-Dependent Least-Squares Based Data Adjustment

The time-dependent least-squares based data-adjustment methodology [9] was developed in the early 1980’s to analyze time-dependent problems of fundamental interest in nuclear engineering (e.g., thermal-hydraulics transient computations and experiments). The underlying concepts are as above and the transient phenomena can be analyzed either globally (“off-line”), when the entire information during the time interval of interest is being analyzed at once or locally, when the data assimilation/adjustment is performed one time-step at a time, as nowadays done in the atmospheric sciences.
The global (“off-line”) methodology can be conveniently summarized by using the notation provided in [16], which considers that the time span of interest for the nonlinear transient system being analyzed is partitioned into N t 1 intervals. At every time instance ν , the quantities J α ν , and J r ν are used to denote the ordered set of integers J α ν 1 , , N α ν and, respectively, J r ν 1 , , N r ν , where N α ν and N r ν denote the numbers of distinct system parameters and distinct responses, respectively, at time instance ν . The quantity J t denotes the set J t   1 , , N t .
At every time instance ν , the imprecisely known system parameters are considered to be variates with mean values denoted as α i ν 0 . The covariances between two parameters α i ν and α j μ , at two time instances μ and ν , is denoted as cov α i ν , α j μ . These local covariances are used to construct the following symmetric global parameter covariance matrix:
C α   C α 11   C α 12     C α 21   C α 22                     C α N t N t ,   C α μ ν cov α i ν , α j μ ;   i , j = 1 , , J α ν ;   ν , μ = 1 , , J t .
Similarly, the imprecisely known measured responses are characterized by local mean values denoted as r i ν , at a time instance ν . The covariances between two responses measured at time instances μ and ν are denoted as cov r i ν , r j μ . These covariances are used to construct the symmetric global matrix, C m , for measured responses, defined below:
C m   C m 11   C m 12     C m 21   C m 22                     C m N t N t   ;   C m μ ν cov r i ν , r j μ ;   i , j = 1 , , J r ν ;   ν , μ = 1 , , J t .
In the most general case, the measured responses may be correlated to the parameters through local response-parameter covariances, denoted as cov α i ν , r j μ , between a parameter α i ν and a response r j μ , which are used to construct the global rectangular parameter-response covariance matrix, C α r , having the following form:
C α r C α r 11   C α r 12     C α r 21   C α r 22                     C α r N t N t ;   C r   α μ ν cov α i ν , r j μ ;   i = 1 , , J α ν ;   j = 1 , , J r ν ;   ν , μ = 1 , , J t .
At any given time instance ν , a response r i ν can be a function of not only the system parameters at time instance ν , but also of the system parameters at all previous time instances μ , 1 μ ν ; this means that r ν = R ν p ν , where p ν α 1 , , α μ , , α ν . As in the previous Subsection, the computed response is considered to depend linearly on the model parameters, i.e., the computed response is linearized via a functional Taylor-series expansion around the nominal values, p 0 ν α 0 1 , , α 0 μ , , α 0 ν , of the parameters p ν , as follows:
r ν R ν p ν = R ν p 0 ν + μ = 1 ν S ν μ p 0 μ α μ α 0 μ + h i g h e r   o r d e r   t e r m s ,   ν J t ,
where R ν p 0 ν denotes the vector of computed responses at a time instance ν , at the nominal parameter values p 0 ν , while S ν μ p 0 μ , 1 μ ν , represents the J r ν × J α μ -dimensional matrix containing the first-order Gateaux-derivatives of the computed responses with respect to the parameters, defined as follows:
S ν μ p 0 μ s 11 ν μ s 1 N ν μ s i n ν μ s I 1 ν μ s I N ν μ R 1 n p 0 μ α 1 μ R 1 n p 0 μ α N μ R i ν α n μ R I ν p 0 μ α 1 μ R I ν p 0 μ α N μ ,   1 μ ν .
Since the response R ν p 0 ν at time instance ν can depend only on parameters α 0 μ which appear up to the current time instance ν , it follows that S ν μ = 0 when μ > ν , implying that non-zero terms in the expansion shown in Equation (21) can only occur in the range 1 μ ν . The linear model expressed by Equation (21) can be written in the form
r = R α 0 + S α α 0 + h i g h e r   o r d e r   t e r m s ;
where
α α 1 , , α μ , , α N t ;   α ν = α n ν |   n J α ν ,   ν J t ; r r 1 , , r μ , , r N t ;   r ν = r i ν |   i J r ν ,   ν J t ; R α 0 R 1 , , R μ , , R N t ,   ν J t ;
S S 11 0 S N t 1 S N t N t .
The information provided in Equations (18)–(25) is now used to construct the following Lagrangian functional P z to be minimized:
P z Q z + 2 λ Z α 0 z + d = min ,   a t   z = z b e ,
where λ = λ 1 , , λ ν , , λ N t denotes the corresponding vector of Lagrange multipliers and where
Q z z   C 1 z ;   z α α 0 r r m ;   C C α C α r C r α C m ;   α 0 α 0 1 , , α 0 μ , , α 0 N t ;
d R α 0 r m ;   r m r m 1 , , r m μ , , r m N t ;   Z S ,   U ;   U I 11 0 0 I N t N t .
In Equation (28), the quantities I j j , j = 1 , , N t denote the identity matrices of corresponding dimensions.
Setting P z / z = 0 and solving the resulting equations yields the following results, which have the same formal structure as shown in Equations (13)–(17):
(i)
calibrated best-estimate parameter values:
α b e 1 α b e N t = α 0 1 α 0 N t + C α r 11 C α 11 S 11 C α r 1 N t ρ = 1 N t C α 1 ρ S N t ρ C α r N t 1 C α N t 1 S 11 C α r N t N t ρ = 1 N t C α N t ρ S N t ρ × η = 1 N t K d 1 η d η η = 1 N t K d N t η d η ,
where K d ν η denotes the corresponding ν , η -element of the block-matrix C d 1 , where:
C d C d 11 C d 1 N t C d N t 1 C d N t N t = C r c 11 + C m 11 C r c 1 N t + C m 1 N t C r c N t 1 + C m N t 1 C r c N t N t + C m N t N t C r α 11 S 11 + S 11 C α r 11 S 11 C α r 1 N t + ρ = 1 N t C r α 1 ρ S N t ρ C r α N t 1 S 11 + ρ = 1 N t S N t ρ C α r ρ 1 ρ = 1 N t C r α N t ρ S N t ρ + S N t ρ C α r ρ N t ,
and where the covariance matrix of the computed responses, C r c , is defined as follows:
C r c C r c 11 C r c 1 N t C r c N t 1 C r c N t N t = S C α S ;   C r c ν μ = η = 1 ν ρ = 1 μ S ν η C α η ρ S μ ρ = C r c μ ν   ;   ν , μ J t .
Written in component form, Equation (29) indicates that the vector α b e ν , representing the calibrated best-estimates for the system parameters at a specific time instance ν , takes on the following expression:
α b e ν = α 0 ν + μ = 1 N t C α r ν μ ρ = 1 μ C α ν ρ S μ ρ η = 1 N t K d μ η d η ,   ν J t
(ii)
The calibrated best-estimate covariance matrix, C α b e , corresponding to the calibrated best-estimates system parameters:
C α b e C α b e 11 C α b e 1 N t C α b e N t 1 C α b e N t N t = C α 11 C α 1 N t C α N t 1 C α N t N t C α d 11 C α d 1 N t C α d N t 1 C α d N t N t K d 11 K d 1 N t K d N t 1 K d N t N t C α d 11 C α d 1 N t C α d N t 1 C α d N t N t
where
C α d 11 C α d 1 N t C α d N t 1 C α d N t N t C α r 11 C α 11 S 11 C α r 1 N t ρ = 1 N t C α 1 ρ S N t ρ C α r N t 1 C α N t 1 S 11 C α r N t N t ρ = 1 N t C α N t ρ S N t ρ .
The block-matrix expression in Equation (33) can be written in component form, for the calibrated best-estimate parameter covariance matrix C α b e ν μ between two (distinct or not) time instances ν , μ J t , as follows
C α b e ν μ = C α ν μ η = 1 N t ρ = 1 N t C α r ν ρ π = 1 ρ C α ν π S ρ π K d ρ η C r α η μ π = 1 η S η π C α π μ .
(iii)
The vector of calibrated best-estimate system responses at all time instances ν J t :
r b e 1 r b e N t = r m 1 r m N t + C m 11 C r α 11 S 11 C m 1 N t ρ = 1 N t C r α 1 ρ S N t ρ C m N t 1 C r α N t 1 S 11 C m N t N t ρ = 1 N t C r α N t ρ S N t ρ η = 1 N t K d 1 η d η η = 1 N t K d N t η d η .
Written in component form, Equation (36) provides the following expression for the vector r b e ν , of calibrated best-estimates for the responses at a specific time instance ν :
r b e ν = r m ν + μ = 1 N t C x ν μ ρ = 1 μ C r α ν ρ S μ ρ η = 1 N t K d μ η d η ,   ν J t .
(iv)
The expression of the calibrated best-estimate covariance block-matrix, C r b e , for the best-estimate responses:
C r b e C r b e 11 C r b e 1 N t C r b e N t 1 C r b e N t N t = C m 11 C m 1 N t C m N t 1 C m N t N t C r d 11 C r d 1 N t C r d N t 1 C r d N t N t K d 11 K d 1 N t K d N t 1 K d N t N t C r d 11 C r d 1 N t C r d N t 1 C r d N t N t ,
where
C r d 11 C r d 1 N t C r d N t 1 C r d N t N t = C m 11 C r α 11 S 11 C m 1 N t ρ = 1 N t C r α 1 ρ S N t ρ C m N t 1 C r α N t 1 S 11 C m N t N t ρ = 1 N t C r α N t ρ S N t ρ .
The block-matrix expression given in Equation (38) can be written in component form, where each of the calibrated best-estimate parameter covariance matrix C r b e ν μ between two (distinct or not) time instances ν , μ J t has the following form:
C r b e ν μ = C m ν μ η = 1 N t ρ = 1 N t C m ν ρ π = 1 ρ C r α ν π S ρ π K d ρ η C m η μ π = 1 η S η π C α r π μ .
(v)
The best-estimate response-parameter covariance block-matrix C α r b e :
C r α b e C r α b e 11 C r α b e 1 N t C r α b e N t 1 C r α b e N t N t = C r α 11 C r α 1 N t C r α N t 1 C r α N t N t C r d 11 C r d 1 N t C r d N t 1 C r d N t N t K d 11 K d 1 N t K d N t 1 K d N t N t C α d 11 C α d 1 N t C α d N t 1 C α d N t N t .
Each of the calibrated best-estimate parameter-response covariance matrices C r α b e ν μ for ν , μ J t , which appear as block-matrix components of the matrix C r α b e in Equation (41) has the following expression:
C r α b e ν μ = C r α ν μ η = 1 N t ρ = 1 N t C x ν ρ π = 1 ρ C r α ν π S ρ π K d ρ η C α r η μ π = 1 η S η π C α π μ .
Computing the calibrated best-estimate quantities by using Equations (32), (35), (37), (40) and (42) is definitely more advantageous in terms of storage requirements than the direct computations of the corresponding full block-matrices. The largest requirement of computational resources arises when inverting the matrix C d . In view of Equation (30), it is important to note that the inverse matrix, C d 1 , incorporates simultaneously all of the available information about the system parameters and responses at all time instances (i.e., ν = 1 , 2 , , N t ). In other words, at any time instance ν , C d 1 incorporates information not only from time instances prior to, and at, ν (i.e., information regarding the “past” and “present” states of the system) but also from time instances posterior to ν (i.e., information about the “future” states of the system). Therefore, at any specified time instance ν , the calibrated best-estimates parameters α b e ν and responses r α b e r b e together with the corresponding calibrated best-estimate covariance matrices C α b e ν μ , C r b e ν μ , and C α r b e ν μ will also incorporate automatically, through the matrix C d 1 , all of the available information about the system parameters and responses at all time instances, i.e., ν = 1 , 2 , , N t .
Note that the mathematical formalism underlying the time-dependent data-adjustment methodology is written in terms of absolute sensitivities and covariances, in contrast to the data adjustment formalism for time-independent systems, which is written in terms of relative sensitivities and covariances. Nevertheless, the two formalisms produce end-results which look formally quite similar.
The actual application of the predictive modeling results presented in Equations (32), (35), (37), (40) and (42) to a physical system characterized by nominal values and uncertainties for model parameters together with the computed and measured responses is straightforward, in principle, although it can become computationally very demanding regarding both data handling and computational speed.
The minimum value, Q min , of the functional Q z has the following expression:
Q min Q z b e = d C d α 0 1 d = χ 2 .
As the above expression indicates, the minimal value, Q min Q z b e , represents the square of the length of the vector d , measuring (in the corresponding metric) the deviations between the experimental and nominally computed responses. The quantity Q min Q z b e can be evaluated directly from the given data (i.e., given parameters and responses, together with their original uncertainties) after having inverted the deviation-vector covariance matrix C d α 0 . It is also very important to note that Q min Q z b e is independent of calibrating (or adjusting) the original data. As the dimension of d indicates, the number of degrees of freedom characteristic of the calibration under consideration is equal to the number of experimental responses. In the extreme case of absence of experimental responses, no actual calibration takes place since d = R α 0 , so that the best-estimate parameter values are just the original nominal values, i.e., α b e k = α 0 k . An actual calibration (adjustment) occurs only when including at least one experimental response.
In turn, the variate Q min follows a χ 2 -distribution with n degrees of freedom, where n denotes the total number of experimental responses considered in the calibration (adjustment) procedure. The quantity Q min is the “ χ 2 of the calibration (adjustment) at hand” and can be used as an indicator of the agreement between the computed and measured responses, measuring essentially the consistency of the measured responses with the model parameters. A similar quantity exists for the time-independent data adjustment procedure presented previously, in Section 2.1.
In order to facilitate a comparison with the time-dependent data-assimilation methods used in the atmospheric sciences (which are summarized in Section 2.3, below), which customarily involve just two consecutive time-steps, Cacuci [16] has obtained explicit formulas by simplifying the expressions presented in Equations (32), (35), (37), (40) and (42) by including only two consecutive time-steps, namely ν = k 1 , k ;   k = 1 , 2 , , N t , and by presenting the explicit form of the inverse matrix K d ν η . The simplified expressions thus obtained are as follows:
(i)
The expressions for the calibrated best-estimate parameter values take on the following particular form of Equation (32) at time node k :
α b e k = α 0 k + μ = k 1 k C α r k μ ρ = 1 μ C α k ρ S μ ρ η = k 1 k K d μ η d η ,
where
K d k 1 , k 1 = C d k 1 , k 1 C d k 1 , k C d k , k 1 C d k , k 1 1 = C d k 1 , k 1 1 + C d k 1 , k 1 1 C d k 1 , k K d k , k C d k , k 1 C d k 1 , k 1 1 ,
K d k 1 , k = C d k 1 , k 1 1 C d k 1 , k C d k , k C d k , k 1 C d k 1 , k 1 1 C d k 1 , k 1 = C d k 1 , k 1 1 C d k 1 , k K d k , k ,
K d k , k = C d k , k C d k , k 1 C d k 1 , k 1 1 C d k 1 , k 1 = C d k , k 1 + C d k , k 1 C d k , k 1 K d k 1 , k 1 C d k 1 , k C d k , k 1 ,
K d k , k 1 = C d k , k 1 C d k , k 1 C d k 1 , k 1 C d k 1 , k C d k , k 1 C d k , k 1 1 = C d k , k 1 C d k , k 1 K d k 1 , k 1 .
(ii)
The components C α b e ν μ , ν , μ = k 1 ,   k , of the calibrated best-estimate covariance matrix, C α b e , have the following particular form of Equation (35):
C α b e ν μ = C α ν μ η = k 1 k ρ = k 1 k C α r ν ρ π = 1 ρ C α ν π S ρ π K d ρ η C r α η μ π = 1 η S η π C α π μ f o r ν = k 1 ,   k ;   a n d   μ = k 1 ,   k .
(iii)
The vector r b e k , representing the calibrated best-estimates for the system parameters at a time instance k , takes on the following particular form of Equation (37):
r b e k = r m k + μ = k 1 k C m k μ ρ = 1 μ C r α k ρ S μ ρ η = k 1 k K d μ η d η .
(iv)
The components C r b e ν μ of the calibrated best-estimate covariance block-matrix C r b e for the best-estimate responses takes on the following particular form of Equation (38) for ν , μ = k 1 ,   k :
C r b e ν μ = C m ν μ η = k 1 k ρ = k 1 k C x ν ρ π = k 1 ρ C r α ν π S ρ π K d ρ η C m η μ π = k 1 η S η π C α r π μ .
(v)
The matrix-valued components C α r b e ν μ , ν , μ = k 1 ,   k , of the best-estimate response-parameter covariance matrix C α r b e reduce to the following particular forms of Equation (42):
C r α b e ν μ = C r α ν μ η = k 1 k ρ = k 1 k C m ν ρ π = k 1 ρ C r α ν π S ρ π K d ρ η C r α η μ π = k 1 η S η π C α π μ .

2.3. Least-Squares Based Variational Data Assimilation

The predecessors to data-assimilation methods were called “objective analyses” [24], to distinguish them from the “subjective analyses” in which data was manipulated according to the opinion of experts. Subsequently, the so-called “nudging” or “Newtonian relaxation” methods were introduced [25] to drive the model variables towards the observations by adding empirical forcing terms. The variational principle introduced by Sasaki [26,27] provided the basis for the development of “optimal interpolation,” three-dimensional variational data assimilation,” and “physical space statistical analysis” methods (see, e.g., the books [11,12,13,14]), which were developed independently but were shown by Lorenc [28] to be equivalent.
The introduction of Cacuci’s adjoint method [10] to the atmospheric sciences [29,30,31] has enabled the subsequent development of the modern techniques nowadays called “variational data assimilation” methodology (also called “4D-VAR”; “4D” refers to a framework involving the four independent variables “time plus three space dimensions”). Variational data assimilation aims at determining the model’s evolution “trajectory” which fits “best”—n a “least-square” sense—the observational data over an assimilation time interval by adjusting the initial conditions supplied for advancing (integrating) the forecast model forward in time. The earliest work on developing the 4D-VAR methodology using adjoint functions was by Derber [32,33] and Hoffmann [34], while Navon [35,36,37] has introduced augmented Lagrangian methods for enforcing the conservation of integral invariants. Most of the meteorological centers around the world have since implemented 4D-VAR methods in conjunction with comprehensive numerical weather prediction models [38,39,40].
The so-called strong constraint “variational data assimilation (VDA)” or classical VDA framework assumes that the forecast model perfectly represents the evolution of the actual physical system (atmosphere), and the “best fit” model trajectory is obtained by adjusting only the initial conditions via minimization of a user-defined cost functional, subject to the model equations as strong constraints. However, like any other models of large-scale processes, numerical weather prediction (NWP) models are affected by errors arising from the omission of sub-grid processes; also, the discretization of continuous processes introduces dissipative and dispersion errors. Furthermore, the mathematical modeling of the boundary conditions and forcing terms is imperfect, and most of the physical processes and their interactions in the atmosphere are parameterized. In general, these modeling approximations/errors comprise both systematic (so-called “bias”) and stochastic components which vary in space and time. Customarily, the model error (ME) term is formally introduced as a correction to the time derivatives of model variables and the resulting methodology is called the weak constraint VDA.
Using the notation of [14], the state of the physical system (e.g., atmosphere) can be described by the following evolution equation:
d x ( t ) d t = M [ x ( t ) ] + T [ η ( t ) ] ,   x ( t 0 ) = x 0 ,
where: (i) the vector x ( t ) represents the state of the atmosphere, with an initial condition x 0 at the initial time t 0 ; (ii) the operator M [ . ] denotes all the mathematical operations involved in the model; (iii) η t represents the model errors; (iv) T [ . ] represents an operator that maps the space of ME to the space of the model state x ( t ) . If the model state has an associated model error at every grid point, then the operator T [ ] is identically equal to the unit matrix and the dimension of η t is the same as the dimension of the model state x ( t ) . On the other hand, if only some components of the state vector (e.g., the atmosphere in the polar regions) are affected by modeling errors, then the operator T [ ] can be specified in such a way that only those model grid points (in the polar regions) have modeling errors, while (by comparison) the rest of the model states are free of ME’s. Some authors formally/artificially write η t as the sum of a stochastic error (“noise”) component and a “systematic error” component but such a separation is superfluous in practice since both components are anyway estimated based on “expert opinion” rather than based on exact modeling.
The model error is considered to evolve according to the following evolution model:
d η d t = Φ [ η ( t ) , x ( t ) ] ,   η ( t 0 ) = η 0 ,
where the rate of growth or decay of η t in time is governed by the particular form of the mapping Φ .
Data-assimilation schemes determine the analyzed atmospheric state as an optimal combination of a-priori background information and observational information, by minimizing the following user-defined functional:
J [ x ( t 0 ) , η ( t 0 ) ] J o + J b + J η ,
where
J b 1 2 [ x 0 x b ] T   B 1   [ x 0 x b ] ,
J o 1 2 i = 0 n   [ H ( x i ) y o ( t i ) ] T   R 1   [ H ( x i ) y o ( t i ) ] ,
J η 1 2 [ η 0 η b ] T   Q 1   [ η 0 η b ] .
The components of the functionals J b , J o and J η are defined as follows:
  • The functional J b quantifies, in a least-square sense, the squared differences between the initial state x 0 and the background state x b ; B denotes the estimated background covariance matrix. The background state x b provides an initial guess for minimization of J .
  • The functional J o quantifies, in a least-square sense, the squared differences between the observed state and initial state at the time instance t i ; R denotes the estimated observation covariance matrix and H ( x i ) denotes the observation operator at the time instance t i .
  • The functional J η quantifies, in a least-square sense, the model errors; Q denotes the covariance matrix of model errors.
The constrained minimization of the functional J x 0 , η 0 subject to the “hard constraints” represented by Equations (53) and (54) is performed by minimizing the unconstrained augmented Lagrangian functional defined below:
L ( x , η , x , η ) = J ( x 0 , η 0 ) + t 0 t n     x ( t ) ,   d x ( t ) d t M [ x ( t ) ] η ( t )   d t + t 0 t n     η ( t ) ,   d η d t Φ [ η ( t ) , x ( t ) ]   d t ,
where x ( t ) and η ( t ) are the Lagrange multiplier vectors corresponding to x ( t ) and η ( t ) , respectively, while , denotes Euclidean inner product in the three-dimensional physical space. The extrema of L ( x , η , x , η ) are the solutions of the following Euler–Lagrange equations:
L x = 0 ,   L η = 0 ,
L x = 0 ,   L η = 0 .
Imposing Equation (61) ensures that the equations describing the evolution of model state and ME shown in Equations (53) and (54) are satisfied. Solving Equation (60) yields the following adjoint equations which describe the evolution of the adjoint variables x ( t ) and η ( t ) :
d x ( t ) d t = M x T x ( t ) + Φ x T η ( t ) + δ ( t t i ) i = 0 n   H x T   R 1   [ H ( x i ) y o ( t i ) ] ;   x ( t n ) = 0 ,
d η ( t ) d t = Φ η T η ( t ) + x ( t ) ;   η ( t n ) = 0 .
As indicated by Equations (62) and (63), the evolution of x ( t ) is coupled to the evolution of η ( t ) via the operator Φ . Backward integration of the adjoint models represented by Equations (62) and (63) from time t n t 0 provides the values of initial adjoint states: x ( t 0 ) and η ( t 0 ) . Furthermore, the optimality conditions expressed by Equations (60) and (61) also provide the following expressions for the partial gradients of the functional L ( x , η , x , η ) or, equivalently, the functional J ( x 0 , η 0 ) , with respect to the initial model state, x 0 , and the initial ME state, η 0 :
x 0 J = x 0 J b   +   x 0 J o = B 1 [ x 0 x b ] + x ( t 0 ) ;
η 0 J = η 0 J η   +   η 0 J o = Q 1 [ η 0 η b ] + η ( t 0 )   .
If the model’s errors are neglected, Equation (65) would not exist, so the amount of computations would be just half of the computations needed when modeling errors are explicitly taken into account.

3. BERRU-PM-2+: Second-Order-Plus MaxEnt Forward and Inverse Predictive Modeling Methodology

Cacuci [15,16] conceived a predictive modeling methodology which uses the maximum entropy (MaxEnt) principle [17] to combine experimental and computational information in the joint phase-space of responses and model parameters in order to provide optimal mean values for model parameters, as well as reduced uncertainties for the optimally predicted responses and parameters. This methodology has been called the “BERRU-PM” methodology, which is an acronym for “Best-Estimate Results with Reduced Uncertainties: Predictive Modeling”. Using the MaxEnt principle [17], the BERRU-PM methodology eliminates the need for introducing and “minimizing” a user-chosen “cost functional quantifying the discrepancies between measurements and computations”, thus yielding results that are free of subjective user-interferences while generalizing and significantly extending the current dynamic data assimilation procedures. The use of the maximum entropy principle enables the BERRU-PM to consider the model as an “imprecisely known” entity, as opposed to being appended by Lagrange multipliers as a “hard constraint” devoid of model errors. Incorporating correlations, including those between the imprecisely known model parameters and computed model responses, the BERRU-PM also provides a quantitative metric, constructed from sensitivity and covariance matrices, for determining the degree of agreement among the various computational and experimental data while eliminating discrepant information. This Section will introduce the BERRU-PM-2+ predictive modeling methodology, which further generalizes the BERRU-PM methodology provided in [15,16] by including all of the second-order terms in response sensitivities and providing explicitly the mechanism for incorporating not only second-order but arbitrarily high-order (hence the “2+” in the acronym BERRU-PM-2+) sensitivities of responses with respect to parameters, overcoming the “curse of dimensionality” [41].

3.1. Construction of the Second-Order-Accurate MaxEnt Probability Distribution p c z z c , C c of Computational Model Responses and Parameters

The model parameters are denoted as α i , i = 1 , , T P , where T P denotes the “total number of parameters” and are considered, without loss of generality, to be real-valued scalars. For subsequent mathematical operations, it is convenient to consider that these model parameters are the components of the vector of parameters denoted as α α 1 , , α T P T P . In practice, the model parameters are obtained from experimental measurements, which are external to the model and are performed independently of the manner in which the parameters are used in a computational model. Thus, the model parameters are not known exactly; they are subject to uncertainties. Formally, the model parameters can be considered to obey an unknown multivariate probability distribution, denoted as p α α , which is defined on a domain (of definition of parameters) denoted as D α T P and is normalized such that D α p α α   d α = 1 . Formally, the expected (or mean) value of a model parameter α j , denoted as α j 0 E c α j , is defined as follows:
α j 0 D α α j p α α   d α ,   j = 1 , , T P ;   α 0 α 1 0 , , α T P 0 .
The covariance, cov α i , α j , of two parameters, α i and α j , is formally defined as follows:
cov α i , α j D α δ α i δ α j p α α   d α ;   i , j = 1 , , T P ;   δ α k α k α k 0   ;   k = 1 , , T P .
The variance, var α i , of a parameter α i , is formally defined as the following particular case of Equation (67):
var α i D α δ α i 2 p α α   d α ;   i = 1 , , T P .
The standard deviation of α i is denoted as σ i and is defined as follows: σ i var ( α i ) . The definitions of the higher-order correlations among parameters, which are used together with those shown in Equations (66)–(68) to construct the moments of the distribution of computed responses (i.e., results of interest computed using the model), are presented in Appendix A.
A model response is a function (implicit and, occasionally, explicit) of the model parameters and is denoted as r k α . Evidently, the model response is also an imprecisely known quantity, subject to uncertainties stemming from the model parameters, in addition to uncertainties stemming from the numerical computational procedures. A model response thus comprises by both systematic and random errors. For subsequent mathematical operations, it is convenient to consider that the responses r k α , k = 1 , , T R , where T R denotes the “total number of responses” computed using the model, are the components of the vector of responses denoted as r α r 1 α , , r T R α .
Conceptually, the computational model’s responses can be considered to obey an unknown multivariate probability distribution function, denoted as p c r , which is defined on a domain denoted as D r T R and is normalized such that D r p c r   d r = 1 . Furthermore, the parameters and computed responses can be considered to follow a joint probability distribution function denoted as p c α , r p α α p c r , which is not known, but is formally defined on a computational domain denoted as D c = D α D r , which comprises the reunion of the domains of definitions of all of the model’s parameters and responses, and is properly normalized, i.e., D c p c α , r d α d r = 1 . The subscript “c” is used to indicate quantities pertaining to the “computational model”.
In addition to results produced by computational models, in practice there also exist independent measurements of responses and parameters, which are subject to uncertainties which stem from the respective measurement devices and procedures, independently of the computational model. Since such measurements are scalar quantities which are compared to the computational results, it is convenient to consider that the computational responses are also real-valued scalar quantities, to enable direct comparisons with the corresponding measured responses.
Since the system’s response is a function of the system’s parameters, it is assumed that each computed response can be formally expanded in a multivariate Taylor-series around the parameters’ mean values. In particular, the fourth-order Taylor-series of a system response, denoted as r k α , around the expected (or nominal) parameter values α 0 has the following formal expression:
r k α = r k α 0 + j 1 = 1 T P r k α α j 1 α 0 δ α j 1   + 1 2 j 1 = 1 T P j 2 = 1 T P 2 r k α α j 1 α j 2 α 0 δ α j 1 δ α j 2 + 1 3 ! j 1 = 1 T P j 2 = 1 T P j 3 = 1 T P 3 r k α α j 1 α j 2 α j 3 α 0 δ α j 1 δ α j 2 δ α j 3 + 1 4 ! j 1 = 1 T P j 2 = 1 T P j 3 = 1 T P j 4 = 1 T P 4 r k α α j 1 α j 2 α j 3 α j 4 α 0 δ α j 1 δ α j 2 δ α j 3 δ α j 4 + ε k .
In Equation (69), the quantity r k α 0 indicates the computed value of the response using the expected/nominal parameter values α 0 α 1 0 , , α T P 0 , while the notation α 0 indicates that the quantities within the braces (i.e., first- and second-order sensitivities of the response with respect to the parameters) are also computed using the expected/nominal parameter values. The quantity ε k in Equation (69) comprises all quantifiable errors in the representation of the computed response as a function of the model parameters, including the truncation errors O δ α j 5 of the Taylor-series expansion, possible bias-errors due to incompletely modeled physical phenomena, and possible random errors due to numerical approximations. The radius/domain of convergence of the series in Equation (69) determines the largest values of the parameter variations δ α j which are admissible before the respective series becomes divergent. In turn, these maximum admissible parameter variations limit, through Equation (67), the largest parameter covariances/standard deviations which can be considered for using the Taylor-expansion for the subsequent purposes of computing moments of the distribution of computed responses.
As is well known, and as indicated by Equation (69), the Taylor-series of a function of T P -variables [e.g., r k α ] comprises T P 1st-order derivatives, T P T P + 1 / 2 distinct second-order derivatives, and so on. The computation by conventional methods of the n t h -order functional derivatives (called “sensitivities” in the field of sensitivity analysis) of a response with respect to the T P -parameters (on which it depends) would require at least O T P n large-scale computations. The exponential increase—with the order of response sensitivities—of the number of large-scale computations needed to determine higher-order sensitivities is the manifestation of the “curse of dimensionality in sensitivity analysis”, by analogy to the expression coined by Belmann [41] to express the difficulty of using “brute-force” grid search when optimizing a function with many input variables. The “nth-order Comprehensive Adjoint Sensitivity Analysis Methodology for Nonlinear Systems” (nth-CASAM-N) conceived by Cacuci [20] and the “nth-order Comprehensive Adjoint Sensitivity Analysis Methodology for Response-Coupled Forward/Adjoint Linear Systems” (nth-CASAM-L) conceived by Cacuci [21] are currently the only methodologies that enable the exact and efficient computation of arbitrarily high-order sensitivities while overcoming the curse of dimensionality.
Uncertainties in the model’s parameters will evidently give rise to uncertainties in the model responses r k α . The approximate moments of the unknown distribution of r k α are obtained by using the so-called “propagation of errors” methodology, integrating formally, over the unknown distribution, various expressions involving the truncated Taylor-series expansion of the response provided in Equation (69), as first performed by Tukey [42]. Tukey’s results were generalized to sixth-order by Cacuci [21].
The expectation value, E r k , of a response r k α is obtained by integrating formally Equation (69) over p c α , r , i.e.,:
E c r k D c r k α p c α , r d α d r .
The approximations of E r k up to fourth-order in the parameters’ standard deviations are presented in Appendix A.
The expression of the correlation between a computed responses and a parameter variance, which is denoted as cov α i , r k , is defined below
cov α i , r k D c δ α i r k α E c r k p c α , r d α d r .
The approximations of cov α i , r k up to fourth-order in the parameters’ standard deviations are presented in Appendix A.
The covariance between two responses r k and r is denoted as c o v r k ,   r and is defined below:
c o v r k ,   r D c r k α E c r k r α E c r p c α , r d α d r .
The approximations of c o v r k ,   r up to fourth-order in in the parameters’ standard deviations are presented in Appendix A.
It is assumed in this work that the only information about the distribution p c α , r is provided by the known first- and second-order moments of the joint distribution of model parameters and responses defined by Equations (66), (67), (70), (71) and (72). t is well known that when the underlying distribution p c α , r needs to be determined from such incomplete information, the principle of maximum entropy (MaxEnt) originally formulated by Jaynes [17] provides the optimal compatibility with the available information, while simultaneously ensuring minimal spurious information content. According to the MaxEnt principle, the probability density p c α , r would satisfy the “available information” without implying any spurious information or hidden assumptions if:
(i)
p c α , r maximizes the Shannon [43] information entropy for the computational model, S c , defined below:
S c = D c p c α , r ln p c α , r d α d r ,
(ii)
p c α , r satisfies the “moments constraints” defined by Equations (66), (67), (70), (71) and (72);
(iii)
p c α , r satisfies the normalization condition:
D c p c α , r d α d r = 1 .
The MaxEnt distribution p c α , r is obtained as the solution of the constrained variational problem H c p c / p c = 0 , where the entropy (Lagrangian functional) H c p c is defined as follows:
H c p c = D c p c α , r ln p c α , r d α d r λ 0 D c p c α , r d α d r 1 k = 1 T R λ k 1 D c r k p c α , r d α d r E c r k i = 1 T P λ i 2 D c α i p c α , r   d α d r α i 0 1 2 k = 1 T R = 1 T R λ k 11 D c r k r p c α , r d α d r c o v r k ,   r E c r k E c r   k = 1 T R i = 1 T P λ k i 12 D c r k α i p c α , r d α d r c o v r k ,   r E c r k α i 0   1 2 i = 1 T P j = 1 T P λ i j 22 D c α i α j p c α , r d α d r cov α i , α j α i 0 α j 0 .
In Equation (75), the quantities λ k 1 , λ k 2 , λ k 11 , λ k 12 , and λ k 22 denote the respective Lagrange multipliers, and the factors 1 / 2 are introduced for subsequent computational convenience.
Solving the equation H p c / p c = 0 yields the following expression for the resulting MaxEnt distribution p c z :
p c z = 1 Z c b , Λ c exp b z 1 2 z Λ c z ,
where the subscript “c” indicates quantities related to the computational model and where the various vectors and matrices are defined as follows:
z = r α ;   b   λ 1 λ 2 ;   λ 1   λ 1 1 · λ T R 1 ;   λ 2   λ 1 2 · λ T P 1 ;
Λ c   Λ c 11 Λ c 12 Λ c 12 Λ c 22 ;   Λ c 11 λ 11 11 · λ 1 , T R 11 · λ k 11 · λ T R , 1 11 · λ T R , T R 11 ; Λ c 12 λ 11 12 · λ 1 , T P 12 · λ k 12 · λ T R , 1 12 · λ T R , T P 12 ;   Λ c 11 λ 11 22 · λ 1 , T P 22 · λ k 22 · λ T P , 1 22 · λ T P , T P 22 .
The normalization constant Z c b , Λ c in Equation (76) is defined as follows:
Z c b , Λ c = D c exp b z 1 2 z Λ c z d z ;   d z   d α d r .
In statistical mechanics, the normalization constant Z c is called the partition function (or sum over states) and carries all of the information available about the possible states of the system, while the MaxEnt distribution p c z is called the canonical Boltzmann–Gibbs distribution. The integral in Equation (79) can be evaluated explicitly by conservatively extending the computational domain D c to the entire multidimensional real vector space N , where N T R + T P , to obtain the following expression:
Z c b , Λ c = N exp b z 1 2 z Λ c z d z = 2 π N / 2 D e t Λ c e 1 2 b Λ c 1 b .
The Lagrange multipliers are determined in terms of the known information (means and covariances of parameters and responses) by differentiating the “free energy” F b , C c ln Z c b , C c with respect to the components of the vector b   λ 1 , λ 2   to obtain the following expressions:
F b , C c λ k 1 = 1 Z c D c r k e b z 1 2 z Λ c z d z = D c r k p c α , r d z = E c r k ;   k = 1 ,   ,   T R ;
F b , C c λ i 2 = 1 Z c D c α i e b z 1 2 z Λ c z d z = D c α i p c α , r   d α d r = α i 0 ;   i = 1 ,   ,   T P .
The results obtained in Equations (81) and (82) can be collectively written in vector-matrix form as follows:
F b , C c b = z c ;   z c E c r , α 0 ;   E c r E c r 1 , , E c r T R .  
On the other hand, it follows from Equation (80) that
F b , C c b = Λ c 1 b .
The relations obtained in Equations (83) and (84) imply the following relation:
b = Λ c z c .
Differentiating a second time the relation provided in Equation (81) or (82) yields the following relations:
2 F b , C c λ j 1 λ k 1 = 1 Z c 2 Z c λ j 1 E c r k + 1 Z c D c r j r k e b z 1 2 z Λ c z d z   = E c r j E c r k E c r j r k cov r j , r k ;   j , k = 1 ,   ,   T R .
2 F b , C c λ i 2 λ k 1 = E c r k α i 0 E c α i r k cov α i , r k ;   i = 1 , , T P ;   k = 1 ,   ,   T R ;
2 F b , C c λ i 2 λ j 2 = α i 0 α j 0 E c α i α j cov α i , α j ;   i , j = 1 ,   ,   T P .
The results obtained in Equations (86)–(88) can be collectively written in vector-matrix form as follows:
2 F b , C c b b = C c ;   C c   C c r r C c r α C c α r C α α ;   C c r r cov r j , r k T R × T R ;     C c r α cov r k , α i T R × T P = C c α r ;   C α α cov α i , α j T P × T P .
On the other hand, it follows from Equation (84) that
2 F b , C c b b = Λ c 1 .
The relations obtained in Equations (89) and (90) imply the following relation:
Λ c 1 = C c .
Introducing the results obtained in Equations (84) and (91) into Equations (76) and (80) yields the following expression for the MaxEnt distribution p c z z c , C c :
p c z z c , C c = 2 π N / 2 D e t C c exp 1 2 z z c C c 1 z z c .

3.2. Construction of the Second-Order-Accurate MaxEnt Probability Distribution p e z z e , C e of Experimentally Measured Responses

In addition to information regarding the computational model’s responses and parameters, in practice there also exists experimental information about the responses and model parameters, which are obtained independently of computations. In the most comprehensive setting, there would be independent measurements for the computed responses, for the model parameters and for measured correlations between the model parameters and the measured responses. In this idealized setting, each one of the computed responses would also be measured experimentally, so that information about the mean values and covariances for a total number T R of experimentally measured responses would also be available. The mean values of the experimentally measured responses are denoted as r i e , i = 1 , , T R , and the covariances of two measured responses are denoted as cov r i , r j e , i , j = 1 , , T R . The letter “e” is used either as a superscript or a superscript to indicate experimentally measured quantities. Similarly, experimentally-determined values for the expected values α i e , i = 1 , , T P , and newly measured covariances cov α i , α j e for the model parameters could also be available. In principle, it is also possible to obtain correlations between some measured responses and some model parameters. In the nuclear energy field, for example, an important measured detector response is the fission rate in a material (e.g., reactor fuel) which contains (fissionable) uranium-235. However, the fission cross section of uranium-235 would also be a model parameter in the computation of the fission reaction rate, which would thus be correlated to the response.
Formally, the measured responses and parameters are considered to follow a joint unknown probability distribution function, which can be denoted as p e α , r and which is formally defined on a domain denoted as D e N ;   N T P + T R . The means and covariances of the experimentally measured responses can be defined formally as follows:
r i e D e   r i p e α , r d r d α   ,   i = 1 , , T R .
cov r i , r j e D e   r i r i e r j r j e p e α , r d r d α   ;   i , j = 1 , , T R   .
The expected values of the measured responses are considered to constitute the components of a vector denoted as r e r 1 e , , r T R e . The covariances (i.e., standard deviations and correlations) of the measured responses are considered to be components of the T R × T R -dimensional covariance matrix of measured responses, which are denoted as C e r r cov r i , r j e T R × T R . Similarly, the mean values and covariances for the additional measurements of model parameters can be formally represented as follows:
α j e D e   α j p e α , r d r d α   ,   j = 1 , , T P .
The expected values of the measured parameters are considered to constitute the components of a vector denoted as α e α 1 e , ,   α T R e . The covariance, cov α i , α j e , of two independently/newly measured parameters, α i and α j , is defined as follows:
cov α i , α j e D e   δ α i e δ α j e p e α , r d r d α ;   i , j = 1 , , T P ;   δ α i e α i α i e .
The covariances cov α i , α j e would constitute the components of the covariance matrix, denoted as C e α α , of newly measured parameters. When correlations, denoted as cov α i , r j e , between the measured responses and the model parameters are available, they can formally be considered to be elements of a (very sparse, in practice) rectangular correlation matrix which is denoted as C e α r cov α i , r j e T P × T R , where
cov α i , r j e D e   α i α i e r j r j e p e α , r d r d α   ;   i = 1 , , T P ; j = 1 , , T R   .
The MaxEnt principle [17] can now be applied to construct the least informative (and hence, most conservative) distribution using the information provided by the vector of mean experimental values r e and covariance matrices C e r r cov r i , r j e T R × T R and C e r α cov α i , r j e T P × T R . This MaxEnt distribution for the experimentally measured responses is denoted as p e z z e , C e and is constructed by applying the same steps as those leading to Equation (92), above, to obtain the following expression:
p e z z e , C e = 2 π N / 2 D e t C e exp 1 2 z z e C e 1 z z e ,
where
C e   C e r r C e r α C e α r C e α α ;   z = r α ;   z e = r e α e .

3.3. Construction of the Complete Second-Order-Accurate Joint Posterior MaxEnt Probability Distbribution of Computed and Measured Responses and Model Parameters

The joint posterior probability distribution of all computed and experimentally measured quantities, which is denoted as p p z z p , C p where the subscript “p” indicates “posterior,” is obtained as the properly normalized product of the distributions p e z z e , C e and p c z z c , C c . The posterior probability p p z z p , C p is defined on the domain D p obtained as the union of the computational and experimental domains, i.e., D p D c D e . Straightforward, albeit tedious, algebraic computations lead to the following expression for p p z z p , C p :
p p z z p , C p p c z z c , C c p e z z e , C e =   K exp 1 2 Q z z p , C p ,
where the normalization constant K and the quadratic form Q z p , C p have the following expressions, respectively:
K 2 π N / 2 D e t C c + C e 1 / 2 exp 1 2 z c z e C c + C e 1 z c z e ;
Q z z p , C p z z p C p 1 z z p ;  
C p   C c 1 + C e 1 1   = C c C c C c + C e 1 C c = C e C e C c + C e 1 C e ;
z p C p C c 1 z c + C e 1 z e .  
The expression obtained in Equation (100) provides the exact first-order (mean values) and second-order (variances and co-variances) of the most comprehensive combined distribution of computations and measurements of responses and parameters. In practice, such a comprehensive amount of experimental information is highly unlikely to be available. Furthermore, even if such massive amount of experimental information were available, the inversion of the matrix C c + C e 1 would require massive computational resources for large systems involving many parameters.

3.4. Practical Situation: Only Response Measurements Are Available to Be Assimilated

In practice, the information (mean values and covariances) about the model parameters indicated in Equations (66)–(68) is obtained prior to using this information in a model and, hence, for computing responses using the respective model. Additional experimental information would usually be available only about the responses of interest. Any experimental information that might be available about the model parameters and/or about correlations between some model parameters and some responses would need to be assessed against the similar information already available in the covariance/correlation matrices C α α and/or C c r α , so the respective experimental and computational information could be combined into the appropriate components of these matrices. In practice, therefore, only information that becomes additionally available about measured responses would need to be assimilated explicitly, i.e., only the components of the vector r e r 1 e , , r T R e and of the matrix C e r r cov r i , r j e T R × T R would become available for assimilation and predictive modeling. These measured responses would be considered to follow an unknown probability distribution function, denoted as p e r and defined on a domain denoted as D e T R , so that the expressions in Equations (93) and (94) would take on the following forms, respectively:
r i e D e   r i p e r d r   ,   i = 1 , , T R .
cov r i , r j e D e   r i r i e r j r j e p e r d r   ;   i , j = 1 , , T R   .
The MaxEnt distribution corresponding to the information provided in Equations (105) and (106) has the following expression:
p e r r e , C e r r =   2 π T R / 2 D e t C e r r 1 / 2 exp 1 2 r r e C e r r 1 r r e .
Furthermore, when only the experimental information represented by the distribution p e r r e , C e r r is available, the posterior joint probability distribution of the computed and measured quantities, which is denoted as p p r , α = p c z z c , C c p e r r e , C e r r and which is considered to be defined on the corresponding domain D p , takes on the following form:
p p r , α = N exp 1 2 Q r , α ;   N D p exp 1 2 Q r , α   d α   d r 1 ; Q r , α   z z c C c 1 z z c + r r e C e r r 1 r r e .
The posterior distribution represented by Equation (108) is evidently not Gaussian. The posterior mean values for the best-estimate predicted responses and parameters are defined as before, namely:
r b e D p p p α , r   d α   d r 1 D p r p p α , r   d α   d r ;
α b e D p p p α , r   d α   d r 1 D p α p p α , r   d α   d r ;
The evaluation of the integrals appearing in Equations (109) and (110) can be performed to a high degree of accuracy, with a controlled error, by employing the saddle-point (Laplace) method [44,45,46]. For a ratio of integrals of the form
I = exp g z d z 1 f z exp g z d z ,
the saddle-point (Laplace) method yields the following result [44,45,46]:
I = f ^ 1 2 f ^ i 1 g ^ i 2 i 3 i 4 g ^ i 1 i 2 g ^ i 3 i 4 + 1 2 f ^ j 1 j 2 g ^ j 1 j 2 + O f ^ j 1 j 2 j 3
where:
(i)
the derivative of a function with respect to a component of z is denoted using a subscript, e.g., f i f / z i , f i j 2 f / z i z j , i , j = 1 , , T I , where T I denotes the total number of independent variables;
(ii)
the superscripts denote the respective component of the inverse Hessian of the respective function, e.g., f i j denotes the i , j -element of the inverse Hessian matrix f i j 1 ;
(iii)
an index that appears as a subscript and a superscript implies a summation over all possible values of that index;
(iv)
the “hat” denotes that the respective quantity is evaluated at the saddle point of exp g z , which is defined as the point at which the gradient g z z g / z 1 , , g / z T I of g z vanishes, i.e., g z z = 0 .
The saddle-point of p p r , α is denoted as r s , α s and is defined by the following relations:
Q r , α r = 0 ,   Q r , α α = 0 ,   a t   r , α = r s , α s .
To obtain the partial gradients (differentials) shown in Equation (113), it is convenient to write the matrix C c 1 in the form C c 1 = C 11 C 12 C 21 C 22 , and use this form together with Equations (99) and (83) in Equation (108) to expand the functional Q r , α in the following form:
Q r , α =   r E c r C 11 r E c r + r E c r C 12 α α 0 + α α 0 C 21 r E c r + α α 0 C 22 α α 0 + r r e C e r r 1 r r e .
Taking the partial differentials of the expression in Equation (114) yields the following equation at r , α = r s , α s :
r s E c r α s α 0 =   C c r r C c r α C c α r C α α C e r r 1 r s r e 0 .
Solving Equation (115) leads to the following expressions for the saddle point coordinates r s and α s :
r s = r e + C e r r C e r r + C c r r 1 E c r r e ,
α s = α 0 C c α r C e r r + C c r r 1 E c r r e .
Using the results obtained in Equations (112), (116) and (117) in Equations (109) and (110), respectively, yields the following results for the best-estimate predicted values for the responses and, respectively, calibrated parameters:
r b e = r s = r e + C e r r C e r r + C c r r 1 E c r r e ;
α b e = α s = α 0 C c α r C e r r + C c r r 1 E c r r e .
Since the components of the vector E c r , and the components of the matrices C c r r and C c α r can contain arbitrarily high-order response sensitivities to model parameters, as shown in Appendix A, the formulas presented in (118) and (119) generalize all of the previous formulas of this type found in data adjustment/assimilation procedures published to date (which contain at most second-order sensitivities). The best-estimate parameter values are the “calibrated model parameters” which can be used for subsequent computations with the “calibrated model”.
The second-order moments of the posterior distribution p p r , α comprise the covariances between the best-estimated response, which are denoted as C b e r r , the covariances between the best-estimate parameters, which are denoted as C α α b e , and the correlations between the best-estimate parameters and responses, which are denoted as C α r b e . The expression of the “best-estimate” posterior parameter covariance matrix C r b e for the best-estimate responses r b e is derived by using the results given in (118) and (119) to obtain the following expressions:
C b e r r D p r r b e r r b e p p α , r   d α   d r = C e r r C e r r C e r r + C c r r 1 C e r r .
The following important result has been used to obtain the final expression provided in Equation (120):
E c r r e E c r r e = r r e r + E c r r r e r + E c r = C e r r + C c r r .
As indicated in Equation (120), the initial covariance matrix C e r r for the experimentally measured responses is multiplied by the matrix I C e r r + C c r r 1 C e r r , which means that the variances contained on the diagonal of the best-estimate matrix C b e r r are smaller than the experimentally measured variances contained in C e r r . Hence, the incorporation of experimental information reduces the predicted best-estimate response variances in C b e r r by comparison to the measured variances contained a priori in C e r r . Since the components of the matrix C b e r r contain high-order sensitivities, the formula presented in Equation (120) generalizes all of the previous formulas of this type found in data adjustment/assimilation procedures published to date (which contain at most first-order sensitivities).
The expression of the “best-estimate” posterior parameter covariance matrix C b e α α for the best-estimate parameters α b e is derived by using the result given in Equation (119) to obtain:
C b e α α D p α α b e α α b e p p α , r   d α   d r = C α α C c α r C e r r + C c r r 1 C c r α .
The matrices C α α and C c α r C e r r + C c r r 1 C c r α are symmetric and positive definite. Therefore, the subtraction indicated in Equation (122) implies that the components of the main diagonal of C b e α α must have smaller values than the corresponding elements of the main diagonal of C α α . In this sense, the combination of computational and experimental information has reduced the best-estimate parameter variances on the diagonal of C b e α α . Since the components of the matrices C α α , C c α r , and C c r r contain high-order response sensitivities, the formula presented in Equation (122) generalizes all of the previous formulas of this type found in data adjustment/assimilation procedures published to date (which contain at most first-order sensitivities).
The expression of the “best-estimate” posterior parameter correlation matrix C b e α r and/or its transpose C b e r α , for the best-estimate parameters α b e and best-estimate responses r b e , are derived by using the results given in Equations (118) and (119) to obtain the following expressions:
C b e α r D p α α b e r r b e p p α , r   d α   d r = C c α r C e r r + C c r r 1 C e r r ;
C b e r α D p r r b e α α b e p p α , r   d α   d r = C e r r C e r r + C c r r 1 C c r α = C b e α r .
Since the components of the matrices C c α r and C c r r contain high-order sensitivities, the formulas presented in Equations (123) and (124) generalize all of the previous formulas of this type found in data adjustment/assimilation procedures published to date (which contain at most first-order sensitivities).
It is important to note from the results shown in (118)–(124) that the computation of the best estimate parameter and response values, together with their corresponding best-estimate covariance matrices, only requires the computation of C e r r + C c r r 1 , which entails the inversion of a matrix of size T R × T R . This is computationally very advantageous, since T R T P , i.e., the number of responses is much less than the number of model parameters in the overwhelming majority of practical situations.
The minimum value, Q min = Q α b e , r b e , of the quadratic form Q α , r , takes on the following expression, which is obtained by using Equation (115):
Q min = χ 2 = r b e E c r α b e α 0 C c r r C c r α C c α r C α α 1 r b e E c r α b e α 0 + r b e r e C e r r 1 r b e r e = E c r r e C e r r + C c r r 1 E c r r e .
As the expression obtained in Equation (125) indicates, the quantity Q min represents the square of the length of the vector E c r r e , measuring (in the corresponding metric) the deviations between the experimental and nominally computed responses. The quantity Q min can be evaluated directly from the given data (i.e., model parameters and computed and measured responses, together with their original uncertainties) after having computed the matrix C e r r + C c r r 1 . As the dimension of the vector E c r r e indicates, the number of degrees of freedom characteristic of the calibration under consideration is equal to the number T R of experimental responses. It is important to note that Q min is independent of calibrating (or adjusting) the original data. In the extreme case of absence of experimental responses, no actual calibration takes place. An actual calibration (adjustment) occurs only when including at least one experimental response.
The variate Q min follows a χ 2 -distribution with T R degrees of freedom, where T R denotes the total number of experimental responses considered in the assimilation/calibration (adjustment) procedure. Thus, the quantity Q min is the “ χ 2 of the calibration at hand” and can be used as an indicator of the agreement between the computed and measured responses (which depend on the model parameters). Recall that the χ 2 -distribution is a measure of the deviation of a “true distribution” (in this case, the distribution of experimental responses) from the hypothetic one (in this case, a MaxEnt distribution constructed from the available information regarding mean values and correlations/covariances). For predictive modeling it is important to assess if: (i) the response and data measurements are free of gross errors (blunders such as wrong settings, mistaken readings, etc.), and (ii) the measurements are consistent with the assumptions regarding the respective means, variances, and covariances.

3.5. Characteristics of the BERRU-PM-2+ Methodology: Summary

The end-results of BERRU-PM-2+ methodology introduced in Section 3.4, above, are the expressions presented in Equations (118)−(125). An examination of these expressions highlights the fact that they can be utilized both for “forward/direct predictive modeling” and for “inverse predictive modeling”. The “forward” or “direct problem” solves the “parameter-to-output” mapping that describes the “cause-to-effect” relationship in the physical process being modeled. The “inverse problem” attempts to solve the “output-to-parameters” mapping. In particular, “measurement problems” are “inverse” to the direct problem, in that they seek to determine (from measurements) the properties of the host medium (e.g., composition, geometry, including internal interfaces) or the properties of the source (e.g., strength, location, direction) and/or the size of the medium on its boundaries. Such inverse problems are encountered in fields as diverse as astrophysics (in which one measures the intensity and spectral distribution of light in order to infer properties of starts), nuclear medicine (where radioisotopes are injected into patients and the radiation emitted is used in diagnostics to reconstruct body properties, e.g., tumors), non-destructive fault detection in materials, underground (oil, water) logging, and detection of sensitive materials. Some authors further group such inverse problems into “invasive”, when the interior particle distribution is accessible for measurements, as opposed to “non-invasive” ones, in which only particle distributions on the boundaries of (or exterior to) the medium can be measured.
Inverse problems are fundamentally ill-posed and/or ill-conditioned, unstable to uncertainties in the model parameters and/or the experimental measurements. In particular, inverse problems involving differential operators are ill-posed because the differentiation operator is not continuous with respect to any physically meaningful observation topology. The existence of a solution for an inverse problem is in most cases secured by defining the data space to be the set of solutions to the direct problem. This approach may fail if the data is incomplete, perturbed, or noisy. If the uniqueness of a solution cannot be secured from the given data, additional data and/or a priori knowledge about the solution need to be used to restrict the set of admissible solutions. In particular, stability of the solution is the most difficult to ensure and verify. If an inverse problem fails to be stable, then small round-off errors or noise in the data will amplify to a degree that renders a computed solution useless. The procedures used to solve approximately an ill-posed problem are called “regularization” procedures and are customarily categorized as “explicit” or “implicit”; see, e.g., [47,48,49]. The historically older explicit methods attempt to manipulate the forward model equations in conjunction with measurements in order to estimate explicitly the unknown source and/or other unknown characteristics of the medium. On the other hand, implicit methods combine measurements with repeated solutions of the direct problem obtained with different values of the unknowns, iterating until an a priori selected functional, usually representing the user-defined “goodness of fit” between measurements and direct computations, is reduced to a value deemed to be “acceptable” by the user.
Since the framework of BERRU-PM methodology (including its present extension, BERRU-PM-2+) comprises the combined phase-space of parameters and responses, it can be used for solving both forward/direct and inverse problems. The solution of forward/direct problems is provided by the expression for the predicted best-estimate response given in Equation (116) together with the corresponding reduced predicted uncertainties provided by Equation (120). The remaining expressions for the best-estimate predicted parameters and their reduced predicted uncertainties provided in Equations (117), (122), along with the parameter-response correlations provided in Equation (123) and the “goodness of fit indicator” provided by Equation (125) are additional features which the BERRU-PM methodology provides. Conversely, the solution of “inverse problems” is provided by the expressions for the best-estimate predicted parameters and their reduced predicted uncertainties provided in Equations (117) and (122). The superior accuracy provided by the BERRU-PM methodology (by comparison to the traditional methods based on the minimization of chi-square/least-squares-type functionals for solving inverse problems) has been demonstrated in [16] for the inverse prediction, from detector responses in the presence of counting uncertainties, of the thickness of a homogeneous slab of material containing uniformly distributed gamma-emitting sources. In particualar, it has been shown in [16] that for optically very thin slabs, both the traditional chi-square-minimization methods and the BERRU-PM methodology predict the slab’s thickness accurately. For optically thick slabs, however, the traditional inverse-problem methods based on the minimization of chi-square-type functionals fail to predict the slab’s thickness, while the BERRU-PM methodology correctly predicts the slab’s actual physical thickness when precise experimental results are assimilated, while also predicting the physically correct response within the selected precision criterion.
As shown in Appendix A, the expressions of the moments of the joint distribution of computed responses and model parameters defined in Equations (70)−(72), in terms of the moments of the distribution of model parameters, explicitly include the third- and higher-order sensitivities of responses with respect to the model parameters, as well as the systematic and random errors that would stem from incomplete modeling and numerical truncations. The expressions provided in Appendix A also indicate the path for incorporating explicitly higher-order sensitivities, if available. Since BERRU-PM-2+ mathematical framework introduced in Section 3 of this work enables the use of higher-order sensitivities, it is expected that the BERRU-PM-2+ methodology will improve considerably the already unparalleled efficiency and accuracy of the extant BERRU-PM methodology.

4. Discussion, Conclusions and Outlook

Although they are not identical, the data adjustment and data-assimilation methodologies share several fundamental common features which stem from their common root in least-square minimization procedures, so the expressions of the end-results produced by these methodologies look similar. Neither of these methodologies use the complete first-order information, however. The data-adjustment methodology is incomplete in first-order because it has no provisions for incorporating the first-order correlations between the model parameters and the computed responses, which are defined by Equation (71) and which can be readily computed, as shown in Appendix A. Even though the provision for incorporating the correlations defined in Equation (4) between measured responses and model parameters is extant, this provision has not been used in applications thus far because of lack of actual data.
The applicability of the data-adjustment methodology is also restricted by the domain of applicability of the “generalized linear least-squares methodology” (GLLSM), which requires that: (i) the uncertainty of the calculated response (e.g., neutron multiplication factor) be dominated by uncertainties in the data (e.g., cross sections, neutron multiplicities, fission spectra), having negligeable systematic errors; and (ii) the data uncertainties be sufficiently small for approximating the response (e.g., neutron multiplication factor) by a first-order Taylor-series expansion about the values of the given data. The analysis performed in [50,51] indicated that neither condition is fulfilled in general, since (i) the linear perturbation assumption underlying the GLLSM predictions breaks down for relative standard deviations (in the data) larger than ca. 1%; and (ii) systematic errors larger than ca. 1% also cause the brake-down of the GLLSM predictions, since such systematic errors are not explicitly considered within the GLLSM formalism. Consequently, for real application cases, which are always characterized by systematic computational errors, the work reported in [50,51] indicates that true coverage probabilities of GLLSM confidence intervals for the computational bias may strongly differ from their expected confidence levels, particularly if a chi-square filter is used. The user-defined “cost functional” represented by Equation (1), which is minimized in the least-square sense, cannot be extended in an obvious manner to incorporate higher-order correlations and sensitivities. A fundamentally different type of “cost functional” would need to be introduced for incorporating higher-order correlations and sensitivities, the form of which has not been produced yet in the literature.
The data-assimilation methodology is also incomplete in first-order because it has no provisions for incorporating the first-order correlations between the model parameters and either the computed or the measured responses; specifically, neither the information represented by Equation (71) nor the information represented by Equation (97) can be incorporated in the user-defined cost functional underlying the extant data-assimilation methodologies. On the other hand, the currently most advanced 4D-VAR data-assimilation methodologies (including the extended/ensemble Kalman filter, second-order filter and variants thereof; see, e.g., references [11,12,13,14]) allow the incorporation of some partial second-order information through the so-called “Hessian-vector product” which appears as an additional term in the model dynamics/forecast step and the data assimilation step. Thus, although the most advanced data-assimilation methods can be considered to be “more inclusive”—in the sense of incorporating some partial second-order information—than the most advanced data-adjustment methods, neither of these methodologies are complete regarding the incorporation of all extant first- and/or second-order information, and neither methodology provides a mechanism (even a theoretical one) for incorporating response sensitivities (to model parameters) higher than first-order and/or incorporating correlations among responses and/or parameters higher than second-order. Of course, the predictions of the data adjustment or assimilation methodologies are also limited to first-order mean values and second-order covariances/correlations for parameters or responses.
The first-order (in response sensitivities) BERRU-PM methodology [16] is based on the MaxEnt principle rather than least-squares minimization, thereby eliminating the user-introduced preference (bias) in the definition of the functional “describing the discrepancies between computations and measurements” which is to be minimized. Although the end-products of the BERRU-PM procedure look superficially similar to those produced by the data adjustment and/or data assimilation procedure, the BERRU-PM procedure differs fundamentally from the data adjustment/assimilation procedures, both regarding the actual contents of its end-products and also the potential for further generalization to higher-order, which are inherent to the BERRU-PM methodology [16] but are lacking within the frameworks of the other aforementioned methodologies.
Extending the BERRU-PM methodology [16], the BERRU-PM-2+ methodology presented in Section 3 of this work also employs the MaxEnt principle instead of a user-defined “cost functional to be minimized”. Employing the MaxEnt principle not only eliminates the impact of the user-defined functional which predetermines the outcome of the respective methodology’s predictions, but also makes it possible to incorporate the complete first- and second-order information (sensitivities and correlations among parameters and responses, computed and measured) by considering the complete joint-phase-space of parameters, computed and measured responses. By setting its mathematical framework in this most-inclusive phase-space, the BERRU-PM-2+ methodology can be used for solving both forward/direct problems and inverse problems, not only for applications related to energy-systems but for applications in all fields that combine computational and experimental information to produce best-estimate predictions for results of interest (responses) while also calibrating/improving the models of the underlying physical systems. Furthermore, the novel MaxEnt-based BERRU-PM-2+ methodology introduced in Section 3 explicitly includes the third-and fourth-order response sensitivities to model parameters, and clearly indicates the path for incorporating explicitly higher-order sensitivities, if available. The BERRU-PM-2+ expressions also comprise quantities representing the systematic and random errors that would stem from incomplete modeling and numerical truncations. It is therefore expected that the BERRU-PM-2+ methodology will further improve the already unparalleled efficiency and accuracy of the extant first-order BERRU-PM methodology.
The data-assimilation methodology is widely used in the atmospheric and geophysical sciences. On the other hand, predictive modeling applications to energy-related fields are very few, with the highly notable exception of applications in the nuclear energy field, where the original data-adjustment methodology was conceived. The perspective on the available predictive modeling methodologies presented in this work and, in particular, the introduction in this work of the BERRU-PM-2+ methodology, also aim to motivate a much wider use of predictive modeling in the energy sciences for obtaining best-estimate results with reduced uncertainties. In practice, predictive modeling methodologies should be an integral part of model validation methodologies to be applied prior to model qualification and licensing considerations.
Ongoing work aims at extending the BERRU-PM-2+ methodology to provide information about the joint parameters/responses posterior distribution’s third-order (skewness) and fourth-order (kurtosis) moments. The incorporation of third- and fourth-order information (i.e., high-order response sensitivities along with skewness and kurtosis of parameters) will greatly increase the range of validity of predicted results, as well as reduce the uncertainties in these predictions. In particular, the envisaged new fourth-order (BERRU-PM-4) predictive modeling methodology will significantly impact the activities of model calibration and model validation against experiments. Further work will aim at quantifying the impact of ignorance (lack of knowledge) caused by non-modeled/missed phenomena in computational models of physical/biological systems. It is envisaged to demonstrate the potential impact of this new, high-order, comprehensive predictive modeling methodology by applications to reduce uncertainties in modeling predictions for energy and related systems.

Funding

This research received no external funding.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A

This appendix presents the expressions, up to and including fourth-order in the standard deviations of parameters, of the moments of the computed responses in the combined phase-space of parameters and computed responses.
Recall from Equations (67) and (68) that the covariance, cov α i , α j , of two parameters, α i and α j , is formally defined as follows: cov α i , α j D α δ α i δ α j p α α   d α ρ i j σ i σ j ;   i , j = 1 , , T P .
The quantity ρ i j denotes the correlation between the parameters α i and α j , while σ i denotes the standard deviation of parameter α i and σ j denotes the standard deviation of parameter α j , respectively. The variance, var α i , of a parameter α i , is formally defined as the following particular case of Equation (67): var α i D α δ α i 2 p α α   d α = σ i 2 ;   i = 1 , , T P .
The third-order correlation, t i j k , of three parameters ( α i , α j , α k ) is formally defined as follows:
t i j k σ i σ j σ k D α δ α i δ α j δ α k p α α   d α ;   i , j , k = 1 , , T P .
The fourth-order parameter moment, μ 4 j 1 j 2 j 3 j 4 , and the associated fourth-order correlation q j 1 j 2 j 3 j 4 among four parameters, are defined as follows:
q i j k σ i σ j σ k σ D α δ α i δ α j δ α k   δ α p α α   d α ;   i , j , k , = 1 , , T P .
The expression of the fourth-order Taylor-series expansion of the computed response, which was provided in Equation (69), is re-written below in order to highlight the expressions of the contributions to various order in parameter deviations δ α j :
r k 1 α = r k α 0 + j 1 = 1 T P r k α α j 1 α 0 δ α j 1   + ε k 1 ;
r k 2 α = r k 1 α   + 1 2 j 1 = 1 T P j 2 = 1 T P 2 r k α α j 1 α j 2 α 0 δ α j 1 δ α j 2 + ε k 2 ;
r k 3 α = r k 2 α + 1 3 ! j 1 = 1 T P j 2 = 1 T P j 3 = 1 T P 3 r k α α j 1 α j 2 α j 3 α 0 δ α j 1 δ α j 2 δ α j 3 + ε k 3
r k 4 α = r k 3 α + 1 4 ! j 1 = 1 T P j 2 = 1 T P j 3 = 1 T P j 4 = 1 T P 4 r k α α j 1 α j 2 α j 3 α j 4 α 0 δ α j 1 δ α j 2 δ α j 3 δ α j 4 + ε k 4

Appendix A.1. Fourth-Order Approximate Expressions for the Expected Value of the Computed Response

The approximations to the expected value, E c r k , of the computed response r k α are obtained by integrating formally the expression provided in Equation (69) over the unknown distribution p α α to obtain the following expressions for the first-order through fourth-order approximate expectation, E c n r k ,   n = 1 , , 4 :
E c 1 r k = E c 0 r k r k α 0 + ε ^ k 2 ;
E c 2 r k = E c 1 r k + 1 2 j 1 = 1 T P j 2 = 1 T P 2 r k α α j 1 α j 2 α 0 ρ j 1 j 2 σ j 1 σ j 2 + ε ^ k 3 ;
E c 3 r k = E c 2 r k + 1 6 j 1 = 1 T P j 2 = 1 T P j 3 = 1 T P 3 r k α α j 1 α j 2 α j 3 α 0 t j 1 j 2 j 3 σ j 1 σ j 2 σ j 3 + ε ^ k 4 ,
E c 4 r k = E c 3 r k +   1 4 ! j 1 = 1 T P j 2 = 1 T P j 3 = 1 T P j 4 = 1 T P 4 r k α α j 1 α j 2 α j 3 α j 4 α 0 q j 1 j 2 j 3 j 4 σ j 1 σ j 2 σ j 3 σ j 4 + ε ^ k 5 .
As indicated in Equation (A8), the second-order sensitivities contribute the leading correction terms to the response’ s expected value, causing it to differ from the response’s computed value, r k α 0 . The quantities ε ^ k n = O σ n , n = 2 , 3 , 4 , 5 , which appear in Equations (A7)–(A10), represent the approximate mean values of the respective-order error term ε k , which appeared in Equation (69). The systematic error in each error term ε ^ k n , n = 2 , 3 , 4 , 5 , is of order O σ n ; the actual numerical value for each error term ε ^ k n , n = 2 , 3 , 4 , 5 , is assigned by the user.

Appendix A.2. Fourth-Order Approximate Expressions for the Correlations between Computed Responses and Model Parameters

The correlation between a model parameter and a computed response was defined in Equation (71). The approximations to the correlation between a model parameter α i and a computed response r k α are obtained by multiplying the expression provided in Equation (69) by δ α i and by the corresponding expression of r k n α E c n r k , and then integrating the resulting expression formally over the unknown distribution p c α , r . This sequence of mathematical operations yields the following expressions for the respective approximate correlations, c o v α i ,   r k n D α δ α i r k n α E c n r k p c α , r   d α d r , n = 1 , , 4 :
c o v α i ,   r k 1 = j = 1 T P r k α α j α 0 cov α i , α j   + ε i k 3 ;
c o v α i ,   r k 2   =   c o v α i ,   k k 1 + 1 2 j 1 = 1 T P j 2 = 1 T P 2 r k α α j 1 α j 2 α 0 t i , j 1 j 2 σ i σ j 1 σ j 2 + ε i k 4 ;
c o v α i ,   r k 3 = c o v α i ,   r k 2   + 1 6 j 1 = 1 T P j 2 = 1 T P j 3 = 1 T P 3 r k α α j 1 α j 2 α j 3 α 0 q i , j 1 j 2 j 3 σ j 1 σ j 2 σ j 3 σ j 4 + ε i k 5 ;
The error terms ε i k n = O σ n , n = 3 , 4 , 5 , which appear in Equations (A11)–(A13), respectively, contain random errors, as well as systematic errors of order O σ n , n = 3 , 4 , 5 ; their respective values are assigned by the user.

Appendix A.3. Fourth-Order Approximate Expressions for the Correlations between Computed Responses

The covariance, c o v r k ,   r D c r k α E c r k r α E c r p c α , r d α d r , between two computed responses r k and r has been defined in Equation (72). It is shown below that the nth-order approximation of the covariance c o v r k ,   r can be defined in a either a “consistent” way, in which case the respective approximation is denoted as c o v r k n ,   r n i n c , or an “inconsistent” was, in which case the respective approximation is denoted as c o v r k n ,   r n c o n , for   n = 1 , 2 , , as follows:
c o v r k n ,   r n i n c D c r k n α E c n r k r n α E c n r p c α , r d α d r + O σ 2 n
c o v r k n ,   r n c o n c o v r k n + 1 ,   r n + 1 O σ n + 1 ,   n = 1 , 2 , .
The standard deviation, S D r k n α , of r k n α is obtained by setting k = in either (A14) or (A15) and taking the square root of the resulting expression, respectively, to obtain:
S D r k n i n c var r k n ,   r k n i n c + O σ n ,   n = 1 , 2 , .
S D r k n c o n var r k n ,   r k n c o n + O σ n + 1 / 2 ,   n = 1 , 2 ,
Thus, as indicated by the right-side of Equation (A16), the standard deviation S D r k n i n c corresponding to the inconsistent variance var r k n ,   r k n i n c will not comprise the complete number of terms that contain the nth-order parameter standard deviations. In contradistinction to S D r k n i n c , the consistent standard deviation, S D r k n c o n , which corresponds to the consistent variance var r k n ,   r k n i n c , will comprise all of the terms that contain the nth-order parameter standard deviations.
The first-order approximation of the covariance c o v r k ,   r has the following expression:
c o v r k 1 ,   r 1   = i = 1 T P j = 1 T P r k α α i r k α α j α 0 cov α i , α j   + O σ 3 .
Notably, the expression of c o v r k 1 ,   r 1 provided in (A18) is consistent both in the highest-order (in this case: first-order) of sensitivities and also in the highest-order of parameter standard deviation (in this case, second-order) since all of the terms involving the product σ i σ j are consistently included (i.e., none are missing) in the expression provided. Therefore, the standard deviation of the computed response r k α will also be correct to first-order in the standard deviations of the parameters, i.e.,
S D r k 1   = i = 1 T P j = 1 T P r k α α i r k α α j α 0 cov α i , α j   1 2 + O σ 3 / 2 .
The second-order approximation of the covariance c o v r k ,   r has the following expression:
c o v r k 2 ,   r 2 i n c = c o v r k 1 ,   r 1 + 1 2 j 1 = 1 T P j 2 = 1 T P j 3 = 1 T P 2 r k α α j 1 α j 2 r α α j 3 + r k α α j 1 2 r α α j 2 α j 3 α 0 t j 1 j 2 j 3 σ j 1 σ j 2 σ j 3 + 1 4 j 1 = 1 T P j 2 = 1 T P j 3 = 1 T P j 4 = 1 T P 2 r k α α j 1 α j 2 2 r α α j 2 α j 3 α 0 q j 1 j 2 j 3 j 4 ρ j 1 j 2 ρ j 3 j 4 σ j 1 σ j 2 σ j 3 σ j 4 + O σ j 1 σ j 2 σ j 3 σ j 4 .
Notably, the expression of c o v r k 2 ,   r 2 i n c provided in Equation (A20) is consistent in the second-order of sensitivities but is inconsistent in the fourth-order of standard deviations of parameters since it has errors of order O σ j 1 σ j 2 σ j 3 σ j 4   . The missing terms of order O σ j 1 σ j 2 σ j 3 σ j 4   , which involve third-order response sensitivities to parameters, are provided in the expression below for the “consistent second-order parameter correlations” c o v r k 2 ,   r 2 c o n :
c o v r k 2 ,   r 2 c o n = c o v r k 2 ,   r 2 i n c + 1 6 j 1 = 1 T P j 2 = 1 T P j 3 = 1 T P j 4 = 1 T P 3 r k α α j 1 α j 2 α j 3 r α α j 4 + r k α α j 1 3 r k α α j 2 α j 3 α j 4 α 0 q j 1 j 2 j 3 j 4 σ j 1 σ j 2 σ j 3 σ j 4 + O σ j 1 σ j 2 σ j 3 σ j 4 σ j 5 .
Therefore, if the third-order sensitivities are available, then the expression provided in Equation (A21) should be used, since it includes all of the fourth-order terms containing the products σ j 1 σ j 2 σ j 3 σ j 4 . Consequently, the consistent second-order standard deviation S D r k 2 c o n c o v r k 2 ,   r 2 c o n will not have any second-order errors in the parameter standard deviations, i.e.,
S D r k 2 c o n c o v r k 2 ,   r 2 c o n + O σ j 1 σ j 2 σ j 3 σ j 4 σ j 5 1 / 2 .
On the other hand, the second-order inconsistent standard deviation S D r k 2 i n c c o v r k 2 ,   r 2 i n c will have second-order errors in the parameter standard deviations, i.e.,
S D r k 2 i n c c o v r k 2 ,   r 2 i n c + O σ j 1 σ j 2 σ j 3 σ j 4 1 / 2 .

References

  1. Humi, I.; Wagschal, J.J.; Yeivin, Y. Multi-group constants from integral data. In Proceedings of the 3rd International Conference on the Peaceful Uses of Atomic Energy, Geneva, Switzerland, 31 August–9 September 1964; Volume 2, p. 398. [Google Scholar]
  2. Cecchini, G.; Farinelli, U.; Gandini, A.; Salvatores, A. Analysis of integral data for few-group parameter evaluation of fast reactors. In Proceedings of the 3rd International Conference on the Peaceful Uses of Atomic Energy, Geneva, Switzerland, 31 August–9 September 1964; Volume 2. [Google Scholar]
  3. Usachhev, L.N. Perturbation theory for the breeding ratio and for other number ratios pertaining to various reactor processes. J. Nucl. Energy. Parts A/B React. Sci. Technol. 1964, 18, 571. [Google Scholar] [CrossRef]
  4. Rowlands, J. The production and performance of the adjusted cross-section set FGL5. In Proceedings of the International Symposium on Physics of Fast Reactors, Tokyo, Japan, 16–19 October 1973. [Google Scholar]
  5. Gandini, A.; Petilli, M. AMARA: A Code Using the Lagrange Multipliers Method for Nuclear Data Adjustment; CNEN-RI/FI(73)39; Comitato Nazionale Energia Nucleare: Casaccia/Rome, Italy, 1973. [Google Scholar]
  6. Kuroi, H.; Mitani, H. Adjustment to cross-section data to fit integral experiments by least squares method. J. Nucl. Sci. Technol. 1975, 12, 663. [Google Scholar] [CrossRef]
  7. Dragt, J.B.; Dekker, J.W.M.; Gruppelaar, H.; Janssen, A.J. Methods of adjustment and error evaluation of neutron capture cross sections. Nucl. Sci. Eng. 1977, 62, 11. [Google Scholar] [CrossRef]
  8. Weisbin, C.R.; Oblow, E.M.; Marable, J.H.; Peelle, R.W.; Lucius, J.L. Application of sensitivity and uncertainty methodology to fast reactor integral experiment analysis. Nucl. Sci. Eng. 1978, 66, 307. [Google Scholar] [CrossRef]
  9. Barhen, J.; Cacuci, D.G.; Wagschal, J.J.; Bjerke, M.A.; Mullins, C.B. Uncertainty analysis of time-dependent nonlinear systems: Theory and application to transient thermal hydraulics. Nucl. Sci. Eng. 1982, 81, 23–44. [Google Scholar] [CrossRef]
  10. Cacuci, D.G. Sensitivity theory for nonlinear systems: I. Nonlinear functional analysis approach. J. Math. Phys. 1981, 22, 2794–2812. [Google Scholar] [CrossRef]
  11. Lewis, J.M.; Lakshmivarahan, S.; Dhall, S.K. Dynamic Data Assimilation: A Least Square Approach; Cambridge University Press: Cambridge, UK, 2006. [Google Scholar]
  12. Lahoz, W.; Khattatov, B.; Ménard, R. (Eds.) Data Assimilation: Making Sense of Observations; Springer: Heidelberg, Germany, 2010. [Google Scholar]
  13. Faragó, I.; Havasi, Á.; Zlatev, Z. (Eds.) Advanced Numerical Methods for Complex Environmental Models: Needs and Availability; Bentham Science Publishers: Bussum, The Netherlands, 2013. [Google Scholar]
  14. Cacuci, D.G.; Navon, M.I.; Ionescu-Bujor, M. Computational Methods for Data Evaluation and Assimilation; Chapman & Hall/CRC: Boca Raton, FL, USA, 2014. [Google Scholar]
  15. Cacuci, D.G. Predictive modeling of coupled multi-physics systems: I. Theory. Ann. Nucl. Energy 2014, 70, 266–278. [Google Scholar] [CrossRef]
  16. Cacuci, D.G. BERRU Predictive Modeling: Best-Estimate Results with Reduced Uncertainties; Springer: Berlin, Germany, 2019. [Google Scholar]
  17. Jaynes, E.T. Information theory and statistical mechanics. Phys. Rev. 1957, 106, 620–630. [Google Scholar] [CrossRef]
  18. Cacuci, D.G. Second-order adjoint sensitivity analysis methodology (2nd-ASAM) for large-scale nonlinear systems: I. Theory. Nucl. Sci. Eng. 2016, 184, 16–30. [Google Scholar] [CrossRef]
  19. Cacuci, D.G. The Second-Order Adjoint Sensitivity Analysis Methodology; CRC Press: Boca Raton, FL, USA, 2018. [Google Scholar]
  20. Cacuci, D.G. The nth-order comprehensive adjoint sensitivity analysis methodology for nonlinear systems (nth-CASAM-N): Mathematical framework. J. Nucl. Eng. 2022, 3, 10. [Google Scholar] [CrossRef]
  21. Cacuci, D.G. The nth-Order Comprehensive Adjoint Sensitivity Analysis Methodology: Overcoming the Curse of Dimensionality. Volume I: Linear Systems; Springer Nature Switzerland: Cham, Switzerland, 2022. [Google Scholar]
  22. SCALE: A Modular Code System for Performing Standardized Computer Analyses for Licensing Evaluation; ORNL/TM 2005/39, Version 6; Oak Ridge National Laboratory: Oak Ridge, TN, USA, 2009.
  23. Venard, C.; Santamarina, A.; Leclainche, A.; Mournier, C. The R.I.B. Tool for the determination of computational bias and associated uncertainty in the CRISTAL criticality safety package. In Proceedings of ANS Nuclear Criticality Safety Division Topical Meeting (NCSD 2009), Richland, DC, USA, 13–17 September 2009.
  24. Gandin, L.S. Objective Analysis of Meteorological Fields; Gridromet: Leningrad, Russia, 1963. (In Russian) [Google Scholar]
  25. Anthes, R.A. Data assimilation and initialization of hurricane prediction models. J. Atmos. Sci. 1974, 31, 702. [Google Scholar] [CrossRef]
  26. Sasaki, Y.K. A fundamental study of the numerical prediction based on the variational principle. J. Meteor. Soc. Japan 1955, 33, 262. [Google Scholar] [CrossRef] [Green Version]
  27. Sasaki, Y.K. An objective analysis based on the variational method. J. Meteor. Soc. Japan 1958, 36, 77. [Google Scholar] [CrossRef] [Green Version]
  28. Lorenc, A.C. Analysis-methods for numerical weather prediction. Q. J. R. Meteorol. Soc. 1986, 112, 1177. [Google Scholar] [CrossRef]
  29. Hall, M.C.G.; Cacuci, D.G.; Schlesinger, M.E. Sensitivity analysis of a radiative-convective model by the adjoint method. J. Atm. Sci. 1982, 39, 2038–2050. [Google Scholar] [CrossRef]
  30. Hall, M.C.G.; Cacuci, D.G. Physical interpretation of the adjoint functions for sensitivity analysis of atmospheric models. J. Atm. Sci. 1983, 40, 2537–2546. [Google Scholar] [CrossRef]
  31. Práger, T.; Kelemen, F.D. Adjoint methods and their application in earth sciences. In Advanced Numerical Methods for Complex Environmental Models: Needs and Availability; Faragó, I., Havasi, Á., Zlatev, Z., Eds.; Bentham Science Publishers: Bussum, The Netherlands, 2013; Chapter 4, Part A; pp. 203–275. [Google Scholar]
  32. Derber, J.C. The variational four-dimensional assimilation of analysis using filtered models as constraints. Ph.D. Thesis, University of Wisconsin–Madison, Madison, WI, USA, 1985. [Google Scholar]
  33. Derber, J.C. Variational four-dimensional analysis using the quasigeostrophic constraint. Mon. Weather. Rev. 1987, 115, 998. [Google Scholar] [CrossRef]
  34. Hoffmann, R.N. A four-dimensional analysis exactly satisfying equations of motion. Mon. Weather. Rev. 1986, 114, 388. [Google Scholar] [CrossRef]
  35. Navon, I.M.; de Villiers, R. Combined penalty multiplier optimization methods to enforce integral invariants conservation. Mon. Weather. Rev. 1983, 111, 1228. [Google Scholar] [CrossRef]
  36. Navon, I.M. A review of variational and optimization methods in meteorology. Dev. Geomath. 1986, 5, 29–34. [Google Scholar]
  37. Navon, I.M. Practical and theoretical aspects of adjoint parameter estimation and identifiability in meteorology and oceanography. Dyn. Atmos. Ocean. 1998, 27, 55–79. [Google Scholar] [CrossRef]
  38. Derber, J.C.; Parrish, D.F.; Lord, S.J. The new global operational analysis system at the National Meteorological Center. Weather. and Forecast 1991, 6, 538. [Google Scholar] [CrossRef]
  39. Mahfouf, J.F.; Buizza, R.; Errico, R.M. Strategy for including physical processes in the ECMWF variational data assimilation system. In Proceedings of the ECMWF workshop on non-linear aspects of data assimilation, Reading, UK, 9–11 September 1996. [Google Scholar]
  40. Rabier, F. Overview of global data assimilation developments in numerical weather-prediction centers. Q. J. R. Meteorol. Soc. 2005, 131, 3215. [Google Scholar] [CrossRef]
  41. Bellman, R.E. Dynamic Programming. Rand Corporation; Princeton University Press: Princeton, NJ, USA, 1957; ISBN 978-0-691-07951-6. [Google Scholar]
  42. Tukey, J.W. The Propagation of Errors, Fluctuations and Tolerances; Technical Reports No. 10–12; Princeton University: Princeton, NJ, USA, 1957. [Google Scholar]
  43. Shannon, C.E. A Mathematical theory of communication. Bell System Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef] [Green Version]
  44. Evangelou, E.; Zhu, Z.; Smith, R.L. Estimation and prediction for spatial generalized linear mixed models using high order Laplace approximation. J. Stat. Plan. Inference 2011, 141, 3564–3577. [Google Scholar] [CrossRef] [Green Version]
  45. Shun, Z. and P. McCullagh. Laplace approximation of high dimensional integrals. J. R. Stat. Society. Ser. B 1995, 57, 749–760. [Google Scholar]
  46. Tang, Y.; Reid, N. Laplace and saddle point approximations in high dimensions. arXiv 2021, arXiv:2107.10885. [Google Scholar] [CrossRef]
  47. Tichonov, A.N. Regularization of non-linear ill-posed problems. Dokl. Akad. Nauk. 1963, 49. [Google Scholar]
  48. Tichonov, A.N. Solution of incorrectly formulated problems and the regularization method. Sov. Math. Dokl. 1963, 4, 1035. [Google Scholar] [CrossRef]
  49. Tarantola, A. Inverse Problem Theory and Methods for Model Parameter Estimation; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2005. [Google Scholar]
  50. Hoefer, A.; Buss, O.; Neuber, J.C. How confident can we be in confidence intervals for the computational bias obtained with the generalized linear least squares methodology?—A toy model analysis. In Proceedings of the International Conference on Nuclear Criticality, ICNC 2011, Edinburgh, Scotland, UK, 19–22 September 2011. [Google Scholar]
  51. Hoefer, A.; Buss, O.; Neuber, J.C. Limitations of the generalized linear least squares methodology for bias estimation in criticality safety analysis. In Proceedings of the IAEA International Workshop on Burnup Credit Criticality Calculation Methods and Applications, Beijing, China, 25–28 October 2011. Available online: https://www.researchgate.net/publication/276271639 (accessed on 5 December 2022).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cacuci, D.G. Perspective on Predictive Modeling: Current Status, New High-Order Methodology and Outlook for Energy Systems. Energies 2023, 16, 933. https://doi.org/10.3390/en16020933

AMA Style

Cacuci DG. Perspective on Predictive Modeling: Current Status, New High-Order Methodology and Outlook for Energy Systems. Energies. 2023; 16(2):933. https://doi.org/10.3390/en16020933

Chicago/Turabian Style

Cacuci, Dan Gabriel. 2023. "Perspective on Predictive Modeling: Current Status, New High-Order Methodology and Outlook for Energy Systems" Energies 16, no. 2: 933. https://doi.org/10.3390/en16020933

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop