1. Introduction
Quantifying uncertainty is a problem common to scientists dealing with mathematical models of physical processes. There are a wide range of numerical methods to solve systems of ordinary differential equations, such as the Euler method, higher-order Taylor methods, and Runge–Kutta methods. For systems of partial differential equations, we consider several numerical methods, such as finite difference, finite element, finite volume, and spectral methods to produce solutions to physical models based on differential equations. If numeral methods are used, the resulting solution is an approximation to the true underlying solution and usually consists of a single solution set when parameter and starting values are known and specified. However, in many processes there exist various sources of noise to the process. Aleatoric uncertainty is an “external error” which may take the form of measurement error or some other error associated with generating the solution. The key to external (Aleatoric) error is that the error exists on the solution set. Epistemic uncertainty is model uncertainty/misspecification uncertainty which exists in the formulation of the model. This can be a result of not accounting for various components of a model. Epistemic uncertainty can be considered “internal error” as it exists before the solution to the differential equation is generated. For a very good overview of Bayesian model calibration, see [
1], as it addresses many of the issues associated when quantifying uncertainty for computer models, while, for quantifying uncertainty for differential equations models, see [
2]. For a good reference for techniques dealing with model inadequacy, see [
3], and for work on model misspecification and model order see [
4].
These two different types of uncertainty are manifested in different ways. For example, external error is manifested by unstructured random noise about the solution set. In contrast, internal error is propagated through the differential equation solver and may manifest itself as structured changes in the solution set which is called uncertainty propagation [
5]. To highlight this issue consider the simple differential equation with no uncertainty introduced:
this has the well known solution with initial condition
:
To account for external error the solution to Equation (1) would be:
where
follows some appropriate probability distribution. However, if we have internal error then Equation (1) would be:
where
follows some appropriate probability distribution. Let
be the solution to Equation (1), then if an external error is added to the model we have
, where
follows some appropriate probability distribution. One can see the effects of both types of errors in
Figure 1.
To construct the plots in
Figure 1, simulation was used to ensure that the error structure was correctly represented, using 10,000 simulations. In panel (a) of
Figure 1, the trajectory for an exponential growth model with
and
, which contains no error. This subfigure shows exactly what one would expect. For panel (b), the exponential growth model with
with external error
. Notice that the error bands are virtually equidistant from the mean line. For panel (c), the exponential growth model the
with internal error
is plotted. In contrast to the external error only model the error bands increase in width as the time increases. Panel (d) shows the internal error
with external error
added to it. Notice that the internal plus external error is not simply and addition of the two error as might think. This phenomenon is our motivation for this research.
The question we focus on here is can a model be purposely misspecified in order to improve computation time while still accounting for uncertainty induced due to this misspecification? Model misspecification is an important question especially when using deterministic models as they are typically assumed to be correct [
6]. To approach this problem the Bayesian framework will be used from a fully Bayesian perspective [
7,
8] as it allows for a natural method to quantify uncertainty from different sources.
Notation
We have two major steps producing attributes, e.g., forcing, displacement, velocity and acceleration. All attributes in the training step will be denoted by superscript while attributes in the validation step will not have any superscript. In each of the major step we have four different sets of attributes:
The true unobserved attributes u, v, a produced by the system; i.e., the solution of the non-oracle differential equation using the true parameters , forcing f and no internal or external errors. This is our target.
The observed attributes , , produced by the system; i.e., the solution of the non-oracle differential equation using the true parameters and the true forcing f with true external and internal errors added. This together with the forcing are our data.
The fitted/predicted attributes , , , defined as the solution of either the oracle or non-oracle differential equation using estimated parameters , with predicted external and internal errors added.
Attributes , , that are the solution of the oracle or non-oracle differential equation using estimated parameters with predicted internal and no external error added. These will be used for estimation of maximum.
2. Models
Suppose that the dynamics of a given system be modeled by an abstract initial-value problem:
with appropriate boundary conditions, where
denotes the
derivative of
u with respect to the time variable
t. Furthermore, we suppose that the function
can be separated into lower- and higher-order dynamics,
G and
H, respectively, such as:
for some
ℓ smaller than or equal to
n and assume that the higher-order term
H has smaller effects than the lower-order term for some values of the input parameters of the model. Due to the added complexity and computational cost associated with the solution of the higher-order dynamics model, one may want to introduce a misspecified model by retaining the lower-order dynamics in the following manner:
where
is a random process governed by some probability distribution with mean 0 and variance
. This model ignores the additional complexity of full model but accounts for the associated uncertainty through
. In the following, we consider a non-linear single degree-of-freedom oscillator as the full model, which we will hereafter refer to as the oracle model, and a linear single degree-of-freedom oscillator as the misspecified model.
2.1. Non-Linear Single Degree-of-Freedom Oscillator
The displacement
around the equilibrium position of a non-linear single degree-of-freedom oscillator is governed by:
where
denotes the time-derivative of
u,
is the linear component of the spring stiffness coefficient,
is the higher-order component of spring stiffness,
m is the mass of the object attached to the spring, and
is the damping coefficient. Moreover, the system is subjected to the external forcing
and initial displacement
and velocity
.
A single degree-of-freedom oscillator is a popular model for modeling spring behavior. In an experimental setting, one can imagine that the spring be subjected to a sinusoidal forcing term by using a cam to apply a periodic and continuous force to the spring. From a mathematical point of view, any sufficiently smooth function can be represented as a Fourier series, in which case, the forcing term is given as a linear combination of trigonometric functions. In this work, the external forcing used in the generation of data for parameter identification will be chosen in the form:
where
is the amplitude and
is an input parameter to control the angular frequency
of the forcing term.
We also suppose that the external forcing
is subjected to unforeseen errors. These errors are collectively denoted by
such that the system is modeled by:
In the following, we assume that above system is too complex to solve and has to be simplified in order to be tractable. Another point of view is that one does not necessarily know how to model all the physical phenomena occurring in the system.
2.2. Linear Single Degree-of-Freedom Oscillator
The model given in (
10) can be decomposed into linear and non-linear components. Using the notation of (
6), we have:
with
and
. Considering a model that retains the linear component,
, as defined in (
11), and omitting the non-linear component,
, we arrive at the misspecified model:
Note that the variable
in (
13) accounts for both the model misspecification
and errors
in the forcing. Due to
being governed by a probability distribution, then
will also be governed by some probability distribution with some mean and variance
. In fact, the proposed formulation does not allow one to separate the uncertainty due to the non-linear component from possible errors when observing the forcing term. In other words,
can be viewed as a modeling error, “internal” to the system, in contrast to the traditional approach in which one considers an “external” error, to explain the error associated with the system solution. Finally notice, that if we fit Equations (
10) and (
13) to data, the estimated values of
and
will be different depending on whether the cubic term is included or not.
In the following, the model defined in (
10) will be referred to as the
Oracle model and the misspecified model defined in (
13) as the
non-Oracle model.
Figure 2 shows an example of sampled data for
f versus
u using forcing (
9) with
and
. As expected the data show a cubic relationship between displacement and forcing. The least squares estimated Oracle and non-Oracle models are overlaid on the data and show that the Oracle model fits extremely well the data while the non-Oracle model seems to provide a reasonable approximation. From a computational standpoint, the non-Oracle model is very appealing since its solution is far less computationally demanding than the Oracle model.
Upon applying a force to the system, we suppose that we can track the state of the system at times , which will result in the forcing vector , internal errors ), as well as the solution vectors for displacement, for velocity, and for acceleration. These unobserved quantities will be considered as a target for our estimation procedure. The solution vectors are actually observed contaminated with external errors and the observed solution vectors will be denoted by for displacement, , for velocity and for acceleration. These quantities will be referred to as data in our estimation procedure.
Figure 3 shows an example forcing (
9) with
(Panel a) and the resulting solutions of displacement (Panel b), velocity (Panel c), and acceleration (Panel d). Notice that the solutions for displacement are quite close in behavior with the non-Oracle model being somewhat smoother than the Oracle model. For velocity (Panel b), we observe that there is a marked difference between the Oracle and non-Oracle solutions. Specifically, the non-Oracle solutions do not fit well in the extremes and lack some dynamics exhibited by the Oracle model. The acceleration trajectories (Panel d) show that there is also a much stronger difference between the Oracle and non-Oracle models. Again, the non-Oracle model performs poorly at the extremes and lacks much of the dynamic behavior versus the Oracle model. One question of interest is whether the uncertainty associated with the non-Oracle model can be correctly quantified so that the model can be reliably used to create prediction intervals that have well calibrated coverage probabilities.
4. Simulation Study
Of particular interest is model validation. Specifically, under what training dataset conditions are the models valid? Furthermore, under what conditions is the non-Oracle model competitive with the Oracle model? In this case, model validity is defined to be the predictive performance of the models for predicting both future values and maximum values. As this is a predictive study, a training dataset
obtained using a training forcing
is used to obtain samples from the posterior distribution of
. Additionally, a separate and independent validation forcing
is applied to obtain solutions
. To validate the models the validation trajectories are compared with their posterior predictive distribution given by
To make the situation more realistic we accommodate possible internal errors in the system by using instead of the true forcing
a forcing contaminated by fixed realization of internal errors
, where
are independent centered Gaussian random variables with standard deviation
. The training datasets are generated using Equation (
9) with amplitude
at the levels of 1 (low), 5 (medium), and 10 (high), and forcing error
at
(low),
(medium), and
(high). The model parameters are specified as
,
,
, and
for all simulations. Since the interest is in estimating both future values and maximum value, the predictive distribution of the trajectories for all N time steps ahead are considered. Note that this is different from the traditional one, two, ten, etc., step ahead approach.
The validation forcings are generated using two regimes:
Sinusoidal and
Erratic. The Sinusoidal regime is uses Equation (
9) with
and centered Gaussian internal errors with
. Panel (a) of
Figure 4 shows a representative realization of the Sinusoidal regime when the training dataset is generated using the same parameters. Furthermore,
Figure 4 show the validation trajectories for displacement
(Panels b), velocity
(Panel c), and acceleration
(panel d) all with their 95% posterior predictive intervals. Notice that the posterior predictive intervals appear to have high coverage probabilities for displacement, velocity, and acceleration. This gives evidence that using sinusoidal training dataset gives a valid model for predicting the same system.
To understand how the sinusoidal training regime performs on a process that is dramatically different the Erratic regime is employed. The
Erratic regime validation data generation processes is a realization of a sum of two stochastic processes. The first is a Gaussian moving average process with a small variance,
, serving as a background noise. The second process is a smoothed version of a marked Poisson process and models large shocks. The time of the large shocks is given by Poisson process with the amplitude modelled by a Gaussian random variable with a relatively large variance,
. Panel (a) of
Figure 5 shows a representative realization. Notice that the process is very different from the sinusoidal process. However, by considering Panels (b, c, and d) one can see that the posterior predictive intervals appear to capture the process quite well.
A simulation study is conducted by varying the training amplitude and assessing the performance of posterior predictive intervals on capturing the true value. The training datasets were created using the sinusoidal regime with amplitude,
at levels 1, 5, and 10. These levels of
are chosen so that there is a scenario in which a small, an accurate, and a large training amplitude is considered. The internal error standard deviation
is also varied and set to levels
,
, and
which correspond to a small, an accurate, and a large training noise. For each amplitude and noise combination 100 datasets were generated and MCMC samples from the posterior predictive distribution were obtained for each dataset for both the Oracle and non-Oracle models. For each amplitude, noise and Oracle/non-Oracle combination datasets two validation forcing datasets were generated, one using sinusoidal forcing and the other using erratic forcing. The simulation study was replicated 100 times. A Metropolis–Hastings embedded in a Gibbs sampler was implemented in MATLAB to obtain samples from the posterior predictive distribution. A total of 30,000 samples were taken from the posterior distribution with the first 20,000 samples discarded as burn-in samples and the remaining 10,000 samples thinned by 10 resulting in a set of 1000 samples from which all inferences will be made. Sampler diagnostics, such as traceplots, as well as autocorrelation within chains, were examined for convergence, mixing, and to determine the thinning rate. Example traceplots and auto-correlation plots for the Oracle model can be found in
Figure A1 and
Figure A2, respectively. Notice the chains appear to have converged and a thinning of every 10th sample will keep the auto-correlation between samples below 0.25. As one may be concerned with the choice of
on its impact on the posterior distribution of
, a small sensitivity study is presented in
Figure A3 that shows the kernel density estimates of the posterior distribution of
when
, and 10. Notice that the posterior distributions in all cases essentially agree with each other, hence choice of
should not have any large influence on the inferences.
Using the 2.5% and 97.5% quantiles from the MCMC samples from the posterior predictive distribution a 95% predictive interval is created for each
,
, and
using each contaminated validation forcing
. Let
be the solution attribute of interest, such as
,
or
and
be the true value of the attribute at time
t for simulation replicate
i and
be the 95% posterior predictive interval for the attribute at time
t for simulation replicate
i. The posterior predictive coverage rates for attribute
Z were calculated using:
where
is an indicator function taking on the value 1 if
and 0 otherwise.
Table 2 shows the results from the simulation study of the average coverage probability of the 95% posterior predictive intervals for
,
, and
for both Oracle and non-Oracle models across both validation regimes. Notice that for scenarios when
is 0.5 or higher the coverage probabilities are quite good for all attributes, across all training amplitudes for both the Oracle and non-Oracle models using both Sinusoidal and Erratic validation forcing. Additionally, notice that the Oracle model performs reasonably well when
is 0.05 when
is 5 or 10. Furthermore, when the training Amplitude is higher the coverage probabilities are improve in all cases. This suggests that provided the typical range of the training forcing amplitude plus the internal error,
, is larger than the typical forcing that would be applied in practice the non-Oracle model is valid for predicting the attributes. In cases where the typical forcing that would be applied in practice has low noise the Oracle model should be preferred.
Since the coverage probabilities are quite good when the training set exhibits more extreme amplitudes than the validation data one would like to determine the difference in the widths of the prediction intervals. Again, let
Z be the solution attribute of interest, such as
u,
v or
a, and
be the true value of the attribute at time
t for simulation replicate
i and
be the 95% posterior predictive interval for the attribute at time
t for simulation replicate
i. The average posterior predictive interval widths, using an
norm, for attribute
Z were calculated using:
Table 3 shows the average posterior predictive interval widths for
u,
v, and
a for both Oracle and non-Oracle models using a training system with
and internal error standard deviation
, and validated using both Erratic and Sinusoidal systems. Notice that the average interval widths are systematically higher for the non-Oracle versus the Oracle. This is to be expected since the non-Oracle model will always have a larger estimated internal error which is then propagated through the system. Furthermore, notice that the interval widths for the non-Oracle models are about twice the interval width of the Oracle models. Hence, the trade off of using the non-Oracle model is much wider predictive intervals.
The results from the simulation study above show that both the Oracle and non-Oracle models perform well when considering estimating the true value of the system. Since mechanical systems often fail at the extremes engineers are quite interested in predicting the extreme values of the system across time. To study this a simulation study was conducted using the same simulation experimental design, MCMC sampling scheme, and replications as above with the maximum values of each of the attributes as the quantity of interest. As above, let
Z denote the attribute of interest, typically displacement, velocity, or acceleration. To obtain a posterior predictive interval for
the following approach is used. For the
draw from the posterior distribution of
, the corresponding solution trajectory for
Z is created without using external error. The solution is denoted by
for times
, and the maximum value across
t is obtained,
. Finally, the external error with variance given by the corresponding
mth posterior sample is added to obtain
. The 2.5% and 97.5% quantiles for
are used to create a 95% posterior predictive interval for
M denoted by
. Let
be the true maximum value of the system under forcing
for simulation replicate
i and
be the corresponding 95% posterior predictive interval of the maximum value. The coverage posterior predictive coverage rates for the maximum were calculated using:
where
is an indicator function taking on the value 1 if
and 0 otherwise.
Table 4 shows the results of the coverage probabilities when using the posterior predictive distribution to predict the maximum value for each attribute for both Oracle and non-Oracle models across various Sinusoidal training amplitudes and noise for both Sinusoidal and Erratic validation regimes. Notice that the results are much different when attempting to predict the maximum value. Both the Oracle and non-Oracle methods work well when the validation forcing has similar amplitude and error as the training amplitude and errors. Consider when
and
then coverage probabilities for the maximum value are high across both Sinusoidal and Erratic validation regimes and across all attributes. When
and
then the Oracle model appears to work well but the non-Oracle model does not. This is consistent across all attributes and both validation regimes. When
is large relative to the training error and when the training and validation amplitudes differ both the Oracle and non-Oracle models perform poorly across both validation regimes.