1. Introduction
Computer models are employed to simulate or emulate complex physical systems with high cost. With the rapid advancement of computer technology, computer models have become increasingly appealing and are applied to various domains due to their relatively low cost and time efficiency. Despite their advantageous properties, computer models possess certain drawbacks, one being the uncertainty stemming from unknown parameters, referred to as “calibration parameters”, which are intricately linked to the inherent properties of physical systems. Calibration aims to discern these parameters to achieve precise predictions of the physical system.
The concept of calibration was first proposed in [
1], where the authors developed a Bayesian calibration framework. From then on, the Bayesian calibration methods began to emerge, such as in [
2,
3,
4,
5], etc. The above Bayesian calibration procedures have been applied to real applications successfully, but they lack rigorous theoretical guarantees. The appearance of frequentist calibration fills a theoretical gap.
Recently, Tuo and Wu proved the inconsistency of K-O (Kennedy and O’Hagan) calibration based on rigorous derivations [
6], and they developed an
calibration framework by minimizing the
-norm of the discrepancy function between the computer model and the true process (physical system) [
7]. Moreover, Tuo and Wu also derived the consistency and asymptotic normality of the proposed estimator and proved its semiparametric efficiency. Subsequently, Wong et al. investigated the theoretical properties of the ordinary least squares estimator of the calibration parameter and estimated the discrepancy function between the computer model and the true process via smoothing spline ANOVA [
8]. In recent years, there have been some new calibration methods. Tuo proposed a projected kernel calibration, which has a natural Bayesian version and can construct credible intervals of the proposed estimator without large sample approximation [
9]. To address the issue of local convergence, Wang proposed a penalized projected kernel calibration that can achieve both semiparametric efficiency and global convergence of the proposed estimator [
10].
While existing frequentist calibration methods assume continuous output for convenience, practical scenarios often involve discrete or other types of output, such as binary output in biology, count output in medicine, etc. Sung employed a kernel logistic regression-based calibration procedure for binary output and applied it to cell adhesion studies [
11]. For the count output, Sun and Fang adopted a penalized maximum likelihood method and constructed a calibration procedure, which also enjoys asymptotic normality and semiparametric efficiency [
12].
So far, constant calibration has been considered assuming calibration parameters are constant and independent of input variables. In reality, the calibration parameter often varies with input variables, meaning it is a function of some input variables. Tuo et al. addressed this issue and developed a functional calibration framework based on the reproducing kernel Hilbert space, where the calibration parameter is a function of input variables
X [
13]. They derived the theoretical properties from two perspectives: the consistency of estimation and the consistency of prediction. Sometimes, the calibration parameter is not related to all input variables and only varies with one specified variable, such as time. Tian et al. proposed a novel framework for the inference of real-time parameters based on reinforcement learning and applied their method to physics-based models of turbofan engines [
14]. The proposed calibration procedure in [
14] has computational validity but lacks theoretical persuasion. To the best of our knowledge, there is no real-time calibration procedure with rigorous theoretical guarantees.
Real-time calibration parameters resemble varying coefficients in statistics. Since the computer models are non-linear, real-time calibration can be regarded as a problem of estimating a coefficient-varying non-linear model. There are few articles about coefficient-varying non-linear models, except for [
15]. The authors constructed a pointwise estimator of the functional coefficient based on a local linear smoother and successfully applied their procedure to a photosynthesis study.
The motivation for this article arises primarily from both theoretical and applied perspectives. From the theoretical perspective, most existing calibration approaches assume that model parameters are constant over time. However, in many scientific applications, the underlying parameters evolve dynamically with time. Ignoring this time variation may cause systematic bias, as the calibrated computer model cannot adequately capture the changing physical process. Our work addresses this limitation by proposing a calibration framework that explicitly accounts for time-varying parameters, thereby improving both theoretical understanding and estimation accuracy. From the application perspective, the motivation is exemplified by the NASA Orbiting Carbon Observatory-2 (OCO-2) mission [
16]. In this mission, the forward model plays a central role in the Observing System Uncertainty Experiments (OSUE), where its accuracy directly affects the evaluation of retrieval algorithms. Importantly, the forward model involves several geometric parameters—such as instrument and solar azimuth/zenith angles—that are inherently time-dependent. Accurately calibrating these time-varying parameters is crucial for the reliable prediction of radiances and, ultimately, for improving the quality of atmospheric carbon retrievals. This concrete application highlights the practical relevance and necessity of the proposed framework.
Building on this motivation, we adopt the idea of [
15], obtain a local linear estimator of the discrepancy function between the computer model and the true process, and construct a pointwise estimator of the functional parameter via quasi-profile least squares in this paper. Furthermore, we establish the rate of convergence for the estimator of the discrepancy function, as well as the asymptotic normality of the pointwise estimator for the real-time parameter.
This paper is organized as follows. In
Section 2, we develop the proposed calibration procedure based on the local linear approximation and the quasi-profile least squares. In
Section 3, we investigate the asymptotic properties of the proposed estimators. In
Section 4 and
Section 5, two examples including simulated and practical models and an application in NASA’s OCO-2 mission are employed to illustrate the superior accuracy of the proposed method. Finally, we draw some conclusions and discuss future extensions in
Section 6. The proofs of all the theorems in this paper are provided in
Appendix A.
3. Theoretical Properties
3.1. Assumptions
Before we give the asymptotic properties of the proposed estimators, we must list some necessary conditions. First, we make some general assumptions for the structure of the dataset and the kernel functions as follows:
A1. The kernel function
is symmetric, and satisfies
A2. The bivariate kernel function
is of order
, that is,
where
is a multi-index and
.
A3. The kernel function is continuously differentiable on its support (0, 1) with for all and .
A4. The input variables , are independent and identically distributed (i. i. d) random vectors. are independent and identically distributed with mean zero and variance , and . For every , and are independent.
A5.
is a unique solution of Equation (
2), and
is a compact subset of
.
A6. Let
and
, then
A7. Both
and
are non-singular.
A8.
Conditions A1–A3 make some limitations for the kernel functions
and
, which were adopted in [
17,
18,
19]. Furthermore, we need to specify the widths
and
and give some conditions on the combined distribution referring to [
18]. Denote the joint probability density functions of
, and
by
, and
, respectively.
B1. , , and .
B2. , and .
B3. exists and is continuous on for .
B4. is continuous on uniformly in y; exists and is continuous on uniformly in y, for .
B5. is continuous on uniformly in .
B6. exists and is continuous on for .
B7. exists and is continuous on , for .
B8. is continuous on uniformly in ; exists and is continuous on uniformly in , for .
B9. As , both and tend to 0.
In addition, we need to make some restrictions for the computer model . In this article, we assume is known for convenience. In practice, we can obtain an estimator (surrogate model) of using Gaussian process regression based on the simulated dataset; see Remark 1 for more details.
C1..
C3. For , , where is a reproducing kernel Hilbert space and is Donsker for all .
Assumptions C1–C2 require that both computer model
and its first-order partial derivatives with respect to
are bounded. These are standard regularity conditions that are typically easy to satisfy in practice and have been widely adopted in the literature; see, for example, [
7,
13]. Furthermore, we assume that the second-order partial derivatives of
with respect to
are continuous. This is a relatively mild assumption because it does not require the continuity of higher-order derivatives. Finally, Assumption C3 states that computer model
for any fixed
satisfies the Donsker property, which is crucial for establishing asymptotic normality via empirical process theory; see [
7] for further discussion.
3.2. Asymptotic Normality
Theorem 1. Under Conditions A1, A2, A6, and B1–B8, we havewhere . Theorem 1 claims that the rate of convergence of is , which is slower than . In this article, we select a -consistent estimator of ; thus, the above rate becomes .
Theorem 2. Under Conditions A1–A8, B9, and C1–C3, we denote and ; then, we havewhere is defined in Condition A7 andwhere represents taking the expectation over random and . Theorem 2 establishes the asymptotic normality of the raw estimator for the time-varying parameter at a fixed , which facilitates the asymptotic properties of the proposed estimator. Based on the above results, we can derive the asymptotic distribution of as follows:
Theorem 3. Under Conditions A1–A8, B1–B9, and C1–C3, we denote ; then, we havewhere has also been defined in Condition A7 and Theorem 3 establishes the asymptotic normality of the proposed estimator for the time-varying parameter at a fixed point , which is also called pointwise asymptotic normality.
5. An Application to Calibrate the Forward Model in NASA’s OCO-2 Mission
In the NASA Orbiting Carbon Observatory-2 (OCO-2) mission, Observing System Uncertainty Experiments (OSUEs) play a crucial role in performing probabilistic assessments on retrieval algorithms. The forward model is an essential component of the OSUEs, and its prediction accuracy for real-world scenarios directly impacts the evaluation of retrieval algorithms, as detailed in [
16].
The forward model describes the complex physical relationship between the atmospheric variable and radiances . This model typically involves four geometric parameters, Instrument Azimuth Angle (Inst-AziA), Instrument Zenith Angle (Inst-ZenA), Solar Azimuth Angle (Sol-AziA), and Solar Zenith Angle (Sol-ZenA), denoted by , which are time-dependent. These angles define the relative positions of the instrument’s line of sight and the incoming solar radiation. Specifically, azimuth angles describe the horizontal orientation of either the instrument or the sun with respect to a reference direction (typically north), while zenith angles measure the deviation from the vertical. Together, they determine the optical path of sunlight through the atmosphere and thus strongly influence the measured radiances.
Given the high computational cost of this forward model, we constructed a surrogate model based on experimental data. We utilized the simulated dataset from [
16] and first employed Principal Component Analysis (PCA) to reduce the dimensionality of
from 66 to 4. We considered the spectrometer’s measure of the radiation intensity in strong CO
2 wavelength bands and computed its functional PCA weight as a new output because output
is functional data concerning wavelength
w. At last, we used Gaussian process regression to construct a surrogate model
of the forward model based on normalized experimental parameters.
To reduce the uncertainty of the forward model, we need to identify the time-varying parameter
based on true observations. We downloaded data comprising 20 days from NASA’s official website and used MAPE and mean square prediction error (MSPE) to measure the performance of the involved calibration procedures. The calibrated results, including MAPE, MSPE, and the
and
quantiles
of the pointwise estimated values of the calibration parameter, are presented in
Table 3 and
Figure 3.
From
Table 3, the proposed calibration procedure PLS achieves the minimal prediction error, while OLS is slightly worse than PLS. As expected, the constant calibration procedure COLS performs poorly in terms of MAPE and PMSE, validating our assumption of the time-varying parameter.
Figure 3 reports the comparison for the predicted values through different methods to calibrate the forward model. The predicted values from the proposed PLS method are closest to the true observations, while those from the constant calibration procedure significantly deviate from the true observations. This application of the forward model further verifies the efficiency of the proposed calibration procedure.
6. Conclusions and Discussion
In this article, we propose a real-time calibration procedure for computer models with a time-varying parameter. To construct a pointwise estimator for the time-varying parameter at a fixed point, we employed a quasi-profile least squares estimation approach. This involved deriving a local linear estimator for the discrepancy function given the calibration parameter, followed by computing a quasi-profile least squares estimator for the calibration function at a specified time point. Additionally, we provided insights on the convergence rate of the estimator for the discrepancy function and explored the asymptotic properties of our proposed estimator of the time-varying parameter. Furthermore, we conducted numerical simulations and considered a real-world example, demonstrating the favorable performance of our proposed method.
Although our proposed method exhibits superior performance in both asymptotic theory and computational efficiency, there are some drawbacks in our calibration procedure. First, we assume that the computer model is fully known, which may not hold in practical applications. To address this, future research could focus on constructing surrogate models to approximate unknown computer models with time-varying parameters. Second, we assume that random errors are independent and identically distributed with finite variance. In situations where errors are correlated or exhibit heteroscedasticity, the efficiency of the proposed estimator may be reduced. Extending the method to account for correlated or non-standard error structures could significantly broaden its applicability. Finally, enhancing estimation efficiency through weighted quasi-profile least squares or other advanced techniques presents a promising avenue for further investigation. Overall, these potential extensions suggest that the proposed framework could be adapted to a wider range of complex and realistic modeling scenarios.