1. Introduction
High-dimensional uncertainty quantification problems arise frequently in scientific computing and engineering, where quantities of interest depend on a large number of uncertain parameters. In such settings, the number of model evaluations required to accurately explore the input space grows rapidly with dimension, leading to the well-known curse of dimensionality. This challenge is particularly severe when accurate model evaluations are computationally expensive, as is often the case for simulations governed by partial differential equations. Dimension reduction techniques play a central role in alleviating this difficulty by identifying low-dimensional structures that capture the dominant input–output dependence.
A common formulation of supervised dimension reduction assumes that a scalar response
Y depends on the input
only through a low-dimensional linear subspace, i.e.,
where
with
. The span of
is referred to as a sufficient dimension reduction (SDR) subspace, and the smallest such subspace is known as the central subspace [
1,
2]. A variety of methods have been developed to estimate this structure, including sliced inverse regression (SIR) [
3], sliced average variance estimation (SAVE) [
4], contour regression [
5], and related approaches [
6,
7,
8,
9,
10,
11,
12]. These techniques are effective when sufficient high-quality labeled data are available, but their performance can deteriorate rapidly when accurate observations are scarce.
In many practical applications, data are available from models of varying fidelity. Low-fidelity models are typically inexpensive to evaluate but may be biased or incomplete, while high-fidelity models provide accurate information at a significantly higher computational cost. In such multi-fidelity settings, classical supervised dimension reduction methods often fail because the number of high-fidelity samples is insufficient to reliably estimate conditional expectations or variances. Bayesian approaches have been proposed to mitigate data scarcity, such as the Bayesian inverse regression framework in [
13], which employs Gaussian process regression to approximate the likelihood and infer dimension-reduced structure. However, these methods typically assume a single fidelity level and can degrade when accurate observations are extremely limited or when data originate from heterogeneous sources.
Multi-fidelity modeling provides a principled framework to combine information from models of differing accuracy and cost. Gaussian process-based multi-fidelity methods, beginning with the autoregressive formulation of Kennedy and O’Hagan [
14] and its recursive implementation by Le Gratiet and Garnier [
15], have been widely used for surrogate modeling and uncertainty quantification. More recently, nonlinear information fusion strategies based on deep or autoregressive Gaussian processes have been developed to capture complex relationships between fidelity levels [
16]. While these approaches have demonstrated strong predictive performance, they primarily focus on forward surrogate accuracy and uncertainty propagation, and do not explicitly address supervised dimension reduction in high-dimensional input spaces.
Several recent works have explored dimension reduction in related contexts, including active subspace methods [
6,
17,
18] and Gaussian process-based kernel dimension reduction techniques [
8]. Although effective in moderate-dimensional problems, these methods typically require either a sufficiently large number of high-fidelity samples or direct optimization of high-dimensional projection operators. When both the input dimension is large and high-fidelity data are scarce, such optimization problems become ill-posed due to the large number of parameters involved.
In this work, we propose a
Rotated Multi-Fidelity Gaussian Process (RMFGP) framework that addresses these challenges by tightly coupling nonlinear multi-fidelity Gaussian process modeling with supervised dimension reduction and Bayesian active learning. The central idea is to exploit abundant low-fidelity data to extract coarse structural information about the input–output relationship using SAVE, and to use this information to rotate the input space prior to multi-fidelity model training. A nonlinear multi-fidelity Gaussian process surrogate is then constructed in the rotated space, enabling effective information fusion across fidelity levels. By sampling from the trained surrogate, the dimension reduction procedure is iteratively refined, allowing reliable estimation of the central subspace even when the number of high-fidelity samples is small (
Figure 1).
To further improve efficiency, a Bayesian active learning strategy driven by predictive uncertainty is incorporated to adaptively enrich the high-fidelity dataset. Depending on user requirements, the proposed framework supports either a rotated surrogate model for improved predictive accuracy or a reduced-order surrogate obtained via a two-stage dimension reduction strategy that combines Bayesian information criterion-based dimension selection with Gaussian process kernel optimization.
The main contributions of this work are summarized as follows:
Multi-fidelity–informed supervised dimension reduction. We develop a dimension reduction framework that leverages nonlinear multi-fidelity Gaussian process surrogates to enable reliable estimation of the central sufficient dimension reduction subspace under severe high-fidelity data scarcity.
Rotated multi-fidelity Gaussian process model. A rotated multi-fidelity Gaussian process (RMFGP) is proposed, in which the input space is iteratively rotated using directions identified by supervised dimension reduction and refined through surrogate-generated samples, rather than treated as a fixed preprocessing step.
Two-stage dimension reduction strategy under data scarcity. To address the ill-posedness of directly learning high-dimensional projection operators with limited high-fidelity data, we introduce a two-stage dimension reduction procedure that combines RMFGP-based rotation with Gaussian process kernel-based optimization, with intrinsic dimension selected via a Bayesian information criterion.
Integration of Bayesian active learning in multi-fidelity settings. A Bayesian active learning strategy based on predictive uncertainty is incorporated to adaptively select new high-fidelity samples, improving both surrogate accuracy and dimension reduction quality while minimizing computational cost.
Applications to stochastic partial differential equations. The proposed framework is validated on a range of numerical examples, including stochastic partial differential equations, demonstrating improved prediction accuracy, convergence behavior, and uncertainty propagation compared to existing Gaussian process-based dimension reduction methods.
The remainder of the paper is organized as follows:
Section 2 introduces the necessary background on Gaussian process regression, multi-fidelity modeling, supervised dimension reduction, and active learning.
Section 3 presents the proposed RMFGP framework and associated algorithms.
Section 4 demonstrates the performance of the method through several numerical examples. Conclusions and directions for future work are given in
Section 5.
2. Background and Preliminaries
2.1. Gaussian Process Regression
Gaussian process regression (GPR) provides a flexible Bayesian framework for approximating unknown functions from data. The presentation in this section follows [
19]. Let
denote an unknown scalar-valued function, and suppose that noisy observations are available in the form
where
represents independent Gaussian noise.
In the Gaussian process framework, the latent function
is modeled as a stochastic process,
where
is a positive-definite covariance kernel parameterized by hyperparameters
. This prior specification encodes assumptions about the smoothness and correlation structure of the function.
Let
and
denote the training inputs and outputs. Under the Gaussian likelihood assumption, the hyperparameters
can be estimated by maximizing the marginal log-likelihood,
where
is the covariance matrix with entries
.
Given a new input
, the posterior predictive distribution of
is Gaussian,
with mean and variance given by
where
.
The posterior mean provides a surrogate prediction of the unknown function, while the posterior variance naturally quantifies predictive uncertainty, a feature that will be exploited later for active learning and multi-fidelity modeling.
2.2. Multi-Fidelity Gaussian Process Models
Multi-fidelity modeling aims to combine information from models or experiments of varying accuracy and computational cost. In many scientific computing applications, low-fidelity models are inexpensive to evaluate but may exhibit bias or missing physics, whereas high-fidelity models provide accurate predictions at a substantially higher cost. Gaussian process-based multi-fidelity models provide a flexible probabilistic framework for fusing such heterogeneous information.
2.2.1. Linear Autoregressive Multi-Fidelity Models
A widely used formulation is the linear autoregressive model proposed by Kennedy and O’Hagan [
14]. Let
denote a sequence of Gaussian processes corresponding to increasing levels of fidelity, where
represents the highest fidelity. The autoregressive relationship between successive fidelity levels is given by
where
is a scalar correlation parameter and
is a Gaussian process independent of
, modeling the discrepancy between fidelity levels.
This construction implies a Markov property across fidelity levels and allows the high-fidelity response to be expressed as a linear transformation of the lower-fidelity model plus a correction term. To reduce computational complexity, Le Gratiet and Garnier [
15] proposed a recursive implementation under a nested experimental design, transforming the original multi-fidelity problem into a sequence of standard Gaussian process regressions.
Under this formulation, the predictive mean and variance at fidelity level
t can be computed recursively as
where
denotes the covariance matrix at fidelity level
t.
While effective in problems with strong linear correlations between fidelity levels, this formulation can perform poorly when the relationship between models is highly nonlinear or varies across the input space.
2.2.2. Nonlinear Information Fusion
To address these limitations, Perdikaris et al. [
16] introduced a nonlinear autoregressive multi-fidelity Gaussian process model, also referred to as a nonlinear autoregressive Gaussian process (NARGP). In this framework, the relationship between fidelity levels is expressed as
where
is an unknown nonlinear function modeled as a Gaussian process and
is an independent Gaussian process capturing residual discrepancies.
By augmenting the input space with predictions from the lower-fidelity model, this formulation allows flexible nonlinear correlations between fidelity levels to be learned from data. The resulting surrogate can be interpreted as a deep Gaussian process with a restricted architecture [
20,
21]. Since the posterior distribution is no longer Gaussian for
, predictive moments are typically approximated using Monte Carlo sampling.
Nonlinear multi-fidelity Gaussian process models have demonstrated improved robustness and predictive accuracy in a wide range of applications, particularly when linear correlations between fidelity levels are weak or input-dependent. In the present work, this nonlinear information fusion framework serves as a fundamental building block for the proposed RMFGP methodology.
2.3. Supervised Dimension Reduction Methods
Supervised dimension reduction aims to identify low-dimensional structures in the input space that preserve the dependence between inputs and outputs. Let be a random input vector and let denote a scalar response. A central concept in supervised dimension reduction is the notion of a sufficient dimension reduction (SDR) subspace.
Definition 1 (Sufficient dimension reduction)
. A subspace is called a sufficient dimension reduction subspace for Y given X ifwhere the columns of form a basis of and . Among all SDR subspaces, the smallest one in the sense of set inclusion is referred to as the
central subspace, denoted by
[
22]. Once an estimate of the central subspace is obtained, the original high-dimensional regression problem can be approximated by
where
is an unknown low-dimensional function.
A variety of classical methods have been developed to estimate the central subspace, including sliced inverse regression (SIR) [
3] and sliced average variance estimation (SAVE) [
4]. These methods rely on discretizing the range of the response variable and estimating conditional moments of standardized inputs.
2.3.1. Sliced Inverse Regression
Sliced inverse regression (SIR) estimates the central subspace by exploiting variations in the conditional mean . After standardizing the input variables, the response Y is partitioned into a finite number of slices, and the covariance of the slice-wise conditional means is used to construct an eigenvalue problem whose leading eigenvectors provide an estimate of the SDR directions.
The computational procedure of SIR adopted in this work is summarized in Algorithm 1. While computationally efficient, SIR may fail to detect symmetric or higher-order dependencies between the inputs and the response.
| Algorithm 1 Sliced Inverse Regression (SIR) |
- 1:
Compute the sample mean and sample variance:
and compute the standardized random vectors - 2:
Discretize Y as , where a collection of intervals is a partition of . - 3:
Approximate or by - 4:
Approximate by - 5:
Let be the first d eigenvectors of M, let ,. The SDR predictors are .
|
2.3.2. Sliced Average Variance Estimation
Sliced average variance estimation (SAVE) extends SIR by incorporating second-order conditional information through the conditional variance
. Following [
23], the SAVE matrix is constructed as
where
Z denotes the standardized input,
is a partition of the response space, and
is the identity matrix.
The full computational steps of SAVE used throughout this paper are summarized in Algorithm 2. Compared to SIR, SAVE is capable of detecting a broader class of dependencies, including symmetric relationships between inputs and outputs. For this reason, SAVE is adopted as the primary supervised dimension reduction tool in the proposed framework, although other SDR methods could also be incorporated.
| Algorithm 2 Sliced Average Variance Estimation (SAVE) |
- 1:
Standardize to obtain as in Algorithm 1. - 2:
Discretize Y as , where a collection of intervals is a partition of . - 3:
For each slice , compute the sample conditional variance of Z given : - 4:
Compute the sample version of M: - 5:
Let be the first d eigenvectors of M, let ,. The SDR predictors are , where .
|
2.3.3. Selection of Reduced Dimension
A key challenge in supervised dimension reduction is determining the intrinsic dimension
d of the central subspace. In this work, the reduced dimension is selected using the Bayesian information criterion (BIC) proposed in [
24]. Let
denote the eigenvalues of the SAVE-based matrix
. The BIC score is defined as
where
is a sequence satisfying the conditions specified in [
24]. The estimated reduced dimension is then given by
In the present work, Algorithms 1 and 2 are applied either directly to observed data or to surrogate-generated samples, depending on the stage of the proposed RMFGP framework. While effective in classical settings, the direct application of SAVE and BIC can be unreliable when the number of accurate observations is small. This limitation motivates the surrogate-assisted dimension reduction strategy developed in the subsequent sections.
2.4. Bayesian Active Learning
Bayesian active learning, also referred to as sequential experimental design, aims to improve model accuracy by adaptively selecting new data points that are expected to provide the greatest information gain [
25]. In problems where accurate observations are expensive or time-consuming to obtain, active learning plays a crucial role in achieving reliable inference with limited computational resources.
In the context of this work, observations are available from multiple fidelity levels. Let
denote the low-fidelity and high-fidelity datasets, respectively, where typically
. A multi-fidelity Gaussian process surrogate is constructed using the combined dataset
, as described in
Section 2.2.
The objective of active learning in this setting is to enrich the high-fidelity dataset
by selectively querying new high-fidelity observations from a predefined candidate pool. In this work, the candidate pool is chosen to be the set of low-fidelity input locations
, which are inexpensive to evaluate and already available. At each iteration, a new high-fidelity sample location is selected by maximizing an acquisition function,
Among the various acquisition functions proposed in the literature, such as expected improvement or probability of improvement [
26,
27,
28], we adopt the predictive variance of the multi-fidelity Gaussian process surrogate,
where
denotes the posterior predictive variance at input
x. This choice is motivated by the fact that predictive variance naturally quantifies epistemic uncertainty in Gaussian process models and directly reflects regions of the input space where additional high-fidelity information is most beneficial.
Once the new high-fidelity observation is obtained, it is added to the training dataset, and the multi-fidelity surrogate is retrained. This process is repeated until a stopping criterion is satisfied, such as reaching a prescribed prediction error threshold on a validation set or exhausting a predefined computational budget.
In the proposed framework, Bayesian active learning serves two complementary purposes. First, it improves the predictive accuracy of the multi-fidelity surrogate model with minimal additional high-fidelity evaluations. Second, by enhancing surrogate quality in regions of high uncertainty, it indirectly improves the reliability of surrogate-assisted supervised dimension reduction procedures employed in subsequent sections.
3. Rotated Multi-Fidelity Gaussian Process Framework
3.1. Problem Setting
We consider a supervised regression problem in which a scalar quantity of interest depends on a high-dimensional input. Let
denote an unknown high-fidelity response function, where the input dimension
p may be large. We assume that evaluations of
f are expensive and that only a limited number of accurate observations are available.
In addition to the high-fidelity model, we assume access to one or more low-fidelity models that provide approximate evaluations of the response at substantially lower computational cost. For clarity of exposition, we focus on the case of two fidelity levels, although the proposed framework can be extended naturally to more general multi-fidelity settings. Let
denote the high-fidelity and low-fidelity datasets, respectively, with
in typical applications.
A central assumption of this work is that the high-fidelity response admits an approximate low-dimensional structure. Specifically, we assume that there exists a matrix
with
such that
where
is an unknown function. The row space of
defines the central sufficient dimension reduction subspace introduced in
Section 2.3. Identifying this subspace is critical both for reducing the effective dimensionality of the problem and for constructing accurate surrogate models. Note that throughout this work, we use
A to denote rotation matrices and
M to denote final reduction matrices, while
B in
Section 2 refers generically to SDR directions.
The primary challenge addressed in this work arises from the severe scarcity of high-fidelity data. Classical supervised dimension reduction techniques, such as SIR or SAVE, rely on an accurate estimation of conditional moments and typically require a sufficient number of high-fidelity observations. When is small, a direct application of these methods becomes unreliable. At the same time, although low-fidelity data are abundant, they may contain bias or missing information and cannot be used directly to infer the central subspace of the high-fidelity response.
The goal of this work is therefore twofold. First, we aim to develop a reliable strategy for identifying the central dimension reduction subspace of the high-fidelity model by exploiting both low- and high-fidelity data. Second, we seek to construct an accurate surrogate model for the high-fidelity response that can be used for prediction and uncertainty quantification with minimal reliance on expensive model evaluations.
To achieve these objectives, we introduce a Rotated Multi-Fidelity Gaussian Process (RMFGP) framework that tightly integrates multi-fidelity Gaussian process modeling, supervised dimension reduction, and Bayesian active learning. The subsequent subsections describe the construction of the RMFGP model and the associated dimension reduction and sampling strategies in detail.
3.2. Input Rotation via Supervised Dimension Reduction
The first step of the proposed RMFGP framework is to identify informative directions in the input space and use them to define a rotated coordinate system. At this stage, the objective is not to perform aggressive dimension reduction, but rather to improve the representation of the input space so that subsequent multi-fidelity surrogate modeling can be carried out more effectively.
Since low-fidelity data are typically abundant and inexpensive to obtain, we begin by applying supervised dimension reduction to the low-fidelity dataset. Specifically, the sliced average variance estimation (SAVE) method introduced in
Section 2.3 is applied to the low-fidelity observations
. The computational procedure follows Algorithm 2, resulting in an estimate of dominant directions that characterize the dependence between low-fidelity inputs and outputs.
Let
denote the orthogonal matrix whose columns are the eigenvectors obtained from the SAVE matrix constructed using the low-fidelity data, ordered according to decreasing eigenvalues. This matrix defines a rotation of the original input space,
where
represents the rotated input coordinates. All available data, including low-fidelity, high-fidelity, and validation inputs, are transformed using the same rotation.
It is important to emphasize that this step performs a rotation rather than a truncation of the input space. All p dimensions are retained, and no information is discarded at this stage. The purpose of the rotation is to align the coordinate system with directions that are informative for the low-fidelity response, thereby improving the conditioning of subsequent multi-fidelity Gaussian process models.
Although the low-fidelity model may contain bias or missing physics, its response often captures coarse structural information about the underlying input–output relationship. By exploiting this information through SAVE-based rotation, the input space is reorganized so that informative directions are concentrated in the leading coordinates, while less influential directions are pushed toward higher-indexed components. This reorganization facilitates more effective information fusion across fidelity levels in later stages of the RMFGP framework.
The rotated input representation serves as the foundation for constructing the nonlinear multi-fidelity Gaussian process surrogate described in the next subsection. As will be shown, this initial rotation can be further refined using surrogate-generated predictions, leading to improved estimation of the central subspace associated with the high-fidelity response.
3.3. Rotated Multi-Fidelity Gaussian Process Construction
After rotating the input space as described in
Section 3.2, we construct a nonlinear multi-fidelity Gaussian process surrogate in the rotated coordinates. This surrogate serves as the core modeling component of the proposed RMFGP framework and provides the probabilistic foundation for subsequent dimension refinement and active learning.
Let
denote the rotated input associated with
. The low-fidelity and high-fidelity datasets in the rotated space are given by
We adopt the nonlinear autoregressive multi-fidelity Gaussian process (NARGP) formulation introduced in [
16] to fuse information across fidelity levels. In the rotated space, the high-fidelity response is modeled as
where
denotes the low-fidelity Gaussian process surrogate,
is an unknown nonlinear function modeled as a Gaussian process, and
is an independent Gaussian process capturing residual discrepancies between fidelity levels.
The low-fidelity surrogate is first trained using by standard Gaussian process regression. Predictions from this surrogate are then used as additional inputs to train the high-fidelity model using . This construction allows the multi-fidelity surrogate to learn complex, input-dependent correlations between fidelity levels while preserving a coherent probabilistic interpretation.
We emphasize that the defining feature of the RMFGP model is the coupling between input rotation and multi-fidelity Gaussian process regression. Unlike standard NARGP models constructed in the original input space, the RMFGP surrogate operates in a rotated coordinate system informed by supervised dimension reduction. This coupling improves the alignment between informative input directions and the covariance structure of the Gaussian process, leading to enhanced predictive performance and numerical stability in high-dimensional settings.
Since the posterior distribution of the high-fidelity surrogate is no longer Gaussian under the nonlinear autoregressive formulation, predictive moments are approximated using Monte Carlo sampling. In particular, for a given input , samples of are first drawn from the low-fidelity posterior and then propagated through the Gaussian process to generate samples of . The predictive mean and variance are subsequently estimated from these samples.
The resulting RMFGP surrogate provides both point predictions and uncertainty quantification for the high-fidelity response. These uncertainty estimates play a crucial role in guiding Bayesian active learning and in enabling surrogate-assisted supervised dimension reduction, as described in the following subsections.
3.4. Iterative Refinement and Active Learning
The rotated multi-fidelity Gaussian process constructed in
Section 3.3 provides an initial probabilistic surrogate for the high-fidelity response in the rotated input space. However, because the initial rotation is obtained solely from low-fidelity data, it may not accurately reflect the central subspace associated with the high-fidelity model. To address this limitation, the RMFGP framework employs an iterative refinement strategy that combines surrogate-assisted supervised dimension reduction with Bayesian active learning.
3.4.1. Surrogate-Assisted Refinement of Rotation
Once the RMFGP surrogate is trained, it can be used to generate additional response samples at arbitrary input locations. In particular, surrogate predictions are evaluated at the low-fidelity input locations
, yielding a set of surrogate-generated high-fidelity responses
These surrogate-generated samples are inexpensive to obtain and provide an approximation of the input–output relationship of the high-fidelity model across the input domain.
The SAVE algorithm (Algorithm 2) is then reapplied using the surrogate-generated dataset
to estimate an updated set of dimension reduction directions. Let
denote the rotation matrix obtained at iteration
k. The input space is subsequently rotated according to
and the RMFGP surrogate is retrained in the updated rotated coordinates.
This surrogate-assisted refinement procedure allows information from the limited high-fidelity data to be propagated throughout the input space via the multi-fidelity surrogate, enabling more reliable estimation of the central subspace than would be possible using the high-fidelity observations alone. In practice, only a small number of refinement iterations is required before the rotation stabilizes.
3.4.2. Integration of Bayesian Active Learning
To further enhance surrogate accuracy and refinement reliability, Bayesian active learning is incorporated within the iterative framework. At each iteration, the predictive uncertainty of the RMFGP surrogate is evaluated over the candidate pool of low-fidelity input locations. New high-fidelity samples are selected by maximizing the predictive variance, as described in
Section 2.4, and added to the high-fidelity dataset.
The inclusion of actively selected high-fidelity samples serves two complementary purposes. First, it improves the local accuracy of the RMFGP surrogate in regions of high uncertainty. Second, it enhances the quality of surrogate-generated samples used for supervised dimension reduction, leading to more accurate estimation of rotation directions in subsequent iterations.
The combined process of surrogate-assisted rotation refinement and Bayesian active learning results in a feedback loop in which improved surrogates lead to improved dimension reduction, which in turn leads to improved surrogate construction. This coupling distinguishes the RMFGP framework from conventional multi-fidelity Gaussian process models and classical supervised dimension reduction methods, which typically treat surrogate modeling and dimension reduction as independent tasks.
The iterative refinement procedure is terminated when a prescribed stopping criterion is satisfied, such as stabilization of the estimated dimension reduction directions, convergence of surrogate prediction error, or exhaustion of the available computational budget.
3.5. Two-Stage Dimension Reduction Strategy
The iterative refinement procedure described in
Section 3.4 yields a sequence of rotation matrices that progressively align the input coordinates with directions that are informative for the high-fidelity response. While this rotation significantly improves surrogate modeling and information fusion, it does not by itself reduce the dimensionality of the input space. In many applications, however, constructing a reduced-order surrogate model is desirable for efficiency, interpretability, and scalability.
A direct approach to supervised dimension reduction would be to estimate a projection matrix by optimizing a Gaussian process kernel or likelihood function in the original high-dimensional space. When the number of high-fidelity samples is limited, such an approach is often ill-posed due to the large number of parameters involved. To address this challenge, the RMFGP framework adopts a two-stage dimension reduction strategy that decouples the identification of informative directions from the final reduction of dimensionality.
3.5.1. Stage I: RMFGP-Based Rotation
In the first stage, the RMFGP framework is used to identify and refine a rotation of the input space through surrogate-assisted SAVE and active learning, as described in
Section 3.2,
Section 3.3 and
Section 3.4. Let
denote the final rotation matrix obtained upon convergence of the iterative refinement procedure. This rotation concentrates the dominant input–output dependence into the leading coordinates of the transformed input
while retaining the full dimensionality of the problem.
The primary role of this stage is to reorganize the input space so that the effective dimensionality of the response is reduced, even though all p coordinates are preserved. By doing so, the subsequent dimension reduction step can be carried out in a significantly better-conditioned coordinate system.
3.5.2. Stage II: Reduced-Order Surrogate Construction
In the second stage, a reduced-order surrogate model is constructed by selecting a subset of the rotated coordinates. The intrinsic dimension
d of the reduced representation is determined using the Bayesian information criterion described in
Section 2.3, applied to the SAVE-based matrix computed from surrogate-assisted data. Once
d is selected, the reduced input is defined as
corresponding to the leading
d components of the rotated input vector.
A Gaussian process surrogate is then trained using as input. In this reduced space, kernel hyperparameter optimization becomes well-posed due to the substantially lower dimensionality. This allows for the construction of an accurate and efficient reduced-order surrogate for the high-fidelity response.
3.5.3. Algorithmic Summary
The proposed two-stage strategy offers several advantages. First, it avoids the direct optimization of high-dimensional projection operators using limited high-fidelity data. Second, it exploits the strengths of multi-fidelity modeling to guide dimension reduction in a data-efficient manner. Finally, it provides flexibility to the user: depending on application requirements, one may either use the rotated full-dimensional surrogate for maximum accuracy or employ the reduced-order surrogate for improved computational efficiency.
The complete RMFGP procedure, including input rotation via supervised dimension reduction, nonlinear multi-fidelity Gaussian process construction, surrogate-assisted refinement, Bayesian active learning, and the two-stage dimension reduction strategy described above, is summarized in Algorithm 3. This algorithm serves as a practical implementation of the RMFGP framework and highlights the interaction between surrogate modeling, dimension reduction, and adaptive sampling in high-dimensional, data-scarce multi-fidelity settings.
| Algorithm 3 Rotated multi-fidelity Gaussian process model (RMFGP) |
- 1:
Input: low-fidelity data sets , high-fidelity set and validation data set , threshold , maximum iteration number I, reduction parameter or 1. - 2:
Apply SAVE method Algorithm 2 only on the low-fidelity data to compute the first rotation matrix to extract the principle direction information in the low-fidelity data. Then apply to all . - 3:
Perform NARGP on new training data , to get the prediction at . Then perform SAVE method again on new to compute the rotation matrix and apply to all Xs. - 4:
Check whether the threshold of the generalization error meet. If not, perform Bayesian active learning method to locate where the prediction variance achieves maximum. Sample and add them into the high-fidelity training set. - 5:
Repeat step 2 and 3 until the generalization error threshold fulfilled or the maximum iteration number is reached. - 6:
Compute the rotation matrix , and build the rotated model. - 7:
If dimension reduction parameter , compute the intrinsic dimension d through BIC and compute the reduction matrix consisting of the first s principle columns of , where . - 8:
Apply Gaussian processes dimension reduction technique to compute reduction matrix which is . Then build the reduced model with final reduction matrix .
|
4. Numerical Results
In this section, we present four numerical examples to demonstrate the effectiveness of the proposed RMFGP framework in high-dimensional, data-scarce settings. We first assess the accuracy of the estimated rotation or reduction matrix using the subspace distance metric introduced in [
24],
where
and
denote the true and estimated central subspace matrices, respectively;
and
are the corresponding projection matrices; and
denotes the Frobenius norm. To highlight the benefit of the proposed approach, we compare the reduction matrix computed by RMFGP (with
) against the GP-SAVE method introduced in [
13], using the same number of high-fidelity samples.
Once the rotation or reduction matrix is obtained, new training and test datasets are generated to construct and evaluate Gaussian process surrogate models. To ensure a fair comparison, two cases are considered depending on the reduction parameter . When , the proposed RMFGP model and standard Gaussian process regression (GPR) are trained on the rotated and original input spaces, respectively. When , the reduced-order surrogate constructed using RMFGP is compared against GP-SAVE, where the same reduced dimension is used.
Model accuracy is quantified using the relative error
where
u and
denote the exact and predicted high-fidelity responses evaluated on the test set.
4.1. Linear Example: Poisson’s Equation
We first consider a linear multi-fidelity example in which the high-fidelity and low-fidelity models are related through additive high-dimensional perturbations. This example represents a simple setting in which the correlation between fidelity levels is predominantly linear.
Specifically, we define
where
,
, are independent random variables uniformly distributed on
. Both functions can be interpreted as solutions of Poisson’s equation
with forcing terms
From the analytical form of , it is evident that the true intrinsic dimension is . The corresponding central subspace is spanned by and . The total input dimension is . In this experiment, we use low-fidelity samples and test points. For each case, the initial number of high-fidelity samples is set to , and two iterations of Bayesian active learning are performed, with five new high-fidelity samples added per iteration.
Table 1 reports the subspace distance
obtained with different numbers of high-fidelity samples when
. Both RMFGP and GP-SAVE exhibit consistent improvement as
increases. However, RMFGP achieves substantially higher accuracy across all sample sizes. This improvement can be attributed to the effective exploitation of low-fidelity information and the refinement enabled by Bayesian active learning, which together enhance the estimation of the central subspace under limited high-fidelity budgets.
Table 2 presents the values of the Bayesian information criterion
. The criterion attains its maximum at
, correctly identifying the intrinsic dimension
, consistent with the analytical structure of
.
Table 3 summarizes the relative prediction errors on the test set. When
, RMFGP trained on the rotated inputs consistently outperforms standard GPR trained on the original inputs. When
, the reduced-order RMFGP surrogate also achieves lower errors than GP-SAVE. In RMFGP, the input dimension is first reduced from
to
, followed by a further reduction to
using Gaussian process-based dimension reduction. This hierarchical reduction alleviates the burden of hyperparameter optimization and allows accurate surrogate construction with fewer high-fidelity samples.
Figure 2 illustrates the mean squared error as a function of
. In both the rotated and reduced settings, RMFGP exhibits faster convergence and lower error levels than the comparison methods.
Figure 3 further confirms this observation through correlation plots at
, where RMFGP predictions align closely with the ideal correlation line, while the comparison methods exhibit larger deviations.
4.2. Nonlinear Example
The second example demonstrates the capability of the proposed RMFGP framework to model nonlinear problems and nonlinearly correlated relationships between low- and high-fidelity data that cannot be captured by linear autoregressive multi-fidelity models.
We consider the following pair of functions:
where
,
, are independent random variables uniformly distributed on
. From the analytical expressions of
and
, it follows that the intrinsic dimension of the low-fidelity model is two, whereas the intrinsic dimension of the high-fidelity model is one. Such a discrepancy may arise in practical applications when low-fidelity models contain additional noise or spurious dependencies that are absent in the high-fidelity response.
The original input dimension is , and the true central subspace of the high-fidelity model is spanned by the vector . In this experiment, we use low-fidelity training samples and test points. For all cases, the initial number of high-fidelity samples is set to . The Bayesian active learning scheme is then employed to add two high-fidelity samples in the first iteration and three additional samples in the second iteration, after which the stopping criterion is satisfied.
Table 4 reports the subspace distance
defined in (
30) for different numbers of high-fidelity samples. Both RMFGP and standard Gaussian process regression exhibit consistent improvement as
increases. However, RMFGP achieves significantly higher accuracy in estimating the central subspace with substantially fewer high-fidelity samples. In this example, the low-fidelity data contain complete but noisy information about the high-fidelity response, and the proposed RMFGP framework effectively exploits this structure to mitigate the impact of noise.
The Bayesian information criterion values
are reported in
Table 5. The criterion attains its maximum at
, correctly identifying the intrinsic dimension
, consistent with the analytical structure of
. This result demonstrates the ability of RMFGP to reliably identify intrinsic dimensionality even when low-fidelity data introduce additional spurious directions.
When the reduction parameter , the RMFGP framework first reduces the input dimension from to using the RMFGP-based rotation. A subsequent Gaussian process-based dimension reduction step is then applied to further reduce the dimension from to .
Table 6 and
Figure 4 present the relative error and mean squared error as functions of
for four different surrogate models. In all cases, RMFGP achieves lower prediction errors than the comparison methods. In particular, the reduced-order RMFGP surrogate exhibits superior accuracy in the small-sample regime, highlighting the effectiveness of the proposed framework in high-dimensional, data-scarce nonlinear settings.
Figure 5 further confirms this observation through correlation plots at
, where RMFGP predictions align closely with the ideal correlation line, while the comparison methods exhibit larger deviations.
Figure 6 is the prediction plot at
. The black star line is the exact prediction with
computed by the true dimension reduction matrix. The red circle line obtained by RMFGP (
) fits the curve well. The blue square line has a large error at some locations of
. The successful estimation of the central subspace as well as the predictions on the test set prove the ability of our method to exclude the effect of noise with a relatively small set of highly accurate data, which is useful in many real-world applications where high-fidelity data are expensive or hard to collect.
4.3. Advection Equation
The third example is aimed at evaluating the performance of RMFGP for stochastic partial differential equation problems in [
29]. Consider the one-dimensional advection equation with random input
with the initial condition
where
a is a constant coefficient,
, and
is a random vector.
Under this setting, an analytical high-fidelity solution is available and is denoted by
,
The low-fidelity data are generated from
Here,
is generated by i.i.d. uniformly distributed random variables on
, and the constant
a is fixed to be 1. Compared to the previous examples, the low-fidelity model in this setting contains missing information, since it depends only on
rather than the full input vector.
In this experiment, the high- and low-fidelity function values are computed at
and
. From the analytical expression of
, the true reduced dimension is
, and the corresponding reduction matrix is
with
.
Table 7 reports the BIC values
; the criterion attains its maximum at
, indicating that the estimated intrinsic dimension is
.
For the numerical experiments, the number of low-fidelity training samples is set to
, and the number of test samples is
. For all cases with different
, the experiment starts from
high-fidelity samples, and the Bayesian active learning procedure adds five high-fidelity samples per iteration, with two iterations performed before termination.
Table 8 summarizes the accuracy of the estimated central subspace using the metric
. As expected, both methods improve as
increases; however, RMFGP (
) consistently outperforms GP-SAVE across all sample sizes. In particular, when the number of high-fidelity samples is insufficient, GP-SAVE may fail to identify the central subspace accurately.
The relative errors reported in
Table 9 and the MSE curves in
Figure 7 further confirm this behavior. RMFGP achieves smaller prediction errors and exhibits faster convergence than the baseline GP model. The correlation plots in
Figure 8 and the prediction plots in
Figure 9 also demonstrate that RMFGP yields improved generalization performance and more accurate regression behavior than the comparison approach. Overall, this example illustrates that RMFGP can successfully approximate the central subspace in a stochastic differential equation problem, even when the low-fidelity model contains missing information.
4.4. Elliptic Equation
The final example illustrates the performance of the proposed RMFGP framework for a more challenging stochastic partial differential equation. We consider the one-dimensional elliptic equation with a random, high-order coefficient
subject to homogeneous Dirichlet boundary conditions
The diffusion coefficient
depends on a random input vector
and is defined differently for the high- and low-fidelity models as
The random variables
are assumed to be independent and uniformly distributed on
. Compared with the high-fidelity model, the low-fidelity model introduces a systematic bias through the constant shift in the denominator, while retaining partial information about the stochastic structure of the coefficient.
For this elliptic problem, a deterministic solution representation exists and is given by
Applying the boundary conditions yields
The integrals appearing in the solution are evaluated using highly accurate numerical quadrature. Unlike the previous examples, no closed-form analytical expression for
is available.
In this experiment, the solution is evaluated at the spatial location
. The exact central subspace cannot be inferred directly from the governing equation. Instead, it is approximated using the classical SAVE method applied to a large reference dataset consisting of 10,000 high-fidelity samples. The intrinsic dimension is determined via the Bayesian information criterion, and the corresponding BIC values
are reported in
Table 10. The criterion attains its maximum at
, indicating that the true reduced dimension is
.
The number of low-fidelity training samples is fixed at , and the number of test samples is set to . For all cases with different values of , the experiment starts with high-fidelity samples. Two iterations of Bayesian active learning are performed, with two samples added in the first iteration and three samples added in the second.
Table 11 summarizes the accuracy of the estimated central subspace using the metric
.
Table 12 reports the relative prediction errors for different surrogate models.
Figure 10 presents the mean squared error as a function of the number of high-fidelity samples, while
Figure 11 shows the corresponding correlation plots at
. Across all metrics, RMFGP consistently outperforms the comparison Gaussian process models and GP-SAVE, exhibiting behavior similar to that observed in the previous examples.
As the number of high-fidelity samples increases, RMFGP with (rotated inputs without truncation) achieves the best predictive performance. This observation reflects the fact that accurate identification of the principal directions improves the quality of the surrogate.
Finally, the uncertainty quantification analysis is also performed for this model. With the dimension reduction matrix
M obtained by the RMFGP model with
, a new Gaussian process surrogate can be built by pre-processing all training data with
M to reduce the input dimension to
. Then, 2000 samples for
are drawn from an i.i.d uniformly distribution.
Figure 12 presents the average of means and standard deviations(std) of those 2000 cases along with 50 different
x in
. The
x-axis represents the indexes of different
x values. The
y-axis is the average mean for
and std for
in each case. The ground truth is the black line. The green diamond line is obtained by the pure SAVE method if we give large enough training data, i.e., high-fidelity data samples. Here, we give 10,000 samples in order to obtain these results. Note that there are only 35 high-fidelity samples in the RMFGP model. As shown in
Figure 12a, all four methods have similar performance. However,
Figure 12b shows that our RMFGP method has a smaller std compared to the GP-SAVE method. This shows RMFGP is more confident about the predictions.
5. Conclusions
In this work, we proposed a Rotated Multi-Fidelity Gaussian Process (RMFGP) framework for surrogate modeling, dimension reduction, and uncertainty quantification in high-dimensional settings with severely limited high-fidelity data. The central idea of the proposed approach is to tightly integrate multi-fidelity Gaussian process modeling, supervised dimension reduction, and Bayesian active learning within a unified iterative framework. By exploiting abundant low-fidelity data to guide an initial rotation of the input space and refining this rotation through surrogate-assisted dimension reduction and adaptive sampling, RMFGP enables reliable identification of the central subspace associated with the high-fidelity response.
A key feature of RMFGP is the distinction between input rotation and dimensional truncation. Rather than performing aggressive dimension reduction directly from scarce high-fidelity data, the proposed two-stage strategy first reorganizes the input space to concentrate informative directions and then constructs reduced-order surrogates in a well-conditioned coordinate system. This design alleviates the ill-posedness commonly encountered in high-dimensional Gaussian process regression and allows for accurate surrogate construction with substantially fewer high-fidelity evaluations.
The effectiveness of the RMFGP framework was demonstrated through a sequence of numerical examples, including linear and nonlinear algebraic models as well as stochastic partial differential equations. Across all test cases, RMFGP consistently achieved more accurate estimation of the central subspace, improved predictive performance, and more reliable uncertainty quantification compared to standard Gaussian process regression and GP-SAVE methods, particularly in the small-sample regime. The results also highlight the benefit of Bayesian active learning in enhancing both surrogate accuracy and dimension reduction quality under limited computational budgets.
Several directions for future research remain open. The present work focuses on problems with moderate input dimensionality and smooth response functions; extensions to problems with discontinuities or sharp gradients would be of interest. In addition, while the current framework employs Gaussian process surrogates, the rotation and refinement strategy could be combined with other probabilistic or operator-learning models. Finally, extending RMFGP to handle time-dependent problems and more complex multi-physics systems represents a promising direction for further investigation.