Next Article in Journal
Numerical and Experimental Estimation of Heat Source Strengths in Multi-Chip Modules on Printed Circuit Boards
Next Article in Special Issue
Smart Meter Low Battery Voltage Status Assessment Driven by Knowledge and Data
Previous Article in Journal
A Lightweight Model of Learning Common Features in Different Domains for Classification Tasks
Previous Article in Special Issue
Gaussian Process Regression with Soft Equality Constraints
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

RMFGP: A Rotated Multi-Fidelity Gaussian Process Framework for Supervised Dimension Reduction

1
Department of Mathematics, Purdue University, West Lafayette, IN 47906, USA
2
School of Mechanical Engineering, Purdue University, West Lafayette, IN 47907, USA
*
Author to whom correspondence should be addressed.
Mathematics 2026, 14(2), 325; https://doi.org/10.3390/math14020325
Submission received: 23 December 2025 / Revised: 14 January 2026 / Accepted: 16 January 2026 / Published: 18 January 2026
(This article belongs to the Special Issue Machine Learning and Statistical Learning with Applications)

Abstract

High-dimensional surrogate modeling with limited high-fidelity data poses a major challenge in uncertainty quantification. Classical supervised dimension reduction methods often fail in this setting due to insufficient accurate observations, while low-fidelity data are abundant but biased. In this work, we propose a Rotated Multi-Fidelity Gaussian Process (RMFGP) framework that enables reliable dimension reduction and surrogate construction under severe data scarcity. The proposed method integrates nonlinear multi-fidelity Gaussian process regression with sliced average variance estimation (SAVE) to iteratively identify informative input directions. Low-fidelity data are first used to extract coarse structural information, which is exploited to rotate the input space prior to multi-fidelity model training. Predictions generated by the trained RMFGP surrogate are then used to refine the dimension reduction, allowing accurate estimation of the central sufficient dimension reduction subspace even when high-fidelity data are scarce. A Bayesian active learning strategy based on predictive uncertainty is further incorporated to adaptively select new high-fidelity samples. Numerical examples, including stochastic partial differential equations, demonstrate that RMFGP significantly improves prediction accuracy, convergence, and uncertainty propagation compared to existing Gaussian process-based dimension reduction approaches, while requiring substantially fewer high-fidelity evaluations.

1. Introduction

High-dimensional uncertainty quantification problems arise frequently in scientific computing and engineering, where quantities of interest depend on a large number of uncertain parameters. In such settings, the number of model evaluations required to accurately explore the input space grows rapidly with dimension, leading to the well-known curse of dimensionality. This challenge is particularly severe when accurate model evaluations are computationally expensive, as is often the case for simulations governed by partial differential equations. Dimension reduction techniques play a central role in alleviating this difficulty by identifying low-dimensional structures that capture the dominant input–output dependence.
A common formulation of supervised dimension reduction assumes that a scalar response Y depends on the input X R p only through a low-dimensional linear subspace, i.e.,
Y = f ( X ) g ( β T X ) ,
where β R p × d with d p . The span of β is referred to as a sufficient dimension reduction (SDR) subspace, and the smallest such subspace is known as the central subspace [1,2]. A variety of methods have been developed to estimate this structure, including sliced inverse regression (SIR) [3], sliced average variance estimation (SAVE) [4], contour regression [5], and related approaches [6,7,8,9,10,11,12]. These techniques are effective when sufficient high-quality labeled data are available, but their performance can deteriorate rapidly when accurate observations are scarce.
In many practical applications, data are available from models of varying fidelity. Low-fidelity models are typically inexpensive to evaluate but may be biased or incomplete, while high-fidelity models provide accurate information at a significantly higher computational cost. In such multi-fidelity settings, classical supervised dimension reduction methods often fail because the number of high-fidelity samples is insufficient to reliably estimate conditional expectations or variances. Bayesian approaches have been proposed to mitigate data scarcity, such as the Bayesian inverse regression framework in [13], which employs Gaussian process regression to approximate the likelihood and infer dimension-reduced structure. However, these methods typically assume a single fidelity level and can degrade when accurate observations are extremely limited or when data originate from heterogeneous sources.
Multi-fidelity modeling provides a principled framework to combine information from models of differing accuracy and cost. Gaussian process-based multi-fidelity methods, beginning with the autoregressive formulation of Kennedy and O’Hagan [14] and its recursive implementation by Le Gratiet and Garnier [15], have been widely used for surrogate modeling and uncertainty quantification. More recently, nonlinear information fusion strategies based on deep or autoregressive Gaussian processes have been developed to capture complex relationships between fidelity levels [16]. While these approaches have demonstrated strong predictive performance, they primarily focus on forward surrogate accuracy and uncertainty propagation, and do not explicitly address supervised dimension reduction in high-dimensional input spaces.
Several recent works have explored dimension reduction in related contexts, including active subspace methods [6,17,18] and Gaussian process-based kernel dimension reduction techniques [8]. Although effective in moderate-dimensional problems, these methods typically require either a sufficiently large number of high-fidelity samples or direct optimization of high-dimensional projection operators. When both the input dimension is large and high-fidelity data are scarce, such optimization problems become ill-posed due to the large number of parameters involved.
In this work, we propose a Rotated Multi-Fidelity Gaussian Process (RMFGP) framework that addresses these challenges by tightly coupling nonlinear multi-fidelity Gaussian process modeling with supervised dimension reduction and Bayesian active learning. The central idea is to exploit abundant low-fidelity data to extract coarse structural information about the input–output relationship using SAVE, and to use this information to rotate the input space prior to multi-fidelity model training. A nonlinear multi-fidelity Gaussian process surrogate is then constructed in the rotated space, enabling effective information fusion across fidelity levels. By sampling from the trained surrogate, the dimension reduction procedure is iteratively refined, allowing reliable estimation of the central subspace even when the number of high-fidelity samples is small (Figure 1).
To further improve efficiency, a Bayesian active learning strategy driven by predictive uncertainty is incorporated to adaptively enrich the high-fidelity dataset. Depending on user requirements, the proposed framework supports either a rotated surrogate model for improved predictive accuracy or a reduced-order surrogate obtained via a two-stage dimension reduction strategy that combines Bayesian information criterion-based dimension selection with Gaussian process kernel optimization.
The main contributions of this work are summarized as follows:
  • Multi-fidelity–informed supervised dimension reduction. We develop a dimension reduction framework that leverages nonlinear multi-fidelity Gaussian process surrogates to enable reliable estimation of the central sufficient dimension reduction subspace under severe high-fidelity data scarcity.
  • Rotated multi-fidelity Gaussian process model. A rotated multi-fidelity Gaussian process (RMFGP) is proposed, in which the input space is iteratively rotated using directions identified by supervised dimension reduction and refined through surrogate-generated samples, rather than treated as a fixed preprocessing step.
  • Two-stage dimension reduction strategy under data scarcity. To address the ill-posedness of directly learning high-dimensional projection operators with limited high-fidelity data, we introduce a two-stage dimension reduction procedure that combines RMFGP-based rotation with Gaussian process kernel-based optimization, with intrinsic dimension selected via a Bayesian information criterion.
  • Integration of Bayesian active learning in multi-fidelity settings. A Bayesian active learning strategy based on predictive uncertainty is incorporated to adaptively select new high-fidelity samples, improving both surrogate accuracy and dimension reduction quality while minimizing computational cost.
  • Applications to stochastic partial differential equations. The proposed framework is validated on a range of numerical examples, including stochastic partial differential equations, demonstrating improved prediction accuracy, convergence behavior, and uncertainty propagation compared to existing Gaussian process-based dimension reduction methods.
The remainder of the paper is organized as follows: Section 2 introduces the necessary background on Gaussian process regression, multi-fidelity modeling, supervised dimension reduction, and active learning. Section 3 presents the proposed RMFGP framework and associated algorithms. Section 4 demonstrates the performance of the method through several numerical examples. Conclusions and directions for future work are given in Section 5.

2. Background and Preliminaries

2.1. Gaussian Process Regression

Gaussian process regression (GPR) provides a flexible Bayesian framework for approximating unknown functions from data. The presentation in this section follows [19]. Let z : R p R denote an unknown scalar-valued function, and suppose that noisy observations are available in the form
y i = z ( x i ) + ε i , i = 1 , , n ,
where ε i represents independent Gaussian noise.
In the Gaussian process framework, the latent function z ( x ) is modeled as a stochastic process,
z ( x ) GP ( 0 , k ( x , x ; θ ) ) ,
where k ( · , · ; θ ) is a positive-definite covariance kernel parameterized by hyperparameters θ . This prior specification encodes assumptions about the smoothness and correlation structure of the function.
Let X = [ x 1 , , x n ] T and y = [ y 1 , , y n ] T denote the training inputs and outputs. Under the Gaussian likelihood assumption, the hyperparameters θ can be estimated by maximizing the marginal log-likelihood,
log p ( y X , θ ) = 1 2 log | K | 1 2 y T K 1 y n 2 log ( 2 π ) ,
where K is the covariance matrix with entries K i j = k ( x i , x j ; θ ) .
Given a new input x , the posterior predictive distribution of z ( x ) is Gaussian,
p ( z y , X , x ) = N μ ( x ) , σ 2 ( x ) ,
with mean and variance given by
μ ( x ) = k n K 1 y ,
σ 2 ( x ) = k ( x , x ) k n K 1 k n T ,
where k n = [ k ( x , x 1 ) , , k ( x , x n ) ] .
The posterior mean provides a surrogate prediction of the unknown function, while the posterior variance naturally quantifies predictive uncertainty, a feature that will be exploited later for active learning and multi-fidelity modeling.

2.2. Multi-Fidelity Gaussian Process Models

Multi-fidelity modeling aims to combine information from models or experiments of varying accuracy and computational cost. In many scientific computing applications, low-fidelity models are inexpensive to evaluate but may exhibit bias or missing physics, whereas high-fidelity models provide accurate predictions at a substantially higher cost. Gaussian process-based multi-fidelity models provide a flexible probabilistic framework for fusing such heterogeneous information.

2.2.1. Linear Autoregressive Multi-Fidelity Models

A widely used formulation is the linear autoregressive model proposed by Kennedy and O’Hagan [14]. Let { Z t ( x ) } t = 1 r denote a sequence of Gaussian processes corresponding to increasing levels of fidelity, where t = r represents the highest fidelity. The autoregressive relationship between successive fidelity levels is given by
Z t ( x ) = ρ t Z t 1 ( x ) + δ t ( x ) , t = 2 , , r ,
where ρ t is a scalar correlation parameter and δ t ( x ) is a Gaussian process independent of Z t 1 ( x ) , modeling the discrepancy between fidelity levels.
This construction implies a Markov property across fidelity levels and allows the high-fidelity response to be expressed as a linear transformation of the lower-fidelity model plus a correction term. To reduce computational complexity, Le Gratiet and Garnier [15] proposed a recursive implementation under a nested experimental design, transforming the original multi-fidelity problem into a sequence of standard Gaussian process regressions.
Under this formulation, the predictive mean and variance at fidelity level t can be computed recursively as
μ t ( x ) = ρ t μ t 1 ( x ) + μ δ t + k n t ( t ) K t 1 y t ρ t μ t 1 ( x t ) μ δ t ,
σ t 2 ( x ) = ρ t 2 σ t 1 2 ( x ) + k t ( x , x ) k n t ( t ) K t 1 k n t ( t ) T ,
where K t denotes the covariance matrix at fidelity level t.
While effective in problems with strong linear correlations between fidelity levels, this formulation can perform poorly when the relationship between models is highly nonlinear or varies across the input space.

2.2.2. Nonlinear Information Fusion

To address these limitations, Perdikaris et al. [16] introduced a nonlinear autoregressive multi-fidelity Gaussian process model, also referred to as a nonlinear autoregressive Gaussian process (NARGP). In this framework, the relationship between fidelity levels is expressed as
Z t ( x ) = g t x , Z t 1 ( x ) + δ t ( x ) ,
where g t ( · , · ) is an unknown nonlinear function modeled as a Gaussian process and δ t ( x ) is an independent Gaussian process capturing residual discrepancies.
By augmenting the input space with predictions from the lower-fidelity model, this formulation allows flexible nonlinear correlations between fidelity levels to be learned from data. The resulting surrogate can be interpreted as a deep Gaussian process with a restricted architecture [20,21]. Since the posterior distribution is no longer Gaussian for t 2 , predictive moments are typically approximated using Monte Carlo sampling.
Nonlinear multi-fidelity Gaussian process models have demonstrated improved robustness and predictive accuracy in a wide range of applications, particularly when linear correlations between fidelity levels are weak or input-dependent. In the present work, this nonlinear information fusion framework serves as a fundamental building block for the proposed RMFGP methodology.

2.3. Supervised Dimension Reduction Methods

Supervised dimension reduction aims to identify low-dimensional structures in the input space that preserve the dependence between inputs and outputs. Let X : Ω R p be a random input vector and let Y : Ω R denote a scalar response. A central concept in supervised dimension reduction is the notion of a sufficient dimension reduction (SDR) subspace.
Definition 1
(Sufficient dimension reduction). A subspace S R p is called a sufficient dimension reduction subspace for Y given X if
Y X B T X ,
where the columns of B R p × d form a basis of S and d < p .
Among all SDR subspaces, the smallest one in the sense of set inclusion is referred to as the central subspace, denoted by S Y | X  [22]. Once an estimate of the central subspace is obtained, the original high-dimensional regression problem can be approximated by
Y = f ( X ) g ( B T X ) ,
where g : R d R is an unknown low-dimensional function.
A variety of classical methods have been developed to estimate the central subspace, including sliced inverse regression (SIR) [3] and sliced average variance estimation (SAVE) [4]. These methods rely on discretizing the range of the response variable and estimating conditional moments of standardized inputs.

2.3.1. Sliced Inverse Regression

Sliced inverse regression (SIR) estimates the central subspace by exploiting variations in the conditional mean E [ X Y ] . After standardizing the input variables, the response Y is partitioned into a finite number of slices, and the covariance of the slice-wise conditional means is used to construct an eigenvalue problem whose leading eigenvectors provide an estimate of the SDR directions.
The computational procedure of SIR adopted in this work is summarized in Algorithm 1. While computationally efficient, SIR may fail to detect symmetric or higher-order dependencies between the inputs and the response.
Algorithm 1 Sliced Inverse Regression (SIR)
1:
Compute the sample mean and sample variance:
μ ^ = E n ( X ) , σ ^ = v a r n ( X ) .
and compute the standardized random vectors
Z i = Σ ^ 1 / 2 ( X i μ ^ ) , i = 1 , , n .
2:
Discretize Y as Y ^ = h = 1 H h I ( Y J h ) , where a collection of intervals { J 1 , , J h } is a partition of Y i .
3:
Approximate E [ Z | Y ^ J h ] or E [ Z | Y J h ] by
E n ( Z | Y J h ) = E n [ Z I ( Y J h ) ] E n [ I ( Y J h ) ] , l = 1 , , H
4:
Approximate v a r [ E ( Z | Y ^ ) ] by
M = h = 1 H E [ I ( Y J h ) ] E n ( Z | Y J h ) E n ( Z T | Y J h )
5:
Let v ^ 1 , , v ^ d be the first d eigenvectors of M, let β ^ k = Σ ^ 1 / 2 v ^ k , k = 1 , , d . The SDR predictors are [ β ^ 1 T ( X 1 μ ^ ) , , β ^ d T ( X d μ ^ ) ] .

2.3.2. Sliced Average Variance Estimation

Sliced average variance estimation (SAVE) extends SIR by incorporating second-order conditional information through the conditional variance Var ( X Y ) . Following [23], the SAVE matrix is constructed as
M SAVE = 1 H h = 1 H I p Var ^ ( Z Y J h ) 2 ,
where Z denotes the standardized input, { J h } h = 1 H is a partition of the response space, and I p is the identity matrix.
The full computational steps of SAVE used throughout this paper are summarized in Algorithm 2. Compared to SIR, SAVE is capable of detecting a broader class of dependencies, including symmetric relationships between inputs and outputs. For this reason, SAVE is adopted as the primary supervised dimension reduction tool in the proposed framework, although other SDR methods could also be incorporated.
Algorithm 2 Sliced Average Variance Estimation (SAVE)
1:
Standardize X 1 , , X n to obtain Z i as in Algorithm 1.
2:
Discretize Y as Y ^ = h = 1 H h I ( Y J h ) , where a collection of intervals { J 1 , , J h } is a partition of Y i .
3:
For each slice J h , compute the sample conditional variance of Z given Y J h :
v a r n ( Z | Y ^ = h ) = E n [ Z Z T I ( Y ^ = h ) ] E n [ I ( Y ^ = h ) ]
4:
Compute the sample version of M:
M = H 1 h = 1 H E n I ( Y ^ = h ) [ I p v a r n ( Z | Y ^ = h ) ] 2
5:
Let v ^ 1 , , v ^ d be the first d eigenvectors of M, let β ^ k = Σ ^ 1 / 2 v ^ k , k = 1 , , d . The SDR predictors are [ β ^ 1 T ( X 1 μ ^ ) , , β ^ d T ( X d μ ^ ) ] , where μ ^ = E n ( X ) .

2.3.3. Selection of Reduced Dimension

A key challenge in supervised dimension reduction is determining the intrinsic dimension d of the central subspace. In this work, the reduced dimension is selected using the Bayesian information criterion (BIC) proposed in [24]. Let λ 1 λ 2 λ p denote the eigenvalues of the SAVE-based matrix M SAVE + I p . The BIC score is defined as
G ( k ) = n 2 l = k + 1 p log λ l + 1 λ l C n k ( 2 p k + 1 ) 2 ,
where C n is a sequence satisfying the conditions specified in [24]. The estimated reduced dimension is then given by
d = arg max 1 k p 1 G ( k ) .
In the present work, Algorithms 1 and 2 are applied either directly to observed data or to surrogate-generated samples, depending on the stage of the proposed RMFGP framework. While effective in classical settings, the direct application of SAVE and BIC can be unreliable when the number of accurate observations is small. This limitation motivates the surrogate-assisted dimension reduction strategy developed in the subsequent sections.

2.4. Bayesian Active Learning

Bayesian active learning, also referred to as sequential experimental design, aims to improve model accuracy by adaptively selecting new data points that are expected to provide the greatest information gain [25]. In problems where accurate observations are expensive or time-consuming to obtain, active learning plays a crucial role in achieving reliable inference with limited computational resources.
In the context of this work, observations are available from multiple fidelity levels. Let
D L = { ( x i L , y i L ) } i = 1 N L , D H = { ( x i H , y i H ) } i = 1 N H
denote the low-fidelity and high-fidelity datasets, respectively, where typically N L N H . A multi-fidelity Gaussian process surrogate is constructed using the combined dataset D = { D L , D H } , as described in Section 2.2.
The objective of active learning in this setting is to enrich the high-fidelity dataset D H by selectively querying new high-fidelity observations from a predefined candidate pool. In this work, the candidate pool is chosen to be the set of low-fidelity input locations { x i L } i = 1 N L , which are inexpensive to evaluate and already available. At each iteration, a new high-fidelity sample location is selected by maximizing an acquisition function,
x N H + 1 H = arg max x D L a ( x ) .
Among the various acquisition functions proposed in the literature, such as expected improvement or probability of improvement [26,27,28], we adopt the predictive variance of the multi-fidelity Gaussian process surrogate,
a ( x ) = σ 2 ( x ) ,
where σ 2 ( x ) denotes the posterior predictive variance at input x. This choice is motivated by the fact that predictive variance naturally quantifies epistemic uncertainty in Gaussian process models and directly reflects regions of the input space where additional high-fidelity information is most beneficial.
Once the new high-fidelity observation ( x N H + 1 H , y N H + 1 H ) is obtained, it is added to the training dataset, and the multi-fidelity surrogate is retrained. This process is repeated until a stopping criterion is satisfied, such as reaching a prescribed prediction error threshold on a validation set or exhausting a predefined computational budget.
In the proposed framework, Bayesian active learning serves two complementary purposes. First, it improves the predictive accuracy of the multi-fidelity surrogate model with minimal additional high-fidelity evaluations. Second, by enhancing surrogate quality in regions of high uncertainty, it indirectly improves the reliability of surrogate-assisted supervised dimension reduction procedures employed in subsequent sections.

3. Rotated Multi-Fidelity Gaussian Process Framework

3.1. Problem Setting

We consider a supervised regression problem in which a scalar quantity of interest depends on a high-dimensional input. Let
f : R p R
denote an unknown high-fidelity response function, where the input dimension p may be large. We assume that evaluations of f are expensive and that only a limited number of accurate observations are available.
In addition to the high-fidelity model, we assume access to one or more low-fidelity models that provide approximate evaluations of the response at substantially lower computational cost. For clarity of exposition, we focus on the case of two fidelity levels, although the proposed framework can be extended naturally to more general multi-fidelity settings. Let
D H = { ( x i H , y i H ) } i = 1 N H , D L = { ( x i L , y i L ) } i = 1 N L
denote the high-fidelity and low-fidelity datasets, respectively, with N H N L in typical applications.
A central assumption of this work is that the high-fidelity response admits an approximate low-dimensional structure. Specifically, we assume that there exists a matrix A R d × p with d p such that
f ( x ) g ( A x ) ,
where g : R d R is an unknown function. The row space of A defines the central sufficient dimension reduction subspace introduced in Section 2.3. Identifying this subspace is critical both for reducing the effective dimensionality of the problem and for constructing accurate surrogate models. Note that throughout this work, we use A to denote rotation matrices and M to denote final reduction matrices, while B in Section 2 refers generically to SDR directions.
The primary challenge addressed in this work arises from the severe scarcity of high-fidelity data. Classical supervised dimension reduction techniques, such as SIR or SAVE, rely on an accurate estimation of conditional moments and typically require a sufficient number of high-fidelity observations. When N H is small, a direct application of these methods becomes unreliable. At the same time, although low-fidelity data are abundant, they may contain bias or missing information and cannot be used directly to infer the central subspace of the high-fidelity response.
The goal of this work is therefore twofold. First, we aim to develop a reliable strategy for identifying the central dimension reduction subspace of the high-fidelity model by exploiting both low- and high-fidelity data. Second, we seek to construct an accurate surrogate model for the high-fidelity response that can be used for prediction and uncertainty quantification with minimal reliance on expensive model evaluations.
To achieve these objectives, we introduce a Rotated Multi-Fidelity Gaussian Process (RMFGP) framework that tightly integrates multi-fidelity Gaussian process modeling, supervised dimension reduction, and Bayesian active learning. The subsequent subsections describe the construction of the RMFGP model and the associated dimension reduction and sampling strategies in detail.

3.2. Input Rotation via Supervised Dimension Reduction

The first step of the proposed RMFGP framework is to identify informative directions in the input space and use them to define a rotated coordinate system. At this stage, the objective is not to perform aggressive dimension reduction, but rather to improve the representation of the input space so that subsequent multi-fidelity surrogate modeling can be carried out more effectively.
Since low-fidelity data are typically abundant and inexpensive to obtain, we begin by applying supervised dimension reduction to the low-fidelity dataset. Specifically, the sliced average variance estimation (SAVE) method introduced in Section 2.3 is applied to the low-fidelity observations D L = { ( x i L , y i L ) } i = 1 N L . The computational procedure follows Algorithm 2, resulting in an estimate of dominant directions that characterize the dependence between low-fidelity inputs and outputs.
Let A L R p × p denote the orthogonal matrix whose columns are the eigenvectors obtained from the SAVE matrix constructed using the low-fidelity data, ordered according to decreasing eigenvalues. This matrix defines a rotation of the original input space,
x ˜ = A L T x ,
where x ˜ represents the rotated input coordinates. All available data, including low-fidelity, high-fidelity, and validation inputs, are transformed using the same rotation.
It is important to emphasize that this step performs a rotation rather than a truncation of the input space. All p dimensions are retained, and no information is discarded at this stage. The purpose of the rotation is to align the coordinate system with directions that are informative for the low-fidelity response, thereby improving the conditioning of subsequent multi-fidelity Gaussian process models.
Although the low-fidelity model may contain bias or missing physics, its response often captures coarse structural information about the underlying input–output relationship. By exploiting this information through SAVE-based rotation, the input space is reorganized so that informative directions are concentrated in the leading coordinates, while less influential directions are pushed toward higher-indexed components. This reorganization facilitates more effective information fusion across fidelity levels in later stages of the RMFGP framework.
The rotated input representation serves as the foundation for constructing the nonlinear multi-fidelity Gaussian process surrogate described in the next subsection. As will be shown, this initial rotation can be further refined using surrogate-generated predictions, leading to improved estimation of the central subspace associated with the high-fidelity response.

3.3. Rotated Multi-Fidelity Gaussian Process Construction

After rotating the input space as described in Section 3.2, we construct a nonlinear multi-fidelity Gaussian process surrogate in the rotated coordinates. This surrogate serves as the core modeling component of the proposed RMFGP framework and provides the probabilistic foundation for subsequent dimension refinement and active learning.
Let x ˜ = A L T x denote the rotated input associated with x R p . The low-fidelity and high-fidelity datasets in the rotated space are given by
D ˜ L = { ( x ˜ i L , y i L ) } i = 1 N L , D ˜ H = { ( x ˜ i H , y i H ) } i = 1 N H .
We adopt the nonlinear autoregressive multi-fidelity Gaussian process (NARGP) formulation introduced in [16] to fuse information across fidelity levels. In the rotated space, the high-fidelity response is modeled as
Z H ( x ˜ ) = g x ˜ , Z L ( x ˜ ) + δ ( x ˜ ) ,
where Z L ( x ˜ ) denotes the low-fidelity Gaussian process surrogate, g ( · , · ) is an unknown nonlinear function modeled as a Gaussian process, and δ ( x ˜ ) is an independent Gaussian process capturing residual discrepancies between fidelity levels.
The low-fidelity surrogate Z L ( x ˜ ) is first trained using D ˜ L by standard Gaussian process regression. Predictions from this surrogate are then used as additional inputs to train the high-fidelity model Z H ( x ˜ ) using D ˜ H . This construction allows the multi-fidelity surrogate to learn complex, input-dependent correlations between fidelity levels while preserving a coherent probabilistic interpretation.
We emphasize that the defining feature of the RMFGP model is the coupling between input rotation and multi-fidelity Gaussian process regression. Unlike standard NARGP models constructed in the original input space, the RMFGP surrogate operates in a rotated coordinate system informed by supervised dimension reduction. This coupling improves the alignment between informative input directions and the covariance structure of the Gaussian process, leading to enhanced predictive performance and numerical stability in high-dimensional settings.
Since the posterior distribution of the high-fidelity surrogate is no longer Gaussian under the nonlinear autoregressive formulation, predictive moments are approximated using Monte Carlo sampling. In particular, for a given input x ˜ , samples of Z L ( x ˜ ) are first drawn from the low-fidelity posterior and then propagated through the Gaussian process g ( · , · ) to generate samples of Z H ( x ˜ ) . The predictive mean and variance are subsequently estimated from these samples.
The resulting RMFGP surrogate provides both point predictions and uncertainty quantification for the high-fidelity response. These uncertainty estimates play a crucial role in guiding Bayesian active learning and in enabling surrogate-assisted supervised dimension reduction, as described in the following subsections.

3.4. Iterative Refinement and Active Learning

The rotated multi-fidelity Gaussian process constructed in Section 3.3 provides an initial probabilistic surrogate for the high-fidelity response in the rotated input space. However, because the initial rotation is obtained solely from low-fidelity data, it may not accurately reflect the central subspace associated with the high-fidelity model. To address this limitation, the RMFGP framework employs an iterative refinement strategy that combines surrogate-assisted supervised dimension reduction with Bayesian active learning.

3.4.1. Surrogate-Assisted Refinement of Rotation

Once the RMFGP surrogate is trained, it can be used to generate additional response samples at arbitrary input locations. In particular, surrogate predictions are evaluated at the low-fidelity input locations { x i L } i = 1 N L , yielding a set of surrogate-generated high-fidelity responses
D ^ H = { ( x i L , y ^ i H ) } i = 1 N L .
These surrogate-generated samples are inexpensive to obtain and provide an approximation of the input–output relationship of the high-fidelity model across the input domain.
The SAVE algorithm (Algorithm 2) is then reapplied using the surrogate-generated dataset D ^ H to estimate an updated set of dimension reduction directions. Let A ( k ) denote the rotation matrix obtained at iteration k. The input space is subsequently rotated according to
x ˜ ( k ) = A ( k ) T x ,
and the RMFGP surrogate is retrained in the updated rotated coordinates.
This surrogate-assisted refinement procedure allows information from the limited high-fidelity data to be propagated throughout the input space via the multi-fidelity surrogate, enabling more reliable estimation of the central subspace than would be possible using the high-fidelity observations alone. In practice, only a small number of refinement iterations is required before the rotation stabilizes.

3.4.2. Integration of Bayesian Active Learning

To further enhance surrogate accuracy and refinement reliability, Bayesian active learning is incorporated within the iterative framework. At each iteration, the predictive uncertainty of the RMFGP surrogate is evaluated over the candidate pool of low-fidelity input locations. New high-fidelity samples are selected by maximizing the predictive variance, as described in Section 2.4, and added to the high-fidelity dataset.
The inclusion of actively selected high-fidelity samples serves two complementary purposes. First, it improves the local accuracy of the RMFGP surrogate in regions of high uncertainty. Second, it enhances the quality of surrogate-generated samples used for supervised dimension reduction, leading to more accurate estimation of rotation directions in subsequent iterations.
The combined process of surrogate-assisted rotation refinement and Bayesian active learning results in a feedback loop in which improved surrogates lead to improved dimension reduction, which in turn leads to improved surrogate construction. This coupling distinguishes the RMFGP framework from conventional multi-fidelity Gaussian process models and classical supervised dimension reduction methods, which typically treat surrogate modeling and dimension reduction as independent tasks.
The iterative refinement procedure is terminated when a prescribed stopping criterion is satisfied, such as stabilization of the estimated dimension reduction directions, convergence of surrogate prediction error, or exhaustion of the available computational budget.

3.5. Two-Stage Dimension Reduction Strategy

The iterative refinement procedure described in Section 3.4 yields a sequence of rotation matrices that progressively align the input coordinates with directions that are informative for the high-fidelity response. While this rotation significantly improves surrogate modeling and information fusion, it does not by itself reduce the dimensionality of the input space. In many applications, however, constructing a reduced-order surrogate model is desirable for efficiency, interpretability, and scalability.
A direct approach to supervised dimension reduction would be to estimate a projection matrix A R d × p by optimizing a Gaussian process kernel or likelihood function in the original high-dimensional space. When the number of high-fidelity samples is limited, such an approach is often ill-posed due to the large number of parameters involved. To address this challenge, the RMFGP framework adopts a two-stage dimension reduction strategy that decouples the identification of informative directions from the final reduction of dimensionality.

3.5.1. Stage I: RMFGP-Based Rotation

In the first stage, the RMFGP framework is used to identify and refine a rotation of the input space through surrogate-assisted SAVE and active learning, as described in Section 3.2, Section 3.3 and Section 3.4. Let A ( ) R p × p denote the final rotation matrix obtained upon convergence of the iterative refinement procedure. This rotation concentrates the dominant input–output dependence into the leading coordinates of the transformed input
x ˜ = A ( ) T x ,
while retaining the full dimensionality of the problem.
The primary role of this stage is to reorganize the input space so that the effective dimensionality of the response is reduced, even though all p coordinates are preserved. By doing so, the subsequent dimension reduction step can be carried out in a significantly better-conditioned coordinate system.

3.5.2. Stage II: Reduced-Order Surrogate Construction

In the second stage, a reduced-order surrogate model is constructed by selecting a subset of the rotated coordinates. The intrinsic dimension d of the reduced representation is determined using the Bayesian information criterion described in Section 2.3, applied to the SAVE-based matrix computed from surrogate-assisted data. Once d is selected, the reduced input is defined as
x ˜ r = ( x ˜ 1 , , x ˜ d ) T ,
corresponding to the leading d components of the rotated input vector.
A Gaussian process surrogate is then trained using x ˜ r as input. In this reduced space, kernel hyperparameter optimization becomes well-posed due to the substantially lower dimensionality. This allows for the construction of an accurate and efficient reduced-order surrogate for the high-fidelity response.

3.5.3. Algorithmic Summary

The proposed two-stage strategy offers several advantages. First, it avoids the direct optimization of high-dimensional projection operators using limited high-fidelity data. Second, it exploits the strengths of multi-fidelity modeling to guide dimension reduction in a data-efficient manner. Finally, it provides flexibility to the user: depending on application requirements, one may either use the rotated full-dimensional surrogate for maximum accuracy or employ the reduced-order surrogate for improved computational efficiency.
The complete RMFGP procedure, including input rotation via supervised dimension reduction, nonlinear multi-fidelity Gaussian process construction, surrogate-assisted refinement, Bayesian active learning, and the two-stage dimension reduction strategy described above, is summarized in Algorithm 3. This algorithm serves as a practical implementation of the RMFGP framework and highlights the interaction between surrogate modeling, dimension reduction, and adaptive sampling in high-dimensional, data-scarce multi-fidelity settings.
Algorithm 3 Rotated multi-fidelity Gaussian process model (RMFGP)
1:
Input: low-fidelity data sets { X L , y L } , high-fidelity set { X H , y H } and validation data set { X T , y T } , threshold ξ , maximum iteration number I, reduction parameter f l a g = 0 or 1.
2:
Apply SAVE method Algorithm 2 only on the low-fidelity data to compute the first rotation matrix A T to extract the principle direction information in the low-fidelity data. Then apply A T to all X = ( X L , X H , X T ) .
3:
Perform NARGP on new training data { X L ^ , y L } , { X H ^ , y H } to get the prediction y T ^ at X T ^ . Then perform SAVE method again on new { X T ^ , y T ^ } to compute the rotation matrix A ^ and apply A ^ to all Xs.
4:
Check whether the threshold of the generalization error meet. If not, perform Bayesian active learning method to locate x where the prediction variance achieves maximum. Sample { x , y H } and add them into the high-fidelity training set.
5:
Repeat step 2 and 3 until the generalization error threshold fulfilled or the maximum iteration number is reached.
6:
Compute the rotation matrix M 1 = A T i A ^ i , and build the rotated model.
7:
If dimension reduction parameter f l a g = 1 , compute the intrinsic dimension d through BIC and compute the reduction matrix M ^ 1 consisting of the first s principle columns of M 1 , where d < s < p .
8:
Apply Gaussian processes dimension reduction technique to compute reduction matrix M 2 which is s × d . Then build the reduced model with final reduction matrix M = M ^ 1 M 2 .

4. Numerical Results

In this section, we present four numerical examples to demonstrate the effectiveness of the proposed RMFGP framework in high-dimensional, data-scarce settings. We first assess the accuracy of the estimated rotation or reduction matrix using the subspace distance metric introduced in [24],
m ( A , A ^ ) = P P ^ F ,
where A and A ^ denote the true and estimated central subspace matrices, respectively; P and P ^ are the corresponding projection matrices; and · F denotes the Frobenius norm. To highlight the benefit of the proposed approach, we compare the reduction matrix computed by RMFGP (with flag = 1 ) against the GP-SAVE method introduced in [13], using the same number of high-fidelity samples.
Once the rotation or reduction matrix is obtained, new training and test datasets are generated to construct and evaluate Gaussian process surrogate models. To ensure a fair comparison, two cases are considered depending on the reduction parameter flag . When flag = 0 , the proposed RMFGP model and standard Gaussian process regression (GPR) are trained on the rotated and original input spaces, respectively. When flag = 1 , the reduced-order surrogate constructed using RMFGP is compared against GP-SAVE, where the same reduced dimension is used.
Model accuracy is quantified using the relative error
e = u u ^ 2 u 2 ,
where u and u ^ denote the exact and predicted high-fidelity responses evaluated on the test set.

4.1. Linear Example: Poisson’s Equation

We first consider a linear multi-fidelity example in which the high-fidelity and low-fidelity models are related through additive high-dimensional perturbations. This example represents a simple setting in which the correlation between fidelity levels is predominantly linear.
Specifically, we define
f H ( x ) = sin π ( x 1 + x 3 ) + sin π ( x 1 + x 2 ) + 2 ,
f L ( x ) = f H ( x ) + x 3 x 4 x 5 x 6 ,
where x i , i = 1 , , 6 , are independent random variables uniformly distributed on [ 0 , 1 ] . Both functions can be interpreted as solutions of Poisson’s equation
2 f = h ( x ) ,
with forcing terms
h H ( x ) = 2 π 2 sin π ( x 1 + x 3 ) + 2 π 2 sin π ( x 1 + x 2 ) ,
h L ( x ) = h H ( x ) + 1 6 x 3 3 x 4 x 5 x 6 + x 3 x 4 3 x 5 x 6 + x 3 x 4 x 5 3 x 6 + x 3 x 4 x 5 x 6 3 .
From the analytical form of f H , it is evident that the true intrinsic dimension is d = 2 . The corresponding central subspace is spanned by β 1 = ( 1 , 0 , 1 , 0 , 0 , 0 ) T and β 2 = ( 1 , 1 , 0 , 0 , 0 , 0 ) T . The total input dimension is p = 6 . In this experiment, we use N L = 200 low-fidelity samples and N T = 500 test points. For each case, the initial number of high-fidelity samples is set to N H 10 , and two iterations of Bayesian active learning are performed, with five new high-fidelity samples added per iteration.
Table 1 reports the subspace distance m ( A , A ^ ) obtained with different numbers of high-fidelity samples when flag = 1 . Both RMFGP and GP-SAVE exhibit consistent improvement as N H increases. However, RMFGP achieves substantially higher accuracy across all sample sizes. This improvement can be attributed to the effective exploitation of low-fidelity information and the refinement enabled by Bayesian active learning, which together enhance the estimation of the central subspace under limited high-fidelity budgets.
Table 2 presents the values of the Bayesian information criterion G ( k ) . The criterion attains its maximum at k = 2 , correctly identifying the intrinsic dimension d ^ = 2 , consistent with the analytical structure of f H .
Table 3 summarizes the relative prediction errors on the test set. When flag = 0 , RMFGP trained on the rotated inputs consistently outperforms standard GPR trained on the original inputs. When flag = 1 , the reduced-order RMFGP surrogate also achieves lower errors than GP-SAVE. In RMFGP, the input dimension is first reduced from p = 6 to s = 3 , followed by a further reduction to d = 2 using Gaussian process-based dimension reduction. This hierarchical reduction alleviates the burden of hyperparameter optimization and allows accurate surrogate construction with fewer high-fidelity samples.
Figure 2 illustrates the mean squared error as a function of N H . In both the rotated and reduced settings, RMFGP exhibits faster convergence and lower error levels than the comparison methods. Figure 3 further confirms this observation through correlation plots at N H = 30 , where RMFGP predictions align closely with the ideal correlation line, while the comparison methods exhibit larger deviations.

4.2. Nonlinear Example

The second example demonstrates the capability of the proposed RMFGP framework to model nonlinear problems and nonlinearly correlated relationships between low- and high-fidelity data that cannot be captured by linear autoregressive multi-fidelity models.
We consider the following pair of functions:
f H ( x ) = exp 0.2 i = 1 10 x i ,
f L ( x ) = x 4 f H ( x ) ,
where x i , i = 1 , , 10 , are independent random variables uniformly distributed on [ 0 , 1 ] . From the analytical expressions of f H and f L , it follows that the intrinsic dimension of the low-fidelity model is two, whereas the intrinsic dimension of the high-fidelity model is one. Such a discrepancy may arise in practical applications when low-fidelity models contain additional noise or spurious dependencies that are absent in the high-fidelity response.
The original input dimension is p = 10 , and the true central subspace of the high-fidelity model is spanned by the vector β 1 = ( 1 , 1 , , 1 ) T . In this experiment, we use N L = 200 low-fidelity training samples and N T = 500 test points. For all cases, the initial number of high-fidelity samples is set to N H 5 . The Bayesian active learning scheme is then employed to add two high-fidelity samples in the first iteration and three additional samples in the second iteration, after which the stopping criterion is satisfied.
Table 4 reports the subspace distance m ( A , A ^ ) defined in (30) for different numbers of high-fidelity samples. Both RMFGP and standard Gaussian process regression exhibit consistent improvement as N H increases. However, RMFGP achieves significantly higher accuracy in estimating the central subspace with substantially fewer high-fidelity samples. In this example, the low-fidelity data contain complete but noisy information about the high-fidelity response, and the proposed RMFGP framework effectively exploits this structure to mitigate the impact of noise.
The Bayesian information criterion values G ( k ) are reported in Table 5. The criterion attains its maximum at k = 1 , correctly identifying the intrinsic dimension d ^ = 1 , consistent with the analytical structure of f H . This result demonstrates the ability of RMFGP to reliably identify intrinsic dimensionality even when low-fidelity data introduce additional spurious directions.
When the reduction parameter flag = 1 , the RMFGP framework first reduces the input dimension from p = 10 to s = 3 using the RMFGP-based rotation. A subsequent Gaussian process-based dimension reduction step is then applied to further reduce the dimension from s = 3 to d = 1 .
Table 6 and Figure 4 present the relative error and mean squared error as functions of N H for four different surrogate models. In all cases, RMFGP achieves lower prediction errors than the comparison methods. In particular, the reduced-order RMFGP surrogate exhibits superior accuracy in the small-sample regime, highlighting the effectiveness of the proposed framework in high-dimensional, data-scarce nonlinear settings. Figure 5 further confirms this observation through correlation plots at N H = 20 , where RMFGP predictions align closely with the ideal correlation line, while the comparison methods exhibit larger deviations.
Figure 6 is the prediction plot at N H = 20 . The black star line is the exact prediction with x d computed by the true dimension reduction matrix. The red circle line obtained by RMFGP ( f l a g = 1 ) fits the curve well. The blue square line has a large error at some locations of x d . The successful estimation of the central subspace as well as the predictions on the test set prove the ability of our method to exclude the effect of noise with a relatively small set of highly accurate data, which is useful in many real-world applications where high-fidelity data are expensive or hard to collect.

4.3. Advection Equation

The third example is aimed at evaluating the performance of RMFGP for stochastic partial differential equation problems in [29]. Consider the one-dimensional advection equation with random input
t u ( x , t ; ξ ) + a 4 i = 1 5 ξ i x u ( x , t ; ξ ) = 0 ,
with the initial condition
u ( x , 0 ; ξ ) = sin π ( x + 1 ) + 1 ,
where a is a constant coefficient, x [ 0 , 1 ] , and ξ = ( ξ 1 , , ξ 5 ) [ 0 , 1 ] 5 is a random vector.
Under this setting, an analytical high-fidelity solution is available and is denoted by u H ,
u H ( x , t ; ξ ) = sin π x a 4 t i = 1 5 ξ i + 1 + 1 .
The low-fidelity data are generated from
u L ( x , t ; ξ ) = sin π x a 4 t i = 3 5 ξ i + 1 + 1 .
Here, ξ is generated by i.i.d. uniformly distributed random variables on [ 0 , 1 ] 5 , and the constant a is fixed to be 1. Compared to the previous examples, the low-fidelity model in this setting contains missing information, since it depends only on ( ξ 3 , ξ 4 , ξ 5 ) rather than the full input vector.
In this experiment, the high- and low-fidelity function values are computed at x = 0.5 and t = 1 . From the analytical expression of u H , the true reduced dimension is d = 1 , and the corresponding reduction matrix is A = span { β 1 } with β 1 = ( 1 , 1 , 1 , 1 , 1 ) T . Table 7 reports the BIC values G ( k ) ; the criterion attains its maximum at k = 1 , indicating that the estimated intrinsic dimension is d ^ = 1 .
For the numerical experiments, the number of low-fidelity training samples is set to N L = 200 , and the number of test samples is N T = 500 . For all cases with different N H , the experiment starts from ( N H 10 ) high-fidelity samples, and the Bayesian active learning procedure adds five high-fidelity samples per iteration, with two iterations performed before termination. Table 8 summarizes the accuracy of the estimated central subspace using the metric m ( A , A ^ ) . As expected, both methods improve as N H increases; however, RMFGP ( flag = 1 ) consistently outperforms GP-SAVE across all sample sizes. In particular, when the number of high-fidelity samples is insufficient, GP-SAVE may fail to identify the central subspace accurately.
The relative errors reported in Table 9 and the MSE curves in Figure 7 further confirm this behavior. RMFGP achieves smaller prediction errors and exhibits faster convergence than the baseline GP model. The correlation plots in Figure 8 and the prediction plots in Figure 9 also demonstrate that RMFGP yields improved generalization performance and more accurate regression behavior than the comparison approach. Overall, this example illustrates that RMFGP can successfully approximate the central subspace in a stochastic differential equation problem, even when the low-fidelity model contains missing information.

4.4. Elliptic Equation

The final example illustrates the performance of the proposed RMFGP framework for a more challenging stochastic partial differential equation. We consider the one-dimensional elliptic equation with a random, high-order coefficient
d d x a ( x ; ξ ) d u ( x ; ξ ) d x = 1 , x ( 0 , 1 ) ,
subject to homogeneous Dirichlet boundary conditions
u ( 0 ; ξ ) = u ( 1 ; ξ ) = 0 .
The diffusion coefficient a ( x ; ξ ) depends on a random input vector ξ = ( ξ 1 , , ξ 6 ) and is defined differently for the high- and low-fidelity models as
a H ( x ; ξ ) = 1 ξ 1 + sin x ( ξ 1 + ξ 2 + ξ 3 + ξ 4 ) + 1 ,
a L ( x ; ξ ) = 1 0.1 + sin x ( ξ 1 + ξ 2 + ξ 3 + ξ 4 ) + 1 .
The random variables ξ i are assumed to be independent and uniformly distributed on [ 0 , 1 ] . Compared with the high-fidelity model, the low-fidelity model introduces a systematic bias through the constant shift in the denominator, while retaining partial information about the stochastic structure of the coefficient.
For this elliptic problem, a deterministic solution representation exists and is given by
u ( x ; ξ ) = u ( 0 ; ξ ) + 0 x a ( 0 ; ξ ) u ( 0 ; ξ ) y a ( y ; ξ ) d y .
Applying the boundary conditions yields
a ( 0 ; ξ ) u ( 0 ; ξ ) = 0 1 y a ( y ; ξ ) d y 0 1 1 a ( y ; ξ ) d y .
The integrals appearing in the solution are evaluated using highly accurate numerical quadrature. Unlike the previous examples, no closed-form analytical expression for u ( x ; ξ ) is available.
In this experiment, the solution is evaluated at the spatial location x = 0.7 . The exact central subspace cannot be inferred directly from the governing equation. Instead, it is approximated using the classical SAVE method applied to a large reference dataset consisting of 10,000 high-fidelity samples. The intrinsic dimension is determined via the Bayesian information criterion, and the corresponding BIC values G ( k ) are reported in Table 10. The criterion attains its maximum at k = 2 , indicating that the true reduced dimension is d ^ = 2 .
The number of low-fidelity training samples is fixed at N L = 200 , and the number of test samples is set to N T = 500 . For all cases with different values of N H , the experiment starts with ( N H 5 ) high-fidelity samples. Two iterations of Bayesian active learning are performed, with two samples added in the first iteration and three samples added in the second.
Table 11 summarizes the accuracy of the estimated central subspace using the metric m ( A , A ^ ) . Table 12 reports the relative prediction errors for different surrogate models. Figure 10 presents the mean squared error as a function of the number of high-fidelity samples, while Figure 11 shows the corresponding correlation plots at N H = 25 . Across all metrics, RMFGP consistently outperforms the comparison Gaussian process models and GP-SAVE, exhibiting behavior similar to that observed in the previous examples.
As the number of high-fidelity samples increases, RMFGP with flag = 0 (rotated inputs without truncation) achieves the best predictive performance. This observation reflects the fact that accurate identification of the principal directions improves the quality of the surrogate.
Finally, the uncertainty quantification analysis is also performed for this model. With the dimension reduction matrix M obtained by the RMFGP model with f l a g = 1 , a new Gaussian process surrogate can be built by pre-processing all training data with M to reduce the input dimension to d = 2 . Then, 2000 samples for ξ are drawn from an i.i.d uniformly distribution. Figure 12 presents the average of means and standard deviations(std) of those 2000 cases along with 50 different x in [ 0 , 1 ] . The x-axis represents the indexes of different x values. The y-axis is the average mean for ( a ) and std for ( b ) in each case. The ground truth is the black line. The green diamond line is obtained by the pure SAVE method if we give large enough training data, i.e., high-fidelity data samples. Here, we give 10,000 samples in order to obtain these results. Note that there are only 35 high-fidelity samples in the RMFGP model. As shown in Figure 12a, all four methods have similar performance. However, Figure 12b shows that our RMFGP method has a smaller std compared to the GP-SAVE method. This shows RMFGP is more confident about the predictions.

5. Conclusions

In this work, we proposed a Rotated Multi-Fidelity Gaussian Process (RMFGP) framework for surrogate modeling, dimension reduction, and uncertainty quantification in high-dimensional settings with severely limited high-fidelity data. The central idea of the proposed approach is to tightly integrate multi-fidelity Gaussian process modeling, supervised dimension reduction, and Bayesian active learning within a unified iterative framework. By exploiting abundant low-fidelity data to guide an initial rotation of the input space and refining this rotation through surrogate-assisted dimension reduction and adaptive sampling, RMFGP enables reliable identification of the central subspace associated with the high-fidelity response.
A key feature of RMFGP is the distinction between input rotation and dimensional truncation. Rather than performing aggressive dimension reduction directly from scarce high-fidelity data, the proposed two-stage strategy first reorganizes the input space to concentrate informative directions and then constructs reduced-order surrogates in a well-conditioned coordinate system. This design alleviates the ill-posedness commonly encountered in high-dimensional Gaussian process regression and allows for accurate surrogate construction with substantially fewer high-fidelity evaluations.
The effectiveness of the RMFGP framework was demonstrated through a sequence of numerical examples, including linear and nonlinear algebraic models as well as stochastic partial differential equations. Across all test cases, RMFGP consistently achieved more accurate estimation of the central subspace, improved predictive performance, and more reliable uncertainty quantification compared to standard Gaussian process regression and GP-SAVE methods, particularly in the small-sample regime. The results also highlight the benefit of Bayesian active learning in enhancing both surrogate accuracy and dimension reduction quality under limited computational budgets.
Several directions for future research remain open. The present work focuses on problems with moderate input dimensionality and smooth response functions; extensions to problems with discontinuities or sharp gradients would be of interest. In addition, while the current framework employs Gaussian process surrogates, the rotation and refinement strategy could be combined with other probabilistic or operator-learning models. Finally, extending RMFGP to handle time-dependent problems and more complex multi-physics systems represents a promising direction for further investigation.

Author Contributions

Conceptualization, J.Z., S.Z. and G.L.; Methodology, J.Z., S.Z. and G.L.; Software, J.Z. and S.Z.; Validation, J.Z. and S.Z.; Formal analysis, J.Z. and S.Z.; Investigation, J.Z. and S.Z.; Resources, J.Z. and S.Z.; Data curation, J.Z. and S.Z.; Writing—original draft, J.Z., S.Z. and G.L.; Writing—review & editing, J.Z., S.Z. and G.L.; Visualization, J.Z. and S.Z.; Supervision, J.Z. and G.L.; Project administration, G.L.; Funding acquisition, G.L. All authors have read and agreed to the published version of the manuscript.

Funding

We would like to acknowledge the support received from the National Science Foundation (DMS-2533878, DMS-2053746, DMS-2134209, ECCS-2328241, CBET-2347401 and OAC-2311848), and U.S. Department of Energy (DOE) Office of Science Advanced Scientific Computing Research program DE-SC0023161, the SciDAC LEADS Institute, and DOE–Fusion Energy Science, under grant number: DE-SC0024583.

Data Availability Statement

The original contributions presented in this study are included in the article. For further inquiries, please contact the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ma, Y.; Zhu, L. A Review on Dimension Reduction. Int. Stat. Rev. 2013, 81, 134–150. [Google Scholar] [CrossRef]
  2. Fukumizu, K.; Bach, F.R.; Jordan, M.I. Dimensionality reduction for supervised learning with reproducing kernel Hilbert spaces. J. Mach. Learn. Res. 2004, 5, 73–99. [Google Scholar]
  3. Li, K. Sliced Inverse Regression for Dimension Reduction. J. Am. Stat. Assoc. 1991, 86, 316–327. [Google Scholar] [CrossRef]
  4. Li, Y.; Zhu, L. Asymptotics for sliced average variance estimation. Ann. Stat. 2007, 35, 41–69. [Google Scholar] [CrossRef]
  5. Li, B.; Zha, H.; Chiaromante, F. Contour Regression: A general approach to dimension reduction. Ann. Stat. 2005, 33, 1580–1616. [Google Scholar] [CrossRef]
  6. Constantine, P.G.; Dow, E.; Wang, Q. Active Subspace Methods in Theory and Practice: Application to Kriging Surfaces. SIAM J. Sci. Comput. 2020, 36, A1500–A1524. [Google Scholar] [CrossRef]
  7. Li, W.; Lin, G.; Li, B. Inverse regression-based uncertainty quantification algorithms for high-dimensional models: Theory and practice. J. Comput. Phys. 2016, 321, 259–278. [Google Scholar] [CrossRef]
  8. Tripathy, R.; Bilionis, I.; Gonzalez, M. Gaussian processes with built-in dimensionality reduction: Applications to high-dimensional uncertainty propagation. J. Comput. Phys. 2016, 321, 191–223. [Google Scholar] [CrossRef]
  9. Reich, B.J.; Bondell, H.D.; Li, L. Sufficient dimension reduction via Bayesian mixture modeling. Biometrics 2011, 67, 886–895. [Google Scholar] [CrossRef]
  10. Solonen, A.; Cui, T.; Hakkarainen, J.; Marzouk, Y. On dimension reduction in Gaussian filters. Inverse Probl. 2016, 32, 045003. [Google Scholar] [CrossRef]
  11. Xia, Y.; Tong, H.; Li, W.K.; Zhu, L.X. An adaptive estimation of dimension reduction space. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2002, 64, 363–410. [Google Scholar] [CrossRef]
  12. Xia, Y. A constructive approach to the estimation of dimension reduction directions. Ann. Stat. 2007, 35, 2654–2690. [Google Scholar] [CrossRef]
  13. Cai, X.; Lin, G.; Li, J. Bayesian inverse regression for dimension reduction with small datasets. arXiv 2019, arXiv:1906.08018v3. [Google Scholar] [CrossRef]
  14. Kennedy, M.C.; O’Hagan, A. Predicting the output from a complex computer code when fast approximations are available. Biometrika 2000, 87, 1–13. [Google Scholar] [CrossRef]
  15. Gratiet, L.L.; Garnier, J. Recursive co-kriging model for design of computer experiments with multiple levels of fidelity. Int. J. Uncertain. Quantif. 2014, 4, 365–386. [Google Scholar] [CrossRef]
  16. Perdikaris, P.; Raissi, M.; Damianou, A.; Lawerence, N.; Karniadakis, G.E. Nonlinear information fusion algorithms for data-efficient multi-fidelity modelling. Proc. R. Soc. A Math. Phys. Eng. Sci. 2017, 473, 20160715. [Google Scholar] [CrossRef] [PubMed]
  17. Lam, R.R.; Zahm, O.; Marzouk, Y.M.; Willcox, K.E. Multifidelity Dimension Reduction via Active Subspaces. SIAM J. Sci. Comput. 2020, 42, A929–A956. [Google Scholar] [CrossRef]
  18. Tripathy, R.; Bilionis, I. Deep active subspaces: A scalable method for high-dimensional uncertainty propagation. In Proceedings of the ASME 2019 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, Anaheim, CA, USA, 18–21 August 2019. [Google Scholar]
  19. Rasmussen, C.E.; Williams, C.K.I. Gaussian Processes for Machine Learning; MIT Press: Cambridge, MA, USA, 2006. [Google Scholar]
  20. Damianou, A.; Lawrence, N. Deep Gaussian Processes. In Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, Scottsdale, AZ, USA, 29 April–1 May 2013; pp. 207–215. [Google Scholar]
  21. Damianou, A. Deep Gaussian Processes and Variational Propagation of Uncertainty. Ph.D. Thesis, University of Sheffield, Sheffield, UK, 2015. [Google Scholar]
  22. Li, B. Sufficient Dimension Reduction: Methods and Applications with R; Chapman & Hall/CRC Press: Boca Raton, FL, USA, 2018. [Google Scholar]
  23. Cook, R.D.; Forzani, L. Likelihood-based sufficient dimension reduction. J. Am. Stat. Assoc. 2009, 104, 197–208. [Google Scholar] [CrossRef]
  24. Li, B.; Wen, S.; Zhu, L. On a Projective Resampling Method for Dimension Reduction with Multivariate Responses. J. Am. Stat. Assoc. 2008, 103, 1177–1186. [Google Scholar] [CrossRef]
  25. Chernoff, H. Sequential design of experiments. Ann. Math. Stat. 1959, 30, 755–770. [Google Scholar] [CrossRef]
  26. Dror, H.A.; Steinberg, D.M. Sequential experimental designs for generalized linear models. J. Am. Stat. Assoc. 2008, 103, 288–298. [Google Scholar] [CrossRef]
  27. Deng, X.; Joseph, V.R.; Sudjianto, A.; Wu, C.J. Active learning through sequential design, with applications to detection of money laundering. J. Am. Stat. Assoc. 2009, 104, 969–981. [Google Scholar] [CrossRef]
  28. Williams, B.J.; Santner, T.J.; Notz, W.I. Sequential design of computer experiments to minimize integrated response functions. Stat. Sin. 2000, 10, 1133–1152. [Google Scholar]
  29. Jardak, M.; Su, C.; Karniadakis, G.E. Spectral polynomial chaos solutions of the stochastic advection equation. J. Sci. Comput. 2002, 17, 319–338. [Google Scholar] [CrossRef]
Figure 1. A general framework for solving high-dimensional problems with insufficient data—an overview of the workflow.
Figure 1. A general framework for solving high-dimensional problems with insufficient data—an overview of the workflow.
Mathematics 14 00325 g001
Figure 2. MSE of linear examples: (a) models without dimension reduction: RMFGP ( f l a g = 0 ) vs. GP; (b) models with dimension reduction: RMFGP ( f l a g = 1 ) vs. GP-SAVE.
Figure 2. MSE of linear examples: (a) models without dimension reduction: RMFGP ( f l a g = 0 ) vs. GP; (b) models with dimension reduction: RMFGP ( f l a g = 1 ) vs. GP-SAVE.
Mathematics 14 00325 g002
Figure 3. Correlation plots of linear examples at N H = 30 : (a) RMFGP ( f l a g = 0 ) vs. GP; (b) RMFGP ( f l a g = 1 ) vs. GP-SAVE.
Figure 3. Correlation plots of linear examples at N H = 30 : (a) RMFGP ( f l a g = 0 ) vs. GP; (b) RMFGP ( f l a g = 1 ) vs. GP-SAVE.
Mathematics 14 00325 g003
Figure 4. MSE of nonlinear examples: (a) models without dimension reduction: RMFGP ( f l a g = 0 ) vs. GP; (b) models with dimension reduction: RMFGP ( f l a g = 1 ) vs. GP-SAVE.
Figure 4. MSE of nonlinear examples: (a) models without dimension reduction: RMFGP ( f l a g = 0 ) vs. GP; (b) models with dimension reduction: RMFGP ( f l a g = 1 ) vs. GP-SAVE.
Mathematics 14 00325 g004
Figure 5. Correlation plots of nonlinear examples at N H = 20 : (a) RMFGP ( f l a g = 0 ) vs. GP; (b) RMFGP ( f l a g = 1 ) vs. GP-SAVE.
Figure 5. Correlation plots of nonlinear examples at N H = 20 : (a) RMFGP ( f l a g = 0 ) vs. GP; (b) RMFGP ( f l a g = 1 ) vs. GP-SAVE.
Mathematics 14 00325 g005
Figure 6. Prediction plots of nonlinear example at N H = 20 : RMFGP ( f l a g = 1 ) vs. GP-SAVE. The x axis is the test inputs after dimension reduction by x d , and the y axis is the corresponding observations.
Figure 6. Prediction plots of nonlinear example at N H = 20 : RMFGP ( f l a g = 1 ) vs. GP-SAVE. The x axis is the test inputs after dimension reduction by x d , and the y axis is the corresponding observations.
Mathematics 14 00325 g006
Figure 7. MSE of advection equation: (a) models without dimension reduction: RMFGP ( f l a g = 0 ) vs. GP; (b) models with dimension reduction: RMFGP ( f l a g = 1 ) vs. GP-SAVE.
Figure 7. MSE of advection equation: (a) models without dimension reduction: RMFGP ( f l a g = 0 ) vs. GP; (b) models with dimension reduction: RMFGP ( f l a g = 1 ) vs. GP-SAVE.
Mathematics 14 00325 g007
Figure 8. Correlation plots of advection equation at N H = 30 : (a) RMFGP ( f l a g = 0 ) vs. GP; (b) RMFGP ( f l a g = 1 ) vs. GP-SAVE.
Figure 8. Correlation plots of advection equation at N H = 30 : (a) RMFGP ( f l a g = 0 ) vs. GP; (b) RMFGP ( f l a g = 1 ) vs. GP-SAVE.
Mathematics 14 00325 g008
Figure 9. Prediction plots of advection equation at N H = 30 : RMFGP ( f l a g = 1 ) vs. GP-SAVE.
Figure 9. Prediction plots of advection equation at N H = 30 : RMFGP ( f l a g = 1 ) vs. GP-SAVE.
Mathematics 14 00325 g009
Figure 10. MSE of elliptic equation: (a) models without dimension reduction: RMFGP ( f l a g = 0 ) vs. GP; (b) models with dimension reduction: RMFGP ( f l a g = 1 ) vs. GP-SAVE.
Figure 10. MSE of elliptic equation: (a) models without dimension reduction: RMFGP ( f l a g = 0 ) vs. GP; (b) models with dimension reduction: RMFGP ( f l a g = 1 ) vs. GP-SAVE.
Mathematics 14 00325 g010
Figure 11. Correlation plots of the elliptic equation at N H = 25 : (a) RMFGP ( f l a g = 0 ) vs. GP; (b) RMFGP ( f l a g = 1 ) vs. GP-SAVE.
Figure 11. Correlation plots of the elliptic equation at N H = 25 : (a) RMFGP ( f l a g = 0 ) vs. GP; (b) RMFGP ( f l a g = 1 ) vs. GP-SAVE.
Mathematics 14 00325 g011
Figure 12. Average prediction means and standard deviations of the elliptic equation at N H = 35 : (a) average mean; (b) average standard deviation.
Figure 12. Average prediction means and standard deviations of the elliptic equation at N H = 35 : (a) average mean; (b) average standard deviation.
Mathematics 14 00325 g012
Table 1. Accuracy of the dimension reduction matrix measured by the metric m ( A , A ^ ) with different numbers of high-fidelity samples for linear examples.
Table 1. Accuracy of the dimension reduction matrix measured by the metric m ( A , A ^ ) with different numbers of high-fidelity samples for linear examples.
N H = 25 N H = 30 N H = 35 N H = 40
RMFGP ( f l a g = 1 )0.1332620.1127050.0663550.043337
GP-SAVE0.3967800.2432220.2217070.202306
Note: Bold numbers indicate the lowest error among all methods for each fixed N H .
Table 2. BIC: G ( k ) for linear examples.
Table 2. BIC: G ( k ) for linear examples.
k = 1 k = 2 k = 3 k = 4 k = 5 k = 6
G ( k ) 0.9366920.9592090.9272190.8946350.8609160.826732
Note: Bold number indicates the highest BIC value.
Table 3. Relative error e of the RMFGP model compared to the standard GP for linear examples. If f l a g = 0 , the inputs are simply rotated by the rotation matrix from the RMFGP model before fed into a new GP surrogate model. It is compared to a standard GP model. If f l a g = 1 , the inputs are reduced to dimension d = 2 . For comparison, the inputs for the standard GP are reduced to dimension d = 2 by a reduction matrix computed by the SAVE method using the same number of high-fidelity training points.
Table 3. Relative error e of the RMFGP model compared to the standard GP for linear examples. If f l a g = 0 , the inputs are simply rotated by the rotation matrix from the RMFGP model before fed into a new GP surrogate model. It is compared to a standard GP model. If f l a g = 1 , the inputs are reduced to dimension d = 2 . For comparison, the inputs for the standard GP are reduced to dimension d = 2 by a reduction matrix computed by the SAVE method using the same number of high-fidelity training points.
N H = 25 N H = 30 N H = 35 N H = 40
RMFGP ( f l a g = 0 )0.0513820.0302600.0190120.007531
GP0.0605130.0438980.0276940.023358
RMFGP ( f l a g = 1 )0.0513580.0395320.0269990.020309
GP-SAVE0.0843100.0779060.0729450.066365
Note: Bold numbers indicate the relative error of the proposed method for each fixed N H .
Table 4. Accuracy of the dimension reduction matrix measured by the metric m ( A , A ^ ) with different numbers of high-fidelity samples for nonlinear examples.
Table 4. Accuracy of the dimension reduction matrix measured by the metric m ( A , A ^ ) with different numbers of high-fidelity samples for nonlinear examples.
N H = 10 N H = 15 N H = 20 N H = 25
RMFGP ( f l a g = 1 )0.3756180.2177880.0329440.019125
GP-SAVE1.2348841.0044670.1772090.080237
Note: Bold numbers indicate the lowest error among all methods for each fixed N H .
Table 5. BIC: G ( k ) for nonlinear examples.
Table 5. BIC: G ( k ) for nonlinear examples.
k = 1 k = 2 k = 3 k = 4 k = 5
G ( k ) 0.9622130.9355110.9061460.8764730.846605
k = 6 k = 7 k = 8 k = 9 k = 10
G ( k ) 0.8163630.7852840.7539460.7221620.689382
Note: Bold number indicates the highest BIC value.
Table 6. Relative error e of the RMFGP model compared to the standard GP for nonlinear examples.
Table 6. Relative error e of the RMFGP model compared to the standard GP for nonlinear examples.
N H = 10 N H = 15 N H = 20 N H = 25
RMFGP ( f l a g = 0 )0.0373860.0083240.0010960.000810
GP0.1711200.1396230.0349450.024370
RMFGP ( f l a g = 1 )0.0809420.0452540.0066340.003374
GP-SAVE0.1753990.1335780.0266750.021621
Note: Bold numbers indicate the relative error of the proposed method for each fixed N H .
Table 7. BIC: G ( k ) for the advection equation.
Table 7. BIC: G ( k ) for the advection equation.
k = 1 k = 2 k = 3 k = 4 k = 5
G ( k ) 0.8753940.8464150.8165650.7861530.755256
Note: Bold number indicates the highest BIC value.
Table 8. Error of the dimension reduction matrix measured by the metric m ( A , A ^ ) with different numbers of high-fidelity samples for the advection equation.
Table 8. Error of the dimension reduction matrix measured by the metric m ( A , A ^ ) with different numbers of high-fidelity samples for the advection equation.
N H = 20 N H = 25 N H = 30 N H = 35
RMFGP ( f l a g = 1 )0.2490670.1146060.0728980.066792
GP-SAVE1.4108941.2696640.4832620.450924
Note: Bold numbers indicate the lowest error among all methods for each fixed N H .
Table 9. Relative error e of the RMFGP model compared to the standard GP for the advection equation.
Table 9. Relative error e of the RMFGP model compared to the standard GP for the advection equation.
N H = 20 N H = 25 N H = 30 N H = 35
RMFGP ( f l a g = 0 )0.6454690.0849000.0683120.061210
GP1.0251440.9646630.7050190.672353
RMFGP ( f l a g = 1 )0.2776590.2607260.2163230.084661
GP-SAVE0.7434880.6655750.4061300.384018
Note: Bold numbers indicate the relative error of the proposed method for each fixed N H .
Table 10. BIC: G ( k ) for the elliptic equation.
Table 10. BIC: G ( k ) for the elliptic equation.
k = 1k = 2k = 3k = 4k = 5k = 6
G(k)0.7018810.7323030.7094890.6859600.6621420.637976
Note: Bold number indicates the highest BIC value.
Table 11. Error of the dimension reduction matrix measured by the metric m ( A , A ^ ) with different numbers of high-fidelity samples for the elliptic equation.
Table 11. Error of the dimension reduction matrix measured by the metric m ( A , A ^ ) with different numbers of high-fidelity samples for the elliptic equation.
N H = 20 N H = 25 N H = 30 N H = 35
RMFGP ( f l a g = 1 )0.1591290.1419720.1228670.120856
GP-SAVE0.9210450.3878800.2506630.238877
Note: Bold numbers indicate the lowest error among all methods for each fixed N H .
Table 12. Relative error e of the RMFGP model compared to the standard GP for the elliptic equation.
Table 12. Relative error e of the RMFGP model compared to the standard GP for the elliptic equation.
N H = 20 N H = 25 N H = 30 N H = 35
RMFGP ( f l a g = 0 )0.0150760.0079890.0045570.001545
GP0.0606330.0084410.0069390.004409
RMFGP ( f l a g = 1 )0.0140570.0115400.0112760.010667
GP-SAVE0.0595380.0264540.0209150.014822
Note: Bold numbers indicate the relative error of the proposed method for each fixed N H .
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, J.; Zhang, S.; Lin, G. RMFGP: A Rotated Multi-Fidelity Gaussian Process Framework for Supervised Dimension Reduction. Mathematics 2026, 14, 325. https://doi.org/10.3390/math14020325

AMA Style

Zhang J, Zhang S, Lin G. RMFGP: A Rotated Multi-Fidelity Gaussian Process Framework for Supervised Dimension Reduction. Mathematics. 2026; 14(2):325. https://doi.org/10.3390/math14020325

Chicago/Turabian Style

Zhang, Jiahao, Shiqi Zhang, and Guang Lin. 2026. "RMFGP: A Rotated Multi-Fidelity Gaussian Process Framework for Supervised Dimension Reduction" Mathematics 14, no. 2: 325. https://doi.org/10.3390/math14020325

APA Style

Zhang, J., Zhang, S., & Lin, G. (2026). RMFGP: A Rotated Multi-Fidelity Gaussian Process Framework for Supervised Dimension Reduction. Mathematics, 14(2), 325. https://doi.org/10.3390/math14020325

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop