Next Article in Journal
Affine Calculus for Constrained Minima of the Kullback–Leibler Divergence
Previous Article in Journal
Statistical Gravity and Entropy of Spacetime
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Optimal ANOVA-Based Emulators of Models With(out) Derivatives

by
Matieyendou Lamboni
1,2
1
Department DFR-ST, University of Guyane, Cayenne 97346, French Guiana
2
228-UMR Espace-Dev, University of Guyane, University of Réunion, IRD, University of Montpellier, 34090 Montpellier, France
Stats 2025, 8(1), 24; https://doi.org/10.3390/stats8010024
Submission received: 31 January 2025 / Revised: 1 March 2025 / Accepted: 12 March 2025 / Published: 17 March 2025
(This article belongs to the Section Statistical Methods)

Abstract

:
This paper proposes new ANOVA-based approximations of functions and emulators of high-dimensional models using either available derivatives or local stochastic evaluations of such models. Our approach makes use of sensitivity indices to design adequate structures of emulators. For high-dimensional models with available derivatives, our derivative-based emulators reach dimension-free mean squared errors (MSEs) and a parametric rate of convergence (i.e., O ( N 1 ) ). This approach is extended to cope with every model (without available derivatives) by deriving global emulators that account for the local properties of models or simulators. Such generic emulators enjoy dimension-free biases, parametric rates of convergence, and MSEs that depend on the dimensionality. Dimension-free MSEs are obtained for high-dimensional models with particular distributions from the input. Our emulators are also competitive in dealing with different distributions of the input variables and selecting inputs and interactions. Simulations show the efficiency of our approach.

1. Introduction

Derivatives are sometimes available in modeling either according to the nature of observations of the phenomena of interest ([1,2] and the references therein) or low-cost evaluations of exact derivatives for some classes of PDE/ODE-based models, thanks to adjoint methods [3,4,5,6,7,8]. Models are defined via their rates of change with respect to their inputs; implicit functions (defined via their derivatives) are instances. Additionally, and particularly for complex models or simulators, efficient estimators of gradients and second-order derivatives using stochastic approximations are provided in refs. [9,10,11,12,13]. Being able to reconstruct functions using such derivatives is worth investigating [14], as having a practical, fast-evaluated model that links stochastic parameters and/or stochastic initial conditions to the output of interest of PDE/ODE models remains a challenge due to numerous uncertain inputs [15].
Moreover, the first-order derivatives of models are used to quickly select non-relevant input variables of simulators or functions, leading to effective screening measures. Efficient variance-based screening methods rely on the upper bounds of generalized sensitivity indices, including Sobol’s indices (see refs. [16,17,18,19,20,21,22] for independent inputs and [23] for non-independent variables).
For high-dimensional simulators, dimension reductions via screening input variables are often performed before building their emulators using Gaussian processes (kriging models) [24,25,26,27,28] polynomial chaos expansions [29,30,31], SS-ANOVA [32,33], or other machine learning approaches [34,35]. Indeed, such emulators rely on nonparametric or semi-parametric regressions and struggle to reconstruct simulators for a moderate to large number of inputs (e.g., [35]). Often, nonparametric rates of convergence are achieved by such emulators (see ref. [33] for SS-ANOVA and [36,37,38] for polynomial chaos expansions). Regarding the stability and accuracy of polynomial chaos expansions, the number of model runs needed is first estimated at the square of the dimension of the basis used [36] and then reduced at that dimension up to a logarithm factor [37,38]. Note that for d inputs, such a dimension is about ( w + 1 ) d for the tensor-product basis, including, at most, the monomial of degree w.
For models with a moderate to large number of relevant inputs, Bayesian additive regression trees have been used for building emulators of such models using only the input and output observations (see ref. [35] and the references therein). Such approaches rely somehow on rule-ensemble learning approaches by constructing homogenous or local base learners (see ref. [34] and references therein). Combining the model outputs and model derivatives can help to build emulators that account for both local and global properties of simulators. For instance, including derivatives in the Gaussian processes (considered in ref. [2]) allows for improving such emulators. Emulators based on Taylor series (see ref. [14]) combine both the model outputs and derivative outputs with interesting theoretical results, such as dimension-free rates of convergence. However, concrete constructions of such emulators are not provided in that paper.
Note that the aforementioned emulators are part of the class of global approximations of functions. While global emulators can be used to approximate models at any point in entire domain, local or point-based emulators require building different emulators for the same model. Conceptually, the main issues related to such practical emulators are the truncation errors and the biases due to epistemic uncertainty. Indeed, none of the above emulators rely on exact and finite expansions of functions in general. Thus, additional assumptions about the order of such errors are necessary to derive the rates of convergence of emulators. For instance, decreasing eigenvalues of kernels are assumed in kernel-based learning approaches (see, e.g., ref. [39]).
So far, the recent derivative-based (Db) ANOVA provides exact expansions of smooth functions with different types of distributions from inputs using the model outputs, such as first-order and second-order derivatives up to d t h -order cross partial derivatives [22]. It was used in ref. [13] to derive the plug-in and crude emulators of models by replacing the unknown derivatives with their estimates. However, convergence analysis of such emulators is not known, and derivative-free methods are much more convenient for applications in which the computations of cross-partial derivatives are too expansive or impossible [9,10,11].
Therefore, for high-dimensional simulators for which all the inputs are important or not, it is worth investigating the development of their emulators based directly on available derivatives or derivative-free methods. The contribution of this paper is threefold:
  • We designed adequate structures of emulators based on information gathered from global and derivative-based sensitivity analysis, such as unbiased orders of truncations and the selection of relevant ANOVA components (inputs and interactions);
  • We constructed derivative-based or derivative-free global emulators that are easy to fit and compute and can cope with every distribution of continuous input variables;
  • We examined the convergence analysis of our emulators, with a particular focus on the (i) dimension-free upper bounds of the biases and MSEs; (ii) the parametric rates of convergence (i.e., O ( N 1 ) ); and (iii) the number of model runs needed to obtain the stability and accuracy of our estimations.
In this paper, flexible emulators of complex models or simulators and approximations of functions are proposed using exact and finite expansions of functions involving cross-partial derivatives, known as Db-ANOVA. Section 2 first deals with the general formulation of Db-ANOVA using different continuous distribution functions and emulators of models with available derivatives. Such emulators reach the parametric rates of convergence, and their MSEs do not depend on dimensionality (i.e., d). Second, adequate structures of emulators are investigated using Sobol’s indices and their upper bounds, as the components of such emulators are interpretable as the main effects and interactions of a given order. The orders of unbiased truncations have been derived, leading to the possibility of selecting the ANOVA components that were included in our emulators.
For non-highly smooth models and for high-dimensional simulators for which the computations of derivatives are too expansive or impossible, new, efficient simulator emulators have been considered and are shown in Section 3, including their statistical properties. First, we provide such emulators under the assumption of quasi-uniform distributions of inputs so as to (i) obtain practical conditions for using such emulators and (ii) derive emulators that enjoy dimension-free MSEs for particular distributions of inputs. Second, such an assumption is removed to cope with every distribution of inputs. The proposed emulators have dimension-free biases and reach the parametric rate of convergence as well. Numerical illustrations (see Section 4) and an application to a heat diffusion PDE model (see Section 5) have been considered to show the efficiency of our estimators, and we conclude this work in Section 6.

General Notation

For an integer, d > 0 , denote with X : = ( X 1 , , X d ) a random vector of d independent and continuous variables with marginal cumulative distribution functions (CDFs), F j , and probability density functions (PDFs), ρ j , j = 1 , , d .
For a non-empty subset, u { 1 , , d } , | u | stands for its cardinality, and ( u ) : = { 1 , , d } u . Additionally, X u : = ( X j , j u ) denotes a subset of such variables, and the partition X = ( X u , X u ) holds. Finally, we use · 2 for the Euclidean norm, E [ · ] for the expectation operator, and V [ · ] for the variance operator.

2. New Insight into Derivative-Based ANOVA and Emulators

Given an integer, n > 0 , and an open set, Ω R d , consider a weak partial differentiable function, f : Ω R n [40,41], and a subset, v { 1 , , d } , with | v | > 0 . Denote with D | v | f : = k v x k f the | v | th weak cross-partial derivatives of each component of f with respect to each x k with k v and L 2 ( Ω ) : = f : Ω R n : E f ( X ) 2 2 < + , e.g., the Hilbert space of functions. Consider the following Hilbert–Sobolev space:
W d , 2 : = f L 2 ( Ω ) : D | v | f L 2 ( Ω ) ; | v | d .
In what follows, assume the following:
Assumption A1.
X is a random vector of independent and continuous variables, supported on an open domain, Ω.
Assumption A2.
f ( · ) is a deterministic function with f ( · ) W d , 2 .

2.1. Full Derivative-Based Emulators

Under Assumption 2, every sufficiently smooth unction, f ( · ) , admits the derivative-based ANOVA (Db-ANOVA) expansion (see refs. [13,22]), that is, x Ω ,
f ( x ) = E X f ( X ) + v , v { 1 , , d } | v | > 0 E X D | v | f X k v G k ( X k ) 𝟙 [ X k x k ] g k ( X k ) ,
where X : = X 1 , , X d is a random vector of independent variables, having the CDFs X j G j and the PDFs g j : = d G j d x j . By evaluating f ( · ) at the random vector, X , and taking G j = F j , yields the unique and orthogonal Db-ANOVA decomposition of f ( X ) , that is,
f ( X ) = E f ( X ) + v { 1 , , d } | v | > 0 f v ( X v ) ; f v ( X v ) : = E X D | v | f ( X ) k v F k ( X k ) 𝟙 [ X k X k ] ρ k ( X k ) .
When analytical cross-partial derivatives are available, or the derivative datasets are observed (see refs. [1,2] and the references therein), we are able to derive emulators of complex models that are time-demanding, bearing in mind the method of moments. Indeed, given a sample, X i i = 1 N : = X i , 1 , , X i , d i = 1 N , from X , and the associated sample of the (analytical or observed) derivatives’ outputs, that is,
D | v | f ( X i ) i = 1 N , v { 1 , , d } ,
the consistent (full) emulator or estimator of f at any sample point, x , of X is
f N ^ ( x ) : = 1 N i = 1 N v , v { 1 , , d } | v | 0 D | v | f ( X i ) k v F k ( X i , k ) 𝟙 [ X i , k x k ] ρ k ( X i , k ) ,
with D | | f = f and k c k : = 1 for every real c k by convention. We can check that f N ^ ( x ) is an unbiased estimator and that it reaches the parametric mean squared error (MSE) rate of convergence, that is, E f N ^ ( x ) f ( x ) 2 = O ( N 1 ) . This rate is dimension-free, provided that V v , v { 1 , , d } | v | 0 D | v | f ( X 1 ) k v F k ( X 1 , k ) 𝟙 [ X 1 , k x k ] ρ k ( X 1 , k ) < + .
For complex models without cross-partial derivatives, optimal estimators of such derivatives (i.e., D | v | f ^ ) have been used to construct the plug-in consistent emulator of f [13]. Such an emulator is given by
f N , p ^ ( x ) : = 1 N i = 1 N v , v { 1 , , d } | v | 0 D | v | f ^ ( X i ) k v F k ( X i , k ) 𝟙 [ X i , k x k ] ρ k ( X i , k ) .
While the estimator, D | v | f ^ , provided in ref. [13] has a dimension-free upper bound for the bias and reaches the parametric rate of convergence, its MSE increases with d | v | , showing the necessity of using the number of model runs, N d | v | , to expect a significant reduction in the MSE for higher-order cross-partial derivatives.

2.2. Adequate Structures of Emulators and Truncations

In high-dimensional settings, it is common to expect a reduction in the dimensionality before building emulators. The use of truncations is common practice in the polynomial approximation of functions [29,32,33,37] and in ANOVA-practicing communities [17,42,43,44], leading to truncated errors. When using Db-ANOVA, controlling such errors is made possible according to the information gathered from global sensitivity analysis [13,17]. Indeed, the variances in the terms in Db-ANOVA expansions of functions are exactly the main part, and interactions with Sobol’s indices up to a normalized constant occur when G j = F j . Thus, we are able to avoid non-relevant terms in our emulators according to the values of Sobol’s indices, suggesting that adequate truncations will not have any impact on the MSEs and the above parametric rate of convergence. For the sake of simplicity, n = 1 is considered in what follows.
Definition 1.
Consider an integer, d 0 { 1 , , d } , and the full Db-ANOVA given by (2). The truncated Db-ANOVA of f (in the superpose sense [42,43,44,45]) of order d 0 is given by
f T , d 0 ( X ) : = E f ( X ) + v { 1 , , d } 0 < | v | d 0 f v ( X v ) .
While f T , d 0 is an approximation of f in general, the equality holds for some functions. Given an integer, α 0 , consider the space of functions:
L α , 0 : = f : R d R n : f ( x ) w { 1 , , d } | w | α E X D | w | f ( X ) k w G k X k 𝟙 X k x k g k X k = 0 .
We can see that L α , 0 is a class of functions having, at most, α -order interactions. Moreover, L α , 0 contains the class of functions given by
D | { v , j 0 } | f = 0 , v { 1 , , d } , j 0 ( v ) , | v | = α .
It is clear, then, that we have f ( X ) = f T , d 0 ( X ) when f L α = d 0 , 0 .
Definition 2.
A truncation of f given by f T , d 0 is said to be an unbiased one whenever f ( X ) = f T , d 0 ( X ) .
Thus, L α , 0 is the class of unbiased truncations of f up to the order of d 0 = α .
Formally, based on available derivatives, we are able to derive unbiased truncations for some classes of functions. Consider the following Db expressions of Sobol’s indices of the input variables X j , j = 1 , , d and their upper bounds (see ref. [22]):
S j : = 1 V f ( X ) E f x j ( X ) f x j ( X ) F j min ( X j , X j ) F j ( X j ) F j ( X j ) ρ j ( X j ) ρ j ( X j ) ,
S T j : = 1 V f ( X ) E f x j ( X ) f x j ( X j , X j ) F j min ( X j , X j ) F j ( X j ) F j ( X j ) ρ j ( X j ) ρ j ( X j ) ,
and
S j S T j U B j : = 1 2 V f ( X ) E f x j ( X ) 2 F j ( X j ) 1 F j ( X j ) ρ j ( X j ) 2 .
Note that the computations of S j and U B j are straightforward, using given first-order derivatives (see ref. [13]), whereas those of S T j require using i) nonparametric methods for given derivatives or ii) derivatives of specific input values. Using such indices, adequate structures of f ( · ) can be constructed. For instance, it is known that when j = 1 d S j = 1 , f is an additive function of the form f ( x ) = j = 1 d h j ( x j ) , with h j being a real-valued function. Thus, a truncation (in the superpose sense) of order d 0 = 1 is an unbiased truncation. Other values of d 0 are given in Proposition 1.
Proposition 1.
Consider the main and total indices given by S j , S T j with j = 1 , , d . Then,
  • j = 1 d ( S j + S T j ) = 2 implies that using d 0 = 2 leads to unbiased truncations;
  • j = 1 d ( 2 S j + S T j ) ] 2 , 3 ] implies that using d 0 = 3 leads to unbiased truncations;
  • If there exists an integer, α, such that ( α 1 ) j = 1 d S T j + j = 1 d S j α and j = 1 d S T j + ( α 1 ) j = 1 d S j α , then d 0 = α leads to unbiased truncations.
Proof. 
See Appendix A. □
In general, if D | { v , j 0 } | f = 0 , v { 1 , , d } , j 0 ( v ) , | v | = α , then d 0 = α leads to unbiased truncations. Often, low-order derivatives are available or can be efficiently computed using fewer model runs, leading to truncated emulators in the superpose sense. It is worth noting that our emulators still enjoy the above parametric rate of convergence for unbiased truncations. In the presence of truncated errors, it is usually difficult to derive the rates of convergence without additional assumptions about the order of such errors.
In addition to such truncations in the superpose sense, screening measures allow for quickly identifying non-relevant inputs (i.e., U B j 0 ), leading to possible dimension reductions. For instance, we can see the following:
  • S j = S T j or S j U B j implies removing all cross-partial derivatives or interactions involving X j ;
  • S j = 0 and U B j S T j 0 , j { 1 , , d } suggest removing the first-order terms, corresponding to d 0 = 1 ;
  • 1 ( S i + S T j ) S T { i , j } = 0 or, equivalently, S i + S T j = 1 and S T k = 0 , k { i , j } implies keeping only X i and X j .
For non-highly smooth functions and the models for which the computations of derivatives are too expansive or impossible, derivative-free methods combined with unbiased truncations remain an interesting framework.

3. Derivative-Free Emulators of Models

This section covers the development of the model emulators, even when all the inputs are important according to screening measures.

3.1. Stochastic Surrogates of Functions Using Db-ANOVA

Consider integers L > 0 , q > 0 ; β R with = 1 , , L ; h : = ( h 1 , , h d ) R + d , and denote with V : = ( V 1 , , V d ) a d-dimensional random vector of independent variables satisfying: j { 1 , , d } ,
E V j = 0 ; E V j 2 = σ 2 ; E V j 2 q + 1 = 0 ; E V j 2 q < + .
Any random variable that is symmetrically about zero is an instance of V j s. Additionally, denote β h V : = ( β h 1 V 1 ; , β h d V d ) . For concise reporting of the results, elementary symmetric polynomials (ESPs) were used (see, e.g., [46,47]).
Definition 3.
Given u { 1 , , d } with | u | > 0 and r u : = ( r k , k u ) R | u | , the p t h ESP of r u is defined as follows:
e p ( u ) ( r u ) : = 0 i f p > | u | o r p < 0 1 i f p = 0 w u | w | = p k w r k i f p = 1 , , | u | .
Note that r : = r { 1 , , d } = ( r 1 , , r d ) R d , e p ( 1 : d ) r : = e p ( { 1 , , d } ) r . In addition, given X j G j , j { 1 , , d } , define
R k x k , X k , V k : = G k ( X k ) 𝟙 [ X k x k ] g k ( X k ) h k σ 2 V k , k = 1 , , d ;
R u x u , X u , V u : = R k x k , X k , V k , k u ; R x , X , V : = R { 1 , , d } x , X , V .
Without loss of generality, we are going to focus on the modified output, that is, f c ( x ) : = f ( x ) E f ( X ) . Based on the above framework, Theorem 1 provides a new approximation of every function or surrogate of a deterministic simulator.
Theorem 1.
Consider distinct β s. If f is smooth enough and Assumption 1 holds, then there exists α d { 1 , , L } and real coefficients C 1 ( p ) , , C L ( p ) , p = 1 , , d such that
f c ( x ) = = 1 L p = 1 d C ( p ) E f X + β h V e p ( 1 : d ) R x , X , V + O h 2 2 α d .
Proof. 
See Appendix B. □
The setting L = 1 , β 1 = 1 , C 1 ( p ) = 1 with p = 1 , , d provides an approximation of order O h 2 2 . Equivalently, the same order is obtained using the constraints:
= 1 L C ( p ) β r = δ p , r ; r = 0 , , L 1 , i f p L 1 = 1 L C ( p ) β r = δ p , r ; r = 0 , , L 2 , p , o t h e r w i s e ,
with p = 1 , , d and L d + 1 . In the same sense, taking = 1 L C ( p ) β r = δ p , r ; r = p , , L + p 1 with p = 1 , , d yields an approximation of order O h 2 2 L . Such constraints implicitly define the coefficients C 1 ( p ) , , C L ( p ) , p = 1 , , d , and they rely on the (generalized) Vandermonde matrices. Distinct values of β s (i.e., β 1 β 2 ) ensure the existence and uniqueness of such coefficients, as such matrices are invertible (see refs. [47,48]).
To improve the approximations of lower-order terms (i.e., lower values of p) in Equation (4), we are given an integer r * { 0 , , d 1 } with r * L 2 and consider the following constraints:
= 1 L C ( p ) β r = δ p , r ; r = 0 , , r * , p + 2 λ p + 2 , , p + 2 λ p + 2 ( L 1 r * ) , i f p r * = 1 L C ( p ) β r = δ p , r ; r = 0 , , r * , p , p + 2 , p + 2 ( L r * 2 ) , o t h e r w i s e ,
where λ p : = r * p 2 stands for the largest integer that is less than r * p 2 . The above choice of coefficients requires L r * + 2 , and L * : = r * + 2 is the minimum number of model runs used for deriving surrogates of functions. Such coefficients are more suitable for truncated surrogates. For instance, the truncated surrogate of order d 0 d (in the superposition sense) is given by
f T , d 0 c ˜ ( x ) : = = 1 L p = 1 d 0 C ( p ) E f X + β h V e p ( 1 : d ) R x , X , V .
Note that in this case, one must require r * { 0 , , d 0 1 } , and we will see that r * = d 0 1 is the best choice to improve the MSEs.
Likewise, when X u I with u I { 1 , , d } is the vector of the most influential input variables according to variance-based sensitivity analysis (see Section 2.2), the following truncated surrogate should be considered:
f u I c ˜ x u I : = = 1 L p = 1 | u I | C ( p ) E f X + β h V e p ( u I ) R u I x u I , X u I , V u I .
Based on Equation (4), the method of moments allows for deriving the emulator of any simulator or the estimator of any function. To that end, we are given two independent samples of size N, that is, X i i = 1 N : = X i , 1 , , X i , d i = 1 N from X and V i i = 1 N : = V i , 1 , , V i , d i = 1 N from V . The full and consistent emulator is given by
f N c ^ ( x ) : = 1 N i = 1 N = 1 L p = 1 d C ( p ) f X i + β h V i e p ( 1 : d ) R x , X i , V i .
The derivations of the truncated emulators (i.e., f T , d 0 c ˜ ^ and f u I c ˜ ^ ) are straightforward. All these emulators rely on N L model runs with the possibility L d . This property is useful for high-dimensional simulators.

3.2. Statistical Properties of Our Emulators

While the emulator f N c ^ ( x ) does not rely on the model derivatives, structural and technical assumptions about f are needed to derive the biases of this emulator, such as the Hölder space of functions. Given ı : = ( i 1 , , i d ) N d , denote D ( ı ) f : = k = 1 d i k x k f , ( x ) ı : = k = 1 d x k i k , ı ! = i 1 ! i d ! and | | ı | | 1 = i 1 + i d . Given α 0 , the Hölder space of α -smooth functions is given by x , y R d ,
H α : = f : R d R : f ( x ) | | ı | | 1 = 0 α 1 D ( ı ) f ( y ) ı ! x y ı M α x y 2 α ,
with M α > 0 , and D ( ı ) f ( y ) is a (weak) cross-partial derivative.
Moreover, given B α 0 and CDFs G j s, define the following space of functions:
L α , B α : = f : R d R : f ( x ) w { 1 , , d } | w | α E X D | w | f ( X ) k w G k X k 𝟙 X k x k g k X k B α .
We can see that L α , B α contains constants; L α , 0 is a class of functions having, at most, α -order of interactions, and L d , 0 is a class of all smooth functions, as B d = 0 . Lemma 1 provides the links between both spaces. To that end, consider M | w | : = D | w | f for all w { 1 , , d } with · being the infinity norm.
Assumption A3.
g j ρ min > 0 for any j { 1 , , d } .
Assumption 3 aims to cover the class of quasi-uniform distributions and other distributions for which the event g j ρ min occurs with a high probability. It is the case for most unbounded distributions.
Lemma 1.
Consider 0 < d 0 d , and assume that f H α with α { 0 , d } and Assumptions 1 and 3 hold. Then, there exists γ 0 > 0 such that f L d 0 , D d 0 , ρ min , with
D d 0 , ρ min : = 2 γ 0 M 0 1 2 ρ min M d M 0 1 / d + 1 d 0 1 2 ρ min M d M 0 1 / d + 1 d d 0 1 .
Proof. 
See Appendix C. □
Note that Lemma 1 also provides the upper bound of the remaining terms when approximating f ( x ) using the truncated function
f d 0 ( x ) : = w { 1 , , d } | w | d 0 E X D | w | f ( X ) k w G k X k 𝟙 X k x k g k X k .
When 1 2 ρ min M d M 0 1 / d 0 , D d 0 , ρ min is equivalent to
D d 0 , ρ min ( d d 0 ) γ 0 M 0 ( d 1 ) / d M d 1 / d ρ min d 0 2 ρ min M d M 0 1 / d + 1 .

3.2.1. Biases of the Proposed Emulators

To derive the bias of f N , d 0 c ^ (i.e., an estimator of f d 0 c ) in Theorem 2 using the aforementioned spaces of functions, denote with Z : = ( Z 1 , , Z d ) a d-dimensional random vector of independent variables that are centered about zero and standardized (i.e., E [ Z k 2 ] = 1 , k = 1 , , d ), and R c denotes the set of such random vectors. For any r N and w { 1 , , d } , define
Γ r : = = 1 L C ( | w | ) β r ; K w , L : = inf Z R c E Z 2 2 L k w Z k 2 Γ | w | + 2 L ;
L w : = r * | w | 2 + L r * 𝟙 | w | r * + ( L r * 1 ) 𝟙 | w | > r * .
Theorem 2.
Assume f H α with α 0 , max ( d , d 0 + 2 ( L r * 1 ) ) and Assumptions 1 and 3 hold. Then, we have
E f N , d 0 c ^ ( x ) f c ( x ) w { 1 , , d } 0 < | w | d 0 σ 2 L w h 2 2 L w M | w | + 2 L w K w , L w 1 2 ρ min | w | + D d 0 , ρ min .
Moreover, if V k U ( ξ , ξ ) with ξ > 0 and k = 1 , , d , then
E f N , d 0 c ^ ( x ) f c ( x ) w { 1 , , d } 0 < | w | d 0 ξ 2 L w | | h 2 | | 1 L w M | w | + 2 L w Γ | w | + 2 L w 1 2 ρ min | w | + D d 0 , ρ min .
Proof. 
See Appendix D. □
Using the fact h k 0 , the results provided in Theorem 2 have simple upper bounds (see Corollary 1). To provide such results, consider
K 1 , r * , d 0 max : = max w { 1 , , d } r * < | w | d 0 K w , ( L r * 1 ) M | w | + 2 ( L r * 1 ) ;
K 2 , r * , d 0 max : = max w { 1 , , d } r * < | w | d 0 M | w | + 2 ( L r * 1 ) Γ | w | + 2 ( L r * 1 ) ;
K 1 , ρ min , r * : = 2 ρ min d 2 ρ min r * + 1 d 2 ρ min d 0 r * 1 d 2 ρ min 𝟙 r * < d 0 1 + d d 0 1 2 ρ min d 0 𝟙 r * = d 0 1 .
Corollary 1. 
Assume f H α with α 0 , max ( d , d 0 + 2 ( L r * 1 ) ) and Assumptions 1 and 3 hold. If h k 0 , then
E f N , d 0 c ^ ( x ) f c ( x ) h 2 2 L r * 1 σ 2 ( L r * 1 ) K 1 , r * , d 0 max K 1 , ρ min , r * + D d 0 , ρ min + O h 2 2 L r * 1 .
Moreover, if V k U ( ξ , ξ ) with ξ > 0 and k = 1 , , d , then
E f N , d 0 c ^ ( x ) f c ( x ) | | h 2 | | 1 L r * 1 ξ 2 ( L r * 1 ) K 2 , r * , d 0 max K 1 , ρ min , r * + D d 0 , ρ min + O | | h 2 | | 1 L r * 1 .
Proof. 
See Appendix E. □
Using the above results, the bias of the full emulator of f c is straightforward when taking d 0 = d and knowing that D d , ρ min = 0 . Moreover, Corollary 2 provides the bias of such an emulator under different structural assumptions about f so as to cope with many functions. To that end, define
K 1 : d : = inf Z R c E Z 2 k = 1 d Z k 2 .
Corollary 2.
Let d 0 = d ; r * d 1 and L = r * + 2 . Assume f H α with α 0 , d + 1 and Assumptions 1 and 3 hold. If h k 0 , then
E f N c ^ ( x ) f c ( x ) σ h 2 M d + 1 K 1 : d Γ d + 1 k = 1 d E E k + O h 2 .
Moreover, if V k U ( ξ , ξ ) with ξ > 0 and k = 1 , , d , then
E f N c ^ ( x ) f c ( x ) ξ | | h | | 1 M d + 1 Γ d + 1 k = 1 d E E k + O | | h | | 1 .
Proof. 
See Appendix F. □
In view of Corollaries 1 and 2, Equation (12) can lead to a dimension-free upper bound of the bias. Indeed, using the uniform bandwidth h k = h and
ξ 1 d M d + 1 Γ d + 1 k = 1 d E E k ,
we can see that the upper bound of the bias is E f N c ^ ( x ) f c ( x ) h . Furthermore, it is worth noting that E f N , d 0 c ^ ( x ) f c ( x ) h L r * 1 for any function f L d 0 , 0 .

3.2.2. Mean Squared Errors

We start this section with the variance of the proposed emulators, followed by their mean squared errors and different rates of convergence. For the variance, define
ϝ r * , d 0 max : = max w { 1 , , d } | w | d 0 M min ( r * , | w | 1 ) + 1 Γ min ( r * , | w | 1 ) + 1 ; ϝ d 0 max : = max w { 1 , , d } | w | d 0 M | w | Γ | w | .
Theorem 3.
Consider the coefficients given by Equation (5), and assume f H α with α 0 , max ( d , d 0 + 2 ( L r * 1 ) ) and Assumptions 1–3 hold. Then,
V f N , d 0 c ^ ( x ) ϝ r * , d 0 max 2 N w { 1 , , d } 0 < | w | d 0 k w E E k 2 h k 2 σ 4 E V 1 2 h V 2 2 min ( r * , | w | 1 ) + 1 | w | .
Moreover, if r * = d 0 1 , h k = h and Z k = V k / σ , then
V f N , d 0 c ^ ( x ) d ϝ d 0 max 2 E Z 1 2 Z 2 2 N d E Z 1 2 Z 2 2 3 ρ min 2 d E Z 1 2 Z 2 2 3 ρ min 2 d 0 1 .
Proof. 
See Appendix G. □
It turns out that the upper bounds of the variance in Theorem 3 do not depend on the uniform bandwidths when r * = d 0 1 , leading to the derivations of the parametric MSEs of f N , d 0 c ^ ( x ) and f N c ^ ( x ) . To that end, consider the upper bound of the above variance, that is,
Σ ¯ d 0 : = d ϝ d 0 max 2 E Z 1 2 Z 2 2 N d E Z 1 2 Z 2 2 3 ρ min 2 d E Z 1 2 Z 2 2 3 ρ min 2 d 0 1 .
Remark 1.
Based on the expression of Σ ¯ d 0 , the random variable V j or Z j = V j / σ having the smallest value of fourth moments or kurtosis should be used. Under the additional condition E Z 1 2 Z 2 2 3 ρ min 2 , we can check that (see Appendix G)
Σ ¯ d 0 ϝ d 0 max 2 N 2 d E Z 1 2 Z 2 2 3 ρ min 2 d 0 𝟙 d 0 d 0 * + 2 d E Z 1 2 Z 2 2 3 ρ min 2 d 0 𝟙 d 0 > d 0 * ,
with d 0 * : = ( d 1 ) ln ( 2 ) ln ( d ) .
In what follows, we are going to use E f N , d 0 c ^ ( x ) f c ( x ) 2 for the MSE of f N , d 0 c ^ and
E f N , d 0 c ^ f c 2 : = E X E f N , d 0 c ^ ( X ) f c ( X ) 2 ,
for the expected or integrated MSE (IMSEs) of f N , d 0 c ^ .
Corollary 3.
Given (5), r * = d 0 1 , assume f H α with α 0 , max ( d , d 0 + 2 ( L d 0 ) ) ; h k = h 0 and Assumptions 1–3 hold. Then, the MSE and IMSE share the same upper bound, given as follows:
E f N , d 0 c ^ f c 2 2 D d 0 , ρ min h 2 2 L d 0 σ 2 ( L d 0 ) K 1 , d 0 1 , d 0 max d d 0 1 2 ρ min d 0 + D d 0 , ρ min 2 + Σ ¯ d 0 + h 2 2 2 ( L d 0 ) σ 4 ( L d 0 ) K 1 , d 0 1 , d 0 max 2 d d 0 2 1 2 ρ min 2 d 0 .
Moreover, if G j = F j with j = 1 , , d , then the IMSE is given by
E E f N , d 0 c ^ f c 2 h 2 2 2 ( L d 0 ) σ 4 ( L d 0 ) K 1 , d 0 1 , d 0 max 2 d d 0 2 1 2 ρ min 2 d 0 + D d 0 , ρ min 2 + Σ ¯ d 0 .
Proof. 
See Appendix H. □
The presence of D d 0 , ρ min in Corollary 3 is going to decrease the rates of convergence of our estimators without additional assumptions about D d 0 , ρ min 2 . Corollary 4 starts providing such conditions and the associated MSEs and IMSEs.
Corollary 4.
Under the conditions of Corolary 3, assume that f L d 0 , 0 . Then, the upper bound of the IMSE and MSE is given by
h 2 2 2 ( L d 0 ) σ 4 ( L d 0 ) K 1 , d 0 1 , d 0 max 2 d d 0 2 1 2 ρ min 2 d 0 + Σ ¯ d 0 .
Moreover, if V k U ( ξ , ξ ) with ξ > 0 and k = 1 , , d , then this bound becomes
| | h 2 | | 1 2 ( L d 0 ) ξ 4 ( L d 0 ) M d 0 + 2 ( L d 0 ) Γ d 0 + 2 ( L d 0 ) 2 d d 0 2 1 2 ρ min 2 d 0 + Σ ¯ d 0 .
Proof. 
Using Corollary 1, the results are straightforward. □
Based on the upper bounds of Corollary 4, interesting choices of σ or ξ on one hand and ρ min and h on the other hand help in obtaining the parametric rates of convergence due to the fact that Σ ¯ d 0 does not depend on h.
Corollary 5.
Let r * = d 0 1 , L = d 0 + 1 . Assume f H α with α { 0 , max ( d , d 0 + 2 ) } ; f L d 0 , 0 and Assumptions 1–3 hold. If h k = h N η with η ] 1 4 , 1 [ and ξ 2 d M d 0 + 2 Γ d 0 + 2 d d 0 1 2 ρ min d 0 1 , then the upper bound of MSE and IMSE is
E f N , d 0 c ^ f c 2 ϝ d 0 max 2 N 2 d ( d + 0.8 ) 3 ρ min 2 d 0 𝟙 d 0 d 0 * + 2 d d + 0.8 3 ρ min 2 d 0 𝟙 d 0 > d 0 * + O ( N 1 ) ,
provided that d + 0.8 3 ρ min 2 .
Moreover, if ρ min = c 0 d ( d + 0.8 ) 3 with the real 1 < c 0 2 , then
E f N , d 0 c ^ f c 2 ϝ d 0 max 2 N ( c 0 1 ) + O ( N 1 ) .
Proof. 
See Appendix I. □
It is worth noting that the parametric rates of convergence are reached for any function that belongs to L d 0 , 0 (see Corollary 5). Moreover, taking c 0 ] 1 , 2 ] leads to the dimension-free MSEs, which hold for particular distributions of the inputs given by
X j F j * , ρ j * ρ min * : = c 0 d ( d + 0.8 ) 3 , j { 1 , , d } .
For uniform distributions, X j U a j , b j , we must have b j a j 1 ρ min * . Obviously, such conditions are a bit strong, as a few distributions are covered. In the same sense, using f N , d 0 c ^ as an emulator of f c for any sample point, x , of X F requires choosing X j such that its support contains that of X j , j = 1 , , d . Thus, Assumption 3, given by g j > ρ min , implicitly depends on the distribution of X j . For instance, given the bounded support of X j , that is, ( a j , b j ) , we must have 1 ρ min * > b j a j b j a j , with ( a j , b j ) being the support of X j , limiting our ability to deploy f N , d 0 c ^ as a dimension-free, global emulator for some distributions of inputs.
Nevertheless, the assumption g j ρ min * is always satisfied for the finite dimensionality, d, if we are only interested in estimating f ( x 0 ) for a given point x 0 , leading to local emulators. Indeed, taking X j G j to be depended on the point x 0 at which f must be evaluated allows for enjoying the parametric rate of convergence and dimension-free MSEs for sample points falling in a neighborhood of x 0 . An example of such a choice is
X j U x j 0 1 2 ρ min * , x j 0 + 1 2 ρ min * .
However, different emulators are going to be built in order to estimate f ( x ) for any value, x , of X . Constructions of balls of given nodes and the radius 1 / ρ min * are an interesting perspective.
Remark 2.
When f L d 0 , 0 , in-depth structural assumptions about f that should allow for enjoying the above MSEs concern the truncation error, resulting from keeping only all the | v | h interactions or cross-partial derivatives with | v | d 0 . One way to handle it consists of choosing d 0 = d 0 , N such that the residual bias is less than 1 / N (i.e., D d 0 , N , ρ min 2 < 1 / N ), thanks to sensitivity indices.
While truncations are sometimes necessary in higher dimensions, it is interesting to have the rates of convergence without any truncation to cover lower or moderate dimensional functions, for instance.
Corollary 6.
Let d 0 = d ; r * = d 1 , L = d + 1 and h k = h . If f H α with α { 0 , d + 1 } , h 0 , and Assumptions 1–3 hold, then the upper bound of MSE and IMSE is
E f N c ^ f c 2 σ 2 h 2 2 M d + 1 2 K 1 : d 2 Γ d + 1 2 1 2 ρ min 2 d + ϝ d max 2 N E Z 1 2 Z 2 2 3 ρ min 2 + 1 d 1 .
Moreover, if V k U ( ξ , ξ ) with ξ > 0 and k = 1 , , d , then
E f N c ^ f c 2 ξ 2 | | h | | 1 2 M d + 1 2 Γ d + 1 2 1 2 ρ min 2 d + ϝ d max 2 N d + 0.8 3 ρ min 2 + 1 d 1 .
Proof. 
See Appendix J. □
In the case of the full emulator of f c , remark that
ϝ d max 2 N d + 0.8 3 ρ min 2 + 1 d 1 d ϝ d max 2 ( d + 0.8 ) N d ( d + 0.8 ) 3 ρ min 2 d ( d + 0.8 ) 3 ρ min 2 d 1 .
Based on Corollary 6, different rates of convergence can be obtained depending on the the support of the input variables X j F j via the choice of ρ min .
Corollary 7.
Let r * = d 1 and L = d + 1 . Assume f H α with α { 0 , d + 1 } ; ξ d M d + 1 Γ d + 1 1 2 ρ min d 1 ; h k = h N η with η ] 1 2 , 1 [ and Assumptions 1–3 hold. Then, the (MSE and IMSE) rates of convergence are given as follows:
E f N c ^ f c 2 ϝ d max 2 N d + 0.8 3 ρ min 2 + 1 d 1 + O N 1 .
(i) 
If ρ min = d + 0.8 3 , then E f N c ^ f c 2 ϝ d max 2 N ( 2 d 1 ) + O N 1 .
(ii) 
If ρ min = c 0 d ( d + 0.8 ) 3 with 1 c 0 2 , then
E f N c ^ f c 2 ϝ d max 2 N min 1 c 0 1 , 1 c 0 d + 1 d 1 + O N 1 ,
with 1 c 0 d + 1 d 1 1 c 0 when c 0 d .
Proof. 
It is straightforward bearing in mind Corollaries 5 and 6. □
Again, the assumptions in Points (i)–(ii) are satisfied for fewer distributions, but they are always satisfied if we are only interested in estimating f ( x 0 ) for a given point x 0 , leading to building different local emulators.

3.2.3. Mean Squared Errors for Every Distribution of Inputs

In this section, we are going to remove the assumption about the quasi-uniform distribution of inputs (Assumption 3) so as to cover any probability distribution of inputs. Note that Assumption 3 is used to derive E E k 2 1 3 ρ min 2 and E | E k | 1 2 ρ min . Such an assumption can be avoided by using the following inequalities:
E | E k | = E 1 G ( X k ) 𝟙 X k X k + G ( X k ) 𝟙 X k < X k g k ( X k ) κ 1 ,
with
κ 1 : = sup k { 1 , , d } sup x k Ω k E F k ( X k ) + G ( X k ) 2 G ( X k ) F k ( X k ) g k ( X k ) ;
and
E E k 2 = E G 2 ( X k ) + F k ( X k ) 2 G ( X k ) F k ( X k ) g k 2 ( X k ) κ 2 ,
with
κ 2 : = sup k { 1 , , d } sup x k Ω k E G 2 ( x k ) + F k ( x k ) 2 G ( x k ) F k ( x k ) g k 2 ( x k ) .
Using such inequalities, the following results are straightforward keeping in mind Corollaries 5–7.
Corollary 8.
Let r * = d 0 1 and L = d 0 + 1 . Assume f H α with α { 0 , max ( d , d 0 + 2 ) } ; f L d 0 , 0 and Assumption 1 and 2 hold. If h k = h N η with η ] 1 4 , 1 [ and ξ 2 d M d 0 + 2 Γ d 0 + 2 d d 0 κ 1 d 0 1 , then we have
E f N , d 0 c ^ f c 2 ϝ d 0 max 2 N 2 κ 2 d ( d + 0.8 ) d 0 𝟙 d 0 d 0 * + 2 d κ 2 ( d + 0.8 ) d 0 𝟙 d 0 > d 0 * ,
provided that κ 2 ( d + 0.8 ) 1 .
Corollary 9.
Let r * = d 1 and L = d + 1 . Assume f H α with α { 0 , d + 1 } ; ξ d M d + 1 Γ d + 1 κ 1 d 1 ; h k = h N η with η ] 1 2 , 1 [ ; and Assumption 1 and 2 hold. Then,
E f N c ^ f c 2 ϝ d max 2 N κ 2 ( d + 0.8 ) + 1 d 1 .
Regarding the choice of G k , from the expression
E E k 2 = E F k ( X k ) + G ( X k ) [ G ( X k ) 2 F k ( X k ) ] g k 2 ( X k ) ,
an interesting choice of G k must satisfy G k 2 F k in order to reduce the value of E E k 2 . The following proposition gives interesting choices of G k s. Recall that X k F k and X k G k are supported on Ω k , Ω k , respectively, with Ω k Ω k .
Proposition 2.
Consider a PDF, ρ k , supported on Ω k Ω k and τ ] 0 , 1 ] . The distribution, G k , defined as a mixture of ρ k and ρ k , that is,
g k = τ ρ k 𝟙 Ω k + ( 1 τ ) ρ k 𝟙 Ω k Ω k ,
allows for reducing E E k 2 , provided that sup { x : x Ω } y 0 , y 0 Ω k Ω k .
Proof. 
We can check that G k F k . □
In what follows, we are going to consider τ = 1 (i.e., G k = F k ) and τ < 1 with ρ k being the uniform distribution.

4. Illustrations

In this section, we deploy our derivative-free emulators to approximate analytical functions. For the setting of different parameters needed, we rely on the results of Corollary 5. Indeed, we use V k U ( ξ , ξ ) with ξ = d d d 0 1 2 ρ min d 0 1 / 2 ; L = d 0 + 1 for the identified d 0 . For each function, we use N L runs of the model to construct our emulators, corresponding to f ( X i + β h V i ) with the following:
  • i = 1 , , N = 500 ;
  • β { 0 , ± 2 k 1 : k = 1 , , L 1 2 } if L is odd and β { ± 2 k : k = 1 , , L 2 } ; otherwise, h = N 1 .
Then, we predict the output for N = 500 sample points, that is, f ( X i ) . Finally, sample values are generated using Sobol’s sequence, and we compare the predictions associated with the initial distributions, that is, G k = F k , with a mixture distribution of F k (i.e., τ = 0.9 ) (see Proposition 2).

4.1. Test Functions

4.1.1. Ishigami’s Function ( d = 3 )

The Ishigami function is given by
f ( x ) = sin ( x 1 ) + 7 sin 2 ( x 2 ) + 0.1 x 3 4 sin ( x 1 ) ,
with X j U ( π , π ) , j = 1 , 2 , 3 . The sensitivity indices are S 1 = 0.3139 , S 2 = 0.4424 , S 3 = 0.0 , S T 1 = 0.567 , S T 2 = 0.442 , and S T 3 = 0.243 . Thus, we have d 0 = 2 because j = 1 3 ( S j + S T j ) = 2.01 (see Proposition 1). Moreover, we are going to remove (in our emulator) the main effect term corresponding to X 3 , as S 3 = 0 . Figure 1 depicts the predictions versus observations (i.e., simulated outputs) for 500 sample points. We can see that our predictions are in line with the simulated values of the output for both distributions used.

4.1.2. Sobol’s g-Function ( d = 10 )

The g-function [49] is defined as follows:
f ( x ) = j = 1 d = 10 | 4 x j 2 | + a j 1 + a j .
with X j U ( 0 , 1 ) j = 1 , , d = 10 . This function is differentiable almost everywhere and has different properties according to the values of a = ( a j , j = 1 , 2 , , d ) [17]. Indeed, the following applies:
  • If a = [ 0 , 0 , 6.52 , 6.52 , 6.52 , 6.52 , 6.52 , 6.52 , 6.52 , 6.52 ] T (i.e., type A), we have S 1 = S 2 = 0.39 , S j = 0.0069 , j > 2 , S T 1 = S T 2 = 0.54 , and S T j = 0.013 , j > 2 . We have d 0 = 2 , as j = 1 d ( S j + S T j ) = 2.01 . Moreover, we have S 1 + S T 2 1 , suggesting that we should only include X 1 and X 2 in our emulator;
  • If a = [ 50 , 50 , 50 , 50 , 50 , 50 , 50 , 50 , 50 , 50 ] T (i.e., type B), we have S j = S T j = 0.1 and j { 1 , 2 , , d } , leading to d 0 = 1 ;
  • If a = [ 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 ] T (i.e., type C), we have S j = 0.02 and S T j = 0.27 , j { 1 , 2 , , d } , and we can check that d 0 = 4 and that all the inputs are important. Thus, we have to include a lot of ANOVA components in our emulator with small effective effects since the variance of that function is fixed. More information is needed to design the structure of this function better.
Figure 2 and Figure 3 depict the predictions versus the simulated outputs (i.e., observations) of the g-function of type A and type B, respectively, for 500 sample points. Even though we obtain quasi-perfect predictions in the case of the g-function of type B, those of type A face some difficulties in predicting small and null values.

5. Application: Heat Diffusion Models with Stochastic Initial Conditions

We deployed our emulators to approximate a high-dimensional model defined by the one-dimensional (1-D) diffusion PDE with stochastic initial conditions, that is,
f t D 2 f 2 x = 0 , x ] 0 , 1 [ , t [ 0 , T ] f ( x , t = 0 ) = Z ( x ) x [ 0 , 1 ] f ( x = 0 , t ) = 0 , f ( x = 1 , t ) = 1 , t [ 0 , T ] ,
where D = 0.0011 represents the diffusion coefficient. The quantity of interest (QoI) is given by J ( Z ( x ) ) : = 1 2 0 T 0 10 f ( x , t ) 2 d x d t . The spatial discretization consists of subdividing the spatial domain, [ 0 , 1 ] , into d equally-sized cells, leading to d initial conditions or inputs, that is, Z ( x j ) with j = 1 , , d . We assume that the d = 50 inputs are independent, and X j : = Z ( x j ) U sin ( 2 π x j ) 1.96 , sin ( 2 π x j ) + 1.96 . A time step of 0.025 was considered, starting from 0 up to T = 5 .
It is known [12] that the exact gradient can be computed as follows: Z J ( Z ( x ) ) = f a ( x , 0 ) , where f a ( x , 0 ) stands for the adjoint model of f evaluated at ( x , t = 0 ) . Note that only one evaluation of such a function is needed to obtain the gradient of the QOI. The adjoint model is given by (see ref. [12])
f a t D 2 f a 2 x = f , x ] 0 , 1 [ , t [ 0 , T ] f a ( x = 0 , t ) = f a ( x = 1 , t ) = 0 , t [ 0 , T ] f a ( x , T ) = 0 , x [ 0 , 1 ] .
The values of the hyper-parameters derived in this paper (considered at the beginning of Section 4) were used to compute the results below. Using the exact values of the gradient (i.e., f a ( x , 0 ) ), we computed the main indices ( S j s) and the upper bounds of the total indices ( U B j s) (see Figure 4, top-left panel). It appears that the upper bounds are almost equal to the main indices, showing the absence of interactions. This information is confirmed by the fact that j = 1 d = 50 S j = 1.09 , leading to d 0 = 1 . Based on this information, Figure 4 (top-right panel) depicts the predictions versus the observations (simulated outputs) using the derivative-based emulator with all the first-order partial derivatives. In the same sense, Figure 4 (bottom-left panel) depicts the observations versus predictions by including only the ANOVA components for which U B j > 0.01 , that is, 37 components. Both results are close together and are in line with the observations. Finally, Figure 4 (bottom-right panel) shows the observations versus predictions for derivative-free emulators using only the components for which U B j > 0.01 . It turns out that our emulators provide reliable estimations. As expected (see MSEs), the derivative-based emulator using exact values of derivatives performs better.

6. Conclusions

In this paper, we have proposed simple, practical, and sound emulators of simulators or estimators of functions using either the available derivatives and distribution functions of inputs or derivative-free methods, such as stochastic approximations. Since our emulators or estimators rely exactly on Db-ANOVA, Sobol’s indices and their upper bounds were used to derive the appropriate structures of such emulators so as to reduce epistemic uncertainty. The derivative-based and derivative-free emulators reach the parametric rate of convergence (i.e., O ( N 1 ) ) and have dimension-free biases. Moreover, the former emulators enjoy dimension-free MSEs when all cross-partial derivatives are available and, therefore, can cope with higher-dimensional models. However, the MSEs of the derivative-free estimators depend on dimensionality, and we have shown that the stability and accuracy of such emulators require about N ( d + 1 ) d model runs for full emulators and about N min d 2 d 0 , 2 d d d 0 runs for unbiased, truncated emulators of order d 0 .
To be able to deploy our emulators in practice, we have provided the best known values for the hyper-parameters needed. The numerical results have revealed that our emulators provide efficient predictions of models once the adequate structures of such models are used. While such results are promising, further improvements are going to be investigated in our next works by i) considering distributions of V j s that may help in reducing the dimensionality in MSEs, ii) taking into account the discrepancy errors by using the output observations (rather than their mean only), and iii) considering local emulators. It is also worth investigating adaptations of such methods in the presence of empirical data.

Funding

This research received no external funding, except the APC funded by stats-MDPI.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No empirical datasets are used in this paper. All simulated data are already in the paper.

Acknowledgments

We would like to thank the three reviewers for their suggestions and comments that have helped improve our manuscript.

Conflicts of Interest

The author has no conflicts of interest to declare regarding the content of this paper.

Appendix A. Derivations of Unbiased Truncations (Proposition 1)

Keeping in mind the Sobol indices, it is known that v { 1 , , d } | v | > 0 S v = 1 , which comes down to v { 1 , , d } | v | d 0 S v = 1 for functions of the form f ( X ) = v { 1 , , d } | v | d 0 f v ( X v ) . Thus, we have
d 0 = v { 1 , , d } | v | d 0 d 0 S v , j = 1 d S T j = v { 1 , , d } | v | d 0 | v | S v .
Taking the difference yields
d 0 j = 1 d S T j = v { 1 , , d } | v | d 0 1 ( d 0 | v | ) S v d 0 j = 1 d S T j ( d 0 1 ) j = 1 d S j = v { 1 , , d } 2 | v | d 0 1 ( d 0 | v | ) S v 0
which implies that j = 1 d S T j + ( d 0 1 ) j = 1 d S j d 0 .
Additionally, taking d 0 ( d 0 1 ) j = 1 d S T j = v { 1 , , d } | v | d 0 [ d 0 ( d 0 1 ) | v | ] S v yields
d 0 ( d 0 1 ) j = 1 d S T j j = 1 d S j = v { 1 , , d } 2 | v | d 0 [ d 0 ( d 0 1 ) | v | ] S v 0 ,
which implies ( d 0 1 ) j = 1 d S T j + j = 1 d S j d 0 .

Appendix B. Proof of Theorem 1

Denote ı : = ( i 1 , , i d ) N d , | | ı | | 1 = i 1 + + i d , ı ! : = i 1 ! i d ! , x ı : = x 1 i 1 x d i d , and D ( ı ) f : = 1 i 1 d i d f . The Taylor expansion of f x + β h V about x of order α is given by
f x + β h V = r = 0 α | | ı | | 1 = r D ( ı ) f ( x ) ı ! β r h V ı + O | | β h V | | 1 α + 1 .
For any w { 1 , , d } with the cardinality | w | ; using 𝟙 w ( · ) for the indicator function and w : = 𝟙 w ( 1 ) , , 𝟙 w ( d ) lead to D ( w ) f = D | w | f . Additionally, using E k : = G k X k 𝟙 X k x k g k X k implies that R k = V k h k σ 2 E k , k = 1 , , d .
First, by evaluating the above expansion at X and taking the expectation with respect to V , A : = = 1 L C ( | u | ) E V f X + β h V e u ( 1 : d ) R x , X , V can be written as
A = r 0 | | ı | | 1 = r D ( ı ) f ( X ) ı ! C ( | u | ) β r E V h V ı w { 1 , , d } | w | = | u | V h σ 2 E w = r 0 | | ı | | 1 = r w { 1 , , d } | w | = | u | D ( ı ) f ( X ) C ( | u | ) β r ı ! σ 2 | u | E V V ı + w h ı w E w
We can see that E V V ı + w h ı w 0 iff ı + w = 2 q , q N d . Equation ı + w = 2 q implies i k = 2 q k 0 if k w ; otherwise, i k = 2 q k 1 0 . The last quantity is equivalent to i k = 2 q k + 1 when k w with q k N , and it leads to ı = 2 q + w , q N d , which also implies that | | ı | | 1 | | w | | 1 = | u | .
When | | q | | 1 = 0 , we have ı = w ; D ( ı ) f = D w f and E V ı + w h ı w = E V 2 w = σ 2 | u | , by independence. Thus, we have
A = w { 1 , , d } | w | = | u | D | w | f ( X ) C ( | u | ) β | u | k w E k + s 1 | | q | | 1 = s w { 1 , , d } | w | = | u | × D ( 2 q + w ) f ( X ) C ( | u | ) β | u | + 2 s ( 2 q + w ) ! σ 2 | u | E V 2 ( q + w ) h 2 q E w ,
using the change of variable 2 s = r | u | . At this point, the setting L = 1 , β = 1 and C ( | u | ) = 1 gives the approximation of w { 1 , , d } | w | = | u | D | w | f ( X ) k w G k X k 𝟙 X k x k g k X k of order O ( h 2 2 ) for all u { 1 , , d } .
Second, taking the expectation with respect to X and the sum over | u | = 1 , , d , we obtain the result, that is,
| u | = 1 d E X A = | u | = 1 d w { 1 , , d } | w | = | u | E X D | w | f ( X ) k w G k X k 𝟙 X k x k g k X k + O h 2 2 ,
bearing in mind Equation (1).
Third, we can increase this order up to O h 2 2 L by using the constraints = 1 L C ( | u | ) β 2 s + | u | = δ 0 , s s = 0 , 1 , , L to eliminate some higher-order terms. Thus, the order O ( h 2 2 L ) is reached. Other constraints can lead to the order O ( h 2 2 α | u | ) with α | u | = 1 , , L .

Appendix C. Proof of Lemma 1

Recall that E k = G k X k 𝟙 X k x k g k X k . By using the definition of the absolute value and the fact that 0 G ( x ) 1 , we can check that E E k 1 2 ρ min . Additionally, using the following inequality (see Lemma 1 in ref. [10])
M | w | 2 γ 0 M 0 1 | w | / d M d | w | / d = 2 γ 0 M 0 M d M 0 | w | / d ,
for a given γ 0 , we can write
D d 0 , ρ min : = w { 1 , , d } | w | > d 0 M | w | 1 2 ρ min | w | 2 γ 0 M 0 w { 1 , , d } | w | > d 0 1 2 ρ min M d M 0 1 / d | w | = 2 γ 0 M 0 w { 1 , , d } 1 2 ρ min M d M 0 1 / d | w | w { 1 , , d } | w | d 0 1 2 ρ min M d M 0 1 / d | w | 2 γ 0 M 0 = 0 d d 1 2 ρ min M d M 0 1 / d = 0 d 0 d 0 1 2 ρ min M d M 0 1 / d = 2 γ 0 M 0 1 2 ρ min M d M 0 1 / d + 1 d 1 2 ρ min M d M 0 1 / d + 1 d 0 = 2 γ 0 M 0 1 2 ρ min M d M 0 1 / d + 1 d 0 1 2 ρ min M d M 0 1 / d + 1 d d 0 1 ,
and the result holds.

Appendix D. Proof of Theorem 2

Recall that R k = V k h k σ 2 E k with E k : = G k X k 𝟙 X k x k g k X k , k = 1 , , d . Using
A 1 : = p = 1 d 0 = 1 L C ( p ) E f X + β h V e p ( 1 : d ) R x , X , V ,
the bias B : = A 1 f c ( x ) becomes
B = w { 1 , , d } 0 < | w | d 0 = 1 L C ( | w | ) E f X + β h V k w V k h k σ 2 E k E X D | w | f ( X ) k w E k w { 1 , , d } | w | > d 0 E X D | w | f ( X ) k w E k = w { 1 , , d 0 } | w | > 0 E X k w E k = 1 L C ( | w | ) E V f X + β h V k w V k h k σ 2 D | w | f ( X ) w { 1 , , d } | w | > d 0 E X D | w | f ( X ) k w E k .
Note that the quantity = 1 L C ( | w | ) E V f X + β h V k w V k h k σ 2 D | w | f ( X ) has been investigated in ref. [13]. To make use of such results in our context given by Equation (5), let L w : = r * | w | 2 + L r * 𝟙 | w | r * + ( L r * 1 ) 𝟙 | w | > r * . Thus, we have
D | w | f ( x ) = 1 L C ( | w | ) E f x + β h V k w V k h k σ 2 σ 2 L w M | w | + 2 L w K 1 , L w h 2 2 L w .
When V k U ( ξ , ξ ) with ξ > 0 and k = 1 , , d , then
D | w | f ( x ) = 1 L C ( | w | ) E f x + β h V k w V k h k σ 2 M | w | + 2 L w ξ 2 L w | | h 2 | | 1 L w Γ | w | + 2 L w .
Using Equation (A1); g k > ρ min and the fact that E E k 1 2 ρ min , we can write
| B | w { 1 , , d } 0 < | w | d 0 σ 2 L w M | w | + 2 L w K 1 , L w h 2 2 L w k w E E k + w { 1 , , d } | w | > d 0 M | w | k w E E k w { 1 , , d } 0 < | w | d 0 σ 2 L w M | w | + 2 L w K 1 , L w h 2 2 L w k w 1 2 ρ min + w { 1 , , d } | w | > d 0 M | w | k w 1 2 ρ min ,
where M | w | = D | w | f . The results hold by using Lemma 1.

Appendix E. Proof of Corollary 1

First, keeping in mind Theorem 2, we can see that the smallest value of L w is L r * 1 , which is reached when | w | > r * . Thus, the bias verifies
B h 2 2 L r * 1 σ 2 ( L r * 1 ) w { 1 , , d } r * < | w | d 0 K w , ( L r * 1 ) M | w | + 2 ( L r * 1 ) 1 2 ρ min | w | + D d 0 , ρ min + O h 2 2 L r * 1 .
Second, using K 1 , r * , d 0 max , we can write
A 3 : = w { 1 , , d } r * < | w | d 0 K w , ( L r * 1 ) M | w | + 2 ( L r * 1 ) 1 2 ρ min | w | K 1 , r * , d 0 max ı = r * + 1 d 0 d ı 1 2 ρ min ı = 2 K 1 , d 0 max ρ min d 2 ρ min r * + 1 d 2 ρ min d 0 r * 1 d 2 ρ min ,
because d ı d ı and ı = r * + 1 d 0 d ı 1 2 ρ min ı ı = r * + 1 d 0 d 2 ρ min ı = d 2 ρ min r * + 1 d 2 ρ min d 0 r * 1 d 2 ρ min 1 .
Finally, if V k U ( ξ , ξ ) with ξ > 0 and k = 1 , , d , the following bias is used to derive the result:
B | | h 2 | | 1 L r * 1 ξ 2 ( L r * 1 ) w { 1 , , d } r * < | w | d 0 M | w | + 2 ( L r * 1 ) Γ | w | + 2 ( L r * 1 ) 1 2 ρ min | w | + D d 0 , ρ min + O | | h 2 | | 1 L r * 1 .

Appendix F. Proof of Corollary 2

For r * < | w | d 1 , we can see that L w = 1 , and the order of approximation in Corollary 1 becomes O h 2 2 or O | | h 2 | | 1 because | w | + 2 d + 1 and L = r * + 2 . When | w | = d , the smallest order is obtained thanks to ref. [13], Corollary 2.

Appendix G. Proof of Theorem 3

For the variance of our emulator, we can write
V f N , d 0 c ^ ( x ) = 1 N V p = 1 d 0 = 1 L C ( p ) f X + β h V e p ( 1 : d ) R x , X , V 1 N E p = 1 d 0 = 1 L C ( p ) f X + β h V e p ( 1 : d ) R x , X , V 2 1 N E p = 1 d 0 w { 1 , , d } | w | = p k w V k h k σ 2 E k = 1 L C ( p = | w | ) f X + β h V 2 .
Using A 2 : = = 1 L C ( | w | ) f X + β h V , and keeping in mind Equation (5), we can write
A 2 = = 1 L C ( | w | ) f X + β h V r = 0 min ( r * , | w | 1 ) | | ı | | 1 = r D ( ı ) f ( X ) ı ! β r h V ı M min ( r * , | w | 1 ) + 1 h V 2 min ( r * , | w | 1 ) + 1 = 1 L C ( | w | ) β min ( r * , | w | 1 ) + 1 ,
because f H | w | + 1 . Keeping in mind that Γ min ( r * , | w | 1 ) + 1 = = 1 L C ( | w | ) β min ( r * , | w | 1 ) + 1 and using ϝ r * , | w | : = M min ( r * , | w | 1 ) + 1 Γ min ( r * , | w | 1 ) + 1 , the variance becomes
V f N , d 0 c ^ ( x ) 1 N E w { 1 , , d } 0 < | w | d 0 ϝ r * , | w | h V 2 min ( r * , | w | 1 ) + 1 k w V k h k σ 2 E k 2 ϝ r * , d 0 max 2 N E w { 1 , , d } 0 < | w | d 0 k w V k E k h k σ 2 h V 2 min ( r * , | w | 1 ) + 1 | w | 2 = ϝ r * , d 0 max 2 N E w { 1 , , d } 0 < | w | d 0 k w V k E k h k σ 2 h V 2 min ( r * , | w | 1 ) + 1 | w | 2 ,
because E [ V k ] = 0 , E V k h V 2 min ( r * , | w | 1 ) + 1 | w | = 0 and when w w ,
E k w V k E k h k σ 2 h V 2 min ( r * , | w | 1 ) + 1 | w | w V E h σ 2 h V 2 min ( r * , | w | 1 ) + 1 | w | = 0 .
For the second result, by expanding E k 2 and knowing that 0 G ( x ) 1 , we can see that E [ E k 2 ] 1 3 ρ min 2 . Additionally, using Z k = V k / σ and the fact that h k = h , r * = d 0 1 and | w | d 0 , we have
V f N , d 0 c ^ ( x ) ϝ d 0 max 2 N w { 1 , , d } 0 < | w | d 0 k w E V 1 2 V 2 2 σ 4 E [ E k 2 ] ϝ d 0 max 2 N w { 1 , , d } 0 < | w | d 0 k w E Z 1 2 Z 2 2 E [ E k 2 ] ϝ d 0 max 2 N w { 1 , , d } 0 < | w | d 0 E Z 1 2 Z 2 2 3 ρ min 2 | w | = ϝ d 0 max 2 N p = 1 d 0 d p E Z 1 2 Z 2 2 3 ρ min 2 p ϝ d 0 max 2 N d E Z 1 2 Z 2 2 d E Z 1 2 Z 2 2 3 ρ min 2 d E Z 1 2 Z 2 2 3 ρ min 2 d 0 1 .
Finally, when E Z 1 2 Z 2 2 3 ρ min 2 and knowing that p = 1 d 0 d p p = 1 d 0 d p = d d 0 1 d 1 d 2 d d 0 , we can write
V f N , d 0 c ^ ( x ) ϝ d 0 max 2 N p = 1 d 0 d p E Z 1 2 Z 2 2 3 ρ min 2 p ϝ d 0 max 2 N E Z 1 2 Z 2 2 3 ρ min 2 d 0 p = 1 d 0 d p = 2 ϝ d 0 max 2 N d E Z 1 2 Z 2 2 3 ρ min 2 d 0 .
Additionally, note that p = 1 d 0 d p 2 d .

Appendix H. Proof of Corollary 3

Since r * = d 0 1 and | w | d 0 , we have K 1 , ρ min , d 0 1 = d d 0 1 2 ρ min d 0 . The first result is obvious using the variance of the emulator provided in Theorem 3 and the bias from Corollary 1.
The second result is due to the fact that when G j = F j , the terms in the Db expansion of f are L 2 -orthogonal.

Appendix I. Proof of Corollary 5

As V k U ( ξ , ξ ) , then Z k U ( 3 , 3 ) and E Z 1 2 Z 2 2 = d + 4 / 5 . Thus, the first result holds.
For the second result, taking ρ min = c 0 d ( d + 0.8 ) 3 with 1 < c 0 2 yields
V f N , d 0 c ^ ( x ) Σ ¯ d 0 = ϝ d 0 max 2 N 1 ( 1 / c 0 ) d 0 c 0 1 ϝ d 0 max 2 N ( c 0 1 ) .

Appendix J. Proof of Corollary 6

Using Equation (A3) and the fact that r * = d 1 , d 0 = d , we can check that
V f N , d 0 c ^ ( x ) ϝ d 0 max 2 N E Z 1 2 Z 2 2 3 ρ min 2 + 1 d 1 Σ ¯ d ,
with
Σ ¯ d : = d ϝ d 0 max 2 E Z 1 2 Z 2 2 N d E Z 1 2 Z 2 2 3 ρ min 2 d E Z 1 2 Z 2 2 3 ρ min 2 d 1 .
Thus, the results hold using Corollaries 2 and 3.

References

  1. Max, D.; Morris, T.J.M.; Ylvisaker, D. Bayesian Design and Analysis of Computer Experiments: Use of Derivatives in Surface Prediction. Technometrics 1993, 35, 243–255. [Google Scholar]
  2. Solak, E.; Murray-Smith, R.; Leithead, W.; Leith, D.; Rasmussen, C. Derivative observations in Gaussian process models of dynamic systems. Adv. Neural Inf. Process. Syst. 2002, 15, 1–8. [Google Scholar]
  3. Le Dimet, F.X.; Talagrand, O. Variational algorithms for analysis and assimilation of meteorological observations: Theoretical aspects. Tellus A Dyn. Meteorol. Oceanogr. 1986, 38, 97–110. [Google Scholar] [CrossRef]
  4. Le Dimet, F.X.; Ngodock, H.E.; Luong, B.; Verron, J. Sensitivity analysis in variational data assimilation. J. Meteorol. Soc. Jpn. 1997, 75, 245–255. [Google Scholar] [CrossRef]
  5. Cacuci, D.G. Sensitivity and Uncertainty Analysis—Theory; Chapman & Hall/CRC: Boca Raton, FL, USA, 2005. [Google Scholar]
  6. Gunzburger, M.D. Perspectives in Flow Control and Optimization; SIAM: Philadelphia, PA, USA, 2003. [Google Scholar]
  7. Borzi, A.; Schulz, V. Computational Optimization of Systems Governed by Partial Differential Equations; SIAM: Philadelphia, PA, USA, 2012. [Google Scholar]
  8. Ghanem, R.; Higdon, D.; Owhadi, H. Handbook of Uncertainty Quantification; Springer International Publishing: Berlin/Heidelberg, Germany, 2017. [Google Scholar]
  9. Agarwal, A.; Dekel, O.; Xiao, L. Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback. In Proceedings of the COLT, Haifa, Israel, 27–29 June 2010; Citeseer: Princeton, NJ, USA, 2010; pp. 28–40. [Google Scholar]
  10. Bach, F.; Perchet, V. Highly-Smooth Zero-th Order Online Optimization. In Proceedings of the 29th Annual Conference on Learning Theory, New York, NY, USA, 23–26 June 2016; Volume 49, pp. 257–283. [Google Scholar]
  11. Akhavan, A.; Pontil, M.; Tsybakov, A.B. Exploiting higher order smoothness in derivative-free optimization and continuous bandits. In Proceedings of the NIPS ’20: Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 6–12 December 2020. [Google Scholar]
  12. Lamboni, M. Optimal and Efficient Approximations of Gradients of Functions with Nonindependent Variables. Axioms 2024, 13, 426. [Google Scholar] [CrossRef]
  13. Lamboni, M. Optimal Estimators of Cross-Partial Derivatives and Surrogates of Functions. Stats 2024, 7, 1–22. [Google Scholar] [CrossRef]
  14. Chkifa, A.; Cohen, A.; DeVore, R.; Schwab, C. Sparse adaptive Taylor approximation algorithms for parametric and stochastic elliptic PDEs. ESAIM Math. Model. Numer. Anal. 2013, 47, 253–280. [Google Scholar] [CrossRef]
  15. Patil, P.; Babaee, H. Reduced-Order Modeling with Time-Dependent Bases for PDEs with Stochastic Boundary Conditions. SIAM/ASA J. Uncertain. Quantif. 2023, 11, 727–756. [Google Scholar] [CrossRef]
  16. Sobol, I.M.; Kucherenko, S. Derivative based global sensitivity measures and the link with global sensitivity indices. Math. Comput. Simul. 2009, 79, 3009–3017. [Google Scholar] [CrossRef]
  17. Kucherenko, S.; Rodriguez-Fernandez, M.; Pantelides, C.; Shah, N. Monte Carlo evaluation of derivative-based global sensitivity measures. Reliab. Eng. Syst. Saf. 2009, 94, 1135–1148. [Google Scholar] [CrossRef]
  18. Lamboni, M.; Iooss, B.; Popelin, A.L.; Gamboa, F. Derivative-based global sensitivity measures: General links with Sobol’ indices and numerical tests. Math. Comput. Simul. 2013, 87, 45–54. [Google Scholar] [CrossRef]
  19. Roustant, O.; Fruth, J.; Iooss, B.; Kuhnt, S. Crossed-derivative based sensitivity measures for interaction screening. Math. Comput. Simul. 2014, 105, 105–118. [Google Scholar] [CrossRef]
  20. Roustant, O.; Barthe, F.; Iooss, B. Poincar inequalities on intervals—Application to sensitivity analysis. Electron. J. Statist. 2017, 11, 3081–3119. [Google Scholar] [CrossRef]
  21. Lamboni, M. Derivative-based integral equalities and inequality: A proxy-measure for sensitivity analysis. Math. Comput. Simul. 2021, 179, 137–161. [Google Scholar] [CrossRef]
  22. Lamboni, M. Weak derivative-based expansion of functions: ANOVA and some inequalities. Math. Comput. Simul. 2022, 194, 691–718. [Google Scholar] [CrossRef]
  23. Lamboni, M.; Kucherenko, S. Multivariate sensitivity analysis and derivative-based global sensitivity measures with dependent variables. Reliab. Eng. Syst. Saf. 2021, 212, 107519. [Google Scholar] [CrossRef]
  24. Krige, D.G. A Statistical Approaches to Some Basic Mine Valuation Problems on the Witwatersrand. J. Chem. Metall. Soc. S. Afr. 1951, 52, 119–139. [Google Scholar]
  25. Currin, C.; Mitchell, T.; Morris, M.; Ylvisaker, D. Bayesian Prediction of Deterministic Functions, with Applications to the Design and Analysis of Computer Experiments. J. Am. Stat. Assoc. 1991, 86, 953–963. [Google Scholar] [CrossRef]
  26. Oakley, J.E.; O’Hagan, A. Probabilistic sensitivity analysis of complex models: A Bayesian approach. J. R. Stat. Soc. Ser. B Stat. Methodol. 2004, 66, 751–769. [Google Scholar] [CrossRef]
  27. Conti, S.; O’Hagan, A. Bayesian emulation of complex multi-output and dynamic computer models. J. Stat. Plan. Inference 2010, 140, 640–651. [Google Scholar] [CrossRef]
  28. Kennedy, M.C.; O’Hagan, A. Bayesian calibration of computer models. J. R. Stat. Soc. Ser. B Stat. Methodol. 2001, 63, 425–464. [Google Scholar] [CrossRef]
  29. Xiu, D.; Karniadakis, G. The Wiener-Askey polynomial chaos for stochastic di eren-tial equations. Siam J. Sci. Comput. 2002, 24. [Google Scholar] [CrossRef]
  30. Ghanem, R.G.; Spanos, P.D. Stochastic Finite Elements: A Spectral Approach; Springer: New York, NY, USA, 1991; pp. 1–214. [Google Scholar]
  31. Sudret, B. Global sensitivity analysis using polynomial chaos expansions. Reliab. Eng. Syst. Saf. 2008, 93, 964–979. [Google Scholar] [CrossRef]
  32. Wahba, G. Spline Models for Observational Data; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 1990. [Google Scholar] [CrossRef]
  33. Wong, R.K.W.; Storlie, C.B.; Lee, T.C.M. A Frequentist Approach to Computer Model Calibration. J. R. Stat. Soc. Ser. B Stat. Methodol. 2016, 79, 635–648. [Google Scholar] [CrossRef]
  34. Friedman, J.H.; Popescu, B.E. Predictive learning via rule ensembles. Ann. Appl. Stat. 2008, 2, 916–954. [Google Scholar] [CrossRef]
  35. Horiguchi, A.; Pratola, M.T. Estimating Shapley Effects in Big-Data Emulation and Regression Settings using Bayesian Additive Regression Trees. arXiv 2024, arXiv:2304.03809. [Google Scholar]
  36. Migliorati, G.; Nobile, F.; Schwerin, E.; Tempone, R. Analysis of Discrete L2 Projection on Polynomial Spaces with Random Evaluations. Found. Comput. Math. 2014, 14, 419–456. [Google Scholar]
  37. Hampton, J.; Doostan, A. Coherence motivated sampling and convergence analysis of least squares polynomial Chaos regression. Comput. Methods Appl. Mech. Eng. 2015, 290, 73–97. [Google Scholar] [CrossRef]
  38. Cohen, A.; Davenport, M.A.; Leviatan, D. On the stability and accuracy of least squares approximations. arXiv 2018, arXiv:math.NA/1111.4422. [Google Scholar] [CrossRef]
  39. Tsybakov, A. Introduction to Nonparametric Estimation; Springer: New York, NY, USA, 2009. [Google Scholar]
  40. Zemanian, A. Distribution Theory and Transform Analysis: An Introduction to Generalized Functions, with Applications; Dover Books on Advanced Mathematics; Dover Publications: Garden City, NY, USA, 1987. [Google Scholar]
  41. Strichartz, R. A Guide to Distribution Theory and Fourier Transforms; Studies in advanced mathematics; CRC Press: Boca Raton, FL, USA, 1994. [Google Scholar]
  42. Caflisch, R.E.; Morokoff, W.J.; Owen, A.B. Valuation of mortgage-backed securities using Brownian bridges to reduce effective dimension. J. Comput. Financ. 1997, 1, 27–46. [Google Scholar] [CrossRef]
  43. Owen, A. The dimension distribution and quadrature test functions. Stat. Sin. 2003, 13, 1–17. [Google Scholar]
  44. Rabitz, H. General foundations of high dimensional model representations. J. Math. Chem. 1999, 25, 197–233. [Google Scholar] [CrossRef]
  45. Kuo, F.; Sloan, I.; Wasilkowski, G.; Woźniakowski, H. On decompositions of multivariate functions. Math. Comput. 2010, 79, 953–966. [Google Scholar] [CrossRef]
  46. Alatawi, M.S.; Martinucci, B. On the Elementary Symmetric Polynomials and the Zeros of Legendre Polynomials. J. Math. 2022, 1–9. [Google Scholar] [CrossRef]
  47. Arafat, A.; El-Mikkawy, M. A Fast Novel Recursive Algorithm for Computing the Inverse of a Generalized Vandermonde Matrix. Axioms 2023, 12, 27. [Google Scholar] [CrossRef]
  48. Rawashdeh, E. A simple method for finding the inverse matrix of Vandermonde matrix. Mathematiqki Vesnik 2019, 71, 207–213. [Google Scholar]
  49. Homma, T.; Saltelli, A. Importance measures in global sensitivity analysis of nonlinear models. Reliab. Eng. Syst. Saf. 1996, 52, 1–17. [Google Scholar] [CrossRef]
Figure 1. Predictions versus simulated outputs (observations) for Ishigami’s function.
Figure 1. Predictions versus simulated outputs (observations) for Ishigami’s function.
Stats 08 00024 g001
Figure 2. Predictions versus simulated outputs (observations) for the g-function of type A.
Figure 2. Predictions versus simulated outputs (observations) for the g-function of type A.
Stats 08 00024 g002
Figure 3. Predictions versus simulated outputs (observations) for the g-function of type B.
Figure 3. Predictions versus simulated outputs (observations) for the g-function of type B.
Stats 08 00024 g003
Figure 4. Main indices (∘) and upper bounds of total indices (+) of d = 50 inputs (top-left panel); observations versus predictions using either derivative-based emulators (see the top-right panel when including all components and the bottom-left panel for all other cases) or derivative-free emulators (bottom-right panel).
Figure 4. Main indices (∘) and upper bounds of total indices (+) of d = 50 inputs (top-left panel); observations versus predictions using either derivative-based emulators (see the top-right panel when including all components and the bottom-left panel for all other cases) or derivative-free emulators (bottom-right panel).
Stats 08 00024 g004
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lamboni, M. Optimal ANOVA-Based Emulators of Models With(out) Derivatives. Stats 2025, 8, 24. https://doi.org/10.3390/stats8010024

AMA Style

Lamboni M. Optimal ANOVA-Based Emulators of Models With(out) Derivatives. Stats. 2025; 8(1):24. https://doi.org/10.3390/stats8010024

Chicago/Turabian Style

Lamboni, Matieyendou. 2025. "Optimal ANOVA-Based Emulators of Models With(out) Derivatives" Stats 8, no. 1: 24. https://doi.org/10.3390/stats8010024

APA Style

Lamboni, M. (2025). Optimal ANOVA-Based Emulators of Models With(out) Derivatives. Stats, 8(1), 24. https://doi.org/10.3390/stats8010024

Article Metrics

Back to TopTop