A Novel Global Sensitivity Measure Based on Probability Weighted Moments

: Global sensitivity analysis (GSA) is a useful tool to evaluate the inﬂuence of input variables in the whole distribution range. Variance-based methods and moment-independent methods are widely studied and popular GSA techniques despite their several shortcomings. Since probability weighted moments (PWMs) include more information than classical moments and can be accurately estimated from small samples, a novel global sensitivity measure based on PWMs is proposed. Then, two methods are introduced to estimate the proposed measure, i.e., double-loop-repeated-set numerical estimation and double-loop-single-set numerical estimation. Several numerical and engineering examples are used to show its advantages.


Introduction
The purpose of sensitivity analysis (SA) is to determine which of the inputs play more significant roles in reducing the uncertainty in the model output. It is useful because it can rank variables, simplify models, establish priorities for research, and so on [1]. Generally, according to different analysis purposes, the available SA techniques can be classified into local SA (LSA) and global SA (GSA). GSA is a more widely used sensitivity analysis method because it does not rely on the choice of a nominal point and the information of the full distribution range is investigated.
In the past few decades, many GSA methods have been studied. Among them, two categories of methods are widely studied and employed: variance-based global sensitivity measures and moment-independent global sensitivity measures.
Variance-based global sensitivity measures, which are usually called Sobol' indices, were proposed and developed by Sobol' [2], Iman [3], Homma [4], and Saltelli [5]. Besides, there are several mature techniques available to solve the above indices, for instance, Monte Carlo simulation, quasi-Monte Carlo simulation, high-dimensional model representation (HDMR), Markovian integration [6], and the surrogate-assisted method [7,8]. Variance-based global sensitivity measures have been widely applied in various fields such as the spray-drying process [9], heap leaching [10], the rainfall-runoff model [11], and polymeric material [12]. However, the variance is not fully representative of uncertainty, especially in some problems where higher moments have more of an effect. Cox [13] and Huber [14] pointed out that "mean-variance decision-making violates the principle that a rational decisionmaker should prefer higher to lower probabilities of receiving a fixed gain, all else being equal".
To include the whole information of output distribution, moment-independent global sensitivity measures were proposed [15]. In this regard, Chun's method preliminarily needs some assumptions [15]. Although the relative entropy-based method can show a ranking of relative importance, the value does not have an absolute physical meaning [16]. Additionally, the delta index introduced by Borgonovo is the most popularly used among them [17]. Cui [18] extended the delta index to the failure probability to deal with reliability models. However, these methods have a relatively large computational cost due to the estimation of probability density functions.
There are other alternative GSA measures. Derivative-based global sensitivity measures were introduced by Kucherenko et al. [19], which can serve as an upper bound on the Sobol' total sensitivity index [20]. Fort et al. proposed goal-oriented sensitivity indices, which depend on the quantity of interest [21]. Kala proposed new quantile-oriented sensitivity indices based on measuring the distance between a quantile and the average value of the model output [22]. Baroni et al. presented an effective strategy for combining varianceand distribution-based global sensitivity analysis [23].
In this paper, we introduce a novel sensitivity measure based on probability weighted moments (PWMs). PWMs are popular in many science and engineering areas [24]. PWMs are usually used to estimate parameters of a distribution. In contrast with classical moments, high-order PWMs can be accurately estimated from small samples. In addition, PWMs are fairly insensitive to outliers because they are linear combinations of samples [25]. They can serve as constraints of the maximum entropy method to describe the distribution feature of the output, which are similar to classical moments [26]. So, PWMs can reflect the influence of uncertainty and be applied in GSA as well.
The paper continues with a review of dominating global sensitivity measure systems in Section 2. Section 3 provides a brief introduction of probability weighted moments and their applications. Section 4 proposes a new global sensitivity measure based on probability weighted moments. Section 5 develops two numerical estimation methods, i.e., double-loop-repeated-set Quasi Monte Carlo (QMC) and double-loop-single-set QMC, for numerically estimating the presented measure. Section 6 provides three numerical examples and one engineering example. Finally, conclusions are provided.

Review of Dominating Global Sensitivity Measure Systems
Consider the computational model under investigation, which is represented by Y = g(X), where Y is the model output of interest and X = (X 1 , X 2 , . . . , X n ) is the ndimensional vector of the model input.

Variance-Based Global Sensitivity Measures
By the probabilistic approach, the first-order Sobol' indices (named the Sobol' main indices) can be defined as: whereas the Sobol' total indices are defined as: where X ∼i means all factors except X i . The main index of X i represents the degree of uncertainty of the variability of the output response when the variable X i acts alone. The total index represents the total degree of the influence of X i on the output response variance, including the interaction of X i with all other variables. Variance-based importance measures have been widely applied due to model independence, incorporating the effect from the full range of variation of each input factor and reflecting the influence of each input and interaction among inputs as well as the capacity to tackle groups of input factors. However, variance-based indices are not sufficient to describe output variability when the model output has a highly skewed or fat-tailed distribution [27] because the variance of the output cannot be considered as fully representative of its statistical characteristics. Despite many advantages, such measures show some inevitable limitations.

Moment-Independent Global Sensitivity Measures
The delta index proposed by Borgonovo, which is based on the probability density function (PDF) of the model output, is applied most extensively [17]: where f Y (y) is the unconditional PDF and f Y|X i (y) is the conditional PDF of output Y. f Y|X i (y) can be obtained by fixing the input variable X i at a realization value. Similarly, the moment-independent global sensitivity measure δ i 1 ,i 2 ,...,i r of a group of inputs X i 1 , X i 2 , . . . , X i r is defined as follows: The delta index is also a global, quantitative, and model-free measure. Its momentindependent property is more important. However, the delta index focuses on estimating only one single effect, and separating main, total effect and interactions are not targeted [23].

Probability Weighted Moments
The PWM of a random variable was formally defined by Greenwood et al. [24] as: where r, s, t are real numbers; F denotes the cumulative distribution function (CDF); and x(F) is the inverse of the CDF (also called the quantile function). The two following forms of PWMs are commonly used: Type 1: and Type 2: Because these two forms can be converted to each other, we only consider Type 2 PWMs in the following equation. From an ordered random sample of size N, x 1 ≤ x 2 ≤ · · · ≤ x N , unbiased estimatesβ k of β k can be obtained as [26]: where k = 0, 1, . . . , (N − 1) are non-negative integers and the binomial coefficient is given as: . Equation (7) can also be written as: From Equation (10), the PWM can be computed by multiplying each observed element by a weight coefficient that is proportional to its probability. Hence, probability weighted moments usually contain more information than classical moments. When comparing a first-order PWM with a first-order classical moment, the magnitudes of the observed elements of the former are adjusted according to probabilities before calculating the average value [28]. A PWM is more robust because of the insensitivity to outliers and the required sample size is smaller.
For a non-negative random variable, β k can be interpreted as moments of the inverse of the CDF [26]. So, they can serve as constraints in the maximum entropy (MaxEnt) or the minimization cross-entropy [25] approach to estimate the inverse of the CDF. Generally, the estimation using the first four moments often produces an acceptable description. Taking the generalized Pareto distribution (GPD) as an example, its CDF and the inverse of the CDF are given in Appendix A. In the case of c = −0.2 and d = 1, the MaxEnt approximation combined with the first fourth-order PWMs of the Pareto quantile function is shown in Figure 1.

Global Sensitivity Measure Based on Probability Weighted Moments
As stated above, not only does a probability weighted moment contain more information than conventional moments, but also it can be accurately estimated from small samples. So, we propose a new global sensitivity measure based on PWMs. It is defined as: where k denotes the order of the PWMs, and the maximal order is usually chosen as 4. Conditional PWMs β k X ∼i (Y|X i ) are computed with X i at a fixed value and X ∼i over their full ranges. E X i β k X ∼i (Y|X i ) is calculated over all possible X i , since X i is uncertain and its true value is unknown.
To scale the above-mentioned importance measure within the interval [0,1], the final form of the measure based on PWMs can be constructed: where n is the number of input variables. (11) can be explained as the difference between the original PWM and the average of conditional PWMs. When the contribution of uncertainty from X i to the difference is bigger, the main measure is bigger as well. Then, we can establish the following properties of the novel global sensitivity measure: Property 2. η k i = 0 means the input variable X i has no effect on the output.
Proof. When the computational model does not include X i , the conditional distribution (11) is equal to 0 and η k i = 0.
Property 3. η k i = 1 denotes that only input X i affects the output.
Proof. According to property 2, ω k i of all variables except X i is separately equal to 0. So, we can obtain η k i = 1 by introducing the values of other variables into Equation (12).

Numerical Estimation for Global Sensitivity Measures
Two numerical methods for estimating η k i indices are proposed in this section. Doubleloop-repeated-set numerical estimators are presented in Section 5.1 and double-loop-singleset numerical estimators are presented in Section 5.2.

Double-Loop-Repeated-Set Numerical Estimators
The most direct method of estimating presented measures is to use the double-loop numerical estimation method. Equations (8) and (9) show the estimator of PWMs, so the key issue is how to estimate conditional PWMs. The main procedures are summarized as follows: Step 1: Generate N 1 samples of the variable vector x = (x 1 , x 2 , . . . , x n ) by the joint PDF f X (x) and calculate the corresponding responses Y as well as its kth-order PWM. These samples can be generated by Monte Carlo sampling, Latin hypercube sampling, quasi-Monte Carlo simulation with sampling based on Sobol' sequences [29,30], or other sampling techniques. Here, quasi-Monte Carlo simulation is recommended for higher and faster convergence.
Step 2: Generate N 2 samples of the input variable X i by its PDF f X i (x i ) and denote these samples as x i 1 , x i 2 , . . . , x i N 2 .
Step 3: The variable X i should be fixed at x i m (m = 1, 2, . . . N 2 ) individually and generate N 3 samples of the remaining variables according to the joint PDF (all variables are independent in this paper). So, the conditional PWM of responses β k X ∼i (Y|X i = x i m ) can be obtained, and this step needs to be repeated N 2 times.
Step 4: Calculate the expectation of all conditional PWMs, i.e., E X i β k X ∼i (Y|X i ) , and ω k i can be easily obtained using Equation (11). After finishing the calculation of ω k i of all variables, the normalized versions also can be evaluated in Equation (12).
However, for estimating the presented measure of each individual input, the total number of sampled points is N 1 + n × (N 2 + N 2 × N 3 ). We denote it as double-looprepeated-set QMC (DLRS QMC) because this method needs repeated sampling of inputs and outputs in each inner loop. It is obvious that the computational cost would be too heavy.

Double-Loop-Single-Set Numerical Estimators
We introduce another double-loop QMC method that needs only one set of samples of inputs and outputs for computing presented measures, which is denoted as double-loopsingle-set QMC (DLSS QMC).
Step 1: Generate 2 × N 1 samples of whole variables by the joint PDF f X (x) and assign half of these samples to a sample matrix A: Calculate the corresponding responses and estimate the kth-order PWM of Y.
Step 2: Matrix B is generated by the remaining samples, which is: Then, generate a new sample matrix B (p,i) A by assigning the (p,i)th component of A to the ith column of B: Step 3: Calculate the conditional PWM of responses β k X ∼i (Y|X i = x i m ), which is based on the sample matrix B (p,i) A .
Steps 4 and 5 are identical to those for DLRS QMC. The total number of sampled points is only (2 + n) × N 1 , contributing to a large reduction in sampling time compared to DLRS QMC. N 1 , N 2 , and N 3 are in the same order of magnitude, which is not more than 4000 usually. Hence, to reduce the calculation cost, the following examples are all solved by the DLSS QMC method.

Numerical Example 1: Linear Function with Normal Distribution Variables
Consider the linear function: Y = a 1 X 1 + a 2 X 2 + · · · + a n X n , (16) where x i (i = 1, 2, · · · , n) are independent and normally distributed, i.e., X i ∼ N µ i , σ 2 i . It is easy to calculate the values of Sobol' indices, which are: The PDF of the model response is also a normally distributed variable, a 2 i σ 2 i , and the conditional response Y|X i is also normally distributed, So, we can obtain the analytical expression of the components of presented measures in such a situation: where Φ −1 denotes the inverse of the CDF of the standard normal distribution. Table 1 lists several representative cases and corresponding examples in detail. Then, three importance measures, i.e., Sobol' indices, delta moment-independent indices, and measures based on PWMs, are calculated, as shown in Table 2 Adding the consideration of Case 5, measures mainly depend on variance. In addition, measures estimated by the DLRS QMC are very close to the analytical values when the sample size N is equal to 3000 or 4000, which is much smaller than the sample size of the sampling method used to estimate Sobol' indices or delta indices.

Numerical Example 2: Linear Function with Exponential Distribution Variables
Consider the linear function with four independent inputs: where X i (i = 1, 2, 3, 4) all obey the exponential distribution, i.e., X i ∼ Exp(λ = 1). From the results shown in Table 3, variables have the same influence on the output according to not only the Sobol' method but also the moment-independent method. However, measures based on PWMs are somewhat different, as presented in Table 4 and Figure 2. The variables that have absolutely the same coefficient have the same importance when ignoring the calculation and sampling error, meaning the measures can reflect the influence of positive and negative coefficients with the increase in order k.  When comparing Case 1 of Example 1 with Example 2, the former does not show much difference, although the coefficients are the same, because the PDF of the normal distribution is symmetric, while the PDF of the exponential distribution is asymmetric.

Numerical Example 3: Ishigami Function
Consider the Ishigami function [31], which is a highly nonlinear function with three inputs: Y = g(X) = sin(X 1 ) + 7 sin 2 (X 2 ) + 0.1x 4 3 sin(X 1 ), where X i (i = 1, 2, 3) are uniformly distributed on interval [−π, π]. Table 5 lists the Sobol' indices calculated by the analytical method and delta indices. Measures based on PWMs calculated by the DLRS QMC are listed in Table 6, and the corresponding figure follows. By analyzing the results, the importance rankings obtained by the main Sobol' indices, moment-independent indices, and new measures are the same, whereas discrepancy exists with the Sobol' total indices. This phenomenon can be explained by comparing Equation (1) with Equation (11). The definitions of the Sobol' main index and the presented measure have a similar form, thus leading to a certain relationship between them. Consider the roof truss structure shown in Figure 3a [32]. The top boom and other compression bars are built using reinforced concrete, and the bottom boom and other tension bars are built using steel. A uniformly distributed load q is applied to the roof structure. We transform the uniform load q into a point load P = ql/4, where l is the length of the steel bars, as shown in Figure 3b. By using the first-order elastic analysis of the deformation, the vertical displacement ∆ C of the top node C can be derived as: where A C and E C represent the sectional area and elastic modulus of concrete bars, respectively; A S and E S are the sectional area and elastic modulus of steel bars, respectively, which are also denoted in Figure 3b. Considering the serviceability criterion that the vertical displacement ∆ C should not exceed an admissible maximal deflection of 3 cm, the performance function can be constructed as Y = 0.03 − ∆ C . Then, we assume that six random inputs q, l, A C , E C , A S , and E S are independently and normally distributed. Their distribution parameters are given in Table 7.  Through computing four kinds of indices and rankings separately, as shown in Tables 8 and 9, all methods have similar ranks of importance. The results show that load q is the most important input among all input variables. Thus, regarding reliability design, q should be primarily considered. Besides the load, the sectional area of concrete bars as well as the sectional area and elastic modulus of steel bars are important. The length of steel bars and the elastic modulus of concrete bars should be considered as unimportant variables, which can be treated as constant values in the optimal design. In addition, this example shows that the required sample size of measures based on PWMs is the least one among the three methods.  To summarize, the importance ranking of presented measures is usually in accordance with that of moment-independent indices δ i when the PDF of the model output is symmetric or approximately symmetric. However, differences could exist if the PDF of the model output is asymmetric. In such cases, the sign of coefficients may also have an influence on PWM-based measures.

Conclusions
This paper introduced a novel global sensitivity measure based on PWMs. Since a PWM includes more information than a classical moment and can be accurately estimated from small samples, the new measure takes advantage of its properties. PWM-based measures can reflect the influence of input variables on output uncertainty when the output is obviously asymmetric. Subsequently, the double-loop-repeated-set QMC method and the double-loop-single-set QMC method were presented to estimate the new measure. The latter method is recommended because it needs fewer sample points. Three numerical examples were mainly used to compare the new measure with Sobol' indices and the moment-independent delta index, thus demonstrating its advantages. Finally, an engineering example of a roof truss structure showed the possibility of realistic application. Besides, the measure can be used to analyze problems of groups of input factors.
There is still some further work that needs to be done. Detailed functions of influence factors of measures based on PWMs, i.e., PDFs and coefficients, should be found. From all the examples, the delta importance measure and presented measures often have the same ranking. So, it is possible to find a method to link them because they both identify the influence of the PDF.