1. Introduction
The purpose of our work is in the field of environmental security [
1,
2], since this is one of the key areas worldwide. By definition, a sensitivity analysis (SA) is an investigation of how much the uncertainty in the input data of a model is apportioned in the accuracy of the output results [
3,
4,
5,
6,
7]; see 
Figure 1. Multidimensional SA [
8,
9,
10,
11,
12] is a very challenging task when modeling, but it is often the key tool for studying a complex phenomenon [
13,
14,
15,
16,
17].
The main problem in SA is the evaluation of the total sensitivity indices (SIs) [
18,
19,
20,
21]. The mathematical formulation for estimating the SIs is represented by a set of multidimensional integrals (MIs) [
22,
23,
24,
25]. The Monte Carlo (MC) and quasi-Monte Carlo (QMC) methods are the best tools for solving the MIs [
22,
26,
27,
28,
29,
30,
31]. For a more clear explanation of the objectives in this paper, one should check [
32,
33,
34].
The input data for SA were obtained with a large-scale model of the long-range transport of pollutants in the air—the 
Unified 
Danish 
Eulerian 
Model 
(UNI-DEM) [
35,
36,
37,
38,
39]. UNI-DEM is also a basic tool and important tool for the creation of a digital twin, namely, Digital Air (see [
40]).
By definition, the model can be introduced with a model function [
41]:  
The concept of the Sobol approach consists of the following representation of 
 and a constant 
 [
41]:  
The above description is noted as the ANOVA representation of 
 if [
41]: 
The quantities
      
      are called total and partial variances [
41]. The same is true for the total variance:
The Sobol global SIs [
6,
41] are determined by:  
Then, the 
total 
sensitivity 
index (TSI) of the input parameter 
 is determined by [
42]:  
      where 
 is the 
jth-order SI for 
.
According to the definition in [
43], when 
, the quantity 
 is called 
the main effect of 
; if 
, 
 are called 
two-way interactions (second-order SIs); if 
, 
 are called 
three-way interactions (third-order SIs), and so on. In this study, we are interested in the main effects and the two-way interactions.
The TSI of the output variance for an input parameter 
 is represented as: 
; see [
18]. With this, we show that multidimensional SA using Sobol’s approach is turned into a problem of evaluating MIs [
44].
In this paper, we aim to suggest implementations for the fast and accurate evaluation of the sensitivity indices. As mentioned earlier, the problem of SA is transformed into a multidimensional integration task that is approached by using novel quasi-Monte Carlo methods, which are compared with the best available algorithms in an application to large-scale modeling of atmospheric pollution. The methods are described in the next section, followed by a thorough analysis of the computational results.
  2. Methods and Algorithms
Consider the following multidimensional problem:  
      where 
 and 
.
In the simplest possible MC approach, “Crude”, we introduce the random variable 
, for which
      
      and the random points 
 are independent realizations of the random point 
 with a probability density function 
 and 
. The 
Crude MC approach for the integral 
I is defined as [
22]:
Let  for  and let  be the representation of  in base .
Then, the discrepancy (star discrepancy) of the set is defined [
22,
24]:
      where 
.
For the one-dimensional quasi-random number sequence, we introduce the radical inverse sequence [
22,
23]:
The 
van der Corput sequence [
45] is obtained when 
. Now, the multidimensional quasi-random number sequence is defined as: 
 where the bases 
 are all relatively prime: 
, where 
 denotes the 
ith prime.
Now, the 
Halton sequence [
46,
47] is defined as:
      where 
, 
 designates the 
i-th prime, and 
 denotes the set of permutations on 
The standard 
M-dimensional 
Hammersley sequence [
48], which is based on 
N samples, is simply composed of a first component of successive fractions 
 paired with 
 one-dimensional van der Corput sequences by using the first 
 primes as bases. More precisely, if 
 are the first 
 prime numbers, then the Hammersley sequence 
 with 
N points in 
s dimensions is given by
      
The 
Faure sequence [
49,
50,
51] is given by:
The 
Sobol sequence [
52,
53,
54] is defined by:
      where 
 are the set of permutations on every 
 subsequent points of the van der Corput sequence, defined by 
 when 
.
In binary, for the Sobol sequence, we have that: 
, where 
 is the set of directional numbers [
55].
For the QMC algorithms, based on the 
Halton, 
Faure and 
Sobol sequences, it is known that the corresponding discrepancy is:
According to several important works [
56,
57,
58,
59], the convergence rate for the scrambling algorithms essentially improves the convergence rate for the unscrambled nets [
56,
57,
58,
59], which is 
. The scrambling itself is based on the randomization of a single digit at each iteration. Let
      
      be quasi-random numbers in 
, and let
      
      be the corresponding scrambled version of the point 
. Let every 
 be rewritten in base 
b as
      
      with 
K being the number of digits for scrambling. For the scrambled Halton sequence 
HaltonScr, we apply a permutation of the radical inverse coefficients, which is obtained by applying a reverse-radix operation to each of the possible coefficient values [
60]. For the scrambled Sobol sequence 
SobolScr, we use a random linear scramble blended with a random digital shift [
61].
Now, we will introduce a super-convergent modified Sobol sequence 
SobolBurkardt based on the INSOBL and GOSOBL routines in ACM TOMS Algorithm 647 and ACM TOMS Algorithm 659, as well as a Burkardt modification [
52,
53,
54,
55,
62,
63,
64,
65,
66,
67]. The original code can only compute the next element of the sequence. Our modification allows the user to specify the index of the desired element. The novelty is that this is the first time that the SobolBurkardt algorithm has been applied for a multidimensional sensitivity analysis of this important digital ecosystem.
  3. Results and Discussion
In this section, the advanced stochastic algorithms described above (
Crude, Sobol, Halton, SobolScr, HaltonScr, SobolBurkardt, Hammersley, and 
Faure) are applied to sensitivity studies with respect to emission levels (
SSREL) and with respect to some chemical reaction rates (
SSRCRR) of varying concentrations of UNI-DEM pollutants [
68,
69]. We use the following notations: EQ refers to the estimated quantity, RV refers to the reference value, RE refers to the relative error, and AE refers to the approximate evaluation.
For 
SSREL, we will investigate an SA of the model output (in terms of the mean monthly concentrations of several important pollutants—in our case, the pollutant is ammonia in Milan) with respect to a variation in the input emissions from anthropogenic pollutants, which consist of four components, 
:
 The output of the model is the mean monthly concentration of the following three pollutants:
—ozone ();
—ammonia ();
—ammonium sulfate and ammonium nitrate ().
			
For 
SSEL, the results for the REs for the AE of the 
 and 
 when using 
Crude, Sobol, Halton, SobolScr, HaltonScr, SobolBurkardt, Hammersley, Faure are shown in 
Table 1, 
Table 2, 
Table 3, 
Table 4, 
Table 5 and 
Table 6, respectively. The quantity 
 is represented by a four-dimensional integral, whereas the rest are represented by eight-dimensional integrals.
In the case of 
SSRCRR, we will investigate the ozone concentrations in Genova according to the rates of variation of these chemical reactions, which are ## 
 (time-dependent) and 
 (time-independent) in the CBM-IV scheme [
36]:
In the case of 
SSRCRR, the results for the REs for the AE of 
 and 
 when using 
Crude, Sobol, Halton, SobolScr, HaltonScr, SobolBurkardt, Hammersley, and 
Faure are shown in 
Table 7, 
Table 8, 
Table 9, 
Table 10, 
Table 11 and 
Table 12, respectively. As in the first case study, the quantity 
 is represented by a six-dimensional integral, whereas the rest are represented by twelve-dimensional integrals.
In the case of 
SSREL, one may observe the following. In 
Table 1, for the model function 
, the best algorithm for all numbers of samples is HaltonScr, followed by SobolBurkardt. For 
, for the total variance 
, the best algorithm is SobolBurkardt, followed by the Halton sequence—see the results in 
Table 2. However, for 
 and 
, the Halton sequence produces slightly better results—see 
Table 2. The behavior of the algorithm can also be seen in 
Figure 2. The RVs for the first and total SIs are presented in 
Table 3. From 
Table 4, 
Table 5 and 
Table 6, one can conclude that for all first-order SIs and TSIs, the best algorithm is SobolBurkardt, followed by HaltonScr and the Halton sequence. It is important that the scrambling procedure improves the results of the Sobol and Halton sequences by at least one order for most of the cases—see the values for 
 and 
 in 
Table 6. In [
42], it was pointed out that having the smallest possible SIs is the most important aspect of a model. In our case, these are 
 and 
—see 
Table 3. For them, SobolBurkardt significantly improved upon the results of the other sequences, and one can also see that the Hammersley and Faure sequences performed better than the Crude algorithm, as expected.
The performance of the algorithms in the case of SSREL can be generalized in this way: The algorithm that we implemented, SobolBurkardt, held the smallest relative errors for all SIs; the scrambled Halton and original Halton sequences were the next, followed by the Hammersley, Faure, scrambled Sobol, and Sobol algorithms; the worst was the plain algorithm.
In the case of 
SSRCRR, the following observations could be made. In 
Table 7, for the model function 
, the best algorithm for all numbers of samples except 
, 
, and 
 was HaltonScr; for 
, the best algorithm was SobolBurkardt, and in the other two cases, the best algorithm was the Hammersley sequence. The RVs of the first- and second-order SIs and TSIs are given in 
Table 9. For all numbers of samples except 
, for the total variance 
D, the best algorithm was SobolBurkardt—see the results in 
Table 8. However, for 
, the scrambled Halton sequence produced results that was one order better than our SobolBurkardt algorithm. The behavior of the algorithm can also be seen in 
Figure 3. For a low number of samples, 
, as shown in 
Table 10, the Halton sequence was better than our SobolBurkardt algorithm for 
 and 
, and the scrambled Halton sequence was better than our algorithm for the total SIs 
, 
, and 
. As previously mentioned, having the smallest possible SIs is the most important aspect of a model. Here, these were 
, 
, and 
—see 
Table 9. For them, our SobolBurkardt implementation performed better than the other algorithms. However, for larger numbers of samples, as shown in 
Table 11 and 
Table 12, one can conclude that for all first-order SIs, second-order SIs, and TSIs, the best algorithm was SobolBurkardt, followed by the HaltonScr and scrambled Sobol sequence algorithms. It is important that the scrambling procedure significantly improved the results of the Sobol and Halton sequences for some of the cases—see the values for 
 and 
 in 
Table 12. For all of the cases, the Hammersley and Faure sequences performed better than the Crude algorithm, as expected.
The performance of the algorithms in the case of SSREL can be generalized in such a way: The SobolBurkardt algorithm that we implemented held the smallest REs for all SIs; the scrambled Halton and the original Halton sequence are the next, followed by the Hammersley, Faure, scrambled Sobol, and Sobol algorithms, and the worst was the Crude algorithm.
The overall conclusion is that the implemented SobolBurkardt algorithm was the best approach among the benchmarked algorithms, and the values of the relative errors showed its supremacy over the majority of the existing methods when applied to multidimensional air pollution sensitivity analysis.