1. Introduction
Mathematical models [
1,
2] are utilized in many areas as compelling instruments for studying composite processes, but our understanding is generally restricted by uncertainties: inexact data, imperfect understanding, and subjective estimates, etc. Because of that, tools are necessary to estimate how uncertainties [
3] in the input affect certain output model results.
The goal of our work is in the field of environmental security [
4,
5,
6]. To establish itself as a reliable simulation tool, modern mathematical models of remote transport of air pollutants should involve lots of chemical and photochemical reactions [
7]. Monte Carlo algorithms [
8,
9,
10,
11] are the best apparatus for multidimensional air pollution modeling [
6].
By definition, Sensitivity analysis (SA) [
12,
13,
14,
15] is the study of how much the uncertainty in the input data of a model (because of inaccurate evaluations, estimations, and data compression, etc.) is reflected in the accuracy of the output results.
SA is a powerful procedure that verifies and improves the mathematical models [
16,
17]. It specifies the key role of SA when mathematical modeling is frequently the only strategy to investigate a complicated phenomenon. Furthermore, the authentic explanation of SA calculations relies on the effectiveness of methods supplying SA.
The input data for SA has been derived during runs of a large-scale model of long-range transport of pollutants in the air—
Unified 
Danish 
Eulerian 
Model (UNI-DEM) [
18,
19,
20,
21]), which includes a large geographical region (4800 × 4800 km) covering the whole Mediterranean, Europe, and parts of Asia and Africa.
The main physical, chemical and photochemical processes between the investigating species, contaminations, and rapidly changing meteorological conditions are taken into account by UNI-DEM. The reason for selecting it is because, in UNI-DEM, the chemical processes are taken into account in a very precise way. UNI-DEM is described mathematically [
19,
21] by the following system of partial differential equations:
      where
      
- —pollutant concentrations, 
- —wind components along the coordinate axes, 
- —diffusion coefficients, 
- —space emissions, 
- —dry and wet deposit coefficients, (), 
- —nonlinear functions expressed the reactions between pollutants. 
A fundamental role in UNI-DEM is played by the chemical reactions, and they connect the equations in the model in the system precisely. The presence of chemical processes determines the nonlinearity and the ‘stiffness’ property of the system of equations (see [
7]). The model’s chemical scheme is the condensed CBM-IV (Carbon Bond Mechanism) proposed in [
1] and improved in [
19]. UNI-DEM involves 35 pollutants and 116 chemical reactions—69 are time-dependent and 47 are time-independent.
Let us assume that the model can be introduced by a model function [
22]:  
The concept of the Sobol approach is given by the following representation of 
 and a constant 
 [
22]: 
The description (
1) is noted to as the ANOVA-representation of 
 if [
22]: 
It guarantees that the right-hand side functions of (
1) are determined in a distinctive way, where 
. The quantities
      
      are called total and partial variances [
22]. An analogical decomposition is fulfilled for the total variance:
Sobol global sensitivity indices [
22,
23] are determined by:  
Then the 
total 
sensitivity 
index (TSI) of input parameter 
 is determined by [
24]:  
      where 
 is the 
 order SI for 
.
By definition [
13], if 
, then the quantity 
 is named 
the main effect of 
; if 
, then 
 are named 
two-way interactions (second-order SIs); if 
, then 
 are the 
three-way interactions (third-order SIs) and so on. The set of input parameters is categorized with respect to their TSI [
13]: very important if 
; important if 
; unimportant if 
 and irrelevant if 
.
To introduce the 
total sensitivity of the output variance to an input parameter 
 one can exhibited its TSI: 
 (see [
2]). Therefore, performance SA using Sobol’s approach is turned into a problem for high-dimensional numerical integration [
25].
  2. Algorithms and Methods
Let us examine the following high-dimensional problem: 
 and .
Let 
 be two integers. By definition [
26,
27], an elementary 
s-interval in base 
b as a subset of 
 of the form 
, 
 for all 
. A 
-net in base 
b is a sequence 
 of 
 points of 
 such that 
 of 
. A 
-sequence in base 
b is an infinite points sequence 
 in such a way that for all 
, the sequence 
 is a 
-net in base 
b.
I. M. Sobol [
26] determines the 
-meshes and 
 sequences, which are 
-nets and 
-sequences in base 2, introduced in 1988 by H. Niederreiter [
27].
Now let us introduce the first algorithm that we are going to use; the well-known Sobol sequence (SOBOL-SEQ) [
28,
29]. To generate its 
j-th component, we will choose a polynomial: 
 and 
 are either 0 or 1. The sequence 
 is defined by
      
	  ⊕ is the bit-by-bit 
exclusive-or operator. For the merits 
 is fulfilled in that every 
 is odd. The direction numbers 
 are determined with 
. Then the 
j-th component of the 
i-th point in a SOBOL-SEQ, is determined with 
 where 
 is the 
k-th binary digit of 
. For generating the SOBOL-SEQ, the algorithm with Gray code implementation [
30] and SIMD-oriented Fast Mersenne Twister (SFMT) [
31] has been used.
We now briefly describe the three algorithms based on modified Sobol sequences. They produced the best results for the problem under consideration up to now [
32]. We use the shortcut 
MCA-MSS for the Monte Carlo algorithms based on Modified Sobol Sequences.
In the first algorithm 
MCA-MSS-1 [
32], we choose a Sobol point (SP) 
 and a pseudorandom point (p.p.) 
 uniformly distributed (u.d.) over the sphere. The sphere has a centrum 
 and a radius 
. If the 
i-th SP is
      
      then 
 with a p.d.f. 
 is determined in the following way
      
      where 
 is a unique u.d. vector in 
 and 
 is the 
shaking radius.
It is assumed that 
, 
. The unit cube 
 is divided into 
 disjoint sub-domains: 
      with 
 for all 
.
Now in every 
s-dimensional sub-domain 
 exactly one 
 point 
 is taken and 
. Then MCA-MSS-1 gives the following approximate value of (
2): 
For the second algorithm 
MCA-MSS-2 [
33], two SP 
 and 
 are taken so that 
 is chosen in the same way as used in MCA-MSS-1. The 
 is selected to be symmetric to 
 in correspondence with the central point 
 in 
. Now there are exactly 
 points. Now, we evaluate all 
 and 
, for 
, and MCA-MSS-2 gives the following approximate value of (
2): 
For 
MCA-MSS-1 and 
MCA-MSS-2, the Box–Muller transform is applied for the 
s-dimensional vector 
 [
34]: if 
s is even, then
      
      and 
 is a random point in 
. Therefore, a random point 
 over the 
s-dimensional sphere is determined by 
.
In the third algorithm 
MCA-MSS-2-S [
35], the original domain of integration is divided into 
 disjoint subdomains with equal volumes 
. Now, the first p.p. 
 is generated u.d. inside the subdomain 
. The second point 
 is computed to be symmetric to 
 in correspondence with the central point 
 in 
. This is similar to MCA-MSS-2 as the 
shaking is with different radiuses 
 in each subdomain. This algorithm is similar to the stratified symmetrized Monte Carlo [
26], which is why we use the notation MCA-MSS-2-S. The value of (
2) obtained approximately with MCA-MSS-2-S is given by: 
The algorithms MCA-MSS-2 and MCA-MSS-2-S have an optimal rate of convergence (
) for the class of continuous functions with continuous first derivatives and bounded second derivatives [
35].
Now to introduce lattice rules, let 
, and N is a natural number. By definition [
36], the point set 
 in 
 with 
 for all 
 is named a lattice point set and 
 is named its generating vector. The algorithm that uses a lattice point set is named a lattice sequence [
37].
We will use this rank-1 lattice sequence [
38]: 
      where 
 is an integer, 
 is the generating vector and 
 denotes the fractional part of 
z.
There exists an optimal choice of 
 according to [
39]:  
      for the function 
 [
39], 
 and 
 do not depend on 
N.
When 
, the following optimal construction exists [
40].
Consider the following generating vector [
38]: 
      using 
 and 
 (
) is the corresponding number of the generalized Fibonacci series [
38]. Then
      
Our generating vector (
3) is transformed into [
40]: 
	  We construct a special lattice sequence with the generating vector (
4), namely 
LATTICE-FIBO.
Now we will improve the convergence of the lattice sequence using the recently developed component and the Component Fast Construction method (CBC) [
41,
42,
43]. Now we construct a special lattice sequence 
LATTICE-CBC with an optimal generating vector following the idea in [
44,
45]. At the first step of LATTICE-CBC, 
 is generated by the CBC. After that, the points of LATTICE-CBC are obtained by the formula
      
The approximate value 
 of (
2) is evaluated in the following way:
Let 
 be the finite field with 
b elements for a prime power 
b (when b is a prime, 
 is the residue class ring modulo 
b) [
46]. Using 
 and a bijection 
 with 
, we generate a digital 
-sequence over 
. Now we take 
 matrices 
 over 
. The matrices are of the form
      
To now generate one of the points 
, of the 
-sequence, we record 
n in its 
b-adic (i.e., base 
b) expansion
      
      with 
 and 
 for all 
i large enough. Now we choose the column vector
      
To generate the point 
, for the i-th coordinate 
, the value of 
 is derived by multiplying the i-th matrix 
 by 
 in 
, which derives a column vector over 
: 
The elements 
, for 
, are now the 
b-adic digits of 
, i.e.,
      
Now the sequence derived in this way 
 defines a digital sequence by generating matrices 
 [
47].
Now we will consider two possible choices for the generating matrices. The first digital sequence is generating matrices based on an implementation of the Sobol sequence from [
48] with 21201 dimensions, and we called it 
DIGITAL-SOBOL. The second digital sequence uses the construction of Xing and Niederreiter [
49]. This sequence was implemented by Pirsic [
50] and can be considered as the currently optimal low-discrepancy sequences with a rate of convergence 
. We will call this sequence 
DIGITAL-XINGN.
  3. Calculations and Results
The efficient stochastic algorithms, namely SOBOl-SEQ, MCA-MSS-1, MCA-MSS-2, MCA-MSS-2-S, LATTICE-FIBO, LATTICE-CBC, DIGITAL-SOBOL and DIGITAL-XINGN, have been applied to sensitivity studies with respect to emission levels and some chemical reactions rates of the concentration variations of UNI-DEM pollutants. The estimated quantity is denoted by EQ and its reference value by RV. The relative error is denoted by RE and the approximate evaluation by AE.
In the case of the 
sensitivity studies with respect to emission levels, we will study the sensitivity of the model output (in terms of mean monthly concentrations of several important pollutants—in our case, this is ammonia in Milan) with respect to a variation of input emissions from the anthropogenic pollutants, which consists of four different components 
 as follows:
The output of the model is the mean monthly concentration of the following three pollutants:
In the case of the 
sensitivity studies with respect to emission levels, the results for REs for the AE of the 
 and 
, using 
SOBOl-SEQ, MCA-MSS-1, MCA-MSS-2, MCA-MSS-2-S, LATTICE-FIBO, LATTICE-CBC, DIGITAL-SOBOL and DIGITAL-XINGN are shown in 
Table 1, 
Table 2, 
Table 3, 
Table 4 and 
Table 5, respectively. The quantity 
 is represented by a four-dimensional integral, whereas the rest are represented by eight-dimensional integrals.
In the case of the 
sensitivity studies with respect to chemical reactions rates, we will study the ozone concentration in Genova according to the rate variation of these chemical reactions: ## 
 (time-dependent) and 
 (time-independent) of the CBM-IV scheme ([
19]) given as follows:
Here, “hv” represents the sunlight in the photochemical reactions and indicates that a photon is proportional to its frequency by Plank’s constant.
In the case of the 
sensitivity studies with respect to chemical reactions rates, the results for REs for the AE of the 
 and 
, using 
SOBOl-SEQ, MCA-MSS-1, MCA-MSS-2, MCA-MSS-2-S, LATTICE-FIBO, LATTICE-CBC, DIGITAL-SOBOL and DIGITAL-XINGN, are shown in 
Table 6, 
Table 7, 
Table 8, 
Table 9 and 
Table 10, respectively. As in the first case study, the quantity 
 is represented by a six-dimensional integral, whereas the rest are represented by twelve-dimensional integrals.
  4. Discussion of the Results
In the case of the 
sensitivity studies with respect to emission levels, a detailed discussion has been made with the previous best available approaches described in [
51,
52,
53].
For the number of samples 
 for the model function 
, the best algorithm is the DIGITAL-XINGN, followed by MCA-MSS-2-S and MCA-MSS-1—see the results in 
Table 1 and 
Table 2 for the maximum number of samples. A big improvement over the older best available result is obtained—the new result is 3e-10 vs. 2e-08 [
32,
35]. For a number of samples 
 for the total variance 
D, the best algorithm is the DIGITAL-XINGN, followed by MCA-MSS-2-S and DIGITAL-SOBOL—see the results in 
Table 3 and 
Table 4 for the maximum number of samples [
32,
35].
Our discussion here includes a detailed comparison with the previous results in the field in [
54,
55]. For the total and first-order sensitivity indices in 
Table 5, it can be seen that the two digital sequences improve the results produced by the best available approaches up to now [
32,
33]. The DIGITAL-SOBOL gives the best results for 
, 
, 
, 
 and 
, while the DIGITAL-XING gives the best results for 
, 
 and 
. The general conclusion is that our methods significantly improved the accuracy of the sensitivity indices described in [
56]. According to [
33,
54,
56], the best available approaches up to now were the SOBOL-SEQUENCE and MCA-MSS-1 for 
, 
 (these two give exactly the same result for all quantities except 
), and MCA-MSS-2 was the best available algorithm for 
, 
, 
, 
, 
, 
. From 
Table 5, one can conclude that between the two lattice sequences, LATTICE-CBC is more accurate by at least 1. Furthermore, we can observe that the worst are MCA-MSS-2-S and LATTICE-FIBO, and the last method gives unsatisfactory relative errors. According to [
32,
54,
56], the best improvements for the sensitivity indexes are obtained for 
, the new result is 4e-06 vs. 4e-03; for 
, the new result is 1e-05 vs. 1e-02. For all total sensitivity indices, there are serious improvements over the existing best available results. For 
, the new result is 3e-07 vs. 1e-05; for 
, 1e-05 vs. 1e-03; for 
, 2e-06 vs. 4e-05; and for 
, 1e-06 vs. 1e-02. In [
24], it is mentioned that having the smallest possible sensitivity indices is the most important aspect of the model. In our case, these are 
 and 
, and the improvement is with three and four orders, respectively, which will be crucial for taking into account the reliability of the results of the model.
The performance of the algorithms can be generalized in such a way: the two digital sequences proposed by the authors yield the smallest relative errors for all sensitivity indices; algorithm MCA-MSS-2 is after that, followed by SOBOL-SEQUENCE, MCA-MSS-1 and LATTICE-CBC. The worst results are produced by the MCA-MSS-2-S and LATTICE-FIBO. Comparing our results with the results obtained in [
32,
35,
54,
56] the main advantages of our methods include the significant improvement in the accuracy and lower computational complexity.
In the case of 
sensitivity studies with respect to chemical reaction rates, a precise discussion has been made with the previous best available approaches from [
4,
57,
58].
For the number of samples 
 for the model function 
, the best algorithm is the DIGITAL-XINGN, followed by MCA-MSS-2-S and MCA-MSS-1—see the results in 
Table 6 and 
Table 7 for the maximum number of samples. There is a huge improvement according to the sensitivity studies in [
32,
35]—the new result is 6e-11 vs. 2e-07. For the number of samples 
 for the total variance 
D, the best algorithm is the DIGITAL-SOBOL, followed by DIGITAL-XINGN, MCA-MSS-2 and SOBOL-SEQUENCE—see the results in 
Table 8 and 
Table 9 for the maximum number of samples [
4].
The discussion in this case includes a comparison between our obtained results with other studies and other papers already published [
32,
59,
60]. For all sensitivity indices in 
Table 10, it can be seen that the two digital sequences improve the results produced by the best available approaches except for the sensitivity index 
 obtained in [
32,
33]. The DIGITAL-SOBOL gives the best results for 
, 
, 
, 
, 
, 
, 
, 
, 
, 
, 
, 
 and 
, while the DIGITAL-XING gives the best results for 
, 
, 
, 
 and 
. The general conclusion is that our methods show a significant improvement in the accuracy vs. the other calculations in this field [
61]. According to [
4,
33,
59], the best available approaches up to now were the SOBOL-SEQEUNCE for 
, 
, 
, 
, 
, 
, 
, 
, 
, MCA-MSS-1 was the best for 
 and 
, and MCA-MSS-2 was the most accurate for 
, 
, 
, 
, 
, 
, 
. According to [
35,
59,
60], the best improvement for the relative error is that for the sensitivity index 
, the new result is 4e-06 vs. 3e-04; for 
, the new is 8e-07 vs. 4e-05; and for 
, the new result is 2e-05 vs. 2e-03. From 
Table 10, focusing on the relative errors, one can conclude that between the two lattice sequences, LATTICE-CBC is more accurate by at least two orders. Furthermore, we can observe that MCA-MSS-2-S is one order better than LATTICE-FIBO and one order worse than the LATTICE-CBC.
We can conclude that the two digital sequences proposed by the authors yield the smallest relative errors for all sensitivity indices except , and DIGITAL-SOBOL has the edge; the algorithm MCA-MSS-2 is closely behind them, followed by SOBOL-SEQUENCE, MCA-MSS-1, LATTICE-CBC and MCA-MSS-2-S. The lattice sequence LATTICE-FIBO gives unsatisfactory relative errors for most of the quantities.
Based on the described comparison with other sensitivity studies and other papers already published in the field [
16,
17,
62,
63], the general conclusion from our discussion is that the main advantages of the proposed methods 
LATTICE-FIBO, LATTICE-CBC, DIGITAL-SOBOL and DIGITAL-XINGN for the very important large-scale air pollution model UNI-DEM are the much higher accuracy and lower computational complexity.
  5. Conclusions
The present study is in a very important area of environmental safety. High levels of pollution can damage ecosystems and harm plants, animals and humans, and it is of great importance to study the levels of pollution precisely. That is why it is of great importance to recognize whether the pollution levels are below some critical values and, if so, to establish a well-founded control system to retain the pollution levels within these limits. These problems can be accomplished successfully by the proposed highly efficient stochastic approaches for studying the sensitivity of various pollution-related processes.
The computational efficiency in terms of relative error of one of the best available stochastic methods for high-dimensional integration has been performed for the sensitivity of the UNI-DEM model output to find the variation of rate constants of chosen chemical reactions and the variation of selected input emissions of the anthropogenic pollutants.
The two proposed digital sequences based on Sobol and Xing-Niederreiter matrices show an essential improvement over the existing best available results produced by the modified Sobol sequences. The proposed approaches by the authors LATTICE-CBC, DIGITAL-SOBOL and DIGITAL-XNING are established as the new best available stochastic methods for this particular very important large-scale air pollution model UNI-DEM.
The obtained results will help test and improve the mathematical models and enable a well-founded explanation by the relevant specialists. By recognizing the main chemical reactions that influence the performance of the system, physicists and chemists will be able to gather important information to improve the model, which will increase the reliability and sustainability of forecasts. Therefore, through our sensitivity results, the mathematical model will contribute to a more accurate evaluation of losses in agriculture and, most importantly, enable an estimation of the effects of harmful emissions on human health.