Parallel Hydrological Model Parameter Uncertainty Analysis Based on Message-Passing Interface

Parameter uncertainty analysis is one of the hot issues in hydrology studies, and the Generalized Likelihood Uncertainty Estimation (GLUE) is one of the most widely used methods. However, the scale of the existing research is relatively small, which results from computational complexity and limited computing resources. In this study, a parallel GLUE method based on a Message-Passing Interface (MPI) was proposed and implemented on a supercomputer system. The research focused on the computational efficiency of the parallel algorithm and the parameter uncertainty of the Xinanjiang model affected by different threshold likelihood function values and sampling sizes. The results demonstrated that the parallel GLUE method showed high computational efficiency and scalability. Through the large-scale parameter uncertainty analysis, it was found that within an interval of less than 0.1%, the proportion of behavioral parameter sets and the threshold value had an exponential relationship. A large sampling scale is more likely than a small sampling scale to obtain behavioral parameter sets at high threshold values. High threshold values may derive more concentrated posterior distributions of the sensitivity parameters than low threshold values.


Introduction
Accurate hydrological forecasting plays an important role in the scientific management of water resources, flood prevention and the efficient operation of hydraulic projects [1][2][3][4][5]. However, hydrological forecasts are made in the face of pervasive uncertainty, both objectively and subjectively. From the objective point of view, due to the inherent uncertainty of natural water cycle and of human observation and modeling processes, each link in hydrological forecasting, such as the model input, the model structure and the model parameters, has its own uncertainty [6,7]. From the subjective point of view, in the process of hydrological forecasting there are several decisions that have to be made by modelers, such as model scope, equations and parameter values [8]. All these decisions are inevitably influenced by subjective factors, such as values, ethics and politics [9], and will inevitably bring uncertainty to forecasting results [10]. At present, the uncertainty in hydrological forecasting has become a research hotspot.
Among the studies of hydrological uncertainty, the uncertainty of the model parameter is an important aspect [11][12][13][14]. In traditional hydrological forecasting studies, the model should be localized by parameter calibration [15][16][17][18]. However, many studies have found that it is difficult to find a certain set of parameters that can yield model performances much better than other parameter

The GLUE Method
The GLUE method was first proposed by Beven and Binley [19] to assess parameter uncertainty based on their research on model calibration. It was concluded that multiple parameter sets may yield similar model performances, i.e., "equifinality". The GLUE method includes several steps: (1) select a hydrological model and determine the model parameter space; (2) select a likelihood function and determine the threshold value and the priori probability distribution; (3) randomly extract parameter sets from the parameter space; (4) run the model with the extracted parameter sets, calculate the likelihood value according to the model results and save the behavioral parameter sets if the likelihood value exceeds the threshold; (5) rescale the threshold values to formulate a cumulative distribution. In this study, the Nash-Sutcliffe efficiency (NSE) [53] was selected as the likelihood function, as in many other studies [23,28,29,52]: where Q t obs and Q t sim represent the observed and simulated streamflow at time t, Q obs represents the mean value of the observed streamflow, and T is the total time steps of the observed and simulated streamflow. Different threshold values and sampling scales were considered to evaluate their impacts on the results of the GLUE method.

MPI
In this study, an MPI is used to implement the parallel GLUE algorithm. An MPI is a message passing library interface protocol that mainly aims at the message passing parallel programming model. In this model, data is transferred between the address spaces of different processes through performing a cooperative operation between processes. In addition to the traditional messaging model, an MPI also provides support for cluster operations, remote memory access, dynamic process creation and parallel I/O. An MPI has become an important part of parallel computing because of its high communication performance, good program portability and powerful functions. An MPI is not a language but a definition for functions, which are expressed as functions, subroutines or methods. At present, there are several implementations of MPI invited by multiple organizations, such as MPICH, OPEN MPI and Intel MPI (Intel Corporation, Santa Clara, CA, USA). The MPICH used in this study is an open source MPI developed by Argonne National Laboratory and Mississippi State University [54][55][56].

The Parallel GLUE Method
In the GLUE method, the processes of the model run with candidate parameters and the likelihood value calculation and evaluation are independent of each other after the parameter sampling process. In the original GLUE method, this part performs sequentially. In this study, this part of the GLUE method is parallelized with the MPI. Different processes simultaneously run the same code with the same input data, and only the model parameters are changed, forming a typical single program, multiple data (SPMD) scenario in parallel computing.
In addition, the parallel computing provided by the MPI is based on the processes. Each process uses an independent memory space, so it is able to build a more complex calculation process of each process. However, the disadvantage is that both the opening and destruction of the processes need to occupy computing resources, so the number of opening processes should be limited; it is better to ensure a certain amount of tasks of each process. Therefore, this study chooses the equivalent parallel modeling method to allocate the same amount of computations of each process to maximize the use of computing resources. A specific calculation process is shown in Figure 1, and its brief principle is stated below:

1.
Create multiple processes according to the sample size and available computing resources, and specify one root process, while the remaining processes are non-root processes.

2.
Read the model configuration file in the root process, and sample in the model parameter space to generate a group of candidate parameter sets.

3.
Assign the candidate parameter sets equally to all processes. 4.
On each process, perform the model run during the calibration period and likelihood value calculation with the sub-candidate parameter sets. Evaluate the likelihood value, and save the parameter sets and forecast results if they are above (or below according to the likelihood function) the threshold value. Among the processes, this step is performed simultaneously.

The Xinanjiang Model
The Xinanjiang model is a famous conceptual hydrological model proposed by Zhao et al. [57]. Many researchers have carried out successful practices and developed improvements in many basins around the world. The Xinanjiang model used in this study is integrated in the EasyDHM model system [4], of which the characteristic is, on the basis of retaining the original runoff generation algorithm, to calculate the routing process using the surface information obtained from DEM and other catchment data and adjust the process using the corresponding parameters. The parameters and their ranges are listed in Table 1. The priori probability distribution of the parameters is assumed as a uniform distribution, as in many other studies.

Study Area and Data Sets
In this study, the catchment above the Gaoan station in the Jinjiang River Basin is selected as the research area ( Figure 2). The Jinjiang River Basin, located in the southwest of the Jiangxi Province, is the second largest tributary of the Ganjiang River, with a total length of 307 km and a total drainage area of 7886 km 2 . The altitude of the Jinjiang River Basin ranges from 18 to 1096 m above sea level. The mountains and low hills are distributed in the northwest, and the plain lays in the east. The Jinjiang River Basin is located in the subtropical area, with a warm and humid climate. The average annual precipitation is approximately 1600 mm. The Gaoan hydrological station is located on the lower reaches of the Jinjiang River, above which the drainage area is 6215 km 2 . The daily precipitation data used in this study come from 20 rain gauges in and around the study area. Other meteorological data, including the temperature, wind speed, relative humidity, and solar radiation, were obtained from the 2 meteorological stations near the study area. The daily streamflow data obtained from the Gaoan hydrological station were provided by the Jiangxi Hydrological Bureau.
All the time series data range from 2000 to 2015, in which the period from 2000 to 2011 was regarded as the calibration period, and the period from 2012 to 2015 was regarded as the validation period.

Hardware and Software Environment
The hardware environment of this study is the "Yuan" supercomputing system of the Supercomputing Center of Chinese Academy of Sciences [58]. It is a new generation supercomputing system affiliated with the general supercomputing environment center of the Chinese Academy of Sciences. The total computing power of the "Yuan" supercomputing system is approximately 2.3 Pflops (the computing power refers to the double precision floating-point peak computing power), and the CPU general-purpose computing capacity is 700 Tflops. The maximum computing power used in this study is approximately 96 Tflops. The user terminal system version of the "Yuan" supercomputing system is CentOS 6.6, the compiler version is GCC-8.3.0, and the MPICH version is MPICH-3.3.1.

Evaluation Criteria
The indicators used in this study to evaluate the computing performance of the parallel GLUE method are the speedup ratio (Sp) and parallelism efficiency (E), which are given by the following expressions: where T s represents the execution time of the serial program, T p represents the execution time of the parallel program, t s is the time required to execute the serial part of the program, t p represents the time required to execute the parallel part of the program, and n represents the number of processes used in the parallel program. The value of Sp increases with increasing n, and the range is [1, n). If Sp = n, the parallel program is considered to have reached the linear speedup ratio (i.e., the ideal speedup ratio). The value range of E is (0,1), and the ideal value of E is 1, in which it is considered that the parallel program has achieved linear speedup, and the processors are fully utilized.

Analysis of Parallelism Efficiency
To analyze the computing efficiency of the parallel GLUE method, a GLUE-based Xinanjiang model parameter uncertainty analysis was performed with a sampling scale of 10,000. Different process numbers are set on the "Yuan" supercomputer for serial (1 process) or parallel (more than 1 process) computing. The number of processes ranges from 2 to 400 and is equal to the number of CPU cores allocated in the supercomputer. The operation time and evaluation indexes Sp and E are listed in Table 2 and Figure 3. The results show that in terms of the calculation time, the number of calculation processes increases from 1 in serial computing to 400, which is the maximum number of the case, and the total time consumption in 10,000 sampling is reduced from 7608 s to 22 s (also shown in Figure 3a). This shows that the parallel method greatly speeds up the operation speed of the GLUE method.   In addition, as shown in Figure 3b, the speedup ratio (Sp) of the parallel GLUE algorithm increases with the increase in the number of processes. The maximum Sp reaches 345.818 when the number of processes reaches the maximum of 400. Within the scale of the case, the relationship between the increase in Sp and the number of processes is close to 1:1, which shows that the number of tasks of a single process still has room to go further, reflecting the high parallelism and scalability of the parallel GLUE method. The change in the parallelism efficiency (E) shows that with the increase in the number of processes, the relative efficiency of the parallel GLUE method always maintains a high level, which is still more than 0.9 at 200 processes and 0.865 at 400 processes. All the above indexes prove the high efficiency of the present parallel GLUE method.
As a traditional method of model parameter optimization, a parameter calibration algorithm based on various optimization algorithms also needs many repeated calculations. At present, many studies have applied parallel computing technology to improve the speed of the parameter calibration algorithm. Table 3 shows some cases of the application of parallel computing technology in parameter calibration. Although the parallel scale and efficiency improvement range of these studies are quite different, they all show the same trend, in that, within a certain number of processes, the number of opening processes and the speedup ratio are close to a linear relationship, and the parallel efficiency is close to 1. However, if the number of opening processes continues to increase, the increase in the speedup ratio and the parallel efficiency will dramatically decrease.  In addition, as shown in Figure 3b, the speedup ratio (Sp) of the parallel GLUE algorithm increases with the increase in the number of processes. The maximum Sp reaches 345.818 when the number of processes reaches the maximum of 400. Within the scale of the case, the relationship between the increase in Sp and the number of processes is close to 1:1, which shows that the number of tasks of a single process still has room to go further, reflecting the high parallelism and scalability of the parallel GLUE method. The change in the parallelism efficiency (E) shows that with the increase in the number of processes, the relative efficiency of the parallel GLUE method always maintains a high level, which is still more than 0.9 at 200 processes and 0.865 at 400 processes. All the above indexes prove the high efficiency of the present parallel GLUE method.
As a traditional method of model parameter optimization, a parameter calibration algorithm based on various optimization algorithms also needs many repeated calculations. At present, many studies have applied parallel computing technology to improve the speed of the parameter calibration algorithm. Table 3 shows some cases of the application of parallel computing technology in parameter calibration. Although the parallel scale and efficiency improvement range of these studies are quite different, they all show the same trend, in that, within a certain number of processes, the number of opening processes and the speedup ratio are close to a linear relationship, and the parallel efficiency is close to 1. However, if the number of opening processes continues to increase, the increase in the speedup ratio and the parallel efficiency will dramatically decrease. This is because in the area of parallel computing, the efficiency will be affected by I/O, inter-process communication, resource contention, load balancing and other factors. With the increase in the number of processes, the impact of these factors will increase significantly. At the same time, due to the influence of the inherent principle of the optimization algorithm, many inter-process communications must be carried out to compare the fitness of different populations and generate new populations. This communication intensive scenario is more likely to be affected by the efficiency of inter-process communication when the number of open processes increases, which ultimately affects the scalability of parallel computing. In contrast, the parallel GLUE algorithm in this study needs very little inter-process communication during the calculations, which is only during parameter distribution and result collection. Compared with parallel parameter calibration, the calculation is less affected by inter-process communication, and the results of the parallelism efficiency analysis in this study also prove this. The E value is still more than 0.9 when allocating 200 processes and 0.865 when allocating 400 processes, which reflects the high scalability of the parallel GLUE algorithm. In a similar study of the parallel GLUE algorithm, Kan et al. [52] found that the speedup ratio of the parallel GLUE algorithm performed on one 8-core CPU of a single computer is more than 10 when the sampling scale is 8000 or below; however, the speedup ratio decreases rapidly when the sampling scale exceeds 8000, and it is only 1.72 when the sampling scale is 1,024,000. This is because a single computer is very limited in CPU computing power, memory capacity, memory bandwidth and hard disk reading speed. When the computing scale is expanded, the resource contention between processes is very obvious, which seriously affects the efficiency of the parallel algorithm.

Impacts of Sampling Scale and Likelihood Threshold on the Number of Behavioral Parameter Sets
To analyze the parameter uncertainty of the Xinanjiang model based on the parallel GLUE method, this study mainly investigated the influence of the likelihood function threshold and the parameter sets sampling scale on parameter uncertainty. The threshold of the likelihood function in this study is a total of 11 levels with a range of 0.60~0.80 and an interval of 0.02. For each level, a total of 10.2 million, 1.02 million and 102,000 parameter sets were sampled from the parameter space. The maximum computing resources occupied are 100 nodes and 2400 logical CPUs, and the total computing power is approximately 96 Tflops. The sampling method is the Latin hypercube sampling (LHS) method [61].
The number of behavioral parameter sets and their proportion in the total sample scale under 3 different parameter sets sampling scales and 11 different NSE likelihood thresholds are shown in Table 4 and Figure 4. It is shown that the numbers of behavioral parameter sets obtained by different sampling sizes under the same likelihood function threshold are different, but the proportions of the behavioral parameter sets in each sample size are basically the same. For example, when the threshold is 0.62, the numbers of behavioral parameter sets when the sampling scales are 10.2 million, 1.02 million and 0.102 million are 7105, 724 and 79, respectively, but the proportions of the behavioral parameter sets are between 0.07% and 0.08%. This means that since the proportions of behavioral parameter sets remain steady, the larger the sample size is, the more behavioral parameter sets there are. On one hand, a larger number of behavioral parameter sets can better reflect the posterior distribution of parameters, which is the distribution of parameters under the situation that the likelihood value is above the threshold, under a certain threshold but, on the other hand, can also provide more forecasting members for future hydrological ensemble forecasting research.   The results also show that the larger the sample size, the more likely it is to obtain the behavioral parameter sets under the better likelihood function threshold conditions. As shown in Table 4, the test performed with 10.2 million samples obtained 2 behavioral parameter sets at the likelihood function threshold of 0.78, but no behavioral parameter set was obtained in the 1.02 million and 102,000 samples at the same threshold condition. Moreover, the results show that if the behavioral parameter sets are obtained under the condition of a higher likelihood function threshold, the parameters can better reflect the posterior distribution of the model parameters when they approach the global optimal solution, which is the best simulation performance of all the possible parameter sets and can also provide high-quality forecast members for future hydrological ensemble forecasting. Figure 4 also shows that the proportion of behavioral parameter sets increases exponentially with the decrease in the likelihood function, which is NSE in this study, threshold. In the existing research on parameter uncertainty using the GLUE method, Li et al. [28] and Xue et al. [62] have studied the relationship between the likelihood function threshold in the GLUE method and the proportion of behavioral parameter sets. It is found that there is a good linear relationship between the likelihood function threshold and the proportion of behavioral parameter sets. Note that the main focus of these studies is on a relatively macro range where the proportion of behavioral parameter sets is greater than 0.1%. In this study, due to the large scale of parameter sets sampling, we mainly investigate the micro interval where the proportion of behavioral parameter sets is less than 0.1%. The results also show that the larger the sample size, the more likely it is to obtain the behavioral parameter sets under the better likelihood function threshold conditions. As shown in Table 4, the test performed with 10.2 million samples obtained 2 behavioral parameter sets at the likelihood function threshold of 0.78, but no behavioral parameter set was obtained in the 1.02 million and 102,000 samples at the same threshold condition. Moreover, the results show that if the behavioral parameter sets are obtained under the condition of a higher likelihood function threshold, the parameters can better reflect the posterior distribution of the model parameters when they approach the global optimal solution, which is the best simulation performance of all the possible parameter sets and can also provide high-quality forecast members for future hydrological ensemble forecasting. Figure 4 also shows that the proportion of behavioral parameter sets increases exponentially with the decrease in the likelihood function, which is NSE in this study, threshold. In the existing research on parameter uncertainty using the GLUE method, Li et al. [28] and Xue et al. [62] have studied the relationship between the likelihood function threshold in the GLUE method and the proportion of behavioral parameter sets. It is found that there is a good linear relationship between the likelihood function threshold and the proportion of behavioral parameter sets. Note that the main focus of these studies is on a relatively macro range where the proportion of behavioral parameter sets is greater than 0.1%. In this study, due to the large scale of parameter sets sampling, we mainly investigate the micro interval where the proportion of behavioral parameter sets is less than 0.1%. This shows that there is a different relationship between the threshold value and the proportion of behavioral parameter sets around the tipping point of 0.1%, which is a linear relationship in the range of more than 0.1% and an exponential relationship in the interval of less than 0.1%. Figure 5 illustrates the comparison of the posterior distribution of the sensitivity parameters derived from the experiment in Section 4.2 for the sampling scale of 10.2 million and the likelihood thresholds of 0.66 and 0.74. There are several aspects that can be interpreted from the results. First, the types of sensitivity parameters remain consistent in different threshold situations and thus in the sampling situations (plot not given). For both threshold values, the Xinanjiang model obtained the same 9 sensitive parameters through the parallel GLUE method. Second, the posterior distribution of some sensitivity parameters changes when the threshold value changes, while those of others remain unchanged. As shown in Figure 5, some of the sensitivity parameters, i.e. IMP, WM3, KKSS and PETM, get more concentrated posterior distributions when the threshold value increases from 0.66 to 0.74. However, the sampling scale has no impact on the posterior distributions. Third, the scatter plots in Figure 5 show that there is no linear relationship among the sensitive parameters, regardless of the threshold values or the sampling scale. This shows that there is a different relationship between the threshold value and the proportion of behavioral parameter sets around the tipping point of 0.1%, which is a linear relationship in the range of more than 0.1% and an exponential relationship in the interval of less than 0.1%. Figure 5 illustrates the comparison of the posterior distribution of the sensitivity parameters derived from the experiment in Section 4.2 for the sampling scale of 10.2 million and the likelihood thresholds of 0.66 and 0.74. There are several aspects that can be interpreted from the results. First, the types of sensitivity parameters remain consistent in different threshold situations and thus in the sampling situations (plot not given). For both threshold values, the Xinanjiang model obtained the same 9 sensitive parameters through the parallel GLUE method. Second, the posterior distribution of some sensitivity parameters changes when the threshold value changes, while those of others remain unchanged. As shown in Figure 5, some of the sensitivity parameters, i.e. IMP, WM3, KKSS and PETM, get more concentrated posterior distributions when the threshold value increases from 0.66 to 0.74. However, the sampling scale has no impact on the posterior distributions. Third, the scatter plots in Figure 5 show that there is no linear relationship among the sensitive parameters, regardless of the threshold values or the sampling scale.

Conclusions
In this study, an MPI-based parallel computing accelerated GLUE method was proposed and implemented in a supercomputing system. A smaller scale experiment was performed with different processors to test the computational efficiency of the parallel GLUE method. A larger scale experiment was performed to analyze the impact of the sampling size and the threshold value on the parameter uncertainty of the Xinanjiang model. The following conclusions were drawn from the study: 1.
The MPI-based parallel GLUE method can significantly improve the calculation efficiency and shorten the calculation time. For the sampling size of 100,000, the parallelism efficiency remains at 0.865, and the speedup ratio is close to a linear speedup ratio, even if the allocated processors are up to 400. The results show the great scalability of the present method.

2.
The proportion of behavioral parameter sets remains unchanged as the sampling scale changes. Within the interval of less than 0.1%, the proportion of behavioral parameter sets and the threshold value have an exponential relationship. The larger the sampling scale, the more likely it is that behavioral parameter sets are obtained at high threshold values.

3.
The threshold values impact the posterior distribution of the sensitivity parameters. Some parameters may yield more concentrated posterior distributions when the threshold is higher.