An Appraisal of Dynamic Bayesian Model Averaging-based Merged Multi-Satellite Precipitation Datasets Over Complex Topography and the Diverse Climate of Pakistan

Merging satellite precipitation products tends to reduce the errors associated with individual satellite precipitation products and has higher potential for hydrological applications. The current study evaluates the performance of merged multi-satellite precipitation dataset (daily temporal and 0.25◦ spatial resolution) developed using the Dynamic Bayesian Model Averaging algorithm across four different climate regions, i.e., glacial, humid, arid and hyper-arid regions, of Pakistan during 2000–2015. Four extensively evaluated SPPs over Pakistan, i.e., Tropical Rainfall Measurement Mission (TRMM) Multi-satellite Precipitation Analysis (TMPA) 3B42V7, Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks-Climate Data Record (PERSIANN-CDR), Climate Prediction Center MORPHing technique (CMORPH), and Era-Interim, are used to develop the merged multi-satellite precipitation dataset. Six statistical indices, including Mean Bias Error, Mean Absolute Error, Root Mean Square Error, Correlation Coefficient, Kling-Gupta efficiency, and Theil’s U coefficient, are used to evaluate the performance of merged multi-satellite precipitation dataset over 102 ground precipitation gauges both spatially and temporally. Moreover, the ensemble spread score and standard deviation are also used to depict the spread and variation of precipitation of merged multi-satellite precipitation dataset. Skill scores for all statistical indices are also included in the analyses, which shows improvement of merged multi-satellite precipitation dataset against Simple Model Averaging. The results revealed that DBMA-MSPD assigned higher weights to TMPA (0.32) and PERSIANN-CDR (0.27). TMPA presented higher skills in glacial and humid regions with average weights of 0.32 and 0.37 as compared to PERSIANN-CDR of 0.27 and 0.25, respectively. TMPA and Era-Interim depicted higher skills during pre-monsoon and monsoon seasons, with average weights of 0.31 and 0.52 (TMPA) and 0.25 and 0.21 (Era-Interim), respectively. Merged multi-satellite precipitation dataset overestimated precipitation in glacial/humid regions and showed poor performance, with the poorest values of mean absolute error (2.69 mm/day), root mean square error (11.96 mm/day), correlation coefficient (0.41), Kling-Gupta efficiency score (0.33) and Theil’s U (0.70) at some stations in glacial/humid regions. Higher performance is observed in hyper-arid region, with the best values of 0.71 mm/day, 1.72 mm/day, 0.84, 0.93, and 0.37 for mean absolute error, root mean square error, correlation coefficient, Kling-Gupta Efficiency score, and Theil’s U, respectively. Merged multi-Satellite Precipitation Dataset demonstrated significant improvements as compared to TMPA across all climate regions with average improvements of 45.26% (mean bias error), 30.99% (mean absolute error), 30.1% (root mean square error), 11.34% (correlation coefficient), 9.53% (Kling-Gupta efficiency score) and 8.86% (Theil’s U). The ensemble spread and variation of DBMA-MSPD calculated using ensemble spread score and standard deviation demonstrates high spread (11.38 mm/day) and variation (12.58 mm/day) during monsoon season in the humid and glacial regions, respectively. Remote Sens. 2020, 12, 10; doi:10.3390/rs12010010 www.mdpi.com/journal/remotesensing Remote Sens. 2020, 12, 10 2 of 30 Moreover, the improvements of DBMA-MSPD quantified against fixed weight SMA-MSPD reveals supremacy of DBMA-MSPD, higher improvements (40–50%) in glacial and humid regions.


Introduction
Precipitation is one of the integral components in climate studies and global water and energy cycles [1].Precipitation is used as a forcing data in hydrological modeling such as weather monitoring, streamflow forecasting, flood simulations and warnings, and hydrological trend analysis [2].Therefore, the accuracy of hydrological simulations and applications is contingent on the accuracy of precipitation.Conventional ground-based precipitation measurements, including rain (or snow) gauges networks and weather radar systems, are either sparsely distributed both spatially and temporally or do not exist at all due to climate variability, complex topography, human geography, and other limited conditions [3,4].However, ground precipitation gauges (GPGs) are considered the most accurate method for measuring the precipitation because they provide direct precipitation observations [5].
Recent advancements in remote sensing technologies provide an alternative solution to surmount the spatial and temporal limitations of conventional ground-based precipitation measurement networks.Satellite-based precipitation products (SPPs) have been utilized during the past thirty years for regional to global scale precipitation estimation [6,7].At present, SPPs have been extensively applied in hydrological simulations [8][9][10], regional and global precipitation pattern recognition, and drought monitoring [11][12][13].The applications of SPPs are complicated, and the proper selection of one SPP over another is debatable.Therefore, the evaluation of each SPP against the in-situ observations before the application is extremely important and considered a standard process [14,15].
Various studies have evaluated different SPPs against the GPGs both on regional and global scales at daily [16][17][18][19][20][21][22], monthly [17,19,23,24], and annual [17,20] temporal resolutions.It is concluded from the previous studies that all the SPPs have inevitable errors based on different statistical indices, including mean error, correlation with GPGs, and other categorical indices (probability of detection, critical success index, and false alarm ratio).The errors might be associated with precipitation retrieval algorithms or other factors such as the topographic, climatic and seasonal dependency of SPPs.These errors impact the quality of precipitation estimates, which propagate further into hydrological modeling and other applications [16].
Several efforts, such as developing new products with high spatial and temporal resolution [23,25], reducing sampling issues, improvements in precipitation estimation algorithms, merging different SPPs using relative weights and weight estimation using the dynamic methods, have been made to address the uncertainties of existing SPPs [7,16,26].Developing and application of a merged multi-satellite precipitation dataset (MSPD) based on different statistical models or even the same statistical model can significantly improve the accuracy of meteorological and hydrological models [27].The performance of MSPD is generally better than all or most of the individual merging members.Several methods are proposed in literature to develop MSPDs, which include Statistical Objective Analysis [28], Conditional Merging [29,30], Simple Scaling Method [30], data assimilation [31], Bayesian Model Averaging [26,27,32], probability density function [33], variation approach [34], and neural network analysis [35].Furthermore, Li and Shao [36] merged gauged precipitation and TRMM datasets over Australia using a non-parametric kernel merging method, and concluded high performance of the developed MSPD.Readers are referred to [36][37][38][39] for more details about techniques to develop MSPDs.
Many studies reported significantly better performance of MSPD developed for estimation of climate and hydrological variables using a simple model averaging (SMA) approach [24, 27,40].Shen et al. [41] evaluated the One-Outlier Removed (OOR) method and reported better performance of OOR than SMA for all seasons except the winter in Tibetan Plateau.Bayesian Model Averaging (BMA) is a more progressive merging technique, which calculates the optimum weights for individual merging members based on the ground observations.In this method, consensus predictions are derived from multiple competing predictors, and MSPD is generated to exploit the strength of every prediction.BMA method assigns weights to each member based on its predictive performance [27,42,43].It provides a more convincing portrayal of the predictive uncertainty that considers both within-model and between-model variances and also outperformed other merging techniques by producing reliable and precise results [40,42,43].This method has also been successfully applied in hydrological modeling and streamflow simulations [44][45][46].A MSPD using Dynamic BMA (DBMA) was developed recently to optimally merge four SPPs and evaluated over Tibetan Plateau both topographically and seasonally [26,47].It accounts for both spatial and temporal variability in the performances of the merging members.The DBMA-MSPD was found superior to both SMA and OOR merging methods [26].
Very limited literature is available on developing and evaluating the MSPDs across Pakistan [16,22,48].The results demonstrated significant improvements of MSPDs overall merging members.However, most of the literature used the fixed weight techniques [16,22] or the dynamic weight algorithms to merge different SPPs in each climate region [48], and the spatial variation of these weights are not fully considered.The present MSPD considers both spatial and temporal variation in weights for different SPPs over complex topography and diverse climate of Pakistan.
The key objectives of the current study are to develop a MSPD using the Dynamic Bayesian Model Averaging (DBMA-MSPD) algorithm for whole Pakistan, to evaluate and compare the performance of DBMA-MSPD against the individual members at seasonal and topographical scales.This experiment is performed using four comprehensively evaluated satellite products in Pakistan, including Tropical Rainfall Measuring Mission (TRMM) Multi-satellite Precipitation Analysis (TMPA), Climate Prediction Center MORPHing technique (CMORPH), ERA-Interim, and Precipitation Estimation from Remotely Sensed Information using Artificial Neural Network (PERSIANN-CDR), at daily scale for sixteen years (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015).This study is organized as follows: Section 2 introduces the study area, data and methods, Section 3 represents the comprehensive evaluation and discussion, and Section 4 is the conclusion of the current research.

Study Area
Pakistan lies between 23.5 • -37.5 • N latitude and 62 • -75 • E longitude having an area of 803,940 km 2 (Figure 1a).Pakistan shares its boundaries with China at its north, India at east, Iran, and Afghanistan at west and the Arabian Sea at south.The study area is topographically very complex, with a maximum elevation of 8600 m above sea level (the Himalayas and Karakoram mountain ranges) to a minimum of 0 m (Arabian Sea).
Pakistan has a diverse climate with four distinct climate regions, i.e., glacial, humid, arid, and hyper-arid regions.Extreme north of the country is occupied by glacial region with the mean elevation of 4158 m and mean annual precipitation of 348 mm.This region is mostly covered with snow and glaciers.The Hindu Kush Himalaya (HKH) mountain ranges, which are very famous for snow and glaciers, also lie in the glacial region.Snow and glacier melt from this region in the summer season is the main source of water in the Indus river for domestic, industrial, and agriculture sectors of Pakistan.However, the excessive snow and glacier melt caused acute flood events in Pakistan, especially the 2010 flood, which took thousands of lives and severely damaged the infrastructure and economy of the country.The humid region consists of very high mountains of HKH and all major rivers of the country (Indus, Kabul, Gilgit, Chitral, Swat, Panjkora, Hunza, Kurram, and Jhelum).Mean elevation of the humid region is 1286 m, with a mean annual precipitation of 852 mm.The arid region is composed of major agricultural region of Punjab province.The Indus river and its tributaries flow through the region and considered as a primary source of water in the region.The average elevation and mean annual precipitation in arid region are 633 m and 322 mm, respectively.Hyper-arid region includes the Sindh and Balochistan provinces, and south part of the Punjab province.Most of the hyper-arid region consists of barren lands and dry mountains.The mean elevation and mean annual precipitation in the region are 444 m and 133 mm, respectively.

Ground Precipitation Gauge Data
Ground precipitation gauges (GPGs) are considered as standard precipitation data sources, which provide direct precipitation records for satellite precipitation calibration and validation processes.Pakistan Meteorology Department (PMD), and Snow and Ice Hydrology Project (SIHP) of

Ground Precipitation Gauge Data
Ground precipitation gauges (GPGs) are considered as standard precipitation data sources, which provide direct precipitation records for satellite precipitation calibration and validation processes.Pakistan Meteorology Department (PMD), and Snow and Ice Hydrology Project (SIHP) of Water and Power Development Authority (WAPDA) own the meteorological data of Pakistan.SIHP operates the meteorological stations at high altitudes, mostly situated in glacial and humid regions.The daily precipitation records from 2000 to 2015 at 102 GPGs have been collected from both organizations.Out of 102 GPGs, precipitation records of 79 GPGs are collected from PMD, while the remaining 23 GPGs from WAPDA. Figure 1b shows the geographical distribution of obtained meteorological stations.In the current study, GPGs are named with respect to the climate zones, i.e., GPGs in the glacial, humid, arid, and hyper-arid regions are represented as G-GPGs, H-GPGs, A-GPGs, and HA-GPGs.The area (km 2 )/number of stations in glacial, humid, arid, and hyper-arid regions are 72,774/19, 137,753/39, 270,484/19, and 322,929/25, respectively.
The precipitation records are manually collected by PMD and WAPDA, which might be subjected to personal and instrumental errors.Furthermore, GPGs located at high elevations may also be subjected to other errors due to splashing and wind.Therefore, PMD and WAPDA do the correction of precipitation records following the standard code of World Meteorological Organization (WMO-N).Moreover, data quality tests, Skewness and Kurtosis methods are performed after filling the missing data using the zero-order method [16].
The distribution of GPGs in Pakistan is non-homogenous and very limited/scarce, which is not enough to understand the precipitation pattern, its spatial and temporal variability over different climate regions [48].In the last few decades, the number of GPGs has significantly increased, but the density is still not able to cope with scientific requirements [49].Therefore, the application of SPPs and development of MSPDs are of paramount importance over Pakistan as an alternative precipitation source for hydrological and meteorological applications.

Satellite-based Precipitation Datasets
The experiment is conducted using three satellite-based precipitation datasets, including TMPA 3B42-v7, CMORPH, and PERSIANN-CDR, and one re-analysis precipitation dataset (Era-Interim).These datasets are selected because they are comprehensively evaluated in the previous studies over the study area (Pakistan).

TMPA 3B42-v7
The Tropical Rainfall Measurement Mission (TRMM), a first dedicated satellite precipitation product launched in late 1997, was capable of measuring moderate and heavy precipitation with reasonable accuracy.The TRMM Multi-Satellite Precipitation Analysis (TMPA) is one of the most suitable TRMM based multi-satellite precipitation products for near real-time (3B42-RT) and research (3B42-V6/3B42-V7) applications [50,51].To enhance the calibration process, 3B42-V7 uses the global real-time precipitation datasets from Global Precipitation Climatology Centre (GPCC).In the current study, TMPA 3B42-V7 (with spatial resolution of 0.25 • and daily temporal resolution) is used which has many advanced specifications as compared to previous versions such as a newer IR brightness temperature dataset, latitude band calibration system, additional satellite inputs, and a single uniformed processed surface precipitation analysis [31].

PERSIANN-CDR
Center of Hydrometeorology and Remote Sensing (CHRS) at the University of California Irvine developed the PERSIANN algorithm and has 0.25 • /daily spatial/temporal resolution.PERSIANN adjust the neural network parameters by using Passive Microwave (PMW) data (TMI, AMSU-B, and SSM/I) to enhance the accuracy of precipitation estimates.PERSIANN-CDR, developed by CHRS with the same spatial resolution and family, uses the same neural network used by previous products for precipitation estimation.The only difference is the use of input IR dataset with GridSat-B1 instead of CPC-IR.Moreover, PMW data is not used in PERSIANN-CDR [52].The global atmospheric reanalysis precipitation dataset, Era-Interim, is developed by the European Centre for Medium-Range Weather Forecasts (ECMWF).Era-interim provides the real-time global precipitation observations from 1979 to present with spatial resolution 0.25 • .Era-Interim data is generated from the data assimilation system based on IFS (Cy31r2) 2006 release.It includes a four-dimensional variational (4D-Var) analysis with 12 hours window.Era-Interim estimates the precipitation based on temperature and relative humidity while using weather forecast model [54].

Dynamic Bayesian Model Averaging (DBMA)
The term dynamic is used to consider both spatial and temporal variability of BMA weights.BMA is a statistical approach to combine the predictions and inferences based on multiple competing models to yield a highly skillful and reliable probabilistic merging [26,27].BMA uses Bayes theorem to model not only parameters uncertainty through prior distribution but also model the uncertainty by obtaining parameter and model's posterior.It is a robust technique for direct model selection, combined estimation, and prediction.BMA estimates weights as posterior probabilities, which reflect model's forecasting skills relative to other models in the training period [27].In the current study, BMA is applied to optimally merge four SPPs to improve satellite precipitation estimation and obtain best agreement with GPGs measurements by adjusting predictive probability density function (PDF).Figure 2 shows the flow chart of dynamic BMA (DBMA) algorithm for merging multi-satellite precipitation.MSPD in the current study is developed by following the methodology proposed by Ma et al. [26], to which readers are referred for a detailed description.
The calculated BMA weights are adjusted for all merged members at each calibrated location on each day.Note that BMA analyses are performed only when all the SPPs captured precipitation (hit cases only).Since BMA weights are applied at the point scale, pixels having at least a single GPG are checked for spatiotemporal SPPs coincidence.The next day weights of dynamic BMA are based on adding new sample data (e.g., 41st day) and deleting the oldest sample data (e.g., 1st day) from the training dataset.Therefore, there is considerable overlap (39 days) between the training data at GPG from one day to another.
Several studies demonstrated that the performance of dynamic MSPDs in terms of Root Mean Square Error (RMSE) and Correlation Coefficient (CC) increased till 40 days; however, no improvements are observed beyond 40 days interval [26,47,[55][56][57].Therefore, BMA is trained for a time window of three years using 40 days training period.The training was performed for each calibrated GPG on a daily temporal scale.Considering year 2002 as a basic (starting) year (for which previous two years of daily precipitation data is available), 40 days of training period in one year and 40 training days in each of the previous two years, with a total of 120 days are considered to optimize BMA weights.Hamill [55] also concluded that expanding the training period to include the same season days from previous year would likely improve model performance.The optimal BMA weights were then interpolated to the whole study area (Pakistan) using the ordinary kriging method.Finally, a blended satellite precipitation estimate for each grid is calculated based on individual data and corresponding optimal grid weights across Pakistan.
To check the importance and sensitivity of training time windows (TTW), we have considered four sets of TTW, i.e., 1-year, 2-years, 3-years, and 5-years.MAE and RMSE statistical indices calculated on daily temporal scale averaged over each year of TTW are used to evaluate the accuracy and performance of MSPD during each TTW (Figures 3 and 4, respectively).Figures 3 and 4 show the MSPD performance in glacial, humid, arid, and hyper-arid regions during four sets of TTW.The analyses show that performance of DBMA-MSPD increases with an increase in TTW duration, which supports the statement of Hamil [55].Higher fluctuations (low accuracy) are observed for 1-year TTW, which are decreasing with an increase in TTW duration, e.g., 3-years and 5-years.The analyses depict that there are no significant improvements when we expand the duration from 3-years to 5-years (only minor improvements in 5-years TTW as compared to 3-years).Therefore, 3-years of TTW is selected because of considerable improvements as compared to 1-year and 2-years TTW and less number of training datasets as compared to 5-years TTW.

Dynamic Bayesian Model Averaging (DBMA)
The term dynamic is used to consider both spatial and temporal variability of BMA weights.BMA is a statistical approach to combine the predictions and inferences based on multiple competing models to yield a highly skillful and reliable probabilistic merging [26,27].BMA uses Bayes theorem to model not only parameters uncertainty through prior distribution but also model the uncertainty by obtaining parameter and model's posterior.It is a robust technique for direct model selection, combined estimation, and prediction.BMA estimates weights as posterior probabilities, which reflect model's forecasting skills relative to other models in the training period [27].In the current study, BMA is applied to optimally merge four SPPs to improve satellite precipitation estimation and obtain best agreement with GPGs measurements by adjusting predictive probability density function (PDF).Figure 2 shows the flow chart of dynamic BMA (DBMA) algorithm for merging multi-satellite precipitation.MSPD in the current study is developed by following the methodology proposed by Ma et al. [26], to which readers are referred for a detailed description.The calculated BMA weights are adjusted for all merged members at each calibrated location on each day.Note that BMA analyses are performed only when all the SPPs captured precipitation (hit cases only).Since BMA weights are applied at the point scale, pixels having at least a single GPG are checked for spatiotemporal SPPs coincidence.The next day weights of dynamic BMA are based on adding new sample data (e.g., 41st day) and deleting the oldest sample data (e.g., 1st day) from the training dataset.Therefore, there is considerable overlap (39 days) between the training data at GPG from one day to another.
Several studies demonstrated that the performance of dynamic MSPDs in terms of Root Mean Square Error (RMSE) and Correlation Coefficient (CC) increased till 40 days; however, no improvements are observed beyond 40 days interval [26,47,[55][56][57].Therefore, BMA is trained for a time window of three years using 40 days training period.The training was performed for each calibrated GPG on a daily temporal scale.Considering year 2002 as a basic (starting) year (for which previous two years of daily precipitation data is available), 40 days of training period in one year and 40 training days in each of the previous two years, with a total of 120 days are considered to optimize BMA weights.Hamill [55] also concluded that expanding the training period to include the same season days from previous year would likely improve model performance.The optimal BMA weights were then interpolated to the whole study area (Pakistan) using the ordinary kriging method.Finally, a blended satellite precipitation estimate for each grid is calculated based on individual data and corresponding optimal grid weights across Pakistan.
To check the importance and sensitivity of training time windows (TTW), we have considered four sets of TTW, i.e., 1-year, 2-years, 3-years, and 5-years.MAE and RMSE statistical indices calculated on daily temporal scale averaged over each year of TTW are used to evaluate the accuracy and performance of MSPD during each TTW (Figure 3 and Figure 4, respectively).Figure 3 and Figure 4 show the MSPD performance in glacial, humid, arid, and hyper-arid regions during four sets of TTW.The analyses show that performance of DBMA-MSPD increases with an increase in TTW duration, which supports the statement of Hamil [55].Higher fluctuations (low accuracy) are observed for 1-year TTW, which are decreasing with an increase in TTW duration, e.g., 3-years and 5-years.The analyses depict that there are no significant improvements when we expand the duration from 3-years to 5-years (only minor improvements in 5-years TTW as compared to 3-years).Therefore, 3-years of TTW is selected because of considerable improvements as compared to 1-year and 2-years TTW and less number of training datasets as compared to 5-years TTW.

Performance Evaluation
The statistical indices, including Mean Bias Error (MBE), Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Correlation Coefficient (CC) [58], Kling-Gupta efficiency (KGE score) [59], Theil's U coefficient [60], ensemble spread score (ESS), and standard deviation (SD) are used to assess and compare the performance of DBMA-MSPD against the GPG observations.Table 1 represents a complete description of the statistical indices.MBE demonstrates over/under-estimation of precipitation with positive/negative values.MAE represents the average absolute error between DBMA-MSPD and GPGs.RMSE evaluates the average squared error magnitude between the merged (DBMA-MSPD) and observed precipitation data.CC represents an agreement between the simulated and observed observations.KGE combines the variability ratio (γ), bias ratio (β), and correlation coefficient (R).The perfect values for KGE, β, and γ are all 1.Theil's U evaluates the accuracy of DBMA-MSPD forecasted precipitation associated with GPGs.Theil's U can be related to R2 but differs in boundary conditions, i.e., not bounded by zero and one.Value of Theil's U closer to zero represents perfect forecasting accuracy; assumes a value of 1 when models forecast the same errors as the naïve no-change extrapolation and value greater than 1 represent worst forecasting accuracy and has to be rejected [60].ESS is usually the deviation of DBMA-MSPD precipitation from its mean [61].The ideal DBMA merged precipitation will have the same size of ESS as its RMSE at the same time span [61,62].SD is used to describe the MSPD simulated precipitation variance.A higher SD value indicates a higher variation in simulated precipitation from its mean.
Moreover, we further compared the performance of DBMA-MSPD with SMA based MSPD.SMA is a fixed weight ensemble approach, which merges SPPs using the arithmetic mean of all members [41].The equation for SMA approach is given as: where i S represents individual SPP, n is the number of SPPs and MSPD SMA is the final MSPD developed using SMA approach.

Performance Evaluation
The statistical indices, including Mean Bias Error (MBE), Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Correlation Coefficient (CC) [58], Kling-Gupta efficiency (KGE score) [59], Theil's U coefficient [60], ensemble spread score (ESS), and standard deviation (SD) are used to assess and compare the performance of DBMA-MSPD against the GPG observations.Table 1 represents a complete description of the statistical indices.MBE demonstrates over/under-estimation of precipitation with positive/negative values.MAE represents the average absolute error between DBMA-MSPD and GPGs.RMSE evaluates the average squared error magnitude between the merged (DBMA-MSPD) and observed precipitation data.CC represents an agreement between the simulated and observed observations.KGE combines the variability ratio (γ), bias ratio (β), and correlation coefficient (R).The perfect values for KGE, β, and γ are all 1.Theil's U evaluates the accuracy of DBMA-MSPD forecasted precipitation associated with GPGs.Theil's U can be related to R2 but differs in boundary conditions, i.e., not bounded by zero and one.Value of Theil's U closer to zero represents perfect forecasting accuracy; assumes a value of 1 when models forecast the same errors as the naïve no-change extrapolation and value greater than 1 represent worst forecasting accuracy and has to be rejected [60].ESS is usually the deviation of DBMA-MSPD precipitation from its mean [61].The ideal DBMA merged precipitation will have the same size of ESS as its RMSE at the same time span [61,62].SD is used to describe the MSPD simulated precipitation variance.A higher SD value indicates a higher variation in simulated precipitation from its mean.
Moreover, we further compared the performance of DBMA-MSPD with SMA based MSPD.SMA is a fixed weight ensemble approach, which merges SPPs using the arithmetic mean of all members [41].The equation for SMA approach is given as: where S i represents individual SPP, n is the number of SPPs and MSPD SMA is the final MSPD developed using SMA approach. where The performance of DBMA-MSPD is compared with fixed weight MSPD (SMA, hereinafter SMA-MSPD) using a skill score for each statistical index used in the analyses (shown in Table 1).For example, the skill scores of MBE and MAE are shown in Equations ( 2) and (3), respectively.

Data Normality and Box-Cox Transformation
BMA analyses require the normal distribution of precipitation data; however, the precipitation data is usually gamma-distributed.Since we have considered only the hit cases and 2 mm as a threshold for precipitation and no precipitation events, the data is closely normally distributed almost overall GPGs/SPPs in four climate regions.The Jarque-Bera test was used to determine the normality of precipitation distribution [63].Data is normally distributed when the skewness (symmetry of residuals) is close to zero, and kurtosis (peakedness of the curve) is closer to a value of 3.During our analyses, skewness of the data is ranging between 0.27 to 0.78, kurtosis between 2.18 to 3.99, and Jarque-Bera between 13.46 to 47.28.Besides the Jarque-Bera test, we also applied Box-Cox transformation before the application of BMA to ensure that the transformed individual is close to normal distribution.The Box-Cox transformation is reliable for non-normal distribution in hydrological variables [26,42,44,64].Box-Cox transformation has the capability to transform the non-normally distributed data to datasets either normally distributed or close enough to the normal distribution [65].The results confirm that precipitation is normally distributed after Box-Cox transformation (Figure 5).
normally distributed data to datasets either normally distributed or close enough to the normal distribution [65].The results confirm that precipitation is normally distributed after Box-Cox transformation (Figure 5).The application of Box-Cox transformation is dependent on λ values, which might be positive or negative.This demonstrates that the transformed results do not cover the entire range (-∞, +∞); therefore, approximately normal distribution of precipitation is expected.The λ is obtained by calculating a minimum of unconstrained multivariable function using the Nelder-Mead simplex technique [66].The λ value is dependent on length of the sampling data (number of precipitation observations), and its values for TMPA, PERSIANN-CDR, Era-Interim, and CMORPH are ranging between 0.02-1.57,−0.03-2.21,0.02-1.96,and −0.08-2.37,respectively.The λ values of TMPA at representative GPGs across each glacial, humid, arid, and hyper-arid regions are 1.352, 0.515, 0.619, and 1.386, respectively.

Spatiotemporal Distribution of DBMA-MSPD Weights over Pakistan
The weights of the merged members (TMPA, Era-Interim, PERSIANN-CDR, and CMORPH) for the DBMA-MSPD are averaged over four seasons during 2000-2015 and presented in Figure 6.The analysis shows that DBMA assigned higher weights to TMPA (Figure 6a) and PERSIANN-CDR (Figure 6c), followed by Era-Interim (Figure 6b) and CMOPRH (Figure 6d).The average weights of TMPA, PERSIANN-CDR, Era-Interim, and CMORPH during 2000-2015 are 0.32, 0.27, 0.22, and 0.19, respectively.On the regional scale, TMPA shows higher skills in glacial (0.32) and humid (0.37)The application of Box-Cox transformation is dependent on λ values, which might be positive or negative.This demonstrates that the transformed results do not cover the entire range (−∞, +∞); therefore, approximately normal distribution of precipitation is expected.The λ is obtained by calculating a minimum of unconstrained multivariable function using the Nelder-Mead simplex technique [66].The λ value is dependent on length of the sampling data (number of precipitation observations), and its values for TMPA, PERSIANN-CDR, Era-Interim, and CMORPH are ranging between 0.02-1.57,−0.03-2.21,0.02-1.96,and −0.08-2.37,respectively.The λ values of TMPA at representative GPGs across each glacial, humid, arid, and hyper-arid regions are 1.352, 0.515, 0.619, and 1.386, respectively.

Spatiotemporal Distribution of DBMA-MSPD Weights over Pakistan
The weights of the merged members (TMPA, Era-Interim, PERSIANN-CDR, and CMORPH) for the DBMA-MSPD are averaged over four seasons during 2000-2015 and presented in Figure 6.The analysis shows that DBMA assigned higher weights to TMPA (Figure 6a) and PERSIANN-CDR (Figure 6c), followed by Era-Interim (Figure 6b) and CMOPRH (Figure 6d).The average weights of TMPA, PERSIANN-CDR, Era-Interim, and CMORPH during 2000-2015 are 0.32, 0.27, 0.22, and 0.19, respectively.On the regional scale, TMPA shows higher skills in glacial (0.32) and humid (0.37) regions as compared to PERSIANN-CDR having DBMA weights of 0.27 and 0.25.Moreover, the average weights of Era-Interim vary from a maximum of 0.22 in glacial to 0.20 in humid region.Similarly, CMORPH weights    The temporal spans of four seasons are: April, May, and June (pre-monsoon), July, August, and September (monsoon), October and November (post-monsoon), and December, January, February, and March (winter).More than 60% of precipitation is received during the monsoon season in Pakistan, which varies with topography and climate in magnitude from low (<100 mm) in the glacial region (34-36 • N), higher (>700 mm) in the North-East (29-33 • N) and again low (around 100 mm) in the South (24-28 • N) [16,67].Spatial distribution of relative weights of merged members varies seasonally over all the climate regions.
During the pre-monsoon season (Figure 7), higher weights are assigned to TMPA (0.31), followed by Era-Interim (0.25).Weights of PERSIANN-CDR and CMORPH are 0.23 and 0.21.Highest average weight for TMPA is observed in the arid (0.36) and humid (0.33) regions.The distribution of weight is gradually decreasing from arid towards the hyper-arid region (0.31).However, for Era-Interim, highest average weight is observed in the humid (0.34) and glacial (0.25) regions.In contrast, for PERSIANN-CDR, the highest weight is observed in hyper-arid (0.28) and arid regions (0.26).Moreover, no specific trend is observed for CMORPH.  of Era-Interim (0.21), PERSIANN-CDR (0.14) and CMORPH (0.13), respectively.Similarly, on a regional scale, TMPA shows relatively higher skills in the humid and hyper-arid regions with average weights of 0.54 and 0.53, respectively.TMPA experienced almost similar average weights in all the climate regions.In the case of Era-Interim, highest weights are observed in hyper-arid and arid regions with average values of 0.24 and 0.23, respectively.However, no particular trend is observed for PERSIANN-CDR and CMORPH.PERSIANN-CDR and CMORPH experience higher skills in the glacial region with average weights of 0.16 and 0.17, respectively.During the pre-monsoon season (Figure 7), higher weights are assigned to TMPA (0.31), followed by Era-Interim (0.25).Weights of PERSIANN-CDR and CMORPH are 0.23 and 0.21.Highest average weight for TMPA is observed in the arid (0.36) and humid (0.33) regions.The distribution of weight is gradually decreasing from arid towards the hyper-arid region (0.31).However, for Era-Interim, highest average weight is observed in the humid (0.34) and glacial (0.25) regions.In contrast, for PERSIANN-CDR, the highest weight is observed in hyper-arid (0.28) and arid regions (0.26).Moreover, no specific trend is observed for CMORPH.
During the post-monsoon season (Figure 9), weights are shifted to PERSIANN-CDR and CMORPH with average weights of 0.33 and 0.25.The average weights of TMPA and Era-Interim are 0.20 and 0.22.PERSIANN-CDR and CMORPH show a relatively small variation in the distribution of weights across different climate regions.PERSIANN-CDR shows higher skills in arid (0.37) and hyper-arid (0.36) regions while relatively lower skills in glacial (0.33) and humid (0.32) regions.However, no significant variation is observed for CMORPH.The regional average weights for CMORPH are 0.26 (glacial), 0.27 (humid), 0.26 (arid) and 0.25 (hyper-arid).During the winter season (Figure 10), PERSIANN-CDR and TMPA show higher skills as compared to other merged members.The average weights of the merged members during the winter During the monsoon season (Figure 8), TMPA dominates other merging members having higher accuracy with an overall average weight of 0.52, which is significantly higher than average weights of Era-Interim (0.21), PERSIANN-CDR (0.14) and CMORPH (0.13), respectively.Similarly, on a regional scale, TMPA shows relatively higher skills in the humid and hyper-arid regions with average weights of 0.54 and 0.53, respectively.TMPA experienced almost similar average weights in all the climate regions.In the case of Era-Interim, highest weights are observed in hyper-arid and arid regions with average values of 0.24 and 0.23, respectively.However, no particular trend is observed for PERSIANN-CDR and CMORPH.PERSIANN-CDR and CMORPH experience higher skills in the glacial region with average weights of 0.16 and 0.17, respectively.season are 0.33 (TMPA), 0.14 (Era-Interim), 0.38 (PERSIANN-CDR), and 0.15 (CMORPH).In the glacial, arid and hyper-arid regions, PERSIANN-CDR dominated TMPA.However, TMPA shows slightly higher skills in the humid region.The average weights of PERSIANN-CDR in all the climate regions from glacial to hyper-arid region are 0.35, 0.38, 0.41, and 0.40, respectively.Similarly, the average weight of TMPA are 0.27 (glacial), 0.39 (humid), 0.28 (arid) and 0.35 (hyper-arid).The average DBMA-MSPD weights (Figure 11) at representative GPGs in each climate region are plotted against DOY (Day of the Year).Representative GPGs are those whose weights are close to the average temporal distribution of DBMA-MSPD weights in the corresponding climate region.The representative stations for glacial, humid, arid and hyper-arid regions are G-GPG10, H-GPG17, A-GPG13, and HA-GPG17, respectively.In the glacial region, significant variation is observed in During the post-monsoon season (Figure 9), weights are shifted to PERSIANN-CDR and CMORPH with average weights of 0.33 and 0.25.The average weights of TMPA and Era-Interim are 0.20 and 0.22.PERSIANN-CDR and CMORPH show a relatively small variation in the distribution of weights across different climate regions.PERSIANN-CDR shows higher skills in arid (0.37) and hyper-arid (0.36) regions while relatively lower skills in glacial (0.33) and humid (0.32) regions.However, no significant variation is observed for CMORPH.The regional average weights for CMORPH are 0.26 (glacial), 0.27 (humid), 0.26 (arid) and 0.25 (hyper-arid).
The average DBMA-MSPD weights (Figure 11) at representative GPGs in each climate region are plotted against DOY (Day of the Year).Representative GPGs are those whose weights are close to the average temporal distribution of DBMA-MSPD weights in the corresponding climate region.The representative stations for glacial, humid, arid and hyper-arid regions are G-GPG10, H-GPG17, A-GPG13, and HA-GPG17, respectively.In the glacial region, significant variation is observed in PERSIANN-CDR and Era-Interim weights.However, less volatility can be seen for TMPA and moderate for CMORPH.Peak weights of TMPA, PERSIANN-CDR, and Era-Interim are observed between DOY 156-232 during monsoon season (late pre-monsoon, monsoon, and early post-monsoon rainy days).In the humid region, higher weights are assigned to TMPA followed by PERSIANN-CDR.

Statistical Evaluation of DBMA-MSPD over Pakistan
Statistical evaluation of DBMA-MSPD is performed using pixel by pixel analysis (pixels having at least one GPG were selected) to ensure proper analysis of DBMA-MSPD over Pakistan.The DBMA-MSPD performance in daily precipitation estimation over 102 GPGs during the past 16 years over entire Pakistan is shown in Figure 12.
Figure 12a shows mean bias error (MBE) of DBMA-MSPD over Pakistan.DBMA-MSPD overestimated precipitation in the extreme north (glacial region) of Pakistan; however, the error is declining towards the south.The maximum overestimation is observed at HMS4 (+1.89 mm/day), while the maximum underestimation is -1.14 mm/day at HMS34.
Figure 12b shows the spatial distribution of mean absolute error (MAE) over Pakistan.A high magnitude of error is observed in the glacial and humid regions, which are gradually decreasing toward the hyper-arid region.The MAE for DBMA-MSPD is variable even in the glacial and humid regions ranging from 2.69 mm/day (at HMS4) to 1.36 mm/day at GMS5.However, MAE falls

Statistical Evaluation of DBMA-MSPD over Pakistan
Statistical evaluation of DBMA-MSPD is performed using pixel by pixel analysis (pixels having at least one GPG were selected) to ensure proper analysis of DBMA-MSPD over Pakistan.The DBMA-MSPD performance in daily precipitation estimation over 102 GPGs during the past 16 years over entire Pakistan is shown in Figure 12.   Figure 12a shows mean bias error (MBE) of DBMA-MSPD over Pakistan.DBMA-MSPD overestimated precipitation in the extreme north (glacial region) of Pakistan; however, the error is declining towards the south.The maximum overestimation is observed at HMS4 (+1.89 mm/day), while the maximum underestimation is −1.14 mm/day at HMS34.
Figure 12b shows the spatial distribution of mean absolute error (MAE) over Pakistan.A high magnitude of error is observed in the glacial and humid regions, which are gradually decreasing toward the hyper-arid region.The MAE for DBMA-MSPD is variable even in the glacial and humid regions ranging from 2.69 mm/day (at HMS4) to 1.36 mm/day at GMS5.However, MAE falls gradually from humid region to hyper-arid region ranging from 2.38 mm/day at HMS11 to 0.71 mm/day at HAMS18.
Figure 12c shows RMSE distribution, where a similar spatial distribution trend as MAE is observed, i.e., RMSE is decreasing from north to south.RMSE is ranging from 11.96 mm/day (at HMS11) to 1.72 mm/day (HAMS8).The figure depicts higher RMSE in the downstream part of the glacial region and upstream part of the humid region.Minimum RMSE is observed in the South-West of Pakistan.
DBMA-MSPD shows a higher correlation with the GPGs with an average CC of 0.70 (Figure 12d) over the entire Pakistan.Higher CC is mainly observed in the south-east of Pakistan (maximum CC is 0.84 at HAMS18), while lower CC is 0.41 observed at GMS2 (glacial region).DBMA-MSPD shows poor correlation with GPGs in the glacial region, which gradually increasing towards the South of Pakistan.The average CC in the glacial region is 0.55, which increases to 0.68 in humid region, 0.77 in arid, and 0.81 in hyper-arid regions.
Figure 12e shows the spatial distribution of KGE score compared to GPGs observations.Smaller KGE score is observed in extreme east and west of glacial region, which gradually increases toward the south of Pakistan (from glacial region to hyper-arid region).The maximum KGE score is observed in hyper-arid region, more specifically at HA-GPG20 (0.93) and HA-GPG19 (0.91).Smallest KGE score in glacial the region might be associated with poor performance of individual SPPs in glacial and humid regions.The smallest KGE score is 0.33 at G-GPG10 and 0.34 at G-GPG11.
Figure 12f represents the spatial distribution of Theil's U over Pakistan.Theil's U is used in this study to observe the merged dataset forecasting accuracy over the spatial span of study area.It also squares the deviation to give more weight to errors and to exaggerate errors that help to detect the regions with high errors.Theil's U values closer to or larger than 1 represent poor forecasting capability of the statistical model (DBMA).It is observed that the model's forecasting is considerably increasing from north to south of Pakistan.The performance of DBMA-MSPD is poor in the glacial region, which is increasing gradually towards the hyper-arid region.Theil's U is ranging from 0.70 at H-GPG4 to 0.37 at HA-GPG18.The average Theil's U overall climate regions are 0.53, representing better forecasting accuracy of DBMA-MSPD.
Table 2 shows the mean and median of ESS and SD scores of DBMA-MSPD and SMA-MSPD during different seasons across four climate regions of Pakistan.For DBMA-MSPD, the analyses show that high ensemble spread is observed during monsoon season followed by pre-and post-monsoon seasons.The minimum spread is depicted in hyper-arid region across all seasons due to low precipitation magnitude/intensity and higher SPPs performance.On the other hand, higher spread is observed in humid region (more specifically in the north-west of the region) with maximum average ESS values of 11.38 mm/day and 9.26 mm/day, respectively, during monsoon and pre-monsoon seasons.A similar trend is also observed for the glacial region, with the maximum average ESS values of 10.26 mm/day in monsoon, and 8.80 mm/day in pre-monsoon seasons.In contrast, higher ESS values for arid region are observed during the post-monsoon season, which might be associated with relatively frequent precipitation (moderate magnitude and intensity).
Results in Table 2 show that variation in precipitation is dependent on magnitude and intensity, i.e., higher variation during monsoon season and lower variation in winter season.SD statistic for DBMA-MSPD shows high variations in glacial region as compared with other regions.Lower variations are observed in hyper-arid region, having very minimal precipitation amount.The maximum mean SD values in glacial and humid regions are 12.58 mm/day and 11.43 mm/day during monsoon season.SMA-MSPD shows comparatively lower ensemble spread and precipitation variation as compared to DBMA-MSPD.The analyses show that ESS score is increasing with the increase in precipitation magnitude and intensity.Thus, a higher ESS score is observed during monsoon followed by post-monsoon and pre-monsoon seasons.The ensemble spread of SMA-MSPD also varies with topography, i.e., high ESS score (6.67 mm/day) across humid region followed by 5.33 mm/day in glacial region during monsoon season.Minimum ensemble spread is observed across hyper-arid region, which might be associated with minimal precipitation in the region.A minimum ESS score of 1.77 mm/day is observed during the winter season across hyper-arid region.The analyses show that SMA-MSPD is not as effective as DBMA-MSPD to express the ensemble spread in precipitation during all seasons across all climate regions.
SMA-MSPD has a relatively poor capacity to depict the variation in precipitation as DBMA-MSPD evaluated across all four climate regions during different precipitation seasons.The precipitation variation is dependent on seasonal variations in magnitude and intensity of precipitation.In glacial region, a higher SD (average value of 6.79 mm/day) is observed during monsoon season, while minimum SD in 6.07 mm/day in winter season.The higher variation in SMA-MSPD precipitation (with average SD value of 6.57 mm/day) in humid region is observed during the monsoon season, while minimum average SD (2.12 mm/day) is observed during winter season in hyper-arid region.
Figure 13 shows the skill scores (improvements) of DBMA-MSPD compared with SMA-MSPD across Pakistan.Figure 13a shows higher improvements in MBE in the center north of glacial region (45-50%), which is decreasing from north to south (from glacial to humid, arid, and hyper-arid regions).Higher skills in glacial and humid regions confirm the significant improvements of BMA algorithm as compared to SMA.Besides, the supremacy dynamic (spatial and temporal) variation of weights can also be depicted from glacial and humid regions.Lower improvements in MBE are observed in hyper-arid region, more specifically in the south-east.The lower skills in hyper-arid region are associated with higher performance of individual SPP in the region.is observed in glacial region, while SD score is ranging from 30% to 45% in humid region.Minimum improvements (20-30%) in terms of SD are observed in hyper-arid region.MAE skills show higher improvements in 40%-50% and 30-40% in glacial and humid regions, respectively (Figure 13b).Slightly lower improvements in MAE are observed in the west of arid and east of hyper-arid regions ranging between 30-35%.Figure 13c shows higher improvements in RMSE across glacial (40-50%) and humid (40-45%) region, while lower in the middle of hyper-arid region (30-35%).
Lower improvements as compared to other skill scores are observed in CC, KGE, and Theil's U (Figure 13d-f).DBMA-MSPD significantly improved the CC in east of glacial region as compared to SMA-MSPD, while moderate improvements are observed in the rest of glacial and humid regions.Lower CC skill score is depicted in the extreme west of hyper-arid region (10-15%).There is no particular trend observed in the KGE skill score in glacial and humid regions; however, higher improvements are observed in these two regions, which are declining towards the hyper-arid region (Figure 13e).Theil's U score has been significantly increased in the extreme north of the glacial region (22-24%).However, the improvements of DBMA-MSPD show a decreasing trend towards the south (hyper-arid region) of Pakistan.
The ESS and SD skill scores of DBMA-MSPD compared with SMA-MSPD are also calculated and presented in Figure 14.Higher improvements in ensemble spread are observed in humid (40-50%) and glacial (35-40%) regions.ESS score is decreasing towards the hyper-arid region, where the ESS score is ranging between 20-30% (Figure 14a).SD skill score (Figure 14b) shows contrasting figues as compared with ESS score in humid and glacial regions.Maximum SD skill score (35-45%) is observed in glacial region, while SD score is ranging from 30% to 45% in humid region.Minimum improvements (20-30%) in terms of SD are observed in hyper-arid region.Overall, the skill scores show high improvements of DBMA-MSPD in glacial and humid regions, which is declining towards the hyper-arid region.Lower improvements in hyper-arid/arid region are due to the high performance of individual SPPs, low magnitude of precipitation and low elevation.Higher improvements in glacial region are due to the robustness of BMA algorithm and dynamic variation of weights as compared to fixed weight SMA-MSPD.

Comparison of DBMA-MSPD with SMA-MSPD and Merging Members using GPGs Observations
The comparison and evaluation of DBMA-MSPD and four merged members against 102 GPGs Overall, the skill scores show high improvements of DBMA-MSPD in glacial and humid regions, which is declining towards the hyper-arid region.Lower improvements in hyper-arid/arid region are due to the high performance of individual SPPs, low magnitude of precipitation and low elevation.
Higher improvements in glacial region are due to the robustness of BMA algorithm and dynamic variation of weights as compared to fixed weight SMA-MSPD.

Comparison of DBMA-MSPD with SMA-MSPD and Merging Members using GPGs Observations
The comparison and evaluation of DBMA-MSPD and four merged members against 102 GPGs in four climate zones using selected six statistical indices are presented in Table 3.The analysis shows that DBMA-MSPD outperformed all the merged members and showing significant improvement in all climate zones.Among the merged members, TMPA shows supremacy in capturing precipitation with high forecasting accuracies and a high correlation with GPGs in all climate zones.On the other side, CMORPH presents the worst performance in all climate zones.The analysis confirms a significant reduction in errors for DBMA-MSPD as compared to individual merged members and shows highest potential to be utilized in hydrological applications [26,48].

Discussion
Accurate estimation and capturing higher spatial and temporal variation in precipitation is a challenging task over the diverse climate and complex topography of Pakistan [16].The sparse distribution of GPGs contributes to low accuracy in estimation of precise precipitation.Therefore, SPPs are used as an alternative which provides homogenous precipitation estimates on regional to global scales.With significant advancement in remote sensing techniques and continuous improvements in satellite-based retrieval algorithms provide cost-effective and reliable precipitation estimates [68,69].However, SPPs observations are subjected to uncertainties resulting from retrieval algorithms.To reduce these uncertainties and to obtain high-quality precipitation estimates, researchers have focused on merging individual SPPs [16,22,26,70].
Very limited studies are found which focused on the development of merged precipitation datasets over Pakistan.Very recently, Rahman et al. [48], Rahman et al. [16], and Muhammad et al. [22] developed and evaluated merged multi-satellite precipitation datasets (MSPDs) in Pakistan.Muhammad et al. [22] developed MSPD across Pakistan by assigning relative weights to SPPs, i.e., Global Precipitation Measurement (GPM)-based Integrated Multi-Satellite Retrievals for GPM (IMERG) research version (IMERG-IR), IMERG real-time (IT), and TRMM 2B42 (RT).The results demonstrated significant agreement with GPG observations and improvements as compared to individual SPPs.Rahman et al. [48] developed Dynamic Clustered Bayesian Model Averaging (DCBA-MSPD) across Pakistan and evaluated its performance across four climate regions (spatial evaluation) and four seasons (temporal evaluation) during 2000-2015.They concluded better performance of DCBA-MSPD as compared to individual SPP members.However, relatively poor performance was observed over high elevated areas of glacial and humid regions.Similarly, Rahman et al. [16] reported significant improvement in precipitation estimation and reduction in uncertainties both on regional and seasonal scales.In the current study, the performance of DBMA-MSPD shows elevation dependency.Besides the significant improvements in reduction of errors associated with individual SPPs, higher discrepancies are observed in glacial and humid zones (high elevated zones).Besides the improvements, all the MSPDs show a strong dependency on precipitation magnitude/intensity and elevation [16,26,41].Shen et al. [41] reported that precipitation estimation capability of MSPDs reduces significantly at an elevation higher than 4000 m.
The possible sources of errors are elevation dependency, climate variability, topographic complexity, impact (limitations) of different sensors used in SPPs, and the retrieval algorithms used to estimate precipitation [48,71].The topography of Pakistan is very complex, consisting of the permanent snow and glaciers in glacial region, high mountain peaks and hilly areas in northern parts and plain areas having arid and hyper-arid characteristics in the south, all in a very short spatial span.The topographic complexity also contributes to strong scattering of MW (microwave) signals, especially across cold climate and snow-dominated regions (i.e., glacial region) [31,72].
The poor performance of all MSPDs evaluated over Pakistan in the glacial region might be associated with the performance of IR (InfraRed) and MW sensors used in precipitation estimation.Information about precipitation estimates based on the lowest temperature on the top of cloud is provided by IR sensors, while PMW provides details about precipitation area rather than cloud temperature [73].The clouds over glacial region are relatively warm and hamper the capabilities of sensors to estimate precipitation from warmer clouds.This is because the cloud top temperature is too warm for the IR sensors' threshold to differentiate between precipitation and no precipitation clouds [74,75].Furthermore, the accuracy of IR sensors is also dependent on brightness temperature, and polarization properties vary with snow cover and exposure, which in turn depend on altitude and terrain of the mountainous region [72,76].Besides the limitations of IR sensors, PMW sensors have poor accuracy in detecting precipitation over mountains (glacial and humid regions in the current study), which is due to warmer clouds that could produce heavy precipitation without more ice aloft in PMW algorithm, dense vegetation cover, and relative coarse spatial resolution [73].
Intense precipitation during monsoon season across the humid region causes attenuation in signals, which is very frequent in such regions [77].In addition, the retrieval algorithms used to estimate precipitation also contributes to errors in SPPs [78,79].The algorithms used in SPPs uses IR brightness temperature at the top of the cloud and indirectly estimate the precipitation without taking into consideration the elevation impact and sub-cloud evaporation phenomena [72,73], hence significantly affect the quality and accuracy of SPPs [80].Besides the poor performance of SPPs due to retrieval algorithms over heavy precipitation regions, the processing schemes of MW and IR sensors contribute to further uncertainties [72].Moreover, external errors related to the data quality of GPGs such as splashing and wind effects, human-induced errors, impact of snow, and non-uniform/sparse/limited distribution of GPGs also plays a role in calibrating the SPPs [81].
The factors mentioned above are the causes of poor performance of MSPDs over glacial and humid regions of Pakistan, i.e., poor performance of SPPs in glacial and humid regions leads to greater uncertainties in the developed MSPDs [57].High performance of individual SPPs in arid and hyper-arid regions [16,82,83] contributes to the high performance of MSPDs.
MSPD developed using a dynamic algorithm (variation of weights over time and space) has many advantages as compared to fixed weight-based MSPD.The dynamic weights can account the regional topographic complexity as we all as extreme local climate that can influence the merging process.In comparison to fixed weight MSPDs, the improvements of DBMA-MSPD is quantified using a skill score calculated using SMA-MSPD as a reference forecast.The analyses show significant improvement of DBMA-MSPD across glacial and humid regions and relatively lower improvements in hyper-arid region.We investigated the DBMA-MSPD performance for a particular season at a specific location.Variation of weights over time in a dynamic algorithm demonstrates the performance of MSPDs based on precipitation intensity and magnitude, e.g., monsoon season in Pakistan having an intense and high magnitude of precipitation.Sometimes, the dynamic algorithms (such as DBMA and DCBA) randomly select an individual SPP rather than considering the SPP having a high correlation with GPGs and low errors (high weight) in a particular season.Thus, higher uncertainties are introduced during the merging process [56].

Conclusions
The current study proposed a merged multi-satellite precipitation dataset (MSPD) using Dynamic Bayesian Model Averaging (DBMA) algorithm from four satellite-based precipitation products (SPPs), including TMPA 3B42v7, PERSIANN-CDR, Era-Interim, and CMORPH.The MSPD was evaluated over four different climate regions of Pakistan, i.e., glacial, humid, arid and hyper-arid, both on spatial and seasonal (pre-monsoon, monsoon, post-monsoon, and winter) scale.Performance of MSPD is evaluated over 102 GPGs using six statistical indices at daily temporal scale, including Mean Bias Error (MBE), Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Correlation Coefficient (CC), Kling-Gupta efficiency (KGE score) and Theil's U coefficient during 2000-2015.The key findings from the current study are listed below: (1) The MSPD has significantly improved the performance of individual SPPs.TMPA has higher accuracy as compared to other SPPs.
Remote Sens. 2019, 11, x FOR PEER REVIEW 4 of 29 region is composed of major agricultural region of Punjab province.The Indus river and its tributaries flow through the region and considered as a primary source of water in the region.The average elevation and mean annual precipitation in arid region are 633 m and 322 mm, respectively.Hyperarid region includes the Sindh and Balochistan provinces, and south part of the Punjab province.Most of the hyper-arid region consists of barren lands and dry mountains.The mean elevation and mean annual precipitation in the region are 444 m and 133 mm, respectively.

Figure 1 .
Figure 1.(a) Elevation of Pakistan (study area) obtained from Shuttle Radar Topography Model (SRTM), (b) Geographic location of rain gauges and four climate regions, and (c) Mean annual precipitation over Pakistan during 2000-2015.

Figure 1 .
Figure 1.(a) Elevation of Pakistan (study area) obtained from Shuttle Radar Topography Model (SRTM), (b) Geographic location of rain gauges and four climate regions, and (c) Mean annual precipitation over Pakistan during 2000-2015.

Figure 5 .
Figure 5. Distribution of precipitation of dominant ensemble member (TMPA) over representative GPG across (a) Glacial, (b) Humid, (c) Arid, and (d) Hyper-arid regions.Blue dots show the distribution of probability density function (PDF) along the normal (red) line.

Figure 5 .
Figure 5. Distribution of precipitation of dominant ensemble member (TMPA) over representative GPG across (a) Glacial, (b) Humid, (c) Arid, and (d) Hyper-arid regions.Blue dots show the distribution of probability density function (PDF) along the normal (red) line.
. The temporal spans of four seasons are: April, May, and June (pre-monsoon), July, August, and September (monsoon), October and November (post-monsoon), and December, January, February, and March (winter).More than 60% of precipitation is received during the monsoon season in

Figure 6 .
Figure 6.Spatial distribution of DBMA-MSPD average weights of merged members, (a) TMPA, (b) Era-Interim, (c) PERSIANN-CDR, and (d) CMORPH, during 2000-2015.The seasonal (pre-monsoon, monsoon, post-monsoon, and winter) distribution of DBMA-MSPD weights of four merged members over Pakistan during 2000-2015 are shown in Figures 7-10.The temporal spans of four seasons are: April, May, and June (pre-monsoon), July, August, and September (monsoon), October and November (post-monsoon), and December, January, February, and March (winter).More than 60% of precipitation is received during the monsoon season in Pakistan, which varies with topography and climate in magnitude from low (<100 mm) in the glacial region .
Higher fluctuations in weights are observed only for PERSIANN-CDR.The monsoon precipitation (DOY 181-273) is captured very well by TMPA and PERSIANN-CDR representing peak weights during the season.In contrast, less volatility is observed for Era-Interim and CMORPH.The arid region is dominated by PERSIANN-CDR, followed by TMPA and Era-Interim (almost similar weights).There is a significant fluctuation of PERSIANN-CDR and Era-Interim weights in the monsoon season.TMPA is showing a different trend in the arid region as compared to other regions, i.e., no significant fluctuation during the monsoon season.In the hyper-arid region, dominated by TMPA and Era-Interim, all the SPPs captured high precipitation events effectively.Remote Sens. 2019, 11, x FOR PEER REVIEW 16 of 29 PERSIANN-CDR and Era-Interim weights.However, less volatility can be seen for TMPA and moderate for CMORPH.Peak weights of TMPA, PERSIANN-CDR, and Era-Interim are observed between DOY 156-232 during monsoon season (late pre-monsoon, monsoon, and early post-monsoon rainy days).In the humid region, higher weights are assigned to TMPA followed by PERSIANN-CDR.Higher fluctuations in weights are observed only for PERSIANN-CDR.The monsoon precipitation (DOY 181-273) is captured very well by TMPA and PERSIANN-CDR representing peak weights during the season.In contrast, less volatility is observed for Era-Interim and CMORPH.The arid region is dominated by PERSIANN-CDR, followed by TMPA and Era-Interim (almost similar weights).There is a significant fluctuation of PERSIANN-CDR and Era-Interim weights in the monsoon season.TMPA is showing a different trend in the arid region as compared to other regions, i.e., no significant fluctuation during the monsoon season.In the hyper-arid region, dominated by TMPA and Era-Interim, all the SPPs captured high precipitation events effectively.
Remote Sens. 2019, 11, x FOR PEER REVIEW 17 of 29 HMS11) to 1.72 mm/day (HAMS8).The figure depicts higher RMSE in the downstream part of the glacial region and upstream part of the humid region.Minimum RMSE is observed in the South-West of Pakistan.

Table 1 .
Statistical indices used to evaluate the performance of DBMA-MSPD.M is the simulated (merged precipitation) data, O is the observed precipitation data from GPGs, n is the number of samples, X is the data element (X = M for the DBMA-MSPD while X = O for GPGs), CV is the coefficient of variation, and bars on variables represents the average values.