Identiﬁcation of the Optimum Rain Gauge Network Density for Hydrological Modelling Based on Radar Rainfall Analysis

: Rain gauges continue to be sources of rainfall data despite progress made in precipitation measurements using radar and satellite technology. There has been some work done on assessing the optimum rain gauge network density required for hydrological modelling, but without consensus. This paper contributes to the identiﬁcation of the optimum rain gauge network density, using scaling laws and bias-corrected 1 km × 1 km grid radar rainfall records, covering an area of 28,371 km 2 that hosts 315 rain gauges in south-east Queensland, Australia. Varying numbers of radar pixels (rain gauges) were repeatedly sampled using a unique stratiﬁed sampling technique. For each set of rainfall sampled data, a two-dimensional correlogram was developed from the normal scores obtained through quantile-quantile transformation for ordinary kriging which is a stochastic interpolation. Leave-one-out cross validation was carried out, and the simulated quantiles were evaluated using the performance statistics of root-mean-square-error and mean-absolute-bias, as well as their rates of change. A break in the scaling of the plots of these performance statistics against the number of rain gauges was used to infer the optimum rain gauge network density. The optimum rain gauge network density varied from 14 km 2 / gauge to 38 km 2 / gauge, with an average of 25 km 2 / gauge.


Introduction
Rainfall is a key forcing input for hydrological modelling, such as that used in studies on extreme events and climate impact analysis. However, the high spatial variability of rainfall is recognised, and thus data regarding the rainfall distribution in space and time is paramount for meaningful use of the outputs of hydrological models. Ground-based rain gauges have been the source of rainfall measurement for quite a long time, and are generally seen as the "ground truth". However, the poor gauge network density, as a result of limited resources, accessibility and maintenance [1], is a challenge. This is coupled with the fact that gauges provide information for a small area (e.g., 203 mm in diameter), and extrapolating to the spatial scale of several km 2 introduces high uncertainty (e.g., [2]). In recent years, gridded radar and satellite products data have been processed to obviate the limitations of the gauges, but these approaches have their challenges, including the spatial scale, which ranges from 1 km 2 to about 50 km 2 . The systematic bias issues that these products suffer range from sensor limitations to sampling errors and the algorithms for retrieval [3]). Although weather radar captures very well the spatial variability, the intensities suffer uncertainties stemming from factors such as beam blocking, ground clutter and signal attenuation [4]. As such, there is always the need to bias-correct these data sources with reference to the gauged measurements, and thus point rain gauge records and interpolation methods will continue to play a key role in hydrological modelling. This obviously It is located in south-east Queensland, and the radar is centred at a latitude of 27.718 • S and a longitude of 153.240 • E. The radar data has been processed by the Australian Bureau of Meteorology (BOM) (http://www.bom.gov.au/australia/radar/about/calculating_rainfall_accumulations. shtml, accessed on 12 August 2015), and supplied with a 6 min temporal resolution and a spatial unit of 1 km 2 , and from January 1, 2009 to June 30, 2015. However, the radar data were aggregated from 9 a.m. to 9 a.m. in order to conform to the observed daily rainfall sampling timescale. Within the study area, there are 324 rainfall gauges (Figure 1) that are managed by the BOM. Missing records within the daily rainfall data have been infilled by the Queensland Department of Environment and Science (https://www.longpaddock.qld.gov.au/silo/, accessed on 4 April 2020), and the complete records were used. Each rain gauge station is assigned to a 1-km 2 radar grid centre, and the values of grid centres with more than one rain gauge were averaged, reducing the daily rain gauge stations to 315. The rainfall data from the radar at the 315-gauge locations were extracted to constitute the collocated datasets. This means that for the day of interest, we have the radar (RAD), gauged (GAU) and collocated (COL) datasets for analysis. While the radar data has a minimum wet value of 0.01 mm, the minimum was set to 0.1 mm, to conform with the observed daily rainfall records. It needs to be underlined that partial radar coverage data have been previously used [22][23][24], and this is the first time that the complete landfall area coverage is being used.
A temperate climate, without a dry season and a hot summer, characterises the climate of the study area, in accordance with the classification of [25]. It is a subtropical region, with an average temperature of 26.5 • C, and with summer temperatures sometimes exceeding 29 • C. The region experiences an average annual rainfall of 990 mm, the majority of which occurs during the summer months, from December to March. The winter months from June to August are generally dry, whereas the hot summer months from December to February could experience elevated numbers of thunderstorms.

Marginal Distribution Fitting
A standard two-parameter right-skewed distribution is fitted to the daily rainfall amounts greater than zero from the 3 datasets (RAD, GAU and COL) separately. One standard distribution is chosen from the set of Generalized Pareto, Gamma, Gumbel, Log-Logistic, Log-Normal, Kappa and Weibull (R packages fitdistrplus, [26]; FAdist, [27]), using the Anderson-Darling statistic. These right-skewed distributions are considered appropriate for daily rainfall amounts as treated here, and they are commonly used in the literature [28][29][30][31][32]. The fitted distribution is used to transform the daily amounts to probabilities, and then to the standard Gaussian (N [0,1]) quantiles (Q-Q transformation) used in the ordinary kriging interpolation. However, there is a need to account for the dry gauges (zeros) that abound in daily rainfall records. Daily rainfall amounts r at a dry station k with spatial coordinates S k are assigned as: where d[s k ] is the minimum distance of the dry gauge located at S k from a wet gauge, d is the average of d, and 0.1 is the minimum wet gauge value. Assuming p o represents the proportion of the gauges that are dry, the fitted two-parameter distribution F R is zero inflated and used to transform the rainfall amounts r[s k ] into standard Gaussian quantiles (normal score) w[s k ], as: In Equation (2), the cumulative normal distribution N [0,1] is represented as Φ, and Φ −1 is the inverse. Given a normal score, the inverse of Equation (2), written as gives the rainfall amount.

Bias Correction
There could be significant differences between the marginal distributions of GAU, RAD and COL datasets for the same day. Hence a bias correction method was implemented. The traditional Quantile-Quantile (Q-Q) bias-correction method assumes that the observed gauge data provide the right distribution, and that the gridded datasets from radar, satellite or GCMs/RCMs therefore need to be adjusted to reflect the observed distribution. This idea is expressed mathematically as (e.g., [18]) and is used to correct the gridded radar rainfall amounts (R RAD ) to bias-corrected (R RAD−BC ) amounts, using the rain gauge data distribution (F GAU ) and radar data distribution (F RAD ) for the day, F −1 being the inverse function. However, the observed daily records as used in this paper are seen as a sample, and therefore require adjustment as well. As presented later, the number of rain gauges is not high enough to reproduce the spatial structure for a wet day. For a given probability p, the fitted distributions (F GAU , F RAD , F COL ) are used to estimate the rainfall amount for GAU, RAD and COL. Then, the difference between the RAD and COL amounts is added to the GAU in order to obtain amounts that follow the "true" distribution (F TRUE ) of daily rainfall (R) for the day. This is expressed mathematically as: Given the GAU rainfall amounts (R GAU ), the true distribution is then used to estimate the bias-corrected probabilities (p GAU−BC ), whereas, for the RAD, the probabilities [F RAD (R RAD )] are used to adjust the rainfall amounts, as: Therefore, the observed rainfall amounts are preserved, but their probabilities are reassigned, whereas the reverse is the case for the radar rainfall. Figure 2 illustrates the bias-correction scheme. Hereafter, the bias-corrected radar daily rainfall amounts are used. Figure 2. Bias correction scheme: black arrows show the errors in the collocated rainfall amount that is added to the gauged amount to obtain the "true" distribution (red curve); blue arrows indicate the correction of the probability of the gauge rainfall amount; brown arrows show the correction applied to the radar rainfall amounts.

Spatial Structure Modelling
The spatial structure required by the ordinary kriging interpolation is developed using the standard Gaussian quantiles, due to the normality assumption of kriging. It is based on the framework presented by [33], which uses the 'round-trip' fast Fourier transform approach on the empirical correlogram R[i, j], obtained as: In Equation (7), x and y are the separation distances, in km, in the eastern and northern directions, respectively, from the origin (0,0) of the empirical correlogram. Pair gauge locations at separation distance h within the bounded region of (x ± 1, y ± 1) are included in the calculation of the correlogram value at the grid point (x, y), with N h representing the number of pair gauges. The means of the pair of tail w[s k ] and head w[s k + h] values are denoted as m o and m +h , respectively.
Following [21,22], a 2D exponential distribution expressed as was fitted to the empirical correlogram data. Along an elliptical contour, u and v are the separation distances in the direction of the major and minor axes, respectively. The 3 parameters defining the 2D exponential distribution are the angle between the major axis and the horizontal direction (θ), the major axis length (L u ), and the minor axis length (L v ), the anisotropy ratio (η) being defined as L v /L u . These parameters are estimated using the global optimisation technique of [34] by matching the empirical and the analytical elliptical correlogram contours [21].

Stratified Sampling of Rain Gauge Locations
In order to investigate the effects of the number of rain gauges (radar pixels used interchangeably) on the spatial structure, a set number of rain gauges were sampled from the grid centres of the radar data. It is a known fact that rain gauges are by no means uniformly distributed over a study region, as exemplified in Figure 1 for the study region. Therefore, a stratified sampling approach was adopted to mimic the spatial distribution of the current rain gauges. These are the steps for the stratified sampling approach: • Firstly, the study region was overlaid with a 25 km × 25 km grid, and the resulting 63 blocks within, or intersecting, the study region are labelled in Figure 1; • Secondly, rain gauges within each grid were counted, and those blocks devoid of gauges were assigned a value of 0.5 times the fraction of the grid within the radar coverage, to allow for possible selection of gauges within the fractional grids, particularly for higher sampling numbers. The rain gauge network density of the grids varies from 1.7 to 48.6 gauges per 1000 km 2 , grid 45 recording the highest density; • Thirdly, the observed rain gauge counts within the grids were used to develop the weights for the stratified sampling; • Fourth, the number of rain gauges required were sampled with replacement from integers 1 to 63, representing the grids, in accordance with their weights; • Finally, the numbers of samples from each grid from the previous step were sampled randomly, without replacement from the subset of the grid, noting that the subset of each grid is the number of 1-km 2 radar grid centres it contains, which varies from 6 (grid 61) to 625 (the inner grids).

Performance Statistics
Ordinary kriging does not require a description here, as it has been well documented in the literature (e.g., [35,36]). However, it suffices to say that it estimates a variable at a target location using known values at several locations in space, and it is based on linear weighted least squares. For each set of a number of rain gauges sampled, one of the distributions discussed in Section 3.1 was fitted and used to convert the rainfall amounts to standard normal quantiles by means of Equation (2). Then, a 2D correlogram was fitted as explained in Section 3.3. Next, leave-one-out cross validation (LOOCV) was performed using the R package gstat [37]. LOOCV leaves one data point out at a time, and its prediction is made using the remaining data points. The predictions in the normal score were converted to rainfall amounts using Equation (3). The predicted values were evaluated using the root-mean-square-error (RMSE) and the mean-absolute-bias (MAB) performance statistics, defined as: where the observed and predicted values for the ith gauge are, respectively, V O (i) and V P (i), and N is the number of sampled rain gauges. The variation of the performance statistics with the number of rain gauges is used to identify the optimum rain gauge network density.

Results and Discussion
A total of 24 wet days, with varying statistical properties, were selected for the analysis (Table 1). MN-mean rainfall (pixels ≥ 1mm); SD-standard deviation of rainfall (pixels ≥ 1mm); WP-proportion of pixels with rainfall ≥ 1mm; MAX-maximum rainfall; LX-major axis length; LY-minor axis length; AR-anisotropy ratio; AA-anisotropy angle.
As seen in Table 1, the mean of pixels that registered rainfall amounts ≥ 1 mm varies between 3.1 mm and 119.7 mm, while the proportion of wet pixels ranges from 0.094 to 1. For the maximum pixel rainfall, the range is from 5.1 mm to 450 mm. Figure 4 shows the results of the bias-correction scheme applied to the marginal distributions of four selected wet days, and their bias-corrected rainfields are depicted in Figure 5. In many cases, radar overestimates (demonstrated by 20290402, 20101216) or underestimates (demonstrated by 20101011, 20120125) the observed rainfall [38,39], explaining why bias-correction is necessary (e.g., [40]). Some of the errors stem from the methods used for converting radar reflectivity to rainfall intensity and ground clutter. For day 20120125, the collocated distribution was in sync with that of the full radar, but different from the distribution of the rain gauge data. A such, the agreement between the full radar and the collocated marginal distributions depends on the spatial distribution of rainfall, which varies significantly across the wet days.

Marginal Distribution
Rainfall amounts were sampled from the bias-corrected radar images corresponding to a given number of rain gauges. This was repeated a number of times, as explained in Section 3.4. An empirical distribution is fitted to each sampled dataset. For a given probability, rainfall amounts from the repeated samples were used to define the median and the 95% prediction limits for that probability. Figure 6 compares the median empirical distribution with the one derived from the full radar. While the median distribution is quite close to the full radar for all numbers of gauges, the widths of the prediction limits decrease with increasing numbers of gauges. Again, the spatial distribution of rainfall for the day will determine how the median distribution of the smaller number of gauges matches the full radar case. As shown in the two examples, the higher the coefficient of variation for the rainfall data, the higher the variability around the median distribution.

Spatial Structure Parameters
For each set of sampled rainfall data for a fixed number of gauges, Equations (1) and (2) were used to convert rainfall data into normal scores. The Section 3.3 methodology was applied to derive the 2D correlogram parameters of the major and minor axis lengths, as well as the anisotropy direction and ratio. Figure 7 shows the variation of the correlogram parameters with the number of gauges for two wet days. Both the major and minor axis lengths increase similarly as the number of gauges increases, at a rate of over 40% for an additional gauge at 50 gauges, and drops sharply to less than 1% between 1000 and 1500 gauges, using the median values. While there is considerable variation in the anisotropy direction up to about 2000 gauges, the median stabilises quite well after 100 gauges. These observations point to the fact that daily rainfall varies considerably from wet day to wet day, and reflecting the true spatial structure entirely depends on the number and location of the gauges. It needs to be pointed out that the patterns displayed in Figure 7 were also observed by [21] using radar rainfall records from the Bethlehem station in South Africa, but no analyses, as done in this paper, were made.

The Optimum Rain Gauge Network Density
In this section, each set of sampled rainfall data for a fixed number of gauges is used to develop the marginal distribution and the 2D spatial structure required by the ordinary kriging interpolation. The LOOCV technique was used to simulate rainfall amounts, which were evaluated using RMSE and MAB, as presented in Section 3.5, thus incorporating uncertainties into the marginal distribution and the 2D spatial structure, because of the inadequate number of gauges sampled. Figure 8 shows plots of the performance statistics (RMSE and MAB) against the number of gauges for the different sampled datasets. The median values of the repeated samples for a fixed number of gauges are shown as solid circles. It is not surprising that the variability of the performance statistics, with regards to the median values, is higher for the smaller number of gauges.
As the rain gauge network density increases, the inter-gauge distances are decreased, thus increasing the correlation between the gauges that results in the observed decreasing performance statistics. After 2000 gauges, there is virtually no variability in the median. Of note is the perfect power law scaling beyond 2000 gauges for all performance statistics, as empirically observed for all wet days. Therefore, a power law of the form was fitted to the median values after the number of gauges passed 2000. In Equation (11), PS indicates a performance statistic which is either RMSE or MAB, N is the number of gauges, and A and B are the power law coefficient (normalising factor) and scaling exponent, respectively. Gyasi-Agyei et al. [41] used such a power law to relate the channel network average link slope to contributing catchments. They used the break in the scaling exponent (B) to delineate hillslope from the main channel network of a catchment. In the presentation here, the fitted power law line is extended to the lowest number of gauges, in order to determine at which number of gauges there is a break in the scaling, i.e., departure from the scaling law. This breaking point is identified as the optimum number of gauges, and thus there is no appreciable increase in information gained when the number of gauges is increased beyond this point. In Table 2, the values of the scaling coefficient and exponent of the fitted power law for the different wet days are shown. It is observed that the scaling exponent of RMSE and MAB are not significantly different for the same day at the 5% level (paired T test p-value = 0.49; F test p-value = 0.2), but the breaking point identified could be significantly different; about 460 on average. With respect to RMSE, the breaking point varies from 750 gauges (38 km 2 /gauge) to 2000 gauges (14 km 2 /gauge), while for the MAB it could be as high as 56 km 2 /gauge for a few wet days. Dwelling on the averages, RMSE yielded 18 km 2 /gauge and MAB 26 km 2 /gauge, and their combination yielded 22 km 2 /gauge. These average values translate to grid sizes of 4 km for RMSE, 5 km for MAB, and 4.7 km for the combination. There is no apparent correlation between the scaling exponents and the listed rainfall properties in Table 1, with the exception of the wet proportion, which exhibits correlation coefficients of −0.6 with RMSE and −0.7 with MAB. This means that as the wet proportion increases, the power law scaling slope becomes steeper.  Another way to estimate the representative threshold values was to use rate of change (ROC), estimated as where i and i+1 are the successive number of gauges indexed when arranged in increasing order, (N i+1 − N i ) is the difference in the number of gauges, and (PS i+1 − PS i ) represents the difference in performance statistics at the successive intervals. In comparison to RMSE and MAB, the ROC is rainfall magnitude-independent, meaning values for the different wet days could be compared. Rates of change (ROC) is commonly used in finance to measure the change in the price of a security over a fixed time interval, so the denominator (N i+1 − N i ) is not required, or set to 1, in that sense (https://www. ambroker.com/en/analysis/blog/what-rate-change-roc-indicator-and-how-use-trading/, accessed on 2 June 2020). The ROC was calculated progressively for all increasing numbers of gauges for each wet day. For a fixed number of gauges, the medians of the 24 values were selected and plotted in Figure 9. As was done for the individual days of RMSE, demonstrated in Figure 8, the power law curve (Equation (11)) was fitted to the number of gauges beyond 2000, and extrapolated to the lower number of gauges to identify the scaling breaking point. Because both the RMSE and MAB show decreasing trends with increasing numbers of gauges, resulting in negative ROC values, absolute values of ROC (or −ROC) were used to allow the fitting of the power law. Clearly, the breaking point is 1000 for the ROC of the RMSE, and 1250 gauges for that of MAB, the average of both performance statistics being 1125 gauges. These breaking points are slightly lower than the average of the breaking points of individual wet days, as presented in Table 2. A breaking point of 1125 gauges translates to an optimum rain gauge density of 25 km 2 /gauge, and a grid size of 5 km. Since the current rain gauge density of the case study site is 90 km 2 /gauge (28,371/315), the implication of our finding is that this density needs to be improved by at least a factor of three, to mimic the full-scale level. Due to economic constraints, this may not be the way to go, and it may be necessary to rely on blending radar and satellite records with whatever gauge network density is affordable, while remaining aware of the need to be wary of the consequences. Hence ground based rain gauges will continue to be widely used and play a significant role in hydrological analysis and modelling.
Girons Lopez et al. [10] used an inverse-distance weighting method to interpolate a varying number of rain gauges over a 6400-km 2 study area in north-eastern Switzerland. Using a Pearson correlation coefficient and the normalised RMSE (NRMSE), they concluded that increasing the rain gauge network density beyond 24 per 1000 km 2 (42 km 2 per gauge, grid size of 6.4 km) did not improve the performance statistics. Their threshold value for the optimum rain gauge density is not significantly different from what has been observed in our case study, our average optimum being a grid size of 5 km. However, a grid size of 4 km (2000 gauges) is ideal for all wet days. Villarini et al. [7] witnessed the power law type decrease in the NRMSE of catchment-wide average rainfall when the numbers of rain gauges were increased, although they did not fit a power-curve to investigate whether there is a breaking point in the scaling behaviour, as this was not their objective. However, they recommended a minimum of four gauges to evaluate satellite pixels of about 200 km 2 for daily rainfall, and they established a scaling behaviour (power law) between the NRMSE of rainfall accumulation and the sampling interval.
This study has provided one insight into the evaluation of the daily satellite precipitation products that come with different grid sizes. The grid size of 4 km for PERSIANN_CCS [42] and TASAT [43] satellite products may be ideal. Certainly, precipitation satellite products with grid sizes of 10 km or greater may require spatial downscaling to a finer grid size, for better hydrological modelling.

Conclusions
The rain gauge continues to be a valuable source of rainfall records, despite its primary limitation of having a small coverage area, of about 203 mm in diameter, and an inadequate network density, rendering it unable to capture the high spatial variability of rainfall. For these reasons, radar and satellite rainfall data sources are becoming popular, but can be cost-prohibitive for some areas. With the aid of bias-corrected radar daily rainfall records, this study has provided a framework for determining the optimum rain gauge density. The probabilities of the radar records are assumed to be correct, but the rainfall amounts were bias-corrected using observed daily rain gauge records within the study area. While there are many studies in the literature on the optimum rain gauge network density, there is no consensus on this. A simple practical approach is implemented to ascertain the optimum rain gauge network density.
The starting point is a unique stratified sampling technique, used to mimic the distribution of the current rain gauge locations that are employed to sample a fixed number of rain gauge locations from the bias-corrected radar data of the wet day, with the days considered independently. This was repeated a number of times for a fixed number of gauges. For each set of sampled locations, the daily rainfall amounts were transformed into normal scores that were used to develop the 2D correlogram (spatial structure) required by ordinary kriging interpolation. Then, LOOCV was carried out, and the simulated quantiles were evaluated using the performance statistics of RMSE and MAB. Plotting these performance statistics against the number of rain gauges revealed a break in scaling for all the 24 wet days analysed. Rates of change (ROC) per additional gauge of the performance statistics revealed the same break in scaling as that depicted by RMSE and MAB. It is the breaking point in the power law scaling that is used to infer the optimum rain gauge network density.
Generally speaking, the uncertainty concerning the median of the performance statistics decreases with the increasing number of gauges. This is due to the fact that the higher the number of gauges, the better the reproduction of the spatial structure of the full-scale region. The break in scaling varied between 750 and 2000 gauges, which translates to 38 km 2 /gauge (grid size~6 km) to 14 km 2 /gauge (grid size~4 km), respectively. However, no apparent reasons were established for the variations in the daily rainfall statistics. In the end, ROC gave an average optimum network density of 25 km 2 /gauge, corresponding to a grid size of 5 km. Thus, the case study site's rain gauge network density of 90 km 2 /gauge needs to be improved by at least a factor of three in order to mimic the full-scale level.
One implication is that there may not be a real advantage in downscaling daily satellite precipitation products with grid sizes finer than 5 km. However, this methodology needs to be duplicated in different regions in order to ascertain the effects of local conditions, such as orography and the spatiotemporal variability of rainfall, on the optimum rain gauge network density. While the breaking point of the number of gauges varied from day to day, there were no clear linkages between this and the storm properties, and this needs to be further investigated.