A Method for the Optimized Design of a Rain Gauge Network Combined with Satellite Remote Sensing Data

: A well-designed rain gauge network can provide precise and detailed rainfall data for earth science research; meanwhile, satellite precipitation data has been developed to generate more real spatial features, which provides new data support for the improvement of ground station network design methods. In this paper, satellite precipitation data are introduced into the design of a rain gauge network and an optimized method for designing a rain gauge network that comprehensively considers the information content, spatiotemporality, and accuracy (ISA) of the data is proposed. After screening the potential stations, the average spatial information index of the rain gauge network, which is calculated from remote sensing data, is used to address the shortcomings of applying spatial information from single-use measurement data. Then, the greedy ranking algorithm is used to rank the order in which the rain gauges are added to the network. The results of the rain gauge network design in the upper reaches of the Chaobai river show that compared with two methods that do not consider spatiality or use only measured data to consider spatiality, the proposed method performs better in terms of the spatial layout and accuracy verification. This study provides new ideas and references for the design of hydrological station networks and explores the use of remote sensing data for the layout of ground-based station networks.


Introduction
Ground-based observations of precipitation are the most basic data in hydrology and provide fundamental information for Earth sciences. Rain gauge networks are very important for water resource planning, management, development, and utilization. One of the key issues in hydrological measurements is to improve the scientific rationality and optimization of rain gauge network design in regions. In the design of a rain gauge network, although the World Meteorological Organization (WMO) has identified the minimum monitoring density requirements [1], unified methodological recommendations regarding the number and spatial locations of stations have been difficult to establish due to differences in topography, local climate, and meteorological conditions and local economic conditions. Generally, the networks of hydrological stations in many regions are arranged according to a grid, or ground stations are added according to the spatial distribution characteristics of ground stations based on regional interpolation data and funding conditions. The development of optical, active, and passive microwave remote sensing technology in recent years has provided a new amount of data, the spatiotemporal characteristics of a large range are not spatiotemporally continuous like local data; therefore, it is difficult to obtain a universal rainfall layout method. Although the combined application of remote sensing and ground observation data is undoubtedly beneficial to the optimization of station network layouts, only a few related studies have been performed. For example, Wang et al. [44] proposed a temporally continuous maximum coverage network layout method according to the ground-based measurements and space-borne rainfall sensors monitoring period using the maximal covering location problem model [45].
Compared with the data measured at rain gauges, satellite precipitation products have better spatial continuity, which can make identifying the location of rainfall sites more reasonable and effective. Local or global texture feature analyses of remote sensing data are widely used in image segmentation, object classification, and pattern recognition [46][47][48][49]. Among many texture analysis methods, the grayscale co-occurrence matrix (GLCM) method is the most common. For example, Li et al. built GLCM texture features using five characteristics, i.e., standard deviation, correlation, mean, contrast, and angular second moment, and then generating differential images at different moments via a combination of the spectra to monitor ground object changes in very-high-resolution remote sensing images [50].
As Chacon-Hurtado et al. [5] noted in their review, although nontraditional data sources obtained by various sensing technologies have the potential to complement traditional network design, new network design methods are needed to address heterogeneous dynamic data. This paper comprehensively applies the advantages of satellite precipitation data and ground station data to explore an optimal ground rainfall station network layout method with optimal comprehensive indicators of the information content, spatiotemporality, and accuracy (ISA). The paper is organized into five sections. The study area and data are introduced in the second section. The basic research methods, including information entropy theory, the kriging interpolation method, and the optimization layout method proposed in this paper, are introduced in the third section. The calculation results and discussion are presented in the fourth section, and the summary and prospect are provided in the fifth section.

Study Area
The Chaobai River is one of the main tributaries of the Haihe River Basin in China. The Chaobai River originates from the North China Plain, which has a semi-humid and semi-arid continental monsoon climate. The two tributaries of the Chaobai River, the Chao River and Bai River, converge into the Miyun Reservoir, which accounts for two-thirds of Beijing's drinking water supply, from the northwest and southeast, respectively, and converge into the Chaobai River after discharge. This paper takes the upper reaches of the Chaobai River, that is, the basin above the Miyun Reservoir, as the research area. The study area covers an area of 136,000 km 2 , and it has an average annual precipitation value of 300-400 mm and average temperature of 9-10 °C. The precipitation from June to October accounts for more than 85% of the annual precipitation. The terrain is high in the northwest and low in the southeast, with an average elevation of 1010 m. The location and precipitation distribution of the study area of the average rainy season are shown in Figure 1.

Satellite Remote Sensing Data
PERSIANN-CCS is a satellite precipitation product that has high spatiotemporal resolution, with an hourly temporal resolution, a spatial resolution of 0.04° × 0.04°, and a coverage area of 60° N to 60° S [51]. This algorithm first extracts local and regional cloud features from longwave infrared geostationary satellite imagery (10.7 µm) and uses a neural network to relate the cloud classification and microwave data to estimate gridded rainfall. Detailed information is provided in [39,52,53]. Compared with other precipitation products, PERSIANN-CCS has the advantage of providing nearly real-time data at high spatial resolution. This product is widely used in flood warning and hydrological model simulations [54,55]. Nguyen et al. [56] reported that PERSIANN-CCS data could accurately capture the observed hydrograph shape via a simulation flood forecasting experiment, and the minimum correlation coefficient with real data reached 0.86. Zeweldi et al. [57] used radar rainfall observation data to evaluate the PERSIANN-CCS results and showed that the accuracy of PERSIANN-CCS can capture the patterns of intel-annual rainfall variability well. Cánovas-García et al. [58] evaluated three storm events and determined that the temporal distribution of rainfall was captured well by PERSIANN-CCS. Although Chang et al. [59] concluded that the absolute value of precipitation data from PERSIANN-CCS must be further improved, and many attempts to improve the accuracy of PERSIANN-CCS have been undertaken by the Center of Hydrometeorology and Remote Sensing at UCI (University of California, Irvine) [60,61], it is worth affirming that PERSIANN-CCS accurately reflects the spatial distribution characteristics of precipitation.

Data Preprocessing
Due to the inaccessibility of the observed rainfall data in 2014, the optimum design for the station network was determined using the observed data during June-October from 2006 to 2016 (excluding 2014) and remote sensing daily precipitation products from the same time. In this study, we randomly selected 25 stations as the initial network of the study area (544 km 2 /station) according to WMO minimum rainfall density standards (575 km 2 /station for hilly/undulating areas, 250 km 2 /station for mountainous areas) [1]. The interpolation results of the measured rain gauge data with sufficient density collected around the study area were taken as the true values to verify and evaluate the final results. The average number of stations for each interpolation is 80 (170 km 2 /station). Daily precipitation data can be provided consistently at 63 points in the study area, and discontinuous time intervals are observed at approximately 40 other stations.
To ensure the results of the rain gauge network are reasonable and effective, the satellite precipitation products used in this study need to be screened with the purpose of deleting obviously invalid and abnormal data to eliminate errors caused by abnormal data. The preprocessing method of satellite precipitation products is evaluating the correlation for the time series between the real value and the remote sensing rainfall products after dividing the rainfall into six grades. The results are shown in Table 1. In statistics, the closer the absolute value of the correlation coefficient is to 1, the more relevant it is. In this study, remote sensing products with medium correlation coefficients or above were selected for the subsequent experiments; in other words, the threshold value of the correlation coefficient was set as 0.5. After being screened, remote sensing precipitation products were reduced from 1530 days to 1196 days.
In this paper, the spatial scale of remote sensing, i.e., 4 km × 4 km resolution, was used, and the research area was divided into 909 grids. The spatial distribution, before and after filtering, of the correlation coefficients between the remote sensing precipitation data and truth value of each grid is shown in Figure 2. It shows that the correlation coefficients of the spatial series in the study area were controlled to be greater than 0.6.
(a) Before screening ( b) After screening Figure 2. Spatial distribution of the correlation coefficients for the measured precipitation data and remote sensing precipitation data.

Entropy
Shannon [17] presented the information entropy theory in 1948 by calculating the uncertainty of random variables. A higher entropy value corresponds to greater uncertainty in the change in the representative variable and more information represented by the variable. In recent years, it has been widely used in hydrological sequence analyses [62,63], station network layouts [64], water resources assessment [65], and other aspects.
The marginal entropy of X is expressed as follows: The joint entropy of X and Y is expressed as follows: In this paper, X denotes a rain gauge, and { , , , … , } denotes the precipitation sequence values for this rain gauge. The entropy value of a network composed of n rain gauges is expressed by the joint entropy of n rain gauges, and the calculation expression is as follows: The joint entropy denotes the comprehensive information of two variables while the transformation denotes the information quantity shared by the two variables, i.e., mutual information, or the amount of redundant information between the two variables. The mutual information of X and Y is expressed as follows: The redundant information [66] of a network comprising N rain gauges can be expressed by C, and the calculation expression is as follows: Figure 3 shows the topological figures of the above indexes. Regarding the data probability problem in the calculation of information entropy, we often compute the probability by setting different bin values for the discretization data because the calculation of the joint entropy of non-Gaussian distributions is relatively complex [16,25,67]. This paper applies the method of calculating the optimal bin value proposed by Scott [68]. The expression is as follows: where ( ) is the standard deviation of an observation series of X (in this paper, this variable indicates the daily series of a rain gauge), and N is the total number of a data series.

Entropy, Variance, and Standard Deviation
Variance and standard deviation in statistics and entropy in information theory can measure the amount of information of a group of random variables, and these parameters are widely used in some algorithms of machine learning and data mining. Variance and standard deviation measure the dispersion degree of a data distribution while information entropy measures the confusion degree and uncertainty degree of the data. Although the two are related to each other, they are not equivalent [69][70][71]. Variance is defined as the square of the data unit, which makes it difficult to interpret the numerical meaning, and it also exaggerates the dispersion degree of the data, which makes it difficult for people to intuitively understand the numerical significance. Therefore, the arithmetic square root of variance is usually taken as the indicator to describe the dispersion degree, i.e., the standard deviation.

Fundamental Theory of Kriging Interpolation
The kriging method is a widely used spatial interpolation method that was proposed by D. G. Krige, a mining engineer in South Africa (see [72]). Unlike the inverse distance weighting method and spline function method, which are directly based on the surrounding measured values or the degree of smoothness of the generated surface, the kriging method assigns different weights to each sampling point according to the spatial position and correlation by analyzing the spatial distribution of the variation in sampling point data. Subsequently, the estimated value of the interpolated point is obtained via a sliding weighted average, and the expression is as follows: where is the estimated value at ( , ) and is the weight coefficient, which satisfies the minimum error and unbiased estimation of the estimated value and the real value: Different kriging interpolation methods rely on different assumptions. For example, ordinary kriging interpolation assumes that spatial attributes are homogeneous and have the same expectations and variances for any point in space. That is, the values at any point (x, y) comprise the mean value and the random deviation R(x, y) of that point. Half of the variance of the difference between the values of x and x + h of the regionalized variable z(x) is defined as the semivariance function. The semivariance function values of different distances h are calculated as follows: Generally, the fitting models of the semivariance function include the circle model, spherical model, exponential model, Gauss model and linear model. Kriging variance can be calculated by a semivariogram function, and the expression is as follows:

Optimization of Network Design
The optimal rain gauge network design consists of two parts: the identification of potential sites and the selection of the optimal site network.

Filtering Potential Stations
Potential stations are the locations where stations are expected to be built. The screening for potential stations involves the selection of areas within the study area as potential site layout areas. Previous rain-measuring station network layout designs primarily divided the whole research area according to a certain grid and then considered all grids as potential candidate stations directly [22,43], or manually selected individual grids as potential points according to the kriging interpolation accuracy (kriging variance) and other constraints [32,73]. The former requires considerable redundant calculations, especially for large study areas, and data with high spatial and temporal resolution; thus, this method is often used with radar data [43]. The latter manual screening method selects relatively fewer potential sites; thus, it is not conducive to the overall planning of a regional network.
Rainfall is a hydrological variable with significant spatial and temporal characteristics; therefore, the lack of existing networks and spatiotemporal characteristics of the site should be considered in the selection of potential sites. In practical applications, the kriging interpolation accuracy of the existing ground precipitation measurement stations can indicate whether the existing network needs to be supplemented [22,32,74]. In addition, remote sensing products provide more accurate relative spatial characteristics and temporal continuity, and thus can objectively represent local spatiotemporal characteristics [42].
When considering local spatiotemporal characteristics (window is z × z), because the number of local grids is small and the data changes are relatively single, the local spatial dispersion degree, i.e., the spatial characteristics, is more conveniently explained using variance rather than entropy based on the computational complexity and information accuracy. The changes of local spatial characteristics in a time series represent the required spatiotemporal characteristics, which can be obtained by calculating the entropy of the standard deviation in time series l, namely, the spatiotemporal information ( ). A larger value of ( ) corresponds to regional spatiotemporal characteristics with greater significance and a greater number of rain observation stations required for control. In summary, the expression for the screening for potential stations is as follows: where ( ) denotes the local spatiotemporal information of the grid X in local window z in the time series; ( ) denotes the average kriging variance of grid X in time series l; and t denotes the threshold. Using "and" to connect the two screening conditions logically ensures that potential stations with considerable spatiotemporal information will be selected and indicates that accurate rainfall values cannot be obtained from the existing network. The qualified points are used as potential stations for subsequent optimized network design.

Optimized Conditions
The ideal optimal design scheme should add, remove, and change as little as possible based on the premise of maintaining the original network. Generally, because existing stations have a long time series of observation data and high data integrity, which are conducive to follow-up scientific research and actual situation analyses, it is recommended not to remove or change the existing stations when the station network is insufficient. This condition would ensure that the application requirements are met to maintain continuous observations when optimizing the design of the network. Several studies have also illustrated this perspective [16].
Assuming that the network has n sites, which is expressed as , , . . . , , according to the description above, m potential sites have been screened, which are expressed as , , . . . , . We consider that the new rain gauge network needs to meet the following three criteria after adding the stations: (1) Maximizing the total information content of the network in time series ( , ), including maximizing the joint entropy, which ensures that the network can capture enough information from the time series data in the study area; (2) maximizing the mutual information between the network and the other stations that are not selected, which ensures that the selected network covers the time series information characteristics of the unselected stations as much as possible and (3) minimizing information redundancy within the network, which ensures minimum information duplication among stations in the network. Inspired by the MIMR method [28], ( , ) is expressed as follows: According to previous studies [28,32], the tradeoff weight μ is set to provide the user with the option to consider additional knowledge; in this paper, this value is set to 0.9.
Maximizing the spatiotemporal of the network ( , ) ensures that the spatial characteristics of rainfall in the area are fully captured. The calculated indicator is the mean of the ( ) of all stations in the network based on remote sensing data. The higher the average spatiotemporal information of the rain gauge network, the stronger the spatiotemporality of precipitation captured by the whole site network. The expression of ( , ) is as follows: Maximizing the interpolation accuracy of the network ( , ), i.e., minimizing the regional kriging variance, ensures the accuracy of the network in the application. The expression of the ( , ) is as follows: where k is the number of grids in the study area.
In summary, the conditional expressions for the new network are as follows: . . .

Greedy Ranking Algorithm
The greedy ranking algorithm is a common and simple ranking method that can transform a multi-objective problem into a single-objective problem and simplify the calculation process [28,32]. The three optimization objectives have a defined relationship and mutual influence. To avoid differences between different dimensions, the min-max normalization of the variables is performed before the greedy ranking algorithm is used. Taking the total information content as an example, the expression is as follows: where ( , ) and ( , ) represent the maximum and minimum values of the total information content in the sequence of all potential sites added in the original network, respectively. The final expression of the optimization objective of the greedy ranking algorithm is as follows: This expression is defined as the best ISA rules, whereby potential sites are added to the network in turn. In theory, as the number of stations increases, the total information content, spatial index, and errors of the network converge to a stable value [22,43,75].
In summary, the detailed algorithm steps are as follows (corresponding flow chart is shown in Figure 4). Steps 2-4, from the perspective of the site location of the spatiotemporality and the degree of demand, implement the potential site selection and filter out the parts to improve the efficiency of the subsequent filter network. Steps 6-8, from the perspective of the network of comprehensive information content, spatiotemporality, and accuracy, filter out the best station network. The advantages of measured data and remote sensing data are applied synthetically to design the station network reasonably.
1. Collect the time series data and remote sensing precipitation products for the existing stations in the study area and remove the abnormal data.

Calculate
( ) at X in window z × z based on the time series data of remote sensing precipitation products. 3. Perform kriging interpolation and determine the kriging variance of each grid in the study area based on the existing measured site data. 4. Count the potential point sets , , . . . , satisfying the conditions throughout the area using Equation (11). 5. Calculate the information content of each potential point after joining the network according to Equation (12). 6. Calculate the average spatiotemporality of the station network after each potential point joins the network according to Equation (13). 7. Calculate the average kriging variance of the study area after each potential point joins the network according to Equation (14). 8. Calculate the ISA index according to Equation (17)

Computation and Distribution of Entropy
Keum and Coulibaly [16] concluded that a data sequence could provide comprehensive information for at least 10 years when the entropy method is applied to network design research using different time series lengths. The time series of this paper represents 10 years of rainy seasons, with each rainy season lasting for 5 months. After screening, the time series includes a total of 1196 days. The marginal entropy of a unit ranges from 0 to log2 (1196) = 10.22 bits. Because of the frequent precipitation during the rainy season, the amount of information is relatively small. Figure 5 shows the change in entropy with increasing time series length. The marginal entropy tends to be stable after it increases. Obviously, when the time is shorter than 300 days, the change in information entropy is significant and the variance is correspondingly high, which indicates that only 1 or 2 years of data are insufficient to capture the complete information. After a time series of 500 days is reached, the standard deviation of the marginal entropy decreases to 0.1. The information value of each point tends to be stable, which can represent all the changes at the point over time. This result shows that the selected time series can satisfy the information entropy calculation requirements and suggests that a time series during the rainy season shorter than 10 years may be applied to the design of a network with reasonable results. The spatial distribution of marginal entropy is calculated using the interpolation data of 25 randomly selected rain gauges, as shown in Figure 6. The marginal entropy in the map is high in the north and low in the south, and the relative elevation in the high-entropy area is also high. The highentropy grids indicate that the precipitation has strong temporal variability and contains special information in particular grids. However, as described in reference [2], the grids with high entropy are not necessarily the best location for adding a rain gauge because the establishment of the network considers the comprehensive network information rather than the information of a single site.

Screening for Potential Stations
This paper describes the process of screening potential stations, and the screening event of the 26th rain gauge is considered as an example. As stated in Equation (12), the main objective is to screen for regions with inaccurate interpolation results and high local spatial heterogeneity. When screening, we tend to conduct a general screening because being too strict will lead to inadequate site candidates and being too loose will decrease the significance of the screening. Therefore, the thresholds of ( ) and ( ) are the 0.5 quantiles. Figure 7a shows the ( ) distribution map from the interpolation results based on 25 observed rain gauges. The variance increases radially from the location of the existing observation rain gauge to the edge of the study area. Figure 7b shows the potential station area screened when the kriging variance threshold equals the 0.5 quantile. Potential station areas are distributed in all areas except around the observation station.  shows that the ( ) value is high in the central and western parts of the study area, which means that the spatial heterogeneity is strong and that the consistency is considerable in the eastern part. This feature is also consistent with the elevation. Figure 8b shows the screening results of potential stations when the threshold value equals the 0.5 quantile, and the results are distributed mostly in the west and central regions. The two results are calculated logically, and the distribution of potential locations for the 26th rain gauge is obtained. We obtained 230 grids in total as shown in Figure 9. As the number of stations in the network increases, the number of potential grids for the additional station decreases gradually. For example, the result of the screening for potential new station locations for the 60th rain gauge is 165 grids.

Optimal Network of Additional Stations
After screening out the potential rain gauge area, the best comprehensive station scenario is obtained according to the performance of the potential rain gauge when joining the rain gauge network, i.e., according to Equation (18). Figure 10 shows the results of screening for the 26th rain gauge. Figure 10. Screening for the 26th rain gauge. (a)-(d) are the distribution of the total information content, spatiotemporality, accuracy, and ISA components after potential points join the network. Figure 10a shows that the distribution of the total information content of the network is low in the west and high in the east after a new rain gauge is added. The interpolation data based on the measured data suggest that additional stations in the south and east can enrich the rainfall data obtained by the network. Spatiotemporal information from the new network based on remote sensing data (Figure 10b) describes the local spatiotemporal characteristics collected by the network. The addition of stations in the eastern, central and western regions can help the network capture more spatial information, which is similar to the results of the precipitation contour map. Figure 10c shows that the kriging variance of the station network interpolation is uniformly distributed with the original network, and the best result is obtained when the network is sparse or along the edge. Based on the calculation results of the three aspects, the ISA distribution map (Figure 10d) is obtained and the maximum value is taken as the location of the 26th rain gauge according to the greedy algorithm.
In this paper, the network results are calculated in turn until the total number of rain gauges is 60. With the increase in the number of stations, the total information, spatiality, and kriging variance of the network gradually stabilizes. The statistical results are shown in Figure 11, which shows that after adding 15-20 rain gauges, i.e., when the total number of rain gauges is approximately 45, the indicators of the rain gauge network no longer improve with the inclusion of additional rain gauges. In practice, the choice of how many rain gauges to add can be facilitated based on the statistical results. The network design results for 60 rain gauges and 45 rain gauges are shown in Figure 12.

Comparison and Verification of Index Selection
The network design results of this paper were compared with the results from a method that only considers the total information content and the kriging interpolation accuracy of the network for the network design. This method does not conduct potential point screening; in other words, all points in the study area are regarded as potential rain gauges. For convenience of expression, the method in this paper is abbreviated as R-ISA and the comparison method is abbreviated as S-IA. The optimized expression of S-IA method is The design results are shown in Figure 13. Regarding the intuitive layout, the distribution of the S-IA network is more uniform with the additional rain gauges located in the south-central part, which has fewer existing rain gauges, whereas the distribution of the R-ISA network appears to be more consistent with the precipitation distribution compared with the S-IA network. Figure 14 is obtained by calculating the statistics for various types of information, spatiality, and errors of the two networks.   Figure 14b shows that the mutual information of the R-ISA network is slightly lower than that of the S-IA network with the increase in the number of stations. This result may be related to the fact that the S-IA method provides more potential stations than the R-ISA method. The redundancy of information within the R-ISA network is slightly higher than that within the S-IA network, which can be explained by the layout characteristics. The layout results of the S-IA method are relatively intensive in some areas, especially the southeast, compared with the R-ISA method, which affects the redundancy within the network. Significant differences in the spatial characteristics of the two networks are shown in Figure 14d.
As the number of S-IA networks increases, the SI (spatiotemporal information) value does not change significantly, indicating that the spatiotemporal information captured by the S-IA network does not increase significantly. In the screening of the R-ISA network, local spatiotemporal properties are taken into account, and the SI value shows an upward trend and tends to be stable. This difference in results is because the S-IA method focuses too much on the numerical differences in the time series of the interpolated measured data during network screening and does not consider local spatiotemporal changes in the network. Figure 14e,f compare the accuracy of the two networks. Figure 14e shows the variation trend of the total variance of the average daily kriging interpolation throughout the screening process. Both networks decrease to the same extent as the number of stations increases. Figure 14f shows the variation in the average relative error compared with the true value and indicates that the resulting errors of this method are lower than those of the S-IA method. This result is closely related to the distribution of stations. With the same number of stations, the distribution of the R-ISA method is more balanced with the precipitation distribution than that of the S-IA, which may be based on the higher rainfall in the southeast region and significant spatiotemporal characteristics in the western region. The comparison of the results indicates that it is practical and effective to consider spatiality in the design of rain gauge networks.

Comparison and Verification of Data Selection
In this section, the same rules for potential point screening and site network screening are used to compare the results of the optimized station network design using the method proposed in this paper (R-ISA) and the method using only measured data (S-ISA). The optimized expression of S-ISA method is as follows: where S ( , ) represents the spatiotemporal information calculated based the existing site network interpolation data. Similarly, screening for a potential site for the 26th rain gauge is taken as an example. The ( ) distribution obtained by the S-ISA method is shown in Figure 15. Because the interpolation data are derived from adjacent grids, although a semivariogram is used in the interpolation, the spatiotemporal information response is not uniform when the density of the station network is not sufficiently large. The spatiotemporal information is high in places with high site density and low in places far from the existing stations, which is contrary to the conditions for the design of the station network. This result is also the basic defect of point-scale data interpolation for surface-scale data. Finally, the potential distribution results obtained by the S-ISA method combined with the two screening conditions are relatively concentrated ( Figure 16). In contrast, remote sensing rainfall products are retrieved from spatial features, which are more objective and reasonable for the exchange of spatial information.
(a) (b) Figure 15. Screening based on ( ) using measured data. (a) Spatial distribution of ( ) based on measured data and (b) 0.5 quantile potential station distribution from quantile screening based on measured data. Figure 16. Distribution of the potential positions of the 26th rain gauge using the S-ISA method. Figure 17 shows the results of the S-ISA method for the optimum design of the rain gauge network. Intuitively, the network selected by the S-ISA method is too concentrated in the area with better interpolation accuracy, and no sites are distributed on the edge of the area with few monitoring sites in the eastern part of the research area.  Figure 18, which is similar to Figure 14 in the previous section, is based on the performance of the two methods.  Figure 18a,b show that significant differences do not occur between the two networks in the proportion of joint entropy and transformation of residual potential stations, which may be related to the control of the same rainfall contour region by the two networks. In Figure 18c, because the stations are relatively concentrated in the middle, the S-ISA network is slightly higher than the R-ISA network in terms of the redundancy in the network. Figure 18d demonstrates that as the number of stations increases, the SI value of the S-ISA method slightly increases and then decreases. This is because the spatiotemporality calculated based on the measured data is more accurate in the region with higher interpolation accuracy (that is, in the middle of the study area). However, as for the whole research area, accurate spatiotemporality cannot be obtained. Therefore, with the increase of rain gauges, the spatiotemporal information of the network is not significantly improved. Therefore, compared with R-ISA, this method cannot obtain accurate spatiotemporal properties for the whole area, and the spatiality of the R-ISA network is also superior to that of the S-ISA network. Although the difference in the mean variance of kriging interpolation between the two methods in Figure 18e fluctuates and is not very significant, the error optimization speed of the R-ISA method is faster and higher than that of the S-ISA method based on the relative errors (Figure 18f). A comprehensive comparison indicates that the results of a rain gauge network that considers spatiotemporality based on satellite rainfall products are more advantageous in terms of spatiotemporality and accuracy than those considering spatiotemporality based on interpolation data.

Summary and Conclusions
In this paper, a method for optimizing the design of rain gauge networks using both remote sensing and measured data is presented. The design process involves two parts: potential rain gauge screening and network screening. Potential rain gauge screening refers to screening out areas that are most in need of stations by considering the point perspective of high kriging accuracy of the measured data and low local spatial flux based on remote sensing data. In contrast, network screening refers to selecting the best combination of rain gauges from the potential rain gauges from the perspective of the network. This part of the method mainly considers three indicators that are of high degree: the total information content of the network, the spatiotemporality of the network, and the interpolation accuracy of the network. The information content and accuracy are calculated from ground-measured data. The information calculations include the joint entropy of the network, the mutual information outside the network, and the redundancy inside the network. The SI is calculated from remote sensing data to reflect the spatial and temporal characteristics of the network. The accuracy of the network is represented by the kriging variance of interpolation. Then, the greedy algorithm is used to rank the rain gauges in the network. This approach ensures the appropriate design of station networks by leveraging the high accuracy of measured data and the good spatiality of remote sensing data. This paper compares the results of the proposed method with that of two other methods that consider only two indicators (information content and kriging variance) (S-IA) and all three indicators but only in situ data (S-ISA). The results of the S-IA method show that the distribution of stations is too uniform because the spatial representativeness is not considered. The distribution of stations in the S-ISA results is too centralized because the spatial interpolation of the measured data is quite different in the middle and edge regions. The R-ISA results retain the advantages of the S-IA method. Moreover, the introduction of remote sensing data into the calculation space is more objective and performs better in all aspects. However, due to the use of two types of data sources, this method is still subject to the limitation of spatial scale and a certain degree of uncertainty; moreover, remote sensing data still need to be cleaned before they can be used in this application. In summary, the rain gauge network design method combined with the satellite data proposed in this paper can not only ensure the content of information and application accuracy obtained by the network but also ensure the ability of the station network to capture spatial features and provide satellite remote sensing data to assist with the design of ground station networks as a new solution. In addition, through the two steps, potential point screening and station network screening, the efficiency of the station network optimization is improved.
Furthermore, there are still some research problems in this paper, such as how to better leverage the advantages of remote sensing data when combined with other hydrological elements, the analysis of the difference between the results of the greedy algorithm and the results of the multi-objective optimization algorithm described in the literature and the influence of the scale of remote sensing data on the design of the rain gauge network.