Modeling the Relationship of Precipitation and Water Level Using Grid Precipitation Products with a Neural Network Model

: Modeling the relationship between precipitation and water level is of great signiﬁcance in the prevention of ﬂood disaster. In recent years, the use of machine learning algorithms for precipitation–water level prediction has attracted wide attention in ﬂood forecasting and other ﬁelds; however, a clear method to model the relationship of precipitation and water level using grid precipitation products with a neural network model is lacking. The issues of the method include how to select a neural network model, as well as how to inﬂuence the modeling results with di ﬀ erent types and resolutions of remote sensing data. The purpose of this paper is to provide some ﬁndings for the issues. We used the back-propagation (BP) neural network and a nonlinear autoregressive exogenous model (NARX) time series network to model the relationship between precipitation and water level, respectively. The water level of Pingshan hydrographic station at a catchment area in the Jinsha River Basin was simulated by the two network models using three di ﬀ erent grid precipitation products. The results showed that when the ground station data are missing, the grid precipitation product is a good alternative to construct the precipitation–water level relationship. In addition, using the NARX network as a model ﬁtting network using extra inputs was better than using the BP neural network; the Nash e ﬃ ciency coe ﬃ cients of the former were all higher than 97%, while the latter were all lower than 94%. Furthermore, the input of grid products with di ﬀ erent spatial resolutions has little signiﬁcant e ﬀ ect on the modeling results of the model.


Introduction
Modeling the relationship between precipitation (When the word "precipitation" is used, it refers to "liquid precipitation" in this paper) and water level is an important issue in hydrology and also an important issue for flood disaster prevention and mitigation [1]. Generally, two common methods-a physical model-driven method and data-driven method-are adopted to model the relationship between precipitation and water level.
The physical model-driven method is based on the physical process of the water cycle to simulate the relationship between precipitation and water level; examples include the Variable Infiltration Capacity (VIC) Macroscale Hydrologic Model [2], the Soil and Water Assessment Tool (SWAT) model [3], and the Xinanjiang model [4]. The physical model-driven method has a clear physical principle, but due to the complex processes of a water cycle, it is often hard to model the exact model expression and obtain the complex physical parameters, which sometimes results in relatively low accuracy for modeling the relationship [5][6][7].
The data-driven method models the relationship from big data by using machine learning algorithms [8]. Sometimes, the data-driven method has higher accuracy compared to the physical model-driven method [9,10]. Using the precipitation data from observation stations and machine learning algorithms to model the relationship has been done in many studies [1,[11][12][13]. Because of this, the data-driven method has achieved good results in the field of meteorology and hydrology. Shilpa et al. [14] used the support vector machine (SVM) to reduce the error rate of rainfall prediction, but the SVM algorithm is generally applicable to the binary classification process, and there are some difficulties in the complex nonlinear continuous function fitting process. Sahoo et al. used the backpropagation neural network (BPNN) and the adaptive neuro-fuzzy inference systems (ANFIS) model to observe and simulate runoff in Basantpur to forecast floods [15]. The simulation accuracy of both models is more than 90%, but the simulation effect of the ANFIS model is better. In this study, the in-situ data is used for input, and the impact of different model structure and different model parameters on simulation results is mainly discussed, but neglects the influence of different input parameters on the prediction results. Sandeep et al. used ground station observation data of precipitation, maximum temperature, and minimum temperature as the input to predict runoff by establishing radial basis fewer network, recurrent neural network (RNN), and BPNN models [16]. The final results show that the RNN has the best prediction performance. At the same time, they used a SVM and the BPNN in another article to estimate the runoff of Agalpur Watered. The results show that both models can estimate the runoff better, but the effect of SVM is better when considering the lag input variables. These two papers mainly discuss the influence of different models and training parameters on the simulation results. The input data is ground station data rather than grid precipitation products [17]. Carlo proposed a data-driven method based on the random forest algorithm to predict water level as an index to predict floods. This paper took historical site data as the features of random forest construction, which showed that the time delay effect affected the rainfall-water level prediction [18]. Generally speaking, the model constructed by using the random forest algorithm is relatively simple. Due to the need to lose many features of the "tree", there may be a problem of under-fitting. Therefore, it is common to use a neural network algorithm to simulate the complex precipitation-water level relationship. Chen et al. used an improved genetic algorithm coupling a back propagation neural network (IGA-BPNN) model for water level prediction, and they used the ground station data of the Hanjiang River as the input of the model [1]. Ramli Adnan et al. introduced the extended Kalman filter at the output of the BPNN to show the improvement of the prediction of the actual flood level [19], and he also used the data of upstream and downstream stations of the river as the input. Although many people use site data modeling, there are many areas with sparse or even no site records, which makes it hard to apply the data-driven method. Grid precipitation products, such as the tropical rainfall measuring mission (TRMM), climate hazards group infrared precipitation with station data (CHIRPS), and the global land data assimilation system (GLDAS), which have global coverage, acceptable spatial-temporal resolution, and decades of records, would be important data sources [20]. Therefore, it is of great significance to use grid precipitation products to establish a precipitation-water level model.
To model the relationship of precipitation and water level using grid precipitation products with the data-driven method, three basic issues need to be studied: • What machine learning algorithm should be selected? The back-propagation (BP) neural network model is a commonly chosen model. The input of the BP model is the data at a time point. As is known, the change of water level caused by precipitation is not an instant process and it always has a time delay. Therefore, a neural network model considering the time delay effect, such as the nonlinear autoregressive exogenous model (NARX), may achieve better results than the BP neural network model [21,22].
• What is the effect of different grid precipitation products for modeling the relationship? There are several types of grid precipitation products. Many studies have discussed the accuracy of precipitation products and their applicability in different regions and used them as data sources for data-driven methods [21,23], but the effect on using different data sources for modeling the relationship of precipitation and water level has not been discussed [24][25][26].

•
What is the effect of different spatial resolution data from the same precipitation products in modeling the relationship?
These issues are rarely clearly studied. This paper will study the three issues by comparing the modeling accuracy between the BP model and the NARX model, exploring the effects of different types of grid precipitation products, as well as showing the effects of different spatial resolutions of the same grid precipitation product.
The main contributions of this paper are the following: 1) by comparing the precipitation-water level simulation results of the BP neural network and NARX time series network, this paper proves that the NARX time series network is better than the BP neural network for modelling the relationship of precipitation and water level in the short-term, as the time lag effect should be considered in the precipitation-water level model and 2) we explore the influence of using different kinds and different spatial-resolution remote sensing precipitation products on modeling the relationship of precipitation and water level. The contributions should guide the modeling of the relationship of precipitation and water level using grid precipitation products with a neural network model.

Study Area and Data Used
The Jinsha River Basin was taken as the research area. The Jinsha River (24.46 • E-35.76 • E, 90.535 • N-104.936 • N, as shown in Figure 1) is located in the upper reaches of the Yangtze River, accounting for 77% of the total length of the upper reaches. The Yangtze River is one of the largest and most important rivers in China. The origin of the Jinsha River is in the main peak of Tanggula Mountain in Qinghai Province. The Jinsha River flows through Qinghai, Tibet, Sichuan, and Yunnan provinces in Western China. The main stream of the Jinsha River has a total length of 2308 kilometers and a drainage area of about 340,000 square kilometers. The topography of the Jinsha River Basin is complex, the elevation fluctuation is obvious [27], the maximum height difference is 3300 m, the annual average precipitation of the Jinsha River Basin is about 710 mm, the annual average precipitation of downstream is about 900-1300 mm, and the annual average precipitation of middle and upstream is about 600-800 mm. The precipitation decreases gradually from southeast to northwest of the basin. Due to the wide area of the Jinsha River Basin, there are obvious differences between precipitation distribution and climate change. The complex geographical conditions also cause uneven distribution of meteorological and hydrological stations in the basin, which leads to the lack of observation. Figure 1 shows the selected Pingshan station in the lower reaches of the Jinsha River, the hydrological situation of the Jinsha River, the topography of the Jinsha River Basin, and the sparse meteorological observation stations in the Jinsha River Basin. The gauge station selected for the study was the Pingshan hydrological station, which is the core-control hydrological station in the lower reaches of the Jinsha River Basin. The data set of 2006-2009 was selected because there was a dam on the site which had not been completed yet. Considering that the data set of this part of time may not be affected by the upstream dam, the data were from Yangtze River Hydrological Bureau. The span of the data was from 2006 to 2009, the time granularity of the data was 1 day, and the content of the data was the average daily water level. The grid precipitation products used in this paper are CHIRPS, GLDAS-2, and TRMM-V7. All the data are listed as Table 1. Three different resolution grid precipitation products were used for this paper. One was CHIRPS from http://chg.geog.ucsb.edu/data/chirps/ in a grid format with a spatial resolution of 0.05° × 0.05° and a temporal resolution of 1 day. The second one was the GLDAS-2 from https://ldas.gsfc.nasa.gov/gldas/ with 0.25° × 0.25° spatial and 1 day temporal resolutions. The last one was TRMM-V7 from https://pmm.nasa.gov/data-access/downloads/TRMM with 0.25°× 0.25° spatial and 1 day temporal resolutions. The gauge station selected for the study was the Pingshan hydrological station, which is the core-control hydrological station in the lower reaches of the Jinsha River Basin. The data set of 2006-2009 was selected because there was a dam on the site which had not been completed yet. Considering that the data set of this part of time may not be affected by the upstream dam, the data were from Yangtze River Hydrological Bureau. The span of the data was from 2006 to 2009, the time granularity of the data was 1 day, and the content of the data was the average daily water level. The grid precipitation products used in this paper are CHIRPS, GLDAS-2, and TRMM-V7. All the data are listed as Table 1. Table 1. Data list of this paper. Climate hazards group infrared precipitation with station data (CHIRPS); global land data assimilation system (GLDAS); tropical rainfall measuring mission (TRMM). Three different resolution grid precipitation products were used for this paper. One was CHIRPS from http://chg.geog.ucsb.edu/data/chirps/ in a grid format with a spatial resolution of 0.05 • × 0.05 • and a temporal resolution of 1 day. The second one was the GLDAS-2 from https: //ldas.gsfc.nasa.gov/gldas/ with 0.25 • × 0.25 • spatial and 1 day temporal resolutions. The last one was TRMM-V7 from https://pmm.nasa.gov/data-access/downloads/TRMM with 0.25 • × 0.25 • spatial and 1 day temporal resolutions.

Data
The CHIRPS data formed a quasi-global precipitation dataset, which was from 1981 to 2014. Spanning 50 • S-50 • N (and all longitudes) [28], CHIRPS is a fusion of satellite images and data from rain-gauge stations, which is widely used for hydrology modeling, where there are sparse in situ rainfall stations. Bai et al. (2018) compared CHIRPS data with 2480 rainfall stations in China. Although the accuracy of CHIRPS data is not ideal when compared with a single rainfall station, it performs well at the watershed scale [29].
The GLDAS data uses multiple land surface models to integrate satellite and ground-based observational data [30]. GLDAS products include weather forcing data, land surface states data, and flux data. The GLDAS-2 product uses the "Global Meteorological Forcing Dataset" from Princeton University [31] to create more climatologically consistency than GLDAS-1. Wang et al. (2016) evaluated the accuracy of GLDAS-1 and GLDAS-2 monthly data in major watersheds of China, and the results showed that GLDAS-2 data had better spatial and temporal accuracy than GLDAS-1 data for precipitation, temperature, and runoff [32].
The TRMM is a joint mission between NASA and the Japan Aerospace Exploration (JAXA) Agency to study rainfall for weather and climate research, and the dataset of TRMM is intended to provide a relative precise estimation of quasi-global precipitation [20]. TRMM-V6 and V7 data both show a good agreement in the tropics, especially for moderate and heavy rain days. For heavy rain, the rainfall estimates of V6 are higher than V7 in the whole year [33]. For the Jinsha River Basin in the upper reaches of the Yangtze River, TRMM-V6 and TRMM-V7 can better reflect the annual change trend of precipitation in the study basin, among which the estimated results of TRMM-V7 are closer to the measured precipitation results, the estimated results of TRMM-V7 in 2007, 2008, and 2009 are very close to the measured precipitation, and the overall results of V6 are about 100 mm lower. The estimated result of TRMM-RTV7 is about 300 mm higher overall [24]. Thus, we chose the TRMM-V7 dataset for this paper.
In order to ensure the consistency of data sampling, the study used weather forcing data of GLDAS-2 as a grid input in combination with the grid precipitation product. Meanwhile, we used the Daily Climate Data Set of China Ground International Exchange Station (V3.0) for the control experiment. This dataset included daily data of air pressure, temperature, precipitation, evaporation, relative humidity, wind direction, wind speed, sunshine hours, and 0 cm ground temperature elements of 166 stations in China since January 1951 to the near present, which was provided by the National Meteorological Information Center of China (http://data.cma.cn/data/cdcdetail/dataCode/SURF_CLI_ CHN_MUL_DAY_V3.0.html).

Methodology
In order to solve the problems of data-driven modeling, such as model selection, input variable selection, and the impact of different data sources, the method developed in this paper was divided into four parts (I, II, III, IV), as shown below ( Figure 2).

•
(I) Selection of input parameters: This part determined the inputs of the used models. Considering the physical model, the input variables of the neural network were selected by referring to the physical process of the hydrological model in part I.

•
(II) Train data preprocessing: This part prepared training data for the neural network models.
To improve the performance of the neural network model and avoid the over-fitting of models, the unnecessary information and some noise were removed by the principal component analysis (PCA) dimensionality reduction in order to speed up the convergence of the model in part II. • (III) Precipitation-water level modeling: This part compared the modeling methods. As for model fitting, we chose a BP neural network to build the relationship between precipitation and water level; meanwhile, a NARX time series network was also chosen to build the relationship between precipitation and water level in part III.

Selection of Input Parameters
Some existing studies use a data-driven method to construct the relationship of precipitation and water level. The influence of precipitation on water level is usually only considered when building the model, but the natural water cycle is a complex physical process. Soil moisture, vegetation transpiration, surface confluence, and other parameters will also affect water level and runoff. Based on regression analysis, Jun Du et al. proved that the effects of annual precipitation and forest cover on runoff change were 69.8% and 17.3% [34]. Existing data-driven models cannot explain the physical phenomena of water level change; therefore, the input parameters of the neural network model are screened by referring to the existing hydrological model. In this study, we investigated existing hydrological models and selected variables by referring to the inputs of hydrological models.
Many hydrological models include a surface module, snow module, meteorological driving module, frozen soil module for calculating the effect of the freezing-thawing process on heat flux and humidity, lake and wetland module for calculating the water-heat balance of lakes and wetlands, carbon cycle module, and confluence algorithm for connecting the grids with river channels [35]. Due to the difficulty of data acquisition and computational complexity, the modules generally considered in estimating runoff yield and confluence using the hydrological model are the surface module, meteorological driving module, and confluence module; that is, the digital elevation model (DEM) grid data, meteorological data, soil data, and vegetation parameters data of the study area.
Hornik proved that the neural network algorithm is a universal approximator [36]. With only one hidden layer containing enough neurons, the multi-layer feedforward neural network can

Selection of Input Parameters
Some existing studies use a data-driven method to construct the relationship of precipitation and water level. The influence of precipitation on water level is usually only considered when building the model, but the natural water cycle is a complex physical process. Soil moisture, vegetation transpiration, surface confluence, and other parameters will also affect water level and runoff. Based on regression analysis, Jun Du et al. proved that the effects of annual precipitation and forest cover on runoff change were 69.8% and 17.3% [34]. Existing data-driven models cannot explain the physical phenomena of water level change; therefore, the input parameters of the neural network model are screened by referring to the existing hydrological model. In this study, we investigated existing hydrological models and selected variables by referring to the inputs of hydrological models.
Many hydrological models include a surface module, snow module, meteorological driving module, frozen soil module for calculating the effect of the freezing-thawing process on heat flux and humidity, lake and wetland module for calculating the water-heat balance of lakes and wetlands, carbon cycle module, and confluence algorithm for connecting the grids with river channels [35]. Due to the difficulty of data acquisition and computational complexity, the modules generally considered in estimating runoff yield and confluence using the hydrological model are the surface module, meteorological driving module, and confluence module; that is, the digital elevation model (DEM) grid data, meteorological data, soil data, and vegetation parameters data of the study area.
Hornik proved that the neural network algorithm is a universal approximator [36]. With only one hidden layer containing enough neurons, the multi-layer feedforward neural network can approximate any continuous function with arbitrary precision. According to the characteristics of the neural network Remote Sens. 2020, 12, 1096 7 of 18 algorithm, the input parameters with constant values can be removed because constants can be merged in a function of any kind. The final input parameters of the model are the meteorological data, soil data, and evapotranspiration parameters of the Jinsha River Basin.

Train Data Preprocessing
In order to solve the problem of the high dimensionality of input parameters caused by remote sensing data input, principal component analysis (PCA) was adopted to reduce the dimensions of input data in our study. The purpose of dimension reduction is to use low-dimensional data to retain as much information as possible from the original high-dimensional data; that is, to project the original high-dimensional spatial data to the target low-dimensional subspace through linear mapping to achieve the effect of dimension reduction. PCA is computed as follows, using the covariance method.
Compute the covariance matrix of samples: XX T ; 4.
Eigenvalue decomposition of covariance matrix XX T ; 5.
Extract the d largest characteristic value eigenvalues: 6.
Output the projection matrix after dimension reduction: After the dimensionality reduction of input data, it was inevitable to discard some information from the original data, but at the same time, we could remove some noise that deviated from the sample center too much. For the neural network model to be used in the study, too many dimensions of input samples could easily lead to the over-fitting of the model and bad generalization, thus leading to poor prediction results of the model.

Precipitation-Water Level Modeling
In order to better simulate the delayed effect of the water level change process, modeling the relationship between precipitation and water level needs to consider time variation. A nonlinear autoregressive exogenous model (NARX) was selected in this paper. To show the effect of the time series network model, this paper used the BP neural network model as a comparison.

Back Propagation Neural Network Model
Back propagation neural network (BPNN) is a feedforward neural network trained by the error back propagation (BP) algorithm [37]. It is the most widely used network in the field of prediction and modeling. The key to the BP neural network is that signals propagate forward and errors propagate backward.
The BP algorithm is based on gradient descent strategy. The goal of the gradient descent algorithm is to minimize the cumulative error on the training set; the gradient vector of the loss function is obtained by the chain derivation rule and then the error value of the output layer node is obtained. By using the error value of the output layer node as the reverse input, the connection weights and thresholds of the neural network were updated. We looped the update process until the termination condition of network training (maximum training times, minimum error, maximum iteration times, etc.) was reached.

Nonlinear Autoregressive Exogenous Model
The nonlinear autoregressive exogenous model (NARX), developed by Tsungnan Lin [38], is a recurrent network that has exogenous inputs. That means that the output of the model depends both on past values of the previous output and current and past values of independent variable series.
Compared with the BP neural network, the NARX network correlates the past observed (or predicted values) with the current target values through a time delay variable (d), which can more accurately model the observation variables associated with the time series. The function of the NARX time series network model is expressed as follows, where y(t) is the output target value on the t day, x(t) is the input variable value on day t, and d is the time delay days. The delay time is often determined by multiple prediction experiments; here, we got the best prediction result by repeated experiments, which are recorded in Section 3.2.1. If the delay time was too long or too short, the NARX model was under-fitted or oscillate.
NARX can be inputted externally through observed data, i.e., parallel mode. It can also be inputted through the target predicted value; i.e., the serial mode. The parallel mode had higher prediction accuracy because it used the exact value of observed data. As the ground stations can obtain the accurate daily water level, the parallel model was adopted in this study. Usually, the hydrological process is a slow cumulative process, and water level changes are usually associated with past water level, so the NARX time series network model can better explain the process of water level changes, and more accurate results of water level changes can be obtained theoretically.

Evaluation of Precipitation-Water Level Modeling
The main criteria for evaluating the efficacy of prediction models are the correlation coefficients (R), percentage bias (PBias), root-mean-square error (RMSE), Nash-Sutcliffe efficiency (NSE), and mean absolute error (MAE), which is commonly used to assess the quality of hydrological models. The validation coefficients are expressed as follows: where n is the number of predictions, O i is the observed value of water level (i.e., the measured value), P i is the predicted value of water level output from the model, O is the average value of water level observation, and P is the average water level prediction value. Through comparative experiments, the model accuracy of different inputs and different networks was evaluated.

Results and Discussion
In order to study the possible influencing factors in the precipitation-water level model, we carried out several groups of comparative experiments. Considering the instantaneousness of water level observation, we used an n-7 sliding window and divided the data into 1543 groups of 8 days (the first 7 days of each group and the eighth day of training) as the prediction. The size of the sliding window came from many attempts during parameter adjustment. Through the results of these groups of comparative experiments, we can give some guidance as to the construction of the relationship of precipitation-water level.
We divided the input of the neural network model into a precipitation part, which directly affected the water level change, and an extra meteorological input (EMI) part, which had an indirect effect on the water level change. The additional meteorological input data included soil moisture data, temperature data, wind speed data, and evapotranspiration data. We used the precipitation observation data of in situ station data, CHIRPS precipitation products, GLDAS-2 precipitation products, and TRMM-V7 precipitation products as inputs, and the meteorological inputs used the assimilation data in GLDAS-2. At the same time, in order to verify the modeling results of surface precipitation modeling, we used a full station input for a contrast experiment (the last group). To explore the influence of different networks on the modeling results of the precipitation-water level model, each group of inputs used different network models for water level simulation. Furthermore, we resampled the grid precipitation products into different resolutions to explore the impact of different resolutions on the modeling results. Table 2 shows the accuracy of all type of models mentioned above. Table 2. Accuracy evaluation of precipitation-water level modeling using different type of models. Extra meteorological input (EMI); spatial resolution (SR); back propagation neural network (BPNN); nonlinear autoregressive exogenous model (NARX); percentage bias (Pbias); root mean square error (RSME); Nash-Sutcliffe efficiency (NSE); mean average error (MAE); correlation coefficients (R).

Comparison of Different Grid Precipitation Products with Respect to the Observed Data
In order to preliminarily verify the availability of grid precipitation products, this paper makes a simple evaluation regarding the accuracy of grid precipitation products. Figure 3 shows the scatter plots of the areal precipitation time series of CHIRPS, GLDAS-2, and TRMM-V7 against the rainfall data of the in situ station from 2006 to 2009. The CHIRPS precipitation product had a correlation of 0.606 with the in situ station data (Figure 3a) and the GLDAS-2 precipitation product had a correlation of 0.733 (Figure 3b), whereas the TRMM-V7 product had a correlation of 0.645 (Figure 3c).  Table 3 shows the comparison of the accuracy of daily precipitation mean values of three grid precipitation products (CHIRPS, GLDAS-2, TRMM-V7) and ground precipitation stations. It can be seen from the table that GLDAS-2 data had the highest correlation, but the percentage bias of data was also the largest. Generally, grid precipitation products tended to underestimate the extreme value of rainfall, but the average value was relatively reliable.

Parameter Optimization of Neural Network
Although multilayer feedforward neural networks can approximate arbitrary continuous functions, it is usually impossible to tell the absolute number of hidden layers of networks and the number of neurons in hidden layers, but there are some general parameter setting rules summarized by previous studies. Generally speaking, due to the structure limitations of the training data and network model, the simulation effect of the feedforward neural network model with two hidden layers is usually better than that with only one hidden layer [39]. Here, we used different kinds and resolutions of grid precipitation products, and the number of input nodes in each group of neural networks was determined by the results of the principal components analysis (PCA) dimensionality reduction processing. The node number of the first hidden layer was = + 1 + 1, where is the dimension of the independent input variable; the second one was = + 3. Moreover, the transfer functions of the network were of Tansig types. The selection of parameters used in this paper was a relatively optimal solution obtained by multiple parameter adjustment, and all the training parameters used are shown in Table 4.  Table 3 shows the comparison of the accuracy of daily precipitation mean values of three grid precipitation products (CHIRPS, GLDAS-2, TRMM-V7) and ground precipitation stations. It can be seen from the table that GLDAS-2 data had the highest correlation, but the percentage bias of data was also the largest. Generally, grid precipitation products tended to underestimate the extreme value of rainfall, but the average value was relatively reliable.

Parameter Optimization of Neural Network
Although multilayer feedforward neural networks can approximate arbitrary continuous functions, it is usually impossible to tell the absolute number of hidden layers of networks and the number of neurons in hidden layers, but there are some general parameter setting rules summarized by previous studies. Generally speaking, due to the structure limitations of the training data and network model, the simulation effect of the feedforward neural network model with two hidden layers is usually better than that with only one hidden layer [39]. Here, we used different kinds and resolutions of grid precipitation products, and the number of input nodes in each group of neural networks was determined by the results of the principal components analysis (PCA) dimensionality reduction processing. The node number of the first hidden layer was N 1 = √ D i + 1 + 1, where D i . is the dimension of the independent input variable; the second one was N 2 = N 1 + 3. Moreover, the transfer functions of the network were of Tansig types. The selection of parameters used in this paper was a relatively optimal solution obtained by multiple parameter adjustment, and all the training parameters used are shown in Table 4. Table 4. Model parameters used for the neural network models.

Study parameter
Learning rate = 0.01 momentum factor = 0. In order to get better modelling results, we made some adjustments to the parameters of the neural network. We optimized the network parameters by the method of artificially adjusting parameters and we performed preliminary experiments to determine the network parameters and further adjusted them in the actual experiment. Table 5 shows the result of the cross-correlation function between rainfall gauge data and the water level of Pingshan station. We found that when the time delay was 6, the correlation between the rainfall series and water level series reached a maximum of 0.7559. Based on the results of cross-correlation function (CCF), we gradually adjusted the time delay parameter in the NARX network model. After many experiments, we found that when the time lag was 5 days, the performance of the NARX model reached the optimal value.

Modeling Results of BP Neural Network Model and NARX Network
The BP neural network is the most popular network in the field of prediction, but its performance in precipitation-water level modeling is not as good as the NARX network, as shown in Figure 4. Figure 4 shows the line plots of observed data and simulated water level using different neural network models with the input of GLDAS-2+EMI: The red line is the modeling result of using the BPNN and the blue line is the modeling result of using the NARX network. From the graph, we can see that the result of water level modeling using the NARX model was closer to the observed value. Additionally, as shown in Table 2, we found that the R 2 value of the water level modeling using the BP neural network model was 0.93 to 0.94 and the R 2 value of modeling results using the NARX model was 0.97 to 0.98. Although the four sets of experiments used different neural network model input variables, the results of using the NARX model for water level modeling were better in each group of experiments.   Table 6 shows the average value of modeling results using two different neural networks, and from Table 6 we can get more precise results regarding the use of the BP neural network model and NARX network model, from which we can obviously see that the NARX model performed better than the BP model in the four performance evaluation criteria. Although the correlation coefficients  Table 6 shows the average value of modeling results using two different neural networks, and from Table 6 we can get more precise results regarding the use of the BP neural network model and NARX network model, from which we can obviously see that the NARX model performed better than the BP model in the four performance evaluation criteria. Although the correlation coefficients of the BP model and NARX model were more than 0.95, the average correlation coefficient of the NARX prediction results was 2% higher than that of the BP model. However, the average percentage bias of the NARX model was 0.00032%, while the BP neural network model was 0.00041%. As for RMSE, the mean RMSE of the BP model was 1.072 m, but that of NARX was 0.606 m, and the Nash-Sutcliffe efficiency of the two models was higher than 50%; thus, both kinds of neural network model were reliable, but the NARX model was more reliable, as the NSE of it was over 97%. All criteria for the performance evaluation show that the NARX model performed better than the BPNN model in short-term water level modeling, which was consistent with our expectations, but the percentage bias of the BP neural network was smaller on the whole. As the NARX model used past rainfall input values and water level observations, considering the delayed effect of hydrological processes, the fitting results of the NARX network could get closer to the complex hydrological processes in the real world. Using NARX network for model construction can also better explain the relationship between rainfall and water level, and it can be seen from Figure 4 that for the BPNN, the NARX network produced less model oscillation and lower probability of extreme value.

Comparison of Modeling Results using Grid Products and Station Data
In order to verify the availability of grid products in precipitation-water level modeling, we compared the modeling results of grid products with the modeling results of ground stations. Table 7 shows the impact of different types of rainfall data input on water level modeling results: The average correlation coefficients of grid data modeling and station data using the NARX network were both 0.988,and, using GLDAS-2 0.25 • with EMI as an input showed the best result, the RSME of which was 0.561 m and the Nash efficiency coefficient was 97.858%, while the Nash efficiency coefficient of using station data was 97.560%. Combined with Table 2, we found that the percentage bias of station data was relatively small, which was −0.00060%. From Figure 5, we can see more intuitively that the accuracy of modeling results with grid precipitation data was little different from that with station data. In addition, although using GLDAS-2 0.25 • with EMI as an input showed the smallest RSME and the highest correlation coefficient with observed data, the average percentage bias of GLDAS-2 was larger than the other group. It can be seen from the above results that it was feasible to use grid rainfall products as the solution of sparse area of stations. Because the grid data integrated the results of the satellite data and the ground station data, it was more reliable than the conventional spatial interpolation results of ground station data.
In this experiment, the prediction result using grid data input was slightly better than that using only ground station.
From Figure 5, we can see more intuitively that the accuracy of modeling results with grid precipitation data was little different from that with station data. In addition, although using GLDAS-2 0.25° with EMI as an input showed the smallest RSME and the highest correlation coefficient with observed data, the average percentage bias of GLDAS-2 was larger than the other group. It can be seen from the above results that it was feasible to use grid rainfall products as the solution of sparse area of stations. Because the grid data integrated the results of the satellite data and the ground station data, it was more reliable than the conventional spatial interpolation results of ground station data. In this experiment, the prediction result using grid data input was slightly better than that using only ground station.

Influence of using Different Precipitation Products
In order to verify the influence of different grid precipitation products on the model construction, we resampled the three products of CHIRPS, GLDAS-2, and TRMM-V7 to 0.25° and constructed the model. The final experimental results are shown in Table 8. The Nash efficiency coefficients of using CHIRPS, GLDAS-2, and TRMM-V7 data based on the BPNN model were 93%, 91%, and 92%, and those of the NARX model were all above 97%.

Influence of using Different Precipitation Products
In order to verify the influence of different grid precipitation products on the model construction, we resampled the three products of CHIRPS, GLDAS-2, and TRMM-V7 to 0.25 • and constructed the model. The final experimental results are shown in Table 8. The Nash efficiency coefficients of using CHIRPS, GLDAS-2, and TRMM-V7 data based on the BPNN model were 93%, 91%, and 92%, and those of the NARX model were all above 97%.  Figure 6 shows the accuracy histogram of three different products; we can find that the GLDAS -2 data modeling results were the best by comprehensive comparison. The R of using GLDAS-2 with the NARX network was 0.989, the RSME was 0.561 m, and the Nash efficiency coefficient was 97.858%. However, modeling with TRMM-V7 and CHIRPS data showed less percentile deviation. For the Jinsha River Basin, GLDAS-2 data was the best rainfall input among the above three kinds of rainfall products, but the performance gap of the three kinds of grid precipitation products was not large, in which the absolute deviation of CHIRPS group was the smallest, which may have been due to the better performance of CHIRPS data on the extreme value of rainfall estimation.
the NARX network was 0.989, the RSME was 0.561 m, and the Nash efficiency coefficient was 97.858%. However, modeling with TRMM-V7 and CHIRPS data showed less percentile deviation. For the Jinsha River Basin, GLDAS-2 data was the best rainfall input among the above three kinds of rainfall products, but the performance gap of the three kinds of grid precipitation products was not large, in which the absolute deviation of CHIRPS group was the smallest, which may have been due to the better performance of CHIRPS data on the extreme value of rainfall estimation.

Influence of using Different Resolution Products
In order to explore the influence of grid product resolution on model building, we resampled each grid product to a different resolution: GLDAS-2 and TRMM data were resampled to 0.5° (the original resolution was 0.25°) and CHIRPS data were resampled to 0.25 and 0.5 degrees (the original resolution was 0.05°), as shown in Figure 7. The Nash efficiency coefficients of using BPNN with CHIRPS input for 0.05°, 0.25°, and 0.5° were 90.857%, 92.933%, and 92.727%, while those using the NARX network were 97.447%, 97.505%, and 97.277%. From Figure 7, we can see that the grid product input with different resolutions had little influence on the modeling results. For the CHIRPS and TRMM data, the prediction accuracy of products with higher resolution was slightly reduced, but for GLDAS-2 data, the high-resolution input modeling results were slightly better. For the input of different resolutions of the same precipitation product, the short-term water level prediction accuracy using spatial resolutions of 0.25 °× 0.25 ° and 0.5 °× 0.5 ° had little difference and the model prediction result of the high spatial resolution grid data input of chirps did not significantly improve, which may have been due to the model over-fitting problem caused by the high input dimension. For model training, using 0.5 ° × 0.5 ° spatial resolution data for input can significantly improve the processing speed of the model, so the use of low and medium resolution data can be considered in the process of model training.

Influence of using Different Resolution Products
In order to explore the influence of grid product resolution on model building, we resampled each grid product to a different resolution: GLDAS-2 and TRMM data were resampled to 0.5 • (the original resolution was 0.25 • ) and CHIRPS data were resampled to 0.25 and 0.5 degrees (the original resolution was 0.05 • ), as shown in Figure 7. The Nash efficiency coefficients of using BPNN with CHIRPS input for 0.05 • , 0.25 • , and 0.5 • were 90.857%, 92.933%, and 92.727%, while those using the NARX network were 97.447%, 97.505%, and 97.277%. From Figure 7, we can see that the grid product input with different resolutions had little influence on the modeling results. For the CHIRPS and TRMM data, the prediction accuracy of products with higher resolution was slightly reduced, but for GLDAS-2 data, the high-resolution input modeling results were slightly better. For the input of different resolutions of the same precipitation product, the short-term water level prediction accuracy using spatial resolutions of 0.25 • × 0.25 • and 0.5 • × 0.5 • had little difference and the model prediction result of the high spatial resolution grid data input of chirps did not significantly improve, which may have been due to the model over-fitting problem caused by the high input dimension. For model training, using 0.5 • × 0.5 • spatial resolution data for input can significantly improve the processing speed of the model, so the use of low and medium resolution data can be considered in the process of model training.
In total, when choosing the BP neural network model as the fitting model, using in-site stations, CHIRPS and GLDAS-2 data as input data sources did not have an obvious influence on the prediction results, and grid precipitation products could get similar results to rain-gauge station inputs. When selecting the NARX network as a fitting model, the results of using the GLDAS-2 precipitation product as an input were generally better than those of using rain-gauge station inputs and CHIRPS data. There was little difference between the accuracy of simulation results by using grid precipitation products and that by using ground stations. The results of using ground stations to build the model were almost the same as using grid precipitation products. When GLDAS-2 data were used as a model input, the result of water level modeling was slightly better than that of the ground station. However, the percentage deviation between the two models using the in-site station as an input was better than using precipitation products, which was due to the grid precipitation products tending to underestimate high rainfall values [40]. In total, when choosing the BP neural network model as the fitting model, using in-site stations, CHIRPS and GLDAS-2 data as input data sources did not have an obvious influence on the prediction results, and grid precipitation products could get similar results to rain-gauge station inputs. When selecting the NARX network as a fitting model, the results of using the GLDAS-2 precipitation product as an input were generally better than those of using rain-gauge station inputs and CHIRPS data. There was little difference between the accuracy of simulation results by using grid precipitation products and that by using ground stations. The results of using ground stations to build the model were almost the same as using grid precipitation products. When GLDAS-2 data were used as a model input, the result of water level modeling was slightly better than that of the ground station. However, the percentage deviation between the two models using the in-site station as an input was better than using precipitation products, which was due to the grid precipitation products tending to underestimate high rainfall values [40].

Conclusions
Precipitation-water level modeling usually lacks a clear model-building mechanism. In this study, we showed a specific data-driven process for grid precipitation-water level modeling and evaluated the accuracy of the model. The different data were used as input variables for the two different neural network models. The results of this study lead to the following conclusions:  Figure 7. Comparison of performance of the water level modeling using different resolutions. (a) Nash efficiency coefficient for water level prediction using GLDAS-2 data. (b) Nash efficiency coefficient for water level prediction using TRMM-V7 data. (c) Nash efficiency coefficient for water level prediction using CHIRPS data.

Conclusions
Precipitation-water level modeling usually lacks a clear model-building mechanism. In this study, we showed a specific data-driven process for grid precipitation-water level modeling and evaluated the accuracy of the model. The different data were used as input variables for the two different neural network models. The results of this study lead to the following conclusions:

1.
Compared with the BP neural network, the NARX time series network can significantly improve the accuracy of water level modeling, which is related to the NARX network, considering the time lag effect.

2.
Compared with the ground station, the grid data can get similar results in general. The GLDAS 2 data are better than the ground station in water level modeling. Therefore, in an area where the water level station is missing, the surface rainfall data can be used as an available alternative of ground battle points for water level modeling experiments.

3.
Under the same resolution, the water level modeling results with different data sources are similar, although the GLDAS 2 results are slightly better. Using the same data source, the experimental results of water level modeling with different resolutions of surface rainfall data have little difference, so it is of little significance to pursue high-resolution surface rainfall products in the construction of a precipitation-water level model.

4.
In this paper, by putting forward a method for building a precipitation-water level model, the influencing factors of each part of the water level model are discussed, which has certain guiding significance for future research into water level modeling.
At present, our research can only predict the short-term water level and does not consider the influence of distance effect in the basin on the time delay of rainfall water level process. In the future, we will continue to consider the use of the machine learning method to predict the medium and long-term water level and divide the distance according to the spatial correlation factors in the basin.