Estimation of Water-Use Rates Based on Hydro-Meteorological Variables Using Deep Belief Network

: This study proposed a deep learning-based model to estimate stream water-use rate (WUR) using precipitation (P) and potential evapotranspiration (PET). Correlations were explored to identify relationships among accumulated meteorological variables for various time durations (three-, four, ﬁve-, and six-month cumulative) and WUR, which revealed that three-month cumulative meteorological variables and WUR were highly correlated. A deep belief network (DBN) based on iterating parameter tuning was developed to estimate WUR using P, PET, and antecedent stream water-use rate (DWUR). The training and validation periods were 2011–2016, and 2017–2019, respectively. The results showed that the PET-DWUR based model provided better performances in Nash–Sutcli ﬀ e ﬃ ciency (NSE), root mean square error (RMSE), and determination coe ﬃ cient ( R 2 ) than the P-PET-DWUR and P-DWUR models. The framework in this study can provide a forecast model for deﬁciencies of stream water use coupled with a weather forecast model.


Introduction
Freshwater resources are stressed within South Korea due to climate variability and climate change [1][2][3]. Unusual severe droughts from 2013 to 2015 in South Korea [4] have resulted in reduced runoff and depleted streams [5]. In addition, many studies have projected more occurrences and severities of droughts in the future [6]. One solution for managing against varying hydroclimate is a hard path approach for constructing water supply infrastructure to meet human and environmental water demands. These hard path solutions are large scale, politically intractable, and require stringent environmental assessments before commissioning [7][8][9].
A soft path solution deals with technologies and policies that aim to improve the overall productivity of water use rather than new supply sources [10]. It delivers diverse water services matched to the users' needs and works with water users at the community scale level [7,11]. The stream water coordination council (SWCC) operates in South Korea and consists of individual water users and suppliers at the community level. The SWCC faces a substantial challenge, i.e., the improvement of water use efficiency against more intense drought.
A soft path solution has various uncertainty sources, however, the uncertainties can be reduced by collecting and analyzing data [7]. Therefore, the Ministry of Environment in South Korea has collected 2. Methodology

Experimental Design
This study proposed a deep learning-based prediction model of WUR by considering weather variables and evaluated its reproducibility. We found relationships among WUR and meteorological variables such as precipitation (P) and potential evapotranspiration (PET) for nine years (2011-2019). Moreover, a new model using DBN, an unsupervised method to train with a relatively small dataset, was developed for predicting WUR. The training and prediction periods for the DBN were 2011 to 2016, and 2017 to 2019, respectively. The prediction variable was WUR, and the input variables were combinations of precipitation (P) and PET for various time durations (3-, 4-, 5-, and 6-month cumulative) including DWUR that accounts for a month before the prediction month [32]. Because drought occurs due to an excessive shortage of accumulative precipitation and continuously active evapotranspiration, monthly cumulative P and PET for 3, 4, 5, and 6 months were used as input datasets. Therefore, this study used the following three possible input datasets: (1) P, PET, and DWUR (P-PET-DWUR), (2) PET and DWUR (PET-DWUR), and (3) P and DWUR (P-DWUR). The objectives of this study were to identify the relationships among WUR and meteorological variables and to suggest the DBN-based prediction model with the best input dataset using different cumulative periods in the Yeongsan River basin. Moreover, this prediction model was applied to several water-use facilities for its validation ( Figure 1). Water 2020, 12, x FOR PEER REVIEW 3 of 14

Study Area and Data
To secure a reliable supply of stream water (i.e., agricultural water), the central government permits and manages stream water in Korea. The upper limit boundary of available stream water use is set as the reference low flow to prevent indiscriminate water use and reduce stream water deficit. The reference low flow is defined by the ten-year frequency streamflow that is maintained for more than 355 days per year [33]. Although use of stream water is permitted below the reference low flow, the water-use facility (WUF) controls the amount of stream water used according to drought severity.
The flood control office has responsibility for managing stream water and adjusting the permitted amount for the WUFs. The amount of permitted stream water for each WUF can be changed every year with consideration of both water demand and water availability by the authority of the institutions. Unlike stream water use, the WUR is defined as stream water use (demand) divided by stream water permit (supply), in the range 0 to 100. Therefore, the WUR was chosen to be the prediction variable instead of stream water use. The Yeongsan River basin located in southwestern Korea is well known as a large-scale agricultural region even after industrialization. Stream water has been highly demanded near the downstream of the river basin where the population is concentrated along the riverside ( Figure 2).
In this study, the meteorological variables were precipitation and potential evapotranspiration (PET), observed from twelve automated surface observing systems in the Korea Meteorological Administration. The basin mean monthly precipitation was calculated by using the Thiessen method. Monthly PET was estimated by the Thornthwaite equation [34] using observed monthly mean temperature at the stations in the basin. The stream water use by WUFs was averaged and used for the model input data as antecedent WUR.

Study Area and Data
To secure a reliable supply of stream water (i.e., agricultural water), the central government permits and manages stream water in Korea. The upper limit boundary of available stream water use is set as the reference low flow to prevent indiscriminate water use and reduce stream water deficit. The reference low flow is defined by the ten-year frequency streamflow that is maintained for more than 355 days per year [33]. Although use of stream water is permitted below the reference low flow, the water-use facility (WUF) controls the amount of stream water used according to drought severity.
The flood control office has responsibility for managing stream water and adjusting the permitted amount for the WUFs. The amount of permitted stream water for each WUF can be changed every year with consideration of both water demand and water availability by the authority of the institutions. Unlike stream water use, the WUR is defined as stream water use (demand) divided by stream water permit (supply), in the range 0 to 100. Therefore, the WUR was chosen to be the prediction variable instead of stream water use. The Yeongsan River basin located in southwestern Korea is well known as a large-scale agricultural region even after industrialization. Stream water has been highly demanded near the downstream of the river basin where the population is concentrated along the riverside ( Figure 2).
In this study, the meteorological variables were precipitation and potential evapotranspiration (PET), observed from twelve automated surface observing systems in the Korea Meteorological Administration. The basin mean monthly precipitation was calculated by using the Thiessen method. Monthly PET was estimated by the Thornthwaite equation [34] using observed monthly mean temperature at the stations in the basin. The stream water use by WUFs was averaged and used for the model input data as antecedent WUR.

Deep Belief Network (DBN)
A DBN is a probabilistic generative model that has several stacked RBMs and it trains using the greedy learning method. A RBM is a type of Markov model that has no edges in each layer, unlike Boltzmann machine which has a connection between layers [35]. A RBM expresses a joint probability distribution using a graph consisting of nodes corresponding to random variables and edges representing the relationships among random variables and it consists of one visible layer and one hidden layer. Here, visible units are input data, and hidden units are outputs that improve learning ability ( Figure 3a). DBN is a structure that stacks RBM with a simple calculation process, unlike deep Boltzmann machine (DBM) which expresses the entire layer as joint probability, and the rest of the layers, except the last layer, are expressed as conditional probability as follows: where ℎ = , is number of hidden layers, (ℎ |ℎ ) denotes the conditional distribution for the units of the layer ( − 1) given the units layer, and (ℎ , ℎ ) corresponds to the joint distribution of the top two layers ( − 1) and .
The input data initializes the weights of the DBN by the pretraining step before the backpropagation algorithm is used to fine-tune the weights in the learning phase. Hence, the initial weight is determined, and the next layer is trained. This process is repeated until all the layers are trained (Figure 3b).

Deep Belief Network (DBN)
A DBN is a probabilistic generative model that has several stacked RBMs and it trains using the greedy learning method. A RBM is a type of Markov model that has no edges in each layer, unlike Boltzmann machine which has a connection between layers [35]. A RBM expresses a joint probability distribution using a graph consisting of nodes corresponding to random variables and edges representing the relationships among random variables and it consists of one visible layer and one hidden layer. Here, visible units are input data, and hidden units are outputs that improve learning ability ( Figure 3a). DBN is a structure that stacks RBM with a simple calculation process, unlike deep Boltzmann machine (DBM) which expresses the entire layer as joint probability, and the rest of the layers, except the last layer, are expressed as conditional probability as follows: where h 0 = ν, L is number of hidden layers, P(h l−1 h l ) denotes the conditional distribution for the units of the layer (l − 1) given the l units layer, and P h L , h L−1 corresponds to the joint distribution of the top two layers (L − 1) and L.
The input data initializes the weights of the DBN by the pretraining step before the backpropagation algorithm is used to fine-tune the weights in the learning phase. Hence, the initial weight is determined, and the next layer is trained. This process is repeated until all the layers are trained (Figure 3b).
DBN training is a layer-wise pretraining technique that sequentially performs pretraining from the lower layer (the layer close to the input) to the upper layer. The initial weight is pretrained using a layer-by-layer strategy, and the higher-level features are trained from the previous layers.
The DBN training process can be summarized as follows: 1. To initialize the visible units to a training dataset; 2. To express the back-and-forth process as a conditional probability where σ is the activation transfer function, c j and b i are the biases, ν i and h j are the states of the visible and hidden units, and ω ij represents the connection weight between units i and j; 3. To re-update all of the hidden units in parallel given the reconstructed visible units using Equation (2); and 4. To repeat with all training examples and update the weights (ω ij ) and biases (c j and b i ) using Equations (4)- (6), where · denotes the expectation of the training data, ν i h j data refers to the distribution of raw data input to the RBM, ν i h j recon refers to the distribution of data after the model has been reconstructed, and α is learning rate.  DBN training is a layer-wise pretraining technique that sequentially performs pretraining from the lower layer (the layer close to the input) to the upper layer. The initial weight is pretrained using a layer-by-layer strategy, and the higher-level features are trained from the previous layers.
The DBN training process can be summarized as follows: 1. To initialize the visible units to a training dataset; 2. To express the back-and-forth process as a conditional probability Negative phase, (ℎ) = ∑ where is the activation transfer function, and are the biases, and ℎ are the states of the visible and hidden units, and represents the connection weight between units i and j; 3. To re-update all of the hidden units in parallel given the reconstructed visible units using Equation (2); and 4. To repeat with all training examples and update the weights ( ) and biases ( and ) using Equations (4)-(6), This study applied the DEEPNET library of the R program to estimate the WUR. The proposed DBN settings are shown in Table 1. The prediction variable was the WUR, and the input variables were combinations of precipitation (P) and PET for various time durations (3-, 4-, 5-, and 6-month cumulative) including DWUR that accounts for a month before the prediction month. Each model updated over a different range of hidden units, learning rates, epochs, batch sizes on the same underlying input set to select the optimal parameters. The training period was 2011-2016, and the DBN model was validated from 2017 to 2019. Therefore, twelve models based on combinations of input variables were constructed to estimate the WUR (Figure 4). underlying input set to select the optimal parameters. The training period was 2011-2016, and the DBN model was validated from 2017 to 2019. Therefore, twelve models based on combinations of input variables were constructed to estimate the WUR (Figure 4).

Relationship between Meteorological Variables and Stream Water-Use Rate
This study used the monthly precipitation, monthly PET, and monthly WUR from 2011 to 2019 ( Figure 5). The monthly mean precipitation ranged from 22.9 to 262.4 mm, mainly concentrated in July and August which was 40% (516.79 mm) of the annual precipitation (1301.9 mm) ( Figure 5a). The monthly PET started to increase from April and largely occurred in July and August which was 55% (310.4 mm) of the total annual PET (787.8 mm) (Figure 5b). The WUR underwent the annual cycle of agricultural activities, which increased from May to September (Figure 5c). The annual mean WUR was 26.7% with peaks of 63.2% in June. Especially, the highest WUR was in 2015, when extreme drought was experienced in South Korea [4].

Relationship between Meteorological Variables and Stream Water-Use Rate
This study used the monthly precipitation, monthly PET, and monthly WUR from 2011 to 2019 ( Figure 5). The monthly mean precipitation ranged from 22.9 to 262.4 mm, mainly concentrated in July and August which was 40% (516.79 mm) of the annual precipitation (1301.9 mm) (Figure 5a). The monthly PET started to increase from April and largely occurred in July and August which was 55% (310.4 mm) of the total annual PET (787.8 mm) (Figure 5b). The WUR underwent the annual cycle of agricultural activities, which increased from May to September (Figure 5c). The annual mean WUR was 26.7% with peaks of 63.2% in June. Especially, the highest WUR was in 2015, when extreme drought was experienced in South Korea [4].
Correlation coefficients were obtained to examine the relationships among P/PET and WUR for different durations. The relationship became weaker as the accumulated period was longer ( Figure 6). The correlations of WUR with P and PET, for a 3-month duration, were 0.47 and 0.72, respectively; those for the four-, five-, and six-month cumulative durations were 0.28 and 0.54, 0.09 and 0.33, and −0.07 and 0.12, respectively. In particular, the WUR was strongly correlated with accumulated PET over three months, because increased PET caused the soil to be drier, which resulted in the manager of the WUF perceiving an increased demand for the crops. As the cumulative period was extended, the slope of the regression line for precipitation was significantly changed as compared with that of PET. In particular, the six-month cumulative period showed a negative relationship. Correlation coefficients were obtained to examine the relationships among P/PET and WUR for different durations. The relationship became weaker as the accumulated period was longer ( Figure  6). The correlations of WUR with P and PET, for a 3-month duration, were 0.47 and 0.72, respectively; those for the four-, five-, and six-month cumulative durations were 0.28 and 0.54, 0.09 and 0.33, and −0.07 and 0.12, respectively. In particular, the WUR was strongly correlated with accumulated PET over three months, because increased PET caused the soil to be drier, which resulted in the manager of the WUF perceiving an increased demand for the crops. As the cumulative period was extended, the slope of the regression line for precipitation was significantly changed as compared with that of PET. In particular, the six-month cumulative period showed a negative relationship.   Correlation coefficients were obtained to examine the relationships among P/PET and WUR for different durations. The relationship became weaker as the accumulated period was longer ( Figure  6). The correlations of WUR with P and PET, for a 3-month duration, were 0.47 and 0.72, respectively; those for the four-, five-, and six-month cumulative durations were 0.28 and 0.54, 0.09 and 0.33, and −0.07 and 0.12, respectively. In particular, the WUR was strongly correlated with accumulated PET over three months, because increased PET caused the soil to be drier, which resulted in the manager of the WUF perceiving an increased demand for the crops. As the cumulative period was extended, the slope of the regression line for precipitation was significantly changed as compared with that of PET. In particular, the six-month cumulative period showed a negative relationship.

Estimation of Stream Water-Use Rate
We constructed a DBN-based model using precipitation and PET to estimate WUR. We selected the optimal parameters for the DBN based on the best performance model, as shown in Table 2. The

Estimation of Stream Water-Use Rate
We constructed a DBN-based model using precipitation and PET to estimate WUR. We selected the optimal parameters for the DBN based on the best performance model, as shown in Table 2. The optimal parameters differed by accumulated meteorological variables for various time durations (3, 4, 5, and 6 months). Figure 7 shows a time series of WUR by different cumulative periods for 3, 4, 5, and 6 months, to compare the estimated three different combinations of meteorological variables for each duration. The precipitation, PET, and DWUR of a three-months cumulative period estimated the most similar to the observations. In addition, the estimated WUR by PET-DWUR for the three-month cumulative period was similar to the observations (Figure 7a). P-DWUR inferred that the estimated WUR was underestimated as compared with the observations, and it showed some lagged response. The longer the cumulative duration of the meteorological variables, the lower the performance skill of the model. Especially, the P-DWUR showed a lower predictability with a relatively shorter cumulative period than the others. For quantitative performance evaluation of the DBN model based on input data for each duration, it was diagnosed using root mean square error (RMSE), Nash-Sutcliff efficiency (NSE), and determination coefficient (R 2 ) ( Table 3). The RMSEs or NSEs of the estimated WUR with P-PET-DWUR and PET-DWUR, at the three-month cumulative period, were similar, and the R 2 of WUR between observation and estimation with PET-DWUR was 0.96, which showed a strong positive relationship. This result reflects the phenomenon that WUR increases as PET increases. It can be seen that the model performance decreases as the duration increases. In particular, the correlations among cumulative precipitation and PET, and WUR decreased as the duration increased, as shown in Figure 6b-d. Additionally, as the RMSE increased and the NSE decreased, the performances of the model for different durations were reduced. The scores of inferred WUR by PET-DWUR also decreased according to the increased cumulative duration, but it showed better performance than those of P-DWUR with all performance indices. to the observations. In addition, the estimated WUR by PET-DWUR for the three-month cumulative period was similar to the observations (Figure 7a). P-DWUR inferred that the estimated WUR was underestimated as compared with the observations, and it showed some lagged response. The longer the cumulative duration of the meteorological variables, the lower the performance skill of the model. Especially, the P-DWUR showed a lower predictability with a relatively shorter cumulative period than the others. For quantitative performance evaluation of the DBN model based on input data for each duration, it was diagnosed using root mean square error (RMSE), Nash-Sutcliff efficiency (NSE), and determination coefficient (R 2 ) ( Table 3). The RMSEs or NSEs of the estimated WUR with P-PET-DWUR and PET-DWUR, at the three-month cumulative period, were similar, and the R 2 of WUR between observation and estimation with PET-DWUR was 0.96, which showed a strong positive relationship. This result reflects the phenomenon that WUR increases as PET increases. It can be seen that the model performance decreases as the duration increases. In particular, the correlations among cumulative precipitation and PET, and WUR decreased as the duration increased, as shown in Figure  6b-d. Additionally, as the RMSE increased and the NSE decreased, the performances of the model for different durations were reduced. The scores of inferred WUR by PET-DWUR also decreased according to the increased cumulative duration, but it showed better performance than those of P-DWUR with all performance indices.  The Taylor diagram represents the model performance for the estimated WUR which can reflect the correlation, the ratio of the standardized deviation, and centered RMS difference (CRMSD) [36]. The correlation coefficient between the modeled and the observed data is visualized by cos θ which is the azimuth angle (θ) of each point, and a circle is drawn around the reference point (REF) representing CRMSD. The CRMSD is the RMSD that does not take into account the mean model bias, and CRMSD is divided with observed standard deviation, as shown in Figure 8 Therefore, the most ideal value is at REF, where the correlation coefficient is 1, the ratio of standard deviation is 1, and the CRMSD is 0.
The PET-DWUR model had a higher correlation coefficient than the P-PET-DWUR model for all cumulative periods, while the P-PET-DWUR model had a higher ratio of the standard deviation than the PET-DWUR model for all cumulative periods. The CRMSDs of both the PET-DWUR and P-PET-DWUR models for the three-and four-month cumulative durations were similar, however, the CRMSDs of PET-DWUR for the five-and six-month cumulative durations were lower than those of P-PET-DWUR (Figure 8). The PET-DWUR showed better performance than P-PET-DWUR for five-and six-month cumulative durations. Both P-PET-DWUR and PET-DWUR for the three-month cumulative duration showed the best performance, the correlations were 0.96 and 0.98; the ratios of standard deviation were 0.88, and 0.77; and the CRMSDs were 0.30 and 0.29, respectively. The performance of estimated WUR based on both P-PET-DWUR and PET-DWUR showed close to the REF. The P-PET-DWUR and the PET-DWUR for the four-month cumulative duration showed a similar distance to the REF.
CRMSDs of PET-DWUR for the five-and six-month cumulative durations were lower than those of P-PET-DWUR (Figure 8). The PET-DWUR showed better performance than P-PET-DWUR for fiveand six-month cumulative durations. Both P-PET-DWUR and PET-DWUR for the three-month cumulative duration showed the best performance, the correlations were 0.96 and 0.98; the ratios of standard deviation were 0.88, and 0.77; and the CRMSDs were 0.30 and 0.29, respectively. The performance of estimated WUR based on both P-PET-DWUR and PET-DWUR showed close to the REF. The P-PET-DWUR and the PET-DWUR for the four-month cumulative duration showed a similar distance to the REF.

Estimation of Stream Water-Use Rate on Stream Water-Use Facilities
The model using P-PET-DWUR and PET-DWUR for the three-month cumulative period as input data showed excellent performances for WUR. The inferred WUR is the average of all the WUFs in the Yeongsan River basin. In order to examine the applicability of the DBN model, WUR was estimated by two single facilities. We selected the facilities that had a small annual variance of WUR in the Yeongsan River basin. The stream water-use permits of the selected WUFs, Si-Jong (st. 926) and Wol-Ho (st. 976), were 979,935 and 494,144 m 3 /day, and WURs were 35.8% and 36.7%, respectively. To examine the effectiveness of the DBN model at each WUF, the DBN model using PET-DWUR for the three-month cumulative period had the best performance as compared with the other durations estimated WUR with the observations from 2017 to 2019 (Figure 9a,b).

Estimation of Stream Water-Use Rate on Stream Water-Use Facilities
The model using P-PET-DWUR and PET-DWUR for the three-month cumulative period as input data showed excellent performances for WUR. The inferred WUR is the average of all the WUFs in the Yeongsan River basin. In order to examine the applicability of the DBN model, WUR was estimated by two single facilities. We selected the facilities that had a small annual variance of WUR in the Yeongsan River basin. The stream water-use permits of the selected WUFs, Si-Jong (st. 926) and Wol-Ho (st. 976), were 979,935 and 494,144 m 3 /day, and WURs were 35.8% and 36.7%, respectively. To examine the effectiveness of the DBN model at each WUF, the DBN model using PET-DWUR for the three-month cumulative period had the best performance as compared with the other durations estimated WUR with the observations from 2017 to 2019 (Figure 9a,b).
Water 2020, 12, x FOR PEER REVIEW 11 of 14 addition, the accuracy of the inferred WUR from May to September was very high as the WUR tended to constantly be high.

Conclusions
This study proposed a DBN-based model to estimate WUR in the Yeongsan River basin with a high proportion of agricultural land use. Specifically, on the basis of relationships among meteorological variables and WUR, a deep learning-based model was developed to estimate WUR using precipitation and PET. For the relationship between WUR and precipitation, PET having threemonth cumulative durations was the strongest. Overall, the relationship became weaker as the accumulated period became longer. Within the same cumulative period, the correlation of WUR with  The performances of the model showed its excellence to predict WUR at each WUF. However, the irrigation pumping facility, st. 927, withdrew stream water only from May to September and there was no stream water use for the remaining months, which caused the model to overestimate the WUR from October to April. In addition, the accuracy of the inferred WUR from May to September was very high as the WUR tended to constantly be high.

Conclusions
This study proposed a DBN-based model to estimate WUR in the Yeongsan River basin with a high proportion of agricultural land use. Specifically, on the basis of relationships among meteorological variables and WUR, a deep learning-based model was developed to estimate WUR using precipitation and PET. For the relationship between WUR and precipitation, PET having three-month cumulative durations was the strongest. Overall, the relationship became weaker as the accumulated period became longer. Within the same cumulative period, the correlation of WUR with PET was higher than with precipitation because most of the basin consisted of well-irrigated paddy fields where the water could be safely supplied even during droughts. The well-irrigated paddy was less sensitive to the lack of precipitation because water could be stably supplied for agricultural use even if the precipitation was relatively small. Therefore, stream water users demand more water when PET increases, leading to dryness of soil moisture.
In this study, stream water use was estimated by applying DBN for the derivation of the relationships among the quantified meteorological variables and WUR. The best performance was for three-month accumulated precipitation and PET. The results using P-PET-DWUR and PET-DWUR showed the best performance, and the P-DWUR based estimate of WUR revealed a relatively poorer performance than PET-DWUR. When the cumulative period was over four months, the estimated WUR with PET-DWUR showed better performance as compared with that of both precipitation and PET. This matches previous results that showed the relationship between WUR and two meteorological variables, precipitation and PET. The SWCC, the governance-based agency in South Korea, can operationally adjust stream water use with a soft-path approach. However, the operation of the SWCC has been ineffective due to difficulty predicting stream water use. The estimation method for WUR was developed based on relationships with meteorological variables, and its applicability was verified. The proposed method showed the possibility for forecasting WUR considering weather conditions, and accordingly, it could be actively used to adjust the amount of stream water use as a soft-path solution ( Figure 10). A limitation of this study was that only agricultural water use was considered, and other water uses were excluded, i.e., residential, environmental, and industrial water use. We confirmed that the proposed model using PET-DWUR could be effectively applied to predict agricultural water-use. However, it is necessary to consider all types of crops that could cause different patterns of stream water use.  A limitation of this study was that only agricultural water use was considered, and other water uses were excluded, i.e., residential, environmental, and industrial water use. We confirmed that the proposed model using PET-DWUR could be effectively applied to predict agricultural water-use. However, it is necessary to consider all types of crops that could cause different patterns of stream water use.

Conflicts of Interest:
The authors declare no conflict of interest.