A Study on Developing an AI-Based Water Demand Prediction and Classiﬁcation Model for Gurye Intake Station

: Drought has signiﬁcant impacts on both society and the environment, but it is a gradual and comprehensive process that affects a region over time. Therefore, non-structural measures are necessary to prepare and respond to the damage caused by drought in a ﬂexible manner according to the stage of drought. In this study, an AI-based water demand prediction model was developed using deep neural network (DNN) and long short-term memory (LSTM) models. The model was trained from 2004 to 2015 and veriﬁed from 2016 to 2021. Model accuracy was evaluated using data, with the LSTM model achieving a correlation coefﬁcient (CC) of 0.95 and normalized root mean square error (NRMSE) of 8.38, indicating excellent performance. The probability of the random variable X falling within the interval [a,b], as described by the probability density function f(x), was calculated using the water demand data. The cumulative distribution function was used to calculate the probability of the random variable being less than or equal to a speciﬁc value. These calculations were used to establish the criteria for each stage of the crisis alert system. Decision tree (DT) and random forest (RF) models, based on AI-based classiﬁcation, were used to predict water demand at the Gurye intake station. The models took into account the impact of water demand from the previous day, as well as the effects of rainfall, maximum temperature, and average temperature. Daily water demand data from the Gurye intake station and the previous day’s rainfall, maximum temperature, and average temperature data from a nearby observatory were collected from 2004 to 2021. The models were trained on data from 2004 to 2015 and validated on data from 2016 to 2021. Model accuracy was evaluated using the F1-score, with the random forest model achieving a score of 0.88, indicating excellent performance.


Introduction
The increasing threat of drought resulting from climate change and abnormal weather has led to growing expectations for measures such as irrigation, embankment of intake stations, multipurpose dams, and water supply dams [1][2][3] to ensure stable water supply in the Yeongsan and Seomjin river basins.To address long-term water shortages at the national level, the Ministry of Land, Transport, and Maritime Affairs (MLTM) has been established, which predicts demand and supply for all basins nationwide.However, the focus is mainly on national rivers and multipurpose dams, while local and small rivers are relatively underdeveloped [4][5][6][7].
Water demand forecasting plays a critical role in determining the production plan for clean water at water treatment plants, the operation plan for water pumps, and the operation plan for reservoirs.Proper utilization of the predicted water demand value can lead to cost reductions in operation, production, and transportation.On the other hand, if high-accuracy water demand prediction is not achieved, excessive water may be transferred from the water treatment plant to the reservoir, leading to inefficient pump operation and excessive power consumption.Furthermore, the water level in the reservoir may not be properly adjusted due to the excessive water supply, which can result in various problems [6,[8][9][10][11].
In the context of large-scale water supply management, it is important to have accurate water demand forecasts to plan pumping operations and optimize costs.Previous studies have compared the performance of the adaptive neuro-fuzzy inference system (ANFIS) and the auto-regressive (AR) model for water demand prediction, and found that the AR model provides better prediction results.Typically, AR models are used for short-term forecasting, where general trends and periodic patterns on an annual, weekly, and daily basis can be identified.However, the AR model is not suitable for predicting water demand with complex cycle components that are combined in various patterns.This has been highlighted in previous studies [6,[12][13][14][15].
To address these issues, Tabesh and Dini (2009) [16] proposed a water demand prediction model that considers the influence of external factors such as weather and water level data.They applied an artificial neural network (ANN) model, which is a nonlinear model.Choi et al. (2009) [17] suggested that an AI-based model, specifically a multi-layer perceptron, would be appropriate.Firat et al. (2010) [18] applied generalized regression neural network (GRNN) and cascade correlation neural network (CCNN) isometric neural network models to predict water demand.More improved prediction results have been confirmed by comparing machine learning-and deep learning-based prediction models with AR models [10,[19][20][21].
Since water demand forecasting is essential for optimal water resource management, many studies have applied and developed various methods for accurate forecasting.It is also necessary to study whether decision makers should quantitatively supply water demand.There is a need for research to support decision makers in determining how much water demand will be sufficient or insufficient and how much supply there will be in the future.
The current state of drought situation management in Korea can be grasped based on the standard manual for crisis management, Drought disaster .And to detect signs related to a drought crisis or to assess the level of risk when a crisis is expected to occur, a crisis alert is issued.The four stages of crisis management are attention (blue) → caution (yellow) → alert (orange) → severe (red).In the case of a drought disaster, there are criteria for each level of crisis alert, and situation management takes this into consideration [3,10,22].However, the current quantitative standards for drought management are ambiguous, making it difficult for decision makers to make judgments.
The purpose of this study is to minimize drought damage through prompt and efficient response, in line with the goals of the Drought Disaster Crisis Management Basic Direction [3][4][5][6][7]10,22].A prediction model was developed using long short-term memory (LSTM) and deep neural network (DNN) models to enable decision makers to quantitatively supply water demand.To establish a quantitative standard, the probability density function (PDF) for the water demand data was calculated, along with the probability of including the random variable for the interval.Based on this, the cumulative distribution function (CDF) was used to calculate the probability that a given random variable is less than or equal to a specific value, and standards were established according to the drought crisis warning stage.Decision tree (DT) and random forest (RF) models were used to roughly estimate the supply scale of water demand in the near future based on the established criteria.In this way, water demand prediction is essential in terms of optimal water resource management and energy savings.Therefore, we attempted to apply machine learning and deep learning to accurately predict water damage.The model proposed in this study can be used to determine the amount of supply by predicting consumers' water demand and to establish optimal operation plans.It can also significantly contribute to reducing power consumption and energy at the national level.

Study Area
In the Seomjin river basin, Gokseong, Gurye, and Gwangyang occupy most of the area, and Suncheon, Hwasun, and Boseong make up some of it.The water quality of the Seomjin river is close to the first-class level at all points including Gurye, the representative point, and Namwon and Hadong.In addition, since water for agricultural use is supplied using a water conveyance tunnel, continuous monitoring of drought and water quality is an essential point.Gurye intake station has the distinction of being an intermediate point from the Seomjin river dam to the Seomjin river estuary and forms abundant flow as a confluence point of nearby small rivers.It is also an important facility that can respond to drought and water quality changes and stably supply high-quality water demand.The intake ability of Gurye intake station is 11,000 (m 3 /day), the intake volume is 9098 (m 3 /day), the water supply area is Gurye, and the population supply is 9881 people (Figure 1).
Water 2023, 15, x FOR PEER REVIEW

Study Area
In the Seomjin river basin, Gokseong, Gurye, and Gwangyang occupy mo area, and Suncheon, Hwasun, and Boseong make up some of it.The water quali Seomjin river is close to the first-class level at all points including Gurye, the rep tive point, and Namwon and Hadong.In addition, since water for agricultural us plied using a water conveyance tunnel, continuous monitoring of drought and wa ity is an essential point.Gurye intake station has the distinction of being an inter point from the Seomjin river dam to the Seomjin river estuary and forms abund as a confluence point of nearby small rivers.It is also an important facility that can to drought and water quality changes and stably supply high-quality water dema intake ability of Gurye intake station is 11,000 (m 3 /day), the intake volume (m 3 /day), the water supply area is Gurye, and the population supply is 9881 peo ure 1).Data from 1 January 2004 to 31 December 2021on independent variables such as water demand, average temperature, and minimum temperature were used.Table 1 shows the basic statistics of dependent and independent variable data.Meteorological data and water demand data can be downloaded from the Meteorological Data Open Portal operated by the Korea Meteorological Administration (https://data.kma.go.kr/cmmn/main.do,accessed on 31 December 2021).In this study, an AI-based water demand forecasting model was developed to quantitatively predict the water demand of Gurye intake station.Crisis Alert Levels (scale) were set using PDF and CDF, and an AI-based classification model was applied based on the set criteria.The flow of this study is explained in detail as follows (Figure 2).
ter demand classification model, data from 2004 to 2015 were used for the learning period and data from 2016 to 2021 were used for the evaluation period.When developing the predictive model, DT and RF models were utilized.The predictive accuracy of each model was evaluated using the F1-score.(5) A random search method was applied according to new input data without using fixed learning data and parameters, and the K-fold crossvalidation method was applied to prevent overfitting.

Long Short-Term Memory
Long short-term memory (LSTM), developed by ameliorating the disadvantages of recurrent neural networks (RNNs), removes unnecessary memories by adding input gates (  ), forget gates (  ), and output gates (  ) to memory cells in the hidden layer [21,[23][24][25], erasing and deciding what to remember.These three gates have a sigmoid function in common.After passing the sigmoid function, a value between 0 and 1 comes out, and the gate is adjusted with these values.In summary, LSTM has a slightly more complex formula for calculating the hidden state than RNNs and adds a value called cell state.Compared to RNNs, LSTM shows excellent performance in processing long sequences of inputs (Figure 3).(1) Water demand data of Gurye intake station and 6 meteorological data were collected daily from 2004 to 2021 and used as dependent and independent variables of the AI-based water demand forecasting model.(2) Data from 2004 to 2015 were used for the learning period and data from 2016 to 2021 were used for the evaluation period.When developing the predictive model, LSTM and DNN models were utilized.The predictive accuracy of each model was evaluated using the correlation coefficient (CC) and normalized root mean square error (NRMSE).( 3) Based on the water demand data of the Gurye intake station, a histogram was prepared to determine the frequency distribution.And, using PDF and CDF, quantitative risk warning standards were set.(4) To develop the water demand classification model, data from 2004 to 2015 were used for the learning period and data from 2016 to 2021 were used for the evaluation period.When developing the predictive model, DT and RF models were utilized.The predictive accuracy of each model was evaluated using the F1-score.(5) A random search method was applied according to new input data without using fixed learning data and parameters, and the K-fold cross-validation method was applied to prevent overfitting.

Long Short-Term Memory
Long short-term memory (LSTM), developed by ameliorating the disadvantages of recurrent neural networks (RNNs), removes unnecessary memories by adding input gates (i t ), forget gates ( f t ), and output gates (o t ) to memory cells in the hidden layer [21,[23][24][25], erasing and deciding what to remember.These three gates have a sigmoid function in common.After passing the sigmoid function, a value between 0 and 1 comes out, and the gate is adjusted with these values.In summary, LSTM has a slightly more complex formula for calculating the hidden state than RNNs and adds a value called cell state.Compared to RNNs, LSTM shows excellent performance in processing long sequences of inputs (Figure 3).

Deep Neural Network
A deep neural network (DNN) is an artificial neural network (ANN) composed of several hidden layers between an input layer and an output layer.DNNs, like regular ANNs, can model complex non-linear relationships.DNNs have the advantage of being able to model complex data with fewer units (nodes) than similarly performed ANNs [21,[25][26][27][28].The DNN is trained using a standard-error backpropagation algorithm, and the weights are updated through stochastic gradient descent.Deep neural networks are vulnerable to overfitting because the added layers allow modeling of rare dependencies in the training data.To overcome overfitting, dropout regularization has emerged as one of the regularization methods.In dropout regularization, some units of the hidden layers are randomly omitted during training.This method helps to solve rare dependencies that may occur in the training data (Figure 4).

Deep Neural Network
A deep neural network (DNN) is an artificial neural network (ANN) composed of several hidden layers between an input layer and an output layer.DNNs, like regular ANNs, can model complex non-linear relationships.DNNs have the advantage of being able to model complex data with fewer units (nodes) than similarly performed ANNs [21,[25][26][27][28].The DNN is trained using a standard-error backpropagation algorithm, and the weights are updated through stochastic gradient descent.Deep neural networks are vulnerable to overfitting because the added layers allow modeling of rare dependencies in the training data.To overcome overfitting, dropout regularization has emerged as one of the regularization methods.In dropout regularization, some units of the hidden layers are randomly omitted during training.This method helps to solve rare dependencies that may occur in the training data (Figure 4).

Deep Neural Network
A deep neural network (DNN) is an artificial neural network (ANN) composed of several hidden layers between an input layer and an output layer.DNNs, like regular ANNs, can model complex non-linear relationships.DNNs have the advantage of being able to model complex data with fewer units (nodes) than similarly performed ANNs [21,[25][26][27][28].The DNN is trained using a standard-error backpropagation algorithm, and the weights are updated through stochastic gradient descent.Deep neural networks are vulnerable to overfitting because the added layers allow modeling of rare dependencies in the training data.To overcome overfitting, dropout regularization has emerged as one of the regularization methods.In dropout regularization, some units of the hidden layers are randomly omitted during training.This method helps to solve rare dependencies that may occur in the training data (Figure 4).

Probability Density Function and Cumulative Distribution Function
The law of probability is the basis for statistical characterization of repeated observations.The probability P(E 1 ) of a specific event E 1 is defined as the frequency at which the event will occur at the end of repeated trals [29][30][31].
Water 2023, 15, 4160 where n 1 is the frequency of the event E 1 , N is the number of attempts and is a sufficiently large value, and n 1 N is called the relative frequency or probability.Both continuous and discrete random variables are characterized by the probability distribution of a specific value of each variable.The probability density function (PDF) is a function representing the distribution of a random variable.For the probability density function f (x) and the interval [a, b], the probability P(a ≤ X ≤ b) that the random variable X is included in the interval is as follows.
A cumulative distribution function (CDF) is a function that gives the probability that a given random variable is less than or equal to a certain value.That is, the cumulative distribution function f (x) means the probability that a certain variable X is not larger than a specific variable x.
Therefore, f (x) is a function that increases from 0 to 1 and divides into each class interval to indicate the data belonging to each interval.

Decision Tree
Decision tree (DT) is a model that derives rules to subdivide similar data and classify them by category by expressing data in a tree-like graph based on the rules of the data [22,[32][33][34][35]. DT is based on the downward induction method of dynamic programming, and the data separated from the upper node are subdivided into similar data by criteria.And, through iterative subdivision, it is repeated until the final classification by yield is completed.A decision tree consists of a root node, internal nodes, leaf nodes, and branches.
Here, in all nodes except the end node, prediction results are derived by learning cases that are satisfied and unsatisfied through conditions based on classification criteria.Depending on the degree of pruning in the learning process of the model, prediction results can be built more accurately.The complexity parameter (Cp) determines the number of trees at which the error rate is lowest.That is, the accuracy of each parameter is identified for pruning, and the prediction result can be expressed based on the optimal parameter (Figure 5).

Random Forest
Random forest (RF) is an ensemble-based model and a classification model that adds voluntariness and the basic principle of bootstrap aggregation (bagging), which is a method of aggregating samples by learning bootstrap models several times in multiple decision tree models.Random forest has high accuracy among classification models [22,[36][37][38][39]. Random forest randomly extracts learning data based on the basic principle of bagging, independently constructs a decision tree, and generates a total of n-trees.Here, when deriving the output result, the decision tree is randomly determined so that the result can be derived.This is defined as the number of classifiers (mtry).In the learning process, the model is repeatedly trained to select the optimal parameters and derive the best prediction results (Figure 6).

Random Forest
Random forest (RF) is an ensemble-based model and a classification model that adds voluntariness and the basic principle of bootstrap aggregation (bagging), which is a method of aggregating samples by learning bootstrap models several times in multiple decision tree models.Random forest has high accuracy among classification models [22,[36][37][38][39]. Random forest randomly extracts learning data based on the basic principle of bagging, independently constructs a decision tree, and generates a total of n-trees.Here, when deriving the output result, the decision tree is randomly determined so that the result can be derived.This is defined as the number of classifiers (mtry).In the learning process, the model is repeatedly trained to select the optimal parameters and derive the best prediction results (Figure 6).

Evaluating the Predictive Power of the Model
Correlation analysis, which indicates the correlation between the observed data being measured at the station and the data predicted through the prediction model, is a method designed to quantitatively identify the relationship between two variables [21,25,40].

Evaluating the Predictive Power of the Model
Correlation analysis, which indicates the correlation between the observed data being measured at the station and the data predicted through the prediction model, is a method designed to quantitatively identify the relationship between two variables [21,25,40].
where, in order to calculate the correlation, first, the deviations of x and y, that is, x i − x and y i − y for each x i and y i are calculated.The square error divided by n is the mean square error (MSE), and the square root of the error is the root mean square error (RMSE).This is the normalized root mean square error (NRMSE), which standardizes mean square error and root mean square error [21,25,40,41].
where y i means the i-th actual value and ŷi means the i-th simulated value.The accuracy verification of the classification model is performed based on the confusion matrix (Table 2).The confusion matrix is true positive (TP) when an observed value is predicted as 1 and the model result is 1, false negative (FN) when the observed value is predicted as 1 and the model result is 0, and the observed value is 0. Predicting the model output value as 1 is called false positive (FP), and when the observed value is 0, predicting the model output value as 0 is called true negative (TN) [6,15,39].

Development of Water Demand Prediction Model Using DNN and LSTM Models
To effectively perform DNN model learning, we determined the optimal combination of learning rate, hidden layers, hidden nodes, optimizer, and activation for the prediction model.Similarly, to effectively perform LSTM model learning, we determined the optimal combination of activation, learning rate, epochs, optimizer, and loss for the prediction model.Additionally, by applying K-fold cross-validation to the dataset during the learning period, we observed an improvement in accuracy for a small dataset (Table 3).Figure 7 displays the water demand forecasted by the DNN and LSTM models, as well as the actual observed data at the Gurye intake station.The DNN model predicts the overall tendency of water damage well.However, the accuracy of predicting peak values, which are important in prediction, is low.On the other hand, the LSTM model performs better than the DNN model.It not only captures the overall water demand amount and variability well but also accurately predicts the time when the peak value occurs.In forecasting, it is important to predict the overall amount, but it is also important to determine how accurately the highest and lowest points and the peak values for each section are predicted.

Setting of Crisis Alert Standards
The types of crises considered in managing drought situations include crop damage, a reduction in river maintenance flow, and groundwater depletion due to the shortage of domestic, agricultural, and industrial water.The development of these crises occurs in stages.The first stage is drought caused by a lack of precipitation due to climate change.The second stage is a shortage of domestic, agricultural, and industrial water, along with crop damage in some areas.The third stage involves the expansion of shortages in domestic, agricultural, and industrial water, as well as crop damage, to large-scale areas.
According to drought forecasting and warning standards, the criteria for attention (blue), caution (yellow), alert (orange), and serious (red) are as follows: The attention stage is reached when the water level of the river and water resource facility is lower than normal, necessitating preparation for drought in terms of domestic and industrial water.The caution stage is reached when the river maintenance flow is insufficient, or the dam (reservoir) needs to restrict the supply of water for river maintenance.In the alert stage, it becomes necessary to limit water supply due to the occurrence or anticipation of a partial shortage of domestic and industrial water.The serious stage is reached when the shortage of domestic and industrial water has expanded, and supply restrictions have occurred or are necessary in rivers and dams (reservoirs).Regarding domestic and industrial water, no quantitative risk warning standard has been set, so restrictions on actions at each stage Python-based software 3.12 was used, and the processing time was determined quickly within 30 min.A well-designed AI infrastructure leverages high-performance computing capabilities, such as GPUs or TPUs, to perform complex calculations in parallel.This allows machine learning algorithms to process enormous datasets swiftly, leading to faster model training and inference.

Setting of Crisis Alert Standards
The types of crises considered in managing drought situations include crop damage, a reduction in river maintenance flow, and groundwater depletion due to the shortage of domestic, agricultural, and industrial water.The development of these crises occurs in stages.The first stage is drought caused by a lack of precipitation due to climate change.The second stage is a shortage of domestic, agricultural, and industrial water, along with crop damage in some areas.The third stage involves the expansion of shortages in domestic, agricultural, and industrial water, as well as crop damage, to large-scale areas.
According to drought forecasting and warning standards, the criteria for attention (blue), caution (yellow), alert (orange), and serious (red) are as follows: The attention stage is reached when the water level of the river and water resource facility is lower than normal, necessitating preparation for drought in terms of domestic and industrial water.The caution stage is reached when the river maintenance flow is insufficient, or the dam (reservoir) needs to restrict the supply of water for river maintenance.In the alert stage, it becomes necessary to limit water supply due to the occurrence or anticipation of a partial shortage of domestic and industrial water.The serious stage is reached when the shortage of domestic and industrial water has expanded, and supply restrictions have occurred or are necessary in rivers and dams (reservoirs).Regarding domestic and industrial water, no quantitative risk warning standard has been set, so restrictions on actions at each stage are based on qualitative judgment.
To establish a quantitative standard, a histogram was created to analyze the distribution of water demand data from the Gurye intake station between 2004 and 2021.The water demand at the Gurye intake station increased from 5000 to 6000, with the highest distribution of water demand being between 6000 and 7000 (Table 5, Figure 8).According to the Guidelines for Comprehensive Water Demand Management Plan, the demand for water for living, industrial, and agricultural use is calculated based on 70% to 80% of the maximum daily water supply, which is determined by facility standards.The permitted amount for the Gurye intake station is 9098 (m 3 /day), which means According to the Guidelines for Comprehensive Water Demand Management Plan, the demand for water for living, industrial, and agricultural use is calculated based on 70% to 80% of the maximum daily water supply, which is determined by facility standards.The permitted amount for the Gurye intake station is 9098 (m 3 /day), which means 70% of the maximum daily water supply is 6368.6 (m 3 /day), and 80% is 7278.4(m 3 /day).
In this study, the crisis warning standards for the Gurye intake station were established by referring to the drought forecasting and warning standards, the Guidelines for Comprehensive Water Demand Management Plan, and previous studies.Taking into account the permitted amount of the Gurye intake station and the maximum water supply per day, the standard for the serious level was set based on the maximum permitted amount of the Gurye intake station, with a standard value of 9098.0 (m 3 /day).The standard for the alert stage was set at 75% of the daily maximum water supply, with a standard value of 6823.5 (m 3 /day).The caution level was based on 50%, with a standard value of 4549.0 (m 3 /day), and the attention level was based on 25%, with a reference value of 2274.5 (m 3 /day).The crisis alert standards established in this study are presented in Table 6 below.

Development of Water Demand Class Interval Classification Prediction Model
Classification is the process of predicting the dependent variable (class interval) that has the highest correlation with the independent variable.It is a method used to identify the class interval to which the data on water demand samples belong.Classification models can be divided into two categories.The first is the discriminant function model, which determines decision boundaries that divide data into different areas according to class intervals and calculates which intervals are distributed from these decision boundaries.The second is the stochastic model, which calculates the probability of distribution in the class interval for the input data.In this study, the DT and RF models were used to determine the scale of water demand in the near future.
The water demand at the Gurye intake station is affected by the water demand from the previous day, as well as the rainfall, maximum temperature, and average temperature of the surrounding rainfall stations.Taking these factors into account, we collected daily data on the observed water demand at the Gurye intake station and the rainfall, maximum temperature, and average temperature from the previous day at the rainfall stations from 2004 to 2021.
During the learning period, we applied K-fold cross-validation to 4383 data points from 2004 to 2015 and performed model learning and evaluation.Additionally, we used 2192 data points from 2016 to 2021 as the verification interval to assess the accuracy of the model.The DT model learns using one parameter, cp, which represents a parameter for tree pruning of the DT model.Tree pruning is a process that reduces overfitting of the DT and increases its generalizability.The results of the DT model according to the parameters are shown in Table 7, and we developed the model by selecting the optimal parameters.Tables 8 and 9 show the results of the water demand class interval model during the learning period using the DT model.The parameters of the DT model were optimized based on the input data, and we evaluated the predictive performance of the model.Tables 8 and 9 also show the results of the evaluation period for the water demand class interval classification model using the DT model (Figure 9).When examining the confusion matrix, we observed that Class 1, Class 2, Class 3, and Class 4 had low predictive power.Overall, the predictive power for all classes was found to be low, with an F1-score of 0.43 (Table 10).
Table 10.An evaluation of the applicability of a flood damage classification prediction model using decision tree model.Predicting the scale of water demand in the near future is crucial for permanently expanding water supply and resolving demand management issues to reduce uncertainty.This requires approaching crisis management and implementing adaptive drought measures.At the risk management level, water demand management involves taking appropriate measures to prevent, prepare for, and respond to drought based on an understanding of drought forecasting and warning standards.Early warning to identify risks is crucial, and improving forecasting and warning capabilities should be a priority.
Drought forecasting and warning standards are currently established for weather drought (Meteorological Administration), water for living and industrial use of Environment), and water for agriculture (Ministry of Agriculture, Food, and Livestock).Because weather forecasting and warning standards for drought and agricultural water are presented quantitatively, implementing step-by-step national action plans based on this information can be carried out with little uncertainty.However, the standard for and warning drought for living and industrial water has not been presented quantitatively, leading to relatively high uncertainty in implementing step-by-step national action guidelines based on qualitative judgment.
Water demand adaptive management involves driving near-future information through an iterative learning process in the presence of uncertainty.In this study, supply measures and demand management policies are approached at the level of adaptive management, and information is provided in advance to carry out management measures based on the best information with high speed and accuracy.Flexibility and promptness must be secured in the process of forming and implementing policies by utilizing this information.

Conclusions
Drought has serious social and environmental impacts, but it is widespread and occurs gradually.Accordingly, in order to prepare for and respond in advance to damage caused by drought, it is necessary to establish non-structural measures that can be applied flexibly according to the drought stage [1][2][3].Therefore, in this study, an AI model was applied to predict water demand in real time.Model accuracy was evaluated using data, with the LSTM model achieving a CC of 0.95 and an NRMSE of 8.38, indicating excellent performance [3][4][5][6][7]10,22].
And standards for each stage of crisis warning were set.AI-based classification models, namely DT and RF models, were used to identify the scale of water demand based on the established standards.The water demand at the Gurye intake station is influenced by the water demand from the day before, and the accuracy of the model was evaluated by considering this influence.As a result of evaluating the accuracy of the model, the F1-score value of the RF model was 0.81, showing excellent performance.
Because water demand prediction is essential for optimal water resource management, many studies have applied and developed various methods for accurate prediction.However, rather than predicting the strategic value of water demand, research is needed to determine how much water demand is insufficient or sufficient in the short term.
Adaptive water demand management can be said to promote information about the near future through an iterative learning process in situations where uncertainty exists.This study focuses on supply and demand management policies and approaches them from an adaptive management perspective, providing advance information so that management measures can be carried out based on the best information quickly and accurately.This information should be utilized to ensure flexibility and speed in the process of policy formulation and implementation.

Figure 2 .
Figure 2. Conceptual diagram of a study on developing an AI−based water demand prediction and classification model for Gurye intake station.

Figure 2 .
Figure 2. Conceptual diagram of a study on developing an AI−based water demand prediction and classification model for Gurye intake station.

Water 2023 ,
15, x FOR PEER REVIEW 10 of 17

Figure 7 .
Figure 7. Observed (Gurye intake station; water demand) and predicted water demand prediction model from DNN and LSTM models.Red triangle represents DNN model.Green X represents LSTM model.

Figure 7 .
Figure 7. Observed (Gurye intake station; water demand) and predicted water demand prediction model from DNN and LSTM models.Red triangle represents DNN model.Green X represents LSTM model.

Figure 8 .
Figure 8. Histogram of water intake data.

Figure 8 .
Figure 8. Histogram of water intake data.

Figure 9 .
Figure 9.Comparison with prediction results and observation results using DT.

Figure 9 .
Figure 9.Comparison with prediction results and observation results using DT.

Table the basic statistics of dependent and independent variable data. Meteorological d water demand data can be downloaded from the Meteorological Data Open Por ated by the Korea Meteorological Administration (https://data.kma.go.kr/cmmn/m accessed on 31 December 2021).Table 1 .
Basic statistics for the dependent and independent variables.

Table 1 .
Basic statistics for the dependent and independent variables.

Table 2 .
The structure of the confusion matrix.Based on the calculated confusion matrix, accuracy, error rate, sensitivity, precision, and specificity can be calculated.The F1-score can be calculated as follows using precision and sensitivity, and β is generally marked as 1.

Table 3 .
Settings of hyper-parameters in DNN and settings of parameters in LSTM.

Table 4
shows the results of the evaluation of predictive power in the evaluation period from 2016 to 2021.The CC of the DNN model is 0.89 and the NRMSE is 12.42.The CC of the LSTM model is 0.95 and the NRMSE is 8.38.

Table 4 .
Evaluation of predictive power using CC and NRMSE.

Table 5 .
Frequency and accumulation of water demand data.

Table 6 .
Setting the crisis alert standards at Gurye intake station.

Table 7 .
Derivation of parameters for a decision tree model.

Table 8 .
Water demand classification prediction model performance evaluation using decision tree (learning section).

Table 9 .
Water demand classification prediction model performance evaluation using decision tree (evaluation section).

Table 9 .
Water demand classification prediction model performance evaluation using decision tree (evaluation section).