ARIMA-M: A New Model for Daily Water Consumption Prediction Based on the Autoregressive Integrated Moving Average Model and the Markov Chain Error Correction

Water resource is considered as a significant factor in the development of regional environment and society. Water consumption prediction can provide an important decision basis for the regional water supply scheduling optimizations. According to the periodicity and randomness nature of the daily water consumption data, a Markov modified autoregressive moving average (ARIMA) model was proposed in this study. The proposed model, combined with the Markov chain, can correct the prediction error, reduce the continuous superposition of prediction error, and improve the prediction accuracy of future daily water consumption data. The daily water consumption data of different monitoring points were used to verify the effectiveness of the model, and the future water consumption was predicted in the study area. The results show that the proposed algorithm can effectively reduce the prediction error compared to the ARIMA.


Introduction
Water resources are considered as an important key factor for regional sustainable development in both developing and developed countries. With the development of urbanization and the improvement of people's living standards, the demand for water supply is increasing, and the shortage of water resources is becoming more and more serious. A crisis of water scarcity occurs in many parts of the world. With the expansion of the scope and scale of the urban water supply system, the complexity of the water supply has been significantly increased. The decision-making for the water supply is only based on the experience and judgment of the current water demand, which causes difficulty in predictability of water supply, leading to excessive water supply. In addition, the excessive water supply increases the pressure on the water supply network, which increases the risk of leakage and burst of water pipes. Therefore, the analysis of urban water supply and demand is of great significance for prediction of the urban water demand. Firstly, by quota analysis of water consumption on different regions, the allocation and management of water resources in the water administration department can be optimized. Effective forecasting of water consumption is helpful in improving emergency response ability of water resource management, as well as in providing technical support for assessment or management of water resource conservation. Secondly, water consumption forecast can improve management and service quality of water supply enterprises. The water supply demand forecast can be used to ensure the demand of water supply and water pressure during various periods to improve Yellow River. Guarnaccia et al. [20] made a prediction of short-term tank water level in urban water distribution network.
Artificial neural network has a strong nonlinear approximation ability and can be used in data prediction and other fields [21]. Bennett et al. [22] used the urban water consumption prediction model, based on artificial neural network (ANN), and used demographic, socio-economic, and water appliance stock information as an input to predict the future water consumption. Mouatadid and Adamowski [23] proposed a water consumption prediction method based on the extreme learning machine neural network. Adebiyi [24] compared the performance difference between ARIMA and the neural network model on stock price prediction. The results showed that ARIMA-based prediction results can produce a better trend of prediction results, whereas the ANN-based approach can fit the prediction details well. Similarly, Sebri [25] compared the performance of Box and Jenkins' ARIMA model and ANN model on water consumption prediction in Tunisia, and the result indicated that the traditional Box-Jenkins method outperformed ANN estimated on raw, degraded, or deseasonalized data in terms of forecasting accuracy. Thus, it is difficult to obtain the seasonal and periodic characteristics of water consumption data by ANN, and it is easy to produce over fitting problems in the limited dataset for a strong nonlinear approximation ability [26], which reduces the prediction accuracy. Therefore, it is worth performing a further study about the ARIMA model for predicting water consumption data.
However, due to the random and volatility of water consumption data, the ARIMA model will inevitably have large errors in the prediction of non-linear non-stationary time series data, with certain trends and periodicity. In addition, the process of data acquisition is tedious, which involves many links, such as acquisition, transmission, storage, and exchange. Additionally, the integrity of the obtained data cannot be guaranteed, which greatly limits the accuracy of ARIMA model prediction.
To bridge the gap in the data modelling, this study presents a water consumption prediction model, combining the ARIMA and Markov model. On the basis of data analysis and pre-processing, the water consumption prediction was carried out on the basis of the ARIMA model. Aiming at the prediction error, this study proposes a prediction value correction method that is based on Markov chain.

Water Data Pre-Processing
The data pre-processing procedure includes uploading the data through the sensor of the regional data monitoring point, and then gathering the data to the data processing server to form the dataset within a certain period of time. However, due to the failure of data collection point, noise, and other factors, it is easy to have data value missing, or large, small, and other abnormal data, which greatly affects the effectiveness of data processing. Therefore, effective identification and data processing are required for further data analysis.
For the analysis of the collected water consumption data, the identifiable data abnormal features include data missing or zero, data mutation of zero, or a large data mutation, and so on. The above abnormal data features, zero value and missing value, can be directly tested and judged. The 3δ criterion (i.e., the pauta criterion) can be used to judge whether the mutation data is abnormally large or small. Assuming that the sample data approximately obey the normal distribution, the data contain random errors, and the error region is determined according to the probability. Furthermore, the error beyond the region is considered as gross error, and the data within the gross error range is regarded as the abnormal value. If δ is the standard deviation and µ is the mean value, the probability of data distribution in (µ − 3δ, µ + 3δ) is 0.9973, and the data beyond this range is the abnormal value point, where δ and µ are the standard deviation and mean value, calculated from the dataset after eliminating the zero value and missing value in the water consumption data. After obtaining the abnormal data value, the data need to be recovered to obtain the normal range. Subsequently, the mean filling method is used to calculate the mean value of the dataset to remove the outliers, which include the zero value, missing value, abnormal large value, and abnormal small value, which were previously identified using the above detection method.
Even after the abnormal value detection and processing, the water consumption data monitoring process inevitably produces errors and noises. The use of many noise data for water consumption prediction greatly affects the data prediction, which requires further data abnormal value processing to remove data noise.
Empirical mode decomposition (EMD) is a time-frequency analysis method that can decompose time-series data into multiple intrinsic mode function (IMF) components, where each component represents a certain local feature of data. EMD has been widely used in signal de-noising, fault diagnosis, image processing, and other aspects. Using the data decomposed by the EMD, it is easy to produce mode aliasing, and different time-scale features in the IMF allow an efficient data processing [27,28]. Wu and Huang proposed the ensemble empirical mode decomposition (EEMD) method. During the decomposition process, white noise is introduced according to a certain signal-to-noise ratio, and the influence of white noise is reduced through the set average method, which has the advantage of anti-aliasing [29]. The EEMD method is used to remove the noise in the historical water consumption data. The water consumption data processed by outliers are decomposed by the EEMD to obtain N-component, including n-1 IMF component and 1 residual term r n . The decomposed data are arranged, according to the frequency from high to low, and afterwards the highest frequency component is removed and the residual component is summed to obtain the new data as the de-noised data.

Prediction of Water Consumption Based on Markov Chain Modification
The daily water consumption data is nonlinear and uncertain, and interrelated to time. The daily water consumption data prediction is a time-series prediction problem. In this study, the ARIMA model was established for daily water consumption data. Furthermore, a modified Markov chain model was proposed to forecast the daily water consumption, which can reduce the error caused by the randomness nature of the water consumption data.

Prediction Model Based on ARIMA
The ARIMA model is widely used to forecast non-stationary time series data. It can be used to forecast the trend of daily water consumption data. In a model of ARIMA (p, d, q), AR is autoregressive, p is the number of regression terms, MA is the moving average, q is the number of moving average terms, and d is the difference time to make the data a stationary series. Firstly, the non-stationary historical data x t is processed by the d difference to develop the stable historical data y t , fitted to the ARMA (p, q) model to predict the consumption, and then the original data x t is obtained by d times contrast difference. The ARMA model is expressed as follows: where φ 1 , · · · , φ p and θ 1 , · · · , θ q are constant, ε t is a white noise sequence, then the time series y t follows the (p, q) order autoregressive moving average model, which is recorded as ARMA(p, q). When the original data sequence is non-stationary, firstly, the data is processed by the d-th difference to obtain the stationary sequence; subsequently, the corresponding ARMA time series model is established for analysis of the stationary time series. The auto correlation function (ACF) and the partial auto correlation function (PACF) are analyzed. If the PACF is p-order truncated and the ACF is tailed, the AR (p) model can be established, accordingly. If the PACF is tailed and the ACF is q-order truncated, then the MA (q) model can be established. If the PACF and ACF are all tailed, the ARMA model is established. Subsequently, the ARMA (p, d, q) model is established for the time series of d-order difference processing. Because the judgment of tailing and truncation is of a certain subjective, therefore, the model order can be determined according to the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) criteria, and the parameters p and q of the model can be obtained.
The regression coefficient, moving average coefficient, and white noise variance of the ARIMA (p, d, q) are estimated by least square method and moment estimate method, and parameter ofφ 1 , · · · ,φ p , θ 1 , · · · ,θ p are obtained. Afterwards, the hypothesis test is carried out to determine whether the residual sequence is a white noise sequence. The presence of white noise data sequence confirms the efficiency of the model. On this basis, the model that passed the test can be used for prediction purposes. Table 1 demonstrates the prediction model flow based on the ARIMA. So, according to the Algorithm in Table 1, the future data can be predicted. Forecast future data: The valid ARIMA (p, d, q) model is used to predict the data in the next few days.

Markov Chain Theory
Markov chain is a stochastic process with discrete time and state. A Markov chain sequence has several different states. In one time sequence, the state of the next time sequence can be determined by the random transition probability matrix [30]. According to the initial probability of each state and the transition probability of each state, Markov chain predicts the change trend for each state. The probability of future state of Markov chain at each time is only related to the state of the time, but not to the state of the sequence before the time, which has no aftereffect.
Markov model can be represented by the triples {S, π, P}, in which S represents the state space of the random process and the finite data set of the random process. π is the probability vector of the selected initial state time, and P is the probability transfer matrix. The probability transfer matrix can be obtained by frequency estimation probability method, or by minimizing the squared sum error of the probability vector about the probability vector of current state and the theoretical state. Setting the state value of the random process as S = {S 1 , S 2 , · · · , S n }, the probability transfer characteristic of Markov chain can be determined by the conditional probability, that is, the probability P, P X m+k = S j X m = S i , of the state S j after k-time processing, when the variable X is in state S i on the time m.
Whether the data series can be predicted by Markov model requires χ 2 detection. Let f ij be the number of state i transitions to state j, and P ij be the probability of state i transitions to state j. The statistic χ 2 is expressed as Equation (2), where, P •j is marginal probability of state j, which satisfies Equation (3).
If the data sequence accords with χ 2 > χ 2 α (m − 1) 2 , then the Markov model can be used to predict the future trend of data.
If the transition probability of the Markov chain from state S i to S j in one step is P ij , then the matrix of state transition probability in one step can be expressed as Equation (4).
Water 2020, 12, 760 6 of 20 If the random process is in the i-th state at the current time, and the number of times it transfers to the j-th state at the next time is f ij , then f i = f ij . Using the method of frequency estimation probability, The probability P ij of state i transitions to state j can be calculated by Equation (5).
Let π 0 denote the initial vector of the stochastic process at time t, and the parameters p 1 , p 2 , · · · , p n denote the probability of each state at that time. Then, the initial state vector is expressed as π 0 = (p 1 , p 2 , · · · , p n ), and the probability vector of the random process at t = m is π m = π 0 P m . When the value of m is large enough, the probability vector will tend to a stable value, which is expressed as Y = π m × S i . According to the characteristics of the Markov process, the future state of the stochastic process can be predicted by its historical state. The predicted value D t+1 is expressed as Formula (6), which is the inner product of the state vector X t+1 and the average value of each state, where X t+1 = (x t+1,1 , x t+1,2 , · · · , x t+1,i · · · , x t+1,N ), if the state is in i then the value of x t+1,i in the matrix is 1, and the other variables of x t+1, j are set to zero, where j is any state other than i.

Modifying ARIMA Water Consumption Forecast Based on Markov Chain
Markov chain can be used to predict the trend of data, and the predicted value Y of test the dataset can be modified by ARIMA to improve the accuracy of water consumption prediction. In this study, firstly, the future trend value of water consumption was predicted, and subsequently, the water consumption data obtained from the prediction model was increased by a certain error value in proportion as the corrected water consumption data.
Let the data prediction series in the continuous time range be expressed as D r = [D 1 , · · · , D R ], and divide the data series D r into N states, D 1 , D 2 , · · · , D N . Considering the randomness nature of the water consumption data, the data distribution law is unclear. In order to evenly divide the data sequence into several states, this study proposed the use of the method of k-means algorithm on state division.
Let y t+n be the water consumption data at the time of t+n predicted by the ARIMA model, D te be the average predicted value based on Markov chain, and y te be the average predicted value of the ARIMA model. As the error value of the ARIMA prediction increases gradually, in the predicted value of the time t+n in future, the correction coefficient f t+n is used to correct the error value. Then, the modified predicted water consumption data y t+n at the time of t+n is expressed as Formula (7). Because the error value of the ARIMA prediction in the future is the cumulative error, one-by-one, therefore, the value of the correction factor is increased gradually, hence Formula (8) is adopted so as to improve the prediction accuracy.
The daily water consumption prediction process based on the modified ARIMA prediction of Markov chain is demonstrated in Figure 1, and the specific process is presented as as Algorithm in Table 2.  Repeat steps A2.3-2.6 to find the predicted water consumption of Markov chain at each time to be predicted. 7 The prediction value of water consumption data at the time of t+n is obtained on the basis of the Markov chain prediction value and the ARIMA prediction value by Formula (7).
The algorithm flow is as follows in Figure 1.

Data Analysis
The effectiveness of the proposed algorithm is verified by examples. The daily water intake data of some water monitoring points in Guangdong Province from 2016 to 2017 were selected for the  Table 2. Algorithm of data forecast based on the Markov chain-modified ARIMA model.

A2.1
The water consumption data series D r is divided into N states. The k-means clustering algorithm is used to cluster the data sequence, and the states of each value in the sequences, the partition of N states, and the mean value E i of state i are obtained. A2.2 One step state transition matrix P (1) is calculated by Formula (4). According to the change of state in the sequence, the state transition frequency f i j is obtained, and then the transition probability p i j of each state is obtained according to Formula (5). A2. 3 Select the time t as the initial state, and get the initial state vector X t = x t,1 , x t,2 , · · · , x t,N . The data of the day before the forecast date is taken as the initial state. A2. 4 Calculate the state vector X t+1 of water consumption to be predicted at the next time. Let x t+1,i represent the probability of state i at time t+1, then the state vector at time t+1 is the product of state vector at time t and transfer matrix, X t+1 = X t P (1) . A2. 5 The prediction value D t+1 of future time based on Markov chain is calculated, which is expressed as Formula (6). A2. 6 Repeat steps A2.3-A2.6 to find the predicted water consumption of Markov chain at each time to be predicted. A2.7 The prediction value of water consumption data at the time of t+n is obtained on the basis of the Markov chain prediction value and the ARIMA prediction value by Formula (7).
The algorithm flow is as follows in Figure 1.

Data Analysis
The effectiveness of the proposed algorithm is verified by examples. The daily water intake data of some water monitoring points in Guangdong Province from 2016 to 2017 were selected for the Water 2020, 12, 760 8 of 20 experiment. The daily water consumption data from January to December 2016 was used to build the model, and the data from January 2017 was used to test the validity of the model.

Data Pre-Processing
The abnormal values in the daily water consumption data, such as the noise, zero value, abnormally large values, or abnormally small values, may easily cause the error in the prediction model. Therefore, it is necessary to pre-process the data to remove the noise, abnormally large values, and other abnormalities. First, the abnormally large values of water consumption data were removed, on the basis of the pauta criterion, and the mean value was used to fill the abnormal values. For the noise data, the mode decomposition method was used to remove the high frequency data component as the noise. experiment. The daily water consumption data from January to December 2016 was used to build the model, and the data from January 2017 was used to test the validity of the model.

Data Pre-Processing
The abnormal values in the daily water consumption data, such as the noise, zero value, abnormally large values, or abnormally small values, may easily cause the error in the prediction model. Therefore, it is necessary to pre-process the data to remove the noise, abnormally large values, and other abnormalities. First, the abnormally large values of water consumption data were removed, on the basis of the pauta criterion, and the mean value was used to fill the abnormal values. For the noise data, the mode decomposition method was used to remove the high frequency data component as the noise. Figure 2 and Figure 3 represent the original data and data after outlier processing of the two monitoring points, respectively. Figure 4 and Figure 5 demonstrate the outliers processed and denoised data of two monitoring points, respectively.   experiment. The daily water consumption data from January to December 2016 was used to build the model, and the data from January 2017 was used to test the validity of the model.

Data Pre-Processing
The abnormal values in the daily water consumption data, such as the noise, zero value, abnormally large values, or abnormally small values, may easily cause the error in the prediction model. Therefore, it is necessary to pre-process the data to remove the noise, abnormally large values, and other abnormalities. First, the abnormally large values of water consumption data were removed, on the basis of the pauta criterion, and the mean value was used to fill the abnormal values. For the noise data, the mode decomposition method was used to remove the high frequency data component as the noise. Figure 2 and Figure 3 represent the original data and data after outlier processing of the two monitoring points, respectively. Figure 4 and Figure 5 demonstrate the outliers processed and denoised data of two monitoring points, respectively.

Model Validation
Firstly, the ARIMA analysis was performed on data 1 of the monitoring point. The water consumption data X1 of the monitoring point 1 fluctuated within a wide range. To eliminate the fluctuation trend of its time series, the data sequence of X1 was differentially processed and data sequence of DX1 was obtained. As can be seen from Figure 6, the sequence after the first-order difference fluctuated steadily, around the mean value. Figure 7 displays the autocorrelation diagram after the first-order difference of the water consumption sequence. It can be seen from the figure that the autocorrelation coefficient is greater than zero for a long time, indicating the presence of a strong property between the sequences. The stationary state of the Augmented Dickey-Fuller (ADF) unit root test sequence was selected (see Table 3.). The p-value of the unit root test was less than 0.05, suggesting the sequence after the first difference was a stationary sequence.

Model Validation
Firstly, the ARIMA analysis was performed on data 1 of the monitoring point. The water consumption data X1 of the monitoring point 1 fluctuated within a wide range. To eliminate the fluctuation trend of its time series, the data sequence of X1 was differentially processed and data sequence of DX1 was obtained. As can be seen from Figure 6, the sequence after the first-order difference fluctuated steadily, around the mean value. Figure 7 displays the autocorrelation diagram after the first-order difference of the water consumption sequence. It can be seen from the figure that the autocorrelation coefficient is greater than zero for a long time, indicating the presence of a strong property between the sequences. The stationary state of the Augmented Dickey-Fuller (ADF) unit root test sequence was selected (see Table 3.). The p-value of the unit root test was less than 0.05, suggesting the sequence after the first difference was a stationary sequence.

Model Validation
Firstly, the ARIMA analysis was performed on data 1 of the monitoring point. The water consumption data X 1 of the monitoring point 1 fluctuated within a wide range. To eliminate the fluctuation trend of its time series, the data sequence of X 1 was differentially processed and data sequence of DX 1 was obtained. As can be seen from Figure 6, the sequence after the first-order difference fluctuated steadily, around the mean value. Figure 7 displays the autocorrelation diagram after the first-order difference of the water consumption sequence. It can be seen from the figure that the autocorrelation coefficient is greater than zero for a long time, indicating the presence of a strong property between the sequences. The stationary state of the Augmented Dickey-Fuller (ADF) unit root test sequence was selected (see Table 3). The p-value of the unit root test was less than 0.05, suggesting the sequence after the first difference was a stationary sequence.   Further, it is necessary to judge whether there is correlation between the sequence data. If the sequence is white noise sequence, there is no information to be extracted, and the analysis of the sequence needs to be terminated. White noise test was conducted for the data after the first-order difference, and the results are shown in Table 4. The output p value is far less than 0.05, so the firstorder difference sequence is a stationary non-white noise sequence.

Stat 5%
312.49 6.26e-70   Further, it is necessary to judge whether there is correlation between the sequence data. If the sequence is white noise sequence, there is no information to be extracted, and the analysis of the sequence needs to be terminated. White noise test was conducted for the data after the first-order difference, and the results are shown in Table 4. The output p value is far less than 0.05, so the firstorder difference sequence is a stationary non-white noise sequence.

Stat 5%
312.49 6.26e-70  Further, it is necessary to judge whether there is correlation between the sequence data. If the sequence is white noise sequence, there is no information to be extracted, and the analysis of the sequence needs to be terminated. White noise test was conducted for the data after the first-order difference, and the results are shown in Table 4. The output p value is far less than 0.05, so the first-order difference sequence is a stationary non-white noise sequence. The ARIMA model was fitted on the first-order stationary white noise sequence. The relative optimal model identification method was used to calculate the BIC information of all combinations of ARIMA (p, 1, q) at p and q less than or equal to 5. The model parameter with the minimum BIC information was selected and the BIC matrix bic_mat was as follows: When p value is 2 and q value is 2, the minimum BIC value is 5178.98. Then the sequence was fitted and analyzed with the model of ARIMA (2, 1, 2). The p-value of the white noise test around the residual was 0.93, which is white noise; therefore, the model is valid.
The same method was adopted to determine the water consumption data fitting model of monitoring point 2. The time sequence after the first-order difference of monitoring point 2 fluctuated stably around the mean value, as shown in Figure 8. And Figure 9 displays the autocorrelation diagram after the first-order difference of the water consumption sequence at monitoring point 2. The ADF unit root was selected to check the stable state of the sequence, and the results are shown in Table 5. The unit root test p-value was less than 0.05, which suggests the sequence after the first-order difference was a stationary sequence. The white noise test was carried out on the data after the first-order difference, and the results are shown in Table 6. As it can be observed from the results, the output p-value was far less than 0.05; therefore, the sequence after the first-order difference was a stationary non-white noise sequence. It was determined that the ARIMA (p, 1, q) was less than or equal to 5 BIC information of all combinations. The p and q values, corresponding to the minimum BIC value, were all 2, and then the sequence was also fitted and analyzed with the model of ARIMA (2, 1, 2). The white noise test p-value of the residual was 0.90, which was white noise; thus, the model passed the test and is valid. The ARIMA model was fitted on the first-order stationary white noise sequence. The relative optimal model identification method was used to calculate the BIC information of all combinations of ARIMA(p, 1, q) at p and q less than or equal to 5. The model parameter with the minimum BIC information was selected and the BIC matrix bic_mat was as follows: When p value is 2 and q value is 2, the minimum BIC value is 5178.98. Then the sequence was fitted and analyzed with the model of ARIMA(2, 1, 2). The p-value of the white noise test around the residual was 0.93, which is white noise; therefore, the model is valid.
The same method was adopted to determine the water consumption data fitting model of monitoring point 2. The time sequence after the first-order difference of monitoring point 2 fluctuated stably around the mean value, as shown in Figure 8. And Figure 9 displays the autocorrelation diagram after the first-order difference of the water consumption sequence at monitoring point 2. The ADF unit root was selected to check the stable state of the sequence, and the results are shown in Table 5. The unit root test p-value was less than 0.05, which suggests the sequence after the first-order difference was a stationary sequence. The white noise test was carried out on the data after the firstorder difference, and the results are shown in Table 6. As it can be observed from the results, the output p-value was far less than 0.05; therefore, the sequence after the first-order difference was a stationary non-white noise sequence. It was determined that the ARIMA(p, 1, q) was less than or equal to 5 BIC information of all combinations. The p and q values, corresponding to the minimum BIC value, were all 2, and then the sequence was also fitted and analyzed with the model of ARIMA (2,1,2). The white noise test p-value of the residual was 0.90, which was white noise; thus, the model passed the test and is valid.    The longer the prediction period of the ARIMA model, the larger the prediction error, which causes error accumulation. Therefore, the proposed error correction method based on the Markov chain was used to correct the prediction results from the ARIMA model.
Firstly, on the basis of the Markov model, the training data were counted, and the state transition matrix and the one-step state transition value under each state were obtained. Subsequently, the future data prediction value was obtained as the future data trend. Then, the modified values were calculated on the basis of the prediction results of the Markov model.
In the prediction based on the Markov chain, the state of data sequence was set to 5, and k-means algorithm was used to divide the state of data sequence. The cluster diagram of water consumption of monitoring point 1 and 2 are demonstrated in Figure 10 and Figure 11     The longer the prediction period of the ARIMA model, the larger the prediction error, which causes error accumulation. Therefore, the proposed error correction method based on the Markov chain was used to correct the prediction results from the ARIMA model.
Firstly, on the basis of the Markov model, the training data were counted, and the state transition matrix and the one-step state transition value under each state were obtained. Subsequently, the future data prediction value was obtained as the future data trend. Then, the modified values were calculated on the basis of the prediction results of the Markov model.
In the prediction based on the Markov chain, the state of data sequence was set to 5, and k-means algorithm was used to divide the state of data sequence. The cluster diagram of water consumption of monitoring point 1 and 2 are demonstrated in Figures 10 and 11 The Markov chain one-step state probability matrix of daily water consumption data at monitoring point 1 and point 2 are presented in the following equation, respectively, as follows: Water 2020, 12, 760 13 of 20 Given the significance level α = 0.01, χ 2 0.01 (5 − 1) 2 = 32 can be obtained by looking at the table. According to Equations (2) and (3), the statistical value χ 2 of monitoring point 1 and 2 are 700.81 and 1268.14, respectively. Therefore, the Markov model can be used to predict the daily water consumption in future.
If the water consumption data of monitoring point 1 on that day is known, the state vector is set as P 0 = [0,0,0,0,1], according to the water consumption data, then the state vector of the next day is P 1 = P 0 × P (1) . According to Equation (6), the predicted value is [127114.01 88890.54 121126.54 109786.08 137019.38]. In the same way, the prediction value of the next n days is calculated, accordingly, on the basis of the method of the modified ARIMA model, that is, combining the predicted value of the Markov chain to modify the predicted result of the ARIMA in proportion.

Stat 5%
316.44 8.62e-71 The longer the prediction period of the ARIMA model, the larger the prediction error, which causes error accumulation. Therefore, the proposed error correction method based on the Markov chain was used to correct the prediction results from the ARIMA model.
Firstly, on the basis of the Markov model, the training data were counted, and the state transition matrix and the one-step state transition value under each state were obtained. Subsequently, the future data prediction value was obtained as the future data trend. Then, the modified values were calculated on the basis of the prediction results of the Markov model.
In the prediction based on the Markov chain, the state of data sequence was set to 5, and k-means algorithm was used to divide the state of data sequence. The cluster diagram of water consumption of monitoring point 1 and 2 are demonstrated in Figure 10 and Figure 11   To test the prediction performance of the proposed model, the following prediction algorithms were compared and analyzed, which included the ARIMA prediction, the Markov prediction, and the modified ARIMA model (ARIMA-M).
In order to measure the stability and adaptability of the prediction model, root mean square error (RMSE) and coefficient of determination (R 2 ), and the relative prediction error (RE) were selected as the evaluation indexes. The RMSE reflects the difference between the original value and the estimated value. The smaller the value, the closer the predicted value is to the real value, and the better the prediction effect. The R 2 can represent the whole fitting degree of the prediction model. The closer the R 2 is to 1, the better the fitting degree of the prediction value to the observation value, and the better the prediction performance of the model. The RE is the ratio of absolute error to the real value. The relative error reflects the reliability of the prediction. If the true real value and the predicted value of data r are T i and Y i , respectively, N is the number of predicted samples, and the average value of all data values is T i , then RMSE can be calculated through Equation (12), and R 2 and RE can be expressed by Equations (13) and (14).
The prediction results and the relative error of the training data of monitoring point 1 are presented in Figures 12 and 13, respectively. In addition, the prediction results and relative error curves of the training data of monitoring point 2 are demonstrated in Figures 14 and 15, respectively. From the prediction results of the training data, it can be seen that the daily water consumption data of the two monitoring points predicted by the ARIMA were close to the real data value, and the overall trend predicted by the Markov was consistent with the predicted data; however, some errors were present. According to the error curve, it can be seen that the error of the ARIMA prediction was close to 0, and the error value of the Markov prediction at monitoring point 1 fluctuated between −12 and 15. Furthermore, the error value of the Markov prediction at monitoring point 2 fluctuated between −8 and 14.
Water 2020, 12, x FOR PEER REVIEW 14 of 20 The prediction results and the relative error of the training data of monitoring point 1 are presented in Figures 12 and 13, respectively. In addition, the prediction results and relative error curves of the training data of monitoring point 2 are demonstrated in Figures 14 and 15, respectively. From the prediction results of the training data, it can be seen that the daily water consumption data of the two monitoring points predicted by the ARIMA were close to the real data value, and the overall trend predicted by the Markov was consistent with the predicted data; however, some errors were present. According to the error curve, it can be seen that the error of the ARIMA prediction was close to 0, and the error value of the Markov prediction at monitoring point 1 fluctuated between −12 and 15. Furthermore, the error value of the Markov prediction at monitoring point 2 fluctuated between −8 and 14.       Tables 7 and 8 show the prediction error of the ARIMA and the Markov model on the training dataset for monitoring point 1 and 2, respectively. According to the prediction data of monitoring point, the relative error (RE) of the ARIMA prediction was less than 0.2, and the coefficient of determination (R 2 ) was close to 1; therefore, the training dataset can be better fitted by this model.    Tables 7 and 8 show the prediction error of the ARIMA and the Markov model on the training dataset for monitoring point 1 and 2, respectively. According to the prediction data of monitoring point, the relative error (RE) of the ARIMA prediction was less than 0.2, and the coefficient of determination (R 2 ) was close to 1; therefore, the training dataset can be better fitted by this model.  Tables 7 and 8 show the prediction error of the ARIMA and the Markov model on the training dataset for monitoring point 1 and 2, respectively. According to the prediction data of monitoring point, the relative error (RE) of the ARIMA prediction was less than 0.2, and the coefficient of determination (R 2 ) was close to 1; therefore, the training dataset can be better fitted by this model. The training data mean square error, coefficient of determination, and relative error rate of the Markov model were much larger than those of the ARIMA model. The relative errors of the Markov model for monitoring point 1 and monitoring point 2 were about 13 and 18 times that of the ARIMA, respectively. Therefore, the ARIMA model provided good fitting results for the training data, and the relative error RE of the Markov prediction was less than 2.5%, which can meet the requirements of the daily water consumption data prediction. Therefore, the ARIMA and Markov combined data prediction model (ARIMA_M) can be used for the daily water consumption data prediction. The ARIMA model can fit the training data with high prediction accuracy. The Markov model can predict the trend of water consumption data on the basis of the training data of water consumption.
On the basis of the training set of daily water consumption, the ARIMA and Markov prediction models can be obtained by training. The ARIMA and the proposed ARIMA-M correction algorithm were used to predict the data of 20 days from 1 to 20 January 2017, in order to verify the validity of the model. Table 9 demonstrates the predicted values and errors of monitoring point 1 during the following 10 days. According to the future forecast data, the relative error RE of the ARIMA-M forecast can be reduced by 15.77%, compared to the ARIMA forecast.  Figure 16 represents the total water consumption change and the relative error curve of monitoring point 2 for the following 20 days. Figure 17 shows the prediction error curve of water consumption of monitoring point 2 for the following 20 days using the ARIMA and ARIMA-M algorithms. It can be seen from the figure that the predicted value of the test data using the ARIMA-M model was closer to the real value, and that the prediction error was lower.  The prediction error of the ARIMA and the proposed ARIMA-M model in the overall test set of monitoring points 1 and 2 are presented in Tables 10 and 11, respectively. It can be observed from the table that compared to the training data that the prediction error of the test data was greatly increased. At monitoring point 1, the RMSE reached to 14085, the R 2 value was only −0.04, and the relative error reached 8.07. Using ARIMA-M, the RMSE of the predicted value of the test set was decreased by 25%, R 2 was increased by more than 10 times, and relative error was decreased by 24.4%, in comparison with the traditional ARIMA. For monitoring point 2, compared to the ARIMA, the RMSE of predicted value on ARIMA-M test set and the relative error were reduced by 18.4% and 13%, respectively.   The prediction error of the ARIMA and the proposed ARIMA-M model in the overall test set of monitoring points 1 and 2 are presented in Tables 10 and 11, respectively. It can be observed from the table that compared to the training data that the prediction error of the test data was greatly increased. At monitoring point 1, the RMSE reached to 14085, the R 2 value was only −0.04, and the relative error reached 8.07. Using ARIMA-M, the RMSE of the predicted value of the test set was decreased by 25%, R 2 was increased by more than 10 times, and relative error was decreased by 24.4%, in comparison with the traditional ARIMA. For monitoring point 2, compared to the ARIMA, the RMSE of predicted value on ARIMA-M test set and the relative error were reduced by 18.4% and 13%, respectively.  The prediction error of the ARIMA and the proposed ARIMA-M model in the overall test set of monitoring points 1 and 2 are presented in Tables 10 and 11, respectively. It can be observed from the table that compared to the training data that the prediction error of the test data was greatly increased. At monitoring point 1, the RMSE reached to 14,085, the R 2 value was only −0.04, and the relative error reached 8.07. Using ARIMA-M, the RMSE of the predicted value of the test set was decreased by 25%, R 2 was increased by more than 10 times, and relative error was decreased by 24.4%, in comparison with the traditional ARIMA. For monitoring point 2, compared to the ARIMA, the RMSE of predicted value on ARIMA-M test set and the relative error were reduced by 18.4% and 13%, respectively. According to the above analysis, the ARIMA model can provide a better fit for the changes of daily water consumption data of monitoring points, wheras the Markov can predict the trend of daily water consumption data within a certain error range. However, due to the randomness nature of the water consumption data, the prediction accuracy of the above model for the unknown data decreased, and the proposed ARIMA-M model can be used (1) to correct the deviation of the future daily water consumption prediction data, (2) to reduce the over fitting of the ARIMA model on the training data set, (3) to improve the prediction accuracy of the data, and (4) to provide data support for the decision makers, on the basis of daily water consumption data prediction value.

Discussion and Conclusions
Water resource is an important factor affecting the sustainable development of regional environment and society. Water consumption prediction can provide an important decision basis for regional water supply scheduling optimization. The accurate prediction and quota analysis of water consumption are helpful to the design of regional water use strategy, the improvement of emergency response ability of water resource management, and the improvement of water resource management and service level.
Therefore, a daily water consumption data prediction method is proposed in this study on the basis of the Markov model to modify the ARIMA prediction value. A complete set of schemes from actual data preprocessing to prediction analysis was provided. Firstly, the abnormal value of the data was corrected, and the data noise was effectively reduced by EEMD decomposition, and then further prediction and analysis were carried out. The main idea of the method was to get the data prediction model by fitting the historical data on the basis of the ARIMA model. Using the Markov model to predict the future trend of the data, the ARIMA model was modified, which corrected the great error caused by error superposition, and improved the accuracy of data prediction.
By analyzing the actual data of two water consumption monitoring points, the results showed that the prediction model of ARIMA and Markov had a small error for the training data; however, the prediction error to the unknown data in the future increased greatly. This meant that the model was overfitted. The ARIMA-M method can effectively improve the prediction accuracy of the future daily water consumption data for the monitoring point.
The main findings of this study include: (1) The prediction error of ARIMA model for unknown data can be corrected by using the data trend prediction results of the Markov model. (2) When the ARIMA model is used on a limited dataset, it can easily to produce over fitting. By the hybrid model based on ARIMA and Markov prediction model, the prediction error can be corrected and the prediction ability of the model can be improved. (3) The small predictive error on the training data does not mean that the prediction result of the model is good. Therefore, a hybrid model can be used to eliminate the effect of overfitting.
For future research, the seasonal characteristics of water consumption data can be analyzed, with the aim of further improvement in prediction accuracy. In addition, the adaptability of the model to the annual water consumption data, as well as the early warning of regional water security by integrating regional economic, social, and environmental data, are all worthy of further exploration.