Prediction of Daily Water Consumption in Residential Areas Based on Meteorologic Conditions—Applying Gradient Boosting Regression Tree Algorithm
Abstract
:1. Introduction
2. Materials and Methods
- (1)
- The extraction and preprocessing of meteorological data and historical water consumption data;
- (2)
- The conduction of correlation analysis between the historical water consumption and meteorological data, for the selection of factors with high correlation with water consumption;
- (3)
- The application of the GBRT model for observation of the relationship between the aforementioned factors and water consumption, with the utilization of a genetic algorithm (GA) configured to optimize the hyper parameters in the GBRT model throughout the course of the analysis.
2.1. Data Acquisition and Preprocessing
2.1.1. Meteorological Data
2.1.2. Water Consumption Data
2.2. Analysis of Meteorological Factors Affecting Water Consumption
2.2.1. Meteorological Factors
2.2.2. Composite Meteorological Factor
2.3. Gradient Boosted Regression Tree
2.3.1. Algorithm Overview
2.3.2. Input and Output
2.3.3. Algorithmic Flow
- Compute the residual between the current predicted value and the actual daily water consumption as the target for the next round of training;
- Construct a decision tree with the current target value as the label of daily water consumption and divide it into left and right subtrees based on meteorological factors;
- Use the strategy of minimizing the regression loss function (mean square error) to determine the best partition feature and threshold for each node;
- Allocate the samples based on the partition result to the corresponding subtree;
- Calculate the optimal output value for each leaf node;
- Update the current prediction value of daily water consumption by adding the prediction result of the new decision tree multiplied by a learning rate;
- Repeat steps a–f to generate more decision trees. Termination condition: reach the preset tree quantity or maximum iteration number, or the residual change is small.
2.3.4. Data Processing and Model Creation
Data Processing
- (1)
- Data loading: The initial step involved loading the dataset from an external CSV file. The data were structured as a comma-separated format;
- (2)
- Data splitting: After loading the data, it was divided into the following two main components: features (X) and the target variable (y). The first 10 columns of the dataset represented the input features (X), while the eleventh column represented the target variable (y).
Model Creation
- (1)
- Data splitting for training and testing: The dataset was further divided into training and testing sets to assess the model’s performance. This was accomplished using the train_test_split function from scikit-learn. In this case, 90% of the data were allocated for training, and the remaining 10% for testing;
- (2)
- Gradient boosting regressor initialization: A gradient boosting regressor model was initialized with specific hyperparameters. These hyperparameters included the choice of loss function, learning rate, the number of estimators, maximum depth of the trees, and minimum samples required to split a node;
- (3)
- Model training: The gradient boosting regressor model was trained using the training dataset. During this phase, the model learned to map the input features (X) to the target variable (y);
- (4)
- Model evaluation: To assess the model’s performance, both the training and testing datasets were used. The evaluation was performed by calculating the R-squared (R2) score, which measures the goodness of fit. The higher the R2 score, the better the model fits the data;
- (5)
- Feature importance: The model was analyzed to determine the importance of each feature in making predictions. Features importance provide insights into which input variables have the most influence on the target variable;
- (6)
- Prediction: After training, the model was applied to the entire dataset (X) to make predictions. These predictions were used for further analysis.
2.3.5. Algorithm Parameter
3. Results
3.1. Analysis of the Importance of Meteorology Factors
3.1.1. Streamlining and Analysis of Drivers by Correlation Analysis
3.1.2. Discussion of SST and RGST
3.2. Analysis of Daily Water Consumption Using GBRT Algorithm
3.2.1. Hyperparameter Optimization for GBRT
3.2.2. Analysis of the Results of GBRT
4. Discussion
4.1. Analysis of the Importance of Meteorological Factors Based on Gini Index
4.2. Analysis of Factor Necessity
4.3. Comparison and Limitations
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Qiao, Z.R.; Wu, L.F.; Yang, Z.Z. Prediction of Water Consumption in 31 Provinces of China Based on FGM(1,1) Model. Clean-Soil Air Water 2022, 50, 2200052. [Google Scholar] [CrossRef]
- Ye, Y.L.; Yang, Y.H.; Zhu, L.; Wang, J.; Rao, D.N.; IEEE. A LoRa-based Low-power Smart Water Metering. In Proceedings of the IEEE International Conference on Consumer Electronics and Computer Engineering (ICCECE), Guangzhou, China, 15–17 January 2021; pp. 301–305. [Google Scholar]
- Al-Madhrahi, R.; Abdullah, J.; Alduais, N.A.M.; Mahdin, H.B.; Nasser, A.B.; Saad, A.; Alduais, H.S. An Efficient IoT-based Smart Water Meter System of Smart City Environment. Int. J. Adv. Comput. Sci. Appl. 2021, 12, 420–428. [Google Scholar] [CrossRef]
- Lloret, J.; Tomas, J.; Canovas, A.; Parra, L. An Integrated IoT Architecture for Smart Metering. IEEE Commun. Mag. 2016, 54, 50–57. [Google Scholar] [CrossRef]
- Caldognetto, N.; Evangelisti, L.P.; Poltronieri, F.; Russo, M.; Stefanelli, C.; Tenani, S.; Toboli, S.; Tortonesi, M. Water 4.0: Enabling Smart Water and Environmental Data Metering. In Proceedings of the IEEE/IFIP Network Operations and Management Symposium, Electr Network, Budapest, Hungary, 25–29 April 2022. [Google Scholar]
- Huang, H.D.; Zhang, Z.X.; Song, F.X. An Ensemble-Learning-Based Method for Short-Term Water Demand Forecasting. Water Resour. Manag. 2021, 35, 1757–1773. [Google Scholar] [CrossRef]
- Pacchin, E.; Alvisi, S.; Franchini, M. A Short-Term Water Demand Forecasting Model Using a Moving Window on Previously Observed Data. Water 2017, 9, 15. [Google Scholar] [CrossRef]
- Brentan, B.M.; Meirelles, G.L.; Manzi, D.; Luvizotto, E. Water demand time series generation for distribution network modeling and water demand forecasting. Urban Water J. 2018, 15, 150–158. [Google Scholar] [CrossRef]
- Roper, A.M.; Palmer, R.N. Analyzing the Effects of Temperature and Precipitation in the Context of a Water Demand Model. In Proceedings of the 20th Annual World Environmental and Water Resources Congress, Henderson, NV, USA, 17–21 May 2020; pp. 290–303. [Google Scholar]
- Stelzl, A.; Pointl, M.; Fuchs-Hanusch, D. Estimating Future Peak Water Demand with a Regression Model Considering Climate Indices. Water 2021, 13, 1912. [Google Scholar] [CrossRef]
- Fontanazza, C.M.; Notaro, V.; Puleo, V.; Freni, G. Multivariate Statistical Analysis for Water Demand Modeling. In Proceedings of the 16th International Conference on Water Distribution System Analysis (WDSA), Bari, Italy, 14–17 July 2014; pp. 901–908. [Google Scholar]
- Ridolfi, E.; Vertommen, I.; Magini, R. Joint Probabilities of Demands on a Water Distribution Network: ANon-Parametric Approach. In Proceedings of the 11th International Conference of Numerical Analysis and Applied Mathematics (ICNAAM), Rhodes, Greece, 21–27 September 2013; pp. 1681–1684. [Google Scholar]
- Fiorillo, D.; Kapelan, Z.; Xenochristou, M.; De Paola, F.; Giugni, M. Assessing the Impact of Climate Change on Future Water Demand using Weather Data. Water Resour. Manag. 2021, 35, 1449–1462. [Google Scholar] [CrossRef]
- Haque, M.M.; Rahman, A.; Hagare, D.; Kibria, G. Probabilistic Water Demand Forecasting Using Projected Climatic Data for Blue Mountains Water Supply System in Australia. Water Resour. Manag. 2014, 28, 1959–1971. [Google Scholar] [CrossRef]
- Li, J.; Song, S.B. Urban Water Consumption Prediction Based on CPMBNIP. Water Resour. Manag. 2023, 1–25. [Google Scholar] [CrossRef]
- Mousavi-Mirkalaei, P.; Roozbahani, A.; Banihabib, M.E.; Randhir, T.O. Forecasting urban water consumption using bayesian networks and gene expression programming. Earth Sci. Inform. 2022, 15, 623–633. [Google Scholar] [CrossRef]
- Li, J.; Song, S.B.; Kang, Y.; Wang, H.J.; Wang, X.J. Prediction of Urban Domestic Water Consumption Considering Uncertainty. J. Water Resour. Plan. Manag.-ASCE 2021, 147, 05020028. [Google Scholar] [CrossRef]
- Gao, X.; Zeng, W.R.; Shen, Y.; Guo, Z.W.; Yang, J.H.; Cheng, X.H.; Hua, Q.Z.; Yu, K.P. Integrated Deep Neural Networks-Based Complex System for Urban Water Management. Complexity 2020, 2020, 8848324. [Google Scholar] [CrossRef]
- Li, Z. Analysis and Simulation on Water Consumption Law and Related Design Parameter in Residential Quarters. Master’s Thesis, Tianjin University, Tianjin, China, 2020. [Google Scholar]
- Li, Z.; Wu, X.; Jiang, A.; Liu, Z. The Treatment Method of Abnormal Data in Water Consumption Monitoring in Construction Zone. In Proceedings of the 3rd Session of the 3rd Membership Assembly and Academic Exchange Conference of the Water Supply and Drainage Research Branch of the Architectural Society of China, Guangzhou, China, 20–22 September 2019; p. 7. [Google Scholar]
- Shuo, X.; Wenxiong, M.; Le, L.; Rui, T.; Tian, L. LSTM load forecasting algorithm based on time-sharing somatosensory. In Proceedings of the 10th Renewable Power Generation Conference (RPG 2021), Online Conference, 14–15 October 2021. [Google Scholar]
- Chen, C.; Zhou, G. Analysis on variation characteristics of air temperature and ground temperature in Guilin from 1961 to 2010. Acta Ecol. Sin. 2013, 33, 2043–2053. [Google Scholar] [CrossRef]
- Yin, Z.; Fan, J.; Chen, Y.; Li, D.; Zhang, L. Impact of Sensible Temperature on Summer Weather- Sensitive Power Load Rate in Huangshi City. Meteorol. Mon. 2017, 43, 620–627. [Google Scholar]
- Yu, B.; Liu, M.; Yan, M.; Yao, K. The apparent temperature model under cool condition and effects of wind, vapor-pressure and extra radiation. Sci. Meteorol. Sin. 2002, 22, 304–312. [Google Scholar]
- Xu, W.; Liu, H.; Zhang, Q.; Liu, P. Response of vegetation ecosystem to climate change based on remote sensing and information entropy: A case study in the arid inland river basin of China. Environ. Earth Sci. 2021, 80, 132. [Google Scholar] [CrossRef]
- Zhang, Z.; Yang, W.; Wushour, S. Traffic Accident Prediction Based on LSTM-GBRT Model. J. Control Sci. Eng. 2020, 2020, 4206919. [Google Scholar] [CrossRef]
- Wang, G.; Ruan, Y.; Wang, H.; Zhao, G.; Cao, X.; Li, X.; Ding, Q. Tribological performance study and prediction of copper coated by MoS2 based on GBRT method. Tribol. Int. 2023, 179, 108149. [Google Scholar] [CrossRef]
- Yuan, H.; Yuan, K.; Zhao, Z. On Predicting Event Propagation on Weibo. In Proceedings of the 14th International Conference on Service Systems and Service Management (ICSSSM), Dongbei Univ Finance & Econ, Sch Management Sci & Engn, Dalian, China, 16–18 June 2017. [Google Scholar]
- Pandey, B.; Pathak, J.; Singh, P.; Kumar, R.; Kumar, A.; Kaushik, S.; Thakur, T.K. Microplastics in the Ecosystem: An Overview on Detection, Removal, Toxicity Assessment, and Control Release. Water 2023, 15, 51. [Google Scholar] [CrossRef]
- Sun, J.; Wang, J.; Sun, Y.; Xu, M.; Shi, Y.; Liu, Z.; Wen, X. Electric Heating Load Forecasting Method Based on Improved Thermal Comfort Model and LSTM. Energies 2021, 14, 4525. [Google Scholar] [CrossRef]
- Wei, C.; Jian, L.U.; Zhi-cheng, W.U. Combined forecast model of urban hourly water consumption based on BP neural network. J. Harbin Inst. Technol. 2009, 41, 197–200. [Google Scholar]
- Chen, L. Probabilistic daily water consumption forecasting using Bayesian theory. Syst. Eng. -Theory Pract. 2017, 37, 761–767. [Google Scholar]
- Jiang, W.; Huang, C.; Liu, Q.; Liu, Y.; Tian, S. Investigation on current situation of water consumption and water quota in Chongqing. Water Wastewater Eng. 2019, 45, 102–106. [Google Scholar]
- Reis, R.P.A.; Rocha, D.G.; de Rezende, G.P.; Campos, M.A.S.; Basso, R.E.; Fioramonte, B. Influence of the number of residents and climatic factors on residential water consumption. Water Supply 2023, 23, 1626–1640. [Google Scholar] [CrossRef]
- Zhao, X.; Gai, M. Urban Water Consumption Forecasting in Dalian Based on Equal Dimensional and New Information Grey Markov Forecasting Model. Hydrology 2011, 31, 66–69, 87. [Google Scholar]
Meteorological Indicators | Resolution | Units |
---|---|---|
minimum evaporation | 0.1 | mm |
maximum evaporation | 0.1 | mm |
average barometric pressure | 0.1 | hPa |
highest barometric pressure | 0.1 | hPa |
lowest barometric pressure | 0.1 | hPa |
average temperature | 0.1 | °C |
maximum temperature | 0.1 | °C |
minimum temperature | 0.1 | °C |
average relative humidity | 0.1 | % |
daily precipitation | 0.1 | mm |
8–20 precipitation | 0.1 | mm |
20–20 precipitation | 0.1 | mm |
average wind speed | 0.1 | m/s |
maximum wind speed | 0.1 | m/s |
wind direction of maximum wind speed | ||
maximum wind speed | 0.1 | m/s |
wind direction of maximum wind speed | ||
sunshine duration | 0.1 | h |
average ground temperature | 0.1 | °C |
highest ground temperature | 0.1 | °C |
lowest ground temperature | 0.1 | °C |
Hyperparameter Name | Default | Range | Step Size |
---|---|---|---|
N_estimators | 100 | 10–150 | 1 |
Learning_rate | 1.00 | 0.01–1.00 | 0.01 |
Max_depth | 50 | 2–100 | 1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, Z.; Peng, S.; Zheng, G.; Chu, X.; Tian, Y. Prediction of Daily Water Consumption in Residential Areas Based on Meteorologic Conditions—Applying Gradient Boosting Regression Tree Algorithm. Water 2023, 15, 3455. https://doi.org/10.3390/w15193455
Li Z, Peng S, Zheng G, Chu X, Tian Y. Prediction of Daily Water Consumption in Residential Areas Based on Meteorologic Conditions—Applying Gradient Boosting Regression Tree Algorithm. Water. 2023; 15(19):3455. https://doi.org/10.3390/w15193455
Chicago/Turabian StyleLi, Zhengxuan, Sen Peng, Guolei Zheng, Xianxian Chu, and Yimei Tian. 2023. "Prediction of Daily Water Consumption in Residential Areas Based on Meteorologic Conditions—Applying Gradient Boosting Regression Tree Algorithm" Water 15, no. 19: 3455. https://doi.org/10.3390/w15193455
APA StyleLi, Z., Peng, S., Zheng, G., Chu, X., & Tian, Y. (2023). Prediction of Daily Water Consumption in Residential Areas Based on Meteorologic Conditions—Applying Gradient Boosting Regression Tree Algorithm. Water, 15(19), 3455. https://doi.org/10.3390/w15193455