# Application of Deep Learning in Drainage Systems Monitoring Data Repair—A Case Study Using Con-GRU Model

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Process Framework

**S**that needs to be repaired is first identified, and then other sensor monitoring data

**F**that is closely related to

**S**is analyzed. Subsequently, these two elements are used to construct the dataset

**D**for data repair. Finally, the dataset

**D**is partitioned into training, validation, and test datasets according to certain principles. The training dataset is used for model training, while the validation dataset is used to prevent overfitting or underfitting during the training process. Lastly, the test dataset is employed to evaluate the practical application of the model on unknown datasets. The second part comprises the establishment and training of the model. In this regard, considering that this research aims to repair long-term abnormal monitoring data using an iterative prediction approach, the sequence length

**OL**repaired by the model needs to be set. Furthermore, to ensure the effectiveness of each iteration, the appropriate sequence input length

**IL**of the model needs to be determined, as described in Section 3.3. In addition, a suitable deep learning model is constructed based on the number of input features determined in the first part, as presented in Section 3.4. After constructing the model, the training dataset is utilized for model training, while the validation dataset is employed to ensure the efficacy of the model training. The third part involves model testing and application. In this process, the test dataset is used to test the model, evaluating the effectiveness of both single-step and multistep iterative repairs. For single-step iterative repair, the model uses actual monitoring data in each repair iteration, as describe in Section 4.1. For multistep iterative repair, only actual monitoring data is used in the initial iteration, whereas the repaired results are used to update the model input in subsequent iterations, facilitating long-term abnormal data repair, as describe in Section 4.2 and Section 4.3.

#### 2.2. One-Dimension Convolution (Conv1D)

#### 2.3. Gated Recurrent Unit (GRU)

#### 2.4. Data Sample Generation Strategy

#### 2.5. Model Establishment Strategy

#### 2.6. Performance Indicators

## 3. Case Study

#### 3.1. Research Data

#### 3.2. Data Normalization

#### 3.3. Sample Processing

#### 3.4. Model Development

#### 3.4.1. Con-GRU Model Design

#### 3.4.2. Hyperparameter Configuration

## 4. Results and Discussion

#### 4.1. Performance of 3 h Data Repair Results on the Validation Dataset

#### 4.2. Performance of 1 Day Data Iteratively Repaired Results on the Validation Dataset

#### 4.3. Performance of 1 Day Data Iteratively Repaired Results on the Test Dataset

#### 4.4. Influence of Pump Status and Rainfall on Repair Effect

## 5. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Eggimann, S.; Mutzner, L.; Wani, O.; Schneider, M.Y.; Spuhler, D.; Matthew, M.; Beutler, P.; Maurer, M. The Potential of Knowing More: A Review of Data-Driven Urban Water Management. Environ. Sci. Technol.
**2017**, 51, 2538–2553. [Google Scholar] [CrossRef] [PubMed] - Zhang, A.; Song, S.; Wang, J.; Yu, P.S. Time series data cleaning: From anomaly detection to anomaly repairing. Proc. VLDB Endow.
**2017**, 10, 1046–1057. [Google Scholar] [CrossRef] - Boubiche, D.E.; Pathan, A.K.; Lloret, J.; Zhou, H.; Hong, S.; Amin, S.O.; Feki, M.A. Advanced industrial wireless sensor networks and intelligent IoT. IEEE Commun. Mag.
**2018**, 56, 14–15. [Google Scholar] [CrossRef] - Xu, Y.; Sun, Y.; Wan, J.; Liu, X.; Song, Z. Industrial big data for fault diagnosis: Taxonomy, review, and applications. IEEE Access
**2017**, 5, 17368–17380. [Google Scholar] [CrossRef] - Su, X.; Shao, G.; Vause, J.; Tang, L. An integrated system for urban environmental monitoring and management based on the environmental internet of things. Int. J. Sustain. Dev. World Ecol.
**2013**, 20, 205–209. [Google Scholar] [CrossRef] - Hill, D.J.; Minsker, B.S. Anomaly detection in streaming environmental sensor data: A data-driven modeling approach. Environ. Model. Softw.
**2010**, 25, 1014–1022. [Google Scholar] [CrossRef] - Khani, M.J.; Shirmohammadi, Z. SRCM: An Efficient Method for Energy Consumption Reduction in Wireless Body Area Networks based on Data Similarity. Adhoc Sens. Wirel. Netw.
**2022**, 51, 173–187. [Google Scholar] - Zhang, W.; Tan, G.; Ding, N. Traffic Information Detection Based on Scattered Sensor Data: Model and Algorithms. Adhoc Sens. Wirel. Netw.
**2013**, 18, 225–240. [Google Scholar] - Čampulová, M.; Michálek, J.; Mikuška, P.; Bokal, D. Nonparametric algorithm for identification of outliers in environmental data. J. Chemom.
**2018**, 32, e2997. [Google Scholar] [CrossRef] - Holešovský, J.; Čampulová, M.; Michálek, J. Semiparametric outlier detection in nonstationary times series: Case study for atmospheric pollution in Brno, Czech Republic. Atmos. Pollut. Res.
**2018**, 9, 27–36. [Google Scholar] [CrossRef] - Chen, L.; Yan, H.; Yan, J.; Wang, J.; Tao, T.; Xin, K.; Li, S.; Pu, Z.; Qiu, J. Short-term water demand forecast based on automatic feature extraction by one-dimensional convolution. J. Hydrol.
**2022**, 606, 127440. [Google Scholar] [CrossRef] - Akouemo, H.N.; Povinelli, R.J. Data improving in time series using ARX and ANN models. IEEE Trans. Power Syst.
**2017**, 32, 3352–3359. [Google Scholar] [CrossRef] - Fauconnier, C.; Haesbroeck, G. Outliers detection with the minimum covariance determinant estimator in practice. Stat. Methodol.
**2009**, 6, 363–379. [Google Scholar] [CrossRef] - Cai, L.; Thornhill, N.F.; Kuenzel, S.; Pal, B.C. Real-time detection of power system disturbances based on k-nearest neighbor analysis. IEEE Access
**2017**, 5, 5631–5639. [Google Scholar] [CrossRef] - An, Q.; Tao, Z.; Xu, X.; El Mansori, M.; Chen, M. A data-driven model for milling tool remaining useful life prediction with convolutional and stacked LSTM network. Measurement
**2020**, 154, 107461. [Google Scholar] [CrossRef] - Roberts, C.; Nair, M. Arbitrary discrete sequence anomaly detection with zero boundary LSTM. arXiv
**2018**, arXiv:1803.02395. [Google Scholar] - Tariq, S.; Lee, S.; Shin, Y.; Lee, M.S.; Jung, O.; Chung, D.; Woo, S.S. Detecting anomalies in space using multivariate convolutional LSTM with mixtures of probabilistic PCA. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; Association for Computing Machinery: New York, NY, USA, 2019; pp. 2123–2133. [Google Scholar]
- Su, Y.; Zhao, Y.; Niu, C.; Liu, R.; Sun, W.; Pei, D. Robust anomaly detection for multivariate time series through stochastic recurrent neural network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; Association for Computing Machinery: New York, NY, USA, 2019; pp. 2828–2837. [Google Scholar]
- Tan, F.H.S.; Park, J.R.; Jung, K.; Lee, J.S.; Kang, D. Cascade of one class classifiers for water level anomaly detection. Electronics
**2020**, 9, 1012. [Google Scholar] [CrossRef] - Ye, F.; Liu, Z.; Liu, Q.; Wang, Z. Hydrologic time series anomaly detection based on flink. Math. Probl. Eng.
**2020**, 2020, 3187697. [Google Scholar] [CrossRef] - Shao, P.; Ye, F.; Liu, Z.; Wang, X.; Lu, M.; Mao, Y. Improving iForest for Hydrological Time Series Anomaly Detection; Springer: Berlin/Heidelberg, Germany, 2020; pp. 170–183. [Google Scholar]
- Sun, J.; Lou, Y.; Ye, F. Research on anomaly pattern detection in hydrological time series. In Proceedings of the 2017 14th Web Information Systems and Applications Conference (WISA), Liuzhou, China, 11–12 November 2017; pp. 38–43. [Google Scholar]
- Xu, C.; Wang, J.; Hu, M.; Wang, W. A new method for interpolation of missing air quality data at monitor stations. Environ. Int.
**2022**, 169, 107538. [Google Scholar] [CrossRef] - Wang, P.; Zhang, T.; Zheng, Y.; Hu, T. A multi-view bidirectional spatiotemporal graph network for urban traffic flow imputation. Int. J. Geogr. Inf. Sci.
**2022**, 36, 1231–1257. [Google Scholar] [CrossRef] - Park, S.; Jung, S.; Jung, S.; Rho, S.; Hwang, E. Sliding window-based LightGBM model for electric load forecasting using anomaly repair. J. Supercomput.
**2021**, 77, 12857–12878. [Google Scholar] [CrossRef] - Wang, M.; Kumar, S.S.; Cheng, J.C. Automated sewer pipe defect tracking in CCTV videos based on defect detection and metric learning. Autom. Constr.
**2021**, 121, 103438. [Google Scholar] [CrossRef] - Guo, Z.; Leitão, J.P.; Simões, N.E.; Moosavi, V. Data-driven flood emulation: Speeding up urban flood predictions by deep convolutional neural networks. J. Flood Risk Manag.
**2021**, 14, e12684. [Google Scholar] [CrossRef] - Mullapudi, A.; Lewis, M.J.; Gruden, C.L.; Kerkez, B. Deep reinforcement learning for the real time control of stormwater systems. Adv. Water Resour.
**2020**, 140, 103600. [Google Scholar] [CrossRef] - Chen, K.; Wang, H.; Valverde-Pérez, B.; Zhai, S.; Vezzaro, L.; Wang, A. Optimal control towards sustainable wastewater treatment plants based on multi-agent reinforcement learning. Chemosphere
**2021**, 279, 130498. [Google Scholar] [CrossRef] - Kratzert, F.; Herrnegger, M.; Klotz, D.; Hochreiter, S.; Klambauer, G. NeuralHydrology–interpreting LSTMs in hydrology. In Explainable AI: Interpreting, Explaining and Visualizing Deep Learning; Springer: Berlin/Heidelberg, Germany, 2019; pp. 347–362. [Google Scholar]
- Lees, T.; Reece, S.; Kratzert, F.; Klotz, D.; Gauch, M.; De Bruijn, J.; Kumar Sahu, R.; Greve, P.; Slater, M.; Dadson, S.J. Hydrological concept formation inside long short-term memory (LSTM) networks. Hydrol. Earth Syst. Sci.
**2022**, 26, 3079–3101. [Google Scholar] [CrossRef] - Frame, J.M.; Kratzert, F.; Klotz, D.; Gauch, M.; Shelev, G.; Gilon, O.; Qualls, L.M.; Gupta, H.V.; Nearing, G.S. Deep learning rainfall–runoff predictions of extreme events. Hydrol. Earth Syst. Sci.
**2022**, 26, 3377–3392. [Google Scholar] [CrossRef] - Klotz, D.; Kratzert, F.; Gauch, M.; Keefe Sampson, A.; Brandstetter, J.; Klambauer, G.; Hochreiter, S.; Nearing, G. Uncertainty estimation with deep learning for rainfall–runoff modeling. Hydrol. Earth Syst. Sci.
**2022**, 26, 1673–1693. [Google Scholar] [CrossRef] - Song, W.; Gao, C.; Zhao, Y.; Zhao, Y. A time series data filling method based on LSTM—Taking the stem moisture as an example. Sensors
**2020**, 20, 5045. [Google Scholar] [CrossRef] - Ren, H.; Cromwell, E.; Kravitz, B.; Chen, X. Using deep learning to fill spatio-temporal data gaps in hydrological monitoring networks. Hydrol. Earth Syst. Sci. Discuss.
**2019**, 1–20. [Google Scholar] [CrossRef] - Kulanuwat, L.; Chantrapornchai, C.; Maleewong, M.; Wongchaisuwat, P.; Wimala, S.; Sarinnapakorn, K.; Boonya-aroonnet, S. Anomaly detection using a sliding window technique and data imputation with machine learning for hydrological time series. Water
**2021**, 13, 1862. [Google Scholar] [CrossRef] - Yang, J.; Li, J. Application of deep convolution neural network. In Proceedings of the 2017 14th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), Chengdu, China, 15–17 December 2017; pp. 229–232. [Google Scholar]
- LeCun, Y.; Boser, B.; Denker, J.; Henderson, D.; Howard, R.; Hubbard, W.; Jackel, L. Handwritten digit recognition with a back-propagation network. In Advances in Neural Information Processing Systems 2; Morgan Kaufmann: Burlington, MA, USA, 1989; Volume 2. [Google Scholar]
- Bouvrie, J. Notes on Convolutional Neural Networks. 2006. Available online: http://web.mit.edu/jvb/www/papers/cnn_tutorial.pdf (accessed on 3 November 2022).
- Cheng, H.; Xie, Z.; Shi, Y.; Xiong, N. Multi-step data prediction in wireless sensor networks based on one-dimensional CNN and bidirectional LSTM. IEEE Access
**2019**, 7, 117883–117896. [Google Scholar] [CrossRef] - Teng, S.; Chen, G.; Liu, Z.; Cheng, L.; Sun, X. Multi-sensor and decision-level fusion-based structural damage detection using a one-dimensional convolutional neural network. Sensors
**2021**, 21, 3950. [Google Scholar] [CrossRef] - Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput.
**1997**, 9, 1735–1780. [Google Scholar] [CrossRef] - Cho, K.; Van Merriënboer, B.; Bahdanau, D.; Bengio, Y. On the properties of neural machine translation: Encoder-decoder approaches. arXiv
**2014**, arXiv:1409.1259. [Google Scholar] - Bandara, K.; Bergmeir, C.; Smyl, S. Forecasting across time series databases using recurrent neural networks on groups of similar series: A clustering approach. Expert Syst. Appl.
**2020**, 140, 112896. [Google Scholar] [CrossRef] - Zhang, D.; Hølland, E.S.; Lindholm, G.; Ratnaweera, H. Hydraulic modeling and deep learning based flow forecasting for optimizing inter catchment wastewater transfer. J. Hydrol.
**2018**, 567, 792–802. [Google Scholar] [CrossRef] - Hossein Javaheri, S. Response Modeling in Direct Marketing: A Data Mining Based Approach for Target Selection. 2008. Available online: https://www.diva-portal.org/smash/record.jsf?pid=diva2%3A1024362&dswid=3350 (accessed on 15 November 2022).
- Smyl, S. Forecasting Short Time Series with LSTM Neural Networks. 2016. Available online: https://gallery.azure.ai/Tutorial/Forecasting-Short-Time-Series-with-LSTM-Neural-Networks-2 (accessed on 5 January 2023).
- Pu, Z.; Yan, J.; Chen, L.; Li, Z.; Tian, W.; Tao, T.; Xin, K. A hybrid Wavelet-CNN-LSTM deep learning model for short-term urban water demand forecasting. Front. Environ. Sci. Eng.
**2023**, 17, 22. [Google Scholar] [CrossRef] - Bergstra, J.; Bengio, Y. Random search for hyper-parameter optimization. J. Mach. Learn. Res.
**2012**, 13, 281–305. [Google Scholar] - Guo, G.; Liu, S.; Wu, Y.; Li, J.; Zhou, R.; Zhu, X. Short-term water demand forecast based on deep learning method. J. Water Resour. Plan. Manag.
**2018**, 144, 04018076. [Google Scholar] [CrossRef]

**Figure 3.**The calculation process of GRU: (

**a**) calculation of reset gate; (

**b**) calculation of update gate; (

**c**) calculation of candidate hidden state; (

**d**) calculation of hidden state.

**Figure 8.**Comparison of three models in iterative repair on validation dataset: (

**a**) results of data repair on 6 March 2020; (

**b**) results of data repair on 10 March 2020; (

**c**) results of data repair on 23 April 2020; (

**d**) results of data repair on 18 July 2020.

**Figure 9.**Comparison of the Con-GRU model in iterative repair on test dataset: (

**a**) results of data repair on 24 August 2020; (

**b**) results of data repair on 22 November 2020; (

**c**) results of data repair on 27 November 2020; (

**d**) results of data repair on 29 December 2020.

**Figure 11.**Comparison of the Con-GRU model and Con-GRU-I model on the validation dataset for 3 h data repair results.

**Figure 12.**Comparison of the Con-GRU model and Con-GRU-I model in iterative repair on test dataset: (

**a**) results of data repair on 14 September 2020; (

**b**) results of data repair on 29 October 2020; (

**c**) results of data repair on 1 November 2020; (

**d**) results of data repair on 10 November 2020.

Mean Level (m) | Maximum (m) | Minimum (m) | Standard Deviation | Skewness | Kurtosis |
---|---|---|---|---|---|

−5.502 | 6.452 | −7.382 | 2.187 | 1.586 | 2.051 |

Hyperparameter | Search Range | Optimal Configuration |
---|---|---|

cw | [2, 4, 6, 8, 10, 12] | 6 |

gd | [10, 12, 14, 16, 18] | 16 |

lr | ranging from 0.0005 to 0.04 with an increase of 0.0005 | 0.001 |

bs | [256, 512, 1024, 2048] | 1024 |

Model | MAE (m) | RMSE (m) | NSE | AIC | BIC |
---|---|---|---|---|---|

ANN | 0.3981 | 0.4827 | 0.7791 | 221,622.3 | 510,494.1 |

LSTM | 0.4168 | 0.5045 | 0.7613 | 96,271.2 | 221,850.8 |

Con-GRU | 0.2843 | 0.3558 | 0.8136 | 27,579.3 | 63,772.9 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

He, L.; Ji, S.; Xin, K.; Chen, Z.; Chen, L.; Nan, J.; Song, C.
Application of Deep Learning in Drainage Systems Monitoring Data Repair—A Case Study Using Con-GRU Model. *Water* **2023**, *15*, 1635.
https://doi.org/10.3390/w15081635

**AMA Style**

He L, Ji S, Xin K, Chen Z, Chen L, Nan J, Song C.
Application of Deep Learning in Drainage Systems Monitoring Data Repair—A Case Study Using Con-GRU Model. *Water*. 2023; 15(8):1635.
https://doi.org/10.3390/w15081635

**Chicago/Turabian Style**

He, Li, Shasha Ji, Kunlun Xin, Zewei Chen, Lei Chen, Jun Nan, and Chenxi Song.
2023. "Application of Deep Learning in Drainage Systems Monitoring Data Repair—A Case Study Using Con-GRU Model" *Water* 15, no. 8: 1635.
https://doi.org/10.3390/w15081635