Harsh pollutants that are illegally disposed in the sewer network may spread beyond the sewer network—e.g., through leakages leading to groundwater reservoirs—and may also impair the correct operation of wastewater treatment plants. Consequently, such pollutants pose serious threats to water bodies, to the natural environment and, therefore, to all life. In this article, we focus on the problem of identifying a wastewater pollutant and localizing its source point in the wastewater network, given a time-series of wastewater measurements collected by sensors positioned across the sewer network. We provide a solution to the problem by solving two linked sub-problems. The first sub-problem concerns the detection and identification of the flowing pollutants in wastewater, i.e., assessing whether a given time-series corresponds to a contamination event and determining what the polluting substance caused it. This problem is solved using random forest classifiers. The second sub-problem relates to the estimation of the distance between the point of measurement and the pollutant source, when considering the outcome of substance identification sub-problem. The XGBoost algorithm is used to predict the distance from the source to the sensor. Both of the models are trained using simulated electrical conductivity and pH measurements of wastewater in sewers of a european city sub-catchment area. Our experiments show that: (a) resulting precision and recall values of the solution to the identification sub-problem can be both as high as 96%, and that (b) the median of the error that is obtained for the estimation of the source location sub-problem can be as low as 6.30 m.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.