In this paper, a novel machine learning based algorithm for water supply pollution source identification is presented built specifically for high performance parallel systems. The algorithm utilizes the combination of Artificial Neural Networks for classification of the pollution source with Random Forests for regression analysis to determine significant variables of a contamination event such as start time, end time and contaminant chemical concentration. The algorithm is based on performing Monte Carlo water quality and hydraulic simulations in parallel, recording data with sensors placed within a water supply network and selecting a most probable pollution source based on a tournament style selection between suspect nodes in a network with mentioned machine learning methods. The novel algorithmic framework is tested on a small (92 nodes) and medium sized (865 nodes) water supply sensor network benchmarks with a set contamination event start time, end time and chemical concentration. Out of the 30 runs, the true source node was the finalist of the algorithm’s tournament style selection for 30/30 runs for the small network, and 29/30 runs for the medium sized network. For all the 30 runs on the small sensor network, the true contamination event scenario start time, end time and chemical concentration was set as 14:20, 20:20 and 813.7 mg/L, respectively. The root mean square errors for all 30 algorithm runs for the three variables were 48 min, 4.38 min and 18.06 mg/L. For the 29 successful medium sized network runs the start time was 06:50, end time 07:40 and chemical concentration of 837 mg/L and the root mean square errors were 6.06 min, 12.36 min and 299.84 mg/L. The algorithmic framework successfully narrows down the potential sources of contamination leading to a pollution source identification, start and ending time of the event and the contaminant chemical concentration.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.