In this paper, a fusion of unsupervised clustering and incremental similarity tracking of hourly water demand series is proposed. Current research using unsupervised methodologies to detect anomalous water is limited and may possess several limitations such as a large amount of dataset, the need to select an optimal cluster number, or low detection accuracy. Our proposed approach aims to address the need for a large amount of dataset by detecting anomaly through (1) clustering points that are relatively similar at each time step, (2) clustering points at each time step by the similarity in how they vary from each time step, and (3) to compare the incoming points with a reference shape for online anomalous trend detection. Secondly, through the use of Bayesian nonparametric approach such as the Dirichlet Process Mixture Model, the need to choose an optimal cluster number is eliminated and provides a subtle solution for ‘reserving’ an empty cluster for the future anomaly. Among the 165 randomly generated anomalies, the proposed approach detected a total of 159 anomalies and other anomalous trends present in the data. As the data is unlabeled, identified anomalous trends cannot be verified. However, results show great potential in using minimally unlabeled water demand data for a preliminary anomaly detection.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited