The Application of a Double CUSUM Algorithm in Industrial Data Stream Anomaly Detection
AbstractThe effect of the application of machine learning on data streams is influenced by concept drift, drift deviation, and noise interference. This paper proposes a data stream anomaly detection algorithm combined with control chart and sliding window methods. This algorithm is named DCUSUM-DS (Double CUSUM Based on Data Stream), because it uses a dual mean value cumulative sum. The DCUSUM-DS algorithm based on nested sliding windows is proposed to satisfy the concept drift problem; it calculates the average value of the data within the window twice, extracts new features, and then calculates accumulated and controlled graphs to avoid misleading by interference points. The new algorithm is simulated using drilling engineering industrial data. Compared with automatic outlier detection for data streams (A-ODDS) and with sliding nest window chart anomaly detection based on data streams (SNWCAD-DS), the DCUSUM-DS can account for concept drift and shield a small amount of interference deviating from the overall data. Although the algorithm complexity increased from 0.1 second to 0.19 second, the classification accuracy receiver operating characteristic (ROC) increased from 0.89 to 0.95. This meets the needs of the oil drilling industry data stream with a sampling frequency of 1 Hz, and it improves the classification accuracy. View Full-Text
Share & Cite This Article
Li, G.; Wang, J.; Liang, J.; Yue, C. The Application of a Double CUSUM Algorithm in Industrial Data Stream Anomaly Detection. Symmetry 2018, 10, 264.
Li G, Wang J, Liang J, Yue C. The Application of a Double CUSUM Algorithm in Industrial Data Stream Anomaly Detection. Symmetry. 2018; 10(7):264.Chicago/Turabian Style
Li, Guang; Wang, Jie; Liang, Jing; Yue, Caitong. 2018. "The Application of a Double CUSUM Algorithm in Industrial Data Stream Anomaly Detection." Symmetry 10, no. 7: 264.
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.