Next Article in Journal
Bag of ARSRG Words (BoAW)
Previous Article in Journal
A CNN-BiLSTM Model for Document-Level Sentiment Analysis
Open AccessArticle

Targeted Adaptable Sample for Accurate and Efficient Quantile Estimation in Non-Stationary Data Streams

School of Computer Science, University of St Andrews, Fife KY16 9SX, Scotland, UK
Mach. Learn. Knowl. Extr. 2019, 1(3), 848-870; https://doi.org/10.3390/make1030049
Received: 30 June 2019 / Revised: 21 July 2019 / Accepted: 24 July 2019 / Published: 27 July 2019
The need to detect outliers or otherwise unusual data, which can be formalized as the estimation a particular quantile of a distribution, is an important problem that frequently arises in a variety of applications of pattern recognition, computer vision and signal processing. For example, our work was most proximally motivated by the practical limitations and requirements of many semi-automatic surveillance analytics systems that detect abnormalities in closed-circuit television (CCTV) footage using statistical models of low-level motion features. In this paper, we specifically address the problem of estimating the running quantile of a data stream with non-stationary stochasticity when the absolute (rather than asymptotic) memory for storing observations is severely limited. We make several major contributions: (i) we derive an important theoretical result that shows that the change in the quantile of a stream is constrained regardless of the stochastic properties of data; (ii) we describe a set of high-level design goals for an effective estimation algorithm that emerge as a consequence of our theoretical findings; (iii) we introduce a novel algorithm that implements the aforementioned design goals by retaining a sample of data values in a manner adaptive to changes in the distribution of data and progressively narrowing down its focus in the periods of quasi-stationary stochasticity; and (iv) we present a comprehensive evaluation of the proposed algorithm and compare it with the existing methods in the literature on both synthetic datasets and three large “real-world” streams acquired in the course of operation of an existing commercial surveillance system. Our results and their detailed analysis convincingly and comprehensively demonstrate that the proposed method is highly successful and vastly outperforms the existing alternatives, especially when the target quantile is high-valued and the available buffer capacity severely limited. View Full-Text
Keywords: auxiliary; flexible; median; surveillance; abnormality; histogram auxiliary; flexible; median; surveillance; abnormality; histogram
Show Figures

Figure 1

MDPI and ACS Style

Arandjelović, O. Targeted Adaptable Sample for Accurate and Efficient Quantile Estimation in Non-Stationary Data Streams. Mach. Learn. Knowl. Extr. 2019, 1, 848-870.

Show more citation formats Show less citations formats

Article Access Map by Country/Region

1
Back to TopTop