Information-Theoretic Data Discarding for Dynamic Trees on Data Streams
AbstractUbiquitous automated data collection at an unprecedented scale is making available streaming, real-time information flows in a wide variety of settings, transforming both science and industry. Learning algorithms deployed in such contexts often rely on single-pass inference, where the data history is never revisited. Learning may also need to be temporally adaptive to remain up-to-date against unforeseen changes in the data generating mechanism. Online Bayesian inference remains challenged by such transient, evolving data streams. Nonparametric modeling techniques can prove particularly ill-suited, as the complexity of the model is allowed to increase with the sample size. In this work, we take steps to overcome these challenges by porting information theoretic heuristics, such as exponential forgetting and active learning, into a fully Bayesian framework. We showcase our methods by augmenting a modern non-parametric modeling framework, dynamic trees, and illustrate its performance on a number of practical examples. The end product is a powerful streaming regression and classification tool, whose performance compares favorably to the state-of-the-art. View Full-Text
Share & Cite This Article
Anagnostopoulos, C.; Gramacy, R.B. Information-Theoretic Data Discarding for Dynamic Trees on Data Streams. Entropy 2013, 15, 5510-5535.
Anagnostopoulos C, Gramacy RB. Information-Theoretic Data Discarding for Dynamic Trees on Data Streams. Entropy. 2013; 15(12):5510-5535.Chicago/Turabian Style
Anagnostopoulos, Christoforos; Gramacy, Robert B. 2013. "Information-Theoretic Data Discarding for Dynamic Trees on Data Streams." Entropy 15, no. 12: 5510-5535.