Next Article in Journal
Entropy Generation and Human Aging: Lifespan Entropy and Effect of Physical Activity Level
Next Article in Special Issue
Incremental Entropy Relation as an Alternative to MaxEnt
Previous Article in Journal
Quantum and Ecosystem Entropies
Article Menu

Export Article

Open AccessArticle

Estimating the Entropy of Binary Time Series: Methodology, Some Theory and a Simulation Study

Knight Equity Markets, L.P., Jersey City, NJ 07310, USA
Department of Informatics, Athens University of Economics and Business, Athens 10434, Greece
Division of Applied Mathematics and Department of Neuroscience, Brown University, Providence, RI 02912, USA
Author to whom correspondence should be addressed.
Entropy 2008, 10(2), 71-99;
Received: 6 March 2008 / Revised: 9 June 2008 / Accepted: 17 June 2008 / Published: 17 June 2008
PDF [354 KB, uploaded 24 February 2015]


Partly motivated by entropy-estimation problems in neuroscience, we present a detailed and extensive comparison between some of the most popular and effective entropy estimation methods used in practice: The plug-in method, four different estimators based on the Lempel-Ziv (LZ) family of data compression algorithms, an estimator based on the Context-Tree Weighting (CTW) method, and the renewal entropy estimator. METHODOLOGY: Three new entropy estimators are introduced; two new LZ-based estimators, and the “renewal entropy estimator,” which is tailored to data generated by a binary renewal process. For two of the four LZ-based estimators, a bootstrap procedure is described for evaluating their standard error, and a practical rule of thumb is heuristically derived for selecting the values of their parameters in practice. THEORY: We prove that, unlike their earlier versions, the two new LZ-based estimators are universally consistent, that is, they converge to the entropy rate for every finite-valued, stationary and ergodic process. An effective method is derived for the accurate approximation of the entropy rate of a finite-state hidden Markov model (HMM) with known distribution. Heuristic calculations are presented and approximate formulas are derived for evaluating the bias and the standard error of each estimator. SIMULATION: All estimators are applied to a wide range of data generated by numerous different processes with varying degrees of dependence and memory. The main conclusions drawn from these experiments include: (i) For all estimators considered, the main source of error is the bias. (ii) The CTW method is repeatedly and consistently seen to provide the most accurate results. (iii) The performance of the LZ-based estimators is often comparable to that of the plug-in method. (iv) The main drawback of the plug-in method is its computational inefficiency; with small word-lengths it fails to detect longer-range structure in the data, and with longer word-lengths the empirical distribution is severely undersampled, leading to large biases. View Full-Text
Keywords: Entropy estimation; Lempel-Ziv coding; Context-Tree-Weighting; simulation; spike trains. Entropy estimation; Lempel-Ziv coding; Context-Tree-Weighting; simulation; spike trains.

Figure 1

This is an open access article distributed under the Creative Commons Attribution License (CC BY 3.0).

Share & Cite This Article

MDPI and ACS Style

Gao, Y.; Kontoyiannis, I.; Bienenstock, E. Estimating the Entropy of Binary Time Series: Methodology, Some Theory and a Simulation Study. Entropy 2008, 10, 71-99.

Show more citation formats Show less citations formats

Related Articles

Article Metrics

Article Access Statistics



[Return to top]
Entropy EISSN 1099-4300 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top