# A Super Fast Algorithm for Estimating Sample Entropy

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Sample Entropy via Monte Carlo Sampling

Algorithm 1 Direct method for range counting |

Require: Sequence $\mathbf{u}:=({u}_{i}:i\in {\mathbb{Z}}_{N+m})$, subset $\mathbf{s}\subset {\mathbb{Z}}_{N}$, template length m and threshold r.1: procedure DirectRangeCounting ($\mathbf{u},\mathbf{s},m,r$)2: Set $count=0$, 3: Set $L=\#\mathbf{s}$, 4: for $i=1$ to L do5: Set $\mathbf{a}=[{u}_{{s}_{i}+l-1}:l\in {\mathbb{Z}}_{m}]$, 6: for $j=i+1$ to L do7: Set $\mathbf{b}=[{u}_{{s}_{j}+l-1}:l\in {\mathbb{Z}}_{m}]$, 8: if $\rho (\mathbf{a}-\mathbf{b})\le r$ then9: $count=count+1$, 10: return $count$ |

Algorithm 2 Monte-Carlo-based algorithm for evaluating sample entropy |

Require: Sequence $\mathbf{u}=({u}_{i}:i\in {\mathbb{Z}}_{N+m})$, template length m, tolerance $r\in \mathbb{R}$, sample size ${N}_{0}$ and number of experiments ${N}_{1}$, probability space $\{\mathrm{\Omega},\mathcal{F},P\}$1: procedure MCSampEn ($\mathbf{u},m,r,{N}_{0},{N}_{1}$)2: Set ${\overline{A}}_{{N}_{1}}=0$ and ${\overline{B}}_{{N}_{1}}=0$, 3: for $k=1$ to ${N}_{1}$ do4: Select ${\mathbf{s}}_{k}\in \mathrm{\Omega}$, randomly, with uniform distribution, 5: Compute $\tilde{A}\left({\mathbf{s}}_{k}\right)$ by calling DirectRangeCounting ($\mathbf{u},{\mathbf{s}}^{\left(k\right)},m,r$), 6: Compute $\tilde{B}\left({\mathbf{s}}_{k}\right)$ by calling DirectRangeCounting ($\mathbf{u},{\mathbf{s}}^{\left(k\right)},m+1,r$), 7: ${\overline{A}}_{{N}_{1}}={\overline{A}}_{{N}_{1}}+\frac{1}{{N}_{1}}\tilde{A}\left({\mathbf{s}}^{\left(k\right)}\right)$, 8: ${\overline{B}}_{{N}_{1}}={\overline{B}}_{{N}_{1}}+\frac{1}{{N}_{1}}\tilde{B}\left({\mathbf{s}}^{\left(k\right)}\right)$, 9: $entropy=-log\frac{{\overline{B}}_{{N}_{1}}}{{\overline{A}}_{{N}_{1}}}$, 10: return $entropy$ |

**Theorem**

**1.**

**Theorem**

**2.**

**Proof.**

## 3. Error Analysis

**Theorem**

**3.**

**Theorem**

**4.**

**Theorem**

**5.**

## 4. Experiments

**Long-Term AF Database (ltafdb)**[23]. This database includes 84 long-term ECG recordings of subjects with paroxysmal or sustained atrial fibrillation (AF). Each record contains two simultaneously recorded ECG signals digitized at 128 Hz with 12-bit resolution over a 20 mV range; record durations vary but are typically 24 to 25 h.**Long-Term ST Database (ltstdb)**[24]. This database contains 86 lengthy ECG recordings of 80 human subjects, chosen to exhibit a variety of events of ST segment changes, including ischemic ST episodes, axis-related non-ischemic ST episodes, episodes of slow ST level drift, and episodes containing mixtures of these phenomena.**MIT-BIH Long-Term ECG Database (ltecg)**[19]. This database contains 7 long-term ECG recordings (14 to 22 h each), with manually reviewed beat annotations.**BIDMC Congestive Heart Failure Database (chfdb)**[25]. This database includes long-term ECG recordings from 15 subjects (11 men, aged 22 to 71, and 4 women, aged 54 to 63) with severe congestive heart failure (NYHA class 3–4).**MGH/MF Waveform Database (mghdb)**[26]. The Massachusetts General Hospital/ Marquette Foundation (MGH/MF) Waveform Database is a comprehensive collection of electronic recordings of hemodynamic and electrocardiographic waveforms of stable and unstable patients in critical care units, operating rooms, and cardiac catheterization laboratories. Note that only the ECG records were considered in our experiments.**RR Interval Time Series (RR).**The RR interval time series are derived from healthy subjects (RR/Health), and subjects with heart failure (RR/CHF) and atrial fibrillation (RR/AF).**CHB-MIT Scalp EEG Database (chbmit)**[27]. This database contains (EEG) records of pediatric subjects with intractable seizures. The records are collected from 22 subjects, monitored for up to several days.**Gearbox Database (gearbox)**[20]. The gearbox dataset was introduced in [20] and was published on https://github.com/cathysiyu/Mechanical-datasets (accessed on 27 March 2022).**Meteorological Database (MD)**[22]. The meteorological database used in this section records the hourly weather data in the past 70 years in the Netherlands.

#### 4.1. Approximation Accuracy

**(MeanErr)**and the root mean squared errors

**(RMeanSqErr)**of the 50 outcomes.

#### 4.2. Time Complexity

**S1**- Choose ${N}_{0}$ and ${N}_{1}$ to be independent of N, for example ${N}_{0}=2\times {10}^{3}$ and ${N}_{1}=150$.
**S2**- Choose ${N}_{0}=max\{1024,\lfloor \sqrt{N}\rfloor \}$ and ${N}_{1}=min\left\{5+{log}_{2}N,\lfloor N/{N}_{0}\rfloor \right\}$, depending on N.

#### 4.3. Memory Usage

## 5. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## Appendix A

**Theorem**

**A1.**

**Lemma**

**A1.**

**Proof.**

**The proof for Theorem 3 is shown as follows.**

**Proof.**

**Lemma**

**A2.**

**Proof.**

**Lemma**

**A3.**

**Proof.**

**Lemma**

**A4.**

**Proof.**

**The proof for Theorem 4 is shown as follows.**

**Proof.**

**Theorem**

**A2.**

**Lemma**

**A5.**

**Proof.**

**Lemma**

**A6.**

**Proof.**

**The proof of Theorem 5 is provided as follows.**

**Proof.**

## References

- Pincus, S.M. Approximate entropy as a measure of system complexity. Proc. Natl. Acad. Sci. USA
**1991**, 88, 2297–2301. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Richman, J.S.; Moorman, J.R. Physiological time-series analysis using approximate entropy and sample entropy. Am. J. Physiol. Heart Circ. Physiol.
**2000**, 278, 2039–2049. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Costa, M.; Goldberger, A.L.; Peng, C.-K. Multiscale entropy analysis of complex physiologic time series. Phys. Rev. Lett.
**2002**, 89, 068102. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Jiang, Y.; Peng, C.-K.; Xu, Y. Hierarchical entropy analysis for biological signals. J. Comp. Appl. Math.
**2011**, 236, 728–742. [Google Scholar] [CrossRef] [Green Version] - Li, Y.; Li, G.; Yang, Y.; Liang, X.; Xu, M. A fault diagnosis scheme for planetary gearboxes using adaptive multi-scale morphology filter and modified hierarchical permutation entropy. Mech. Syst. Signal Proc.
**2017**, 105, 319–337. [Google Scholar] [CrossRef] - Yang, C.; Jia, M. Hierarchical multiscale permutation entropy-based feature extraction and fuzzy support tensor machine with pinball loss for bearing fault identification. Mech. Syst. Signal Proc.
**2021**, 149, 107182. [Google Scholar] [CrossRef] - Li, W.; Shen, X.; Li, Y. A comparative study of multiscale sample entropy and hierarchical entropy and its application in feature extraction for ship-radiated noise. Entropy
**2019**, 21, 793. [Google Scholar] [CrossRef] [Green Version] - Jiang, Y.; Mao, D.; Xu, Y. A fast algorithm for computing sample entropy. Adv. Adapt. Data Anal.
**2011**, 3, 167–186. [Google Scholar] [CrossRef] - Mao, D. Biological Time Series Classification via Reproducing Kernels and Sample Entropy. Ph.D. Dissertation, Syracuse University, Syracuse, NY, USA, August 2008. [Google Scholar]
- Grassberger, P. An optimized box-assisted algorithm for fractal dimensions. Phys. Lett. A
**1990**, 148, 63–68. [Google Scholar] [CrossRef] - Theiler, J. Efficient algorithm for estimating the correlation dimension from a set of discrete points. Phys. Rev. A Gen. Phys.
**1987**, 36, 4456–4462. [Google Scholar] [CrossRef] - Manis, G. Fast computation of approximate entropy. Comput. Meth. Prog. Biomed.
**2008**, 91, 48–54. [Google Scholar] [CrossRef] - Manis, G.; Aktaruzzaman, M.; Sassi, R. Low computational cost for sample entropy. Entropy
**2018**, 20, 61. [Google Scholar] [CrossRef] [Green Version] - Wang, Y.H.; Chen, I.Y.; Chiueh, H.; Liang, S.F. A low-cost implementation of sample entropy in wearable embedded systems: An example of online analysis for sleep eeg. IEEE Trans. Instrum. Meas.
**2021**, 70, 9312616. [Google Scholar] [CrossRef] - Tomčala, J. New fast ApEn and SampEn entropy algorithms implementation and their application to supercomputer power consumption. Entropy
**2020**, 22, 863. [Google Scholar] [CrossRef] [PubMed] - Shekelyan, M.; Cormode, G. Sequential Random Sampling Revisited: Hidden Shuffle Method. In Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, Virtually Held, 13–15 April 2021; pp. 3628–3636. [Google Scholar]
- Karr, A.F. Probability; Springer: New York, NY, USA, 1993. [Google Scholar]
- Luzia, N. A simple proof of the strong law of large numbers with rates. Bull. Aust. Math. Soc.
**2018**, 97, 513–517. [Google Scholar] [CrossRef] - Goldberger, A.L.; Amaral, L.A.; Glass, L.; Hausdorff, J.M.; Ivanov, P.C.; Mark, R.G.; Mietus, J.E.; Moody, G.B.; Peng, C.-K.; Stanley, H.E. Physiobank, physiotoolkit, and physionet: Components of a new research resource for complex physiologic signals. Circulation
**2000**, 101, 215–220. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Shao, S.; McAleer, S.; Yan, R.; Baldi, P. Highly accurate machine fault diagnosis using deep transfer learning. IEEE Trans. Ind. Inform.
**2019**, 15, 2446–2455. [Google Scholar] [CrossRef] - Case Western Reserve University Bearing Data Center. Available online: https://engineering.case.edu/bearingdatacenter (accessed on 27 March 2022).
- Royal Netherlands Meteorological Institute. Available online: https://www.knmi.nl/nederland-nu/klimatologie/uurgegevens (accessed on 27 March 2022).
- Petrutiu, S.; Sahakian, A.V.; Swiryn, S. Abrupt changes in fibrillatory wave characteristics at the termination of paroxysmal atrial fibrillation in humans. Europace
**2007**, 9, 466–470. [Google Scholar] [CrossRef] - Jager, F.; Taddei, A.; Moody, G.B.; Emdin, M.; Antolič, G.; Dorn, R.; Smrdel, A.; Marchesi, C.; Mark, R.G. Long-term st database: A reference for the development and evaluation of automated ischaemia detectors and for the study of the dynamics of myocardial ischaemia. Med. Biol. Eng. Comput.
**2003**, 41, 172–182. [Google Scholar] [CrossRef] - Baim, D.S.; Colucci, W.S.; Monrad, E.S.; Smith, H.S.; Wright, R.F.; Lanoue, A.; Gauthier, D.F.; Ransil, B.J.; Grossman, W.; Braunwald, E. Survival of patients with severe congestive heart failure treated with oral milrinone. J. Am. Coll. Cardiol.
**1986**, 7, 661–670. [Google Scholar] [CrossRef] [Green Version] - Welch, J.; Ford, P.; Teplick, R.; Rubsamen, R. The massachusetts general hospital-marquette foundation hemodynamic and electrocardiographic database–comprehensive collection of critical care waveforms. Clin. Monit.
**1991**, 7, 96–97. [Google Scholar] - Shoeb, A.H. Application of Machine Learning to Epileptic Seizure Onset Detection and Treatment. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, September 2009. [Google Scholar]
- Silva, L.E.V.; Filho, A.C.S.S.; Fazan, V.P.S.; Felipe, J.C.; Junior, L.O.M. Two-dimensional sample entropy: Assessing image texture through irregularity. Biomed. Phys. Eng. Expr.
**2016**, 2, 045002. [Google Scholar] [CrossRef] - DeGroot, M.H.; Schervish, M.J. Probability and Statistics, 4th ed.; Person Education: New York, NY, USA, 2012. [Google Scholar]

**Figure 2.**The values of $MeanErr$ and $RMeanSqErr$ for time series “mghdb/mgh001” with respect to the sample size ${N}_{0}$ and the number of computations ${N}_{1}$, where parameters $r=0.15$ and $m=4,5$. (

**a**) $MeanErr$ with $m=4$. (

**b**) $RMeanSqErr$ with $m=4$. (

**c**) $MeanErr$ with $m=5$. (

**d**) $RMeanSqErr$ with $m=5$.

**Figure 3.**The values of $MeanErr$ and $RMeanSqErr$ with respect to ${N}_{0}\in \{200i:i\in {\mathbb{Z}}_{20}^{+}\}$ and ${N}_{1}=150$, where parameters $r=0.15$ and $m=4,5$. (

**a**) $MeanErr$ with $m=4$. (

**b**) $RMeanSqErr$ with $m=4$. (

**c**) $MeanErr$ with $m=5$. (

**d**) $RMeanSqErr$ with $m=5$.

**Figure 4.**The values of $MeanErr$ and $RMeanSqErr$ with respect to ${N}_{0}=2\times {10}^{3}$ and ${N}_{1}\in \{10i:i\in {\mathbb{Z}}_{25}^{+}\}$, where parameters $r=0.15$ and $m=4,5$. (

**a**) $MeanErr$ with $m=4$. (

**b**) $RMeanSqErr$ with $m=4$. (

**c**) $MeanErr$ with $m=5$. (

**d**) $RMeanSqErr$ with $m=5$.

**Figure 5.**The values of $RMeanSqErr$ with respect to p, where parameters $r=0.15$, $m=4,5$, $N={2}^{20}$, ${N}_{0}=2000$, and ${N}_{1}=150$.

**Figure 6.**The left column shows the results of computational time versus data length N on different signals. In the right column, the values of $RMeanSqErr$ are presented by error bars “I”, where the larger the value of $RMeanSqErr$, the longer the error bar “I”. In this figure, we set $m=4,5$, ${N}_{0}=2\times {10}^{3}$, and ${N}_{1}=150$. (

**a**) Time for “ltafdb/00”. (

**b**) Sample entropy “ltafdb/00”. (

**c**) Time for $1/f$ noise. (

**d**) Sample entropy for $1/f$ noise. (

**e**) Time for “chbmit/chb07_01”. (

**f**) Sample entropy for “chbmit/chb07_01”. (

**g**) Time “ltecg/14046”. (

**h**) Sample entropy for “ltecg/14046”.

**Figure 7.**The left column shows the results of computational time versus data length N on different signals. The right column shows the values of $RMeanSqErr$ by error bar, where the larger the value of $RMeanSqErr$, the longer the error bar “I”. In this figure, we set $m=4,5$, ${N}_{0}=max\{1024,\lfloor \sqrt{N}\rfloor \}$, and ${N}_{1}=max\{1,\lfloor N/{N}_{0}\rfloor \}$. (

**a**) Time for “ltafdb/00”. (

**b**) Sample entropy “ltafdb/00”. (

**c**) Time for $1/f$ noise. (

**d**) Sample entropy for $1/f$ noise. (

**e**) Time for “chbmit/chb07_01”. (

**f**) Sample entropy for “chbmit/chb07_01”. (

**g**) Time “ltecg/14046”. (

**h**) Sample entropy for “ltecg/14046”.

**Figure 8.**The results of computational time with respect to p, where parameters $r=0.15$, $m=4,5$, $N={2}^{20}$, ${N}_{0}$, and ${N}_{1}$ are selected such that relative error $RMeanSqErr/SampEn\le 0.02$.

**Figure 9.**The results of memory usage versus data length N with $m=4,5$. (

**a**) Memory usage for $m=4$. (

**b**) Memory usage for $m=5$.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Liu, W.; Jiang, Y.; Xu, Y.
A Super Fast Algorithm for Estimating Sample Entropy. *Entropy* **2022**, *24*, 524.
https://doi.org/10.3390/e24040524

**AMA Style**

Liu W, Jiang Y, Xu Y.
A Super Fast Algorithm for Estimating Sample Entropy. *Entropy*. 2022; 24(4):524.
https://doi.org/10.3390/e24040524

**Chicago/Turabian Style**

Liu, Weifeng, Ying Jiang, and Yuesheng Xu.
2022. "A Super Fast Algorithm for Estimating Sample Entropy" *Entropy* 24, no. 4: 524.
https://doi.org/10.3390/e24040524