# Small Order Patterns in Big Time Series: A Practical Guide

## Abstract

**:**

## 1. Order Patterns Fit Big Data

#### 1.1. The Need for New Methods

- Basic methods should be simple and transparent.
- Few assumptions should be made on the underlying process.
- Algorithms should be resilient with respect to outliers and artifacts.
- Computations should be very fast.

#### 1.2. Contents of the Paper

#### 1.3. A Typical Example

## 2. Pattern Frequencies

#### 2.1. Basic Idea

#### 2.2. Stationarity

#### 2.3. Calculation of Pattern Frequencies

## 3. Key Concepts and Viewpoints

#### 3.1. Permutation Entropy and ${\Delta}^{2}$

#### 3.2. Order Correlation Functions

#### 3.3. Relative Order Correlation Functions

#### 3.4. Two Types of Data

## 4. First Examples: Weather Data

#### 4.1. The Data

#### 4.2. Autocorrelation and Persistence

#### 4.3. $\beta ,\gamma ,$ and $\delta $

## 5. Properties of Correlation Functions

#### 5.1. Two Pattern Identities

#### 5.2. Marginal Errors

#### 5.3. Classical Autocorrelation

#### 5.4. Interpretation of $\beta ,\gamma ,\delta $

#### 5.5. Persistence and Turning Rate

#### 5.6. Symmetries of Order Functions

#### 5.7. Periodicities

#### 5.8. The Decomposition Theorem

**Theorem**

**1**

**.**For a process with stationary increments and an arbitrary lag $d,$ the quadratic distance ${\Delta}^{2}$ of pattern probabilities to white noise uniform pattern frequencies $\frac{1}{6}$ defined in (2) has the following representation:

**Proof.**

## 6. Case Study: Speech and Music

#### Sliding Windows

## 7. Case Study: Tides

#### 7.1. The Data

#### 7.2. Order Correlation

#### 7.3. Relative Order Correlation

## 8. Case Study: Particulates

#### 8.1. The Data

#### 8.2. Sliding Windows Analysis

## 9. Brain and Heart Signals

#### 9.1. The Data

#### 9.2. Sleep Stages

## 10. Conclusions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Amigo, J.; Keller, K.; Kurths, J. Recent progress in symbolic dynamics and permutation complexity. Ten years of permutation entropy. Eur. Phys. J. Spec. Top.
**2013**, 222, 247–257. [Google Scholar] - Amigo, J.M. Permutation Complexity in Dynamical Systems; Springer Series in Synergetics; Springer: Berlin, Germany, 2010. [Google Scholar]
- Zanin, M.; Zunino, L.; Rosso, O.; Papo, D. Permutation entropy and its main biomedical and econophysics applications: A review. Entropy
**2012**, 14, 1553–1577. [Google Scholar] [CrossRef] - Carpi, L.C.; Saco, P.M.; Rosso, O.A. Missing ordinal patterns in correlated noises. Phys. A
**2010**, 389, 2020–2029. [Google Scholar] [CrossRef] - Martinez, J.H.; Herrera-Diestra, J.L.; Chavez, M. Detection of time reversibility in time series by ordinal patterns analysis. Chaos
**2018**, 28, 123111. [Google Scholar] [CrossRef] [PubMed][Green Version] - Zanin, M.; Rodríguez-González, A.; Menasalvas Ruiz, E.; Papo, D. Assessing time series reversibility through permutation patterns. Entropy
**2018**, 20, 665. [Google Scholar] - Parlitz, U.; Berg, S.; Luther, S.; Schirdewan, A.; Kurths, J.; Wessel, N. Classifying cardiac biosignals using ordinal pattern statistics and symbolic dynamics. Comput. Biol. Med.
**2012**, 42, 319–327. [Google Scholar] [CrossRef] [PubMed] - McCullough, M.; Small, M.; Iu, H.; Stemler, T. Multiscale ordinal network analysis of human cardiac dynamics. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci.
**2017**, 375, 20160292. [Google Scholar] [CrossRef] - Bandt, C.; Shiha, F. Order patterns in time series. J. Time Ser. Anal.
**2007**, 28, 646–665. [Google Scholar] [CrossRef] - Bandt, C. Permutation entropy and order patterns in long time series. In Time Series Analysis and Forecasting; Rojas, I., Pomares, H., Eds.; Contributions to Statistics; Springer: Berlin, Germany, 2015. [Google Scholar]
- Bandt, C. A new kind of permutation entropy used to classify sleep stages from invisible EEG microstructure. Entropy
**2017**, 19, 197. [Google Scholar] [CrossRef] - Bandt, C.; Pompe, B. Permutation entropy: A natural complexity measure for time series. Phys. Rev. Lett.
**2001**, 88, 174102. [Google Scholar] [CrossRef] [PubMed] - Bandt, C. Autocorrelation type functions for big and dirty data series. arXiv
**2014**, arXiv:1411.3904. [Google Scholar] - Rosso, O.; Larrondo, H.; Martin, M.T.; Plastino, A.; Fuentes, M. Distinguishing Noise from Chaos. Phys. Rev. Lett.
**2007**, 99, 154102. [Google Scholar] [CrossRef] [PubMed][Green Version] - López-Ruiz, R.; Nagy, Á.; Romera, E.; Sañudo, J. A generalized statistical complexity measure: Applications to quantum systems. J. Math. Phys.
**2009**, 50, 123528. [Google Scholar] [CrossRef][Green Version] - Deutscher Wetterdienst. Climate Data Center. Available online: ftp://ftp-cdc.dwd.de/pub/CDC/observations_germany (accessed on 20 May 2019).
- Brockwell, P.; Davies, R. Time Series, Theory and Methods, 2nd ed.; Springer: New York, NY, USA, 1991. [Google Scholar]
- Shumway, R.; Stoffer, D. Time Series Analysis and Its Applications, 2nd ed.; Springer: New York, NY, USA, 2006. [Google Scholar]
- Bandt, C. Crude EEG parameter provides sleep medicine with well-defined continuous hypnograms. arXiv
**2017**, arXiv:1710.00559. [Google Scholar] - Ferguson, S.; Genest, C.; Hallin, M. Kendall’s tau for serial dependence. Can. J. Stat.
**2000**, 28, 587–604. [Google Scholar] [CrossRef] - National Oceanic and Atmospheric Administration. National Water Level Observation Network. Available online: https://www.tidesandcurrents.noaa.gov/nwlon.html (accessed on 20 May 2019).
- California Air Resources Board. Available online: www.arb.ca.gov/adam (accessed on 20 May 2019).
- Terzano, M.; Parrino, L.; Sherieri, A.; Chervin, R.; Chokroverty, S.; Guilleminault, C.; Hirshkowitz, M.; Mahowald, M.; Moldofsky, H.; Rosa, A.; et al. Atlas, rules, and recording techniques for the scoring of cyclic alternating pattern (CAP) in human sleep. Sleep Med.
**2001**, 2, 537–553. [Google Scholar] [CrossRef] - Goldberger, A.; Amaral, L.; Glass, L.; Hausdorff, J.; Ivanov, P.; Mark, R.; Mietus, J.; Moody, G.; Peng, C.K.; Stanley, H. PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation
**2000**, 101, e215–e220. [Google Scholar] [CrossRef] [PubMed]

**Figure 1.**Our goal is to find structure in big data series. These are respiration data of a healthy volunteer, measured 50 times per second during 24 h of normal life. (

**a**) One minute of clean data. (

**b**) Mean flow for intervals of 30 s. (

**c**) Order correlation function $\tilde{\tau}\left(d\right)$ for each minute. While Panel b does not provide much information, our method in Panel c shows the differences and tendencies of respiration during the activity phases and sleep stages. Data from a cooperation with Achim Beule.

**Figure 4.**Temperature in ${}^{\circ}$C for the first 10 days in January and July 1978, for 100 and 1000 days.

**Figure 5.**Relative humidity in % for the first 10 days in January and July 1978, for 100 and 1000 days.

**Figure 6.**Autocorrelation and persistence for temperature (left part) and relative humidity (right part). The curves correspond to 35 consecutive years. The lag d runs from 1–49 h. Each row describes a two-month period, from January/February up to November/December.

**Figure 7.**The functions $\beta ,\gamma ,$ and $\delta ,$ for $d=1,\dots ,24$ h, for July–August of 35 consecutive years. The upper row corresponds to temperature and the bottom row to relative humidity. Although there is considerable variation over the years, some common structures can be seen.

**Figure 8.**Twelve seconds of the song “Hey Jude” by The Beatles. (

**a**) The signal: mean of the absolute amplitude over non-overlapping windows of 50 ms. (

**b**) The noisy places $(x,d)$ for which $T{\Delta}^{2}<15,$ drawn in black. The vertical axis represents the lag $d=1,\dots ,30,$ considered as the wavelength, which ranges from 0–7 ms. Each column of the matrix corresponds to one window $x.$

**Figure 9.**Correlogram (upper panel) and persistence (middle panel) of 12 s of “Hey Jude”. The scale of d was reverted and written as frequencies so that the melody could be read like musical notes. The bottom panel shows the percentage $\tilde{\tau}$ of ${\Delta}^{2}$, which is due to persistence.

**Figure 10.**Detail from Figure 9. The vowels of “Jude”, “don’t”, and the second syllable in “better” provide stationary parts of the time series lasting for 0.3, 0.1, and 0.3 s. Only 20 ms of each signal are shown in the top panel. Order correlation functions were calculated for six, resp. two, disjoint windows of a length of 50 ms and drawn for one pitch period, which equaled 4.5 ms for all three sounds.

**Figure 11.**Tides form an almost deterministic process with periodicities on the scale of days, months, and years. Data from [21].

**Figure 12.**Water levels for three days in 2014, measured every six minutes at different stations in the U.S. Data taken from [21]; shifted and scaled for better visibility.

**Figure 13.**Correlation functions $\beta ,$ for one month in seven consecutive years, at the four places of Figure 12. For a given month, each ocean station has its specific $\beta $-profile.

**Figure 14.**Classical autocorrelation and order correlation functions for the station at Anchorage in January. Persistence reflects the diurnal rhythm. The asymmetry functions $\beta ,\gamma ,\delta $ all show a very specific structure, which remains stable through seven consecutive years. In contrast, $\rho $ does not seem to contain much structural information.

**Figure 15.**The division of ${\Delta}^{2}$ into the components described in Theorem 1, illustrated for the tides at Anchorage in the years 2013 and 2014 in sliding window analysis. From top to bottom, the four panels correspond to the function $\tilde{\tau}$, which takes the largest part of ${\Delta}^{2},$ to $\tilde{\beta},\tilde{\gamma}$, and $\tilde{\delta}.$

**Figure 16.**Particulate measurements are notoriously noisy. They show a weak daily and yearly rhythm, which can hardly be detected from the data. The PM10 measurements for Station 3215 Trona-Athol in San Bernardino, California, are from the public database [22].

**Figure 17.**Correlation functions for hourly particulate measurements in San Bernardino [22] in 2000–2011. The 12 curves correspond to years and show the consistency of the correlation structure over the years. The order correlation functions keep the structure from Days 1–6 better than autocorrelation.

**Figure 18.**Sliding window analysis of (

**a**) persistence $\tau \left(d\right)$ and (

**b**) up-down balance $\beta \left(d\right)$ for hourly particulate measurements in San Bernardino 1997–2011 [22]. The lag d runs from 1–72, that is three days, on the vertical axis. Overlapping windows of 50 days were used. Daily rhythm is present mainly in summer, in both $\tau $ and $\beta .$

**Figure 19.**Biomedical signals: 8 s of an electroencephalogram, an electrocardiogram, and a plethysmogram. Order correlation functions seem to apply to all of them.

**Figure 21.**Theorem 1 for the plethysmogram over a whole night. From top to bottom: $\tilde{\tau},\tilde{\beta},\tilde{\gamma},\tilde{\delta}.$ The bottom panel contains the error of the equation of Theorem 1 on a scale from 0 up to 1%. Data of Person n3 from Terzano et al. [23].

Function | Range | Min Assumed for | Max Assumed for |
---|---|---|---|

Autocorrelation $\rho $ | $[-1,1]$ | linear decreasing series | linear increasing series |

Spearman rank autocorr. | $[-1,1]$ | decreasing series | increasing series |

Up-down balance $\beta $ | $[-1,1]$ | decreasing series | increasing series |

Persistence $\tau $ | $[-\frac{1}{3},\frac{2}{3}]$ | alternating series | monotone series |

Turning rate $TR$ | $[0,1]$ | monotone series | alternating series |

Up-down scaling $\delta $ | $[-1,1]$ | $-(t+{(-1)}^{t})$ | $t+{(-1)}^{t}$ |

Rotational asymmetry $\gamma $ | $[-1,1]$ | ${(-2)}^{t}$ | ${(-\frac{1}{2})}^{t}$ |

Function | Time Reversal | Negative Function | Rotation |
---|---|---|---|

$\rho ,$ Spearman, $\tau $ | + | − | − |

$\beta $ and $\delta $ | − | − | + |

Rotational asymmetry $\gamma $ | − | + | − |

Function | Half Period $\frac{\mathit{L}}{2}$ | Period L | Symmetry Type |
---|---|---|---|

$\rho ,$ Spearman, | minimum | maximum | vertical line |

Persistence $\tau $ | minimum | bumped maximum | vertical line |

$\beta ,\gamma ,$ and $\delta $ | zero | zero or discontinuity | symmetry center |

© 2019 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Bandt, C. Small Order Patterns in Big Time Series: A Practical Guide. *Entropy* **2019**, *21*, 613.
https://doi.org/10.3390/e21060613

**AMA Style**

Bandt C. Small Order Patterns in Big Time Series: A Practical Guide. *Entropy*. 2019; 21(6):613.
https://doi.org/10.3390/e21060613

**Chicago/Turabian Style**

Bandt, Christoph. 2019. "Small Order Patterns in Big Time Series: A Practical Guide" *Entropy* 21, no. 6: 613.
https://doi.org/10.3390/e21060613