Open Access
This article is

- freely available
- re-usable

*Entropy*
**2018**,
*20*(8),
612;
doi:10.3390/e20080612

Article

Permutation Entropy Based on Non-Uniform Embedding

^{1}

Department of Engineering Mechanics, Hohai University, Nanjing 210098, China

^{2}

Center for Nonlinear Systems, Kaunas University of Technology, Studentu 50-146, LT-51368 Kaunas, Lithuania

^{*}

Author to whom correspondence should be addressed.

Received: 11 July 2018 / Accepted: 8 August 2018 / Published: 17 August 2018

## Abstract

**:**

A novel visualization scheme for permutation entropy is presented in this paper. The proposed scheme is based on non-uniform attractor embedding of the investigated time series. A single digital image of permutation entropy is produced by averaging all possible plain projections of the permutation entropy measure in the multi-dimensional delay coordinate space. Computational experiments with artificially-generated and real-world time series are used to demonstrate the advantages of the proposed visualization scheme.

Keywords:

permutation entropy; ordinal pattern; attractor embedding; multi-dimensional phase space## 1. Introduction

Real-world time series and experimental data are usually contaminated with noise. System states of such time series are usually complex, nonstationary and difficult to identify. Computation of the complexity for a given time series helps to quantify the intricacy of the model that governs the evolution of that series.

Different well-known complexity measures can be used to describe a time series, including Lempel-Ziv complexity [1] and Kolmogorov complexity [2]. Chaotic features of a time series can be quantified by assessing its maximal Lyapunov exponent [3]. The complexity of a time series can be assessed by dimension estimation models (such as the information dimension [4] and the fractal dimension [5]); also by entropy assessment methods (such as Shannon entropy [6], approximate entropy [7], K-S entropy [8,9], sample entropy [10], conditional entropy [11], multiscale entropy [12]).

Arguably, the most powerful tool to measure the complexity of a time series is the permutation entropy (PE). Since its introduction in 2002 by Bandt and Pompe in their foundational paper [13], it has been successfully applied in a wide range of scientific areas and for a vast number of purposes [14]. The main advantages of PE are: (i) it is simple to use; (ii) calculations are fast; and (iii) it is robust against noise. PE is based on the permutation patterns or the order relations among values of a signal [15]. PE compares the order of neighboring relative values, rather than apportioning amplitudes according to different levels [15,16]. PE can be computed using fast PE algorithms [17]. These characteristics make PE an appealing tool used in a large number of real-world signal and image processing applications [18].

PE is based on one-dimensional time series reconstruction into a D-dimensional space with the embedding delay $\tau $. The time series embedding scheme into the higher dimension space was first presented by Packard, Crutchfield, Farmer and Shaw [19,20]. This embedding scheme represents the optimal properties of a dynamical system if the embedding dimension Dand time delay $\tau $ (the difference between consecutive observations) are estimated adequately. Theoretical foundations by Takens [21], and expansions of his ideas by Sauer et al. [22] indicate that the embedding dimension of $D>2m+1$ (where m is the fractal dimension of the attractor) almost always ensures the reconstruction of the topology of the original attractor [20]. This surprising result states that time series output is sufficient to obtain complete information about hidden states of the dynamical system [23].

Commonly-used methods for finding the optimal time delay $\tau $ are the average mutual information method [24], the correlation sum method [25], the phase space expansion method [26,27] and the geometry-based method [28].

The selection of the optimal embedding dimension is usually based on the examination of some invariant on the reconstructed attractor. Usually, these invariants represent the dynamics of the system and are based on the geometrical properties of that system. An invariant value is computed by increasing the embedding dimension until that invariant value settles down [29]. Typical examples of such methods are the box-counting method [20,30], the correlation dimension method [25,31], the largest Lyapunov exponent method [32], the Kolmogorov–Sinai entropy method [33] and the false nearest neighbors (FNN) method [34,35].

The identification of the optimal embedding dimension D and the optimal time delay $\tau $ helps to reconstruct the attractor in the delay-coordinate space. However, it is well known that non-uniform embedding (when time delays are not equal) might lead to a better reconstruction of the attractor if compared to uniform embedding (when all time delays are equal) [36,37]. However, the selection of the optimal vector of time delays for non-uniform embedding is a difficult optimization problem that requires massive computational resources. Several approaches have been proposed to tackle this problem. The identification of a near-optimal vector of time delays employing genetic algorithms was proposed in [38,39]. Good near-optimal non-uniform embedding results were obtained by using the ant colony optimization algorithms reported in [40]. Non-uniform time series embedding with special target functions based on the Fourier spectral analysis were presented in [41,42].

The main objective of this paper is to employ non-uniform time series embedding for the construction of a visualization scheme for PE. The optimal embedding dimension and the set of optimal time lags are used to design a computational algorithm for plotting PE as a single surface. The paper is structured as follows. The normalized PE and the target function used to identify optimal time delays are introduced in Section 2. The proposed visualization scheme for PE is introduced in Section 3. The results of computational experiments with the sine wave, the Rossler time series and a real-world time series are discussed in Section 4. A discussion and concluding remarks are given in the last section.

## 2. Preliminaries

#### 2.1. Permutation Entropy

For a given time series $\{{x}_{1},{x}_{2},\dots ,{x}_{N}\}$, uniform embedding yields a trajectory matrix:
where D is the embedding dimension and $\tau $ is the time delay. PE quantifies the statistics of ordinal permutations in the rows of the trajectory matrix [29].

$${Y}_{t}=\{{x}_{t},{x}_{t+\tau},{x}_{t+2\tau},\dots ,{x}_{t+(D-1)\tau}\},\phantom{\rule{0.277778em}{0ex}}\phantom{\rule{0.277778em}{0ex}}t=1,2,\dots ,N-(D-1)\tau ,$$

For example, the sequence $\{3,7,6\}$ has ordinal pattern $\pi =1;3;2$, since its ${x}_{1}\le {x}_{3}\le {x}_{2}$. The ordinal pattern of the sequence $\{6,3,7\}$ is $\pi =2;1;3$. As a consequence, there are $D!$ possible pattern orders, which represent all unique orderings (permutations ${\pi}_{i},i=1,2,\dots ,D!$). The relative frequency of each distribution with which they occur in the trajectory matrix is defined as follows:
where $f\left({\pi}_{i}\right)$ represents the occurrence number of the ${\pi}_{i}$ pattern order. Normalized PE is defined here as:

$${P}_{i}=\frac{f\left({\pi}_{i}\right)}{N-(D-1)\tau},\phantom{\rule{0.277778em}{0ex}}\phantom{\rule{0.277778em}{0ex}}i=1,2,\dots ,D!,$$

$${H}_{D}\left(\tau \right)=-\frac{1}{\mathrm{ln}(D!)}\sum _{i=1}^{D!}{P}_{i}\mathrm{ln}\left({P}_{i}\right).$$

The range of PE values defined by Equation (3) is from 0–1. PE depends on two predefined parameters: the embedding dimension D and the time lag $\tau $.

#### 2.2. Non-Uniform Embedding

The trajectory matrix produced by non-uniform embedding reads:
where n is the length of the observable time series; $\Delta $ is the width of the observation window; $\Delta ={\tau}_{1}+\dots +{\tau}_{D-1}$; $\{{\tau}_{1},{\tau}_{2},\dots ,{\tau}_{D-1}\}$ is the vector of time delays.

$${Y}_{t}=\{{x}_{t},{x}_{t+{\tau}_{1}},{x}_{t+{\tau}_{1}+{\tau}_{2}},\dots ,{x}_{t+\Delta}\},\phantom{\rule{0.277778em}{0ex}}\phantom{\rule{0.277778em}{0ex}}t=1,2,\dots ,n-\Delta ,$$

The optimal time delays are computed as the $(D-1)$-dimensional argument of the following maximization problem [43]:
where L is the upper range for all time lags.

$$\underset{1\le {\tau}_{1},\dots ,{\tau}_{D-1}\le L}{\mathrm{max}}\left(\frac{1}{(n-\Delta )\sqrt{D}}\sum _{k=1}^{n-\Delta}\sqrt{{x}_{k}^{2}+{x}_{k+{\tau}_{1}}^{2}+\dots +{x}_{k+\Delta}^{2}}\right).$$

## 3. The Proposed Visualization Scheme for PE

Visualization of PE as a function of two time delays (at $D=3$) is proposed in [44]. Such an approach enables one to plot PE as a surface, and the graphical features of that surface leak the underlying complexity of the analyzed time series. Visualizing higher-dimensional arrays is a considerable challenge, and a scheme for rationalizing the information contained in these arrays is of considerable practical benefit, since in practice, one often wishes to use $D>3$. The proposed visualization scheme for PE at $D>3$ reads:

- Determine the optimal embedding dimension D for a given time series.
- Set L (the upper range for all time lags). Determine the set of optimal time lags ${\tau}_{1}^{\u2605},{\tau}_{2}^{\u2605},\dots ,{\tau}_{D-1}^{\u2605}$ according to Equation (5).
- Average all $(D-1)(D-2)/2$ planar digital images of PE.

**Example**

**1.**

In the case of $D=5$, the visualization procedure can be illustrated by a schematic diagram in Table 1.

Note that only two different time delays $({\tau}_{k},{\tau}_{l})$ are not fixed in each plane projection of ${H}_{D}$. In other words, ${H}_{D}$ is a two-dimensional image of PE when all time delays ${\tau}_{s}$, $s\ne k$, $s\ne l$, $k\ne l$ are fixed to the corresponding components of the vector of optimal time delays. Finally, a two-dimensional averaged digital image of PE is computed as an arithmetic average of all planar PE images. The upper range L should be high enough to ensure that the maximum point of the target function (Equation (5)) does not lie on the boundary of the search space (all optimal time lags should be lower than L).

Note that the computation of all possible digital images of PE is not necessary. It is possible to compute the averaged two-dimensional digital image (a square matrix of $L\times L$ pixels) of PE directly:
where $\zeta $ and $\eta $ are coordinates of a pixel in the x- and the y-axis; $i,j=1,\dots ,L$.

$${\overline{H}}_{D}(\zeta =i,\eta =j)=\frac{2}{(D-1)(D-2)}\sum _{k=1}^{D-2}\sum _{l=k+1}^{D-1}{H}_{D}({\tau}_{k}=i,{\tau}_{l}=j),$$

## 4. Computational Experiments

#### 4.1. The Sine Wave

The sine wave used in this computational experiment is generated by the formula ${x}_{t}=\mathrm{sin}\left(\frac{2\pi t}{40}\right)$, $t=1,\dots ,$ 10,000; L is set to 100. The optimal embedding dimension is determined by FNN; $D=4$. Full sort yields the optimal set of time delays ${\tau}_{1}={\tau}_{2}={\tau}_{3}=5$. Planar projections ${H}_{4}({\tau}_{1},{\tau}_{2},5)$, ${H}_{4}({\tau}_{1},5,{\tau}_{3})$, ${H}_{4}(5,{\tau}_{2},{\tau}_{3})$ and the averaged ${\overline{H}}_{4}(\zeta ,\eta )$ are shown in Figure 1.

It is well known that planar PE projections of the sine wave do exhibit a periodic pattern of periodic and symmetric cells (the size of these cells equals the ratio between the sampling frequency and the period of the sine wave) [44]. The averaged ${\overline{H}}_{4}$ based on the optimal embedding into the delay coordinate space also reveals the periodicity of the embedded time series. All three images ${H}_{4}({\tau}_{1},{\tau}_{2},5)$, ${H}_{4}({\tau}_{1},5,{\tau}_{3})$ and ${H}_{4}(5,{\tau}_{2},{\tau}_{3})$ are comprised of periodic cells with the same geometric boundaries. Though the structure of the geometric pattern in every cell is similar, the distortion of the image and the angles of the characteristic inclination lines are different (Figure 1a–c). Each cell in the averaged image yields traces of the three different inclination lines (Figure 1d). Thus, in principle, it is possible to read the optimal embedding dimension from the averaged image. The three inclination lines visible in each cell of Figure 1d suggest that the averaged image is comprised of three plain PE images, which in turn suggests that the optimal embedding dimension is four (Figure 2).

Computational experiments are continued with the sine wave using the technique presented in [44] (Figure 2b). The structure of the reconstructed image reveals the periodicity of the sine wave. However, the complexity of the image in Figure 2b is lower than the complexity of PE images depicted in Figure 1. This can be explained by the fact that the ordinal patterns are divided into 24 bins in Figure 1a–c versus six bins in Figure 2b.

#### 4.2. The Rössler Time Series

The Rössler system is a paradigmatic model of chaotic dynamics [45,46]:
where constants $a,b,c$ are set to $a=0.1,b=0.1,c=14$. The Rössler time series is generated by integrating the three coupled ordinary differential equations and by selecting every tenth value of x in the time domain $1\le t\le 1000$ (the time step is set to 0.01). FNN yields the optimal embedding dimension $D=6$; L is set to 100. Full sort yields the set of optimal time lags $\{38,11,42,21,23\}$. All planar projections of PE produced by fixing three of the five delays at the optimal values are illustrated in Figure 3.

$$\left\{\begin{array}{c}\frac{dx}{dt}=-y-z,\hfill \\ \frac{dy}{dz}=x+ay,\hfill \\ \frac{dz}{dt}=b+z(x-c),\hfill \end{array}\right.$$

Finally, the averaged image of PE ${\overline{H}}_{6}(\zeta ,\eta )$ is depicted in Figure 4b. It can be observed that ${\overline{H}}_{6}$ integrates specific features represented in planar projections of PE and provides a single image representing the nonlinear properties of the Rössler time series. However, it is impossible to reconstruct the optimal embedding dimension from Figure 4a. First of all, this is due to the complexity of the time series. Secondly, the number of averaged images is large due to the fact that the optimal embedding dimension is large, as well. The Rössler time series represents a chaotic oscillator. Nevertheless, it is possible to assess an average period of chaotic oscillations (which is around 60 time steps in our computational experiment). The high-PE bands at delays of 60 (Figure 4b) denote this average period of oscillations (compare to Figure 1).

The averaged image ${\overline{H}}_{6}(\zeta ,\eta )$ can be compared with a similar reconstruction of PE with non-ideal parameters of the embedding. Uniform embedding yields the set of optimal time delays $\{9,9,9,9,9\}$; the optimal embedding dimension is the same ($D=6$; Table 2). The averaged ${\overline{H}}_{6}(\zeta ,\eta )$ produced by uniform embedding is depicted in Figure 4a. It is clear that the uniform embedding produces a much more regular image of averaged PE if compared to the non-uniform embedding. The high-PE bands at delays around 60 are still visible in images produced both by uniform and non-uniform embedding. However, the pattern of PE in Figure 4b is much less regular compared to the one in Figure 4a.

All real-world time series are contaminated by noise. Therefore, it is important to investigate the effects induced by noise on the proposed scheme of PE visualization. Computational experiments are continued with the Rössler time series, but different realizations of the Gaussian random noise with zero mean are added to it. The standard deviation of the random noise is set to $\frac{\beta R}{100\%}$, where $\beta $ is the noise level in percent; R is the range of the stationary Rössler time series (the difference between the maximum and the minimum values of the time series in the observation window). ${\overline{H}}_{6}$ for the Rössler time series with different noise levels are depicted in Figure 4 (intermediate images of all possible plain projections of PE are omitted for brevity). The deterministic components of PE are gradually lost when the signal is completely buried in noise. ${\overline{H}}_{6}$ with 200% noise (Figure 4f) represents the image of random noise. It is known that PE is robust to noise [47]. However, the amount of noise added to the deterministic time series is so high that the time series becomes a random time series and correlations between data points are lost completely. It is interesting to observe that the average PE is more acutely impacted by noise when one of the coordinates ($\zeta ,\eta $) is low. Note that the addition of noise to the deterministic system induces the change of the optimal embedding dimension and optimal delays in the resulting time series. Therefore, the optimal embedding dimension and the set of optimal time delays are reconstructed for every different noise level (Table 2). Though PE is not a good measure to quantify the structural complexity, it is well suited to assess the randomness of a time series (PE increases with respect to noise level) [48]. This effect is clearly observed in Figure 4 when the range of PE raises from $[0.58;0.63]$ up to $[0.9937;0.9940]$.

The presented algorithm for plotting the averaged PE is based not only on the optimal embedding dimension, but also on the set of optimal time delays. Non-uniform embedding results in a different pattern if compared to uniform embedding (Figure 4a,b). However, it is also important to answer a question if the process of optimizing time delays yields any benefit aside from the averaging effect. In other words, it is important to compare PE patterns generated by the optimal non-uniform set of time delays with patterns generated by a set of delays generated by a random number generator.

Computational experiments are continued with the Rössler time series without the additive noise. A standard random number generator is used to produce 10 real numbers uniformly distributed in the interval $[0,1]$. Every number is multiplied by 50 and rounded, resulting in two sets of random time delays: $\{7,18,39,27,12\}$ and $\{30,40,16,9,29\}$. The averaged ${\overline{H}}_{6}$ for each set of random time delays are depicted in Figure 5.

It can be seen that particular values of time delays do have a strong impact on the averaged pattern of PE. The averaged PE inherits the properties of individual planar sections of PE. The geometric coordinates of the particular set of time delays in the multi-dimensional coordinate space define specific features of planar projections, which influence the averaged image. The high-PE bands at delays of 60 are still clearly expressed in Figure 5a,b. However, the averaged PE patterns are very much different; and the proposed technique for plotting the averaged PE is particularly based on the set of optimal time delays.

#### 4.3. Real-World Time Series

Computational experiments are continued with the EEG signal available from the Brain/Neural Computer Interaction (BNCI) Horizon 2020 project [49] (A01 time series from dataset P300 Speller); the graphical representation of the signal is shown in Figure 6.

Two different intervals are selected from the same Electroencephalogram (EEG) signal: $1\le k\le 100,000$ (Interval A) and $200,001\le k\le 300,000$ (Interval B). Every second measurement point is skipped in both intervals in order to make the optimal time delays smaller (the number of remaining points in Intervals A and B is 50,000).

FNN yields the optimal embedding dimension $D=5$ for Interval A. The set of optimal time lags is $\{{\tau}_{1}=6,{\tau}_{2}=9,{\tau}_{3}=16,{\tau}_{4}=6\}$; the parameter L is set to 60. The averaged PE for Interval A ${\overline{H}}_{5}$ is depicted in Figure 7a.

Computational experiments are continued with Interval A, but now, PE is visualized using the technique proposed in [44]. The embedding dimension D is always three; time delays ${\tau}_{1}$ and ${\tau}_{2}$ are varied from 1–L (Figure 7b). Some important observations can be done by comparing the two digital images of PE in Figure 7. First of all, the process of averaging makes some features of the PE pattern appear cleaner. However, the averaging results in some information loss; for example, the stripe at ${\tau}_{2}=5$ in Figure 7b is not visible in Figure 7a. Clearly, some of the structure of the PE pattern in Figure 7a is replicated in Figure 7b, albeit with more noise. Furthermore, the difference in the PE values is a consequence of different D; and the fact that ordinal patterns are being divided into 120 bins in Figure 7a versus six bins in Figure 7b. The straightforward computation of PE (as shown in Figure 7b) yields a high level of noise what results in a high uncertainty of PE reconstruction. The improvement in the signal-to-noise of our PE reconstruction is due to the fact that the proposed algorithm averages over many two-dimensional images.

Analogous computations are performed with Interval B. FNN yields the optimal embedding dimension $D=5$ again, but the set of optimal time lags is now $\{{\tau}_{1}=5,{\tau}_{2}=6,{\tau}_{3}=14,{\tau}_{4}=6\}$. The averaged PE for Interval B ${\overline{H}}_{5}$ is shown in Figure 8a. The old visualization scheme for Interval B yields the digital image in Figure 8b. This confirms again that the optimal embedding dimension and the optimal set of time delays play a pivotal role in a representative visualization of PE. Comparison of the graphical features in digital images of PE represented in Figure 7a and Figure 8a helps to identify the differences occurring in the evolving time series. Such comparisons are much more difficult with the old visualization scheme (Figure 7b and Figure 8b).

## 5. Discussion

A novel visualization scheme for permutation entropy is presented in this paper. This scheme matches permutation entropy with the topological characteristics of the investigated time series: the optimal embedding dimension and the optimal set of time delays. The proposed scheme is based on non-uniform attractor embedding and uses different time delays, but results in a single digital image.

The proposed algorithm is based on the averaging of all possible plain projections of the permutation entropy measure in the multi-dimensional delay coordinate space. Such an approach is a natural extension of the technique used for the quantification of the phase space occupied by the reconstructed attractor [41]. The proposed scheme extends the visualization of permutation entropy from a three-dimensional phase space (with two time delays) to a multi-dimensional phase space.

The proposed scheme is well suited for real-world time series contaminated by the additive noise. Arithmetic averaging of plane projections reduces the optical effects induced by the additive noise and increases the clarity of specific geometric features (which can be used for the interpretation of the investigated time series). It is well known that permutation entropy can be used to identify couplings between time series [50]. The applicability of the proposed visualization scheme for the identification of couplings and synchronization between time series remains a definite objective of future research.

## Author Contributions

Conceptualization, M.R.; Methodology, M.R. and K.P.; Software, M.T., K.P. and N.F.A. Validation, M.C. and N.F.A.; Formal analysis, M.T. and K.P.; Investigation, K.P.; Resources, M.T. and M.C.; Data curation, K.P. and N.F.A.; Writing, original draft preparation, M.R.; Writing, review and editing, M.C. and M.R.; Visualization, K.P.; Supervision, M.R.; Project administration, M.T. and M.R.; Funding acquisition, M.C.

## Funding

This research was funded by the Jiangsu Provincial Recruitment Program of Foreign Experts (Type B, Grant No. JSB2017007).

## Conflicts of Interest

The authors declare no conflict of interest.

## Abbreviations

The following abbreviations are used in this manuscript:

PE | Permutation Entropy |

FNN | False Nearest Neighbors |

EEG | Electroencephalogram |

BNCI | Brain/Neural Computer Interaction |

## References

- Lempel, A.; Ziv, J. On the complexity of finite sequences. IEEE Trans. Inform. Theory
**1976**, 22, 75–81. [Google Scholar] [CrossRef] - Kolmogorov, A.N. Three approaches to the definition of the concept “quantity of information”. Probl. Peredachi Inf.
**1965**, 1, 3–11. [Google Scholar] - Kaspar, F.; Schuster, H.G. Easily calculable measure for the complexity of spatiotemporal patterns. Phys. Rev. A
**1987**, 36, 842–848. [Google Scholar] [CrossRef] - Farmer, D.J. Information dimension and the probabilistic structure of chaos. J. Phys. Sci.
**1982**, 37, 1304–1325. [Google Scholar] [CrossRef] - Termonia, Y.; Alexandrowicz, Z. Fractal dimension of strange attractors from radius versus size of arbitrary clusters. Phys. Rev. Lett.
**1983**, 51, 1265–1268. [Google Scholar] [CrossRef] - Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J.
**1948**, 27, 379–423. [Google Scholar] [CrossRef] - Pincus, S.M. Approximate entropy as a measure of system complexity. Proc. Natl. Acad. Sci. USA
**1991**, 88, 2297–2301. [Google Scholar] [CrossRef] [PubMed] - Walters, P. An Introduction to Ergodic Theory; Springer Publishing House: New York, NY, USA, 1982; p. 250. ISBN 978-0-387-95152-2. [Google Scholar]
- Stolz, I.; Keller, K. A general symbolic approach to Kolmogorov-Sinai entropy. Entropy
**2017**, 19, 675. [Google Scholar] [CrossRef] - Richman, J.S.; Moorman, J.R. Physiological time-series analysis using approximate entropy and sample entropy. Am. J. Physiol. Heart Circ. Physiol.
**2000**, 278, 2039–2049. [Google Scholar] [CrossRef] [PubMed] - Keller, K.; Unakafov, A.M.; Unakafova, V.A. Ordinal patterns. Entropy
**2014**, 16, 6212–6239. [Google Scholar] [CrossRef] - Zhou, J.; Xiao, J.; Xiao, H.; Zhang, W.; Zhu, W.; Li, C. Multifault diagnosis for rolling element bearings based on intrinsic mode permutation entropy and ensemble optimal extreme learning machine. Adv. Mech. Eng.
**2014**, 6, 803–919. [Google Scholar] [CrossRef] - Bandt, C.; Pompe, B. Permutation entropy: A natural complexity measure for time series. Phys. Rev. Lett.
**2002**, 88, 174102. [Google Scholar] [CrossRef] [PubMed] - Zunino, L.; Olivares, F.; Scholkmann, F.; Rosso, O.A. Permutation entropy based time series analysis: Equalities in the input signal can lead to false conclusions. Phys. Lett. A
**2017**, 381, 1883–1892. [Google Scholar] [CrossRef] - Zanin, M.; Zunino, L.; Rosso, O.A.; Papo, D. Permutationentropy and its main biomedical and econophysicsapplications: A review. Entropy
**2012**, 14, 1553–1577. [Google Scholar] [CrossRef] - Li, Y.; Li, G.; Yang, Y.; Liang, X.; Xu, M. A fault diagnosis scheme for planetary gearboxes using adaptive multi-scale morphology filter and modified hierarchical permutation entropy. Mech. Syst. Signal Proccess
**2018**, 105, 319–337. [Google Scholar] [CrossRef] - Unakafova, V.A.; Keller, K. Efficiently measuring complexity on the basis of real-world data. Entropy
**2013**, 15, 4392–4415. [Google Scholar] [CrossRef] - Azami, H.; Escudero, J. Amplitude-aware permutation entropy: Illustration in spike detection and signal segmentation. Comput. Meth. Prog. Biomed.
**2016**, 128, 40–51. [Google Scholar] [CrossRef] [PubMed] - Packard, N.H.; Crutchfield, J.P.; Farmer, J.D.; Shaw, R.S. Geometry from a time series. Phys. Rev. Lett.
**1980**, 45, 712–716. [Google Scholar] [CrossRef] - Theiler, J. Estimating the fractal dimension of chaotic time series. Lincoln Lab. J.
**1990**, 3, 63–86. [Google Scholar] - Takens, F. Detecting strange attractors in turbulence. In Dynamical Systems and Turbulence, Warwick 1980; Rand, D.A., Young, L.S., Eds.; Springer: Berlin/Heidelberg, Germany, 1981; pp. 366–381. [Google Scholar] [CrossRef]
- Sauer, T.; Yorke, J.A.; Casdagli, M. Embedology. J. Stat. Phys.
**1991**, 65, 579–616. [Google Scholar] [CrossRef] - Yap, H.L.; Eftekhari, A.; Wakin, M.B.; Rozell, C.J. A first analysis of the stability of takens’ embedding. In Proceedings of the IEEE Global Conference on Signal and Information Processing (GlobalSIP), Anaheim, CA, USA, 3–5 December 2014; pp. 404–408. [Google Scholar] [CrossRef]
- Fraser, A.; Swinney, H. Independent coordinates for strange attractors from mutual information. Phys. Rev. A
**1986**, 33, 1134–1140. [Google Scholar] [CrossRef] - Grassberger, P.; Procaccia, I. Measuring the strangeness of strange attractors. Physical D
**1983**, 9, 189–208. [Google Scholar] [CrossRef] - Buzug, T.; Pfister, G. Comparison of algorithms calculating optimal embedding parameters for delay time coordinates. Physical D
**1992**, 58, 127–137. [Google Scholar] [CrossRef] - Buzug, T.; Pfister, G. Optimal delay time and embedding dimension for delay-time coordinates by analysis of the global static and local dynamical behavior of strange attractors. Phys. Rev. A
**1992**, 45, 7073–7084. [Google Scholar] [CrossRef] [PubMed] - Casdagli, M.; Eubank, S.; Farmer, J.; Gibson, J. State space reconstruction in the presence of noise. Physical D
**1991**, 51, 52–98. [Google Scholar] [CrossRef] - Bradley, E.; Kantz, H. Nonlinear time-series analysis revisited. Chaos
**2015**, 25, 097610. [Google Scholar] [CrossRef] [PubMed] - Falconer, K. Box-counting dimension. In Fractal Geometry: Mathematical Foundation and Applications; John Wiley and Sons: Chichester, UK, 1990; pp. 27–43. ISBN 10: 0471922870. [Google Scholar]
- Grassberger, P.; Procaccia, I. Characterization of strange attractors. Phys. Rev. Lett.
**1983**, 50, 346–349. [Google Scholar] [CrossRef] - Wolf, A.; Swift, J.B.; Swinney, H.L.; Vastano, J.A. Determining Lyapunov exponents from a time series. Physical D
**1985**, 16, 285–317. [Google Scholar] [CrossRef] - Pesin, Y. Characteristic Lyapunov exponents and smooth ergodic theory. Russ. Math. Surv.
**1977**, 32, 55–114. [Google Scholar] [CrossRef] - Kennel, M.; Brown, R.; Abarbanel, H. Determining embedding dimension for phase-space reconstruction using a geometrical construction. Phys. Rev. A
**1992**, 45, 3403–3411. [Google Scholar] [CrossRef] [PubMed] - Abarbanel, H.D.I.; Brown, R.; Sidorowich, J.J.; Tsimring, L.S. The analysis of observed chaotic data in physical systems. Rev. Mod. Phys.
**1993**, 65, 1331–1392. [Google Scholar] [CrossRef] - Huke, J.P.; Broomhead, D.S. Embedding theorems for non-uniformly sampled dynamical systems. Nonlinearity
**2007**, 20, 2205–2244. [Google Scholar] [CrossRef] - Manabe, Y.; Chakraborty, B. A novel approach for estimation of optimal embedding parameters of nonlinear time series by structural learning of neural network. Neurocomputing
**2007**, 70, 1360–1371. [Google Scholar] [CrossRef] - Small, M. Optimal time delay embedding for nonlinear time series modeling. arXiv, 2003. [Google Scholar]
- Vitrano, J.B.; Povinelli, R.J. Selecting dimensions and delay values for a time-delay embedding using a genetic algorithm. In Proceedings of the GECCO’01 3rd Annual Conference on Genetic and Evolutionary Computation, San Francisco, CA, USA, 7–11 July 2001; pp. 1423–1430, ISBN 1-55860-774-9. [Google Scholar]
- Shen, M.; Chen, W.N.; Zhang, J.; Chung, H.S.H.; Kaynak, O. Optimal selection of parameters for nonuniform embedding of chaotic time series using ant colony optimization. IEEE Trans. Cybern.
**2013**, 43, 790–802. [Google Scholar] [CrossRef] [PubMed] - Ragulskis, M.; Lukoseviciute, K. Non-uniform attractor embedding for time series forecasting by fuzzy inference systems. Neurocomputing
**2009**, 72, 2618–2626. [Google Scholar] [CrossRef] - Lukoseviciute, K.; Ragulskis, M. Evolutionary algorithms for the selection of time lags for time series forecasting by fuzzy inference systems. Neurocomputing
**2010**, 73, 2077–2088. [Google Scholar] [CrossRef] - Timofejeva, I.; Poskuviene, K.; Cao, M.S.; Ragulskis, M. Synchronization measure based on a geometric approach to attractor embedding using finite observation windows. Complexity
**2018**, 2018, 8259496. [Google Scholar] [CrossRef] - Little, D.J.; Kane, D.M. Permutation entropy with vector embedding delays. Phy. Rev. E
**2017**, 96, 062205. [Google Scholar] [CrossRef] [PubMed] - Rössler, O.E. An equation for continuous chaos. Phys. Lett.
**1976**, 57, 397–398. [Google Scholar] [CrossRef] - Letellier, C.; Messager, V. Influences on Otto E. Rössler’s earliest paper on chaos. Chaos
**2010**, 20, 3585–3616. [Google Scholar] [CrossRef] - Amigo, J.M. Permutation Complexity in Dynamical Systems; Springer: Berlin, Germany, 2010; p. 280. ISBN 978-3-642-04083-2. [Google Scholar]
- Amigo, J.M.; Zambrano, S.; Sanjuan, M.A.F. Combinatorial detection of determinism in noisy time series. EPL
**2008**, 83, 60005. [Google Scholar] [CrossRef] - BNCI Horizon 2020 Project Database. Available online: http://bnci-horizon-2020.eu/database/data-sets (accessed on 1 May 2018).
- Riedl, M.; Müller, A.; Wessel, N. Practical considerations of permutation entropy. Eur. Phys. J. Spec. Top.
**2013**, 222, 249–262. [Google Scholar] [CrossRef]

**Figure 1.**Permutation entropy for the sine wave: ${H}_{4}({\tau}_{1},{\tau}_{2},5)$ is depicted in (

**a**); ${H}_{4}({\tau}_{1},5,{\tau}_{3})$ in (

**b**); ${H}_{4}(5,{\tau}_{2},{\tau}_{3})$ in (

**c**); and the averaged ${\overline{H}}_{4}(\zeta ,\eta )$ in (

**d**); Numerical values of permutation entropy are indicated in color bars.

**Figure 2.**The geometric structure of a single periodic cell produced by the averaged PE reveals the three different inclination lines from three planar projections of the Permutation entropy (PE). (

**a**) The PE reconstructed by the technique presented in [44] is depicted in (

**b**).

**Figure 3.**All possible planar projections of PE for the Rössler time series ($D=6$): (

**a**) ${H}_{6}({\tau}_{1},{\tau}_{2},42,21,23)$; (

**b**) ${H}_{6}({\tau}_{1},11,{\tau}_{3},21,23)$; (

**c**) ${H}_{6}({\tau}_{1},11,42,{\tau}_{4},23)$; (

**d**) ${H}_{6}({\tau}_{1},11,42,21,{\tau}_{5})$; (

**e**) ${H}_{6}(38,{\tau}_{2},{\tau}_{3},21,23)$; (

**f**) ${H}_{6}(38,{\tau}_{2},42,{\tau}_{4},23)$; (

**g**) ${H}_{6}(38,{\tau}_{2},42,21,{\tau}_{5})$; (

**h**) ${H}_{6}(38,11,{\tau}_{3},{\tau}_{4},23)$; (

**i**) ${H}_{6}(38,11,{\tau}_{3},21,{\tau}_{5})$; (

**j**) ${H}_{6}(38,11,42,{\tau}_{4},{\tau}_{5})$.

**Figure 4.**Averaged PE for the Rössler time series: (

**a**) uniform embedding with no additive noise; (

**b**) non-uniform embedding with no additive noise; (

**c**) ${\overline{H}}_{6}$ with $10\%$ noise; (

**d**) ${\overline{H}}_{7}$ with $50\%$ noise; (

**e**) ${\overline{H}}_{7}$ with $75\%$ noise; (

**f**) ${\overline{H}}_{8}$ with $200\%$ noise.

**Figure 5.**Averaged PE for the Rössler time series with no additive noise: (

**a**) the pattern produced by a random set of time delays $\{7,18,39,27,12\}$; (

**b**) by a random set of time delays $\{30,40,16,9,29\}$.

**Figure 6.**The Electroencephalogram (EEG) signal available from the Brain/Neural Computer Interaction (BNCI) Horizon 2020 project database [49]. Insets (

**a**) and (

**b**) are used to depict the zoomed parts of the signal.

**Figure 7.**Digital images of PE reconstructed for Interval A. The proposed scheme yields the image in (

**a**). The scheme without the assessment of the optimal embedding dimension and the optimal set of time lags results in the image in (

**b**).

**Figure 8.**Digital images of PE reconstructed for Interval B. The proposed scheme yields the image in (

**a**). The scheme without the assessment of the optimal embedding dimension and the optimal set of time lags results in the image in (

**b**).

${\mathit{\tau}}_{1}$ | ${\mathit{\tau}}_{2}$ | ${\mathit{\tau}}_{3}$ | ${\mathit{\tau}}_{4}$ | ${\mathit{H}}_{5}\left(\mathit{\tau}\right)$ |
---|---|---|---|---|

$1,\dots ,L$ | $1,\dots ,L$ | ${\tau}_{3}^{\u2605}$ | ${\tau}_{4}^{\u2605}$ | ${H}_{5}\left({\tau}_{1},{\tau}_{2},{\tau}_{3}^{\u2605},{\tau}_{4}^{\u2605}\right)$ |

$1,\dots ,L$ | ${\tau}_{2}^{\u2605}$ | $1,\dots ,L$ | ${\tau}_{4}^{\u2605}$ | ${H}_{5}\left({\tau}_{1},{\tau}_{2}^{\u2605},{\tau}_{3},{\tau}_{4}^{\u2605}\right)$ |

$1,\dots ,L$ | ${\tau}_{2}^{\u2605}$ | ${\tau}_{3}^{\u2605}$ | $1,\dots ,L$ | ${H}_{5}\left({\tau}_{1},{\tau}_{2}^{\u2605},{\tau}_{3}^{\u2605},{\tau}_{4}\right)$ |

${\tau}_{1}^{\u2605}$ | $1,\dots ,L$ | $1,\dots ,L$ | ${\tau}_{4}^{\u2605}$ | ${H}_{5}\left({\tau}_{1}^{\u2605},{\tau}_{2},{\tau}_{3},{\tau}_{4}^{\u2605}\right)$ |

${\tau}_{1}^{\u2605}$ | $1,\dots ,L$ | ${\tau}_{3}^{\u2605}$ | $1,\dots ,L$ | ${H}_{5}\left({\tau}_{1}^{\u2605},{\tau}_{2},{\tau}_{3}^{\u2605},{\tau}_{4}\right)$ |

${\tau}_{1}^{\u2605}$ | ${\tau}_{2}^{\u2605}$ | $1,\dots ,L$ | $1,\dots ,L$ | ${H}_{5}\left({\tau}_{1}^{\u2605},{\tau}_{2}^{\u2605},{\tau}_{3},{\tau}_{4}\right)$ |

**Table 2.**Optimal embedding dimensions and optimal time lags for the Rössler time series with different noise levels.

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).