Symplectic Entropy as a Novel Measure for Complex Systems

Real systems are often complex, nonlinear, and noisy in various fields, including mathematics, natural science, and social science. We present the symplectic entropy (SymEn) measure as well as an analysis method based on SymEn to estimate the nonlinearity of a complex system by analyzing the given time series. The SymEn estimation is a kind of entropy based on symplectic principal component analysis (SPCA), which represents organized but unpredictable behaviors of systems. The key to SPCA is to preserve the global submanifold geometrical properties of the systems through a symplectic transform in the phase space, which is a kind of measure-preserving transform. The ability to preserve the global geometrical characteristics makes SymEn a test statistic for the detection of the nonlinear characteristics in several typical chaotic time series, and the stochastic characteristic in Gaussian white noise. The results are in agreement with findings in the approximate entropy (ApEn), the sample entropy (SampEn), and the fuzzy entropy (FuzzyEn). Moreover, the SymEn method is also used to analyze the nonlinearities of real signals (including the electroencephalogram (EEG) signals for Autism Spectrum Disorder (ASD) and healthy subjects, and the sound and vibration signals for mechanical systems). The results indicate that the SymEn estimation can be taken as a measure for the description of the nonlinear characteristics in the data collected from natural complex systems.


Introduction
Complex systems have been widely studied in various fields, such as mathematics, physics, engineering, economics, biomedical engineering, etc. Due to the unknown structures of systems and the complicated and disordered forms of measured time series, it has always been challenging to explore the characteristics of complex systems (such as the brain) [1,2].The essential problem for a complex system is differentiating between nonlinearity and linear stochastics, because a complex system-especially with chaos-is often very similar to a random process.Symplectic geometry is a geometry in an even dimensional phase space with a closed nondegenerate skew-symmetric bilinear form.Due to the measure-preserving capability of symplectic similar transform in phase space, symplectic geometry can preserve the system structure-even non-linear structure.The global submanifold geometrical properties in the system, or the data from the system, can keep unchanged through symplectic transforms in phase space [3][4][5].The aim of this study is to employ the measure-preserving characteristics of symplectic geometry to propose an entropy measure for complex systems.As a statistical measure, entropy can distinguish complex systems from linear stochastic processes.Symplectic geometry has been used to study complex systems in two areas: solving symplectic differential equations for various nonlinear dynamical systems [6][7][8][9][10][11], and analyzing a time series in phase space geometry to test the nature of complex systems [3,4,12].However, there has been little research on the entropy estimation in the symplectic theory.
In fact, the studies and applications of symplectic geometry have mainly been used to solve symplectic differential systems in mathematics, physics, and theoretical and classical mechanics since Feng and his research group first developed symplectic approach to accurately solve some nonlinear dynamical systems constructed using Hamiltonian models [9].The mathematical fundamentals of symplectic geometry are based on the symplectic space that is different from Euclidean space, although the two spaces could deal with the Hamiltonian system.The symplectic similarities in the symplectic space can preserve the structures of the Hamiltonian matrices [13][14][15].Some numerical algorithms based on symplectic geometry-such as symplectic Householder transformations, symplectic QR-like decomposition [16], or the symplectic Gram-Schmidt algorithm [17]-are proposed and modified to solve the eigenvalues of the Hamiltonian matrices, particularly for sparse and large structured matrices [18].Symplectic eigensolutions are proposed to perform the energy band analysis for a periodical waveguide by introducing symplectic mathematics into the electro-magnetic waveguide theory [19].A novel method based on symplectic mathematics and finite element analysis is developed to deduce the dispersion relations for some typical cellular structures [20].For some basic problems in solid mechanics and elasticity, many of the research works of symplectic elasticity are attributed to Lim, C.W. et al. [9], Zhong, W. et al. [20], Yao, W. [21] and Lü, C.F. et al. [22].Subsequently, the symplectic analysis has been applied to solve other applied engineering problems, such as control problems [23], piezoelectric materials [24], bending problems of corner-supported rectangular thin plates [25,26], and so on.
Recently, symplectic geometry theory has been applied to analyze the time series from complex systems through the reconstructed phase space [4].According to the Takens' embedding theorem, the reconstructed phase space is equivalent (diffeomorphic) to the attractor of the original dynamical system generating the data, so it can reflect the dynamical characteristics of the dynamic system [27].It is on this basis that our previous works extended the symplectic geometry theory into the analysis of the time series [28,29].The dimension of a nonlinear dynamic system has been estimated based on symplectic geometry [4].The symplectic principal component method has been developed to reduce noise in nonlinear systems [12,28].Subsequently, some researchers have used and further developed the applications of symplectic theory on time series.Sprinter's surface electromyography EMG signals have been evaluated based on symplectic geometry [29].The determinism characteristics in a time series have been detected by using the symplectic geometry method [30].The symplectic geometry spectrum regression method has been proposed to predict noisy time series [3].The above research works have shown that symplectic principal component analysis (SPCA)-based methodologies are superior to singular value decomposition SVD-based methods for complex systems.However, little work has been done to quantify the disorder or the randomness of a system from the point of view of the symplectic theory.
According to Boltzmann's definition, the interpretation of entropy is a measure of the number of possible microstates of a system.Entropy is used to describe and quantify the disorder, uncertainty, or randomness of a system, or the lack of information existing in a system.At present, many entropy estimation approaches-such as Shannon entropy, approximate entropy, sample entropy, fuzzy entropy, etc.-have been proposed and widely used to quantify the complexity of time series in various fields [31][32][33][34][35][36].As a broad and general concept, Shannon entropy has been widely used in information theory and in thermodynamics since Claude Shannon devised it in 1948 [37].This paper presents a symplectic entropy (SymEn) approach based on SPCA and Shannon entropy.The SymEn approach measures the energy distribution of a dynamic system in symplectic space.To test the utility of the SymEn approach, we chose several typical time series (including noise data and three chaotic time series) and four kinds of engineering signals (electroencephalogram (EEG) signals from the brain systems of healthy subjects, Autism Spectrum Disorder (ASD) subjects, and the sound and vibration signals from the mechanical systems).

Symplectic Entropy
Given N sampled time series x 1 , x 2 , • • • , x N coming from a system f(x), an attractor X m×d in phase space can be reconstructed by time delay coordinates method: where d is the embedding dimension, m = N − d + 1 is the number of dots in the d-dimension reconstruction vector, and X T i , (i = 1, . . ., m) denotes a dot in the phase space.According to Takens' embedding theorem, X reflects the characteristics of the system.Then, a Hamilton matrix M of the system f(x) can be built from X: where A = X T X.Meanwhile, any Hamilton matrix can remain unchanged at symplectic similar transform in symplectic geometry theory.The related proofs refer to the proofs of Appendix A in [4].In symplectic geometry, the symplectic similar transform is a regular transform.Its essence is dependent on a bilinear antisymmetric nonsingular cross product-the symplectic cross product: where x and y are 2n-dimension vectors.
There are also some properties, as follows: Theorem 1.The product of symplectic matrices is also a symplectic matrix.
Theorem 2. Suppose Household matrix H is: where so, H is a symplectic unitary matrix.* is conjugate transposition.
Theorem 3. Let S ∈ C 2n×2n as a symplectic matrix, then S can be decomposed as S = QR, where Q is a symplectic unitary matrix and R is an upper triangle matrix.
Here, the Household matrix H can be used as the matrix Q.Then, the Hamilton matrix M can be transformed into an upper Hessenberg matrix by the Household matrix H, where B is upper Hessenberg matrix (b ij = 0, i > j + 1).Meanwhile, H can be obtained from the matrix A [4].
The eigenvalues of matrix B are given as The eigenvalues λ(A) of matrix A are equal to those of matrix B; i.e., µ = λ(B) = λ(A).The eigenvalues µ = {µ 1 , µ 2 , • • • , µ d } are sorted in descending order; that is, The µ values are called the symplectic principal components of A with relevant symplectic orthonormal bases.They reflect the distribution of the energy of the system in different symplectic directions.The probability of the energy distribution in each direction can be defined, respectively, as p 1 , p 2 , • • • , p d , and the probability space of the equation can be expressed as: where d is the embedding dimension, ∑ d i=1 p i = 1, 0 ≤ p i ≤ 1.It represents the uncertainty of the entropy in each direction.Then, an entropy can be defined as follows: SymEn gives the statistical average and measures the degree of uncertainty of energy about the underlying probability distribution in different directions of a system in terms of the Shannon entropy, called Symplectic Entropy.

Materials
A complex system-especially with chaos-is often random-like.Detecting the nonlinearity of a time series is one of the inverse problems for a complex system, particularly for a chaotic dynamical system.Here, to assess the utility of the SymEn method as a measure for complex systems, the SymEn method is used to test the nonlinearity in the data from both synthetic and real experiments.In Section 2.2.1, we illustrate the four synthetic time series from the Gaussian white noise processing and three chaotic systems whose underlying dynamics are known.In Section 2.2.2, we consider several real time series from practical systems.In Section 2.2.3, we introduce the null hypothesis technique in order to test the difference between the raw data and its surrogate data based on the entropy measures, including SymEn, approximate entropy (ApEn), sample entropy (SampEn), and fuzzy entropy (FuzzyEn) methods.

Synthetic Time Series
In practical systems, noise is generally regarded as Gaussian white noise having an independent and identical probability distribution.Its probability distribution of each direction is even in the dimension d (see Figure 1).The probability value is calculated by Equation (10).For an equiprobable distribution, the entropy should increase with the number of the embedding dimension d.Here, the Gaussian white noise with mean 0 and variance 1 is used to test if the SymEn estimate is suitable to evaluate the noise.In order to determine if the SymEn method can reflect the nonlinear characteristics of a time series, three typical chaotic systems are applied to generate the corresponding nonlinear time series.
• Van der Pol chaotic system:

Real Time Series
To further illustrate the application of the SymEn method to real data sets, we employ several real data sets, including two bioelectric signals (electroencephalogram (EEG) data from children with ASD and healthy subjects), and two mechanical signals (vibration signals of an air compressor and engine sound signals).
The brain system is extremely complicated.The EEG data are the electrical signals from the brain when the brain is at work.Therefore, brain function analysis based on EEG has been of enormous interest.A number of studies have also studied the nonlinear behavior of the brain dynamics in the EEG analysis.However, the EEG signals are complicated and random-like.It has been challenging to detect the nonlinear properties of EEG data [38].Here, the EEG signals at the right (O2) are collected during the resting state with open eyes for the children with ASD and healthy subjects, respectively.The sampling conditions are detailed in previous papers [39,40].
In the mechanical engineering field, there are many complex mechanical systems, such as air compressors and car engines [41,42].Here, the nonlinear characteristics in the abnormal vibration signals of the air compressor and the sound data of faulty diesel engines are detected by the SymEn method.For the air compressor, the vibration acceleration data are collected at a 25,600 Hz sampling frequency.The length of the measured data is 32,768 points.For the diesel engine, the sound time series are given by a 48 kHz sampling frequency.The three chaotic systems are given as follows: • Rössler chaotic system: .

Real Time Series
To further illustrate the application of the SymEn method to real data sets, we employ several real data sets, including two bioelectric signals (electroencephalogram (EEG) data from children with ASD and healthy subjects), and two mechanical signals (vibration signals of an air compressor and engine sound signals).
The brain system is extremely complicated.The EEG data are the electrical signals from the brain when the brain is at work.Therefore, brain function analysis based on EEG has been of enormous interest.A number of studies have also studied the nonlinear behavior of the brain dynamics in the EEG analysis.However, the EEG signals are complicated and random-like.It has been challenging to detect the nonlinear properties of EEG data [38].Here, the EEG signals at the right (O2) are collected during the resting state with open eyes for the children with ASD and healthy subjects, respectively.The sampling conditions are detailed in previous papers [39,40].
In the mechanical engineering field, there are many complex mechanical systems, such as air compressors and car engines [41,42].Here, the nonlinear characteristics in the abnormal vibration signals of the air compressor and the sound data of faulty diesel engines are detected by the SymEn method.For the air compressor, the vibration acceleration data are collected at a 25,600 Hz sampling frequency.The length of the measured data is 32,768 points.For the diesel engine, the sound time series are given by a 48 kHz sampling frequency.

Surrogate Data and Null Hypothesis
Surrogate data has been widely employed to discriminate the nonlinear dynamics in real data and typical chaotic time series [36,43,44].The idea of surrogate data is to first specify a stochastic process with some linear properties which are consistent with the original data as a null hypothesis, then generate surrogate data sets according to this hypothesis, and calculate a test statistic for the original data and its surrogate data, respectively.If the statistic value of the original data is obviously different from those of its surrogate data, this null hypothesis is rejected, and nonlinearity is detected.Here, we apply a linear autocorrelated Gaussian noise process as a null hypothesis.This null hypothesis can generate the surrogate data with the same mean, variance, and spectrum power as the original data.The algorithm of this null hypothesis has been detailed in our early work [44].In this study, the SymEn measure is used as a test statistic, as are the ApEn, SampEn, and FuzzyEn measures.In order to give the degree of difference between the original data and its surrogate data, the z value is given: where Q orig is the test statistic value for the original data, and Q s is the mean of the statistic values for the surrogates.σ s is the standard deviation (SD) of the statistic values for the surrogates.z > 1.96 means that the null hypothesis can be rejected for two-sided testing at a 95% (α = 0.05) confidence level.
For α = 0.05, the number of surrogates is B = 2/α − 1 = 39 [36,38,45].For this, 39 sets of surrogate data are generated for each analyzed data in this study.Besides, it is necessary to note that the analysis length of each data analysis is 1000 points in this study.

Applicantion to Synthetic Time Series
In this section, the proposed SymEn method is applied to analyze several synthetic time series (Gaussian white noise and three chaotic time series) in Section 2.2.1.Meanwhile, the reported results are compared with those given by the ApEn, SampEn and FuzzyEn methods.

Tests on Gaussian White Noise Process
For the noise data from the Gaussian white noise process, we note that its surrogates-which are generated according to the null hypothesis of a linear autocorrelated Gaussian noise process-should give negative results.In other words, the null hypothesis should not be rejected.Figure 2 gives the analysis of the noise time series and its surrogate data on the embedding dimension d = 2:25 for four entropy methods.The results show that the entropy values are different for the different methods (see Figure 2).However, there is no difference between the noise and its surrogate data for the four entropy methods.These results indicate that the null hypothesis is accepted at the 95% confidence level.In other words, the noise data is from the same dynamical process as its surrogates from the null hypothesis.Considering that the dynamical property of the noise is an equiprobable distribution in different embedding dimension d (see Figure 1), its entropy values should increase with the number of the embedding dimension.The symplectic entropy method presents that the SymEn values of the noise and its surrogates increase with the increase of the embedding dimensions (see Figure 2a).However, for the approximate entropy, the values decrease with the increase of the embedding dimensions (see Figure 2b).For the sample entropy, the values vary with the increase of the embedding dimensions, especially no values in the higher dimensions (see Figure 2c).For the fuzzy entropy, with the increase of the dimension, the values decrease a little in the lower dimensions and then increase rapidly in the higher dimensions (see Figure 2d).

Tests on Chaotic Dynamical Systems
Three chaotic time series coming from three chaotic dynamical systems are analyzed by the SymEn, ApEn, SampEn, and FuzzyEn methods, respectively.Figures 3-5 give the results of four entropy methods for these time series and their surrogate data.Figure 3 shows that the Lorenz time series is different from its surrogate data in the embedding dimension d = 2:25.The SymEn values of the original data increase and tend to be flat with the increase of the embedding dimension, while those of its surrogate data only increase.The difference between the original data and its surrogates becomes widened with the increase of the dimension.Moreover, the results of the ApEn, SampEn, and FuzzyEn methods are similar to that of the SymEn method, although these values are dissimilar for different methods.That is, the entropy values of the raw data are also different from those of its surrogate data for the ApEn, SampEn, and FuzzyEn methods.The results indicate that the null hypothesis is rejected at the 95% level for four entropy methods.The Lorenz chaotic time series is not a random signal generated from a linear autocorrelated Gaussian process.The dynamical properties of the Lorenz chaotic time series are not the same as the dynamic characteristics of the surrogate data.The raw data should contain some nonlinear components.Furthermore, the chaotic time series from Rössler and Van der Pol chaotic dynamical systems are analyzed by the SymEn, ApEn, SampEn, and FuzzyEn methods (see Figures 4 and 5).We can see that there is a significant difference between the entropy values of the raw Rössler and Van der Pol chaotic data and those of their surrogate data, respectively.The null hypothesis is rejected at the 95% confidence level for the raw Rössler and Van der Pol chaotic data, respectively.The results indicate that there are the nonlinear characteristics in the raw Rössler and Van der Pol chaotic data.

Tests on Chaotic Dynamical Systems
Three chaotic time series coming from three chaotic dynamical systems are analyzed by the SymEn, ApEn, SampEn, and FuzzyEn methods, respectively.Figures 3-5 give the results of four entropy methods for these time series and their surrogate data.Figure 3 shows that the Lorenz time series is different from its surrogate data in the embedding dimension d = 2:25.The SymEn values of the original data increase and tend to be flat with the increase of the embedding dimension, while those of its surrogate data only increase.The difference between the original data and its surrogates becomes widened with the increase of the dimension.Moreover, the results of the ApEn, SampEn, and FuzzyEn methods are similar to that of the SymEn method, although these values are dissimilar for different methods.That is, the entropy values of the raw data are also different from those of its surrogate data for the ApEn, SampEn, and FuzzyEn methods.The results indicate that the null hypothesis is rejected at the 95% level for four entropy methods.The Lorenz chaotic time series is not a random signal generated from a linear autocorrelated Gaussian process.The dynamical properties of the Lorenz chaotic time series are not the same as the dynamic characteristics of the surrogate data.The raw data should contain some nonlinear components.Furthermore, the chaotic time series from Rössler and Van der Pol chaotic dynamical systems are analyzed by the SymEn, ApEn, SampEn, and FuzzyEn methods (see Figures 4 and 5).We can see that there is a significant difference between the entropy values of the raw Rössler and Van der Pol chaotic data and those of their surrogate data, respectively.The null hypothesis is rejected at the 95% confidence level for the raw Rössler and Van der Pol chaotic data, respectively.The results indicate that there are the nonlinear characteristics in the raw Rössler and Van der Pol chaotic data.By comparison with the ApEn, SampEn, and FuzzyEn methods, the SymEn method is able to detect the nonlinearity of the chaotic time series whose underlying dynamic property is known.
By comparison with the ApEn, SampEn, and FuzzyEn methods, the SymEn method is able to detect the nonlinearity of the chaotic time series whose underlying dynamic property is known.By comparison with the ApEn, SampEn, and FuzzyEn methods, the SymEn method is able to detect the nonlinearity of the chaotic time series whose underlying dynamic property is known.

Application to Real Time Series
To further test the application of the SymEn method to the real data, some results are reported on experimental time series from several sources.We attempt to illustrate that the SymEn can be regarded as a measure to reflect the dynamic characteristics of time series from real complex systems by comparison with the ApEn, SampEn, and FuzzyEn methods.

Application to Real Time Series
To further test the application of the SymEn method to the real data, some results are reported on experimental time series from several sources.We attempt to illustrate that the SymEn can be regarded as a measure to reflect the dynamic characteristics of time series from real complex systems by comparison with the ApEn, SampEn, and FuzzyEn methods.

Application to Real Time Series
To further test the application of the SymEn method to the real data, some results are reported on experimental time series from several sources.We attempt to illustrate that the SymEn can be regarded as a measure to reflect the dynamic characteristics of time series from real complex systems by comparison with the ApEn, SampEn, and FuzzyEn methods.

The EEG for ASD and Healthy Subjects
EEG signals have been widely used to investigate electrical brain activity.Although many nonlinear measures have been employed to deal with the EEG signals (such as ApEn and SampEn [32,46]), the research has mostly focused on the classification analysis of EEG signals.However, an exploration of the nonlinearity in the EEG data is lacking.In particular, few studies detect the nonlinearity of the raw EEG data of ASD.For children with ASD, some nonlinear measures have been applied to directly analyze the corresponding EEG data in the resting state in order to explore the brain mechanics of ASD [40].In this paper, we perform nonlinearity tests based on the SymEn measure on the EEG signals (O2) of ASD and healthy subjects.Figure 6 shows the raw EEG data for ASD subject in resting state with open eyes along with the SymEn functions of the raw EEG data and its surrogates.The SymEn values of the raw EEG are distinctly different from those of its surrogates-particularly in the higher dimensions (Figure 6b,c).The test of the raw EEG data can reject the null hypothesis of a linear stochastic process at the 95% level (z > 1.96).Meanwhile, the results of the ApEn, SampEn, and FuzzyEn methods are also shown in Figure 6.For the ApEn method, there is some difference between the raw data and its surrogate data in the low dimensions (see Figure 6d).The discrepancies also exist in the SampEn values of the raw data and its surrogate data in some embedding dimensions (see Figure 6e).The result of the FuzzyEn method is similar to that of the SampEn method (see Figure 6f).However, the differences between the raw data and its surrogates for the ApEn, SampEn, and FuzzyEn methods are not more obvious than that for the SymEn method, particularly in the higher embedding dimensions.

The EEG for ASD and Healthy Subjects
EEG signals have been widely used to investigate electrical brain activity.Although many nonlinear measures have been employed to deal with the EEG signals (such as ApEn and SampEn [32,46]), the research has mostly focused on the classification analysis of EEG signals.However, an exploration of the nonlinearity in the EEG data is lacking.In particular, few studies detect the nonlinearity of the raw EEG data of ASD.For children with ASD, some nonlinear measures have been applied to directly analyze the corresponding EEG data in the resting state in order to explore the brain mechanics of ASD [40].In this paper, we perform nonlinearity tests based on the SymEn measure on the EEG signals (O2) of ASD and healthy subjects.Figure 6 shows the raw EEG data for ASD subject in resting state with open eyes along with the SymEn functions of the raw EEG data and its surrogates.The SymEn values of the raw EEG are distinctly different from those of its surrogates-particularly in the higher dimensions (Figure 6b,c).The test of the raw EEG data can reject the null hypothesis of a linear stochastic process at the 95% level (z > 1.96).Meanwhile, the results of the ApEn, SampEn, and FuzzyEn methods are also shown in Figure 6.For the ApEn method, there is some difference between the raw data and its surrogate data in the low dimensions (see Figure 6d).The discrepancies also exist in the SampEn values of the raw data and its surrogate data in some embedding dimensions (see Figure 6e).The result of the FuzzyEn method is similar to that of the SampEn method (see Figure 6f).However, the differences between the raw data and its surrogates for the ApEn, SampEn, and FuzzyEn methods are not more obvious than that for the SymEn method, particularly in the higher embedding dimensions.For the healthy subject, the results are similar to those of the ASD subject.Figure 7 presents the raw EEG signal as well as the SymEn, ApEn, SampEn, and FuzzyEn measures of the raw data and its surrogates.It can be observed that the SymEn values of the raw data do not belong to the distribution of the SymEn functions of the surrogates (see Figure 7b).Particularly in the dimension d = 25 (z > 1.96), the test of the raw data rejects the hypothesis of belonging to the same distribution as all 39 surrogates (see Figure 7b,c).The result indicates that the raw EEG data contains the nonlinear dynamical properties.The raw EEG data should come from a nonlinear dynamical system.However, the results for the ApEn, SampEn, and FuzzyEn methods are not better than that of the SymEn method (see Figure 7b,d-f).For the healthy subject, the results are similar to those of the ASD subject.Figure 7 presents the raw EEG signal as well as the SymEn, ApEn, SampEn, and FuzzyEn measures of the raw data and its surrogates.It can be observed that the SymEn values of the raw data do not belong to the distribution of the SymEn functions of the surrogates (see Figure 7b).Particularly in the dimension d = 25 (z > 1.96), the test of the raw data rejects the hypothesis of belonging to the same distribution as all 39 surrogates (see Figure 7b,c).The result indicates that the raw EEG data contains the nonlinear dynamical properties.The raw EEG data should come from a nonlinear dynamical system.However, the results for the ApEn, SampEn, and FuzzyEn methods are not better than that of the SymEn method (see Figure 7b,d-f).For the healthy subject, the results are similar to those of the ASD subject.Figure 7 presents the raw EEG signal as well as the SymEn, ApEn, SampEn, and FuzzyEn measures of the raw data and its surrogates.It can be observed that the SymEn values of the raw data do not belong to the distribution of the SymEn functions of the surrogates (see Figure 7b).Particularly in the dimension d = 25 (z > 1.96), the test of the raw data rejects the hypothesis of belonging to the same distribution as all 39 surrogates (see Figure 7b,c).The result indicates that the raw EEG data contains the nonlinear dynamical properties.The raw EEG data should come from a nonlinear dynamical system.However, the results for the ApEn, SampEn, and FuzzyEn methods are not better than that of the SymEn method (see Figure 7b,d-f).

The Time Series for Diesel Engine and Air Compressor
Mechanical systems are also a kind of complex system; for example, diesel engines and air compressors.In engine systems, acoustic signals are often applied to detect the mechanical faults, because they can provide significant dynamic information on the engines [42].Many acoustic methods are employed to deal with engine faults, such as acoustic emission analysis [47].However, fault diagnosis based on acoustic signals is often easily affected by background noise [42].Few studies employ nonlinear analysis methods to explore the characteristics of the abnormal sound in diesel engines.In this paper, we first apply the proposed SymEn function to test the nonlinearity of the abnormal sound of the diesel engine.Figure 8 presents one episode of the original sound data in the faulty diesel engine and its SymEn values with those of its surrogates.Our SymEn-based tests applied to the sound data reject the null hypothesis with 95% confidence over 39 surrogates.In dimension d = 25, the SymEn value of the original data is obviously different from those of its surrogates with the difference degree z = 16.99 (z > 1.96).The results show that the dynamical characteristics of the raw sound data are not linear.The raw data should come from a nonlinear system.Meanwhile, the ApEn, SampEn, and FuzzyEn methods are also employed to analyze the sound data (see Figure 8d-f).In general, the results are worse than that of the SymEn method, although the ApEn values show some difference between the raw data and its surrogate in d = 3,4.
Finally, we examine the vibration acceleration data in an air compressor and its surrogates.Air compressors are extensively employed mechanical systems.The working state of the air compressor becomes unstable due to the interaction of pistons, the pressure instability of the compressed air outputted from the air compressor in volatile gas-consumption situations.Besides, environmental

The Time Series for Diesel Engine and Air Compressor
Mechanical systems are also a kind of complex system; for example, diesel engines and air compressors.In engine systems, acoustic signals are often applied to detect the mechanical faults, because they can provide significant dynamic information on the engines [42].Many acoustic methods are employed to deal with engine faults, such as acoustic emission analysis [47].However, fault diagnosis based on acoustic signals is often easily affected by background noise [42].Few studies employ nonlinear analysis methods to explore the characteristics of the abnormal sound in diesel engines.In this paper, we first apply the proposed SymEn function to test the nonlinearity of the abnormal sound of the diesel engine.Figure 8 presents one episode of the original sound data in the faulty diesel engine and its SymEn values with those of its surrogates.Our SymEn-based tests applied to the sound data reject the null hypothesis with 95% confidence over 39 surrogates.In dimension d = 25, the SymEn value of the original data is obviously different from those of its surrogates with the difference degree z = 16.99 (z > 1.96).The results show that the dynamical characteristics of the raw sound data are not linear.The raw data should come from a nonlinear system.Meanwhile, the ApEn, SampEn, and FuzzyEn methods are also employed to analyze the sound data (see Figure 8d-f).In general, the results are worse than that of the SymEn method, although the ApEn values show some difference between the raw data and its surrogate in d = 3,4.
Finally, we examine the vibration acceleration data in an air compressor and its surrogates.Air compressors are extensively employed mechanical systems.The working state of the air compressor becomes unstable due to the interaction of pistons, the pressure instability of the compressed air outputted from the air compressor in volatile gas-consumption situations.Besides, environmental noise often has a strong impact on the vibration measurement.Hence, the measured vibration signals often show complex dynamics behaviors.This paper considers the vibration acceleration collected at the fault state of the air compressor.Figure 9 gives the original series and its SymEn function, as well as those of its surrogate data.The SymEn of the raw data is obviously distinguishable from those of its surrogates.The null hypothesis is rejected at the 95% significance level, especially in d = 25 (z > 1.96).The raw data is not from a linear process in the null hypothesis.The result indicates that the raw data is nonlinear.Besides, the ApEn, SampEn, and FuzzyEn methods are also used to analyze the raw vibration data (see Figure 9d-f).The results of the ApEn, SampEn, and FuzzyEn are worse than those of the SymEn method.
Entropy 2016, 18, 412 13 of 18 noise often has a strong impact on the vibration measurement.Hence, the measured vibration signals often show complex dynamics behaviors.This paper considers the vibration acceleration collected at the fault state of the air compressor.Figure 9 gives the original series and its SymEn function, as well as those of its surrogate data.The SymEn of the raw data is obviously distinguishable from those of its surrogates.The null hypothesis is rejected at the 95% significance level, especially in d = 25 (z > 1.96).The raw data is not from a linear process in the null hypothesis.
The result indicates that the raw data is nonlinear.Besides, the ApEn, SampEn, and FuzzyEn methods are also used to analyze the raw vibration data (see Figure 9d-f).The results of the ApEn, SampEn, and FuzzyEn are worse than those of the SymEn method.From the above analysis, the results have shown that the proposed SymEn function could be a suitable measure for nonlinearity tests, not only on synthetic data, but also on the limited samples of noisy measurements from real world systems.

Discussion
We propose the SymEn function method and test four types of synthetic time series.For Gaussian white noise, its distribution is equiprobable.The probability distribution in the SymEn From the above analysis, the results have shown that the proposed SymEn function could be a suitable measure for nonlinearity tests, not only on synthetic data, but also on the limited samples of noisy measurements from real world systems.

Discussion
We propose the SymEn function method and test four types of synthetic time series.For Gaussian white noise, its distribution is equiprobable.The probability distribution in the SymEn function is equal for the noise (Figure 1).The SymEn method reflects the distribution of the noise data efficaciously.For a typical chaotic time series, the SymEn function method gives the differences between the raw chaotic data and its surrogate data (Figures 3-5).According to the null hypothesis, the test on chaotic time series with the SymEn function favors the rejection of the null hypothesis.The results are consistent with those of the ApEn, SampEn, and FuzzyEn functions.As can be seen, the SymEn measure generally performs properly for the wider range of the embedding dimensions.The ApEn and SampEn measures turn out to be unable to discriminate a chaotic system from its surrogate data sets in higher embedding dimensions.The FuzzyEn function is better than them, but worse than the SymEn.
For the nonlinearity test of the EEG signals, previous studies had found that the nonlinear results were varied for the different methods and the EEG at different brain areas [43,48].Theiler et al. [43] found that the EEG data at the left central (C3) could reject the null hypothesis of a linear stochastic process, whereas any other EEG data (O1) could not.Kugiumtzis also pointed out that cross-examining the results with other methods seemed to be necessary [48].To examine the practicability of the SymEn function method, the difference degree z values (referring to Equation ( 15)) are further given in Figure 10 for the four entropy methods to test the EEG signals of the ASD and healthy subjects.For the suitable range of the embedding dimension (such as d = 5), the tests on the EEG signals are consistent for the four entropy methods (Figure 10).For the ApEn, SampEn, and FuzzyEn, variations of the difference degrees are shown with the different embedding dimensions.However, the SymEn method performs better than the other three methods.For the embedding dimension d = 2:25, all the z values of the SymEn function are larger than 1.96.The rejection can be obtained at the wide range of the embedding dimension for the SymEn method.The EEG data can come from a nonlinear brain system.The results are consistent with those in the previous literature [43,48].
Entropy 2016, 18, 412 15 of 18 function is equal for the noise (Figure 1).The SymEn method reflects the distribution of the noise data efficaciously.For a typical chaotic time series, the SymEn function method gives the differences between the raw chaotic data and its surrogate data (Figures 3-5).According to the null hypothesis, the test on chaotic time series with the SymEn function favors the rejection of the null hypothesis.The results are consistent with those of the ApEn, SampEn, and FuzzyEn functions.As can be seen, the SymEn measure generally performs properly for the wider range of the embedding dimensions.The ApEn and SampEn measures turn out to be unable to discriminate a chaotic system from its surrogate data sets in higher embedding dimensions.The FuzzyEn function is better than them, but worse than the SymEn.
For the nonlinearity test of the EEG signals, previous studies had found that the nonlinear results were varied for the different methods and the EEG at different brain areas [43,48].Theiler et al. [43] found that the EEG data at the left central (C3) could reject the null hypothesis of a linear stochastic process, whereas any other EEG data (O1) could not.Kugiumtzis also pointed out that cross-examining the results with other methods seemed to be necessary [48].To examine the practicability of the SymEn function method, the difference degree z values (referring to Equation ( 15)) are further given in Figure 10 for the four entropy methods to test the EEG signals of the ASD and healthy subjects.For the suitable range of the embedding dimension (such as d = 5), the tests on the EEG signals are consistent for the four entropy methods (Figure 10).For the ApEn, SampEn, and FuzzyEn, variations of the difference degrees are shown with the different embedding dimensions.However, the SymEn method performs better than the other three methods.For the embedding dimension d = 2:25, all the z values of the SymEn function are larger than 1.96.The rejection can be obtained at the wide range of the embedding dimension for the SymEn method.The EEG data can come from a nonlinear brain system.The results are consistent with those in the previous literature [43,48].For the sound and vibration signals, we further exhibit the difference degree z values of the discrimination between the original data and its surrogate data in Figure 11.The ApEn, SampEn, and FuzzyEn methods do not achieve all of the difference degrees being larger than 1.96 for all d = 2:25.However, for some embedding dimensions-such as d = 3 or 4-the four methods give consistent discriminations for the sound and vibration signals (Figure 11a,b).The results suggest the rejection of the null hypothesis.For the SymEn method, the performance of the test on the different embedding dimensions is uniform in d = 2:25.The results indicate that the SymEn method is better than the ApEn, SampEn, and FuzzyEn methods.

Conclusions
In this paper, the symplectic entropy (SymEn) measure has been proposed to measure the disorder of the system.Its algorithm is a logarithmic measure of the average amount of energy about the underlying probability distribution in different directions of a system, like the Shannon entropy.The higher the disorder of the system is, the higher the entropy.For a Gaussian white noise process, the SymEn method reflects this point, because the SymEn values become larger with the increase of the embedding dimension.Meanwhile, we study the effectiveness and practicability of the SymEn method experimentally by testing the nonlinearity of the synthetic chaotic data and real time series.In comparison with the ApEn, SampEn, and FuzzyEn entropy measures, the results show that the performance of the proposed SymEn method is superior to them.The SymEn method can reflect the characteristics of the systems and can be used to detect if there are nonlinear components in the real data from a complex system.

Conclusions
In this paper, the symplectic entropy (SymEn) measure has been proposed to measure the disorder of the system.Its algorithm is a logarithmic measure of the average amount of energy about the underlying probability distribution in different directions of a system, like the Shannon entropy.The higher the disorder of the system is, the higher the entropy.For a Gaussian white noise process, the SymEn method reflects this point, because the SymEn values become larger with the increase of the embedding dimension.Meanwhile, we study the effectiveness and practicability of the SymEn method experimentally by testing the nonlinearity of the synthetic chaotic data and real time series.In comparison with the ApEn, SampEn, and FuzzyEn entropy measures, the results show that the performance of the proposed SymEn method is superior to them.The SymEn method can reflect the characteristics of the systems and can be used to detect if there are nonlinear components in the real data from a complex system.

Figure 1 .
Figure 1.The probability values of the Gaussian white noise in different directions in dimension d = 3:5:28.

Figure 1 .
Figure 1.The probability values of the Gaussian white noise in different directions in dimension d = 3:5:28.

Figure 2 .
Figure 2. The characteristic test of the Gaussian white noise process: (a) the symplectic entropy (SymEn) values of noise and its surrogate data; (b) the approximate entropy (ApEn) values of noise and its surrogate data; (c) the sample entropy (SampEn) values of noise and its surrogate data; (d) the fuzzy entropy (FuzzyEn) values of noise and its surrogate data.

Figure 2 .
Figure 2. The characteristic test of the Gaussian white noise process: (a) the symplectic entropy (SymEn) values of noise and its surrogate data; (b) the approximate entropy (ApEn) values of noise and its surrogate data; (c) the sample entropy (SampEn) values of noise and its surrogate data; (d) the fuzzy entropy (FuzzyEn) values of noise and its surrogate data.

Figure 3 .Figure 3 .
Figure 3.Comparison of nonlinear tests on four entropy methods for Lorenz chaotic dynamical system.

Figure 3 .
Figure 3.Comparison of nonlinear tests on four entropy methods for Lorenz chaotic dynamical system.

Figure 4 .Figure 4 .Figure 5 .
Figure 4. Comparison of nonlinear tests on four entropy methods for Rössler chaotic dynamical system.

Figure 5 .
Figure 5.Comparison of nonlinear tests on four entropy methods for Van de Pol chaotic dynamical system.

Figure 6 .
Figure 6.The nonlinearity test on the electroencephalogram (EEG) signal of ASD subject based on the SymEn method by comparison with the ApEn, SampEn, and FuzzyEn methods: (a) The raw EEG data in the right O2; (b) The SymEn functions of the raw data and its surrogates in the embedding dimension d = 2:25; (c) The SymEn values in d = 25 for the raw data and its surrogates, * for the raw data; (d) The ApEn values of the raw data and its surrogates in the embedding dimension d = 2:25; (e) The SampEn values of the raw data and its surrogates in the embedding dimension d = 2:25; (f) The FuzzyEn values of the raw data and its surrogates in the embedding dimension d = 2:25.

Figure 6 .
Figure 6.The nonlinearity test on the electroencephalogram (EEG) signal of ASD subject based on the SymEn method by comparison with the ApEn, SampEn, and FuzzyEn methods: (a) The raw EEG data in the right O2; (b) The SymEn functions of the raw data and its surrogates in the embedding dimension d = 2:25; (c) The SymEn values in d = 25 for the raw data and its surrogates, * for the raw data; (d) The ApEn values of the raw data and its surrogates in the embedding dimension d = 2:25; (e) The SampEn values of the raw data and its surrogates in the embedding dimension d = 2:25; (f) The FuzzyEn values of the raw data and its surrogates in the embedding dimension d = 2:25.

Figure 7 .
Figure 7.The nonlinearity test on the EEG signal of healthy subject: (a) The raw EEG data in the right O2; (b) In the range of 2 to 25 dimension, the SymEn functions of the raw data and its surrogates; (c) The SymEn measures in d = 25 for the raw data and its surrogates; (d) The ApEn values of the raw data and its surrogates in d = 2:25; (e) the SampEn values of the raw data and its surrogates in d = 2:25; (f) The FuzzyEn values of the raw data and its surrogates in d = 2:25.

Figure 8 .
Figure 8.The test on the abnormal sound in the diesel engine: (a) The raw 1000-point sound time series; (b) The SymEn functions of the raw data and its surrogates in d = 2:25; (c) In d = 25, the SymEn measures of the raw data and its surrogates, * for the raw data, histogram for its surrogates; (d) The ApEn values of the raw data and its surrogates in d = 2:25; (e) The SampEn values of the raw data and its surrogates in d = 2:25; (f) The FuzzyEn values of the raw data and its surrogates in d = 2:25.

Figure 8 .
Figure 8.The test on the abnormal sound in the diesel engine: (a) The raw 1000-point sound time series; (b) The SymEn functions of the raw data and its surrogates in d = 2:25; (c) In d = 25, the SymEn measures of the raw data and its surrogates, * the raw data, histogram for its surrogates; (d) The ApEn values of the raw data and its surrogates in d = 2:25; (e) The SampEn values of the raw data and its surrogates in d = 2:25; (f) The FuzzyEn values of the raw data and its surrogates in d = 2:25.

Figure 9 .
Figure 9.The test on the acceleration time series in the vibration of the air compressor: (a) The raw acceleration signal; (b) The SymEn measures in d = 2:25 for the raw data and its surrogates; (c) The SymEn measure (*) of the raw data in d = 25, and the histogram of the SymEn values for its surrogates; (d) The ApEn values of the raw data and its surrogates in d = 2:25; (e) The SampEn values of the raw data and its surrogates in d = 2:25; (f) The FuzzyEn values of the raw data and its surrogates in d = 2:25.

Figure 9 .
Figure 9.The test on the acceleration time series in the vibration of the air compressor: (a) The raw acceleration signal; (b) The SymEn measures in d = 2:25 for the raw data and its surrogates; (c) The SymEn measure (*) of the raw data in d = 25, and the histogram of the SymEn values for its surrogates; (d) The ApEn values of the raw data and its surrogates in d = 2:25; (e) The SampEn values of the raw data and its surrogates in d = 2:25; (f) The FuzzyEn values of the raw data and its surrogates in d = 2:25.

Figure 10 .
Figure 10.The difference degree z values of the entropy values in Figures 6 and 7 for four entropy methods.(a) ASD subject; (b) Healthy subject.

Figure 10 .
Figure 10.The difference degree z values of the entropy values in Figures 6 and 7 for four entropy methods.(a) ASD subject; (b) Healthy subject.

Figure 11 .
Figure 11.The difference degree z values of the entropy values in Figures 8 and 9 for the four entropy methods.(a) Air compressor; (b) Diesel engine.

Figure 11 .
Figure 11.The difference degree z values of the entropy values in Figures 8 and 9 for the four entropy methods.(a) Air compressor; (b) Diesel engine.