On Entropy of Probability Integral Transformed Time Series

The goal of this paper is to investigate the changes of entropy estimates when the amplitude distribution of the time series is equalized using the probability integral transformation. The data we analyzed were with known properties—pseudo-random signals with known distributions, mutually coupled using statistical or deterministic methods that include generators of statistically dependent distributions, linear and non-linear transforms, and deterministic chaos. The signal pairs were coupled using a correlation coefficient ranging from zero to one. The dependence of the signal samples is achieved by moving average filter and non-linear equations. The applied coupling methods are checked using statistical tests for correlation. The changes in signal regularity are checked by a multifractal spectrum. The probability integral transformation is then applied to cardiovascular time series—systolic blood pressure and pulse interval—acquired from the laboratory animals and represented the results of entropy estimations. We derived an expression for the reference value of entropy in the probability integral transformed signals. We also experimentally evaluated the reliability of entropy estimates concerning the matching probabilities.


Introduction
The sampling theorem [1] paved a way for pervasive signal processing within the scientific fields where it was once inconceivable. Tools developed for classical thermodynamics or communications engineering found new multidisciplinary implementation.
The function developed to estimate the uncertainty of the communication signals-entropy [2] -attracted the attention of scientists from a range of different fields. Other entropy concepts were accepted as well-Kolmogorov-Sinai [3], Grassberger et al. [4] and Eckmann et al. [5], despite difficult implementation and firm theoretical framework.
To ensure easy implementation, Pincus [6] proposed the approximate entropy (ApEn) that avoids the rigid mathematical requirements of its theoretical predecessors (hence the name-approximate). The researchers readily accepted ApEn and its modification SampEn (sample entropy, [7]), with a commendation of the rapidly growing number of citations [8].
Medical researchers quickly realized the benefits of signal processing [9] and successfully applied ApEn and SampEn, in particular for cardiovascular signals: for heart rate variability (HRV) analysis in patients with type 2 diabetes [10], in patients with heart failure [11], in healthy subjects [12] during ( ) = Pr{ ≤ } = Pr{ ≤ } = Pr{ ≤ ( )} = ( ) = (1) The distribution function Fy(y0) is a linear function of y0. Its derivative is a constant, so the distribution of signal y is indeed uniform. An illustrative example of signals transformed by PIT is presented in Figure 2, showing the systolic blood pressure (SBP) and pulse interval (PI) of a laboratory rat before and after PIT application. PIT gained its popularity during the early days of the digital era, as its inverse produces a random signal of arbitrary distribution. When software packages started to provide built-in From Figure 1 it is obvious that the probabilities Pr{x ≤ x 0 } and Pr{y ≤ y 0 } are equal. The same applies to the distribution functions: (F x (x 0 ) = Pr{x ≤ x 0 }) = (F y (y 0 ) = Pr{y ≤ y 0 }). Additionally, the PIT transformation rule states that y 0 = F x (x 0 ), so the following may be written: The distribution function F y (y 0 ) is a linear function of y 0 . Its derivative is a constant, so the distribution of signal y is indeed uniform.
An illustrative example of signals transformed by PIT is presented in Figure 2, showing the systolic blood pressure (SBP) and pulse interval (PI) of a laboratory rat before and after PIT application.

Probability Integral Transform, Sklar's Theorem and Copula Density
The probability integral transform (PIT, or PI-transform) converts a random variable (RV) x with an arbitrary distribution function Fx(x) into a RV y uniformly distributed on the segment [0, 1] [35]. The function used for transformation is the distribution function of signal x, i.e., y = Fx(x). The resulting distribution function Fy(y) is uniform (Figure 1). The proof can be found in textbooks on probability and random variables (e.g., p. 139, [37]), but it is included for the completeness: From Figure 1 it is obvious that the probabilities Pr{x ≤ x0} and Pr{y ≤ y0} are equal. The same applies to the distribution functions: (Fx(x0) = Pr{x ≤ x0}) = (Fy(y0) = Pr{y ≤ y0}). Additionally, the PIT transformation rule states that y0 = Fx(x0), so the following may be written: The distribution function Fy(y0) is a linear function of y0. Its derivative is a constant, so the distribution of signal y is indeed uniform. An illustrative example of signals transformed by PIT is presented in Figure 2, showing the systolic blood pressure (SBP) and pulse interval (PI) of a laboratory rat before and after PIT application. PIT gained its popularity during the early days of the digital era, as its inverse produces a random signal of arbitrary distribution. When software packages started to provide built-in PIT gained its popularity during the early days of the digital era, as its inverse produces a random signal of arbitrary distribution. When software packages started to provide built-in distribution generators, PIT was almost let into oblivion, but not for long. Sklar's theorem [38], although derived in the early sixties, came into the research focus at the turn of the century and brought PIT to the forefront again.
Sklar's theorem states that every D-dimensional (multivariate) distribution function H(x 01 , x 02 , . . . , x 0D ) = Pr{x 1 ≤ x 01 , . . . , x D ≤ x 0D } can be expressed in terms of its uniform marginals F xi (x 0i ) = Pr{x i ≤ x 0i }, i = 1, . . . , D, and a joint distribution-a copula C-that binds them, i.e., An alternative interpretation can be formulated if we recollect that each marginal is uniformly distributed, i.e., F xi (x 0i ) = u i , i = 1, . . . , D: Despite the abstract theoretical definition, a copula implementation and interpretation are simple. The copulas are distribution functions, and their derivatives are the probability density functions-the copula density.
The copula density depicts the dependency structure (density of dependency) of the composite signals. An ability to visualize the dependency structure, especially for bivariate signals, is a unique advantage of the copulas density. To estimate the empirical copula density-it is sufficient to apply the probability integral transform to the source signals and find their joint probability density function.
An example of empirical copula density is shown in Figure 3. The left panel shows the classical joint probability distribution function of systolic blood pressure (SBP) and pulse intervals (PI) signals of the laboratory rats. The right panel presents the copula density of the same signals (D = 2), in the abstract two-dimensional [0, 1] D copula plane. The dependency structure reveals the linear relationship between SBP and PI that corresponds to baroreflex, a major regulatory feedback that helps to maintain blood pressure at a nearly constant level.
Entropy 2020, 22, x FOR PEER REVIEW 4 of 21 distribution generators, PIT was almost let into oblivion, but not for long. Sklar's theorem [38], although derived in the early sixties, came into the research focus at the turn of the century and brought PIT to the forefront again. Sklar's theorem states that every D-dimensional (multivariate) distribution function ( , , … , ) = Pr { ≤ , … , ≤ } can be expressed in terms of its uniform marginals ( ) = Pr{ ≤ } , = 1, … , , and a joint distribution-a copula C-that binds them, i.e., An alternative interpretation can be formulated if we recollect that each marginal is uniformly distributed, i.e., ( ) = , = 1, … , : Despite the abstract theoretical definition, a copula implementation and interpretation are simple. The copulas are distribution functions, and their derivatives are the probability density functions-the copula density.
The copula density depicts the dependency structure (density of dependency) of the composite signals. An ability to visualize the dependency structure, especially for bivariate signals, is a unique advantage of the copulas density. To estimate the empirical copula density-it is sufficient to apply the probability integral transform to the source signals and find their joint probability density function.
An example of empirical copula density is shown in Figure 3. The left panel shows the classical joint probability distribution function of systolic blood pressure (SBP) and pulse intervals (PI) signals of the laboratory rats. The right panel presents the copula density of the same signals (D = 2), in the abstract two-dimensional [0, 1] D copula plane. The dependency structure reveals the linear relationship between SBP and PI that corresponds to baroreflex, a major regulatory feedback that helps to maintain blood pressure at a nearly constant level.
(a) (b) Figure 3. (a) Joint probability density function of systolic blood pressure (SBP) and pulse interval (PI) of a laboratory rat; (b) copula density of PI-transformed SBP and PI signals. Note that the linear dependency structure along the diagonal in (b) is consistent with the linear cardiovascular relationship of SBP and PI in healthy subjects.
Another property of the copulas is that they quantify the strength of signal coupling. It differs from other similar procedures because it can handle more than two signals, it can capture non-linear dependencies as well, but, above all, it can be adapted to the properties of the observed signals. There are many copula sets ("copula families" [39]), and each one is adapted to the particular signal type. For example, Frank copulas are the most suitable for cardiovascular signals [34]. It is this feature of the copulas that has brought them great popularity in many research domains, from finances [40], a) b) Another property of the copulas is that they quantify the strength of signal coupling. It differs from other similar procedures because it can handle more than two signals, it can capture non-linear dependencies as well, but, above all, it can be adapted to the properties of the observed signals. There are many copula sets ("copula families" [39]), and each one is adapted to the particular signal type. For example, Frank copulas are the most suitable for cardiovascular signals [34]. It is this feature of the copulas that has brought them great popularity in many research domains, from finances [40], telecommunications [41], civil engineering [42], geodesy, [43] climatology [44], to medicine [45] and cardiology [34].

XEn, ApEn and SampEn
The procedures for estimating ApEn and SampEn, originally introduced in [6,7], are repeatedly described in most papers that implement them. We shall give a brief recap of XEn as a general procedure and outline the differences in respect to ApEn and SampEn.
If ApEn or SampEn are implemented, there is just a single series X and in the remaining explanation Y = X.
Time series must be pre-processed before further analysis. The signal comparability is ensured by z-normalization (standard scaling-mean signal value and standard deviation reduced to 0 and 1, respectively). The estimation of statistical moments requires the stationary time series, ensured by a filter designed specifically for biomedical time series [47].
Time series are then divided into the overlapping vectors of length m (m is usually 2, 3 or 4): A distance between each template X (i) m and each follower vector Y ( j) m is defined as a maximal absolute sample difference: If the distance is less than or equal to the predefined threshold r, the vectors are declared as similar.
The template matching probabilityp The sign "ˆ" in (4) denotes an estimate, while I{·} is an indicator function that is equal to 1 if m ≤ r, otherwise it is equal to zero. It is used as a mathematical description for the counting process, so (4) estimates the relative frequency of vectors similar to the template X (i) m . For SampEn and XSampEn, the step (N − m + 1) is excluded from averaging. SampEn also excluded self-matching, when the template vector X (i) m is compared to itself. Averaging the probabilities is different for ApEn and SampEn, and the corresponding cross-entropies. For (X) ApEn, the logarithms of the probabilities (the information contents of each template [2]) are averaged: For SampEn, the logarithm is taken from the averaged probabilities: The complete procedure is repeated for the vectors of length m + 1, with summands equal to: Final entropy estimates are: While the sample entropy is a robust estimator [7], the approximate entropy can suffer from inconsistencies as it is based on the logarithm of probability estimations with accumulating estimation errors (cf. Equation (5)). The time-series length N and threshold r are both pointed out as the primary cause for inconsistencies [7,14,22,25]. One of the main adverse outcomes is the zero matching probability (a template vector with no similar followers). The various corrections proposed in [8,22] converge towards the true entropy values when time series length N converges to infinity. We implemented a simple correction, which turned out to be close to the true entropy value regardless of N [22]: As already mentioned, this paper aims to investigate the ApEn, SampEn, and XEn of PI-transformed cardiovascular time series, and PIT is closely related to the copula density. However, another entropy measure related to the copulas-the copula entropy-already exists [48][49][50]. It is based on the Shannon entropy [2], with corresponding probabilities evaluated as normalized histograms of the empirical copula density.
The Shannon entropy applied to a time series is a static measure. If the order of signal samples is permuted in time, as in isodistributional surrogate signals [51], the Shannon entropy will remain unaltered because the probability density function remains the same, as it is derived from the amplitude values, regardless of their position in time. Hence, Shannon entropy reflects the level of orderliness in spatial, but not in the temporal domain.
The ApEn-based entropies reveal the signal orderliness both in the spatial domain (via threshold r) and in the temporal domain (via the vector length m and its increase to m + 1). However, if the threshold r is set to zero, it was shown that ApEn is equivalent to the Shannon conditional entropy applied to the first-order Markov chain [6]. This is, however, a theoretical abstraction, as ApEn with r = 0 cannot be practically achieved.
It was also shown that ApEn of differentially coded binary time series for r = 0 is equivalent to Shannon's binary entropy [52]. This result, although feasible, has no practical application.

Artificial Time Series
To test the entropy of signals before and after the PI-transformation, we generated random signals with Gaussian, gamma and beta distribution. Gamma and beta distributions, with parameters (α, β) equal respectively to (1,2) and (3,1), are skewed distributions with amplitude concentrated in different regions. For each example, we have generated 20 signals or signal pairs, comprising N = 3000 samples each, and the distribution of each signal was tested by the Kolmogorov-Smirnov test. Additionally, we generated signals of deterministic chaos, where their unpredictability is governed by deterministic laws of the simple non-linear equation of the logistic map [36]: The parameter RP was chosen to be 3.81, a value that guarantees chaotic behavior over the complete signal range without oscillations. Another value, RP = 3.58, generated a chaotic signal, but omitting some amplitude ranges. The probability distribution function (a normalized histogram) of these two signals is presented in Figure 4. The second signal (RP = 3.58) is an example of the signal that cannot be used for the copulas as it is not continuous, so it does not fulfill the theoretical requirements.
that cannot be used for the copulas as it is not continuous, so it does not fulfill the theoretical requirements.
The statistical dependency of signal samples is induced using an MA filter: Statistically dependent time series with distribution functions ( ) and ( ) were created using the copula method as follows: the original signal points (x, y) are PI-transformed into uniform signal points (ux, uy) and then the corresponding joint distribution is created in the unit plane [0, 1] 2 using the Frank copula distribution [34,39]: Finally, each , point is transformed back to the original signal plane using the transform ( , ) = ( ), . This method generates mutually dependent time series X and Y with distributions ( ) and ( ), where the dependency level is given by the copula parameter θ.  Figure 5 presents an illustrative example of signals X and Y with skewed gamma and beta distributions, generated using copula generator with parameter θ = 3. From their joint probability density function (PDF) (Figure 5c) no conclusions can be drawn about the relationship of X and Y, but strong linear coupling is clearly visible in their dependency structure (empirical copula density, Figure 5d). The statistical dependency of signal samples is induced using an MA filter: Statistically dependent time series with distribution functions F x (x) and F y (y) were created using the copula method as follows: the original signal points (x, y) are PI-transformed into uniform signal points (u x , u y ) and then the corresponding joint distribution is created in the unit plane [0, 1] 2 using the Frank copula distribution [34,39]: Finally, each u x , u y point is transformed back to the original signal plane using the transform This method generates mutually dependent time series X and Y with distributions F x (x) and F y (y), where the dependency level is given by the copula parameter θ. Figure 5 presents an illustrative example of signals X and Y with skewed gamma and beta distributions, generated using copula generator with parameter θ = 3. From their joint probability density function (PDF) (Figure 5c) no conclusions can be drawn about the relationship of X and Y, but strong linear coupling is clearly visible in their dependency structure (empirical copula density, Figure 5d). Deterministic non-linear dependency is introduced using relationships Y = a·X EXP + b, where a = b = 1 are arbitrary chosen parameters, while parameter EXP ranges from 1.1 to 2. Figure 6a,b show the linear signal coupling estimated by the Pearson, Kendall and Spearman tests [53]. All three tests use different procedures-the Pearson test uses classical moment theory, while the Kendall and Spearman tests use different ranking procedures. The aim was to examine whether the PI transformation changes the linear coupling of two data series. The corresponding data series are generated using a copula generator and an MA filter. Figure 6a,b show that the PI transformation does not cause changes in the linear dependence of the two signals. The dependence between adjacent samples of a single signal also remains unchanged after PI-transformation, as shown by the autocorrelation function in Figure 6d. However, the Pearson, Spearman, and Kendall tests, designed to capture correlation, are unable to capture nonlinear dependence, Figure 6c.
The correlation lines of source signals and their PI-transformed counterparts in Figure 6 overlap perfectly, showing that PIT does not alter the coupling of the signal pairs. Deterministic non-linear dependency is introduced using relationships Y = a·X EXP + b, where a = b = 1 are arbitrary chosen parameters, while parameter EXP ranges from 1.1 to 2. Figure 6a,b show the linear signal coupling estimated by the Pearson, Kendall and Spearman tests [53]. All three tests use different procedures-the Pearson test uses classical moment theory, while the Kendall and Spearman tests use different ranking procedures. The aim was to examine whether the PI transformation changes the linear coupling of two data series. The corresponding data series are generated using a copula generator and an MA filter. Figure 6a,b show that the PI transformation does not cause changes in the linear dependence of the two signals. The dependence between adjacent samples of a single signal also remains unchanged after PI-transformation, as shown by the autocorrelation function in Figure 6d. However, the Pearson, Spearman, and Kendall tests, designed to capture correlation, are unable to capture nonlinear dependence, Figure 6c.
The correlation lines of source signals and their PI-transformed counterparts in Figure 6 overlap perfectly, showing that PIT does not alter the coupling of the signal pairs. (d) copula density. Note that the linear statistical dependency induced by the copula parameter θ = 3 cannot be recognized in the joint probability density function (c), but it is visible in the dependency structure of the copula density (d).  To test whether the differentiation would be possible at all, we estimated a multifractal spectrum that describes the fluctuation of the local regularity of the observed signals. The multifractal spectrum is estimated in terms of wavelet leaders [54].
The results in Figure 7 reveal that the PIT induced the changes in signal regularity. The inset in the upper right corner of Figure 7a shows that the spectra of the transformed pseudorandom signals  Deterministic non-linear dependency is introduced using relationships Y = a·X EXP + b, where a = b = 1 are arbitrary chosen parameters, while parameter EXP ranges from 1.1 to 2. Figure 6a,b show the linear signal coupling estimated by the Pearson, Kendall and Spearman tests [53]. All three tests use different procedures-the Pearson test uses classical moment theory, while the Kendall and Spearman tests use different ranking procedures. The aim was to examine whether the PI transformation changes the linear coupling of two data series. The corresponding data series are generated using a copula generator and an MA filter. Figure 6a,b show that the PI transformation does not cause changes in the linear dependence of the two signals. The dependence between adjacent samples of a single signal also remains unchanged after PI-transformation, as shown by the autocorrelation function in Figure 6d. However, the Pearson, Spearman, and Kendall tests, designed to capture correlation, are unable to capture nonlinear dependence, Figure 6c.
The correlation lines of source signals and their PI-transformed counterparts in Figure 6 overlap perfectly, showing that PIT does not alter the coupling of the signal pairs.
To test whether the differentiation would be possible at all, we estimated a multifractal spectrum that describes the fluctuation of the local regularity of the observed signals. The multifractal spectrum is estimated in terms of wavelet leaders [54]. To test whether the differentiation would be possible at all, we estimated a multifractal spectrum that describes the fluctuation of the local regularity of the observed signals. The multifractal spectrum is estimated in terms of wavelet leaders [54].
The results in Figure 7 reveal that the PIT induced the changes in signal regularity. The inset in the upper right corner of Figure 7a shows that the spectra of the transformed pseudorandom signals differ from the original spectra; besides, all transformed spectra overlap as the signals get the same distribution (inset in Figure 7a). The spectrum of deterministic chaos reveals monofractal properties. The spectrum remains monofractal after PI-transform (Figure 7b). The spectra of signals filtered by the MA filter differ from the source signals, showing that the local regularity of the signals has changed.

Time Series Recorded from the Laboratory Rats Exposed to Shaker and Restraint Stress
The real cardiovascular signals we used to check the PIT entropy concept were recorded at Laboratory of Cardiovascular Pharmacology, Medical Faculty, University of Belgrade, from outbred male Wistar rats weighing 330 ± 20 g.
Ten days before the experiments, radio-telemetric probes (TA11PA-C40, DSI, Transoma Medical, St. Paul, Minnesota, U.S.A.) were implanted into the abdominal aorta under combined ketamine and xylazine anesthesia, along with gentamicin and followed by metamizole injections for pain relief. The arterial blood pressure (BP) signal was digitized at 1000 Hz and relayed to a computer The results in Figure 7 reveal that the PIT induced the changes in signal regularity. The inset in the upper right corner of Figure 7a shows that the spectra of the transformed pseudorandom signals differ from the original spectra; besides, all transformed spectra overlap as the signals get the same distribution (inset in Figure 7a). The spectrum of deterministic chaos reveals monofractal properties. The spectrum remains monofractal after PI-transform (Figure 7b). The spectra of signals filtered by the MA filter differ from the source signals, showing that the local regularity of the signals has changed.

Time Series Recorded from the Laboratory Rats Exposed to Shaker and Restraint Stress
The real cardiovascular signals we used to check the PIT entropy concept were recorded at Laboratory of Cardiovascular Pharmacology, Medical Faculty, University of Belgrade, from outbred male Wistar rats weighing 330 ± 20 g.
Ten days before the experiments, radio-telemetric probes (TA11PA-C40, DSI, Transoma Medical, St. Paul, MN, USA) were implanted into the abdominal aorta under combined ketamine and xylazine anesthesia, along with gentamicin and followed by metamizole injections for pain relief. The arterial blood pressure (BP) signal was digitized at 1000 Hz and relayed to a computer equipped with Dataquest A.R.T. 4.0 software for analysis of cardiovascular signals.
Rats were randomized into two groups. The first group was exposed to shaker stress, with rats positioned on a platform shaking at 200 cycle/min. The second group was exposed to restraint stress, with rats placed in a Plexiglas restrainer tube (ID 5.5 cm with pores) in the supine position. Arterial blood pressure (BP) waveforms were recorded before (CONTROL) and after the first exposure to stress (SHAKER, RESTRAINT). Other phases of the experimental protocol are not relevant for this study [55].
Systolic blood pressure was derived from the arterial BP as local maxima in the BP waveforms, while the pulse interval (PI) time series was derived as the time distance between successive maximal arterial blood pressure increases. Artifacts were removed semi-automatically, first using the filter designed for cardiovascular time series [56] and then carefully visually examining the BP waveforms and residual artifacts. The signals from rats with traces of unstable health were completely excluded. A very slow varying signal component (mostly the result of rat relocation) was removed using a filter proposed by [47]. De-trended time series should be stationary at least in a wide-sense, i.e., their first and second statistical moment should be time-invariant. Then the mean value and standard deviation estimated from the time series are equal to their statistical counterparts [37], and only then the standard scaling could be reliably implemented. Thus, de-trended time series were checked using a stationarity test [57,58] and those that were not wide sense stationary were eliminated.
The final number of remaining animals per experimental group was n = 6. It was satisfactory according to the variability of the parameters in the control group rats (the statistical software "Power Sample Size Calculation"). All experimental procedures in this study were confirmed by the European Communities Council directive of 24 November 1986 (86/609/ECC), and the School of Medicine, University of Belgrade, Guidelines on Animal Experimentation.

Results and Discussion
This section presents the results of our study. Each set of the results is accompanied with the corresponding discussion.

Threshold Choice
As already pointed out, the threshold value is crucial for the consistency of entropy estimates, and its proper choice should be the first task. However, entropy is also a function of the time series length N-shorter time series require a larger threshold, and the relationship is not linear. A thorough analysis in [22] showed that a reliable estimation of probabilities (4) is a key factor for stable entropy measures. This requires that the threshold values be higher than the generally accepted ones. One of the methods is to plot a threshold profile, i.e., to estimate the entropy for different threshold values and fixed N. Figure 8 presents the XEn profile estimated from the cardiovascular signals, as cross-entropy requires higher threshold values than self-entropy [14,22]. Besides, real signals are a better choice for threshold profiling than stationary artificial data. The vertical lines in Figure 8 show the threshold for which the entropy estimates become consistent. This threshold value is equal to r = 0.3 and is adopted for further entropy estimation. The threshold evaluation method proposed in [22] gives higher threshold values, but in present study, we did not want to differ too much from the classical values. the methods is to plot a threshold profile, i.e., to estimate the entropy for different threshold values and fixed N. Figure 8 presents the XEn profile estimated from the cardiovascular signals, as crossentropy requires higher threshold values than self-entropy [14,22]. Besides, real signals are a better choice for threshold profiling than stationary artificial data. The vertical lines in Figure 8 show the threshold for which the entropy estimates become consistent. This threshold value is equal to r = 0.3 and is adopted for further entropy estimation. The threshold evaluation method proposed in [22] gives higher threshold values, but in present study, we did not want to differ too much from the classical values.

Entropy Estimated from Artificial Data
The purpose of artificial data, generated in controlled conditions, is to present the reference regarding the probability integral transformation entropy estimates.
Self-entropy estimates (ApEn and SampEn) are presented in Figure 9. As expected, the entropy of pseudo-random signals depends on their distribution (Figure 9a), but PI-transform eliminates this dependency (Figure 9b). The chosen parameters (N = 3000, r = 0.3) are sufficient to ensure reliable entropy estimates. Obviously, no correction is needed as the original and corrected ApEn estimates perfectly overlaps.

Entropy Estimated from Artificial Data
The purpose of artificial data, generated in controlled conditions, is to present the reference regarding the probability integral transformation entropy estimates.
Self-entropy estimates (ApEn and SampEn) are presented in Figure 9. As expected, the entropy of pseudo-random signals depends on their distribution (Figure 9a), but PI-transform eliminates this dependency (Figure 9b). The chosen parameters (N = 3000, r = 0.3) are sufficient to ensure reliable entropy estimates. Obviously, no correction is needed as the original and corrected ApEn estimates perfectly overlaps.
The dependency of signal samples, induced by the MA filter, causes an entropy decrease. The decrease is indeed due to the sample correlation, as the signal distribution remains Gaussian and decrease if entropy is seen both for original signals, and the PI-transformed signals (Figure 9c). On the other hand, the non-linear transform induced by the relation Y = a·X EXP + b also decreases the entropy, but this decrease is due to the distribution change: PI-transform converts the distribution into uniform, and the entropy of all converted signals remains stable, regardless of the exponent EXP (Figure 9d).   Figure 10a has been verified using Spearman, Pearson and Kendall tests (shown in Figure 6a). However, their XEn estimates are constant, revealing that the entropy does not reflect the statistical correlation between two pseudo-random time series. It is in accordance with the entropy procedure, where a template vector is compared with each one of its followers, while the dependency exists only with the followers in its vicinity. The dependency of signal samples, induced by the MA filter, causes an entropy decrease. The decrease is indeed due to the sample correlation, as the signal distribution remains Gaussian and decrease if entropy is seen both for original signals, and the PI-transformed signals (Figure 9c). On the other hand, the non-linear transform induced by the relation Y = a·X EXP + b also decreases the entropy, but this decrease is due to the distribution change: PI-transform converts the distribution into uniform, and the entropy of all converted signals remains stable, regardless of the exponent EXP (Figure 9d). Figure 10 shows cross-entropy estimates. The correlation of the signal pairs in Figure 10a has been verified using Spearman, Pearson and Kendall tests (shown in Figure 6a). However, their XEn estimates are constant, revealing that the entropy does not reflect the statistical correlation between two pseudo-random time series. It is in accordance with the entropy procedure, where a template vector is compared with each one of its followers, while the dependency exists only with the followers in its vicinity. When correlation is induced by the MA filter (Figure 10b), both XApEn and PIT-XApEn decrease with the increase of filter length. SampEn decreases as well, but at a lower rate, while PIT-SampEn remains stable. SampEn is known to be a stable measure, due to the logarithm taken from the average matching probabilities [7]. However, this stability reduces the ability to recognize the subtle changes, and equalizing the distribution further reduces the possibility of recognition, so it might be a disadvantage.
The non-linear relationship between the signals is induced by the relation Y = a·X EXP + b. The corresponding cross-entropies are independent of the level of exponent EXP, if the reference signal is a pseudo-random signal X (source signal). If the reference signal is signal Y, obtained by a nonlinear transform of signal X, the cross-entropy decreases for XApEn and XSampEn, but the PIT counterparts remain constant. The reason is the same as for the Figure 9d: the decrease is due to the distribution of reference signal that changes as a consequence of non-linear transform; had the decrease been due to the induced non-linear coupling, the PIT entropy would have changed as well; since it has not changed, the non-linear coupling is not responsible for entropy decrease.
The artificial time series have so far been pseudo-random signals with a given distribution. The deterministic chaos exhibits unpredictability, but not randomness. Figure 11 presents the entropy estimated from the logistic map signals. The entropy of the non-continual chaotic signal (Figure 11a) is very low, revealing a low level of its uncertainty. The second signal (Figure 11b) is genuinely chaotic and it depends on the initial conditions. For this reason, the estimated entropy values are not constant, but their changes are not significant (Figure 11b).
The absolute values of cross-entropy estimates are similar to the self-entropy in Figure 11b. This When correlation is induced by the MA filter (Figure 10b), both XApEn and PIT-XApEn decrease with the increase of filter length. SampEn decreases as well, but at a lower rate, while PIT-SampEn remains stable. SampEn is known to be a stable measure, due to the logarithm taken from the average matching probabilities [7]. However, this stability reduces the ability to recognize the subtle changes, and equalizing the distribution further reduces the possibility of recognition, so it might be a disadvantage.
The non-linear relationship between the signals is induced by the relation Y = a·X EXP + b. The corresponding cross-entropies are independent of the level of exponent EXP, if the reference signal is a pseudo-random signal X (source signal). If the reference signal is signal Y, obtained by a nonlinear transform of signal X, the cross-entropy decreases for XApEn and XSampEn, but the PIT counterparts remain constant. The reason is the same as for the Figure 9d: the decrease is due to the distribution of reference signal that changes as a consequence of non-linear transform; had the decrease been due to the induced non-linear coupling, the PIT entropy would have changed as well; since it has not changed, the non-linear coupling is not responsible for entropy decrease.
The artificial time series have so far been pseudo-random signals with a given distribution. The deterministic chaos exhibits unpredictability, but not randomness. Figure 11 presents the entropy estimated from the logistic map signals. The entropy of the non-continual chaotic signal (Figure 11a) is very low, revealing a low level of its uncertainty. The second signal (Figure 11b) is genuinely chaotic and it depends on the initial conditions. For this reason, the estimated entropy values are not constant, but their changes are not significant (Figure 11b).
The absolute values of cross-entropy estimates are similar to the self-entropy in Figure 11b. This means that less predictable signal (with higher entropy) is dominant in cross-entropy estimates. On When correlation is induced by the MA filter (Figure 10b), both XApEn and PIT-XApEn decrease with the increase of filter length. SampEn decreases as well, but at a lower rate, while PIT-SampEn remains stable. SampEn is known to be a stable measure, due to the logarithm taken from the average matching probabilities [7]. However, this stability reduces the ability to recognize the subtle changes, and equalizing the distribution further reduces the possibility of recognition, so it might be a disadvantage.
The non-linear relationship between the signals is induced by the relation Y = a·X EXP + b. The corresponding cross-entropies are independent of the level of exponent EXP, if the reference signal is a pseudo-random signal X (source signal). If the reference signal is signal Y, obtained by a non-linear transform of signal X, the cross-entropy decreases for XApEn and XSampEn, but the PIT counterparts remain constant. The reason is the same as for the Figure 9d: the decrease is due to the distribution of reference signal that changes as a consequence of non-linear transform; had the decrease been due to the induced non-linear coupling, the PIT entropy would have changed as well; since it has not changed, the non-linear coupling is not responsible for entropy decrease.
The artificial time series have so far been pseudo-random signals with a given distribution. The deterministic chaos exhibits unpredictability, but not randomness. Figure 11 presents the entropy estimated from the logistic map signals. The entropy of the non-continual chaotic signal (Figure 11a) is very low, revealing a low level of its uncertainty. The second signal (Figure 11b) is genuinely chaotic and it depends on the initial conditions. For this reason, the estimated entropy values are not constant, but their changes are not significant (Figure 11b).

Entropy Estimated from Cardiovascular Signals of Laboratory Rats Exposed to Stress
The parameters recorded from the laboratory rats are presented in Table 1. It reveals a significant decrease in pulse interval (increase in heart rate) in rats exposed to the restraint stress, other changes are slight and not significant. The results are presented as mean ± standard deviation. * denotes the difference between the baseline and stressed signals at the significance level of p < 0.05.
The entropy estimates are given in Figure 12.

Entropy Estimated from Cardiovascular Signals of Laboratory Rats Exposed to Stress
The parameters recorded from the laboratory rats are presented in Table 1. It reveals a significant decrease in pulse interval (increase in heart rate) in rats exposed to the restraint stress, other changes are slight and not significant. The results are presented as mean ± standard deviation. * denotes the difference between the baseline and stressed signals at the significance level of p < 0.05.
The entropy estimates are given in Figure 12. The absolute values of cross-entropy estimates are similar to the self-entropy in Figure 11b. This means that less predictable signal (with higher entropy) is dominant in cross-entropy estimates. On the other hand, the consistency of repeated entropy estimation governed by the reference signal: consistency of X vs. Y cross-entropy is similar to the consistency of X self-entropy (Figure 11a,c); variability of Y vs. X cross-entropy is proportional to the variability of Y self-entropy (Figure 11b,d).

Entropy Estimated from Cardiovascular Signals of Laboratory Rats Exposed to Stress
The parameters recorded from the laboratory rats are presented in Table 1. It reveals a significant decrease in pulse interval (increase in heart rate) in rats exposed to the restraint stress, other changes are slight and not significant. The results are presented as mean ± standard deviation. * denotes the difference between the baseline and stressed signals at the significance level of p < 0.05.
The entropy estimates are given in Figure 12. The horizontal line in Figure 12 shows the theoretical value of ApEn, SampEn, and XEn that can be evaluated for the random signals with uniform distribution: A detailed evaluation of the theoretical value is shown in Appendix A. This is a result of perfect randomness and uniformity that can serve as a reference value, without the need to run the tedious simulation studies, e.g., surrogate data tests [51].
Considering the experimental results, the entropy of PI-transformed signals captured slightly more statistically significant differences between the cardiovascular parameters of the animals before and after exposure to stress: while classical entropies found the differences in shaker stress-SBP and PI vs. SBP, and XSampEn found a difference in SBP vs. PI, PIT entropy found additional differences in XApEn of SBP vs. PI, and, in restraint stress, in PI vs. SBP and XSampEn in SBP vs. PI.
Contrary to the artificial signals with the controlled outcome, the reliability of the entropy estimated from real data sources is always a subject of discussion. As already stated, a failure in estimating the matching probabilities, Equation (4), leads to an inconsistent entropy estimation. The reliability of probabilities can be checked using the Jeruchim criterion that defines the minimal signal The horizontal line in Figure 12 shows the theoretical value of ApEn, SampEn, and XEn that can be evaluated for the random signals with uniform distribution: A detailed evaluation of the theoretical value is shown in Appendix A. This is a result of perfect randomness and uniformity that can serve as a reference value, without the need to run the tedious simulation studies, e.g., surrogate data tests [51].
Considering the experimental results, the entropy of PI-transformed signals captured slightly more statistically significant differences between the cardiovascular parameters of the animals before and after exposure to stress: while classical entropies found the differences in shaker stress-SBP and PI vs. SBP, and XSampEn found a difference in SBP vs. PI, PIT entropy found additional differences in XApEn of SBP vs. PI, and, in restraint stress, in PI vs. SBP and XSampEn in SBP vs. PI.
Contrary to the artificial signals with the controlled outcome, the reliability of the entropy estimated from real data sources is always a subject of discussion. As already stated, a failure in estimating the matching probabilities, Equation (4), leads to an inconsistent entropy estimation. The reliability of probabilities can be checked using the Jeruchim criterion that defines the minimal signal (a) self-entropy, rats submitted to the restraint stress; (b) self-entropy, rats submitted to the shaker stress; (c) cross-entropy, rats submitted to the restraint stress; (d) cross-entropy, rats submitted to the shaker stress. Results are presented as mean ± SE (standard error of mean); * denotes the difference between the control and stressed signals at the significance level of p < 0.05. The horizontal line shows the theoretical value for perfect random signals with uniform distribution.
The horizontal line in Figure 12 shows the theoretical value of ApEn, SampEn, and XEn that can be evaluated for the random signals with uniform distribution: A detailed evaluation of the theoretical value is shown in Appendix A. This is a result of perfect randomness and uniformity that can serve as a reference value, without the need to run the tedious simulation studies, e.g., surrogate data tests [51].
Considering the experimental results, the entropy of PI-transformed signals captured slightly more statistically significant differences between the cardiovascular parameters of the animals before and after exposure to stress: while classical entropies found the differences in shaker stress-SBP and PI vs. SBP, and XSampEn found a difference in SBP vs. PI, PIT entropy found additional differences in XApEn of SBP vs. PI, and, in restraint stress, in PI vs. SBP and XSampEn in SBP vs. PI.
Contrary to the artificial signals with the controlled outcome, the reliability of the entropy estimated from real data sources is always a subject of discussion. As already stated, a failure in estimating the matching probabilities, Equation (4), leads to an inconsistent entropy estimation. The reliability of probabilities can be checked using the Jeruchim criterion that defines the minimal signal length required to achievep confirmed the traditional engineering rule that the signal length required for a reliable estimation of a binary event probability should be at least 10/p (m) i (r) [59]. The ultimate case of unreliability is the matching probability equal to zero,p (m) i (r) = 0. This occurs if the signal length N is too short, or if the threshold r is inadequate, or if the vector length m is too long [22]. However, zero probability can occur in XEn for a completely logical reason: the template vector can comprise amplitudes that can never be found in another signal, so no follower vectors exists. In this case, the zero probability is not a result of an incorrect estimation, but a valid relationship between the two signals. Figure 13 shows the percentage of reliably estimated matching probabilities, while Figure 14 shows the estimated percentage of zero probabilities. The ultimate case of unreliability is the matching probability equal to zero, ̂ ( ) ( ) = 0. This occurs if the signal length is too short, or if the threshold is inadequate, or if the vector length is too long [22]. However, zero probability can occur in XEn for a completely logical reason: the template vector can comprise amplitudes that can never be found in another signal, so no follower vectors exists. In this case, the zero probability is not a result of an incorrect estimation, but a valid relationship between the two signals. Figure 13 shows the percentage of reliably estimated matching probabilities, while Figure 14 shows the estimated percentage of zero probabilities. From both figures, it can be seen that PIT signals have better performances than source signals. The increased number of reliably estimated probabilities in Figure 13 is an outcome of the uniform distribution. The signal amplitudes are equally probable, so the probability that a template finds a matching follower is increased. In distributions with exhibited tails (source signals), some of the templates are less likely to find a matching follower.
The decreased number of zero-matching probability ( Figure 14) is another benefit of the probability integral transform. As already said, the distributions of signal pairs for cross-entropy can have non-overlapping segments, so some of the templates will never find the followers. After the PItransform, the signals would be mapped into the same [0, 1] segment, and non-overlapping segments would not exist.  template vector can comprise amplitudes that can never be found in another signal, so no follower vectors exists. In this case, the zero probability is not a result of an incorrect estimation, but a valid relationship between the two signals. Figure 13 shows the percentage of reliably estimated matching probabilities, while Figure 14 shows the estimated percentage of zero probabilities. From both figures, it can be seen that PIT signals have better performances than source signals. The increased number of reliably estimated probabilities in Figure 13 is an outcome of the uniform distribution. The signal amplitudes are equally probable, so the probability that a template finds a matching follower is increased. In distributions with exhibited tails (source signals), some of the templates are less likely to find a matching follower.
The decreased number of zero-matching probability ( Figure 14) is another benefit of the probability integral transform. As already said, the distributions of signal pairs for cross-entropy can have non-overlapping segments, so some of the templates will never find the followers. After the PItransform, the signals would be mapped into the same [0, 1] segment, and non-overlapping segments would not exist.  From both figures, it can be seen that PIT signals have better performances than source signals. The increased number of reliably estimated probabilities in Figure 13 is an outcome of the uniform distribution. The signal amplitudes are equally probable, so the probability that a template finds a matching follower is increased. In distributions with exhibited tails (source signals), some of the templates are less likely to find a matching follower.
The decreased number of zero-matching probability ( Figure 14) is another benefit of the probability integral transform. As already said, the distributions of signal pairs for cross-entropy can have non-overlapping segments, so some of the templates will never find the followers. After the PI-transform, the signals would be mapped into the same [0, 1] segment, and non-overlapping segments would not exist. Figures 13 and 14 also reveal the empirically obtained threshold r = 0.3, although slightly exceeding the traditional values from the literature (0.15-0.25), might not be sufficient for XEn as the values of XApEn and XApEn with correction differ. It is in accordance with the theoretical findings from [22], but we preferred to use the values that are more aligned with the traditional ones.

Conclusions
The aim of this paper was to apply the ApEn-based entropies and cross-entropies to the signals submitted to the probability integral transformation. PIT yields the signal with uniform distribution, keeping the signal fluctuations intact. The idea was to eliminate the influence of amplitude distribution, and to estimate the entropy where each amplitude has equal opportunity. Then the true unpredictability of the signal could be estimated without the bias induced by amplitude distribution.
The artificial environment revealed that PIT self-entropy estimates are insensitive to the linear or non-linear signal transformation, if the transformation is induced sample by sample (relationship Y = a·X EXP + b). However, entropy estimates are sensitive to transformations that induce the dependency along the signal itself, e.g., using the MA filter. Considering the cross-entropy, its estimates remain constant when correlation coefficient between the signals X and Y increase from 0 to 1, with a conclusion that statistical correlation cannot be measured by the means of cross-entropy. Cross-entropy, on the other hand, notices if one of the signal is formed from another by inducing the correlation between its successive samples.
The chaotic signals are generated using the formula for deterministic chaos. "Chaos" did not deceive the entropy procedure, so the entropy estimates were quite low, showing the high level of signal predictability. Regardless of apparent chaotic signal appearance, the "deterministic" component could not escape the unbiased entropy measure.
Estimates of the real signals showed that PIT results of signals in stress reveal a slightly increased statistical significance than classical entropy measures. However, the main outcome is the increased estimation reliability, compared to the classical measures. The increased reliability is a consequence of the uniform amplitude distribution over [0, 1] segment and reduced number of zero-matching probabilities.
The entropy estimates of PI-transformed signals are unbiased regarding the amplitude distribution. Their reliability has improved, and a referent value-a ground truth to which entropy estimates can be compared-can be obtained by formula and not by a simulation study.
The future work will be devoted to the evaluation of errors in entropy estimation for ApEn, SampEn, XEn, and their PIT pairs, and to developing the methods for error attenuation. The future work will also include the continuation on thresholds role in inconsistency of entropy estimation.