Estimating Phase Amplitude Coupling between Neural Oscillations Based on Permutation and Entropy

Cross-frequency phase–amplitude coupling (PAC) plays an important role in neuronal oscillations network, reflecting the interaction between the phase of low-frequency oscillation (LFO) and amplitude of the high-frequency oscillations (HFO). Thus, we applied four methods based on permutation analysis to measure PAC, including multiscale permutation mutual information (MPMI), permutation conditional mutual information (PCMI), symbolic joint entropy (SJE), and weighted-permutation mutual information (WPMI). To verify the ability of these four algorithms, a performance test including the effects of coupling strength, signal-to-noise ratios (SNRs), and data length was evaluated by using simulation data. It was shown that the performance of SJE was similar to that of other approaches when measuring PAC strength, but the computational efficiency of SJE was the highest among all these four methods. Moreover, SJE can also accurately identify the PAC frequency range under the interference of spike noise. All in all, the results demonstrate that SJE is better for evaluating PAC between neural oscillations.

Particularly, more reports focused on PAC because of its ubiquity in electrophysiology significance in brain functions. The PAC is generally defined as that the amplitude of a higher-frequency oscillation is modulated by the phase of a lower-frequency oscillation [17]. Moreover, the PAC has been studied in many functions, such as memory, attention selection, and sensory information detection [18,19]. It is widely observed in the hippocampus; neocortex; and basal ganglia from rats, mice, monkeys to humans [20][21][22]. Several quantitative measures have been proposed to characterize PAC, such as phase-locking value (PLV) [23], mean vector length (MVL) [24], Kullback-Liebler (KL) based modulation index (KLmi) [9], and generalized linear modeling (GLM) [25]. Overall, these different methods were proposed based on different principles and focused on different purposes, which can be found in comparative studies [9,[26][27][28][29]. Recently, based on permutation entropy [30] and mutual information theory, a new method called permutation mutual information (PMI) was applied to measure PAC [31]. Although the advantage of the PMI approach has been described, it failed to account for the multiple time scales inherent in the physiological systems. To solve this problem, Aziz et al. improved the permutation entropy (PE) and called this modified procedure multiscale permutation entropy (MPE). By comparing the analysis results of MPE and PE on physiological signals, it is proved that MPE is more robust [32]. Therefore, in the present study, we proposed a novel approach named MPMI. It measures the PAC by evaluating the mutual information of multiscale permutations between two neural oscillations, which is developed from the MPE on the basis of mutual information theory.
It is well known that the classical conditional mutual information (CMI) is initially proposed to characterize the weak interaction between two phase signals with similar rhythms [33]. Cheng et al. improved it and successfully applied it in measuring PAC [34]. Afterward, Li et al. presented a new method called PCMI for assessing the directionality between two signals generated by different neuronal populations [35]. The simulations show that this method is superior to the CMI method for identifying the coupling direction. Therefore, we will use PCMI to estimate PAC. Joint entropy is a nonlinear measure used to estimate the uncertainty between one time series and another [36]. In order to weaken or eliminate the influence generated by the probability distribution of the original series, we proposed a symbolized time series to replace the original time series to calculate the probability distribution, called SJE; it can be used to measure PAC. Cui et al. proposed a novel approach to quantify the synchronization strength of EEG, which was named as normalized weighted-permutation mutual information (WPMI) [37]. It overcomes the shortcomings of the PMI algorithm. Specifically, in the extraction of the ordinal patterns, only the information of the order structure is reserved, and a large amount of amplitude information is lost. The simulation model shows that WPMI can reflect the amplitude characteristics of the time series while estimating the synchronization strength of the time series. Therefore, we will use the WPMI method to measure PAC.
Finally, we will evaluate the performance of these four algorithms (MPMI, PCMI, SJE, and WPMI) through simulation data and hope that these methods can help researchers understand the characteristics of PAC in neural oscillations and further explore the structure and function of the brain.

Multiscale Permutation Mutual Information
The MPMI is developed based on MPE [32] and mutual information theory. For a time series {x(t); t = 1, 2, · · · , N}, we first average a successively increasing number of data points in non-overlapping windows to construct a continuous coarse-grained time series [38]. The coarse-grained time series X s k is calculated according to the following formula: where s denotes the scale factor and 1 ≤ k ≤ N/s. The length of each coarse-grained time series is equal to the length of the original time series divided by the scale factor s. When s = 1, the coarse-grained time series is the original time series, and the calculated entropy value is the permutation entropy value. An example of a coarse-graining procedure is presented in Figure 1. value is the permutation entropy value. An example of a coarse-graining procedure is presented in Figure 1.
where m is the embedding dimension, and τ is the time delay. Then, stance, if the embedding dimension is set as 3, it will take 6 motifs for each time series, which is illustrated in Figure 2b. Figure 2a shows an example of some permutation patterns that may appear in the signal. Then, we can calculate the number of occurrences of each ordinal pattern of the occurrence of each symbol is obtained from the reconstructed sequence:  Secondly, when the scale factor is s, calculate the coarse-grained time series X s k embedding delay expression Z m,τ,s where m is the embedding dimension, and τ is the time delay. Then, Z m,τ,s 1,l is arranged in increasing order of magnitudes. When the two values of Z m,τ,s 1,l are equal, they are sorted according to the size of the subscript. Therefore, each of the sub-vectors in m dimensional space is mapped into an ordinal pattern π m,τ i , i = 1, 2, · · · m!. Each of the T = M − (m − 1)τ sub-vectors corresponds to one of m! possible arrangements. For instance, if the embedding dimension is set as 3, it will take 6 motifs for each time series, which is illustrated in Figure 2b. Figure 2a shows an example of some permutation patterns that may appear in the signal. Then, we can calculate the number of occurrences of each ordinal pattern π m,τ i which is indicated by C i . Consequently, the probability P(X = π m,τ i ) of the occurrence of each symbol is obtained from the reconstructed sequence: value is the permutation entropy value. An example of a coarse-graining procedure is presented in Figure 1.
where m is the embedding dimension, and τ is the time delay. Then,   Following, we define the probability distribution of {x(t)} as p x , and then, the MPE of {x(t)} is defined as Consider two-time series {x(t)} and {y(t)}; let MPE(X) and MPE(Y) denote their MPE. The joint probability p xy = p(π m,τ i , π m,τ j ) of each symbol occurrence in the signal be calculated [35]. Thus, the multiscale joint permutation entropy MPE(X, Y) can be denoted by Afterwards, on the basis of MPE and mutual information theory, we can define MPMI as follows: Finally, the normalized multiscale permutation mutual information (MPMI Nor ) can be defined as where the MPMI Nor ranges from 0 to 1. The greater the value of MPMI Nor , the stronger the relationship between X and Y.

Permutation Conditional Mutual Information
Presumably, two-time series X = (X 1 , X 2 , · · · , X n ) T and Y = (Y 1 , Y 2 , · · · , Y n ) T , based on the Shannon information theory and the permutation method introduced in Section 2.1, the PE and the joint permutation entropy of the time series can be calculated by the following formulas: Mathematically, the PCMI can be defined by where Y δ is an observable derived from the state of the process Y's δ steps delay, i.e., Y δ : y t+δ = y t . In general, the value of δ ranges from 3 to 15 [39]. At some later time points, the information that is transmitted from one process X to another process Y can be defined as where the N is the maximal later points. PCMI X→Y means that the coupling strength of X to Y.

Symbolic Joint Entropy
Given the time series as X = [x(1), x(2), . . . , x(N)], sequence reconstruction is to convert X into non-overlapping sub-vectors X k . The conversion of the new vector is shown as . . .
where m represents the embed dimension, k = 1, 2, · · · , M = [N/m]. After generating X k , we can symbolize X k in accordance with certain rules to reduce the requirements for the original sequence. The basic principle of the permutation is to replace the original value with anon-negative integer. For a given window, it can be denoted by a symbol according to the permutation rule in the Table 1. We take m = 3 as an example, as demonstrated in Table 1. Table 1. The permutation rule in symbolic joint entropy.
Permutation Symbol Therefore, we can get a symbolized time series {s x (t); t = 1, 2, · · · , M * m}. Similarly, symbolized time series of the time series {y(t); t = 1, 2, · · · , N} can be expressed as s y (t); t = 1, 2, · · · , M * m . After that, calculate the probability of occurrence of the corresponding time π l in the two symbolized time series: where #π l denotes its frequency of occurrence in the time series. π l can be expressed as Subsequently, the symbolic joint entropy is defined as For two series without correlation, after symbolizing, the occurrence probability of all potential symbol vectors is approximately equal, i.e., 1 m 2 . In this case, the symbolic entropy achieves the theoretical maximum. It can be defined as On the contrary, the minimum symbol joint entropy (log 2 m) is obtained when the series are completely linearly correlated. In order to minimize the influence of the parameter m, we further defined a normalized coupling coefficient (C SJ ) as The index of C SJ is positively correlated with the coupling strength between the two series. The value 1 indicated the perfect inter-series coupling whereas the value 0 suggested no inter-series correlation.

Weighted-Permutation Mutual Information
Given two time series X and Y which are embedded into m dimensional space, each of the sub-vectors is assigned to an ordinal pattern π i . Specifically, the weighted probability p ω for each ordinal pattern of X can be calculated as where Π is a collection of ordinal patterns π i , i = 1, 2, · · · , m!. I π i (X i ) = 1, if the sub-vector X i can be mapped to ordinal pattern π i , otherwise I π i (X i ) = 0. ω i is a weight which is determined by a specific feature of each sub-vector X i . Different features are selected according to the needs, but the relationship ∑ i p w (π i ) = 1 is always present. The weight ω i is calculated by where X i is mean of sub-vector X i obtained by the following formula: Therefore, the probability distribution p ωx of weighted ordinal pattern in time series X is obtained. p ωy can be calculated in the same way. For these two-time series, π i and π j are used to indicate the ordinal patterns of X k and Y k , respectively. Thus, m! * m! joint ordinal patterns are available. A new method for calculating the probability of each joint ordinal pattern is proposed as follows: where Π is a collection of ordinal patterns π ij .I π ij (X i ; Y j ) = 1, if the sub-vector X i and Y j can be mapped into ordinal pattern π ij , otherwise I π ij (X i ; Y j ) = 0. Weight ω ij = ω i ω j , ω i and ω j denote the weight of the sub-vector X i and Y j , respectively. Thus, the weighted probability distribution of joint ordinal patterns p wxy is obtained. According to the information theory, the WPE of the time series X and Y can be calculated as Then, WPMI is defined as Afterward, the WPMI can be normalized as where WPMI Nor varies from 0 to 1. If X and Y are two perfect synchronous time series, WPMI Nor equals 1; if there is completely no synchronization between X and Y, WPMI Nor is close to 0. The stronger the synchronization between X and Y, the higher the WPMI Nor .

Acquisition of Simulation Data
First, multiple sine waves with different frequencies, amplitudes, and phases are added together to generate the simulation data [40,41]. The low-frequency oscillation (LFO) and high-frequency oscillation (HFO) are denoted by low(t) and high(t), respectively. The low(t) is a 4-8 Hz theta oscillation and the high(t) is a 60-70 Hz gamma oscillation, and the sampling frequency is 1000 Hz. The amplitude of the components was inversely proportional to their frequencies. Both oscillations are normalized to a range of −1 to 1; the purpose is to eliminate the interference caused by amplitude fluctuation and maintain the fluctuation of its instantaneous frequency. The phases were randomly selected from [−π, π]. After that, the Von Mises coupling method is used to generate the amplitude series of HFO [42]: The parameter c controls the maximum amplitude of HFO. λ is a concentration parameter.
The λ with bigger value generates the larger amplitude of HFO around the preferred phase. Specifically, the λ with a zero value caused the equal amplitudes of the HFO at all phases.
In the following simulation analysis, parameters are c = 1, λ = 1. Consequently, the raw data with PAC can be generated as below: A construction example of the simulation data is shown in Figure 3.  Afterward, to regulate the PAC strength, the raw data are combined with an interferential signal (IFS) and normalize in the same way. Therefore, the original signal can be reconstructed as The parameter k represents the coupling strength, ranging from 0 to 1. Figure 4 shows the simulation results under different coupling strengths. Afterward, to regulate the PAC strength, the raw data are combined with an interferential signal (IFS) and normalize in the same way. Therefore, the original signal can be reconstructed as The parameter k represents the coupling strength, ranging from 0 to 1. Figure 4 shows the simulation results under different coupling strengths.
Afterward, to regulate the PAC strength, the raw data are combined with an interferential signal (IFS) and normalize in the same way. Therefore, the original signal can be reconstructed as The parameter k represents the coupling strength, ranging from 0 to 1. Figure 4 shows the simulation results under different coupling strengths.

Calculating Phase-Amplitude Coupling
The coupling strength between the amplitude

Calculating Phase-Amplitude Coupling
The coupling strength between the amplitude A high (t) of HFO and phase θ low (t) of LFO can be measured by MPMI, PCMI, SJE, and WPMI methods. However, the phase series θ low (t) of the low(t) is a periodic function with monotonically increase in each cycle. Thus, the ordinal patterns of θ low (t) would be very simple, and it is inappropriate for analyzing the coupling between θ low (t) and A high (t).Conversely, if there is a prominent PAC between them, the magnitude orders of cos(θ low (t)) will be similar to that of A high (t). Correspondingly, it would be beneficial for calculating PAC between cos(θ low (t)) and A high (t).
First of all, by using Hilbert transformation: where θ low (t) and A low (t) are the phase and amplitude series of the slow oscillation, respectively. Thus, cos(θ low (t)) can be generated by dividing low(t) by its instantaneous amplitude A low (t): low(t) A low (t) = Re(e iθ low (t) ) = cos(θ low (t)), Therefore, the amplitude of cos(θ low (t)) is only dependent on the phase time series of the slow oscillation, which is more conducive to the measurement of PAC. Finally, after generating the phase time series cos(θ low (t)), we further use the Hilbert transform to produce the amplitude A high (t) of the fast wave oscillation high(t). Then, the PAC of two time series can be obtained by cos(θ low (t)) and A high (t).

Parameter Choices in the Algorithm
In each algorithm, two key parameters embed dimension m and time lag τ are included. The order m represents the number of data points involved in the motif. Bandt et al. suggested that the value range of m is 3-7 [30], we set m = 3. The lag is referred to as the number of sample points spanned by each motif. Regarding the choice of τ, we refer to the mutual information method proposed by Li et al. [35]. In the simulation data, by comparing the mutual information derived from the different τ values, the lag τ corresponding to the maximal value of mutual information is an optimum parameter. As shown in Figure 5, when τ = 10, the mutual information reaches the maximum value between cos(θ low (t)) and A high (t) with coupling strength k = 0.3, k = 0.5, and k = 0.8. the number of sample points spanned by each motif. Regarding the choice of τ, we refer to the mutual information method proposed by Li et al. [35]. In the simulation data, by comparing the mutual information derived from the different τ values, the lag τ corresponding to the maximal value of mutual information is an optimum parameter. As shown in Figure 5, when τ = 10, the mutual information reaches the maximum value between cos low θ t ( ( )) and high At () with coupling strength k = 0.3, k = 0.5, and k = 0.8.

Parameter Choice in MPMI
When analyzing the MPE, the coarse-grained process is very important. The specific operation is to truncate the raw time series into several small sequences with the same length and average each segment to obtain a new time series. Therefore, the analysis of signal complexity needs to consider the choice of the scale factor s in the coarse-graining process. The adjacent elements of the original time series contain sequence information. If the value of the scale factor s is too small, the information of the relevant fragments cannot

Parameter Choice in MPMI
When analyzing the MPE, the coarse-grained process is very important. The specific operation is to truncate the raw time series into several small sequences with the same length and average each segment to obtain a new time series. Therefore, the analysis of signal complexity needs to consider the choice of the scale factor s in the coarse-graining process. The adjacent elements of the original time series contain sequence information. If the value of the scale factor s is too small, the information of the relevant fragments cannot be extracted to the greatest extent, so the analysis effect is not obvious. However, when the complexity difference between the signals is small, the scale factor s should not be selected too large, otherwise, the difference between the signals may be eliminated in the averaging process. In this study, the MPMI value varies with coupling strength under different scale factors s is set. As shown in Figure 6, we can observe that MPMI is proportional to k. When s increases, the MPMI curve moves upward as a whole, but the overall change trend remains unchanged. When k = 0, to avoid a higher MPMI value, s = 3 is selected. be extracted to the greatest extent, so the analysis effect is not obvious. However, when the complexity difference between the signals is small, the scale factor s should not be selected too large, otherwise, the difference between the signals may be eliminated in the averaging process. In this study, the MPMI value varies with coupling strength under different scale factors s is set. As shown in Figure 6, we can observe that MPMI is proportional to k. When s increases, the MPMI curve moves upward as a whole, but the overall change trend remains unchanged. When k = 0, to avoid a higher MPMI value, s = 3 is selected.

Parameter Choice in PCMI
When applying the PCMI method, it is necessary to consider the selection of parameter δ because of its influence on the PCMI estimation. The algorithm itself determines that the value of δ cannot be less than the order m in PCMI [33]. In the simulation data, we plotted the variation curve of PCMI with the coupling strength k when δ takes from 3 to 10. As shown in Figure 7, we can see that with the increase of δ, the changing trend of the PCMI curve remains constant, and increases with k. However, when δ = 10, k = 0.7 to 1, the PCMI value dropped, causing inaccurate measurement results. That is, when δ takes from 3 to 9, the PCMI estimation is stable. Therefore, in this paper, we suggest that the choice of δ = 5 is an appropriate option.

Parameter Choice in PCMI
When applying the PCMI method, it is necessary to consider the selection of parameter δ because of its influence on the PCMI estimation. The algorithm itself determines that the value of δ cannot be less than the order m in PCMI [33]. In the simulation data, we plotted the variation curve of PCMI with the coupling strength k when δ takes from 3 to 10. As shown in Figure 7, we can see that with the increase of δ, the changing trend of the PCMI curve remains constant, and increases with k. However, when δ = 10, k = 0.7 to 1, the PCMI value dropped, causing inaccurate measurement results. That is, when δ takes from 3 to 9, the PCMI estimation is stable. Therefore, in this paper, we suggest that the choice of δ = 5 is an appropriate option. plotted the variation curve of PCMI with the coupling strength k when δ takes from 3 to 10. As shown in Figure 7, we can see that with the increase of δ, the changing trend of the PCMI curve remains constant, and increases with k. However, when δ = 10, k = 0.7 to 1, the PCMI value dropped, causing inaccurate measurement results. That is, when δ takes from 3 to 9, the PCMI estimation is stable. Therefore, in this paper, we suggest that the choice of δ = 5 is an appropriate option.

Dependence on Data Length
In order to analyze the influence of data length on four kinds of algorithms, simulated time series with data length from 5 s to 30 s (5000 to 30,000 sample points) are generated by using the simulation data. We conducted 100 experiments on each epoch set of data length when k = 0.3, 0.5, 0.8. It can be seen from Figure 8a,c,d that MPMI, SJE, and WPMI

Dependence on Data Length
In order to analyze the influence of data length on four kinds of algorithms, simulated time series with data length from 5 s to 30 s (5000 to 30,000 sample points) are generated by using the simulation data. We conducted 100 experiments on each epoch set of data length when k = 0.3, 0.5, 0.8. It can be seen from Figure 8a,c,d that MPMI, SJE, and WPMI all present three small fluctuation horizontal curves, indicating that these three algorithms are not sensitive to changes in the data length. In addition, it can also be explained that when the PAC intensity is constant within a period, the data length has almost no effect on the performance of these three algorithms. When k = 0.8, compared with MPMI and SJE, the WPMI curve is more stable, and the performance is the best. As demonstrated in Figure 8b, when k = 0.3, 0.5, 0.8, the three curves show small fluctuations, which can reflect the PAC strength effectively. In short, these four algorithms can reflect the PAC value under different k within 5 s to 30 s. all present three small fluctuation horizontal curves, indicating that these three algorithms are not sensitive to changes in the data length. In addition, it can also be explained that when the PAC intensity is constant within a period, the data length has almost no effect on the performance of these three algorithms. When k = 0.8, compared with MPMI and SJE, the WPMI curve is more stable, and the performance is the best. As demonstrated in Figure 8b, when k = 0.3, 0.5, 0.8, the three curves show small fluctuations, which can reflect the PAC strength effectively. In short, these four algorithms can reflect the PAC value under different k within 5 s to 30 s.

The Effects of Coupling Coefficient on Methods
To analyze MPMI, PCMI, SJE, and WPMI characteristics with the changing of coupling strength, coupling coefficient k gradually increased from 0 to 1 with the step of 0.05. Figure 9 plots the relationship between MPMI/PCMI/SJE/WPMI and coupling strength k, where the PAC estimators all gradually increase as the k increases. However, when 0 0.2 k  or 0.8 1 k , the PAC estimators increased comparatively slowly. It shows that when the coupling is weak or strong, it is difficult to distinguish any difference of PAC intensities.

The Effects of Coupling Coefficient on Methods
To analyze MPMI, PCMI, SJE, and WPMI characteristics with the changing of coupling strength, coupling coefficient k gradually increased from 0 to 1 with the step of 0.05. Figure 9 plots the relationship between MPMI/PCMI/SJE/WPMI and coupling strength k, where the PAC estimators all gradually increase as the k increases. However, when 0 ≤ k ≤ 0.2 or 0.8 ≤ k ≤ 1, the PAC estimators increased comparatively slowly. It shows that when the coupling is weak or strong, it is difficult to distinguish any difference of PAC intensities.
To analyze MPMI, PCMI, SJE, and WPMI characteristics with the changing of coupling strength, coupling coefficient k gradually increased from 0 to 1 with the step of 0.05. Figure 9 plots the relationship between MPMI/PCMI/SJE/WPMI and coupling strength k, where the PAC estimators all gradually increase as the k increases. However, when 0 0.2 k  or 0.8 1 k , the PAC estimators increased comparatively slowly. It shows that when the coupling is weak or strong, it is difficult to distinguish any difference of PAC intensities.

The Effects of Noise on Methods
To analyze the effects of noise on four methods, white Gaussian noise with an SNR varying from −5 dB to 20 dB in steps of 1 dB is added to the coupled signal. The data length is 10,000. In the case of k = 0.3, 0.5, 0.8, respectively, Figure 10 plots the variation of PAC with white Gaussian noise and coupling strengths of k = 0.3, 0.5, 0.8. In order to compare the four methods in the same coordinate system, the normalization curve is realized by dividing the coupling value with noise by the coupling value without noise.

The Effects of Noise on Methods
To analyze the effects of noise on four methods, white Gaussian noise with an SNR varying from −5 dB to 20 dB in steps of 1 dB is added to the coupled signal. The data length is 10,000. In the case of k = 0.3, 0.5, 0.8, respectively, Figure 10 plots the variation of PAC with white Gaussian noise and coupling strengths of k = 0.3, 0.5, 0.8. In order to compare the four methods in the same coordinate system, the normalization curve is realized by dividing the coupling value with noise by the coupling value without noise. It could be seen that in Figure 10a, when k = 0.3, SNR from −5 dB to 6 dB, compared with the other three methods, SJE method is closer to the coupling value when there is no noise, and the ability to suppress noise better. When the value of SNR ranges from 10 dB to 20 dB, SJE, WPMI and MPMI are all close to the noise-free coupling value, while the PCMI value increases with the increase of the SNR, until SNR = 20 dB, it still does not reach the coupling value under noise-free conditions value. Figure 10b shows that when k = 0.5 and SNR range from is −5 dB to 5 dB, the anti-noise performance of SJE is slightly better than the other three methods. In Figure 10c, when k = 0.8, SNR ranges from −5 dB to 10 dB, WPMI against noise is the best, exceeding SJE. This is because under the same data length, the higher the coupling strength, the better the performance of the WPMI. Moreover, when SNR is from 7 dB to 20 dB, SJE and MPMI have similar performance.

Detection of the Frequency Pairs with PAC
In the process of collecting EEG data, there will inevitably be some noise interference. Therefore, in this paper, white Gaussian noise is used to simulate measurement noise [43]. In addition, when analyzing EEG signals collected by humans or animals under different behaviors and cognitive situations, we do not know the magnitude of the PAC strength value and the coupling frequency range of each stage in advance. Therefore, it is necessary to measure the PAC to obtain the best possible frequency estimation of the phase and amplitude coupling components. The PAC value can also be used to evaluate the degree It could be seen that in Figure 10a, when k = 0.3, SNR from −5 dB to 6 dB, compared with the other three methods, SJE method is closer to the coupling value when there is no noise, and the ability to suppress noise better. When the value of SNR ranges from 10 dB to 20 dB, SJE, WPMI and MPMI are all close to the noise-free coupling value, while the PCMI value increases with the increase of the SNR, until SNR = 20 dB, it still does not reach the coupling value under noise-free conditions value. Figure 10b shows that when k = 0.5 and SNR range from is −5 dB to 5 dB, the anti-noise performance of SJE is slightly better than the other three methods. In Figure 10c, when k = 0.8, SNR ranges from −5 dB to 10 dB, WPMI against noise is the best, exceeding SJE. This is because under the same data length, the higher the coupling strength, the better the performance of the WPMI. Moreover, when SNR is from 7 dB to 20 dB, SJE and MPMI have similar performance.

Detection of the Frequency Pairs with PAC
In the process of collecting EEG data, there will inevitably be some noise interference. Therefore, in this paper, white Gaussian noise is used to simulate measurement noise [43].
In addition, when analyzing EEG signals collected by humans or animals under different behaviors and cognitive situations, we do not know the magnitude of the PAC strength value and the coupling frequency range of each stage in advance. Therefore, it is necessary to measure the PAC to obtain the best possible frequency estimation of the phase and amplitude coupling components. The PAC value can also be used to evaluate the degree of PAC affected under different behaviors or physiological and pathological mechanisms.
At present, several methods of exploring PAC have been adopted. Among them, the most popular analysis is the comodulogram, in which a "coupling palette" is computed and used to indicate the strength of the modulation of amplitude by phase within wide ranges of phase-giving and amplitude-giving frequencies, [21,44,45]. Its advantage is that the experimenter can directly observe the frequency range showing the phase and amplitude of strong coupling. However, before the analysis, the possible coupling frequency range must be chosen. We used a comodulogram to test whether these four algorithms can detect PAC frequency pairs in a given frequency range. In order to verify the accuracy of the algorithm, the frequency range of the simulation data was expanded to 1-100 Hz, and the actual PAC coupling frequency range was 4-8 Hz and 60-70 Hz. Usually, the abscissa represents the frequencies analyzed as phase-modulating f P , whereas amplitude-modulated f A is represented in the ordinate axis; that is, jet colors in a given coordinated (x, y) of the bi-dimensional map indicate that the phase of the x frequency modulates the amplitude of the y frequency.
In Figure 11, we show an example of applying a comodulogram to simulate data. The comodulograms were constructed by using f P calculated in 0.5 Hz steps with 4 Hz bandwidths and f A calculated in 2 Hz steps with 10 Hz bandwidths. Next, in order to simulate the measurement noise in the EEG signal more realistically, white Gaussian noise with SNR = 5 dB was added to the original data, and the PAC comodulogram of the four methods at k = 0.8 was obtained as shown below. The comodulograms of MPMI and WPMI are relatively clustered and can correctly show the phase-amplitude frequency range. Although SJE is more concentrated than the MPMI and WPMI, white Gaussian noise has an impact on it. The comodulogram obtained by PCMI is relatively scattered, the frequency range of the HFO becomes wider, and white Gaussian noise has a greater impact on it. Furthermore, the running time of the four algorithms was also compared. The SJE algorithm has the fastest running time, WPMI takes the longest time, and MPMI is slightly faster than PCMI.
Entropy 2021, 23, x FOR PEER REVIEW 13 of 17 abscissa represents the frequencies analyzed as phase-modulating P f , whereas amplitude-modulated A f is represented in the ordinate axis; that is, jet colors in a given coordinated (x, y) of the bi-dimensional map indicate that the phase of the x frequency modulates the amplitude of the y frequency.
In Figure 11, we show an example of applying a comodulogram to simulate data. The comodulograms were constructed by using P f calculated in 0.5 Hz steps with 4 Hz bandwidths and A f calculated in 2 Hz steps with 10 Hz bandwidths. Next, in order to simulate the measurement noise in the EEG signal more realistically, white Gaussian noise with SNR = 5 dB was added to the original data, and the PAC comodulogram of the four methods at k = 0.8 was obtained as shown below. The comodulograms of MPMI and WPMI are relatively clustered and can correctly show the phase-amplitude frequency range. Although SJE is more concentrated than the MPMI and WPMI, white Gaussian noise has an impact on it. The comodulogram obtained by PCMI is relatively scattered, the frequency range of the HFO becomes wider, and white Gaussian noise has a greater impact on it. Furthermore, the running time of the four algorithms was also compared. The SJE algorithm has the fastest running time, WPMI takes the longest time, and MPMI is slightly faster than PCMI.

Sensibility to Spurious Coupling
The instantaneous sharp edges in the time domain correspond to the broadband harmonic components in the frequency domain. They affect the phase of low-frequency components and the amplitude of high-frequency components. Therefore, even if there is no

Sensibility to Spurious Coupling
The instantaneous sharp edges in the time domain correspond to the broadband harmonic components in the frequency domain. They affect the phase of low-frequency components and the amplitude of high-frequency components. Therefore, even if there is no real physiological interaction, spurious CFC may be generated [46][47][48]. In the previous simulation analysis, we only considered the interference caused by measurement noise. In fact, in real EEG recordings, noise components may be more complex and intense, such as sharp waveforms and large artifacts. Using simulated data, Kramer and colleagues believe that in some cases, sharp edges in the data may result in coupling in a wide range of frequencies-for-amplitude [49]. However, this type of coupling has not been confirmed in real the scalp or intracranial EEG data until recently, and it is unclear under what circumstances, if at all, such a situation may arise [47,50]. In this study, the Hilbert transform is used to extract the analytic signal with the instantaneous amplitude and phase components. Therefore, it inevitably leads to non-meaningful amplitude estimates for PAC computation.
Currently, existing papers have studied the detection of spurious coupling phenomena in signals without interaction [49]. Thus, we studied the impact of spike noise on the detection of PAC frequency bands by these four algorithms in the case of strong coupling. In the simulation data, set k = 0.8, the data length is 10 s, and a train of a spike (3 standard deviations height, 20 ms width at half-maximum, 100 ms inter-peak interval) are superimposed on the background signal. In the presence of spurious coupling, the phase-amplitude comodulograms of the four methods are obtained, as shown in Figure 12. components. Therefore, it inevitably leads to non-meaningful amplitude estimates for PAC computation. Currently, existing papers have studied the detection of spurious coupling phenomena in signals without interaction [49]. Thus, we studied the impact of spike noise on the detection of PAC frequency bands by these four algorithms in the case of strong coupling. In the simulation data, set k = 0.8, the data length is 10 s, and a train of a spike (3 standard deviations height, 20 ms width at half-maximum, 100 ms inter-peak interval) are superimposed on the background signal. In the presence of spurious coupling, the phase-amplitude comodulograms of the four methods are obtained, as shown in Figure 12. We can observe that compared with other methods, the SJE has a higher degree of aggregation and can better reflect the true coupling frequency range, and spurious coupling has less influence on it. In the MPMI and WPMI methods, the range of high-frequency rhythms in their comodulograms becomes wider due to the influence of spurious coupling. PCMI loses the ability to detect the coupling frequency range in the presence of spike noise.

Conclusions
In this study, the methods based on permutation analysis were applied to measure the PAC strength. The comparison was carefully performed between the performances of SJE and that of MPMI, PCMI, and WPMI. The results demonstrated that compared with other methods, the SJE algorithm exhibited superior performance, including the lower We can observe that compared with other methods, the SJE has a higher degree of aggregation and can better reflect the true coupling frequency range, and spurious coupling has less influence on it. In the MPMI and WPMI methods, the range of high-frequency rhythms in their comodulograms becomes wider due to the influence of spurious coupling. PCMI loses the ability to detect the coupling frequency range in the presence of spike noise.

Conclusions
In this study, the methods based on permutation analysis were applied to measure the PAC strength. The comparison was carefully performed between the performances of SJE and that of MPMI, PCMI, and WPMI. The results demonstrated that compared with other methods, the SJE algorithm exhibited superior performance, including the lower complexity and the highest computational efficiency. Furthermore, the results also show that SJE can better resist additive white Gaussian noise except for high coupling strengths where WPMI is more effective. In addition, we also found that the SJE has the ability to detect the pairs of coupling frequency in a wide frequency range when the signal is mixed with noise of low signal-to-noise ratio. moreover, it is capable to depict the comodulogram of a signal containing spike noise. In conclusion, it suggests that SJE is possibly a better choice to evaluate the PAC under certain conditions.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.