Multivariate Multiscale Dispersion Entropy of Biomedical Times Series

Due to the non-linearity of numerous physiological recordings, non-linear analysis of multi-channel signals has been extensively used in biomedical engineering and neuroscience. Multivariate multiscale sample entropy (MSE–mvMSE) is a popular non-linear metric to quantify the irregularity of multi-channel time series. However, mvMSE has two main drawbacks: (1) the entropy values obtained by the original algorithm of mvMSE are either undefined or unreliable for short signals (300 sample points); and (2) the computation of mvMSE for signals with a large number of channels requires the storage of a huge number of elements. To deal with these problems and improve the stability of mvMSE, we introduce multivariate multiscale dispersion entropy (MDE–mvMDE), as an extension of our recently developed MDE, to quantify the complexity of multivariate time series. We assess mvMDE, in comparison with the state-of-the-art and most widespread multivariate approaches, namely, mvMSE and multivariate multiscale fuzzy entropy (mvMFE), on multi-channel noise signals, bivariate autoregressive processes, and three biomedical datasets. The results show that mvMDE takes into account dependencies in patterns across both the time and spatial domains. The mvMDE, mvMSE, and mvMFE methods are consistent in that they lead to similar conclusions about the underlying physiological conditions. However, the proposed mvMDE discriminates various physiological states of the biomedical recordings better than mvMSE and mvMFE. In addition, for both the short and long time series, the mvMDE-based results are noticeably more stable than the mvMSE- and mvMFE-based ones. For short multivariate time series, mvMDE, unlike mvMSE, does not result in undefined values. Furthermore, mvMDE is faster than mvMFE and mvMSE and also needs to store a considerably smaller number of elements. Due to its ability to detect different kinds of dynamics of multivariate signals, mvMDE has great potential to analyse various signals.


I. INTRODUCTION
M ULTIVARIATE techniques are needed to analyse data consisting of more than one time series [1].The majority of physiological and pathophysiological activities include interactions between different kinds of single processes.Thus, we expect that parameters or measures with different origins are considered in a multivariate way [1], [2].Furthermore, recent developments in sensor technology enabling routine recordings of multi-channel signals have led to an increasing popularity of this kind of analysis on biomedical data [1], [3].
Advances on information theory and non-linear dynamical approaches have recently allowed the study of different kinds of multivariate time series [4].Due to the intrinsic nonlinearity of diverse physiological processes, non-linear analysis of multivariate time series has been broadly used in biomedical engineering with the aim of studying the relationship between simultaneously recorded signals [4].
Multivariate multiscale entropy (mvMSE) as a powerful non-linear measure is based on a combination of multivariate sample entropy (SampEn -mvSE) and the coarse-graining process [5].mvMSE, by taking into account both the spatial and time domains, shows the complexity of multi-channel signals [5].Complexity reflects the degree of structural richness of time series [5], [6] and is different with that of irregularity or uncertainty defined from classical entropy methods such as SampEn [7], permutation entropy (PerEn) [8], and dispersion entropy (DisEn) [9].That is to say, neither completely regular or certain nor completely irregular (uncorrelated random) time series are truly complex, since none of them is structurally rich at a global level [5], [6], [10], [11].
The multivariate multiscale entropy-based analysis is interpreted based on: 1) the multivariate time series X is more complex than the multivariate time series Y, if for the most temporal scales, the mvSE measures for X are larger than those for Y; 2) a monotonic fall in the multivariate entropy values along the temporal scale factors shows that the signal only includes useful information at the smallest scale factors; and 3) a multivariate signal illustrating long-range correlations and complex creating dynamics is characterized by either a constant mvSE or this demonstrates a monotonic rise in mvSE with the temporal scale factor [5].
Although the mvMSE is a powerful and widely-used method, when applied to short signals, the results may be undefined or unreliable [12].To alleviate this shortcoming, multivariate multiscale fuzzy entropy (mvMFE) based on multivariate fuzzy entropy (mvFE) and the coarse-graining process was suggested [13].To decrease the running time of the mvMFE proposed in [13], we have recently proposed an mvMFE with a new fuzzy membership function [12].Nevertheless, the mvMFE is still slow for real-time applications and may lead to unreliable results for short signals, as shown later.
To overcome the problem of unreliable values for mvMFE and mvMSE, multivariate multiscale PerEn (mvMPE) was proposed [14].To have more information regarding the amplitude of multi-channel signals, multivariate weighted multiscale PerEn (mvWMPE) has recently been developed [15].However, both the mvMPE and mvWMPE do not take into account the cross-statistical properties between multiple input channels and do not follow the concept of complexity for some signals such as white Gaussian noise (WGN) and 1/f noise [5], [10], [12].mvMSE and mvMFE have growing appeal and broad use.They have been successfully used in a number of biomedical signal processing applications, such as, to characterise electroencephalogram (EEG) signals in Alzheimer's disease (AD) [16], [17], to analyze the multivariate cardiovascular time series [18], to characterize focal and non-focal EEG time series [12], to analyze the complexity of interbeat interval and interbreath signals [5], and to analyze the postural fluctuations in fallers and non-fallers older adults [19].
However, mvMSE and mvMFE have the following shortcomings: 1) mvMSE and mvMFE values may be unreliable and unstable for short signals; 2) they are not quick enough for real-time applications; and 3) computation of mvMSE and mvMFE of a signal with a large number of channels needs to have large memory space, as shown later.To address these drawbacks, we propose four algorithms to extend our recently developed MDE to its multivariate versions, termed multivariate MDE (mvMDE).To evaluate the mvMDE methods, we use both synthetic and real multivariate datasets.Our results indicate that mvMDE is noticeably faster than the existing methods, leads to more stable results, better discriminates different kinds of biomedical time series, does not lead to undefined values for short multivariate time series, and needs to store a considerably smaller number of elements in comparison with mvMSE and mvMFE.

II. MULTIVARIATE MULTISCALE DISPERSION ENTROPY (MVMDE)
In this study, we propose and explore three different alternative implementations of mvMDE until we arrive at a fourth and preferred one.All the mvMDE implementations include two main steps: 1) coarse-graining process for multivariate time series; and 2) multivariate dispersion entropy (mvDE), as an extension of our recently developed DisEn [9].It is worth noting that for all the mvMDE algorithms, the mapping based on the normal cumulative distribution function (NCDF) used in the calculation of mvDE for the first temporal scale factor is maintained fixed across all scales.In fact, in the mvMDE, µ and σ of the NCDF are respectively set at the average and standard deviation (SD) of the original time series and they remain constant for all temporal scale factors.This fact is similar to r in the mvMSE and mvMFE, setting at a certain percentage (usually 15%) of the SD of the original signal and remaining constant for all scales [5], [12].

A. Coarse-Graining Process for Multivariate Signals
Assume we have a p-channel time series U = {u k,b } b=1,2,... ,L k=1,2,... ,p of length L. In the mvMDE algorithms, for each channel, the original signal is first divided into nonoverlapping segments of length τ , named scale factor.Next, for each channel, the average of each segment is calculated to derive the coarse-grained signals as follows [5], [12]: (1) where N denotes the length of the coarse-grained signal.The second step of mvMDE is calculating the mvDE of each coarse-grained signal.

B. Background Information for the mvDE
We build four diverse alternative implementations of mvDE (mvDE-I to III and mvDE) until we arrive at a preferred (or optimal) one, i.e., mvDE.However, we here present all the simpler alternatives (mvDE-I to III), since they can still be useful in some settings and allow for clearer comparisons with other current approaches.

1) mvDE-I:
The mvDE-I of the multi-channel coarsegrained time series X = {x k,i } i=1,2,... ,N k=1,2,... ,p , which is based on the mvMPE algorithm [14], is calculated as follows: a) First, X = {x k,i } i=1,2,... ,N k=1,2,... ,p are mapped to c classes with integer indices from 1 to c. Since the amplitude values of each of series x k (k = 1, 2, . . ., p) may be dominated by the components of vectors coming from the time series with the largest amplitudes, we scale all of the data channels to the same amplitude range.To this end and to overcome the problem of assigning the majority of x k,i to only few classes when maximum or minimum values are noticeable larger or smaller than the mean/median value of the signal, the NCDF of each of x k is first calculated.In fact, the NCDF maps X into Y = {y k,i } i=1,2,... ,N k=1,2,... ,p from 0 to 1 as follows: where σ k and µ k are the SD and mean of time series x k , respectively.Then, we use a linear algorithm to assign each y k,i to an integer from 1 to c.To do so, for each member of the mapped signal, we use z c k,i = round(c • y k,i + 0.5), where z c k,i denotes the i th member of the classified signal in the k th channel and rounding involves either increasing or decreasing a number to the next digit.Note that, although this part is linear, the whole mapping approach is non-linear because of the use of NCDF.
b) Time series z m,c k,j are made with embedding dimension m and time delay d according to z m,c k,j = {z c k,j , z c k,j+d , [8].Each time series z m,c k,j is mapped to a dispersion pattern π v0v1...vm−1 , where The number of possible dispersion patterns that can be assigned to each time series z m,c k,j is equal to c m , since the signal has m members and each member can be one of the integers from 1 to c [9].where # means cardinality.In fact, p(π v0...vm−1 ) shows the number of dispersion patterns of π v0...vm−1 that is assigned to z m,c k,j , divided by the total number of embedded signals with embedding dimension m multiplied by the number of channels.
d) Finally, based on the Shannon's definition of entropy, the mvDE-I is calculated as follows: In case all possible dispersion patterns have equal probability value, the highest value of mvDE-I is obtained, which has a value of ln(c m ).In contrast, if there is only one p(π v0...vm−1 ) different from zero, which demonstrates a completely regular/certain signal, the smallest value of mvDE-I is obtained.In the algorithm of mvDE-I, we compare N p dispersion patterns of a p-channel signal with c m potential patterns.Thus, at least c m + N p elements are stored.
To work with reliable statistics to calculate MDE, it was recommended c m < L τmax [20].For mvMDE-I, since the mvDE-I counts the dispersion patterns for every channel of a multivariate time series, it is suggested c m < pL τmax .mvMDE-I works appropriately when the components of a multivariate signal are statistically independent.However, the mvMDE-I algorithm, like mvMPE [14], does not consider the spatial domain of time series.To overcome this problem, we propose mvMDE-II based on the Taken's theorem [12], [21].
2) mvDE-II: The algorithm of mvDE-II is as follows: a) First, like mvDE-I, X = {x k,i } i=1,2,... ,N k=1,2,... ,p are mapped to Z = {z k,i } i=1,2,... ,N k=1,2,... ,p based on the NCDF.b) To take into account both the spatial and time domains, multi-channel embedded vectors are generated according to the multivariate embedding theory [21].The multivariate embedded reconstruction of Z is defined as: The number of possible dispersion patterns that can be assigned to each time series Z m (j) is equal to c mp , since the signal has mp members and each member can be one of the integers from 1 to c. d) For each of c mp potential dispersion patterns π v0...vmp−1 , relative frequency is obtained based on the DisEn algorithm [9] as follows: e) Finally, based on the Shannon's definition of entropy, the mvDE-II is calculated as follows: In the algorithm of mvDE-II, at least c mp +N p elements are stored.Thus, when p is large, the algorithm needs huge space of memory to store elements.To work with reliable statistics to calculate mvMDE-II, it is recommended c mp < L τmax .Thus, although mvDE-II deals with both the spatial and time domains, the length of a signal and its number of channels should be very large and small, respectively, to reliably calculate mvDE-II values.To alleviate the problem, we propose mvDE-III.
3) mvDE-III: The algorithm of mvDE-III is as follows: a) First, like the mvDE-I and mvDE-II approaches, X = {x k,i } i=1,2,... ,N k=1,2,... ,p are mapped to Z = {z k,i } i=1,2,... ,N k=1,2,... ,p .b) Multivariate embedded vectors Z k,m (j) are generated according to the Taken's embedding theorem [21] with p embedding dimension vectors m k = [1, 1, . . ., m k , . . ., 1, 1] (k = 1, . . ., p) with length m + p − 1, where m k denotes the k th element of m.For simplicity, we assume m k = m and The number of possible dispersion patterns that can be assigned to each time series Z k,m (j) is equal to c m+p−1 , since the signal has m + p − 1 members and each member can be one of the integers from 1 to c [9].Since we count the number of patterns for each of p different m k , we have a p.c mp−m−p+1 times increase in the number of dispersion patterns in comparison with mvDE-II, leading to more reliable results for a signal with a small number of sample points, as shown later.
d) For each channel 1 ≤ k ≤ p and for each of c m+p−1 potential dispersion patterns π v0...vm+p−2 , relative frequency is obtained as follows: e) Finally, based on the Shannon's definition of entropy, the mvDE-II is calculated as follows: In the algorithm of mvDE-III, at least c m+p−1 + N p elements are stored.Although this number is noticeably smaller than that for mvDE-II, the algorithm still needs to have large memory space for a signal with a large number of channels.To work with reliable statistics to calculate mvMDE-III, it is recommended c m+p−1 < pL τmax .Therefore, albeit mvMDE-III takes into account both the spatial and time domains and needs to smaller number of sample points in comparison with mvMDE-II, there is a need to have a large enough number of samples and small number of channels.To alleviate these deficiencies, we propose mvDE.

C. Multivariate Dispersion Entropy (mvDE)
The mvDE algorithm is as follows: a) First, like mvDE-I to III, the multivariate signal X = {x k,i } i=1,2,... ,N k=1,2,... ,p is mapped to c classes with integer indices from 1 to c. b) Like mvDE-II, to consider both the spatial and time domains, multivariate embedded vectors Z m (j), 1 ≤ j ≤ N − (m − 1)d are created based on the Taken's embedding theorem [21].For simplicity, we assume c) For every Z m (j), all combinations of the p k=1 m k elements in Z m (j) taken m at a time, termed φ q (j) (q = 1, ... mp m ), are created.The number of the combinations is equal to mp m .Therefore, for all channels, we have In the mvDE algorithm, at least c m +N p elements are stored.This number is noticeably smaller than those for mvDE-I to III, leading to more stable results for signals with a short length and a large number of samples.As the number of patterns obtained by the mvMDE method is τmax to work with reliable statistics.

D. Parameters of the mvMDE, mvMSE, and mvMFE Methods
In addition to the maximum scale factor τ max described before, there are three other parameters for the mvMDE methods, including the embedding dimension vector m, the number of classes c, and the time delay vector d.In practice, it is recommended d k = 1, because some information with regard to the frequency may be ignored for 1 < d.We need 1 < c to keep away the trivial case of having only one dispersion pattern.For simplicity, we use c = 5 and m k = 2 for all signals used in this study, although the range 2 < c < 9 leads to similar results.For more information about c, m k , and d k , please refer to [9].
In this study, d k , m k , and r for the mvMSE and mvMFE were respectively set as 1, 2, and 0.15 of the SD of the original time series following recommendations in [5], [12].The maximum scale factor for mvMSE and mvMFE also follows [5], [12].In the algorithm of mvSE and mvFE, at least N p 2 + N p(pm + 1) elements are stored.Matlab codes of mvMFE and mvMSE are available at http://dx.doi.org/10.7488/ds/1432.Overall, the characteristics and limitations of the mvSE, mvFE, and mvDE algorithms for a p-channel signal with length N are summarized in Table I.

III. EVALUATION SIGNALS
In this section, the descriptions of correlated and uncorrelated noise signals and real time series used in this study are given.
To understand the behaviour of the mvMDE methods on uncorrelated WGN and 1/f noise, we first generated a trivariate time series, where originally all three data channels were realization of mutually independent 1/f noise.Then, we gradually decreased the number of data channels representing 1/f noise (from 3 to 0) and at the same time, increased the number of variates representing independent WGN (from 0 to 3) [22].The number of channels was always three.
To create correlated bivariate noise time series, we first generated a bivariate uncorrelated random time series H. Afterwards, H was multiplied with the standard deviation (hereafter, sigma) and then, the value of the mean (hereafter, mu) was added.Next, H was multiplied by the upper triangular matrix L obtained from the Cholesky decomposition of a defined correlation matrix R (which is positive and symmetric) to set the correlation.Here, we set R = 1 0.95 0.95 1 according to [5], [12].An in-depth study on the effect of correlated and uncorrelated 1/f noise and WGN on multiscale entropy approaches can be found in [5], [6].
Based on the fact that the larger the order of an AR process, the more complex the AR process [5], we evaluate the mvMDE, mvMSE, and mvMFE methods on a bivariate AR process describing the evolution of a set of two variables as a linear function of their past values according to: where y is the 2 × 1 vector of variables, α is the maximum lag in the bivariate AR model, A γ denotes the 2 × 2 matrix of parameters corresponding to lag order γ, and e n is the 2 × 1 vector of error terms assumed to be WGN [23].For simplicity, we set A γ = 0.15 0.15 0.15 0.15 .

B. Real Biomedical Datasets 1) Dataset of Stride Internal Fluctuations:
To investigate the ability of the proposed mvMDE methods to reveal the long-range correlations and dynamics of multivariate signals, the stride interval recordings are used [24], [25].The time series were recorded from ten young, healthy men.Mean age was 21.7 years, changing from 18 to 29 years.Height and weight were 1.77 ± 0.08 meters (mean ± SD) and 71.8 ± 10.7 kg (mean ± SD), respectively.All ten participants provided informed written consent walking for 1 hour at slow, normal, and fast paces and also walking a metronome set to each subject's mean stride interval.Three walking paces were considered as different variables from the same system.In this way, we expect to be able to discriminate between the metronomically-paced and self-spaced walking.For further information, please refer to [25].
2) Dataset of Focal and Non-focal Brain Activity: The ability of the mvMDE methods, in comparison with mvMFE and mvMSE, to differentiate focal from non-focal recordings is evaluated using a publicly-available EEG dataset [26].The dataset includes 5 patients and, for each patient, there are 750 focal and 750 non-focal bivariate signals.The length of each recording was 20 s with sampling frequency of 512 Hz (10240 sample points).Further information can be found in [26].Before computing the aforementioned methods, all recordings were digitally filtered employing an FIR band-pass filter with cut-off frequencies at 0.5 Hz and 40 Hz.
3) Surface MEG Recordings in Alzheimer's Disease: We analysed resting state MEG time series recorded with a 148channel whole-head magnetometer.All 62 participants agreed for the research, which was approved by the local ethics committee.To screen the cognitive status, a mini-mental state examination (MMSE) was done.There were 36 AD patients (age = 74.06± 6.95 years, all data given as mean ± SD, and MMSE score = 18.06 ± 3.36) and 26 controls (age = 71.77± 6.38 years, and MMSE score = 28.88 ± 1.18).The difference in age between two groups was not significant (pvalue = 0.1911, Student's t-test) [27].The distribution of MEG sensors is shown in Fig. 2 in [27].For each participant, five minutes of MEG resting state activity were recorded at a sampling frequency of 169.5 Hz.The signals were divided into 10 s segments (1695 samples) and visually inspected using an automated thresholding procedure to discard epochs noticeably contaminated with artifacts.All recordings were digitally band-pass filtered with a Hamming window FIR filter of order 200 and cut-off frequencies at 1.5 Hz and 40 Hz.For more information, please see [27].

A. Synthetic Signals
We first apply the proposed and existing methods to 40 independent realizations of uncorrelated trivariate WGN and 1/f noise, described in Section III.The number of sample points for each of the 1/f noise and WGN signals were 15000.The average and SD of the results for mvMDE-I, mvMDE-II, mvMDE-III, mvMDE, mvMSE, and mvMFE are depicted in Fig. 1(a) to 1(f), respectively.Using all the existing and proposed methods, the entropy values of trivariate WGN signals are higher than those of the other trivariate time series at low scale factors.However, the entropy values for the coarse-grained trivariate 1/f noise signals stay almost constant or decrease slowly along the temporal scale factor, while the entropy values for the coarse-grained WGN signal monotonically decreases with the increase of scale factors.When the length of WGN signals, obtained by the coarsegraining process, decreases (i.e., the scale factor increases), the mean value of inside each signal converges to a constant value and the SD becomes smaller.Therefore, no new structures are revealed at higher temporal scales.This demonstrates a multivariate WGN time series has information only in small temporal scale factors.In contrast, for trivariate 1/f noise signals, the mean value of the fluctuations inside each signal does not converge to a constant value.
For all the methods, the higher the number of variates representing 1/f noise, the higher complexity the trivariate signal, in agreement with the fact that multivariate 1/f noise is structurally more complex than multivariate WGN [5], [10], [12].Here, for multivariate 1/f noise and WGN, τ max was 20 for mvMDE, according to Section II.
To compare the results obtained by the mvMDE, mvMSE, and mvMFE methods, we used the coefficient of variation (CV) defined as the SD divided by the mean.In fact, CV permits comparison of variability estimates regardless of the magnitude values.We investigate the results obtained by uncorrelated noise signals at scale factor 10, as a trade-off between short and long scale factors.As can be seen in Table II To assess the ability of the mvMDE methods to characterize short signals in comparison with mvMFE and mvMSE, we use trivariate 1/f and WGN noise with length of 300 sample points.The results for the mvMDE, mvMSE, and mvMFE approaches at temporal scales 1 to 20 are depicted in Fig. 2(a) to 2(f), respectively.As can be seen in Fig. 2(a To evaluate the computational time of mvMSE, mvMFE, mvMDE-I to III, and mvMDE, we use uncorrelated multivariate WGN time series with different lengths, changing from 100 to 10,000 sample points, and different number of channels, changing from 2 to 8. The results are depicted in Table III.The simulations have been carried out using a PC with Intel (R) Xeon (R) CPU, E5420, 2.5 GHz and 8-GB RAM by MATLAB R2015a.The results show that the computation times for mvMSE and mvMFE are close.The slowest algorithm is mvMDE-II, while the fastest ones are mvMDE-I and mvMDE, in that order.For an 8-channel signal with 10,000 samples, using mvMSE, mvMFE, and mvMDE-II, the array exceeded the memory available.Overall, in terms of computation time and memory space, mvMDE outperforms all the existing and proposed methods taking into account both the time and spatial domains.
Univariate multiscale entropy approaches only consider every data channel separately and fail to take into account the cross-channel information of multivariate time series [5].Uncorrelated multi-channel WGN has less structural complexity and more irregularity compared with multi-channel 1/f noise.To assess the ability of the existing and proposed multivariate entropy methods to reveal the dynamics across the channels, we created 40 independent realizations of different combinations of bivariate 1/f noise and WGN time series with length 20,000 (according to [5], [12]), making the channels correlated.Fig. 3(a) to 3(d) respectively show the results obtained using the mvMDE-I, mvMDE-II, mvMDE-III, and mvMDE to model both the within-and cross-channel properties in multivariate signals.
mvMDE-I cannot discriminate the correlated from uncorrelated WGN or 1/f noise.This fact is revealed in Fig. 3 (a).Therefore, mvMDE-I should only be used when the components of a multi-channel time series are statistically independent.Multivariate multiscale entropy-based methods at scale factor 1 show the irregularity of multi-channel signals [5].The mvMDE-II, mvMDE-III, and mvMDE values at scale 1 show that the uncorrelated WGN is the most irregular and unpredictable time series in agreement with [6], while the most irregular signals using mvMFE and mvMSE are the correlated   WGN [5], [12], in contrast with the fact that correlated multichannel WGN signals are more predictable and regular than uncorrelated WGN ones [6], [20].
The correlated bivariate 1/f noise is the most complex signal using the mvMDE-II, mvMDE-III, and mvMDE.The second most complex signal is the uncorrelated bivariate 1/f noise, as can be seen in Fig. 3.The decreases of the uncorrelated bivariate WGN noise profiles using mvMDE-II, mvMDE-III, and mvMDE are the largest, evidencing the fact that the uncorrelated WGN is the least complex time series.These facts are also in agreement with the previous studies [5], [10], [12].Therefore, as desired, the mvMDE-II, mvMDE-III, and mvMDE deal with both the cross-and within-channel correlations.
The ability of the mvMDE methods to characterize multivariate AR processes is further evaluated using bivariate AR(1), AR(3), and AR (5).The results obtained by the mvMDE-I, mvMDE-II, mvMDE-III, and mvMDE are shown in Fig. 4(a) to 4(d).As expected, when the lag order increases, the complexity of the corresponding time series using the mvMDE approaches increases, in agreement with the fact that a larger lag order denotes a more complex time series [5].

B. Real Biomedical Datasets
Discrimination of aged and diseased individuals' from control or healthy subjects' time series is a long-lasting challenge in the physiological complexity literature [5], [6], [12], [28].To this end, we use the mvMDE methods, in comparison with mvMFE as an improved version of mvMSE [12], to detect different types of dynamical variability of multivariate recordings of three physiological datasets.Of note is that we do not use the mvMDE-I for biomedical signals, because it does not take into account both the spatial and time domains at the same time.1) Dataset of Stride Internal Fluctuations: For the self-paced versus metronomically-paced stride interval fluctuations, the results obtained by the mvMDE-III, mvMDE, and mvMFE, respectively depicted in Fig. 5(a), (b), and (c), show that the selfpaced unconstrained walk's fluctuations have more complexity and greater long-range correlations than the metronomicallypaced walk's series, in agreement with those obtained by mvMSE, and multivariate empirical mode decompositionenhanced by mvSE [24].We did not use mvMDE-II, as the signals do not follow the minimum number of samples required for mvMDE-II.To compare the results, the CV values for both the metronomically-and self-paced walk (MPW and SPW) at scale factor 4, as a trade-off between the long and short scales, are shown in Table IV   of elements stored by the use of the mvMFE algorithm.
The average and SD of mvMDE and mvMFE values for five regions are respectively shown in Fig. 7(a) and 7(b).As can be seen in Fig. 6 and Fig. 7, the average mvMDE and mvMFE values for AD patients are smaller than those for controls at lower scale factors (short-time scale factors), while at higher scales, the AD subjects' recordings have larger entropy values (long-time scale factors) for both the mvMFE and mvMDE, in agreement with [16], [30], [31].Because the larger the number of channels, the smaller the mvMSE and similarly mvMFE values [16], the entropy values for anterior region are larger than those for the other four regions.It is worth noting that we only use mvMDE, as the signals do not follow the minimum number of samples required for mvMDE-II and III.
The MannWhitney U-test was used to assess the differences between the mvMDE and mvMFE profiles at each temporal scale for AD patients versus controls, because the mvMDE and mvMFE values at each scale factor did not follow a normal distribution.The temporal scales with p-values smaller than 0.001 are shown with * in Fig. 6 and Fig. 7.The pvalues show that the mvMDE, compared with the mvMFE, significantly discriminated the controls from subjects with AD at a larger number of scale factors.Moreover, the smallest pvalue was achieved by the mvMDE, evidencing the superiority of mvMDE over mvMFE.
On the whole, the profiles for the real datasets evidence the advantage of mvMDE-II, mvMDE-III, and mvMDE over mvMFE to discriminate different types of dynamics of multichannel signals as well as the superiority of mvMDE over mvMFE in terms of ability to discriminate various dynamics of time series, computational time, and memory cost.As mentioned before, mvMPE does not consider the spatial domain.We have also refined the mvMPE [14] on the basis of mvMDE-II, mvMDE-III, and mvMDE.These approaches have the following advantages over the first version of mvMPE [14]: 1) they take into account both the spatial and time domains; 2) their results were more stable than the mvMPEbased ones; and 3) better distinguished different dynamics of multivariate signals.However, since the mvMDE methods are considerably faster, result in more stable profiles, and lead to larger differences between physiological conditions of recordings, for simplicity, we did not report the mvMPE-based results.Our future study will aim at proposing the refined composite mvMDE (RCmvMDE) approaches according to [12].Moreover, we will explore the mvMDE and RCmvMDE on other physiological and non-physiological time series.

V. CONCLUSIONS
To quantify the complexity of biomedical multivariate time series, we built four diverse alternative implementations of mvMDE as further developments of our recently introduced MDE [20].These insights help towards a comprehensive understanding of four strategies to extend a univariate-based entropy method to its multivariate versions and therefore, provide invaluable information for future studies on multivariate time series.Although mvMDE was the best algorithm in terms of ability to discriminate dynamics of multivariate signals, computational time, and memory cost, the simpler alternatives (mvDE-I to III) may still be useful in some settings.
We assessed their performance on the correlated and uncorrelated multivariate noise signals, the bivariate AR time series, and three physiological datasets.The results showed the similar behavior of mvMSE-, mvMFE-, and mvMDE-based profiles.However, mvMDE had the following advantages over the existing methods: 1) it was noticeably faster than the existing methods; 2) mvMDE, in comparison with mvMSE and mvMFE, resulted in more stable profiles; 3) mvMDE better discriminated different kinds of biomedical signals; 4) for short multivariate time series, mvMDE, unlike mvMSE, did not result in undefined values; and 5) mvMDE, compared with mvMSE and mvMFE, needed to store a considerably smaller number of elements.
Overall, we expect the mvMDE approach to play a key role in the assessment of complexity in multivariate time series.
) and 2(d), the mvMDE-I and mvMDE methods better discriminate different dynamics of the noise signals.However, the mvMSE values are undefined at higher scale factors.Although the mvMFE-and mvMDE-II-based values are defined at all scale factors, they cannot distinguish the dynamics of different noise signals.The profiles obtained by mvMDE-III are more distinguishable than mvMDE-II, as mentioned that mvMDE-III needs a smaller number of sample points.Nevertheless, the profiles obtained by mvMDE-III have overlaps at several scale factors.Overall, the results show the superiority of mvMDE-I and mvMDE over mvMDE-II, mvMDE-III, mvMSE, and mvMFE for short uncorrelated signals.
. The CV values for the mvMDE-III-and mvMDE-based profiles are smaller than those for mvMFE, showing the superiority of the proposed methods over mvMFE in terms of the stability of results.The smallest CV values are achieved by the mvMDE.2) Dataset of Focal and Non-focal Brain Activity: For the focal and non-focal EEG recordings, the results obtained by mvMDE-II, mvMDE-III, mvMDE, and mvMFE, respectively depicted in Fig. 5 (a), (b), (c), and (d), show that the focal time series are less complex than the non-focal ones, in agreement

Fig. 8 :
Fig. 8: Mean value and SD of the results obtained by (a) mvMDE and (b) mvMFE computed from 36 AD patients versus 26 elderly age-matched controls over five scalp regions.Red and blue indicate AD patients and controls, respectively.The scale factors with p-values smaller than 0.001 are shown with *.

TABLE I :
Ability to deal with spatial domain and characterization of short signals, minimum number of elements to be stored, and minimum number of samples needed for each of the mvSE, mvFE, and mvDE algorithms for a p-channel signal with length N.

TABLE II :
CV values of the proposed and existing multivariate multiscale entropy-based analyses at scale factor 10 for the uncorrelated trivariate 1/f noise and WGN.

TABLE V :
CV values of the entropy results at scale factor 6 using mvMDE-II, mvMDE-III, mvMDE, and mvMFE for focal and non-focal EEG recordings.To assess the ability of mvMDE, in comparison with mvMFE, we applied the methods to the 148-channel MEG signals to discriminate AD patients from controls.Because mvMFE needs to store a huge number of elements for a signal with a large number of channels, mvMFE was not able to simultaneously analyse all 148 time series.However, the results using mvMDE are depicted in Fig.6.It represents an advantage of mvMDE over mvMFE for signals with a large number of channels.To compare the mvMFE and mvMDE, we applied the methods to five main scalp regions, namely, anterior (17 channels), right (34 channels) and left lateral (34 channels), central (29 channels), and posterior (34 channels) areas, leading to the smaller number of channels to noticeably decrease the number Mean value and SD of the results obtained by mvMDE computed from 36 AD patients versus 26 elderly controls for all the 148 channels.Red and blue respectively indicate AD patients and controls.The scales with p-values smaller than 0.001 are shown with *.