An Analytical EM Algorithm for Sub-Gaussian Vectors

: The area in which a multivariate α -stable distribution could be applied is vast; however, a lack of parameter estimation methods and theoretical limitations diminish its potential. Traditionally, the maximum likelihood estimation of parameters has been considered using a representation of the multivariate stable vector through a multivariate normal vector and an α -stable subordinator. This paper introduces an analytical expectation maximization (EM) algorithm for the estimation of parameters of symmetric multivariate α -stable random variables. Our numerical results show that the convergence of the proposed algorithm is much faster than that of existing algorithms. Moreover, the likelihood ratio (goodness-of-ﬁt) test for a multivariate α -stable distribution was implemented. Empirical examples with simulated and real world (stocks, AIS and cryptocurrencies) data showed that the likelihood ratio test can be useful for assessing goodness-of-ﬁt.


Introduction
Several empirical studies confirm that the real financial market data are often skewed and heavy-tailed (e.g., [1][2][3]). Therefore, a Gaussian distribution rarely fits the datafor example, stock returns or risk factors are badly fit by this law [4,5]. Note that the α-stable distribution offers a reasonable improvement, if not the best choice, among the alternative distributions that have been proposed in the literature (e.g., [6,7]). Thus, normal distributions are usually substituted by more general stable distributions that allow us to model both the effects of leptokurtosis and asymmetry in the data.
In the one-dimensional case, the distribution of the random α-stable variable is characterized by four parameters: the tail (stability) parameter α ∈ (0; 2], skewness β ∈ [−1; 1], scale σ > 0 and position µ ∈ 1 . The stability parameter α is the most important, because it describes the behavior of the tails [2]. Moreover, when α → 2, the distribution of the random variable is equivalent to a normal distribution and β → 0 (normal tail behavior). If α → 0, then the distribution becomes degenerate (extremely heavy tails). For example, Fielitz and Smith [8] showed that a symmetric univariate α-stable distribution appears to be appropriate when stock prices are modeled. Rachev and Mittnik [2] estimated the parameters of autoregressive-moving-average models with asymmetric stable Paretian distributions. Kabasinskas et al. [9] proposed a mixed-stable model to model the intra-daily data from German DAX component stock returns. Furthermore, the parameters of the stable distribution collectively and the skewed t-distribution, calculated on the basis of stock data, were applied to predict the direction of the change in future Accounting and Governance Risk rating [10]. Ogata [11] revealed that cryptocurrencies are a special class of assets. During various clustering procedures, parameters of α-stable distribution (together with other) were used as the main features.
One of the most recent applications of α-stable and truncated α-stable distributions in pension fund selection was published by Kopa [12]. They used first-, second-and third-order stochastic dominance rules to select the best pension fund from second pillar in Lithuania on the basis of the risk profile of the participant.
Furthermore, the multivariate α-stable symmetric vector with the stability parameter α is expressed through a normally distributed random vector whose variance is distributed by an α 2 -stable distribution [13][14][15][16] where s 1 is a subordinator with the stability parameter α 2 < 1; s 2 is a random vector, distributed by the d-variate normal law N(0, Ω); and µ is a random vector of the means. The random α-stable variable with α 2 < 1 and β = 1 is called a subordinator. In general, a subordinator is a one-dimensional non-decreasing Lévy process [17]. The estimation of parameters of multivariate α-stable distributions has been a possibility for long time [18][19][20][21][22][23]; however, researchers emphasize that methods of the estimation of parameters still have weaknesses. For example, the authors of [20] gave an example of an application of the elliptical sub-Gaussian distribution to DAX 30 and reported how to deal with issues when the α values of underlying assets are different and data are asymmetric; however, they could not incorporate time dependencies. Daul [24] explained how to deal with large portfolios when assets have very different marginal distributions (αs are very different). Furthermore, Nolan [25] gave an example of how to use it for Dow Jones 30 portfolios and introduced how to solve tail and serial dependency issues. Moreover, Kouaissah [26] showed how to use parametric multivariate stochastic dominance in the case of a sub-Gaussian distribution, even with different tail parameters. Such results are crucial in portfolio selection and management problems. However, practical applications of the multivariate stable law are limited by the fact that its distribution and density functions are not expressed through elementary functions. That said, Ravishanker's group and Rezakhah's group [15,27] reported the estimation of parameters of sub-Gaussian vectors by stochastic EM algorithms. The MCMC approach for the maximum likelihood estimation of model parameters (µ, Ω, α) was considered by Ravishanker [15]. However, the accuracy of stochastic methods (e.g., [27]) is highly dependent on internal sample size, the error of stochastic integration and convergence criteria.
The analytical EM parameter estimation approach is considered in this paper. Since the probability density function (see the next section) of a random symmetric multivariate α-stable vector might be expressed through bivariate integrals, it may be calculated using Gaussian and Gauss-Laguerre quadratures instead of using stochastic integration. This approach increases accuracy, simplifies the calculation of likelihood functions and increases computational efficiency compared to regular MLE methods and stochastic EM algorithms. Now, let us discuss goodness-of-fit tests that were used in similar studies. The well known Kolmogorov-Smirnov test [28,29] is sensitive to differences in the position and shape of the sample distribution. It is extremely difficult to implement the proposed methods, as they require a transformation whose derivation is analytically intractable for most multivariate distributions. Another famous goodness-of-fit test is the Anderson-Darling test [30], which is used to estimate statistical differences between observed values and theoretical univariate distributions. However, there has been no multivariate case for this test created yet. Goodness-of-fit tests for univariate, symmetric, α-stable distributions were considered by Matsui and Takemura [31]. Their methodology requires complicated numerical integration in order to preform goodness-of-fit tests for symmetric stable distributions. A completely different approach based on the empirical characteristic function tests for the multivariate stable distributions was considered by Meintanis [32]. However, they used weighted L 2 -type statistics that are difficult to compute. In [33], specific goodness-of-fit test for multivariate normal distributions are proposed. The authors claimed that their test is overall the most powerful and is effective against skewed, long-tailed and shorttailed symmetric alternatives. Moreover, they claimed that this test can be used for testing fitness for an assumed p-variate non-normal distribution. However, the major weakness of the test is related to the proposed test statistics. McAssey [34] introduced a technique of a goodness-of-fit test for general multivariate distributions based on multivariate nor-mal, uniform, Student's t and beta distributions. This method is simple to implement, involves a reasonable computational burden and does not require analytically intractable derivations. It has sufficient power to detect a poor fit in most situations. However, the authors did not provide evidence that such a technique could be used for a multuvariate α-stable distribution.
As an alternative to the mentioned goodness-of-fit tests, in this paper, we suggest a likelihood ratio test (developed in [35]). This test has never been used in the scientific literature for the goodness-of-fit of multivariate α-stable distributions. However, this likelihood ratio test is computationally fast and does not require specific knowledge. The only two things that must be known are how to generate random variables and how to compute a likelihood function.
The paper is structured as follows: the likelihood function and the maximum likelihood method are described in Section 2. The EM algorithm is considered and applications of the Gaussian and Gauss-Laguerre quadrature formulas for bivariate integration are discussed in Section 2.3. The likelihood ratio test for distribution fitting with the estimated parameters is given in Section 2.4. In Section 3, we provide evidence of the convergence of the EM algorithm and real world examples with stock data, biomedical data and cryptocurrency data. Finally, we provide empirical recommendations in Section 3.5 and discuss our findings in Section 4.

Materials and Methods
In this section, we introduce maximum likelihood estimates of a multivariate α-stable distribution, the proposed expectation maximization (EM) algorithm, quadrature formulas and the likelihood ratio test. The experimental design is also presented at the end of this section.

Maximum Likelihood Estimation
Maximum likelihood estimates (MLE) of parameters of any model are obtained by maximizing the likelihood function, calculated for a sample consisting of independent vectors [5,15,36,37]. Let us consider the probability density of a random vector created according to Equation (1). Indeed, the density of the d-dimensional normal vector N(µ, s · Ω) is as follows: where |Ω| > 0, d 1. Now, let us write down the probability density of an α-stable subordinator [2,38]: where s 0, −1 y 1, 0 < α 2 and Thus, the probability density of a symmetric multivariate α-stable random vector with given parameters µ, Ω, α is expressed as a bivariate integral: However, it is very difficult to use this function in practice, even in a univariate case. Let us consider the sample X = X 1 , X 2 , . . . , X K consisting of independent d-variate α-stable vectors. The likelihood function by virtue of (2) is The maximum likelihood estimates can be obtained by maximizing the latter function with respect to µ, Ω and α. Note that very small or very large values of it can occur during numerical calculations. Therefore, from a computational point of view, it is reasonable to use the log-likelihood function (LLF): where B(·) is denoted as This optimization problem may be solved in many ways; however, it is very time consuming and inefficient. In the next section, we introduce the expectation maximization approach in MLE estimation.

The Expectation Maximization Algorithm
Let us consider a two-step recurrent procedure, in which the parameters µ and Ω of the multivariate α-stable distribution are estimated at each iteration by the expectation maximization (EM) algorithm, and then α is estimated by solving the corresponding univariate optimization problem.
Since LLF is an analytical function, its partial derivatives with respect to µ and Ω can be found. Hence, if α is fixed, then the Maximal Likelihood estimates of the parameters µ and Ω can be found by setting partial derivatives of (3) to zero and solving the corresponding system of nonlinear equations: Using the rules of differentiation with respect to vector or matrix [39], one can write the derivatives Let us denote a few important functions g i , f i , h(·) and w(·) Using the notation from above, the derivatives of LLF can be rewritten in the following way: By setting the later derivatives to zero, one can write down the fixed point equations for MLE:μ = h X,μ,Ω,α Assume the initial values µ 0 , Ω 0 and α 0 are chosen. Then, k iterations of the EM algorithm can be generated and estimates µ k , Ω k and α k at each iteration can be computed [40,41]. Following the two-step approach, estimates of the mean and covariance matrix are calculated for a particular tail parameter α k : Thus, an estimate of a stability parameter α at the (k + 1)th step is obtained by solving the univariate LLF minimization problem: Our empirical studies show that the LLF for fixed parameters µ and Ω is unimodal with respect to the stability parameter α (for examples, see Results Section). Hence, in the current study, the golden section search method is applied to the minimization of (9).
The sample mean vector and sample covariance matrix can be selected correspondingly as initial values of µ and Ω: and Hence, the initial value of the stability parameter can be taken as α 0 = 1.5. Many literature sources support this assumption, since, usually, for financial datasets, the tail parameter is from interval (1; 2) (e.g., see [4]). Mainly, the value is greater than 1.5 for assets from developed financial markets, and it is smaller for cryptocurrencies and emerging markets.

Gauss and Gauss-Laguerre Quadratures
Integrals in the expressions of the LLF (3), and corresponding estimates and minimization problems (4) or (7)-(9), can be computed by subroutines of mathematical programming systems such as MathCad and Maple, or using open source code, e.g., R or Python. Computational experiments with the α-stable models of several variables have shown that solving (4) or (7)-(9) using MathCad (on PC Intel(R) Core(TM) i5, CPU 2.40 GHz, RAM 4.00 GB, OS 64-bit) usually requires many hours of processing time (see [16], Section 6.1). Since such computational time can be unacceptable in many applications, it is wise to implement the quadrature formulas for integration. Let us consider the Gaussian and Gauss-Laguerre quadrature formulas for the calculation of integrals in LLF and related expressions [41][42][43][44].
Thus, to calculate the integrals in Equations (3), (5) and (6) with respect to y, Gaussian quadrature formulas are used: where f (y i ) is an integrated function, ϑ i and y i are the Gaussian quadrature weights and nodes over [−1, 1] and m is the number of nodes. The positions y i (termed nodes) and weighting coefficients ϑ i (termed weights) are chosen such that the integral is evaluated exactly by the quadrature (12), for some class of functions f (usually polynomials of a given degree). The Gaussian quadrature nodes y i are roots of polynomials P m (y) and can be computed by any root-finding algorithm [44].
To calculate the integrals in (3), (5) and (6) with respect to z, the Gauss-Laguerre quadrature formulas are applicable.
where f (z i ) is an integrated function, κ i and z i are the Gaussian-Laguerre quadrature weights and nodes (the ith zeros of Laguerre polynomials L γ n (z)) over [0, ∞] and n is the number of nodes. Note that the appropriate choice of parameter γ with respect to the dimensionality of the problem and the initial value of the tail parameter α can improve calculation time. In this study, we set γ = 0.
Computational experiments showed that it is enough to take Gauss-Laguerre quadrature with 20 nodes and 70 nodes for Gaussian quadrature to guaranty admissible accuracy of the corresponding calculations. Tables of the nodes and weights for Gaussian and Gaussian-Laguerre quadratures can be found in [42].

A Likelihood Ratio Test for Multivariate, α-Stable Distribution
There are any different and reliable goodness-of-fit tests that could be used to test whether data fit a univariate distribution well. However, this cannot be said about the multivariate case. Among other multivariate (Kolmogorov-Smirnov, based on the characteristic function, based on specific test statistics, based on a beta function, etc.) methods, the most simple and straightforward is the likelihood ratio test [35].
The likelihood ratio test [45] can be applied to asses the quality of the MLE estimation of the multivariate α-stable model too. The set of parameters (μ,Ω,α) of the model is assumed to be known when considering the likelihood ratio test for distribution fitting to some data sample X.
The method works as follows. At the beginning, N multivariate random vectors Y with α-stable parameter estimates (μ,Ω,α) are simulated. Then, the empirical distribution [45]. If the empirical probability F

Experimental Design
Now, let us describe the experiment. First, we generated multivariate α-stable random variables X using techniques described in [15]. Secondly, we estimated a set of parameters (μ,Ω,α) of sample X using different algorithms (direct minimization of (3) and analytical EM algorithm). We assumed that direct minimization of (3) gives "true" estimates of parametersμ,Ω andα. To assess the quality of the EM algorithm, we simulated more (M) data samples with the same parametersμ,Ω,α. From each iteration, we collected information and checked the convergence of the EM algorithm. If the LLF value and parameter estimates converged (with precision 10 −8 ) to a value obtained by direct MLE, then we could say that the proposed algorithm converged and was reliable. Later, we proceeded with a goodness-of-fit test (likelihood ratio test). Finally, we give three real word examples: the stock market, biomedicine and cryptocurrency.

Results
The estimation of MLE α-stable parameters by the EM algorithm is an iterative process that requires one to choose initial values, (10) and (11), and perform a certain number of iterations-(7)-(9)-while the values in adjacent steps differ insignificantly. To test the behavior of the designed algorithm, the experiments were performed with simulated and real world data. The integrals were calculated using the Gaussian (12) and the Gauss-Laguerre (13) quadratures.

Test Data Modeling
To test the convergence of the proposed analytical EM algorithm, the following experiment was conducted.
Then, for each two-dimensional vector, we estimated parameters of a bivariate, αstable distribution using the proposed EM algorithm and direct minimization of (3). We assume that the direct MLE method gives true estimates of a bivariate, α-stable distribution. During each iteration, we registered EM estimates of all parameters until their values deviated from the previous estimate by no more than 10 −8 . Moreover, the terminal estimate was compared with the corresponding true value (obtained by direct method), and the statistical significance of differences was tested. It turns out that such precision ensures that there are no statistically significant differences between the true value and the corresponding terminal estimate. Thus, the analytical EM algorithm converged for all 100 samples. The averaged results are shown in Figure 1.
The averages of LLF and EM estimates are shown as solid lines for each iteration; corresponding values of direct MLE are dashed lines; and dotted lines are 95% confidence intervals calculated using the normal approximation. As shown in Figure 1, the convergence of the EM algorithm is rather fast independently of the sample size. The likelihood function and location parameter converged in 5-7 iterations; however, tail parameter α and covariance matrix converged only in slightly fewer than 20 iterations. This is not a surprise, because LLF is much more sensitive to changes in α and Ω. In the case of shorter samples (K = 50, Figure 1a 5) and (6). It must be noted that the same convergence (as seen in the average case) was observed in each of the 100 cases, too. This means that, independently of the sample selected, the EM estimates always converged to the values obtained by the direct MLE method.
Finally, the likelihood ratio test was performed for all simulated datasets, where a particular LLF value in each case was compared with the empirical distribution function of the LLF with original parameters α, µ and Ω. If we set the significance level p to 0.01, then, according to the likelihood ratio test (Section 2.4), the empirical probability should be within the interval p 2 , 1 − p 2 , i.e., (0.005, 0.995). In this case, we cannot reject the hypothesis that the simulated data fit a symmetrical, multivariate α-stable distribution with given parameters α, µ and Ω. During our experiment, there were no cases when the hypothesis could be rejected.
It must be noted that analytical EM algorithm is 30 times faster than the direct MLE algorithm and could be used in practice.

Stock Market Data
In the previous section, we showed that the EM algorithm is fast and converges in approximately 20 iterations independently of the length of the dataset. Theoretically, if the datasets are symmetrically distributed with the same parameter α, then the EM algorithm should converge quickly and the parameter estimates should converge to true values. We now show a practical example with real stock price data from Yahoo [46]. The algorithm was applied to daily stock returns of four companies, AAPL, TSLA, LUMIN and ATnT, collected during 2019-2021. This experiment dataset consisted of four-dimensional vectors of length K = 545.
First, we investigate the marginal characteristics of the underlying returns (AAPL, TSLA, LUMIN and ATnT). Table 1 presents the empirical characteristics and estimates of marginal α-stable distributions. As we can see, the mean returns of AAPL and TSLA were positive, while those of LUMIN and ATnT were negative, which indicates that prices of the first two were mostly increasing during the period analyzed, while the prices of the latter two mainly decreased. We use the vector of empirical means as one of input parameters µ 0 for the EM algorithm. Furthermore, negative skewness and kurtosis above three indicate that the marginal empirical distributions for all the assets deviate from normality/symmetries. This assumption is supported by the fact that estimated marginal parameters α are different from two and β are different from zero. Moreover, Anderson-Darling statistics (A-D) that is calculated for each sample, if we assume an α-stable distribution with given parameter estimates, indicate that we cannot reject the non-parametric hypothesis about data fitness with α-stable distribution. It must be noted that α was slightly different for all assets analyzed, which indicates that we deviated from theoretical assumptions, but not too far.
Next, in Table 2, we show the correlation and covariance among the assets analyzed. The covariance matrix is input parameter Ω 0 for the EM algorithm, while the correlation allows us to understand how strong the linear relation between variables is. As can be seen, stock returns are positively but weakly correlated. Only ATnT and LUMIN have correlations above 0.5. Figure 2 shows that the LLF for fixed µ and Ω is unimodal with respect to the tail parameter α. Moreover, this property was observed in all iterations.
Therefore, the application of the golden section search method is reasonable in order to minimize LLF with respect to α. In this case, the EM algorithm converged in fewer than 30 iterations. Figure 3 shows the dependence of the id of iterations and LLF and the estimates of parameters α, µ and Ω.  As it may be observed, the convergence is slightly slower in the stock data case compared with the simulated datasets. Such behavior might be due to the fact that the real datasets are asymmetric and have different marginal parameters α. However, this assumption is not supported by any kind of experiments yet.
According to results of the EM algorithm, the parameter estimates of the multivariate α-stable distribution for stock data are as follows: α 14 = 1.51459, while the terminal value of the LLF 29 function was equal to −5193.62. The subscript above indicates the iteration at which the parameter estimate converged. As we can see, the LLF converged after 29 iterations so we say that the algorithm converged after 29 iterations too. The estimateα 14 = 1.51459 confirms the assumption (from literature) that α for stock returns from developed markets is above 1.5.
We demonstrated that, even if the data are not purely symmetrically distributed, EM converges quickly and allows us to obtain ML estimates of parameters of multivariate α-stable random variables. However, it was not shown yet how good the the estimation is. Now, let us show the likelihood ratio test as a goodness-of-fit test.
First, M = 100 four-dimensional testing samples with estimated parametersα 14 ,μ 27 andΩ 28 were generated, and LLF values for each sample were calculated. Then, the empirical distribution function from LLF values was constructed (shown as a solid line in Figure 4). Secondly, the empirical probability that the random value of LLF is less than or equal to the terminal value of LLF = −5193.62 (test statistics) was calculated (shown as a dashed line in Figure 4). In the case of stock market data, the empirical probability of the test statistics was equal to 0.020927.
Finally, if we set the significance level to 0.01, according to the likelihood ratio test, this probability is within the interval (0.005, 0.995). This means that we cannot reject the hypothesis that the data fit a multivariate, symmetric, α-stable distribution with parameterŝ α 14 ,μ 27 andΩ 28 .

AIS Data Modeling
In a similar manner, we give an example of the analysis of data from the Australian Institute of Sport (AIS), examined in [47] (the database is often used for statistical calculations). This dataset contains various biomedical measurements (body fat (Bfat, %), lean body mass (LBM, kg), height (Ht, cm) and total weight (Wt, kg)) of a group of female Australian athletes [48]. In this experiment, the data consisted of a four-dimensional vector of length K = 100. Table 3 gives the empirical characteristics and estimates of marginal α-stable distributions for female athletes. As shown in Table 3, the data are not leptokurtic (kurtosis < 3), which means that the data are light-tailed or lack outliers. Moreover, the negative kurtosis of Bfat indicates that its distribution has a lighter tail than the normal distribution. Bfat, LBM and Wt are fairly symmetrical, because their skewness is between −0.5 and 0.5. All mentioned observations indicate that Bfat, LBM and Wt could be normally distributed. This assumption is supported by goodness-of-fit statistics (A-D statistics), which, being less that 2, indicate that Bfat, LBM and Wt are α-stable distributed with α → 2 (normal distribution). However, the marginal estimate of parameter α for Ht is equal to 1.6858, and we cannot state that the four-dimensional vector is distributed by a multivariate normal distribution anymore. That is why we proceed with the estimation of parameters of the multivariate α-stable distribution. If the terminal α of the EM algorithm turned out to be close to 2, then we would run a test for multivariate normality.
Next, we give estimates of correlation and covariance matrices for AIS data (see Table 4). As shown in Table 4, AIS data are strongly correlated, with the exception of Bfat, which correlates weakly with LBM and Ht. When we initiated the EM algorithm (in the same vein as in previous section), we used the vector of means as the initial value of µ 0 and the covariance matrix as the initial value of Ω 0 . Similar to Figure 2, in Figure 5, we show that the LLF for a fixed µ and Ω is unimodal with respect to the stability parameter α. Using the EM algorithm described in this paper, the LLF value and parameter estimates were registered during 100 iterations (see Figure 6).
The optimal value of the parameter α was obtained after 14 iterations. However, the EM algorithm converged with respect to the mean and covariance matrix, respectively, after 40 and 47 (slightly more than in stock case) iterations. From the figure of parameters µ (middle column in Figure 6), it may appear that the convergence of the location parameter was achieved very early; however, as the initial value of µ was luckily selected very close to true value, the oscillations were very small. The greatest absolute deviation of µ in sequential iterations was in the interval [−0.00071, 0.00339] and visually cannot be seen. According to the EM algorithm, the estimates of the parameters of the multivariate α-stable distribution for AIS data are as follows:  22. It is interesting to note that the sample size in the AIS data case was five times smaller than that of the stock data; however, the number of iterations was nearly double. As we can see again, terminalα = 1.61459 is not very different form the smallest marginal α = 1.685812. Furthermore, the estimatedα was different from two, which is why we can reject the hypothesis about the multivariate normality of AIS data. Now, let us show a likelihood ratio test in the same manner as in the previous section. First, we constructed empirical distribution function from LLF values with parametersα 14 , µ 40 andΩ 47 (shown as a solid line in Figure 7). Secondly, we calculated the empirical probability that a random value of LLF is less than or equal to the terminal value LLF = 1039.22 (shown as a dashed line in Figure 7). In the case of AIS data, the empirical probability is equal to 0.3662.
Finally, we cannot reject the hypothesis that AIS data fit a symmetric multivariate α-stable distribution with parametersα 14 ,μ 40 andΩ 47 because this probability is within the interval (0.005, 0.995).

Cryptocurrency Data
Finally, we show how the EM algorithm can be applied to estimate the parameters of a four-dimensional cryptocurrency dataset. As an example, Bitcoin, Ethereum, Ripple and ADA (during 2019-2021 [46]) were used. Hence, the length of the vector is K = 791 (exchanges operate 24/7 the whole year). Table 5 gives empirical characteristics and estimates of marginal α-stable distributions of four cryptocurrencies. There is no big surprise that the characteristics of cryptocurrency data are different from those analyzed in Sections 3.2 and 3.3. The first obvious things that are very different from the previous sections are the skewness and kurtosis. The negative skewness is nearly twice greater than for stocks and AIS data, while the kurtosis is extremely large and exceeds 20. Such a composition indicates that we should expect fat left tails, i.e., α → 1 and β → −1. However, marginal estimates of the α-stable distribution show that such an assumption is not completely true. Parameter β is mainly close to 0, which indicates fair symmetry of the tails. Moreover, marginal αs are not as extreme as expected, but they are significantly different from stocks and AIS data.
Next, in Table 6, we check for empirical correlation and covariance. As we can see, cryptocurrencies are average (above 0.5) to strongly (above 0.8) correlated, which indicates that there are high chances that the prices of underlying assets will move in the same direction. Such a portfolio would have a higher risk than a less correlated one.
Next, we ran the EM algorithm to estimate the parameters of the multivariate α-stable distribution. There is no surprise that the LLF for fixed µ and Ω is unimodal with respect to the stability parameter α (see Figure 8). Using the EM algorithm, the LLF value and parameter estimates were registered for the first 100 iterations. However, the convergence conditions were reached much more quickly (see Figure 9).  while the terminal value of the LLF 50 function is equal to −7013.09. By contrast to the stock market, the estimate of parameter α for cryptocurrency is below 1.5, which is not a surprise. Now, let us show a likelihood ratio test in the same manner as in the two previous sections. First, we constructed an empirical distribution function from LLF values with parametersα 14 ,μ 39 andΩ 36 (shown as a solid line in Figure 10). Secondly, we calculated the empirical probability that a random value of LLF is less than or equal to the terminal value LLF = −7013.09 (shown as dashed line in Figure 10). In the case of cryptocurrency data, the empirical probability is equal to 0.08119 and is near the lower bound of non-rejection (0.005, 0.995) region. However, we cannot reject hypothesis that cryptocurrency data fits multivariate symmetric α-stable distribution with parameterŝ α 14 ,μ 39 andΩ 36 .

Recommendations
In the previous three sections, we demonstrate examples of how to estimate parameters of a multivariate α-stable distribution using an analytical EM algorithm. In the case of stock data, the estimated tail parameterα of the multivariate case was very close to the smallest marginal α estimate (given in Table 1). Moreover, a similar situation was observed in AIS and cryptocurrency data. This gives us an idea of how to chose the initial α 0 for applications of the EM algorithm in the future.
Furthermore, the estimateμ in the case of stock data was more similar to the vector of medians or vector µ of marginal estimates rather than the vector of empirical means. However, for AIS data, marginal empirical distributions were symmetrical (all empirical location parameters were very similar) so there was no difference if we chose a vector of empirical means, medians or marginal parameters µ as the initial value for µ 0 .
By summarizing two latest assumptions, we can recommend one to chose the smallest marginal α, vector of marginal µ parameters (or medians) and covariance matrix as initial values for the EM algorithm. This could increase efficiency and reduce the number of iterations required to converge.
By summarizing the findings from the last four sections, we can give the following recommendations for the implementation of an analytical EM algorithm:

1.
Check for empirical characteristics of datasets. The vectors of means or medians can be used as initial values for µ 0 , and the covariance matrix as an input parameter for Ω 0 .

2.
If estimates of a marginal α-stable distribution are available, then choose the minimal α as the initial value of α 0 and the vector of location parameters as µ 0 .

3.
Run the EM algorithm and check for convergence. 4.
(Optional) When one or more parameter estimates converge, stop updating them and optimize only with respect to parameters that did not converge. However, such a modification may cause the algorithm to get stuck in a local minimum.

5.
Use the likelihood ratio test to check the goodness-of-fit.

Discussion and Conclusions
Today, the research of distributions related to α-stable ones, including sub-Gaussian vectors, is more relevant, because they often occur in the analysis of economic data, information flows along computer networks, cryptocurrency markets, etc. The statistical maximum likelihood approach is constructed by the analytical EM algorithm in this paper, which allows us to estimate the parameters of the symmetric multivariate α-stable distribution with high precision. Computational experiments showed that multivariate α-stable distribution parameter estimates, obtained by the developed numerical method, are statistically adequate, because, after a certain number of iterations, the value of the likelihood function and estimates of the parameters converged to their corresponding ("true") values. However, due to the fact that real data are often asymmetric, it is very important to perform goodness-of-fit tests. The proposed likelihood ratio test (for assessment of the goodness-of-fit) showed that the data fit a multivariate α-stable model well with obtained estimates in all three cases analyzed (stock market data, biomedical measurements data and cryptocurrency data). In future research, this test could be compared with other multivariate goodness-of-fit tests, e.g., the one proposed in [34] or based on characteristic functions [32].
Moreover, it is more important to make sure that the methodology is robust with respect to the asymmetry and the fact that marginal αs in real data are different. This is also the future trend of our research.
Furthermore, in the Results Section, we emphasized that the number of iterations is nearly independent of sample size; however, the duration of calculations (real CPU time) is highly dependent on that. This is due to the fact that the likelihood function is sample dependent. We plan (in the near future) to implement parallel summation in LLF and parallel quadrature integration to reduce overall processing time.
Finally, it must be noted that the analytical EM algorithm is 30 times faster than the regular ML algorithm and could be very useful in practice. However, it is currently not available in R or Python, where the stochastic EM algorithm is already implemented.