An Universal, Simple, Circular Statistics-Based Estimator of α for Symmetric Stable Family

: The aim of this article is to obtain a simple and efﬁcient estimator of the index parameter of symmetric stable distribution that holds universally, i.e., over the entire range of the parameter. We appeal to directional statistics on the classical result on wrapping of a distribution in obtaining the wrapped stable family of distributions. The performance of the estimator obtained is better than the existing estimators in the literature in terms of both consistency and efﬁciency. The estimator is applied to model some real life ﬁnancial datasets. A mixture of normal and Cauchy distributions is compared with the stable family of distributions when the estimate of the parameter α lies between 1 and 2. A similar approach can be adopted when α (or its estimate) belongs to (0.5,1). In this case, one may compare with a mixture of Laplace and Cauchy distributions. A new measure of goodness of ﬁt is proposed for the above family of distributions.


Introduction
Our motivation in this paper is to obtain a universal and efficient estimator of the tail index parameter α of symmetric stable distribution (explained in Section 2) Nolan (2003). This is achieved by appealing to methods available in circular statistics. We recall that there exist two popular estimators of α in the literature. The Hill estimator proposed by Hill (1975), which uses the linear function of the order statistics, however, can be used to estimate α ∈ [1, 2] only. Furthermore, it is also "extremely sensitive" to the choice of k (explained in Section 6) even for other values of α. Hill (1975) and Dufour and Kurz-Kim (2010) pointed out other drawbacks of the Hill estimator. The other estimator proposed by Anderson and Arnold (1993) is based on characteristic function approach. However, this estimator cannot be obtained in a closed form and is to be solved numerically. Furthermore, neither its asymptotic distribution nor its variance and bias are available in the literature.
Our approach in this paper appeals to circular statistics and is based on the method of trigonometric moments as in SenGupta (1996) and later also discussed in Jammalamadaka and SenGupta (2001). This stems from the very useful result which presents a closed analytical form of the density of a wrapped (circular) stable distribution obtained by wrapping the corresponding stable distribution which need not have any closed form analytic representation for arbitrary α. This result shows that α is preserved as the same parameter even after the wrapping. Furthermore, this paper presents a goodness of fit test based on the wrapped probability density function, which may be used as a necessary condition to ascertain the fit of the stable distribution. We exploit this approach with the real life examples. This estimator has a simple and elegant closed form expression. It is asymptotically normally distributed with mean α and variance available in a closed analytical form. Furthermore, from extensive simulations under parameter configurations encountered in financial data, it is exhibited that this new estimator outperforms both the estimators mentioned above almost uniformly in the entire comparable support of α. In Section 2, the probability density function of the wrapped stable distribution and some associated notations are introduced. The moment estimator of the index parameter is also defined in this section. Section 3 presents the derivation of the asymptotic distribution of the moment estimator defined in Section 2. In Section 4, an improved estimator of the index parameter is obtained. Section 5 shows the derivation of the asymptotic distribution of the improved estimator using the multivariate delta method. In addition, the asymptotic variance is computed for various values of the parameters through simulation. In Section 6, comparison of the performance of the improved estimator is made with those of the Hill estimator and the characteristic function-based estimator based on their root mean square errors through simulation. In Section 7, the procedure of the various computations is presented. In Section 8, applications of the proposed estimator is made on some real life data. We also conclude with remarks on the performance of the various estimators and some comments on future scope in Section 8. Finally, the tables showing the various computations and the figures on the applications of data are given in Appendices A, B and C.

The Trigonometric Moment Estimator
The regular symmetric stable distribution is defined through its characteristic function given by where µ is the location parameter; σ is the scale parameter, which we take as 1; and α is the index or shape parameter of the distribution. Here, without loss of generality, we take µ = 0.
From the stable distribution, we can obtain the wrapped stable distribution (the process of wrapping explained in Jammalamadaka and SenGupta (2001)). Suppose θ 1 , θ 2 , ..., θ m is a random sample of size m drawn from the wrapped stable (given in Jammalamadaka and SenGupta (2001)) distribution whose probability density function is given by It is known in general from Jammalamadaka and SenGupta (2001) that the characteristic function of θ at the integer p is defined as, Furthermore, from Jammalamadaka and SenGupta (2001), it is known that for, the p.d.f given by Equation (1), Then, we note thatR 1 = C 1 2 +S 1 2 andR 2 = C 2 2 +S 2 By the method of trigonometric moments estimation, equatingR 1 andR 2 to the corresponding functions of the theoretical trigonometric moments, we get the estimator of index parameter α as (see SenGupta (1996)):α = 1 ln 2 ln lnR 2 lnR 1 Then, we defineR j = 1 m ∑ m i=1 cos j(θ i −θ), j = 1, 2 andθ is the mean direction given bȳ θ = arctan S 1 C 1 . Note thatR 1 ≡R. We consider two special cases.
2.1. Special Case 1 : µ = 0, σ = 1 We now consider the case as treated by Anderson and Arnold (1993), specifically µ = 0 and σ = 1, and hence the concentration parameter ρ = exp(−1) as both parameters are known. This case may arise when one has historical data or prior information on the scale parameter. In such a case, the probability density function reduces to In addition, by the method of trigonometric moments estimation, the estimator of index parameter α is given byα Next, we consider a general case when µ = 0 and σ, and hence the estimator of the concentration parameter is ρ =R 1 . This case is especially useful in many real life applications, for example, for price changes in financial data, µ = 0 is a standard assumption. In such a case, the probability density function reduces to In addition, by the method of trigonometric moments estimation, the estimator of index parameter α is given byα 2 = 1 ln 2 ln lnC 2 lnC 1 As is also seen in Anderson and Arnold (1993), for financial data after using log-ratio transformation, the location parameter of the transformed variable becomes zero. Hence, the case of µ = 0 was not considered by Anderson and Arnold (1993) and accordingly by us also for the comparison made in this paper.

Derivation of the Asymptotic Distribution of the Moment Estimator
where T m = (C 1 ,C 2 ,S 1 ,S 2 ) , µ is the mean vector given by µ = (ρ cos µ 0 , ρ 2 α cos 2µ 0 , ρ sin µ 0 , ρ 2 α sin 2µ 0 ) and Σ is the dispersion matrix given by Proof. The derivations for the proof are given in Appendix A.
Hence, assuming large sample size, central limit theorem Feller (1971) gives (C 1 ,C 2 ,S 1 ,S 2 ) where µ is the mean vector given by and Σ is the dispersion matrix given by where A, B, C, D, E, F, G, H, I and J are as defined above.
Hence, assuming large sample size, central limit theorem Feller (1971) givesC 2 L − → N(µ , σ 2 m ) where µ is the mean given by µ = ρ 2 α and σ 2 is the dispersion given by σ 2 = mV(C 2 ), that is Therefore, by delta method (given in Casella and Berger (2002)), we get µ is the mean vector given by µ = (ρ, ρ 2 α ) and Σ is the dispersion matrix given by:- Proof. The derivations for the proof are given in Appendix C.
Hence, assuming large sample size, central limit theorem Feller (1971) gives where µ is the mean vector given by µ = (ρ, ρ 2 α ) and Σ is the dispersion matrix given by L − → N 2 (0, Σ ) Therefore, by delta method (given in Casella and Berger (2002) The above theorems imply the estimator to be consistent. Hence, in large samples, the performance of the estimator is reasonably good. Now, assuming the sample size is large, say 100, we calculate the asymptotic variance γ Σγ/100 of g(T m ) =α for different values of α ranging from 0 to 2 and different values of ρ ranging from 0 to 1 in Table 1. Table 1. Asymptotic Variances of the moment estimatorα and modified truncated estimator α * .

Improvement Over the Moment Estimator
The moment estimator need not always remain in the support of the true parameter α (that is (0,2]). Hence, the moment estimators proposed above do not need to be proper estimators of α. A modified estimator free from this defect is given bŷ (since support of α excludes non-positive values). Thus, the density function ofα * is given by Thus, we get g(α * ) as a mixed distribution of one atomic mass function and a continuous function.

Derivation of the Asymptotic Distribution of the Modified Truncated Estimators
Now, using the asymptotic normal distribution ofα, we can derive the same results for the modified truncated estimator of the index parameter α (given as below) as we have done for the method of moment estimator of α.
The mean ofα * is given by asymptotically (as noted above) and f(α) =probability density function ofα.
The asymptotic variance ofα * is given by Similarly, the mean ofα * 1 is given by The asymptotic variance ofα * 1 is given by The mean ofα * 2 is given by The asymptotic variance ofα * 2 is given by Thus, the following theorem is established Now, assuming the sample size m is large, say 100, the asymptotic variances of the modified truncated estimatorα * for different values of α and different values of ρ (ranging from 0 to 1) are displayed in Table 2.

Comparison of the Proposed Estimator With the Hill Estimator and the Characteristic Function Based Estimator
Next, we want to compare the performance of this modified truncated estimator with that of a popular estimator known as Hill-estimator Dufour and Kurz-Kim (2010); Hill (1975), which is a simple non-parametric estimator based on order statistic. Given a sample of n observations X 1 , X 2 , ...X n , the Hill-estimator is defined as, where k is the number of observations which lie on the tails of the distribution of interest and is to be optimally chosen depending on the sample size, n, tail thickness α, as k = k(n, α) and X j:n denotes j-order statistic of the sample of size n.
The asymptotic normality of the Hill estimator is provided by Goldie and Richard (1987) as, ) exists and is non-zero valued) and using Equation (3), We need this result for comparing the performances of the estimators for α. In addition, we make a comparison of the performance of the modified truncated estimatorα 2 with that of the characteristic function based estimator Anderson and Arnold (1993), which is obtained by minimization of the objective function (where µ = 0 and σ = 1) given bŷ The performance of the modified truncated estimatorα 3 is compared with that of the characteristic function-based estimator Anderson and Arnold (1993), which is obtained by minimization of the objective function (where µ = 0 and σ unknown) given by, x 1 , x 2 , ..., x n are realizations from symmetric stable (α) distribution, z i is the ith zero of the mth degree Hermite polynomial H m (z) and It is to be noted that, for the estimator of α < 1, we do not know any explicit form of the probability density function. However, for value of the estimator between 1 and 2, i.e., for 1 <α * < 2, we may compare the fit with the stable family by modeling a mixture of normal and Cauchy distribution and then using the method as proposed in Anderson and Arnold (1993) by the objective function given by whereη(t) is the same as defined above with the realizations taken from the mixture distribution.
ψ NC denotes the corresponding theoretical characteristic function given by where p denotes the mixture proportion, σ 1 and σ 2 are taken as the scale parameters of the normal and Cauchy distributions, respectively (the location parameters are taken as zeros, the reason for which is mentioned above). Finally, a measure for the goodness of fit is proposed as:

Index of Objective function (I.O.) = Objective function + Number of parameters estimated
The distribution for which I.O. is minimum gives the best fit to the data. The modified truncated estimator based on the moment estimator is free of the location parameter since it is defined in terms ofR j = 1 m ∑ m i=1 cos j(θ i −θ), j = 1, 2, that is in terms of the quantity (θ i −θ), which is centered with respect to the mean directionθ, although it is not free of the nuisance parameter that is the concentration parameter ρ. The Hill estimator is scale invariant since it is defined in terms of log of ratios but not location invariant. Therefore, centering needs to be done in order to take care of the location invariance.

Computational Studies
The analytical variance of the untruncated moment estimator was compared with that of the modified truncated estimator, as presented in Table 1, for values of α < 1, which is more applicable in practical situations for volatile data.
The comparison of the performances of the two estimators is shown in Table 2. The parameter configurations were chosen as given by Hill (1975) and Dufour and Kurz-Kim (2010). The simulation is presented in Table 2 for the values of α = 1.01, 1.25, 1.5, 1.75, and 1.9 each with sample size n = 100, 250, 500, 1000, 2000, 5000, and 10,000 and for different values of ρ = 0.2, 0.4, 0.6, and 0.8 when skewness parameter β = 0, location parameter µ = 0, and scale parameter σ = (− ln(ρ)) (1/α) , i.e., concentration parameter ρ = e −σ α . For each combination of α and n, 10,000 replications were performed. In this simulation, the sample was relocated by three different relocations, viz. true mean = 0, estimated sample mean, and estimated sample median, and comparison of the root mean square errors (RMSEs) was made.
Next, in Table 3, comparison of the performance of the modified truncated estimatorα 2 with that of the characteristic function-based estimator where the simulation is presented for the values of α = 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, and 2.0 each with sample size n = 20, 30, 40, and 50, and the values of σ were taken as 3, 5, and 10 . For each combination of α and n, 10,000 replications were performed. Table 3. Comparison of the RMSEs of the modified truncated estimatorα 3 (RMSE3) and the characteristic function-based estimator (RMSE4) when µ = 0 and σ unknown. The asymptotic variance of the characteristic function-based estimator, unlike that of the modified truncated estimator, is not available in any closed analytical form. We are thus unable to present the Asymptotic Relative Efficiency (ARE) of these estimators of α analytically. Instead, we compared these through their MSEs based on extensive simulations over all reasonable small, moderate, and large sample sizes.

Applications
8.1. Inference on the Gold Price Data (In US Dollars)  Gold price data, say x t , were collected per ounce in US dollars over the years 1980-2013. These were transformed as z t = 100(ln(x t ) − ln(x t−1 )), which were then "wrapped" to obtain θ t = z t mod2π and finally transformed to θ = (θ t −θ) mod 2π, whereθ denotes the mean direction of θ t and θ denotes the variable thetamod as used in the graphs. The Durbin-Watson test performed on the log ratio transformed data shows that the autocorrelation is zero. The test statistic of Watson's goodness of fit Jammalamadaka and SenGupta (2001) for wrapped stable distribution was obtained as 0.01632691 and the corresponding p-value was obtained as 0.9970284, which is greater than 0.05, indicating that the wrapped stable distribution fits the transformed gold price data (in US dollars). The modified truncated estimateα * 1 is 0.3752206 while the estimate by characteristic function method is 0.401409. The value of the objective function using the characteristic function estimate is 2.218941 while that using our modified truncated estimate is 2.411018.

Inference on the Silver Price Data (In US Dollars) (1980-2013)
Data on the price of silver in US dollars collected per ounce over the same time period also underwent the same transformation. The Durbin-Watson test performed on the log ratio transformed data shows that the autocorrelation is zero. Here, the Watson's goodness of fit test for wrapped stable distribution was also performed and the value of the statistic was obtained as 0.02530653 and the corresponding p-value is 0.9639666, which is greater than 0.05, indicating that the wrapped stable distribution also fits the transformed silver price data (in US dollars). The modified truncated estimate of the index parameter α is 0.4112475 while the estimate by characteristic function method is 0.644846. The value of the objective function using the characteristic function estimate is 2.234203 while that using our modified truncated estimate is 2.234432.
8.3. Inference on the Silver Price Data (In INR)  Data on the price of silver in INR were also collected per 10 grams over the same time period. The p-value for the Durbin-Watson test performed on the log ratio transformed data is 0.3437, which indicates that the autocorrelation is zero. Here, the Watson's goodness of fit test was also performed on the transformed data and the value of the statistic was obtained as 0.03382334 and the corresponding p-value is 0.8919965, which is greater than 0.05, indicating that the wrapped stable distribution also fits the silver price data (in INR). The estimateα * 1 is 1.142171, which is the same as the characteristic function estimate. The value of the objective function using the characteristic function estimate is 2.813234 while that using our modified truncated estimate is 2.665166. Since the estimate of α lies between 1 and 2, a mixture of normal and Cauchy distributions is used in Anderson and Arnold (1993) to estimate the respective parameters. The initial values of the scale parameter (σ 1 ) for the normal distribution is taken as the sample standard deviation and that for the Cauchy distribution (σ 2 ) is taken as the sample quartile deviation. In addition, different initial values of the mixing parameter p yield the same estimate of the parameters, viz.p = 0.165,σ 1 = 14.38486, andσ 2 = 0.077, and the value of the objective function was found to be 0.9308165. Then, the value of I.O. using modified truncated estimate (assuming stable distribution) is 4.665166 (2.665166 + 2), using the characteristic function estimate (assuming stable distribution) is 4.813234 (2.813234 + 2), and using the characteristic function estimate (assuming mixture of normal and Cauchy distribution) is 3.9308165 (0.9308165 + 3). Thus, it can be observed using the I.O. measure that a mixture of normal and Cauchy distribution gives the best fit to the data. The maximum likelihood estimate of α assuming wrapped stable distribution is 1.1421361. Akaike's information criterion (AIC) value assuming wrapped stable distribution is 153.5426 and that assuming a mixture of normal and Cauchy distribution is 201.4.

Inference on the Box and Jenkins Stock Price Data
Series B Box and Jenkins (IBM) common stock closing price data obtained from Box et al. (2016) were also transformed similarly as for the preceding one. The Durbin-Watson test performed on the log ratio transformed data shows that the autocorrelation is zero. Watson's test statistic for the goodness of fit test was obtained as 0.0554223 and the corresponding p-value was obtained as 0.6442058, which is greater than 0.05, indicating that the wrapped stable distribution fits the stock price data. The estimates of the index parameter α and the concentration parameter ρ as obtained by modified truncation method are 1.102854 and 0.4335457, respectively.

Findings and Concluding Remarks
It can be observed from Table 1 that the asymptotic variance of the untruncated estimator is reduced for the corresponding truncated estimator, indicating the efficiency of the truncated estimator.
It can also be noted from Table 2 that, for α = 1.01, the RMSE of the modified truncated estimator is less than that of the Hill estimator when the sample is relocated by three different relocations, viz. true mean = 0, sample mean, and sample median, for higher values of the concentration parameter ρ = 0.5, 0.6, 0.8, and 0.9 for sample sizes n = 100, 250, 500, and 1000 and for ρ = 0.3, 0.4, 0.6, 0.8, and 0.9 for sample sizes n = 2000, 5000, and 10,000. Furthermore, it can be observed that, for α = 1.25, 1.5, 1.75 and 1.9, the RMSE of the modified truncated is less than that of the Hill estimator for different relocations for ρ = 0.6, 0.7, 0.8, and 0.9 for smaller sample size and even for ρ = 0.5 for larger sample size. This clearly indicates the efficiency of the modified truncated estimator over the Hill estimator for higher values of the concentration parameter ρ. Table 3 that the RMSE of the modified truncated estimator is less than that of the characteristic function-based estimator for almost all values of α corresponding to all values of σ.

It can be observed in
The Hill estimator (Dufour and Kurz-Kim (2010)) is defined for 1 ≤ α ≤ 2, whereas the modified truncated estimator is defined for the whole range 0 ≤ α ≤ 2. In addition, the overall performance of the modified truncated estimator is quite good in terms of efficiency and consistency over both the Hill estimator and the characteristic function-based estimator.
Thus, we have established an estimator of the index parameter α that strongly supports its parameter space (0, 2]. It can be observed from the above real life data applications that the modified truncated estimator is quite close to that of the characteristic function-based estimator. In addition, it is simpler and computationally easier than that of the estimator defined in Anderson and Arnold (1993). Thus, it may be considered as a better estimator.
Again, when the estimator of α lies between 1 and 2, is attempted to model a mixture of two distributions with the value of the index parameter as that of the two extreme tails that is modeling a mixture of Cauchy (α = 1) and normal (α = 2) distributions when 1 < α < 2 or modeling a mixture of Double Exponential (α = 1 2 ) and Cauchy (α = 1) distributions when 1 2 < α < 1. Then, it is compared with that of the stable family of distributions for goodness of fit.
We could have used the usual technique of non-linear optimization as used in Salimi et al. (2018) for estimation, but it is computationally demanding and also the (statistical) consistency of the estimators obtained by such method is unknown. In contrast, our proposed methods of trigonometric moment and modified truncated estimation are much simpler, computationally easier and also possess useful consistency properties and, even their asymptotic distributions can be presented in simple and elegant forms as already proved above.
Author Contributions: Problem formulation, formal analyses and data curation, A.S.; formal and numerical data analyses, M.R.

Acknowledgments:
The research of the second author of the paper was funded by the Senior Research Fellowship from the University Grants Commission, Government of India. She is also thankful to the Indian Statistical Institute and the University of Calcutta for providing the necessary facilities.

Conflicts of Interest:
The authors declare no conflicts of interest.