Diagnostic Analytics for an Autoregressive Model under the Skew-Normal Distribution

: Autoregressive models have played an important role in time series. In this paper, an autoregressive model based on the skew-normal distribution is considered. The estimation of its parameters is carried out by using the expectation–maximization algorithm, whereas the diagnostic analytics are conducted by means of the local inﬂuence method. Normal curvatures for the model under four perturbation schemes are established. Simulation studies are conducted to evaluate the performance of the proposed procedure. In addition, an empirical example involving weekly ﬁnancial return data are analyzed using the procedure with the proposed diagnostic analytics, which has improved the model ﬁt.


Introduction
For time series data analysis, autoregressive (AR) modeling is an essential technique and has been applied in many areas including biology, chemistry, earth sciences, economics, education, engineering, finance, health, medicine, and physics; see, for example, [1,2] for recent accounts of time series modeling and applications. Issues related to estimation and testing for AR models are extensive and well-established; see [3,4] for related issues to statistical diagnostics which are of equal importance. Local influence diagnostics is the study of how relevant minor perturbations impact the fit of the model and the results of statistical inference. This has become a useful statistical methodology after [5] introduced the idea of local influence to aid in the identification of potentially influential observations. Diagnostic analytics is used in a number of regression and time series models. Among others, Refs. [6][7][8][9][10][11][12][13][14][15] investigated the local influence of linear or nonlinear regression models under non-normal distributional assumptions. In a framework of time series data, Refs. [16][17][18] considered influence diagnostics for AR and vector AR models under normal or elliptical distributions.
A standard assumption for time series models is that their errors are mutually independent and follow normal or symmetric distributions, as studied in [16][17][18]. However, it is known that certain financial and other datasets feature errors with skewed distributions. In order to deal with such data, the skew-normal (SN) distribution and its scale-mixtures have provided an appealing alternative and can therefore be adopted. Their properties, extensions, and applications are becoming increasingly popular; see [19][20][21][22][23][24][25][26]. In addition, Refs. [27][28][29] studied SN linear, linear mixed, and nonlinear models. Ref. [8] analyzed diagnostics in the nonlinear model with scale mixtures of SN and AR errors. For particular financial applications, Ref. [30] looked at asset pricing issues with return data following SN models. To our knowledge, no study on influence diagnostics in AR time series models under SN distributions have been reported. Therefore, the objective of the present paper is to formulate an AR model under the SN distribution (SN AR model) and to derive diagnostic analytics with applications to financial data. We use the matrix differential calculus pioneered by [31] to establish the mathematical results used in our data analysis. We implement the maximum likelihood (ML) method with the expectation-maximization (EM) algorithm to estimate the SN AR model parameters, whereas the local influence method with four perturbation schemes is used for the diagnostic analytics. The EM algorithm has now become a popular iterative technique for the ML estimation method with incomplete data; see [32,33]. The paper proceeds as follows. In Section 2, the SN AR model is introduced and the ML estimations of the model parameters are derived. Section 3 presents the local influence method and establishes normal curvatures under four perturbation schemes. In Section 4, a simulation study and an empirical example involving an AR model are presented, while the effectiveness of the proposed diagnostics is illustrated and discussed. Our concluding remarks are addressed in Section 5. The derivations of the normal curvatures are presented in the Appendix A.

The SN AR Model
A random variable Y is said to follow an SN distribution with location parameter µ, scale parameter σ 2 and skewness parameter λ, which is denoted by Y ∼ SN(µ, σ 2 , λ), if its probability density function is given by where φ and Φ are the standard normal probability density and cumulative distribution functions, respectively. We see that, if λ = 0, then the probability density function of Y defined in (1) reduces to the normal probability density function. If Y ∼ SN(µ, σ 2 , λ), then E(Y) = µ + σδ √ 2/π and Var(Y) = σ 2 − (2/π)σ 2 δ 2 , where δ = λ/ √ 1 + λ 2 . Let Y t be generated by a stationary AR(p) process given by where Y t is a time series, with Y 1 , . . . , Y p being the p initial values for y t−j , such that x t = (y t−1 , . . . , y t−p ) is the p × 1 vector of SN AR(p) covariates; β j is a regression coefficient, for j = 1, . . . , p, such that β = (β 1 , . . . , β p ) is a p × 1 regression coefficient vector, and u t is the scalar error term which follows an SN distribution, that is, u t ∼ SN(0, σ 2 , λ), where σ 2 is the scale parameter and λ is the skewness parameter. Thus, θ = (β, σ 2 , λ) is the (p + 2) × 1 vector of SN AR(p) model parameters. Note that we do not include an intercept in the SN model above. The expected value of u t is not zero, due to the assumption that the expected value of the underlying normal distribution is zero. Then, we choose the non-zero expected value of u t as an approximate to replace an intercept for the SN model.

ML Estimation
Finding the ML estimate of the parameter vector θ by direct maximization of the log-likelihood can potentially be a difficult task, so we implement the EM algorithm for this estimation. Let y c = (y o , y m ) denote the complete data, with y o being the observed data and y m the missing data. We use the notation Y c , Y o , Y m for the random vectors associated with y c , y o and y m , respectively. Starting from θ (0) as the initial estimate, we can get θ (0) , θ (1) , . . . by running the two steps of the EM algorithm iteratively as defined below.
We suppose that (θ|ω) is twice continuously differentiable in a neighborhood of ( θ, ω 0 ). We are interested in the comparison of θ and θ ω using the local influence method idea, which is to investigate the degree of inference affected by those minor changes in the corresponding perturbations. Ref. [5] used likelihood displacement (LD) to assess the influence of the perturbation ω defined as LD(ω) = 2( ( θ) − ( θ ω )). Here, large values of LD(ω) indicate that θ and θ ω differ considerably in relation to the contours of the non-perturbed log-likelihood function (θ). The idea is based on studying the local behavior of LD(ω) and the normal curvature C l (θ) in a unit-length direction vector l, where ||l|| = 1. According to [5], the normal curvature used to examine the local influence of the perturbation vector at where l is a q × 1 vector of unit length, −¨ is the (p + 2) × (p + 2) observed information matrix for the postulated model, ∆ is the (p + 2) × q perturbation matrix for the perturbed model, and −¨ and ∆ are evaluated at θ = θ and ω = ω 0 . The suggestion is to make the local influence diagnostic analytics by finding the maximum curvature C max = max ||l||=1 C l , where C max corresponds to the largest absolute eigenvalue λ max and its associated eigenvector l max of the matrixF = ∆ ¨ −1 ∆. If the absolute value of the ith element of l max is the largest, then the ith observation in the data may be the most influential.
To examine the magnitude of influence, it is useful to have a benchmark value for C max and for the elements of l max , see [10,18,34].

Local Influence for the SN AR Model
Next, we conduct a local influence diagnostic analytics for the SN AR(p) model. Due to the complexity of the SN distribution, we obtain the ML estimates based on the EM algorithm. As suggested by [29,34], the Q function and a Q function obtained similarly as LD can be used to replacethe log-likelihood function and likelihood displacement in the method of [5], in order to assess the influence of the perturbation. Thus, the normal curvature should be changed to be where l is a q × 1 vector of unit length, andF,Q and ∆ are q × q, (p + 2) × (p + 2) and (p + 2) × q matrices, respectively. In addition,Q and ∆ need to be evaluated at θ = θ and ω = ω 0 . The method examines the total local influence C t = C l t (θ), where l t is a q × 1 unit-length vector with one at the tth position and zeros elsewhere. We denote S = −∆ Q −1 ∆. Since C l (θ) is not invariant under a uniform change of scale, Ref. [34] proposed the conformal normal curvature B l (θ) = C l (θ)/(2tr(S)). An interesting property of the conformal normal curvature is that, for any unit-length direction l, we have 0 ≤ B l (θ) ≤ 1, which allows comparison of curvatures among different models.
In order to determine if the tth observation is influential, Ref. [34] proposed to classify the tth observation as possibly influential if M(0) t = B l t is greater than the benchmark 1/q + c * SM(0), where SM(0) is the sample standard error of M(0) k , for k = 1, . . . , q, and c * is a certain constant. Depending on the specific application, c * may be taken to be a suitably selected positive value.
The forms given in (8) are used to obtain our normal curvature results under the four perturbation schemes, namely the case-weights, data, variance parameter, and skewness parameter schemes. The matricesQ and ∆ need to be established for each scheme.
We employ the matrix differential calculus proposed by [31] to establish these algebraic results in the following sections. We present their derivations in their respective Appendix A.

Perturbation of Case-Weights
Assume that a minor perturbation is made on the SN AR(p) model, with y t = x t β + u t being replaced by ω t y t = ω t x t β + ω t u t , where ω t is the weight. Let ω = (ω p+1 , . . . , ω T ) denote the (T − p) × 1 perturbation vector and ω 0 = (1, . . . , 1) denote the (T − p) × 1 non-perturbation vector. Then, the complete-data log-likelihood function of the perturbed model is given by Thus, the perturbed Q function obtained from (9) is expressed as For the SN AR(p) model in the perturbation of case-weights, the (p + 2) × (T − p) perturbation matrix ∆ must be evaluated at θ = θ and ω = ω 0 after taking derivatives, obtaining where u t = y t − x t β,

Perturbation of Data
Assume that a perturbation replaces y t by ω t + y t . Let ω = (ω p+1 , . . . , ω T ) denote the (T − p) × 1 perturbation vector and ω 0 = (0, . . . , 0) the (T − p) × 1 non-perturbation vector. The perturbed AR(p) model can be written as y t Then, the complete-data log-likelihood function of the perturbed model is given by Thus, the perturbed Q function is expressed as For the SN AR(p) model in the perturbation of data, we have where u t = y t − x t β, and h t is as in the data perturbation.

Perturbation of Scale
Assume that the variance parameter σ 2 in the model is replaced by ω t −1 σ 2 , meaning that Let ω = (ω p+1 , . . . , ω T ) denote the (T − p) × 1 perturbation vector and ω 0 = (1, . . . , 1) denote the (T − p) × 1 non-perturbation vector. Then, the complete-data log-likelihood function of the perturbed model is given by Thus, the perturbed Q function is expressed as For the SN AR(p) model with the perturbation of variance parameter, we have where u t = y t − x t β, and h t is as in data perturbation.

Perturbation of Skewness
Considering the particular skewed feature of the distribution, we may investigate the effect on the model fit by making a minor change of the skewness parameter λ. In our perturbed model, we propose to replace λ by √ ω i λ. Let ω = (ω p+1 , . . . , ω T ) denote the (T − p) × 1 perturbation vector and ω 0 = (1, . . . , 1) denote the (T − p) × 1 non-perturbation vector. Then, the complete-data log-likelihood function of the perturbed model is given by Thus, the perturbed Q function is expressed as For the SN AR(p) model with the perturbation of the skewness parameter, we have where u t = y t − x t β and h t , c 2 t are as in the data perturbation.

Simulation Study I: Effectiveness of the Diagnostics
For our simulation, we consider an AR(1) model as specified in Section 2 given by where we choose β = 0.12, σ 2 = 0.003 and λ = 0.1. We generate T = 400 observations. Now, we compare the performance of the ML estimates in the presence of five perturbed or shifted observations among three different scenarios with λ = 0.1, 0.2, 0.3. We perturb the value y t by y t * = y t + is the estimate of β under the perturbed data and E st is the estimate of β under the non-perturbed data. The above computation in MATLAB (version 9.3, MathWorks, US) runs five iterations to converge in less than 60 s on a PC. Our experience indicates that the computation runs up to 10 iterations to converge in less than 120 s for even T = 1000. Figure 1 shows the effectiveness of the influence diagnostics.

Simulation Study II: Comparing SN and Normal Distributions
We perform a numerical simulation to examine the effectiveness of our method under SN distributions. The results under the SN and normal distributions are compared as follows: Step 1. We use the simulated data (λ = 0.1) with y t perturbed by y t * = y t + βy t−1 d, where d = 5 and t = 200, 201, 202, 203, 204. We fit an AR(1) model under normality to the data, and obtain the fitted AR(1) model given by Y t = 0.1549y t−1 + u t , where u t ∼ N(0, 0.0151).
Step 2. We conduct a local influence diagnostic analytics under the normal distribution using the diagnostic results given by [18].
Step 3. We compare the local influence results for the normal distribution in Step 2. In Figure 2, twenty-four influence observations are detected under the SN distribution. These results are summarized in Table 1.
As presented, twenty-four influential observations are detected by the local influence diagnostic analytics under the SN distribution, which is more than the twenty influential values detected by the local influence analytics under the normal distribution. These results are informative and as expected. We see that the influential values (that is, cases #62, #153, #201 and #301) are less than zero.
As we understand, when λ for the SN distribution is larger than zero, it is easier to find possible influential values less than zero due to the difference in patterns between the SN distribution and the normal distribution. This indicates that the diagnostic results under the SN distribution established in Section 3 work well.

Real-World Data Analysis
We conduct an empirical example of financial data based on our results presented in Sections 2 and 3. We choose Chevron shares (hereinafter referred to as CVX weekly financial return data), which were collected from 12 January 2007 to 1 August 2014 to construct the AR models. Figure 3 shows the log transformation of this data set (a total of 395 observations). We first determine the order of the AR model, using the following steps: Step 1. We assume the data are subject to an AR(p) model, for p = 1, 2, . . ., given by Step 2. For the ith equation given in (17), we let β (i) j be the OLS estimate of β j , where the superscript (i) denotes the estimates for the AR(i) model. Then, the residual is defined as u i y t−i . The estimated variance for the AR(i) model is expressed as Step 3. We use the ith and (i − 1)th equations given in (17) to test H 0 : β i = 0 versus H 1 : β i = 0, that is, we test the AR(i) model versus the AR(i − 1) model. The test statistic is defined as For our model, M(i) follows an asymptotically chi-square distribution with one degree of freedom, that is, M(i) ∼ χ 2 (1).
Since the absolute values of β 1 , β 2 and β 3 all are less than 1, we can accept that the CVX time series is stationary for the AR(3) model. Thus, we obtain a predictive model in the fitted SN AR(3) structure given by with µ = 0, σ 2 = 0.0013, and λ = 0.0661. By applying the method in Section 3, we can conduct influence analysis for the SN AR(3) model. After calculating the observed Hessian matrix and the ∆ matrices for the four perturbation schemes of case-weights, data, variance parameter, and skewness parameter, we obtain the diagnostics matrices S 1 , S 2 , S 3 , and S 4 , respectively. In this case, we select the benchmark as 1/392 + 3SM(0), with the values of 0.1144, 0.0178, 0.0347, and 0.0791 established in the simulation study for the four perturbation schemes, respectively.
In Figure 4, the straight line represents the benchmark value which determines whether an observation is potentially influential. The observation is justified to be potentially influential when its diagnostic value exceeds the benchmark. Firstly, we only find case #92 to be potentially influential. The other potential influential observations may be masked by case #92. Similar to a step-wise diagnostic procedure proposed by [6], a second step of identification of influential observations is conducted. In the second rstep, we replace the value of case #92 by the average of cases #91 and #93 to get a new time series.
We re-fit an AR model in the same manner as done previously in the first step. For the new time series, the parameters in the SN AR model are again estimated by applying the EM-algorithm with the order also selected to be three. Thus, we present the new SN AR(3) model as with µ = 0, σ 2 = 0.0011, λ = 0.1015. Since the absolute values of β 1 , β 2 , and β 3 are all less than one, we can accept the CVX time series to be stationary for the AR(3) model. For the new AR(3) model, we conduct influence analysis. We select the benchmark as 1/392 + 3SM(0) again, with the four values of 0.0377, 0.0100, 0.0162, and 0.0203 for the corresponding perturbation schemes, respectively. Thus, twenty-two influence observations are detected in Figure 5, summarized in Table 3. It tallies the observations which are identified to be potentially influential for CVX, and an "*" in Table 3 indicates that the observation has been identified via the assigned perturbation scheme. It is worth noting that the points shown in Table 3 correspond to a number of material historical events. Many of these points relate to events around the 2008 global financial crisis. For example, on September 7 2008, the US Treasury Department announced to take over Fannie Mae and Freddie Mac. On 3 October 2008, the Bush administration signed a total of up to 700 billion dollars in a financial rescue plan. This shows the effectiveness of our procedure in identifying potentially influential observations to improve modeling outcomes.
The first-order derivatives related toQ( θ) in (7) are given by The second-order derivatives related toQ( θ) in (7) are given by The first-order derivatives related to ∆ in (13) are given by The second-order derivatives related to ∆ in (13) are given by Noting that ω 0 = (0, . . . , 0) , we obtain (13).