Next Article in Journal
Optimal Control Problem for Minimization of Net Energy Consumption at Metro
Next Article in Special Issue
COVID-19 Data Analysis Using Bayesian Models and Nonparametric Geostatistical Models
Previous Article in Journal
Analysis of Multi-Server Queueing System with Flexible Priorities
Previous Article in Special Issue
Public Debt, Governance, and Growth in Developing Countries: An Application of Quantile via Moments
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Blockwise Empirical Likelihood Test for Gaussianity in Stationary Autoregressive Processes

by
Chioneso S. Marange
1,*,
Yongsong Qin
2,
Raymond T. Chiruka
1 and
Jesca M. Batidzirai
3
1
Department of Statistics, Faculty of Science and Agriculture, Alice Campus, Fort Hare University, Alice 5700, South Africa
2
College of Mathematics and Statistics, Guangxi Normal University, Guilin 541004, China
3
School of Mathematics, Statistics and Computer Science, Pietermaritzburg Campus, University of KwaZulu-Natal, Pietermaritzburg 3201, South Africa
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(4), 1041; https://doi.org/10.3390/math11041041
Submission received: 30 December 2022 / Revised: 9 February 2023 / Accepted: 13 February 2023 / Published: 18 February 2023
(This article belongs to the Special Issue Nonparametric Statistical Methods and Their Applications)

Abstract

:
A new and simple blockwise empirical likelihood moment-based procedure to test if a stationary autoregressive process is Gaussian has been proposed. The proposed test utilizes the skewness and kurtosis moment constraints to develop the test statistic. The test nonparametrically accommodates the dependence in the time series data whilst exhibiting some useful properties of empirical likelihood, such as the Wilks theorem with the test statistic having a chi-square limiting distribution. A Monte Carlo simulation study shows that our proposed test has good control of type I error. The finite sample performance of the proposed test is evaluated and compared to some selected competitor tests for different sample sizes and a variety of alternative applied distributions by means of a Monte Carlo study. The results reveal that our proposed test is on average superior under the log-normal and chi-square alternatives for small to large sample sizes. Some real data studies further revealed the applicability and robustness of our proposed test in practice.
MSC:
62G10; 60G10; 60G15; 62E10

1. Introduction

In time series analysis, testing whether a time series process follows a Gaussian distribution is customarily conducted as preliminary inference on the data before further analysis can be performed. This makes the development and use of normality tests for time series data a vital area in the field of applied and theoretical statistics. Various tests for assessing consistency to Gaussianity in time series processes have been developed and widely reported in literature (see [1,2,3,4,5], among others). These tests make use of various forms of mathematical characterization of the underlying time series process in developing the test statistics. For example, Epps [1] proposed a test based on the analysis of the empirical characteristic function. Lobato and Velasco [2] as well as Bai and Ng [3] developed tests based on the skewness and kurtosis coefficients. Moulines and Choukri [6] based their test on both the empirical characteristic function as well as the skewness and kurtosis coefficients. Bontemps and Meddahi [7] used moment conditions implied by Stein’s characterization of the Gaussian distribution. Psaradakis and Vavra [5] proposed a test by approximating the empirical distribution function of the Anderson Darling’s statistic, using a sieve bootstrap approximation. On the other hand, Rao and Gabr [8] proposed their test based on the bispectral density function.
As alluded to by Lobato and Velasco [2] as well as Bai and Ng [3], in time series analysis, testing for normality is customarily performed by utilizing the skewness and kurtosis tests. This is because of their lower computational cost, popularity, simplicity and flexibility. The moment-based tests by Lobato and Velasco [2], as well as Bai and Ng [3], used classical measures of skewness and kurtosis involving standardized third and fourth central moments. To handle the issues of data dependence, Lobato and Velasco [2] used the skewness–kurtosis test statistic, but studentized by standard error estimators that are consistent under serial dependence of the observations. On the other hand, to cater for dependence, Bai and Ng [3] used the limiting distributions for the third and fourth moments when the data are weakly dependent. One possible way to alleviate the dependence problem in moment-based tests is to employ techniques that can address the correlation that exists between the time series observations. Such techniques may include some bootstrapping and resampling procedures. The blockwise-empirical likelihood (BEL) technique is one such procedure that is widely used to address issues of correlated data in time series processes (see [9]).
The use of the empirical likelihood (EL) methodology (see [10,11] for more details) to develop simple, powerful and efficient moment-based tests for normality has received enormous attention (see [12,13,14,15] for more insight). The application of the EL methodology on independent and identically distributed (i.i.d.) data has been studied in a variety of contexts, including inference on the skewness and kurtosis coefficients (see [16] for more details), but our interest in this study concerns the application of the EL methodology on weakly dependent time series processes. Due to the underlying dependence structure in time series processes, the formulation of the EL usually fails, and in order to apply the EL methodology to time series data, serial dependence among observations needs not to be ignored. As a remedy, Kitamura [9] proposed the BEL methodology for weakly dependent processes. This proposed technique has been shown to provide valid inference in various scenarios with time series processes in a wide range of problems (for example, see [17,18,19,20,21,22,23,24,25,26,27], among others). Similarly to the i.i.d. EL version, the BEL method creates an EL log-ratio statistic with a chi-square limit for inference. However, the BEL construction crucially involves blocks of consecutive observations in time, which serves to capture the underlying time-dependence structure. It is important to note that the choice of block sizes is vital, as it determines the coverage performance of the standard BEL methodology [9]. Thus, the performance of the BEL is largely dependent on the choice of the block size b, which is an integer defined on 1 b n .
In this article, we propose a goodness of fit (GoF) test statistic for Gaussianity in weakly dependent stationary autoregressive processes of order one (AR(1)). We focused on AR(1) processes because they are commonly encountered in the field of econometrics as well as applied and theoretical statistics. The GoF test is constructed based on the third and fourth moments, employing the standard BEL methodology. Thus, our proposed procedure applies the standard BEL methodology that non-parametrically accommodates the dependency in the time series process whilst exhibiting some useful properties of EL, such as Wilks’s theorem. The next section will present the development of the proposed test statistic. The article will further present the finite sample performance of our proposed test in comparison with other existing competitor tests. Some real data applications will be conducted and presented. Lastly, conclusions and recommendations will be drawn based on the findings of the Monte Carlo (MC) simulations as well as the real data studies.

2. The Blockwise Empirical Likelihood Ratio Test Statistic

Let X 1 , X 2 , , X n be a sample of n consecutive equally spaced observations from a strictly stationary, real-valued, discrete-time stochastic process { X t : t Z } taking values in R d . Since this is a general definition, for this study we considered { X t } for an autoregressive process of order 1 model that is assumed to be stationary. Thus, we define the autoregressive process of order 1 as given by
X t = ϕ X t 1 + ε t ,
where ϕ is a constant such that | ϕ | < 1 . According to Nelson [28], Wei [29], as well as Box et al. [30], among others, the requirement | ϕ | < 1 is called the stationarity condition for the AR(1) process. The problem of interest is to test the composite null hypothesis that the one-dimensional marginal distribution of X t is Gaussian, that is
H 0 : X t N ( μ , σ 2 ) .
Based on the observed sample X i s, we are interested in the alternative hypothesis that the distribution of X t is non-Gaussian. Without loss of generality, we then proposed to use standardized random variables of the sample observations. To achieve this, we adopted the common transformation technique of standardizing observed data points to Z t by subtracting the mean ( μ ^ ) and dividing by σ ^ 2 . Thus, from the initial hypothesized framework in (1), we estimate μ and σ 2 by their maximum likelihood estimators, i.e., μ ^ = X ¯ = 1 / n i = 1 n X i and σ ^ 2 = S 2 = 1 / ( n 1 ) i = 1 n ( X i X ¯ ) 2 . Let Z i = ( X i X ¯ ) / S , i = 1 , 2 , , n . Then, the composite null hypothesis becomes
H 0 : Z t N ( 0 , 1 ) .
The parameter ϕ is unknown. Bai and Ng [3] derived and proved the limiting distribution of the sample skewness and kurtosis for a stationary time series process under arbitrary τ and κ before extending the general results to τ = 0 (or, equivalently, μ 3 = 0 ) and κ = 3 (or, equivalently, μ 4 = 3 ) under normality. Following the empirical likelihood methodology, the r t h moment of the transformed data, Z t , E Z t r , has sample moments of the form i = 1 n p i Z i r , where Z t N ( 0 , 1 ) . The probabilities p i s are components of the empirical likelihood, i = 1 n p i , which is used to maximize the empirical likelihood function given empirical constraints. Under the null hypothesis that the standardized observations are from a Gaussian distribution, the unbiased empirical moment equations are
i = 1 n p i Z i 3 E ( Z 3 ) = i = 1 n p i Z i 3 = 0 i = 1 n p i Z i 4 E ( Z 4 ) = i = 1 n p i Z i 4 3 = 0 ,
where the probability parameters p i s fulfill the two fundamental properties of probability theory which states that 0 p i 1 and i = 1 n p i = 1 . We now consider Z 1 , Z 2 , Z n for the problem of inference about the process mean E Z t r = μ r R d . Considering the unbiased empirical moment equations in (2), the hypotheses for the ELR test can be written as
H 0 : E ( Z r ) = μ r vs . H a : E ( Z r ) μ r .
where r takes values 3 and 4. Since there is dependence in the time series data, one cannot use the traditional EL by Owen [11], which was specifically developed for independent, identically distributed data. Thus, the i.i.d. formulation of EL fails for dependent data by ignoring the underlying dependence structure. Therefore, following the works of Kitamura [9], we then adopted the standard BEL (also discussed by Nordman et al. [26], and Kim et al. [31], among others) to construct the test statistic. The technique involves choosing an integer block length 1 b n and forming a collection of length b data blocks, which could possibly be maximally overlapping (OL) as given by
{ ( Z i , , Z i + b 1 ) } for i = 1 , 2 , , M with M = n b + 1 ,
or non-overlapping (NOL) as given by
{ ( Z b ( i 1 ) + 1 , , Z i b ) } for i = 1 , 2 , , M with M = [ n / b ] .
In both cases, all blocks have constant length b for a given sample size n. For inference on the mean parameter μ r , each block in the OL collection i = 1 , 2 , , M , contributes a centered block sum given by
B i , μ r j = 1 i + b 1 ( Z j r μ r ) ,
or with NOL blocks
B i , μ r j = b ( i 1 ) + 1 b i ( Z j r μ r ) .
The blocking schemes (4) and (5) aim to preserve the underlying dependence between neighboring time observations. We then consider the profile blockwise empirical likelihood function for μ r given as
L B E L , n ( μ r ) = sup i = 1 M p i : p i 0 , i = 1 M p i = 1 , i = 1 M p i B i , μ r = 0 d ,
where 0 d = ( 0 , 0 , . . . , 0 ) R d . Under a zero-expectation constant, L B E L , n ( μ r ) assesses the plausibility of μ r through utilizing probabilities i = 1 M p i , which are assigned to the centered block sums B i , μ r to maximize the multinomial likelihood, i = 1 M p i . In the absence of the mean constraint in (6), the maximization of the multinomial likelihood is performed when each p i = 1 / M , which leads to the corresponding BEL ratio
R B E L , n ( μ r ) = L B E L , n ( μ r ) / M M .
The computation of L B E L , n ( μ r ) for the BEL version is similar to that described by Owen (1988, 1990) for i.i.d. data. Thus, when 0 d lies within the interior convex hull of { B i , μ r : i = 1 , 2 , , M } , then for L B E L , n ( μ r ) the standard Lagrange multiplier arguments imply that the maximum is attained at probabilities
p i , μ r = 1 M ( 1 + λ B E L , n , μ r B i , μ r ) ( 0 , 1 ) , i = 1 , 2 , , M ,
with the Lagrange multiplier λ B E L , n , μ r B i , μ r R d satisfying
i = 1 M B i , μ r M ( 1 + λ B E L , n , μ r B i , μ r ) = 0 d .
For more computational details of this result, see Kitamura [9]. Further, Kitamura [9] alluded that, under certain mixing and moment conditions, as well as for traditional small b asymptotics, that is, b 1 + b 2 / n 0 as n , the log-EL ratio of the standard BEL has a chi-square limiting distribution. Thus, under regularity conditions that can be found in [9], one can easily show that
2 n b M log R B E L , n ( μ r , 0 ) d χ d 2 ,
at the true mean parameter μ r , 0 R d (for detailed proof, see [9]). Nordman et al. [26] as well as Kim et al. [31] further provided a detailed rationale and explanation to support (7), that is, the limiting distribution for the log-EL ratio of weakly dependent data has a chi-square limiting distribution. For this log-EL ratio, ( b M ) 1 represents an explicit block adjustment factor in (7) to ensure the distributional limit for the log-EL ratio of the BEL. A block length of b = 1 results in the EL distributional result of Owen [10,11]. Our choice of the ideal block length for the proposed test statistic is discussed in the next section. Now, let us consider the −2 log-likelihood ratio test statistic for the null hypothesis, which is given by
( 2 L L R ) r = 2 n b M log R B E L , n ( μ r , 0 ) .
In order to determine whether to reject or not reject H 0 , we used the likelihood ratio to compare to size-adjusted critical values. Thus, for our proposed blockwise empirical likelihood ratio test, we propose to reject the null hypothesis if
BELT : = max r G ( 2 L L R ) r > C α ,
where C α is the critical value and r G , with G = { 3 , 4 } . The values of G are integers representing the third and fourth moment constraints that are used to maximize the test statistic. Our proposed test statistic (9) is a cumulative sum (CUSUM)-type test statistic and it is well accepted in the change point literature (for example, see [32,33,34,35]). Alternatively, one can consider the test statistic based on the Shiryayev-Roberts (SR) approach (for example, see [36]). However, for empirical likelihood moment-based GoF tests, it has been demonstrated that the CUSUM-type test statistic is superior in power performance as compared to the SR-type test statistic [12,13,15].
The classical EL method can be considered a special case of the BEL method (without data blocking, thus, b = 1 ). That is when b = 1 implies both M = n b + 1 for OL blocks and M = n / b for NOL blocks reduces to n. Since the standard BEL method mimics the i.i.d. case, under the condition that b = 1 one can show that (8) for (3) has a chi-square limiting distribution. Applying the EL methodology and the Wilks theorem [37], Shan et al. [12] showed that, under the usual classical case of the EL (with standardized data), a similar test statistic, ( 2 L L R ) k , for k G , with G = { 3 , 4 , 5 , 7 } has a chi-square limiting distribution (for further details, see lemma 2.1 and proposition 2.1 and their respective proofs by Shan et al. [12]). Referring to the proofs by Owen [10,11], Nordman et al. [20] reported that the EL with i.i.d. has a key feature in allowing a nonparametric casting of the Wilks theorem, meaning, when evaluated at the true mean, the log-likelihood ratio has a chi-square limiting distribution. This was first extended to the BEL method with weakly dependent processes by Kitamura [9], who showed that a similar result for the classical EL method applies to the BEL method. However, according to [9], the BEL method requires choosing a suitable block size b for the optimal coverage accuracy. The next section will present a MC simulation-based approach to determine the ideal block size for the proposed test statistic.

3. Monte Carlo Simulation Procedures

3.1. Block Size Selection

The standard implementation of the BEL method typically involves data blocks of constant length for an observed time series and therefore requires a corresponding block length selection. In addition, the performance of the BEL method often depends critically on the choice of the block length. In the literature, little is known about the best block size selection for optimal coverage accuracy with the standard BEL method. In practice, researchers usually borrow from the block bootstrap literature, where optimal orders for block sizes vary in powers of the sample size such as n 1 / 3 or n 1 / 5 (see [38,39], among others). Additionally, data-driven block length choices also borrow from bandwidth selection with kernel spectral density estimators such as the Bartlett kernel (see [9,40]) where such block selections are also based on a fixed block size order (i.e., n 1 / 3 ). Kitamura [9] recommended that the empirical block selections for the BEL method should involve estimated adjustments to the block order n 1 / 3 , and this may not in fact be optimal for the coverage accuracy of the standard BEL method. In implementation, the block order is usually adjusted by a constant factor, which is often set to C = 1 or 2 [31,39]. Several studies that looked at weakly dependent time series data (both autoregressive and moving average models) have adopted the use of the optimal block order of C n 1 / 3 (see [31,40], among others).
Using R, we conducted an extensive MC experiment to establish the ideal block sizes to use for an AR(1) process, which is our time series process of interest. To achieve this objective, we assessed the coverage accuracy of the BEL method on inference about the mean parameter ( μ 0 = 0 , mean for a stationary, weakly dependent time series) at the 0.05 nominal level for different sample sizes (n = 100, 250, 500 and 1000) with varying choices of block sizes. As investigated and alluded to by Nordman et al. [26], the coverage accuracy of the BEL method depends on the block length b. We borrowed the works of Kim et al. [31] on the choice of block order adjustment, C n 1 / 3 and constant factor C, which was set to C = 0.5, 1, 2 and 3. In essence, we employed four different block sizes of the form b = C n 1 / 3 , where C = 0.5, 1, 2, 3. Thus, we considered varying the block sizes in powers of the sample sizes by utilizing n 1 / 3 . Under the null hypothesis, we generate data from the following AR(1) process.
X t = ϕ X t 1 + ϵ t ,
where ϵ t is i.i.d. N ( 0 , 1 ) , and the autoregressive parameter ϕ takes nineteen values from −0.9 through to 0.9 at 0.1 intervals apart. We reported the results for both the detailed grids of negative and positive values of ϕ because our test statistic is proposed for applications on 1 < ϕ < 1 . Coverage probabilities for both the BEL method with OL and NOL blocks were assessed. However, it is important to note that NOL blocks are known to perform either similarly or slightly worse than the OL block versions [31]. We only opted for the normal distribution of the error term since the EL method has been reported to have small coverage errors in time series data for both normal and non-normal errors [41,42]. In order to assess the overall performance of the BEL method under the several simulated scenarios, the mean, as well as the mean average deviations, were used.
From the results (see Table 1 and Table 2), we can see that under small sample sizes (i.e., n = 100 and 250) the standard BEL method performs well with OL blocks and block sizes of n 1 / 3 and 2 n 1 / 3 . Under these sample sizes, the BEL method with NOL blocks performed slightly less than the BEL method with OL blocks. In addition, from Table 2, the coverage probabilities based on the BEL method with block sizes of 0.5 n 1 / 3 continue to be the less accurate statistic. The performance of the BEL method with block sizes n 1 / 3 , 2 n 1 / 3 and 3 n 1 / 3 is comparable for moderate to large sample sizes (i.e., n = 500 and 1000). It is important to note that when ϕ 0.9 , the BEL method is generally conservative, and when ϕ 0.9 , the BEL method is regarded as largely permissive or anti-conservative. This poor coverage performance was also reported in MC simulations conducted by Nordman et al. [26]. This is a major weakness of the standard BEL method as is it sensitive to the strength of the underlying time-dependence structure (see [20,26,31]). However, due to the simplicity, flexibility, and wide range of applications of the standard BEL method in various applied and theoretical statistics problems, we adopted it for our proposed test statistic. From the simulation results, we decided to adopt block sizes n 1 / 3 and 2 n 1 / 3 for further investigation because of the better coverage accuracy for the BEL method for both the NOL and OL blocks.

3.2. Finite Sample Performance

This section compares the finite sample behavior of the proposed testing procedure in different situations. The R statistical package was used for all MC simulations. In order to conduct the MC power study, firstly we had to compute the ϕ and size-adjusted critical values of the proposed test. The rationale behind having ϕ dependent critical values is that in addition to sample size, the type I error control for weakly dependent time series process is heavily dependent on ϕ (for example, see [43]). We simulated an AR(1) model with normal errors. Without a loss of generality, we generated 20,000 samples of size 100, 500 and 1000 with ϕ taking values 0 , ± 0.1 , ± 0.2 , ± 0.3 , ± 0.4 , ± 0.5 , ± 0.6 , ± 0.7 , ± 0.8 and ± 0.9 . These simulations were conducted for both the BEL with NOL and OL blocks for block sizes n 1 / 3 and 2 n 1 / 3 . The upper alpha quantile of the empirical distribution of the test statistic was considered the critical value for each simulated scenario. These critical values are correct only if the AR(1) model is the data-generating process and the errors are indeed normal (for example, see [44]).
Under the null hypothesis, we generate data from an AR(1) process defined in (10), where the autoregressive parameter ϕ { 0 , ± 0.5 , 0.6 , 0.7 , 0.8 , ± 0.9 } (similar to [2,4]). We report the findings for a detailed grid of positive values of ϕ because positive autocorrelation is particularly relevant for many empirical applications. The error terms ϵ t are i.i.d. random variables, which may follow any of the following seven alternative distributions:
  • Standard normal ( N ( 0 , 1 ) ) ,
  • Standard log-normal (Log N),
  • Student t with 10 degrees of freedom ( t 10 ) ,
  • Chi-squared with 1 ( χ 1 2 ) and 10 degrees of freedom ( χ 10 2 ) ,
  • Beta with parameters (2, 1) ( β ( 2 , 1 ) ) ,
  • Uniform on [0, 1] ( U ( 0 , 1 ) ) .
To simulate the process defined in (10), we generated independent realizations for these distributions. When ϕ 0 , the process in (10) is not stationary. To cater to this challenge, we adopted the approach used by Nieto-Reyes et al. [4] of disposing some p a s t observations. We set our p a s t to 1000 and n = number of replications— p a s t . These alternative distributions have been used before for similar purposes (see [2,4]). Before the main power study, we conducted a MC experiment to further establish the block size (i.e., n 1 / 3 or 2 n 1 / 3 ) and the BEL block structure (i.e., NOL or OL) that will result in the optimal power for our proposed test. In Table 3, we report the empirical rejection probabilities for the proposed tests with NOL and OL blocks for block sizes n 1 / 3 and 2 n 1 / 3 . We considered three sample sizes, n = 100, 500 and 1000, with ϕ { 0 , ± 0.5 , 0.6 , 0.7 , 0.8 , ± 0.9 } . Four alternative distributions (i.e., Log N, t 10 , χ 10 2 and U ( 0 , 1 ) ) for the error term were used. In these experiments, 5000 replications were carried out at a nominal level of α = 0.05 . The main conclusion derived from these experiments is that the BEL statistic with both the NOL and OL blocks and a block size of n 1 / 3 were generally superior under almost all the simulated cases. The proposed procedure for the NOL and OL blocks gave comparable power. Due to the latter, we decided to adopt OL blocks with block sizes n 1 / 3 . Our choice of block sizes was also recommended and used by Kitamura [9] and also, as stated earlier, NOL blocks are known to perform either similarly or slightly worse than the OL block versions [31].
For the main power comparison study, we considered some well-known existing procedures to test if a stationary process is Gaussian. We adopted 3 (three) competitor tests, namely the Epps and Pulley (EPPS) test (see [1] for more details), the Lobato and Velasco (LV) test (see [2] for further insight) and the Psaradakis and Vavra (PV) test (see [5] for further insight). These tests have been considered in other studies where power comparisons have been conducted, and these include studies by Lobato and Velasco [2] as well as Nieto-Reyes et al. [4]. Our choice of competitor tests was also limited to the availability of the tests in R as well as tests that reasonably control type I error. The results of the MC power study are presented in Table 4, Table 5 and Table 6. We report the empirical rejection probabilities for the proposed blockwise empirical likelihood ratio test, with OL blocks of size n 1 / 3 (referred to as BELT henceforth). As in the previous MC simulation experiment, we considered three sample sizes, n = 100, 500 and 1000, with ϕ { 0 , ± 0.5 , 0.6 , 0.7 , 0.8 , ± 0.9 } . Seven alternative distributions for the error term were used (i.e., N ( 0 , 1 ) , Log N, t 10 , χ 1 2 , χ 10 2 , β ( 2 , 1 ) and U ( 0 , 1 ) as defined earlier). Each simulation scenario was repeated 5000 times at a nominal level of α = 0.05 .
The findings in Table 4, Table 5 and Table 6 shows that our proposed test, the BELT, has good control of type I error as compared to all other tests considered. For small samples (i.e., n = 100 ), the BELT test was superior under the χ 1 2 alternative distribution (see Table 4). The BELT and PV tests were on average the most powerful under the Log N alternative distribution. The BELT test was superior under the χ 10 2 , whilst the EPPS test was superior under the β ( 2 , 1 ) and U ( 0 , 1 ) alternative distributions. The LV test was overly the most powerful test under the t 10 distribution. Our proposed test was on average the second most superior test under the β ( 2 , 1 ) and U ( 0 , 1 ) distributions.
For medium sample sizes of n = 500 (see Table 5), our proposed test overly outperformed all tests under Log N, χ 1 2 , χ 10 2 and β ( 2 , 1 ) alternatives. However, under the Log N distribution, our proposed test is comparable to the LV test. On the other hand, the LV test was generally superior to all other tests under the t 10 alternative distribution. Our proposed test was generally the second most powerful test under the U ( 0 , 1 ) alternative distribution. The EPPS and PV tests are on average the least powerful tests.
For large samples ( n = 1000 in Table 6), the BELT and LV tests are the most superior under the Log N alternative distribution. When the alternative was t 10 , the LV test was on average the most powerful test. Our proposed test, the BELT, was the most superior test under the χ 1 2 , χ 10 2 and β ( 2 , 1 ) alternative distributions. The EPPS test was on average the most powerful under the U ( 0 , 1 ) distribution, with the BELT test being the second most powerful test.
In order to obtain a clearer visualization of the performance of the different tests, the ranking procedure was used (for example, see [15]). Table 7 shows the ranking of all the tests considered in this study according to the average powers computed from the values in Table 4, Table 5 and Table 6. The rank of power is based on the respective alternative distributions and sample sizes. Using average powers, we can select the tests that are, on average, most powerful against the respective alternative distributions. One of the major findings derived from the ranking is that on average our proposed test was superior under the Log N, χ 1 2 and χ 10 2 for small to large samples.
Lastly, we decided to determine the computational cost of the new algorithm by focusing on the computational time of the proposed test as compared to that of the competitor tests. For accessing and comparing the computational times, we opted for the R benchmark. These experiments were conducted using a notebook installed with 64 Bit Windows 10 Home edition. The processor was a 4th generation Intel Core i5-4210U, which has a speed of 1.7 GHz cache and random access memory of 4GB PC3 DDR3L SDRAM. The sample size was set to 100 with 1000 replications for each test where ϕ = 0.5 under a chi-square alternative distribution with 1 degree of freedom. The results (see Table 8) show a clear advantage of our proposed approach to that of the PV test.

4. Real Data Applications

4.1. The Canadian Lynx Data

Firstly, we used the Canadian lynx dataset, which has been extensively used in various statistical applications and previously found to be non-Gaussian [1,4,8]. The dataset has been shown to model well with an autoregressive time series process (see [45,46,47], among others). The Canadian lynx dataset consists of 114 observations of the annual record of the number of lynxes trapped in the Mackenzie River district of North-West Canada for the period from 1821 to 1934 (see [45] for more details). The Canadian lynx data are
269, 321, 585, 871, 1475, 2821, 3928, 5943, 4950, 2577, 523, 98, 184, 279, 409, 2285, 2685, 3409, 1824, 409, 151, 45, 68, 213, 546, 1033, 2129, 2536, 957, 361, 377, 225, 360, 731, 1638, 2725, 2871, 2119, 684, 299, 236, 245, 552, 1623, 3311, 6721, 4254, 687, 255, 473, 358, 784, 1594, 1676, 2251, 1426, 756, 299, 201, 229, 469, 736, 2042, 2811, 4431, 2511, 389, 73, 39, 49, 59, 188, 377, 1292, 4031, 3495, 587, 105, 153, 387, 758, 1307, 3465, 6991, 6313, 3794, 1836, 345, 382, 808, 1388, 2713, 3800, 3091, 2985, 3790, 674, 81, 80, 108, 229, 399, 1132, 2432, 3574, 2935, 1537, 529, 485, 662, 1000, 1590, 2657, 3396.
The goal of this section was to carry out a bootstrap study to assess the robustness and applicability of our proposed test in practice. The approach was to use a sample of size 100 by randomly selecting from the Canadian lynx data and then test for normality at 0.05 level of significance. Before the bootstrap study, we conducted the augmented Dickey–Fuller test for stationary assumption on the complete dataset. The augmented Dickey–Fuller Test (Dickey–Fuller = −6.31, p-value = 0.01) revealed that the Canadian lynx data are stationary. We then used a graphical approach to access the normality and the findings (see Figure 1) support the findings reported by Rao and Gabr [8], Epps [1] as well as Nieto-Reyes et al. [4] that the data are indeed non-normal. Having confirmed that the data are stationary and non-Gaussian, we then employed a bootstrap technique using the proposed test where we randomly removed 14 observations of the Canadian lynx data and then derived the p-value from the remaining observations. We also repeated this technique 5000 times for the EPPS and LV tests. We considered these tests because they performed quite well in our MC power study. The findings showed that the proposed BELT test had a p-value of 3.846 × 10 4 . The p-values that were obtained for the other tests, that is, 6.028 × 10 5 for the EPPS test and 1.558 × 10 3 for the LV test, were all suggestive for one to conclude that the Canadian lynx data are indeed non-Gaussian. The p-values obtained from the traditional tests as well as our proposed test proved to be consistent in illustrating the non-normality of the Canadian lynx data. Thus, our proposed test statistic has demonstrated robustness and that it is applicable when applied to some non-Gaussian real-life data.

4.2. The Souvenir Data

This real data study intends to demonstrate the practical applicability of our proposed test under a normally distributed time series process using 84 monthly sales for a souvenir shop at a beach resort town in Queensland, Australia (see [48] for more details). These sales were recorded from January 1987 to December 1993 and have been used for various time series applications (for example, see [49,50], among others). However, the time series data are not consistent with the normal distribution and are not stationary [49]. To verify these claims, we conducted tests for the Gaussianity and stationarity of a time series process using the k random projections test and the augmented Dickey–Fuller test, respectively. We supplemented these tests with diagnostic plots for assessing the stationarity and normality in time series processes (see Figure 2).
The results revealed that the monthly sales for the souvenir shop do not follow a Gaussian process (k = 16, p-value < 2.2 × 10 16 ) and are not stationary (Dickey–Fuller = −2.0809, p-value = 0.5427), which is a similar finding reported by the graphical plots presented in Figure 2. Since the goal is to examine the performance of our proposed test under a normally distributed time series process, we used the Holt–Winters exponential smoothing to obtain the forecast errors for the monthly sales of the souvenir shop, which are well-known to be stationary and consistent with normality [49]. The Holt–Winters exponential smoothing was ideal because the time series process of the log of monthly sales for the souvenir shop can be described using an additive model with a trend and seasonality. Thus, to obtain the forecasts, we fitted a predictive model for the log of the monthly sales. We then obtained the forecast errors ( n = 72 ) and used the same testing procedures reported earlier to assess whether these errors are indeed normally distributed and stationary.
From the plots (see Figure 3), it is clear that the forecast errors are normally distributed and stationary. The k random projections test (k = 16, p-value = 0.7923) and the augmented Dickey–Fuller test (Dickey–Fuller = −4.5942, p-value = 0.01) also revealed that the forecast errors follow a Gaussian process and are stationary. To demonstrate the robustness and applicability of our proposed test, we conducted a bootstrap study (using 5000 replications) by randomly deleting two observations at a time in order to test whether the forecast errors follow a Gaussian process. For the sake of comparison, this procedure was repeated for each of the selected competitor tests. The respective p-values were noted under the null hypothesis that the forecast errors follow a Gaussian process. At 5 % level of significance, our proposed test reported a p-value of 0.6418158, whilst the EPPS, LV and PV tests reported p-values of 0.3776963, 0.7008432 and 0.611926, respectively. Thus, our proposed test as well as the selected competitor tests suggest that the forecast errors of the monthly sales for the souvenir shop follow a Gaussian process. This is consistent with the graphical plots presented in Figure 3 as well as past applications [49]. This real data study has further demonstrated the robustness and applicability of our proposed test in practice.

5. Conclusions

A simple BEL-based procedure to test if a stationary autoregressive time process is Gaussian has been proposed. Coefficients of skewness and kurtosis provide convenient measures for characterizing the shape of the normal distribution in time series processes [2,3]. Our proposed test utilizes these moment constraints (i.e., the skewness and kurtosis coefficients) to develop the test statistic. The test applies the standard BEL methodology (see [9] for more details) that nonparametrically handles the dependence in the time series data. The test statistic has a chi-square limiting distribution and has good control of type I error as compared to the existing traditional competitor tests studied. Monte Carlo simulations have shown that our proposed test is overly powerful under the Log N, χ 1 2 , and χ 10 2 for small to large sample sizes. Further, the real data studies have demonstrated the applicability of the proposed testing procedure in practice. This study has once again proved the efficiency and power of the nonparametric empirical likelihood methodology in developing moment-based GoF tests, and presently this has only been well-established for i.i.d. data [12,13,14,15]. We utilized a CUSUM-type statistic to construct our test statistic, and we advocate for future studies to consider the common alternative to the CUSUM-type statistic, which is to utilize the Shiryayev–Roberts statistic [36].
Through MC simulation experiments, we have also discovered that the coverage performance of the standard BEL method depends on the strength of the underlying dependence structure of the time series process. However, the coverage performance improves with the increasing sample size, and a similar finding was also reported by Nordman et al. [20]. The selection of an optimal block size for the standard BEL method is problematic and is a major drawback of this technique. As a remedy, in the recent past, a few studies have proposed various methods to address this drawback (see [20,26,31]). Nordman et al. [20] proposed a modified BEL method for handling both the short- and long-range dependence for time processes. On the other hand, in order to handle dependence in weakly dependent time processes, Nordman et al. [26] as well as Kim et al. [31] proposed the expansive BEL (EBEL) method and the progressive BEL (PBEL) method, respectively. Unlike the standard BEL method, which depends critically on the choice of block length selection, the EBEL uses a simple and nonstandard data-blocking technique that considers every possible block length. The PBEL requires no block length selection, but rather it uses a data-blocking technique where block lengths increase by an arithmetic progression. All these proposed methods exhibit better coverage accuracy than the standard BEL method, and we suggest that future research can adopt these data-blocking methods for our proposed testing procedure.

Author Contributions

Conceptualization, C.S.M., Y.Q. and R.T.C.; Methodology, C.S.M.; Software, C.S.M.; Validation, C.S.M., Y.Q. and R.T.C.; Formal analysis, C.S.M.; Investigation, C.S.M.; Resources, C.S.M.; Data curation, C.S.M.; Writing—original draft, C.S.M.; Writing—review & editing, C.S.M. and J.M.B. All authors have read and agreed to the published version of the manuscript.

Funding

This study was funded through a research seed grant that was awarded to the main author by the Govan Mbeki Research and Development Centre, University of Fort Hare.

Data Availability Statement

The data presented in this study are available in the respective cited articles/sources.

Acknowledgments

We warmly thank both the associate editor and the anonymous reviewers for their constructive comments and suggestions that have allowed us to improve the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Epps, T.W. Testing that a stationary time series is Gaussian. Ann. Stat. 1987, 1683–1698. [Google Scholar] [CrossRef]
  2. Lobato, I.N.; Velasco, C. A simple test of normality for time series. Econom. Theory 2004, 20, 671–689. [Google Scholar] [CrossRef] [Green Version]
  3. Bai, J.; Ng, S. Tests for skewness, kurtosis, and normality for time series data. J. Bus. Econ. Stat. 2005, 23, 49–60. [Google Scholar] [CrossRef] [Green Version]
  4. Nieto-Reyes, A.; Cuesta-Albertos, J.A.; Gamboa, F. A random-projection based test of Gaussianity for stationary processes. Comput. Stat. Data Anal. 2014, 75, 124–141. [Google Scholar] [CrossRef]
  5. Psaradakis, Z.; Vávra, M. A distance test of normality for a wide class of stationary processes. Econom. Stat. 2017, 2, 50–60. [Google Scholar] [CrossRef] [Green Version]
  6. Moulines, E.; Choukri, K. Time-domain procedures for testing that a stationary time-series is Gaussian. IEEE Trans. Signal Process. 1996, 44, 2010–2025. [Google Scholar] [CrossRef]
  7. Bontemps, C.; Meddahi, N. Testing normality: A GMM approach. J. Econom. 2005, 124, 149–186. [Google Scholar] [CrossRef]
  8. Rao, T.S.; Gabr, M.M. A test for linearity of stationary time series. J. Time Ser. Anal. 1980, 1, 145–158. [Google Scholar] [CrossRef]
  9. Kitamura, Y. Empirical likelihood methods with weakly dependent processes. Ann. Stat. 1997, 25, 2084–2102. [Google Scholar] [CrossRef]
  10. Owen, A.B. Empirical likelihood ratio confidence intervals for a single functional. Biometrika 1988, 75, 237–249. [Google Scholar] [CrossRef]
  11. Owen, A. Empirical likelihood ratio confidence regions. Ann. Stat. 1990, 18, 90–120. [Google Scholar] [CrossRef]
  12. Shan, G.; Vexler, A.; Wilding, G.E.; Hutson, A.D. Simple and exact empirical likelihood ratio tests for normality based on moment relations. Commun. Stat. Comput. 2010, 40, 129–146. [Google Scholar] [CrossRef]
  13. Marange, C.S.; Qin, Y. A simple empirical likelihood ratio test for normality based on the moment constraints of a half-Normal distribution. J. Probab. Stat. 2018, 2018, 8094146. [Google Scholar] [CrossRef]
  14. Marange, C.S.; Qin, Y. A new empirical likelihood ratio goodness of fit test for normality based on moment constraints. Commun. Stat. Simul. Comput. 2019, 50, 1561–1575. [Google Scholar] [CrossRef]
  15. Marange, C.S.; Qin, Y. An Empirical Likelihood Ratio-Based Omnibus Test for Normality with an Adjustment for Symmetric Alternatives. J. Probab. Stat. 2021, 2021, 6661985. [Google Scholar] [CrossRef]
  16. Zhao, Y.; Moss, A.; Yang, H.; Zhang, Y. Jackknife empirical likelihood for the skewness and kurtosis. Stat. Its Interface 2018, 11, 709–719. [Google Scholar] [CrossRef]
  17. Lin, L.; Zhang, R. Blockwise empirical Euclidean likelihood for weakly dependent processes. Stat. Probab. Lett. 2001, 53, 143–152. [Google Scholar] [CrossRef]
  18. Bravo, F. Blockwise empirical entropy tests for time series regressions. J. Time Ser. Anal. 2005, 26, 185–210. [Google Scholar] [CrossRef]
  19. Bravo, F. Blockwise generalized empirical likelihood inference for non-linear dynamic moment conditions models. Econom. J. 2009, 12, 208–231. [Google Scholar] [CrossRef]
  20. Nordman, D.J.; Sibbertsen, P.; Lahiri, S.N. Empirical likelihood confidence intervals for the mean of a long-range dependent process. J. Time Ser. Anal. 2007, 28, 576–599. [Google Scholar] [CrossRef] [Green Version]
  21. Nordman, D.J. Tapered empirical likelihood for time series data in time and frequency domains. Biometrika 2009, 96, 119–132. [Google Scholar] [CrossRef]
  22. Chen, S.X.; Wong, C.M. Smoothed block empirical likelihood for quantiles of weakly dependent processes. Stat. Sin. 2009, 71–81. [Google Scholar]
  23. Chen, Y.Y.; Zhang, L.X. Empirical Euclidean likelihood for general estimating equations under association dependence. Appl. Math. J. Chin. Univ. 2010, 25, 437–446. [Google Scholar] [CrossRef]
  24. Wu, R.; Cao, J. Blockwise empirical likelihood for time series of counts. J. Multivar. Anal. 2011, 102, 661–673. [Google Scholar] [CrossRef]
  25. Lei, Q.; Qin, Y. Empirical likelihood for quantiles under negatively associated samples. J. Stat. Plan. Inference 2011, 141, 1325–1332. [Google Scholar] [CrossRef]
  26. Nordman, D.J.; Bunzel, H.; Lahiri, S.N. A nonstandard empirical likelihood for time series. Ann. Stat. 2013, 3050–3073. [Google Scholar] [CrossRef] [Green Version]
  27. Nordman, D.J.; Lahiri, S.N. A review of empirical likelihood methods for time series. J. Stat. Plan. Inference 2014, 155, 1–18. [Google Scholar] [CrossRef]
  28. Nelson, C.R. Applied Time Series Analysis for Managerial Forecasting; Holden-Day Inc.: San Francisco, CA, USA, 1973. [Google Scholar]
  29. Wei, W. Time Series Analysis; Addison-Wisley Publishing Company Inc.: Reading, MA, USA, 1990. [Google Scholar]
  30. Box, G.E.; Jenkins, G.M.; Reinsel, G.C. Time Series Analysis: Forecasting and Control, 3rd ed.; Prentice Hall: Englewood Cliffs, NJ, USA, 1994. [Google Scholar]
  31. Kim, Y.M.; Lahiri, S.N.; Nordman, D.J. A progressive block empirical likelihood method for time series. J. Am. Stat. Assoc. 2013, 108, 1506–1516. [Google Scholar] [CrossRef]
  32. Ploberger, W.; Krämer, W. The CUSUM test with OLS residuals. Econom. J. Econom. Soc. 1992, 271–285. [Google Scholar] [CrossRef]
  33. Gombay, E.; Horvath, L. An application of the maximum likelihood test to the change-point problem. Stoch. Process. Their Appl. 1994, 50, 161–171. [Google Scholar] [CrossRef] [Green Version]
  34. Gurevich, G.; Vexler, A. Change point problems in the model of logistic regression. J. Stat. Plan. Inference 2005, 131, 313–331. [Google Scholar] [CrossRef]
  35. Vexler, A.; Wu, C. An optimal retrospective change point detection policy. Scand. J. Stat. 2009, 36, 542–558. [Google Scholar] [CrossRef] [Green Version]
  36. Vexler, A.; Liu, A.; Pollak, M. Transformation of Changepoint Detection Methods into a Shiryayev-Roberts Form; Department of Biostatistics, The New York State University at Buffalo: Buffalo, NY, USA, 2006. [Google Scholar]
  37. Wilks, S.S. The large-sample distribution of the likelihood ratio for testing composite hypotheses. Ann. Math. Stat. 1938, 9, 60–62. [Google Scholar] [CrossRef]
  38. Hall, P.; Horowitz, J.L.; Jing, B.Y. On blocking rules for the bootstrap with dependent data. Biometrika 1995, 82, 561–574. [Google Scholar] [CrossRef]
  39. Lahiri, S.N. Resampling Methods for Dependent Data; Springer: New York, NY, USA, 2003. [Google Scholar]
  40. Andrews, D.W. Heteroskedasticity and autocorrelation consistent covariance matrix estimation. Econom. J. Econom. Soc. 1991, 59, 817–858. [Google Scholar] [CrossRef]
  41. Monti, A.C. Empirical likelihood confidence regions in time series models. Biometrika 1997, 84, 395–405. [Google Scholar] [CrossRef]
  42. Qin, Y.; Lei, Q. Empirical Likelihood for Mixed Regressive, Spatial Autoregressive Model Based on GMM. Sankhya A 2021, 83, 353–378. [Google Scholar] [CrossRef]
  43. Caner, M.; Kilian, L. Size distortions of tests of the null hypothesis of stationarity: Evidence and implications for the PPP debate. J. Int. Money Financ. 2001, 20, 639–657. [Google Scholar] [CrossRef] [Green Version]
  44. De Long, J.B.; Summers, L.H. Is Increased Price Flexibility Stabilizing? Natl. Bur. Econ. Res. 1986, 76, 1031–1044. [Google Scholar]
  45. Campbell, M.J.; Walker, A.M. A Survey of Statistical Work on the Mackenzie River Series of Annual Canadian Lynx Trappings for the Years 1821–1934 and a New Analysis. J. R. Stat. Soc. Ser. A 1977, 140, 411–431. [Google Scholar] [CrossRef]
  46. Tong, H. Some comments on the Canadian lynx data. J. R. Stat. Soc. Ser. A 1977, 140, 432–436. [Google Scholar] [CrossRef]
  47. Haggan, V.; Heravi, S.M.; Priestley, M.B. A study of the application of state-dependent models in non-linear time series analysis. J. Time Ser. Anal. 1984, 5, 69–102. [Google Scholar] [CrossRef]
  48. Makridakis, S.; Wheelwright, S.; Hyndman, R. Forecasting: Methods and Applications; John Willey & Sons: New York, NY, USA, 1998. [Google Scholar]
  49. Coghlan, A. A Little Book of R for Time Series; Release 0.2; Parasite Genomics Group, Wellcome Trust Sanger Institute: Cambridge, UK, 2017. [Google Scholar]
  50. Truong, P.; Novák, V. An Improved Forecasting and Detection of Structural Breaks in time Series Using Fuzzy Techniques; Contribution to Statistics; Springer: Cham, Switzerland, 2022. [Google Scholar]
Figure 1. Diagnostic plots for the Canadian lynx data. The upper plot shows a time-series plot, which reveals evidence of stationarity. The middle plots are the histogram (middle-left) and the quantile–quantile plot (middle-right), and both plots suggest that the time series process has a non-normal distribution. The lower plots show the autocorrelation functions, and for both plots, the autocorrelations are close to zero, giving further evidence of stationarity.
Figure 1. Diagnostic plots for the Canadian lynx data. The upper plot shows a time-series plot, which reveals evidence of stationarity. The middle plots are the histogram (middle-left) and the quantile–quantile plot (middle-right), and both plots suggest that the time series process has a non-normal distribution. The lower plots show the autocorrelation functions, and for both plots, the autocorrelations are close to zero, giving further evidence of stationarity.
Mathematics 11 01041 g001
Figure 2. Diagnostic plots for the souvenir data. The upper plot shows a time-series plot, which reveals evidence of non-stationarity. The middle plots are the histogram (middle-left) and the quantile–quantile plot (middle-right), and both plots suggest that the time series process has a non-normal distribution. The lower plots show the autocorrelation functions.
Figure 2. Diagnostic plots for the souvenir data. The upper plot shows a time-series plot, which reveals evidence of non-stationarity. The middle plots are the histogram (middle-left) and the quantile–quantile plot (middle-right), and both plots suggest that the time series process has a non-normal distribution. The lower plots show the autocorrelation functions.
Mathematics 11 01041 g002
Figure 3. Diagnostic plots for the souvenir forecast errors. The upper plot shows a time-series plot, which reveals evidence of stationarity. The middle plots are the histogram (middle-left) and the quantile–quantile plot (middle-right), both plots suggest that the time series process has a normal distribution. The lower plots show the autocorrelation functions, and for both plots, the autocorrelations are close to zero, giving further evidence of stationarity.
Figure 3. Diagnostic plots for the souvenir forecast errors. The upper plot shows a time-series plot, which reveals evidence of stationarity. The middle plots are the histogram (middle-left) and the quantile–quantile plot (middle-right), both plots suggest that the time series process has a normal distribution. The lower plots show the autocorrelation functions, and for both plots, the autocorrelations are close to zero, giving further evidence of stationarity.
Mathematics 11 01041 g003
Table 1. Coverage probabilities for 95% BEL CIs for the mean of X t = ϕ X t 1 + ϵ t ( ϵ t i.i.d. standard normal), with n = 100 , 250 for NOL and OL blocks of size b = C n 1 / 3 using 5000 simulations. Means and mean average deviations (from 0.95) of coverage probabilities are indicated in bold.
Table 1. Coverage probabilities for 95% BEL CIs for the mean of X t = ϕ X t 1 + ϵ t ( ϵ t i.i.d. standard normal), with n = 100 , 250 for NOL and OL blocks of size b = C n 1 / 3 using 5000 simulations. Means and mean average deviations (from 0.95) of coverage probabilities are indicated in bold.
NOL BlocksOL Blocks
ϕ 0.5 n 1 / 3 n 1 / 3 2 n 1 / 3 3 n 1 / 3 0.5 n 1 / 3 n 1 / 3 2 n 1 / 3 3 n 1 / 3
n = 100
−0.90.99980.99700.95220.92081.00000.99880.98100.9708
−0.80.99760.98880.93740.90980.99920.99280.97160.9528
−0.70.99380.97420.92660.89260.99600.98460.95880.9401
−0.60.98500.96360.92300.89580.99140.97620.95120.9306
−0.50.97780.95640.91160.88920.98460.97040.94360.9228
−0.40.97060.94920.90760.88200.97840.96100.93840.9176
−0.30.96300.94200.90720.88020.96880.95460.93220.9130
−0.20.95620.93700.90260.88000.96000.94860.92720.9088
−0.10.95040.93400.90000.87560.95300.94400.92240.9046
00.94020.93220.89240.86240.94440.94000.92580.8998
0.10.93000.93420.90540.85800.92660.93240.90880.8952
0.20.91800.91700.89480.85580.92540.92880.90640.8894
0.30.90100.90460.88960.85820.91220.92000.90260.8816
0.40.88520.89640.88480.84660.88800.90400.89960.8752
0.50.85780.88600.87940.84480.85700.89720.88680.8736
0.60.82540.85840.86340.83980.81340.86280.87480.8614
0.70.75140.81160.82740.81800.76200.81140.84780.8266
0.80.65260.73500.78240.77800.64820.74280.78960.7830
0.90.45380.54160.64380.65080.47160.54560.64740.6510
Mean0.890.900.880.850.890.910.900.88
MAD0.090.070.070.100.090.060.060.07
n = 250
−0.90.98700.99660.97920.96020.99240.99840.99060.9836
−0.80.98320.98640.96080.93980.98900.99100.97720.9628
−0.70.97680.97400.94900.93020.98480.97940.96780.9538
−0.60.97440.96580.93980.92380.98040.97320.96300.9452
−0.50.97080.95940.93720.92200.97680.96560.95920.9392
−0.40.96680.95540.93380.92000.96960.96220.95520.9356
−0.30.96280.95040.93260.91820.96440.95700.95240.9318
−0.20.95860.94500.93100.91740.95920.95180.95020.9294
−0.10.95320.94220.92820.91560.95380.94820.94740.9274
00.94240.94280.92980.91480.95340.94560.93440.9330
0.10.94220.93540.92460.91180.94020.93440.93220.9252
0.20.92580.93240.92240.91540.93740.93180.93040.9224
0.30.91580.92700.91300.90980.92600.93100.93080.9230
0.40.91420.92100.92280.90180.90120.92560.92960.9180
0.50.88800.90620.91260.90200.89220.91360.91580.9144
0.60.85940.88500.90880.89460.86540.90060.90500.9060
0.70.81820.86520.89180.89380.80940.87940.89680.9006
0.80.73480.79980.85580.87060.73500.80840.86920.8730
0.90.58800.67460.75780.79060.56140.67420.76640.8020
Mean0.910.920.920.910.910.930.930.92
MAD0.060.050.040.040.060.040.030.03
Table 2. Coverage probabilities for 95% BEL CIs for the mean of X t = ϕ X t 1 + ϵ t ( ϵ t i.i.d. standard normal), with n = 500 , 1000 for NOL and OL blocks of size b = C n 1 / 3 using 5000 simulations. Means and mean average deviations (from 0.95) of coverage probabilities are indicated in bold.
Table 2. Coverage probabilities for 95% BEL CIs for the mean of X t = ϕ X t 1 + ϵ t ( ϵ t i.i.d. standard normal), with n = 500 , 1000 for NOL and OL blocks of size b = C n 1 / 3 using 5000 simulations. Means and mean average deviations (from 0.95) of coverage probabilities are indicated in bold.
NOL BlocksOL Blocks
ϕ 0 . 5 n 1 / 3 n 1 / 3 2 n 1 / 3 3 n 1 / 3 0 . 5 n 1 / 3 n 1 / 3 2 n 1 / 3 3 n 1 / 3
n = 500
−0.90.98900.98860.97680.96100.99380.98840.98260.9780
−0.80.98540.98000.96780.94920.99020.98210.97120.9668
−0.70.98100.97640.96000.94280.98700.97640.96440.9594
−0.60.97680.97040.95640.93980.98260.97070.96060.9556
−0.50.97240.96520.95180.93880.97840.96640.95600.9522
−0.40.96740.96300.94940.93600.97340.96310.95380.9506
−0.30.94600.95920.94700.93340.97040.96020.95220.9480
−0.20.95980.95660.94520.93160.96460.95640.95080.9462
−0.10.95600.95280.94440.93140.96000.95440.94900.9440
00.94860.94960.94400.92120.94980.95400.94600.9346
0.10.94260.94400.93640.93020.94780.94580.94240.9356
0.20.94420.93840.93160.92240.93240.94260.93840.9306
0.30.92340.92800.93600.92160.92380.93840.93760.9344
0.40.90960.93120.92880.91980.91420.93300.93660.9272
0.50.88360.92620.92660.91820.89260.92400.93120.9290
0.60.86760.90920.91920.91600.86900.91180.92940.9336
0.70.82640.87780.91140.91400.81680.89140.92400.9274
0.80.74080.83040.88920.90180.74480.84980.91500.9076
0.90.56260.70700.81780.83720.58620.72800.89640.8548
Mean0.910.930.930.920.910.930.940.94
MAD0.060.040.020.030.060.030.020.02
n = 1000
−0.90.99980.98600.97320.96381.00000.98680.97860.9720
−0.80.99440.97760.96600.95040.99660.97920.96680.9586
−0.70.98300.97200.96060.94700.98780.97320.96000.9524
−0.60.97500.96640.95620.94300.97760.96900.95640.9500
−0.50.96760.96160.95400.94160.97220.96420.95220.9482
−0.40.96180.95980.95240.94020.96800.96040.95020.9454
−0.30.95700.95680.95080.93860.96340.95720.94840.9450
−0.20.95300.95520.94900.93800.95920.95500.94740.9440
−0.10.95020.95240.94800.93740.95600.95340.94540.9418
00.95040.94880.94220.94040.94860.95700.94420.9414
0.10.94280.94420.94040.94060.94320.94940.94820.9442
0.20.94020.94500.93680.93960.93800.94720.94520.9464
0.30.92560.93540.94640.93480.93320.94280.94360.9372
0.40.92160.93600.94300.93180.92500.93340.93600.9414
0.50.91480.93240.93380.93680.91660.92960.93740.9402
0.60.88560.92460.92940.92840.87220.91940.93440.9360
0.70.84120.90640.93060.92880.85500.90120.92900.9264
0.80.78700.86960.90660.91860.78140.86520.90300.9124
0.90.65160.76600.84920.88820.64760.76260.86280.8842
Mean0.920.940.940.940.920.940.940.94
MAD0.050.030.020.020.050.030.020.01
Table 3. Empirical rejection probabilities of the process defined in (10) for NOL and OL blocks with varying block sizes at the 0.05 nominal level for n = 100 , 500 , 1000 using 5000 replications.
Table 3. Empirical rejection probabilities of the process defined in (10) for NOL and OL blocks with varying block sizes at the 0.05 nominal level for n = 100 , 500 , 1000 using 5000 replications.
NOL BlocksOL Blocks
ϕ bLog N t ( 10 ) χ 10 2 U ( 0 , 1 ) Log N t ( 10 ) χ 10 2 U ( 0 , 1 )
n = 100
−0.9 n 1 / 3 0.06360.04740.06700.05120.02400.04600.04700.0564
2 n 1 / 3 0.09860.05020.06760.04680.03820.04860.06160.0652
−0.5 n 1 / 3 0.98840.06320.50880.28320.99280.06200.50980.2908
2 n 1 / 3 0.98060.05420.45820.24840.98460.05380.45780.2630
0 n 1 / 3 1.00000.06900.88420.99961.00000.07720.87300.9988
2 n 1 / 3 1.00000.05180.84640.99481.00000.05860.82500.9938
0.5 n 1 / 3 0.99960.05440.48120.46121.00000.05320.44560.4244
2 n 1 / 3 0.99840.03960.36240.40460.99540.04520.32720.3836
0.6 n 1 / 3 0.99640.05440.30000.24860.99480.04560.30640.2146
2 n 1 / 3 0.98960.03140.21960.23280.97640.04700.23460.2208
0.7 n 1 / 3 0.96720.04700.19780.11980.95420.05060.19240.1218
2 n 1 / 3 0.92260.04000.12040.12060.88840.04760.12820.1124
0.8 n 1 / 3 0.77660.05160.10480.06560.76740.05180.09980.0612
2 n 1 / 3 0.67700.02600.07200.07600.64500.04200.07060.0728
0.9 n 1 / 3 0.32680.05620.05960.08580.32840.10120.05950.0828
2 n 1 / 3 0.31180.04900.05580.05520.29280.07700.05200.0544
n = 500
−0.9 n 1 / 3 0.76960.04960.11580.06000.66940.04420.10600.0652
2 n 1 / 3 0.79840.04900.12880.05500.62540.04620.11460.0562
−0.5 n 1 / 3 1.00000.09740.99880.91861.00000.09740.99900.9282
2 n 1 / 3 1.00000.08080.99860.88600.99940.08920.99920.8962
0 n 1 / 3 1.00000.42801.00001.00001.00000.44741.00001.0000
2 n 1 / 3 1.00000.35961.00001.00001.00000.40501.00001.0000
0.5 n 1 / 3 1.00000.21740.99820.99061.00000.22800.99900.9914
2 n 1 / 3 1.00000.19540.99720.98801.00000.18180.99600.9842
0.6 n 1 / 3 1.00000.13320.97780.78841.00000.13640.97300.7884
2 n 1 / 3 1.00000.10900.96560.77541.00000.11560.96180.7704
0.7 n 1 / 3 1.00000.08020.82740.34361.00000.08020.84200.3524
2 n 1 / 3 1.00000.07380.78620.36221.00000.07300.78900.3512
0.8 n 1 / 3 1.00000.05720.46840.12041.00000.06000.47420.1092
2 n 1 / 3 1.00000.05160.43880.13100.99980.05060.41660.1248
0.9 n 1 / 3 0.96100.05760.15760.05580.96160.05180.14040.0490
2 n 1 / 3 0.95080.04840.12420.05640.94660.04620.12540.0516
n = 1000
−0.9 n 1 / 3 0.98320.05620.22540.05560.98360.04820.21900.0632
2 n 1 / 3 0.98420.05200.24480.05260.96960.05020.21300.0482
−0.5 n 1 / 3 1.00000.19261.00001.00001.00000.21301.00001.0000
2 n 1 / 3 1.00000.18621.00000.99741.00000.17701.00000.9986
0 n 1 / 3 1.00000.80581.00001.00001.00000.82201.00001.0000
2 n 1 / 3 1.00000.78441.00001.00001.00000.79461.00001.0000
0.5 n 1 / 3 1.00000.49341.00001.00001.00000.47881.00001.0000
2 n 1 / 3 1.00000.46560.99980.99701.00000.48601.00001.0000
0.6 n 1 / 3 1.00000.28721.00000.97821.00000.30181.00000.9818
2 n 1 / 3 1.00000.27100.99360.97681.00000.28820.99980.9754
0.7 n 1 / 3 1.00000.13540.98620.63461.00000.14220.98520.6174
2 n 1 / 3 1.00000.12960.97900.62801.00000.14260.98460.6170
0.8 n 1 / 3 1.00000.07460.79620.18621.00000.06540.77860.1744
2 n 1 / 3 1.00000.07200.74760.19121.00000.06520.73720.1840
0.9 n 1 / 3 0.99920.04740.27140.06100.99940.05300.25640.0658
2 n 1 / 3 0.99660.04720.23300.06080.99920.05000.23880.0606
Note: Bold represents the most powerful test under the respective simulated scenario.
Table 4. Empirical rejection probabilities of the process defined in (10) at the 0.05 nominal level for n = 100 using 5000 replications.
Table 4. Empirical rejection probabilities of the process defined in (10) at the 0.05 nominal level for n = 100 using 5000 replications.
Rejection Rates for n = 100
ϕ Test N ( 0 , 1 ) Log N t 10 χ 1 2 χ 10 2 β (2,1) U ( 0 , 1 )
−0.9BELT0.04740.02260.04780.02800.06840.05140.0574
EPPS0.12680.05340.12160.07280.12260.14380.1574
LV0.02840.14540.03160.08920.04000.02240.0234
PV0.06280.29720.06400.16420.08800.08240.0302
−0.5BELT0.05300.99500.06100.99660.49560.48620.2932
EPPS0.07120.68100.05320.85280.20440.48400.5538
LV0.04560.99940.18960.99880.47940.16980.0096
PV0.04820.99840.12860.99800.35840.37800.2664
0BELT0.04681.00000.05521.00000.88680.95760.9978
EPPS0.06320.96720.08580.99600.54260.97060.9948
LV0.04281.00000.29501.00000.78200.74600.5446
PV0.04841.00000.15681.00000.80480.98200.9602
0.5BELT0.05100.99980.05320.99960.45340.37620.4248
EPPS0.07320.85660.06460.95980.26580.55900.5668
LV0.03420.99780.15780.99840.41600.10300.0002
PV0.03840.99980.08620.99920.40400.42420.1100
0.6BELT0.05540.99600.05200.99620.30840.21220.2214
EPPS0.07500.61820.06100.81880.19900.35960.3392
LV0.03320.98720.12040.97380.28480.06760.0020
PV0.06600.99320.08260.98640.29320.21080.0510
0.7BELT0.05240.96240.05000.94940.19860.15400.1064
EPPS0.07980.32320.06640.48460.14620.21700.2158
LV0.03240.90500.08320.82920.16460.03820.0028
PV0.06000.93400.08680.89000.15200.08460.0326
0.8BELT0.04820.79800.04880.72380.09520.09860.0606
EPPS0.11040.14640.09720.20380.13080.15760.1636
LV0.01540.62460.04080.44000.06580.02400.0050
PV0.06480.69640.06840.53880.09200.06920.0546
0.9BELT0.05220.43300.05020.20260.05760.10260.0874
EPPS0.17080.13040.14740.13900.15160.18320.1844
LV0.00920.17500.01420.08360.01840.00480.0002
PV0.07820.30200.07440.18300.08400.06420.0324
Note: Bold represents the most powerful test under the respective simulated scenario.
Table 5. Empirical rejection probabilities of the process defined in (10) at the 0.05 nominal level for n = 500 using 5000 replications.
Table 5. Empirical rejection probabilities of the process defined in (10) at the 0.05 nominal level for n = 500 using 5000 replications.
Rejection Rates for n = 500
ϕ Test N ( 0 , 1 ) Log N t 10 χ 1 2 χ 10 2 β (2,1) U ( 0 , 1 )
−0.9BELT0.04900.67740.04780.48060.10460.09040.0538
EPPS0.07600.37360.05940.22140.06940.08960.0986
LV0.06920.88520.08660.61360.12220.07280.0562
PV0.04600.71340.07800.38200.06300.06600.0492
−0.5BELT0.04661.00000.08881.00000.99860.99920.9248
EPPS0.06040.99980.13721.00000.76920.99000.9928
LV0.04221.00000.45641.00000.99420.99680.9638
PV0.05581.00000.22161.00000.96800.99540.9570
0BELT0.05601.00000.38561.00001.00001.00001.0000
EPPS0.05541.00000.32661.00000.99761.00001.0000
LV0.04521.00000.74361.00001.00001.00001.0000
PV0.04521.00000.49381.00001.00001.00001.0000
0.5BELT0.05201.00000.19001.00000.99580.99820.9888
EPPS0.06101.00000.14941.00000.87480.99720.9948
LV0.04301.00000.45281.00000.99380.99600.9708
PV0.03841.00000.20721.00000.99300.99800.9196
0.6BELT0.05421.00000.12181.00000.97660.92260.7842
EPPS0.05960.99940.10401.00000.72760.90600.8186
LV0.04821.00000.32381.00000.95580.88900.4742
PV0.04201.00000.13860.99900.92740.91720.4590
0.7BELT0.05341.00000.08561.00000.83340.62340.3440
EPPS0.06340.99980.08241.00000.47240.57920.4224
LV0.04141.00000.20881.00000.73920.41520.0690
PV0.04800.99700.10740.99680.71860.54940.0842
0.8BELT0.05121.00000.07421.00000.48120.26480.1168
EPPS0.08500.98120.07280.96220.25540.25840.1844
LV0.04101.00000.11740.99880.37580.12880.0166
PV0.05140.98960.07520.99200.37440.20700.0308
0.9BELT0.04500.95820.04220.84140.14500.08780.0584
EPPS0.11740.58380.09620.45740.15160.15800.1430
LV0.01760.83180.03840.55520.06880.02180.0100
PV0.04520.51760.07200.33500.02120.01040.0072
Note: Bold represents the most powerful test under the respective simulated scenario.
Table 6. Empirical rejection probabilities of the process defined in (10) at the 0.05 nominal level for n = 1000 using 5000 replications.
Table 6. Empirical rejection probabilities of the process defined in (10) at the 0.05 nominal level for n = 1000 using 5000 replications.
Rejection Rates for n = 1000
ϕ Test N ( 0 , 1 ) Log N t 10 χ 1 2 χ 10 2 β (2,1) U ( 0 , 1 )
−0.9BELT0.05100.98160.04740.90980.21720.12360.0646
EPPS0.06880.79140.05420.48060.08220.07860.0990
LV0.09100.99200.11820.89580.24080.12940.0684
PV0.05900.92400.05200.61600.08400.04500.0520
−0.5BELT0.04801.00000.20961.00001.00001.00001.0000
EPPS0.05441.00000.25081.00000.97841.00001.0000
LV0.04641.00000.69761.00001.00001.00001.0000
PV0.05201.00000.40101.00001.00001.00001.0000
0BELT0.04961.00000.81121.00001.00001.00001.0000
EPPS0.05721.00000.59341.00001.00001.00001.0000
LV0.04881.00000.94221.00001.00001.00001.0000
PV0.04801.00000.79101.00001.00001.00001.0000
0.5BELT0.05601.00000.48661.00001.00001.00001.0000
EPPS0.05601.00000.25861.00000.99401.00001.0000
LV0.04881.00000.67201.00001.00001.00001.0000
PV0.04901.00000.39601.00001.00001.00001.0000
0.6BELT0.04421.00000.30281.00001.00001.00000.9798
EPPS0.05721.00000.17381.00000.95920.99720.9826
LV0.04381.00000.49701.00000.99961.00000.9722
PV0.05900.99940.23001.00000.99980.99920.8800
0.7BELT0.05521.00000.13941.00000.98880.92020.6256
EPPS0.06180.99980.10161.00000.76780.85640.6302
LV0.04701.00000.31441.00000.97740.88980.3628
PV0.04901.00000.14040.99900.95800.90720.2490
0.8BELT0.04681.00000.06901.00000.77440.49960.1740
EPPS0.07441.00000.07540.99960.41720.40360.2364
LV0.05101.00000.15121.00000.67820.31840.0408
PV0.06220.99300.08800.99240.65500.38260.0430
0.9BELT0.04961.00000.05360.98860.25780.14540.0652
EPPS0.09020.92000.08820.77200.17800.16240.1220
LV0.03620.99240.05380.90080.15500.06240.0120
PV0.05140.57600.05600.47460.00080.00060.0002
Note: Bold represents the most powerful test under the respective simulated scenario.
Table 7. Ranking of tests using average powers computed from the empirical rejection probabilities in Table 4, Table 5 and Table 6 for n = 100, 500 and 1000.
Table 7. Ranking of tests using average powers computed from the empirical rejection probabilities in Table 4, Table 5 and Table 6 for n = 100, 500 and 1000.
Power Rankings
nRankingLog N t 10 χ 1 2 χ 10 2 β (2,1) U ( 0 , 1 )
1001BELT, PVLVBELTBELTEPPSEPPS
2LVEPPS, PVPVLV, PVBELT, PVBELT
3EPPSBELTLVEPPSLVPV
4 EPPS LV
5001BELT, LVLVBELTBELTBELT, EPPSEPPS
2PVPVLVLVPV, LVBELT
3EPPSBELTPV, EPPSPV PV, LV
4 EPPS EPPS
10001BELT, LVLVBELTBELTBELTEPPS
2PV, EPPSPV, BELTLVLVEPPSBELT
3 EPPSEPPS, PVPVLV, PVLV
4 EPPS PV
Table 8. Comparisons of computational times (in seconds) for the studied tests.
Table 8. Comparisons of computational times (in seconds) for the studied tests.
TestReplicationsElapsedRelativeUser.selfSys.self
BELT100013.915.43413.310.59
EPPS10008.093.1607.400.66
LV10002.561.0002.470.09
PV10001394.86544.8671381.0213.16
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Marange, C.S.; Qin, Y.; Chiruka, R.T.; Batidzirai, J.M. A Blockwise Empirical Likelihood Test for Gaussianity in Stationary Autoregressive Processes. Mathematics 2023, 11, 1041. https://doi.org/10.3390/math11041041

AMA Style

Marange CS, Qin Y, Chiruka RT, Batidzirai JM. A Blockwise Empirical Likelihood Test for Gaussianity in Stationary Autoregressive Processes. Mathematics. 2023; 11(4):1041. https://doi.org/10.3390/math11041041

Chicago/Turabian Style

Marange, Chioneso S., Yongsong Qin, Raymond T. Chiruka, and Jesca M. Batidzirai. 2023. "A Blockwise Empirical Likelihood Test for Gaussianity in Stationary Autoregressive Processes" Mathematics 11, no. 4: 1041. https://doi.org/10.3390/math11041041

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop