On the Performance of Wavelet Based Unit Root Tests

: In this paper, we apply the wavelet methods in the popular Augmented Dickey-Fuller and M types of unit root tests. Moreover, we provide an extensive comparison of the wavelet based unit root tests which also includes the recent contributions in the literature. Moreover, we derive the asymptotic properties of the wavelet based unit root tests under generalized least squares detrending mechanism. We demonstrate that the wavelet based M tests exhibit better size performance even in problematic cases such as the presence of negative moving average innovations. However, the power performances of the wavelet based unit root tests are quite similar to each other.


Introduction
It is well known that many financial and economic time series exhibit non-stationary characteristics. Without treatment of these non-stationary characteristics, both univariate and multivariate analysis on these kinds of series may yield incorrect conclusions. Therefore, in numerous studies both in economy and finance, testing the unit root of time series is usually the first step before conducting the econometric analysis. The unit root testing procedure is first introduced by Dickey and Fuller (1979) and Dickey and Fuller (1981). Afterwards, many different unit root tests have been devised in the literature. Except for a few studies, overwhelmingly these unit root tests are constructed in the time domain. However, conclusions drawn from these tests remain controversial in many cases due to the low power of tests in near unit root cases and severe size distortions, especially in the case of the large negative moving average (MA) root.
Even before the introduction of the unit root testing, Granger (1966) points out that most economic time series have a spectral density characterized by the significant power in low frequencies followed by exponential decline at higher frequencies, especially in trending series. This observation implies that the variance of a unit root process is mostly originated from the low frequencies. Capitalizing on this notion, Fan and Gencay (2010) developed a wavelet based unit root testing procedure. Using a wavelet spectrum, the contribution of the variance to the overall variance at each frequency can be decomposed, and therefore it is straightforward to construct a wavelet based unit root testing procedure. Fan and Gencay (2010) rely on the discrete wavelet transformation (DWT) to extract the most persistent component of time series called the scaling (approximation) coefficients and use these coefficients, particularly the ratio of the variance from the unit scale to the total variance of the time series to build their test statistics. Even though Fan and Gencay's (2010) unit root test enjoys considerable power, their test suffers from the size distortions when the MA error part has large negative unit roots. Trokić (2016) improves upon Fan and Gencay's (2010) unit root test by constructing a nonparametric testing procedure and shows that size distortions can be treated by using a bootstrap-like procedure called wavestrapping. These two tests are the only wavelet based unit root tests in the literature currently.
Following the same logic behind Fan and Gencay (2010) and Trokić (2016) unit root testing procedures, we propose the wavelet based versions of Dickey and Fuller (1981) and Ng and Perron (2001) tests. We use a generalized least squares (GLS) detrending to get rid of the deterministic components in the observed data. As wavelet filtering doesn't alter the nature of linear time series process, our wavelet based tests share the same asymptotic distributions of the original tests. Using Monte Carlo simulations, we evaluate size and power properties of our tests against Fan and Gencay (2010) and Trokić (2016). In these simulations, we consider Daubechies and Symlet filter families since the developed methodology is compatible with compactly supported wavelets. From these filters, Daubechies are the compactly supported filters that have a maximum amount of vanishing moments. Furthermore, Symlet filters are obtained by increasing the symmetry of Daubechies filters.
Our results show that the new proposed unit tests have less size distortions in sample without relying on a bootstrap routine compared to Fan and Gencay (2010) and Trokić (2016). The power performance of the tests indicates there is no single dominating test. Moreover, in medium length filters (filter length of 2 or 4), type of wavelet does not alter the results drastically.
The rest of the paper is as follows. Section 2 introduces the wavelet theory. Section 3 explains our wavelet based tests as well as Fan and Gencay (2010) and Trokić's (2016) methods. Section 4 presents Monte Carlo simulation results and Section 5 provides the conclusions and the Appendix A presents proofs of the theorems and the lemmas. All limits in the paper are as T − → ∞, → denotes the weak convergence in distribution and x denotes the closest integer to x.

Wavelet Transform
Recently, the wavelet filters have become frequently used tools in unit root and cointegration studies. In these studies, the authors utilize the fact that wavelet filters can operate in both time and frequency domain. This feature helps the wavelets capture the nonstationarity across a wide range of frequencies (Fan and Gencay (2010)). This makes the wavelet transform a proper instrument for unit root and cointegration testing. Accordingly, for the construction of the new unit test, we utilize the wavelet methods. First, we briefly introduce the wavelet transformation. This section and the notation used in this paper mostly follow Fan and Gencay (2010) and Eroglu (2018).
A wavelet, ψ(t), is a real-valued function oscillating in a finite domain with the following basic properties: The first property implies that a wavelet function must take a non-zero value in a finite time period and the second property indicates that all the departures from zero should be cancelled out Gençay et al. (2001). Using the function ψ(t), we can design the continuous time wavelet transform (CWT) of a time series x t as it follows: is translated by u and dilated by s. Note that W(u, s) is called the wavelet coefficient in this transfigurations. Additionally, the parameter s ∈ R + allows wavelets to work under different frequencies. However, the CWT has an important shortcoming: it is almost impossible to analyse all wavelet coefficients for all frequencies. Furthermore, in the CWT, the wavelet coefficients are redundant transformation for time series data. Hence, the CWT is not very appropriate in unit root testing. Nevertheless, the wavelet theory equipped with many other transformations that can solve the problems of the CWT such as the DWT, the maximum overlap discrete wavelet transform, and the discrete wavelet packet transform, etc. From these techniques, the DWT that shares the fundamental properties of the CWT creates a non-redundant decomposition with a finite number of frequencies.
Consequently, the DWT is a more suitable instrument for our study.
The DWT can be defined with two separate filters. The first filter h = (h 0 , h 1 , . . . , h L−1 ) is called the discrete wavelet (or high pass) filter with a finite length L where h l corresponds to a filter coefficient for all l = 0, . . . , L − 1. The high pass filters satisfy the zero sum condition, ∑ L−1 l=0 h l = 0 and these filters have unit energy, ∑ L−1 l=0 h 2 l = 1 as do the CWT filters. The high pass filter does not provide the full analysis of the observed series. However, we also have an complementary filter g (low pass filter). The low pass filter g can be obtained by the quadrature mirror relationship 1 . Unlike the high pass filter, the low pass filters sum to √ 2, ∑ L−1 l=0 g l = √ 2, but they also have unit energy, ∑ L−1 l=0 g 2 l = 1. Using the convolution on the observed series and the filters defined above, we transform the time series process into its high frequency and low frequency components. Let {x t } T t=1 be the observed time series process with dyadic length T = 2 J for some integer J. Then, the matrix of the DWT coefficients can be defined as W L = W L 1 , W L 2 , . . . , W L J , V L J , where, for j = 1, 2, . . . , J, W L j is the column vector of j-th level wavelet coefficients and V L J is the column vector of J-th level scaling (approximation) coefficients. In this decomposition, the approximation coefficients V L J explain the fluctuations of x t on the scale 2 J (the largest scale among the all coefficients) and the wavelet coefficients W L j are associated with the changes on the scale 2 j−1 . Note that scale and frequency are inversely proportional. As a result, V L J captures the lowest frequency and W L 1 captures the highest frequency components of the transformed series. Additionally, the approximation coefficient V L J has a length of T/2 J and W L j has a length of T/2 j for each j = 1, 2, . . . , J.
In practice, the wavelet and the approximation coefficients for the levels higher than 1 can be obtained by the pyramid algorithm, which is firstly proposed by Mallat (1989). However, in this study, we focus on the first level wavelet transformation. We can obtain this transformation as the following: where the filtering is carried out by the convolution of the observed series with the high pass and low filters. In the construction of our test statistic, we only use the first level approximation coefficients of the observed time series processes, V L 1,t . Notice that V L 1,t corresponds to lowest frequency data in level 1 decomposition. In this regard, we separate the data from the high frequency components that contain short term fluctuations. As indicated (Fan and Gencay, 2010), Trokić (2016) and Eroglu (2018), this separation also filters out the short run problematic dynamics in the process such as the innovations of the observed series with highly negative MA roots. Accordingly, the wavelet transform helps us to remove some problematic issues before the testing stage. In the literature, there are other variants of wavelet transformation such as the maximum overlap discrete wavelet transform and the discrete wavelet packet transform. In simulations, we also utilize the maximum overlap discrete wavelet transform; however, DWT has better performance overall so we drop the maximum overlap discrete wavelet transform for brevity. 2 Another issue worth considering is the performance of higher level wavelet transformations. For instance, Trokić (2016) utilizes higher level transformations upto 3rd level, but he achieves the best results by means of power with the first level DWT while the higher level DWT has slight size improvements in the testing. 1 The quadrature mirror relationship can be characterized by: g l = (−1) l+1 h L−1−l for l = 0, . . . , L − 1 (Fan and Gencay 2010). 2 The results for the maximum overlap discrete wavelet transform are available upon request.

Regression Based Wavelet Unit Root Tests
We consider a basic unit root model: where µ t captures the deterministic component, y t is the stochastic part of the observed series, B denotes the back-shift or lag operator and the parameter ρ governs the unit root process where we assume |ρ| ≤ 1. For brevity, we only consider two scenarios for the deterministic component. We index these cases with the letter j. j = 0 indicates no deterministic component in the observed series, thus µ t = 0 for all t. When j = 1, we assume a mean, i.e., µ t = 1 for all t and, when j = 2, we assume a mean and trend such that µ t = 1 t . As in the classical unit root testing, we first need to remove the deterministic trends from the observed series. Otherwise, these components introduce nuisance parameters in the asymptotic distribution of the test statistics. In order to eliminate these nuisance parameters, we apply a GLS detrending algorithm to the observed series. To obtain the GLS detrended series, we first employ quasi-differencing on the observed series x t and µ t with some positive constant c, which is a quasi-differencing parameter. The quasi-differencing algorithm can be seen as follows: where xc ,0 = x 0 and µc ,0 = µ 0 . Nielsen (2009) demonstrates the GLS detrended series as: After obtaining the GLS detrended series, we apply the first level wavelet transform with filter length L to these series:V L c,1,t = G(B)xc ,2t . (4) For simplicity, we first assume µ t = 0. Notice that we can apply Equation (1) on y t to obtain as follows: where we drop mod T and L notation for brevity and G(B) = g 0 + g l B + · · · + g L−1 B L−1 . Now, consider y 2t = ρ 2 y 2t−2 + u 2t + ρ u 2t−1 = ρ 2 y 2t−2 + (1 + ρB)u 2t . Using this result, we can write: In addition, note that Vc ,1,t−1 = G(B)y 2t−2 ; then, we can conclude that Vc ,1,t = ρ 2 Vc ,1,t−1 + G(B)(1 + ρB)u 2t . This result implies that, if y t follows a unit root process, then Vc ,1,t also follows a unit root process, but the innovation structure of the wavelet transformed series carries further MA roots. However, these additional MA roots do not alter the stationarity of the innovation terms. Accordingly, we can claim that v t admits a stationary Wold decomposition where * t is an i.i.d random variable. From Chang and Park (2002), we can approximate v t as a finite order autoregressive (AR) process: where * p,t = * t + ∑ ∞ k=p+1 α k u t−k . We can use the following assumption from Chang and Park (2002) for the new innovations: Assumption 1. Let ( t , F t ) be a martingale difference sequence, with some filtration (F t ), such that a. E 2 t |F t−1 = σ 2 and b. E | t | r < K with r ≥ 4, where K is a constant depending only on r.
Remark 1. Assumption 1 indicates that the innovation process t admits a stationary Wold decomposition. On the other hand, with simple algebra, it is possible to show that the innovations of the filtered y t , say * t , also follow a stationary Wold decomposition. Accordingly, we can rewrite Assumption 1 for * t as: Assumption 2. Let α(z) = 0 for all |z| ≤ 1, and ∑ ∞ k=0 |k| s |α k | < ∞ for some s ≥ 1.
Before presenting our theoretical results on a wavelet based unit root test, we review the recent methods that also deal with the unit root problem by utilizing wavelet theory. These recent methods include contributions of Fan and Gencay (2010) and Trokić (2016). First, Fan and Gencay (2010) propose a unit root test based on the notion of Granger (1981) who argues that generally time series after detrending has a peak in power spectra at low frequencies and exponential decline at higher frequencies. Fan and Gencay (2010) decompose variance of the observed series into low and high frequency components via DWT to test for unit root. More specifically, their unit root test is based on the ratio of the variance from the low pass filtered series and the variance of observed series.
Fan and Gencay's (2010) unit root test statistics are defined as follows: whereλ 2 v = 4ω 2 andω 2 is the long run variance of u t in Equation (3), andλ 0 is the estimate of the variance of t . These parameters can be estimated by applying a nonparametric kernel estimation with Barlett kernel to the residuals obtained after applying a detrending procedure on x t . We consider GLS detrending for this test in this study.
Trokić (2016) argues that, even though Fan and Gencay (2010) enjoy high statistical power, their test suffers from violent size distortions in the presence of errors with negative MA roots and follow a parametric way to correct the long run variance of the observed series. In this regard, Trokić (2016) tries to improve the Fan and Gencay (2010) test by devising a parameter free unit root test that is more robust to size distortions. Trokić's (2016) test is based on the variance of the scaling coefficients and the variance of its fractionally differenced transform series with some order d > 0. The test statistics of Trokić's (2016) unit root test are as follows: whereṼc ,1,t = ∆ −d +Vc,1,t is the fractional transform ofVc ,1,t and ∆ −d + is the fractional differencing operator that can be written for some time series process {v t } T t=1 as: Note that this operator does not include the prehistoric observation of the time series process v t and T 1 = T/2, since every time we apply wavelet filters to the observed series, we lose half of the sample. Additionally, Trokić (2016) and Nielsen (2009) suggest that the parameter d can be chosen from the inverval (0, 1) by the practitioner. While Nielsen (2009) sets d = 0.1 to obtain the best power performance, Trokić (2016) picks d = 0.05. The asymptotic distribution of Fan and Gencay's (2010) and Trokić's (2016) tests can be summarized as the following: where W j,c (s) is defined in Theorem 1 and W j,1+d,c (s) is the fractional Brownian motion that is demonstrated in Nielsen (2009). However, although Trokić (2016) and Fan and Gencay (2010) do not explicitly derive the asymptotic results for GLS detrending series, following Nielsen (2009), Fan and Gencay (2010), and Trokić (2016), one can easily reach the outcome. 3 Now, we can illustrate our theoretical contribution on wavelet based unit root tests. Under Assumptions 1 and 2, the approximation error is small as p becomes large (Chang and Park 2002). As a result, we can use the following augmented regression for unit root testing: Note that when δ = 0,Vc ,1,t is a unit root process and if δ < 0, thenVc ,1,t is a stationary process. We base our unit root test on Equation (7). This equation is similar to the conventional Augmented Dickey-Fuller (ADF) regression, thus we can use a similar procedure. Suppose that we estimate the model in Equation (7) with OLS and obtain the estimatesδ,α 1 ,· · · ,α p−1 andα p . We construct the null hypothesis of a unit root in x t as H 0 : δ = 0. This hypothesis can be tested with two different t statistics: where se δ is the standard deviation of the OLS estimator of δ andα(1) = 1 − ∑ p k=1α k in the Equation (7). Additionally, we can also construct modified wavelet based Phillips and Perron (1988) tests. These are given as: where s * 2 AR (p) =σ 2 /α(1) 2 is the spectral AR estimate of long run variance from ADF regression in Equation (7). Note that both ADF and M type tests require the selection of lag length p. We can apply an information criteria based method to select the optimal lag length.
Theorem 1. Let Assumptions 1 and 2 hold, then Theorem 1 shows that the wavelet based tests share the same asymptotic distribution as the classical tests. This result is expected since wavelet filtering does not alter the nature of the linear time series process. Moreover, these results provide two new contributions in the wavelet based unit root testing literature. First, we derive the theoretical results for the GLS detrending mechanism in wavelet based unit root tests. Second, we modify the ADF and Ng and Perron's (2001) tests by utilizing the wavelet theory.

Small Sample Properties
In this section, we evaluate the performance of different wavelet based unit root tests by Monte Carlo simulations. In these simulations, we consider five different wavelets, namely, Haar, Db2, Db4, sym2, and sym4. We can categorise these wavelets into two main groups. The first group consists of Daubechies wavelets which are characterized by a maximal number of vanishing moments. In our exercise, we consider Daubechies wavelets Db2 and Db4 with lengths 4 and 8, respectively. The second group is called Symlet which are modified version of Daubechies wavelets with increased symmetry. 4 The lengths of Symlet wavelets sym2 and sym4 are 4 and 8, respectively. Finally, Haar wavelet, which has length of 2, is a special type of filter that can be placed in Daubechies and Symlet at the same time.
For simulations, we consider the following data generation process: where e t is i.i.d standard normal random variables. Since the coefficient γ is asymptotically irrelevant, we set γ = 0 for all cases. Furthermore, for the size exercise, we set ρ = 1 and for the power exercise we use ρ = 0.99 and 0.9 5 . As we discussed in the previous sections, we compare three different families of wavelet based unit root test statistics. These are Trokić's (2016) variance ratio statistic, Fan and Gencay's (2010) statistic and the wavelet version of Ng and Perron's (2001) test statistics. To evaluate the small sample and large sample properties, we use sample size T = 100 and T = 1000. Moreover, we examine three types of deterministic component adjustments. These are no deterministic component, only mean, and mean and trend cases.
The newly proposed wavelet based M type and ADF tests require optimal lag length selection to remove the present serial correlation innovation process. In this study, we utilize modified Akaike information criteria (MAIC) information criteria proposed by Ng and Perron (2001). Other information criterion can be considered; however, in our simulation studies, we observe the best results can be obtained with MAIC. Moreover, we also consider the modification of Perron and Qu (2007) for the lag selection procedure. Following Perron and Qu (2007), we utilize OLS instead of GLS detrended data to calculate MAIC, but use GLS detrended data in the testing phase.
As mentioned in Section 3,c is used for GLS detrending. This parameter is chosen, for each test, as at the local alternative ρ = 1 −c/T, the test obtains 50% power with the critical values generated by the same value ofc. This value for each test statistic can be find by running an expensive grid search. We present the values of this parameter in First, we evaluate the size performance of the wavelet based tests with simulated data. In these simulations, we focus on MA(1) innovations for brevity. The MA(1) coefficient θ in Equation (15) is chosen from {0.8, 0, −0.8} 7 . The results of the size exercise can be found in Tables 2 and 3 for sample sizes 100 and 1000, respectively.
First, we discuss about the over-size problem with negative MA innovations when T = 100. Almost every test statistic in Table 2 exhibits severe size distortions under this scenario. However, M type of unit root tests can eliminate the problem successfully, while ADF tests also demonstrate smaller size distortion relative to Trokić's (2016) and Fan and Gencay's (2010) statistics. Additionally, Fan and Gencay's (2010) test statistic seems to suffer the severest size distortion among all statistics. 5 The results for other intermediate values of ρ are available upon request. 6 In the simulation, we observe that, for all tests, the optimalc is very close. As a result, we use the samec for all tests. A similar approach is adopted by Ng and Perron (2001). The values of critical values with other significant levels are available upon request. 7 The simulations can be conducted under different ARMA innovations. These results are available upon request. Since they do not alter the findings, we skip them for brevity. These features also persist in larger samples (see Table 3). When T = 1000, we observe size distortions, but slightly less than observed in small samples. Another important observation in these table is that the size distortion problem becomes more severe when we consider deterministic component adjustments, especially in detrending cases. Nonetheless, M type of tests still provide satisfactory size correction even after the detrending procedure. For no serial correlation (θ = 0) and positive MA innovation case (θ = 0.8), we observe all wavelet based tests are either correctly sized or slightly undersized. For instance, Fan and Gencay's (2010) test is undersized by 0.03% when θ = 0.8 for all deterministic component cases. On the other hand, when θ = 0.8 and we have trend and mean as the deterministic component, M tests show 0.02% size distortion. Finally, Trokić's (2016) test is the least affected by detrending algorithms by means of size distortion. Again, these findings are also valid for large sample size (T = 1000), but with slight improvement as expected.
In another exercise, we compare the size performances of standard and wavelet based tests. In this exercise, we only consider GLS demeaned statistics with sample sizes T = 100 and 1000 for brevity and space constraints. Moreover, we utilize the same serial correlation scenarios as in the previous exercises. The results for this exercise can be found in Table 4. In this table, when T = 100 and θ = −0.8, standard tests are undersized and the wavelet based tests are oversized, but the size distortions are almost the same. However, when θ = 0.8, the wavelet based tests are much more successful than the standard tests. Although this result seems controversial, we know that Ng and Perron's (2001) M tests are quite successful without further modification. Additionally, the wavelet modification engenders better results against standard ADF tests, especially ADF α . In the large sample case, all tests are performing similarly as expected. As a result, there is no single winner in the size contest for the small samples. Moreover, we can attribute the difference appeared in standard and wavelet M tests to the fact that wavelet based tests effectively utilize half of the sample. We expect this difference would be eliminated in the moderate sample sizes.
In the current literature, GLS is generally preferred to OLS for demeaning and detrending series. Therefore, we also use GLS demeaning and detrending in our study. However, we also conduct a small simulation to compare results of GLS and OLS in the case of demeaning with sample size T = 100. We use Haar, Db2, and sym2 as they usually perform quite well in our simulations. The results of this simulation are shown in Table 5. For θ = 0 and 0.8, tests based on OLS demeaning are significantly undersized and are clearly worse than their GLS demeaning based counterparts. For a negative MA root case, the tests are oversized except Trokić's (2016) test and tests based on GLS demeaning have slightly better sizes than those based on OLS demeaning except Trokić's (2016) test and ADF * t test. Finally, we present size properties of tests when different lengths of wavelets are selected. For the case of T = 100 and GLS demeaning, Figure 1 shows sizes of tests with wavelet length between 2 and 16 for θ = −0.8, 0, and 0, 8, respectively. Results clearly show that, for θ = 0 and 0.8 when the wavelength increases over 8, tests become significantly oversized. For θ = −0.8, sizes of tests don't change much with the wavelet length. These results show that tests based on smaller wavelet lengths show better size properties.  1. The size comparison of tests with various wavelet lengths with sample size T = 100. Note: tvr, ram, mzaw, mztv, msbw, adfaw and adftw correspond to τ * , FG, MZ * α , MZ * t , MSB * , ADF * α , and ADF * t , respectively.

The Size-Adjusted Power Performance of the Wavelet Based Tests
In this part, we investigate the size-adjusted power properties of the wavelet based tests. We use the model in Equations (13)-(15). As in the size exercise, we utilize the same data generation and detrending algorithms, but we set ρ as 0.99 and 0.9. The results for the size-adjusted power performance of wavelet based unit root tests are summarized in Tables 6-8. These tables demonstrate a few interesting findings. First, Fan and Gencay's (2010) test suffers extreme power loss when θ = −0.8 and T = 100. We cannot observe conventional power curve for this test since the power is decreasing with increasing values of ρ. This result is surprising in unit root literature. The detrending or demeaning algorithm does not alter this conclusion, but larger sample size approximately corrects this distortion. On the other hand, other tests still maintain conventional power performance. Second, detrending or demeaning slightly reduce the power of the tests for both small and large samples. Third, the tests show similar power performance in the no serial correlation case. However, we observe slightly worse power for Trokić's (2016) test when θ = 0.8 than the other tests. Finally, when we compare M tests and ADF tests, ADF tests exhibit better performance than M tests in almost all cases.
These findings imply that there is no single dominant test by means of size and size-adjusted power. While M and ADF tests engender better size correction in problematic cases, Trokić (2016) generates more stable power properties. Moreover, the type of wavelet filter (being from the family of Daubechies or Symlets) does not matter by means of size or size-adjusted power.    In the last two Monte Carlo exercises, we evaluate the large sample properties of the wavelet based and standard unit root tests under GLS demeaning 8 . First, we examine the asymptotic behaviour of the wavelet based test with different wavelet filters and lengths. In this exercise, we only consider the asymptotic power properties of the MZ * α test and seven different wavelet filters, namely Haar, Db2, Db4, Db8, sym2, sym4 and sym8. These results, which are generated under no serial correlation and sample size 1000, are presented in Figure 2. From this figure, it is clear that wavelet type and length do not matter asymptotically.
In another exercise, we compare the asymptotic power curves of the GLS demeaned standard and wavelet based tests. From these tests, we consider τ * , FG, MZ * α and MZ * t , as the wavelet based tests, andτ, MZ α and MZ t as standard unit root tests. The results of the simulations, which are run with no serial correlation and sample size 1000, are given in Figure 3. The findings are twofold: (1) Nielsen's (2009) test and its wavelet version are almost asymptotically equivalent; and (2) there are very slight deviations in other tests. However, increasing the sample size further may eliminate the difference further. On the other hand, the figure illustrates that the most powerful tests are M tests, the second rank belongs to Fan and Gencay's (2010) test and the least powerful tests are Nielsen's (2009) test. 8 We also consider GLS detrending, but, for the space considerations, we do not present them. If requested, they are available from the authors.  . Asymptotic power curves of the wavelet based and standard unit root tests under GLS demeaning. Note: tvr, ram, mzaw and mztw correspond to τ * , FG, MZ * α and MZ * t , respectively. nvr, mza and mzt correspond to τ, MZ α and MZ t which are standard unit root tests without the wavelet application, respectively.

Conclusions
In this study, we extend the results of Fan and Gencay (2010) in Ng and Perron's (2001) framework and we provide an analysis of the application of GLS detrending in the wavelet framework.
As a result of our comparison exercise, relative to existing wavelet based unit roots, the newly proposed tests seem to be more robust to problematic innovation structures such as negative MA roots. Although all tests suffer size distortion from the presence of the negative MA innovations, in particular, M type tests are almost correctly sized. Furthermore, our tests also exhibit local power, while there is no single test that dominates the power performance contest.
We also show that the wavelet type does not matter in unit root testing. However, using higher length filters may distort the performance of wavelet based tests. Nonetheless, we can suggest length 2 or 4 wavelets for wavelet based unit root tests.
For the future work, we also consider wavelet based Johansen cointegration test using similar methodology. Recently, Eroglu (2018)  where W(t) is a standard Brownian motion and φ(1) is the long run variance of u t .
The proof of this lemma can found in Trokić (2016) and Fan and Gencay (2010).
Lemma A2. Suppose that Assumptions 1-2 hold and x t is generated by Equations (2) and (3). Let V 1,c,t be defined in Equation (4). The partial sum process ofV 1,c,t satisfies the following properties: where W j,c (s) is demonstrated in Theorem 1.
Proof of Lemma 3. The proof of this lemma can be obtained from the consistency ofα (1), which is demonstrated in Lemma 3.5 of Chang and Park (2002) and the results of Lemma A1. First, note that α(1) − → α(1), thus 1/α(1) − → 1/α(1). Additionally,σ is a consistent estimator of the variance of * p,t . However, from Fan and Gencay (2010) and Trokić (2016), we know the long run variance of v t is given as 2σφ (1), and then we obtain the result from Continuous Mapping Theorem (CMT) since we also haveσ − → σ * 2 .
Proof of Theorem 1. The proof of results for the ADF test based on wavelet transformed series directly follows Chang and Park (2002). Note that the wavelet based augmented regression satisfies the same conditions as the classical ADF regression. As a result, we can use Lemmas A2 and A3 to obtain the results. The proof is the same as in Chang and Park (2002), and thus we skip the details.
The results for the wavelet based M tests follow from Lemmas A2 and A3. We simply apply CMT to reach the desired outcome.