A Note on the Asymptotic Normality of the Kernel Deconvolution Density Estimator with Logarithmic Chi-Square Noise

Zu, Yang

doi:10.3390/econometrics3030561

Open AccessShort Note

A Note on the Asymptotic Normality of the Kernel Deconvolution Density Estimator with Logarithmic Chi-Square Noise

by

Yang Zu

Department of Economics, City University London, Northampton Square, EC1V 0HB London, UK

Econometrics 2015, 3(3), 561-576; https://doi.org/10.3390/econometrics3030561

Submission received: 29 April 2015 / Accepted: 17 July 2015 / Published: 21 July 2015

Download

Browse Figures

Versions Notes

Abstract

:

This paper studies the asymptotic normality for the kernel deconvolution estimator when the noise distribution is logarithmic chi-square; both identical and independently distributed observations and strong mixing observations are considered. The dependent case of the result is applied to obtain the pointwise asymptotic distribution of the deconvolution volatility density estimator in discrete-time stochastic volatility models.

Keywords:

kernel deconvolution estimator; asymptotic normality; volatility density estimation

JEL classifications:

C13; C22; C46; C58

1. Introduction

Consider the measurement error model:

Y = X + ε,

where X is the signal, while ε is the noise. Assume X is independent of ε; X has density

f_{X}

, and ε has density k, so the density of Y, denoted as

f_{Y}

, is the convolution of

f_{X}

and k:

f_{Y} = f_{X} * k,

where the * denotes convolution.

Assume we observe the realizations

Y_{1}, ..., Y_{n}

of Y and that the function k is fully known; one possible estimator for

f_{X}

from the noisy observations

Y_{1}, ..., Y_{n}

is the kernel deconvolution estimator:

{\hat{f}}_{X} (x) = \frac{1}{2 π} \int_{- \infty}^{+ \infty} e^{- i t x} \frac{ϕ_{K} (t h) \hat{ϕ_{f_{Y}}} (t)}{ϕ_{k} (t)} d t,

(1)

where:

\hat{ϕ_{f_{Y}}} (t) = \frac{1}{n} \sum_{j = 1}^{n} e^{i t Y_{j}},

is the empirical characteristic function of density

f_{Y}

,

K (x)

is a kernel function,

ϕ_{K}

and

ϕ_{k}

are the Fourier transform of K and k, respectively1. The kernel deconvolution estimator was first proposed forthe measurement error model by Carroll and Hall [1] and Stefanski and Carroll [2].

Define the kernel deconvolution function as follows:

ν_{h} (x) : = \frac{1}{2 π} \int_{- \infty}^{+ \infty} \frac{ϕ_{K} (t)}{ϕ_{k} (t / h)} e^{- i t x} d t;

the kernel deconvolution estimator can be written compactly as:

{\hat{f}}_{X} (x) = \frac{1}{n h} \sum_{j = 1}^{n} ν_{h} (\frac{x - Y_{j}}{h}) .

(2)

In this paper, I show the asymptotic normality for the estimator

{\hat{f}}_{X} (x)

when the distribution of ε is logarithmic chi-square. The asymptotic distribution of the kernel deconvolution estimator has been considered in Fan [3], Fan and Liu [4], Van Es and Uh [5] and Van Es and Uh [6] for identically independently distributed (i.i.d.) observations. Masry [7] and Kulik [8] consider various cases for the weakly-dependent observations. However, none of the above research allows the error distribution to be the logarithmic chi-square distribution. I consider both the identical and independently distributed (i.i.d.) observations and strong mixing observations in this paper, which complements the above-mentioned literature.

The results obtained in this paper can be applied to obtain the asymptotic distribution of the deconvolution volatility density estimator. The problem of estimating volatility density has been gaining increasing interest in econometrics in recent years; see, e.g., Van Es, Spreij, and Van Zanten [9] and and Van Es, Spreij, and Van Zanten [10] for the kernel deconvolution estimator, Comte and Genon-Catalot [11] for the penalized projection estimator and Todorov and Tauchen [12] for the study in the context of high-frequency data. Kernel deconvolution with logarithmic chi-square noise arises naturally when estimating the volatility density in stochastic volatility (SV) models. Existing research (e.g., Van Es, Spreij, and Van Zanten [9] and Van Es, Spreij, and Van Zanten [10]) focuses on the convergence rates of the estimators, and the asymptotic distribution of the estimators is not available.

In Section 2, I review the probabilistic properties of the logarithmic chi-square distribution; Section 3 presents the asymptotic normality of the estimator, for both i.i.d. observations and dependent observations; Section 4 discusses the application of the results to volatility density estimation in SV models; Section 5 concludes the paper.

2. Logarithmic Chi-Square Distribution

The logarithmic chi-square distribution is obtained by taking the logarithm of a chi-square distribution with degrees of freedom of one. The density function of logarithmic chi-square distribution is:

k (x) = \frac{1}{\sqrt{2 π}} e^{\frac{1}{2} x} e^{- \frac{1}{2} e^{x}} .

The density function of the logarithmic chi-square distribution is asymmetric and is plotted in Figure 1.

Figure 1. Density function of the logarithmic chi-square distribution.

The characteristic function of the logarithmic chi-square distribution is:

ϕ_{k} (t) = \frac{1}{\sqrt{π}} 2^{i t} Γ (\frac{1}{2} + i t),

where

Γ (.)

is the gamma function.

Fan [3] studies the quadratic mean convergence rate of the kernel deconvolution estimator; it turns out that the convergence rate of the estimator depends heavily on the type of error distribution. In particular, it is determined by the tail behaviour of the modulus of the characteristic function of the error distribution: the faster the modulus function goes to zero in the tail, the slower the converge rate. The following lemma, which is from Van Es, Spreij, and Van Zanten [10], gives the tail behaviour of

| ϕ_{k} (t) |

.

Lemma 1.

(Lemma 5.1 of Van Es, Spreij, and Van Zanten [10]) For

| t | \to \infty

, we have:

| ϕ_{k} | = \sqrt{2} e^{- \frac{1}{2} π | t |} (1 + O (\frac{1}{| t |})),

(3)

and:

\begin{matrix} Re ϕ_{k} (t) & = & | ϕ_{k} | (cos [t log (\sqrt{1 + 4 t^{2}} - t)] + O (\frac{1}{| t |})), \end{matrix}

(4)

\begin{matrix} Im ϕ_{k} (t) & = & | ϕ_{k} | (sin [t log (\sqrt{1 + 4 t^{2}} - t)] + O (\frac{1}{| t |})) . \end{matrix}

(5)

From (3), it is known that the modulus of

ϕ_{k} (t)

decays exponentially fast as

| t | \to \infty

. It thus belongs to the super-smooth density according to the classification in Fan [13]. According to Fan [13], the optimal convergence rate of the estimator is

{(log n)}^{- 2}

, when

h = {(log n)}^{- 1}

. Figure 2 plots the modulus function

| ϕ_{k} |

and its approximation

\sqrt{2} e^{- \frac{1}{2} π | t |}

; we notice that the two functions almost coincide at both tails.

Figure 2. Modulus function of the characteristic function of logarithmic chi-square distribution and its approximation: the higher peak curve is the approximating function

\sqrt{2} e^{- \frac{1}{2} π | t |}

.

Figure 2. Modulus function of the characteristic function of logarithmic chi-square distribution and its approximation: the higher peak curve is the approximating function

\sqrt{2} e^{- \frac{1}{2} π | t |}

.

From (4) and (5), it is known that in both tails, neither the real part nor imaginary part of the characteristic function can dominate the other; this violates the assumptions in the previous works by, e.g., Fan [3] and Masry [7], on studying the asymptotic normality; for super-smooth error distributions, these papers assume either the real part or the imaginary part to be dominant.

3. Asymptotic Normality

In this paper, I consider one particular kernel function, namely the sinc kernel function:

(C1)

The sinc kernel function is defined as:

K (x) = \frac{sin (x)}{π x},

with Fourier transform2:

ϕ_{K} (t) = I {| t | ⩽ 1} .

The sinc kernel function is favoured in theoretical literature because of the simplicity of its Fourier transform and is thus used here. 3

3.1. i.i.d. Observations

In this section, I prove the asymptotic normality of the estimator when the observations are i.i.d.

Theorem 1.

When the observations are i.i.d. and ε is distributed as logarithmic chi-square, if Assumption (C1) holds, when

exp (1 / h) / n \to 0

as

n \to \infty

and

h \to 0

, it holds that,

\frac{{\hat{f}}_{X} (x) - K_{h} * f_{X} (x)}{\sqrt{\frac{1}{2 π^{2} n} exp (π / h) f_{Y} (x)}} \to^{d} N (0, 1),

where

K_{h} (x) : = (1 / h) K (x / h)

.

Proof.

Denote:

Z_{j} = \frac{1}{h} ν_{h} (\frac{x - Y_{j}}{h}),

then:

\hat{f} (x) = \frac{1}{n} \sum_{j = 1}^{n} Z_{j} .

First:

\begin{matrix} E \hat{f} (x) & = & E Z_{1} \\ = & E [\frac{1}{2 π} \int_{- \infty}^{+ \infty} e^{- i t x} \frac{ϕ_{K} (t h) \hat{ϕ_{f_{Y}}} (t)}{ϕ_{k} (t)} d t] \\ = & \frac{1}{2 π} \int_{- \infty}^{+ \infty} e^{- i t x} \frac{ϕ_{K} (t h) E [\hat{ϕ_{f_{Y}}} (t)]}{ϕ_{k} (t)} d t \\ = & \frac{1}{2 π} \int_{- \infty}^{+ \infty} e^{- i t x} \frac{ϕ_{K} (t h) ϕ_{f_{Y}} (t)}{ϕ_{k} (t)} d t \\ = & \frac{1}{2 π} \int_{- \infty}^{+ \infty} e^{- i t x} ϕ_{K} (t h) ϕ_{f_{X}} (t) d t \\ = & K_{h} * f_{X} (x), \end{matrix}

Second, I evaluate

Var Z_{1}

,

\begin{matrix} Var Z_{1} & = & Var (\frac{1}{h} ν_{h} (\frac{x - Y_{1}}{h})) \\ = & \frac{1}{h^{2}} (E ν_{h} {(\frac{x - Y_{1}}{h})}^{2} - {(E ν_{h} (\frac{x - Y_{1}}{h}))}^{2}) \\ = & \frac{1}{h^{2}} (\int ν_{h} {(\frac{x - y}{h})}^{2} f_{Y} (y) d y - {(K_{h} * f_{X} (x))}^{2}) \\ = & \frac{1}{h^{2}} (h \int ν_{h} {(y)}^{2} d y f_{Y} (x) - {(K_{h} * f_{X} (x))}^{2}) \\ = & \frac{1}{2 π^{2}} exp (\frac{π}{h}) f_{Y} (x) (1 + o (1)), \end{matrix}

(6)

where the last equality is obtained because

K_{h} * f_{X} (x) \to f_{X} (x)

as

h \to 0

, and

\int {|ν_{h} (x)|}^{2} d x = \frac{h}{2 π^{2}} exp (\frac{π}{h}) (1 + o (1))

. The latter result is shown as follows,

\begin{matrix} \int {|ν_{h} (x)|}^{2} d x & = & \frac{1}{2 π} \int {|ϕ_{ν_{h}} (u)|}^{2} d u \\ = & \frac{1}{2 π} \int {|\frac{ϕ_{K} (u)}{ϕ_{k} (u / h)}|}^{2} d u \\ = & \frac{h}{π} \int_{0}^{1 / h} {|\frac{1}{ϕ_{k} (u)}|}^{2} d u \\ = & \frac{h}{π} (\int_{0}^{M} {|\frac{1}{ϕ_{k} (u)}|}^{2} d u + \int_{M}^{1 / h} {|\frac{1}{ϕ_{k} (u)}|}^{2} d u), \end{matrix}

where M is a very big number. The first term in the brackets is a constant depending on M; the order of the second term can be evaluated as follows,

\begin{matrix} \int_{M}^{1 / h} {|\frac{1}{ϕ_{k} (u)}|}^{2} d u & = & \frac{1}{2 π} (exp (\frac{π}{h}) - exp (π M)) \\ = & \frac{1}{2 π} exp (\frac{π}{h}) (1 + o (1)), \end{matrix}

where I use the fact that when M is big,

| ϕ_{k} (u) |

can be replaced by its asymptotic approximation. The second term clearly dominates the first term, which is a constant, such that:

\int {|ν_{h} (x)|}^{2} d x = \frac{h}{2 π^{2}} exp (\frac{π}{h}) (1 + o (1)) .

(7)

Here, I use the argument of Butucea [15] to split the integral and show that the tail part of the integral dominates.

A sufficient condition for asymptotic normality is the Lyapunov condition, which reduces to:

\frac{E | Z_{1} - E Z_{1} |^{2 + δ}}{n^{δ / 2} {[Var (Z_{1})]}^{1 + δ / 2}} \to 0,

(8)

for i.i.d. data.

For an upper bound for the numerator,

\begin{matrix} E | Z_{1} - E Z_{1} |^{2 + δ} & ⩽ & E | Z_{1} |^{2 + δ} + {| E Z_{1} |}^{2 + δ} \\ ⩽ & 2 E | Z_{1} |^{2 + δ} \\ = & \frac{2}{h^{2 + δ}} \int_{- \infty}^{+ \infty} {|ν_{h} (\frac{x - y}{h})|}^{2 + δ} f_{Y} (y) d y \\ ⩽ & \frac{C}{h^{2 + δ}} \int_{- \infty}^{+ \infty} {|ν_{h} (\frac{x - y}{h})|}^{2 + δ} d y \end{matrix}

(9)

Now, notice the result from Van Es, Spreij, and Van Zanten [10] and Masry [16] that, for

p > 2

4,

\begin{matrix} {∥ν_{h}∥}_{p} & ⩽ & {∥ν_{h}∥}_{\infty}^{1 - 2 / p} {∥ν_{h}∥}_{2}^{2 / p} . \end{matrix}

An upper bound for

{∥ν_{h}∥}_{\infty}

is easy to get, as5:

\begin{matrix} {∥ν_{h}∥}_{\infty} & = & sup_{x} |\frac{1}{2 π} \int \frac{ϕ_{K} (t)}{ϕ_{k} (t / h)} e^{- i t x} d t| \\ ⩽ & \frac{1}{2 π} \int |\frac{ϕ_{K} (t)}{ϕ_{k} (t / h)}| d t \\ ⩽ & \frac{\sqrt{2}}{π^{2}} h exp (\frac{π}{2 h}), \end{matrix}

while

{∥ν_{h}∥}_{2}^{2}

is known from (7), such that:

\begin{matrix} \int {|ν_{h} (z)|}^{p} d z & ⩽ & {∥ν_{h}∥}_{\infty}^{p - 2} {∥ν_{h}∥}_{2}^{2} \\ ⩽ & C \times h^{p - 2} exp (\frac{π (p - 2)}{2 h}) \times h exp (\frac{π}{h}) \\ = & C \times h^{p - 1} exp (\frac{π p}{2 h}), \end{matrix}

for

p > 2

. Therefore, take

p = 2 + δ

and use the result in (9); it then holds that:

\begin{matrix} E | Z_{1} - E Z_{1} |^{2 + δ} & ⩽ & C \times exp (\frac{π (2 + δ)}{2 h}), \end{matrix}

(10)

this, together with (6), implies that Lyapunov’s condition (8) holds, which completes the proof. □

3.2. Strong Mixing Observations

In this section, I consider the model:

Y = X + ε,

(11)

where X’s realizations of

X_{1}, \dots, X_{n}

are strictly stationary and strong mixing, while the noise realizations

ε_{1}, \dots, ε_{n}

are i.i.d. logarithmic chi-square variables, independent of X, such that the observations

Y_{1}, \dots, Y_{n}

are also strictly stationary and strong mixing.

There are various concepts of dependence; here, I consider the case of α mixing, also called strong mixing, which is the weakest among all of the dependence concepts.

Definition 1.

Let

{X_{t}}

,

t = \dots, - 1, 0, 1, \dots

be an infinite sequence of strictly stationary random variables and

F_{i}^{j}

be the σ-algebra generated by

{X_{t}, i ⩽ t ⩽ j}

; then, the α-mixing coefficient is defined as:

α (k) = sup_{A \in F_{- \infty}^{0}, B \in F_{k}^{+ \infty}} |P (A) P (B) - P (A B)| .

The sequence

{X_{t}}

,

t = \dots, - 1, 0, 1, \dots

, is called α-mixing if

α (k) \to 0

as

k \to \infty

.

For the dependent case, a bounded assumption on the joint density of observations is also needed.

(C2)

The probability density function of any joint distribution

(Y_{i}, Y_{j})

,

1 ⩽ i < j ⩽ n

, exists and is bounded by a constant.

Now, I give the asymptotic normality theorem. Notice that the mixing assumption here is a litter weaker than that in Masry [7].

Theorem 2.

In model (11), let

X_{1}, X_{2}, \dots, X_{n}

be strictly stationary, α-mixing with:

\sum_{k = 1}^{\infty} α {(k)}^{1 - 2 / δ} < \infty,

(12)

for some

δ > 2

; the noises

ε_{1}, \dots, ε_{n}

are i.i.d. logarithmic chi-square variables, independent of X; if (C1) and (C2) hold, when

exp (1 / h) / n \to 0

as

n \to \infty

and

h \to 0

, then:

\frac{{\hat{f}}_{X} (x) - K_{h} * f_{X} (x)}{\sqrt{\frac{1}{2 π^{2} n} exp (π / h) f_{Y} (x)}} \to^{d} N (0, 1) .

Proof.

First, by strict stationarity and using the ergodic theorem for strong mixing sequences, similarly as in the proof of Theorem 1,

E \hat{f} (x) = K_{h} * f_{X} (x) .

Next, the variance of the estimator is evaluated; first:

\begin{matrix} Var (\hat{f} (x)) & = & \frac{1}{n} Var (Z_{1}) + \frac{2}{n^{2}} \sum_{j = 1}^{n - 1} (n - j) Cov (Z_{1}, Z_{j + 1}) . \end{matrix}

Knowing from Theorem 1 that the first term is:

\frac{1}{n} Var (Z_{1}) = \frac{1}{2 π^{2} n} exp (\frac{π}{h}) f_{Y} (x) (1 + o (1)) .

(13)

For the covariance term, first notice:

\begin{matrix} |Cov (Z_{1}, Z_{j + 1})| & ⩽ & |E (Z_{1} Z_{j + 1})| + {(K_{h} * f_{X} (x))}^{2} \\ ⩽ & |E (Z_{1} Z_{j + 1})| + O (1), \end{matrix}

(14)

as

h \to 0

. Now, because:

\begin{matrix} | E (Z_{1} Z_{j + 1}) | & = & \frac{1}{h^{2}} |E (ν_{h} (\frac{x - Y_{1}}{h}) ν_{h} (\frac{x - Y_{j + 1}}{h}))| \\ = & \frac{1}{h^{2}} |E \int \int \frac{ϕ_{K} (t) ϕ_{K} (t^{'})}{ϕ_{k} (t / h) ϕ_{k} (t^{'} / h)} e^{- i t \frac{x - Y_{1}}{h}} e^{- i t^{'} \frac{x - Y_{j + 1}}{h}} d t d t^{'}| \\ = & \frac{1}{h^{2}} |\int \int \frac{ϕ_{K} (t) ϕ_{K} (t^{'})}{ϕ_{k} (t / h) ϕ_{k} (t^{'} / h)} E (E (e^{- i t \frac{x - X_{1} - ε_{1}}{h}} e^{- i t^{'} \frac{x - X_{j + 1} - ε_{j + 1}}{h}} | X)) d t d t^{'}| \\ = & \frac{1}{h^{2}} |\int \int \frac{ϕ_{K} (t) ϕ_{K} (t^{'})}{ϕ_{k} (t / h) ϕ_{k} (t^{'} / h)} ϕ_{k} (t / h) ϕ_{k} (t^{'} / h) E (e^{- i t \frac{x - X_{1}}{h}} e^{- i t^{'} \frac{x - X_{j + 1}}{h}}) d t d t^{'}| \\ = & \frac{1}{h^{2}} |\int \int ϕ_{K} (t) ϕ_{K} (t^{'}) E (e^{- i t \frac{x - X_{1}}{h}} e^{- i t^{'} \frac{x - X_{j + 1}}{h}}) d t d t^{'}| \\ ⩽ & \frac{1}{h^{2}} |\int \int |ϕ_{K} (t) ϕ_{K} (t^{'})| E (|e^{- i t \frac{x - X_{1}}{h}} e^{- i t^{'} \frac{x - X_{j + 1}}{h}}|) d t d t^{'}| \\ ⩽ & \frac{1}{h^{2}} |\int \int |ϕ_{K} (t) ϕ_{K} (t^{'})| d t d t^{'}| \\ ⩽ & \frac{C}{h^{2}}, \end{matrix}

where C is a constant; continuing on (14), I get:

|Cov (Z_{1}, Z_{j + 1})| ⩽ C \frac{1}{h^{2}} (1 + o (1)) .

(15)

On the other hand, using the assumption on the α-mixing coefficients and the covariance inequality for strong mixing sequence in Proposition 2.5 in Fan and Yao [17], for

δ > 2

,

\begin{matrix} Cov (Z_{1}, Z_{j + 1}) & ⩽ & 8 α {(j)}^{1 - 2 / δ} {(E {|Z_{1}|}^{δ})}^{1 / δ} {(E {|Z_{j + 1}|}^{δ})}^{1 / δ} \\ = & 8 α {(j)}^{1 - 2 / δ} {(E {|Z_{1}|}^{δ})}^{2 / δ} \\ ⩽ & C^{'} α {(j)}^{1 - 2 / δ} exp (\frac{π}{h}) . \end{matrix}

(16)

Therefore, using (15) and (16),

\begin{matrix} |\sum_{j = 1}^{n - 1} Cov (Z_{1}, Z_{j + 1})| \\ ⩽ & |\sum_{j = 1}^{m_{n}} Cov (Z_{1}, Z_{j + 1})| + |\sum_{j = m_{n}}^{n - 1} Cov (Z_{1}, Z_{j + 1})| \\ ⩽ & C \frac{1}{h^{2}} m_{n} + C exp (\frac{π}{h}) \sum_{j = m_{n}}^{n - 1} α {(j)}^{1 - 2 / δ}, \end{matrix}

if one chooses

m_{n} = \frac{1}{h |log h|}

, then

m_{n} \to \infty

and

m_{n} h \to 0

, then obviously the first term is

o (exp (\frac{π}{h}))

; the second term is also

o (exp (\frac{π}{h}))

by noticing the mixing assumption in (12). Then, it is shown that:

|\sum_{j = 1}^{n - 1} Cov (Z_{1}, Z_{j + 1})| = o (exp (\frac{π}{h})) .

(17)

From (13) and (17), it then follows that:

\begin{matrix} Var (\hat{f} (x)) & = & \frac{1}{2 π^{2} n} exp (\frac{π}{h}) f_{Y} (x) (1 + o (1)) . \end{matrix}

Now, I prove the central limit theorem, for which I use the classical large block-small block argument of proving the central limit theorem for the dependent sequence. First, I make some normalizations, define

σ_{0} = {(\frac{1}{2 π^{2}} exp (\frac{π}{h}) f_{Y} (x))}^{1 / 2}

, and:

Z_{j}^{'} = \frac{Z_{j} - K_{h} * f_{X} (x)}{σ_{0}},

then

Z_{j}^{'}

has mean zero and unit variance and:

\frac{1}{n} \sum_{j = 1}^{n} Z_{j}^{'} = \frac{\hat{f} (x) - K_{h} * f_{X} (x)}{σ_{0}};

and it will be shown that:

\sqrt{n} (\frac{1}{n} \sum_{j = 1}^{n} Z_{j}^{'}) \to^{d} N (0, 1),

which is the result that needed to be shown.

First, the set

{1, \dots, n}

is partitioned into

2 k_{n} + 1

subsets with large blocks of size

l_{n}

and small blocks of size

s_{n}

, such that

k_{n} = ⌊n / (l_{n} + s_{n})⌋

, so the last remaining block has size

n - k_{n} (l_{n} + s_{n})

. The sizes are such that

l_{n} \to \infty

,

s_{n} \to \infty

,

l_{n} / s_{n} \to \infty

. Then, we can write:

\sum_{j = 1}^{n} Z_{j}^{'} = \sum_{j = 1}^{k_{n}} ξ_{j} + \sum_{j = 1}^{k_{n}} η_{j} + ζ,

where:

\begin{matrix} ξ_{j} & = & \sum_{j^{'} = (j - 1) (l_{n} + s_{n}) + 1}^{(j - 1) (l_{n} + s_{n}) + l_{n}} Z_{j^{'}}^{'} \\ η_{j} & = & \sum_{j^{'} = (j - 1) (l_{n} + s_{n}) + l_{n} + 1}^{j (l_{n} + s_{n})} Z_{j^{'}}^{'} \\ ζ & = & \sum_{k_{n} (l_{n} + s_{n}) + 1}^{n} Z_{j}^{'} . \end{matrix}

which are the sum of large blocks, small blocks and the last block, respectively. Then, as a standard procedure for the small block-big block argument, I show the following:

\begin{matrix} \frac{1}{n} E {(\sum_{j = 1}^{k_{n}} η_{j})}^{2} & = & o (1), \end{matrix}

(18)

\begin{matrix} \frac{1}{n} E ζ^{2} & = & o (1), \end{matrix}

(19)

\begin{matrix} |E exp (i t \sum_{j = 1}^{k_{n}} ξ_{j} / \sqrt{n}) - \prod_{j = 1}^{k_{n}} E exp (i t ξ_{j} / \sqrt{n})| & \to & 0, \end{matrix}

(20)

\begin{matrix} \frac{1}{n} \sum_{j = 1}^{k_{n}} E ξ_{j}^{2} & \to & 1, \end{matrix}

(21)

\begin{matrix} \frac{1}{n} \sum_{j = 1}^{k_{n}} E [ξ_{j}^{2} I (|ξ_{j}| ⩾ ε n^{1 / 2})] & \to & 0, \end{matrix}

(22)

for

\forall ε > 0

. (18) and (19) say that the small blocks and the last block are of smaller order. (20) says that the large blocks are as if independent in the sense of the characteristic function. Then, (21) and (22) are the Lindeberg-Feller condition for the asymptotic normality for

\sum_{j = 1}^{k_{n}} ξ_{j}

under independence.

For (18) and (19), using the moment inequality for the α-mixing sequence in Proposition 2.7 (i) in Fan and Yao [17], it can be shown that:

\begin{matrix} E {(\sum_{j = 1}^{k_{n}} η_{j})}^{2} & = & O (k_{n} s_{n}), \\ E ζ^{2} & = & O (n - k_{n} (l_{n} + s_{n})), \end{matrix}

notice that the conditions for Proposition 2.7 (i) are satisfied, because by (10),

E {|Z_{j}^{'}|}^{δ} < \infty

for

δ > 2

; and the mixing assumption (12) implies that

α {(j)}^{1 - 2 / a} = j^{- b}

for

b > 1

, which is

α (j) = j^{- a b / (a - 2)} = j^{- \frac{1}{2} \times \frac{1}{1 / (2 b) - 1 / (a b)}}

; take

δ = a b

and

q = 2 b

, so the mixing condition is also satisfied.

For (20), using the covariance inequality in Proposition 2.6 in Fan and Yao [17], we have:

\begin{matrix} |E exp (i t \sum_{j = 1}^{k_{n}} ξ_{j} / \sqrt{n}) - \prod_{j = 1}^{k_{n}} E exp (i t ξ_{j} / \sqrt{n})| \\ ⩽ & 16 (k_{n} - 1) α (s_{n}); \end{matrix}

this is

o (1)

by choosing for example

l_{n} = {(n h^{γ_{1}})}^{1 / 2}

,

s_{n} = {(n h^{γ_{2}})}^{1 / 2}

for

1 < γ_{1} < γ_{2}

. Then,

k_{n} = O (n^{1 / 2} h^{- γ_{1} / 2})

, such that for some

q > 1

,

\begin{matrix} k_{n} α (s_{n}) & = & n^{1 / 2} h^{- γ_{1} / 2} \frac{1}{{(n h^{γ_{2}})}^{q / 2}} \\ = & n^{\frac{(1 - q)}{2}} h^{- \frac{(γ_{2} q + γ_{1})}{2}}; \end{matrix}

obviously, the above expression is

o (1)

by the assumption that

exp (1 / h) / n \to 0

, so (20) is proven.

For Feller’s condition (21), first use the same strategy as calculating the variance of the estimator; it holds that:

E ξ_{j}^{2} = l_{n} (1 + o (1)),

for any j, because

ξ_{j}

is also an infinite sum of the observations. Therefore,

\frac{1}{n} \sum_{j = 1}^{k_{n}} E ξ_{j}^{2} = \frac{1}{n} k_{n} l_{n} (1 + o (1)) \to 1 .

Finally, for Lindeberg’s condition (22), first observe that:

\begin{matrix} E [ξ_{j}^{2} I (|ξ_{j}| ⩾ ε n^{1 / 2})] & ⩽ & {(E ξ_{j}^{4})}^{1 / 2} P (|ξ_{j}| ⩾ ε n^{1 / 2}) \\ ⩽ & {(E ξ_{j}^{4})}^{1 / 2} \frac{E ξ_{j}^{2}}{{(ε \sqrt{n})}^{2}}, \end{matrix}

where I first use Holder’s inequality and then Markov’s inequality. Using again the moment inequality for the strong mixing sequence in Proposition 2.7 in Fan and Yao [17],

\begin{matrix} {(E ξ_{j}^{4})}^{1 / 2} \frac{E ξ_{j}^{2}}{{(ε \sqrt{n})}^{2}} & ⩽ & {(l_{n}^{2})}^{1 / 2} \times \frac{l_{n}}{{(ε \sqrt{n})}^{2}} \\ = & \frac{l_{n}^{2}}{ε^{2} n}, \end{matrix}

so:

\begin{matrix} \frac{1}{n} \sum_{j = 1}^{k_{n}} E [ξ_{j}^{2} I (|ξ_{j}| ⩾ ε n^{1 / 2})] & = & O (\frac{k_{n}}{n} \times \frac{l_{n}^{2}}{ε^{2} n}) \\ = & \frac{1}{ε^{2}} O (\frac{l_{n}}{n}) = o (1) . \end{matrix}

Using the Lindeberg-Feller condition and employing the standard argument for the proof of central limit theorems, it can be shown that:

\prod_{j = 1}^{k_{n}} E exp (i t ξ_{j} / \sqrt{n}) \to exp (- \frac{t^{2}}{2});

this together with (18)–(20) entail the stated result. □

4. Application to Density Estimation in the Stochastic Volatility Model

In this section, I consider applying the results of Theorem 2 to obtain the asymptotic distribution of the kernel deconvolution volatility density estimator in SV models. A generic SV model has the following form,

y_{t_{i}} = σ_{t_{i}} ε_{t_{i}}, i = 1, \dots, n,

(23)

where

ε_{t_{i}}

,

i = 1, \dots, n

are i.i.d.

N (0, 1)

;

{σ_{t_{i}}}

is a latent stochastic process called the volatility process;

{y_{t_{i}}}

is the observed financial returns. The SV model is a popular model used in financial econometrics to describe the evolution of financial returns. Model (23) incorporates popular discrete-time SV models (e.g., Taylor [18]) and the discretized continuous-time SV models, which assume the volatility process to be stationary as special cases (see, e.g., Shephard [19] for a review). Van Es, Spreij, and Van Zanten [9] and Van Es, Spreij, and Van Zanten [10] considered estimating the volatility density using the kernel deconvolution estimator in this model.

Remark 1.

By using the term “stochastic volatility”, here, I consider the so-called “genuine stochastic volatility” models, where the volatility process has a separate stochastic driving factor (see, e.g., Shephard and Andersen [20] and Andersen, Bollerslev, Diebold, and Labys [21] for detailed discussions). It thus does not include the ARCH/GARCH class models, where one has explicitly specified one-step-ahead predictive densities. Van Es, Spreij, and Van Zanten [10] considered estimating volatility density in the context of the ARCH/GARCH class of models and had given the rate of convergence for their estimator.

To apply the general theory derived in Section 3, it is further assumed that the volatility process

{σ_{t_{i}}}

is strictly stationary, and it is independent of

ε_{t_{i}}

for

i = 1, \dots, n .

The independence assumption rules out the leverage effect in stochastic volatility models and, thus, is suitable to apply to, say, exchange rate data, where the leverage effect is rarely observed. Extending the model to allow for the leverage effect is an important, yet challenging task, which is thus left for future research.

The SV model can be written as a measurement error model (11) by taking squares and logarithms on both sides of equation (23),

log y_{t_{i}}^{2} = log σ_{t_{i}}^{2} + log ε_{t_{i}}^{2}, i = 1, \dots ., n,

(24)

such that the variable

log y_{i}^{2}

is the convolution of

log η_{i}

with a completely known distribution logarithmic chi-square. Following the notations in the previous sections, write the density functions of

log y_{t_{i}}^{2}

,

log σ_{t_{i}}^{2}

and

log ε_{t_{i}}^{2}

to be

f_{y}

,

f_{σ}

and k, respectively.

If we want to recover the density

f_{σ}

of

log σ_{t_{i}}^{2}

from the observations

{log y_{t_{i}}^{2}}

, this is a problem of deconvolution with logarithmic chi-square error, and the kernel deconvolution estimator can be used. Van Es, Spreij, and Van Zanten [9] and Van Es, Spreij, and Van Zanten [10] first noticed this connection. Define

Z_{j} : = log y_{j}^{2}

; they use the following estimator to recover

f_{σ} (x)

,

{\hat{f}}_{y} (x) = \frac{1}{2 π} \frac{1}{n} \sum_{j = 1}^{n} \int_{- \infty}^{+ \infty} \frac{ϕ_{K} (t h)}{ϕ_{k} (t)} e^{- i t (x - Z_{j})} d t,

where

ϕ_{K}

is the Fourier transform of a kernel function K and

ϕ_{k} (t)

is the characteristic function of the

log χ_{1}^{2}

variable. Van Es, Spreij, and Van Zanten [9] and Van Es, Spreij, and Van Zanten [10] derive the convergence rate of the estimator, but a central limit theorem is missing.

If we assume the observed return sequence

{Z_{j}}

,

j = 1, \dots, n

is generated by the SV model (23) with a strict stationary, the α-mixing volatility process satisfies (12) and i.i.d. errors; a simple application of Theorem 2 will lead to the following corollary.

Corollary 1.

In the stochastic volatility model (23), when the volatility process

{σ_{j}}

,

j = 1, \dots, n

is α-mixing with (12) satisfied,

ε_{t_{i}}

’s are i.i.d.

N (0, 1)

, independent of the volatility process; when

exp (1 / h) / n \to 0

as

n \to \infty

and

h \to 0

, it holds that:

\frac{{\hat{f}}_{σ} (x) - K_{h} * f_{σ} (x)}{\sqrt{\frac{1}{2 π^{2} n} exp (π / h) f_{y} (x)}} \to^{d} N (0, 1) .

Since the density

f_{u} (x)

can be estimated with the observed return sequence

{log y_{t_{i}}^{2}}

consistently using the classical kernel density estimator for any x (see e.g., Fan and Yao [17]), the above result can be used to construct pointwise confidence intervals for the kernel deconvolution density estimator.

5. Conclusions

In this paper, I have proven the asymptotic normality for the kernel deconvolution estimator with logarithmic chi-square noise. I consider both the case of identical and independently distributed observations and strong mixing observations. The results are applied to prove the asymptotic normality of the kernel deconvolution estimator for volatility density in stochastic volatility models.

Acknowledgements

I am grateful for Kerry Patterson and the two anonymous referees for helpful comments, which greatly improved the paper. The usual disclaimer applies.

Conflicts of Interest

The author declares no conflict of interest.

References

R. Carroll, and P. Hall. “Optimal rates of convergence for deconvolving a density.” J. Am. Stat. Assoc. 83 (1988): 1184–1186. [Google Scholar] [CrossRef]
L. Stefanski, and R. Carroll. “Deconvoluting kernel density estimators.” Statistics 21 (1990): 169–184. [Google Scholar] [CrossRef]
J. Fan. “Asymptotic normality for deconvolution kernel density estimators.” Sankhya Indian J. Stat. A 53 (1991): 97–110. [Google Scholar]
Y. Fan, and Y. Liu. “A note on asymptotic normality for deconvolution kernel density estimators.” Sankhya Indian J. Stat. A 59 (1997): 138–141. [Google Scholar]
B. Van Es, and H. Uh. “Asymptotic normality of nonparametric kernel type deconvolution density estimators: Crossing the Cauchy boundary.” J. Nonparametr. Stat. 16 (2004): 261–277. [Google Scholar] [CrossRef]
B. Van Es, and H. Uh. “Asymptotic normality of kernel-type deconvolution estimators.” Scand. J. Stat. 32 (2005): 467–483. [Google Scholar]
E. Masry. “Asymptotic normality for deconvolution estimators of multivariate densities of stationary processes.” J. Multivar. Anal. 44 (1993): 47–68. [Google Scholar] [CrossRef]
R. Kulik. “Nonparametric deconvolution problem for dependent sequences.” Electron. J. Stat. 2 (2008): 722–740. [Google Scholar] [CrossRef]
B. Van Es, P. Spreij, and H. van Zanten. “Nonparametric volatility density estimation.” Bernoulli 9 (2003): 451–465. [Google Scholar] [CrossRef]
B. Van Es, P. Spreij, and H. van Zanten. “Nonparametric volatility density estimation for discrete time models.” Nonparametr. Stat. 17 (2005): 237–251. [Google Scholar] [CrossRef]
F. Comte, and V. Genon-Catalot. “Penalized projection estimator for volatility density.” Scand. J. Stat. 33 (2006): 875–893. [Google Scholar] [CrossRef]
V. Todorov, and G. Tauchen. “Inverse realized laplace transforms for nonparametric volatility density estimation in jump-diffusions.” J. Am. Stat. Assoc. 107 (2012): 622–635. [Google Scholar] [CrossRef]
J. Fan. “On the optimal rates of convergence for nonparametric deconvolution problems.” Ann. Stat. 19 (1991): 1257–1272. [Google Scholar] [CrossRef]
A. Delaigle, and I. Gijbels. “Practical bandwidth selection in deconvolution kernel density estimation.” Comput. Stat. Data Anal. 45 (2004): 249–267. [Google Scholar] [CrossRef]
C. Butucea. “Asymptotic normality of the integrated square error of a density estimator in the convolution model.” Sort 28 (2004): 9–26. [Google Scholar]
E. Masry. “Multivariate probability density deconvolution for stationary random processes.” IEEE Trans. Inf. Theory 37 (1991): 1105–1115. [Google Scholar] [CrossRef]
J. Fan, and Q. Yao. Nonlinear Time Series. Berlin, Germany: Springer, 2002, Volume 2. [Google Scholar]
S.J. Taylor. “Financial returns modelled by the product of two stochastic processes-a study of the daily sugar prices 1961–1975.” Time Ser. Anal. Theory Pract. 1 (1982): 203–226. [Google Scholar]
N. Shephard, ed. Stochastic Volatility: Selected Readings. Oxford, UK: Oxford University Press, 2005.
N. Shephard, and T.G. Andersen. “Stochastic volatility: Origins and overview.” In Handbook of Financial Time Series. Edited by T.G. Andersen, R.A. Davis, J.P. Kreiß and T. Mikosch. Berlin, Germany: Springer, 2009, pp. 233–254. [Google Scholar]
T.G. Andersen, T. Bollerslev, F.X. Diebold, and P. Labys. “Parametric and nonparametric volatility measurement.” Handb. Financ. Econom. 1 (2009): 67–138. [Google Scholar]

¹The characteristic function of a random variable with density f is defined as $ϕ_{f} = \int_{ℝ} e^{i t x} f (x) d x$ .
²In this paper, I follow the convention to define the Fourier transform of a function f to be $ϕ_{f} = \int_{- \infty}^{+ \infty} e^{i t x} f (x) d x .$
³Usually, for practical implementations, the following kernels:

$K_{1} (x) = \frac{48 cos x}{π x^{4}} (1 - \frac{15}{x^{2}}) - \frac{144 sin x}{π x^{5}} (2 - \frac{5}{x^{2}}),$

with Fourier transform:

$ϕ_{K_{1}} (t) = I {| t | ⩽ 1} {(1 - t^{2})}^{3},$

are used because they have better numerical properties; see Delaigle and Gijbels [14] for the discussions.
⁴This is easy to see by noticing $\int {|ν_{h} (x)|}^{p} d x ⩽ \int {|ν_{h} (x)|}^{2} {|{sup}_{x} ν_{h} (x)|}^{p - 2} d x$ for $p > 2$ .
⁵Here, again, the splitting integral argument as in proving (7) is used; I omit the details for the ease of exposition.

© 2015 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license ( http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zu, Y. A Note on the Asymptotic Normality of the Kernel Deconvolution Density Estimator with Logarithmic Chi-Square Noise. Econometrics 2015, 3, 561-576. https://doi.org/10.3390/econometrics3030561

AMA Style

Zu Y. A Note on the Asymptotic Normality of the Kernel Deconvolution Density Estimator with Logarithmic Chi-Square Noise. Econometrics. 2015; 3(3):561-576. https://doi.org/10.3390/econometrics3030561

Chicago/Turabian Style

Zu, Yang. 2015. "A Note on the Asymptotic Normality of the Kernel Deconvolution Density Estimator with Logarithmic Chi-Square Noise" Econometrics 3, no. 3: 561-576. https://doi.org/10.3390/econometrics3030561

APA Style

Zu, Y. (2015). A Note on the Asymptotic Normality of the Kernel Deconvolution Density Estimator with Logarithmic Chi-Square Noise. Econometrics, 3(3), 561-576. https://doi.org/10.3390/econometrics3030561

Article Menu

A Note on the Asymptotic Normality of the Kernel Deconvolution Density Estimator with Logarithmic Chi-Square Noise

Abstract

1. Introduction

2. Logarithmic Chi-Square Distribution

3. Asymptotic Normality

3.1. i.i.d. Observations

3.2. Strong Mixing Observations

4. Application to Density Estimation in the Stochastic Volatility Model

5. Conclusions

Acknowledgements

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI