Boosted Whittaker–Henderson Graduation

Jin, Zihan; Yamada, Hiroshi

doi:10.3390/math12213377

Open AccessFeature PaperArticle

Boosted Whittaker–Henderson Graduation

by

Zihan Jin

and

Hiroshi Yamada

^*

Graduate School of Humanities and Social Sciences, Hiroshima University, 1-2-1 Kagamiyama, Higashi-Hiroshima 739-8525, Japan

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(21), 3377; https://doi.org/10.3390/math12213377

Submission received: 2 September 2024 / Revised: 24 October 2024 / Accepted: 27 October 2024 / Published: 29 October 2024

(This article belongs to the Special Issue Recent Advances in Time Series Analysis)

Download

Browse Figures

Versions Notes

Abstract

The Whittaker–Henderson (WH) graduation is a smoothing method for equally spaced one-dimensional data such as time series. It includes the Bohlmann filter, the Hodrick–Prescott (HP) filter, and the Whittaker graduation as special cases. Among them, the HP filter is the most prominent trend-cycle decomposition method for macroeconomic time series such as real gross domestic product. Recently, a modification of the HP filter, the boosted HP (bHP) filter, has been developed, and several studies have been conducted. The basic idea of the modification is to achieve more desirable smoothing by extracting long-term fluctuations remaining in the smoothing residuals. Inspired by the modification, this paper develops the boosted version of the WH graduation, which includes the bHP filter as a special case. Then, we establish its properties that are fundamental for applied work. To investigate the properties, we use a spectral decomposition of the penalty matrix of the WH graduation

Keywords:

Whittaker–Henderson graduation; Hodrick–Prescott filter; boosted Hodrick–Prescott filter

MSC:

62G05

1. Introduction

The Whittaker–Henderson (WH) graduation is a smoothing method for equally spaced one-dimensional data such as time series. It includes the Bohlmann (1899) [1] filter, Hodrick–Prescott (HP) filter (Hodrick and Prescott, 1997 [2]), and Whittaker (1923) [3] graduation as its special cases [4,5,6,7]. Among them, the HP filter is the most prominent trend-cycle decomposition method for macroeconomic time series such as real gross domestic product. Recently, Phillips and Shi (2021) [8] developed its modification, the boosted HP (bHP) filter. The basic idea of the modification is to achieve more desirable smoothing by extracting long-term fluctuations remaining in the smoothing residuals [8,9].

Several studies concerning the bHP filter have since emerged: (i) Knight (2021) [10] derived the penalized least squares problems corresponding to the bHP filter, (ii) Tomal (2022) [11] and Trojanek et al. (2023) [12] published empirical studies using the bHP filter, (iii) Hall and Thomson (2024) [13] provided a way to use the bHP filter as a frequency-selective filter, (iv) Mei et al. (2024) [14] and Biswas et al. (2024) [15] extended the bHP filter theoretically, (v) Yamada (2024) [9] provided a perspective of the bHP filter, (vi) Yamada (2024) [16] established the properties of the bHP filter, and (vii) Bao and Yamada (2024) [17] studied the boosted version of the Bohlmann filter.

Inspired by Phillips and Shi (2021) [8], this paper develops the boosted version of the WH graduation, the bWH graduation, which includes the bHP filter as a special case. As in the case of the bHP filter, the new filter can also recover long-term fluctuations remaining in the smoothing residuals. Then, we establish its properties, which are fundamental for applied work. To examine them, we use a spectral decomposition of the penalty matrix of the WH graduation, which is a banded symmetric matrix. This direction of research is suggested by Yamada (2024) [16], and this paper can be considered as an extension of Yamada (2024) [16]. In addition, since the boosted version of the Bohlmann filter is identical to the bWH graduation of order 1, this paper can also be considered as an extension of Bao and Yamada (2024) [17].

The organization of this paper is as follows. In Section 2, we provide some preliminary remarks, which include the spectral decomposition of the penalty matrix of the WH graduation. In Section 3, we define the bWH graduation and present the spectral representation of it. In Section 4, we establish several properties of the bWH graduation. In Section 5, we empirically illustrate the results obtained. Section 6 concludes the paper. In the Appendix A, we provide some of the proofs.

2. Preliminaries

In this section, we provide some preliminary remarks.

2.1. Data

Let

y_{t}

denote the realization of a variable y at

v_{t}

for

t = 1, \dots, n

, where

v_{1}, \dots, v_{n}

are equally spaced. In this paper, we assume that

y_{t}

cannot be represented as

\sum_{k = 0}^{p - 1} ϕ_{k} t^{k}

for

t = 1, \dots, n

, where

ϕ_{k}

for

k = 0, \dots, p - 1

are real numbers. This is because there is no need for smoothing in this case.

2.2. Notations

Let

y = {[y_{1}, \dots, y_{n}]}^{⊤}

,

x = {[x_{1}, \dots, x_{n}]}^{⊤}

,

ι

be an n-dimensional column vector of ones,

0_{r, s}

be an

r \times s

matrix of zeros, and

I_{r}

be an

r \times r

identity matrix. For an r-dimensional column vector

η = {[η_{1}, \dots, η_{r}]}^{⊤}

,

{∥ η ∥}^{2} = η^{⊤} η = \sum_{t = 1}^{r} η_{t}^{2}

. For a full-column rank matrix

Γ \in R^{r \times s}

, the column space of

Γ

and its orthogonal complement are denoted by by

S (Γ)

and

S^{⊥} (Γ)

, respectively. Let

Δ = 1 - L

, where L is the lag operator such that

L x_{t} = x_{t - 1}

. Accordingly,

Δ = 1 - L

is the difference operator such that

Δ x_{t} = (1 - L) x_{t} = x_{t} - x_{t - 1}

and

Δ^{2} x_{t} = {(1 - L)}^{2} x_{t} = x_{t} - 2 x_{t - 1} + x_{t - 2}

.

2.3. Key Matrices

2.3.1. $Δ_{p}$ and $C_{p}$

For an n-dimensional column vector

ζ = {[ζ_{1}, \dots, ζ_{n}]}^{⊤}

, let

Δ_{p}

be an

(n - p) \times n

matrix such that

Δ_{p} ζ = {[Δ^{p} ζ_{p + 1}, \dots, Δ^{p} ζ_{n}]}^{⊤}

. Based on the binomial theorem, letting

a_{h} = {(- 1)}^{p - h} (\binom{p}{h})

for

h = 0, 1, \dots, p

,

{(1 - L)}^{p}

can be expanded as follows:

\begin{matrix} {(1 - L)}^{p} = {(- 1) L + 1}^{p} = \sum_{h = 0}^{p} (\binom{p}{h}) {(- 1)}^{p - h} L^{p - h} 1^{h} = \sum_{h = 0}^{p} a_{h} L^{p - h} . \end{matrix}

(1)

Then, it follows that:

\begin{matrix} Δ^{p} x_{t} & = {(1 - L)}^{p} x_{t} = (\sum_{h = 0}^{p} a_{h} L^{p - h}) x_{t} \\ = a_{0} x_{t - p} + a_{1} x_{t - p + 1} + \dots + a_{p} x_{t}, t = p + 1, \dots, n . \end{matrix}

(2)

Accordingly,

Δ_{p}

is an

(n - p) \times n

Toeplitz matrix given as follows:

\begin{matrix} Δ_{p} = [\begin{matrix} a_{0} & \dots & a_{p} & 0 & \dots & 0 \\ 0 & ⋱ & ⋱ & ⋱ & ⋮ \\ ⋮ & ⋱ & ⋱ & ⋱ & 0 \\ 0 & \dots & 0 & a_{0} & \dots & a_{p} \end{matrix}] . \end{matrix}

(3)

Here,

a_{0} = - 1

if p is odd and

a_{0} = 1

if p is even; therefore, the rank of

Δ_{p}

is

(n - p)

and the nullity of

Δ_{p}

is p. For example,

Δ_{1} \in R^{(n - 1) \times n}

and

Δ_{2} \in R^{(n - 2) \times n}

are given as follows, respectively:

\begin{matrix} Δ_{1} = [\begin{matrix} - 1 & 1 & 0 & \dots & 0 \\ 0 & ⋱ & ⋱ & ⋱ & ⋮ \\ ⋮ & ⋱ & ⋱ & ⋱ & 0 \\ 0 & \dots & 0 & - 1 & 1 \end{matrix}], Δ_{2} = [\begin{matrix} 1 & - 2 & 1 & 0 & \dots & 0 \\ 0 & ⋱ & ⋱ & ⋱ & ⋱ & ⋮ \\ ⋮ & ⋱ & ⋱ & ⋱ & ⋱ & 0 \\ 0 & \dots & 0 & 1 & - 2 & 1 \end{matrix}] . \end{matrix}

Let

C_{p}

be an

n \times n

-banded symmetric matrix defined as follows:

\begin{matrix} C_{p} = Δ_{p}^{⊤} Δ_{p} . \end{matrix}

(4)

Then,

C_{p}

is a positive semidefinite matrix whose rank is

(n - p)

. Incidentally,

C_{1}, \dots, C_{4}

are explicitly shown in Anderson (1971, pp. 68–69) [18]. In addition,

C_{1}

is

A_{2}

in Strang (1999) [19], and it is also shown in Nakatsukasa et al. (2013, p. 3233) [20].

2.3.2. $Π_{p}$

Let

\begin{matrix} Π_{p} = [τ_{(0)}, \dots, τ_{(p - 1)}], \end{matrix}

(5)

where

τ_{(k)} = {[1^{k}, 2^{k}, \dots, n^{k}]}^{⊤}

for

k = 0, \dots, p - 1

. More specifically, it is a Vandermonde matrix given as follows:

\begin{matrix} Π_{p} = [\begin{matrix} 1^{0} & 1^{1} & \dots & 1^{p - 1} \\ 2^{0} & 2^{1} & \dots & 2^{p - 1} \\ ⋮ & ⋮ & ⋮ \\ n^{0} & n^{1} & \dots & n^{p - 1} \end{matrix}] . \end{matrix}

(6)

Then,

Π_{p}

is of full column rank, and the first column of

Π_{p}

,

τ_{(0)}

, is equal to

ι

. In addition, by assumption,

y \notin S (Π_{p})

.

Let

\begin{matrix} {\hat{τ}}_{p} = Π_{p} {\hat{β}}_{p}, \end{matrix}

(7)

where

{\hat{β}}_{p} = {(Π_{p}^{⊤} Π_{p})}^{- 1} Π_{p}^{⊤} y = \arg {min}_{β} {∥ y - Π_{p} β ∥}^{2}

. For example,

{\hat{τ}}_{1} = \bar{y} τ_{(0)} = \bar{y} ι

, where

\bar{y} = \frac{1}{n} \sum_{t = 1}^{n} y_{t}

and

{\hat{τ}}_{2} = α^{ba} τ_{(0)} + β^{ba} τ_{(1)}

, where

α^{ba}

and

β^{ba}

are explicitly shown in Kim et al. (2009, p. 341) [21].

{\hat{τ}}_{p}

is the orthogonal projection of

y

onto

S (Π_{p})

.

2.3.3. $D_{p}$ and $U_{p}$

Denote a spectral decomposition of

C_{p}

, which is defined by (4) as follows:

\begin{matrix} C_{p} = U_{p} D_{p} U_{p}^{⊤}, \end{matrix}

(8)

where

D_{p} = diag (d_{p, 1}, \dots, d_{p, n})

, and

U_{p} = [u_{p, 1}, \dots, u_{p, n}]

is an orthogonal matrix. Given that

C_{p}

is a positive semidefinite matrix whose rank is

(n - p)

, we let

\begin{matrix} 0 = d_{p, 1} = \dots = d_{p, p} < d_{p, p + 1} \leq \dots \leq d_{p, n} . \end{matrix}

(9)

Here, the largest eigenvalue of

C_{p}

is upper bounded by

2^{2 p}

, i.e., it follows that

\begin{matrix} d_{p, n} \leq 2^{2 p} . \end{matrix}

(10)

A proof of the last inequality in (10) is provided in Appendix A.1. Figure 1 depicts

d_{p, 1}, \dots, d_{p, n}

for

p = 3

and

n = 50

. Note that if

p = 3

, it follows that

d_{p, n} \leq 2^{2 p} = 64

. We can confirm this from the figure. Since

C_{p} ι = Δ_{p}^{⊤} Δ_{p} ι = 0_{n, 1}

, we can let

(d_{p, 1}, u_{p, 1}) = (0, \frac{1}{\sqrt{n}} ι)

.

2.3.4. $D_{p, j}$ and $U_{p, j}$ for $j = 1, 2$

Let

D_{p, 1} = diag (d_{p, 1}, \dots, d_{p, p})

and

D_{p, 2} = diag (d_{p, p + 1}, \dots, d_{p, n})

. Accordingly, it follows that

\begin{matrix} D_{p} = [\begin{matrix} D_{p, 1} & 0_{p, n - p} \\ 0_{n - p, n} & D_{p, 2} \end{matrix}], \end{matrix}

In addition, from the inequalities in (9),

D_{p, 1} = 0_{p, p}

, and

D_{p, 2} \in R^{(n - p) \times (n - p)}

is a positive definite matrix. Let

U_{p, 1} = [u_{p, 1}, \dots, u_{p, p}]

and

U_{p, 2} = [u_{p, p + 1}, \dots, u_{p, n}]

. Accordingly,

\begin{matrix} U_{p} = [U_{p, 1}, U_{p, 2}] . \end{matrix}

2.4. Results Regarding the Key Matrices

For

k = 0, \dots, p - 1

,

τ_{(k)}

in (5) belonging to the null space of

Δ_{p}

in (3), it follows that

\begin{matrix} Δ_{p} τ_{(k)} = 0_{n - p, 1} . \end{matrix}

(11)

A proof of (11) is provided in Appendix A.2. Accordingly, we have

\begin{matrix} Δ_{p} Π_{p} = [Δ_{p} τ_{(0)}, \dots, Δ_{p} τ_{(p - 1)}] = 0_{n - p, p}, \end{matrix}

from which it follows that

\begin{matrix} C_{p} Π_{p} = Δ_{p}^{⊤} Δ_{p} Π_{p} = 0_{n, p} . \end{matrix}

(12)

Thus, given that (i) the nullity of

C_{p}

is p and (ii)

Π_{p} = [τ_{(0)}, \dots, τ_{(p - 1)}]

is of full column rank,

{τ_{(0)}, \dots, τ_{(p - 1)}}

is a basis of the null space of

C_{p}

. On the other hand, from (8) and (9), it follows that

\begin{matrix} C_{p} U_{p, 1} = 0_{n, p}, \end{matrix}

(13)

which implies that

{u_{p, 1}, \dots, u_{p, p}}

is an orthonormal basis of the null space of

C_{p}

. Combining these results, it follows that

\begin{matrix} S (Π_{p}) = S (U_{p, 1}) . \end{matrix}

(14)

From (14), we obtain the following two results:

\begin{matrix} U_{p, 1} U_{p, 1}^{⊤} (= U_{p, 1} {(U_{p, 1}^{⊤} U_{p, 1})}^{- 1} U_{p, 1}^{⊤}) = Π_{p} {(Π_{p}^{⊤} Π_{p})}^{- 1} Π_{p}^{⊤}, \end{matrix}

(15)

\begin{matrix} U_{p, 2}^{⊤} Π_{p} = 0_{n - p, p} . \end{matrix}

(16)

Given that

C_{p} = Δ_{p}^{⊤} Δ_{p}

and

C_{p} u_{p, i} = d_{p, i} u_{p, i}

for

i = 1, \dots, n

, we have

\begin{matrix} ∥ Δ_{p} u_{p, i} ∥^{2} = u_{p, i}^{⊤} C_{p} u_{p, i} = d_{p, i}, i = 1, \dots, n . \end{matrix}

(17)

Then, from (9) and (10), it follows that

\begin{matrix} 0 = ∥ Δ_{p} u_{p, 1} ∥^{2} = \dots = ∥ Δ_{p} u_{p, p} ∥^{2} \leq ∥ Δ_{p} u_{p, p + 1} ∥^{2} \leq \dots \leq {∥ Δ_{p} u_{p, n} ∥}^{2} \leq 2^{2 p} . \end{matrix}

(18)

These indicate that with respect to

Δ_{p}

, which is defined by (3),

(i): the degree of smoothness of $u_{p, 1}, \dots, u_{p, p}$ is the highest, and
(ii): the degree of smoothness of $u_{p, i}$ is higher than or equal to that of $u_{p, i + 1}$ for $i = p, \dots, n - 1$ .

Figure 2 depicts

u_{p, 4}

,

u_{p, 8}

,

u_{p, 12}

,

u_{p, 16}

for

p = 3

and

n = 50

. This figure is consistent with (ii).

2.5. WH Graduation

The WH(p) graduation is a smoothing method given as follows:

\begin{matrix} min_{x_{1}, \dots, x_{n} \in R} \sum_{t = 1}^{n} {(y_{t} - x_{t})}^{2} + λ_{p} \sum_{t = p + 1}^{n} {(Δ^{p} x_{t})}^{2}, \end{matrix}

(19)

where

λ_{p} \in (0, \infty)

is a smoothing parameter and p is a positive integer such that

n > p

. It is a penalized least squares regression. When

p = 1

, since

Δ x_{t} = (1 - L) x_{t} = x_{t} - x_{t - 1}

, (19) is identical to the Bohlmann filter. When

p = 2

, since

Δ^{2} x_{t} = (1 - 2 L + L^{2}) x_{t} = x_{t} - 2 x_{t - 1} + x_{t - 2}

, (19) is identical to the HP filter, given as follows:

\begin{matrix} min_{x_{1}, \dots, x_{n} \in R} \sum_{t = 1}^{n} {(y_{t} - x_{t})}^{2} + λ_{2} \sum_{t = 3}^{n} {(x_{t} - 2 x_{t - 1} + x_{t - 2})}^{2} . \end{matrix}

(20)

When

p = 3

, (19) is identical to the Whittaker graduation.

The WH(p) graduation can be represented in matrix form as follows:

\begin{matrix} min_{x} f_{p} (x) = {∥ y - x ∥}^{2} + λ_{p} x^{⊤} C_{p} x . \end{matrix}

(21)

Then,

C_{p}

, which is defined by (4), is the penalty matrix of the WH graduation. Denoting the solution to the minimization problem in (21) by

{\hat{x}}_{p}

, it follows that

\begin{matrix} {\hat{x}}_{p} = A_{p} y, \end{matrix}

(22)

where

\begin{matrix} A_{p} = {(I_{n} + λ_{p} C_{p})}^{- 1} . \end{matrix}

(23)

A_{p}

is referred to as the smoother matrix of the WH(p) graduation.

2.6. Spectral Representation of WH Graduation

Given the spectral decomposition of

C_{p}

in (8),

A_{p}

, which is the smoother matrix of the WH(p) graduation, can be spectrally decomposed as follows:

\begin{matrix} A_{p} = {(I_{n} + λ_{p} C_{p})}^{- 1} = U_{p} {(I_{n} + λ_{p} D_{p})}^{- 1} U_{p}^{⊤} = U_{p} B_{p} U_{p}^{⊤}, \end{matrix}

(24)

where

B_{p} = {(I_{n} + λ_{p} D_{p})}^{- 1}

. Let

B_{p} = diag (b_{p, 1}, \dots, b_{p, n})

. Then, given that

\begin{matrix} b_{p, i} = {(1 + λ_{p} d_{p, i})}^{- 1}, i = 1, \dots, n, \end{matrix}

(25)

from the inequalities in (9), it follows that

\begin{matrix} 1 = b_{p, 1} = \dots = b_{p, p} > b_{p, p + 1} \geq \dots \geq b_{p, n} > 0 . \end{matrix}

(26)

3. Boosted WH Graduation

3.1. Boosted WH Graduation

We define the boosted WH graduation of order p, bWH(p) graduation for short, as follows:

\begin{matrix} {\hat{x}}_{p}^{(m)} = A_{p}^{(m)} y, \end{matrix}

(27)

where

\begin{matrix} A_{p}^{(m)} = I_{n} - {(I_{n} - A_{p})}^{m} . \end{matrix}

(28)

We refer to

A_{p}^{(m)}

as the smoother matrix of the bWH(p) graduation.

The bWH(p) graduation is related to the existing filters as follows. In our notation, the boosted HP filter developed by Phillips and Shi (2021) [8] is represented as follows:

\begin{matrix} {\hat{x}}_{2}^{(m)} = A_{2}^{(m)} y . \end{matrix}

(29)

Thus, bWH(p) graduation is a generalization of the bHP filter. In addition, since

\begin{matrix} A_{p}^{(1)} = I_{n} - (I_{n} - A_{p}) = A_{p}, \end{matrix}

(30)

the bWH(p) graduation is also a generalization of the WH(p) graduation. Moreover, the bWH(1) graduation was dealt with in Bao and Yamada (2024) [17]. Finally, Hall and Thompson (2024) [13] refer to

{\hat{x}}_{2}^{(2)}

as twicing.

We illustrate how boosting brings a gain. Given that

A_{p}^{(2)} = I_{n} - {(I_{n} - A_{p})}^{2} = I_{n} - I_{n} + 2 A_{p} - A_{p}^{2} = A_{p} + A_{p} (I_{n} - A_{p})

, it follows that

\begin{matrix} {\hat{x}}_{p}^{(2)} = {\hat{x}}_{p} + A_{p} (y - {\hat{x}}_{p}) . \end{matrix}

(31)

Here,

A_{p} (y - {\hat{x}}_{p})

in (31) can be regarded as a gain from boosting. Given that

A_{p}

is the smoother matrix of the WH graduation,

A_{p} (y - {\hat{x}}_{p})

is a trend recovered from the WH graduation residuals,

y - {\hat{x}}_{p}

.

MATLAB/GNU Octave, R, and Python user-defined functions for calculating

{\hat{x}}_{p}^{(m)}

in (27) are available from GitHub. The URL is https://github.com/HiroshiFromHiroshima/Boosted_Whittaker-Henderson_Graduation (accessed on 26 October 2024). We used MATLAB version R2018b, GNU Octave version 7.1.0, R version 4.2.3, and Python version 3.12.5 to verify these user-defined functions.

3.2. Spectral Representation of bWH Graduation

Given the spectral decomposition of

A_{p}

in (24),

A_{p}^{(m)}

in (28), which is the smoother matrix of the bWH(p) graduation, can be spectrally decomposed as follows:

\begin{matrix} A_{p}^{(m)} & = I_{n} - {(I_{n} - A_{p})}^{m} = I_{n} - {(I_{n} - U_{p} B_{p} U_{p}^{⊤})}^{m} \\ = U_{p} \{I_{n} - {(I_{n} - B_{p})}^{m}\} U_{p}^{⊤} = U_{p} B_{p}^{(m)} U_{p}^{⊤}, \end{matrix}

(32)

where

\begin{matrix} B_{p}^{(m)} = I_{n} - {(I_{n} - B_{p})}^{m} . \end{matrix}

(33)

From (32), we have some useful results.

Given that $B_{p}^{(m)}$ is a diagonal matrix, from (32), it follows that

$\begin{matrix} {(A_{p}^{(m)})}^{⊤} = {(U_{p} B_{p}^{(m)} U_{p}^{⊤})}^{⊤} = U_{p} B_{p}^{(m)} U_{p}^{⊤} = A_{p}^{(m)} . \end{matrix}$

(34)

That is, $A_{p}^{(m)}$ is a symmetric matrix.
Let $B_{p}^{(m)} = diag (b_{p, 1}^{(m)}, \dots, b_{p, n}^{(m)})$ . Then, it follows that

$\begin{matrix} b_{p, i}^{(m)} = 1 - {(1 - b_{p, i})}^{m} = 1 - {(\frac{λ_{p} d_{p, i}}{1 + λ_{p} d_{p, i}})}^{m}, i = 1, \dots, n . \end{matrix}$

(35)

Thus, it follows from (26) that

$\begin{matrix} 1 = b_{p, 1}^{(m)} = \dots = b_{p, p}^{(m)} > b_{p, p + 1}^{(m)} \geq \dots \geq b_{p, n}^{(m)} > 0 . \end{matrix}$

(36)

Figure 3 depicts $b_{p, 1}^{(m)}, \dots, b_{p, n}^{(m)}$ for $p = 3$ , $n = 50$ , $m = 2$ , and $λ_{p} = 1000$ . It illustrates the inequalities in (36).
Then, given that $A_{p}^{(m)}$ is a symmetric matrix whose eigenvalues, $b_{p, i}^{(m)}$ for $i = 1, \dots, n$ , are all positive, $A_{p}^{(m)}$ is a positive definite matrix.
Let $B_{p, 1}^{(m)} = diag (b_{p, 1}^{(m)}, \dots, b_{p, p}^{(m)})$ and $B_{p, 2}^{(m)} = diag (b_{p, p + 1}^{(m)}, \dots, b_{p, n}^{(m)})$ . Then, given that $B_{p, 1}^{(m)} = I_{p}$ , the spectral decomposition of $A_{p}^{(m)}$ in (32) becomes

$\begin{matrix} A_{p}^{(m)} = U_{p, 1} U_{p, 1}^{⊤} + U_{p, 2} B_{p, 2}^{(m)} U_{p, 2}^{⊤} . \end{matrix}$

(37)

Moreover, from (15), $A_{p}^{(m)}$ can be represented as follows:

$\begin{matrix} A_{p}^{(m)} = Π_{p} {(Π_{p}^{⊤} Π_{p})}^{- 1} Π_{p}^{⊤} + U_{p, 2} B_{p, 2}^{(m)} U_{p, 2}^{⊤} . \end{matrix}$

(38)
Given (16), postmultiplying (37) by $Π_{p}$ yields the following:

$\begin{matrix} A_{p}^{(m)} Π_{p} = Π_{p} . \end{matrix}$

(39)

Let us summarize the above results.

Lemma 1.

The smoother matrix of the bWH graduation,

A_{p}^{(m)}

, has the following properties.

(i): $A_{p}^{(m)}$ is a symmetric matrix. Moreover, it is a positive definite matrix whose eigenvalues satisfy (36).
(ii): $A_{p}^{(m)}$ can be represented as in (38).
(iii): $A_{p}^{(m)}$ is a matrix such that $A_{p}^{(m)} Π_{p} = Π_{p}$ .

Postmultiplying (38) by

y

, we immediately obtain the following result.

Lemma 2.

{\hat{x}}_{p}^{(m)}

in (27) can be spectrally represented as

\begin{matrix} {\hat{x}}_{p}^{(m)} = {\hat{τ}}_{p} + b_{p, p + 1}^{(m)} z_{p, p + 1} u_{p, p + 1} + \dots + b_{p, n}^{(m)} z_{p, n} u_{p, n}, \end{matrix}

(40)

where

z_{p, i} = u_{p, i}^{⊤} y

for

i = p + 1, \dots, n

and

{\hat{τ}}_{p} \in S^{⊥} (U_{p, 2})

is the orthogonal projection of

y

onto

S (Π_{p})

.

Remark 1.

Given the inequalities in (18) and (36), the spectral representation of

{\hat{x}}_{p}^{(m)}

in Lemma 2 shows how smoothing is performed. See also Figure 3, which illustrates (36).

4. Properties of the bWH Graduation

In this section, we establish several properties of the bWH graduation.

For this purpose, we will introduce some notations. The t-th entry of

{\hat{x}}_{p}^{(m)}

in (27) is denoted by

{\hat{x}}_{p, t}^{(m)}

for

t = 1, \dots, n

, and the

(i, j)

-th entry of

A_{p}^{(m)}

in (28) is denoted by

a_{p, i, j}^{(m)}

for

i, j = 1, \dots, n

. Let

\begin{matrix} C_{p}^{(m)} = \frac{1}{λ_{p}} \{{(A_{p}^{(m)})}^{- 1} - I_{n}\} . \end{matrix}

(41)

Here, we note that

C_{p}^{(m)}

in (41) can be defined because

A_{p}^{(m)}

is a positive definite matrix from Lemma 1(i). In addition, let

\begin{matrix} D_{p}^{(m)} = \frac{1}{λ_{p}} \{{(B_{p}^{(m)})}^{- 1} - I_{n}\} . \end{matrix}

(42)

Here,

{(B_{p}^{(m)})}^{- 1}

in (42) is a diagonal matrix whose diagonal entries are all positive (See the inequalities in (36)).

Proposition 1.

The bWH graduation has the following properties.

(i): (a) The average of the entries of ${\hat{x}}_{p}^{(m)}$ in (27) is equal to that of $y$ . That is, it follows that

$\begin{matrix} \frac{1}{n} \sum_{t = 1}^{n} {\hat{x}}_{p, t}^{(m)} = \frac{1}{n} \sum_{t = 1}^{n} y_{t} . \end{matrix}$

(43)

(b) The bWH graduation residuals, $y - {\hat{x}}_{p}^{(m)}$ , sum to zero. That is, it follows that

$\begin{matrix} \sum_{t = 1}^{n} (y_{t} - {\hat{x}}_{p, t}^{(m)}) = 0 . \end{matrix}$

(44)
(ii): The result in (i)(a) can be generalized as follows.

$\begin{matrix} \frac{1}{n} \sum_{t = 1}^{n} t^{k} {\hat{x}}_{p, t}^{(m)} = \frac{1}{n} \sum_{t = 1}^{n} t^{k} y_{t}, k = 0, \dots, p - 1 . \end{matrix}$

(45)
(iii): Each row of the smoother matrix $A_{p}^{(m)}$ in (28) sums to unity. That is, it follows that

$\begin{matrix} a_{p, i, 1}^{(m)} + \dots + a_{p, i, n}^{(m)} = 1, i = 1, \dots, n . \end{matrix}$

(46)
(iv): ${\hat{τ}}_{p}$ in (7) satisfies

$\begin{matrix} A_{p}^{(m)} {\hat{τ}}_{p} = {\hat{τ}}_{p} . \end{matrix}$

(47)

Accordingly, ${\hat{x}}_{p}^{(m)}$ in (27) can be represented by ${\hat{τ}}_{p}$ as follows:

$\begin{matrix} {\hat{x}}_{p}^{(m)} = {\hat{τ}}_{p} + A_{p}^{(m)} (y - {\hat{τ}}_{p}) . \end{matrix}$

(48)
(v): When n and m are fixed, as $λ_{p} \to \infty$ , ${\hat{x}}_{p}^{(m)} \to {\hat{τ}}_{p}$ .
(vi): When n and m are fixed, as $λ_{p} \to 0$ , ${\hat{x}}_{p}^{(m)} \to y$ .
(vii): When n and $λ_{p}$ are fixed, as $m \to \infty$ , ${\hat{x}}_{p}^{(m)} \to y$ .
(viii): When n, m, and $λ_{p}$ are fixed, as $h \to \infty$ , ${(A_{p}^{(m)})}^{h} y \to {\hat{τ}}_{p}$ .
(ix): ${\hat{x}}_{p}^{(m)}$ in (7) can be considered as the solution of a penalized least squares problem. More specifically, it follows that

$\begin{matrix} {\hat{x}}_{p}^{(m)} & = \arg min_{x} f_{p}^{(m)} (x) = {∥ y - x ∥}^{2} + λ_{p} x^{⊤} C_{p}^{(m)} x \\ = {(I_{n} + λ_{p} C_{p}^{(m)})}^{- 1} y . \end{matrix}$

(49)
(x): ${\hat{x}}_{p}^{(m)}$ in (7) can be represented by

$\begin{matrix} {\hat{x}}_{p}^{(m)} = U_{p} {\hat{θ}}_{p}^{(m)}, \end{matrix}$

(50)

where $U_{p}$ is an $n \times n$ orthogonal matrix in (8) and

$\begin{matrix} {\hat{θ}}_{p}^{(m)} & = \arg min_{θ} g_{p}^{(m)} (θ) = {∥ y - U_{p} θ ∥}^{2} + λ_{p} θ^{⊤} D_{p}^{(m)} θ \\ = {(I_{n} + λ_{p} D_{p}^{(m)})}^{- 1} U_{p}^{⊤} y . \end{matrix}$

(51)

Here, $D_{p}^{(m)}$ is a diagonal matrix such that its first p diagonal entries are zeros; therefore, the first p entries of $θ$ are not penalized.
(xi): The penalty matrices $C_{p}^{(m)}$ in (49) and $D_{p}^{(m)}$ in (51) are similar; therefore, they have the same eigenvalues. Furthermore, $C_{p}^{(m)}$ is a non-negative definite matrix whose null space is identical to $S (Π_{p})$ .

Proof.

The proofs of (i) to (xi) are, in turn, as follows.

(i): $ι$ is one of the columns of $Π_{p}$ in (6). Then, its orthogonal projection onto $S (Π_{p})$ is itself; i.e., it follows that $Π_{p} {(Π_{p}^{⊤} Π_{p})}^{- 1} Π_{p}^{⊤} ι = ι$ . In addition, from (16), it follows that $ι^{⊤} u_{p, i} = 0$ for $i = p + 1, \dots, n$ . Accordingly, pre-multiplying (40) by $ι^{⊤}$ yields

$\begin{matrix} ι^{⊤} {\hat{x}}_{p}^{(m)} = ι^{⊤} Π_{p} {(Π_{p}^{⊤} Π_{p})}^{- 1} Π_{p}^{⊤} y = ι^{⊤} y, \end{matrix}$

(52)

which implies that the average of the entries of ${\hat{x}}_{p}^{(m)}$ equals that of $y$ . In addition, from (52), it immediately follows that

$\begin{matrix} ι^{⊤} (y - {\hat{x}}_{p}^{(m)}) = 0 . \end{matrix}$

Thus, the bWH(p) graduation trend residuals sum to zero.
(ii): $τ_{(k)}$ is one of the columns of $Π_{p}$ in (6). Then, it follows that $Π_{p} {(Π_{p}^{⊤} Π_{p})}^{- 1} Π_{p}^{⊤} τ_{(k)} = τ_{(k)}$ for $k = 0, \dots, p - 1$ . In addition, from (16), it follows that $τ_{(k)}^{⊤} u_{p, i} = 0$ for $k = 0, \dots, p - 1$ and $i = p + 1, \dots, n$ . Accordingly, pre-multiplying (40) by $τ_{(k)}^{⊤}$ yields

$\begin{matrix} τ_{(k)}^{⊤} {\hat{x}}_{p}^{(m)} = τ_{(k)}^{⊤} Π_{p} {(Π_{p}^{⊤} Π_{p})}^{- 1} Π_{p}^{⊤} y = τ_{(k)}^{⊤} y, \end{matrix}$

from which we have (45).
(iii): Given that $ι$ is one of the columns of $Π_{p}$ in (6), it follows from (39) that

$\begin{matrix} A_{p}^{(m)} ι = ι . \end{matrix}$

Thus, each row of the smoother matrix $A_{p}^{(m)}$ sums to unity.
(iv): Again from (39), regarding ${\hat{τ}}_{p}$ in (7), it follows that

$\begin{matrix} A_{p}^{(m)} {\hat{τ}}_{p} = A_{p}^{(m)} Π_{p} {(Π_{p}^{⊤} Π_{p})}^{- 1} Π_{p}^{⊤} y = Π_{p} {(Π_{p}^{⊤} Π_{p})}^{- 1} Π_{p}^{⊤} y = {\hat{τ}}_{p}, \end{matrix}$

from which we immediately obtain

$\begin{matrix} {\hat{x}}_{p}^{(m)} = {\hat{τ}}_{p} + A_{p}^{(m)} (y - {\hat{τ}}_{p}) . \end{matrix}$
(v): From (35), it follows that

$\begin{matrix} b_{p, i}^{(m)} = 1 - {(\frac{λ_{p} d_{p, i}}{1 + λ_{p} d_{p, i}})}^{m} \to 0, (n, m : fixed, λ_{p} \to \infty) \end{matrix}$

for $i = p + 1, \dots, n$ . Accordingly, it follows from Lemma 2 that

$\begin{matrix} {\hat{x}}_{p}^{(m)} & = {\hat{τ}}_{p} + b_{p, p + 1}^{(m)} z_{p, p + 1} u_{p, p + 1} + \dots + b_{p, n}^{(m)} z_{p, n} u_{p, n} \\ \to {\hat{τ}}_{p}, (n, m : fixed, λ_{p} \to \infty) . \end{matrix}$
(vi): From (35) and (36), it follows that $b_{p, 1}^{(m)} = \dots = b_{p, p}^{(m)} = 1$ and

$\begin{matrix} b_{p, i}^{(m)} = 1 - {(\frac{λ_{p} d_{p, i}}{1 + λ_{p} d_{p, i}})}^{m} \to 1, (n, m : fixed, λ_{p} \to 0) \end{matrix}$

for $i = p + 1, \dots, n$ . Accordingly, it follows from (32) that

$\begin{matrix} A_{p}^{(m)} = U_{p} B_{p}^{(m)} U_{p}^{⊤} \to U_{p} U_{p}^{⊤} = I_{n}, (n, m : fixed, λ_{p} \to 0) . \end{matrix}$

Therefore, we have

$\begin{matrix} {\hat{x}}_{p}^{(m)} = A_{p}^{(m)} y \to y, (n, m : fixed, λ_{p} \to 0) . \end{matrix}$
(vii): From (35) and (36), it follows that $b_{p, 1}^{(m)} = \dots = b_{p, p}^{(m)} = 1$ and

$\begin{matrix} b_{p, i}^{(m)} = 1 - {(\frac{λ_{p} d_{p, i}}{1 + λ_{p} d_{p, i}})}^{m} \to 1, (n, λ_{p} : fixed, m \to \infty) \end{matrix}$

for $i = p + 1, \dots, n$ . Accordingly, it follows from (32) that

$\begin{matrix} A_{p}^{(m)} = U_{p} B_{p}^{(m)} U_{p}^{⊤} \to U_{p} U_{p}^{⊤} = I_{n}, (n, λ_{p} : fixed, m \to \infty) . \end{matrix}$

Therefore, we have

$\begin{matrix} {\hat{x}}_{p}^{(m)} = A_{p}^{(m)} y \to y, (n, λ_{p} : fixed, m \to \infty) . \end{matrix}$
(viii): From (32) and (36), it follows that

$\begin{matrix} {(A_{p}^{(m)})}^{h} = U_{p} {(B_{p}^{(m)})}^{h} U_{p}^{⊤} \to U_{p, 1} U_{p, 1}^{⊤}, (n, m, λ_{p} : fixed, h \to \infty) . \end{matrix}$

Accordingly, it follows from (15) that

$\begin{matrix} {(A_{p}^{(m)})}^{h} y = U_{p} {(B_{p}^{(m)})}^{h} U_{p}^{⊤} y \to U_{p, 1} U_{p, 1}^{⊤} y = {\hat{τ}}_{p}, (n, m, λ_{p} : fixed, h \to \infty) . \end{matrix}$
(ix): From the definition of $C_{p}^{(m)}$ given by (41), it immediately follows that $A_{p}^{(m)} = {(I_{n} + λ_{p} C_{p}^{(m)})}^{- 1}$ . Accordingly, we have

$\begin{matrix} {\hat{x}}_{p}^{(m)} & = A_{p}^{(m)} y = {(I_{n} + λ_{p} C_{p}^{(m)})}^{- 1} y \\ = \arg min_{x} f_{p}^{(m)} (x) = {∥ y - x ∥}^{2} + λ_{p} x^{⊤} C_{p}^{(m)} x . \end{matrix}$
(x): Based on the spectral decomposition of $A_{p}^{(m)}$ in (32), i.e., $A_{p}^{(m)} = U_{p} B_{p}^{(m)} U_{p}^{⊤}$ , $C_{p}^{(m)}$ in (41) can be decomposed as follows:

$\begin{matrix} C_{p}^{(m)} = \frac{1}{λ_{p}} \{U_{p} {(B_{p}^{(m)})}^{- 1} U_{p}^{⊤} - I_{n}\} = U_{p} D_{p}^{(m)} U_{p}^{⊤}, \end{matrix}$

(53)

by which we have

$\begin{matrix} U_{p} {\hat{θ}}_{p}^{(m)} & = U_{p} {(I_{n} + λ_{p} D_{p}^{(m)})}^{- 1} U_{p}^{⊤} y = {(I_{n} + λ_{p} U_{p} D_{p}^{(m)} U_{p}^{⊤})}^{- 1} y \\ = {(I_{n} + λ_{p} C_{p}^{(m)})}^{- 1} y = A_{p}^{(m)} y = {\hat{x}}_{p}^{(m)} . \end{matrix}$

Let

$\begin{matrix} d_{p, i}^{(m)} = \frac{1}{λ_{p}} (\frac{1}{b_{p, i}^{(m)}} - 1), i = 1, \dots, n . \end{matrix}$

(54)

Then, $D_{p}^{(m)} = diag (d_{p, 1}^{(m)}, \dots, d_{p, n}^{(m)})$ , and from the inequalities given in (36), it follows that

$\begin{matrix} 0 = d_{p, 1}^{(m)} = \dots = d_{p, p}^{(m)} < d_{p, p + 1}^{(m)} \leq \dots \leq d_{p, n}^{(m)} . \end{matrix}$

(55)
(xi): Given that $U_{p}$ in (8) is an orthogonal matrix, from (53), $C_{p}^{(m)}$ is similar to $D_{p}^{(m)}$ , and they have the same eigenvalues. Then, based on (55), $C_{p}^{(m)}$ is a non-negative definite matrix such that its nullity is p. Based on Lemma 1(i) and (39), $A_{p}^{(m)}$ is a positive definite matrix such that $A_{p}^{(m)} Π_{p} = Π_{p}$ . Then, it follows that

$\begin{matrix} {(A_{p}^{(m)})}^{- 1} Π_{p} = Π_{p}, \end{matrix}$

from which we have

$\begin{matrix} C_{p}^{(m)} Π_{p} = \frac{1}{λ_{p}} \{{(A_{p}^{(m)})}^{- 1} - I_{n}\} Π_{p} = \frac{1}{λ_{p}} (Π_{p} - Π_{p}) = 0_{n, p} . \end{matrix}$

(56)

Thus, $τ_{(0)}, \dots, τ_{(p - 1)}$ belong to the null space of $C_{p}^{(m)}$ .

□

Remark 2.

Regarding Proposition 1, we make several remarks.

1.: Some of the results in Proposition 1 are generalizations of those in Yamada (2020, Proposition 2.2) [22], which documents several properties of the HP filter. For example, Proposition 1(i) is a generalization of Yamada (2020, Proposition 2.2(iii)(a)) [22].
2.: Since $A_{p}^{(m)}$ is symmetric, from $A_{p}^{(m)} ι = ι$ , it follows that $ι^{⊤} A_{p}^{(m)} = ι^{⊤}$ , from which we have

$\begin{matrix} ι^{⊤} {\hat{x}}_{p}^{(m)} = ι^{⊤} A_{p}^{(m)} y = ι^{⊤} y . \end{matrix}$

This is another proof of Proposition 1(i)(a).
3.: $\sum_{t = 1}^{n} t^{k} {\hat{x}}_{p, t}^{(m)} = \sum_{t = 1}^{n} t^{k} y_{t}$ for $k = 0, \dots, p - 1$ in Proposition 1(ii) is a generalization of $\sum_{t = 1}^{n} t^{k} {\hat{x}}_{2, t}^{(1)} = \sum_{t = 1}^{n} t^{k} y_{t}$ for $k = 0, 1$ in Weinert (2007, p. 960) [4]. Here, ${\hat{x}}_{2, t}^{(1)}$ equals the t-th entry of ${\hat{x}}_{2} = {(I_{n} + λ_{2} Δ_{2}^{⊤} Δ_{2})}^{- 1} y$ .
4.: ${\hat{x}}_{p}^{(m)} = {\hat{τ}}_{p} + A_{p}^{(m)} (y - {\hat{τ}}_{p})$ in Proposition 1(iv) is a generalization of the result in Kim et al. (2009, p. 342) [21], and Yamada (2018, Equation (3)) [23]. Given that $A_{p}^{(m)}$ is a low-pass filter, it indicates that ${\hat{x}}_{p}^{(m)}$ is the sum of the polynomial time trend estimated by OLS, ${\hat{τ}}_{p}$ , and low-frequency components in the polynomial time trend residuals, $y - {\hat{τ}}_{p}$ .
5.: Given that ${\hat{x}}_{p}^{(m)} = A_{p}^{(m)} y$ , for example, it follows that ${(A_{p}^{(m)})}^{2} y = A_{p}^{(m)} {\hat{x}}_{p}^{(m)}$ . Thus, ${(A_{p}^{(m)})}^{h} y$ in Proposition 1(viii) represents the result of h times repeated bWH graduation. In addition, from Proposition 1(viii), it follows that

$\begin{matrix} Δ_{p} \{{(A_{p}^{(m)})}^{h} y\} \to 0_{n - p, 1}, (n, m, λ_{p} : fixed, h \to \infty), \end{matrix}$

(57)

which is a generalization of the result in Weinert (2007, p. 961) [4].
6.: Proposition 1(ix) and (x) are generalizations of the results in Knight (2021) [10]. Given that $A_{p}^{(1)} = A_{p} = {(I_{n} + λ_{p} C_{p})}^{- 1}$ , by its definition, it follows that

$\begin{matrix} C_{p}^{(1)} = \frac{1}{λ_{p}} \{{(A_{p}^{(1)})}^{- 1} - I_{n}\} = \frac{1}{λ_{p}} (I_{n} + λ_{p} C_{p} - I_{n}) = C_{p} . \end{matrix}$

(58)

Therefore, if $m = 1$ , then (49) becomes

$\begin{matrix} {\hat{x}}_{p}^{(1)} & = \arg min_{x} f_{p}^{(1)} (x) = {∥ y - x ∥}^{2} + λ_{p} x^{⊤} C_{p}^{(1)} x \\ = {(I_{n} + λ_{p} C_{p}^{(1)})}^{- 1} y, \end{matrix}$

(59)

which is the WH graduation. From (25), (35), and (54), it also follows that

$\begin{matrix} d_{p, i}^{(1)} = \frac{1}{λ_{p}} (\frac{1}{b_{p, i}^{(1)}} - 1) = \frac{1}{λ_{p}} (\frac{1}{b_{p, i}} - 1) = d_{p, i}, i = 1, \dots, n . \end{matrix}$

(60)

Accordingly, we obtain $D_{p}^{(1)} = D_{p}$ , which is consistent with (58).
7.: The penalized least squares problem in (51) is a generalized ridge regression representation of the bWH graduation.

5. An Empirical Illustration

In this section, we will provide an empirical illustration of the bWH(p) graduation. We will use the same data as in Nocon and Scott (2012, Example 1) [6], which are annual data from 1989 to 2009. Accordingly,

n = 21

. The data are taken from Table 1 of their paper [6].

Figure 4 shows the results for the case where

λ_{p} = 10^{6}

and

p = 3

. The top panel shows

y

(red circle) and

{\hat{x}}_{p}^{(1)} (= {\hat{x}}_{p})

(blue solid line). Since the value of

λ_{p}

is huge, from Proposition 1(v), it follows that

\begin{matrix} {\hat{x}}_{p}^{(1)} \approx {\hat{τ}}_{p} = Π_{p} {(Π_{p}^{⊤} Π_{p})}^{- 1} Π_{p}^{⊤} y . \end{matrix}

(61)

Alternatively, we can see this as the result of the following fact:

\begin{matrix} A_{p}^{(1)} & = A_{p} = {(I_{n} + λ_{p} Δ_{p}^{⊤} Δ_{p})}^{- 1} = I_{n} - Δ_{p}^{⊤} {(\frac{1}{λ_{p}} I_{n - p} + Δ_{p} Δ_{p}^{⊤})}^{- 1} Δ_{p} \\ \approx I_{n} - Δ_{p}^{⊤} {(Δ_{p} Δ_{p}^{⊤})}^{- 1} Δ_{p} = Π_{p} {(Π_{p}^{⊤} Π_{p})}^{- 1} Π_{p}^{⊤}, \end{matrix}

(62)

if

λ_{p}

is huge. Here, the third equality follows from the Sherman–Morrison–Woodbury formula to

{(I_{n} + λ_{p} Δ_{p}^{⊤} Δ_{p})}^{- 1}

. The last equality follows from the fact that

[Π_{p}, Δ_{p}^{⊤}]

is nonsingular and

{(Δ_{p}^{⊤})}^{⊤} Π_{p} = Δ_{p} Π_{p} = 0_{n - p, p}

. The middle panel shows

y - {\hat{x}}_{p}^{(1)}

(red circle) and

A_{p}^{(1)} (y - {\hat{x}}_{p}^{(1)})

(blue solid line). From the panel, it is observable that

\begin{matrix} A_{p}^{(1)} (y - {\hat{x}}_{p}^{(1)}) \approx 0_{n, 1}, \end{matrix}

(63)

which shows that there is no gain from boosting if

λ_{p}

is huge. The result is reasonable from

\begin{matrix} Π_{p} {(Π_{p}^{⊤} Π_{p})}^{- 1} Π_{p}^{⊤} \{I_{n} - Π_{p} {(Π_{p}^{⊤} Π_{p})}^{- 1} Π_{p}^{⊤}\} = 0_{n, n} . \end{matrix}

The bottom panel shows

{\hat{x}}_{p}^{(1)}

(red dashed line) and

{\hat{x}}_{p}^{(2)}

(blue solid line). In this case, since

\begin{matrix} {\hat{x}}_{p}^{(2)} = {\hat{x}}_{p}^{(1)} + A_{p}^{(1)} (y - {\hat{x}}_{p}^{(1)}) \approx {\hat{x}}_{p}^{(1)}, \end{matrix}

(64)

we cannot observe the red dashed line in the panel. Note that the first equality in (64) follows from (31).

Figure 5 shows the results for the case where

λ_{p} = 1160

and

p = 3

, where

λ_{p} = 1160

is the value used in Nocon and Scott (2012) [6]. They selected the value by using generalized cross-validation. Again, the top panel shows

y

(red circle) and

{\hat{x}}_{p}^{(1)}

(blue solid line). The middle panel shows

y - {\hat{x}}_{p}^{(1)}

(red circle) and

A_{p}^{(1)} (y - {\hat{x}}_{p}^{(1)})

(blue solid line). From the middle panel, it is observable that

\begin{matrix} A_{p}^{(1)} (y - {\hat{x}}_{p}^{(1)}) \neq 0_{n, 1} . \end{matrix}

(65)

Recall that

A_{p}^{(1)} (y - {\hat{x}}_{p}^{(1)})

is the gain from boosting. Given that

{\hat{x}}_{p}^{(2)} = {\hat{x}}_{p}^{(1)} + A_{p}^{(1)} (y - {\hat{x}}_{p}^{(1)})

, due to this gain from boosting, in the bottom panel,

{\hat{x}}_{p}^{(2)}

(blue solid line) is different from

{\hat{x}}_{p}^{(1)}

(red dashed line). In addition, we report the following two results:

\begin{matrix} \frac{1}{n} ι^{⊤} y = \frac{1}{n} ι^{⊤} {\hat{τ}}_{p} = \frac{1}{n} ι^{⊤} {\hat{x}}_{p}^{(1)} = \frac{1}{n} ι^{⊤} {\hat{x}}_{p}^{(2)} = 30.695, \\ \frac{1}{n} ι^{⊤} \{A_{p}^{(1)} (y - {\hat{x}}_{p}^{(1)})\} = 0 . \end{matrix}

These are consistent with Proposition 1(i).

We also tried the case where

λ_{p} = 10^{- 6}

and

p = 3

. In the case, we obtained the following:

\begin{matrix} {\hat{x}}_{p}^{(m)} \approx y, m = 1, 2, \end{matrix}

(66)

\begin{matrix} A_{p}^{(1)} (y - {\hat{x}}_{p}^{(1)}) \approx 0_{n, 1}, \end{matrix}

(67)

which are consistent with Proposition 1(vi).

6. Concluding Remarks

In this paper, we developed the boosted version of the WH graduation and established its properties. The theoretical results we obtained are summarized in Proposition 1 and Lemmas 1 and 2 and empirically illustrated in Section 5. Also see Table 1, which lists the relationships between the main matrices such as

A_{p}^{(m)}

and

B_{p}^{(m)}

.

Finally, we give a remark. To use the bWH(p) graduation in (27), the values of the three parameters, p,

λ_{p}

, and m, must be specified. Among them, the specification of

λ_{p}

is particularly important. This is because, as empirically shown in Section 5,

{\hat{x}}_{p}^{(m)}

is strongly dependent on the value of

λ_{p}

. One idea is to determine the value of

λ_{p}

using the (approximate) gain function corresponding to the smoother matrix,

A_{p} (= A_{p}^{(1)})

. However, it seems better to determine it by also considering the value of m. This could be achieved by extending the approach adopted by Hall and Thomson (2024) [13]. We are investigating this issue and will report our findings in the future.

Author Contributions

Writing—original draft, Z.J.; Writing—review & editing, H.Y.; Supervision, H.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by JST SPRING (JPMJSP2132) and JSPS KAKENHI (23K01377).

Data Availability Statement

The data used in this article are taken from Nocon and Scott (2012, Table 1) [6].

Acknowledgments

We thank the three anonymous referees for their valuable comments. We also thank Keith Knight for permission to reference his unpublished manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proofs

In this section, we provide two proofs.

Appendix A.1. Proof of (10)

Given

| a_{k} {| = | (- 1)}^{p - k} (\binom{p}{k}) | = (\binom{p}{k})

for

k = 0, \dots, p

and

2^{p} = {(1 + 1)}^{p} = \sum_{k = 0}^{p} (\binom{p}{k})

, it follows that

\sum_{k = 0}^{p} | a_{k} | = 2^{p}

. For an n-dimensional vector

η = {[η_{1}, \dots, η_{n}]}^{⊤}

, it follows that

\begin{matrix} ∥ Δ_{p} η ∥ = ∥ a_{0} η_{1 : n - p} + \dots + a_{p} η_{p + 1 : n} ∥ \\ \leq | a_{0} | ∥ η_{1 : n - p} ∥ + \dots + | a_{p} | ∥ η_{p + 1 : n} ∥ \leq \sum_{k = 0}^{p} | a_{k} | ∥ η ∥ = 2^{p} ∥ η ∥, \end{matrix}

where

η_{a : b} = {[η_{a}, \dots, η_{b}]}^{⊤}

. Here, the first inequality follows from the triangle inequality, the second inequality follows from

∥ η_{a : b} ∥ \leq ∥ η_{1 : n} ∥ = ∥ η ∥

, and the final equality follows from

\sum_{k = 0}^{p} | a_{k} | = 2^{p}

. Then, we have

\begin{matrix} d_{p, n} = u_{p, n}^{⊤} C_{p} u_{p, n} = ∥ Δ_{p} u_{p, n} ∥^{2} \leq 2^{2 p} {∥ u_{p, n} ∥}^{2} = 2^{2 p}, \end{matrix}

which completes the proof.

Appendix A.2. Proof of (11)

Let a and b be integers such that

b > a \geq 0

. Since

Δ^{a} t^{a} = a!

, where

Δ^{0} = 1

, it follows that

\begin{matrix} Δ^{b} t^{a} = Δ^{b - a} (Δ^{a} t^{a}) = Δ^{b - a} a! = 0 . \end{matrix}

(A1)

Given that

Δ_{p} τ_{(k)} = {[Δ^{p} {(p + 1)}^{k}, \dots, Δ^{p} n^{k}]}^{⊤} \in R^{n - p}

, the t-th entry of

Δ_{p} τ_{(k)}

is

\begin{matrix} Δ^{p} {(t + p)}^{k} = Δ^{p} \{\sum_{h = 0}^{k} (\binom{k}{h}) p^{h} t^{k - h}\} = (\binom{k}{0}) p^{0} Δ^{p} t^{k} + \dots + (\binom{k}{k}) p^{k} Δ^{p} t^{0} = 0 \end{matrix}

(A2)

for

t = 1, \dots, n - p

. Here, the third equality in (A2) follows from (A1). Recall that k is a non-negative integer at most

p - 1

. Therefore,

Δ_{p} τ_{(k)} = 0_{n - p, 1}

for

k = 0, \dots, p - 1

.

References

Bohlmann, G. Ein Ausgleichungsproblem. Nachrichten von der Gesellschaft der Wissenschaften zu Gottingen Mathematisch-Physikalische Klasse 1899, 1899, 260–271. [Google Scholar]
Hodrick, R.J.; Prescott, E.C. Postwar U.S. business cycles: An empirical investigation. J. Money Credit. Bank. 1997, 29, 1–16. [Google Scholar] [CrossRef]
Whittaker, E.T. On a new method of graduation. Proc. Edinb. Math. Soc. 1923, 41, 63–75. [Google Scholar] [CrossRef]
Weinert, H.L. Efficient computation for Whittaker–Henderson smoothing. Comput. Stat. Data Anal. 2007, 52, 959–974. [Google Scholar] [CrossRef]
Phillips, P.C.B. Two New Zealand pioneer econometricians. N. Z. Econ. Pap. 2010, 44, 1–26. [Google Scholar]
Nocon, A.S.; Scott, W.F. An extension of the Whittaker–Henderson method of graduation. Scand. Actuar. J. 2012, 1, 70–79. [Google Scholar] [CrossRef]
Biessy, G. Revisiting Whittaker–Henderson smoothing. arXiv 2023, arXiv:2306.06932. [Google Scholar]
Phillips, P.C.B.; Shi, Z. Boosting: Why you can use the HP filter. Int. Econ. Rev. 2021, 62, 521–570. [Google Scholar] [CrossRef]
Yamada, H. Linear trend, HP trend, and bHP trend. SSRN 2024. [Google Scholar] [CrossRef]
Knight, K. The Boosted Hodrick–Prescott Filter, Penalized Least Squares, and Bernstein Polynomials. Unpublished Manuscript. 2021. Available online: https://utstat.utoronto.ca/keith/papers/hp-pls.pdf (accessed on 26 October 2024).
Tomal, M. Testing for overall and cluster convergence of housing rents using robust methodology: Evidence from Polish provincial capitals. Empir. Econ. 2022, 62, 2023–2055. [Google Scholar] [CrossRef] [PubMed]
Trojanek, R.; Gluszak, M.; Kufel, P.; Tanas, J.; Trojanek, M. Pre and post-financial crisis convergence of metropolitan housing markets in Poland. J. Hous. Built Environ. 2023, 38, 515–540. [Google Scholar] [CrossRef]
Hall, V.B.; Thomson, P. Selecting a boosted HP filter for growth cycle analysis based on maximising sharpness. J. Bus. Cycle Res. 2024. [Google Scholar] [CrossRef]
Mei, Z.; Phillips, P.C.B.; Shi, Z. The boosted Hodrick–Prescott filter is more general than you might think. J. Appl. Econom. 2024. [Google Scholar] [CrossRef]
Biswas, E.; Sabzikar, F.; Phillips, P.C.B. Boosting the HP filter for trending time series with long-range dependence. Econom. Rev. 2024. [Google Scholar] [CrossRef]
Yamada, H. Boosted HP filter: Several properties derived from its spectral representation. In Computational Science and Its Applications—ICCSA 2024; Gervasi, O., Murgante, B., Garau, C., Taniar, D., C. Rocha, A.M.A., Faginas Lago, M.N., Eds.; Springer: Cham, Switzerland, 2024. [Google Scholar] [CrossRef]
Bao, R.; Yamada, H. Boosted Whittaker–Henderson Graduation of Order 1: A Graph Spectral Filter Using Discrete Cosine Transform. Contemp. Math. 2024. Forthcoming. Available online: https://www.researchgate.net/publication/384363420_Boosted_Whittaker-Henderson_Graduation_of_Order_1_A_Graph_Spectral_Filter_Using_Discrete_Cosine_Transform (accessed on 26 October 2024).
Anderson, T.W. The Statistical Analysis of Time Series; John Wiley and Sons: New York, NY, USA, 1971. [Google Scholar]
Strang, G. The discrete cosine transform. SIAM Rev. 1999, 41, 135–147. [Google Scholar] [CrossRef]
Nakatsukasa, Y.; Saito, N.; Woei, E. Mysteries around the graph Laplacian eigenvalue 4. Linear Algebra Its Appl. 2013, 438, 3231–3246. [Google Scholar] [CrossRef]
Kim, S.; Koh, K.; Boyd, S.; Gorinevsky, D. ℓ₁ trend filtering. SIAM Rev. 2009, 51, 339–360. [Google Scholar] [CrossRef]
Yamada, H. A smoothing method that looks like the Hodrick–Prescott filter. Econom. Theory 2020, 36, 961–981. [Google Scholar] [CrossRef]
Yamada, H. Why does the trend extracted by the Hodrick–Prescott filtering seem to be more plausible than the linear trend? Appl. Econ. Lett. 2018, 25, 102–105. [Google Scholar] [CrossRef]

Figure 1. Eigenvalues

d_{p, 1}, \dots, d_{p, n}

for

p = 3

and

n = 50

.

Figure 1. Eigenvalues

d_{p, 1}, \dots, d_{p, n}

for

p = 3

and

n = 50

.

Figure 2. Eigenvectors

u_{p, 4}

,

u_{p, 8}

,

u_{p, 12}

,

u_{p, 16}

for

p = 3

and

n = 50

.

Figure 2. Eigenvectors

u_{p, 4}

,

u_{p, 8}

,

u_{p, 12}

,

u_{p, 16}

for

p = 3

and

n = 50

.

Figure 3. Eigenvalues

b_{p, 1}^{(m)}, \dots, b_{p, n}^{(m)}

for

p = 3

,

n = 50

,

m = 2

, and

λ_{p} = 1000

.

Figure 3. Eigenvalues

b_{p, 1}^{(m)}, \dots, b_{p, n}^{(m)}

for

p = 3

,

n = 50

,

m = 2

, and

λ_{p} = 1000

.

Figure 4. Empirical illustration.

λ_{p} = 10^{6}

and

p = 3

. The top panel shows

y

(red circle) and

{\hat{x}}_{p}^{(1)} (= {\hat{x}}_{p})

(blue solid line). The middle panel shows

y - {\hat{x}}_{p}^{(1)}

(red circle) and

A_{p}^{(1)} (y - {\hat{x}}_{p}^{(1)})

(blue solid line). The bottom panel shows

{\hat{x}}_{p}^{(1)}

(red dashed line) and

{\hat{x}}_{p}^{(2)}

(blue solid line).

Figure 4. Empirical illustration.

λ_{p} = 10^{6}

and

p = 3

. The top panel shows

y

(red circle) and

{\hat{x}}_{p}^{(1)} (= {\hat{x}}_{p})

(blue solid line). The middle panel shows

y - {\hat{x}}_{p}^{(1)}

(red circle) and

A_{p}^{(1)} (y - {\hat{x}}_{p}^{(1)})

(blue solid line). The bottom panel shows

{\hat{x}}_{p}^{(1)}

(red dashed line) and

{\hat{x}}_{p}^{(2)}

(blue solid line).

Figure 5. Empirical illustration.

λ_{p} = 1160

and

p = 3

. The top panel shows

y

(red circle) and

{\hat{x}}_{p}^{(1)} (= {\hat{x}}_{p})

(blue solid line). The middle panel shows

y - {\hat{x}}_{p}^{(1)}

(red circle) and

A_{p}^{(1)} (y - {\hat{x}}_{p}^{(1)})

(blue solid line). The bottom panel shows

{\hat{x}}_{p}^{(1)}

(red dashed line) and

{\hat{x}}_{p}^{(2)}

(blue solid line).

Figure 5. Empirical illustration.

λ_{p} = 1160

and

p = 3

. The top panel shows

y

(red circle) and

{\hat{x}}_{p}^{(1)} (= {\hat{x}}_{p})

(blue solid line). The middle panel shows

y - {\hat{x}}_{p}^{(1)}

(red circle) and

A_{p}^{(1)} (y - {\hat{x}}_{p}^{(1)})

(blue solid line). The bottom panel shows

{\hat{x}}_{p}^{(1)}

(red dashed line) and

{\hat{x}}_{p}^{(2)}

(blue solid line).

Table 1. List of the relationships between the main matrices.

	WH(p) Graduation	bWH(p) Graduation
Smoother matrix	$A_{p}$	$A_{p}^{(m)}$
Spectral decomposition of smoother matrix	$U_{p} B_{p} U_{p}^{⊤}$	$U_{p} B_{p}^{(m)} U_{p}^{⊤}$
Penalty matrix	$C_{p}$	$C_{p}^{(m)}$
Spectral decomposition of penalty matrix	$U_{p} D_{p} U_{p}^{⊤}$	$U_{p} D_{p}^{(m)} U_{p}^{⊤}$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jin, Z.; Yamada, H. Boosted Whittaker–Henderson Graduation. Mathematics 2024, 12, 3377. https://doi.org/10.3390/math12213377

AMA Style

Jin Z, Yamada H. Boosted Whittaker–Henderson Graduation. Mathematics. 2024; 12(21):3377. https://doi.org/10.3390/math12213377

Chicago/Turabian Style

Jin, Zihan, and Hiroshi Yamada. 2024. "Boosted Whittaker–Henderson Graduation" Mathematics 12, no. 21: 3377. https://doi.org/10.3390/math12213377

APA Style

Jin, Z., & Yamada, H. (2024). Boosted Whittaker–Henderson Graduation. Mathematics, 12(21), 3377. https://doi.org/10.3390/math12213377

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Boosted Whittaker–Henderson Graduation

Abstract

1. Introduction

2. Preliminaries

2.1. Data

2.2. Notations

2.3. Key Matrices

2.3.1. $Δ_{p}$ and $C_{p}$

2.3.2. $Π_{p}$

2.3.3. $D_{p}$ and $U_{p}$

2.3.4. $D_{p, j}$ and $U_{p, j}$ for $j = 1, 2$

2.4. Results Regarding the Key Matrices

2.5. WH Graduation

2.6. Spectral Representation of WH Graduation

3. Boosted WH Graduation

3.1. Boosted WH Graduation

3.2. Spectral Representation of bWH Graduation

4. Properties of the bWH Graduation

5. An Empirical Illustration

6. Concluding Remarks

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Proofs

Appendix A.1. Proof of (10)

Appendix A.2. Proof of (11)

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Boosted Whittaker–Henderson Graduation

Abstract

1. Introduction

2. Preliminaries

2.1. Data

2.2. Notations

2.3. Key Matrices

2.3.1. Δ p and C p

2.3.2. Π p

2.3.3. D p and U p

2.3.4. D p , j and U p , j for j = 1 , 2

2.4. Results Regarding the Key Matrices

2.5. WH Graduation

2.6. Spectral Representation of WH Graduation

3. Boosted WH Graduation

3.1. Boosted WH Graduation

3.2. Spectral Representation of bWH Graduation

4. Properties of the bWH Graduation

5. An Empirical Illustration

6. Concluding Remarks

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Proofs

Appendix A.1. Proof of (10)

Appendix A.2. Proof of (11)

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2.3.1. $Δ_{p}$ and $C_{p}$

2.3.2. $Π_{p}$

2.3.3. $D_{p}$ and $U_{p}$

2.3.4. $D_{p, j}$ and $U_{p, j}$ for $j = 1, 2$