Next Article in Journal
Fractional-Order Modeling of COVID-19 Transmission Dynamics: A Study on Vaccine Immunization Failure
Next Article in Special Issue
Review of the Natural Time Analysis Method and Its Applications
Previous Article in Journal
Enhancing Portfolio Optimization: A Two-Stage Approach with Deep Learning and Portfolio Optimization
Previous Article in Special Issue
Information–Theoretic Analysis of Visibility Graph Properties of Extremes in Time Series Generated by a Nonlinear Langevin Equation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Boosted Whittaker–Henderson Graduation

Graduate School of Humanities and Social Sciences, Hiroshima University, 1-2-1 Kagamiyama, Higashi-Hiroshima 739-8525, Japan
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(21), 3377; https://doi.org/10.3390/math12213377
Submission received: 2 September 2024 / Revised: 24 October 2024 / Accepted: 27 October 2024 / Published: 29 October 2024
(This article belongs to the Special Issue Recent Advances in Time Series Analysis)

Abstract

:
The Whittaker–Henderson (WH) graduation is a smoothing method for equally spaced one-dimensional data such as time series. It includes the Bohlmann filter, the Hodrick–Prescott (HP) filter, and the Whittaker graduation as special cases. Among them, the HP filter is the most prominent trend-cycle decomposition method for macroeconomic time series such as real gross domestic product. Recently, a modification of the HP filter, the boosted HP (bHP) filter, has been developed, and several studies have been conducted. The basic idea of the modification is to achieve more desirable smoothing by extracting long-term fluctuations remaining in the smoothing residuals. Inspired by the modification, this paper develops the boosted version of the WH graduation, which includes the bHP filter as a special case. Then, we establish its properties that are fundamental for applied work. To investigate the properties, we use a spectral decomposition of the penalty matrix of the WH graduation

1. Introduction

The Whittaker–Henderson (WH) graduation is a smoothing method for equally spaced one-dimensional data such as time series. It includes the Bohlmann (1899) [1] filter, Hodrick–Prescott (HP) filter (Hodrick and Prescott, 1997 [2]), and Whittaker (1923) [3] graduation as its special cases [4,5,6,7]. Among them, the HP filter is the most prominent trend-cycle decomposition method for macroeconomic time series such as real gross domestic product. Recently, Phillips and Shi (2021) [8] developed its modification, the boosted HP (bHP) filter. The basic idea of the modification is to achieve more desirable smoothing by extracting long-term fluctuations remaining in the smoothing residuals [8,9].
Several studies concerning the bHP filter have since emerged: (i) Knight (2021) [10] derived the penalized least squares problems corresponding to the bHP filter, (ii) Tomal (2022) [11] and Trojanek et al. (2023) [12] published empirical studies using the bHP filter, (iii) Hall and Thomson (2024) [13] provided a way to use the bHP filter as a frequency-selective filter, (iv) Mei et al. (2024) [14] and Biswas et al. (2024) [15] extended the bHP filter theoretically, (v) Yamada (2024) [9] provided a perspective of the bHP filter, (vi) Yamada (2024) [16] established the properties of the bHP filter, and (vii) Bao and Yamada (2024) [17] studied the boosted version of the Bohlmann filter.
Inspired by Phillips and Shi (2021) [8], this paper develops the boosted version of the WH graduation, the bWH graduation, which includes the bHP filter as a special case. As in the case of the bHP filter, the new filter can also recover long-term fluctuations remaining in the smoothing residuals. Then, we establish its properties, which are fundamental for applied work. To examine them, we use a spectral decomposition of the penalty matrix of the WH graduation, which is a banded symmetric matrix. This direction of research is suggested by Yamada (2024) [16], and this paper can be considered as an extension of Yamada (2024) [16]. In addition, since the boosted version of the Bohlmann filter is identical to the bWH graduation of order 1, this paper can also be considered as an extension of Bao and Yamada (2024) [17].
The organization of this paper is as follows. In Section 2, we provide some preliminary remarks, which include the spectral decomposition of the penalty matrix of the WH graduation. In Section 3, we define the bWH graduation and present the spectral representation of it. In Section 4, we establish several properties of the bWH graduation. In Section 5, we empirically illustrate the results obtained. Section 6 concludes the paper. In the Appendix A, we provide some of the proofs.

2. Preliminaries

In this section, we provide some preliminary remarks.

2.1. Data

Let y t denote the realization of a variable y at v t for t = 1 , , n , where v 1 , , v n are equally spaced. In this paper, we assume that y t cannot be represented as k = 0 p 1 ϕ k t k for t = 1 , , n , where ϕ k for k = 0 , , p 1 are real numbers. This is because there is no need for smoothing in this case.

2.2. Notations

Let y = [ y 1 , , y n ] , x = [ x 1 , , x n ] , ι be an n-dimensional column vector of ones, 0 r , s be an r × s matrix of zeros, and I r be an r × r identity matrix. For an r-dimensional column vector η = [ η 1 , , η r ] , η 2 = η η = t = 1 r η t 2 . For a full-column rank matrix Γ R r × s , the column space of Γ and its orthogonal complement are denoted by by S ( Γ ) and S ( Γ ) , respectively. Let Δ = 1 L , where L is the lag operator such that L x t = x t 1 . Accordingly, Δ = 1 L is the difference operator such that Δ x t = ( 1 L ) x t = x t x t 1 and Δ 2 x t = ( 1 L ) 2 x t = x t 2 x t 1 + x t 2 .

2.3. Key Matrices

2.3.1. Δ p and C p

For an n-dimensional column vector ζ = [ ζ 1 , , ζ n ] , let Δ p be an ( n p ) × n matrix such that Δ p ζ = [ Δ p ζ p + 1 , , Δ p ζ n ] . Based on the binomial theorem, letting a h = ( 1 ) p h p h for h = 0 , 1 , , p , ( 1 L ) p can be expanded as follows:
( 1 L ) p = { ( 1 ) L + 1 } p = h = 0 p p h ( 1 ) p h L p h 1 h = h = 0 p a h L p h .
Then, it follows that:
Δ p x t = ( 1 L ) p x t = h = 0 p a h L p h x t = a 0 x t p + a 1 x t p + 1 + + a p x t , t = p + 1 , , n .
Accordingly, Δ p is an ( n p ) × n Toeplitz matrix given as follows:
Δ p = a 0 a p 0 0 0 0 0 0 a 0 a p .
Here, a 0 = 1 if p is odd and a 0 = 1 if p is even; therefore, the rank of Δ p is ( n p ) and the nullity of Δ p is p. For example, Δ 1 R ( n 1 ) × n and Δ 2 R ( n 2 ) × n are given as follows, respectively:
Δ 1 = 1 1 0 0 0 0 0 0 1 1 , Δ 2 = 1 2 1 0 0 0 0 0 0 1 2 1 .
Let C p be an n × n -banded symmetric matrix defined as follows:
C p = Δ p Δ p .
Then, C p is a positive semidefinite matrix whose rank is ( n p ) . Incidentally, C 1 , , C 4 are explicitly shown in Anderson (1971, pp. 68–69) [18]. In addition, C 1 is A 2 in Strang (1999) [19], and it is also shown in Nakatsukasa et al. (2013, p. 3233) [20].

2.3.2. Π p

Let
Π p = [ τ ( 0 ) , , τ ( p 1 ) ] ,
where τ ( k ) = [ 1 k , 2 k , , n k ] for k = 0 , , p 1 . More specifically, it is a Vandermonde matrix given as follows:
Π p = 1 0 1 1 1 p 1 2 0 2 1 2 p 1 n 0 n 1 n p 1 .
Then, Π p is of full column rank, and the first column of Π p , τ ( 0 ) , is equal to ι . In addition, by assumption, y S ( Π p ) .
Let
τ ^ p = Π p β ^ p ,
where β ^ p = ( Π p Π p ) 1 Π p y = arg min β y Π p β 2 . For example, τ ^ 1 = y ¯ τ ( 0 ) = y ¯ ι , where y ¯ = 1 n t = 1 n y t and τ ^ 2 = α ba τ ( 0 ) + β ba τ ( 1 ) , where α ba and β ba are explicitly shown in Kim et al. (2009, p. 341) [21]. τ ^ p is the orthogonal projection of y onto S ( Π p ) .

2.3.3. D p and U p

Denote a spectral decomposition of C p , which is defined by (4) as follows:
C p = U p D p U p ,
where D p = diag ( d p , 1 , , d p , n ) , and U p = [ u p , 1 , , u p , n ] is an orthogonal matrix. Given that C p is a positive semidefinite matrix whose rank is ( n p ) , we let
0 = d p , 1 = = d p , p < d p , p + 1 d p , n .
Here, the largest eigenvalue of C p is upper bounded by 2 2 p , i.e., it follows that
d p , n 2 2 p .
A proof of the last inequality in (10) is provided in Appendix A.1. Figure 1 depicts d p , 1 , , d p , n for p = 3 and n = 50 . Note that if p = 3 , it follows that d p , n 2 2 p = 64 . We can confirm this from the figure. Since C p ι = Δ p Δ p ι = 0 n , 1 , we can let ( d p , 1 , u p , 1 ) = 0 , 1 n ι .

2.3.4. D p , j and U p , j for j = 1 , 2

Let D p , 1 = diag ( d p , 1 , , d p , p ) and D p , 2 = diag ( d p , p + 1 , , d p , n ) . Accordingly, it follows that
D p = D p , 1 0 p , n p 0 n p , n D p , 2 ,
In addition, from the inequalities in (9), D p , 1 = 0 p , p , and D p , 2 R ( n p ) × ( n p ) is a positive definite matrix. Let U p , 1 = [ u p , 1 , , u p , p ] and U p , 2 = [ u p , p + 1 , , u p , n ] . Accordingly,
U p = [ U p , 1 , U p , 2 ] .

2.4. Results Regarding the Key Matrices

For k = 0 , , p 1 , τ ( k ) in (5) belonging to the null space of Δ p in (3), it follows that
Δ p τ ( k ) = 0 n p , 1 .
A proof of (11) is provided in Appendix A.2. Accordingly, we have
Δ p Π p = [ Δ p τ ( 0 ) , , Δ p τ ( p 1 ) ] = 0 n p , p ,
from which it follows that
C p Π p = Δ p Δ p Π p = 0 n , p .
Thus, given that (i) the nullity of C p is p and (ii) Π p = [ τ ( 0 ) , , τ ( p 1 ) ] is of full column rank, { τ ( 0 ) , , τ ( p 1 ) } is a basis of the null space of C p . On the other hand, from (8) and (9), it follows that
C p U p , 1 = 0 n , p ,
which implies that { u p , 1 , , u p , p } is an orthonormal basis of the null space of C p . Combining these results, it follows that
S ( Π p ) = S ( U p , 1 ) .
From (14), we obtain the following two results:
U p , 1 U p , 1 ( = U p , 1 ( U p , 1 U p , 1 ) 1 U p , 1 ) = Π p ( Π p Π p ) 1 Π p ,
U p , 2 Π p = 0 n p , p .
Given that C p = Δ p Δ p and C p u p , i = d p , i u p , i for i = 1 , , n , we have
Δ p u p , i 2 = u p , i C p u p , i = d p , i , i = 1 , , n .
Then, from (9) and (10), it follows that
0 = Δ p u p , 1 2 = = Δ p u p , p 2 Δ p u p , p + 1 2 Δ p u p , n 2 2 2 p .
These indicate that with respect to Δ p , which is defined by (3),
(i)
the degree of smoothness of u p , 1 , , u p , p is the highest, and
(ii)
the degree of smoothness of u p , i is higher than or equal to that of u p , i + 1 for i = p , , n 1 .
Figure 2 depicts u p , 4 , u p , 8 , u p , 12 , u p , 16 for p = 3 and n = 50 . This figure is consistent with (ii).

2.5. WH Graduation

The WH(p) graduation is a smoothing method given as follows:
min x 1 , , x n R t = 1 n ( y t x t ) 2 + λ p t = p + 1 n ( Δ p x t ) 2 ,
where λ p ( 0 , ) is a smoothing parameter and p is a positive integer such that n > p . It is a penalized least squares regression. When p = 1 , since Δ x t = ( 1 L ) x t = x t x t 1 , (19) is identical to the Bohlmann filter. When p = 2 , since Δ 2 x t = ( 1 2 L + L 2 ) x t = x t 2 x t 1 + x t 2 , (19) is identical to the HP filter, given as follows:
min x 1 , , x n R t = 1 n ( y t x t ) 2 + λ 2 t = 3 n ( x t 2 x t 1 + x t 2 ) 2 .
When p = 3 , (19) is identical to the Whittaker graduation.
The WH(p) graduation can be represented in matrix form as follows:
min x f p ( x ) = y x 2 + λ p x C p x .
Then, C p , which is defined by (4), is the penalty matrix of the WH graduation. Denoting the solution to the minimization problem in (21) by x ^ p , it follows that
x ^ p = A p y ,
where
A p = ( I n + λ p C p ) 1 .
A p is referred to as the smoother matrix of the WH(p) graduation.

2.6. Spectral Representation of WH Graduation

Given the spectral decomposition of C p in (8), A p , which is the smoother matrix of the WH(p) graduation, can be spectrally decomposed as follows:
A p = ( I n + λ p C p ) 1 = U p ( I n + λ p D p ) 1 U p = U p B p U p ,
where B p = ( I n + λ p D p ) 1 . Let B p = diag ( b p , 1 , , b p , n ) . Then, given that
b p , i = ( 1 + λ p d p , i ) 1 , i = 1 , , n ,
from the inequalities in (9), it follows that
1 = b p , 1 = = b p , p > b p , p + 1 b p , n > 0 .

3. Boosted WH Graduation

3.1. Boosted WH Graduation

We define the boosted WH graduation of order p, bWH(p) graduation for short, as follows:
x ^ p ( m ) = A p ( m ) y ,
where
A p ( m ) = I n ( I n A p ) m .
We refer to A p ( m ) as the smoother matrix of the bWH(p) graduation.
The bWH(p) graduation is related to the existing filters as follows. In our notation, the boosted HP filter developed by Phillips and Shi (2021) [8] is represented as follows:
x ^ 2 ( m ) = A 2 ( m ) y .
Thus, bWH(p) graduation is a generalization of the bHP filter. In addition, since
A p ( 1 ) = I n ( I n A p ) = A p ,
the bWH(p) graduation is also a generalization of the WH(p) graduation. Moreover, the bWH(1) graduation was dealt with in Bao and Yamada (2024) [17]. Finally, Hall and Thompson (2024) [13] refer to x ^ 2 ( 2 ) as twicing.
We illustrate how boosting brings a gain. Given that A p ( 2 ) = I n ( I n A p ) 2 = I n I n + 2 A p A p 2 = A p + A p ( I n A p ) , it follows that
x ^ p ( 2 ) = x ^ p + A p ( y x ^ p ) .
Here, A p ( y x ^ p ) in (31) can be regarded as a gain from boosting. Given that A p is the smoother matrix of the WH graduation, A p ( y x ^ p ) is a trend recovered from the WH graduation residuals, y x ^ p .
MATLAB/GNU Octave, R, and Python user-defined functions for calculating x ^ p ( m ) in (27) are available from GitHub. The URL is https://github.com/HiroshiFromHiroshima/Boosted_Whittaker-Henderson_Graduation (accessed on 26 October 2024). We used MATLAB version R2018b, GNU Octave version 7.1.0, R version 4.2.3, and Python version 3.12.5 to verify these user-defined functions.

3.2. Spectral Representation of bWH Graduation

Given the spectral decomposition of A p in (24), A p ( m ) in (28), which is the smoother matrix of the bWH(p) graduation, can be spectrally decomposed as follows:
A p ( m ) = I n ( I n A p ) m = I n ( I n U p B p U p ) m = U p I n ( I n B p ) m U p = U p B p ( m ) U p ,
where
B p ( m ) = I n ( I n B p ) m .
From (32), we have some useful results.
  • Given that B p ( m ) is a diagonal matrix, from (32), it follows that
    ( A p ( m ) ) = ( U p B p ( m ) U p ) = U p B p ( m ) U p = A p ( m ) .
    That is, A p ( m ) is a symmetric matrix.
  • Let B p ( m ) = diag ( b p , 1 ( m ) , , b p , n ( m ) ) . Then, it follows that
    b p , i ( m ) = 1 ( 1 b p , i ) m = 1 λ p d p , i 1 + λ p d p , i m , i = 1 , , n .
    Thus, it follows from (26) that
    1 = b p , 1 ( m ) = = b p , p ( m ) > b p , p + 1 ( m ) b p , n ( m ) > 0 .
    Figure 3 depicts b p , 1 ( m ) , , b p , n ( m ) for p = 3 , n = 50 , m = 2 , and λ p = 1000 . It illustrates the inequalities in (36).
    Then, given that A p ( m ) is a symmetric matrix whose eigenvalues, b p , i ( m ) for i = 1 , , n , are all positive, A p ( m ) is a positive definite matrix.
  • Let B p , 1 ( m ) = diag ( b p , 1 ( m ) , , b p , p ( m ) ) and B p , 2 ( m ) = diag ( b p , p + 1 ( m ) , , b p , n ( m ) ) . Then, given that B p , 1 ( m ) = I p , the spectral decomposition of A p ( m ) in (32) becomes
    A p ( m ) = U p , 1 U p , 1 + U p , 2 B p , 2 ( m ) U p , 2 .
    Moreover, from (15), A p ( m ) can be represented as follows:
    A p ( m ) = Π p ( Π p Π p ) 1 Π p + U p , 2 B p , 2 ( m ) U p , 2 .
  • Given (16), postmultiplying (37) by Π p yields the following:
    A p ( m ) Π p = Π p .
Let us summarize the above results.
Lemma 1.
The smoother matrix of the bWH graduation, A p ( m ) , has the following properties.
(i) 
A p ( m ) is a symmetric matrix. Moreover, it is a positive definite matrix whose eigenvalues satisfy (36).
(ii) 
A p ( m ) can be represented as in (38).
(iii) 
A p ( m ) is a matrix such that A p ( m ) Π p = Π p .
Postmultiplying (38) by y , we immediately obtain the following result.
Lemma 2.
x ^ p ( m ) in (27) can be spectrally represented as
x ^ p ( m ) = τ ^ p + b p , p + 1 ( m ) z p , p + 1 u p , p + 1 + + b p , n ( m ) z p , n u p , n ,
where z p , i = u p , i y for i = p + 1 , , n and τ ^ p S ( U p , 2 ) is the orthogonal projection of y onto S ( Π p ) .
Remark 1.
Given the inequalities in (18) and (36), the spectral representation of x ^ p ( m ) in Lemma 2 shows how smoothing is performed. See also Figure 3, which illustrates (36).

4. Properties of the bWH Graduation

In this section, we establish several properties of the bWH graduation.
For this purpose, we will introduce some notations. The t-th entry of x ^ p ( m ) in (27) is denoted by x ^ p , t ( m ) for t = 1 , , n , and the ( i , j ) -th entry of A p ( m ) in (28) is denoted by a p , i , j ( m ) for i , j = 1 , , n . Let
C p ( m ) = 1 λ p ( A p ( m ) ) 1 I n .
Here, we note that C p ( m ) in (41) can be defined because A p ( m ) is a positive definite matrix from Lemma 1(i). In addition, let
D p ( m ) = 1 λ p ( B p ( m ) ) 1 I n .
Here, ( B p ( m ) ) 1 in (42) is a diagonal matrix whose diagonal entries are all positive (See the inequalities in (36)).
Proposition 1.
The bWH graduation has the following properties.
(i) 
(a) The average of the entries of x ^ p ( m ) in (27) is equal to that of y . That is, it follows that
1 n t = 1 n x ^ p , t ( m ) = 1 n t = 1 n y t .
(b) The bWH graduation residuals, y x ^ p ( m ) , sum to zero. That is, it follows that
t = 1 n ( y t x ^ p , t ( m ) ) = 0 .
(ii) 
The result in (i)(a) can be generalized as follows.
1 n t = 1 n t k x ^ p , t ( m ) = 1 n t = 1 n t k y t , k = 0 , , p 1 .
(iii) 
Each row of the smoother matrix A p ( m ) in (28) sums to unity. That is, it follows that
a p , i , 1 ( m ) + + a p , i , n ( m ) = 1 , i = 1 , , n .
(iv) 
τ ^ p in (7) satisfies
A p ( m ) τ ^ p = τ ^ p .
Accordingly, x ^ p ( m ) in (27) can be represented by τ ^ p as follows:
x ^ p ( m ) = τ ^ p + A p ( m ) ( y τ ^ p ) .
(v) 
When n and m are fixed, as λ p , x ^ p ( m ) τ ^ p .
(vi) 
When n and m are fixed, as λ p 0 , x ^ p ( m ) y .
(vii) 
When n and λ p are fixed, as m , x ^ p ( m ) y .
(viii) 
When n, m, and λ p are fixed, as h , ( A p ( m ) ) h y τ ^ p .
(ix) 
x ^ p ( m ) in (7) can be considered as the solution of a penalized least squares problem. More specifically, it follows that
x ^ p ( m ) = arg min x f p ( m ) ( x ) = y x 2 + λ p x C p ( m ) x = ( I n + λ p C p ( m ) ) 1 y .
(x) 
x ^ p ( m ) in (7) can be represented by
x ^ p ( m ) = U p θ ^ p ( m ) ,
where U p is an n × n orthogonal matrix in (8) and
θ ^ p ( m ) = arg min θ g p ( m ) ( θ ) = y U p θ 2 + λ p θ D p ( m ) θ = ( I n + λ p D p ( m ) ) 1 U p y .
Here, D p ( m ) is a diagonal matrix such that its first p diagonal entries are zeros; therefore, the first p entries of θ are not penalized.
(xi) 
The penalty matrices C p ( m ) in (49) and D p ( m ) in (51) are similar; therefore, they have the same eigenvalues. Furthermore, C p ( m ) is a non-negative definite matrix whose null space is identical to S ( Π p ) .
Proof. 
The proofs of (i) to (xi) are, in turn, as follows.
(i)
ι is one of the columns of Π p in (6). Then, its orthogonal projection onto S ( Π p ) is itself; i.e., it follows that Π p ( Π p Π p ) 1 Π p ι = ι . In addition, from (16), it follows that ι u p , i = 0 for i = p + 1 , , n . Accordingly, pre-multiplying (40) by ι yields
ι x ^ p ( m ) = ι Π p ( Π p Π p ) 1 Π p y = ι y ,
which implies that the average of the entries of x ^ p ( m ) equals that of y . In addition, from (52), it immediately follows that
ι ( y x ^ p ( m ) ) = 0 .
Thus, the bWH(p) graduation trend residuals sum to zero.
(ii)
τ ( k ) is one of the columns of Π p in (6). Then, it follows that Π p ( Π p Π p ) 1 Π p τ ( k ) = τ ( k ) for k = 0 , , p 1 . In addition, from (16), it follows that τ ( k ) u p , i = 0 for k = 0 , , p 1 and i = p + 1 , , n . Accordingly, pre-multiplying (40) by τ ( k ) yields
τ ( k ) x ^ p ( m ) = τ ( k ) Π p ( Π p Π p ) 1 Π p y = τ ( k ) y ,
from which we have (45).
(iii)
Given that ι is one of the columns of Π p in (6), it follows from (39) that
A p ( m ) ι = ι .
Thus, each row of the smoother matrix A p ( m ) sums to unity.
(iv)
Again from (39), regarding τ ^ p in (7), it follows that
A p ( m ) τ ^ p = A p ( m ) Π p ( Π p Π p ) 1 Π p y = Π p ( Π p Π p ) 1 Π p y = τ ^ p ,
from which we immediately obtain
x ^ p ( m ) = τ ^ p + A p ( m ) ( y τ ^ p ) .
(v)
From (35), it follows that
b p , i ( m ) = 1 λ p d p , i 1 + λ p d p , i m 0 , ( n , m : fixed , λ p )
for i = p + 1 , , n . Accordingly, it follows from Lemma 2 that
x ^ p ( m ) = τ ^ p + b p , p + 1 ( m ) z p , p + 1 u p , p + 1 + + b p , n ( m ) z p , n u p , n τ ^ p , ( n , m : fixed , λ p ) .
(vi)
From (35) and (36), it follows that b p , 1 ( m ) = = b p , p ( m ) = 1 and
b p , i ( m ) = 1 λ p d p , i 1 + λ p d p , i m 1 , ( n , m : fixed , λ p 0 )
for i = p + 1 , , n . Accordingly, it follows from (32) that
A p ( m ) = U p B p ( m ) U p U p U p = I n , ( n , m : fixed , λ p 0 ) .
Therefore, we have
x ^ p ( m ) = A p ( m ) y y , ( n , m : fixed , λ p 0 ) .
(vii)
From (35) and (36), it follows that b p , 1 ( m ) = = b p , p ( m ) = 1 and
b p , i ( m ) = 1 λ p d p , i 1 + λ p d p , i m 1 , ( n , λ p : fixed , m )
for i = p + 1 , , n . Accordingly, it follows from (32) that
A p ( m ) = U p B p ( m ) U p U p U p = I n , ( n , λ p : fixed , m ) .
Therefore, we have
x ^ p ( m ) = A p ( m ) y y , ( n , λ p : fixed , m ) .
(viii)
From (32) and (36), it follows that
( A p ( m ) ) h = U p ( B p ( m ) ) h U p U p , 1 U p , 1 , ( n , m , λ p : fixed , h ) .
Accordingly, it follows from (15) that
( A p ( m ) ) h y = U p ( B p ( m ) ) h U p y U p , 1 U p , 1 y = τ ^ p , ( n , m , λ p : fixed , h ) .
(ix)
From the definition of C p ( m ) given by (41), it immediately follows that A p ( m ) = ( I n + λ p C p ( m ) ) 1 . Accordingly, we have
x ^ p ( m ) = A p ( m ) y = ( I n + λ p C p ( m ) ) 1 y = arg min x f p ( m ) ( x ) = y x 2 + λ p x C p ( m ) x .
(x)
Based on the spectral decomposition of A p ( m ) in (32), i.e., A p ( m ) = U p B p ( m ) U p , C p ( m ) in (41) can be decomposed as follows:
C p ( m ) = 1 λ p U p ( B p ( m ) ) 1 U p I n = U p D p ( m ) U p ,
by which we have
U p θ ^ p ( m ) = U p ( I n + λ p D p ( m ) ) 1 U p y = ( I n + λ p U p D p ( m ) U p ) 1 y = ( I n + λ p C p ( m ) ) 1 y = A p ( m ) y = x ^ p ( m ) .
Let
d p , i ( m ) = 1 λ p 1 b p , i ( m ) 1 , i = 1 , , n .
Then, D p ( m ) = diag ( d p , 1 ( m ) , , d p , n ( m ) ) , and from the inequalities given in (36), it follows that
0 = d p , 1 ( m ) = = d p , p ( m ) < d p , p + 1 ( m ) d p , n ( m ) .
(xi)
Given that U p in (8) is an orthogonal matrix, from (53), C p ( m ) is similar to D p ( m ) , and they have the same eigenvalues. Then, based on (55), C p ( m ) is a non-negative definite matrix such that its nullity is p. Based on Lemma 1(i) and (39), A p ( m ) is a positive definite matrix such that A p ( m ) Π p = Π p . Then, it follows that
( A p ( m ) ) 1 Π p = Π p ,
from which we have
C p ( m ) Π p = 1 λ p ( A p ( m ) ) 1 I n Π p = 1 λ p ( Π p Π p ) = 0 n , p .
Thus, τ ( 0 ) , , τ ( p 1 ) belong to the null space of C p ( m ) .
Remark 2.
Regarding Proposition 1, we make several remarks.
1. 
Some of the results in Proposition 1 are generalizations of those in Yamada (2020, Proposition 2.2) [22], which documents several properties of the HP filter. For example, Proposition 1(i) is a generalization of Yamada (2020, Proposition 2.2(iii)(a)) [22].
2. 
Since A p ( m ) is symmetric, from A p ( m ) ι = ι , it follows that ι A p ( m ) = ι , from which we have
ι x ^ p ( m ) = ι A p ( m ) y = ι y .
This is another proof of Proposition 1(i)(a).
3. 
t = 1 n t k x ^ p , t ( m ) = t = 1 n t k y t for k = 0 , , p 1 in Proposition 1(ii) is a generalization of t = 1 n t k x ^ 2 , t ( 1 ) = t = 1 n t k y t for k = 0 , 1 in Weinert (2007, p. 960) [4]. Here, x ^ 2 , t ( 1 ) equals the t-th entry of x ^ 2 = ( I n + λ 2 Δ 2 Δ 2 ) 1 y .
4. 
x ^ p ( m ) = τ ^ p + A p ( m ) ( y τ ^ p ) in Proposition 1(iv) is a generalization of the result in Kim et al. (2009, p. 342) [21], and Yamada (2018, Equation (3)) [23]. Given that A p ( m ) is a low-pass filter, it indicates that x ^ p ( m ) is the sum of the polynomial time trend estimated by OLS, τ ^ p , and low-frequency components in the polynomial time trend residuals, y τ ^ p .
5. 
Given that x ^ p ( m ) = A p ( m ) y , for example, it follows that ( A p ( m ) ) 2 y = A p ( m ) x ^ p ( m ) . Thus, ( A p ( m ) ) h y in Proposition 1(viii) represents the result of h times repeated bWH graduation. In addition, from Proposition 1(viii), it follows that
Δ p ( A p ( m ) ) h y 0 n p , 1 , ( n , m , λ p : fixed , h ) ,
which is a generalization of the result in Weinert (2007, p. 961) [4].
6. 
Proposition 1(ix) and (x) are generalizations of the results in Knight (2021) [10]. Given that A p ( 1 ) = A p = ( I n + λ p C p ) 1 , by its definition, it follows that
C p ( 1 ) = 1 λ p ( A p ( 1 ) ) 1 I n = 1 λ p ( I n + λ p C p I n ) = C p .
Therefore, if m = 1 , then (49) becomes
x ^ p ( 1 ) = arg min x f p ( 1 ) ( x ) = y x 2 + λ p x C p ( 1 ) x = ( I n + λ p C p ( 1 ) ) 1 y ,
which is the WH graduation. From (25), (35), and (54), it also follows that
d p , i ( 1 ) = 1 λ p 1 b p , i ( 1 ) 1 = 1 λ p 1 b p , i 1 = d p , i , i = 1 , , n .
Accordingly, we obtain D p ( 1 ) = D p , which is consistent with (58).
7. 
The penalized least squares problem in (51) is a generalized ridge regression representation of the bWH graduation.

5. An Empirical Illustration

In this section, we will provide an empirical illustration of the bWH(p) graduation. We will use the same data as in Nocon and Scott (2012, Example 1) [6], which are annual data from 1989 to 2009. Accordingly, n = 21 . The data are taken from Table 1 of their paper [6].
Figure 4 shows the results for the case where λ p = 10 6 and p = 3 . The top panel shows y (red circle) and x ^ p ( 1 ) ( = x ^ p ) (blue solid line). Since the value of λ p is huge, from Proposition 1(v), it follows that
x ^ p ( 1 ) τ ^ p = Π p ( Π p Π p ) 1 Π p y .
Alternatively, we can see this as the result of the following fact:
A p ( 1 ) = A p = ( I n + λ p Δ p Δ p ) 1 = I n Δ p 1 λ p I n p + Δ p Δ p 1 Δ p I n Δ p ( Δ p Δ p ) 1 Δ p = Π p ( Π p Π p ) 1 Π p ,
if λ p is huge. Here, the third equality follows from the Sherman–Morrison–Woodbury formula to ( I n + λ p Δ p Δ p ) 1 . The last equality follows from the fact that [ Π p , Δ p ] is nonsingular and ( Δ p ) Π p = Δ p Π p = 0 n p , p . The middle panel shows y x ^ p ( 1 ) (red circle) and A p ( 1 ) ( y x ^ p ( 1 ) ) (blue solid line). From the panel, it is observable that
A p ( 1 ) ( y x ^ p ( 1 ) ) 0 n , 1 ,
which shows that there is no gain from boosting if λ p is huge. The result is reasonable from
Π p ( Π p Π p ) 1 Π p I n Π p ( Π p Π p ) 1 Π p = 0 n , n .
The bottom panel shows x ^ p ( 1 ) (red dashed line) and x ^ p ( 2 ) (blue solid line). In this case, since
x ^ p ( 2 ) = x ^ p ( 1 ) + A p ( 1 ) ( y x ^ p ( 1 ) ) x ^ p ( 1 ) ,
we cannot observe the red dashed line in the panel. Note that the first equality in (64) follows from (31).
Figure 5 shows the results for the case where λ p = 1160 and p = 3 , where λ p = 1160 is the value used in Nocon and Scott (2012) [6]. They selected the value by using generalized cross-validation. Again, the top panel shows y (red circle) and x ^ p ( 1 ) (blue solid line). The middle panel shows y x ^ p ( 1 ) (red circle) and A p ( 1 ) ( y x ^ p ( 1 ) ) (blue solid line). From the middle panel, it is observable that
A p ( 1 ) ( y x ^ p ( 1 ) ) 0 n , 1 .
Recall that A p ( 1 ) ( y x ^ p ( 1 ) ) is the gain from boosting. Given that x ^ p ( 2 ) = x ^ p ( 1 ) + A p ( 1 ) ( y x ^ p ( 1 ) ) , due to this gain from boosting, in the bottom panel, x ^ p ( 2 ) (blue solid line) is different from x ^ p ( 1 ) (red dashed line). In addition, we report the following two results:
1 n ι y = 1 n ι τ ^ p = 1 n ι x ^ p ( 1 ) = 1 n ι x ^ p ( 2 ) = 30.695 , 1 n ι A p ( 1 ) ( y x ^ p ( 1 ) ) = 0 .
These are consistent with Proposition 1(i).
We also tried the case where λ p = 10 6 and p = 3 . In the case, we obtained the following:
x ^ p ( m ) y , m = 1 , 2 ,
A p ( 1 ) ( y x ^ p ( 1 ) ) 0 n , 1 ,
which are consistent with Proposition 1(vi).

6. Concluding Remarks

In this paper, we developed the boosted version of the WH graduation and established its properties. The theoretical results we obtained are summarized in Proposition 1 and Lemmas 1 and 2 and empirically illustrated in Section 5. Also see Table 1, which lists the relationships between the main matrices such as A p ( m ) and B p ( m ) .
Finally, we give a remark. To use the bWH(p) graduation in (27), the values of the three parameters, p, λ p , and m, must be specified. Among them, the specification of λ p is particularly important. This is because, as empirically shown in Section 5, x ^ p ( m ) is strongly dependent on the value of λ p . One idea is to determine the value of λ p using the (approximate) gain function corresponding to the smoother matrix, A p ( = A p ( 1 ) ) . However, it seems better to determine it by also considering the value of m. This could be achieved by extending the approach adopted by Hall and Thomson (2024) [13]. We are investigating this issue and will report our findings in the future.

Author Contributions

Writing—original draft, Z.J.; Writing—review & editing, H.Y.; Supervision, H.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by JST SPRING (JPMJSP2132) and JSPS KAKENHI (23K01377).

Data Availability Statement

The data used in this article are taken from Nocon and Scott (2012, Table 1) [6].

Acknowledgments

We thank the three anonymous referees for their valuable comments. We also thank Keith Knight for permission to reference his unpublished manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proofs

In this section, we provide two proofs.

Appendix A.1. Proof of (10)

Given | a k | = | ( 1 ) p k p k | = p k for k = 0 , , p and 2 p = ( 1 + 1 ) p = k = 0 p p k , it follows that k = 0 p | a k | = 2 p . For an n-dimensional vector η = [ η 1 , , η n ] , it follows that
Δ p η = a 0 η 1 : n p + + a p η p + 1 : n | a 0 | η 1 : n p + + | a p | η p + 1 : n k = 0 p | a k | η = 2 p η ,
where η a : b = [ η a , , η b ] . Here, the first inequality follows from the triangle inequality, the second inequality follows from η a : b η 1 : n = η , and the final equality follows from k = 0 p | a k | = 2 p . Then, we have
d p , n = u p , n C p u p , n = Δ p u p , n 2 2 2 p u p , n 2 = 2 2 p ,
which completes the proof.

Appendix A.2. Proof of (11)

Let a and b be integers such that b > a 0 . Since Δ a t a = a ! , where Δ 0 = 1 , it follows that
Δ b t a = Δ b a ( Δ a t a ) = Δ b a a ! = 0 .
Given that Δ p τ ( k ) = [ Δ p ( p + 1 ) k , , Δ p n k ] R n p , the t-th entry of Δ p τ ( k ) is
Δ p ( t + p ) k = Δ p h = 0 k k h p h t k h = k 0 p 0 Δ p t k + + k k p k Δ p t 0 = 0
for t = 1 , , n p . Here, the third equality in (A2) follows from (A1). Recall that k is a non-negative integer at most p 1 . Therefore, Δ p τ ( k ) = 0 n p , 1 for k = 0 , , p 1 .

References

  1. Bohlmann, G. Ein Ausgleichungsproblem. Nachrichten von der Gesellschaft der Wissenschaften zu Gottingen Mathematisch-Physikalische Klasse 1899, 1899, 260–271. [Google Scholar]
  2. Hodrick, R.J.; Prescott, E.C. Postwar U.S. business cycles: An empirical investigation. J. Money Credit. Bank. 1997, 29, 1–16. [Google Scholar] [CrossRef]
  3. Whittaker, E.T. On a new method of graduation. Proc. Edinb. Math. Soc. 1923, 41, 63–75. [Google Scholar] [CrossRef]
  4. Weinert, H.L. Efficient computation for Whittaker–Henderson smoothing. Comput. Stat. Data Anal. 2007, 52, 959–974. [Google Scholar] [CrossRef]
  5. Phillips, P.C.B. Two New Zealand pioneer econometricians. N. Z. Econ. Pap. 2010, 44, 1–26. [Google Scholar]
  6. Nocon, A.S.; Scott, W.F. An extension of the Whittaker–Henderson method of graduation. Scand. Actuar. J. 2012, 1, 70–79. [Google Scholar] [CrossRef]
  7. Biessy, G. Revisiting Whittaker–Henderson smoothing. arXiv 2023, arXiv:2306.06932. [Google Scholar]
  8. Phillips, P.C.B.; Shi, Z. Boosting: Why you can use the HP filter. Int. Econ. Rev. 2021, 62, 521–570. [Google Scholar] [CrossRef]
  9. Yamada, H. Linear trend, HP trend, and bHP trend. SSRN 2024. [Google Scholar] [CrossRef]
  10. Knight, K. The Boosted Hodrick–Prescott Filter, Penalized Least Squares, and Bernstein Polynomials. Unpublished Manuscript. 2021. Available online: https://utstat.utoronto.ca/keith/papers/hp-pls.pdf (accessed on 26 October 2024).
  11. Tomal, M. Testing for overall and cluster convergence of housing rents using robust methodology: Evidence from Polish provincial capitals. Empir. Econ. 2022, 62, 2023–2055. [Google Scholar] [CrossRef] [PubMed]
  12. Trojanek, R.; Gluszak, M.; Kufel, P.; Tanas, J.; Trojanek, M. Pre and post-financial crisis convergence of metropolitan housing markets in Poland. J. Hous. Built Environ. 2023, 38, 515–540. [Google Scholar] [CrossRef]
  13. Hall, V.B.; Thomson, P. Selecting a boosted HP filter for growth cycle analysis based on maximising sharpness. J. Bus. Cycle Res. 2024. [Google Scholar] [CrossRef]
  14. Mei, Z.; Phillips, P.C.B.; Shi, Z. The boosted Hodrick–Prescott filter is more general than you might think. J. Appl. Econom. 2024. [Google Scholar] [CrossRef]
  15. Biswas, E.; Sabzikar, F.; Phillips, P.C.B. Boosting the HP filter for trending time series with long-range dependence. Econom. Rev. 2024. [Google Scholar] [CrossRef]
  16. Yamada, H. Boosted HP filter: Several properties derived from its spectral representation. In Computational Science and Its Applications—ICCSA 2024; Gervasi, O., Murgante, B., Garau, C., Taniar, D., C. Rocha, A.M.A., Faginas Lago, M.N., Eds.; Springer: Cham, Switzerland, 2024. [Google Scholar] [CrossRef]
  17. Bao, R.; Yamada, H. Boosted Whittaker–Henderson Graduation of Order 1: A Graph Spectral Filter Using Discrete Cosine Transform. Contemp. Math. 2024. Forthcoming. Available online: https://www.researchgate.net/publication/384363420_Boosted_Whittaker-Henderson_Graduation_of_Order_1_A_Graph_Spectral_Filter_Using_Discrete_Cosine_Transform (accessed on 26 October 2024).
  18. Anderson, T.W. The Statistical Analysis of Time Series; John Wiley and Sons: New York, NY, USA, 1971. [Google Scholar]
  19. Strang, G. The discrete cosine transform. SIAM Rev. 1999, 41, 135–147. [Google Scholar] [CrossRef]
  20. Nakatsukasa, Y.; Saito, N.; Woei, E. Mysteries around the graph Laplacian eigenvalue 4. Linear Algebra Its Appl. 2013, 438, 3231–3246. [Google Scholar] [CrossRef]
  21. Kim, S.; Koh, K.; Boyd, S.; Gorinevsky, D. 1 trend filtering. SIAM Rev. 2009, 51, 339–360. [Google Scholar] [CrossRef]
  22. Yamada, H. A smoothing method that looks like the Hodrick–Prescott filter. Econom. Theory 2020, 36, 961–981. [Google Scholar] [CrossRef]
  23. Yamada, H. Why does the trend extracted by the Hodrick–Prescott filtering seem to be more plausible than the linear trend? Appl. Econ. Lett. 2018, 25, 102–105. [Google Scholar] [CrossRef]
Figure 1. Eigenvalues d p , 1 , , d p , n for p = 3 and n = 50 .
Figure 1. Eigenvalues d p , 1 , , d p , n for p = 3 and n = 50 .
Mathematics 12 03377 g001
Figure 2. Eigenvectors u p , 4 , u p , 8 , u p , 12 , u p , 16 for p = 3 and n = 50 .
Figure 2. Eigenvectors u p , 4 , u p , 8 , u p , 12 , u p , 16 for p = 3 and n = 50 .
Mathematics 12 03377 g002
Figure 3. Eigenvalues b p , 1 ( m ) , , b p , n ( m ) for p = 3 , n = 50 , m = 2 , and λ p = 1000 .
Figure 3. Eigenvalues b p , 1 ( m ) , , b p , n ( m ) for p = 3 , n = 50 , m = 2 , and λ p = 1000 .
Mathematics 12 03377 g003
Figure 4. Empirical illustration. λ p = 10 6 and p = 3 . The top panel shows y (red circle) and x ^ p ( 1 ) ( = x ^ p ) (blue solid line). The middle panel shows y x ^ p ( 1 ) (red circle) and A p ( 1 ) ( y x ^ p ( 1 ) ) (blue solid line). The bottom panel shows x ^ p ( 1 ) (red dashed line) and x ^ p ( 2 ) (blue solid line).
Figure 4. Empirical illustration. λ p = 10 6 and p = 3 . The top panel shows y (red circle) and x ^ p ( 1 ) ( = x ^ p ) (blue solid line). The middle panel shows y x ^ p ( 1 ) (red circle) and A p ( 1 ) ( y x ^ p ( 1 ) ) (blue solid line). The bottom panel shows x ^ p ( 1 ) (red dashed line) and x ^ p ( 2 ) (blue solid line).
Mathematics 12 03377 g004
Figure 5. Empirical illustration. λ p = 1160 and p = 3 . The top panel shows y (red circle) and x ^ p ( 1 ) ( = x ^ p ) (blue solid line). The middle panel shows y x ^ p ( 1 ) (red circle) and A p ( 1 ) ( y x ^ p ( 1 ) ) (blue solid line). The bottom panel shows x ^ p ( 1 ) (red dashed line) and x ^ p ( 2 ) (blue solid line).
Figure 5. Empirical illustration. λ p = 1160 and p = 3 . The top panel shows y (red circle) and x ^ p ( 1 ) ( = x ^ p ) (blue solid line). The middle panel shows y x ^ p ( 1 ) (red circle) and A p ( 1 ) ( y x ^ p ( 1 ) ) (blue solid line). The bottom panel shows x ^ p ( 1 ) (red dashed line) and x ^ p ( 2 ) (blue solid line).
Mathematics 12 03377 g005
Table 1. List of the relationships between the main matrices.
Table 1. List of the relationships between the main matrices.
WH(p) GraduationbWH(p) Graduation
Smoother matrix A p A p ( m )
Spectral decomposition of smoother matrix U p B p U p U p B p ( m ) U p
Penalty matrix C p C p ( m )
Spectral decomposition of penalty matrix U p D p U p U p D p ( m ) U p
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Jin, Z.; Yamada, H. Boosted Whittaker–Henderson Graduation. Mathematics 2024, 12, 3377. https://doi.org/10.3390/math12213377

AMA Style

Jin Z, Yamada H. Boosted Whittaker–Henderson Graduation. Mathematics. 2024; 12(21):3377. https://doi.org/10.3390/math12213377

Chicago/Turabian Style

Jin, Zihan, and Hiroshi Yamada. 2024. "Boosted Whittaker–Henderson Graduation" Mathematics 12, no. 21: 3377. https://doi.org/10.3390/math12213377

APA Style

Jin, Z., & Yamada, H. (2024). Boosted Whittaker–Henderson Graduation. Mathematics, 12(21), 3377. https://doi.org/10.3390/math12213377

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop