Abstract
polynomial trend filtering, which is a filtering method described as an -norm penalized least-squares problem, is promising because it enables the estimation of a piecewise polynomial trend in a univariate economic time series without prespecifying the number and location of knots. This paper shows some theoretical results on the filtering, one of which is that a small modification of the filtering provides not only identical trend estimates as the filtering but also extrapolations of the trend beyond both sample limits.
Keywords:
ℓ1 trend filtering; Hodrick–Prescott filtering; Whittaker–Henderson method of graduation; Lasso regression; basis pursuit denoising; total variation denoising MSC:
62G05
JEL Classification:
C22
1. Introduction
The -norm penalized least-squares problem, defined as:
where are observed time-series data, was developed by (), who called it trend filtering.1 Here, is a tuning parameter and denotes the backward difference operator such that . Accordingly, . Recall that in (1) is -norm of . Unlike () filtering, which is defined as the following squared -norm penalized least-squares problem:
where is a smoothing/tuning parameter, the solution of trend filtering becomes a continuous piecewise linear trend. The relationship between HP filtering and trend filtering corresponds to that between ridge regression of () and Lasso (least absolute shrinkage and selection operator) regression of ()/BPDN (basis pursuit denoising) of (). Econometric applications of trend filtering include (), (), (), and ().
It has been well-known that HP filtering is a form of the Whittaker–Henderson (WH) method of graduation, which is defined as:
For historical surveys of WH filtering, see (), (), and (). Likewise, as shown in (), (), and (), trend filtering may be generalized as:
We refer to it as polynomial trend filtering.2 This filtering method is promising because it enables us to estimate a piecewise -th order polynomial trend of a univariate economic time series without prespecifying the number and location of knots. For more details, see ().
Let denote the solution of (3) and define , where h denotes the length of extrapolation by:
Recently, () introduced the following three modifications of the WH method of graduation:3
where for . Denote the solution of (a), (b), and (c) by for and . () showed that, for and , it follows that:
Among the above results, is of practical use because it provides not only a smoothed series identical to that of the WH graduation, but also an extrapolation beyond the sample limit of current data. Also, is of interest because it shows that based on (5) are useless to reduce the end-point problem of the WH graduation.4 In addition, () proved that, for and :
where .
In this paper, we present three modifications of polynomial trend filtering and show that they provide not only identical trend estimates as polynomial trend filtering, but also extrapolations of the trend beyond both sample limits. In addition, we show some other results on the modified filtering. We also provide a MATLAB function for calculating the solution of one of the modified filtering methods.
The paper is organized as follows. In Section 2, we present three modifications of polynomial trend filtering. In Section 3, we state the main results of the paper. In Section 4, we make some remarks on the results provided in Section 3. Section 5 provides some concluding remarks.
Notation. Let and be the identity matrix. For an n-dimensional column vector, , , , and . is the p-th order difference matrix such that . We denote by . is a Vandermonde matrix, defined by
and we denote , which is a matrix, by .
2. Three Modifications of Polynomial Trend Filtering
Let denote the solution of (4) and define and , where g and h denote the length of extrapolations:
For example, , defined by (12) for , are explicitly expressed as follows:
For a proof of (15), see the Appendix A.
Consider the following three modifications of polynomial trend filtering:
where for and for . Note that (16) is equivalent to polynomial trend filtering if . We denote the solution of (d), (e), and (f) by for and .
Among (16)–(18), the objective function of (16) may be represented in matrix notation as:
where is a matrix and is a -dimensional column vector. Let , where , , and . The MATLAB function for calculating , , and , which depends on CVX developed by (), is as follows:
function [x_g,x,x_h]=m_l1_pt_filtering(y,lambda,p,g,h)
% y: T-dimensional column vector
% lambda: positive real number
% p, g, h: positive integer
% x_g: g-dimensional column vector
% x: T-dimensional column vector
% x_h: h-dimensional column vector
T=length(y);
S=[sparse(T,g),speye(T),sparse(T,h)];
D=diff(speye(g+T+h),p);
cvx_begin
variables z(g+T+h)
minimize(sum((y-S*z).^2)+lambda*norm(D*z,1))
cvx_end
x_g=z(1:g); x=z(g+1:g+T); x_h=z(g+T+1:g+T+h);
end
3. Main Results
Theorem 1.
Proof.
Because the objective function of (4) is coercive and strictly convex with respect to , are the unique global minimizer of the function. It follows that:
where the equality holds only if for .5 In addition, from (11) and (12), for , and for , we have the following inequalities:
Combining (21)–(23) yields
where the equality in (26) holds only if for , which proves that for . Likewise, combining (21)–(25) proves that for and combining (21), (24) and (25) proves that for . ☐
As an illustration of the above theorem, we give a numerical example. Consider the case where , , and . Suppose that we obtained
by applying polynomial trend filtering of order 2 (i.e., trend filtering) to a T-dimensional time-series data.6 Because , the line plot of for becomes a continuous piecewise linear line such that is a knot. for are explicitly . Then, from the above theorem, in the case, for and are as follows:
Theorem 2.
If , for and , it follows that
where .
Proof.
Because is a -diagonal Toeplitz matrix, such that:
where for , it may be expressed as
where is a upper triangular matrix, is a matrix, is an matrix, and is an unit lower-triangular matrix. For example, when , , and :
Let , , , and , which is a -dimensional column vector. Then, by definition of and , it follows that:
which leads to:
From (), if , it follows that , where . Recalling that , we obtain if , which implies that may be represented as . Because , must equal . Therefore, if , then . ☐
Theorem 3.
Suppose that , where is a p-dimensional column vector. Then, for , it follows that:
where .
Proof.
If , it follows that: . Accordingly, , which indicates that may be represented as . Because if , must equal . Therefore, we obtain if . ☐
Corollary 1.
Let for .
- (i)
- Denote the -th column of Π and that of , respectively, by and by for . If , then for any .
- (ii)
- Let be a T-dimensional column vector. If , then for any .
4. Some Remarks on the Main Results
First, we make a remark on Theorem 1. Because , from (29), may be expressed with as . Likewise, because , from (30), may be expressed with as . Thus, the modified polynomial trend filtering, (16), may be characterized as a filtering that calculates
from .7 In addition, from (), it follows that as . Therefore, we obtain:
Second, we provide a remark on Theorems 2 and 3. () recently showed that:
where and , which is a -dimensional column vector, is the solution of the following Lasso regression/BPDN:
Because , in (35) represents an orthogonal decomposition of . Here, we show that we may prove Theorems 2 and 3 by using (35) and (36). Premultiplying (35) by yields . We accordingly obtain:
- (i)
- From (, p. 324), if , then . Therefore, we obtain and , which proves Theorem 2.
- (ii)
- If , where , then , which implies that . Again, from (), we obtain if . Therefore, if , it follows that and , which proves Theorem 3.
Third, we give an example of Corollary 1 (i). For the case where and , it follows that for any .
5. Concluding Remarks
The polynomial trend filtering method is a promising piecewise polynomial curve-fitting method because it does not require prespecifying the number and location of knots. We have shown some theoretical results on this method. One of them is that a small modification of the filtering provides identical trend estimates and also extrapolations of the trend beyond both sample limits. Another is that based on (12) are useless to improve the trend estimates of polynomial trend filtering. We also provided a MATLAB function for calculating the solution of one of the modified filtering methods. The main results of the paper are summarized in Theorems 1–3 and Corollary 1.
Finally, we remark that applying the modified polynomial trend filtering (16)–(18) requires specification of the value of . For this purpose, the methods proposed in () and () are applicable.
Author Contributions
H.Y. contributed mainly to the paper. R.D. joined the project and contributed to complete it.
Funding
This work was supported in part by the Japan Society for the Promotion of Science KAKENHI Grant Number 16H03606.
Acknowledgments
We appreciate two anonymous referees for their valuable suggestions and comments. An earlier draft entitled “A Small But Practically Useful Modification to the Trend Filtering” was presented at the 12th International Symposium on Econometric Theory and Applications & 26th New Zealand Econometric Study Group 2016 in Hamilton, New Zealand, 17–19 February 2016. Our thanks to the participants for their useful comments. The usual caveat applies.
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A. Proof of (15)
Because , from for , we obtain for . Then, because for and , it follows that
Furthermore, because for and , we finally obtain:
References
- Beck, Amir. 2014. Introduction to Nonlinear Optimization Theory, Algorithms, and Applications with MATLAB. Philadelphia: SIAM. [Google Scholar]
- Chen, Scott Shaobing, David L. Donoho, and Michael A. Saunders. 1998. Atomic decomposition by basis pursuit. SIAM Journal on Scientific Computing 20: 33–61. [Google Scholar] [CrossRef]
- Grant, M., and Stephen Boyd. 2013. CVX: Matlab Software for Disciplined Convex Programming, Version 2.0 Beta. Available online: http://cvxr.com/cvx (accessed on 9 July 2018).
- Harchaoui, Zaıd, and Céline Lévy-Leduc. 2010. Multiple change-point estimation with a total variation penalty. Journal of the American Statistical Association 105: 1480–93. [Google Scholar] [CrossRef]
- Hodrick, Robert J., and Edward C. Prescott. 1997. Postwar U.S. business cycles: An empirical investigation. Journal of Money, Credit and Banking 29: 1–16. [Google Scholar] [CrossRef]
- Hoerl, Arthur E., and Robert W. Kennard. 1970. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12: 55–67. [Google Scholar] [CrossRef]
- Kim, Seung-Jean, Kwangmoo Koh, Stephen Boyd, and Dimitry Gorinevsky. 2009. ℓ1 trend filtering. SIAM Review 52: 339–60. [Google Scholar] [CrossRef]
- Koenker, Roger, Pin Ng, and Stephen Portnoy. 1994. Quantile smoothing splines. Biometrika 81: 673–80. [Google Scholar] [CrossRef]
- Miller, Morton D. 1946. Elements of Graduation. Philadelphia: Actuarial Society of America and American Institute of Actuaries. [Google Scholar]
- Mohr, Matthias F. 2005. A trend-Cycle(-Season) Filter. European Central Bank Working Paper, No. 499. Frankfurt am Main, Germany: European Central Bank. [Google Scholar]
- Nocon, Alicja S., and William F. Scott. 2012. An extension of the Whittaker–Henderson method of graduation. Scandinavian Actuarial Journal 2012: 70–79. [Google Scholar] [CrossRef]
- Osborne, Michael R., Brett Presnell, and Berwin A. Turlach. 2000. On the lasso and its dual. Journal of Computation and Graphical Statistics 9: 319–37. [Google Scholar]
- Phillips, Peter C. B. 2010. Two New Zealand pioneer econometricians. New Zealand Economic Papers 44: 1–26. [Google Scholar] [CrossRef]
- Schuette, Donald R. 1978. A linear programming approach to graduation. Transactions of Society of Actuaries 30: 407–31. [Google Scholar]
- Tibshirani, Robert. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B 58: 267–88. [Google Scholar]
- Tibshirani, Robert, Michael Saunders, Saharon Rosset, Ji Zhu, and Keith Knight. 2005. Sparsity and smoothness via the fused lasso. Journal of the Royal Statistics Society: Series B 67: 91–108. [Google Scholar] [CrossRef]
- Tibshirani, Ryan J., and Jonathan Taylor. 2011. The solution path of the generalized lasso. Annals of Statistics 39: 1335–71. [Google Scholar] [CrossRef]
- Tibshirani, Ryan J. 2014. Adaptive piecewise polynomial estimation via trend filtering. The Annals of Statistics 42: 285–323. [Google Scholar] [CrossRef]
- Winkelried, Diego. 2016. Piecewise linear trends and cycles in primary commodity prices. Journal of International Money and Finance 64: 196–213. [Google Scholar] [CrossRef]
- Weinert, Howard. 2007. Efficient computation for Whittaker–Henderson smoothing. Computational Statistics and Data Analysis 52: 959–74. [Google Scholar] [CrossRef]
- Yamada, Hiroshi. 2017a. Estimating the trend in US real GDP using the ℓ1 trend filtering. Applied Economics Letters 24: 713–16. [Google Scholar] [CrossRef]
- Yamada, Hiroshi. 2017b. A trend filtering method closely related to ℓ1 trend filtering. Empirical Economics. [Google Scholar] [CrossRef]
- Yamada, Hiroshi. 2017c. A small but practically useful modification to the Hodrick–Prescott filtering: A note. Communications in Statistics–Theory and Methods 46: 8430–34. [Google Scholar] [CrossRef]
- Yamada, Hiroshi. 2018. A new method for specifying the tuning parameter of ℓ1 trend filtering. Studies in Nonlinear Dynamics and Econometrics. [Google Scholar] [CrossRef]
- Yamada, Hiroshi, and Ruixue Du. 2018. A modification of the Whittaker–Henderson method of graduation. Communications in Statistics–Theory and Methods. forthcoming. [Google Scholar]
- Yamada, Hiroshi, and Lan Jin. 2013. Japan’s output gap estimation and ℓ1 trend filtering. Empirical Economics 45: 81–88. [Google Scholar] [CrossRef]
- Yamada, Hiroshi, and Gawon Yoon. 2014. When Grilli and Yang meet Prebisch and Singer: Piecewise linear trends in primary commodity prices. Journal of International Money and Finance 42: 193–207. [Google Scholar] [CrossRef]
- Yamada, Hiroshi, and Gawon Yoon. 2016. Selecting the tuning parameter of the ℓ1 trend filter. Studies in Nonlinear Dynamics and Econometrics 20: 97–105. [Google Scholar] [CrossRef]
| 1. | trend filtering is supported in several standard software packages such as MATLAB, R, Python, and EViews. |
| 2. | (4) where has been known as total variation denoising in signal processing, which may be regarded as a form of the fused Lasso by (). () proposed using the filtering to detect multiple change points. (4) may be regarded as a form of the generalized Lasso by (). In addition, we note that there exist some pioneering works on the filtering that uses the -norm penalty. (, sct. 1.7) mentioned that could be an alternative measure of smoothness to , () introduced a filtering, defined as:
|
| 3. | See also (). |
| 4. | An argument similar to this is given by (, p. 20). |
| 5. | In the objective function of (4), is coercive because it is a quadratic function whose Hessian matrix is positive definite. See, e.g., (, Lemma 2.42). |
| 6. | In the case, is expected to become sparse, as in the numerical example, because is included as a penalty. |
| 7. | Let us calculate for the case where , , and . From (28), it follows that
Accordingly, we obtain:
|
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).