Next Article in Journal
Comparison of Knee Muscular Strength Balance among Pre- and Post-Puberty Adolescent Swimmers: A Cross-Sectional Pilot Study
Next Article in Special Issue
Machine Learning Techniques to Predict Timeliness of Care among Lung Cancer Patients
Previous Article in Journal
Additional Active Movements Are Not Required for Strength Gains in the Untrained during Short-Term Whole-Body Electromyostimulation Training
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Managing Mortality and Aging Risks with a Time-Varying Lee–Carter Model

1
College of Economics and Management, Hengyang Normal University, Hengyang 421002, China
2
Department of Actuarial Studies and Business Analytics, Macquarie University, Sydney, NSW 2109, Australia
3
Business School, Hunan University, Changsha 410012, China
*
Author to whom correspondence should be addressed.
Healthcare 2023, 11(5), 743; https://doi.org/10.3390/healthcare11050743
Submission received: 8 January 2023 / Revised: 17 February 2023 / Accepted: 27 February 2023 / Published: 3 March 2023

Abstract

:
Influential existing research has suggested that rather than being static, mortality declines decelerate at young ages and accelerate at old ages. Without accounting for this feature, the forecast mortality rates of the popular Lee–Carter (LC) model are less reliable in the long run. To provide more accurate mortality forecasting, we introduce a time-varying coefficients extension of the LC model by adopting the effective kernel methods. With two frequently used kernel functions, Epanechnikov (LC-E) and Gaussian (LC-G), we demonstrate that the proposed extension is easy to implement, incorporates the rotating patterns of mortality decline and is straightforwardly extensible to multi-population cases. Using a large sample of 15 countries over 1950–2019, we show that LC-E and LC-G, as well as their multi-population counterparts, can consistently improve the forecasting accuracy of the competing LC and Li–Lee models in both single- and multi-population scenarios.

1. Introduction

The ongoing improvement of human life expectancy around the world has far-reaching influences on many aspects of our society. For instance, Maestas et al. [1] highlight the importance of considering demographic changes, including mortality projections, in analyzing labor supply. Mortality projections are used to predict future changes in the size and age structure of the labor force, which can help inform policy decisions related to workforce development and training, immigration, and other factors that affect the labor market. For the healthcare system, Fenton et al. [2] indicate the importance of considering changes in mortality projections in analyzing healthcare utilization and expenditure. Specifically, prediction of future demand for health and medical care services is dependent on mortality projections. Such predictions may inform policy decisions related to the allocation of resources for healthcare and the design of related systems. The importance of accurate mortality projections is also witnessed in retirement and pensions and the insurance industry (see, for example, ref. [3], among others).
To better model the underlying patterns of mortality improvement is, therefore, essential to enhance accuracy of mortality projection. Among the existing studies, Lee and Carter [4] is the mostly widely used factor-based mortality model, known as the LC model. Despite its popularity, the LC model assumes constant age-specific mortality decline speeds over time. Therefore, the LC model produces increasingly large proportional differences in the projected death rates, even at adjacent ages, in the long run. Specifically, the projected mortality rates of LC are implausibly low for infants and younger ages relative to older ages. This is in contrast to the findings in influential studies, including Li et al. [5], that found that mortality declines at younger ages have been slowing while declines at older ages have been accelerating.
A “rotation” approach of age-specific mortality decline speed was later proposed by Li et al. [5] to enhance the LC model. However, this rotation is of an ad hoc nature rather than being data-driven [6]. To address this, we investigate a time-varying coefficients extension of the LC model. Simply speaking, we employ a local constant kernel smoothing method [7] to model and forecast dynamic age-specific mortality declines. In this paper, we consider two commonly adopted kernel functions: Epanechnikov and Gaussian, and the corresponding extensions of the LC model are denoted by LC-E and LC-G, respectively. Technical details of the modelling approach are provided, and our empirical results consider data of 15 low-mortality populations sourced from the famous Human Mortality Database [8], including annual mortality rates of age 0–100 spanning 1950–2019.
The contributions of this paper are threefold. First, the proposed time-varying coefficients extension of the LC model effectively complements the rotation adopted in Li et al. [5]. The proposed flexible framework ensures that the speed of rotation is data-driven, rather than ad hoc. The existence of closed-form solutions further makes the proposed extension computationally efficient. Secondly, using a comprehensive population dataset, we systematically demonstrate the outstanding performance of the time-varying models. The overall superiority of the LC-E and LC-G models is robust when the age and temporal setting are changed. The analysis of life expectancies up to 2100 further shows that the LC-G model can well address the issue of underestimated mortality improvements at old ages, existing for the LC model [5]. Thus, the LC-E and LC-G models may provide potentially more accurate forecasts than the LC model in the research regarding long-term mortality projection. Finally, we describe the multi-population extensions of the employed models and illustrate their forecasting performances. Consistent with the single-population case, the newly proposed multi-population extensions are still overall the best performing model among all investigated specifications.
The rest of this paper is structured as below. Section 2 describes the data and models. We present empirical results with discussions in Section 3. Section 5 concludes the paper.

2. Data and Methods

2.1. Database

To study the mortality rates, the data of this paper are sourced from the famous Human Mortality Database [8], or HMD. HMD provides detailed mortality and population data for over 40 developed countries. We employ the annual age-specific central mortality rates, which are derived from official statistics from each country. HMD is maintained by the Max Planck Institute for Demographic Research in Rostock, Germany, and the University of California, Berkeley, United States of America (USA). As stated in its official website, the purpose of HMD is to make mortality data available for scientific research and to promote research in social sciences, such as actuarial studies, demographics, labor economics, and public health and care.
In order to study the consistent mortality declines, we explore 15 low-mortality countries as considered in Li and Lee [9]: Australia (instead of West Germany), Austria, Canada, Denmark, the United Kingdom, Finland, France, Italy, Japan, the Netherlands, Norway, Spain, Sweden, Switzerland, and the USA. For the low-mortality nature, mortality rates are more likely to consistently decline across all ages. This enables one to observe the potential rotating patterns on age-specific mortality rates, as argued by Li et al. [5]. Following Booth et al. [10], we choose an opportune range of data starting from 1950 to 2019 in order to have a reliable and complete dataset. The crude uni-sex (total) mortality data for ages 0–100 are studied in this paper.
We present logged mortality rates averaged across the 15 populations in Figure 1. Consistent mortality declines are observed at all ages over time. Appropriate modelling of those patterns is critical to credible mortality projections, which is the key to the accurate actuarial, demographic, and healthcare practices (see, for example, refs. [11,12,13,14], among others).

2.2. The Lee–Carter and Related Models

Suppose we have mortality data of N ages, each age with T years of observations. The Lee–Carter (LC) model (1992) ([4]) summarizes the systematic mortality trends of the N ages by a common factor. Formally, the log central mortality rate at age x in year t, ln m x , t , follows the specification given by:
ln m x , t = a x + b x k t + ε x , t ,
where a x is the mortality level, i.e., the average mortality rate over time at age x, k t is the period effect, i.e., the systematic mortality trend common to all ages, b x is the age effect at x, i.e., the sensitivity of ln m x , t to k t , and ε x , t is the normal residual term with mean 0 and variance σ ε x 2 . As noted by [4], Equation (1) is not identifiable without normalization constraints. For example, one could multiply b x with a constant c and divide k t by the same constant and reach the same fitting results. In [4], the following normalization constraints are imposed:
t k t = 0 , and x b x = 1 .
Given the constraint of k t , a x is set to the mean of ln m x , t over the sample considered. The LC model is then estimated by singular value decomposition (SVD) instead of the usual ordinary least square approach in the original paper. (A maximum likelihood estimation method may also be employed to calibrate the parameters [15]. Alternatively, a Bayesian approach is proposed in Li et al. [16]. Compared to those estimation methods, SVD is much more computationally efficient and stable, which is critical to the proposed time-varying framework of this paper.)
While a x and b x are assumed to be constant, the period effect is often modeled by a time-series process. In particular, ref. [4] assumes a random walk with drift specification, which is adopted by many later studies:
k t = k t 1 + d + e t ,
where the drift term d measures the average annual change in k t , and e t i . i . d . N ( 0 , σ e 2 ) . Based on the time-series specification in Equation (2), future mortality rates can be projected by extrapolating the period effect k t . Specifically, the expected h-step-ahead mean forecasts of the period effect and the log central death rate are given by:
k ^ T + h = k T + h d , ln m ^ x , T + h = a ^ x + b ^ x k ^ T + h ,
where T is the last year of the sample.
Remark 1.
Since the forecast k ^ T + h is just a linear trend, Booth et al. [17] consider two adjustments in estimating the optimal k ^ t to fit this trend. First, the modeling of k t is modified by considering the age distribution of deaths via a Poisson model. Second, a method is implemented to identify the most appropriate fitting period to address potential departure of the linearity of k t . The resulting extension is known as the Booth–Maindonald–Smith (BMS) model. In addition, the influential work of Brouhns et al. [18] points out the inappropriateness of using the simple SVD with Gaussian errors when dealing with mortality data. The first adjustment of the BMS model essentially estimates the drift term d k in Equation (2) in a generalized linear model with Poisson family, which effectively resolves this issue. Consequently, we adjust d k as in the BMS approach for all models considered in this paper.

2.3. Issues with the Lee–Carter Model and Related Literature

For the long-spanning nature of mortality data, the static parametric structure of the LC model may not be compatible with reliable long-term demographic research. To see this, the specification of the LC model decomposes mortality improvements into the product of latent non-stationary mortality trend factors ( k t ), assumed as a random walk with drift, and the corresponding age specific loadings of these trends ( b x ), assumed constant over time. Considering the data displayed in Figure 1, we plot the fitted b x using a rolling window of a 30-year width in Figure 2. (There are no available data for the first 29 time points (i.e., 1950–1978) using the rolling window approach.) Clearly, the fitted b x becomes roughly “flatter” over time, with decreasing (increasing) weights for the younger (older) ages. Ignoring such a dynamic pattern therefore reduces the credibility of the forecast mortality rates and life expectancies of LC. Even when the BMS model is used, the static parametric nature is not necessarily sufficient to address this issue.
The insufficiency of the static b x over time is recognized in the influential work by Li et al. [5]. To resolve it, a “rotation” approach is proposed, which specifies that the out-of-sample b ^ x will be dynamic and gradually converge to an ultimate structure. However, such an ultimate structure is determined by expert (ad hoc) judgments rather than being data-driven [6].

2.4. A Time-Varying Coefficients Extension of the Lee–Carter Model

To systematically address the ad hoc issue of Li et al. [5], we investigate a time-varying coefficients extension of the LC model. Specifically, we employ a local constant kernel smoothing method [7], also known as the Nadaraya–Watson estimator, to accommodate the dynamics in the age-specific mortality declines. A novel application of such an estimator in mortality forecasting can be found in Chang and Shi [19]. This approach allows b x to be time-dependent over both the in-sample (denoted by b x , t ) and out-of-sample (denoted by b x , T + h ) periods. In this paper, we consider two commonly adopted kernel functions: Epanechnikov and Gaussian, and the corresponding extensions of the LC model are denoted by LC-E and LC-G, respectively. The technical details are described in this section.

2.4.1. A Time-Varying Framework Using Kernels

To incorporate the time-varying coefficients, we extend the LC model as follows:
ln m x , t = a x + b x , t k t + ε x , t ,
such that the age-specific declines are allowed to be dynamic. This suggests that the relative age-specific mortality decline speed is now time-dependent. In terms of the forecasting, we have that
ln m ^ x , T + h = a ^ x + b ^ x , T + h k ^ T + h ,
where b ^ x , T + h is the forecast b x , t at step h, and k ^ T + h is identically defined as in Equation (3). Consequently, to obtain forecasts of this time-varying framework, one needs to additionally model and forecast the time-dependent b x , t . (Note that we do not require the assumption b x , t > 0 . Negative values mean that mortality rates at certain ages may (temporarily) increase (relative to other ages), rather than decline. As explained above, since this paper works with low-mortality populations, consistent declines are expected and more likely to occur.
For the estimation of the time-varying b x , t , many effective methods have been discussed in the statistics literature, including polynomial splines [20,21], smoothing splines [22,23], and kernel-local polynomial smoothing [24]. Among them, one of the most widely used methods is the kernel-local polynomial smoothing, which includes the local constant and the local linear estimation. In this paper, we illustrate our method using the local constant kernel smoothing method [7], also known as the Nadaraya–Watson estimator.
Remark 2.
Regarding the selection of kernels, an influential criterion is to compare the resulting asymptotic mean integrated squared error. Among all popular kernels, Epanechnikov minimizes this error and is therefore recognized as the most efficient kernel [25]. The Epanechnikov kernel is also investigated in various mortality research including Chang and Shi [19]. For comparison purposes, the Gaussian kernel is also investigated, which is widely used in modern machine learning methodology. The Gaussian kernel is especially popular for techniques involving the principal component analysis (related to the SVD in LC), for its attractive technical features. See Theodoridis [26] for a review on this topic.
To implement the estimation, at each time t, the usual SVD will be applied once to K b ( τ s τ ) ln m x , s , where s = 1 , 2 , . . . , T , τ = t / T and τ s = s / T . The resulting estimated “weighted” age-specific declines will be the corresponding b ^ x , t . In addition, K b ( τ s τ ) is a known kernel function with b known as the bandwidth, and K b ( τ s τ ) is essentially a weight for the sth observation. For instance, with the widely employed Gaussian kernel, we have that
K b ( τ s τ ) = 1 2 π b e [ ( τ s τ ) / b ] 2 2
which is essentially the Gaussian density evaluated at ( τ s τ ) / b and further scaled by b. The bandwidth b then determines the weights distributed to each observation when estimating b x , t . A larger b means weights are more evenly distributed. Other than the Gaussian kernel, we also examine the popular Epanechnikov kernel in this paper, which is specified as
K b ( τ s τ ) = 0.75 b { 1 [ ( τ s τ ) / b ] 2 } + .
This kernel is also investigated in various mortality research, including Chang and Shi [19]. In this paper, we denote the time-varying coefficients LC model estimated by the Gaussian and Epanechnikov kernels as LC-G and LC-E, respectively.
We now employ the LC-E and LC-G models to revisit the mortality rates presented in Figure 1, and the resulting fitted b ^ x , t are plotted in Figure 3a,b, respectively. respectively. (The bandwidths are chosen in a similar manner as that described in Section 2.4.3. In short, we employ the same hold-out sample strategy for data spanning 1950–2019. Forecasts of the age-specific mortality declines ( b ^ x , t ) are produced using the exponential smoothing state space model discussed in Feng and Shi [27]. The bandwidths are then chosen as those minimize the forecasting errors of logged mortality rates in the hold-out sample). Both the fitted curves and time-varying patterns of the LC-E and LC-G models are similar to each other. Compared to the preliminary results displayed in Figure 1b with a running window, less randomness is observed. This may be explained by the fact that the entire (weighted) sample is used to produce each individual b ^ x , t . Roughly speaking, dynamics produced by kernels and the running window are consistent. However, despite that we do observe the rotation of b ^ x , t over most age groups as argued by Li et al. [5], the mortality declines at very old ages seem to decelerate rather than accelerate. Thus, an adjustment needs to be adopted before reliable mortality modelling and forecasting can be implemented with the dynamic b ^ x , t . Otherwise, the forecast mortality improvements at old ages may gradually decline, leading to a larger gap between mortality rates of relatively younger and older ages. This is contrary to the rotation proposed by Li et al. [5] and the concept of (age) coherence studied in the demographic and actuarial literature (see, for example, [6,9,28], among others).

2.4.2. Coherent Modelling of the Dynamic Age-Specific Mortality Declines

To adjust b ^ x , t , we require that forecast mortality rates across ages will not diverge in the long run. This idea, known as age coherence, is much in-line with the rotation argued in Li et al. [5] and is critical to producing biologically reasonable forecasts of mortality rates in the long-term analysis [6]. The age coherence is formally defined as follows.
Definition 1.
Age coherence means that for the h-step-ahead forecasts, | ln m ^ i , T + h ln m ^ j , T + h | = O p ( 1 ) , i , j ( 1 , . . . , N ) , where if X n = O p ( a n ) then for any ε > 0 , there exists a finite M > 0 and a finite N > 0 such that P ( | X n / a n | M ) ε n > N . That is, when h , | ln m ^ i , T + h ln m ^ j , T + h | will not diverge to infinity.
To accommodate the age coherence in mortality forecasting, it is worth noting that ln m x , t is usually assumed to be a non-stationary I(1) process (see, for example, [6,19,28,29,30], among others). Thus, let y x , t = ln m x , t ln m x , t 1 , which is known as the mortality improvement of age x at time t, its long-run mean must be time-independent due to the stationarity. It can then be shown that
ln m ^ x , T + h = E ( ln m x , T + h | Ω T ) = ln m x , T + k = 1 h E ( y x , T + k | Ω T ) = O p ( 1 ) + h μ ^ x
when h goes large, where Ω T is the information set at time T. Thus, we have that
| ln m ^ i , T + h ln m ^ j , T + h | = O p ( 1 ) + | h ( μ ^ i μ ^ j ) | ,
where μ ^ x is the fitted long-run mean of the stationary series y x , t . It can then be seen that the only scenario to ensure the age coherence is that μ ^ i μ ^ j for all i , j = 1 , . . . , N . From Equation (5), when h , it is easy to see that y ^ x , T + h = b ^ x , T + h d , where d is identically defined as in Equation (2). Further, since y x , t is assumed stationary, the condition of μ ^ i μ ^ j is equivalent to b ^ i , T + h b ^ j , T + h for all i , j = 1 , . . . , N . With the constraint of the LC model, such that x b ^ x , T + h = 1 , we require that b ^ x , T + h converges to 1 / N for all x = 1 , . . . , N when h .
We now discuss an appropriate model to study b x , t such that the age coherence of the forecast rates is ensured. Note that the condition all b ^ x , T + h converging to the same constant 1 / N implies that b x , t 1 / N is a stationary sequence with 0 mean for all x. Hence, we denote this sequence by b x , t * , an appropriate parametric structure analogous to that discussed in Li and Lu [6] and described below.
b 1 , t * = α 1 b 1 , t 1 * + ε 1 , t b b 2 , t * = α 2 b 2 , t 1 * + β 2 b 1 , t 1 * + ε 2 , t b b i , t * = α i b i , t 1 * + β i b i 1 , t 1 * + γ i b i 2 , t 1 * + ε i , t b
where i = 3 , . . . , N , and t = 1 , . . . , T . Essentially, Equation (6) is a vector autoregressive (VAR) model with constrained coefficient matrix. According to Li and Lu [6], the constraint is to emphasize the impact of the temporal effect (measured by α i ) and those of the cohorts effect (measured by β i for the same cohort and γ i for the nearest younger cohort). The forecasts of b ^ x , T + h * are produced in the same way as for a usual VAR model, based on which we can derive b ^ x , T + h = b ^ x , T + h * + 1 / N . It can be seen that as long as the VAR of Equation (6) is stationary (a sufficient condition would be that summations of absolute values of α i , β i , and γ i are all smaller than or equal to 1 for all i = 1 , . . . , N ), b ^ x , T + h * will converge to 0 for all x in the long run, and thus the forecast mortality rates will be age coherent. In addition, at each forecasting step, we can rescale b ^ x , T + h to sum up to 1 to satisfy the constraint x b ^ x , T + h = 1 .
In terms of the estimation, Equation (6) can be solved using the penalized least squares (PLS), and its loss function is defined below:
L F = i = 3 N t = 2 T [ b i , t * α i b i , t 1 * β i b i 1 , t 1 * γ i b i 2 , t 1 * ] 2 + t = 2 T [ b 2 , t * α 2 b 2 , t 1 * β 2 b 1 , t 1 * ] + t = 2 T ( b 1 , t * α 1 b 1 , t 1 * ) 2 + λ α i = 2 N ( α i α i 1 ) 2 + λ β i = 3 N ( β i β i 1 ) 2 + λ γ i = 4 N ( γ i γ i 1 ) 2 ,
where λ α , λ β , and λ γ are pre-selected smoothing parameters of α i , β i , and γ i , respectively.
Remark 3.
With greater values of those λ’s, the fitted coefficients will be smoother across ages. When all λ’s are equal to 0, this is a special case of PLS with no penalty and reduces to the usual ordinary leas square case. With more smoothly changed coefficients, the forecast b ^ i , T + h * is expected to vary smoother from i 1 to i. This aims to reduce the roughness among fitted and forecast b i , t * , which is inevitable since mortality data are often of a small sample size [6]. The quadratic nature of L F , α ^ i , and β ^ i will have closed-form solutions, which are computationally efficient. Nevertheless, it is important to note that both LC-E and LC-G models will introduce additional 3 N 3 parameters in the estimation, compared to the original LC model.

2.4.3. Tuning Parameters Selection and the Estimation Procedure

The only outstanding issue now is how to select the tuning parameters, i.e., smoothing penalties of the time-varying coefficients LC models. In a usual case, a cross-validation technique can be employed to prevent the model from overfitting. It is worth noting that a usual cross-validation technique for time-series data, such as the expanding-window approach explained in Hyndman and Athanasopoulos [31], is not quite applicable in our case. The reason is that this expanding-window approach normally considers a very short forecasting step. In contrast, age-coherent adjustment of the LC model emphasizes the long-term forecast. Thus, we employ a hold-out-sample approach to select the tuning parameters (b and three λ ’s), which minimizes
RMSFE = 1 N ( T / 3 ) i = 1 N h = 1 T / 3 ln m ^ i , 2 T / 3 + h ln m i , 2 T / 3 + h 2 ,
where RMSFE is the root of mean squared forecasting errors (since logged mortality rates are modeled, the RMSFE also considers the logged rather than original values. If an exponential transformation is applied, the predicted original mortality rates are unavoidably exposed to the famous Jensen’s inequality issue. Thus, the accuracy of forecasting is inherently lower than the logged rate is considered), and the evaluation period is given by the last third ( [ 2 T / 3 , T ] ) of the data in our study. (Note that the choice of the length of the test sample (one-third) is common among existing studies. Adopting other popular alternatives, such as the last fourth, fifth, and tenth sample will lead to robust results.)
In summary, forecasting with the LC-E and LC-G models can be implemented with the following steps:
  • Fit the original LC model with the full sample to obtain a ^ x , k ^ T , and d;
  • For the training sample spanning [ 1 , 2 T / 3 1 ] , use the Epanechnikov (LC-E) or Gaussian (LC-G) kernel to obtain b x , t , based on which the constrained VAR model of Equation (6) is fitted for b x , t * ;
  • Select the optimal tuning parameters b, λ α , λ β , and λ γ via a grid search, such that the RMSFE of the test sample over [ 2 T / 3 , T ] is minimized;
  • With the selected tuning parameters, fit the LC-E or LC-G model for the full sample;
  • Forecast b ^ x , T + h with the fitted model, and then produce ln m ^ x , T + h with Equation (5), where a ^ x and k ^ T + h = k ^ T + h d sourced from Step 1.
To study the uncertainties of the forecast, the same procedure of the LC model can be applied for the LC-E and LC-G models proposed in this section. For instance, assuming that e t in Equation (2) follows a Gaussian distribution, the prediction interval (PI) of the estimated life expectancies will be produced based on simulations. More specifically, we use method (I) described in Li [32] to implement the simulation, and a 95% PI is then comprises the 2.5th and 97.5th percentiles of the simulated results.
Remark 4.
The proposed specification and estimation of the LC-E and LC-G models have four advantages. First, from an empirical point of view, this is in line with the rotation investigated in Li et al. [5] for finite steps of forecasting. More specifically, the convergence of b ^ x , T + h to an identical constant enables mortality decline to decelerate at younger ages and accelerate at older ages, which effectively adjusts the inconsistence observed in Figure 1b and Figure 3. Second, unlike that determined by expert opinions as in Li et al. [5], the speed of rotation/convergence is data-driven for both LC-E and LC-G and is therefore not ad hoc. Third, the computational cost is low for the existence of closed-form solutions compared to other potentially more flexible but (much) more computationally intensive parametric frameworks. Fourth, different from the ultimate pattern of b ^ x , T + h investigated in Li et al. [5], the rotations achieved by the LC-E and LC-G models realize the age coherence in the long run. This ensures the non-divergence of mortality rates across all ages, which is biologically reasonable and thus theoretically appealing.

2.5. Related Mortality Models

Other than the three factors ( a x , b x , and k t ) used in the LC model, existing research has proposed alternative specifications to model mortality rates. For instance, Renshaw and Haberman [15], Cairns et al. [33], Renshaw and Haberman [34], and Plat [35] have considered additional factors, such as the cohort impacts and/or other temporal factors. Brouhns et al. [18], Currie et al. [36], Wang and Lu [37], and Debon et al. [38] calibrate mortality rates with different statistical methodologies. Machine learning techniques are considered in Hainaut [39], Levantesi and Pizzorusso [40], Nigri et al. [41], and Doukhan et al. [42]. Other studies, such as [43,44,45], introduce different time-series models to forecast the temporal patterns. Comprehensively reviewing those models is out of the scope of this paper, and please see specifications therein for details.

3. Results and Discussion

We model the mortality data and present the results in this section. As demonstrated in Figure 1, on average, the mortality rates consistently decline across all ages. This is also true for individual populations, as can be observed in Figure 4, where four countries are inspected for illustrative purposes. Despite the data volatility in some cases (e.g., Sweden, due to its small population size), substantial mortality improvements take place for both young and old ages. For instance, the “accidental hump” of ages 20–25 which was outstanding in the 1980s, has been flattening over time in all countries.
Following the steps described in Section 2, we fit all models over the dataset 1950–2000 as the training set, and forecast mortality rates from 2000 to 2019 as the test set. The selected bandwidths are displayed in Table 1 for the LC-E and LC-G models. To ensure the validity of the VAR(1) model to study b x , t , the assumptions of stationarity and adequacy of VAR(1) need to be tested for the in-sample b ^ x , t . The I ( 0 ) feature is tested via the usual augmented Dickey–Fuller (ADF) unit root test, and the adequacy of VAR(1) is tested by the Granger causality (GC) test. Recall that for each population, we have a total of 101 age groups, and the b x , t is estimated in 2 sets via the LC-E and LC-G. All tests are performed at the 5% level, and the number out of 101 violated assumptions are presented in Table 1. It is clear that in the vast majority of cases, both the stationarity and model adequacy assumptions hold. The actual percentages of negative outcomes are close to expected (i.e., 5%). Those results validate the application of the VAR(1) framework for both the LC-E and LC-G models.

3.1. Out-of-Sample Forecasting Analysis

We firstly consider the forecasting performance of the proposed LC-E and LL-G models. In particular, we compare their out-of-sample forecasting accuracy with those of the LC and BMS models. Following Li and Lu [6], Feng et al. [28], and Chang and Shi [19], recall that the training sample is set to 1950–2000, and the test sample is 2001–2019. To measure the accuracy, the RMSFE is computed as follows, where the mortality data of each country are fitted into each of the four models individually:
RMSFE h = 1 101 × h i = 1 101 s = 1 h ln m ^ i , 2000 + s ln m i , 2000 + s 2 .
Thus, RMSFE 19 is the overall forecasting accuracy measure covering the entire test period, whereas RMSFE h with 1 h 19 is the accumulated forecasting accuracy measure up to the hth step.
Our baseline results, including single-year age mortality rates over 1950–2019, are presented in Table 2. Overall, except for the UK data, the RMSFE 19 of LC-E and LC-G are uniformly smaller than those of the LC model for all the examined populations. This suggests the improved forecasting performance considering the time-varying b ^ x , T + h . The BMS model, with adjustments on both shortened training sample period and estimation of k t , improves the forecasting accuracy of LC in seven countries. Despite this, except for the UK, Japan, the Netherlands and the USA, both the LC-E and LC-G models consistently outperforms the BMS counterparty in 13 cases. This suggests that appropriately allowing for dynamic age-specific mortality declines is more effective than the two adjustments in BMS to improve the forecasting accuracy. Although the LC-E model outperforms LC-G in six cases, the differences between the two approaches are marginal. This indicates that the forecasting performance of the time-varying coefficients framework described in Section 2.4 is robust against specific kernel functions. Finally, the RMSFE 19 averaged over the fifteen countries of the LC, BMS, LC-E, and LC-G models are 0.2576, 0.2610, 0.2202, and 0.2197, respectively. Thus, on average, BMS cannot improve the forecasting performance of LC, whereas the LC-E and LC-G models can improve it by 14.7%. From a statistical point of view, if an unknown population is randomly chosen, we expected that LC-E or LC-G is more likely to improve the out-of-sample forecasting, with the forecasting error expected to be reduced by roughly 15%.
To gain more insights of the four models’ performances at different horizons, we plot the RMSFE h at each of the sixteen forecasting steps. For demonstration purposes, those of Australia, Canada, Spain, and Sweden are presented in Figure 5. Results of other countries are available upon request. In the vast majority of cases, we see that the RMSFEs of the LC-E and LC-G models are smaller than those of the other two models. This indicates that LC-G and LC-H are able to generate relatively more accurate forecasting results at different horizons.
We now evaluate the robustness of the forecasting results by performing the out-of-sample forecasting analysis under two major variant settings: (1) we follow Li et al. [5] and model the logged mortality rates of the five-year ages instead of the single-year groups; and (2) we consider a shorter training sample over 1970–2000 to avoid fluctuations in 1950–1969 for certain populations (e.g., Spain). A summary of the RMSFE 19 across the 15 countries is reported in Table 3. Overall, we conclude that both LC-E and LC-G are able to produce satisfying forecasting performances that are robust across different settings. In addition, using different kernel functions in the time-varying framework only leads to marginal differences in the out-of-sample forecasts.

3.2. Long-Term Analyses: 2020–2100

We now conduct long-term analyses beyond the full sample period using the proposed time-varying LC models. Since the produced forecasts are robust against the selected kernels, it is sufficient to focus on one model only. More specifically, we follow the steps described in Section 2.4.3 to fit the LC-G model using the full sample (1950–2019) and produce b ^ x , T + h over 2020–2100. We report the forecasts for Australia, Canada, Spain, and Sweden in Figure 6. Overall, all b ^ x , T + h are converging to the long-run mean 1 / 101 . In other words, all the curves are “rotating” to a flat horizontal line. The dynamics are consistent with those of Li et al. [5] (see, for example, Figure 5), with younger ages faster and older ages slower. Unlike in Li et al. [5], the speed of rotation is data-driven for the LC-G model and does not suffer the potential ad hoc issue when determined by expert judgment [6]. It is worth mentioning that the impact of the so-called “accident hump” is more obvious for the Australian population, since b ^ 25 , T + h is lower than b ^ x , T + h of the neighboring ages. Such a difference, however, is much smaller when it is closer to the end of the forecasting period (2100).
Using those obtained b ^ x , T + h , we can derive forecast mortality rates and thus produce the life expectancy using the LC-G model. Since life expectancy at birth ( e 0 ) considers mortality rates at all ages, we contrast the forecast e ^ 0 of LC-G and LC over 2020–2100. Both the point and interval forecasts of Australia, Canada, Spain, and Sweden are plotted in Figure 7. It can be seen that e ^ 0 generated by LC-G are uniformly larger than those of the LC model. This is consistent with the fact that the LC-G model implements a rotation [5] of b ^ x , and with the argued age-coherent property of the forecast mortality rates in the long run [6]. In addition, despite the PIs of LC-G being slightly wider than those of LC, their widths are at similar levels. This may be explained by the fact that k ^ T + h are identical for both models. As of 2100, the point forecast e ^ 0 of LC-G has exceeded the upper bound of the 95% PI of LC for all countries except Australia, implying a significant difference. More specifically, the point forecast e ^ 0 of LC-G (LC) for Australia, Canada, Spain, and Sweden grow from 83.4, 82.9, 82.9, and 82.7, respectively, in 2020 to 94.4, 94.3, 94.9, and 94.6 (92.3, 91.8, 91.9, and 90.1), respectively, in 2100.
To analyze the reasons contributing to those deviations in e ^ 0 of LC and LC-G in 2100, ranging from 2.1 (Australia) to 4.5 (Sweden) years, we plot the forecast logged mortality rates in Figure 8. Comparing with the true rates in 2019, it is observed that little improvements are gained at old ages for the forecasts produced by the LC model. In contrast, due to the rotation and age-coherent property, forecasts produced by LC-G demonstrate much more obvious mortality improvements at old ages. In addition, since b ^ x , T + h sum up to one at each step h, less mortality declines are distributed to the young ages. This contributes to the rotation-like difference when contrasting the ln m ^ x , T + h produced by LC-G to those by LC. Overall, the difference in e ^ 0 produced by LC and LC-G, as presented in Figure 7, may be explained by the potentially large impacts of mortality rates at old ages in the calculation of life expectancy. Nevertheless, due to the influence of the smoothing penalties in (7), the long-run forecast logged mortality rates of LC-G are much smoother than those of the LC model.

4. Extensions to the Multi-Population Modelling: An Illustrative Example

A multi-population extension of the LC model is studied in the seminal work of Li and Lee [9]. An additional common factor which controls the relationships between populations is included in the proposed Li–Lee (LL) model. Specifically, the logged mortality rate is modeled as:
ln m x , t , j = a x , j + B x K t + b x , j k t , j + ε x , t , j ,
where a x , j represents the average of the age-specific mortality level for the jth population, B x and K t represent the age effect and period effect of the common factor, k t , j is the time component of the ith population with age response b x , j , and ε x , t , j is the population-specific error term. The common factor B x K t describes the mortality trend of all modeled populations. To obtain the estimates, in this paper, we follow Li and Lee [9] to apply a usual LC model to the logged mortality rates averaged over all the fifteen populations. Consistent with LC, a ^ x , j is the average of ln m x , t , j over t. The population-specific factor b x , i k t , i can be estimated by applying SVD to the residual matrix ( ln m x , t , j a ^ x , j B ^ x K ^ t ) .
Similar to the case under LC, the common mortality index K t is assumed non-stationary and can be modeled as a random walk with drift process. In contrast, the group-specific time component k t , i is fitted by a stationary autoregressive process (set to AR(1) in this paper, as considered in Li and Lee [9]), such that long-term forecasts are coherent (i.e., non-divergent) across the included populations. Given the data observed in the last year T, the h-step-ahead forecast of the logged mortality rate is given as follows:
ln m ^ x , T + h , j = a ^ x , j + B ^ x K ^ T + h + b ^ x , j k ^ T + h , j .
Despite its desirable coherence among populations, ln m ^ x , T + h , j of the LL model is still not age coherent. To see this, without identical B ^ x across ages, the non-stationary temporal term K ^ T + h in Equation (9) can lead to divergent long-run forecasts even for neighboring ages of the same population.
The multi-population extensions of the BMS, LC-E, and LC-G are straightforward to derive. Essentially, instead of fitting an LC model to forecast the common trend B ^ x K ^ T + h , BMS, LC-E, and LC-G models can be fitted for the averaged rates. Consequently, forecasts of their multi-population extensions can be produced below.
ln m ^ x , T + h , j B = a ^ x , j + B ^ x K ^ T + h B + b ^ x , j k ^ T + h , j ln m ^ x , T + h , j E = a ^ x , j + B ^ x , T + h E K ^ T + h + b ^ x , j k ^ T + h , j ln m ^ x , T + h , j G = a ^ x , j + B ^ x , T + h G K ^ T + h + b ^ x , j k ^ T + h , j
where the superscripts B, E, and G indicate the forecast of the multi-population extension of the BMS (LL-BMS), LC-E (LL-E), and LC-G (LL-G), respectively. It can be seen that in all cases, the population-specific average rate a ^ x , j and stationary trend b ^ x , j k ^ T + h , j are unchanged from that of the LL model. For the LL-BMS, the same two adjustments, selecting an optimal sample period and using a Poisson model to obtain K ^ T + h B , as considered in the single-population BMS model are adopted for the averaged population. For the LL-E and LL-G models, B ^ x , T + h E and B ^ x , T + h G will be forecast using the constrained VAR model as for the single-population case described in Equation (6), based on the averaged population. The tuning parameters will be selected in the same way as described in Section 2.4.3, which will minimize the RMSFE of the averaged population for the hold-out sample.
Without imposing a long-run restriction, LL-BMS will lead to population-coherent forecasts as the LL, which are, however, not age coherent when h . In contrast, following the discussions in Section 2.4, B ^ x , T + h E and B ^ x , T + h G are age invariant in the long run. Since k ^ T + h , j is a stationary process, b ^ x , j k ^ T + h , j will converge to a constant for all ages. Thus, the logged mortality rates forecast by LL-E and LL-G are both age- and population-coherent in the long run across all ages and populations.
For illustration purposes, we consider the baseline results only as in Section 3.1 and Section 3.2 for the multi-population models. The RMSFE 19 of the out-of-sample forecasts for each population are reported in Table 4. First, contrasting the results of Table 2 and Table 4, it is worth noting that in 10, 12, 8, and 9 out of the 15 countries, respectively, the forecasting performances of the LL, LL-BMS, LL-E, and LL-G are superior to those of the corresponding single-population counterparties. Second, LL-E and LL-G outperform the LL model in all the 15 populations. Compared to the LL-BMS model, LL-G and LL-H lead to more accurate forecasts in 11 cases. Finally, although LL-G beats LL-E in all cases except for Japan, the differences between the two time-varying models are still marginal in most cases. On average, LL-E and LL-G improve the forecasting performance of the LL by 9.1% and 11.3%, respectively. Thus, those results are very consistent with the single-population out-of-sample forecasting metrics, suggesting that allowing for dynamic age-specific mortality declines can produce the best-performing forecasts in the multi-population case.
We now examine the forecast life expectancy at birth over 2020–2100. In Figure 9, for the datasets of Australia, Canada, Spain, and Sweden, we present the point forecasts together with the 95% PIs produced by the LL and LL-G models. Consistent with Figure 7, the e ^ 0 generated by LL-G are uniformly larger than those by the LL. This can still be explained by the desirable age-coherence property of the LL-G model. In addition, all point forecasts of the LL-G model are above the corresponding upper bounds of the 95% PIs of the LC counterparty, suggesting a significant difference. Nevertheless, due to the population coherence, LL leads to similar forecasts of e 0 in 2100, ranging from 91.2 (Sweden) to 91.8 (Australia). For the same reason, the forecast e 0 in 2100 are also close to each other for the LL-G model, which vary from 93.0 (Sweden) to 94.8 (Canada). Nevertheless, for both the LL and LL-H models, the widths of the resulting PIs are narrower than those observed in Figure 7 for the single-population models. This can be attributed to the fact that more information is used in the multi-population framework, which can therefore reduce the uncertainty of the unexplained random disturbances.

5. Concluding Remarks

This paper investigates a time-varying coefficients extension of the Lee–Carter (LC) model. As argued in the influential work of Li et al. [5], mortality declines decelerate at young ages and accelerate at old ages. The lack of considering such dynamics makes the LC model less accurate in forecasting mortality rates in the long-term analysis.
To effectively resolve this issue without imposing computationally intensive burdens, we employ the kernel time-varying parametric approach as examined in a recent work of Chang and Shi [19]. Specifically, in-sample time-dependent b x , t are first estimated using the Epanechnikov (LC-E) and Gaussian (LC-G) kernel methods. To project the out-of-sample b ^ x , T + h , a VAR-type model similar to that considered in Li and Lu [6] is further adopted. The forecasting of logged mortality rates is therefore conducted analogously to that for the LC model. For the proposed LC-E and LC-G models, we can draw five key conclusions in this paper. First, the long-run age-specific patterns of mortality declines are forced (asymptotically) identical across all ages. This ensures the desirable age coherence as proposed in the recent literature [6,19,28,29]. Second, the LC-E and LC-G models are computationally efficient and data-driven. Specifically, both models adopt the penalized least square, which results in closed-form solutions. In addition, the parameters and smoothing penalties relevant to the speeds of rotation in b ^ x , T + h are estimated/determined in a data-driven fashion, which is different from the ad hoc judgment of the rotation used in Li et al. [5]. Thirdly, using a large sample of uni-sex data of 15 countries over 1950–2019, we demonstrate the outstanding out-of-sample forecasting performance of the LC-E and LC-G models. The differences in the selection of the kernel function is deemed marginal. This conclusion is robust when a five-year instead of a one-year age mortality rate is modeled, and the training period is shortened to 1970–2000. Therefore, we recommend modelling and forecasting mortality rates with the time-varying coefficients LC model in demographic practices. Fourthly, our long-term analysis supports the effectiveness of the age-coherent forecasts produced by the recommended model. When forecast to 2100, significantly longer life expectancies at birth are generated by the LC-G model in most illustrated cases, which may be caused by its larger forecast mortality improvements at old ages. Finally, multi-population extensions of the LC-E (LL-E) and LC-G (LL-G) are straightforward to derive, which are based on the specification of the Li–Lee (LL) model Li and Lu [9]. On one hand, our proposed extension can achieve coherence across both ages and populations, whereas LL can only forecast non-divergent mortality rates across populations. On the other hand, consistent conclusions hold as those in the single-population scenario. Such conclusions include the more accurate out-of-sample forecasts and larger long-range projections of life expectancies led by the LL-G than the LL model.
There are some directions that are worth examining for future research. First, it is of interest to investigate female and male populations separately. For each country, differences between the selected tuning parameters of the LC-E and LC-G models may indicate important variations of the short-term speeds of mortality rotations for the two sexes. In addition, recent studies, including Vekas [46], have outlined that the rotation proposed by Li et al. [5] is notably more prevalent in populations of women than among men. This strongly motivates the exploration of sex differences using LC-E and LC-G models. Secondly, the fitted d in modelling the temporal patterns of k t may be made further time-dependent. Third, it is worth exploring the applicability of our specification to alternative models as listed in Section 2.5. For factor-based models, such as Renshaw and Haberman [34], our time-varying specification may be straightforwardly extensible. It will be more challenging to incorporate dynamics in other frameworks. Fourth, in the multi-population case, we may allow the population-specific b x , i to be dynamic in the forecasting steps. This more flexible specification might further improve the forecasting accuracy and provide additional information on the population-specific rotation of the b x , i . Finally, impacts of COVID-19 may have significantly delayed mortality declines, especially for the old ages. After more data become available, our kernel-based models can be used to study how such shocks evolve and whether they are short-lived (and if so, when they eventually die out).

Author Contributions

Z.C.: writing—review and editing, formal analysis, visualization, validation; Y.S.: writing—original draft, formal analysis, methodology, data curation, software; A.S.: writing—review and editing, formal analysis, project administration, supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Humanities and Social Sciences Project Funded by the Ministry of Education of the People’s Republic of China [grant number: 20YJA840001]; and the Research Foundation of Education Bureau of Hunan Province, China [grant number: 19A064].

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Maestas, N.; Mullen, K.J.; Powell, D. The Effect of Population Aging on Economic Growth, the Labor Force and Productivity; Technical report; National Bureau of Economic Research: Cambridge, MA, USA, 2016. [Google Scholar]
  2. Fenton, J.J.; Jerant, A.F.; Bertakis, K.D.; Franks, P. The cost of satisfaction: A national study of patient satisfaction, health care utilization, expenditures, and mortality. Arch. Intern. Med. 2012, 172, 405–411. [Google Scholar] [CrossRef] [PubMed]
  3. Huang, W.; Zhang, C. The power of social pensions: Evidence from China’s new rural pension scheme. Am. Econ. J. Appl. Econ. 2021, 13, 179–205. [Google Scholar] [CrossRef]
  4. Lee, R.D.; Carter, L.R. Modeling and forecasting US mortality. J. Am. Stat. Assoc. 1992, 87, 659–671. [Google Scholar]
  5. Li, N.; Lee, R.; Gerland, P. Extending the Lee–Carter method to model the rotation of age patterns of mortality decline for long-term projections. Demography 2013, 50, 2037–2051. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Li, H.; Lu, Y. Coherent Forecasting of Mortality Rates: A Sparse Vector-Autoregression Approach. ASTIN Bull. J. IAA 2017, 47, 563–600. [Google Scholar] [CrossRef]
  7. Fan, J.; Gijbels, I. Local Polynomial Modelling and Its Applications: Monographs on Statistics and Applied Probability; CRC Press: Boca Raton, FL, USA, 1996; Volume 66. [Google Scholar]
  8. Human Mortality Database. University of California, Berkeley (USA), and Max Planck Institute for Demographic Research (Germany). 2023. Available online: https://www.mortality.org/ (accessed on 28 January 2023).
  9. Li, N.; Lee, R. Coherent mortality forecasts for a group of populations: An extension of the Lee–Carter method. Demography 2005, 42, 575–594. [Google Scholar] [CrossRef] [Green Version]
  10. Booth, H.; Hyndman, R.; Tickle, L.; De Jong, P. Lee–Carter mortality forecasting: A multi-country comparison of variants and extensions. Demogr. Res. 2006, 15, 289–310. [Google Scholar] [CrossRef]
  11. Li, H.; Li, J.S.H. Optimizing the Lee–Carter approach in the presence of structural changes in time and age patterns of mortality improvements. Demography 2017, 54, 1073–1095. [Google Scholar] [CrossRef]
  12. Boonen, T.J.; Li, H. Modeling and forecasting mortality with economic growth: A multipopulation approach. Demography 2017, 54, 1921–1946. [Google Scholar] [CrossRef]
  13. Li, H.; Shi, Y. Forecasting mortality with international linkages: A global vector-autoregression approach. Insur. Math. Econ. 2021, 100, 59–75. [Google Scholar] [CrossRef]
  14. Li, H.; Shi, Y. Mortality forecasting with an age-coherent sparse var model. Risks 2021, 9, 35. [Google Scholar] [CrossRef]
  15. Renshaw, A.E.; Haberman, S. Lee–Carter mortality forecasting with age-specific enhancement. Insur. Math. Econ. 2003, 33, 255–272. [Google Scholar] [CrossRef]
  16. Li, H.; De Waegenaere, A.; Melenberg, B. The choice of sample size for mortality forecasting: A Bayesian learning approach. Insur. Math. Econ. 2015, 63, 153–168. [Google Scholar] [CrossRef]
  17. Booth, H.; Maindonald, J.; Smith, L. Applying Lee–Carter under conditions of variable mortality decline. Popul. Stud. 2002, 56, 325–336. [Google Scholar] [CrossRef] [PubMed]
  18. Brouhns, N.; Denuit, M.; Vermunt, J.K. A Poisson log-bilinear regression approach to the construction of projected lifetables. Insur. Math. Econ. 2002, 31, 373–393. [Google Scholar] [CrossRef] [Green Version]
  19. Chang, L.; Shi, Y. Dynamic modelling and coherent forecasting of mortality rates: A time-varying coefficient spatial-temporal autoregressive approach. Scand. Actuar. J. 2020, 843–863. [Google Scholar] [CrossRef]
  20. Huang, J.Z.; Wu, C.O.; Zhou, L. Polynomial spline estimation and inference for varying coefficient models with longitudinal data. Stat. Sin. 2004, 14, 763–788. [Google Scholar]
  21. Huang, J.Z.; Shen, H. Functional coefficient regression models for non-linear time series: A polynomial spline approach. Scand. J. Stat. 2004, 31, 515–534. [Google Scholar] [CrossRef] [Green Version]
  22. Hastie, T.; Tibshirani, R. Varying-coefficient models. J. R. Stat. Soc. Ser. B Methodol. 1993, 55, 757–779. [Google Scholar] [CrossRef]
  23. Chiang, C.T.; Rice, J.A.; Wu, C.O. Smoothing spline estimation for varying coefficient models with repeatedly measured dependent variables. J. Am. Stat. Assoc. 2001, 96, 605–619. [Google Scholar] [CrossRef]
  24. Fan, J.; Zhang, W. Statistical estimation in varying coefficient models. Ann. Stat. 1999, 27, 1491–1518. [Google Scholar] [CrossRef]
  25. Wand, M.P.; Jones, M.C. Kernel Smoothing; CRC Press: Boca Raton, FL, USA, 1994. [Google Scholar]
  26. Theodoridis, S. Machine Learning: A Bayesian and Optimization Perspective; Academic Press: Cambridge, MA, USA, 2015. [Google Scholar]
  27. Feng, L.; Shi, Y. Forecasting mortality rates: Multivariate or univariate models? J. Popul. Res. 2018, 35, 289–318. [Google Scholar] [CrossRef]
  28. Feng, L.; Shi, Y.; Chang, L. Forecasting mortality with a hyperbolic spatial temporal VAR model. Int. J. Forecast. 2020, 37, 255–273. [Google Scholar] [CrossRef]
  29. Li, H.; Lu, Y.; Lyu, P. Coherent mortality forecasting for less developed countries. Risks 2021, 9, 151. [Google Scholar] [CrossRef]
  30. Guibert, Q.; Lopez, O.; Piette, P. Forecasting mortality rate improvements with a high-dimensional VAR. Insur. Math. Econ. 2019, 88, 255–272. [Google Scholar] [CrossRef] [Green Version]
  31. Hyndman, R.J.; Athanasopoulos, G. Forecasting: Principles and Practice; OTexts: Melbourne, Australia, 2018. [Google Scholar]
  32. Li, J. A quantitative comparison of simulation strategies for mortality projection. Ann. Actuar. Sci. 2014, 8, 281. [Google Scholar] [CrossRef]
  33. Cairns, A.J.; Blake, D.; Dowd, K. A two-factor model for stochastic mortality with parameter uncertainty: Theory and calibration. J. Risk Insur. 2006, 73, 687–718. [Google Scholar] [CrossRef]
  34. Renshaw, A.E.; Haberman, S. A cohort-based extension to the Lee–Carter model for mortality reduction factors. Insur. Math. Econ. 2006, 38, 556–570. [Google Scholar] [CrossRef]
  35. Plat, R. On stochastic mortality modeling. Insur. Math. Econ. 2009, 45, 393–404. [Google Scholar] [CrossRef]
  36. Currie, I.D.; Durban, M.; Eilers, P.H. Smoothing and forecasting mortality rates. Stat. Model. 2004, 4, 279–298. [Google Scholar] [CrossRef]
  37. Wang, D.; Lu, P. Modelling and forecasting mortality distributions in England and Wales using the Lee–Carter model. J. Appl. Stat. 2005, 32, 873–885. [Google Scholar] [CrossRef]
  38. Debón, A.; Montes, F.; Mateu, J.; Porcu, E.; Bevilacqua, M. Modelling residuals dependence in dynamic life tables: A geostatistical approach. Comput. Stat. Data Anal. 2008, 52, 3128–3147. [Google Scholar] [CrossRef]
  39. Hainaut, D. A neural-network analyzer for mortality forecast. ASTIN Bull. J. IAA 2018, 48, 481–508. [Google Scholar] [CrossRef] [Green Version]
  40. Levantesi, S.; Pizzorusso, V. Application of machine learning to mortality modeling and forecasting. Risks 2019, 7, 26. [Google Scholar] [CrossRef] [Green Version]
  41. Nigri, A.; Levantesi, S.; Marino, M.; Scognamiglio, S.; Perla, F. A deep learning integrated Lee–Carter model. Risks 2019, 7, 33. [Google Scholar] [CrossRef] [Green Version]
  42. Richman, R.; Wüthrich, M.V. A neural network extension of the Lee–Carter model to multiple populations. Ann. Actuar. Sci. 2021, 15, 346–366. [Google Scholar] [CrossRef]
  43. Haberman, S.; Renshaw, A. A comparative study of parametric mortality projection models. Insur. Math. Econ. 2011, 48, 35–55. [Google Scholar] [CrossRef] [Green Version]
  44. Cadena, M. Mortality Models based on the Transform ∖log (-∖log x). arXiv 2015, arXiv:1502.07199. [Google Scholar]
  45. Doukhan, P.; Pommeret, D.; Rynkiewicz, J.; Salhi, Y. A class of random field memory models for mortality forecasting. Insur. Math. Econ. 2017, 77, 97–110. [Google Scholar] [CrossRef] [Green Version]
  46. Vékás, P. Rotation of the age pattern of mortality improvements in the European Union. Cent. Eur. J. Oper. Res. 2020, 28, 1031–1048. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Averaged logged mortality rates.
Figure 1. Averaged logged mortality rates.
Healthcare 11 00743 g001
Figure 2. Dynamic age-specific declines b ^ x , t .
Figure 2. Dynamic age-specific declines b ^ x , t .
Healthcare 11 00743 g002
Figure 3. Fitted age-specific declines b ^ x , t of the averaged population, 1950–2019: (a) Epanechnikov kernel; (b) Gaussian kernel.
Figure 3. Fitted age-specific declines b ^ x , t of the averaged population, 1950–2019: (a) Epanechnikov kernel; (b) Gaussian kernel.
Healthcare 11 00743 g003
Figure 4. Mortality rates ( ln m x , t ) over 1950–2019:(a) Australia; (b) Canada; (c) Spain; (d) Sweden.
Figure 4. Mortality rates ( ln m x , t ) over 1950–2019:(a) Australia; (b) Canada; (c) Spain; (d) Sweden.
Healthcare 11 00743 g004
Figure 5. RMSFE over forecasting steps: (a) Australia; (b) Canada; (c) Spain; (d) Sweden.
Figure 5. RMSFE over forecasting steps: (a) Australia; (b) Canada; (c) Spain; (d) Sweden.
Healthcare 11 00743 g005
Figure 6. Projected rotation of age-specific mortality decline ( b ^ x , T + h ), 2020–2100: (a) Australia; (b) Canada; (c) Spain; (d) Sweden.
Figure 6. Projected rotation of age-specific mortality decline ( b ^ x , T + h ), 2020–2100: (a) Australia; (b) Canada; (c) Spain; (d) Sweden.
Healthcare 11 00743 g006
Figure 7. Forecast e 0 over 2020–2100, single-population models: (a) Australia; (b) Canada; (c) Spain; (d) Sweden.
Figure 7. Forecast e 0 over 2020–2100, single-population models: (a) Australia; (b) Canada; (c) Spain; (d) Sweden.
Healthcare 11 00743 g007
Figure 8. Forecast ln m x , t over 2020–2100: (a) Australia; (b) Canada; (c) Spain; (d) Sweden.
Figure 8. Forecast ln m x , t over 2020–2100: (a) Australia; (b) Canada; (c) Spain; (d) Sweden.
Healthcare 11 00743 g008
Figure 9. Forecast e 0 over 2020–2100, Multi-population models: (a) Australia; (b) Canada; (c) Spain; (d) Sweden.
Figure 9. Forecast e 0 over 2020–2100, Multi-population models: (a) Australia; (b) Canada; (c) Spain; (d) Sweden.
Healthcare 11 00743 g009
Table 1. Unit root/Granger causality tests and bandwidths.
Table 1. Unit root/Granger causality tests and bandwidths.
LC-ELC-G
ADFGCBWADFGCBW
Australia345.07333.73
Austria653.13552.18
Canada365.23363.55
Denmark744.26752.73
UK339.79339.81
Finland335.15331.55
France433.37442.27
Italy443.21452.18
Japan679.71669.88
Netherlands565.96574.18
Norway545.23543.73
Spain533.21532.18
Sweden433.86442.64
Switzerland442.57451.82
USA449.92448.18
Mean4.44.2-4.34.5-
Note: This table presents the augmented Dickey–Fuller (ADF) unit root and Granger causality test results, as well as the chosen bandwidths (BW). In all cases, we report the number of negative results (stationarity assumption does not hold for ADF, and VAR(1) is not adequate for GC) out of 101 age groups at 5% significant level.
Table 2. Connecting rod parameter of PUMA560 robot.
Table 2. Connecting rod parameter of PUMA560 robot.
RMSFERanking
CountryLCBMSLC-ELC-GLCBMSLC-ELC-G
Australia0.1970.2120.1750.1753421
Austria0.2500.1910.1830.1834312
Canada0.1570.1660.1400.1413412
Denmark0.3890.4130.3450.3453412
UK0.1710.1670.1660.1644321
Finland0.2660.2800.2530.2483421
France0.2300.2010.1680.1684321
Italy0.2300.2950.1290.1293421
Japan0.4680.3540.4600.4604132
Netherlands0.2340.2200.2230.2234132
Norway0.2970.2710.2640.2644321
Spain0.2410.4310.1630.1633412
Sweden0.2440.2490.2150.2153412
Switzerland0.3590.3310.2960.2964312
USA0.1300.1330.1230.1223421
Mean0.2580.2610.2200.2203421
Table 3. Summary of robustness checks: Age and temporal variations.
Table 3. Summary of robustness checks: Age and temporal variations.
MeanStd. Dev.MedianQ 1 Q 3
Panel A: five-year age groups
LC0.2560.0890.2390.1980.288
BMS0.2650.0900.2370.2020.326
LC-E0.2200.0830.1890.1680.265
LC-G0.2190.0820.1890.1670.262
Panel B: 1970–2000
LC0.2400.0800.2180.1920.262
BMS0.2390.0840.2200.1800.288
LC-E0.2000.0620.1870.1530.240
LC-G0.2000.0620.1870.1520.238
Note: This table presents summary statistics of the root of mean squared forecasting error (RMSFE) with two variations in the sample: five-year age groups and period truncated to 1970–2019. Bold numbers are the smallest quantity for each statistic across the four models. Std. Dev. is the standard deviation. Q1 and Q3 are the first and third quartile, respectively.
Table 4. Multi-population out-of-sample forecasting performance.
Table 4. Multi-population out-of-sample forecasting performance.
Error MeasureRanking
CountryLLLL-BMSLL-ELL-GLLLL-BMSLL-ELL-G
Australia0.1850.1790.1760.1754321
Austria0.2440.2340.2140.2104321
Canada0.1630.1270.1430.1364132
Denmark0.2760.2990.2520.2503421
UK0.1670.1200.1580.1534132
Finland0.2680.2520.2630.2604132
France0.1950.2040.1470.1353421
Italy0.2330.2260.2170.2164321
Japan0.1840.1080.1680.1694123
Netherlands0.1790.1650.1510.1414321
Norway0.2470.2440.2330.2324321
Spain0.2710.2680.2490.2484321
Sweden0.2360.2210.2110.1984321
Switzerland0.3040.3130.2710.2633421
USA0.1850.1750.1710.1694321
Mean0.2220.2090.2020.1974321
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, Z.; Shi, Y.; Shu, A. Managing Mortality and Aging Risks with a Time-Varying Lee–Carter Model. Healthcare 2023, 11, 743. https://doi.org/10.3390/healthcare11050743

AMA Style

Chen Z, Shi Y, Shu A. Managing Mortality and Aging Risks with a Time-Varying Lee–Carter Model. Healthcare. 2023; 11(5):743. https://doi.org/10.3390/healthcare11050743

Chicago/Turabian Style

Chen, Zhongwen, Yanlin Shi, and Ao Shu. 2023. "Managing Mortality and Aging Risks with a Time-Varying Lee–Carter Model" Healthcare 11, no. 5: 743. https://doi.org/10.3390/healthcare11050743

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop