Application of Continuous Non-Gaussian Mortality Models with Markov Switchings to Forecast Mortality Rates

Featured Application: The proposed new methods of modelling and forecasting mortality rates are used, among others, to estimate life expectancy depending on the type of death as a fundamental life insurance factor. Abstract: The ongoing pandemic has resulted in the development of models dealing with the rate of virus spread and the modelling of mortality rates µ x , t . A new method of modelling the mortality rates µ x , t with different time intervals of higher and lower dispersion has been proposed. The modelling was based on the Milevski–Promislov class of stochastic mortality models with Markov switches, in which excitations are modelled by second-order polynomials of results from a linear non-Gaussian ﬁlter. In contrast to literature models where switches are deterministic, the Markov switches are proposed in this approach, which seems to be a new idea. The obtained results conﬁrm that in the time intervals with a higher dispersion of µ x , t , the proposed method the empirical data more accurately than the commonly used the Lee–Carter model.

Nowadays, the high dynamics of changes also force adjusting their parameters to the forecasting models' situations. Therefore, it can be observed that there is still a need to search for models that consider the variability of parameters over time. Some authors have proposed using the methodology of stochastic dynamic hybrid (switched) systems ( [22,23]). They considered the dynamic systems to consist of several subsystems described by deterministic or stochastic differential equations. These subsystems have the same structures and different parameters. The submodels can change over time according to a given switching rule, creating a hybrid system.
Recently, this idea was developed by [24][25][26], where the authors proposed measure changes (probability distributions) for LC-models. Another approach was developed S1. The family of extended Milevsky and Promislov mortality models with Gaussian linear scalar filters (GLSF) is studied. This family is described by stochastic processes representing a mortality rate µ x,t for a person aged X at time t. The solutions of the mentioned stochastic differential equations are considered with switches (Section 2.1). S2. Considering the ln-function of µ x,t , and applying the Ito formula, a new vector state with unknown parameters is introduced (Section 2.2). S3. Using moment equations for GLSF with Markov switches, the first-and the secondorder moments of equations for a particular case of two subsystem models (stationary and nonstationary solutions) are obtained, and approximate solutions are analysed (Sections 2.2.1-2.2.3). S4. A similar analysis to step S3 for the non-Gaussian linear scalar filters (nGLSF) model with Markov switches is repeated (Section 2.3). S5. The estimation procedure of the parameters (introduced in step S2) is applied (Section 2.4).
In Section 3, we have compared empirical mortality rates with theoretical ones obtained from the standard LC model and the models proposed in Section 2.1. Section 4 contains the discussion. Conclusions research are then given in Section 5.

Mathematical Preliminaries
Throughout this paper, we use the following notation. Let | · | and <·> be the Euclidean norm and the inner product in R n , respectively. We mark R + = [0, ∞), T = [t 0 , ∞), t 0 ≥ 0. Let Ξ = (Ω, F , {F t } t≥0 , P) be a complete probability space with a filtration {F t } t≥0 satisfying usual conditions. Let σ(t) : R + → S be the switching rule, where S = {1, . . . , N} is the set of states. We denote switching times as τ 1 , τ 2 , . . . and assume that there is a finite number of switches on every finite time interval. Let W k (t) be the independent Brownian motions. We assume that processes W k (t) and σ(t) are both {F t } t≥0 adapted.
By the stochastic hybrid system, we call the vector Itô stochastic differential equations with a switching rule described by where x ∈ R n is the state vector, (σ 0 , x 0 ) is an initial condition, t ∈ T and M is a number of Brownian motions. f(x(t), t, σ(t)) and g k (x(t), t, σ(t)) are defined by sets of { f (x(t), t, l)} and {g k (x(t), t, l)}, respectively, i.e., f(x(t), t, σ(t)) = f(x(t), t, l), g k (x(t), t, σ(t)) = g k (x(t), t, l) for σ(t) = l. Functions f : R n × T × S → R n and g k : R n × T × S → R n are locally Lipschitz and such that ∀l ∈ S, t ∈ T f(0, t, l) = g k (0, t, l) = 0, k = 1, . . . , M. These conditions together with these enforced on the switching rule σ ensure that there exists a unique solution of hybrid system (1). Hence, it follows that Equation (1) can be treated as a family (set) of subsystems defined by where x(t, l) ∈ R n is the state vector of the l-subsystem.
We assume that processes W k (t) and σ(t) are mutually independent and both are {F t } t≥0 adapted. Let r(t), t ≥ 0, be a right-continuous Markov chain on the probability space taking values in a finite state space S with the generator Γ = [γ ij ] N×N , i.e., where We assume that the Markov chain is irreducible, i.e., rank(Γ) = N − 1, and has a unique stationary distribution P = [p 1 , p 2 , . . . , p N ] T ∈ R N , which can be determined by solving

Model with Gaussian Linear Scalar Filter (GLSF)
We consider a family of extended Milevsky and Promislow mortality models ( [34]) with Gaussian linear scalar filters described by where µ x (t, l) is a stochastic process representing a mortality rate for a person aged x (x-fixed) at time t, α l x , β l x 1 , q l x 1 , µ l x0 , γ l x 1 , are constant parameters, l ∈ S; W(t) is a standard Wiener process, and y 1 (t) is an output of a linear filter with an input process W(t).
Taking the natural logarithm of both sides of Equation (5) and applying the Ito formula we find Introducing a new vector state: z x (t, l) = [z 1 (t, l), z 2 (t, l)] T = [ln µ x (t, l), y 1 (t, l)] T Equations (6) and (7) can be rewritten in a vector form The unknown parameters of this model are: ln µ l 0 , α l x , β l x 1 , q l x 1 , γ l x 1 , l ∈ S (further in the text, the parameters q l x i are denoted by q l i , i = 1, 2, 3).

Analysis of First Order Moments
Equating the derivatives in Equations (11) and (12) to zero, we find From Equations (19) and (20), it follows that We denote Substituting (21) to Equations (9) and (10), we obtain where The general solution of Equation (23) has the form after algebraic manipulation, one can find the solutions for both coordinations of the vector u(t) in the form If we assume u 10 = α 1 x 0 and u 20 = α 2 x 0 are constant parameters, and we use an approximation then we obtain approximate formulas for first-order moments in the form of linear time depending functions

Analysis of Second-Order Moments
Similar analysis may be derived for second-order moments E[z 2 then Equations (13) and (14) can be represented in the form Using Equalities (29) and (30) we find where The general solution of Equation (23) has the form If we assume φ 10 = c 1 0 x and φ 20 = c 2 0 x are constant parameters, and we use approximation (28), then after algebraic manipulation, we obtain approximate formulas for second-order moments in the form of quadratic linear time-dependent functions We note that in the case of nonstationary solutions of the first and second moment of the process z x 1 (t, l) without Markov switchings (ρ 1 = ρ 2 = 0), we obtain equalities where α l 0 x and c l 0 x are constants of integration and l = 1, 2.

Model with a Non-Gaussian Linear Scalar Filters (nGLSF)
We consider a family of mortality model with a continuous non-Gaussian scalar linear filter described by Introducing new variables: y 1 (t, l) = y(t, l), y 2 (t, l) = y 2 (t, l), y 3 (t, l) = y 3 (t, l) and applying the Ito formula, we obtain where µ x (t, l) is a stochastic process representing a mortality rate for a person aged x at time t, α l x , β l x 1 , q l 1 , q l 2 , q l 3 , µ l x0 , γ l x 1 are constant parameters, l ∈ S; W(t) is a standard Wiener process.
Taking natural logarithm of both sides of Equation (45) and applying the Ito formula, we find Introducing a new vector state Equations (46)-(49) can be rewritten in a vector form and z x (0, l) = [ln µ l 0 , 0, 0, 0] T . The unknown parameters are: ln µ l 0 , α l x , β l x 1 , γ l x 1 , q l i . Similar to Section 2.2, using the method of the moment equations (see, e.g., Appendix B.1), we find the nonstationary solutions of the first and second moment of the process z x 1 (t, l) (see, e.g., Appendix B.2) where α l 0 x and c l 0 x are constants of integration. To obtain the differential equations for first-and second-order moments with Markov switchings, we differentiate Equations (52) and (53) and denote in the forms (22)-(24) for first-order moments and (31)-(39) for second-order moments.
In the case of first-order moments with Markov switchings, we obtain exactly the same equations and solutions as in the previous chapter. In the case of second-order moments with Markov switchings, we find where The general solution of Equation (23) has the form If we assume φ 10 = c 1 0 x and φ 20 = c 2 0 x are constant parameters, and we use approximation (28), then, after algebraic manipulation, we obtain approximate formulas for second-order moments in the form of quadratic linear time-dependent functions

Procedure
Based on the following procedure (steps S1-S3), the time intervals with significantly higher and lower dispersion of the empirical µ x,t were determined: S1: ∑ y k (µ x,y k − E(µ x,y k )) 2 based on the empirical µ x,t mortality data was computed.
S2. The statistical hypothesis was verified: H 0 : σ 2 y k = σ 2 y k+1 (versus H 1 : ¬H 0 ), which assumes no statistically significant difference in dispersion between y k and y k+1 segments, based on the F statistic and the Fisher-Snedecor distribution. Rejection of H 0 indicates a significant change in dispersion between y k and y k+1 segments (selection based on the lowest p-value < α = 0.05, [37]).
The random search algorithm was used to estimate the parameters of the models mentioned in S3 and described in more detail, for example, in [29].
Due to the simultaneous occurrence of parameters a l , b l , c l 0 x and ρ l in Formulas (61)-(62), the parameter estimation was performed iteratively.
The empirical data of mortality rates µ x,t used in the article are contained in the Statistics Poland and The Human Mortality Database (source: [38,39]).

Results
Based on the observations µ x,t and the procedure described in Section 2.4, the variability of dispersion throughout the intervals seemed to occur mainly among women and men between 30 and 40 years of age (with some exceptions for both women and men). Apart from a woman/man 30-40 years old, the Fisher F-test for variance did not show significant differences in terms of lower and higher dispersion periods based on the F statistic. The obtained results are divided into two subgroups: one of them with significant differences between lower and higher dispersion and the second one in another case. The first group includes, for example, a 37-year-old woman (F 37 ), a 32-year-old man (M 32 ), a 36-year-old man (M 36 ), and a 37-year-old man (M 37 ). The second group, for example, consists of a 35-year-old woman (F 35 ) and a 22-year-old man (M 22 ). The above results are presented in Tables 1 and 2 and Figures 1-7 (source of empirical data: [38,39]     In Figures 2-7, the observed (blue circular points) and theoretical values of the µ x,t (using the Formulas (61) and (62)) mortality rates based on the Lee-Carter (red squares), nGs or nGMs (green diamonds) models from 1958-2019 (HMD) are presented. Moreover, the forecast (purple diamonds) and the yellow and sienna lines define 95% (Cl95) and 90% (Cl90) confidence intervals. In the case of F 37 , M 32 , and M 36 , the data were split into three periods of lower, higher, and again lower dispersion. In the case of M 37 , it is possible to distinguish an initial period with a higher dispersion (in the years 1958-1963), but the number of observations is not sufficient to estimate all parameters of the model (61)-(62) (Section 2.3). In this case, the nG 2 m model with switches in 1963 and 1991 was used [28]. For the F 35 , M 22 , the F-test showed no significant change in variance concerning all 10-year time intervals described in Section 2.4 (p-value> 0.05). Therefore, in this case, the sequence of observations in the period 1958-2019 cannot be divided into areas with significantly lower and higher dispersion. In this case, the nGs model with one switch (Chow test: F = 11.54, p-value = 7.88 ×10 −4 in the case of F 35 and F = 9.99, p-value = 0.0016 in the case of M 22 ) was applied to the complete set of observations (see Figures 6 and 7). In other cases, the nGMs model was better suited to the empirical data than the LC model or the nG model without switches. The LC model is generally accepted and treated as a benchmark model (see the Mean Squared Error (MSE) in Table 1). The 90% and 95% confidence intervals (CI) for the 2020-2027 forecast of the mortality rates based on nGMs model (F 37 Table 2.

Discussion
Based on Figures 1-7, the values of MSE contained in Table 1, the CI values contained in Table 2, and the results of the research contained in papers [28][29][30][31][32], the following observations can be made: • All the results obtained in the article regarding the proposed model were compared with the benchmark, i.e., the LC model, • In all the studied cases, the MSE value for the nGMs model, which took into account the divisions of periods into higher and lower dispersion, is lower than for the LC model, • As a consequence of the above observation, the respective confidence intervals are narrower, thereby resulting in more accurate forecasts, • In cases that neither contained higher or lower periods of dispersion nor switchings, the LC model works as a better model, one better suited to the empirical data, the LC model usually fits the empirical data better (F 35 , M 22 ), • In cases that neither contained higher or lower periods of dispersion nor switchings, the nGs model fits the empirical data better (M 37 ), • In cases of determining periods with significantly smaller and higher dispersion, the proposed method of modelling µ x,t reflects the shape of the empirical data better than the LC model and the nGs model, • The weakness of the proposed nGMs model can be found in the minimum number of observations that uniquely estimate all model parameters in the period with higher dispersion (in the case of the model (52)-(53) and (61)-(62), a minimum of seven observations, this condition is not met by M 37 ).
Based on the above observations, the nGMs model is recommended for modelling and forecasting mortality rates in time intervals with a higher dispersion of µ x,t . The model proposed in this paper provides closer estimates of empirical values than the model based on the Lee-Carter model. The proposed model can be applied, among other things, to estimate life expectancy by age group, taking into account the type of death. This fundamental life insurance factor allows the insurance company to evaluate risks more accurately and thus increase competitiveness in the financial market.

Conclusions
The aim of the article was to verify the validity of the non-Gaussian linear scalar filter model with homogeneous Markov switchings concerning empirical µ x,t . The obtained MSE results from Section 3 confirm the usefulness of the proposed model, by dividing the data into periods of lower and higher dispersion. To determine the time intervals with the significantly different variances of µ x,t , the Fisher test (F) was used.
Based on the lowest p-value, the limits of these intervals with statistically significantly higher and lower dispersion have been determined, according to the procedure described in Section 2.4. The article focuses mainly on the range between 30 and 40 years of age because in this interval, based on historical data, periods of lower (±1958-1970 and 1995-2019) and higher  dispersion, regardless of gender, were observed. Currently, in the case of the increased number of deaths from the end of 2019 to nowadays, also caused by different types of the virus, the modelling approach proposed in the article should find wide applications. However, the current epidemic time series is too short to model annual mortality rates. The paper opens possibilities of research on the heterogeneity of the Markov chain in the case of the nGLSF model with Markov switchings.
Additionally, one unresolved issue remains the selection of appropriate methods for estimating the parameters of models. While the parameters have been proposed already earlier in this article, further research could fine-tune the methods that would aid in estimating parameters. One of the proposals could be using MCMC methods, while another could be the use of machine learning techniques. Finally, the use of fractional differential equations for µ x,t modelling remains an area of inquiry for researchers.

Acknowledgments:
The authors would like to thank the anonymous reviewers for their valuable comments.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Model with Gaussian Linear Scalar Filter
Appendix A. 1

. Moment Equations for GLSF Model
The moment equations in this model for all l ∈ S are

. Partially Stationary Solutions for GLSF Model
Equating the derivatives in Equations (A1)-(A3) to zero, we find the following conditions for stationary moments Hence, and from Equation (A1), we find the nonstationary solution for the first moment of the process z x 1 (t, l): where α l 0 is a constant in the integration of Equation (A1). Next, equating the derivatives in Equations (A5)-(A9) to zero and taking into account conditions (A10), we obtain Substituting quantities (A12)-(A14) to Equation (A4), we obtain Hence, and from Equations (A11) and (A15), we find the nonstationary solution for the second moment of the process z x 1 (t, l) where c l 0 x is a constant of integration.

Appendix B. Model with Non-Gaussian Linear Scalar Filter
Appendix B.