Sex Differential Dynamics in Coherent Mortality Models

: The main purpose of coherent mortality models is to produce plausible, joint forecasts for related populations avoiding, e.g., crossing or diverging mortality trajectories; however, the coherence assumption is very restrictive and it enforces trends that may be at odds with data. In this paper we focus on coherent, two-sex mortality models and we prove, under suitable conditions, that the coherence assumption implies sex gap unimodality, i.e., we prove that the difference in life expectancy between women and men will ﬁrst increase and then decrease. Moreover, we demonstrate that, in the model, the sex gap typically peaks when female life expectancy is between 30 to 50 years. This explains why coherent mortality models predict narrowing sex gaps for essentially all Western European countries and all jump-off years since the 1950s, despite the fact that the actual sex gap was widening until the 1980s. In light of these ﬁndings, we discuss the current role of coherence as the gold standard for multi-population mortality


Introduction
The aim of coherent multi-population mortality models is to forecast the mortality of related populations preserving the structural differences observed in the past, e.g., preserving differences between countries, regional differences within a country, or higher mortality for men than for women.The idea is to prevent divergence or implausible crossings of mortality trajectories that can arise from forecasting each population individually.The concept of coherence was introduced by Li and Lee [1], as an extension to the popular Lee-Carter model [2], and later formalized by Hyndman et al. [3].Since then a number of coherent models have been proposed, both within the Lee-Carter framework [4][5][6][7], with added cohort effects [8][9][10], and based on the functional data approach [3,11].Today, coherence still serves as the gold standard for multi-population mortality models.
Technically, coherence means that the ratio of age-specific mortality rates of the populations being forecasted converges (to finite, age-specific constants).This requirement ties the forecasts together and it ensures that differences in aggregate characteristics, e.g., survival probabilities and life expectancies, remain bounded as desired.The flip-side, however, is that converging mortality sex ratios may not be supported by data and enforcing it can lead to unrealistic continuations of historic trends despite the intention, see [12,13].In this paper, we investigate and exemplify the implications of coherence in two-sex mortality models, in particular, the implications for the dynamics of the life expectancy difference between the sexes.
The sex differential in life expectancy is a key statistic for summarizing and communicating discrepancies in sex-specific mortality curves.The differential has varied considerably over time, although its general shape has been fairly consistent from country to country.In the Western world, the gap has the shape of a (unimodal) hill; the differential widened substantially in favor of women throughout most of the 20th century, but the trend reversed around the 1980s and the gap has continued to narrow since then [14,15].
Studies have sought to explain the observed trends in the sex gap through changes in behavioral, socioeconomic, and health factors, for example, smoking and drinking habits [16][17][18], labor market participation [19,20], risk behavior [21], and cause-of-death contributions [22][23][24].The rationale being that changes in external risk factors explain changes in mortality sex ratios, which in turn explain trends in the life expectancy differential.This common-sense reasoning assumes that changing mortality sex ratios is the main driver of the sex gap, both when it increases and when it decreases.However, demographic analyses have challenged this assumption.In particular, Glei and Horiuchi [15] and Cui et al. [25] show that while the widening sex gap was indeed caused by changing mortality sex ratios, the narrowing of the sex gap was primarily caused by general mortality improvements for both sexes in combination with heterogeneity in the death distributions.In simpler terms, the expanding sex gap in the Western world was caused primarily by female mortality improving faster than male mortality (i.e., changing mortality sex ratios), while the subsequent narrowing of the gap was caused primarily by improvements in mortality for both sexes, with women retaining their relative advantage (i.e., improvements under stable mortality sex ratios).
In this paper, we demonstrate that coherent two-sex models generally imply unimodal sex gap dynamics.At first sight, this might seem as an attractive feature given that the sex gap in the Western world has also been unimodal, as described above.It turns out, however, that in practice the forecasts are almost always on the declining part of the sex gap trajectory.This in turn implies that coherent models are ill-suited to forecast the mortality in periods with increasing sex gaps.Coincidentally, the concept of coherence was introduced after a prolonged period of narrowing sex gaps and the continuation of this trend was seen as an argument in favor of coherent models, see Hyndman et al. [3].That coherent models produce sensible forecasts only in periods with narrowing sex gaps does however question the status of coherence as a universally desirable feature in multi-population models.We will return to this point towards the end of the paper.
The rest of the paper is organized as follows.First, we illustrate the evolution of the sex gap in Western Europe since the 1950s and we survey the existing sex gap decomposition methods, formalizing the ratio and level effects responsible for the widening and the narrowing of the gap respectively.Second, we present our main mathematical result showing sex gap unimodality in (strongly) coherent models under certain conditions; we discuss the intuition behind the conditions and show by example when multi-modality can occur.Third, we analyze a dynamic Gompertz model as a simple example of a coherent model satisfying the conditions; we compute the typical sex gap trajectories that can arise under this model and we use this to explain why the forecasted sex gap will almost always be narrowing.Fourth, we apply the coherent models of Li and Lee [1] and Hyndman et al. [3] to Western European countries at selected jump-off years during the period with expanding sex gaps; with reference to the mathematical results, we discuss why the majority of these forecasts predict narrowing gaps.Finally, we end with some concluding remarks.

Data and Notation
Data is obtained from the Human Mortality Database [26] and consists of death counts, D(x, t), and central exposure-to-risk estimates, E(x, t), on Lexis A-sets, that is, age-period cells of the form [x, x + 1) × [t, t + 1) for integer ages x ∈ {0, . . ., 110} and calendar years t ∈ {1950, . . ., 2020} for countries in Western Europe.The empirical death rate is estimated as m(x, t) = D(x, t)/E(x, t).
Death rates for Western Europe are obtained by pooling death and exposure counts across individual countries.Throughout, (period) life expectancies are calculated by numerical integration whenever a continuous mortality curve is available, and under the assumption of piecewise constant mortality, whenever the mortality curve is only available at integer ages.That is, where S(x, t) = exp(− x 0 µ(y, t) dy) is the (period) survival function, µ(x, t) the force of mortality at age x and time t, and ω ∈ (0, ∞) the age at truncation.Note that ω is not necessarily the maximum attainable age, and we do not assume that µ(x) = ∞ for x > ω.We return to the role of the truncation age later in the paper.The last equality in Equation ( 2) assumes piecewise constant mortality, a full derivation is given in Appendix C.

Changes in Mortality, Life Expectancy, and Sex Differentials
In this section, we briefly survey the existing methods for decomposing changes in life expectancy and the sex differential.This provides the basis for the subsequent treatment of sex differentials in coherent mortality models.

Decomposing Changes in Life Expectancy
There are two main approaches for decomposing changes in life expectancy into constituent parts.The first approach, pioneered by Pollard [27] and Arriaga [28], focuses on the effect of changing mortality from one age-specific schedule to another, and is typically used to assess how different age groups contribute in driving life expectancy progress between two distinct points in time.The second approach, popularized by Keyfitz [29], examines the effect of a local change to the mortality curve by quantifying how various age-specific improvements in mortality, translate into changes in life expectancy.In (3) and throughout, a dot over a function is used to denote its derivative with respect to time as in Vaupel and Romo [30].
Analyzing the effects of local change have been instrumental for understanding the linkage between the age pattern of mortality and the trends in e(0, t) observed in data.Indeed, studies have shown that the dispersion of the life table death distribution is a main determinant of the pace at which life expectancy improves [29,31,32].
The average number of life-years lost due to death (lifespan disparity) is given by where w(x, t) = µ(x, t)S(x, t)e(x, t) is the (life table) probability of dying at age x times the remaining life expectancy at that age.Lifespan disparity, e † , is a dispersion measure, i.e., it quantifies the effect of age of death being distributed across ages.At one extreme, if e † is zero then everyone dies at the same age.Conversely, if e † is large then the population experiences a high number of premature deaths and large gains in e(0, t) can be made by reducing mortality.Improvements among the young are particularly important as more life years are lost upon death at these ages.This relation was formalized by Keyfitz [29] who showed that if the same rate of mortality improvement, ρ(t), applies to all ages, then the change in life expectancy can be expressed as ė(0, t) = ρ(t)e † (t).The absolute change in e(0, t) can thus be interpreted as a product of the proportion of deaths averted (ρ) and the average number of life-years gained (e † ) by those who now survive.Vaupel and Romo [30] generalized Keyfitz's formula to the case of age-dependent improvement rates, and suggested that a change in life expectancy at birth be decomposed into two components where the first term captures the main effect of improvement, while the second term arises due to heterogeneity in ρ(x, t) at different ages.In (5), ρ denotes the average rate of improvement and Cov(ρ, e) is the covariance between improvement rates and life expectancy, see Vaupel and Romo [30] for details.Equation ( 5) is often taken as the basis for deriving the dynamics of life expectancy sex differentials.

The Rise and Fall of Sex Differentials in Western Europe
The difference in life expectancy at birth between females and males in a given population (the sex gap) is defined as where the subscripts f and m denote female and male quantities, respectively.In applications we might be interested also in (remaining) life expectancy and sex differential at other ages than 0; all formulas apply, mutatis mutandis, for a general age, but for ease of notation we develop the theory for age 0 only.Figure 1 shows the evolution of the sex gap over time since 1950 in Western Europe.The overall pattern is the same across all countries with the gap being shaped as a countryside hill.That is, the gap initially widened, but has since fallen into a decline with differentials currently around 3-6 years.The timing of the turning point varies by country, occurring first in the United Kingdom circa 1970 and lastly in Spain towards the end of the 1990s.The rise and fall of sex differentials observed in the data has recently been studied using demographic decompositions.The principal method of Glei and Horiuchi [15] separates change in θ into effects due to changing mortality ratios and effects due to changing mortality levels, namely where c(x, t) = µ m (x, t)/µ f (x, t) is the mortality sex ratio, ζ(x, t) = µ m (x, t)µ f (x, t) the geometric average of the two mortality rates, and, w f and w m are as in Equation ( 4) for females and males, respectively.Note that ċ(x, t)/c(x, t) It follows that the ratio effect is positive when ρ f > ρ m , while the sign of the level effect depends on the relative size of mortality dispersion for the two sexes (as measured by w f and w m ); thus, the two effects can work either in the same direction, or against each other.Using (7) , Glei and Horiuchi [15] show that the initial widening of θ seen in Figure 1 is caused primarily by women experiencing comparatively larger rates of mortality improvements than men (ρ f > ρ m ), while the subsequent narrowing is largely attributable to differential dispersion between the sexes (w m > w f ) in combination with general improvements.These results are echoed by Cui et al. [25], who use (5) to separate θ into three components of change through which they obtain conditions for the sex gap to be widening, narrowing, or at a turning point.

Pollard's Paradox
Glei and Horiuchi [15] and Cui et al. [25] both stress that differential dispersion plays a pivotal role in determining changes in θ.In fact, differential dispersion may give rise to somewhat counter-intuitive changes.Pollard [27], for example, demonstrated that two populations may experience the same absolute change in mortality, but, at the same time, a widening life expectancy differential.The argument is as follows.For i ∈ { f , m}, consider an absolute decrease in mortality at all ages, that is, µ i (x) = µ i (x) − ε for some ε > 0. Let S i (x) = S i (x)e εx denote the new probability of surviving to age x.Assuming µ f (x) < µ m (x) for all x, the new life expectancy differential is then larger than before, where θ(ε f , ε m ) denotes the life expectancy differential when female mortality is decreased by ε f and male mortality by ε m .If we subsequently increase female mortality slightly, we obtain a situation with 0 < ε f < ε and θ < θ(ε f , ε), i.e., a narrowing mortality differential with a widening life expectancy differential.This phenomenon is sometimes referred to as Pollard's paradox.
In a similar fashion we can argue that a narrowing life expectancy differential does not guarantee narrowing mortality ratios.For instance, taking outset in Keyfitz's formula, when the rate of improvement is the same at all ages, we have , then θ(t) < 0, even though ρ m (t) < ρ f (t).This situation could occur if female life expectancy is close to ω in which case e † f (t) is small, but male life expectancy is not in which case e † m (t) is comparatively larger.In this scenario, female mortality can improve at a faster pace than male mortality, but because males benefit from the improvements across a larger span of ages, they gain life expectancy faster than females.Generally, constant mortality ratios can occur together with both increasing and decreasing life expectancy differentials.Therefore, on its own, a narrowing life expectancy differential cannot be interpreted as male mortality rates "catching-up" to female mortality rates.

Sex Differentials in Coherent Mortality Models
In this section, we present our main mathematical result about sex gap unimodality under coherence.Since coherence implies that the modeled mortality schedules evolve in parallel, it is clear that the forecasted life expectancies approach the same maximal age, or age of truncation, when time approaches infinity.Likewise, it is clear that if we backcast, i.e., "run the model backwards", both mortality schedules will degenerate and the life expectancies converge to zero.Hence, in both limits the sex gap converges to zero.The question is, what happens in between these limits?It might appear obvious that the sex gap will first increase and then decrease, i.e., be unimodal.However, in full generality, this is not true; assuming only coherence, the sex gap can in general exhibit an arbitrary number of modalities.The condition we provide indicates that the mortality schedules have to have the same "shape", in a sense to be made precise later, to guarantee unimodality of the sex gap.Although the mathematical result may not cover most coherent mortality models used in practice, the implications of the result in terms of the location of the peak seems to be valid in much greater generality than proven.The result thereby provides an insight as to why coherent mortality models almost always forecast closing sex gaps.

Coherent Mortality Modeling
In demographic applications it is often required to make forecasts of related populations, e.g., Western, low-mortality countries, or males and females in a given population.Separate forecasting of even very similar populations runs the risk of exaggerating shortterm differences leading to diverging projections, but such outcomes seem implausible if the populations have evolved in parallel in the past.For instance, in the case of females and males, we expect the mortality of both groups to keep improving, but we also expect shorter life spans of men relative to women to persist despite converging social and lifestyle factors [1,13,33,34].The intuitively appealing property that forecasts of related populations should "stay together" is formalized by the concept of coherence and its use is often motivated by a desire for preserving historic relationships.
The notion of coherence was introduced by Li and Lee [1] and later formalized by Hyndman et al. [3].Given a model for the mortality of two populations, µ i (x, t), we let μi (x, t) denote the forecast for population i ∈ {1, 2}.The distinction is, that µ i is typically a stochastic process, while μi is a deterministic forecast, e.g., obtained as the median projection of µ i .The mortality forecasts are said to be coherent if their ratio converges to positive, finite, age-specific constants c(x), that is Forecasts for a group of populations are coherent, if the forecasts are pairwise coherent (Box 1).

Box 1. On the definition of coherence
The literature is marked by some confusion regarding the precise, mathematical definition of coherence.Scholars seem to agree on the property as one ensuring non-diverging forecasts, but one can find contradictory definitions depending on the context in which the concept is used.In particular, it is often unclear whether coherence is a property concerning deterministic or stochastic forecasts, especially when authors define coherence as a property related to the mean forecast but apply the concept in a stochastic setting.In the original paper by Li and Lee [1], coherence was directed at deterministic forecasts, namely "to avoid long-run divergence in mean mortality forecasts" (p.577) by "imposing shared rates of change by age" (p.575).The definition given by Hyndman et al. [3] is often quoted as the one formalizing coherence and labels "mortality forecasts as coherent when the forecast age-specific ratios of death rates for any two subpopulations converge to a set of appropriate constants" (p.262).This definition, however, seems inappropriate for stochastic models.The proper, mathematical definition of coherence in the spirit of Hyndman et al. [3] would be to label forecasts as coherent if the age-specific mortality ratio converges to a stationary distribution π, that is, In this paper, however, we use the usual deterministic definition given in (10).
The model proposed by Li and Lee [1], namely the augmented common-factor model or colloquially the Li-Lee model, models the observed death rate in population i as where K t and κ t,i are stochastic processes modeling common and population-specific secular trends, respectively, and ε x,t,i is the observation error, i.e., the difference between the underlying mortality rates, µ i (x, t), and the observed death rates, m i (x, t).Median forecasts are obtained by inserting the estimates for the age-specific loadings, αx,i , Bx and βx,i , and turning off the error terms, i.e., by where Kt and κt,i denote median forecasts of the corresponding processes.The model is coherent (i.e., it produces coherent forecasts) when the κ t,i 's are modeled as stationary, zeromean processes, e.g., AR(1)-processes.This assumption ensures that each κt,i converges to zero, implying that asymptotically all population mortalities are subject to the same age-specific rates of improvements, which is the content of (10).The Li-Lee model is an archetypical coherent mortality model, and we recall it here to remind the reader of the type of models that we are considering.We return to this model in Section 5.
For the rest of the paper we focus on coherent, two-sex mortality models.Of course, mathematically, it makes no difference whether the mortalities are interpreted as sexspecific or not, but with the applications in mind and for ease of presentation, from now on we phrase everything in terms of female and male mortality.

Example: Sex Gap Unimodality for Truncated Exponential Distributions
To gain some intuition for the problem and method of proof, we start with a simple example in which the calculations can be made explicit.Assume female and male mortality are given by µ f (x, t) = µ/t and µ m (x, t) = cµ/t, respectively, for 0 ≤ x ≤ ω < ∞ and t ∈ (0, ∞), where µ > 0 and c > 1 are given constants.Hence, the period life times are distributed as truncated exponential variates with life expectancy expressed as a functional of the level of mortality.Defining θ(t) as in ( 6), we then have For t tending to zero, both sexes die instantaneously after birth, while for t tending to infinity, both sexes become immortal on [0, ω].Thus, θ is zero in both limits, while strictly positive for 0 < t < ∞.Consequently, since θ is smooth, it must have at least one stationary point, i.e., there must exist a t such that θ(t) = 0.If we can prove that this is the only stationary point, it follows that θ is unimodal.Now, note that if θ were to have more than one stationary point they cannot all be local maxima, some of them have to be local minima or points of inflection.From this observation, it follows that if we can prove that all stationary points are (local) maxima, there can be only one stationary point and we are finished.This in turn follows, if we can prove that θ(t) < 0 whenever θ(t) = 0.By direct calculations we find and Let t be a stationary point for θ, and define the functions p(x) = e −xµ/t and q(x) = ce −xcµ/t for x ∈ [0, ω].We want to prove p(ω) > q(ω), since that implies θ(t) < 0 by (16).By assumption, θ(t) = 0 and it therefore follows from (15) that ω 0 x[p(x) − q(x)] dx = 0. Since p(0) = 1 < c = q(0) and since p and q can cross at most once (consider the straight lines x → log p(x) and x → log q(x)) we must have that p(ω) > q(ω).Because otherwise the integrand would be strictly negative Lebesgue almost surely on [0, ω], contradicting that the integral is zero.This concludes the proof.
In this specific example we could of course also have investigated the monotonicity properties of θ more directly to arrive at the same conclusion.However, the approach of examining stationary points better extends to the general situation in which explicit expressions for θ are not available.
Note that the existence of a maximum for θ hinges on θ being zero as time tends to infinity.In the example, both sexes experience rectangularization of the survival curve such that their life expectancies converge to the same, finite upper limit.However, if we remove the truncation the situation changes.Without truncation, the sex gap is θ(t) = t/µ − t/(cµ) = t(c − 1)/(cµ) with θ(t) = (c − 1)/(cµ) > 0. In this case the sex gap is monotonically increasing to infinity as t tends to infinity.Assuming that an upper limit on life expectancy exists is probably uncontroversial, but it is worth noting that it is this assumption that forces a diminishing sex gap in the limit.

Main Result
We are now ready to state our main mathematical result concerning sex gap unimodality in coherent models.Since our main purpose is to provide insights into the role of coherence; we will only prove the result under the assumption of uniform rates of improvement.More general versions of the result exist, but the conditions become less intuitive, harder to interpret, and tedious to verify.
A twice continuously differentiable function f is called (strongly) unimodal on A ⊆ R, if there exists an a ∈ A such that f is (strictly) increasing for t < a, and (strictly) decreasing for t > a.Let µ : [0, ω] × R → (0, ∞) be a twice continuously differentiable function satisfying lim We think of µ as the backcasted/forecasted mortality surface in a given mortality model, but for ease of notation we leave out the bar over µ that we used in Section 3.1.We assume that the relevant time limits are plus/minus infinity, but other limits could also have been used, e.g., zero and infinity as in the example in Section 3.2.
Let µ f (x, t) = µ(x, t) and µ m (x, t) = µ(x, t)c(x) denote female and male mortality, respectively.Thus, µ m (x, t)/µ f (x, t) = c(x) and the mortality forecasts are therefore strongly coherent, in the sense that the limit in ( 10) is replaced with an equality.In particular, females and males have the same rates of improvement at all ages and times.As previously, let θ(t) = e f (0, t) − e m (0, t) denote the sex gap.Finally, for g ∈ { f , m}, let I g (x, t) = x 0 µ g (u, t) du, and let I −1 g (z, t) denote the age, x, for which I g (x, t) = z.
> 0 for all x, i.e., the rates of improvement is the same for all ages and strictly positive at all times.If c(x) > 1 for all x, and for all t ∈ R, and all x where the argument is defined, then θ is strictly unimodal on R.
The proof of Theorem 1 relies on a decision-theoretic argument and is given in Appendix A. As in the example in the previous section, the proof consists in showing that stationary points are local maxima.In brief, the stationarity assumption, ė f (0, t) = ėm (0, t), is used to construct two probability measures and the monotonicity assumption ( 18) is used to show that these two measures are stochastically ordered from which ë f (0, t) < ëm (0, t), and thereby local maximality, can be deduced.
As demonstrated by the counterexample in Section 3.4, coherence on its own is not enough to ensure sex gap unimodality.Specifically, exceedingly large jumps in mortality levels can cause sex gap multimodality and, intuitively, the role of Equation ( 18) is to prevent such jumps.Increases in mortality levels are revealed by the cumulative death rate I(x), which can be interpreted as the expected number of deaths that would have occurred at age x had the event been repeatable.Thus I −1 (x) is the age at which we would have experienced x deaths.Equation ( 18) looks at the rate of change in log-mortality differences, but with the age input transformed by I −1 (•).Because female mortality is lower than male mortality, I −1 f (x) will be higher than I −1 m (x).Loosely speaking, condition (18) states that "high age" female mortality must increase faster than "low age" male mortality.This is typically, but not always, satisfied.As age and thereby mortality increases, I −1 (x) flattens.If there is a sharp transition in the age-specific mortality curve, this flattening occurs at a comparatively much lower x-value for females than for males and means that ( 18) is comparing µ f evaluated at a high age to µ m evaluated at a much lower age.Figure 2 shows I −1 (x) and µ(I −1 (x)) in the two-level mortality example from Figure 3 and for two Gompertz mortality curves.Condition (18) is violated in the former case as µ(I −1 (x)) does not "jump" at the same time for both sexes, turning the left-hand side of ( 18) positive once the male rate "jumps".Although counterexamples as these can be constructed, in practice, (modelled) female and male mortality schedules are sufficiently aligned that ( 18) is satisfied.
Two−level mortality curve Gompertz mortality curve Theorem 1 formalizes Pollard's paradox; under the stated conditions we have a situation with fixed mortality ratios and a sex gap that is both widening and narrowing, in two distinct epochs.In the terminology of Section 2, both the initial widening and the subsequent narrowing of the sex gap are caused by the level effect, since the ratio effect is absent by construction.Moreover, it is possible to characterize the point where the sex gap peaks in terms of the life expectancy of one of the sexes, e.g., females.This will be illustrated in Section 4.

Sex gap
Figure 3. Mortality rates and the resulting life expectancy differential over time when the same, positive mortality improvement rate is applied to both sexes.In the right panel, the superimposed dashed line is the sex gap had the life expectancy calculation been truncated at x 0 .The curves are calculated using µ(x) = µ when x ∈ [0, x 0 ) and µ(x) = µδ when x ∈ (x 0 + ε, ω] with µ = e −1 , x 0 = 60, δ = 50, ε = 5, and ω = 120.Continuity between x 0 and x 0 + ε is achieved by quadratic interpolation as in (19).The means of interpolation is not important for the calculation; using any C 2 -interpolating curve or applying a bump function to make µ(x) smooth and continuous yields virtually the same result.

Counterexample
The sex gap is not unimodal for arbitrary mortality profiles.As a counterexample, consider a two-level piecewise constant mortality curve made continuous through quadratic interpolation for fixed constants µ, δ, ε > 0. That is, suppose the mortality schedule between ages 0 and x 0 is constant at level µ, but is "bumped" to a new level µδ over the age range x 0 to x 0 + ε for some large δ and small ε.The left panel in Figure 3 shows an example mortality curve of this form, while the right panel shows the resulting life expectancy differential when mortality is subject to the same fractional improvement in the age-specific death rate at all ages; corresponding to µ(x, t) = µ(x) exp(−ρt) for some ρ > 0. For reference, the unimodal curve shown as the dashed line is the life expectancy differential had ω = x 0 .
The gap takes on a multimodal ("roller-coaster") shape.The additional humps are created by the jump in mortality after age x 0 , which creates a false life expectancy barrier (if δ is sufficiently large).This barrier is broken by the females first, whereby the life expectancy differential starts to rise again, creating another modality.Thus, even with monotonically increasing hazards and µ m (x, t)/µ f (x, t) > 1, unimodality is not guaranteed.The example could be extended to create any number of modes by introducing more barriers.It is clear, however, that counterexamples are necessarily somewhat contrived and that they do not arise in "normal" situations.

The Dynamic Gompertz Model
Parametric models can be used for mortality projections by letting calendar year enter the model through its parameters.That is, for each calendar year the parameters of the model are estimated and then viewed as stochastic processes, forecasted by standard time series methods.By linking the stochastic processes driving the models of different populations, coherence and other dependency structures can be achieved, see e.g., [8,9,13].
In this section, we analyze a particularly simple, coherent mortality model of this type.The model is too simple to be of practical use, but it is useful as an illustration of the sex gap trajectories implied by coherence.

The Model
The Gompertz mortality law prescribes that mortality increases exponentially, This mortality profile is a remarkably good fit to adult mortality from age 20, say, onwards.For younger ages, and in fact also for very old ages, the profile is not appropriate.A simple two-sex model for adult mortality can be constructed by fitting a Gompertz law to period, sex-specific mortality data resulting in estimated coefficients (α t,g , β t,g ) for g ∈ { f , m} and t ranging over the years in the estimation window.
If the parameters are modeled as random walks with the same drift for both sexes, the median forecast for population g becomes μg (x, where T denotes the projection jump-off year, h ∈ N 0 is the forecasting horizon, and ξ α and ξ β are the (shared) drift terms of the α-and β-processes, respectively.By design, the forecasts in ( 21) are (strongly) coherent since the mortality sex ratios do not depend on h, μm (x, T + h) More complicated time-series models could also be used, but the current structure suffices for our purposes.Note that although h is assumed non-negative, we can evaluate (21) for all values of h.Specifically, we can think of a forecast as the right tail of an implied surface obtained by letting h range over all integers, negative and positive.

Sex Gap Unimodality under Uniform Rates of Improvement
Assuming uniform rates of improvement, the continuous-time analog of the dynamic Gompertz model is given by with level parameter α, slope parameter β > 0, and rate of improvement ρ > 0. As in Section 3.3, let µ f (x, t) = µ(x, t) and µ m (x, t) = µ(x, t)c(x), where c(x) = exp(∆ α + ∆ β x), with ∆ β > −β, such that µ m is also of form (23) with positive slope.Thus, mortality for both sexes follow a Gompertz mortality schedule with, in general, differing levels and slopes.
The cumulative death rate is while its inverse for fixed time t is It follows that In practice, we typically find excess male mortality relative to female mortality with ∆ α > 0 and ∆ β < 0, that is, the level is higher for males than for females, but the curve is less steep.In this situation, the right-hand side of (26) is negative, since the numerator is negative while the denominator is positive.It then follows from Theorem 1 that the sex gap is strictly unimodal.
We note that Theorem 1 only provides sufficient conditions for unimodality.For instance, having c(x) < 1 at high ages does not imply that unimodality of the gap will not occur.In fact, using empiric estimates we typically have this situation, but still the sex gap is unimodal.

Sex Gap Trajectories
We define the sex gap trajectory as the pairs of female life expectancy and sex gap that can occur together on a given two-sex mortality surface, where as usual θ(t) = e f (0, t) − e m (0, t) denotes the life expectancy differential (sex gap).
In the continuous-time version of the dynamic Gompertz model, we have Since ρ > 0, we see that as t goes from minus infinity to plus infinity the common factor, exp(α − ρt), takes all values in (0, ∞), regardless of the value of α and ρ.It follows that for the dynamic Gompertz model the sex gap trajectory is a function of the female slope parameter and excess mortality only, i.e., T = T (β, ∆ α , ∆ β ).
Based on mortality data from Western Europe 1950-2020, estimates of β are in a narrow range around 0.10, while excess mortality shows greater variability from country to country and over time.Figure 4 shows two sets of trajectories for typical values of β; the left plot corresponds to the average male excess mortality profile in the data, and the right plot corresponds to very high male excess mortality, as seen in, e.g., Finland.We notice that although the size of the sex gap is different in the two plots, the overall shape of the trajectory is almost the same in all cases.In particular, we notice that the sex gap peaks when female life expectancy is in the range 30 to 50 years.This means that if a dynamic Gompertz model is fitted to data and forecasts are produced from a jump-off year where female life expectancy is larger than 50 years then the model will forecast a closing sex gap; at least if the slope parameters are relatively constant (i.e., if ξ β ≈ 0).
Note that the speed with which the sex gap trajectory is traversed in the forecast depends on how fast female life expectancy evolves.We only know that as female life expectancy increases beyond 50 years, the sex gap closes.Since female life expectancy will not increase linearly, the shape of the sex gap in the forecast will typically not resemble the shape of the sex gap trajectories in Figure 4.
That the coherent, dynamic Gompertz model typically produces narrowing sex gaps also for age-dependent rates of improvement, is illustrated in Figure 5.The forecasts are produced by (21) for dynamic Gompertz models fitted to different countries in different periods.The Gompertz models are calibrated using data in the age range {20, . . ., 100} to avoid the poor fit at younger ages to influence parameter estimates; for the same reason we look at the life expectancy differential at age 20, instead of at birth.Parameters of the Gompertz models are estimated by maximum likelihood assuming independent, Poisson-distributed death counts, D(x, t)|E(x, t) ∼ Pois(E(x, t) exp(α t + β t x)).The model's parameters are calibrated to the latest 30 years of data available before projection jump-off.If data do not exist for all 30 years, only available data are used.We estimate and project the models for all countries in Western Europe and for periods with jump-off in 1960, 1980, 2000, and 2020, respectively.
As seen, almost all forecasts in Figure 5 produce narrowing gaps.In some periods the forecasts provide a good description of the future, e.g., the recent forecasts for Finland, France, Portugal and Spain, but in most cases the forecasted sex gap does not provide a sensible continuation of the historic trend.The problem is, that the behavior of data leading up to the jump-off year matters very little; it is the assumed coherence in the forecasts that produces the (implied) unimodal sex gap trajectory and it is the fact that female life expectancy is larger than 50 years at all jump-offs that places us on the declining part of that trajectory.We note in closing, that the dynamic Gompertz model does not always forecast closing sex gaps.Periods with little or no mortality improvements for males (Denmark, 1980), or male mortality being very close to female mortality can lead to an increasing sex gap at jump-off.In the latter case, the peak of the implied sex gap trajectory can occur at a much higher age than indicated in Figure 4, cf.Box 2 "The outlier Ireland" for an example of this rarely happening situation.

The Forecast of Closing Sex Gaps by Coherent Mortality Models
In the previous section we demonstrated that the dynamic Gompertz model typically forecasts closing sex gaps, both under uniform rates of improvement, as predicted by Theorem 1, but also under age-dependent rates of improvement.In this section, we show that more "realistic", coherent mortality models behave qualitatively similar to the dynamic Gompertz model.On this basis, we conclude that closing sex gaps is indeed a general feature of coherent models.The analysis is in two parts.First, we relax the parametric structure of the Gompertz curve but keep the uniform rate of improvement.Second, we show closing sex gaps for coherent, semi-parametric models with age-dependent rates of improvement.

Location of the Sex Gap Zenith under Uniform Rates of Improvement
Based on the dynamic Gompertz model, we concluded in Section 4.3 that for Western European data the implied sex gap peaks at an earlier age than the observed female life expectancy, leading to closing sex gap forecasts.One might object to this analysis that the Gompertz model allows for only a very limited set of mortality profiles, and that the conclusion implicitly rests on this fact.Below, we extend the analysis to general, smooth mortality profiles, retaining the assumption of uniform rates of improvement.
Consider a (strongly) coherent mortality surface with female and male mortality of the form µ f (x, t) = µ f (x, T) exp(−ρ(t − T)) and µ m (x, t) = µ f (x, t)c(x), respectively, where T is a fixed year, µ f (x, T) is the female mortality at age x in year T, c(x) is the male excess mortality at age x, and ρ > 0 is the (uniform) rate of improvement.
Recall from Equation ( 27) that the sex gap trajectory is defined as the set of female life expectancies and sex gaps that can occur together when t varies.By the same argument as in Section 4.3, this set does not depend on the rate of improvement.In other words, assuming a uniform rate of improvement, the sex gap trajectory depends only on the female mortality profile in any given year and the excess mortality profile.In particular, for any female mortality profile, µ f (•, T), and any c-profile, we can define the (implied) sex gap zenith, (a max , θ max ) = (e f (0, t max ), θ(t max )), where t max = arg max θ(t), (29) as the female life expectancy when the sex gap is at its maximum and the corresponding (maximal) value of the sex gap.The notation (a max , θ max ) is chosen to reflect the abscissa and ordinate at the maximum point, cf. Figure 4.
We compute and compare (a max , θ max ) for (i) a Gompertz model and (ii) a graduated mortality curve, where the mortality curves obtained by graduation closely represent the true underlying death rates.In short, the graduation procedure smooths the empirical death rates by a cubic smoothing spline, while the Kannisto model of old-age mortality is used at ages 80 and above.Further details are provided Appendix B.
The sex gap zenith is presented in Table 1 for the two models for the countries in Western Europe.For given year, mortality data for that year only is used to estimate female and male mortality profiles from which the c-profile is derived.Next, assuming a uniform rate of improvement, the implied sex gap trajectory ( 27) can be computed.Finally, the female life expectancy and the size of the sex gap at the zenith (29) of the trajectory are found.The computation is performed on data from three calendar years that captures the different epochs of the period.In the first year, 1950, all countries are on the widening part of the observed θ-curve.The second year, 1985, is the middle of the period where the observed θ-curve peaks, while the third year, 2020, is on the narrowing part of the curve, cf. Figure 1.If a country does not have data at one or both of the endpoints, the nearest data point is used instead.Table 1.Female life expectancy (a max ) when the implied sex gap trajectory is at its maximum (θ max ).
For each country and model, the year refers to the data used to compute the implied sex gap trajectory, see main text for details.Life expectancies are truncated at the age of 110.The only outlier in Table 1 is Ireland, 1950.In this year, the observed life expectancy for females is 66.7 years, but according to Table 1, this is well below a max .Ireland is rather unique in a demographic context.Men outlived women at the beginning of the 20th century, and by the 1930s, Ireland was the only country in the West with higher survival rates for men than for women at any age [35].This pattern has taken a while to reverse, and even though female mortality is lower than male mortality in 1950 (at most ages), the curves lie almost directly on top of each other as seen in Figure 6.Interestingly, this fact makes the sex gap widen for a prolonged period, that is, the slope of θ is rather moderate before reaching the turning point.Once the turning is passed, however, the gap will close quite rapidly; the slope on the narrowing part is much steeper compared to the slope on the widening part of the curve as seen in Figure 6.This feature needs to be understood in the context of life span disparity, e † , and the maximum attainable age, ω.Because the mortality profiles are similar not only in slope but also in level, e † f stays higher than e † m for longer.Female life expectancy at the turning point, a max , is, therefore, closer to ω than had the mortality levels been further apart, causing the subsequent "catch-up" to happen much faster.

Gompertz Mortality
The Gompertz model predicts a female life expectancy of around 30-40 years when the sex gap is at its zenith, varying slightly from country to country and period to period.For comparison, observed female life expectancy in 1950 is between 61.0 (Portugal) and 73.6 (Iceland) and increasing over the period.Thus, all countries are on the declining part of the implied θ-curve, and continued improvements in mortality will further narrow the gap.The only exception is Ireland, see Box 2.
Because the Gompertz curve constitutes a highly stylized mortality profile, the life expectancy predictions from this model are fairly consistent across the different countries and periods.The graduated mortality curve is more flexible, resulting in more diverse predictions, in particular, regarding the female life expectancy at the sex gap zenith.The overall result is, however, qualitatively the same; with the exception of Ireland, the implied sex gap peaks when female life expectancy is below that observed at the estimation year (not shown).The takeaway message is that for essentially all countries and jump-off years, both models project narrowing sex gaps.

Coherence Implies Closing Sex Gaps
Until now we have demonstrated sex gap unimodality and closing sex gap forecasts under conditions imposed for mathematical tractability.In particular, strong coherence and uniform rates of improvement.Of course, coherent mortality models used in practice are neither strongly coherent, nor do they have uniform rates of improvement.Nevertheless, the conclusion extends to "realistic", coherent models also, and for the same reason.Calibrated to Western European mortality data, the sex gap implied by the models peaks when female life expectancy is (much) lower than levels observed after the Second World War.
Table 2 presents the slope of the sex gap following projection jump-off based on the dynamic Gompertz model (21), the Li-Lee model (11), and the product ratio model of Hyndman et al. [3] for different jump-off years.For the Li-Lee model (11), we use AR(1)-processes to describe the sex-specific time-varying indices.The product ratio model is fitted and forecasted using the Demography package (available on CRAN) in R.
Except for the United Kingdom, the actual sex gap was widening at all jump-off years, cf. Figure 1.Nevertheless, essentially all forecasts predict a narrowing sex gap.Further, for those (few) forecasts where the gap is projected to widen, the historical trend is not reproduced.Rather, the apex of θ is predicted to be in the near future with a gap that remains approximately constant on a short horizon (not shown); the forecast trajectories resemble that of Denmark, 1980 seen in Figure 5 (green, dashed line).
In semi-parametric mortality models, e.g., the Lee-Carter model [2] or the Li-Lee model [1], the fit to data is typically good due to a large number of parameters.Even parsimonious model structures such as the dynamic Gompertz model can capture many observed mortality patterns when parameters are allowed to vary freely.Consequently, in the estimation window, most mortality models allow flexible, if not freely varying, rates of improvement, ρ(x, t).In the forecasting region, however, the improvement rates are often, directly or indirectly, constrained by (i) assuming temporal constancy, e.g., in models of the Lee-Carter type, and (ii) by imposing coherence: Both assumptions might be at odds with (recent) trends in data, resulting in death rate (median) trajectories that may not conform with those observed in the past.Relaxing the assumption of time-invariant improvement rates can be achieved by various techniques, for example, by imposing convergence to a long-term target [36], or applying frailty theory [37].This, however, is not the focus of the present paper.Although, formally only a condition in the limit, the coherence assumption built into many multi-population mortality models all but eliminates the ratio effect, identified by Glei and Horiuchi [15] and Cui et al. [25] as the main driver of the widening sex gap in Western Europe until the 1980s, cf.Section 2.2.In periods where the sex gap is narrowing, the ratio effect is less important and the coherence assumption agrees more with observed data.Narrowing sex gap projections are produced as intended but are not guaranteed to replicate the latest trends, cf. Figure 5.In summary, the implied sex gap trajectory of coherent models, being driven mainly by the level effect, is too inflexible to adequately describe and forecast the sex gap evolution since 1950.

Does Coherence Deserve Its Special Status?
The fact that the sex gap narrows in coherent projections at essentially all levels of mortality observed in the West over the last 70 years challenges the desirability of coherence as a modeling goal and adds to the recent criticism of coherence raised by Hunt and Blake [12] and Jarner and Jallbjørn [13].At the time of its development, coherence coincidentally appeared as a desirable property that (to some extent) continued the narrowing of the sex gap that had been observed for some time, unlike most independent methods that projected widening or diverging gaps, see for example Figure 5 in Hyndman et al. [3].One can understand that Hyndman et al. [3] and others concluded that coherent forecasts were an improvement over independent forecasts, but this verdict seems somewhat anchored in the sex gap decline of the time.Imagine standing in a year between 1950-1980, observing sharp transitions from widening to narrowing gaps such as those in Figure 5.It then seems far less obvious that coherence is a property one should impose on a model and certainly questions whether coherence deserves the special status it has been given.
From the adversarial perspective, it is-at least in theory-possible for coherent models to accommodate temporary ratio effects of varying (but bounded) size and thereby produce trends in the sex gap that are in keeping with those recently observed.In most applications, however, the ratio effect diminishes quickly, leading to a misalignment between historic and projected trends as demonstrated in Table 2. Arguably, a diminishing ratio effect is intentional, because otherwise, the rationale for imposing coherence would be undermined.Even though coherence may be a sensible restriction to impose when extrapolating the present mortality regime (for some populations), it does not seem to be so historically and may not be so in the future either.In our view, this suggests a revision of coherence as the guiding principle for multi-population mortality modeling.
Moreover, as is also pointed out by Jarner and Jallbjørn [13], coherence seems too tailored to the log-linear common factor models for which the property was introduced.Even though single population models can be coupled via cointegration techniques to achieve coherence, the choice of scale, i.e., ratios of mortality rates, remains somewhat arbitrary and effectively restricts the coherence label to models of the Lee-Carter type.Therefore, we see a need for a broader definition that covers a larger class of models and permits less restrictive dependency structures.
As a minimum, the coherence requirement should be relaxed to allow for observed patterns of covariation to continue in forecasts.A possible approach to tackle this issue could be to redefine coherence into a property that concerns modeling on the parameter scale rather than modeling on the (log) data scale.In particular, it seems more natural to identify cointegrating relations between (time-varying) parameters, i.e., identifying linear combinations of the parameters that are stationary, rather than requiring mean-reversion of mortality log differences.For further discussion on this point, see Jarner and Jallbjørn [13].

Conclusions
The notion of coherence has been one of the most influential ideas in multi-population mortality modeling.When projecting groups of related populations, e.g., males and females, or countries of similar affluence, which have evolved in parallel in the past, it is natural to expect these populations to "stay together" in the future also.Coherence solves the problem of diverging, or crossing, forecasts that can arise when projecting, even very similar, populations separately.It does so by requiring converging mortality ratios, or, equivalently, asymptotically equal rates of improvement at each age.At first sight, this seems an innocent and reasonable requirement, but in practice coherent models enforce a rigid structure on the forecasts that can be at odds with trends in data.
In this paper we discussed the implications of coherence in two-sex mortality models with focus on the dynamics of the sex differential in life expectancy (sex gap).We provided both theoretical and empirical evidence to support the conclusion that coherent models forecast closing sex gaps for Western European countries for almost all jump-off years since 1950.Despite the fact that the actual sex gap was widening until the 1980s.Coincidentally, coherence was introduced after almost 20 years of narrowing sex gaps, and a continued sex gap closing was seen as a desirable feature of coherent models.However, the inadequacy of coherent models in the first half of the period from 1950 till today, lead us to question coherence as a general modeling principle.
Technically, we prove in the paper that strongly coherent models with a uniform rate of improvement produce a unimodal sex gap trajectory.Further, we demonstrate that for Western European levels of male excess mortality (relative to female mortality) the implied sex gap trajectory typically peaks when female life expectancy is in the range 30 to 50 years.Although formally proven only for a specific subclass of coherent models, this insight applies in much greater generality and it explains why forecasts from coherent two-sex models are almost always on the declining part of the trajectory.
We also discussed the effects responsible for the observed widening and narrowing sex gap in Western Europe from 1950 till today.In particular, with appeal to Glei and Horiuchi [15] we identified changing sex ratios (the ratio effect) as the main driver of the widening sex gap, and differential dispersion combined with general improvements (the level effect) as the main driver of the subsequent narrowing of the sex gap.In light of the similarity between the observed and implied sex gap trajectories in Figures 1 and 4, respectively, it is a priori surprising that coherent models cannot be used to describe the observed sex gap dynamics over the entire period.However, the implied sex gap trajectory peaks too early, in terms of female life expectancy, and therefore cannot be aligned with the widening part of the observed sex gap trajectory.Moreover, even on the narrowing part of the observed sex gap trajectory, coherent forecasts often have a kink at jump-off, see Figure 5.In conclusion, the implied sex gap trajectory of coherent models being driven purely, or mainly, by the level effect is not sufficiently flexible to describe the observed sex gap dynamics.
To prove unimodality of the sex gap, we relied on the assumption that life expectancies are bounded from above, implying that we eventually approach a life expectancy plateau and that the sex gap therefore vanishes in the limit.Even though life expectancy does not appear to be approaching a maximum at present [38] and even though women have been shown to "always" live longer than men [33,34], it does seem reasonable that a biological or genetic barrier to an infinite lifespan exists; however, if an upper limit does not exist a vanishing sex gap is not guaranteed.It is also conceivable that the present level of old-age mortality acts similar to a de facto barrier, but that continued improvements can "break through" the barrier and move it to even higher ages.
Although we have presented the results and the analysis as pertaining to coherent models and their properties, the unimodality result can also be given a real-world interpretation.In Western Europe, males have historically had lower rates of improvement than females, but currently the two sexes enjoy similar rates of improvement.Combined with the unimodality result, this indicates that the observed sex gap will continue to close in the future but it also points to factors that need to change for the gap to start widening again.That is, (i) if mortality undergoes a (perhaps temporary) regime change in which female rates of improvement substantially outweigh male improvements, for instance, if certain diseases that primarily target females are reduced or eradicated, or if sex-specific risk behavior changes in favor of women; (ii) if the current lifetime barrier is broken through, the sex gap dynamics could "start over" and result in a multimodal curve similar to that exemplified in Figure 3.It is also conceivable that a widening gap in favor of men could be brought about.This can happen, e.g., if the sex-specific mortality curves continue to steepen in such a way that high-age male mortality falls below high-age female mortality.
The paper focused on coherence in two-sex models.However, our main result applies also to other coherent, multi-population models whenever an ordering of the population-specific mortality rates exists.A noteworthy instance is mortality rates between different socio-economic groups, a case that has attracted much recent attention, see e.g., Bennett et al. [39], Cairns et al. [40].Historically, the life expectancy gap between the most and least affluent has widened for some time, and the trend is expected to continue over the coming years by domain experts.However, under the assumption of coherence, the gap may inadvertently be projected to close right after projection jump-off.Other applications with mortality orderings expected to persist over time include modeling of rich countries relative to poor countries, insured lives relative to non-insured lives, and smokers relative to non-smokers.In all these cases, coherent models are likely to produce immediately narrowing gaps irrespective of the historic development leading up to jump-off.
Finally, since the main result hinges on excess mortality of one population relative to another, it cannot be applied to situations where such an ordering does not exist.For example, when modeling a group of neighboring countries of similar affluence, we cannot conclude that this will generally lead to closing gaps for all pairs of countries.Arguably, even if we could, narrowing gaps would perhaps be less of a worry in this context, at least in the long run.

Figure 2 .
Figure 2. Illustration of I −1 (x) and µ(I −1 (x)) for the two-level mortality curve depicted in Figure 3 (left panels) and a Gompertz mortality curve (right panels).

Figure 4 .
Figure 4.The sex gap trajectory implied by a dynamic Gompertz model for different values of the female slope parameter, β, for average and high excess male mortality.Left plot has ∆ α = 1.25 and ∆ β = −0.012,right plot has ∆ α = 2.2 and ∆ β = −0.021.

in life expectancy at age 20 Figure 5 .
Figure 5. Forecast of the life expectancy sex gap in Western Europe using a dynamic Gompertz model with parameters forecasted as a random walk with drift, cf.Equation (21), and varying jump-off years.

Figure 6 .
Figure 6.Mortality and implied sex gap for Ireland, 1950.In the left panel, the Gompertz fit is superimposed as solid lines.

Table 2 .
Direction of the sex gap five years after projection jump-off for different coherent models and jump-off years.The table shows whether the gap is narrowing ( ) or widening ( ).The models are calibrated to data in the period 1950 (or the earliest year where data are available) to the listed jump-off year.No direction is shown if the model cannot be calibrated (i.e., lack of data).