Dependence Modelling of Lifetimes in Egyptian Families

: In this study, we analyse a large sample of Egyptian social pension data which covers, by law, the policyholder’s spouse, children, parents and siblings. This data set uniquely enables the study and comparison of pairwise dependence between multiple familial relationships beyond the well-known husband and wife case. Applying Bayesian Markov Chain Monte Carlo (MCMC) estimation techniques with the two-step inference functions for margins (IFM) method, we model dependence between lifetimes in spousal, parent–child and child–parent relationships, using copulas to capture the strength of association. Dependence is observed to be strongest in child–parent relationships and, in comparison to the high-income countries of data sets previously studied, of lesser signiﬁcance in the husband and wife case, often referred to as broken-heart syndrome. Given the traditional use of UK mortality tables in the modelling of mortality in Egypt, the ﬁndings of this paper will help to inform appropriate mortality assumptions speciﬁc to the unique structure of the Egyptian scheme


Introduction
The existence of dependence between individual lifetimes presents the need to refine the independence assumption traditionally used in the pricing and reserving of life insurance products that involve multiple lives and mortality assumptions.Joint lifetime research in the existing literature largely considers dependence between husband and wife.Commonly referred to as broken-heart syndrome, this short-term dependence causes an immediate increase in the mortality of the surviving spouse upon the death of their partner, with the significance of the impact decreasing over time.Early work by Rees and Lutkins (1967), Parkes et al. (1969), and Ward (1976) find the increase to be of greatest severity during the first 6 to 12 months of bereavement, with mortality eventually decreasing to that of a non-widowed sample in some cases.
In studying joint life dependence, Denuit and Cornet (1999) and Denuit et al. (2001) use data from the Belgian National Institute of Statistics for the estimation of the marginal force of mortality in a Markovian model.Bivariate data is, however, more difficult to obtain, with the data for copula estimation in these studies sampled from the gravestones of couples in Belgian cemeteries.Here, an increase in mortality among widowed individuals is observed, with a more significant deviation from the non-widowed mortality in bereaved males.The impact of marriage status on mortality is also presented in Maeder (1995).Many studies, including those by Frees et al. (1996), Carriere (2000), Youn andShemyakin (1999, 2001), Shemyakin andYoun (2001, 2006), Luciano et al. (2008), Spreeuw and Wang (2008), Spreeuw and Owadally (2013), Ji et al. (2011), Dufresne et al. (2018) and Arias and Cirillo (2021) consider a generation-based joint life data set from a large Canadian insurer in their analysis of joint life dependence.Joint annuity data from a French insurer is analysed in Lu (2017), French geneaology data in Cabrignac et al. (2020) and Dutch census data on married couples in Sanders and Melenberg (2016).Henshaw et al. (2020) consider Ghanaian survey data, and Walter et al. (2021) consider joint life and last survivor annuity data from a Kenyan insurer.To the best of our knowledge, the latter two studies are the only two studies assessing dependence in an alternative socioeconomic context.
Copula-based approaches are widely used in the lifetime dependence literature.Frees et al. (1996) use standard maximum likelihood techniques to fit the one-parameter Frank copula to the aforementioned Canadian joint life data.Considering the impact of dependence on joint and r annuities, whose benefits reduce to a proportion r of the original benefit after the death of one annuitant, for varying r, a reduction in annuity values of approximately 5% is observed when dependence is assumed.Youn and Shemyakin (1999) implement a Weibull-Hougaard copula model with Weibull marginals and a Gumbel-Hougaard (or Gumbel) copula, and introduce age difference as a determinant of lifetime dependence, where the copula association parameter is a function of the difference in age.Luciano et al. (2016) observe reduced dependence among a younger sample of joint lives when splitting data through a generation-based model.In this study, the pseudo-maximum likelihood approach is adopted for the estimation of parameters of both one-and twoparameter copulas.Comparison is made between the reversionary annuity prices under dependence and independence for varying benefit levels.The age difference of policyholders is again incorporated by Dufresne et al. (2018) in their copula-based study, with inference functions for margins and pseudo-maximum likelihood approaches adopted for estimation.Dependence is found to be a decreasing function of age difference by both Youn and Shemyakin (1999) and Dufresne et al. (2018).However, the impact of incorporating age difference on product pricing that is observable for an individual liability is mitigated when considering the total liability of an insurer, given a portfolio of policyholders.
Frailty models are an alternative method for dependence modelling, which account for unobserved heterogeneities between individuals in a population.First introduced by Vaupel et al. (1979) and developed by Hougaard (1984) and Oakes (1989), among others, in the lifetime dependence context, frailty-type models have been used in studies including those by Clayton (1978), Hougaard et al. (1992), Klein (1992), Nielsen et al. (1992), Gourieroux and Lu (2015) and Walter et al. (2021).In the latter study, dependence life tables are created.Lu (2017) implement a mixed proportional hazards model to account for observed and unobserved frailties, with a treatment effect capturing the mortality jump characteristic of broken-heart syndrome.The impact of losing a spouse is found to be asymmetric between males and females, a finding also observed in Dufresne et al. (2018).Dependence induced by the occurrence of an event experienced simultaneously by two lifetimes, due to, for example, a car accident or natural disaster, can be modelled using the common shock model of Marshall and Olkin (1967).Gobbi et al. (2019) consider the extended Marshall-Olkin model of Pinto and Kolev (2015), which combines the copula and common shock approaches.
A four-state Markovian mortality model dependent on marital status is proposed in Norberg (1988).Using Fréchet-Hoeffding bounds to estimate the maximum impact of dependence under the assumption of Norberg's model, Denuit and Cornet (1999) observe a reduction of approximately 10% when analysing the impact of dependence on a widow's pension premiums.This study is further developed in Denuit et al. (2001).Norberg's model is extended by Spreeuw and Wang (2008) and Spreeuw and Owadally (2013) to account for the typically short-term nature of broken-heart syndrome.Through the inclusion of an additional state, the mortality of the survivor is assumed to be dependent on the time elapsed since the first death.Ji et al. (2011) extend the model to a semi-Markov model that incorporates instantaneous, short-and long-term dependence, where broken-heart syndrome is a decreasing function of time since bereavement.
Stochastic mortality models are a further class of joint mortality models that appear in the literature.Adopting a credit risk-type approach, the remaining lifetime of an individual is assumed to be a doubly stochastic stopping time with an intensity that is equivalent to the force of mortality.Although well-established for single cohort studies (see, for example, Dahl 2004;Biffis 2005;Luciano and Vigna 2005, 2008and Schrager 2006), the use of stochastic mortality models for joint life dependence is limited.Luciano et al. (2008) implement the approach with dependence induced through copulas, while Jevtić andHurd (2017), andHenshaw et al. (2020) use a probabilistic mechanism with correlated mortality intensities to capture the dependence structure.An alternative approach to correlating stochastic processes in the credit risk setting is proposed by Zhang and Brockett (2020).In this study, individual mortalities are modelled as Brownian motions with drift, and have time indices that move according to correlated subordinators.Dependence is induced through this correlation, where the subordinators are structured to capture both shared frailties and idiosyncratic risks in a similar manner to Jevtić and Hurd (2017) and Henshaw et al. (2020).In a recent study, Arias and Cirillo (2021) propose the use of the non-parametric bivariate reinforced urn process, which learns from the lifetime experiences of individuals and uses the information obtained to make inference about others.In line with Bayesian approaches, prior knowledge on the data can be incorporated into the model and updated at the end of each lifetime, thus facilitating improvements in the model over time.
Previous research beyond the context of lifetimes paired through marriage includes the study of dependence in disease incidence among fathers and sons (Clayton 1978), lifetime dependence and disease heritability in adult twins (Hougaard et al. 1992;van den Berg and Drepper 2022;Wienke et al. 2002, (Denmark); Iachine et al. 1998;Lichtenstein et al. 2000, (Denmark, Sweden, Finland)), where there is a large body of genetics literature, and familial dependence and its impact on child mortality among siblings (Zenger 1993, (Bangladesh); Guo 1993, (Guatemala); Sastry 1997, (Brazil)).However, the implications for insurance are not specifically considered in these works.
In this paper, the existence of dependence between the lifetimes of multiple family members is, therefore, assessed for the first time on this scale and in this socioeconomic context.Pairwise dependence between the lifetimes of husband, wife, son, daughter, father and mother are considered through analysis of Egyptian pension data.Dependence within these relationships spans each of the three classifications of lifetime dependence structures presented by Hougaard (2000).Broken-heart syndrome is characteristic of short-term dependence.
Socioeconomic influences on the determinants of an individual's lifetime, including living circumstances, health, education, religious beliefs and the associated approaches to bereavement and loss are widely accepted.Yet the study of dependence and its impact on insurance is limited to high-income countries.Given that many low and lower-middle income countries rely on mortality tables from high-income countries for the pricing of their mortality-based products, it is critical to know whether the patterns observed in samples from countries such as the UK and Canada can also be seen in different socioeconomic environments.Henshaw et al. (2020) propose a coupled stochastic mortality model with a tempered volatility to reflect the impact of close familial and community structures in low-income countries on the severity of broken-heart syndrome.The findings in this paper provide evidence in support of the propositions of Henshaw et al. (2020), where a Ghanaian data set is considered.
Differences in familial structures across socioeconomic environments are highlighted by the structure of the Egyptian social pension scheme.In the event of a pensioner's death, social pension schemes typically pay out to a spouse or child; however, in Egypt, siblings and parents are also listed as beneficiaries (see Section 2 for details).This policy aligns with the fact that children often live with their parents until marriage, and for male children, in some cases, even after marriage.In addition, many families remain financially dependent on the main income provider or breadwinner, typically the father or eldest son.Emotional ties and living circumstances that influence dependence between family members are strongly reinforced by such traditions and norms.Wide age differences in marital relationships, polygamous partnerships and large families are further features of the environment that may change the strength of dependence in comparison to the samples considered in previous studies.This paper contributes to the dependence literature by expanding the study of dependence within marital relationships through analysis of a large data set in a previously unstudied socioeconomic context.The study focuses on male pensioners and their beneficiaries, assessing the impact of the death of a father or son, i.e., the main income provider, on the lifetimes of their relatives.Five samples are included in the analysis, where each sample considers a different relationship.Samples are collected from the Egyptian social pension scheme for pensioners covered under the General Social Insurance System (Egyptian Social Insurance and Pension Law 79 1975), a compulsory scheme with two funds, covering the government sector, and the public and private sectors, respectively.This system is also studied in detail in Khalil (2006).Throughout the paper, those covered by the pension scheme will be referred to as the policyholder or pensioner, and the beneficiaries.
Copula-based analysis is used to capture the dependence between the lifetimes in each relationship.Comparisons are made between four Archimedean copulas, in line with the widespread use of the Archimedean family in the modelling of bivariate lifetime.As in Dufresne et al. (2018), the Clayton, Frank, Gumbel and Joe copulas are assumed.Copula dependence parameters determining the level of association between two lifetimes, are estimated using the two-step inference functions for margins (IFM) method.In each of the five samples, the marginal distribution parameters are first estimated independently before estimation of the copula dependence parameter.All marginal distributions are fitted with the informative reparametrised Gompertz law (Carriere 1992(Carriere , 1994)).
For parameter estimation, the Bayesian Markov Chain Monte Carlo (MCMC) Metropolis-Hastings (MH) algorithm is implemented.Classical estimation techniques such as maximum likelihood estimation (MLE) provide point estimates of unknown parameters.However, Bayesian MCMC algorithms treat the unknown parameters as random variables and derive estimates of their distribution using random sampling techniques, thus capturing parameter uncertainty.MCMC methods enable the inclusion of prior parameter information and reduce the risk of obtaining local rather than global maxima or minima when random walk sampling is not used, a benefit that is particularly useful for high-dimensional problems.The computation time can, however, be high in comparison to MLE for problems with many parameters and complex likelihood functions.For a more detailed discussion of the workings of MCMC, the interested reader may refer to Robert and Casella (1999), Roberts and Rosenthal (2004) and van Ravenzwaaij et al. (2018).In the analysis of this paper, all MCMC results are compared with MLE point parameter estimates.
Bayesian MCMC techniques are well-used in the literature on copula-based dependence analysis.Huard et al. (2006) and Silva and Lopes (2008) adopt Bayesian analysis to study copula selection criteria, reparameterising the problem such that the prior distribution is on the Kendall's tau correlation coefficient rather than the copula parameter.A comparison of one and two-step Bayesian estimation techniques is made in Silva and Lopes (2008) and Ausin and Lopes (2010).Almeida and Czado (2012) use MCMC sampling to estimate the parameters of a stochastic copula autoregressive model with time-varying dependence, again reparametrising to estimate Kendall's tau.A common approach in the copula literature, this reparametrisation enables a clear comparison of dependence across the copula families by unifying the domains of the estimated dependence parameters.Following the IFM two-step method with MH in the second step, Thongkairat et al. (2019) observe a more accurate estimation of mixed copula models when using Bayesian rather than ML estimation.MCMC methods have also been applied in problems including claim reserve and loss prediction (da Rocha Neves and Migon 2007;de Alba 2002;Hong and Martin 2017;Ntzoufras and Dellaportas 2002); survival analysis (Arjas and Gasbarra 1996, where a coupled MH algorithm with joint prior distribution is used to account for stochastic ordering with known differences in the lifetimes of samples) and mortality modelling (Antonio et al. 2015;Cairns et al. 2011;Czado et al. 2005;Fung et al. 2019;Li and Lu 2018).For a non-exhaustive list of the early use of MCMC techniques in actuarial modelling, see Scollnik (2001).
The remainder of the paper is organised as follows.In Section 2, the data set is introduced and empirical correlation measures for the five samples presented.Section 3 describes the Gompertz survival model and the copula models used for dependence estimation.The MCMC algorithm is introduced in Section 4, and the IFM method in Section 5. Results are presented in Section 6, and the concluding remarks in Section 7.

Data Set
Between 1975 and 1980, a number of fundamental laws were issued to ensure the coverage of all working Egyptian citizens, both inside and outside of Egypt.These laws provide compulsory coverage, funded by the State, for employees in government, public and private sectors (Egyptian Social Insurance and Pension Law 79 1975), coverage for employers and the self-employed (Egyptian Social Insurance and Pension Law 108 1976), regulation of the voluntary social insurance system for Egyptians working abroad (Egyptian Social Insurance and Pension Law 50 1978) and pay-as-you-go (PAYG) coverage for all working individuals excluded under the three aforementioned laws (Egyptian Social Insurance and Pension Law 112 1980).Each law covers beneficiaries against old age, disability and death.The data analysed in this paper consists of lifetime data for individuals covered by Law 79, focusing on those working in the government.Law 79 is a defined benefit system that provides additional benefits including injury at work, health, unemployment and social patronage insurance, which offers benefits such as the provision of housing and monetary discounts.Law 79 was restructured under Law 148 in 2019 to cover all four social security laws.Although no significant changes were observed, in this study, only Law 79 is relevant.
The laws determining the structure of the Egyptian social security system reflect the nature of living circumstances within families in Egypt.The Egyptian social pension scheme is designed to provide benefits to participating workers when they become of pension age, where contributions are made by the worker throughout their employment.A worker exits the scheme through death, partial permanent disability, total disability, or reaching retirement age, where retirement age is to be increased from 60 (Egyptian Social Insurance and Pension Law 79 1975, Section 18(3)) to 65 in 2040, in line with Section 41 of Egyptian Social Insurance and Pension Law 148 (2019).
Following the death of a pensioner, benefits are distributed among their beneficiaries.By law, beneficiaries are defined as the widow or widower, sons, daughters, parents, brothers and sisters of the pension policyholder (Egyptian Social Insurance and Pension Law 148 2019, Section 98).Payments cease and beneficiaries exit the scheme through, for example, death, marriage for a widow, daughter or sister, and reaching the age of 21 for a son or brother, except for those incapable of earning, students not yet aged 26 and unemployed, university degree holders not yet aged 26 and unemployed and those with lower-level qualifications not yet aged 24 (Egyptian Social Insurance and Pension Law 148 2019, Section 105).In the event that an individual is listed as a beneficiary of multiple pensioners, they receive only one benefit.The order in which the selected benefit is received is: personal pension, spouse's pension, parents' pension, son's pension, and brother or sister's pension (Egyptian Social Insurance and Pension Law 148 2019, Section 102).
Data for this analysis were collected from the Social Egyptian Pension scheme, with an observation period of 10 years, from 2010 to 2019.A pair is included in the data set only if the policyholder dies within the observation period.The observed distribution of the survival time of the policyholder is therefore conditional on their death within this period.The sample consists of 20,863 male pensioners (the policyholders) and their dependents, where the dependents are either a spouse, parent, son or daughter.On average, the male policyholder dies at age 62.9, with 80% dying between the ages of 53 and 74.Further descriptive statistics for the full sample are given in Table 1.Classifying the data according to the pensioner-beneficiary relationship, five samples are observed.These include husband and wife (H,W), father and son (F,S), father and daughter (F,D), son and father (S,F) and son and mother (S,M).The most commonly studied relationship in the existing literature, the husband and wife sample contains 19,475 males and 19,937 females.The discrepancy in size indicates instances of polygamy.Participation in such a relationship could be a determinant of the strength of dependence between husband and wife.However, although an interesting feature in the Egyptian social context and one that does not appear among the subjects of previous research in this area, due to the small sample size (462 duplicated husbands), polygamous relationships are removed, such that only one spouse beneficiary is considered.The average entry age is approximately 58 and 50.5 for husbands and wives, respectively, with corresponding deaths at ages 62.6 and 65.Of the 19,937 wives in the sample, just 955 (4.79%) died within the 10-year observation period.
A total of 76 sons (0.56%) and 57 daughters (0.40%) exited the observation due to death, where 13,655 father-son and 14,274 father-daughter relationships were included in the sample.The average ages at death of these beneficiaries were just 28 and 36, respectively.Son-father and son-mother constitute the smallest observed samples, with 218 son-father relationships and 1067 son-mother relationships.Since the data is from a pension scheme, the age at entry and age at death of the child in child-parent relationships are relatively high in comparison to parent-child relationships.However, within the child-parent samples, in comparison to the average age at death of the parent, the average age at death of a son is low.This is likely due to the fact that only policyholders that die within the observation period with a parent that is still alive are included in these samples.As such, pensioner sons dying at older ages outside of this period and those who have already lost the corresponding parent are not accounted for.The full summary statistics for all five samples are provided in Table 2.
Empirical dependence measures for the relationships in each sample are provided in Table 3.Here, the Pearson, Spearman and Kendall's tau correlation coefficients between the lifetimes of each member of a pair are calculated.Note that in obtaining these measures, only pairs in which both members die within the observation period are considered, and as such, the measures do not represent the dependence exhibited between all lifetimes in each sample.However, the empirical results indicate the existence of dependence in the data and thus motivate further exploration of the strength of the lifetime association in this setting.
Age differences among married couples vary significantly in the data.The sample is, therefore, split by age difference (d) to test whether there is an observable impact on the respective correlations, where a positive age difference indicates that the policyholder is the elder member of the pair.For the husband and wife sample, the correlation measures are also provided for samples that are split by the sex of the elder spouse (Table 4).Although the minimum age difference between husband and wife is 0 years and the maximum is 59 years, 80% of the sample differs in age by between 1 and 15 years, where the husband is the elder spouse.While the impact of age difference may be less significant in parent-child relationships, a comparison is also made in these cases.
A high positive correlation is observed between the lifetimes of husband and wife, father and son, father and daughter, and son and father.The correlation decreases with increasing age difference in each of these four samples, in line with the results for spousal dependence in the literature (Dufresne et al. 2018;Youn and Shemyakin 1999).In contrast, correlation between the lifetimes of son and mother increases with increasing age difference, indicating a greater reliance of mother upon son with age.Although the two sample sizes are not comparable, couples in which the wife is the older spouse exhibit an increased correlation, aligning with the findings of asymmetric mortality experience in Lu (2017) and Dufresne et al. (2018).Distributions of age difference and the survival time of the beneficiary after the death of the policyholder, i.e., the time between the first and second deaths, are presented in Figure 1 for each of the five samples.Increased correlation between lifetimes in child-parent relationships can also be observed in the survival time distribution plots, with a greater proportion of bereaved deaths occurring in the early years of bereavement, specifically years 2 and 3.This trend appears with less significance in the husband and wife data set; however, in both parent-child relationships, the association between survival probability and years since bereavement is less clear.This perhaps aligns with their fairly small sample sizes.The father-son sample experiences a gradual increase in mortality, which falls after the fifth year of bereavement, while the father-daughter sample experiences the same year three peak as observed in the three other samples, with an additional peak much later in the bereavement.In comparison to the Ghanaian data set considered in Henshaw et al. (2020), where 13.1% of the sample dies within the first year of bereavement, lifetime dependence is of much lower initial significance.In the Egyptian data set presented here, the impact of losing a spouse appears to be delayed.

Model Description
In this section, the survival and copula models for dependence are presented.Let (x) denote an individual aged x.Then, the survival function of (x) is defined by where τ x is the remaining lifetime of (x), given their survival to age x, and X denotes the remaining lifetime of an individual at birth.Many mortality models exist and are implemented in the literature.For the purpose of this study, Gompertz's law of mortality is adopted.Gompertz's law is a classical model of mortality experience first proposed in Gompertz (1825), which states that after a given age, the logarithm of mortality intensity is a linear function of age.The law is specified to reflect mortality behaviours above a sufficiently high level (observed to be approximately 30 years of age); the suitability of Gompertz's law for old-age mortality is, however, also widely debated.Many studies argue for the existence of a deceleration in the increase in mortality at the highest ages (above approximately 80-90 years), with mortality observed to curve away from the Gompertzian trend and to plateau at very high ages; see, for example, Thatcher and Kannisto (1998) and Thatcher (1999).However, more recently, developments in the reporting of age and mortality data have been proposed as improvements that could contest the non-Gompertzian nature of old-age mortality; see, for example, Gavrilov and Gavrilova (2019).
In line with this ongoing debate, extensions of the classical Gompertz model and alternative mortality models have been developed to allow for flexibility in the modelling of mortality behaviours.The simplest extension of the Gompertz model is the Gompertz-Makeham model (Makeham 1860(Makeham , 1867)), where the addition of a constant term is introduced to capture age-independent mortality.Willemse and Koppelaar (2000) and Willemse and Kaas (2007) propose generalisations of the Gompertz distribution in the context of frailty-based mortality models, extending the model beyond classical age dependent considerations.In more recent work, El-Gohary et al. (2013) propose an alternative generalised Gompertz distribution that allows for flexibility in the specification of the hazard rate to overcome the monotonic requirement of the Gompertz hazard function.Li et al. (2021) alternatively capture the old-age mortality curvature and plateau through proposition of a multi-factor exponential model based on the approximation of mortality measures with Laguerre functions.For a thorough overview of mortality models and their suitability for capturing the mortality experience of different age ranges, see, for example, Booth and Tickle (2008), and the references therein.Cairns et al. (2009) specifically focus on the mortality of pension-age lives (60-89 years), comparing the performances of eight stochastic mortality models in explaining mortality improvements at older ages.Their analysis compares alternatives to the classical Gompertz model, including Lee-Carter and Age-Period-Cohort type models, in addition to generalisations of the model of Cairns et al. (2006).
With the exception of survival data in the son and daughter samples, Table 2 shows that the age at entry largely lies within the Gompertz range.As such, the marginal distributions of the individuals studied in this analysis are assumed to follow the Gompertz law.MCMC relies on the idea that the Markov chain describing the transient behaviour of the parameters accepted by the algorithm converges to its stationary distribution after a sufficient number of iterations, where the stationary distribution resembles the desired probability distribution of the estimated parameters.For the small samples of sons and daughters with ages that are largely outside of the Gompertz range, inaccurate estimates, and thus, greater parameter uncertainty, are more likely to appear.This is exemplified in the results of Table 5 (Section 6).However, the parameter distribution obtained in the first IFM step captures this uncertainty.Sampling from this distribution to estimate the marginal parameters for the second IFM step paired with the assumed convergence of the Markov chain to its stationary distribution therefore mitigates the significance of errors in the marginal estimation.Future work could involve the fitting of a more appropriate model for the age range of child beneficiaries.In addition, while the simple construction of the Gompertz model provides a good starting point for the exploration of mortality experience, it would be interesting to consider the impact on the dependence results of fitting a more comprehensive model of mortality.
The force of mortality λ x and survival function S(x) associated with X are given by respectively, for all samples, where B > 0, c > 1 and x ≥ 0. Reparametrising the Gompertz law such that the estimated parameters are informative (Carriere 1992(Carriere , 1994)), let and e where m > 0 is the modal density and σ > 0 is the dispersion of the density about the mode.Then, where t p x = P(X > x + t|X > x) = S(x+t) S(x) .The probability that (x) dies at a given time t, i.e., the probability density function of the remaining lifetime of (x), is then derived by Copulas are widely used across a broad set of disciplines for the study of dependence between random variables.Since this paper focuses on the estimation of dependence between two lifetimes, bivariate copula functions will be used throughout the analysis.
First introduced by Sklar (1959), copula functions provide a link between the marginal and bivariate distributions of two random variables, thus facilitating a tractable analysis of the associated dependence structures.The definition of the bivariate copula can be extended for the definition of multivariate copulas of dimensions greater than two if the dependence between a higher number of random variables is of interest.
The Archimedean copula family is a class of copulas that are well-used in the modelling of bivariate survival functions, due to their analytical tractability and their relation with informative measures of association, such as Kendall's tau.Copulas in the Archimedean family are particularly useful in high-dimensional studies as they facilitate the modelling of dependence with a single parameter.In addition, in their theoretical study, Genest and Kolev (2021) introduce an extension of the law of uniform seniority to two dependent lives, proving that for a bilinear averaging function, paired lifetimes exhibit Archimedean dependence and have marginal distributions from the same scale family.
For a thorough discussion of copula families, and their definitions and properties, see Nelsen (2006).The analysis in the remainder of this paper focuses on the Clayton, Frank, Gumbel and Joe copulas.To improve the interpretability of the results, estimates of Kendall's tau correlation coefficient, given the copula dependence parameter estimates, will also be provided in Section 6.Details of the structure of each copula and the associated Kendall's tau are provided in Appendix A.
To fit the copulas in Table A1 to the Egyptian pension data set of Section 2, let τ x 1 and τ x 2 denote the remaining lifetimes of the first (pensioner) and second (beneficiary) member of each pair, respectively, given their current ages x 1 and x 2 .Then, by Sklar's theorem (Sklar 1959), if τ x 1 and τ x 2 are positive and continuous, there exists a unique copula C : [0, 1] 2 → [0, 1] that describes the joint distribution function of the bivariate pair of random variables (τ x 1 , τ x 2 ), such that where F τ x 1 (t 1 ) and F τ x 2 (t 2 ) are the marginal distribution functions of τ x 1 and τ x 2 , respectively.The joint survival function of (τ x 1 , τ x 2 ) is similarly given by Considering marginal distributions conditional upon survival to observation means that lifetimes are coupled at the beginning of the observation period.Coupling lifetimes at an earlier date would infer the existence of dependence prior to the observation period.In the husband and wife case, the date of marriage would therefore be an appropriate starting point; however, this data is not readily available.Similarly, coupling from the date of birth of the child would be relevant in the parent-child and child-parent relationships; however, for consistency, we couple at the outset of the observation for all samples.

Metropolis-Hastings MCMC
In this paper, the model parameters are estimated using Bayesian Markov Chain Monte Carlo (MCMC) techniques.Through this approach, Bayes' theorem is used to update the conditional probability of an event, given some known information as more information is obtained.Given a sample of observed data y ∈ R n , with distribution p(y, θ), Bayes' theorem states that where θ is the vector of parameters to be estimated.Since p(y) does not change with θ, it holds that p(θ|y) ∝ p(y|θ)p(θ). (11) The posterior distribution p(θ|y) which describes the distribution of the parameters given the observed data is therefore proportional to the product of the likelihood p(y|θ) and the prior distribution of parameters p(θ).Analytical and numerical analysis of the normalising constant p(y) is, however, largely intractable in higher dimensions, enforcing restrictions on the full estimation of the posterior.
MCMC methods provide algorithms for constructing Markov chains, with stationary distributions replicating that of the posterior.Ensuring convergence to the target distribution, these Markov chains are ergodic and stationary with respect to the posterior distribution.As such, the state of the chain after a sufficient number of steps can be used to approximate the target distribution, the quality of which increases with the number of iterations.Early chain values are highly dependent on the initial value of the chain, due to the Markovian nature of the algorithm, and are thus typically discarded.Through estimation of (11), MCMC algorithms enable random sampling from any probability distribution defined up to a normalisation factor, thus eliminating the limitations associated with integral evaluation.
The Metropolis-Hastings MCMC algorithm (Robert and Casella 1999) proposes a simple method for constructing such a Markov chain (θ t ) t≥0 on the state space of the posterior distribution, where θ t ∈ R d for a parameter vector of dimension d.The algorithm explores the state space of the posterior, progressively constructing an approximation of the target distribution.To implement the algorithm, a proposal kernel q(θ |θ) must first be selected as the distribution from which the potential parameters are sampled.This kernel describes the probability of transitioning to a new point in space θ , given that the chain is currently in state θ and thus describes the movement of the Markov chain.Once specified, the algorithm proceeds as follows: • Initialise, i.e., draw θ 0 from the prior distribution.
where A defines the acceptance probability.-Draw u ∼ U(0, 1).If U < A, accept the proposal, fixing θ t = θ .Else, fix θ t = θ t−1 .Note that if the proposal kernel is specified such that it is symmetric in distribution, the acceptance probability simplifies to since q(θ|θ ) = q(θ |θ) for all θ, θ .
In the analysis of this paper, a normal proposal distribution is selected, such that θ = θ t−1 + N(0, σ), where σ is the standard deviation (step-wise) parameter selected by the user to ensure sufficient exploration of the parameter space.Due to the inclusion of the noise term, such a proposal is referred to as a random walk proposal.Prior distributions in all simulations undertaken in the estimations of this paper are assumed to be noninformative Uniform priors.
The proportion of parameters sampled from the proposal distribution that are accepted by the MH algorithm is the acceptance rate.This measure is used to assess the efficiency of the algorithm, with an acceptance rate of 0.234 considered as optimal (Gelman et al. 1997).The integrated autocorrelation (IAT) score is a further indicator of the robustness of an MCMC simulation.The IAT estimates the number of iterations, on average, needed for an independent sample to be drawn.When running the analysis, an estimate was therefore selected if the acceptance rate of the chain was sufficiently close to the optimal level and if the associated IAT score was low.For the purpose of this study, parameter estimates are given by the mean of the estimated probability distributions.
The standard error of the MCMC sampler is given by where N is the number of iterations of the MCMC algorithm and N IAT is the effective sample size, which provides an estimate of the sample size required to achieve the same level of precision as if the sample were a random sample.

Inference Functions for Margins
The inference functions for margins (IFM) approach of Joe and Xu (1996) is adopted to specify the likelihood function for maximisation.The use of IFM for dependence estimation in copula-based models in the actuarial literature has been observed in studies including those by da Silva Filho et al. (2012) and Brechmann et al. (2013) for dependence between international financial markets, Krämer et al. (2013) and Lee and Shi (2019) for dependence between the number and size of insurance claims and Wang et al. (2015), Lin et al. (2015) and Dufresne et al. (2018) for dependence in mortality models.IFM for estimation of dependence between mortalities modelled with affine processes, is also implemented by Xu et al. (2020).For a d-dimensional multivariate distribution, IFM involves first estimating the vectors of marginal distribution parameters θ 1 , . . ., θ d , then substituting the marginal estimates to maximise the associated likelihood function for the parameters of the joint distribution, which is given by where x i is the observed data, α the vector of parameters of the joint distribution, and for joint distributions captured with copula-based models, where c(F 1 (x 1 ; θ 1 ), . . ., F d (x d ; θ d ); α) is the copula density f j (x j ; θ j ) the marginal density for variate j.Splitting parameter estimation in this way is particularly useful for reducing the computation time for multivariate problems in which large numbers of parameters are to be estimated.Following this two-step approach, two sets of parameter pairs, θ 1 = (m 1 , σ 1 ) and θ 2 = (m 2 , σ 2 ), are estimated for the marginal distributions in each of the five samples, where the subscripts differentiate between the first and second members of each pair.The univariate likelihood for the estimation of θ k , where k = 1, 2, is given by where N is the number of pairs in the sample; and x i k , t i k and c i k are the age at entry, remaining lifetime and censoring point of member k of pair i, respectively, where the censoring point marks the time between entry into the sample and the terminal time of the observation, such that The remaining lifetime of an individual (x i k ) in the observed period is then Inclusion of the censoring point and conditioning on an individual's survival to their age at entry ensures that left truncation and right censoring in the data are accounted for.
In estimating the dependence between lifetimes, the survival time is the variable of interest.As such, the likelihood function for the estimation of the copula dependence parameters is constructed in relation to the joint survival function C(u i , v i ), where for the member 1 and member 2 marginal estimates θ1 and θ2 , respectively.Having obtained the parameter estimates for the marginal distributions of each pair, copula dependence parameters are estimated through the maximisation of the following likelihood function: The four terms in (20) correspond to the likelihood of the death of both (x i 1 ) and (x i 2 ), the death of (x i 1 ) and survival of (x i 2 ), the survival of (x i 1 ) and death of (x i 2 ), and the survival of both (x i 1 ) and (x i 2 ), respectively, where and The partial derivative of C(u i , v i ) with respect to v i is analogous to (21).Given that all pensioners die within the observation period, for the data considered in this paper, (20) reduces to the product of only the first and second terms.
When presenting the results in Section 6, labels k = 1, 2 will be replaced by labels corresponding to the identity of each family member.The likelihood functions ( 17) and ( 20) are also used in the comparison of the MCMC estimation with classical MLE.In this case, the standard error of the parameter estimates is calculated via the inverse of the information matrix I(θ), where I(θ) = −E[H(θ)], the negative of the expected value of the Hessian matrix.

Results
Table 5 displays the MCMC and MLE marginal parameter estimation results, with acceptance rate, IAT score, and standard error (SE), as defined in Section 4. Note that in all cases, MCMC and MLE produce almost the same results.Standard errors are generally low for both estimation techniques, but they are lower when MCMC is used.In all samples, the modal age at death of the beneficiary is greater than that of the pensioner.In the husband and wife case, this reflects the higher life expectancy of females.The modal age at death is particularly high among beneficiaries in the son and father, and son and mother samples.This observation could be due to the fact that here, the parent is alive at the time of the child's death, and so may already be of high age.Each of the marginal estimates may also be influenced by the level of censoring, with many survivors observed relative to the respective sample sizes (see Table 2).The effect of censoring and the associated small sample sizes can also be seen in the IAT, with Markov chains corresponding to samples with fewer data points exhibiting higher scores.However, due to the stationary behaviour of the Markov chain, any increased error in the marginal distribution estimate associated with a limited sample size will be overcome in the second MCMC step.
Comparison of the non-parametric Kaplan-Meier distribution and the Gompertz distribution obtained from the survival function in (5) with MCMC parameters as in Table 5 is presented in Figure 2. Note that in all cases, the Gompertz and Kaplan-Meier distributions fit more closely for the marginals of the beneficiaries.Bias in the data induced by the fact that all pensioners die within the observation period (otherwise, neither pensioner or beneficiary are observed) could be a determinant of this observation.Confidence intervals for the son and daughter samples are also much larger at higher ages, aligning with their small sample sizes, and thus, increased uncertainty.The limited number of observation points resulting from the data's annual reporting of deaths could also be associated with inaccuracies in the fitting of the continuous marginals.The empirical dependence measures presented in Table 3 suggest that the lifetimes of family members in all relationships considered exhibit strong dependence, aligning with the findings in the literature.However, a large number of censored data points appear in all samples, particularly husband and wife, father and son, and father and daughter.In contrast to the empirical correlation estimates which consider only those who have died, and so a biased sample of the data, assumption of copula models for dependence enables censoring in the data to be captured.
The estimation results for the dependence parameters of the Clayton, Frank, Gumbel and Joe copulas defined in Section 3 are presented in Table 6.MCMC and MLE techniques are compared, with the estimates aligning consistently as in the marginal case.The IAT and SE are low for all MCMC estimates, with increased errors in the marginal distribution estimates unobservable in the copula parameter estimation, as expected.
In comparison to the findings of Dufresne et al. (2018), the dependence parameters are relatively low across all of the samples.Dependence is greatest between the lifetimes of son, and mother or father, which may be expected due to the typically unnatural ordering of the deaths.In addition, with increasing age, elder members of Egyptian families are traditionally taken care of by their children.As such, the loss of a son could impact the living circumstances of the bereaved parent, particularly in cases where the son is the breadwinner.Dependence between the lifetimes of husband and wife is stronger than in parent-child and weaker than in child-parent relationships.Focusing on age at death dependence in historical French genealogy data, Cabrignac et al. (2020) consider parentchild and grandparent-child dependencies in addition to the classical marital case, noting a very weak but significant association between lifetimes in the alternative relationships, in line with the parent-child findings of this study.Kendall's tau correlation coefficient estimates obtained from the MCMC dependence parameter estimates in Table 6 are given in Table 7.Here, when comparing between relationships, the same trends in dependence strength as those discussed for the copula estimation are observed.Correlation between lifetimes modelled with a Clayton copula is much lower than for the Frank, Gumbel and Joe copulas.This may suggest that the Clayton copula is not the most appropriate copula for estimation of dependence within the data set of this paper.This finding was also observed in Dufresne et al. (2018) through a comparison of IFM with the omnibus semi-parametric procedure (or pseudo-maximum likelihood) approach.Figure 3 presents a selection of MCMC simulation results for the copula dependence parameter estimation.The density of the estimated parameter distribution, the traceplot of accepted parameters, and the copula likelihood (20) with estimated parameter indicated are presented, where the traceplot depicts the behaviour of the Markov chain, and is thus a plot of the parameters accepted by the MH algorithm.The results for the husband and wife, and father and daughter samples are selected for all four copulas in order to highlight the insignificance of inaccuracies in the marginal estimation step.In contrast to the husband, wife and father marginal samples, the increased IAT score associated with estimation of the daughter marginal parameters (Table 5) induces non-stationary behaviour in the chain.However, despite the risk of inaccurate marginal estimation resulting from this non-convergent behaviour, stationarity in the Markov traceplots presented in Figure 3 is observed in both data sets for all copulas.In addition, plotting the likelihood function (20) for varying α shows that the algorithm maximises the likelihood well in all cases.The MCMC estimate consistently lies close to the ML estimate, with the ML estimate always within its distribution.(F,D), given in rows 1-3 and 4-6, respectively.MCMC estimates given by blue solid line, and MLE estimates by red dashed line.

Goodness-of-Fit
Various methods for assessing the goodness-of-fit of copula models are discussed in the literature.A thorough overview and comparison of the performance of blankettype goodness-of-fit tests, which can be applied to all copula structures, is presented in Genest et al. (2009).Given the right-censoring of the data set of this study, an empirical copula capturing censoring in the data is, however, required.As such, aligning with the observation of the survival time of all policyholders, we implement the non-parametric copula proposed by Gribkova and Lopez (2015) for the case where censoring acts on one of two variables, as follows: Consider a semiparametric estimator of the copula function C(F τ x 1 (t 1 ), F τ x 2 (t 2 )) given by where W jn is a random weight reflecting the jump of the Kaplan-Meier estimator of the distribution function of the remaining lifetime τ x 2 , incorporated to account for right censoring, and F −1 x k (u k , θk ) is a Gompertz realisation of the remaining lifetime of (x j k ), i.e., t j k .The estimator is consistent, and so converges in probability to the true distribution as the sample size tends to infinity, if the censoring point c 2 is independent of the remaining lifetime

Conclusions
In this paper, copula dependence parameters were estimated for five different relationships within Egyptian families, using data from the Egyptian social pension scheme.MCMC techniques with likelihood specified using IFM were implemented and compared with classical MLE.Copula dependence parameters were found to be low in comparison to those in the literature for all of the five relationships considered.However, the corresponding Kendall's tau correlation estimates imply that dependence in this data set and in this socioeconomic context should not be ignored when pricing the associated pension products.Dependence is greatest among child-parent relationships, with non-negligible correlation estimates of between 0.3 and 0.4.The dependence between husband and wife is lower than that of child-parent, with parent-child relationships exhibiting the lowest levels of dependence.Goodness-of-fit testing is highly time-inefficient under the selected semiparametric copula estimator.Future research will involve selecting a more appropriate estimator to sufficiently test the accuracy of the estimation for all sample sizes.
The results presented in this paper cannot be compared with those of previous studies for all samples, due to the absence of research into the dependence between the lifetimes of varying family members.However, in the husband and wife case, the Canadian insurance data largely considered in previous studies exhibits higher levels of dependence than the Egyptian sample.Dependence is also of less significance here than in the Ghanaian data set of Henshaw et al. (2020).Socioeconomic influences on dependence and the characteristics specific to Egypt introduced in Sections 1 and 2 likely contribute to this observed difference.
Furthermore, joint life data, such as the joint and last-survivor annuity data of the Canadian insurer, consists of lifetime data for individuals who specifically sought a joint life policy.In contrast to the compulsory nature of the Egyptian pension scheme, this optional participation in such a policy over a single life policy implies the existence of a relationship (and hence dependence) between the policyholders, which may align with the increased dependence observed in the data set.This supports the findings of Sanders and Melenberg (2016), where a reduced significance of dependence and the associated pricing impacts is observed among married couples under analysis of census data.Since the Egyptian pension scheme is compulsory for all working individuals, the data also spans all social classes.Although this cannot be considered in detail here, given the accessible data, it may further impact the strength of lifetime dependence, and is an interesting area for further study.

Figure 1 .
Figure 1.Age difference and survival time distributions for all samples.

Figure 2 .
Figure 2. Comparison of Kaplan-Meier marginal distribution functions with 95% confidence intervals (black) and Gompertz marginal distribution functions with MCMC marginal parameter estimates as in Table 5 (red).

Figure 3 .
Figure 3. MCMC posterior density, accepted parameter (α) traceplots and likelihood function for estimation of the Clayton, Frank, Gumbel and Joe dependence parameters.Results for (H,W) and (F,D), given in rows 1-3 and 4-6, respectively.MCMC estimates given by blue solid line, and MLE estimates by red dashed line.

Table 1 .
Descriptive statistics of the male pensioners.

Table 2 .
Descriptive statistics for age at entry (Entry) and age at death (Death) for each of the five samples."Death *" gives the descriptive statistics for policyholders whose beneficiaries have also died.

Table 3 .
Empirical dependence measures for each of the five samples, split by age difference d.

Table 4 .
Empirical dependence measures for the husband and wife sample, split by the sex of the elder spouse, where X h represents the lifetime of the husband and X w the lifetime of the wife.

Table 5 .
Marginal distribution parameter estimation results for all five data sets.MCMC: estimate, acceptance rate, standard deviation (SD), integrated autocorrelation score (IAT) and standard error (SE); MLE: estimate, SE.

Table 6 .
Copula dependence parameter estimation results for all five data sets.MCMC: estimate, acceptance rate, standard deviation (SD), integrated autocorrelation score (IAT) and standard error (SE).MLE: estimate, SE.

Table 7 .
Kendall's tau correlation coefficient corresponding to MCMC α dependence parameter estimates.