Temporal Clustering of the Causes of Death forMortality Modelling

Actuaries utilize demographic features such as mortality and longevity rates for pricing, valuation, and reserving life insurance and pension contracts. Capturing accurate mortality estimates requires factual mortality assumptions in mortality models. However, the dynamic and uncertain nature of mortality improvements and deteriorations necessitates better approaches in tracking mortality changes, for instance, using the causes of deaths features. This paper aims to determine temporal homogeneous clusters using unsupervised learning, a clustering approach to group causes of death based on (dis)similarity measures to set representative clusters in detection and monitoring death trends. The causes of death dataset were derived from the World Health Organization, Global Health Estimates for males and females, from 2000 to 2019, for Kenya. A hierarchical agglomerative clustering technique was implemented with modified Dynamic Time Warping distance criteria. Between 6 and 14 clusters were optimally achieved for both males and females. Using visualisations, principal clusters were detected. Over time, the causes of death trends of these clusters have demonstrated a correlated association with mortality and longevity rates, rationalizing why insurance and pension offices may include this approach as a preliminary step to undertake mortality and longevity modelling.


Introduction
Mortality data associated with information such as cause of death is helpful, not only in the medical field (Foreman et al. 2012), but also in insurance and pension funds (Cox 1976). Actuaries in life insurance companies and pension funds analyze mortality and longevity risks using death and survival data to evaluate pricing, valuation, and reserving life insurance products. According to Ashley et al. (2019), the patterns and frequency of causes of death can be a leading indicator of insurance claims. Studies by Kwon and Nguyen (2019) using data from the United States and South Korea have demonstrated that improvements in mortality should be tracked and monitored. Furthermore, the United States population observation report (Holman and MacDonald 2021) concluded by clarifying the importance of considering the prevalence of the causes of death in mortality improvement assumptions for insurance to track mortality trends. Therefore, incorporating causes of death features can be beneficial in monitoring mortality changes over time to explain the specific drivers of increased insurance claims.
Mortality and longevity models form the core of actuarial work in tracking the mortality and survival of policyholders. Life insurance companies face increased death claims due to higher-than-expected mortality experiences, while pension funds are negatively affected by increased longevity rates. Correct estimation of these models is of vital interest to insurance and pension firms because it directly impacts profit or loss. Various models that incorporate data on the causes of death have been employed to model mortality (Caselli et al. 2019;Tabeau et al. 1999;McNown and Rogers 1992). Arnold and Sherris (2013) have shown that models that incorporate the cause of death model improve the assessment of mortality and longevity risks. These models are in contrast to those reviewed by Booth and Tickle (2008), which utilize extrapolation approaches from historical trends to predict mortality. Cause of death approaches have been the alternative of extrapolation models (Janssen 2018) and are being considered due to their perspectives on the underlying process of aggregate mortality (Robertson et al. 2013;Olshansky et al. 2002).
However, modeling mortality by causes faces critical challenges. Firstly, the causes of death data are non-stationary. That is, their mean and variances change continuously, rendering them more difficult to model in comparison to a stationary series. Secondly, causes of death suffer from the assumption of independence (Chiang 1968), where one cause of death influences another. These issues have led to the development of newer approaches in dealing with the cause of death models referred to as co-integration analyses (Arnold and Glushko 2021;Gaille and Sherris 2011), where econometrics approaches, such as Vector Error Correction Models (VECM), are applied to overcome the independence assumption among the causes of death by identifying co-integrated variables within the variables over the short and long term. Other approaches include the Copula type models, which incorporate the dependence relationship among cause-specific rates described by Li and Lu (2018). Such methods may seek a preliminary understanding of death trends and relationships of the causes of death before usage; thus, we aim to fill this gap.
Causes of death data are dynamic and unique to the country of origin. The adoption of the International Classification of Diseases (ICD) by the World Health Organization (WHO) created standardization in classifying causes of death globally. Newer causes of death, such as COVID-19, as described by Shaylika (2020), emerged while others, such as smallpox, have been declared eradicated by the World Health Organization (Meyer et al. 2020). Furthermore, Acquired Immunodeficiency Syndrome (HIV/AIDS) had been a significant contributor to deaths in Kenya, although this is on a declining trend. In 2022, countries are required to implement the ICD 11 framework according to Medicare Centers for Medicaid Services and National Center for Health Statistics (2019). Most developing countries, however, do not participate in the ICD framework, thereby continuing their disadvantage. Therefore, there is the need for such countries to have a reliable framework in accounting for causes of deaths in mortality models.
Besides exclusion from key reporting frameworks such as ICD, most developing countries lack coherent approaches in mortality models mainly because of unreliable data (Arnold and Sherris 2015). These countries have insufficient historical data, a key input in mortality models, especially for extrapolation techniques. With this realization, cause of death models would seem relevant and suitable, incognizant of the newer modeling approaches in the short run. As a preliminary strategy to undertaking cause-o-death modeling, this research is motivated to be a complementary approach based on the application of an exploratory clustering technique, in order to understand the dynamics of trending causes of death in terms of homogeneous clusters.
Therefore, the aim of this paper is to look at how data on deaths, specifically, the causes of death, influence the trend of aggregate mortality rates over time to aid methodical detection, quantification, and monitoring of the causes of death where a standardized classification, insufficient data, and modeling frameworks are nonexistent. For these reasons, an exploratory approach will be employed to identify and gauge the temporal causes of deaths in Kenya to analyze the fluctuations in the various causes of death. Eventually, the causes of death will be clustered into representative groups using a temporal Risks 2022, 10, 99 3 of 34 clustering technique, which is an unsupervised learning algorithm. These groups may then be used as informational benchmarks for mortality modeling by analyzing their trending structures.
The contributions of this paper are: • The addition of a clustering approach of the causes of death that allows for temporality. This gap is essential because it would enable actuaries to incorporate causes of death features in their judgment for future mortality experience.

•
Applying the causes of death features in a developing country setting to expand mortality modeling literature in such jurisdictions.

Clustering
Clustering is a machine learning algorithm, as pointed out by (Richman 2018). It is categorized as unsupervised learning because it uses traits within the data to detect and classify key observations into similar groupings using set criteria (Han et al. 2011).
There are five main types of clustering: partitioning, hierarchical, density-based, grid-based, and model-based techniques (Charrad et al. 2019). Partition (pam) clustering algorithms are further subdivided into hard (Crisp) and soft (Fuzzy) clustering. In the case of hard clustering, observations belong to just one cluster. Examples of hard clustering include: K-means, K-medoids, and Clustering Large Applications (CLARA) algorithms. In the case of soft clustering methods, data points can belong to any cluster with a level of likelihood, for instance, the fanny clustering approach described by Gan and Valdez (2020).
Clustering techniques have been incorporated into many fields such as biology, finance, agriculture, and Geographic Information Systems (GIS) (Lamb et al. 2020). In actuarial applications, Yao (2016) applied clustering in non-life insurance in the ratemaking of car insurance by explaining the general approach in territory clustering. Valuing life insurance products such as variable annuity contracts, Gan and Huang (2017), as well as Gan and Valdez (2016), selected representative policies using clusters to predict models. O'Hagan and Ferrari (2017) applied clustering in actuarial science as a data compression procedure where complex assets and liabilities were divided into several clusters to act as a single representative policy. These policies were subsequently used to model the performance of policy portfolios.

DTW Barycenter Averaging-DBA
According to Charrad et al. (2019), there are over 30 clustering algorithms; however, the best option depends on the type of the dataset, the clustering goal, and the compression level. Conventional clustering approaches do not perform well in the presence of moving objects relative to time. In the case of time series data, static clustering methods ignore the similarity of subsequent series, which may be utilized to compare objects more effectively (Guijo-Rubio et al. 2020). This shortcoming calls for a suitable model when dealing with time-series data.
According to Aghabozorgi et al. (2015), clustering applications in the field of timestamped data are based on sequential data measurements taken across a period from the same source and are used to track change over time, i.e., Dynamic Time Warping, DTW (Lee et al. 2020;Sakoe 1971). This approach tracks the evolution of data over time, creates clusters that follow observations through time, and forms clusters based on the (dis)similarity distance measurement relevant to the given time series. It computes a dynamic distance approach by analyzing two sequences and obtaining an optimal warping path between them, while adhering to specific criteria such as monotonicity (Sard 2019). DTW has been used to overcome some of the drawbacks of the standard Euclidean and Manhattan distance shown in Table 1. That is, it enables the dynamic evolution of data points with time. Time-series clustering developments have evolved over the years, aiming to minimize the computational cost and improve accuracy. However, the classic DTW has continued to be as effective (Wang et al. 2013). DTW has been implemented on many fronts, such as water quality monitoring in hydrology (Lee et al. 2020), gene expression in bioinformatics (Aach and Church 2001), and finance (Tsinaslanidis et al. 2014).

Distance Criteria
Description Reference Aghabozorgi et al. 2015;Sard 2019;Zhao and Itti 2018;Sakoe 1971) This paper extends the DTW methodology by incorporating a prototype function known as Barycenter Averaging (DBA), which aims to minimize its squared distance from an original sequence repeatedly (Petitjean et al. 2011). Furthermore, evaluation has been shown to compare favorably with other prototyping functions in literature (Soheily-Khah et al. 2015;Zhao and Itti 2018). It is a suitable prototyping function that complements the centroid linkage criteria to capture the overall mean of the centroids over time.
The paper is outlined as follows. Section 2 will describe the source and elements of the dataset and the methods implemented. The results and discussion will be presented in Section 3 together with their interpretations, which will outline the implications of the results based on the research question. The conclusion and future extensions will be presented in Section 4.
The universal set of 131 causes of deaths will be denoted by C.S. and it will be used to represent the gender: male and female. The years of interest will be 2000 to 2019, denoted by T. Two sets of age brackets will be used, 20 ≤ x < 60 and x ≥ 60. The age of 20 to 60 will be key in monitoring mortality risk and entails the minimum legal age to be eligible for life insurance in many countries. Additionally, individuals over the age of 60 have repercussions on pension and often consist of retirees affected by longevity risk.
The fundamental quantity of interest will be the death rate m x,s,c,t , which is the ratio of deaths to the mid-year population for each age (x), sex (s), cause (c), and year (t) given by d x,s,c,t P x,s,t . The approach of using death rates, and not the number of deaths would enable the time differential to be factored into the clusters.
Before clustering, the deaths data will be transformed by scaling to reduce the variability of the magnitude of death rates generated by leading causes of deaths and low-tiered causes. Using hierarchical clustering and visualizations, the research questions will be answered from the data by exploring the trends and patterns of the leading causes of death based on gender {male, female}, age {20 ≤ x < 60 and x ≥ 60}, and year (period) {2000-2019}, grouped annually.

Notations
A clustering set is denoted as a set of collections, also called a power set, such that Let C i represent the ith cluster while c i stands for Risks 2022, 10, 99 5 of 34 the center of cluster C i and C j represents the jth cluster while c j stands for the center of cluster C j . Further, n i and n j are the cardinality belonging to cluster C i denoted |C i | and C j denoted |C j |, respectively. The distance d (x i ,y i ) is the distance between the objects x i and y i in cluster C i and d (x j ,y j ) is the distance between the objects x j and y j in cluster C j .

Clustering Tendency
Before performing the cluster analysis, the cluster ability was assessed using Hopkins statistics (Lawson and Jurs 1990) for each age group. The existence of clusters in the dataset was determined by measuring the probability of whether the data comes from a uniform distribution. Any value equal to 0.5 illustrates that the data is uniform. Additionally, values less than 0.5 and closer to zero present non-cluster able data. According to Hopkins Statistic, the aim is to achieve an H value closer to one. It is expressed as below:

Hierarchical Agglomerative Clustering
A hierarchical agglomerative clustering mechanism was applied to obtain homogenous groups of the causes of death that are distinct to ages, sex, and time. Using the agglomerative (combining) technique, a bottom-up approach, individual causes of deaths are continually merged into successive clusters of the hierarchical clustering. The number of the causes of death data is classified over the period 2000 to 2019. The results will enable us to select the ideal groups applicable to insurance companies in the long run. These techniques will allow scaling to incorporate newer causes of death and cluster groups. It will involve the following step-wise procedure on the data, as shown by Algorithm 1.

1.
Designate all data points as individual single clusters.
Combine the clusters using linkage criteria. 4.
Update the distance matrix. 5.
Iterate the procedure until each data point becomes a single cluster.
The key parameters of interest include the distance measure criteria, and the linkage criteria is presented as follows: Table 1 shows some of the distance criteria used in the literature. We will implement the DTW distance with a DBA prototype function. Table 2 gives some of the linkage criteria described by Gan et al. (2007) and Lance and Williams (1967). The centroid linkage criterion will be implemented because our results will require averaged centroid extractions.

Stepwise Procedure for DTW Barycenter Averaging (DBA)
In conjunction with the DTW distance, individual causes of death sequences may be modeled with underlying means. Stepwise, it is an iterative selective process that commences randomly with one of the series in the data as a reference (centroid). Subsequently, it computes the DTW alignment between the cluster and the centroid series. For each centroid point, the average is calculated using the values in each group and then performed repeatedly until a specific number of iterations have converged.

Linkage Criteria Equation Reference
Single Minimum pair distance between points in cluster i and j Average pair distance between points in cluster i and j Maximum pair distance between points in clusters i and j Pair distance between cluster centroid i (mean vector of length p features) and cluster centroid j Euclidean distance between weighted centroids of the two clusters Weighted mean of the between-cluster dissimilarities between the points in cluster i and j

Cluster Validity
Cluster validity is the process of evaluating and determining optimal clusters that exist in a dataset and subsequently assessing the resultant clusters to ensure the quality of the clusters, such as internal, external, and relative cluster validity, which are the three categories of cluster validity indices (Gan et al. 2007). The difference between the first two validity measures is that the external compares the resulting partition to the right one. In contrast, internal validity measures analyze the partitioned data and measure cluster purity. External CVIs are valid if the ground truth is understood. A heuristic approach will be preferred when selecting the optimal number of clusters. The best set of internal cluster validity indexes and visualization techniques are used to perform this task. The majority of internal indices calculate a quality measure by combining cluster cohesiveness (inside or intra-variance) and cluster separation (between or inter-variance (Arbelaitz et al. 2013), they are implemented by the dtwclust and tsclust package in R (Sard 2019) and Montero and Vilar (2015), respectively. Cluster validity index of the seven indices with their objective criterion are as described by Wang and Zhang (2007) as shown by Table 3. Both the maximizing and minimizing cluster validation functions are implemented with the aim of enhancing the optimality of the achieved clusters.

Cluster Elimination Approach
To present the applicability of the proposed clustering methodology in life insurance and pensions, a cause elimination approach will be adopted. This approach is based on the multiple decrement model under competing risks (Chiang 1968). It has previously been applied in studies by Kwon and Nguyen (2019), Li et al. (2019), Kaishev et al. (2007), and Alai et al. (2015). This approach will be extended to clusters assuming independence of the clusters holds.
Let the probability of dying due to cluster c be q c x,t . The mortality adjusted as a result of cluster elimination due to cluster c for the age group x in the year t is represented by q * (c) Where φ represents the mortality change factor such that φ ∈ Q, bounded by −1 < φ < 1. This factor represents the improvement or deterioration of mortality from the expected mortality. If φ is negative or positive, the modified mortality will increase or decrease, respectively. As the assumption of independence of the causes of death holds, the extra mortality resulting from the cluster elimination will be re-distributed to the remaining clusters using proportional weights as explained by Alai et al. (2015). Furthermore, the central death rates m x derived from the population data will be transformed to the annualized probability q x type consistent with life tables by applying the following transformation formula: Risks 2022, 10, 99 7 of 34 be used interchangeably. The following quantities will be derived and computed from the dataset: A hypothetical temporary life assurance and life annuity products payable in arrears will be developed for males and females aged 20 and 60 years based on the Dickson et al. (2019) approach. The respective Actuarial Present Values (APV) will be achieved based on Equations (4) and (5) for n = 0, 1, . . . , 9 According to the prevailing government bond yields, a basis of 13% per annum effective rate will be applied. This rate reflects the interest rate risk for Kenya. A 10-year window period assumes that historical trends will continue like the current mortality rates. The achieved clusters' behavior regarding the overall mortality will be monitored and observed based on the different values of the mortality shocks, for 0, ±5%, ±10%, and ±15%. The motivation of this scenario-based approach is also to incorporate the usage of ±10% rate of mortality shock recommended by legislation in Kenya (Insurance Regulatory Authority 2017). The influence of eliminating a cluster in the age group and gender will be quantified and assessed in conjunction with the derived Actuarial Present Values (APV) and assumption rates using visualization techniques. Table 3. Cluster Validity Index (Internal).

Index Description Objective Criteria
Silhouette (Sil)

Minimum
Score Function (SF)

Cluster Tendency
Based on Table 4, the clustering tendency of the dataset is set out by the Hopkins Statistic. For both male and female age groups, the values are closer to one, indicating the existence of clusters in the data for all age groups. This finding explains that the data is cluster-able and appropriate to perform clustering.  Table 5 shows the optimal cluster results based on age for males and females. It was found that more clusters emerged from younger males and females than their older counterparts. This finding validates the reason for lower life expectancies in developing countries due to more causes of death in younger ages (Roser et al. 2013).

Cluster Validity Indices
This section presents the results of seven cluster validity indices ranked either by maximizing or minimizing their objective functions (refer to Table 3). The best cluster is selected from the highest and lowest ranking indices depending on the criteria of the objective function, which is either maximization or minimization. Tables 6-9 show the aggregate ranking based on all the objective functions for a given cluster. Because this is an iterative and parameterized approach, the range of the cluster limits was set between 2 and 15, which was also the default in the tsclust package algorithm.
Visually, Figures 1 and 2 were the representation of the outcomes distinguished by the black vertical broken line signifying the optimal cluster points. They show that some validation indices performed abnormally. For instance, the Score Function (SF) chose cluster 2 as the best cluster representation for all ages and gender, but it was not an optimum choice compared to the rest of the indices. The probable reason might be that the Score Function index works well with hyper spheroid data structures and not time series, as Saitta et al. (2007) investigated.  Visually, Figures 1 and 2 were the representation of the outcomes distinguished by the black vertical broken line signifying the optimal cluster points. They show that some validation indices performed abnormally. For instance, the Score Function (SF) chose cluster 2 as the best cluster representation for all ages and gender, but it was not an optimum choice compared to the rest of the indices. The probable reason might be that the Score Function index works well with hyper spheroid data structures and not time series, as Saitta et al. (2007) investigated.  This section presents the results of the proposed modified DTW approach in comparison to the Euclidean (l1 norm) and the Manhattan (l2 norm) distance metrics, referred to in Table 1, based on age, gender, and clusters. The comparisons are represented in Figures 3-6. Generally, the DTW has shown superior performance based on the seven validity indices outlined in Table 3. For males aged 20 to 60 and over 60 years, six out of the seven indices identified DTW as the best distance metric. Similarly, females aged 20 to 60 had five out of the seven supporting the DTW. However, DTW and the Euclidean distance jointly lead with three out of seven indices among females aged over 60, with Manhattan only scoring the best under the Dunn index. Despite the results among older females, these results suggest that the DTW distance metric is the best performing model and is suitable for detecting optimal clusters in temporal datasets. Studies of Bartkowiak et al. (2018) have confirmed that the performance accuracy of DTW measures on smaller datasets are better than the lock-step measures, which include both the Euclidean and Manhattan distance criteria because of the dilating alignments of the warping window with time. Furthermore, Cassisi et al. (2012) demonstrated that the Euclidean distance was limited because it could only compare observations with similar lengths, unlike DTW, which could incorporate varying series lengths. DTW overcomes the one to one comparison by achieving the many to one comparisons. It shows that the DTW accepts various alignments of the series datasets because it is less sensitive to non-uniform amplitude scaling and captures structural distortions among non-linear datasets.

Comparison of the Dynamic Time Warping-DBA with the Euclidean (l1 Norm) and the Manhattan (l2 Norm) Distance Metrics
This section presents the results of the proposed modified DTW approach in compa son to the Euclidean (l1 norm) and the Manhattan (l2 norm) distance metrics, referred to Table 1, based on age, gender, and clusters. The comparisons are represented in Figures 6. Generally, the DTW has shown superior performance based on the seven validity indic outlined in Table 3. For males aged 20 to 60 and over 60 years, six out of the seven indic identified DTW as the best distance metric. Similarly, females aged 20 to 60 had five out the seven supporting the DTW. However, DTW and the Euclidean distance jointly lead wi three out of seven indices among females aged over 60, with Manhattan only scoring t best under the Dunn index. Despite the results among older females, these results sugge that the DTW distance metric is the best performing model and is suitable for detecting o timal clusters in temporal datasets. Studies of Bartkowiak et al. (2018) have confirmed th the performance accuracy of DTW measures on smaller datasets are better than the loc step measures, which include both the Euclidean and Manhattan distance criteria becau of the dilating alignments of the warping window with time. Furthermore, Cassisi et (2012) demonstrated that the Euclidean distance was limited because it could only compa observations with similar lengths, unlike DTW, which could incorporate varying seri lengths. DTW overcomes the one to one comparison by achieving the many to one compa isons. It shows that the DTW accepts various alignments of the series datasets because it less sensitive to non-uniform amplitude scaling and captures structural distortions amo non-linear datasets.

Centroid Cluster Extraction Results
The centroid extractions and the comparative column charts were also obtained for each age, gender, time, and cause and visually inspected. In general, the causes detected in these clusters have shown trending structures based on co-movement that exist among causes: trending upwards (increasing), trending downwards (declining), outliers, and insignificant ones. For the complete cluster member list see Tables A2-A5.

Centroid Cluster Extraction Results
The centroid extractions and the comparative column charts were also obtained for each age, gender, time, and cause and visually inspected. In general, the causes detected in these clusters have shown trending structures based on co-movement that exist among causes: trending upwards (increasing), trending downwards (declining), outliers, and insignificant ones. For the complete cluster member list see Tables A2-A5. 3.5.1. Females Aged 20 to 60 Figure 7 illustrates extracted clusters in younger females. Cluster 4 represents upward trending causes, while clusters 1 and 5 are declining. Cluster 2, 5, 6,7,8,9,10,11,12,13, and 14 are outliers. Cluster 3 represents the insignificant causes. Figure 8 displays the average death rates of the causes of death in cluster 1 among females aged 20 to 60 years partitioned in 2000, 2010, and 2019. The average death rates are generally declining over time. HIV/AIDS is shown to experience the most significant reduction in causing deaths compared to other causes. Maternal conditions, tetanus, stroke, meningitis, lower respiratory infections, diarrheal diseases, and cirrhosis of the liver are also shown to be fairly significant.  This result demonstrates that HIV/AIDS is a declining trend among females aged 20 to 60 years, as shown in cluster 1. Detection of the other declining causes of death that form cluster 1 in females 20 to 60 are also achieved and can be quantified based on their decreasing rates in the short run. Figure 9 also represents the declining trend of the average death rates of the causes in cluster 5 among females aged 20 to 60 years. Tuberculosis, cervix uteri cancer, stomach cancer, and larynx cancer have experienced a general decline despite a trend break scenario, meaning that their trends did not consistently reduce from the year 2000 to 2019, as shown. This result explains one of the shortcomings of using this approach in monitoring one-directional trends. Figure 10 represents the average death rates of the causes of death in cluster 4 among females aged 20 to 60 years partitioned in 2000, 2010, and 2019. Notably, all the increasing causes in this cluster are cancer. Breast and esophagus cancers have significantly increased, while thyroid cancer has the least. This result implies that cancers are increasingly the leading cause of death among females aged 20 to 60. Similar studies such as (Mahase 2019) suggest that cancer will be the most prevalent cause of death not only in high-income countries but also globally. Furthermore, Hamdi et al. (2021) have specifically identified esophagus cancer as the leading cancer cause of death in Kenya and its region, as confirmed by these results.      3.5.2. Females Aged over 60 Figure 11 represents clusters for older females. Cluster 2 is trending upwards while cluster 1 is downwards. Cluster 3, 5, 6, 7, 8, 9, 10, and 11 are outliers, while cluster 4 is insignificant. Figure 12 shows the average death rates of the causes of death in cluster 1 among females aged over 60 years partitioned in 2000, 2010, and 2019. Similar to females aged 20 to 60, the most significant impact of the decline is seen by HIV/AIDS. Diarrheal diseases, 3.5.2. Females Aged over 60 Figure 11 represents clusters for older females. Cluster 2 is trending upwards while cluster 1 is downwards. Cluster 3, 5, 6, 7, 8, 9, 10, and 11 are outliers, while cluster 4 is insignificant. Figure 12 shows the average death rates of the causes of death in cluster 1 among females aged over 60 years partitioned in 2000, 2010, and 2019. Similar to females aged 20 to 60, the most significant impact of the decline is seen by HIV/AIDS. Diarrheal diseases, stroke, protein-energy malnutrition, meningitis, cirrhosis of the liver, chronic pulmonary obstructive disease, and asthma have shown slower declines. This result suggests that most of the causes of slower death rate are the leading reason for increased life expectancy. These findings imply a greater longevity risk because the rate of deaths associated with this age set is slowing. Expressly, studies have confirmed that stroke deaths decline more among older individuals than younger ones (Aparicio et al. 2019). A notable finding of tuberculosis and stroke deaths among females aged 60 has indicated an inconsistent decline or misclassification that needs further investigation. On the other hand, Figure 13 shows the increasing average death rate due to cluster 2 among those aged over 60. Compared to females aged 20 to 60, most increases are not linked to cancer, implying that cancer is either a new cause of death among this age group or that most surviving females recovered from or were not diagnosed with cancer in their earlier ages. However, lower respiratory infections, ischemic heart disease, and hypertensive heart disease associated with cardiovascular diseases (CVD) are increasing, as confirmed by studies by Roth et al. (2015) in lower and middle-income regions. Road injury, falls, diabetes mellitus, Alzheimer's disease and other dementias, gall bladder and biliary diseases, breast, cervix uteri cancer, esophagus and stomach cancer have shown steady increases. Kidney diseases similar to CVDs have also witnessed increased prevalence.
shown steady increases. Kidney diseases similar to CVDs have also witnessed increased prevalence.
The implication of this finding on longevity risk will depend on the rates of the increases and decreases of deaths due to these causes. As shown by both declining and increasing trends, slower decline and steady expansion of the deaths will require different approaches to be undertaken at these higher ages, for instance, to determine the inconsistent findings of stroke and tuberculosis among older females. Figure 11. Cluster extraction for females aged over 60. Figure 11. Cluster extraction for females aged over 60.

Males Aged 20 to 60
Regarding the males aged 20 to 60, cluster 1 represents clusters trending downwards while cluster 4 combines upward trending and recently declining trends with a trend change observed in 2010, as shown in Figure 14. Cluster 2,5,6,7,8,9, and 10 are outliers, The implication of this finding on longevity risk will depend on the rates of the increases and decreases of deaths due to these causes. As shown by both declining and increasing trends, slower decline and steady expansion of the deaths will require different approaches to be undertaken at these higher ages, for instance, to determine the inconsistent findings of stroke and tuberculosis among older females.

Males Aged 20 to 60
Regarding the males aged 20 to 60, cluster 1 represents clusters trending downwards while cluster 4 combines upward trending and recently declining trends with a trend change observed in 2010, as shown in Figure 14. Cluster 2,5,6,7,8,9,and 10 are outliers, while cluster 3 causes are insignificant. Figure 15 displays the average death rates of the causes of death in cluster 1 among males aged 20 to 60 years partitioned in 2000, 2010, and 2019. Similarly, HIV/AIDS is shown to experience the most significant reduction in causing deaths, as observed in their female counterparts. Causes of death such as stroke, tetanus, self-harm, ischemic heart disease, lower respiratory infections, diarrheal diseases, diabetes mellitus, and cirrhosis of the liver have shown a significant decline. Still, they are not comparable to HIV/AIDS. Figure 16 also shows causes of death with both increasing and decreasing causes of death with a break around 2010. Consequently, from 2010 onwards, the causes of death have been declining. Tuberculosis, road injury, malaria, interpersonal violence, esophagus cancer, and mouth and oropharynx cancer belong to this group. This age group has experienced an increased number of causes with a trend change. This result implies additional investigation into these cases.
This result demonstrates that HIV/AIDS is declining among males aged 20 to 60, as shown by cluster 1. Detection of the other declining causes of death that form cluster 1 in males 20 to 60 are shown. Self-harm is an external cause of death linked to intentional injuries and unique to males aged 20 to 60. It shows that the clustering approach can detect such complexities unique to gender. However, this approach has also not observed in similar trends as compared to cluster 5 in females aged 20 to 60, probably due to the dynamic nature of causes. This explains one of the shortcomings of this approach in monitoring trends. One remedy is to periodically undertake clustering to reduce the risk of misclassification. plies additional investigation into these cases.
This result demonstrates that HIV/AIDS is declining among males aged 20 to 60, as shown by cluster 1. Detection of the other declining causes of death that form cluster 1 in males 20 to 60 are shown. Self-harm is an external cause of death linked to intentional injuries and unique to males aged 20 to 60. It shows that the clustering approach can detect such complexities unique to gender. However, this approach has also not observed in similar trends as compared to cluster 5 in females aged 20 to 60, probably due to the dynamic nature of causes. This explains one of the shortcomings of this approach in monitoring trends. One remedy is to periodically undertake clustering to reduce the risk of misclassification.

Males Aged over 60
Males aged 60 and above have cluster 2 trending upwards while cluster 1 is declining, as shown by Figure 17. Cluster 3, 5, and 6 are outliers, while four is insignificant. Figure 18 shows the average death rates of the causes of death in cluster 1 among males aged over 60 years partitioned in 2000, 2010, and 2019. However, the leading cause of death is stroke at a

Males Aged over 60
Males aged 60 and above have cluster 2 trending upwards while cluster 1 is declining, as shown by Figure 17. Cluster 3, 5, and 6 are outliers, while four is insignificant. Figure 18 shows the average death rates of the causes of death in cluster 1 among males aged over 60 years partitioned in 2000, 2010, and 2019. However, the leading cause of death is stroke at a slower rate of decline, as observed. Compared to HIV/AIDS reduction for both the younger and older males, we note that the men over 60 experience slower reductions. This scenario is also replicated among the other causes of death in the age groups. This finding suggests that deaths of older men are steady and implies increased survival of men over age 60. Contrastingly, fewer causes are increasing compared to the declining causes of death, as shown by Figure 19. The majority of deaths in this cluster are cancer, with the primary cause being prostate cancer. Like females, tuberculosis does not depict a one-directional trend in males aged over 60. This shortcoming has shown a pattern for both males and females.
Risks 2022, 10, x FOR PEER REVIEW 20 of 33 slower rate of decline, as observed. Compared to HIV/AIDS reduction for both the younger and older males, we note that the men over 60 experience slower reductions. This scenario is also replicated among the other causes of death in the age groups. This finding suggests that deaths of older men are steady and implies increased survival of men over age 60. Contrastingly, fewer causes are increasing compared to the declining causes of death, as shown by Figure 19. The majority of deaths in this cluster are cancer, with the primary cause being prostate cancer. Like females, tuberculosis does not depict a one-directional trend in males aged over 60. This shortcoming has shown a pattern for both males and females. Figure 17. Centroid extraction of males aged over 60. Figure 17. Centroid extraction of males aged over 60.

Trending Upwards
Upward trending clusters imply a lower risk to longevity for older individuals. Conversely, among the young, upward-trending clusters signify increased mortality risk. These clusters show the need for insurers to pay more attention to such clusters because it would likely impact mortality profit and loss in the future.

Trending Downwards
On the other hand, declining trends imply a higher risk of longevity among the old and 3.6. Cause of Death Classification Based on the Proposed Clustering Approach 3.6.1. Trending Upwards Upward trending clusters imply a lower risk to longevity for older individuals. Conversely, among the young, upward-trending clusters signify increased mortality risk. These clusters show the need for insurers to pay more attention to such clusters because it would likely impact mortality profit and loss in the future.

Trending Downwards
On the other hand, declining trends imply a higher risk of longevity among the old and a lower risk of mortality among the young. For example, HIV/AIDS has a declining trend for males and females for all age groups under clusters 1. For this reason, mortality improvements are expected and portray a higher risk in terms of longevity. This mortality improvement led to revisions of life expectancy projections for Kenya (United Nations and Social Affairs 2017). The United Nations uses the probabilistic approach with the HIV/AIDS factor in forecasting life expectancy in Kenya (Raftery et al. 2013). Such models may therefore consider this approach in advance.

Outliers
These clusters comprise individual causes that are highly dissimilar to the trending causes, as shown by Table 10. Examples of outliers captured include: collective violence and legal intervention cause for both males and females peaked in the electioneering period of Kenya 2007/2008 when Kenya witnessed post-election violence. Additionally, breast cancer among males has been higher in Kenya than in the East African region (Sawe et al. 2016). According to Delgermaa et al. (2011), the mesothelioma cause of death linked to asbestos usage should be anticipated in the immediate decades ahead, including in developing countries. This is one of the outliers. All these instances imply that this temporal clustering approach may also capture unnoticeable cases that result in changes in mortality and longevity trends, including mortality shocks. These clusters capture causes with insignificant deaths or those unrelated to gender or age. For instance, maternal conditions and ovarian cancer observed in cluster 3 for young males is insignificant, as this only affects females; they have zero death rates. However, one drawback of this methodology is observed in cluster 4 under males aged over 60, where several causes are misclassified together with the insignificant cases. This implies that further studies should be conducted to understand the reason for this scenario.

Quantifying the Detected Clusters Based on Cause-Elimination Approach
Figures 20 and 21 represents the actuarial present values of a hypothetical annuity for males and females aged over 60, grouped by clusters as formulated in the methods section. The lowest APV is given by cluster 1. This result represents causes of death that have a more significant impact on longevity risk for males and females aged above 60, that is, causes that depict declining trends as shown for each given longevity assumption rate. Eliminating cluster 1 contracts the APV significantly because it quantifies future expectations based on the longevity assumption. However, eliminating cluster 6 for males and eleven for females has the least significance. Consequently, the reduction of reserves would be underestimated. Notice that the APV is more affected by the clusters than the longevity assumption rates shown by the slopes, implying the importance of monitoring the causes of death. Figures 22 and 23 represent the actuarial present values of a hypothetical assurance for males and females aged 20 to 60 grouped by clusters. The highest APV is represented by cluster 1. This is because cluster 1 contains downward trending cluster causes that reduce the risk of mortality in the future; hence, the removal of these cluster for both males and females results in a higher APV. Conversely, elimination of cluster 4 for females aged 20 to 60 would result in a lower APV. This is attributable to removing the causes with the highest risk of mortality in the future, thereby reducing APV and consequently the reserves. From the finding, the APV is more affected by the clusters than the mortality assumption rates.   Figure 20. APV for males aged over 60 clusters against longevity assumption rates.

Application of Causes of Death Cluster Results in Actuarial Literature
The partition of the two age brackets of 20 to 60 years and over 60 years defines the two demographic structures of a population (young and old) and the two age sets that are eligible for insurance and pensions (working and retired) in the majority of jurisdictions. Mortality and longevity risks are usually defined in the context of age and gender. Insurance and pension firms are more concerned with the risk of mortality among the young and the risk of longevity among the old, as mentioned by Brouhns et al. (2002). Actuaries in life insurance and pension companies set out mortality change factors, called mortality or longevity assumption rates, based on regulatory frameworks, for our case, the 10% actuarial judgements and derivation from published tables. Assumptions of mortality improvement or deterioration by the actuary are subjective based on expert opinion and objective through extrapolating historical trends. Therefore, a complementary application of this methodology is sensible in narrowing these two types of analyses by using optimal representative clusters in defining mortality trends based on these classifications. Figure 20. APV for males aged over 60 clusters against longevity assumption rates. Figure 21. APV for females aged over 60 clusters against longevity assumption rates. Cluster 9 Cluster 10 Cluster 11 Figure 21. APV for females aged over 60 clusters against longevity assumption rates. . APV for males aged 20 to 60 clusters against mortality assumption rates. Figure 22. APV for males aged 20 to 60 clusters against mortality assumption rates.

Figure 23
. APV for females aged over 20 to 60 clusters against mortality assumption rates.

Application of Causes of Death Cluster Results in Actuarial Literature
The partition of the two age brackets of 20 to 60 years and over 60 years defines the two demographic structures of a population (young and old) and the two age sets that are eligible for insurance and pensions (working and retired) in the majority of jurisdictions. Mortality and longevity risks are usually defined in the context of age and gender. Insurance and pension firms are more concerned with the risk of mortality among the young and the risk of longevity among the old, as mentioned by Brouhns et al. (2002). Actuaries in life insurance and pension companies set out mortality change factors, called mortality or longevity assumption rates, based on regulatory frameworks, for our case, the 10% actuarial judgements and derivation from published tables. Assumptions of mortality improvement or deterioration by the actuary are subjective based on expert opinion and objective through extrapolating historical trends. Therefore, a complementary application of this methodology Mortality assumption rates . APV for females aged over 20 to 60 clusters against mortality assumption rates.

Limitations of the Study
One of the study's main limitations is using the assumption of independence of the causes of death (Arnold and Glushko 2021;Chiang 1968). In practice, causes of death are correlated and exhibit co-integration tendencies. This study may be extended by incorporating approaches that consider the relaxation of this assumption before clustering.

Conclusions
A temporal clustering approach explored causes of death for 20 years based on age, sex, and period. The study aimed to obtain key clusters in the context of these three features. The hierarchical agglomerative clustering approach was applied using a Dynamic Time Warping distance criterion with a barycenter averaging modification. Objectively, 11 and 14 clusters were obtained amongst older and younger females, respectively, while ten and six were detected in males, the younger and the older, respectively. The clustering quality was assessed by applying the internal validity index measurement of the seven CVI indices.
Regarding age, period, and sex, the causes of death were classified based on the trending clusters; upward, downward, outlier, and insignificant were achieved. In combination with other mortality models, this approach may be incorporated in identifying trends in causes of death features and monitoring future evolution of mortality and longevity assumption rates for pricing and valuations in insurance and pension offices.
Due to the dynamism and nature of the causes of death over time, it is essential that clustering be undertaken periodically to update the changes of classifications. As a further study, risk factors that result in these causes of death may be incorporated into the causes of deaths, such as alcohol use, smoking status, obesity, etc., to understand the patterns of these causes of death. Furthermore, the trend increase or decline rate has not been established and could be an area of further study.

Data Availability Statement:
The data that support the findings of this study are available from https:// www.who.int/data/gho/data/themes/mortality-and-global-health-estimates/ghe-leading-causes-ofdeath (accessed on 1 December 2021).
Acknowledgments: Nicholas Bett wishes to thank the African Center of Excellence in Data Science (ACE-DS) for funding this research.

Conflicts of Interest:
The authors declare no conflict of interest.