Assessing COVID-19-Related Excess Mortality Using Multiple Approaches—Italy, 2020–2021

Introduction: Excess mortality (EM) is a valid indicator of COVID-19’s impact on public health. Several studies regarding the estimation of EM have been conducted in Italy, and some of them have shown conflicting values. We focused on three estimation models and compared their results with respect to the same target population, which allowed us to highlight their strengths and limitations. Methods: We selected three estimation models: model 1 (Maruotti et al.) is a Negative-Binomial GLMM with seasonal patterns; model 2 (Dorrucci et al.) is a Negative Binomial GLM epidemiological approach; and model 3 (Scortichini et al.) is a quasi-Poisson GLM time-series approach with temperature distributions. We extended the time windows of the original models until December 2021, computing various EM estimates to allow for comparisons. Results: We compared the results with our benchmark, the ISS-ISTAT official estimates. Model 1 was the most consistent, model 2 was almost identical, and model 3 differed from the two. Model 1 was the most stable towards changes in the baseline years, while model 2 had a lower cross-validation RMSE. Discussion: Presently, an unambiguous explanation of EM in Italy is not possible. We provide a range that we consider sound, given the high variability associated with the use of different models. However, all three models accurately represented the spatiotemporal trends of the pandemic waves in Italy.


Introduction
As of the end of October 2022, over 600 million people worldwide had been diagnosed with SARS-CoV-2 infection, resulting in over 6.5 million reported deaths associated with COVID-19 [1]. WHO data indicate that Italy, the first country to tackle the SARS-CoV-2 epidemic in Europe, was one of the European countries with the highest number of newly reported deaths associated with COVID-19 [1]. While measuring both the direct and indirect impacts of this pandemic on mortality is indisputably crucial for any retrospective and comparative assessment, the most reliable method for achieving this is still debated. Mortality through all causes and excess mortality are the most-used health status-related population indicators for monitoring the impact of any catastrophic event on public health [2]. Excess mortality is defined as the difference between the number of all-cause deaths and the number of expected deaths in the same study period [3]. It includes all effects potentially associated with several factors-often unmeasurable-that could have directly or indirectly affected total mortality. During the first two years of the COVID-19 pandemic, many studies and reports were published worldwide to estimate excess mortality from all causes [4][5][6]. The problem of estimating the expected number of We obtained data on all-cause mortality in the years 2020 and 2021 and comparable (by week or day) data for earlier years (i.e., 2011-2019) from the National Institute of Statistics (Istituto Nazionale di Statistica, ISTAT). Every year, ISTAT releases daily estimates of the number of deaths by single age, sex, and municipality of residence, drawing information from population and taxpayers' registries (from the National Register of Resident Population and the information provided by municipalities). The number of deaths registered daily is updated quite regularly (every month) and made publicly available at https://www.istat.it/it/archivio/240401. We downloaded the data used in this study on 18 May 2022. Since July 2020, ISTAT and ISS have periodically published reports in which simple descriptive estimates of excess mortality are estimated as the difference between the number of deaths observed per month and the expected deaths per month, computed as the average number of deaths during the previous five years (2015-2019) [6]. These estimates were adopted as official statistics for the assessment of excess mortality, and, for this reason, we consider them as the benchmark for this study.

Choice of the Models and Description of the Three Models
We chose three models, each representing commonly used approaches for estimating excess mortality: the models created by Maruotti et al. [24], Dorrucci et al. [25], and Scortichini et al. [8]. Some of the coauthors contributed to the implementation of the first two models that have been applied to the same ISTAT data at different spatial resolutions and implemented using the R-language (scripts are publicly available). Further, the three models can be applied at the national and/or regional levels. All three models generate an estimate of excess mortality using different approaches: both Maruotti [25] model uses the average number of deaths as a baseline and produces an estimate of the excess mortality. The years used for estimating/computing the baseline are also different between the three models: 2011-2019 for Maruotti et al. [24], 2015-2019 for Dorrucci et al. [25], and 2015-2021 for Scortichini et al. [8]. We will refer to the three models as model 1 for Maruotti et al. [24]; model 2 for Dorrucci et al. [25]; and model 3 for Scortichini et al. [8]. The formalization of the three models is as follows.
Model 1: The model is a Negative Binomial GLMM g(E[y|b]) = Xβ + Zb, where g is a logarithmic function that defines the relationship between the mean response µ = E[y|b] and the linear combination of the predictors; y is the n × 1 (n = 260 using the years 2015-2019 as a baseline and n = 468 using the years 2011-2019 as a baseline) response vector containing the weekly number of deaths in the baseline years; b is a Gaussian random effect vector that corresponds to b ∼ N(0, σ_b); X is an n × 16 fixed-effects design matrix containing vectors defining the 8 Fourier basis functions describing the time dynamic; β is a 16-by-1 fixed-effects vector; and Z is an n × 1 annual random-effects design vector.
Model 2: Again, the model is a Negative Binomial GLM g(E(y)) = β 0year + Xβ, where g is a logarithmic function; y is an n × 1 (n = 159) vector containing the weekly (truncated) average number of deaths in the baseline years and the weekly number of deaths in 2020 and 2021; β 0year is the annual intercept; X is a design matrix containing the b-spline basis function describing the time dynamic; and β is the vector of basis coefficients.
Model 3: The model is a quasi-Poisson [26] GLM g(E(y|X)) = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3 + β 4 x 4 + β 5 x 5 , where g is a logarithmic function; y is an n × 1 (n= 2557 using the years 2015-2019 as a baseline and n = 4018 using the years 2011-2019 as a baseline) vector containing the daily number of deaths in the years under study; x 1 is an n × 1 vector containing the dates; x 2 is an n × 4 matrix containing a cyclic cubic B-spline with three equally spaced knots for the day of the year; x 3 is an n × 1 vector containing the day of the week; x 4 is an n × 6 matrix constrained quadratic B-spline with four equally spaced knots for the days from 1 January 2020 to 31 December 2021 used to define the excess risk in mortality during the COVID-19 outbreak; and x 5 is an n × 25 matrix defined through a distributed lag non-linear model over 0-21 lag days using a cross-basis parameterization of the mean daily temperature. Table 1 shows, in brief, the main characteristics of the models. A linear term corresponding to time to model long-term trends, a cyclic cubic B-spline with three equally spaced knots for the day of the year used to model seasonality, and indicators for day of the week to account for weekly variations in mortality.

Estimate of the number of expected deaths
Mortality baseline estimated over Smooth functions that define a baseline risk accounting for temporal trends and variation in temperature distribution.

Estimate of excess deaths
Difference from the estimated baseline along with 95% prediction intervals. If zero is included in the intervals, no difference from the expected number of deaths is hypothesized.
Difference in the number of deaths in 2020/21 adjusted by seasonality with the number of expected deaths.
The excess risk In mortality during the COVID-19 outbreak defined through a constrained quadratic B-spline with four equally spaced knots.

Strengths of the model
Simple interpretation and very good in-sample fitting performance are obtained for all Italian regions.

Simple interpretation.
Presence of covariates containing the information regarding mean temperatures.

Limitation of the model
Socio-demographic and hospital-related information may improve the accuracy of the estimates and may contribute to explaining heterogeneity across regions. No harvesting effect is considered.
No secular trend estimate.
Low number of cases prevents the full application of the two-stage modelling process in age groups <50 years old.
In this work, we conducted analyses by following the reference articles, maintaining the formalization of the models, the time windows, and the baseline years. Part of the work consisted of extending the time windows until December 2021 and computing the excess mortality estimates using different models on the populations, thereby allowing for comparison. Specifically, we applied the models to the same geographical units, Italian macro-areas (northern Italy, central Italy, and southern Italy/islands), and age groups (0-49; 50-64; 65-79; 80+). Additionally, since the National Health System is organized at the regional level in Italy, we estimated regional models for all age groups together according to the twenty-one regional health systems (more precisely, from North to South, the healthregional systems represented the following regions: Piemonte, Valle d'Aosta, Lombardia, Provincia Autonoma di Bolzano-Alto Adige, Provincia Autonoma di Trento, Veneto, Friuli-Venezia Giulia, Liguria, Emilia-Romagna, Toscana, Umbria, Marche, Lazio, Abruzzo, Molise, Campania, Puglia, Basilicata, Calabria, Sardegna, and Sicilia). We computed the excess mortalities and their confidence intervals for the three models, using different two-time windows to estimate the baseline. We obtained the confidence intervals via bootstrap method for model 1, through parametric point-wise estimation for model 2, and through a Monte-Carlo simulation for model 3. We also calculated the relative excess mortality, called the P-score [27], and discarded the information regarding the confidence intervals. In the following formula, P it = ex it b it , the P-score P it for the i-th region, macro-area, or age group, in the year t-th, is obtained by dividing the excess mortality estimated in the same i-th group and t-th year, ex it , for the baseline itself, b it .
Given the importance of the baseline in estimating excess mortality, we analyzed possible differences due to changes in the baseline period as follows:

Cross-Validation Analysis
We validated models 1 and 2 through cross-validation across pre-pandemic years from 2011 to 2019. The cross-validation procedure consisted of two steps: first, we estimated the two models to compute the excess mortality for each year (year of interest) from 2011 to 2019, using the period 2011-2019 (baseline) and removing the year of interest; second, we evaluated these estimates by computing the Root Mean Square Error (RMSE) [28]. This analysis was performed for Italy without stratifying the data by class age, geographical macro-area, or region. The characteristics of model 3 did not allow for cross-validation since the model included a covariate describing the risk of death from COVID-19.

Excess Mortality Estimates
The number of weekly excess deaths is shown in Figure 1 for the three models: the respective trends were similar for models 1 and 2, highlighting the gap in the observed total mortality during the pandemic years 2020 and 2021, especially in early 2020. Model 3 showed a different temporal pattern of expected deaths with a much less evident distance between the observed total mortality and the baseline, i.e., expected deaths. This was particularly apparent during the peak months. Table 2 illustrates the results regarding the excess number of deaths estimated by the three models. The excess mortality estimates published in the last ISTAT-ISS report, that is, our benchmark, are also shown for comparison. In detail,

Excess Mortality Estimates
The number of weekly excess deaths is shown in Figure 1 for the three models: the respective trends were similar for models 1 and 2, highlighting the gap in the observed total mortality during the pandemic years 2020 and 2021, especially in early 2020. Model 3 showed a different temporal pattern of expected deaths with a much less evident distance between the observed total mortality and the baseline, i.e., expected deaths. This was particularly apparent during the peak months. In 2021, the same pattern as 2020 is followed: model 2 provided estimates overlapping the ISTAT-ISS official figures, whilst both model 1 and model 3 differed from the benchmark; both model 1 and model 3 showed higher estimates compared to the excess deaths for the age classes 0-49, 50-64, and 65-79. For the age class 80+, model 1 shows a lower estimate compared to the benchmark, while the opposite was shown for model 3.
Regarding the relative estimates (Table 2c,d), the larger differences with the benchmark were found with model 3, especially for the 0-49 and 65-79 age groups. The discrepancies among the relative estimates across the models were similar to those observed among the raw estimates of excess mortality.
It should be noted that model 1 provides rigorously evaluated confidence bands without using 2020-2021 data in its baseline estimation. For this reason, it returns wide confidence intervals for the excess mortality estimates that mostly include the ISTAT-ISS estimates; the ISTAT-ISS estimate is included in the confidence interval 28 times. The 12 estimates for which model 1's confidence intervals do not include the ISTAT-ISS estimates correspond to the age groups 0-49 and 65-79. Figure S1 of the supplementary materials compares the ISTAT-ISS and models 1 and 3's relative estimates (p-scores). We did not consider model 2 in the graph since it is the most similar to the ISTAT-ISS results. For each geographical macro-area and age class considered in this study, the difference between the models and the ISTAT-ISS model is represented.
We estimated the values of the same three models at the regional level, producing regional excess mortality estimates. In the supplementary materials (Table S1a,b), we report the results by year and region. All three models' confidence intervals regarding excess deaths estimated included the ISTAT-ISS official estimates. Differences from the benchmark resulted bigger in smaller regions and in regions with lower mortality. The relative excess mortality estimates (Supplementary Table S1c,d) were higher for models 1 and 3 and lower for the ISTAT-ISS report and model 2. The highest relative mortality excesses were found in 2020, mainly in the northern regions (such as Lombardy, Piemonte, Valle D'Aosta, and the Autonomous Provinces of Trento and Bolzano). A visual representation of regional excess mortality has been provided through maps in Figure S2 of the supplementary materials.    Figure 2 shows the total excess mortality estimates for the three models at the national level, using different time windows to compute/estimate the baseline, as explained before. In model 1, the excess mortality estimates were stable for both years regardless of the time window, that is, 2011-2019 or 2015-2019. Whereas the estimates from models 2 and 3 were substantially affected by the choice of the reference period for the baseline estimation. Indeed, with the reference period 2011-2019, the estimates for models 2 and 3 were much higher than those obtained with reference to 2015-2019. The differences found were similar for both years 2020 and 2021.

Discussion
During the first two years of the COVID-19 pandemic, several studies on excess mortality were published (refer to the introduction). The quantification of COVID-19's impact on mortality was one of the most debated topics by the scientific community due to its high sensitivity among members of the public. In relation to the official numbers and the majority of the literature, some studies have produced significantly higher estimates [29,30], while others produced the opposite [13,31]. These different The root-mean-square error of model 1 was higher compared with model 2 (804.5 vs. 653.5); again, the latter was largely due to the use of the 2020 and 2021 mortality data in the baseline estimation in model 2.

Discussion
During the first two years of the COVID-19 pandemic, several studies on excess mortality were published (refer to the introduction). The quantification of COVID-19's impact on mortality was one of the most debated topics by the scientific community due to its high sensitivity among members of the public. In relation to the official numbers and the majority of the literature, some studies have produced significantly higher estimates [29,30], while others produced the opposite [13,31]. These different quantifications of the pandemic's impact caused misleading interpretations of the subject; therefore, it is very important to measure the associated uncertainty. This study compared three models with respect to estimating excess mortality in Italy using the same study period, age groups, geographical areas, and mortality data.
To the best of our knowledge, this is the first time that such a comparison has been carried out in order to understand how different methodologies lead to variations in estimates.
We found different estimates for 2020 and 2021, with significant differences within age groups and geographical areas. We compared each model with the official national estimates [6], which are used by the Italian Government to outline public health policy strategies and interventions.
In this study, we decided to include the negative values of excess mortality in the results as well, as per the WHO indications: "Negative excess deaths could be observed if deaths that would have happened in the absence of the pandemic were averted due to measures taken to deal with the pandemic".

Findings
The three models showed a similar temporal trend in excess deaths, reflecting the SARS-CoV-2 epidemic waves since the beginning of the epidemic in Italy, with the highest peak in excess mortality observed during the first months (March and April 2020), followed by a substantial reduction during the summer of 2020 and the start the second epidemic wave in late 2020 [25]. Since March 2021, there has been a decline in excess deaths coinciding with higher vaccination coverage, thereby underlining the effectiveness of the vaccination campaign [32,33].
The total excess deaths estimated by year for model 1 and model 2 were quite similar. Both estimates were close to the ISTAT-ISS benchmark, with model 2's results being almost identical. Model 1's approach was found to be the most consistent and stable with respect to variations in the baseline years, the population under analysis, and the complexity of the covariates. Model 2 yields results very close to the benchmark, as the model's structure substantially reflects the simple ISTAT-ISS approach. Further, the adjusted average model quantifies the degree of uncertainty, which is not available in the ISTAT-ISS estimates.
We found an about 3% difference in model 3's estimates for both years, making it the one that differs the most from the other two models and the ISTAT-ISS benchmark. Model 3 applied a two-stage design and statistical model that allowed for a flexible estimation of the excess mortality risk and a definition of the baseline risk that considers temporal trends and variations in temperature distribution. These factors seemed to influence the results, especially in 2021, regarding the benchmark and the other two models. Notably, the SARS-CoV-2 epidemic pattern is not typically seasonal in Italy, with the recurring presence of summer epidemic peaks in 2021 [34] and 2022 [35].
When estimating the baseline mortality, the influence of historical excess mortality on the parameter estimates, mainly due to heat waves and seasonal influenza epidemics, may play a role in model 1, a point currently under investigation. In this regard, a weighted model could be further considered; accordingly, downweighing historical excess mortality for standardized conditional residuals could be a simple solution [36].
Differentiation among the considered approaches emerged when stratifying them by macro-area and age groups. The point estimates became more pronounced for the extreme age groups while the prediction intervals overlapped. Model 1 and model 3's prediction intervals are wider than that of model 2.
When comparing the regional estimates in absolute terms, discrepancies were again observed between the three models, especially in regions with an older population and higher expected deaths, for example, the Liguria region in 2021 (Demo-Geodemo-Mappe, Popolazione, Statistiche Demografiche dell'ISTAT). We found that model 2 was susceptible to the choice of the reference period for the baseline estimation; it produced considerably higher estimates when using a longer reference period. In fact, the non-inclusion of a covariate considering the annual trend in this model likely reflected a non-significant annual trend in the yearly count of deaths in recent years, which is in line with the report by the ISTAT-ISS [6]. This assumption was most likely no longer valid when considering a larger period as a baseline. In contrast, model 1 captures cyclical mortality patterns within years, introducing the Fourier series in the linear predictor. The number of Fourier terms is determined via a model selection procedure applied to the historical mortality data. The variation in the cyclic pattern from year to year is modelled by including random effects, resulting in a mixed-effects model. Notably, it is known that random effects can often represent serial correlation among the measurements. Thus, including random effects may also be sufficient. This assumption was most likely no longer valid when considering a larger period as a baseline. Importantly, it is known that random effects are often able to represent the serial correlation among measurements; thus, including random effects may also be sufficient for capturing the serial correlation between mortality with respect to consecutive weeks.
Excess mortality has been one of the most used indicators to describe the impact of the COVID-19 pandemic; the early official estimates were based on simple assumptions in Italy [6] and the rest of Europe [37]. Later, many mathematical models were published with the same approaches applied to significantly different contexts and with substantially different estimates than the official ones [29], often disorienting scientific communities [30,38]. Among several publications on excess mortality [12,22,31,39], the estimations for Italy ranged widely, with a great amount of uncertainty, further fueling a national scientific, political, and public debate on the effectiveness of the pandemic response. For this reason, the estimation and comparison of national excess mortality is an extremely sensitive topic, as shown by a recent article in which World Health Organization researchers explain the errors in the high-profile mortality estimates of Germany and Sweden [27].
More specifically, our cumulated models' estimates for Italy in the 2020-2021 period were in the range of 164,000-193,000 excess deaths, largely different from the estimates published in The Lancet by the Global Burden of Disease group in Seattle [29], which estimated an excess of 259,000 excess deaths for Italy. As already mentioned in the literature, the excess deaths estimated by Wang et al. were on average two times higher than those recorded in several high-income countries, including Italy [40]. Including specific covariates in the prediction models may distort this phenomenon [29].
Another study of relevance was conducted by the World Health Organization (WHO) in March 2022 [41]. The WHO's model differs according to each nation's availability of all-cause mortality data. For Italy, a Negative Binomial Generalized Additive Model (GAM) was built, with two-time parameters modelled through spline functions. The production of these estimates is a result of a global collaboration between the WHO's "Technical Advisory Group for COVID-19 Mortality Assessment" and the specific countries. The produced annual estimates for Italy in this case included the ISTAT-ISS baseline within their confidence interval (the global excess deaths associated with COVID-19 (modelled estimates) (who.int)).

Research Limitations
We limited our choice to only three models to compare excess-mortality estimates, despite the great number of models/analyses applied in the literature [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18]; however, our selection was based on reproducibility, and all three models had already been applied to the Italian official database, which is recorded every month by the official statistics bureau (ISTAT), while two models were based on a time-series approach, which is the most-used method.
No matter which type of modelling is adopted to estimate the excess mortality attributable to this pandemic, the modeling approach will bring about complexity [42]. Many co-factors played a role in determining the observed excess mortality in Italy, namely, a comparatively older population, different regional patterns, and the measures taken by the government during the various stages of the epidemic. However, not all these co-factors are measurable. None of the three models included, for instance, the degree to which the aging Italian population influenced the estimate of the overall excess mortality, and we cannot exclude that this affected the estimates, especially in 2021. An accentuated and prolonged aging phase has characterized Italy for many years. The ISTAT estimates that in 2050, the percentage of people over 65 will rise by more than ten points [6]. Therefore, this factor must be introduced into future models for estimating excess mortality. In future studies on excess mortality, it will be important to discuss the way in which to treat the COVID-19 years with respect to baseline estimations. The pandemic may have modified the structure of the population across the world, and this factor should be considered in excess mortality estimations for future years. A comparison between the excess mortality estimations cited in this article is reported in Table S2 of the supplementary materials.

Conclusions
The data and code used in this study are available on Github, and the full set of results is shown in the appendix, making our analysis entirely reproducible. As we mentioned in the methods, we selected those models that allowed us to replicate the results.
An overview of excess mortality in Italy in 2020 and 2021 at different spatial resolutions and focusing on different age classes was presented in this work. Different methods for estimating excess mortality were studied, thus deepening the methodological aspects, even though we have not considered all possible approaches/models for estimating excess deaths. We cannot give an unambiguous answer as to the excess mortality in Italy; nonetheless, our in-depth study of the three models and of other approaches regarding COVID-19-related excess mortality provides a range that we consider trustworthy. It is important to stress that estimates vary greatly depending on the model used in the baseline estimation and its assumptions, which should be made carefully and, above all, verified.
All mathematical models make several assumptions that should be evaluated, including covariates that are not always significant for the event of interest, i.e., COVID-19-related mortality. This study did not assess the effects of covariates such as temperature; therefore, we cannot assess the way in which this contributed to the difference in the excess mortality estimates observed. While this aspect could have been interesting, it was beyond the objective of this study. What has emerged is that age played an important role, as excess mortality increased corresponding to the age groups in Italy. The latter is expected, given the old age of the Italian population. At the same time, the geographic macro-area showed different behaviors given the different territorial incidences of COVID-19. However, all three models accurately represented the spatiotemporal trends that characterized the pandemic waves in Italy. This work is addressed to all researchers in the field of public health, and is to be used as a reference for future studies on excess mortality Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/ijerph192416998/s1, Table S1: Excess mortality and relative excess mortality estimates in 2020 and 2021 in the different models by region; Table S2: comparison between various excess mortality estimations; Figure S1: difference between the relative excess mortality of models 1-2, and the ISTAT-ISS model, per geographical macorarea and age class; Figure  S2: graphical representation of regional relative excess mortality in Italy in 2020 and 2021 according to the models in study.