1. Introduction
The COVID-19 pandemic began in late 2019, and quickly manifested itself as a massive increase in global mortality. However, there were problems related to attribution and causation. As such, when it comes to analyzing or modeling COVID-19 mortality data, there are two approaches. The first is to specifically use COVID-19-attributed mortality. This has the benefit of a clear causal structure, where patterns in the data can be more easily connected to the spread of the pandemic (
Lei and Shemyakin 2023). However, there are notable problems with respect to proper attribution. Deaths related to COVID-19 are not attributed to the disease absent a positive COVID-19 test, which is not always possible in locations unable to test (
Wang et al. 2022). This is to say nothing of other infrastructure issues or potentially missing deaths not directly caused by COVID-19, but instead by complications from an existing condition or a response to the pandemic (
Ahamad et al. 2020;
Zińczuk et al. 2023). However, using mortality data directly attributed to COVID-19 comes with the massive benefit of no ambiguity.
The second approach is to analyze overall mortality, usually via the concept of excess mortality. This is defined as the normalized difference between an expected (historical) death count and aggregate deaths (
Britt et al. 2023). Although there is some ambiguity in calculating the expected number of deaths for a particular location, this approach has the benefit of capturing systemic effects the pandemic might have had (
Martinez-Folgar et al. 2021;
Zińczuk et al. 2023). For example, it allows data to include the effects of increased mortality in those with existing conditions of contracting COVID-19 (
Martinez-Folgar et al. 2021), or possible increases in suicides (
Yan et al. 2023). However, the data will also reflect a decrease in vehicle-related deaths due to lock-downs (
Wang et al. 2022). This approach captures the net effect and accurately reflects the total effect of the pandemic. However, it also carries the possible risk of masking the magnitude of the positive effect on mortality.
Regardless, multiple approaches are used to model the resulting data. The Center for Disease Control (CDC) has suggested ensemble models (
Johannson et al. 2020), used also in
Imperial College COVID Response Team (
2022), and in
Wang et al. (
2022).
Martinez-Folgar et al. (
2021) and
Basellini et al. (
2021) used generalized additive models to relate different demographic or location data with mortality. These models usually address multiple covariates and require massive data collection. The approach of this paper is different. It combines ARIMA analysis of excess mortality for different geographical areas, and then applies vine copulas with Bayesian pair copula model selection to study the dependence patterns between these areas. Therefore, the model is based on open source mortality data and does not require a detailed analysis of mortality covariates.
ARIMA models have frequently been used in COVID studies. For listed examples, see
Britt et al. (
2023) and
Wang et al. (
2022). Copula models have also been used to determine the relationships between mortality and other time series data, such as the correlation of interstate trends (
Kim 2022), or by combining mortality data with temperature (
Alanazi 2021); however, they were rarely used in conjunction with ARIMA modeling.
Here, we follow the method laid out by
Lei and Shemyakin (
2023), in which ARIMA models are developed for individual locations, and the model residuals are then related to each other via copula analysis. This allows for seasonality and intra-country effects to be accounted for before addressing cross-correlation between countries. This also allows for the non-normal residuals, which while technically a violation of ARIMA assumptions, allows for the interpretation of fat-tailed residual distributions as an indication of a more complicated dependence structure.
The present paper has an objective of analyzing excess mortality time series for different countries during the period of T weeks and modeling the dependence patterns in the vector through the ARIMA residuals .
Mortality statistics are particularly difficult to compare across countries for several reasons. First and foremost, different countries have different standards for recording deaths. For example, England records only the date a death is “registered,” while the United States records mortality statistics using the date of a death (
Basellini et al. 2021;
National Center for Health Statistics 2025). This means comparisons involving countries who do not record the date of a death are difficult, as actual mortality experience will not be reflected in the data. While a close reconstruction of weekly data is possible (see
Martinez-Folgar et al. 2021), it still leaves open the potential problem of a death being registered several weeks after it occurs, making assigning the week it occurred impossible. Second, countries differ in how they define a “week” and how many there are in the calendar year. For example, the European countries in this study (France, Germany, Norway, Sweden) record their weekly mortality data as the sum of deaths occurring from Monday-Sunday, while the United States and Canada record theirs as the sum of deaths from Sunday to Saturday (
National Center for Health Statistics 2025). This makes interpretations of resulting models somewhat weaker, but absent massive spikes for one day only, it should not affect overall trends.
2. ARIMA
Box–Jenkins models, more commonly known as ARIMA models, stand for autoregressive integrated moving average models of a time series. They have lag (order)
for the single variable time series
,
integrated with the moving average model with lag
,
which is applied to the differences
of
with order
,
where
allows the specification of non-stationary models as defined as ARIMA
. Here,
corresponds to the the stationary model ARIMA
also known as ARMA
.
ARIMA model selection requires the estimation both of and the subsequent parameters in the regression models. This is usually performed via Bayesian or maximum likelihood methods. This yields fitted values of for and residuals .
Unit root tests or other stationarity tests can be used to determine the differencing order d, which can be further informed by the behavior of the ACF or PACF of the time series data. Afterwards, the lag order parameters can be determined via information criterion, such as the Akaike and Bayesian (Schwarz) information criteria. Note that in the case of either the AIC or BIC, there is a penalty term for the number of parameters, leading to more parsimonious models if the information criteria are used for model selection. Note also that changing the value of d does not change the number of parameters to be estimated.
Distribution Analysis of ARIMA Residuals
In general, ARIMA methods are efficient in the assumption of normality of the residuals
. However, in many applications, especially survival analysis and finance, one has to deal with asymmetric and fat-tailed residual distributions failing the normality assumption. Therefore, a skewed t-distribution model, such as the one put forward by
Fernandez and Steel (
1996) may be suitable to the describe the distribution of residuals. The PDF defined therein is as follows.
Regardless, fitted distributions should be compared to residual data to ensure accuracy. A common goodness-of-fit test is the Kolmogorov–Smirnov test, which compares empirical CDFs between two different samples of data and/or distributions. Once a distribution has been chosen, the residuals can be appropriately modeled as random variables.
In the case of several dependent time series , k separate models can be developed for the marginal distributions of , which will help further construction of the joint distribution of , which is the ultimate goal.
3. Copula Analysis
Copula analysis is commonly used to model non-linear statistical dependence between two or more random variables. Copulas are special functions that can describe dependence of random variables as an association between their marginal distributions. In the present paper, copula analysis is applied to model the joint distribution of ARIMA residuals using the marginal distributions obtained in the previous section.
Let be random variables, with CDFs and . Then their joint distribution of can be represented using a copula function , where r is some set of parameters measuring the strength of dependence between the two variables.
Sklar’s theorem states that any copula function of and is a valid joint CDF, and also, any joint distribution function can be represented as a copula function of its marginals. Therein lies the advantage of copulas; the copula framework allows for modeling of the joint distribution in two steps. First, one models the marginals, and then, one uses an appropriate copula function for modeling their association.
There are many different types of copulas to be used for this purpose. For an in-depth list and definitions, see
Brechmann (
2010). The most popular copulas used in practice are Archimedean copulas or elliptical copulas. The former are easier to estimate parameters for, while the latter are easier to extend to higher dimensionality, e.g., more marginals.
The most popular one-parametric Archimedean pair copulas combining marginal distribution functions and their dual versions combining marginal survival functions are
Dual (Survival) Clayton’s Copula
Dual (Survival) Gumbel–Hougaard’s Copula
The first and fourth choices are especially good for addressing the lower-tail dependence, while the second and the third work well for the upper-tail dependence.
Several techniques exist for estimating copula parameters. One way of doing it is to use non-parametric measures of sample correlation, namely Kendall’s concordance
or Spearman’s
, as many two-parameter copulas and all single-parameter copulas can have their parameters expressed as a measure of non-parametric correlation, allowing for a direct substitution (
Brechmann 2010). This relationship also allows for Bayesian analysis based on sample correlation. This is discussed later. Another approach is to use maximum likelihood estimation, though this could carry computational issues relating to a lack of closed-form estimators (
Brechmann 2010;
Huang and Shemyakin 2020). For other parametric approaches, see
Brechmann et al. (
2012) and
Manner (
2007). For a non-parametric method, see
Manner (
2007).
Methods used to determine copula selection will be discussed later.
3.1. Vine Copulas
Elliptical copulas allow for a logical extension from dimension two discussed above to higher dimensions. A one-parameter Archimedean copula rarely provides an adequate model in the multivariate case due to its requirement of symmetric association between all pairs of variables. In case of different degree of association, a vine copula, also known as a pair-copula structure, may be preferable. A vine is a graphical tool for establishing the dependence structure in high-dimensional probability distributions. A regular vine is a special case for which all constraints are two-dimensional or conditional two-dimensional. Regular vines generalize trees. Vine copulas work by establishing the structure of association between the variables, where individual edges correspond to different pair copulas. Using Sklar’s theorem, the joint distribution of the data can be represented using a copula function of the marginals.
From here, differentiation yields the following expression, involving a copula density
and marginal densities:
For two variables, this simplifies to expression
which using basic properties of conditional probability can be rewritten as
Extending this to three variables yields the following result involving the conditional copula:
which in turn can be extended to much higher dimensions. For that and more details, see
Aas et al. (
2009) and
Bedford and Cooke (
2001). Regardless, this implies that any joint distribution can be represented as the product of marginals, pair copulas of the component vectors, and conditional pair copulas. In the case of independence between two variables,
substantially simplifying the resulting structure.
Note that when working with vine copulas, the structure of the model must be specified before pair copulas can be estimated. In other words, it has to be determined first which variables are independent of each other, which variables are conditioned on the others, and in what order. For a detailed explanation why this approach is advantageous, see
Aas et al. (
2009) and
Bedford and Cooke (
2001). To estimate the vine structure, one can use the method put forth by
Dissmann et al. (
2013). First, the unconditional copulas are selected from the list of all possible structures, which can be exhaustive, based on which structure minimizes the reference statistic, such as AIC or BIC. Then pair copulas and their parameters are estimated for each non-independent pair. Then a variable is selected to be conditioned on, and the process repeats until all variables are exhausted.
3.2. Pair Copula Selection
Once a model structure has been specified, there are several ways to select the optimal pair copula(s). Most involve specifying a potential set of hyperparameters defined for each copula family, and then comparing them. This can be performed using the AIC (
Brechmann 2010;
Brechmann and Schepsmeier 2013;
Manner 2007), BIC (
Brechmann 2010), or other information criterion. Various goodness-of-fit tests can also be used for this purpose, allowing their statistics to be compared to select a copula (
Huard et al. 2006). However, as
Huard et al. (
2006) point out, this approach compares single copula models with given parameter values chosen from each parametric family, instead of selecting a copula based on multiple possible parameter values.
A solution to this is to select copulas using Bayesian inference.
Wifvat et al. (
2020) describes the following method, which was suggested in
Huard et al. (
2006) and also used in
Shemyakin and Kniazev (
2017). First, let
be the hypotheses that the data come from one of
M copula families, and for each pair
test
. These hypotheses can be assumed to be mutually exclusive and exhaustive. Then, let
be Kendall’s concordance. If all copulas considered can be written as functions of
, the posterior probabilities of the hypotheses given by the data
may be rewritten as
where
is the prior probability of
.
Wifvat et al. (
2020) show that this method still yields good results even for vague or non-informative priors on
. Since
and in case of positive dependence
, uniform Beta
will be a suitable choice for
. Since the posterior probabilities are only to be used for selection purposes,
does not need to be calculated. With the discrete uniform prior on the hypothesis choice, it suffices to calculate the weights with
denoting respective copula p.d.f.:
or, using a Monte Carlo approach and drawing
N samples from the uniform prior, evaluate
and then the posteriors
for each pair
.
4. Case Study
As an illustration of the suggested methodology, let us consider an example of COVID-19 development in several countries of Europe and North America. As stated above, data from certain countries during the pandemic may be unreliable, due to lack of infrastructure, intentional misreporting, missing data, etc. This study chose to focus on mortality data from the United States, Canada, France, Germany, Norway, and Sweden because of the easy availability of their mostly reliable data recorded in a similar time frame. For each country in this study, the following sources were used: The United States’ mortality data were obtained through the Center Disease Control’s (CDC) website. Canada’s data were obtained through
Statistics Canada (
2025), and they adhere to the same standards as the US (
Human Mortality Database 2025). The European countries’ data were obtained entirely through EuroStat, the official statistics body for the European Union (
Eurostat 2025).
To compute excess mortality, the difference between each country’s pandemic data and historical data was recorded as a percentage of the historical data for a given week in a time series. Consistently with
Britt et al. (
2023), the mortality rates are used rather than counts, and weekly excess mortality rate for week
of the year
is defined as the ratio
where
is the weekly death count for every country in the study. The baseline death count
is calculated as the average for non-perturbed (pre-COVID years). Then the time series
for each country is obtained by concatenation
. This approach to excess mortality helps to alleviate the effect of seasonality if it is generally consistent with the pre-COVID patterns.
The six countries in this study record the number of weeks in the year as the same, which is 52 (or 53) full 7-day weeks (
Human Mortality Database 2025). This fixes the problem of weeks being out of alignment in which year they occur, as the data presented differ by one day only between the North American and European time series. Mortality data from 2014 to 2018 were used to make 5-year weekly averages. With 2014 being the only year with 53 weeks, this means that the week 53’s mortality average for each country is simply the last week of 2014.
The weekly excess mortality time series from the first week of 2019 through the 52nd week of 2022 were used for all six countries. The data are summarized in
Table 1 and three separate time series plots for three pairs of countries to avoid cluttering are presented as
Figure 1,
Figure 2 and
Figure 3.
According to the methodology of
Section 2 and
Section 3, the case data are analyzed with the final goal of building a parametric joint distribution model for weekly mortality in the six countries listed, taking into account both serial and cross correlation to provide a forecasting tool. The flow chart in
Figure 4 delineates the steps of model construction.
4.1. ARIMA
At the first stage, ARIMA models were constructed separately for all six countries in the study. To begin, the time series
for
countries were tested for stationarity using the Augmented Dickey–Fuller test and KPSS tests through the
tseries R 4.4 package. The
p-values are summarized in
Table 2. It is worth noticing that none of moving average coefficients in Equation (
2) proved to be statistically significant.
The mixed results do not give a clear indication as to whether the time series are stationary or not.
From here, partial autocorrelation functions for each time series were analyzed. The United States did not show a significant correlation after a delay of 3. France, Germany, and Sweden showed no significant correlation for a lag of 2, but did have significant correlation for higher order lags. Canada and Norway showed no significant correlation after a lag of 2. After this, ARIMA analysis was performed using the
ARIMA function from the R 4.4 package
stats to generate potential models. Final models were selected first by those with a non-zero amount of statistically significant coefficients, and then by BIC. The models were built according to the structure of the following equation:
where
is the error terms, and
is the intercept. The results are summarized in
Table 3.
After the models were selected, the residuals of each model were subjected to a Box–Ljung test using the
stats R package. The results are summarized in
Table 4.
4.2. ARIMA Residual Analysis
At the second stage, the parametric models were developed for the distributions of ARIMA residuals. The Shapiro–Wilk test was performed on the residuals of each model to determine their normality. The results are summarized in
Table 5.
This suggests only the residuals for Norway’s model were normal. Fitting the distribution using
fitdistr function from the R 4.4 package
MASS yielded a normal distribution with
and
. To fit a distribution to the others, the skewed t-distribution as defined in
Fernandez and Steel (
1996) was used. The results of the fitted distributions are summarized in
Table 6.
Results were verified using the Kolmogorov–Smirnov test, which is summarized in the following
Table 7.
D is the test statistic. For the test, due to errors caused by the
ks.test function used, which was not compatible with the distribution functions fitted, fitted distributions had samples randomly drawn from them to be compared, with a sample size of
.
4.3. Vine Structure Selection
At the third stage of the analysis, the pairwise dependence patterns between the countries were determined for ARIMA residuals. Structure of the copula model for the vector of ARIMA residuals
was determined using the
RVineStructureSelect function from the
VineCopula R package. Later procedures determined all copulas after the first level of conditioning to be independence copulas, so only the structure of the first level of the vine will be shown.
Figure 5 illustrates the vine graph, where the edges correspond to the closest connections established between the nodes. It is reasonable to believe that these connections are mostly due to geographical reasons, but also may reflect the similarity between health policies. That is why the pair Norway/Sweden being geographically close but different at the health policy level does not share an edge at the first level of the vine.
4.4. Pair Copula Selection
Finally, the pair copula selection was performed using the Bayesian method outlined in
Shemyakin and Kniazev (
2017) for the choices defined in Equations (5)–(8). First, the assumption was made that each copula would fall into one of four hypothesized copula families.
Hypothesis 1 (H1). Clayton’s Copula, C.
Hypothesis 2 (H2). Gumbel–Hougaard’s Copula, G.
Hypothesis 3 (H3). Dual (Survival) Clayton’s Copula, SC.
Hypothesis 4 (H4). Dual Gumbel–Hougaard’s Copula, SG.
From here, the Monte Carlo approach described above was used, with
10,000. The results are summarized in
Table 8, with the maximum values in each column (for each pair of countries) boldfaced.
Next, optimal
values were selected using MLE, resulting in the following structure. In
Figure 6, along with the primary connections established in
Figure 5 by vine structure estimation, the selected copula and corresponding value of
are also provided, with larger
corresponding to a closer association between the nodes of the graph.
4.5. Validation
Validation of the model with the later (post-COVID) data was carried out on three levels: ARIMA models, distribution models for ARIMA residuals, and the vine copula structure. To validate the results, mortality data from 2023 were used. While Eurostat maintained weekly death counts throughout 2023, the CDC only maintained weekly counts for 37 weeks, and Statistics Canada only maintained weekly counts for 33 weeks. This does not impact the analysis of ARIMA models, since it is performed country-by-country, meaning that all available data can be used.
To validate the ARIMA model, the mean absolute error (MAE) of each country’s ARIMA model was calculated. This is summarized in
Table 9. The relatively low errors suggest that Canada, France, and Sweden’s ARIMA models are accurate. The larger errors of the other series makes sense, as the COVID pandemic was winding down in the USA by the time the CDC stopped updating its weekly death count in 2023. As such, models developed from the US mortality experience in 2023 would probably not be as accurate. For Germany and Norway, excess mortality temporarily plummeted in the beginning of 2023, which could explain the large average error observed.
Figure 7,
Figure 8,
Figure 9,
Figure 10,
Figure 11 and
Figure 12 showing the actual out-of-sample values versus the point forecasts with 90% and 95% confidence bounds corroborate this.
To validate the marginal analysis of ARIMA residuals, the residuals generated for 2023 were compared to previously fitted marginal distributions using the Kolmogorov–Smirnov test. The results are summarized in
Table 10.
This suggests that the marginal distributions of the residuals fitted in
Section 4 are accurate for Canada, France, Germany, Sweden, and Norway, and less accurate for the USA where the COVID pandemic mortality abruptly dropped in 2023. This loss of predictive power is also illustrated in
Figure 7. The accuracy of the other models is illustrated by the other figures, despite an initial trough in some series.
Finally, to validate the copula model proposed in
Section 4, the data length had to be adjusted for each country’s residuals. To accomplish this, each dataset was limited to the first 33 weeks. Then, this sample was fitted using the
VineCopula R package’s
RVineGofTest function to the copula structure estimated in
Section 4 using the Kolmogorov–Smirnov test (evaluated asymptotically), and the Cramer–von-Mises test (evaluated with 200 bootstrap steps). The results are summarized in
Table 11. For more details on how these tests are implemented in software, see
Schepsmeier (
2015).
This suggests that the dependence model put forth by the vine copula found in
Section 4 describes the connections between the countries’ mortality experience well, even when the mortality experience differs from what is expected.
5. Discussion and Conclusions
To summarize the distinctive features of the modeling approach described in
Section 2 and
Section 3 and then illustrated in
Section 4, we will discuss the advantages provided and the most likely applications related to these advantages. Then we will discuss the limitations of this approach. First, we suggest using separate ARIMA models for the time series of weekly excess mortality for the distinct zones (countries, geographical areas, etc). This approach is well established in mortality analysis and used in COVID-19 studies (
Alabdulrazzaq et al. 2021;
Ilie et al. 2020). Then, we use parametric distribution models for ARIMA residuals since they appear to be skewed and fat-tailed. This approach is quite common for financial time series but also has been applied to mortality data (
Campolieti 2021). Finally, we model the joint distribution of ARIMA residuals for the zones considered above using vine copulas with the marginal distributions obtained at the previous step. This approach has been used for financial time series, see, e.g.,
Shemyakin and Kniazev (
2017) for the general introduction and more extensive literature review, but appears to be new for mortality analysis related to epidemics (
D’Urso et al. 2022).
The main advantage of using single time series ARIMA is the ease of data collection. One can use open-source data on mortality, which are often available in real time. Therefore, this approach is convenient for short-term forecast, defined by the CDC as one-week to four-weeks ahead (
National Center for Health Statistics 2025). The most popular and powerful models, see
Johannson et al. (
2020) and
Imperial College COVID Response Team (
2022), use the ensemble approach and datasets including multiple predictors, which makes data collection more difficult. Using excess mortality also bypasses the necessity to properly attribute the cause of death, which can be problematic for epidemic data, especially at the epidemic’s onset, as discussed above.
Parametric distribution models for ARIMA residuals help to address extreme events of abnormally high or low weekly mortality, which can be critical in epidemic contexts. This provides more realistic tail probabilities. However, the example in
Section 4 demonstrates that the multivariate effect of cross-correlation is also important. For short-term prediction of zone mortality, vine copula structure allows for an effective use of the recent history of the related zones. It also provides for a more realistic prediction of the mortality peaking simultaneously in several countries. In this regard it works like vector autoregression and network autoregression models (
Britt et al. 2023;
Sioofy et al. 2021). Unlike vector autoregression, however, it is not limited to linear association and is more effective in analyzing the tail behavior of joint distribution. One can also notice that the approach of the paper is not COVID-specific and can be applied to a wide range of excess mortality contexts.
The main limitation of this modeling approach is related to its strength: it does not directly allow for the use of covariates or external predictors. Therefore, there is no explanation of future mortality other than through previous experience. There is also no room for structural changes during pandemics. That makes it less suitable for longer-term forecasts or developing a realistic explanation of a pandemic’s progression versus traditional SIR and regression models (
Chaurasia and Pal 2020). A possible problem is also the use of aggregated data which does not allow for the study of such factors as age, gender and socioeconomic status. The use of disaggregated data is one of the possible future directions of model development.
As we see from the illustration in
Section 4, time series models of COVID-19 excess mortality built from open source countrywide mortality data are viable even without addressing mortality covariates. ARIMA models have low lags and no residual autocorrelation, but model residuals tend to be non-normal, being skewed with fat tails. In addition, there appears to exist some cross-correlation between countries not otherwise captured by ARIMA models. This cross-correlation can be modeled using vine copula structures, with pair copulas appropriately selected via Bayesian analysis of different hypothesized families. The end result also demonstrates a geographic component in determining the association between the residuals of different countries. Neighboring countries tend to have higher correlations with each other compared to countries separated by an ocean. However, this is not always the case, as Norway and Sweden’s model residuals appeared to be independent of each other and were more closely related to Germany’s residuals. It appears that copula models applied to ARIMA residuals provide an effective way to address cross-correlation between the countries and may help to predict one country’s mortality based on the others. The copula approach allows for addressing non-linear dependence patterns such as tail dependence, which cannot be captured with individual ARIMA modeling or vector autoregression. Model validation showed that real-world experience toward the end of the pandemic differed somewhat from the model predictions, possibly due to decreased mortality. However, the dependence structure held, suggesting that the conclusions derived from the vine copula model were accurate. Overall, the three-stage modeling approach (ARIMA, distribution analysis of ARIMA residuals, vine copula model for the vector of residuals) seems suitable for creating joint short-term forecasts of pandemic mortality for several countries.