Using the SARIMA Model to Forecast the Fourth Global Wave of Cumulative Deaths from COVID-19: Evidence from 12 Hard-Hit Big Countries

: The COVID-19 pandemic is a serious threat to all of us. It has caused an unprecedented shock to the world’s economy, and it has interrupted the lives and livelihood of millions of people. In the last two years, a large body of literature has attempted to forecast the main dimensions of the COVID-19 outbreak using a wide set of models. In this paper, I forecast the short- to mid-term cumulative deaths from COVID-19 in 12 hard-hit big countries around the world as of 20 August 2021. The data used in the analysis were extracted from the Our World in Data COVID-19 dataset. Both non-seasonal and seasonal autoregressive integrated moving averages (ARIMA and SARIMA) were estimated. The analysis showed that: (i) ARIMA/SARIMA forecasts were sufﬁciently accurate in both the training and test set by always outperforming the simple alternative forecasting techniques chosen as benchmarks (Mean, Naïve, and Seasonal Naïve); (ii) SARIMA models outperformed ARIMA models in 46 out 48 metrics (in forecasting future values), i.e., on 95.8% of all the considered forecast accuracy measures (mean absolute error [MAE], mean absolute percentage error [MAPE], mean absolute scaled error [MASE], and the root mean squared error [RMSE]), suggesting a clear seasonal pattern in the data; and (iii) the forecasted values from SARIMA models ﬁtted very well the observed (real-time) data for the period 21 August 2021–19 September 2021 for almost all the countries analyzed. This article shows that SARIMA can be safely used for both the short- and medium-term predictions of COVID-19 deaths. Thus, this approach can help government authorities to monitor and manage the huge pressure that COVID-19 is exerting on national healthcare systems.


Introduction
The COVID-19 pandemic is one of the most severe and dangerous challenges that the world has faced. As a result, the human and socio-economic costs of the COVID-19 pandemic have been dramatically high. As of 20 August 2021, the global death toll from COVID-19 had reached more than 4.4 million people, and several countries had effectively entered the fourth wave of the pandemic (Worldometer 2021). In fact, the virus that caused COVID-19 has mutated multiple times, resulting in highly alarming and contagious variants, such as Alpha, Beta, Delta, and Gamma, which first appeared in the UK, South Africa, Brazil (and Japan), and India, respectively (Centers for Disease Control and Prevention [CDC] 2021).
In such a situation, it becomes crucial to provide reliable forecasts of the patterns of the pandemic so healthcare facilities and personnel can be managed better. Thus, in the last two years, a wide body of studies has attempted to forecast the main dimensions of the COVID-19 pandemic, such as the number of confirmed cases, deaths, hospitalizations, recovered, and vaccinated.
The most used prediction techniques were the seasonal and non-seasonal autoregressive integrated moving average (SARIMA and ARIMA) models (Alzahrani et al. 2020;

Brief Review of the Literature
The ARIMA model, also known as the Box-Jenkins method (Box and Jenkins 1976), is one of the most widely used statistical methods for forecasting stationary time series. It has been extensively employed in many areas of research, including environmental pollution (Sen et al. 2016;Zhang et al. 2018), meteorological factors (Valipour 2015;Liu et al. 2021), financial markets (Chung et al. 2009;Adebiyi et al. 2014), and especially for predicting trends and patterns of infectious disease (Earnest et al. 2005;Gaudart et al. 2009;Li et al. 2012;Kane et al. 2014;Liu et al. 2016;Wang et al. 2018;Singh et al. 2020;Ala'raj et al. 2021). The ARIMA model has very good properties. It is easy to fit and manage, and it is understandable even for non-professional users. It can deal with many common practical situations and complex patterns such as calendar variation, cyclicity, seasonality, trends, external or exogenous interventions, outliers, randomness caused by other factors and/or diseases, and other relevant real aspects of time series (Pack 1990; Barnett and Dobson 2010). Moreover, it does not assume any knowledge of underlying models or structure as do some other forecasting methods (Adebiyi et al. 2014). It simply allows the prediction of a given time series by considering its own lags, i.e., the previous values of the observed time series and the lagged forecast errors. Table 1 lists 32 studies that used an ARIMA/SARIMA framework to forecast the patterns of infectious diseases over the last 16 years. Table 1. Thirty-two selected studies on infectious disease forecasting, which used non-seasonal and seasonal ARIMA model.

Data
The data used to forecast the cumulative deaths from COVID-19 in the 12 selected countries were extracted from the Our World in Data COVID-19 dataset (https://ourworldindata. org/coronavirus, accessed on 25 September 2021), which relies on data collected by The Johns Hopkins University (JHU). Table 2 reports for each country the start date, the end date, and the number of observations. As suggested by several authors (Box and Tiao 1975;McCleary et al. 1980;Box et al. 1994), a reasonable ARIMA model requires at least 40-50 observations. Since the time series for this paper range from a minimum of 386 observations (Vietnam) to a maximum of 549 (Iran), this condition is met. All the time series are plotted in Figure 1, and they suggest an upward trend in the cumulative deaths from COVID-19 in all 12 countries. The COVID-19 daily deaths-obtained by first-differencing each time series-show that 10 of the countries experienced multiple waves, and Thailand and Vietnam are undergoing the first severe wave of COVID-19 ( Figure 2). This seems to suggest the presence of complex patterns and seasonality in the dynamics of deaths from COVID-19. Figure 3 shows plots of the numbers of cumulative deaths from COVID-19 per 100,000 inhabitants. As of 20 August 2021, Argentina, Brazil, and Mexico had reached the highest values, with 269. 81, 242.57, and 195.51 deaths per 100,000 inhabitants, respectively. By contrast, Vietnam, Thailand, and Bangladesh had the lowest values, with 7.75, 12.65, and 15.19 deaths per inhabitant, respectively. This is a matter of concern, especially for American countries.

ARIMA and SARIMA Models
The non-seasonal ARIMA model is classified as "ARIMA(p,d,q)," where: p is the order of the autoregressive (AR) process, d is the order of differencing required by the time series to get stationary, and q is the order of the moving average (MA) process. By multiplying the seasonal terms by the non-seasonal terms in the ARIMA model, it is possible to get a seasonal ARIMA (SARIMA) model. It assumes the notation "SARIMA (p,d,q)(P,D,Q)m," where: m is the frequency of data, and the lowercase and uppercase notations refer to the non-seasonal and seasonal components of the model, respectively.
The analysis used the following steps: i. First, I split the original dataset into training and test sets, and I ran the model with the training set. Its output was compared with the target, i.e., the test set. In particular, the training set was used to predict the last 20 observations of the original dataset. 1 The best ARIMA and SARIMA 2 models were identified using the "auto.arima( )" function included in the package "forecast" (in the R software), developed by Hyndman and Khandakar (2008). 3 This function follows sequential steps to identify the best model to fit. It finds the best model by using the unit root test to assess the non-seasonal and seasonal degrees of difference necessary to make the time series stationary 4 and by looking at the minimization of the Akaike's information criterion (AIC) and the maximum likelihood estimation (MLE). 5 This procedure was used to prevent issues of overfitting and underfitting and to evaluate the overall performance of the model, i.e., its ability to predict unseen data. In addition, as suggested by Hyndman and Athanasopoulos (2021, Section 5.2), I also compared my preferred methods to three simple forecasting methods, i.e., Mean, Naïve, and Seasonal Naïve approaches. 6 To assess the suitability of each model, I used the mean absolute percentage error (MAPE) metric. In fact, it is the most widely used error metric (Kim and Kim 2016;Hyndman and Athanasopoulos 2018, section 3.4), and it is not scale-

ARIMA and SARIMA Models
The non-seasonal ARIMA model is classified as "ARIMA(p,d,q)", where: p is the order of the autoregressive (AR) process, d is the order of differencing required by the time series to get stationary, and q is the order of the moving average (MA) process. By multiplying the seasonal terms by the non-seasonal terms in the ARIMA model, it is possible to get a seasonal ARIMA (SARIMA) model. It assumes the notation "SARIMA (p,d,q)(P,D,Q) m ", where: m is the frequency of data, and the lowercase and uppercase notations refer to the non-seasonal and seasonal components of the model, respectively.
The analysis used the following steps: i. First, I split the original dataset into training and test sets, and I ran the model with the training set. Its output was compared with the target, i.e., the test set. In particular, the training set was used to predict the last 20 observations of the original dataset. 1 The best ARIMA and SARIMA 2 models were identified using the "auto.arima( )" function included in the package "forecast" (in the R software), developed by Hyndman and Khandakar (2008). 3 This function follows sequential steps to identify the best model to fit. It finds the best model by using the unit root test to assess the non-seasonal and seasonal degrees of difference necessary to make the time series stationary 4 and by looking at the minimization of the Akaike's information criterion (AIC) and the maximum likelihood estimation (MLE). 5 This procedure was used to prevent issues of overfitting and underfitting and to evaluate the overall performance of the model, i.e., its ability to predict unseen data. In addition, as suggested by Hyndman and Athanasopoulos (2021, sct. 5.2), I also compared my preferred methods to three simple forecasting methods, i.e., Mean, Naïve, and Seasonal Naïve approaches. 6 To assess the suitability of each model, I used the mean absolute percentage error (MAPE) metric. In fact, it is the most widely used error metric (Kim and Kim 2016;Hyndman and Athanasopoulos 2018, sct. 3.4), and it is not scale-dependent. Thus, it is easily comparable, immediately giving a good approximation of the accuracy of the models. 7 ii. Second, I forecasted the time window of specific interest, from 21 August 2021 to 4 October 2021, and I compared the best ARIMA and SARIMA models on the minimization of AIC and four common measures of the accuracy of models: the mean absolute error (MAE), MAPE, mean absolute scaled error (MASE) and the root mean squared error (RMSE). After identifying the best models with the "auto.arima( )" function, I fitted the SARIMA models with Gretl-2021-c software, using the exact MLE approach and standard errors of parameters based on the Hessian matrix. iii.
Then, I investigated the autocorrelation function (ACF) and the partial autocorrelation function (PACF) of the residuals for the first 14 lags to establish if the residuals described a white noise process. If signs of autocorrelation were present, as suggested by Hyndman and Athanasopoulos (2018, sct. 8.7), I graphically investigated ACF and PACF of the original time series (after differencing), and I added enough parameters until the residuals showed to be randomly distributed. This iterative process was based on the minimization of AIC and four common measures of the accuracy of models: MAE, MAPE, MASE, and RMSE. 8 iv.
Finally, I compared 30-day forecasts, from 21 August 2021 to 19 September 2021, with the actual trends (real-time data) to assess the overall reliability of the models by looking at the MAPE between them.
The steps of this procedure are summarized in Figure 4. The estimated baseline equation for the ARIMA models with (p,d,q) non-seasonal order terms was the following (Davidson 2000) 9 : where ∆ d is the difference operator, 10 y t means the forecasted values, p is the lag order of the AR process, φ is the coefficient of each parameter p, q is the order of the MA process, γ is the coefficient of each parameter q, and ε t denotes the residuals of the errors at time t.
Econometrics 2022, 10, x FOR PEER REVIEW 9 of 25 Figure 4. Nine sequential steps to identify and evaluate the best forecasting models for cumulative deaths from COVID-19.

Evaluation Metrics
I used four common metrics-MAE, MAPE, MASE, and RMSE-to evaluate the over-Start: Set the frequency of the time series to 1 for ARIMA and 7 for SARIMA.
Split the original dataset and train the model by using the "auto.arima( )" function, and evaluate the accuracy.
Forecast the window of interest (30 observations out of sample) for each country using the "auto.arima( )" function.
Choose the best ARIMA and SARIMA models for each country, based on AIC minimization.
Compare ARIMA and SARIMA models on AIC minimization and MAE, MAPE, MASE, and RMSE metrics. Choose the best models.
Plot ACF and PACF of the reisduals of the best models and detect any significant spikes.
Add autoregressive (AR) and/or moving average (MA) parameters until residuals yield white noise (on minimization of AIC).
Fit the final best model with Gretl-2021-c, using the exact maximum likelihood approach and standard errors based on the Hessian matrix.
Finally, compare 30-day forecasts with real-time data to assess the reliability of the models in the short to midterm. The estimated basic equation for the SARIMA models with (p,d,q) non-seasonal order terms and (P,D,Q) seasonal order terms was the following (Chatfield 2000;Clarke and Clarke 2018): where d is the order of non-seasonal differencing, D is the order of seasonal differencing, s is the number of seasons per year, B is the backshift operator, φ p (B) and θ q (B) denote the non-seasonal polynomials of order p and q in B, Φ P B S and Θ Q B S denote the seasonal polynomials of order P and Q in B S , and ε t denotes the residuals of the errors at time t.

Evaluation Metrics
I used four common metrics-MAE, MAPE, MASE, and RMSE-to evaluate the overall accuracy of the forecasted models. In fact, since each of these error measures has specific characteristics and criticalities, I safely considered them jointly in the analysis (omitted reference). The formulae used to calculate each of these metrics were: where n represents the number of observations, y i denotes the actual values, and y i indicates the forecasted values.

Results and Discussion
Figures 5 and 6 show the results of the training and test sets for each country by fitting ARIMA and SARIMA models, respectively. The training set and the test set exhibited, in most cases, very low and similar MAPE for ARIMA and SARIMA models. The only exception was the ARIMA model for Vietnam, where MAPE for the test set was 15 times larger than MAPE for the training set. In this case, MAPE for the test set was definitively greater than that for the training set, suggesting overfitting issues. However, this is not particularly worrying because the SARIMA model for Vietnam exhibited much better performance than the ARIMA model. Moreover, SARIMA outperformed both ARIMA models and the simple forecasting methods used as benchmarks, i.e., Mean, Naïve, and Seasonal Naïve (Table 3). Notably, MAPE for SARIMA models was always lower or very close to 1%, except for Vietnam. This could be deemed a satisfying output considering that many factors were not included in the forecasting process, such as climate and environmental conditions, the efficiency and capacity of the health systems, non-pharmaceutical interventions (lockdowns, physical distancing, quarantine), the age structure of the population, and vaccination campaigns. 11 Thus, the models seem able to learn from previous data, and they can be effective in predicting unseen observations.  In Table 4, I present the ARIMA models for forecasting the cumulative deaths from COVID-19 for each country in the period from 21 August 2021 to 19 September 2021, chosen by using the "auto.arima( )" function. Adding the seasonal effect in the "auto.arima( )" algorithm, I attained the SARIMA models for each country. However, looking at the plots of the ACF and PACF for lags up to 14, it seems that there is structure left in the residuals of the SARIMA fitted models for Bangladesh, Brazil, Iran, the Philippines, Russia, South Africa, Thailand, the US, and Vietnam ( Figure S1, in Supplementary Materials S1). Therefore, I adjusted model parameters until I attained a white noise process. 12 The final optimal SARIMA models are reported in Table 5. 13 Econometrics 2022, 10, x FOR PEER REVIEW 13 of 25  Therefore, the optimal number of parameters for predicting cumulative the deaths from COVID-19 for each country were the following (Table 5): Argentina (0,2,1)(2,0,2)7, Bangladesh (3,1,3)(1,1,2)7, Brazil (1,1,8)(0,1,1)7, India (0,2,1)(2,0,2)7, Iran (6,2,2)(2,0,1)7, Mexico (0,2,1)(4,0,0)7, the Philippines (6,2,4)(3,0,4)7, Russia (4,2,4)(4,0,3)7, South Africa (5,1,8)(4,1,4)7, Thailand (4,2,10)(4,0,2)7, the US (6,1,1)(0,1,1)7, and Vietnam (5,2,4)(0,0,1)7.
Since MASE was always much lower than 1, the actual forecast performance is much better than the naïve method (Table 5). 14 In other words, the proposed method yields smaller errors than one-step errors from the average naïve method (Hyndman and Koehler 2006). According to Lewis (1982), the results of MAPE indicated that SARIMA models had very high accuracy. In fact, the MAPE difference between the observed and fitted data was much smaller than 10%, ranging from 0.34% for Iran to 1.92% for Vietnam. Notably, except for Argentina, India, and Vietnam, the remaining countries had a MAPE smaller than 1% (Table 5). The excellent goodness of fit is confirmed by the analysis of ACF and PACF of the models (Figure 7). In fact, both functions did not show any significant spike, suggesting that residuals were not correlated in all the countries analyzed. That is, the fitted data described a white noise process. PACF of the models (Figure 7). In fact, both functions did not show any significant spike, suggesting that residuals were not correlated in all the countries analyzed. That is, the fitted data described a white noise process. Figure 7. ACF and PACF plot of the residuals of the best SARIMA models (reported in Table 5).
In Figures 8 and 9, I graphically represent the optimal SARIMA models for forecasting cumulative deaths from COVID-19 in the 12 hard-hit big countries in the next 30 days, from 21 August 2021 to 19 September 2021. The light blue area identifies the prediction interval at a 95% level of confidence. The red dashed line represents the forecasted values, and the light green continuous line identifies the original time series until 20 August 2021.
Although the predictions seem to stress a common upward trend for cumulative deaths from COVID-19 in the next 30 days for all the countries, the fitted curves of the  Table 5).
In Figures 8 and 9, I graphically represent the optimal SARIMA models for forecasting cumulative deaths from COVID-19 in the 12 hard-hit big countries in the next 30 days, from 21 August 2021 to 19 September 2021. The light blue area identifies the prediction interval at a 95% level of confidence. The red dashed line represents the forecasted values, and the light green continuous line identifies the original time series until 20 August 2021.
Although the predictions seem to stress a common upward trend for cumulative deaths from COVID-19 in the next 30 days for all the countries, the fitted curves of the forecasted values exhibit different slopes. A slowdown in the growth curve of the deaths from COVID-19 seems to be possible in Argentina, Bangladesh, Brazil, and India. In this respect, the predicted values underline the likelihood of a flattening in the curves of cumulative deaths from COVID-19 around the end of September 2021. On the contrary, Iran, Mexico, the Philippines, South Africa, Russia, Thailand, the US, and Vietnam appear to be characterized by sustained growth of the total deaths from COVID-19 in the next 30 days. Among them, Thailand and Vietnam show a possible explosive growth in the number of deaths from COVID-19 in the same period.   Notably, Brazil, Iran, the Philippines, Russia, Thailand, and the US had the smallest prediction intervals, suggesting a low uncertainty in estimating deaths from COVID-19 at the 95% level of confidence, while Argentina, Bangladesh, India, and Vietnam had the largest prediction intervals. Notably, Brazil, Iran, the Philippines, Russia, Thailand, and the US had the smallest prediction intervals, suggesting a low uncertainty in estimating deaths from COVID-19 at the 95% level of confidence, while Argentina, Bangladesh, India, and Vietnam had the largest prediction intervals.
Finally, in Figures 10 and 11, I compare the estimated models to the real-time data over the period 20 August 2021 to 19 September 2021. The predictions from the SARIMA models seemed to fit very well with the observed data over that time window. The only exception was Thailand, whose forecasts-although also increasing-overestimated the real trend. 15 Econometrics 2022, 10, x FOR PEER REVIEW 18 of 25 Finally, in Figures 10 and 11, I compare the estimated models to the real-time data over the period 20 August 2021 to 19 September 2021. The predictions from the SARIMA models seemed to fit very well with the observed data over that time window. The only exception was Thailand, whose forecasts-although also increasing-overestimated the real trend. 15    Table 7 shows that MAPE difference between forecasted and observed data tended, on average, to grow in all countries. However, this is not particularly worrying because the absolute values of MAPE generally remained low over the whole forecasting window. On 19 September 2021 (i.e., after 30 days), MAPE was lower than 1% for ten out of twelve countries, that is, 83.33% of the sample ( Table 7). The highest values were reached by Vietnam and Thailand, with a MAPE of 4.21% and 10.69%, respectively. Russia, Argentina, and the US showed the lowest MAPE after 30 days, with average differences between forecasted and observed data of 0.03%, 0.11%, and 0.15%, respectively. Thus, the models proved to be not only accurate but also reliable enough in short-to mid-term. This is consistent with the recent literature (Khan and Gupta 2020;Alabdulrazzaq et al. 2021;Al-Turaiki et al. 2021;ArunKumar et al. 2021) and suggests the suitability of the SARIMA models to predict the trend of cumulative deaths from COVID-19 around the world.  Table 7 shows that MAPE difference between forecasted and observed data tended, on average, to grow in all countries. However, this is not particularly worrying because the absolute values of MAPE generally remained low over the whole forecasting window. On 19 September 2021 (i.e., after 30 days), MAPE was lower than 1% for ten out of twelve countries, that is, 83.33% of the sample ( Table 7). The highest values were reached by Vietnam and Thailand, with a MAPE of 4.21% and 10.69%, respectively. Russia, Argentina, and the US showed the lowest MAPE after 30 days, with average differences between forecasted and observed data of 0.03%, 0.11%, and 0.15%, respectively. Thus, the models proved to be not only accurate but also reliable enough in short-to mid-term. This is consistent with the recent literature (Khan and Gupta 2020;Alabdulrazzaq et al. 2021;Al-Turaiki et al. 2021;ArunKumar et al. 2021) and suggests the suitability of the SARIMA models to predict the trend of cumulative deaths from COVID-19 around the world.

Conclusions
In this paper, I attempted to forecast the cumulative deaths from COVID-19 in 12 hardhit big countries for the period 21 August 2021-19 September 2021. The results showed that: (i) the implemented forecasting procedures proved to have a good prediction accuracy both in the training and the test set, by outperforming the simple alternative methods (Mean, Naïve, and Seasonal Naïve); (ii) SARIMA models outperformed ARIMA models (in predicting future values) on AIC and almost all the considered forecast accuracy measures (MAE, MAPE, MASE, and RMSE), suggesting the existence of strong seasonal patterns in the time series; and (iii) the 30-day forecasts from the SARIMA models fitted very well the observed data over the period 21 August 2021-19 September 2021 in almost all the countries analyzed.
Thus, SARIMA models were shown to be accurate and reliable tools for forecasting cumulative deaths from COVID-19. They adapted very well to the implemented data, even with complex patterns and seasonality. This is consistent with the extensive and successful use of this approach in the recent literature for predicting the outcomes of the COVID-19 disease (ArunKumar et al. 2021;Malki et al. 2021;Satpathy et al. 2021). Although predictions beyond 15 or 20 days should be taken with some caution, the models estimated in this article may give a reliable approximation of the pattern of growth of the main dimensions of the COVID-19 pandemic and other similar diseases. In particular, SARIMA models proved that they could be safely used for both the short-and mid-term. Therefore, these predictions can help the government authorities to monitor and manage the huge pressure that COVID-19 is exerting on national healthcare systems.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/econometrics10020018/s1, Figure S1: ACF and PACF plot of the residuals of the SARIMA models obtained using the "auto.arima( )" function; Table S1: Comparison between SARIMA models obtained using "auto.arima( )" function and adjusted SARIMA models considering the minimization of AIC, MAE, MAPE, MASE, and RMSE metrics (in percentage), for cumulative deaths from COVID-19; Acknowledgments: I would like to thank two anonymous reviewers for their constructive comments and suggestions.

Conflicts of Interest:
The author declares no conflict of interest.

1
In fact, as suggested by Hyndman and Athanasopoulos (2021, sct. 5.8), in the first stage, it is crucial to ensure that models perform well on data that are not used to predict the future, and splitting the original dataset into two different subsets is a very common practice to do this. The choice of 20 observations for the test set was due to the fact that my predictive analysis was focused on the medium term. 2 In this case, as suggested by Hyndman (2013), since the time series had daily observations, the frequency was set to 7. This is the easiest approach, and, in this case, it gives the most accurate results. 3 The "auto.arima( )" function is discussed in detail in Hyndman and Athanasopoulos (2018, sct. 8.7). 4 Specifically, the function uses as default the repeated Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test (Kwiatkowski et al. 1992) to determine the appropriate non-seasonal order of differencing. As suggested by Hyndman (2014), this is generally more accurate than the two alternative tests, the augmented Dickey-Fuller (ADF) test (Dickey and Fuller 1979) and the Phillips-Perron (PP) test (Phillips and Perron 1988). To identify the appropriate seasonal order of differencing, the algorithm uses, as default, the test "seas". This is a measure of seasonal strength developed by Wang et al. (2006). 5 For the ARIMA models, I used the following script: auto.arima(training_data,stationary=FALSE,seasonal=FALSE,ic=c("aic"), stepwise=FALSE,nmodels=1000,approximation=FALSE,test=c("kpss")). While for the SARIMA models, I used the following script: auto.arima(train_argentina,stationary=FALSE,seasonal=TRUE,ic=c("aic"),stepwise=FALSE,nmodels=1000,approximation =FALSE,test=c("kpss"),seasonal.test=c("seas")). The same procedure was also applied to forecast the window of interest (from 21 August 2021 to 4 October 2021). 6 They were used as benchmarks, i.e., to ensure that ARIMA/SARIMA models were better than simple alternatives and, thus, worthy of being considered. 7 In this regard, it is useful to stress that MAPE also has some disadvantages, such as giving infinite or undefined results when one or more time series data point equals 0 or close-to-zero actual values. Moreover, it puts a heavier penalty on negative errors (i.e., when predicted values are higher than actual values) than on positive errors. In this case, the mean arctangent absolute percentage error (MAAPE) suggested by Kim and Kim (2016) could be implemented. However, since it did not modify the results of this paper, I preferred not to include it in the analysis. The output of MAAPE is available upon request. 8 The "auto.arima( )" function does not consider the functional form of the residuals. Thus, residuals could not be described as a white noise process. In this case, a manual adjustment is required (Hyndman and Athanasopoulos 2018, sct. 8.7). 9 The drift is omitted because all the models reported in Table 4 had a second difference operator (Hyndman and Athanasopoulos 2018, sct. 8.7). Moreover, a drift in first differences would imply the presence of a linear trend in levels, and that did not seem likely (Figures 1 and 2). 10 I.e., the order of differencing needed to achieve stationarity. 11 To this regard, several studies showed the importance of demographic, environmental, healthcare, and lockdown policies in explaining COVID-19 deaths (Conyon et al. 2020;Sarkodie and Owusu 2020;Perone 2021a). 12 In Table S2 (Supplementary Materials S2), I compared the SARIMA models obtained using the "auto.arima( )" function and the adjusted SARIMA models on the minimization of AIC and four error measures (MAE, MAPE, MASE, and RMSE). The results showed that the latter outperformed the models obtained using the "auto.arima( )" function in 35 out 40 metrics, i.e., on 87.5% of all the forecast accuracy measures. The outcomes were not straightforward for Vietnam; however, the AIC, the ACF, and PACF clearly favored the adjusted SARIMA model. 13 The parameter values of the best SARIMA models are reported in Table S3 (Supplementary Materials S3). 14 Only the SARIMA model for Philippines exhibited a MASE close to 1 (0.9385). However, since it was lower than 1, SARIMA model was better than the naïve method. 15 It is necessary to stress that also the SARIMA model for Vietnam tended to overestimate the real trend. However, the MAPE difference between forecasted and observed data (after 30 days) is significantly lower (4.21%) than that for Thailand (10.69%). Thus, it does not appear to be a matter of serious concern.