Convolutional Neural Network–Component Transformation (CNN–CT) for Conﬁrmed COVID-19 Cases

: The COVID-19 disease constitutes a global health contingency. This disease has left millions people infected, and its spread has dramatically increased. This study proposes a new method based on a Convolutional Neural Network (CNN) and temporal Component Transformation (CT) called CNN–CT. This method is applied to conﬁrmed cases of COVID-19 in the United States, Mexico, Brazil, and Colombia. The CT changes daily predictions and observations to weekly components and vice versa. In addition, CNN–CT adjusts the predictions made by CNN using AutoRegressive Integrated Moving Average (ARIMA) and Exponential Smoothing (ES) methods. This combination of strategies provides better predictions than most of the individual methods by themselves. In this paper, we present the mathematical formulation for this strategy. Our experiments encompass the ﬁne-tuning of the parameters of the algorithms. We compared the best hybrid methods obtained with CNN–CT versus the individual CNN, Long Short-Term Memory (LSTM), ARIMA, and ES methods. Our results show that our hybrid method surpasses the performance of LSTM, and that it consistently achieves competitive results in terms of the MAPE metric, as opposed to the individual CNN and ARIMA methods, whose performance varies largely for different scenarios.


Introduction
Coronaviruses are a large family of viruses characterized by having crown-shaped spikes on their surface. Nowadays, there are seven identified types of coronaviruses that can be transmitted among humans. The most dangerous coronaviruses known until recent years are MERS-CoV and SARS-CoV, and they have caused severe diseases, such as MERS and SARS, in 2003 and 2012, respectively, [1]. However, at the end of 2019, in Wuhan, China, the new epidemiological outbreak of COVID-19 emerged; it was caused by the new coronavirus called SARS-CoV2.
The importance of mathematical models and algorithms to analyze this disease has grown because they allow one to find patterns, make predictions, and understand fluctuations. Epidemiological models can be classified into two groups [2]: • Dynamic Models. These are old models that usually divide the population into several subsets known as compartments, for instance, the Susceptible, Infectious, Recovered or SIR model. The SIR model was proposed in 1902 by Sir Roland Ross and then expanded by Kermack and McKendrick in 1927 [3].
• Forecasting models using time series. Here, we find classical methods such as ARIMA and Exponential Smoothing (ES) [4]. Furthermore, Machine Learning methods like Support Vector Machines [5] and Deep Learning [6] are also in this group.
This work presents a new method of the second group, based on Convolutional Neural Network (CNN) [7] and a proposed Component Transformation (CT), which we named CNN-CT, whose mathematical formulation is presented. The CNN-CT method is applied to forecast the number of COVID-19 confirmed cases for the United States (US), Mexico, Brazil, and Colombia [8]. The CT changes daily observations into weekly data and back. The forecast made by our hybrid CNN-CT method is further adjusted either with ARIMA or ES methods. We compared the proposed hybrid method versus the individual methods. Our results show that the combined method consistently achieves competitive results in terms of the MAPE metric, as opposed to any of its elements-CNN, ARIMA, or ES-whose performance as individual methods varies largely for different countries. Moreover, the proposed CNN-CT method also outperforms the Long Short-Term Memory (LSTM) [9], which is among the most used methods for dealing with time-series.
Both CNN and LSTM are Deep Learning methods, the first of which is equipped with convolutional filters while the second with recurrent operations, but in both cases with parameters that are learned though gradient-descent-like methods in a scenarios where data are used for training as they become available. In contrast, ARIMA and ES are traditional regression methods that consider a full set of training data at once, thus having the potential of better approximating such a training set, but losing the ability to adjust to newly available data as CNN and LSTM can. The proposed CNN-CT method exploits both the potential of incorporating newly available data as well as the strength of looking at a complete set at once, which results in an enriched forecast method.
We chose to use CNNs, given that the signal processing literature states that convolutional filters are more stable than recurrent operations like LSTM [10]. Moreover, the superior performance of CNNs over traditional methods, like ARIMA, has been confirmed by previous work focused on text classification [11] and sequence modeling [12], where convolutions obtained higher performance with respect to other methods.
The rest of this paper is organized as follows. In Section 2, we discuss works related to the forecast of confirmed cases of COVID-19. In Section 3, we show the proposed forecasting method for daily confirmed cases of COVID-19, highlighting the application of Deep Learning, ARIMA, and ES methods. In Section 4, we present details about the data and tools used to validate our method. Finally, Sections 5 and 6 present results and conclusions of this work.

Related Works
COVID-19 is a disease with a high rate of spread, which has led to an interest in estimation and forecasting the number of cases of infected people. Recently, several works have been presented with traditional epidemiological models or Dynamic Models. The Susceptible, Exposed, Infectious, Recovered (SEIR) model [13] was used to forecast confirmed cases in the United Kingdom, and the SIR and SEIR models were applied to forecast cumulative infected and recovered cases in Santiago de Cuba [14]. The Susceptible, Exposed, Infectious, Recovered, Dead (SEIRD) model [15] was used to forecast confirmed and death cases in Mexico. At Chen [16], comparative work was conducted to predict 11 days of confirmed cases in some regions of Canada and the United States. They use SIR, Neural Network, and ARIMA models.
The ARIMA and ES were used as adjusting methods to improve the results obtained for other models such as those obtained for SIR models, Neural Networks, and Support Vector Regression algorithms [2,17]. However, in most cases, the number of days forecast is too short. For instance, the authors of [18] used ARMA to forecast confirmed cases for three days in Chinese provinces, Asian countries, and a few occidental countries (Germany, US, Italy, and Spain). Parvez et al. compared an Adaptive Neuro-fuzzy Inference System versus ARIMA to predict ten days of COVID-19 confirmed cases in Bangladesh [19].
Furthermore, Petropoulus et al. [20], used the ES method known as Holt-Winter to forecast ten days of globally accumulated COVID-19 confirmed cases. Hussain et al. [21], used an ES to estimate twelve days of confirmed cases, and the R 0 parameter known as the basic reproduction number.
ARIMA and Deep Learning methods have been used alone to forecast COVID-19 cases. Chimmula [22] used LSTM to predict daily cases, obtaining with this method an error of eight percent using MAPE. In Chandraa [23], LSTM, BiLSTM, and EDLSTM were used to forecast the spread of COVID-19 infections among selected states in India. The work presented by Zeroul et al. [24] used deep learning to predict 10 days of number of infected people, obtaining a MAPE error between 1.28% and 59%. Saba et al. [25] compared polynomial regression, Holt-Winter, ARIMA, and SARIMA models, to predict the confirmed and deaths cases. Parbat et al. [26] proposed using an SVR-Radial model to forecast total deaths and recovered, daily confirmed cumulative, and confirmed daily deaths in India; this method obtained around thirteen percent MAPE error for the entire country.
Moreover, classical forecast methods have been combined with Machine Learning techniques [2,17,27]. Katris [27] used ARIMA, ES, Neural Network, and MARS models, where the combined methods performed better than the individual methods.
In general, ARIMA and ES methods are used to forecast cases with short-term periods, while Machine Learning and Deep Learning models are able to predict cases over more extended periods. However, the latter do not always obtain good results when used as individual methods.

CNN-CT Method
We show the proposed CNN-CT method in Figure 1, where a Convolutional Neural Network is used as primary forecasting method for daily confirmed cases of COVID-19, and it is complemented by ARIMA or ES, which are used as adjusting methods against daily errors. Firstly, our method's training stage is composed of two phases, each of which is formed by three internal sub-processes plus one global integration sub-process, as is shown in Figure 1.
In the first sub-process of phase 1, we start by transforming daily values y t into weekly components w τ , where t is a day index and τ is a component index. These w τ components represent average weekly forecast estimations. In the second sub-process, a CNN is used to forecast the componentŵ τ . Finally, in the third sub-process, we convert the component estimation w τ back into daily estimationsŷ t,τ .
In phase 2, the adjusting methods are trained. First, we obtain the residual ε t from the difference between the daily prediction and its corresponding ground truth value, i.e.,ŷ τ,t − y t . We scale these residual values to be in the range [1,10], as required by the Holt-Winter methods.
In the second sub-process of phase 2, we use the residuals ε t to train an autoregressive model using either ARIMA or ES, which is used to forecast residual valuesε t (concretely, ε t,es andε t,arima for ES and ARIMA, respectively).
Later, in the third sub-process of phase 2, residual forecasts e t,es or e t,arima are obtained from the previously computed residual forecast values. Finally, this residual forecast e t,X is added to the daily estimationŷ τ,t obtained from the CNN, resulting in the final prediction value F t .

Data Transformation
Prediction models reflect an increased error as the number of forecasting periods increases. We chose to forecast more cases by transforming daily records into weekly components with the CT module, which maps the daily cases y t into components w τ that represent a weighted average of the daily cases obtained within a week. The values w τ are calculated with Equation (1).

CNN Forecast Component
We used a CNN as a component forecasting method. The training and validation stages are composed of w τ values. The CNN architecture contains an input layer with 50 convolutional neurons, a maxpooling layer of size equals 2. A complete MLP layer of 50 neurons, and one output layer with a single neuron. The convolutional layers use the ReLU activation function. The training configuration parameters is as follows: Adam optimizer [28], mean absolute error as loss function, 100 epochs, and batch size equal to 10. The above configuration is used to forecast weekly componentsŵ τ .

Daily Estimations
The reverse transformation or daily estimations involves converting the weekly components w τ back into daily values. For this, it is necessary to calculate the subcomponents of a component, which we define as shown in Table 1. The segmentation of the week into two subcomponents provides insights about the social behavior of countries separately into beginning and end of a week. The distribution of the daily cases with respect to their subcomponents can be obtained by Equations (2) and (3).
where δ τ,1 , δ τ,2 are subcomponents ADS-1 (Monday to Thursday) and ADS-2 (Friday to Sunday) for the component τ. We determine that the daily ratio µ τ,t represents the proportion of the original daily values for subcomponent 1 and 2 for the component τ (Equation (4)). The daily ratio µ τ,t lets us to determine weekday normalized cases x t (Equation (5)) of the training phase. In other words, x 1 = mondeys avg , . . . , x 7 = sundays avg are average confirmed cases of each day of the week throughout the time series.
The weighting of the daily cases obtained with the ratio µ τ,t allows obtaining a statistical estimation on the relevance of persons infected in the first and second subcomponent τ, j of each component τ throughout the training period. The inverse transformation determines the daily cases predicted from the components using Equations (6) and (7).

Residual Transformation
A residual value is given by the difference in the ground truth and the predicted value, as shown in Equation (8).
where y t is the ground truth in time t,ŷ t is the forecast value in time t. Using Equation (8), the residuals e t are obtained by subtraction of y t and y τ,t , as shown in Equation (9).
where y τ,t is the forecasting value in time t of component τ. ARIMA and ES methods used positive numbers; because of this, the residuals were normalized as shown in Equation (10).
where |.| represents normalization of e t in the range of values [1,10].

Residual Forecast
We used ARIMA and ES forecasting methods as forecasting adjustments methods. The training and validation sets are composed by ε t values.

Residual Estimations
We use residual transformations ε t to train ARIMA and ES, from which we obtained four hybrid methods, CNN-ARIMA, CNN-ES, LSTM-ARIMA, and LSTM-ES. The forecasts ε t,es and ε t,arima from these hybrid methods are transformed into residuals e t,es , e t,arima , which are in the non-normalized domain.

Forecasting
Finally, we evaluated the forecast values of the validation phase F t , which is composed of the daily forecasts y τ,t of CNN and adjustment forecasts e t,best , as is shown in Equation (11).
F t = y τ,t + e t,best .

Experimental Setup
The source of the data, the pre-processing applied, the data separation criterion in training, validation, and testing are described below. Finally, the evaluation metrics are described.

Data
The COVID-19 database used in this work is the Novel Coronavirus 2019 dataset [8], whose records report the number of infected, recovered, and deceased people in each country of the world. From this database, we used a time series starting from 22 January 2020, and that is called Time_Series_Covid_19_confirmed. We selected the records corresponding to the US, Mexico, Brazil, and Colombia.
We used data records from 2 March 2020 until 28 June 2020 for training (17 weeks); from 29 June 2020 to 19 July 2020 for validation (3 weeks); and from 20 July 2020 to 9 August 2020 for test (3 weeks). Figure 2 shows a scheme for this split of data.
With this split, the training of the CNN-CT method for the US was carried out with 17 weekly components w τ , as explained in Section 3.1. In the case of Mexico, Brazil, and Colombia, we used only 15 weekly components since the data corresponding to the first week were discarded due to the lack of significant information; that is, the values of the first week were considerably low with respect to the rest of the series. We noticed that processing this first week results in underestimation of the forecast values. Although training is conducted using weekly components w τ , the forecast for the validation and test stages happens in daily values y τ,t , as explained in Section 3.3.
Residual forecasts allow adjusting daily forecast with ARIMA and ES. In addition, it trained with the residuals of forecast daily validation means, and w τ forecasts obtained in the validation phase were transformed into daily estimations y τ,t to be used in the training and validation phase of the adjustment methods. Figure 3 shows a scheme for this split of data for the adjusting methods. Given that the problem we address corresponds to a scenario of auto-regression, the actual structure of the data is such that each output variable y t depends on a vector of past values x = [y t−1 , y t−2 , . . . , y t−T ]. For this work, we used lags of up to three past values, t − 3, t − 2, and t − 1.

Metrics
The proposed hybridized CNN-CT method and its individual composing methods are evaluated by the MAPE [29], as it has been widely used in the works discussed in Section 2. The MAPE computes the percentage of accuracy in the predicted value with respect to the ground truth. The closer to zero, the more accurate it is. Another common metric is RMSPE [4] which is also used in part of this paper.
where, y t is the ground truth,ŷ t is the predicted value, and n indicates the total number of samples.

Tools
This work was developed with a computer with an iOS operating system, 8 GB, and a 2.3 GHz Dual-Core Intel Core i5 processor. We used Python 3.7.1, and the CNN model was built using Tensorflow and Keras libraries [30].

Results
This section shows the results of the CNN-CT method proposed for daily forecasting cases of COVID-19 in the US, Mexico, Brazil, and Colombia. First, we compare the performance of using CNN and LSTM as the main forecasting methods with ARIMA and ES (Holt-Winter, HW) as adjusting methods. Then, we present the comparison of the CNN-CT model versus the individual CNN, LSTM, ARIMA, and Holt-Winters models for each country.
We can see in Figure 4 the comparison of best-performing forecast models for the countries of The United States, Mexico, Brazil, and Colombia. In the US, Figure 4a, the forecasts of LSTM-ARIMA manage to maintain the trend and seasonality patterns with respect to the ground truth. However, the CNN-HW prognosis is well below the actual data. We can see in Table 2 that LSTM-ARIMA achieves the lowest MAPE for the US. Likewise, Figure 4b shows the behavior of the forecasts for daily cases of COVID-19 in Mexico. We can see that all four models are able to maintain trend and seasonality patterns with respect to ground truth. However, LSTM-ARIMA shows a high error rate because of the difference with respect to the actual data. On the other hand, the forecast of CNN-HW is very close to the real data, which allows us to obtain a better performance with respect to the other methods. The average MAPE and its standard deviation are shown in Table 2, where we can see that CNN-HW achieves the best average performance among the four models.
Similarly, Figure 4c shows the comparative Brazil forecast for all the models. We can see that LSTM-ARIMA manages to maintain seasonality patterns concerning the ground truth. In the case of CNN-HW, it follows the trend and seasonality patterns with respect to the ground truth. The average MAPE and its standard deviation are shown in Table 2. However, as we noticed before with the average MAPE and its standard deviation, CNN-HW has the best performance.
We can see in Figure 4d that LSTM-ARIMA manages to maintain seasonality patterns concerning the ground truth for Colombia. In the case of CNN-HW, it follows the trend and seasonality patterns concerning the ground truth. According to Table 2 CNN-ARIMA shows the best MAPE performance, as its curve is the closest to the ground truth.
In general, our experiments show that smoothing with ARIMA or ES helps obtain lower MAPE in the case of CNN. This is not the case with LSTM. Table 2 shows a summary of the MAPE and RMSPE daily forecasting values of the CNN-CT and LSTM-CT for US, Mexico, Brazil, and Colombia. In the case of US, the method with the best performance is LSTM-ARIMA, having a MAPE ≈ 14%. In the case of Mexico and Brazil, CNN-HW is better with MAPE 14.18% and 29.3%. It is possible to see that LSTM-ARIMA and CNN-HW obtain better results in different countries. In Colombia, CNN-ARIMA obtains the best MAPE and RMSPE.
We averaged the MAPE of all the countries for each method in Table 2. We observed that CNN-CT methods have better performance than that of LSTM-CT. Furthermore, for each country, we determined the standard deviation of the error metrics. We noticed that CNN-CT has the lower deviation, which indicates that its best performance is consistent across countries.  We show the comparison of CNN-HW versus the four individual methods in Table 3. We can see that CNN-HW surpasses all of these individual methods for Brazil and Colombia. For the case of Mexico, CNN-HW is below the best performing method (CNN) only by 0.14 MAPE points. Furthermore, CNN-HW achieves competitive results for the US.

Conclusions
This paper investigates the problem of forecasting confirmed daily cases of COVID-19 in Mexico, Brazil, Colombia, and the US. Given the limited number of data available at the time of conducting our experiments, several limitations of the prediction methods became evident. These limitations were even more obvious due to the presence of noise in the daily data, which might very well be a consequence of the restrictions on the flow of data imposed by the sanitary crisis related to COVID-19 worldwide.
In particular, most prediction methods decrease their accuracy as the periods for forecast become larger. To mitigate this issue, we proposed a component transformation that converts daily values into weekly components for correct prediction in those cases.
We present a hybrid forecasting method termed Convolutional Neural Network-Component Transformation (CNN-CT), which uses CNN and LSTM as the main prediction method and ES and ARIMA as adjusting methods for daily error correction. As a result, there are two variants of the proposed method: CNN-CT with Holt-Winters, and LSTM-CT with ARIMA.
We compared the prediction performance of the individual methods that compose the proposed CNN-CT using the MAPE metric. We noticed that CNN and LSTM are very good with learning trend and seasonality of the time series; however, LSTM forecasts tends to generate increasing and decreasing trend, which causes the error to increase. Our experiments show that smoothing with ARIMA or ES helps obtain lower MAPE in the case of CNN. This is not the case with LSTM.
As future works, we propose applying this methodology to other popular forecasting methods such as SVR, Recurrent Neural Network, and so on; measuring the performance quality in more countries; and applying powerful data cleaning as a preprocessing stage. Furthermore, it could be interesting to use different adjusting methods. Finally, we propose testing if the proposed methodology is completely general or determines which strategy applies in different forecast scenarios.