1. Introduction
COVID-19 was declared as a global pandemic by the World Health Organization (WHO) on 12 March 2020. It is an ongoing pandemic and as of 19 January 2021, more than 95.5 million cases have been confirmed, with more than 2.03 million deaths attributed to COVID-19 across 190 countries around the world [
1,
2]. The coronavirus was first identified in December 2019 in Wuhan, China. COVID-19 has spread globally, with America, Europe, and countries in Asia reporting high numbers of cases. The government of China quickly implemented policies such as lockdown, physical distancing, mandatory masks, and quarantine to mitigate the spread of the virus. China has successfully controlled the pandemic rapidly and effectively, but many countries around the world are still struggling to control the spread of the virus. The virus spread to Southeast Asia on 13 January 2020, when a 61-year-old woman from Wuhan tested positive in Thailand [
3]. Indonesia, a country with a population of 273 million, is the worst-hit nation in the region, with a rapid increase in cases since the first case reported in March 2020.
In the beginning, the COVID-19 pandemic has not only disrupted the normal way of life of the community, business, and government operations, but also the economy. COVID-19 has affected all levels of society and all areas of life. Hospitals and doctors are struggling to provide care for the COVID-19 patients, and businesses are affected due to lockdowns. The COVID-19 pandemic has also forced many activities to be carried out online, and new standard operating procedures (SOPs) were enforced by the government to ensure safety protocols for the public and for business operations. The COVID-19 pandemic is also causing an economic recession. The governments of many countries are allowing some economic movement while still enforcing strict health safety protocols for the public and business owners to follow. In any health disease crises, prediction of the number of cases is of utmost importance because it helps the relevant authorities to take strategic actions to mitigate the effect of the rise in numbers or control the spread of the disease.
Accurate forecasts are needed to provide useful information in the process of mitigating the global pandemic infectious disease. Thus, forecasting the number of COVID-19 cases in the upcoming few days will be most useful for considerations in making decisions, including the provision of personal equipment (PPE), preparation of economic policies, preparation of health facilities, lockdown policies, and opening of schools or businesses.
Currently, there are two approaches to forecasting COVID-19 cases. The first approach is forecasting COVID-19 using mathematical and statistical models. The mathematical and statistical model approach requires knowledge of epidemiology and statistical assumptions regarding the distribution of the data. Mathematical and statistical model approaches include the autoregressive integrated moving average (ARIMA) [
4,
5,
6], seasonal ARIMA (SARIMA) [
4], the susceptible-infected-recovered (SIR) model [
5,
7], the logistic growth model [
7], and the Richards model, which is an extension of a simple logistic growth model [
8].
The second approach is forecasting COVID-19 using artificial intelligence. One of the artificial intelligence approaches is machine learning. Machine learning is a computational method with sophisticated algorithms which can learn the pattern of data to solve forecasting problems. Some machine learning forecasting algorithms for forecasting COVID-19 include multi-layer perceptron, random forest, support vector regression, the Elman neural network [
9,
10,
11], and the recurrent neural network (RNN) [
9,
10,
12,
13]. Sahid et al. [
9] concluded that RNN outperformed support vector regression and ARIMA. Hao et al.’s [
10] experimental results showed that RNN is more suitable for the prediction of the cumulative confirmed cases compared to death and cured cases.
RNN utilized network architecture which is suitable for processing sequential data. Qiu, Wang, and Zhou [
14] applied RNN with long short-term memory (LSTM) architecture and attention mechanism for stock price forecasting. Uras et al. [
15] applied RNN with LSTM architecture for Bitcoin closing price forecasting. Yao and Guan [
16] applied RNN with an improved LSTM for natural language processing. RNN is also widely applied for speech recognition [
17] and to solve fuzzy non-linear programming [
18]. Hewamalage, Bergmeir, and Bandara’s [
19] experimental studies concluded that RNN is a good algorithm for obtaining reliable forecasts.
Another artificial intelligence approach for forecasting is a metaheuristics optimization algorithm. The flower pollination algorithm (FPA) is a robust and adaptive metaheuristics optimization algorithm which is inspired by how flower pollination occurs. The FPA solves the balance of global and local search and uses Lévy flight distribution for better global search performance. The FPA is a method that aims for optimization. The FPA outperformed other nature-inspired methods such as the genetic algorithm and particle swarm optimization [
20]. The FPA has been deployed to estimate transportation energy demand [
21], to forecast Organization of the Petroleum Exporting Countries (OPEC) petroleum consumption [
22], to forecast electricity energy consumption [
23], and to solve combined economic and emission dispatch problems [
24]. FPA was created by Yang [
20] in 2014 and has been reported to perform better than other metaheuristic algorithms.
In this paper, the FPA was used to determine the optimal coefficients of the variables in the forecasting function of cumulative confirmed COVID-19 cases in Indonesia. In other words, the FPA was used to perform optimization for curve fitting of cumulative confirmed COVID-19 cases. We compare the performance of the FPA with a machine learning method which is popular for forecasting, the recurrent neural network (RNN). Experimental results showed that the FPA performed better than the RNN in long-term (two weeks) and short-term (one week) forecasting. This research provides state-of-the-art results to help the process of mitigating the global pandemic of COVID-19 in Indonesia. This paper is structured as follows: after this introduction, the second section covers related works on forecasting COVID-19 cases. This is followed by the explanation of the data and the methodology in the third and fourth section. The results and discussion are presented in the fifth section, and the conclusion is provided in the last section.
2. Related Works
In this section, some related works related to forecasting of COVID-19 cases are presented. As explained in the first section, there are two approaches on forecasting COVID-19 cases. The first one, the mathematical and statistical model approach, is presented here [
4,
5,
6,
7,
8,
25,
26].
Mishra et al. [
4] applied the ARIMA, SARIMA, and Prophet model to forecast the cumulative deaths, cumulative cases, and new cases of COVID-19 in India. The model was used to forecast the COVID-19 cases for next 15–20 days starting on 1 September 2020.
Abuhasel, Khadr, and Alquraish [
5] applied SIR and ARIMA models to analyze and forecast the daily COVID-19 cases in the Kingdom of Saudi Arabia. The deterministic SIR model was applied to analyze the COVID-19 spread in Saudi Arabia, while the ARIMA model was used to forecast the daily COVID-19 cases. The two models were applied to the daily data from March 3 until 30 June 2020.
Ali et al. [
6] applied the ARIMA model to forecast the cumulative confirmed cases, recovered cases, and deaths in Pakistan from COVID-19. The training data to develop the ARIMA model were from 27 February until 24 June 2020, and then the ARIMA model was used to forecast the next 10 days (25 June 2020 to 4 July 2020).
Malavika et al. [
7] developed mathematical model approaches to forecast COVID-19 in India. The SIR models were applied to forecast the maximum number of active cases and peak time, the logistics growth curve model was applied for short-term prediction and the time interrupted regression model was used to analyze the effect of lockdown and other policies. The models were used to forecast the COVID-19 epidemic in India by the end of May 2020.
Zuhairoh and Rosadi [
8] applied the Richards model, which is an extension of a simple logistic growth model, to forecast daily cases of COVID-19 in South Sulawesi Province, Indonesia. In addition to forecasting, the objective of this research was to predict when this pandemic would reach the peak of its spread, and when it would end. The data used in this paper were compiled as of 24 June 2020.
Anastassopoulou et al. [
25] developed a mathematical model approach to estimate the fatality ratio (death rate) and recovery case ratio based on time series of positive case data, death rate, and recovered cases from COVID-19 in Hubei, China. The model was based on data distribution from Middle East respiratory syndrome (MERS) and severe acute respiratory syndrome (SARS) cases that occurred previously. The model was applied to forecast the COVID-19 cases by the end of February 2020.
Petropoulos and Makridakis [
26] applied a simple time series model from the exponential smoothing family to forecast the global number of positive cases, the number of deaths, and the number of patients who have been cured of COVID-19 infection. The model was used to forecast the COVID-19 cases from February until March 2020.
The second approach in forecasting the COVID-19 cases is using artificial intelligence, especially machine learning methods [
9,
10,
11,
12,
13]. Shahid, Zameer, and Muneeb [
9] applied four different machine learning methods and the well-known ARIMA method to forecast the confirmed cases, recovered cases, and death cases in 10 major countries affected by COVID-19. The machine learning methods were RNN with bidirectional LSTM (Bi-LSTM) architecture, RNN with LSTM architecture, RNN with gated recurrent unit (GRU) architecture, and support vector regression (SVR). The data used in this research were from 22 January until 10 May 2020 for training, and from 11 May until 27 June 2020 for testing. The RNN model outperformed the SVR and ARIMA for forecasting COVID-19. The models’ ranking, from the best to the worst performance, was: RNN Bi-LSTM, RNN LSTM, RNN GRU, SVR, and ARIMA.
Hao et al. [
10] applied three machine learning methods to forecast the cumulative confirmed cases, cumulative deaths, and cumulative cured cases in Wuhan, Hubei Province, China. The machine learning methods were the Elman neural network, RNN-LSTM, and support vector machine (SVM). The data used in this research were from 23 January 2020 to 6 April 2020. Based on the experimental results, the RNN-LSTM model is more suitable for the prediction of the cumulative confirmed cases compared to death and cured cases.
Balli [
11] applied four different machine learning time series methods to forecast the weekly cumulative confirmed COVID-19 cases for the United States of America (USA), Germany, and the world. The machine learning methods were linear regression, multi-layer perceptron, random forest, and support vector machine. The data used in this research were from between 20 January and 18 September 2020. The data consist of weekly cumulative confirmed cases for 35 weeks. SVM outperformed other methods for forecasting the COVID-19 cases.
Hawas [
12] developed an RNN to forecast the data of COVID-19′s daily infections in Brazil. The training data to develop the RNN model were from 7 April until 6 May 2020, and then the RNN model was used to forecast the next 54 days (7 May 2020 until 29 June 2020). In this research, there were two alternative timesteps used for the RNN, 30 and 40.
Shastri et al. [
13] developed an RNN to forecast the confirmed cases and death cases of COVID-19 in India and USA. In this research, variants of LSTM architecture of RNN are developed, including stacked LSTM, bi-directional LSTM, and convolutional LSTM. The data of confirmed cases used in this research, for both India and USA, were from 7 February until 7 July 2020, while the data of death cases for India were from 12 March until July 2020, and for USA were from 26 February until 7 July 2020. The training data constituted 80% of the total, while the validation data were 20%.
In the COVID-19 research area, machine learning was used for another task beside forecasting. Machine learning has been applied to COVID-19 patient data. Zoabi et al. [
27] used gradient-boosting machine model built with decision-tree base-learner for prediction of COVID-19 positive case based on symptoms while Kim et al. [
28] evaluated several machine learning models to predict the need for intensive care. Recently, Ahmad et al. [
29] proposed Shallow Single-Layer Perceptron Neural Network (SSLPNN) and Gaussian Process Regression (GPR) model for classification and prediction of confirmed COVID-19 cases. Elzeki et al. [
30] proposed a computer-aided model using deep learning to classify positive COVID-19 based on Chest X-ray image data.
The results of closely related works are summarized in
Table 1. In this research, a metaheuristics optimization algorithm, the FPA, is used to forecast the cumulative confirmed COVID-19 cases in Indonesia. The FPA is a robust and adaptive method to perform optimization for curve fitting of COVID-19 cases. The performance of the FPA was evaluated and compared with a machine learning method which is popular for forecasting, the RNN.
4. Methods
4.1. Forecasting Using Flower Pollination Algorithm
The flower pollination algorithm (FPA) is a nature-inspired metaheuristic algorithm proposed by Yang [
20]. The FPA is based on the flower pollination process of flowering plants. Flower pollination can occur by self-pollination or cross-pollination. Self-pollination refers to pollination that occurs from a different flower, or from the same flower, of a single plant. When there is no reliable pollinator available, it is usually aided by wind. Self-pollination is also referred to as abiotic pollination. Cross-pollination, on the other hand, refers to pollination from a flower of a different plant. Cross-pollination is aided by a pollinator, such as bees, bats, birds, and flies, who can fly a long distance. The pollinators may demonstrate as Lévy flight behavior. They jump or fly with distance steps that obey Lévy distribution. Cross-pollination is also referred to as biotic pollination. Cross-pollination is considered to be global pollination, while self-pollination is considered to be local pollination.
There are four rules for the FPA, based on the above flower pollination process of flowering plants:
Rule 1—biotic, cross-pollination, or pollination between flowers is global pollination following Lévy Distribution. This first rule is represented mathematically in Equation (1), where
is the pollen
i or solution vector
xi at iteration
is the current best solution found among all solutions at the current iteration, and
is the strength of the pollination (step size). Lévy flight is used to mimic it; therefore,
is derived from a Lévy distribution with a value greater than 0. Lévy distribution is represented in Equation (2). Lévy distribution uses the standard gamma function
, which is valid for large steps
s > 0.
Rule 2—abiotic, self-pollination, or pollination of flowers from the same plants. Local pollination is represented mathematically in Equation (3).
and
are two pollens of the same plant but from different flowers. ε is a random value from a uniform distribution in range [0,1].
Rule 3–flower constancy or equivalent to a reproduction probability proportional to the likeness of the two flowers involved is often developed by the pollinators.
Rule 4—a probability is used to switch between local pollination and global pollination.
In this study, the FPA was used to forecast cumulative cases of COVID-19. The FPA was used to obtain the best solution
from the set of solutions
x. Each
x consists of a multilinear regression coefficient
, where
and bias
to predict the cumulative daily cases of COVID-19 for day
based on the previous
N days, so that
. The
will be used as sum-product for
and then the results are summed by
. Formally, the multilinear regression in this research is represented in Equation (4):
The objective function for each solution
x is to minimize the difference between predicted cumulative case
and actual cumulative case
. In this research, root-mean-square error (RMSE) is used to measure the difference. RMSE is presented in Equation (5), where
m is equal to the length of the time series record:
Based on the objective function that has been determined, the fitness function for each solution to be evaluated is represented mathematically in Equation (6). The best solution for each generation is
, and will be used as the final solution:
For each generation , solutions as a population are generated. From initial generation , the best solution in the population will be stated as . In generation , where , if there is one solution that is better than , that solution will replace the existing . The alteration of is performed iteratively in each generation; therefore, a dynamic approach is required. The solutions in generation are formed from the pollination of the solutions in generation (either global pollination or local pollination, as stated in Equations (1) and (3), respectively). The switch between global or local pollination in generation is controlled by switch probability , as stated in Rule 4.
4.2. Forecasting Using Recurrent Neural Network
The second method applied is the recurrent neural network (RNN). RNN is a kind of neural network architecture which is suitable for processing sequential data. The advantage of the RNN architecture is that it is more flexible and can be attuned according to the number of sequences in input or output. The RNN uses iterative function cycles to store information [
31]. The RNN architecture is constructed in a form such that the network will remember the previous information and apply it to calculate the current output. In the RNN, the nodes between the hidden layers are connected periodically, and the hidden layer’s input includes not only the output of the input layer, but also the output of the hidden layer at the last time, thus RNN can preserve, learn, and record historical information in sequence data [
32].
The RNN has a similar forward pass process to that of a multilayer perceptron with a single hidden layer. The difference lies in the fact that RNNs accept activations from both the current external input and also the hidden layer activations from previous timesteps [
31]. As shown in
Figure 2, the structure of the RNN includes the input layer, hidden layer, output layer, the weights of input layer to hidden layers, the weights of hidden layers to output layers, and learnable weights for the previously hidden state. These recurrent connections serve to pass values over timestep or sequence.
With this architecture, the current output in the RNN depends on the previous state. In a simple RNN, hidden units will receive the input in the current state and the output from the previous hidden state. The current hidden unit and the output can be defined mathematically in Equations (7) and (8), respectively:
For Equation (7), is the hidden state and is the input at the current timestep. is the learnable weight from the input layer to the hidden layer, while are learnable weights for the previously hidden state’s input. is an activation function and is the bias for the hidden layer. The activation function can be switched depending on the situation. The purpose of using the activation function is to ensure that the model is a non-linear machine. Common activation function choices are sigmoid, tanh, and ReLU functions. For Equation (8), is the output state, is the hidden state, is the learnable weight from hidden layer to the output layer, and is the bias for the output layer.
The complete sequence of hidden activations can be calculated by starting at the first timestep and then recursively applying Equation (7), incrementing time at each step. For the initially hidden unit at the start of the timestep, the value of the previously hidden state unit can either be manually adjusted to a certain value or set to zero. It is known that RNN stability and performance can be improved by using non-zero initial values. As for the weights, the norm is to randomize the weight without known information about the data. However, they can be set to particular values to help avoid overfitting [
31].
In neural networks, the error of the prediction with respect to the target is calculated after the output is obtained. This error is normally in the form of a partial derivative of a differentiable loss function, where the derivative with respect to the weights can be used to improve the weights. There are two well-known algorithms that can be used to calculate the loss derivatives for RNNs: real-time recurrent learning (RTRL) and backpropagation through time (BPTT). BPTT is known to be simpler and more efficient in computation time, particularly since its process is similar to normal backpropagation in the neural network [
31].
In this research, the data of cumulative COVID-19 daily cases are represented sequentially. Each sequence consists of data from the previous N. This sequence will be fed to RNN architectures to predict the cumulative COVID-19 cases of day . The experiments are conducted using several combinations of hyperparameters, such as the number of hidden layers, the dimension of neurons, learning rates, and dropout ratio, to attempt to determine the best model with minimum RMSE. ReLU is used as the activation function and Adam is used as the optimizer.
4.3. Model Performance Measurement
In order to measure the performance of the forecasting model, two performance measurements are used in this research, which are root-mean-square error (RMSE) and mean absolute percentage error (MAPE). RMSE is represented mathematically in Equation (5). The smaller the RMSE values are, the more accurate the forecasting model is; conversely, the larger the RMSE values are, the more inaccurate the model is [
33]. RMSE value is the error number, which doesn’t provide any information about the percentage of error compared to the actual value. Meanwhile, MAPE is a widely used evaluation metric for forecasting methods presenting the percentage of error. MAPE is represented mathematically in Equation (9), where
is actual value,
is forecast value, and
is the length of time series recorded.
The code of both forecasting models, the FPA and RNN, are available to be accessed publicly at
http://ugm.id/covidforecasting (accessed on 29 November 2022). The code is written in Python programming language.
4.4. Training, Validation, and Testing Data
This study involved two phases, which are Phase 1: Development of FPA and RNN Model, and Phase 2: Evaluation of the Forecast Performance of the FPA and RNN Model Developed in Phase 1.
In Phase 1: Model Development, the data period is from 2 March 2020, to 10 July 2020. The dataset from 2 March 2020, to 10 July 2020, is divided into a ratio of 80:20; 80% for training data and 20% for validation data. Therefore, the training data are from 2 March 2020, to 4 June 2020, while the validation data are from 15 June 2020, to 10 July 2020, represented in
Table 2. The validation process is carried out to determine the appropriate hyperparameters for the model.
In Phase 2: Model Evaluation, after the appropriate hyperparameters for the FPA and RNN model are obtained, the testing process is conducted. The FPA and RNN model is tested for short- and long-term forecast of the cumulative COVID-19 cases. We refer to some references [
4,
5,
6,
10,
26] conducting forecasting for the next 7–14 days. Therefore, we used one-week forecast for the short-term and two-week forecast for the long-term forecasting.
Long-term forecast, which forecasts the cumulative cases of COVID-19 over the next 14 days (2-week forecast);
Short-term forecast, which forecasts the cumulative COVID-19 cases for the next 7 days (1-week forecast).
In order to obtain more comprehensive results of the performance of the models, the testing (forecast) process is conducted in several rounds or iterations. Long-term testing is conducted in 5 iterations, while short-term testing is conducted in 10 iterations. The model is updated with the relevant training data in each iteration using the hyperparameters defined in the validation sample in Phase 1.
Table 3 presents the period of training data and testing data for long-term testing, while
Table 4 presents the period of training data and testing data for short-term testing.