Probabilistic Approach to COVID-19 Data Analysis and Forecasting Future Outbreaks Using a Multi-Layer Perceptron Neural Network

The present outbreak of COVID-19 is a worldwide calamity for healthcare infrastructures. On a daily basis, a fresh batch of perplexing datasets on the numbers of positive and negative cases, individuals admitted to hospitals, mortality, hospital beds occupied, ventilation shortages, and so on is published. Infections have risen sharply in recent weeks, corresponding with the discovery of a new variant from South Africa (B.1.1.529 also known as Omicron). The early detection of dangerous situations and forecasting techniques is important to prevent the spread of disease and restart economic activities quickly and safely. In this paper, we used weekly mobility data to analyze the current situation in countries worldwide. A methodology for the statistical analysis of the current situation as well as for forecasting future outbreaks is presented in this paper in terms of deaths caused by COVID-19. Our method is evaluated with a multi-layer perceptron neural network (MLPNN), which is a deep learning model, to develop a predictive framework. Furthermore, the Case Fatality Ratio (CFR), Cronbach’s alpha, and other metrics were computed to analyze the performance of the forecasting. The MLPNN is shown to have the best outcomes in forecasting the statistics for infected patients and deaths in selected regions. This research also provides an in-depth analysis of the emerging COVID-19 variants, challenges, and issues that must be addressed in order to prevent future outbreaks.


Introduction
The present COVID-19 outbreak is a serious global crisis for healthcare infrastructures. The pandemic has triggered a crisis due to which, schools, administrative institutions, and financial institutions such as banks have been shut down in many major countries. Notably, such disruptions not only cause problems for people in the short term but also have longterm effects, for example, an increase in unemployment [1]. According to a study [2], the situation causes a 2.5-3% decline in the economic stability of GDPs globally every month. Furthermore, based on previous crises, it appears that younger and less-educated workers are the most financially impacted [2]. COVID-19 is thought to have originated from animals.
If communities do not follow preventive policies for this highly contagious disease, COVID-19 can spread easily to healthy humans through close contact. Traveling has been the main cause of the huge spread [3,4]. In the early days of the COVID-19 pandemic, almost all reported cases were symptomatic. In a study by Noh J and Danuser G [5] of 50 countries, the number of actual COVID-19 patients in 25 of those countries was predicted to be 5 to 20 times larger than confirmed infected cases. In several European countries in March 2020, the number of total cases/infected patients was around 2.5 times higher than actual reported patients, and currently, it is estimated that the number of unseen infected cases is still 1.5 times higher than reported cases because undiscovered or unseen patients could be symptom-less or exhibit very subtle illness symptoms [6]. Researchers in the fields of pharmacy, chemistry, mathematics, physics, statistics, economics, computer science, geophysics, and medicine have joined hands to fight against COVID-19. However, no one has reached a firm conclusion yet on how to overcome this problem. Furthermore, the structure and symptoms are always mutating. The flu, body temperature, coughing, and shortness of breath are the initial indications of the COVID-19 virus. The severe side effects of this infection may cause acute respiratory disorder (a severe form of asthma), pneumonia, heart failure, renal failure, and possibly death in the subsequent stages [7]. The COVID-19 spread could be significantly slowed down by employing precautionary measures such as minimizing direct contact, social isolation, and smart lockdowns [8].
The accurate and robust forecasting of COVID-19 cases and deaths can assist government interventions and encourage the general public to consider effective actions to slow down the spread of this disease [9]. Researchers have conducted multiple studies to explore the COVID-19 associated risk factors and emotional effects, covering various categories such as nature, health, lockdown, etc., using different models [8,10]. Machine learning models, such as random forest, support vector machine (SVM), K-nearest neighbors (KNN), artificial neural networks, and many others, have also been used to predict the COVID-19 situation [11]. The reproduction rate of a disease is of great concern to epidemiologists as this is what determines a pandemic; a reproduction rate greater than one indicates a pandemic in the population [12]. The nature of COVID-19 has been studied by taking a variety of mathematical models into account. The most used model for analyzing disease dynamics is the Susceptible-Infectious-Recovered (SIR) model. This model uses a system of differential equations that are time-dependent to predict epidemic growth. Researchers have extensively employed the SIR model and different modified forms to study Ebola and AIDS [13,14]. Godio et al. [15] studied the recent SARS-CoV-2 pandemic outbreak by taking data from Italy using the SEIR epidemiological model. They used a Particle Swarm Optimization (PSO) solver to create a stochastic method to fit the model parameters, which improved the predictability of the prediction in a medium run of thirty days. Their findings matched Spanish and South Korean statistics and forecasts. Baleanu et al. [16] and some other researchers [17,18] used the Caputo-Fabrizio derivative to create a COVID-19 fractional differential equation model. The data on COVID-19 reflect a sequence of observations and time-series prediction approaches e.g., artificial neural network-based methods and meta-predictors are all native to the statistics [19,20]. For time-series forecasting, ANNs are frequently used [21]. ANN-based techniques have many advantages over machine learning techniques and one of the key advantages is that ANN can be fed raw data and discover the desired features automatically [22]. ANNs give accurate results based on numerous factors such as performance, accuracy, latency, speed, convergence, and size [23,24]. It is important to note that this research relies on artificial neural networks (ANNs) for forecasting the COVID-19 situation in certain countries.
In this paper, we propose a model to forecast future COVID-19 scenarios in major countries and provide insights for government bodies and policymakers. This work also provides a detailed look at the current COVID-19 variants, challenges, and guidelines for preventing the outbreak effectively. This forecasting is intended to assist organizations, legislators, and the general public in implementing new tactics and reinforcing ongoing COVID-19 precautionary actions. Additionally, this study could aid in relieving the socioe-conomic and psychological distress caused by COVID-19. The key contributions of this study are given below.

•
Awareness about emerging variants of COVID-19: We have collected information about COVID-19 including its types and emerging variants. It is important to note that some of the variants can appear without any prior symptoms. • Literature review: This article gives a brief overview of the related work recently undertaken in the field of COVID-19 forecasting using data mining approaches including machine learning, and deep learning techniques. • Proposed Methodology: We proposed an artificial neural network-based methodology for the statistical analysis of the current pandemic situation in some eastern and western countries. The results show that our approach works well in terms of precision and model fitting to statistical data. • Challenges and future directions: We discussed the current issues associated with utilizing Artificial Intelligence methods to resolve the COVID-19 pandemic. Furthermore, we demonstrate how machine learning and deep learning can assist in preventing the spread of COVID-19 in the future. We also address the potential future contributions of AI and blockchain-based solutions to analyze the outbreak response.

Coronavirus
Coronaviruses are indeed a huge family of viruses that are found both in humans and animals [25]. Seven different types have been identified, including the ones that caused COVID-19 and the SARS and MERS illnesses. According to initial estimations, the retrovirus seemed to be more contagious than the one that caused SARS, although it appeared to be less probable to provoke catastrophic illnesses. We still have a lot to learn about the novel coronavirus (COVID-19) [26].

Symptoms of COVID-19
COVID-19 has been related to a variety of indications, ranging from simple headaches to life-threatening diseases. Upon being exposed to the illness, symptoms and signs may appear after 2 to 14 days [27]. The severity of the symptoms varies from mild to severe. COVID-19 is a virus that can cause the following symptoms in patients: This is not an extensive list of all symptoms and manifestation. The CDC [27] continues to update the list of possible symptoms whenever new information becomes available from research labs or other academic sources. COVID-19 infection appears to put elderly persons with serious medical conditions, such as diabetes, heart disease, or respiratory problems, at an increased risk of developing more serious conditions.

Types of Coronavirus
In a new study on COVID-19, UK-based scientists discovered that there are six different varieties of COVID-19 infection, each with its own set of symptoms.

1.
Flu-like without a temperature Fatigue, muscle aches, absence of smell, sore throat, coughing, shortness of breath, and no temperature are some of the additional symptoms.

2.
Flu-like with temperature Fatigue, absence of smell, sore throat, coughing, uncontrollable shaking, a decrease in hunger, and a temperature.

3.
Gastrointestinal Fatigue, absence of smell, sore throat, a decrease in hunger, chest pain, no coughing, and diarrhea.

4.
Extreme level one, severe exhaustion Fatigue, loss of smell, cough, chest pain, a temperature, and hoarseness.
Extreme level three, abdominal and pulmonary Fatigue, absence of smell, a decrease in hunger, coughing, sore throat, chest pain, a temperature, hoarseness, and muscle pain.

Emerging Variants of COVID-19
New variants are emerging with time. For example, recently, a new mutant (B.1.1.529 also known as Omicron) has emerged, which is fast spreading and can pose a big threat to the effectiveness of COVID-19 vaccinations [28]. Researchers are closely monitoring this novel mutant of COVID-19. This variant contains various changes, which were earlier reported in other mutants, particularly Delta. This new variant has been observed to be expanding rapidly within South Africa. Nowadays, the main goal is to focus on its expansion. The said mutation was identified in Botswana on 11 November 2021 [29] and was identified in a South African traveler who traveled to Hong Kong. Omicron was added to the list of "variants of concern" by the WHO, which also contains Alpha, Beta, Gamma, and Delta. Viruses transform themselves all the time and the majority of mutations are minor. Some of these mutations may be harmful to the virus itself, whereas others can make the infection more aggressive or dangerous. Table 1 illustrates the alterations with the highest risk, which are described as the "variants of concern" and are regularly observed by healthcare practitioners. Regarding vaccinations against COVID-19, the vaccinations from Chinese Sinopharm, Pfizer, and AstraZeneca are very efficacious against the variations after two doses, whereas resistance after one dosage appears to be diminished [30].
There are several variants of SARS-CoV-2, including a brand-new, extremely contagious variant that was detected in the United Kingdom [26]. Another of these new variants is known as VOC202101/02 or P.1 and was reported in visitors from Brazil who traveled to Japan in January 2021. This gene contains the 1-4 nt insertion, three reductions, four identical modifications, and 17 distinct amino acid modifications [31]. Travel restrictions were implemented in an effort to stop the spread of P.1 throughout the nation after it was discovered in the United Kingdom [32]. However, another variety from Brazil (known in the UK as VUI202101/01) was discovered in the UK and comprises a minor recessive mutation. Eight instances of this type, which appeared to be of minimal significance, had been reported as of 14 January 2021. The "expansion and importance of this mutation continues under investigative process", according to Public Health England (PHE). At same time as the English variant, the South African variant appeared and has since been found in at least 20 countries. According to South African genomic data, the 501Y.V2 mutation swiftly supplanted other circulating progenitors in the country because it appeared to have a greater infection rate and hence is more transmittable. The N501Y and E484K spike protein variants are present in this version, as they are in the English and Brazilian variants.

Variants of Interest (VOI)
There is significant proof that the differences in the variants have a massive effect on infectivity, disease intensity, and/or resistance, affecting the epidemiologic scenario in the EU/EEA [30]. There is at least reasonable certainty in the findings for these features, which included genetic, epidemiologic, and in vitro investigations. Additionally, all of the prerequisites for the variants of concern and under investigation listed in Table 2 apply. The indications are labeled to show whether they come from the variants themselves (v) or from mutations linked to the variants (m). Evidence with a "low confidence" rating is labeled to highlight that it is inconclusive. Blank fields or null fields indicate that there are no existing evaluations or scientific evidence for the category, whereas "no" means that there has been no change associated with the feature. B.1 is the comparable virus that is presumed to be "wild-type" (with D614G and no other spike protein modifications) [27].

Variants under Observation
SARS-CoV-2 variants under observation were discovered as indications through outbreak intelligence, rules-based genomic variant screening, and initial technical data [38]. There is some indication that they are similar to the VOIs in terms of quality; however, the evidence is either inadequate or is still to be examined by the ECDC [27]. One or more outbreaks in communities or proof of the communal spread of the mutation elsewhere in the world must have been established for the mutations mentioned in Table 3.

Related Work
Machine learning algorithms often employ data sequences collected over time as the input data to forecast the COVID-19 pandemic situation. The COVID-19 spread has been predicted using a variety of methodologies. The Long Short-Term Memory (LSTM) algorithm is one of the methodologies that has been used. The multi-layer perceptron (MLP), for example, is now being used to forecast the spread of COVID-19. This strategy has made it easier to anticipate the maximum number of COVID-19 victims, the highest proportion of survivors, and the highest number of fatalities per region in a specific time period [44].
Al-Qanes et al. [45] developed a more advanced form of the adaptive neuro-fuzzy infererence system (ANFIS) to calculate the infected patients in different four countries: United States, Iran, Italy, and Korea. Their approach was founded on the marine predators algorithm, a revolutionary nature-inspired optimization. The ANFIS variables were optimized using this technique, improving prediction accuracy. The model has shown efficient prediction performance for MAE, RMSE, MAPE, and R 2 [45]. Other research used an improved ANFIS model by integrating the flower pollination algorithm (FPA) and salp swarm algorithm (SSA). The proposed FPASSA-ANFIS framework was evaluated by employing verified data obtained from the WHO website. Additionally, the proposed model's performance was evaluated using two different datasets of weekly infected patients [20].
The Susceptible-Exposed-Infectious-Recovered (SEIR) approach was used by Alsayed et al. [46] to forecast pandemic peaks in Malaysia. Researchers have utilized the ANFIS approach to anticipate the number of infected people in the short term. Additionally, researchers have hypothesized that extending the treatment time may lessen the severity of the pandemic at its height. The MAPE, RMSE, and R 2 values for this study were 2.79, 46.87, and 0.9973, respectively [46]. Behnood et al. [47] evaluated the influence of several climate-related elements and the size of the population on the spread of COVID-19 by integrating the viral optimization algorithm (VOA) and ANFIS. They showed that the density of the population had a surprising impact on how well their constructed scenarios operated, highlighting the critical role that social distance plays in reducing the rate as well as the spread of COVID-19. They reported the RMSE as 22.47, MAE as 7.33, and R 2 as 0.83 [47].
Aora et al. [48] employed RNN-related LSTM variations to predict the number of positive patients in India. The LSTM model was chosen for forecasting daily as well as weekly COVID-19 patients with approximated errors of three percent for daily cases and eight percent for weekly cases based on the lowest false alarm rate. Depending on the volume of confirmed patients and everyday progression of the designation of COVID-19 hotspots, they divided Indian states into various zones [48]. A bidirectional LSTM network was used by Fokas et al. [49] to produce a reliable generalization of RNNs. This technique was used to forecast new COVID-19 infected individuals in the United States, Spain, Italy, Germany, France, and Sweden [49].
The regression model proposed by Yadav et al. [50] for the forecasting of COVID-19 cases was based on six regression analyses including quadratic, third-degree, fourth-degree, fifth-degree, sixth-degree, and exponential polynomials. The sixth-degree polynomial regression method was the best model for the forecasting of short-term new cases [50]. Geographical hierarchies were employed by Kim et al. [51] to develop Hi-COVIDNet in accordance with a neural network of two-level machinery based on information gathered from the continent and at the country level. This approach comprehended the complex connections between far-off nations and connected their unique risks of infection to the targeted community [51].
Three hybrid techniques for COVID-19 time-series forecasting were developed by Abbasimehr and Paki [52] by combining the Bayesian optimization algorithm with the multi-head attention, LSTM, and CNN deep learning techniques. These findings revealed that deep neural networks outperformed the benchmark model in terms of both the shortterm and long-term predictions. In addition, the best deep learning model's average SMAPE had short-term forecasts of 0.25 and long-term forecasts of 2.59 [52]. Additionally, deep neural networks (DNNs) have been proposed as a technique for prediction. This approach is a significant substitute for estimating a partial differential equation's solution [11]. Based on the distribution of COVID-19 over three time periods, a recent work employed the K-means approach to group countries into various clusters [11].

Methods
The proposed model for this work is the multi-layer perceptron neural network (MLPNN), whose flowchart/structure is illustrated in Figure 1. For this study, we collected data from the website of the World Health Organization [53]. The data used for this research were statistical data and contained no personally identifiable human photos, audios, videos, or other materials. Additionally, all procedures were conducted in accordance with the necessary rules and laws. As shown in Figure 1, the downloaded dataset was pre-processed using features extraction. We considered the categorical features (infected cases, number of deaths, and number of weeks) for this study. We tuned the model by removing the disconnected features that were causing the class imbalance, for example, we did not consider patients who had other diseases such as heart disease, cancer, diabetes, old age, etc. These features were causing a class imbalance, e.g., it was not necessary for all COVID-19 infected patients to be heart patients and vice versa. After removing the disconnected features, we normalized the data and initialized the input data by splitting it into subsets, i.e., 80% for training and 20% for testing. This splitting is typically made in a layered or randomized way to ensure the data are dispersed in the sample data of the subgroups, which minimizes biases or deviations in the data. The classification model that we utilized in the approach was trained using the training data and test data to evaluate the classifier's performance over an unobserved subset of the data. We applied a three-layered feedforward network (multi-layer perceptron neural network) model for training, testing, and validation. The MPLN is discussed briefly in the following sections.

Multi-Layer Perceptron Neural Network
We employed a multi-layer perceptron neural network [54] and a feed-forward neural network with an input layer, hidden layers, and an output layer (see Figure 2). In this research, two separate multi-layer perceptron neural networks were trained, i.e., one for each of the goals-infected cases and deaths. The data of the infected cases and deaths were used from various countries including China, Bangladesh, Germany, Italy, India, Iran, Pakistan, and the United Kingdom. Ten hidden neurons were used in a single hidden layer and a sigmoid function was also used. The sigmoid is the activation function, which is specified as where w ki are the weights of input values and N i is the value of the hidden neurons.
In the output layer, there are two input neurons that show the number of deaths and number of active cases. Furthermore, Equation (2) defines the output of a hyperbolic tangent transfer function that ranges from −1 to +1, that is, where w ij is a weighted output between the hidden neuron i and the output neuron j. N j is the output of j. The best technique for calculating the best values for all the neural network variables, for example, the input and output weights, are used in the supervised learning approach. As a result, establishing the parameters of an ANN results in the development of an ANN model. Training through observed values and optimization is known as supervised learning (see Figure 3).

Mortality/Fatality Rate
The seriousness of a pandemic can be inferred from the fatality (case fatality ratio) rates/ratios, defined by CFR = Deaths Con f irmed cases 100 (3) where CFR is the case fatality ratio.

Cronbach's Alpha
Cronbach's alpha is a risk-adjusted evaluation metric that shows us how much the expected case returns differ from the actual case returns and whether deaths from COVID-19 are above or below the active cases/deaths. We calculated the actual cases and death ratio using Cronbach's formula [55] (Equation (4)) as follows; where Cα denotes the actual cases and deaths, S 2 describes the number of samples, S 2 y represents the variance in the total score. S 2 i is the variance of the individual week, whereas ∑ S 2 i is the sum of the scores of the individual week.

Mean Absolute Error (MAE)
We used the mean absolute error (MAE) (see Equation (4)) to achieve forecasting with minimized errors. Based on the MAE's values, the mean absolute scaled error (MASE) (Equation (5)) was calculated for the actual infected cases/deaths and predicted cases/deaths for future weeks.
where y ≤ k and then the yth error e y is denoted by e y = x y−xy

Mean Absolute Scaled Error (MASE)
We computed the MASE (mean absolute scaled error) using the actual numbers of infected cases and deaths and the forecasted values of the cases and deaths using the following equation (Equation (6)).
where k represents the sample size, x y indicates the actual values of the infected cases/deaths, andx y indicates the forecasted values of the cases/deaths. y ≤ k and then the yth error e y is denoted by e y = x y−xy

Root Mean Square Error (RMSE)
The RMSE computes the difference of the error between two actual values and the forecasted values. We compared the anticipated value and real measurements, i.e., (a) the predicted values and (b) the observed values, respectively. We divided the total number of observations by the sum of all the values. Finally, we calculated the root mean square error (RMSE) (8) below: where n represents the total number of infected people, O denotes the number of observed values of actual cases, and E represents the number of the total expected values.

Data Pre-Processing and Experimental Setup
Authentic sources [53] were used to collect the data. We used the datasets of various countries including China, Bangladesh, Germany, Italy, India, Iran, Pakistan, and the United Kingdom. This study contains no personally identifiable human photos, audios, videos, or other materials. All procedures were followed in compliance with the necessary rules and regulations. A Windows 10, 64-bit operating system, with 16 GB of RAM was employed. For the training and validation datasets, we used CSV files. We normalized the data and initialized the input data by splitting them into subsets, i.e., 80% for training and 20% for testing. This splitting is typically made in a layered or randomized way to ensure the data are dispersed in the sample data of the subgroups, which minimizes biases or deviations in the data. K-fold validation was used to validate the performance of our proposed framework.

Model Forecasting
A time-series analysis is a very important component of deep learning and is utilized for forecasting. Time is the only input variable (independent feature) used to forecast the target feature (dependent feature) in time-series data, which are a type of univariate regressive data. It is used to predict the future values of coming occurrences and is crucial for predicting the occurrence of respiratory disorders such as COVID-19. Positive cases are growing every day, thus it is important to predict whether the rate of growth will continue based on earlier data. Governments can mobilize resources to prevent disease transmission based on forecasts and take action in the future to slow the pace of infection increase without impacting more citizens. Forecast numbers cannot be assured because predictions depend entirely on past patterns. To counter a pandemic emergency such as COVID-19, governments can use this approximate projection of occurrences to evaluate future resource management. This section discusses the actual situation with COVID-19-infected cases and forecasts future situations for infected cases and deaths.  [53]. Due to the large number of deaths at the beginning of the pandemic, China had the highest CFR among the other countries; however, after May, China's fatalities decreased as a result of the lockdowns used to contain the pandemic. It is worth noting that the CFR is influenced by the number of tests performed and the size of the population. Therefore, a solid approach should be developed to avoid this constraint. The CFR changes when new cases of infection and fatalities appear. Tables 5 and 6 show the results for the alpha, MASE, SMAPE, MAE, and RMSE for actual cases and deaths, respectively. Alpha returned a base value parameter of between 0 and 1. MASE returned a mean absolute scaled error measurement of the forecasting. The symmetric mean absolute percentage measurement parameter was returned by the SMAPE function. The MAE returned the mean absolute error and the RMSE returned the root mean squared error metric. Figure 3 denotes a detailed visualization of the weeks, that is, 60 weeks on the x-axis and the number of infected patients plus the number of deaths on the y-axis. Graph (A) shows the data from Bangladesh, graph (B) from China, graph (C) from Germany, graph (D) from India, graph (E) from Iran, graph (F) from Italy, graph (G) from Pakistan, and graph (H) shows the data from the United Kingdom.  Table 7 shows the test results of the best models for the death forecasting. Table 8 shows the weekly death forecasts for the upcoming months. The model forecast results for India show an increase in weekly deaths at a faster rate compared to the other specified countries. Consequently, if the same strategy is maintained, COVID-19 will be completely out of control in India and fatalities could reach more than 121 thousand by the start of the upcoming year. The weekly death forecasts for Pakistan, Bangladesh, and Iran show decreases but at a relatively slow rate. The forecasts indicate that for Pakistan, COVID-19 deaths in the 1st week of the upcoming month in 2022 are 380, and this number will not exceed 537, with a confidence level of 95%. However, weekly deaths will reduce to 316, indicating a reasonably considerable difference in a couple of months. For Iran, the forecast for deaths is 1367 and will not exceed 1732, whereas for Bangladesh, it is 198 and will not exceed 292. The forecasting results for Germany are also declining at a slower rate. The forecast results show that in the last week of the first month, the weekly deaths will be 775 and will not exceed 5812, with a confidence level of 95%. The upper limit suggests an alarming situation. It is highly recommended for their governments to take steps and implement new policies as preventive measures regarding the pandemic situation. The forecast for the UK shows that weekly deaths will increase and in the last week of the upcoming month will be 126 and not exceed 8876. The results indicate that these countries' current strategies are working effectively in controlling the pandemic but the future situation may worsen, as shown by the upper limit of the forecast; it is highly recommended that they revise their policies in a timely manner. Finally, regarding Italy's future scenario, the situation will not be as difficult as in India. However, there is a considerably high weekly deaths forecast (more than a couple of hundred) for the end of the current year and the start of the next year. Table 7 gives a brief overview of the best models' test results for death forecasting. The WHO should give special consideration and help countries, such as India, Italy, and others with high mortality forecasts for COVID-19, to fight against the pandemic.

The Model's Performance
The results of the best accuracy, training, testing, and validation of our framework are briefly summarized in Figure 6. The results show a 99.60% accuracy, which means that the validation effectiveness is satisfactory. These outcomes were seen when initializing the input parameters for the model, indicating that the model was properly trained and the data were error-free. Figure 7 gives a brief visualization of the output results. The value of the training correlation coefficient of the target output was observed to be 99.44%, the validation was observed to be 99.77%, the testing was observed to be 64.16, and the overall value was observed to be 90.6%, which means that our model was efficient. The correlation quantifies the strength of a linear relationship between two variables. We used a correlation to investigate whether a relationship existed between the variables to assume or fit a specific model to our data. A value close to 1 (90.6% in this research) indicated that there was a positive linear relationship between the data columns, which means that our proposed model was precisely or accurately working on the given dataset.

Challenges and Future Directions
We discuss the current issues associated with utilizing Artificial Intelligence methods to resolve the COVID-19 pandemic. Furthermore, we demonstrate how machine learning and deep learning can assist in preventing the transmission rate of COVID-19 in the future.

Challenges
Applications based on AI for investigating COVID-19 are presently facing numerous hurdles, for example, the scarcity, legislation, and inaccessibility of substantial data; there are a lot of noisy data as well as false feedback; the inadequate alertness of the juncture between medicine and computer science; the issue of security and data privacy, etc.

Policies and Regulations
As the epidemic spreads and the numbers of reported affected and deceased people rises, several measures to limit the outbreak have been discussed, for example, social distancing and lockdowns. Authorities have an important role in establishing regulations and rules to motivate citizens, experts, educators, entrepreneurs, medical centers, technology giants, and large corporations to cooperate in COVID-19 mitigation during an outbreak.
Large-scale training data are scarce and unavailable Many Artificial Intelligence deep learning (AIDL) systems rely on large-scale datasets, including diagnostic image processing, with a variety of environmental variables. Yet, because of COVID-19's explosive expansion, there are inadequate resources for AI. In practice, analyzing datasets is a time-consuming task and demands the support of trained health professionals.
Noisy data and speculation on the internet The problems occur as a result of a reliance on easily available online social networking sites; vast amounts of audio/video, fake information, and misleading news have been reported in thousands of different online channels without any substantial modifications. Artificial intelligence-based techniques appeared to be slow when evaluating and processing noisy data. Furthermore, the outputs of Artificial Intelligence ML and DL techniques become skewed with noisy data. These issues reduce the performance and efficiency of Artificial intelligence algorithms, especially for epidemic forecasts and spreading analyses.

Lack of integration between computer science and medicine arenas
Numerous Artificial Intelligence experts have a strong hold on computer science applications, but considerable expertise in diagnostic imaging, epidemiology, pharmacology, and other relevant domains is also required to incorporate other medical information into artificial intelligence methods in the war against COVID-19. To handle COVID-19, it will be essential to arrange for specialists from different majors to work together and integrate data from numerous works.

Data security and privacy
In the era of Artificial Intelligence, the cost of acquiring confidentiality of data is incredibly low. In the presence of healthcare issues such as the current pandemic situation, several government agencies strove to gather a wide range of personal data including contact numbers, ID numbers, and medical data. How to properly maintain individual confidentiality and human rights during Artificial Intelligence discovery and handling is a topic worth tackling.
Unstructured data or incorrect structural data (e.g., numerical, text, and image data) Working with incorrect facts and ambiguous data in textual material can be challenging. It is possible for large amounts of data from several sources to be erroneous. Furthermore, a lot of data makes it difficult to extract valuable bits of metadata.
Early detection of COVID-19 via image analysis such as chest X-rays and CT scans Handling unbalanced datasets results in insufficient diagnostic imaging and extensive training periods and being unable to describe the problems of the efficient outcomes.

Risk assessments of old-age people and patients with other diseases
Old-age people should be screened, functioning treatments and cures should be discovered, risk assessments should be conducted, survival projections should be made, healthcare should be provided, and medical source planning should be conducted. The task at hand is to obtain the physical features and therapeutic outcomes for patients. An additional challenge is dealing with low-quality data, which can lead to skewed and incorrect predictions for old-age people and people with other diseases, for example, heart disease, diabetes, asthma, and so on.

Future Research Direction
Artificial Intelligence and blockchain-based solutions can also contribute to the fighting the outbreak in the following ways.
Non-contact illness diagnostics Using automatic feature categorization in X-ray and CT imaging during COVID-19 outbreaks will successfully limit the outbreaks. A patient's posture can be detected and CT image detection, X-rays, and smart camera facilities can all be utilized in AI-based systems.
Video diagnostics and consulting remotely To deliver COVID-19 hospital admissions and early diagnosis data, a mix of Artificial Intelligence and natural language processing (NLP) modules can be utilized to construct remote diagnostic programs and automation systems.
Bio-technological research AI-based algorithms can be utilized to accurately examine biomedical knowledge in terms of biotechnological research, such as major protein structures, genomic sequencing, and viral itineraries, to determine protein compositions and viral components.
Vaccination and drug development AI-based algorithms can be used to find prospective medications and vaccinations, as well as replicate drug-protein and vaccine-receptor pairings, allowing for the prediction of future drug and vaccine responses in COVID-19 patients.

Fake information must be identified and screened
In order to provide real, accurate, and comprehensive COVID-19 statistics, Artificial Intelligence models must be used to filter out erroneous news and material online. Blockchain-based [56] systems can be used to track and trace the actual information source.

Impact analysis and appraisal
Various sorts of computations can use machine learning, deep learning techniques to evaluate the influence of different social management strategies on the spread of the pandemic. Data could then be used to evaluate logical and efficient strategies for disease prevention and control in the general public.
Tracking of patients' contacts By establishing social networking sites and an information architecture, blockchainbased federated learning can be used to detect and track the characteristics of individuals residing in close proximity to COVID-19 sufferers, effectively anticipating and tracking the pandemic progression.

Smart robots
Robotic systems are likely to be used in activities, for example, public sanitation, deliveries, supply chains, and in healthcare infrastructures that do not require human resource management, e.g., medical treatment. This can stop the COVID-19 virus from spreading.
Future work with descriptive federated learning methods The effectiveness of federated learning methods and graphic properties that cause distinctions between COVID-19 and other strains of tuberculosis must be determined. This will aid radiologists and doctors in being more conscious of the infection and effectively analyzing probable COVID-19 X-rays and CT imaging data.
Importance of COVID-19 diagnostic tools and treatment These are both necessary but the early detection of COVID-19 is far more important. Substantial future study efforts based on ML and DL are needed in order to identify COVID-19 therapies.

Conclusions
The applications of operational research that uses mathematical, statistical, and demographic modeling are crucial in assisting decision makers in education, health, socioeconomic, and other aspects of daily life. By adopting preventative measures beforehand, the transmission of COVID-19 could be considerably slowed. In order to maintain attention on the most sensitive location, country, or region, scientists, research professionals, and global leaders must be informed in advance of the emergency scenarios. For forecasting the pandemic situation, this study proposed a multi-layer perceptron neural network (MLPNN) with the integration of Cronbach's alpha and the MAE, MASE, SMAPE, RMSE, and CFR. We also focused on the current challenges in preventing the outbreak from further spread and what is needed in the future to normalize social and economic activities. High accuracy was observed in estimating the percentages of afflicted patients and deaths. According to the MLPNN model's encouraging results, the volume of COVID-19 people in India will rise in the upcoming weeks and the death rate will also rise. This was evident from the 95% upper limit confidence interval, which was becoming wider for subsequent weeks. In general, forecasts for the near future were more precise compared to the longer term. Furthermore, providing the breakdown of the forecasting for each of the past COVID-19 variants could be a very interesting contribution to the research and will be explored in future studies. For this research, we could not find actual data about the numbers of patients who were affected by the particular variants in the selected countries.