You are currently viewing a new version of our website. To view the old version click .
Computers
  • Editor’s Choice
  • Article
  • Open Access

7 February 2023

Monkeypox Outbreak Analysis: An Extensive Study Using Machine Learning Models and Time Series Analysis

,
,
and
1
School of Information, University of California, Berkeley, CA 94704, USA
2
Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
3
Department of Computer Science and Engineering, GIET University, Odisha 765022, India
4
Faculty of Information Technology, Monash University, Wellington Rd, Clayton, VIC 3800, Australia
This article belongs to the Special Issue Computational Science and Its Applications 2022

Abstract

The sudden unexpected rise in monkeypox cases worldwide has become an increasing concern. The zoonotic disease characterized by smallpox-like symptoms has already spread to nearly twenty countries and several continents and is labeled a potential pandemic by experts. monkeypox infections do not have specific treatments. However, since smallpox viruses are similar to monkeypox viruses administering antiviral drugs and vaccines against smallpox could be used to prevent and treat monkeypox. Since the disease is becoming a global concern, it is necessary to analyze its impact and population health. Analyzing key outcomes, such as the number of people infected, deaths, medical visits, hospitalizations, etc., could play a significant role in preventing the spread. In this study, we analyze the spread of the monkeypox virus across different countries using machine learning techniques such as linear regression (LR), decision trees (DT), random forests (RF), elastic net regression (EN), artificial neural networks (ANN), and convolutional neural networks (CNN). Our study shows that CNNs perform the best, and the performance of these models is evaluated using statistical parameters such as mean absolute error (MAE), mean squared error (MSE), mean absolute percentage error (MAPE), and R-squared error (R2). The study also presents a time-series-based analysis using autoregressive integrated moving averages (ARIMA) and seasonal auto-regressive integrated moving averages (SARIMA) models for measuring the events over time. Comprehending the spread can lead to understanding the risk, which may be used to prevent further spread and may enable timely and effective treatment.

1. Introduction

Monkeypox (MPX) is a zoonotic infection caused by the monkeypox virus. While smallpox-like signs and symptoms characterize it, it is less contagious than smallpox [1]. The virus, an enveloped double-stranded DNA, is a part of the Orthopoxvirus genus of the Poxviridae family, including the Variola (smallpox) virus and Vaccinia virus (smallpox vaccine) [2]. Initially discovered in 1958 in monkeys, the disease spread across different regions of Africa and the USA. Recently, in May 2022, several cases of the monkeypox virus were reported by as many as twelve non-endemic countries, some of which were Australia, Belgium, Spain, Portugal, the United Kingdom, and the USA. The maximum number of cases was reported in Spain, Portugal, and the UK. Recent studies show that Austria, Israel, Switzerland, Taiwan, and India also have monkeypox cases.
The case fatality ratio of monkeypox is significantly lower than smallpox, and the incubation period can last up to three weeks. The major symptoms are headache, fever, muscle aches, respiratory symptoms, chills, etc. The infection spreads through contact, body fluids (saliva, secretions, droplets), and contaminated objects [3]. While there are limited studies to understand the epidemiology, transmission, sources, and patterns, the re-emerging nature of the disease demands more information for implementing strategies to prevent and cure the zoonotic disease. While researchers and practitioners worldwide are investigating the disease, the source of the ongoing outbreak information on the natural animal reservoir for monkeypox is yet to be confirmed. Clinical studies show that the disease may spread through human-to-human and or animal-to-human transmission of MPV [4]. There is active research on epidemiological investigations, genome sequencing, and travel links between countries [5].
As the disease has spread to more than seventy countries, the World Health Organization (WHO) declared monkeypox a global emergency, demanding a coordinated response due to the limited supply of vaccines. Smallpox vaccines have been observed to be effective against monkeypox if administered promptly. With the infection spreading rapidly and the limited supply of vaccines, it is imperative to analyze the burden and impact of the disease on the population, epidemiology, trend, patterns, etc. The 2019 novel coronavirus was declared a global emergency days after it was first identified and claimed millions of lives worldwide [6]. A combination of treatments and lockdown restrictions across the globe contained the spread and saved many lives. As experts warn that monkeypox may potentially become a pandemic in the future, it is necessary to take timely action to monitor the spread and analyze the trends [7].
While previous studies have highlighted the causes, comparisons, potential treatments, etc., for monkeypox, there is still a need to perform an extensive analysis of the spread. The analysis could perform epidemiological investigations, genome sequencing, and phylogenetic analysis. It may assist in understanding the transmission patterns and essentially guide policymakers to define effective policies and strategies to curb the spread further.
In this study, we analyze the spread and pattern of monkeypox using machine learning algorithms. Traditional machine learning methods, such as linear regression, decision trees, random forests, elastic net regression, and neural networks, such as artificial neural networks and convolutional neural networks, have been used to perform the analysis. We have relied on statistical parameters, such as mean absolute error, mean squared error, mean absolute percentage error, and R-squared error, to evaluate these models’ performance. To understand the trends and systematic patterns over time, we have performed a time series analysis using autoregressive integrated moving average (ARIMA) and seasonal auto-regressive integrated moving average (SARIMA).
We performed an extensive data visualization to infer patterns and trends and make observations. To the best of our knowledge, this is the first article to have explored monkeypox using a two-fold approach. The novelty and main contributions are as follows:
  • We combined the traditional machine learning methods and deep learning techniques to explore the spread of monkeypox;
  • We performed a time series analysis for the monkeypox using the ARIMA and SARIMA models.
The rest of the article is arranged as follows. Section 2 highlights materials and methods in which we focus on the related works and the overall methodology of the study. Section 3 describes the experimental analysis concerning the datasets used and the evaluation parameters. Section 4 includes results depicting observations, comparative analysis, and discussions. Finally, Section 5 highlights the overall conclusion of the study.

3. Methodology

The overall methodology is divided into two sections. We analyze the first section using traditional machine learning and deep learning models. In the second section, we perform a time series analysis of the monkeypox data using two different models.

3.1. Analysis Using Machine Learning Models

Machine learning is heavily used in epidemiology to identify trends and patterns associated with the spread of infection. We conducted this study over a defined population to find established patterns of infection in specific groups to suggest potential solutions that would be appropriate concerning the outbreak. We processed and analyzed data from a sample population and observed how certain factors (features) affect the outcome of the operations. We start with the process of data collection (Figure 1).
Figure 1. Analysis using machine learning.
As data quality determines the model’s accuracy, it is necessary to ensure that the data are reliable, as irrelevant and incorrect data may lead to misclassifications. Once the data are collected, the next step is to preprocess them, which includes randomizing them and cleaning them to remove unwanted values or handle missing and duplicate data. It is advised to visualize the data, which aids in understanding the overall structure and relationships between variables and classes. Once the visualization is performed, the data are split into training and test sets. Here, we split the data into 80% training and 20% test data. The training set is set for the model to learn, while the test set checks the model’s performance. Once the data have been split, the next step is to choose appropriate machine learning models to run the algorithm on the processed data. In this study, we have used a combination of traditional machine learning algorithms and deep learning algorithms:
  • The traditional machine learning algorithms considered for the study are linear regression (LR), decision trees (DT), random forests (RF), and elastic net regression (EN).
  • The deep learning algorithms used in the study are convolutional neural networks (CNN) and artificial neural networks (ANN).
The aim is to deduce which model among these makes the best prediction of the monkeypox outbreak. The training step results in model learning and is used to find patterns and make predictions. Once training is complete, the model is evaluated on the test set, which includes unseen data. The model performance is evaluated using several statistical parameters. We can compare the model most suited for the analysis based on the results. As parameters influence machine learning models, the goal is to provide an optimum value to the parameter so that the model’s performance increases. We perform hyperparameter-tuning to tune the hyperparameters to minimize the loss function. The process is automated and is used to achieve the maximum possible accuracy of the algorithm. While several methods can be used to tune parameters, in this study, we rely on the grid search technique, which builds a model for every combination of hyperparameters specified and evaluates each model. The hyperparameter-tuning process has been applied to the deep learning algorithms used in the study, i.e., ANN and CNN. We have considered the hyperparameters for tuning to be neurons, activation function, optimizer, learning rate, batch size, and epochs for both algorithms.
The grid search process involves obtaining output from the final pooling layer, flattened and fed into the fully connected layer, thereby becoming input to the fully connected layer, and applying grid search to the overall procedure. The output generated was used to deduce which model predicts the outbreak most accurately. Hyperparameter-tuning, or optimization, includes finding a set of optimal hyperparameters initially set before the model training. The most common methods are grid search and ransom search techniques. In the grid search technique, all combinations use a preset list of hyperparameters values. The best combination is chosen depending on the cross-validation score. On the other hand, in the random search technique, the model is trained based on random combinations such that the number of parameters can be controlled and tuned. This can lead to a wide range of values due to fast combination; however, it may not guarantee the best combination of parameters. On the contrary, grid search may take a significant amount of time but will exhibit the best combination of parameters. The Table 1 shows the hyperparameters employed in the study.
Table 1. Grid search hyperparameters.
The following hyperparameters have been used in the analysis:
  • The batch size represents the number of samples used in tuning before the parameters are updated;
  • The number of epochs estimates how frequently the algorithm works on the data. One epoch includes each sample in the training dataset, updating the internal model parameters. There may be multiple batches in an epoch, which determines the time taken by the training dataset for training the neural network. Multiple epochs lead to frequent updates in the weight during the training process. While fewer epochs may lead to underfitting, an excessive number of epochs may lead to overfitting;
  • The optimization algorithm is used to bridge the gap between updating model parameters and loss functions, which may be caused due to overall poor performance of the model. Root mean square propagation (RMSprop), stochastic gradient descent (SGD), Adam, Adamax, etc., are optimizers used during the hyperparameter-tuning process;
  • Dropout is responsible for selectively choosing neurons that can be dropped during the training process. When neurons are dropped, the corresponding weights are not applied,
  • Neurons in a hidden layer represent the number of neurons in a layer. Inputs are fed from the initial layers to the next layers, and the final layer presents the output. The overall network performance depends on the number of neurons in a layer.

3.2. Time Series Analysis

Time series incorporates time-based orders over a given period. It is an observation from a sequence of discrete-time of successive intervals. Time series are used in epidemiology to investigate associations between variables and outcomes. The time variable can be used as a reference point to estimate the target variable during forecasting [37]. Different factors affect a certain variable at different points in time. It is assumed to be stationary, i.e., the origin of time should not affect the properties under statistical factors. Once data are collected and cleaned, they may be visualized concerning time (Figure 2).
Figure 2. Time series analysis of data.
The stationarity of the series may be observed, followed by model building. We rely on two models for the study, i.e., autoregressive integrated moving average (ARIMA) and seasonal auto-regressive integrated moving average (SARIMA). Time series may be rending, seasonal, cyclical, or irregular, depending on the intervals in the series. In the case of stationary time series data, the mean and variance should be constant, while the covariance should measure the relationship between variables. Statistical tests and the augmented Dickey–Fuller (ADF) Test can be used to check data stationarity.
In this study, we perform the ADF test, which asserts that for the null hypothesis, the series is non-stationary. For the alternate hypothesis, the series is stationary, i.e., if the p-value is greater than 0.05, it fails to reject the null hypothesis. However, if the p-value is less than or equal to 0.05, it accepts the hypothesis. If the time series data are non-stationary, they may be converted to stationary using specific methods. For this study, we use the technique of differencing, which is a simple transformation of the existing series into an altogether new series. Here, we eliminate the series dependence on time and stabilize the mean. This leads to a significant decrease in the trend and seasonality during transformation. Once the data are converted into stationary, we can implement ARIMA and SARIMA models to perform the analysis.

3.3. Datasets

The dataset has been taken from Kaggle (Monkeypox Dataset Daily Updated) and incorporates three different files, i.e., daily country-wise confirmed cases, monkeypox cases worldwide, and worldwide case detection timeline. The data were collected from January 2022 to August 2022. The first file incorporates a daily number of confirmed cases for all countries where cases have been identified. The second file includes confirmed cases, suspected cases, the number of people hospitalized, travel history, etc. The third file includes information such as date, city, age, gender, isolation, etc. We rely on all the files for carrying out the extensive analysis.
The first part of our analysis incorporates data visualization, observations, and inferences based on the data patterns. This is followed by applying machine learning models to the data to analyze which model performs the best using statistical parameters. Finally, we perform a time series analysis using ARIMA and SARIMA models and observe the results from the forecast.

4. Experimental Analysis

In this section, we discuss the evaluation parameters and results following the experimental analysis performed for the study.

4.1. Evaluation Parameters

The evaluation parameters used in the study are as follows:
MAE: mean absolute error or MAE defines the average magnitude of the errors given a set of predictions. It may also be defined as the average over the test sample of the absolute differences between prediction and actual observation such that all individual differences have equal weight.
MSE: mean squared error or MSE defines the proximity between a regression line and its corresponding data points. It is calculated by taking the distance between the points and the regression line and squaring the values.
MAPE: mean absolute percentage error (MAPE) defines a forecast system’s accuracy. The accuracy is measured as a percentage. It can measure the performance of regression models.
R2: R-squared value or R2 determines how close data are to the fitted regression line. Additionally, known as the coefficient of determination, it can find the strength of the relationship between the linear model and dependent variables.

4.2. Results

This section presents the observations, comparative analysis, and discussions. Based on the analysis performed, we present data visualization graphs, machine-learning-model-based analysis, and time series graphs in the observations section. The comparative analysis section compares our proposed work and similar related works. Finally, we discuss the results and key takeaways in the Discussions Section.

4.2.1. Initial Observations

In this section, we present our initial observations. These include total confirmed cases by country, travel history, symptom analysis, cases worldwide, correlation using the Pearson method, and the number of cases by age group.
Figure 3 shows the trend across the top fifteen countries, although the spread is beyond these countries (see Figure 3). The United States has the highest number of confirmed cases, followed by Spain, Germany, England, and Brazil, which all have nearly the same cases, while there is a significant difference in the number of cases for Switzerland, Austria, and Israel.
Figure 3. Total confirmed cases based on country.
From Figure 4, we observed that the United States has the maximum number of patients with travel history, followed by Brazil, Italy, and Germany. Countries such as India, Singapore, and South Africa have fewer cases. We also observe that Portugal has the highest number of cases with no travel history, i.e., Travel_History_No is zero for Portugal. This increases the chance that Portugal might be the origin of the outbreak.
Figure 4. Travel history based on country.
The common symptoms of monkeypox disease are muscle pain, fatigue, chills, respiratory discomfort, etc. Based on the analysis shown in Figure 5, we observed that most patients exhibited signs of fever followed by rashes and skin lesions. Some patients developed genital ulcer lesions, headaches, and fatigue. Few patients suffered from swollen lymph nodes and muscle pain.
Figure 5. Monkeypox symptom analysis.
Figure 6 depicts a heatmap of the global outbreak. As is evident in Figure 3 and the heat map, the United States has the maximum number of monkeypox cases, followed by Europe. Countries such as India, Australia, and China have fewer confirmed cases.
Figure 6. Monkeypox cases worldwide.
Pearson’s correlation method is used to find the relationship between features in a dataset. The values are between −1 and +1, denoting negative and positive linear correlation, respectively. Hence, the strength of the linear relationship between the two variables can be assessed. Based on the study shown in Figure 7, we observed that people with a travel history correlate positively with hospitalization, and hospitalized cases correlate with confirmed cases.
Figure 7. Pearson’s correlation of features.
We observed in Figure 8 that the maximum cases are in the age group between 40 and 50 years. The number of cases is relatively lower for people in their 20s and 30s, and significantly lower for people in the age group between 10 years and 20 years, and 50 years and 70 years.
Figure 8. Monkeypox cases by age group.

4.2.2. Monkeypox Outbreak Analysis Using Machine Learning Models

In the initial stages of our data analysis, we observed that our data were highly skewed. Skewed data often lead to incorrect pattern analysis and prediction. Therefore, we handled the skewness using the min–max normalization technique. In min–max normalization, for every feature, the minimum value is transformed to 0, and the maximum value is transformed to 1. All the other values are placed as decimals between 0 and 1. After normalization and data splitting (80% training data and 20% test data), we deployed a few traditional machine learning algorithms (linear regression, decision tree, random forest, elastic net regression) and neural networks (convolutional neural networks or CNN, artificial neural networks or ANN). The performance was evaluated using MAE, MSE, MAPE, and R2; the results are depicted in Table 2.
Table 2. Results from applying ML models to monkeypox data.
Based on the results from Table 2, we observed a good range of values for the statistical parameters. MAE, MSE, and MAPE values can range from zero to infinity, and lower values indicate better performance. We deduce that CNN has the lowest values regarding MAE, MSE, and MAPE, suggesting that it performs the best compared to the other models. The second best model is ANN, followed by random forests. In terms of R2, the range lies between 0 and 1, and a higher value depicts better performance. We observe that CNN consistently performs better regarding R2, followed by ANN and the random forest model. Table 3 depicts the performance of the models after applying grid search.
Table 3. Results from applying ML models to monkeypox data with grid search.
We observe that grid search improves the overall performance of ANN and CNN; hence, we apply grid search to all the models to analyze the efficiency.
Once grid search is applied to all models, we observe significant performance improvement. We also observe that CNN with grid search performs the best, followed by random forest with grid search and ANN with grid search, respectively. Linear regression with grid search shows the least efficiency.

4.2.3. Time Series Analysis

Figure 9 depicts the outbreak of monkeypox over the last few months (February 2022–October 2022). We observe that the number of cases was almost constant in the first few months of 2022 until the first few cases were observed after May 2022. While the numbers were still low in May 2022, after June, the numbers increased steadily. The number of cases in July was higher than in June, and the number in August seemed higher than in July. Now that we have data regarding monkeypox cases over time (time series data), we can deconstruct the data to understand the data’s nature better. The data can be deconstructed into various components to analyze the hidden patterns and categories. After deconstruction, we obtain our data’s trend, seasonal, and residual components (see Figure 10).
Figure 9. Monkeypox outbreak over the last few months.
Figure 10. Trend, seasonal, and residual components.
Figure 10 shows that the seasonal component of our time series data has regular cycles over time. Hence, due to seasonality, we assert that time series data are non-stationary. The non-stationarity of data needs to be validated using statistical analysis. We rely on the augmented Dickey–Fuller Test for this. To validate the test, we need to observe whether it is a null hypothesis (not stationery) or an alternative hypothesis (stationery). For the data to be stationary, the p-value should be less than or equal to 0.05. The p-value is used for measuring the probability of acquiring the observed results considering that the null hypothesis is true. Lower p-values demonstrate greater statistical significance. Hence, a p-value of 0.05 or lower denotes statistical significance.
While performing the test, we obtained a p-value of 0.82259. As the p-value is higher than 0.05, the time series data are non-stationary. Non-stationery data denote that the statistical properties are changing through time. Since stationarity tremendously influences how data are analyzed or predicted, it is necessary to convert non-stationary data into stationery data. Stationery time series incorporate statistical properties, such as mean and variance, that do not vary in time and can provide a better analysis. We converted non-stationary data into stationery data using the concept of differencing.
Differencing leads to the transformation of the series to a new form such that the series’ dependence on time is eliminated. Using this technique, we found the difference between current and previous-day cases. Since we calculated the difference only once, the differencing was d = 1. Once the differencing was applied, the value of p was found to be 0.03607, which is less than 0.05, indicating that the data have become stationary. Figure 11 depicts the new time series with the eliminated seasonal components.
Figure 11. New time series after differencing.
Once the new time series is established, we deploy the ARIMA model on the new data. The ARIMA model relies on three parameters, i.e., d, p, and q, where d is the number of nonseasonal differences for obtaining stationarity, p is the number of autoregressive terms, and q is the number of lagged forecast errors in prediction. The value of d is d = 1 from differencing technique. To obtain the values of q and p, we need to analyze the autocorrelation function (ACF) plot and the partial autocorrelation function (PACF) plot, respectively. Figure 12 shows that q and p are 7 and 9, respectively. Once we deployed the ARIMA model using the values and the new data, we obtained the Akaike information criterion (AIC) value and the Bayesian information criterion (BIC) value as 1437.86 and 1482.65, respectively. These values are used to evaluate model performance.
Figure 12. ACF and PACF plots.
We must evaluate a few criteria to assert that the model is a good fit. The residuals must not have any patterns; therefore, the mean must be zero, and the variance must be uniform. The kernel density estimate (KDE) plot is used to visualize the data distribution and should be similar to the normal distribution. The points fall on a 45-degree reference line if the data are normally distributed. The normal Q-Q plot (see Figure 13) indicates univariate normality. Hence, the data points must be in a straight line. In the ACF plot, if data points lie outside the confidence band, they are statistically significant. Our study shows only a few data points lie outside the band, which shows that the model may require additional parameters for better accuracy.
Figure 13. Evaluating the ARIMA model.
Figure 14 depicts the deployed ARIMA model on the test set. The forecast is lower than the actual number of cases, thus indicating that more parameters may be needed for better accuracy, which we have already established from the ACF plot in Figure 13.
Figure 14. Testing the ARIMA model.
We deployed the SARIMA model on the same dataset, as it is much more efficient in handling seasonal data. The SARIMA model incorporates parameters P (number of autoregressive terms), D (number of non-seasonal differences for obtaining stationarity), Q (number of lagged forecast errors in prediction), and S (seasonal length of data). We obtain P = 5, Q = 7, D = 1, and S = 30. Based on the analysis, the AIC and BIC values are 1079.98 and 1137.24, respectively.
Figure 15 denotes all the criteria to determine whether the model fits. We do not see any obvious patterns, the KDE looks similar to a normal distribution, and the normal Q-Q plot looks good as most data points lie on a straight line. In the ACF plot, if data points lie outside the confidence band, they are statistically significant. Our study shows only a few data points outside the band, which shows that the model may require additional parameters for better accuracy.
Figure 15. Evaluating the SARIMA model.
Figure 16 depicts the deployed SARIMA model on the test set. The forecast is lower than the actual number of cases, thus indicating that more parameters may be needed for better accuracy, which we have already established from the ACF plot in Figure 15.
Figure 16. Testing the SARIMA model.

4.3. Comparative Analysis

In this section, we present a comparative analysis of our proposed work with some previous related research works. Table 4 presents a comparative analysis depicting our findings.
Table 4. Comparative analysis.
Table 4 presents a comparative analysis of our proposed work concerning some of the previous related research works. Most of the research works performed in the past are related to clinical trials and case studies. Our research focuses on the machine learning aspect of the study. The analysis has been performed in several ways, i.e., traditional machine learning methods, neural networks, and time series analysis, thereby highlighting the novelty of the research.

4.4. Discussions

In this section, we discuss three aspects of the study, i.e., the study’s main contributions, the key findings associated with the overall analysis, and the study’s limitations. The study’s main contributions are as follows:
  • Compared to the previous research, this study provides an interesting analysis of the monkeypox outbreak by deploying machine learning techniques;
  • This study deploys three different types of machine learning methods, i.e., traditional machine learning methods, such as linear regression, decision trees, random forests, and elastic net regression, and neural networks, such as artificial neural networks and convolutional neural networks;
  • The study also incorporates a time-series-based analysis using two different models, ARIMA and SARIMA;
  • The study performs extensive data visualization to find patterns in data for making inferences.
The key findings of the research are as follows: based on the analysis, the maximum number of confirmed cases is in the United States, followed by Spain and Germany. The United States has the maximum number of patients with a travel history, and Portugal has the maximum number of patients without a travel history. This implies that Portugal could be the origin of the outbreak. Patients’ most observed symptoms are fever, rashes, and genital ulcer lesions. Muscle pain is observed in fewer patients. Based on the correlation matrix, people with a travel history correlate positively with hospitalization, and hospitalized cases correlate with confirmed cases. The maximum number of cases belongs to people in the age group between 40 years to 50 years.
Machine learning analysis shows that CNNs perform better than other models. The evaluation is based on MAE, MSE, MAPE, and R2. Time-series analysis shows that the performances of ARIMA and SARIMA models are satisfactory.
The study’s limitations are as follows: data availability is limited, i.e., the total number of observations and the features in the dataset is insufficient for an extensive analysis. As was observed in time series analysis, in the ACF plot, if data points lie outside the confidence band, they are statistically significant. Our study shows only a few data points outside the band, which shows that the model may require additional parameters for better accuracy.

5. Conclusions

Over the last few months, the monkeypox outbreak has spread across different parts of the world, and the increasing cases have become a global concern. From clinical studies to modes of transmission and travel histories, researchers are trying to identify the critical aspects of the outbreak before the potential pandemic spreads further. In this study, we have deployed machine-learning methods to analyze the outbreak in terms of data visualization, machine-learning algorithms, and time series analysis. Our research makes several inferences based on data visualization related to countries, age, symptoms, travel history, etc. We deployed four traditional machine learning algorithms and two neural networks to analyze the data and observed that CNNs perform the best. Moreover, a time series analysis was conducted using the ARIMA and SARIMA models. Owing to the limitations we discussed in this study, we would like to deploy additional deep-learning techniques to more data to analyze the outbreak data in the future. It would be interesting to include additional features and carry out machine learning operations with them. Moreover, advanced machine learning methods, such as transfer learning, transformers, etc., can also be introduced for analysis.

Author Contributions

Conceptualization, I.P. and P.M.; methodology, R.K.; software, D.T.; validation, I.P., R.K., P.M. and D.T.; formal analysis, D.T.; investigation, I.P.; resources, P.M.; data curation, P.M.; writing—original draft preparation, I.P.; writing—review and editing, D.T.; visualization, R.K.; supervision, D.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Hraib, M.; Jouni, S.; Albitar, M.M.; Alaidi, S.; Alshehabi, Z. The outbreak of monkeypox 2022: An overview. Ann. Med. Surg. 2022, 79, 104069. [Google Scholar] [CrossRef] [PubMed]
  2. Peters, S.M.; Hill, N.B.; Halepas, S. Oral manifestations of monkeypox: A report of two cases. J. Oral Maxillofac. Surg. 2022, 80, 1836–1840. [Google Scholar] [CrossRef]
  3. Alakunle, E.F.; Okeke, M.I. Monkeypox virus: A neglected zoonotic pathogen spreads globally. Nat. Rev. Genet. 2022, 20, 507–508. [Google Scholar] [CrossRef] [PubMed]
  4. Adler, H.; Gould, S.; Hine, P.; Snell, L.B.; Wong, W.; Houlihan, C.F.; Osborne, J.C.; Rampling, T.; Beadsworth, M.B.; Duncan, C.J.; et al. Clinical features and management of human monkeypox: A retrospective observational study in the UK. Lancet Infect. Dis. 2022, 22, 1153–1162. [Google Scholar] [CrossRef]
  5. Adegboye, O.A.; Castellanos, M.E.; Alele, F.O.; Pak, A.; Ezechukwu, H.C.; Hou, K.; Emeto, T.I. Travel-Related Monkeypox Outbreaks in the Era of COVID-19 Pandemic: Are We Prepared? Viruses 2022, 14, 1283. [Google Scholar] [CrossRef]
  6. Priyadarshini, I.; Mohanty, P.; Kumar, R.; Son, L.H.; Chau, H.T.M.; Nhu, V.-H.; Thi Ngo, P.T.; Tien Bui, D. Analysis of Outbreak and Global Impacts of the COVID-19. Healthcare 2020, 8, 148. [Google Scholar] [CrossRef] [PubMed]
  7. Ciccozzi, M.; Petrosillo, N. The Monkeypox Pandemic as a Worldwide Emergence: Much Ado? Infect. Dis. Rep. 2022, 14, 597–599. [Google Scholar] [CrossRef] [PubMed]
  8. Sharma, A.; Priyanka; Fahrni, M.L.; Choudhary, O.P. Monkeypox outbreak: New zoonotic alert after the COVID-19 pandemic. Int. J. Surg. 2022, 104, 106812. [Google Scholar] [CrossRef] [PubMed]
  9. Lai, C.-C.; Hsu, C.-K.; Yen, M.-Y.; Lee, P.-I.; Ko, W.-C.; Hsueh, P.-R. Monkeypox: An emerging global threat during the COVID-19 pandemic. J. Microbiol. Immunol. Infect. 2022, 55, 787–794. [Google Scholar] [CrossRef]
  10. Kumar, N.; Acharya, A.; Gendelman, H.E.; Byrareddy, S.N. The 2022 outbreak and the pathobiology of the monkeypox virus. J. Autoimmun. 2022, 131, 102855. [Google Scholar] [CrossRef]
  11. Bhattacharya, M.; Dhama, K.; Chakraborty, C. Recently spreading human monkeypox virus infection and its transmission during COVID-19 pandemic period: A travelers’ prospective. Travel Med. Infect. Dis. 2022, 49, 102398. [Google Scholar] [CrossRef]
  12. Minhaj, F.S.; Ogale, Y.P.; Whitehill, F.; Schultz, J.; Foote, M.; Davidson, W.; Wong, M. Monkeypox outbreak—Nine states, May 2022. Morb. Mortal. Wkly. Rep. 2022, 71, 764. [Google Scholar] [CrossRef] [PubMed]
  13. Alshahrani, N.Z.; Alzahrani, F.; Alarifi, A.M.; Algethami, M.R.; Alhumam, M.N.; Ayied, H.A.M.; Awan, A.Z.; Almutairi, A.F.; Bamakhrama, S.A.; Almushari, B.S.; et al. Assessment of Knowledge of Monkeypox Viral Infection among the General Population in Saudi Arabia. Pathogens 2022, 11, 904. [Google Scholar] [CrossRef] [PubMed]
  14. Yang, Z.-S.; Lin, C.-Y.; Urbina, A.N.; Wang, W.-H.; Assavalapsakul, W.; Tseng, S.-P.; Lu, P.-L.; Chen, Y.-H.; Yu, M.-L.; Wang, S.-F. The first case of monkeypox virus infection detected in Taiwan: Awareness and preparation. Int. J. Infect. Dis. 2022, 122, 991–995. [Google Scholar] [CrossRef] [PubMed]
  15. Vouga, M.; Nielsen-Saines, K.; Dashraath, P.; Baud, D. The monkeypox outbreak: Risks to children and pregnant women. Lancet Child Adolesc. Health 2022, 6, 751–753. [Google Scholar] [CrossRef] [PubMed]
  16. Gong, Q.; Wang, C.; Chuai, X.; Chiu, S. Monkeypox virus: A re-emergent threat to humans. Virol. Sin. 2022, 37, 477–482. [Google Scholar] [CrossRef]
  17. Altindis, M.; Puca, E.; Shapo, L. Diagnosis of monkeypox virus–An overview. Travel Med. Infect. Dis. 2022, 50, 102459. [Google Scholar] [CrossRef]
  18. Dashraath, P.; Nielsen-Saines, K.; Mattar, C.; Musso, D.; Tambyah, P.; Baud, D. Guidelines for pregnant individuals with monkeypox virus exposure. Lancet 2022, 400, 21–22. [Google Scholar] [CrossRef]
  19. Quarleri, J.; Delpino, M.V.; Galvan, V. Monkeypox: Considerations for the understanding and containment of the current outbreak in non-endemic countries. Geroscience 2022, 44, 2095–2103. [Google Scholar] [CrossRef]
  20. Gruber, M.F. Current status of monkeypox vaccines. Npj Vaccines 2022, 7, 1–3. [Google Scholar] [CrossRef]
  21. Kriss, J.L.; Boersma, P.M.; Martin, E.; Reed, K.; Adjemian, J.; Smith, N.; Carter, R.J.; Tan, K.R.; Srinivasan, A.; McGarvey, S.; et al. Receipt of first and second doses of JYNNEOS vaccine for prevention of monkeypox—United States, May 22–October 10, 2022. Morb. Mortal. Wkly. Rep. 2022, 71, 1374–1378. [Google Scholar] [CrossRef] [PubMed]
  22. McCarthy, M.W. Therapeutic strategies to address monkeypox. Expert Rev. Anti-Infect. Ther. 2022, 20, 1249–1252. [Google Scholar] [CrossRef] [PubMed]
  23. Tarín-Vicente, E.J.; Alemany, A.; Agud-Dios, M.; Ubals, M.; Suñer, C.; Antón, A.; Arando, M.; Arroyo-Andrés, J.; Calderón-Lozano, L.; Casañ, C.; et al. Clinical presentation and virological assessment of confirmed human monkeypox virus cases in Spain: A prospective observational cohort study. Lancet 2022, 400, 661–669. [Google Scholar] [CrossRef] [PubMed]
  24. Mailhe, M.; Beaumont, A.-L.; Thy, M.; Le Pluart, D.; Perrineau, S.; Houhou-Fidouh, N.; Deconinck, L.; Bertin, C.; Ferré, V.M.; Cortier, M.; et al. Clinical characteristics of ambulatory and hospitalized patients with monkeypox virus infection: An observational cohort study. Clin. Microbiol. Infect. 2022. online ahead of print. [Google Scholar]
  25. Català, A.; Clavo-Escribano, P.; Riera-Monroig, J.; Martín-Ezquerra, G.; Fernandez-Gonzalez, P.; Revelles-Peñas, L.; Simon-Gozalbo, A.; Rodríguez-Cuadrado, F.J.; Castells, V.G.; Gomar, F.J.D.L.T.; et al. Monkeypox outbreak in Spain: Clinical and epidemiological findings in a prospective cross-sectional study of 185 cases. Br. J. Dermatol. 2022, 187, 765–772. [Google Scholar] [CrossRef]
  26. Miura, F.; van Ewijk, C.E.; Backer, J.A.; Xiridou, M.; Franz, E.; de Coul, E.O.; Brandwagt, D.; van Cleef, B.; van Rijckevorsel, G.; Swaan, C.; et al. Estimated incubation period for monkeypox cases confirmed in the Netherlands, May 2022. Eurosurveillance 2022, 27, 2200448. [Google Scholar] [CrossRef] [PubMed]
  27. Harapan, H.; Ophinni, Y.; Megawati, D.; Frediansyah, A.; Mamada, S.S.; Salampe, M.; Bin Emran, T.; Winardi, W.; Fathima, R.; Sirinam, S.; et al. Monkeypox: A Comprehensive Review. Viruses 2022, 14, 2155. [Google Scholar] [CrossRef]
  28. Poland, G.A.; Kennedy, R.B.; Tosh, P.K. Prevention of monkeypox with vaccines: A rapid review. Lancet Infect. Dis. 2022, 22, e349–e358. [Google Scholar] [CrossRef]
  29. Kmiec, D.; Kirchhoff, F. Monkeypox: A New Threat? Int. J. Mol. Sci. 2022, 23, 7866. [Google Scholar] [CrossRef]
  30. Wang, L.; Shang, J.; Weng, S.; Aliyari, S.R.; Ji, C.; Cheng, G.; Wu, A. Genomic annotation and molecular evolution of monkeypox virus outbreak in 2022. J. Med Virol. 2022, 95, e28036. [Google Scholar] [CrossRef]
  31. Shafaati, M.; Zandi, M. Monkeypox virus neurological manifestations in comparison to other orthopoxviruses. Travel Med. Infect. Dis. 2022, 49, 102414. [Google Scholar] [CrossRef]
  32. De Baetselier, I.; Van Dijck, C.; Kenyon, C.; Coppens, J.; Michiels, J.; de Block, T.; Smet, H.; Coppens, S.; Vanroye, F.; Bugert, J.J.; et al. Retrospective detection of asymptomatic monkeypox virus infections among male sexual health clinic attendees in Belgium. Nat. Med. 2022, 28, 2288–2292. [Google Scholar] [CrossRef] [PubMed]
  33. Farahat, R.A.; Abdelaal, A.; Shah, J.; Ghozy, S.; Sah, R.; Bonilla-Aldana, D.K.; Rodriguez-Morales, A.J.; McHugh, T.D.; Leblebicioglu, H. Monkeypox outbreaks during COVID-19 pandemic: Are we looking at an independent phenomenon or an overlapping pandemic? Ann. Clin. Microbiol. Antimicrob. 2022, 21, 1–3. [Google Scholar] [CrossRef] [PubMed]
  34. Priyadarshini, I.; Chatterjee, J.M.; Sujatha, R.; Jhanjhi, N.; Karime, A.; Masud, M. Exploring Internet Meme Activity during COVID-19 Lockdown Using Artificial Intelligence Techniques. Appl. Artif. Intell. 2021, 36, 1–24. [Google Scholar] [CrossRef]
  35. Lum, F.-M.; Torres-Ruesta, A.; Tay, M.Z.; Lin, R.T.P.; Lye, D.C.; Rénia, L.; Ng, L.F.P. Monkeypox: Disease epidemiology, host immunity and clinical interventions. Nat. Rev. Immunol. 2022, 22, 597–613. [Google Scholar] [CrossRef] [PubMed]
  36. Manjurul Ahsan, M.; Abu Abdullah, T.; Shahin Ali, M.; Jahora, F.; Khairul Islam, M.; Alhashim, A.G.; Datta Gupta, K. Transfer learning and Local interpretable model agnostic based visual approach in Monkeypox Disease Detection and Classification: A Deep Learning insights. arXiv 2022, arXiv:2211.05633. [Google Scholar]
  37. Dansana, D.; Kumar, R.; Das Adhikari, J.; Mohapatra, M.; Sharma, R.; Priyadarshini, I.; Le, D.-N. Global Forecasting Confirmed and Fatal Cases of COVID-19 Outbreak Using Autoregressive Integrated Moving Average Model. Front. Public Health 2020, 8, 580327. [Google Scholar] [CrossRef]
  38. Girometti, N.; Byrne, R.; Bracchi, M.; Heskin, J.; McOwan, A.; Tittle, V.; Gedela, K.; Scott, C.; Patel, S.; Gohil, J.; et al. Demographic and clinical characteristics of confirmed human monkeypox virus cases in individuals attending a sexual health centre in London, UK: An observational analysis. Lancet Infect. Dis. 2022, 22, 1321–1328. [Google Scholar] [CrossRef]
  39. Thornhill, J.P.; Barkati, S.; Walmsley, S.; Rockstroh, J.; Antinori, A.; Harrison, L.B.; Palich, R.; Nori, A.; Reeves, I.; Habibi, M.S.; et al. Monkeypox Virus Infection in Humans across 16 Countries—April–June 2022. N. Engl. J. Med. 2022, 387, 679–691. [Google Scholar] [CrossRef]
  40. Rao, A.K.; Schulte, J.; Chen, T.H.; Hughes, C.M.; Davidson, W.; Neff, J.M.; Markarian, M.; Delea, K.C.; Wada, S.; Liddell, A.; et al. Monkeypox in a traveler returning from Nigeria—Dallas, Texas, July 2021. Morb. Mortal. Wkly. Rep. 2022, 71, 509. [Google Scholar] [CrossRef]
  41. Mileto, D.; Riva, A.; Cutrera, M.; Moschese, D.; Mancon, A.; Meroni, L.; Giacomelli, A.; Bestetti, G.; Rizzardini, G.; Gismondo, M.R.; et al. New challenges in human monkeypox outside Africa: A review and case report from Italy. Travel Med. Infect. Dis. 2022, 49, 102386. [Google Scholar] [CrossRef]
  42. Orviz, E.; Negredo, A.; Ayerdi, O.; Vázquez, A.; Muñoz-Gomez, A.; Monzón, S.; Clavo, P.; Zaballos, A.; Vera, M.; Sánchez, P.; et al. Monkeypox outbreak in Madrid (Spain): Clinical and virological aspects. J. Infect. 2022, 85, 412–417. [Google Scholar] [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.