Meta-Heuristic Optimization of LSTM-Based Deep Network for Boosting the Prediction of Monkeypox Cases

: Recent technologies such as artiﬁcial intelligence, machine learning, and big data are essential for supporting healthcare monitoring systems, particularly for monitoring Monkeypox conﬁrmed cases. Infected and uninfected cases around the world have contributed to a growing dataset, which is publicly available and can be used by artiﬁcial intelligence and machine learning to predict the conﬁrmed cases of Monkeypox at an early stage. Motivated by this, we propose in this paper a new approach for accurate prediction of the Monkeypox conﬁrmed cases based on an optimized Long Short-Term Memory (LSTM) deep network. To ﬁne-tune the hyper-parameters of the LSTM-based deep network, we employed the Al-Biruni Earth Radius (BER) optimization algorithm; thus, the proposed approach is denoted by BER-LSTM. Experimental results show the effectiveness of the proposed approach when assessed using various evaluation criteria, such as Mean Bias Error, which is recorded as (0.06) using BER-LSTM. To prove the superiority of the proposed approach, six different machine learning models are included in the conducted experiments. In addition, four different optimization algorithms are considered for comparison purposes. The results of this comparison conﬁrmed the superiority of the proposed approach. On the other hand, several statistical tests are applied to analyze the stability and signiﬁcance of the proposed approach. These tests include one-way Analysis of Variance (ANOVA), Wilcoxon, and regression tests. The results of these tests emphasize the robustness, signiﬁcance, and efﬁciency of the proposed approach.


Introduction
After the global impact of COVID-19 in 2020, numerous countries reported that Monkeypox had emerged in 2022, presenting a new global health crisis.Even though the effects of COVID-19 on the worldwide economy and healthcare have been felt for over two years, a second recent viral outbreak is anticipated to emerge in the near future.The "Monkeypox virus" is the second etiological agent.The Zoonotic Orthopoxvirus, closely related to cowpox and smallpox, causes the contagious disease known as Monkeypox [1].The Monkeypox is a member of the Poxviridae family and the Orthopoxvirus genus.Monkeypox is not a recent discovery; it was initially identified in 1958 in Copenhagen.When the virus was first identified in 1958 in monkeys in a Danish laboratory, it was given the name Monkeypox [2].Although rodents and monkeys are the primary carriers, human-to-human transmission is also very common [3].The first isolate was given the name Monkeypox [4].In 1970, the Democratic Republic of the Congo (DRC) reported the first instance of zoonotic MPV transmission from animal to human [5].Monkeypox typically affects many people who live close to tropical rain forests in Central and Western Africa.When a person comes into intimate touch with another infected individual, animal, or object, the virus itself spreads.Direct bodily contact, animal bites, respiratory droplets, or mucus from the eyes, nose, or mouth can all spread it [6].Fever, physical aches, and exhaustion are a few early signs of Monkeypox infection in patients, with a red bump on the skin as the long-term result [7].
According to data collected thus far, Monkeypox is not nearly as contagious as COVID-19; however, the number of reported cases is climbing.In 1990, only 50 cases of Monkeypox were reported in West and Central Africa.By 2020, the number of reported incidents had increased to five thousand.In the past, Monkeypox was thought to only be found in Africa.However, in 2022, people in several countries outside Africa, including the United States and Europe, were found to have the virus [8].Consequently, a widespread sense of excessive dread and fear is gradually developing among the general population; this is frequently reflected in the opinions expressed by individuals on social media.According to the Centers for Disease Control and Prevention (CDC) recommendations, there is currently no effective treatment for the Monkeypox virus.Many countries' healthcare systems and experts are struggling under the pressure of a shortage of medical supplies due to a growing patient population [9].Therefore, gaining knowledge of the pandemic's growth and making accurate predictions regarding its future evolution is one of the most critical steps that can be taken to stop its progression.This is especially true in nations such as India, with a sizable population.The accuracy of forecasting patterns of the Monkeypox distribution can assist in predicting the pandemic outbreak and help governments become better equipped to tackle the pandemic.Additionally, precise forecasting can offer feedback on how well the implemented policy works to reduce the burden on that nation's healthcare system [10].The government can then assess the effectiveness of mitigation plans and implement policy regulations based on the predicted impact zones.For instance, researchers have effectively anticipated the reproduction parameter of COVID-19 in Indonesia using mathematical models such as the SIR and SEIR models, demonstrating the need for accurate forecasting methods [11].
Artificial intelligence (AI) has shown promising results as a decision support system to aid in detecting diseases and establishing accurate medical diagnoses in recent years, among many other applications.Researchers and governments have concentrated on machine learning (ML), a subset of AI that can learn from past data to solve a real-world problem and make accurate predictions about the number of pandemic cases, which is crucial for controlling the virus's incubation and transmission.For example, in the COVID-19 pandemic challenge, ML can forecast the COVID-19 breakout by assessing the virus's riskiness and then stepping up the level of the procedures used.When they utilize ML to detect COVID-19, many countries have seen a decline in the virus's propagation.In summary, many academics have developed models and systems to predict diseases using ML and deep learning (DL) approaches.ML algorithms are widely applicable in the field of medical analysis; for example, in the prediction of COVID-19 [12], the progression of Alzheimer's disease [13], brain tumor [14], breast cancer [15], and other diseases [16,17].ML and DL are crucial in diagnosing diseases and finding solutions to health threats.
In this paper, we propose a new approach for boosting the prediction accuracy of Monkeypox infections.The proposed approach is based on LSTM deep network, where its parameters are optimized using BER optimization algorithm.The contributions of this work can be summarized as follows: • A new approach is proposed based on optimized LSTM prediction to improve the accuracy of Monkeypox infection prediction.

•
The proposed approach is compared with other ML models and optimization algorithms, and the results are recorded.

•
The recorded results are analyzed using statistical methods such as Wilcoxon's ranksum test and one-way analysis of variance to evaluate the statistical difference and significance of the proposed approach.

•
The proposed approach can be generalized and tested for other datasets.

Related Works
ML and DL are crucial in diagnosing diseases and finding solutions to health issues.Many academics have developed models and systems to predict different diseases using ML and DL approaches.Alzheimer's disease cannot be diagnosed with a specific test.The clinical history, cognitive and laboratory tests, and electroencephalography (EEG) should be used to make the diagnosis.Therefore, new methods are required to ensure earlier and more precise diagnosis and to monitor treatment outcomes.With the goal of distinguishing Alzheimer's disease patients from controls, authors in [17] employed a ML technique called support vector machine (SVM) to scour EEG epochs for distinguishing features.A quantitative EEG (qEEG) processing method was created for automatically differentiating patients with Alzheimer's from healthy persons.The study that took each patient's diagnosis into account had high accuracy.
Diseases of the heart rank among the world's top five leading causes of death in the modern era.A significant problem in clinical data analysis is the prediction of cardiovascular disease.With ML, it has been demonstrated that it is possible to make predictions and judgments from the vast amount of data generated by the healthcare sector.The use of ML approaches to predict cardiac disease is only partially explored in several researches.A unique approach to improve the precision of cardiovascular disease prediction by identifying key features using ML techniques was proposed by authors in [18].The prediction model is presented with a variety of feature combinations as well as a number of well-established classification methods.It is usual practice to establish a diagnosis of Parkinson's disease (PD) on medical observations and an evaluation of clinical signs.This evaluation often involves the definition of a wide range of motor symptoms.However, there is a risk of misclassification with conventional diagnostic methods since they rely on the evaluation of motions that can be subtle to human eyes.In order to diagnose PD, ML also enables the combination of several modalities, such as magnetic resonance imaging (MRI) and single-photon emission computed tomography (SPECT) data [19].In order to rely on these alternative measures to detect PD in preclinical stages or atypical forms, we may discover pertinent traits that are not often used in the clinical diagnosis of PD by applying ML algorithms.
A frequent clinical consequence that is linked to high morbidity and death is fatty liver disease (FLD).The potential to develop a suitable plan for prevention, early diagnosis, and therapy is given by an earlier prediction of FLD patients.Authors of [20] created a ML model that could predict FLD and help doctors identify high-risk patients, establish a new diagnosis, and prevent and manage FLD.To predict FLD, classification models including logistic regression, random forest, naive Bayes, and artificial neural networks (ANN) were created.The four models' performance was compared using the area under the receiver operating characteristic curve (ROC).To accurately predict fatty liver disease, authors in [20] created and analyzed four classification algorithms in this study.Nevertheless, the random forest model performed better than other categorization methods.A random forest model could be used in the clinical setting to help doctors categorize patients with fatty livers for primary prevention, surveillance, early treatment, and management.
An increasing number of people worldwide are developing chronic kidney disease (CKD), which significantly impacts overall health and well-being.In the beginning stages of CKD, there are no noticeable symptoms; therefore, many people do not realize they have it.When CKD is diagnosed in its earliest stages, patients can receive medication that slows the disease's progression.Due to their quick and precise recognition abilities, ML models can successfully help therapists accomplish this goal.Authors in [21] suggested a ML approach to CKD diagnosis.The University of California Irvine (UCI) [22] ML repository provided the CKD dataset, which contains a significant number of missing values.Since patients may overlook particular measurements for a variety of reasons, missing data are frequently observed in real-world medical settings.Six ML algorithms were employed to create models after completing the missing dataset.With a diagnosis accuracy rate of 99.75%, random forest outperformed the other ML models.An integrated model was suggested that combines logistic regression and random forest utilizing perceptron, which could reach an average accuracy of 99.83% after ten simulations by assessing the established models' errors.We, therefore, hypothesized that this methodology would be applied to clinical data for more complex disease diagnoses.Authors in [23] presented a novel approach to ML that makes it possible to accurately diagnose coronary artery disease (CAD).Ten classic ML methods were evaluated.Data standardization and preprocessing were performed to increase the efficiency of these methods.Stratified ten-fold cross-validation was combined with a genetic approach, particle swarm optimization, and was employed for parallel feature selection and classifier parameter optimization.Results demonstrated that the proposed method could significantly improve ML models' accuracy for clinical and research applications.
Recently, around the world, Monkeypox has become a rapidly spreading disease, with outbreaks already being documented in 75 different nations outside of Africa.The similarities between Monkeypox, chickenpox, and measles make early clinical diagnosis difficult.Monitoring and quick identification of infected patients with Monkeypox may be aided by computer-assisted detection of lesion morphology in circumstances when confirmatory Polymerase Chain Reaction (PCR) assays are not easily accessible.When enough training examples are available, DL techniques have been demonstrated to be useful for automatically detecting skin lesions.There was already a knowledge gap among medical experts worldwide due to the rarity of Monkeypox before the current outbreak.The accomplishments of supervised ML in the detection of COVID-19 serve as inspiration for scientists as they work to find a solution to this difficult problem.However, there is a shortage of data on Monkeypox skin photos, causing a bottleneck in applying ML to the detection of Monkeypox from patient skin photographs.
Authors in [24] presented the largest dataset of Monkeypox Skin Images in their research.Photographs of healthy and infected skin were gathered through web scraping to create a complete image database available to the public.Infected skin images included those with measles, cowpox, chickenpox, smallpox, Monkeypox, and chickenpox.Authors in [25] developed the Monkeypox Skin Lesion Dataset (MSLD), which includes pictures of skin lesions caused by measles, chickenpox, and Monkeypox.Most photographs have been gathered from websites, news portals, and case reports that are available to the general public.In the first phase, a three-fold cross-validation experiment is set up, and the sample size is increased through data augmentation.The second phase is categorizing diseases such as Monkeypox using several pretrained DL models, including VGG-16, ResNet50, and InceptionV3.ResNet50 achieves the highest overall accuracy.Authors in [26] proposed image data collection and implementation based on a DL model for detecting Monkeypox disease by using modified VGG16.The dataset was developed by gathering photos from various open-source and internet resources, providing a safer approach to utilizing and disseminating such data for developing and deploying any ML model.The modified VGG16 model was applied in two separate studies.According to their findings, this model correctly recognizes patients with Monkeypox disease with high accuracy for both studies.This model's prediction and feature extraction helped to provide a deeper insight into specific features of the Monkeypox virus.

The Proposed Methodology
The LSTM-based neural network is an effective approach to handling time series.LSTM memory cells are responsible for facilitating efficient data transfer.Figure 1 depicts the overall framework of the LSTM prediction model developed in this paper, which consists of the following five functional modules: input layer, hidden layer, output layer, network training, and network prediction.This framework is developed by taking into account the data characteristics of finite sample points of Monkeypox infections time series and the design principle of simplifying the deep neural network.The LSTM cells are used to construct a single-layer neural network, which is then used to make predictions in the network's output layer.

LSTM
To represent a time series, recurrent neural networks (RNNs) have emerged as a popular choice.To better predict future states at the output layer, RNNs implement a context layer that works as memory.Elman RNNs have been used to describe temporal sequences and dynamical systems, despite being just one of several RNN designs.Getting started with RNN training has been difficult.RNNs have been trained using iterations of the backpropagation technique called backpropagation-through-time (BPTT).To achieve a more complex network design with states determined by time, BPTT uses gradient descent with backpropagation of error.The time-evolving RNN looks a lot like a multilayer perceptron with several discrete "hidden" layers.Learning long-term dependencies in the face of disappearing and exploding gradients has been a significant challenge for BPTT applied to basic RNNs.In response to this shortcoming, the LSTM network included a layer of hidden memory cells to improve its ability to recall the long-term dependencies [27].As can be seen in Figure 2, the memory cells are useful for keeping track of the long-term relationships in data.
Traditional RNNs and LSTMs only use the context state from the past to predict the future.In contrast, bidirectional RNNs (BD-RNNs) process data in both ways by using two independent hidden layers that each transmit information to a single output layer.As a result, two separate RNNs are combined in order to provide both forward and backward sequence information at each time step.By repeatedly iterating the backward layer from t = T to t = 1, and the forward layer from t = 1 to t = T, we are able to calculate the forward hidden sequence hf, the backward hidden sequence hb, and the output sequence y.Like BD-RNNs, BD-LSTM may retrieve long-term context or state from both directions.A number of practical sequence processing issues, including phoneme classification, continuous voice recognition, and speech synthesis, have benefited from the use of BD-LSTM networks since their first proposal for word embedding in natural language processing.Since it is important to keep track of future state information, BD-LSTM networks take data in two directions-forward, from the present to the future, and backward, from the future to the past-in order to do so.When the network is given two hidden states that are joined at any one time, it can store data from both the past and the future.The hidden state output h t is calculated in LSTM based on the following formulas: where at time t, i t denotes the input gate, f t denotes the forget gate, and o t denotes the output gate.The memory cell is referred to as c.The number of hidden units is denoted by h t , and the input feature count is

Al-Biruni Earth Radius Optimization Algorithm
It is the goal of optimization algorithms to find the best possible solution to a problem given limitations.When using the BER optimization algorithm, an individual from the population may be shown in the form of a S vector, S = S 1 , S 2 , ..., S d ∈ R, where S d is the size of the search space and d is the parameter or feature in the optimization problem [28,29].It is suggested that the fitness function F be utilized in order to assess a person's performance up to a predetermined point.These steps of the optimization technique are used to search populations for an optimal vector S * that optimizes the fitness.The method begins by selecting a random group of people from the population (solutions).The fitness function, the lower and higher limits for each solution, the dimension, and the population size are all required before BER can begin the optimization process.The optimization algorithm used to optimize the parameters of LSTM is depicted in Algorithm 2.

Exploration Operation
Exploration is responsible for both identifying interesting regions of the search space and avoiding local optimum stagnation via forward progress towards the optimal solution, as will be explained more below.

•
Moving towards the best solution : Using this strategy, the lone explorer in the group will look for promising new areas to explore in the immediate vicinity of where it now is.This is achieved by iteratively looking for a better choice (in terms of fitness) among the many possible alternatives in the immediate area.To do so, the BER study makes use of the following equations: where 0 < x ≤ 180, h is a number that is randomly selected from the range [0, 2], r 1 and r 2 are coefficient vectors whose values are measured by Equation (2), S(t) is the solution vector at iteration t, and D is the diameter of the circle in which the search agent will look for promising areas.

Exploitation Operation
It is the responsibility of the exploitation group to enhance already-in-place answers.At the end of each cycle, the BER algorithm determines who has the highest fitness and rewards them accordingly.Two distinct methods, described below, are used by the BER to accomplish the goal of exploitation.

•
Moving towards the best solution: To move in the direction of the best solution, the following equation is employed.
where r 3 is a random vector calculated using Equation ( 2) that controls the movement steps towards the best solution, S(t) is the solution vector at iteration t, L(t) is the best solution vector, and D refers to the distance vector.

•
Searching the area around the best solution: The area around the best answer is the most promising option (leader).This leads some people to look for improvements by exploring areas close to the optimal answer.The BER uses the following equation to carry out the aforementioned procedure.
S'(t + 1) = r(S * (t) + k) where S * (t) refers to the best solution.After comparing S(t + 1) and S (t + 1), the best solution S * can be selected.If the best fitness is not changed for the last two iterations, the solution is mutated by the following equation: where z is a random number in the range [0, 1] and t is the iteration number.

Selection of the Best Solution
The BER selects the best one to use in the following cycle to guarantee that the solutions are of high quality.However, multimodal functions may converge too quickly because of the elitism approach's increased efficiency [30][31][32].The BER provides outstanding exploration capabilities by using a mutation approach and looking around members of the exploration group.Strong exploration capabilities allow the BER to delay convergence.It is possible to see the BER pseudocode in Algorithm 2. We begin by feeding the BER some information, such as the population size, mutation rate, and the number of iterations.The BER then divides the participants into two groups: those who do exploratory work and those who do exploitative work.During the iterative process of finding the optimal solution, the BER algorithm dynamically adjusts the size of each group.Each team uses two methods to perform their tasks.In between iterations, the BER shuffles the order of the answers to guarantee variety and deep investigation.For example, a solution part of the exploration group in one iteration might move to the exploitation group in the next.The elitist approach taken by the BER ensures that the leader is not replaced during the process.

Experimental Results
In this paper, we conducted a set of experiments to prove the effectiveness and superiority of the proposed BER-LSTM approach in predicting Monkeypox cases.To justify the achieved results, a set of ML models along with four optimization algorithms were incorporated into the conducted experiments.The next sections present the dataset employed in the conducted experiments, discuss the achieved results, and then conclude the established comparisons.
Platform hardware specifications for the experiments performed are as follows: Intel Core i7 CPU, GeForce RTX2070 Super GPU (graphics processing unit) with 8 GB of RAM, and 16 GB of DDR4 RAM for general processing and data storage.Platform is Ubuntu 20.04 with CUDA 9.0, Cudnn 7.1, TensorFlow 1.15, Spider IDE with Python 3.7, and so on for the software.In order to complete the model training process quickly, trials are run with a batch size ≥16.

Dataset
The dataset employed in the conducted experiments is publicly available on Kaggle [33].The records of this dataset are updated daily to include the up-to-date confirmed cases around the world.Figure 3 shows the world map, with colored regions showing spread infections of the Monkeypox virus.In addition, the timeline shown in the plot in Figure 4 for the confirmed cases with respect to the date of the infections up to the date of writing this article.As shown in this plot, the number of confirmed cases is increasing, which demands an accurate prediction for helping governments to get ready with the necessary precautions.
On the other hand, the Pearson correlation of the features recorded in the dataset is shown in Figure 5.In this figure, there is a high correlation between the travel history and the number of confirmed cases, which indicates the relevant features that affect the accurate prediction of the confirmed cases.

Configuration Parameters
To assess the proposed approach and prove its superiority, four optimization algorithms were included in the conducted experiments to optimize the parameters of the LSTM-based deep network.The configuration parameters of these optimization algorithms, along with the proposed BER-based algorithm, are presented in Table 1.These parameters are used to setup the operation each algorithm to achieve the best prediction results.

Optimization of Parameters in LSTM
In this work, we adopted the BER algorithm to optimize the parameters (including the number of hidden layers, hidden nodes, and the learning rate) of the neural network in the LSTM model.In the BER algorithm, we set the number of iterations as 500, the number of runs as 30, mutation probability as 0.5, exploration percentage as 20, and K as 1. Figure 6 exhibits the objective function value at each iteration step, and the algorithm finds the best objective function value as 4.21 at the 15th iteration step.The set of upper bound and lower bound of the search area for the three parameters are presented in Table 2.
Table 2.The optimized set of parameters of the LSTM model.

Evaluation Criteria
The evaluation of the proposed is performed in terms of the metrics listed in Table 3.This table presents the formulas used to calculate each metric.The metrics presented in the table are Root Mean Error (RMSE), Relative RMSE (RRMSE), Mean Absolute Error (MAE), Mean Bias Error (MBE), Pearson's correlation coefficient (r), Nash Sutcliffe Efficiency (NSE), coefficient of determination (R2), determine agreement (WI), where N is the number of observations in the dataset, ( Vn ) and (V n ) are the nth estimated and observed bandwidth, and (V n ) and ( Vn ) are the arithmetic means of the observed and estimated values [38,39].Table 3. Performance evaluation metrics [40].

Metric
Value

The Achieved Results
To evaluate the performance of the proposed approach, the evaluation metrics were calculated for the prediction results achieved by the proposed BER-LSTM model and six other models, namely, standard LSTM [41], bidirectional LSTM (BILSTM) [42], gated recurrent unit (GRU) [43], multiple LSTMs [44], multiple BILSTMs [42], and convolutional LSTMs (CONVLSTMs) [45].The achieved results are presented in Table 4 using the training set.As shown in this table, the values of all metrics using the proposed approach outperform those achieved by the other six models.These results prove the superiority of the proposed approach over the standard approaches in predicting confirmed Monkeypox cases.On the other hand, the achieved results using testing set are presented in Table 5.The evaluated value of MSE, for instance, using the proposed approach is (480.53),whereas the best value of this metric using the other standard approaches was (503.24) based on the BILSTMs.Similarly, the achieved values of the different evaluation metrics using the proposed approach are better than those of the standard approaches when working on the testing set of the Monkeypox cases.On the other hand, the proposed approach is studied from the statistical perspective, and the results are shown in Table 6.The python packages and programs used in this statistical analysis include SciPy, Matplotlib, OriginPro, and DataMelt, in addition to SPSS.For a fair comparison, the results presented in this table were calculated using the average of eight runs of the proposed approach, and the predictions of the results were analyzed.A total of 20 iterations for each run, Max iter in Algorithm 2, is used to ensure the statistical significance of the proposed approach when compared with the other competing approaches in the ANOVA and Wilcoxon rank-sum tests.The proposed approach is compared with different algorithms, and the ANOVA test results are presented in Table 7 to determine the statistical significance of the differences between them.The dependent variable in the ANOVA test is RMSE.The hypothesis testing is formulated here using two hypotheses: the null hypothesis (H0: , where A is the BER-LSTM algorithm, B is the PSO-LSTM algorithm, C is the GWO-LSTM algorithm, D is the GA-LSTM algorithm, and E is the WOA-LSTM algorithm.The alternate hypothesis (H1: means are not all equal).In addition, Wilcoxon's rank-sum statistical analysis of the proposed algorithm in comparison with other algorithms is shown in Table 8.Hypothesis testing is formulated by two hypotheses: the null hypothesis (H0: µ BER-LSTM = µ PSO-LSTM , µ BER-LSTM = µ GWO-LSTM , µ BER-LSTM = µ GA-LSTM , and µ BER-LSTM = µ WOA-LSTM ); the alternate hypothesis (H1: means are not all equal).The dependent variable in the Wilcoxon's rank-sum test is RMSE.This shows the superiority and indicates the statistical significance of the BER-LSTM algorithm; p-value < 0.05 demonstrating the significant superiority.Thus, the alternate hypothesis H1 is accepted.When it comes to the reliability and effectiveness of the currently implemented features, regression testing bears the primary responsibility.To ensure that the proposed approach remains robust in terms of ongoing updates to the feature values, regression testing is performed after a series of updates to feature values.Modifications to the features might cause unintended consequences, such as broken features or reliance on a faulty dependency.The regression test results of the BER-LSTM with respect to compared methods are shown in Table 9.As shown in this table, the regression testing shows a reliable performance of the proposed BER-LSTM algorithm.Tukey's honest significant test (HSD), also known as the Tukey test, is a post hoc statistical test used to determine if there is a statistically significant difference between the means of two sets of data.Once an ANOVA has found a significant difference in the means of three or more sets of data, this test is carried out based on the standardized range distribution.The average RMSE with the Tukey is shown in Figure 7.In this figure, the proposed approach could achieve the smallest value of RMSE, which reflects its effectiveness in predicting the Monkeypox cases robustly.The visual plots depicted in Figure 8, on the other hand, highlight the performance of the proposed approach in predicting the Monkeypox cases.In this figure, four plots are shown, namely, residual, homoscedasticity, quartile-quartile (QQ), and heatmap plots.Typically, this is performed by charting the quantiles of each distribution and evaluating the differences between them.Distributions of QQ points are shown to roughly follow the straight line in the illustration.As a result, the linear relationship between the observed and predicted residuals supports the claimed effectiveness of the advised BER-LSTM.The depicted results confirm the superiority and effectiveness of the proposed approach.The histogram of RMSE values for the BER-LSTM and compared algorithms is shown in Figure 9.A sample ROC curve based on the proposed BER-LSTM approach versus one of the competing approaches, namely, PSO-LSTM, is shown in Figure 10.This curve confirms the superiority and effectiveness of the proposed approach.

Conclusions
Using a publicly available and daily updated Monkeypox dataset, this research proposed a new approach to accurately predict the confirmed cases of Monkeypox infection.The proposed approach is based on optimizing the parameters of the LSTM-based deep network using the BER optimization algorithm.The better between exploration and exploitation of the BER algorithm allows for better optimization of the network parameters, thus achieving better performance.Eight evaluation criteria were considered to assess the performance of the proposed approach.The recorded values of these criteria show the effectiveness of the proposed approach.In addition, to prove the superiority of the proposed approach, six different ML models and four optimization methods were included in the conducted experiments.On the other hand, the statistical significance of the proposed approach was studied using ANOVA, Wilcoxon, and regression tests.The recorded results of these tests confirmed the proposed approach's robustness, significance, and effectiveness.The limitation of the proposed approach is that when tested on a large dataset, the balance between exploration and exploitation processes of the optimization algorithm consumes time.This limitation is currently under investigation to be considered in future work on the BER optimization algorithm.

Figure 1 .
Figure 1.The proposed methodology.Once the prediction model has been specified, the optimizer is configured along with a loss function adoption to optimize the parameters of the LSTM model.The BER optimization algorithm is adopted to optimize the parameters of the LSTM model.Through the use of the loss function, the error values are calculated to control the fine-tuning of the model parameters.Once all of the settings have been adjusted, training data are collected in batches and fed into the model for analysis in iterations.Algorithm 1 provides the pseudocode for the proposed methodology.
x t .Learning also involves adjusting the bias b and weight matrices W and U. Keep in mind that the size of the concealed state determines the dimensions d h of each gate.In this context, C t refers to the present cell's memory, while Ĉt stands for the intermediate cell state.Initial conditions at time zero are specified by C o = 0 and h o = 0.

Figure 2 .
Figure 2. The structure of a neural network based on LSTM cells. )

Figure 3 .
Figure 3.The confirmed Monkeypox cases across the countries.

Figure 4 .
Figure 4.The timeline of the confirmed cases to the date of this article.

Figure 6 .
Figure 6.Plot of the objective function values versus the number of iterations.

Figure 7 .
Figure 7.The average RMSE with Tukey test for the BER-LSTM and compared algorithms.

Figure 8 .
Figure 8. Visual analysis of the results achieved by the BER-LSTM and compared algorithms.

Figure 9 .
Figure 9. Histogram of RMSE values for the and compared algorithms.

Figure 10 .
Figure 10.ROC curve for the BER-LSTM and PSO-LSTM algorithms.One more experiment was conducted to show the smooth convergence time of the algorithm parameters versus the other competing algorithms.The results of this experiment are depicted in Figure11.These results confirm the superiority of the proposed algorithm in predicting the Monkeypox cases efficiently.

Figure 11 .
Figure 11.Smoothness of convergence time of BER-LSTM parameters with respect to other methods.

1 :
Initialize BER population S i (i = 1, 2, ..., d) with size d, iterations Max iter , fitness function F n , t = 1, BER parameters 2: Calculate fitness function F n for each S i 3: Find best solution as S * 4: while t ≤ Max iter do

Table 1 .
Configuration parameters of the competing algorithms.

Table 4 .
Prediction results using the training set.

Table 5 .
Prediction results using the test set.

Table 6 .
Statistical analysis of the achieved Monkeypox prediction results compared with the results achieved by the other approaches.

Table 7 .
Results of the one-way analysis of (ANOVA) test.

Table 8 .
Results of Wilcoxon's signed rank test for the BER-LSTM and compared algorithms.

Table 9 .
Regression test results of the BER-LSTM with respect to compared methods.