On the Classiﬁcation of MR Images Using “ELM-SSA” Coated Hybrid Model

: Computer-aided diagnosis permits biopsy specimen analysis by creating quantitative images of brain diseases which enable the pathologists to examine the data properly. It has been observed from other image classiﬁcation algorithms that the Extreme Learning Machine (ELM) demonstrates superior performance in terms of computational efforts. In this study, to classify the brain Magnetic Resonance Images as either normal or diseased, a hybridized Salp Swarm Algorithm-based ELM (ELM-SSA) is proposed. The SSA is employed to optimize the parameters associated with ELM model, whereas the Discrete Wavelet Transformation and Principal Component Analysis have been used for the feature extraction and reduction, respectively. The performance of the proposed “ELM-SSA” is evaluated through simulation study and compared with the standard classiﬁers such as Back-Propagation Neural Network, Functional Link Artiﬁcial Neural Network, and Radial Basis Function Network. All experimental validations have been carried out using two different brain disease datasets: Alzheimer’s and Hemorrhage. The simulation results demonstrate that the “ELM-SSA” is potentially superior to other hybrid methods in terms of ROC, AUC, and accuracy. To achieve better performance, reduce randomness, and overﬁtting, each algorithm has been run multiple times and a k -fold stratiﬁed cross-validation strategy has been used.


Introduction
To facilitate doctors for diagnosis of brain disease, proper analysis and classification of various types of brain images are required. The conventional images used for this purpose are Computed Tomography, Positron Emanation Tomography, Ultrasonography, X-radiation, and Magnetic Resonance Imaging (MRI). Out of these techniques, the MRI serves as a better source of information for brain study and it helps to recognize tissues with a higher spatial resolution. However, its analysis is complex and time-consuming. Consequently, a computerized framework needs to be developed for automatic diagnosis using appropriate Computer-Aided Diagnosis (CAD) arrangement.
The MRI is a useful asset for imaging of the human cerebrum which uses radiology waves and magnetic fields. For biomedical research and clinical examination, it provides adequate data of the delicate mind tissues [1]. The MRI gives a better quality contrast for various cerebrum tissues and creates fewer antiquities [2][3][4] compared to other imaging processes. The CAD-based analysis from MR pictures has gained increasing importance among researchers [5]. For appropriate classification of MR pictures, the extracted features play an important role.
The Extreme Learning Machine (ELM) is chosen for the classification of brain images as it processes faster. In ELM architecture, the weights and biases are initialized randomly, and the proper weights of the model are obtained using Moore-Penrose (MP) generalized inverse. The ELM model has proven to be useful and reliable classifier for many applications [6]. To further improve the performance of the ELM network, its associated conventional weights have been optimized using bioinspired Salp Swarm Algorithm (SSA) [7]. The major challenge lies is extraction of required features from MR images and subsequent reduction of less important features which assist in reducing the classification complexity.
To achieve better final weights of various Artificial Neural Network (ANN) structures during the training phase, different evolutionary computing techniques have been employed [8][9][10]. The single hidden layer feedforward hidden layer network is a standard ANN structure that trained by the back-propagation (BP) learning utilizing gradient descent strategy which minimizes the cost function. However, BP-based learning is sensitive to initial weights and convergence rate is slower. To improve the learning rate, momentum and step size variation have been introduced [11][12][13][14]. The ELM is a faster model proposed by Huang et al. [15], which used arbitrary initial weights and biases, and final weights are obtained using MP generalized inverse.
In a recent work [3], the Support Vector Machine (SVM) classifier employs Two-Dimensional Discrete Wavelet Transformation (2D-DWT)-based features of MR image and is reported to provide 4% more accuracy than the Self-Organizing Map model. In another study, El-Dahshan et al. [16] have used 2D-DWT and PCA for feature extraction and reduction, and K-Nearest Neighbor (KNN) and Back-Propagation Neural Network (BPNN) for classification of MR images. From the simulation study, it is observed that the average performance of KNN is superior to the BPNN. To improve the accuracy, swarm intelligence techniques are used to tune the model parameters [17,18]. The authors of [19,20] have suggested an ELM training using Differential Evolution (DE) and Particle Swarm Optimization (PSO) optimization. For classification task, the BPNN model have used different filters for preprocessing the images and Feedback Pulse Coupled Neural Network has been used for segmentation technique, and both 2D-DWT and PCA for extraction and reduction of features respectively [21]. Dehuri, S. et al. [22] have proposed an improved particle swarm optimization technique to train Functional Link Artificial Neural Network (FLANN) architecture for classification of brain tumor MRI images as malignant or benign. As the FLANN is a single-layered structure, it requires fewer parameters to be tuned. In this article, the authors have optimized FLANN weights and have achieved good classification accuracy as compared to FLANN with gradient descent learning and SVM model with radial basis function kernel. In another communication, Chao Ma. et al. [23] have applied the Artificial Bee Colony optimization technique to adjust the parameters of the ELM architecture and obtained a better generalization performance in terms of classification accuracy. Eusuff et al. [24] have suggested a Shuffled Frog Leaping Algorithm approach to tune ELM parameters [25]. The authors of [26] have employed the Salp Swarm Algorithm (SSA) [27] in SVM + RBF hybridized model for classification of MR images. They have reported an accuracy of 0.9833, sensitivity of 1, and specificity of 0.9818 which are better than those obtained from SVM classifier.
Different variations of 2D-DWT have been used for feature extraction in classification task, however for obtaining higher accuracy, the authors have used a smaller dataset. The motivation and objective of this work is as follows: (i) To develop an automatic biomedical image classification model offering satisfactory and reliable performance using large MR image datasets. (ii) To further improve the performance, hybridized models are proposed for tuning the parameters of the models using bioinspired optimization techniques. (iii) The lack of salp inspired algorithms in literature is a main motivation of this paper.
In this study, the classification task is carried out using ELM [28][29][30], FLANN, RBFN, and BPNN machine learning models. The SSA, PSO, and DE optimization methods have been utilized for achieving best possible parameters of the models. The main contributions of this paper are listed below.
(i) Which activation function of ELM network has yielded faster convergence during training? (ii) Which bioinspired technique optimized the ELM parameters in a better way? (iii) How much performance improvement was achieved using proper classification models? (iv) ELM-SSA exhibits superior performance over FLANN, RBFN, and BPNN models hybridized with PSO, DE as well as SSA schemes. (v) On average, the ELM-SSA model yields lower execution time compared to other models. (vi) In general, the proposed ELM-SSA model outperforms other hybridized classification models such as FLANN-SSA with an improvement in accuracy of 5.31% and 1.02% for Alzheimer's and Hemorrhage datasets, respectively. (vii) The ELM-SSA model has produced 8.79% and 2.06% better accuracy as compared to RBFN-SSA for two datasets. (viii) ELM-SSA has also shown 7.6% and 1.02% higher accuracy for the two datasets, respectively, as compared to BPNN-SSA.
The proposed ELM-SSA utilizes the Contrast-Limited Adaptive Histogram Equalization (CLAHE) [31,32] for preprocessing purpose, in addition, the 2D-DWT and PCA are chosen for extraction and reduction of features respectively. A 5 × 5 fold stratified cross-validation scheme is used to preserve the imbalanced class distribution. From the simulation of different datasets, it is observed that the proposed approach demonstrates better accuracy in contrast to other hybridized classifiers.
The rest of the article is organized as follows. Section 2 describe with different methodologies adopted in this work. Concise presentation of various sub-block of the hybrid model is presented in Section 3. Section 4 provides the detail simulation based experiment of the model and finally in Section 5, the conclusion and findings of this work are summarized.

Materials and Methods
This section provides a detailed description of the data set and the methodologies adopted in this study. The methodology comprises of five substages: (1) level-3 2D-DWT for extraction of features, (2) feature reduction employ using PCA, (3) overfitting control using stratified k-fold cross-validation, (4) development of ELM classifier, and (5) use of SSA to adjust the parameters of ELM.

2D-DWT for Feature Extraction
The CLAHE [32] preprocessing method based on histogram equalization is used to retain the sharpness of the images and to enhance the edges, the Wavelet Transform (WT) has the potentiality to retain time frequency information and thus is better suited for extracting the features.
The WT of a continuous function like f (x), related to a wavelet function ϕ(x) is described using Equations (1) and (2).
ϕ s,t (x) is built from the parent wavelet ϕ(x) by utilizing the dilation factor S and translation parameter t. Preventing the parameters s and t with s = 2 J and t = 2 j k. The discrete version of Equation (1) has been mentioned in Equation (3).
where P j,k (n) and Q j,k (n) refer to the approximate and detail components, respectively. The parameters j and k represent wavelet scale and translation factors, respectively. The functions G(n) and H(n) represent the low-pass and high-pass filters, respectively. It combines these High-Pass and Low-Pass filters and then downsamplers (DS) by a factor of two. The filter bank approach DWT when applied to MR image, the sub-bands obtained after two stages is shown in Figure 1. Four different sub-bands are AA for (LOW-LOW), AB for (LOW-HIGH), BA for (HIGH-LOW), and BB for (HIGH-HIGH), and they are obtained as shown in Figure 1. Step 4: Obtain MRI n − nth MR image Step 5: EM n 1 : Step 10: i = i + 1 Step 11: end for Step 12: end for Step 13: end while Step 14: Step 15: end for Before reducing the features by the PCA, the FM needs to be normalized by Z-score normalization which is calculated by subtracting the mean from observed feature value and then dividing the standard deviation

Principal Component Analysis (PCA) for Reducing Features
Independent component analysis (ICA) and PCA are well-known methods for transforming the higher dimension feature vector into a lower dimension FV [33]. In contrast to ICA, PCA involves less computational complexity and very less complexity also, so most researchers widely used this methodology. The algorithmic steps of PCA are explained in Algorithm 2.
Algorithm 2 Feature reduction using PCA [34] Input: Primary feature vector. Output: Reduced feature vector. Let X be an input data set of N points and each having p dimensions, which is represented by Equation (4) .
Step1: Compute the mean of X(X :) which is represented by Equation (5) Step 2: Find out the deviation from mean: Q = X −X Step 3: Calculate the covariance matrix C Q which is mentioned in Equation (6) where the correlation between these two dimensions such as i and j are represented by Equation (7): Step 4: Compute the Eigen Vectors and Eigen Values of C Q .
Step 5: Rearrange the Eigen Vectors and Eigen Values: λ1 ≥ 22 ≥ . . . .λn Step 6: The Eigenvectors having the biggest Eigenvalues come to a new space which consists of the essential coefficients, is represented through Equation (8).

k-Fold Cross-Validation
During the training of the classifier, the overfitting problem may occur by generating less error value but while testing with new data, the error produced is high. Therefore, to avoid such overfitting problem k-fold cross-validation scheme is used. In this scheme, the whole dataset is divided randomly into a k-fold partition where one fold is required for testing and the remaining folds are used for training. Finally, the mean of the error generated in all sets are calculated. This process is repeated until each fold data is tested.

Classification Using ELM
The ELM-SSA classifier is employed in the brain MR images for binary classification. The ELM is one of the best learning frameworks which was proposed by Huang in 2006 [15]. It has a good generalization performance along with a faster learning rate as compared to traditional learning techniques such as the SVM, Back-Propagation, ANN, Least Square-SVM, etc. It contains only one hidden layer of which the weights and bias values are adjusted. Instead of gradient descent based back propagation learning, it uses MP generalized inverse technique to find output weights. Figure 2 represents the schematic diagram of the ELM model. The expression of actual output of the above schematic structure is represented by Equation (9).
where x and y are input and output vectors, respectively; b i is a bias of ith hidden neurons; the weight vector between input and hidden layer is represented by w i = [w i1 , w i2 ]; and m and L denote the number of the training sample and hidden nodes, respectively. O j represents the output vector of the jth input vector. The output weight matrix (β) joins ith hidden nodes with output neurons and g(x) is an activation function. The ELM starts with arbitrary m distinct samples with zero error, which is represented through Equation (10).
Given a training set as (x j , y j ), here Q and N are target values and number of variables, respectively. The relationship between Hidden layer output matrix (H), output weight (β), and target training matrix (Y) is expressed by Equation (11). The target training matrix (Y) and output weight (β) matrix is represented by Equation (12).
where Equation (13) represents the hidden layer output matrix H and β can be calculated using MP generalized inverse of H which is represented by Equation (14).
where H + denotes MP generalized inverse of H.
As the input weights and biases of ELM are randomly selected, two issues may arise: (i) During the testing phase, it responds slowly, and (ii) in the presence of maximum hidden neurons, it produces a poor generalization performance. Therefore, nature and bioinspired learning strategies are better candidates to tackle these issues. Thus, SSA is chosen for optimizing the weights and bias values of the ELM model.

Salp Swarm Algorithm (SSA)
Recently, Mirjalili et al. [7] have proposed the SSA in 2017 which is based on the swarming behavior of salps in the ocean. The position vectors of salps are considered as the weights and bias vector. Each salp updates its position vector including leader and followers.
Let X represents salp population dimension of N × d, where N denotes the number of salps with d-dimensions. It is expressed as a matrix, which is shown in Equation (15).
The leader's position (X 1 j ) is calculated as where F j : position vector of food source in jth dimension. U and L: superior and inferior limit, respectively. C 2 , C 3 : are random numbers between 0 and 1. C 1 is expressed as where M is the highest value of iterations and C 1 value decreases as iteration count increases. Thus, it figures out how to put more underscore on the expansion tendency in the beginning stages. The locations of followers are updated as where X i j represents the position of ith follower at jth direction. Algorithm 3 represents the pseudocode of SSA.

Data Set Description
In this study, two standard MR image datasets, i.e., Alzheimer's and Hemorrhage, having 100 and 200 images, respectively, have been used. Each image on the two data sets has a resolution of 256 × 256. The Kaggle website is the source of images which are of two categories, i.e., normal brain images and abnormal brain images. Table 1 provides the details of the datasets. Each of the dataset contains 1296 number of features and two class labels. In this article, a 5 × 5-fold stratified cross-validation technique has been used. Table 2 represents the ratio of number of training and testing images used in Alzheimer's and Hemorrhage datasets. Therefore, 5 trials have been taken in two datasets from which 80 images (40 for normal images and another 40 for abnormal images) are used in the training phase and the remaining 20 images (10 for normal images and another 10 for abnormal images) are utilized for testing in Alzheimer's dataset similarly 160 images (80 for normal images and another 80 for abnormal images) have been used in training and remaining 40 images (20 for normal images and another 20 for abnormal images) are employed for testing in Hemorrhage dataset. Each category of brain MR image sample such as normal brain MR image, Alzheimer's brain MRI, and Hemorrhage brain MRI are represented in Figure 3a-c, respectively .  In this study, a comparative performance has been carried out between the proposed hybrid model, i.e., ELM-based SSA (ELM-SSA) with other existing models such as BPNN, FLANN, and RBFN with other optimization algorithms such as PSO and DE. The performance indices used are accuracy, ROC (Receiver Operating Characteristic Curve), and AUC (Area Under the ROC Curve).

Proposed Methodology
The overall scheme of the proposed hybrid technique (ELM-SSA) for the classification of brain MRI is shown in Figure 4. The methodology SSA based ELM classifier is deal in this section. Each salp of SSA denotes a candidate in the ELM model. Each salp consists of weights and biases of the ELM network where n × N + N represents the salp length. Figure 5 denotes the structure vector of a salp of SSA [35]. In this work, SSA is employed for minimization of misclassification rate which is considered as cost function defined in Equation (19).
Here, Accuracy is computed as where c represents the total number of classes and n denotes the total number of instances. f (i, j) = 1, if an instance i is of class j else 0. c(i, j) = 1, if the predicted class of instance i is j else 0.

Experimental Results and Discussion
This section discusses the experimentation details of the proposed model along with systems configuration, datasets used, parameters setting, validation strategies, and result analysis.

System Configuration
The simulation-based experiments of the proposed classification model are carried out in this section. The configuration of the computing system used is presented in Table 3. An Intel(R) Core(TM) i3-6006U with 2 GHz frequency, 8 GB main memory, Windows 7 operating system and the statistical toolbox of Mat lab R2015a platform have been used for our experiment.

Performance Evaluation
The various performance measures used are given below.
• Accuracy: It finds how many brain images are classified correctly from the total image sets tested. Accuracy = tpe + tne tpe + tne + f pe + f ne (21) where tpe = True Positive, tne = True Negative, f pe = False Positive, f ne = False Negative.
where S old is the execution time of standard classifier, and S new is the execution time of proposed hybrid classier.

Parameters Setting
Each bioinspired algorithm and classifier structure has various parameters which need to be adjusted suitably during training phase. Table 4 represents the parameters used for training. In this study, Gaussian activation function and 10 numbers of hidden nodes have been considered for both RBFN and BPNN, respectively. As FLANN has single-layer architecture, the hidden layer concept is eliminated and the expansion size of this classifier have been considered as 8 for giving better result. There are two common controlling parameters such as crossover and mutation in DE optimization algorithm having values of 0.2 and 0.4, respectively. In case of PSO, the value of first acceleration coefficient is considered as 1.5 and second acceleration coefficient is considered as 2. In this simulation study, to have a uniform comparison population size of 50, 100 maximum iterations have been considered for all bioinspired algorithms. The first experiment was conducted to decide the appropriate activation function and number of hidden nodes of LSTM structure. Five activation functions-sine, sigmoid, tribas, radbas, and hardlimitalong with 10 to 50 hidden nodes having 5 nodes per increment have been considered in this work. The simulation results reveals that, the best performance is achieved in sigmoid activation function along with 50 numbers of hidden nodes in both the datasets which is shown in Figure 6. To discard the effect of arbitrary inputs to ELM architecture, each experiment was comprised of 20 trials per fold while keeping same hidden nodes and activation functions.

Features Extraction and Reduction
The initial dimension of each image is 256 × 256 = 65,536. In this work, a 3-level DWT is used for extracting the feature which produces 36 × 36 = 1296 number of features. To further reduce the number of features, PCA has been used to reduce the dimension to 39 and 70 in case of Alzheimer's and Hemorrhage datasets respectively. The reduced feature is nearly (3%, 5.4%) and (0.05%, 0.1%) of the original dimension for Alzheimer's and Hemorrhage datasets, respectively. Figure 7 shows the variance concerning the number of Principal Components (PCs). The simulation results of both the datasets demonstrate that, only 39 and 70 PCs maintain more than 95% of the total variance.

Performance Comparison
The reduced 39 and 70 number of features have been applied four different classifiers such as; ELM, FLANN, RBFN and BPNN. The testing accuracy of base classifiers with conventional learning models has been obtained and shown in Figure 8.
From the above figure, it is observed that the conventional ELM achieves 87% of accuracy in case of Alzheimer's dataset and 88% of accuracy in case of Hemorrhage dataset, which are superior to accuracy values of other classifiers. To further improve the accuracy, the parameters of four models have been learned using PSO, DE and SSA. The comparison of accuracy of the hybrid models has been made in Figures 9 and 10 for the two datasets, respectively. By comparing the classification performance of the proposed ELM-SSA (Figure 9b) with others (Figure 9a,c,d) using Alzheimer's dataset, it is learnt that the ELM-SSA has highest accuracy than other classifiers. It is also observed that ELM-SSA model provide fastest convergence. The SSA based ELM model offers the best convergence speed. Similarly, by comparing the ELM-SSA (Figure 10c) with other classification results (Figure 10a,b,d) for Hemorrhage's dataset, it can be observed that the ELM-SSA model yields higher accuracy and faster convergence speed. Table 5 presents the comparison of accuracy and AUC of all models. The classification accuracy of the ELM-SSA model obtained for Alzheimer's and Hemorrhage datasets are found to be 99%. Figures 11 and 12 show the comparison of ROC plots for two datasets using hybrid classifiers. It is observed from the plot that the ROC of ELM-SSA lies closer to y-axis which is true positive rate. Figures 13 and 14 display the comparison plots of AUC for two datasets using hybrid classifiers. From this plot, it is found that the ELM-SSA model provides highest AUCs of 0.9695 and 0.9659, respectively, for two datasets which are superior to other hybridized models. The performance improvement of ELM-SSA model over other classification models is listed in Table 6.
From Table 6, it is observed that the hybrid classification models which employ optimization techniques provide better accuracy as compared to basic classification models without optimization. Further, ELM-SSA model provides 13.79% and 12.5% better accuracy as compared to basic ELM in case of Alzheimer's and Hemorrhage datasets, respectively. Similarly, it is demonstrated that the ELM-SSA model outperforms other hybrid ELM models with an improved accuracy of 10%, 3.12% and 8.79%, 3.12% in case of Alzheimer's and Hemorrhage datasets over the ELM-DE and ELM-PSO models, respectively. The ELM-SSA also exhibits superior classification compared to FLANN, RBFN, and BPNN with enhanced accuracy of 22.22%, 30.26% and 26.92% for Alzheimer's dataset and 22.22%, 23.75%, and 25.31% for Hemorrhage dataset, respectively. The results of Table 6 also show that ELM-SSA outperforms other hybridized classification models such as FLANN-SSA with increased accuracy of 5.31% and 1.02% for Alzheimer's and Hemorrhage datasets respectively. Similarly, the ELM-SSA has produced 8.79% and 2.06% better accuracy as compared to RBFN-SSA. ELM-SSA model with 7.6% and 1.02% higher accuracy for Alzheimer's and Hemorrhage datasets respectively as compared to the BPNN-SSA.
From the analysis of all results, the ELM model with SSA based parameters tuning outperforms the basic ELM, FLANN, RBFN, and BPNN classification models. Further, ELM-SSA also exhibits superior performance over FLANN, RBFN, and BPNN models hybridized with PSO, DE, as well as SSA schemes.       In Table 7, the accuracy of the proposed model has been compared with that obtained from other models in the field for Alzheimer's and Hemorrhage disease classification. The reported results are 93.18%, 98.01%, 96.36% and 96.50% of accuracy in case of Alzheimer's dataset [36][37][38][39] and 95.73%, 94.26%, and 95.5% of accuracy in case of Hemorrhage dataset [40][41][42]. In essence, the proposed model have achieved an average improvement accuracy of 4% than previously hybrid classification models. Table 8 shows the comparison of the overall execution time of the proposed and other hybridized models. It is seen that on an average, the ELM-SSA model yields lower execution time compared to other models.

Analysis of Computational Time
The computational time of each phase of proposed hybrid classifier (DWT + PCA + ELM-SSA) has been calculated for both the datasets. Figure 15 represents the comparison of the average time required for feature extraction, feature reduction, and classification over both the datasets in seconds (s) for a single brain image having a size of 256 × 256.
For the Alzheimer's dataset, each brain MR image takes 0.015 s, 0.004 s, and 0.001 s for feature extraction, feature reduction and in classification, respectively, whereas the corresponding times are 0.014 s, 0.005 s, and 0.0018 s for the Hemorrhage dataset. It has been also found from the above analysis that the overall computational time for processing each MR image of Alzheimer's and Hemorrhage datasets are 0.0202 s and 0.0205 s, respectively. It is also observed that feature extraction phase consumes more time than either feature reduction and classification steps.  The ELM-SSA model exhibits the best ROC plots for the two datasets used. • In Table 6, it is shown that the ELM-SSA combined model produces better classification accuracy than the basic ELM as well as other classification schemes using FLANN, RBFN, and BPNN models.
• It is also observed that feature extraction phase consumes more computational time than either of feature reduction and classification stage. • In general, the proposed ELM-SSA model outperforms other hybridized classification models such as FLANN-SSA with an improvement in accuracy of 5.31% and 1.02% for Alzheimer's and Hemorrhage datasets respectively. Similarly the ELM-SSA model has produced 8.79% and 2.06% better accuracy as compared to RBFN-SSA for two datasets. ELM-SSA has also shown 7.6% and 1.02% higher accuracy for Alzheimer's and Hemorrhage datasets, respectively, as compared to BPNN-SSA. • In general, it is found that the proposed ELM-SSA model executes faster than other models over both the datasets.

Conclusions and Future Work
The paper has suggested an efficient and faster hybrid classification model using two standard MR image datasets. Through exhaustive simulation study, it is demonstrated that the proposed ELM-SSA model outperforms other competitive conventional and hybrid ML models. A special ongoing favorable metaheuristic optimization algorithm like Salp Swarm Algorithm has been proposed in this work. As the output weight of the learning model depends on arbitrary input weights and biases, it needs to optimize the hidden neuron parameters in ELM for better results. The SSA-based ELM model exhibits parameters tuned. From the above discussion, the ELM-SSA model has shown 13.79% and 12.5% improved accuracy as compared to basic ELM model for two datasets. Similarly, the ELM-SSA model also outperforms other hybrid ELM models such as ELM-DE and ELM-PSO with an improved accuracy of 10%, 3.12% and 8.79%, 3.12% in case of Alzheimer's and Hemorrhage datasets, respectively. The ELM-SSA also exhibits superior classification compared to FLANN, RBFN, and BPNN with an enhanced accuracy of 22.22%, 30.26%, and 26.92% for Alzheimer's dataset and 22.22%, 23.75%, and 25.31% for the Hemorrhage dataset, respectively. ELM-SSA outperforms other hybridized classification models such as FLANN-SSA with increased accuracy of 5.31% and 1.02% for Alzheimer's and Hemorrhage datasets respectively. Similarly, the ELM-SSA has produced 8.79% and 2.06% better accuracy as compared to RBFN-SSA. The proposed model with 7.6% and 1.02% higher accuracy for Alzheimer's and Hemorrhage datasets, respectively, as compared to the BPNN-SSA. It has been found from the results section that the ELM-SSA model provides the highest AUCs of 0.9695 and 0.9659, respectively, for two datasets which are superior to other hybridized models. The classification accuracy of the ELM-SSA model obtained for both datasets is found to be 99%. It was also found that ELM based hybrid classification model such as ELM-SSA is a better classifier model as compared to FLANN, RBFN, and BPNN classifiers optimized by SSA, PSO, and DE concerning classification accuracy, ROC, and AUC.
In the future, the potentiality of SSA can employed for Kernel ELM (K-ELM) and Regularized ELM model. Classification accuracy can be improve by using different deep learning techniques and ensemble models.The performance of proposed model could be improvised by adding more samples and different augmentation techniques.