Hyperspectral Image Classiﬁcation with Deep CNN Using an Enhanced Elephant Herding Optimization for Updating Hyper-Parameters

: Deep learning approaches based on convolutional neural networks (CNNs) have recently achieved success in computer vision, demonstrating signiﬁcant superiority in the domain of image processing. For hyperspectral image (HSI) classiﬁcation, convolutional neural networks are an efﬁ-cient option. Hyperspectral image classiﬁcation approaches are often based on spectral information. Convolutional neural networks are used for image classiﬁcation in order to achieve greater performance. The complex computation in convolutional neural networks requires hyper-parameters that attain high accuracy outputs, and this process needs more computational time and effort. Following up on the proposed technique, a bio-inspired metaheuristic strategy based on an enhanced form of elephant herding optimization is proposed in this research paper. It allows one to automatically search for and target the suitable values of convolutional neural network hyper-parameters. To design an automatic system for hyperspectral image classiﬁcation, the enhanced elephant herding optimization (EEHO) with the AdaBound optimizer is implemented for the tuning and updating of the hyper-parameters of convolutional neural networks (CNN–EEHO–AdaBound). The validation of the convolutional network hyper-parameters should produce a highly accurate response of high-accuracy outputs in order to achieve high-level accuracy in HSI classiﬁcation, and this process takes a signiﬁcant amount of processing time. The experiments are carried out on benchmark datasets (Indian Pines and Salinas) for evaluation. The proposed methodology outperforms state-of-the-art methods in a performance comparative analysis, with the ﬁndings proving its effectiveness. The results show the improved accuracy of HSI classiﬁcation by optimising and tuning the hyper-parameters.


Introduction
Deep learning techniques based on convolutional neural networks (CNNs) have recently made significant progress in computer vision, demonstrating high efficiency in image processing [1,2].As a result, there has been a lot of interest in CNN models, which has led to the use of CNNs in a variety of image processing contexts, such as remote sensing image processing [3].Hyperspectral image categorization has long been a feature in the remote sensing sector.Meanwhile, CNN-based hyperspectral classification algorithms are becoming increasingly popular [4].Researchers face issues with a large number of spectral bands, larger data sizes, high redundancy, and limited training samples while working with hyperspectral images [5].
Due to the versatility of conceptual model structures and their ability to avoid global optimization problems, meta-heuristic optimization methods are recommended for image classification.A single solution-based meta-heuristic approach and a population-based meta-heuristic technique are the two sorts of meta-heuristic techniques.The populationbased method includes swarm intelligence (SI) algorithms [6].Swarms, natural colonies, herds, and other natural phenomena provide the basis for SI approaches.Particle swarm optimization (PSO) [7], ant colony optimization (ACO) [8], the cuckoo search algorithm (CS) [9], the artificial bee colony (ABC) algorithm [10], and elephant herding optimization (EHO) [11] are some of the most prevalent SI algorithms.Classical optimization issues, feature extraction, and weight tuning in neural networks are all highly functional for these types of optimization techniques.
Several research studies have shown how to optimize spatial-spectral HSI across several classification phases, starting with input data and sampling configurations and finishing with classifier parameter tuning analysis.Some of them concentrate on enhancing the precision of the input data by modifying the training sample, data size, balanced distribution, and clipping the outline of the auxiliary data [12].Different methodologies with deep learning, such as CNNs, have the ability to extract low-, mid-, and high-level spatial properties.Many CNN-based models have been applied to HSI classification with limited labelled samples.In order to appropriately train CNN in the context of few labelled samples and fine-tuned hyper-parameters, many approaches have been proposed to either increase the training set or decrease the network's parameters.CNN's HSI classification is appropriate because of its local layer interconnection and shared weights, which make it effective in capturing feature correlations.CNN-based HSI classification approaches can be split into three types based on the input data of the models: spectral-based CNN, spatial-based CNN, and spectral-spatial-based CNN.The pixel vectors are used as input for spectral CNN-based HSI classification, which employs CNN to exclusively characterize the HSI in the spectral domain.To extract the spectral properties of HSI, Hu et al. suggested a 1D CNN with five convolutional layers [13].Furthermore, [14] provided a valuable work in which CNN was used to extract pixel-pair un-mixing features for HSI classification, resulting in a higher classification rate.
Spatial CNN-based techniques are the next category of HSI classification methodology.As the abundance information of HSI data contains significant spatial information in addition to spectral information, it is important to extract the spatial features of HSI to obtain the full-fledged classification of data.The majority of available spatial CNN-based HSI classification techniques are based on primary components.For instance, in [15], spatial patches' with initial principal components were clipped at the centre pixel.
And neighbouring pixel was used to build a 2D CNN for HSI classification.The most popular and trending CNN-based HSI classification methods are spectral-spatial CNN-based approaches, which attempt to exploit both spectral and spatial HSI information in a single structure.HSI's input is a 3D tensor, and 3D convolution is utilized to classify it [16].In [17], He et al. developed a 3D deep CNN to concurrently extract spatial and spectral features using multiscale features.To retrieve spectral-spatial information and standardize the model, the 3D convolutional layer and batch normalization layer were illustrated in [17].Hyungtae Lee et al. [18] developed CNN architecture to strengthen HSI's spectral and spatial information at the same time.They used a residual structure to improve CNN performance, which was mostly driven by minimal training data.CNNbased approaches are the preferred standard algorithm for HSI classification today due to their high classification performance.
In [19], Res-3D CNN, which was developed by the authors, attempted to enhance the extraction of spatial-spectral features by adding residual interconnections to 3D CNN.Although feature extraction with a small number of training samples can cause serious information leakage, this technique also advises using a limited amount of training data.
This calls for the model to be tuned with hyper-parameters.Zhong [20] constructed an SSRN (spectral-spatial residual network) from unstructured hyperspectral data without dimensionality reduction.They partitioned the fully convolutional learning procedure into independent spatial feature learning and spectral feature extraction, and then added residual interconnections to the existing system.SSRN acquired more prominent features, and the extracted feature training strategy has a growing hand in hyperspectral classification studies in the future.It has also been noted that, in some instances, classifying spatial information tends to lose small amounts of substantial information, although the classification performance relies on the proposed classifier.In the paper of Sharma, [8], a spatial-spectral HSI classification is presented, using nature-inspired ant colony optimization.Improved classification accuracy was attained by combining two separate supervised classifiers: Spectral Angle Mapper (SAM) and Support Vector Machine (SVM).One major contributing aspect was the loss of minimal spatial information on classification due to the small training samples.The EHO technique was followed by Jayanth et al. [21] to classify high-spatial-resolution multispectral images.EHO determines the information class and multispectral image fitness evaluation function.The experimental findings of the datasets show that the proposed approach enhances overall accuracy by 10.7% for the Arsikere taluk dataset and 6.63 percent for the National Institute of Technology Karnataka (NITK) campus dataset, when contrasted with the SVM algorithm.The classification of hyperspectral images was strengthened by the substantially optimised hyper-parameters.An optimized algorithm that can compute fast and deliver efficient performance despite the constraints is needed.
The effectiveness of the optimization algorithm will be more affected by the hyperparameter values.The most ideal values for hyper-parameters in optimization algorithms are determined using a variety of techniques, such as evolutionary algorithms, trial and error (TE), and random search gradients.The adaptive-moment estimation method (Adam) [22] is frequently used for weight updates in deep learning neural networks.However, in this study, a new adaptive optimizer called AdaBound [23] was applied to achieve faster hyper-parameter convergence.At the same time, the AdaBound optimizer can minimize the generalization gaps in existing adaptive methods and SGD optimizers, while maintaining a faster dynamic learning rate early in the training phase.The proposed method is based on enhanced EHO optimization with the AdaBound optimizer for HSI classification.
Elephant herding optimization (EHO) is a method for tackling global optimization issues that is based on elephant herding behaviour [24].Elephants from different families live under the same authority as matriarch elephants, and when the male elephants reach adulthood, they leave their family group.The phenomenal behaviour of elephants is separated into clans by updating operators and separation operators.The present position of the elephant is modified by the clan-updating operator.Later, the separating operator is utilized.The applications of the EHO algorithm demonstrate its outstanding performance in solving optimization challenges.Due to the sheer stochastic character of EHO and the incorrect balance between analysis and development, it is confined to the local optimum.This is considered a key drawback of EHO.As a result, The EHO's capability for analysis is constrained, thus its convergence speed is slower [25].
To fix the low varying convergence of EHO towards the source and ensure an effective balance between the analysis and development stages, this research proposes a spatialspectral enhanced elephant herding optimization algorithm with the AdaBound optimizer on a CNN classifier for supervised HSI classification, by combining spatial-spectral features.The proposed method uses spectral classifier capabilities to provide effective results with a limited training data set.To test the efficiency of our suggested strategy, we analysed two different standardized hyperspectral image datasets, Indian Pines and Salinas, with their respective ground truths.When compared to other existing classification algorithms, the suggested technique outperforms them in terms of computation time and accuracy rate when deployed on hyperspectral images.The following are the key contributions of the research.

•
To provide efficient accuracy for hyperspectral images, an improved and enhanced EHO method with an AdaBound optimizer updating a hyper-parameter algorithm was developed.As matriarch, the fittest elephant in the clan with the most recent position is chosen.Fixing the clan operator in EEHO improved the evaluation by enhancing its population randomly and removing inappropriate convergence towards the source.In EEHO-AdaBound, the algorithm's global convergence performance is improved.It has a better convergence speed and a higher convergence accuracy rate than traditional optimization techniques.It can also determine the best CNN hyper-parameters.

•
In this study, the EEHO-AdaBound was designed to optimize the CNN's initial threshold values and weights.The results of the experiments reveal that the proposed method achieves the best accuracy for classification issues while also overcoming the drawbacks of CNN, which are readily trapped in local minimum values and have low stability.In addition, when compared to other CNN approaches, CNN-EEHO-AdaBound classification is greatly enhanced.

•
The proposed enhanced elephant herding optimization with the AdaBound optimizer on the CNN classifier verifies and validates HIS datasets, and shows that they are superior to the optimization algorithms.
The following outlines how the rest of the article is organized: The basic literature of the EHO algorithm is presented in Section 2. Section 3 explains the methodology of the proposed work, as well as the enhanced EHO with the AdaBound optimizer for updating the hyper-parameters.Section 4 depicts the proposed work's experimental analysis, Section 5 the results and their discussions, and Section 6 the work's conclusion.

Related Work EHO
Metaheuristic optimization approaches are used as solutions for a variety of situations where exhaustive variable selection techniques are either too expensive or require efficient solutions.Swarm intelligence optimization algorithms are global, powerful optimization procedures that try to address a variety of issues that can be simplified to a fitness function of optimization [26].In recent research, these are frequently employed for time-series signal processing, analysis, and image classification applications [27].The ability to obtain the finest classification models and feature sets in a short period of computation time is key to the success of swarm algorithms in image classification.In the studies, SI methods have been used to classify land cover by utilising metaheuristic optimization techniques, such as particle swarm optimization (PSO), with CNNs [28] and SVM [29].In an impressive empirical investigation (9), SVM was integrated with ant colony optimization, genetic algorithms, and artificial bee colony optimization [30].SI methods have indeed been improving over time, and there are now a variety of upgraded methods and applications with improved search techniques.The EHO optimization algorithm is a new technique used in a hybrid model for hyperspectral image classification with the objective of finetuning the hyper-parameters and appropriate feature selections.Wang et al. were the first to propose the EHO method [11].It was combined with the SVM classifier to create a hybrid system for identifying human behaviour [31].A further study [32], in which the researchers presented a customised form of EHO as an independent classifier to increase hyperspectral image classification accuracy, used EHO with long short-term memory (LSTM) for spatial-spectral hyperspectral image classification enhancement.Whereas the EHO technique can approximate ideal accuracy with dimensionality reduction as the primary goal, it does not guarantee it.When an SI method such as EHO combines feature reduction and feature selection in the same phase, it becomes a great optimizer.Hence, this paper proposes a spatial-spectral enhanced elephant herding optimization algorithm with an AdaBound optimizer on the CNN classifier method, in order to achieve improved accuracy and relatively reduced computational time.The research proposes an enhanced EHO optimization technique with parameter tuning, spatial-spectral feature extraction, and selection stages linked, in order to avoid feature set selection dependencies and system hyper-parameter tuning.

Basics of EHO
Elephants, as communal animals, live in matriarchal societies with females and offspring.An elephant clan comprises several elephants and is led by a matriarch.Female members wish to reside with their families, whilst male leaders prefer to remain outside and will progressively gain complete independence from their family.Figure 1 depicts the elephant population devised in paper [11] after observing genuine elephant herding behaviour.In EHO, the following assumptions are factored in: ephant herding optimization algorithm with an AdaBound optimizer on the CNN classifier method, in order to achieve improved accuracy and relatively reduced computational time.The research proposes an enhanced EHO optimization technique with parameter tuning, spatial-spectral feature extraction, and selection stages linked, in order to avoid feature set selection dependencies and system hyper-parameter tuning.

Basics of EHO
Elephants, as communal animals, live in matriarchal societies with females and offspring.An elephant clan comprises several elephants and is led by a matriarch.Female members wish to reside with their families, whilst male leaders prefer to remain outside and will progressively gain complete independence from their family.Figure 1 depicts the elephant population devised in paper [11] after observing genuine elephant herding behaviour.In EHO, the following assumptions are factored in: (1) The elephant clan population is confined to a specific number of elephants in each clan.
(2) From each generation, a predetermined number of male elephants from the chief group will leave their associated family and live alone in a remote location.(3) Each clan's elephants are governed by a matriarch.

Clan-Updating Operator
As per elephant habit, each clan has a matriarch who governs the elephants.As a result, each elephant's new position is determined by the matriarch.Equation (1) shows the calculation of the position of an elephant  in the clan : The new and old positions for elephant  in clan  are represented by  , , and  , , respectively. , is the matriarch of the clan, and she represents the best and fittest elephant.New position  [0,1] is a scale factor determining the influence of the matriarch, and the best elephant position belongs to  [0,1].The best and fittest elephant in clan is calculated by Equation ( 2) (1) The elephant clan population is confined to a specific number of elephants in each clan.
(2) From each generation, a predetermined number of male elephants from the chief group will leave their associated family and live alone in a remote location.(3) Each clan's elephants are governed by a matriarch.

Clan-Updating Operator
As per elephant habit, each clan has a matriarch who governs the elephants.As a result, each elephant's new position is determined by the matriarch.Equation (1) shows the calculation of the position of an elephant m in the clan Cn: The new and old positions for elephant m in clan Cn are represented by p new,Cn,m and p Cn,m , respectively.p best,Cn is the matriarch of the clan, and she represents the best and fittest elephant.New position s [0, 1] is a scale factor determining the influence of the matriarch, and the best elephant position belongs to f [0, 1].The best and fittest elephant in clan is calculated by Equation ( 2) where ω [0, 1] is a factor that affects the elephants new position at p center,Cn and influence Cn on best fit elephant p new,Cn,m .Clan centre individual is p center,Cn , which is calculated using Equation (3).
where 1 ≤ z ≤ Z and g Cn denote the number of elephants in clan Cn, and p Cn,m,z denotes the individual elephant p Cn,m,z in the z-dimension.Hence, p center,Cn is the new best position of an elephant in clan Cn, and it is updated using Equation (3).

Separating Operator
When tackling optimization issues, the parting process by which male elephants depart their family group can be simulated as a separation operator.As indicated in Equation ( 4), the separation operator is applied by the elephant member with the lowest performance in each generation.
where p max denotes the upper bound and p min denotes the lower bound of each individual elephant position in the family.p worst, Cn denotes the worst member of clan Cn.Rand [0, 1] is a stochastic distribution with values ranging from 0 to 1.
The mainframe of EHO is summarised based on the descriptions of the clan-updating and separating operators.EHO Algorithm 1 corresponds to the following:

End for
Estimate each elephant individual in the clan for new position, respectively.Incrementing the generation count K = K + 1.

End while
Output: the optimal best elephant position End.

AdaBound Optimizer as Hyper-Parameter Updating Method in Enhancing EHO
The EHO method is a generalised stochastic search algorithm created by Wang et al. [11] and based on research on elephant behavioural biases.The EHO algorithm is frequently used in machine learning and deep learning optimization.The spatial-spectral hyperspectral image classification in the literature [31] reveals that the modulation classification performance is greatly enhanced using the EHO algorithm to optimise neural networks.The EHO algorithm, on the other hand, contains limitations such as:

•
Unreasonable convergence towards the updated operator in the origin has a lower effect on expanding further.

•
Initial elephant position allocation is uneven.
For the abovementioned reasons, this paper proposes an enhanced EHO with an AdaBound optimizer.Here, the AdaBound optimizer is used for hyper-parameter updating.

AdaBound Optimizer
To train the proposed CNN model with enhanced EHO, the hyper-parameters are updated using the AdaBound optimizer.The advantage of using the AdaBound optimizer is that it can use dynamic bounds on learning rates to achieve the objective of converting from adaptive to stochastic gradient descent (SGD) optimization, which lowers the generalisation gap between adaptive and SGD approaches with high learning rates.The α is used as the algorithm's starting step size, and α/L t is the learning rate.The AdaBound optimizer parameters are updated according to the below equations: where where the momentum values β 1 and β 2 are typically 0.9 and 0.99.clip α/ √ L t , η l (t), η u (t) denotes that the learning rate α/L t has been clipped at these values to avoid gradient instability at higher and lower bounds.Instead of a constant lower and upper bound, the hypermeters of η l and η u are specified as functions of t.In addition, the parameter update is explained as follows: In the above Equation ( 10), the learning rate is denoted as the function of t.Hence, the lower and upper bounds' limit difference is much lower.According to this characteristic, the above method behaves as Adam at the begining, with bounds having minimal effect on the learning rate.Later, the method behaves as SGD with constrained bounds.With this advantage, AdaBound with new updated hyper-parameters is implemented in EHO to enhance it further.Algorithm 2 presents EEHO with the AdaBound optimizer.The hyper-parameters α and β are considered in the EEHO method, and the initial values of α and β are randomly set within 0 and 1.The convergence rate of the algorithm majorly depends on learning L t ; on the other hand, β 1 and β 2 have less of an impact on classification accuracy.Thus, this is the factor to improve the performance of classification, by updating the hyper-parameters and enhancing the EHO algorithm.The AdaBound coefficients are set as L t = 0.001, β 1 = 0.9, and β 2 = 0.999, with a considerable number of iterations.Further, with these updated hypermeters, the minimum error rate is observed; thus, these values are termed as optimal hyper-parameters.In Figure 2, the flow chart of the EEHO-AdaBound is represented.

The EEHO-CNN Approach
The design of classifiers is a vital aspect of hyperspectral image classification.With the advancement of machine learning, CNN as a classifier has strong self-learning and self-adaptive capabilities and can deal with difficult nonlinear issues.CNN has become widely used in the domain of image classification.This section describes how a convolution neural network based on the enhanced elephant herding algorithm with an AdaBound optimizer is used to classify the hyperspectral images.Figure 3 presents a convolution neural network, in which each node in the network instantly and adaptively selects the distinctive feature and extracts all of the key feature parameters at the same time, ensuring that the image processing accuracy is not limited by the order in which they are used.The CNN classifier in this paper has a three-layer network topology.The number of nodes in the input and output layers is defined by the number of input and output images, respectively.Equation (11) shows how to compute the number of hidden layers' nodes of the proposed method: where ̂ is the number of input features, ̂ is the output features, and  1,100 is constant, such that CNN has an integer range of hidden layer nodes from [√̂ ̂ 1, √̂ ̂ 100].Update the new values for hyper-parameters using Equations ( 6)- (10).End for

The EEHO-CNN Approach
The design of classifiers is a vital aspect of hyperspectral image classification.With the advancement of machine learning, CNN as a classifier has strong self-learning and self-adaptive capabilities and can deal with difficult nonlinear issues.CNN has become widely used in the domain of image classification.This section describes how a convolution neural network based on the enhanced elephant herding algorithm with an AdaBound optimizer is used to classify the hyperspectral images.Figure 3 presents a convolution neural network, in which each node in the network instantly and adaptively selects the distinctive feature and extracts all of the key feature parameters at the same time, ensuring that the image processing accuracy is not limited by the order in which they are used.The CNN classifier in this paper has a three-layer network topology.The number of nodes in the input and output layers is defined by the number of input and output images, respectively.Equation (11) shows how to compute the number of hidden layers' nodes of the proposed method: where r is the number of input features, ŝ is the output features, and k [1, 100] is constant, such that CNN has an integer range of hidden layer nodes from [ Moreover, the CNN's preliminary thresholds and weights are set to a different value between −1 and 1, which has an impact on the training duration.With low robustness, this affects the outcomes and convergence results of the CNN.As a result, choosing the best initial weights and thresholds will considerably improve the CNN's performance.The EEHO with the AdaBound optimizer is used in this paper to optimise the initial threshold values and weights of the CNN.
The input feature set is used to train the CNN in order to predict the system output, and the fitness function's aim is to minimise the mean absolute error (MAE) between the CNN output layer and the corresponding results.The following describes the optimal solution: where The work flow of the CNN based on EEHO with the AdaBound optimizer for HSI classification presented in the proposed approach is shown in Figure 3; the procedural explanation of the design and analysis of the proposed method is as follows: Step 1: Set the parameters, in which the total number of present elephant groups is Moreover, the CNN's preliminary thresholds and weights are set to a different value between −1 and 1, which has an impact on the training duration.With low robustness, this affects the outcomes and convergence results of the CNN.As a result, choosing the best initial weights and thresholds will considerably improve the CNN's performance.The EEHO with the AdaBound optimizer is used in this paper to optimise the initial threshold values and weights of the CNN.
The input feature set is used to train the CNN in order to predict the system output, and the fitness function's aim is to minimise the mean absolute error (MAE) between the CNN output layer and the corresponding results.The following describes the optimal solution: where θ = θ1 , θ2 , . . ., θD is the feature vector that has merged initial weights and threshold values of CNN, such that the weights are set as θ1 = θ1 , θ2 , . . ., θd1 and the threshold value is given by t 1 = θd1+1 , θd2+2 , . . ., θd2 .Secondly, the features between the input layer and the hidden layers are given by initial weights set to θ2 = θd2+1 , θd2+2 , . . ., θd3 and the threshold value set to t 2 = θd3+1 , θd3+2 , . . ., θD , where is the sum of all nodes in the CNN, D is depicted as D = n × ρ + ρ + ρ × r + r, Ŷ = Ŷ1 , Ŷ2 , . . ., Ŷn is the required expected feature output, and P = [P 1 , P 2 , . . . ,P n ] is the predicted output.The work flow of the CNN based on EEHO with the AdaBound optimizer for HSI classification presented in the proposed approach is shown in Figure 3; the procedural explanation of the design and analysis of the proposed method is as follows: Step 1: Set the parameters, in which the total number of present elephant groups is Z, the number of elephants M, and the number of elephants in each clan Cn, Z = M × n =; the maximum number of group iterations is GenMax.Consider the impact factor ε, qubit mutation probability q 1 and q 2 , and the maximum number of iterations Gen. Randomly generate the elephant's starting position in domain.
Step 2: Using Equation ( 6), map the best position to the present position; compute the actual fitness value f θ using Equation (7) for each elephant.Depending on the evaluated fitness value provided, arrange the elephants in ascending form.θt g is the global fitness value along with the elephant's new position value.
Step 3: Split all the elephant groups into clans C; compute the elephants' best and worst fitness in Cn clan.
Step 5: Evaluate the separating operator to replace the individual elephant with its worst case fitness in Cn using Equations ( 4) and (10).
Step 6: Integrate the elephants of each clan; use Equation (10) to compute each elephant's fitness value f θ .To obtain the elephants' new location with the global optimal fitness value, organize the elephants in increasing order of their fitness count.
Step 7: Repeat from step 3 until the last elephant obtains the position; otherwise, compute the global position θ = θ1 , θ2 , . . ., θD and stop the algorithm.
Step 8: Once training the network with the best initial thresholds values and weights, the trained CNN models achieve HSI classification accuracy.

Experimental Results and Analysis
In this section, the experimental setting provided for the proposed method and the parameterized algorithm are explained.Using two HSI datasets, the proposed method for automatically designing CNNs for HSI classification demonstrates the usefulness of the proposed method.

Dataset
In this section, the proposed method is tested on two standard hyperspectral datasets [33].Figure 4 presents a diverse vegetation area over the Indian Pines test environment in northeastern Indiana, USA (Indian Pines), and the Salinas Valley in California, USA (Salinas Valley) (Salinas).The comprehensive data of the training samples of each class are presented in Table 1.
Electronics 2023, 12, x FOR PEER REVIEW 11 of 20 Step 5: Evaluate the separating operator to replace the individual elephant with its worst case fitness in  using Equations ( 4) and (10).
Step 6: Integrate the elephants of each clan; use Equation ( 10) to compute each elephant's fitness value   .To obtain the elephantsʹ new location with the global optimal fitness value, organize the elephants in increasing order of their fitness count.
Step 7: Repeat from step 3 until the last elephant obtains the position; otherwise, compute the global position   ,  , … . . and stop the algorithm.Step8: Once training the network with the best initial thresholds values and weights, the trained CNN models achieve HSI classification accuracy.

Experimental Results and Analysis
In this section, the experimental setting provided for the proposed method and the parameterized algorithm are explained.Using two HSI datasets, the proposed method for automatically designing CNNs for HSI classification demonstrates the usefulness of the proposed method.

Dataset
In this section, the proposed method is tested on two standard hyperspectral datasets [33].Figure 4 presents a diverse vegetation area over the Indian Pines test environment in north-eastern Indiana, USA (Indian Pines), and the Salinas Valley in California, USA (Salinas Valley) (Salinas).The comprehensive data of the training samples of each class are presented in Table 1.From the AVIRIS sensor, a 220-band sensor was used to capture images of the Indian Pines test environment.After removing the water absorption bands, the usable dataset includes a large number of bands ( 200

Experiments Compared with Existing Approaches
Different CNN classification methods based on spatial-spectral information were used to compare with the proposed method.The CNN-EEHO-AdaBound approach was evaluated in order to assess its performance.To validate the suggested techniques, numerous handmade CNN models with spectral-spatial information were analysed on hyperspectral datasets.The 2D-3D CNN [34] underwent extensive trials with various numbers of training samples, and it was discovered that the CNN model frequently degrades as the sample size decreases.The residual-based approaches to spectral-spatial residual networks (SSRN) [20] and ResNet [35] can obtain better classification accuracy.For contrast, DenseNet [36] was utilised, which exploited shortcut connections between layers of CNN.e-CNN [37], an automatic design analysis method of CNN using AdaBound optimizers to explore the spatial-spectral information, achieved good performance accuracy and was also compared with the proposed method.The existing approaches were compared with previously created CNN models in terms of classification accuracy and computational complexity.

Experiments Compared with Existing Approaches
Different CNN classification methods based on spatial-spectral information were used to compare with the proposed method.The CNN-EEHO-AdaBound approach was evaluated in order to assess its performance.To validate the suggested techniques, numerous handmade CNN models with spectral-spatial information were analysed on hyperspectral datasets.The 2D-3D CNN [34] underwent extensive trials with various numbers of training samples, and it was discovered that the CNN model frequently degrades as the sample size decreases.The residual-based approaches to spectral-spatial residual networks (SSRN) [20] and ResNet [35] can obtain better classification accuracy.For contrast, DenseNet [36] was utilised, which exploited shortcut connections between layers of CNN.e-CNN [37], an automatic design analysis method of CNN using AdaBound optimizers to explore the spatial-spectral information, achieved good performance accuracy and was also compared with the proposed method.The existing approaches were compared with previously created CNN models in terms of classification accuracy and computational complexity.

Experiment Parameter Settings
This section shows the details of the experiment settings, as each dataset was divided into three components in the proposed experiments: a training set, a test set, and a validation set.The training set and validation set proportions of the Indian Pine and Salinas are 5% and 1%, respectively, with the remaining pixels serving as a testing dataset.Tables 1 and 2 illustrate the distribution of the sample of the two datasets for each class of their ground truth.Table 3 depicts the parameter setting for the proposed CNN-EEHO-AdaBound method.Whilst carrying out the experiment, the training sets of 2D-3D CNN, SSRN, ResNet, DenseNet, and e-CNN, such as filter size, training epoch, etc., were the same as in the corresponding papers.

Results Analysis and Discussion
To demonstrate the usefulness of the proposed method, a study of the classification results is compared in terms of classification accuracy, parameters, and time complexity on benchmark hyperspectral datasets.The proposed method shows the optimal structures and examines the convergence to show that the proposed EEHO with the AdaBound optimizer algorithm is feasible.Finally, testing samples are validated using hyperspectral datasets to promote the effectiveness of CNN-EEHO-AdaBound algorithm techniques.

Accuracy of HSI Classification
The performance of the models was measured using three metrics: overall accuracy (OA), average accuracy (AA), and Kappa coefficient (Kappa).The ratio of samples properly identified by the model is denoted by OA.The average reliability of all ground objects is denoted by the letter AA.The confusion-matrix indicates the percentage of faults minimised by classification versus an essentially random classification.KAPPA is an accuracy score based on the confusion-matrix.
Tables 4 and 5 exhibit the comprehensive classification results on HSI datasets on the proposed method and the other existing methods.As shown in Tables 4 and 5, CNN with EEHO and the AdaBound Optimizer significantly outperforms previous approaches such as 2D-3D CNN, spectral-spatial residual network (SSRN), residual network (ResNet), dense network DenseNet, and e-CNN in terms of classification accuracy.The proposed method outperformed the other methods in the classification of the Indian Pines dataset.With improvements of 0.11%, 0.18%, and 1.62%, respectively, the CNN-EEHO-AdaBound approach had the best AA and Kappa.The best OA, AA, and Kappa results for the Salinas dataset came from CNN-EEHO-AdaBound, with increases of 0.98%, 0.39%, and 1.08, respectively.There may be significant discrepancies in the accuracy of each class.In the first-class classification on Salinas, CNN-EEHO-AdaBound outperformed 2D-3D CNN by 5.56%.
To summarise the classification accuracy analysis, the proposed CNN-EEHO-AdaBound method outperformed state-of-the-art CNN models such as 2D-3D CNN, SSRN, ResNet, DenseNet, and e-CNN.Using the AdaBound optimizer, the offered methodologies can also identify more optimised architectures.On the other hand, the tuned hyper-parameters resulted in an improved classification performance and reduced computation time.

Convergence Analysis of CNN-EEHO-AdaBound Approach
In order to significantly speed up optimal value, the convergence analysis of the CNN-EEHO-AdaBound technique must be carried out.The HSI classification accuracy of the optimised convolution neural network, which comprised architectures and biased weight parameters, was used to calculate fitness.The number of architectural characteristics and the position of the elephants, on the other hand, were only related to architecture.As a result, the number of architectural parameters and the position of all elephants were crucial criteria in the CNN-EEHO-AdaBound approach's architecture convergence study.
The number of hyper-parameters in architectures was inversely proportional to the number of operations in those architectures.The number of hyper-parameters fluctuated as the operations in the models changed, indicating that the designs converged when the number of hyper-parameters remained constant throughout the iterations.The accuracy and number of hyper-parameters of fit during the iterations using the CNN-EEHO-AdaBound technique based on HSI datasets are shown in Figure 6.The architectures converged at seven, nine, and eleven iterations based on the Salinas and Indian Pines datasets, respectively, according to the number of hyper-parameters.After the convergence of the designs, the testing dataset's accuracy further improved.The fundamental reason for this is that the hyper-parameters of the architectures retained from the EEHO-AdaBound were optimised when CNN was trained until the maximum number of iterations was achieved.AdaBound technique based on HSI datasets are shown in Figure 6.The architectures converged at seven, nine, and eleven iterations based on the Salinas and Indian Pines datasets, respectively, according to the number of hyper-parameters.After the convergence of the designs, the testing dataset's accuracy further improved.The fundamental reason for this is that the hyper-parameters of the architectures retained from the EEHO-AdaBound were optimised when CNN was trained until the maximum number of iterations was achieved.

HSI Classification Maps
The entire HSI image classification maps of all the models effectively represent the classification results.Figures 7 and 8 demonstrate the classification maps generated by several models using two benchmark datasets.In comparison to the other models, the proposed CNN-EEHO-AdaBound produced less dispersion in the class with a wide area, implying that it can achieve more specific classification accuracy in this category.The

HSI Classification Maps
The entire HSI image classification maps of all the models effectively represent the classification results.Figures 7 and 8 demonstrate the classification maps generated by several models using two benchmark datasets.In comparison to the other models, the proposed CNN-EEHO-AdaBound produced less dispersion in the class with a wide area, implying that it can achieve more specific classification accuracy in this category.The CNN-EEHO-AdaBound method achieves better results in the classification of various classes in HSI data.

HSI Classification Maps
The entire HSI image classification maps of all the models effectively represent th classification results.Figures 7 and 8 demonstrate the classification maps generated b several models using two benchmark datasets.In comparison to the other models, th proposed CNN-EEHO-AdaBound produced less dispersion in the class with a wide ar ea, implying that it can achieve more specific classification accuracy in this category.Th CNN-EEHO-AdaBound method achieves better results in the classification of variou classes in HSI data.

Comparisons of CNN-EEHO-AdaBound Performance with Other Optimization Algorithm
The PSO-CNN cell-based approach [7], CSO-CNN approach [9], and ACO ap proach [8] techniques were studied in order to compare the accuracy and effectiveness o the proposed CNN-EEHO-AdaBound and other optimization algorithms.The overa classification accuracy of the four optimization techniques is depicted in Table 6.As can be seen in Figure 9, the classification accuracy of the CNN-EEHO AdaBound method according to the optimization algorithms is significantly higher tha that of the existing algorithm, implying that the performance of CNN-EEHO-AdaBoun can be enhanced by utilising optimization techniques.The CNN-EEHO-AdaBoun method is slightly more precise than other optimization methods, and all of them ca

Comparisons of CNN-EEHO-AdaBound Performance with Other Optimization Algorithms
The PSO-CNN cell-based approach [7], CSO-CNN approach [9], and ACO approach [8] techniques were studied in order to compare the accuracy and effectiveness of the proposed CNN-EEHO-AdaBound and other optimization algorithms.The overall classification accuracy of the four optimization techniques is depicted in Table 6.As can be seen in Figure 9, the classification accuracy of the CNN-EEHO-AdaBound method according to the optimization algorithms is significantly higher than that of the existing algorithm, implying that the performance of CNN-EEHO-AdaBound can be enhanced by utilising optimization techniques.The CNN-EEHO-AdaBound method is slightly more precise than other optimization methods, and all of them can achieve greater than 99 percent accuracy.The fundamental reason for this is that these algorithms determine the ideal fitness evaluation value of the individual population, which is a global optimization strategy, decreasing the likelihood of CNN-EEHO-AdaBound that flows into a local minimum.On the other hand, EEHO-AdaBound is an enhancement of the EHO algorithm, with a bio-inspired technique that is very simple to apply and achieves positive efficacy.A faster convergence speed is another advantage of EEHO-AdaBound.The best optimal fitness value of each optimised generation group of every method is shown in Figure 9, under the condition that the characteristic exponent is  =1.5.The EEHO-AdaBound algorithm presented in this paper performs much better than the other three algorithms in terms of convergence speed and convergence accuracy, as also shown in Figure 9.As the EEHO-AdaBound is based on the EHO algorithm, it evolves the elephant's current state with the perfect situation using tuned hyper-parameters.Individuals' previous metadata are successfully utilised in the evolutionary process, and the algorithm's global convergence potential is strengthened further.

Conclusions
In this research, a high-precision EHO-based algorithm is employed to classify hyperspectral images over CNN using the AdaBound optimizer as a high-speed converging optimizer.The enhanced version of EHO with the AdaBound optimizer method provides much improved classification accuracy by using CNN within it.EEHO-AdaBound outperforms the performance by updating the hyper-parameters.To classify the 16 classes in the HSI dataset, a CNN is optimised using the EEHO-AdaBound approach.The experimental results reveal that the adaptive weight has a good damping impact on the error rate and convergence of the CNN-EEHO-AdaBound approach, considerably improving the accuracy of the HSI dataset.The suggested CNN-EEHO-AdaBound classifier has greatly increased classification accuracy when compared to existing classic CNN classifiers.Furthermore, the EEHO-AdaBound algorithm proposed in this work can improve the EHO's global convergence competence; when compared to other traditional optimization algorithms, EEHO-AdaBound has a faster convergence speed and higher convergence accuracy, demonstrating its greater versatility and ease of application to other optimization problems.When the hyper-parameters are updated, the CNN-EEHO-AdaBound-based classifier has a maximum classification accuracy of 99.6%.The classification performance measures can be further enhanced in the future by modifying EHO.For the HSI image classification problem, the superiority of the EEHO-AdaBound algorithm in CNN as a technique to update hyper-parameters achieves good performance.The best optimal fitness value of each optimised generation group of every method is shown in Figure 9, under the condition that the characteristic exponent is α = 1.5.The EEHO-AdaBound algorithm presented in this paper performs much better than the other three algorithms in terms of convergence speed and convergence accuracy, as also shown in Figure 9.As the EEHO-AdaBound is based on the EHO algorithm, it evolves the elephant's current state with the perfect situation using tuned hyper-parameters.Individuals' previous metadata are successfully utilised in the evolutionary process, and the algorithm's global convergence potential is strengthened further.

Conclusions
In this research, a high-precision EHO-based algorithm is employed to classify hyperspectral images over CNN using the AdaBound optimizer as a high-speed converging optimizer.The enhanced version of EHO with the AdaBound optimizer method provides much improved classification accuracy by using CNN within it.EEHO-AdaBound outperforms the performance by updating the hyper-parameters.To classify the 16 classes in the HSI dataset, a CNN is optimised using the EEHO-AdaBound approach.The experimental results reveal that the adaptive weight has a good damping impact on the error rate and convergence of the CNN-EEHO-AdaBound approach, considerably improving the accuracy of the HSI dataset.The suggested CNN-EEHO-AdaBound classifier has greatly increased classification accuracy when compared to existing classic CNN classifiers.Furthermore, the EEHO-AdaBound algorithm proposed in this work can improve the EHO's global convergence competence; when compared to other traditional optimization algorithms, EEHO-AdaBound has a faster convergence speed and higher convergence accuracy, demonstrating its greater versatility and ease of application to other optimization problems.When the hyper-parameters are updated, the CNN-EEHO-AdaBound-based classifier has a maximum classification accuracy of 99.6%.The classification performance measures can be further enhanced in the future by modifying EHO.For the HSI image classification problem, the superiority of the EEHO-AdaBound algorithm in CNN as a technique to update hyper-parameters achieves good performance.

Figure 1 .
Figure 1.Elephant behaviour in a clan.

Figure 1 .
Figure 1.Elephant behaviour in a clan.

Algorithm 1 :
Elephant herding optimization algorithm Start Initialize.Set the number of iterations E = 1; set P for population initialization; choose Gen Max for maximum generation and elephant count as.While searching, do Sorting the individual elephant's actual fitness is used to classify the population.For all clans, generate count For elephant j in the family clan Compute p new,Cn,m and update p Cn,m by Equation (1).Sort the population according to the fitness of individuals.For all clans' ci do For elephant m in the clan Cn do Generate p new,Cn,m and update p Cn,m by Equation (1).If p Cn,m = p best,Cn then Produce p new,Cn,m and update p Cn,m by Equation (2).End if End for End for For all present clans' Cn do Interchange the worst individual elephant Cn by Equation (4).

Figure 4 .
Figure 4. Indian Pines dataset with colour codes.

Figure 4 .
Figure 4. Indian Pines dataset with colour codes.
), with 145 × 145 pixels each.The ground truth map includes 16 different classes of interest.The second dataset, Salinas, was gathered with the 224-band by the Airborne Visible Infrared Imaging Spectrometer (AVIRIS) over Salinas Valley, California, as shown in Figure 5.It has a spatial resolution of 3.7 m per pixel.After removing the 20 water absorption bands and the 16 land cover classes, the available dataset consists of 204 bands of 512 × 217 pixels.Electronics 2023, 12, x FOR PEER REVIEW 12 sensor , a 220-band sensor was used to capture images of the Indian Pines test environment.After removing the water absorption bands, the usable dataset includes a large number of bands (200), with 145 × 145 pixels each.The ground truth map includes 16 different classes of interest.The second dataset, Salinas, was gathered with the 224-band by the Airborne Visible Infrared Imaging Spectrometer (AVIRIS) over Salinas Valley, California, as shown in Figure 5.It has a spatial resolution of 3.7 m per pixel.After removing the 20 water absorption bands and the 16 land cover classes, the available dataset consists of 204 bands of 512 × 217 pixels.

Figure 9 .
Figure 9. Overall accuracy for optimization algorithms.
,  , … . . is the feature vector that has merged initial weights and threshold values of CNN, such that the weights are set as ,  , … . . and the threshold value is given by   ,  , … . . .Secondly, the features between the input layer and the hidden layers are given by initial weights set to   ,  , … . . and the threshold value set to   ,  , … . . , where    ,     ,     ̂, D is the sum of all nodes in the CNN, D is depicted as      ̂ ,   ,  , … . . is the required expected feature output, and   ,  … . is the predicted output.

Table 1 .
Training and test samples of Indian Pines dataset.

Table 1 .
Training and test samples of Indian Pines dataset.

Table 2 .
Training and test samples of Salinas dataset.

Table 4 .
Class-wise overall accuracy (OA%), average accuracy (AA%), and k kappa are represented in the Indian Pines dataset.

Table 5 .
Class-wise overall accuracy (OA%), average accuracy (AA%), and k kappa are represented in the Salinas dataset.

Table 6 .
Overall accuracy of Salinas dataset with other optimization algorithms.

Table 6 .
Overall accuracy of Salinas dataset with other optimization algorithms.