1. Introduction
Swarm intelligence algorithms (SIAs) [
1,
2,
3,
4] are usually inspired by characteristics of the natural swarm of the animal, which can be applied to deal with complex optimization issues from modeling real-world problems. In addition, the behavior and characteristic of biological individuals are also considered in the design of the algorithm. Some widely SIAs include Particle Swarm Optimization (PSO) [
1], Grey Wolf Optimizer (GWO) [
5], Salp Swarm Algorithm (SSA) [
6], Aquila Optimizer (AO) [
7], Duck Swarm Algorithm (DSA) [
3], etc. Expert in numerical optimization problems, engineering constrained optimization problems are also an effective method for verifying proposed algorithms. In addition, it can be abstracted as that all optimization problems can be solved by using SIAs and their variants in theory.
A Salp Swarm Algorithm (SSA) is a metaheuristic optimization technique modeled after the swarming and foraging behavior of salps in marine environments [
6,
8] The main steps are from the leader and followers, which has the advantages of simple structure and few parameters. As we know, the basic SSA has been researched with different strategies for optimization problems, which is used to avoid becoming stuck in local optima. Wang et al. [
9] proposed a modified SSA to solve the node coverage task of wireless sensor networks (WSNs) with tent chaotic population initialization, T-distribution mutation, and adaptive position update strategy. Zhou et al. [
10] designed an improved A* algorithm with SSA using a refined B-spline interpolation strategy, which is used for the path planning problem. Mahdieh et al. [
11] designed an improved SSA via a robust search strategy and a novel local search method for solving feature selection (FS) tasks. Zhang et al. [
12] proposed a cosine opposition-based learning (COBL) to modify SSA and used it to address the FS problem. Wang et al. [
13] designed a spherical evolution algorithm with spherical and hypercube search by two-stage strategy. In addition, multi-perspective initialization, Newton interpolation inertia weight, and followers’ update model strategies were also used in the improved SSA. A Gaussian Mixture Model (GMM) via SSA [
14] was proposed for data clustering of big data processing.Wang et al. [
15] used symbiosis theory and the Gaussian distribution to improve SSA with better exploitation capabilities, which was applied to optimize multiple parameters in fuel cell optimization systems. However, the performance of the basic SSA should be improved by the knowledge-enhanced strategy, where novelty Gaussian mutation, dynamic adjustment of hyperparameter, and mirror learning strategies can be researched for specific tasks.
For the classification task, SIAs are usually utilized to optimize the hyperparameters of the classifier network [
16,
17,
18]. An improved pelican optimization algorithm (IPOA) [
16] was designed to optimize the combination model of variational mode decomposition (VMD) and long short-term memory (LSTM), which is used to forecast Ultra-Short-Term Wind Speed. Li et al. [
17] proposed an improved Parrot Optimizer (IPO) with an aerial search strategy to train the multilayer perceptron (MLP), which enhanced the exploration and optimization ability of the basic pelican optimizer. Its performance was evaluated using the CEC benchmark function and the data set of the oral English teaching quality classification. Song et al. [
18] proposed a modified pelican optimization algorithm with multi-strategies and used it for a high-dimensional feature selection task through K-nearest neighbor (KNN) and support vector machine (SVM) classifiers. Panneerselvam et al. [
19] proposed a dynamic salp swarm algorithm and weighted extreme learning machine to deal with the imbalance task in the classification dataset with a higher accuracy. Wang et al. [
20] proposed a hierarchical and distributed strategy to enhance the Gravitational Search Algorithm, which is inspired by the structure of Multi-Layer Perceptrons (MLPs), resulting in significantly improved performance compared to existing methods. Yang et al. [
21] proposed a self-learning salp swarm algorithm (SLSSA) to train the MLP classifier using UCI datasets; its performance was also verified by CEC2014 benchmark functions with a longer computational time than basic SSA.
In the field of smart agriculture, the application of SIAs can effectively enhance the efficiency of plant disease diagnosis [
22] and seed classification [
23,
24] by classifiers, significantly improve work efficiency, and thus reduce economic costs. Pranshu et al. [
25] used the five most popular machine learning approaches to the Rice varieties classification problem. Din et al. [
26] used a Deep Convolutional Neural Network to identify rice grain varieties via the pre-trained strategy, named RiceNet, which can have better prediction accuracy than traditional machine learning (ML) methods. Iqbal et al. [
27] used three lightweight networks to perform the rice varieties classification task, which can extend to mobile devices. However, the deep neural network method needs more computing resources than the joint ML classifier and the SIA method.
To address the aforementioned challenges, this study proposes an enhanced-knowledge Salp Swarm Algorithm (EKSSA) to optimize the critical parameters of SVM in seed classification tasks. The proposed algorithm incorporates several strategic improvements to enhance its optimization performance. Specifically, adaptive adjustment mechanisms for parameters and are introduced to effectively balance the exploration and exploitation capabilities of the salp population. Furthermore, a novel position update strategy based on Gaussian walk theory is implemented after the basic position update phase to significantly enhance the global search ability of individual salps. Additionally, a dynamic mirror learning strategy is designed to prevent premature convergence to local optima by creating mirrored search regions, thereby substantially improving local search efficiency. The effectiveness of the EKSSA is comprehensively evaluated through experiments on thirty-two CEC benchmark functions and two practical seed classification datasets, demonstrating its superior performance in both optimization accuracy and classification tasks. In summary, the contributions of the designed EKSSA are as follows:
To enhance the performance of the basic SSA, an enhanced-knowledge Salp Swarm Algorithm (EKSSA) is proposed, and its effectiveness is rigorously evaluated through comparisons with other state-of-the-art optimization algorithms.
Exploration and exploitation of the follower balances using different adjustment strategies for the parameters and by the exponential function.
A novel Gaussian mutation strategy and a dynamic mirror learning strategy are introduced to enhance global search capability and prevent EKSSA from becoming trapped in local optima.
Many CEC benchmark functions are applied to evaluate the performance of the designed EKSSA, and two seed classification datasets are also utilized by the combination of EKSSA and the SVM algorithm which is named EKSSA-SVM.
The remainder of this paper is structured as follows:
Section 2 presents the basic SSA.
Section 3 introduces the mathematical model of the proposed EKSSA.
Section 4 presents and discusses the experimental results of comparative algorithms. In
Section 5, the application of EKSSA for optimizing hyperparameters of SVM in seed classification is described. Finally,
Section 6 concludes the study and suggests potential future research directions.
3. The Proposed Enhanced Knowledge Salp Swarm Algorithm
In this study, to overcome the shortage of SSA falling into local optimum, we use the improved method to optimize hyperparameter of the SVM classifier for seed classification tasks. We propose an adjustment strategy for the parameter to balance the optimization process of the follower position. In addition, the Gaussian mutation strategy and the mirror learning strategy are employed to enhance the overall performance of the proposed EKSSA. Moreover, the position update strategy of the follower is also a novel approach. The following introduces in detail the process of the improved strategies.
To enhance the performance between exploration and exploitation of the follower, the adjustment strategy of the parameter
is calculated by:
where
denotes the adjustment strategy parameter, which is used to balance the follower position.
l indicates the current iteration.
is the maximum number. The hyperparameter
curve is depicted in
Figure 1, because nonlinear strategies can effectively enhance the search ability of EKSSA during the optimization process. In the early stage of the slap search, a larger
value (
) can help followers obtain a better search space. In the later stage of slap search, reducing the
value (
) can help followers search for the optimal value of the optimization problem.
Then, the follower’s position from Equation (
4) can be redefined as:
where
denotes the position of the
ith follower in the
jth dimension when the
.
is a random number in (0,1) according to the Gaussian law.
indicates the
jth dimensional food source, which is the best position during the search process of the individual.
Notably, a new Gaussian mutation strategy is proposed to avoid falling into local optimum, and its expression is:
where
denotes the standard deviation of Gaussian variation and it is set to 0.5.
denotes the position of the
ith follower in the
jth dimension.
indicates the
jth dimensional food source. The expression of the Gaussian walk
is defined as:
In addition, the mirror learning strategy is employed to prevent the EKSSA from converging on local optima, thereby strengthening its global search performance, which is defined as:
where
is a random number in (0,1) according to the Gaussian law.
k denotes the scaling factor of the mirror learning strategy.
and
indicate the upper and lower boundary values of the individual, respectively. The adjust strategy of the
k is defined as:
where
denotes a random number in (0,1) according to the Gaussian law.
l is the current iteration.
is the maximum number.
3.1. Computational Complexity Analysis
The test platforms can influence the consumption of optimization time for the same algorithm, which means that the designed EKSSA should be analyzed. Assuming that
N is the population size of the EKSSA,
T indicates the maximum number of iterations, and
D is the dimension. The computational complexity of the proposed EKSSA algorithm is analyzed as follows: the initialization of the salp population requires
operations; the position update in the basic global and local search phase has a complexity of
; the Gaussian mutation and mirror learning strategies contribute an additional
to the update complexity. Furthermore, the complexity of the fitness sorting is
. Therefore, the overall computational complexity of the EKSSA can be expressed as:
However, the computational complexity of the basic SSA is:
3.2. Flowchart and Pseudo-Code of the EKSSA
Figure 2 illustrates the flowchart of the EKSSA algorithm, detailing its optimization process. From
Figure 2, it can be summarized in four stages. Stage 1 is the initialization of the slap position; Stage 2 includes parameter update, leader and follower position update of the proposed EKSSA; Stage 3 involves individual position update by the Gaussian mutation and mirror learning strategy. Stage 4 represents the best population of slaps corresponding to fitness during the optimization process. After the
iterations of the designed method, the best solution and fitness value are generated.
To understand the main architecture of the method, the pseudo-code of the designed EKSSA is displayed in Algorithm 1. In particular, input, output, and the main code of the EKSSA are listed.
Algorithm 1: Pseudo-code of EKSSA. |
![Biomimetics 10 00638 i001 Biomimetics 10 00638 i001]() |
5. Results of EKSSA-SVM for Seed Classification
High-quality seeds can help farmers achieve better profits and safe food. The classification of seeds through intelligent technology not only improves efficiency but also reduces the cost of manual screening. Thus, it is necessary for us to study the ML algorithm to identify seed varieties. The efficiency and accuracy of existing methods still hold potential for further improvement. Notably, there is also a gap for the seed classification task by the optimizing the hyperparameter of SVM using SIAs. In this research, two seed classification datasets were used to verify the performance of the proposed EKSSA. The SVM [
37] served as the baseline model for seed classification, and the EKSSA was employed to optimize its hyperparameters, specifically the penalty coefficient
c and the kernel parameter
g.
Figure 7 presents the diagram of EKSSA-SVM for seed classification tasks using two Rice Varieties datasets from the open resources. There are two categories in the Rice Varieties Dataset 1, 1630 Cammeo and 2180 Osmancik [
38]. Seven features (
,
) were sourced and consolidated with Area, Diameter, Major-axis length, Minor-axis length, Eccentricity, Convex-area, and Extent. In addition, Rice Varieties Dataset 2 has five categories: Arborio, Basmati, Ipsala, Jasmine, and Karacadag [
39]; 10,000 samples were used for each category in this study. Sixteen features (
,
) were sourced and consolidated with Area, Perimeter, Major-Axis, Minor-Axis, Eccentricity, Eqdiasq, Solidity, Convex-Area, Extent, Aspect-Ratio, Roundness, Compactness, Shapefactor1, Shapefactor2, Shapefactor3, and Shapefactor4. The metric accuracy (Acc/%) is defined as:
where
indicates the true positive number of the seed sample;
denotes the true negative number of the seed sample.
and
are the false positive and negative number of the seed sample, respectively.
Notably, the features of the seed classification task are first normalized before being input into the classifier, which should be mapped between 0 and 1. The optimization of hyperparameters
c and
g significantly influences the predictive accuracy of the SVM model in the seed classification task. In these experiments, the dataset was partitioned into training and testing sets with a ratio of 7:3. The search intervals for
c and
g were set to [0.1, 5] and [0.1, 10], respectively. The population size was set to 10, and the iteration was set to 15. The comparison methods were selected from the top four ranking of the nine algorithms, which are EKSSA, HBA, SSA, and GWO. The results of Rice Varieties Dataset 1 and Dataset 2 are listed in
Table 8 and
Table 9, respectively. In addition, the true label and the predict label result is depicted in
Figure 8 of the five methods.
For the Rice Varieties Dataset 1 in
Table 8, test Acc (%) results of KNN, SVM, HBA-SSA, GWO-SVM, SSA-SVM, and EKSSA-SVM are 88.58, 90.5512 with
, 90.6387 with
, 90.8136 with
, 90.7262 with
, and 90.8136 with
, respectively. Although the classification accuracy of EKSSA-SVM and GWO-SSA is the same, the
c and
g values for optimization are different.
For the Rice Varieties Dataset 2 in
Table 9, the accuracy is 97.7867% by the basic SVM with
. SSA-SVM has 98.0667% Acc with
. GWO-SVM has 97.9600% Acc with
. HBA-SVM has 98.0667% Acc with
. The proposed EKSSA-SVM has 98.1133% Acc for the Rice Varieties classification task when
c is set to 3.89383 and
g is set to 6.03773. It is 0.3266, 0.0466, 0.1533, and 0.3466 percentage points higher than SVM, HBA-SVM, GWO-SVM, SSA-SVM, and EKSSA-SVM, respectively.
From
Table 8 and
Table 9, for different datasets, the values of parameter
c and
g are different by the proposed EKSSA-SVM. In addition, the true label and the predict label result of the five Rice Varieties dataset are depicted in
Figure 9 of the SVM, HBA-SVM, GWO-SVM, SSA-SVM, and EKSSA-SVM. Except for Jasmine, there are obvious misclassification samples in the other types of Rice Varieties classification using Dataset 2. Thus, the predicted results of the EKSSA-SVM can be further modified by the quantum strategy.