Improved Salp Swarm Optimization Algorithm: Application in Feature Weighting for Blind Modulation Identification

Copyright: © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/). 1 Laboratory Sys’Com-ENIT (LR-99-ES21), Tunis El Manar University, Tunis 1002, Tunisia; sarra.benchaabane@enit.utm.tn (S.B.C.); ammar.bouallegue@enit.rnu.tn (A.B.) 2 Laboratory RISC-ENIT (LR-16-ES07), Tunis El Manar University, Tunis 1002, Tunisia; akram.belazi@enit.utm.tn 3 Centre for Digital Systems, IMT Lille Douai, Institut Mines-Télécom, University of Lille, F-59000 Lille, France; laurent.clavier@imt-lille-douai.fr * Correspondence: sofiane.kharbech@imt-lille-douai.fr or sofiane.kharbech@enit.utm.tn


Introduction
Feature weighting is a crucial preprocessing step before creating a machine learning (ML) model. It is a technique employed to estimate the optimal degree of influence of features, individually, using a training set. When well-weighted, high weights are attributed to important features while lower weights are attributed to irrelevant and noisy features, i.e., features that likely deteriorate the accuracy of the ML model [1,2]. In literature, there is a lack of investigation of the performance of metaheuristic optimization algorithms in feature weighting.
Due to the good performance shown in a wide range of real-world optimization problems, including in ML [3][4][5], metaheuristic algorithms seem to be good candidates to be deployed for dataset preprocessing techniques, such as feature weighting. There are three basic categories of these algorithms: physics-based [6], evolutionary-based [7], and swarm-based [8]. The Salp Swarm Algorithm (SSA) is a recently swarm-based optimization algorithm which was proposed by Mirjalili and al. [9]. In this paper, we propose an improved version of the SSA, referred to as Improved Salp Swarm Algorithm (ISSA), that later on will be utilized for weighting features.
The main contributions of this paper are four-fold: (i) a weight factor is introduced in the position update formula of the SSA. This factor varies dynamically to balance the ability of exploration and exploitation; (ii) a control parameter is added to the whole search process of the SSA (instead of the initialization only) to improve the accuracy of the solution; (iii) to overcome premature convergence of SSA and evolution stagnation, an Opposition Based Learning technique (OBL) is employed during the search process; and (iv) the improved optimization algorithm is applied for a feature weighting example in such a way that it gives a better misclassification rate for the studied classification problem. The classification problem that we considered in this paper is the blind digital modulation identification (DMI) for multiple-antenna systems.
Although several trials have been made to improve SSA's performance, they are still suffering from the problem of stagnation at local minima in some cases. To highlight its effectiveness, the ISSA is compared first with the original SSA as well as two algorithms that derive from it: Space Transformation SSA (STSSA) [10], which uses space transformation search, and Inertia Weight SSA (IWSSA) [11], in which an inertia weight is embedded.
The rest of the paper is organized as follows. In Section 2, we formulate the optimization problem. Section 3 presents a description of the original SSA and introduces our motivation and improvements. Section 4 gives an overview on the comparison methodology and the materials used. In Section 5, the proposed algorithm is evaluated by sixteen optimization benchmark functions. The performance of the ISSA in feature weighting is discussed in Section 6. Finally, Section 7 summarizes the main findings of this study and suggests directions for future research.

Problem Formulation
To show its scalability for real-world applications, the ISSA is applied for feature weighting in the context of blind digital modulation recognition for a multi-antenna system. The signals model and the identification system are the ones employed in [12]. The system model considers a frequency-flat block-fading multiple-input-multiple-output (MIMO) channel with m transmitting antennas and n receiving antennas. The identification process is based on one of the most popular strategies used for the blind DMI issue in MIMO systems [13,14]. The identification is three-staged: (i) the blind source separation stage is first, (ii) feature extraction for each one of the separated streams takes place, and (iii) the modulation scheme for each separated stream is identified through the minimum distance (MD) classifier. The extracted features used to estimate the modulation type on separated streams are the Higher-Order Statistics (HOS). The MD classifier identifies the modulation scheme by calculating the Euclidean distance of a feature vector with all the theoretical ones and then selecting the closest. In order to improve the performance of the identification system, authors in [12] introduced a feature-denoising approach. In our paper, for the same purpose, we will embed a feature weighting approach, i.e., we added weights to the initially extracted HOS so that the misclassification rate is minimized. Therefore, the optimization problem can be formulated as follows: where α is the weighting vector, the Iverson bracket . is a function that takes a truth value inside and returns 1 or 0 accordingly, |.| denotes the cardinality of the dataset S, and ||.|| indicates the Euclidean norm of a vector. s i ∈ R 1×nf represents the vector of nf features. φ is the set of possible modulation schemes. The expression inside the big parentheses returns the ith estimated modulation type that corresponds to the feature vector s i (i.e., a vector of estimated HOS). v j is the theoretical HOS vector for the modulation φ j and ϕ i is the true modulation scheme for the sample i of the dataset (i.e., the ith estimated HOS vector). denotes component-wise multiplication.

SSA, the Basic Algorithm
SSA simulates the swarming mechanism of salps when they are hunting arround oceans. In massive oceans, salps usually shape a swarm identified as a salp chain. In the SSA, the leader is the salp at the front of the chain, and the rest of the salps are called followers. Equation (2) updates the leader's position for the iteration k + 1: where x 1 i presents the position of the first salp (leader) in the ith dimension, f i is the position of the food source in the ith dimension. u i and l i indicate the upper and the lower bounds of ith dimension, respectively. c 2 and c 3 are random numbers generated in [0, 1]. The coefficient c 1 is the most critical parameter in SSA because it balances exploration and exploitation and it is defined as follows: where k is the current iteration, and T max represents the maximum number of iterations. Equation (4) updates the position of the followers for the iteration k + 1: where j ≥ 2, x j i is jth follower's salp position in ith dimension and x j−1 i is the (j − 1)th follower's salp position in ith dimension.
It is worth noting that the dimension of all vectors described above is the number of objective function variables. Figure 1 illustrates the flowchart of the SSA. f presents the best fitness getting from the previous iterations. The subscript new refers to the values of the current iteration.

Motivation and Improvements
SSA stores the best solution in the food source variable. Thus, it is very competitive in exploiting search space. Like many other optimization algorithms, SSA is still suffering from the problem of local stagnation and low convergence speed. Figures 2a and 3a show the evaluated solutions using the SSA for optimizing f 3 and f 4 (two selected benchmark functions [9]), respectively. Their correspondent convergence visualizations are plotted in Figures 2b and 3b, respectively. As shown in these figures, the algorithm fails in reaching the best solution (i.e., the (0, 0) pair). Besides, we can affirm the powerlessness of SSA in converging to the global optima. To overcome these drawbacks, we propose an improved SSA called ISSA. The main improvements in the ISSA are as follows. Firstly, an inertia weight w ∈ [0, 1] is introduced into SSA. This parameter accelerates the convergence speed during the search. It also makes a balance between exploitation and exploration capabilities to escape local solutions. Secondly, the performance of the ISSA is highly influenced by c 1 . In fact, c 1 is decreased by exploring and exploiting the search space, which achieves a precise appreciation of the optimal solution. The new update formula is displayed in (5): Finally, the convergence rate of the algorithm is not stable and will be slow in most cases. Therefore, we apply the OBL technique [15,16]. This technique can bring the algorithm closer to the global optima, creating more flexibility in exploring search space and quickly converging towards an optimal value. Mathematically, OBL can be represented as in (6): Figures 2c and 3c show the evaluated solutions using the ISSA for optimizing f 3 and f 4 , respectively. Their correspondent convergence visualizations are illustrated in Figures 2d and 3d, respectively. As can be observed from these figures, the ISSA succeeds in achieving a satisfactory level in attending to the best solution. Moreover, it converges faster toward global optima. All the above results confirm that these improvements enhance the SSA. Figure 4 illustrates the flowchart of the SSA.

Comparison Methodology
In literature, there are many metaheuristic optimization algorithms and trials of improvements of the SSA. As mentioned in the Introduction, many of them still suffer from some weaknesses, mainly the inability to stay away from local minima. For the rest of them, our choice for the benchmarking algorithms is based on the following criteria. The original version of the algorithm (SSA) and some recent improvements in the closest context for the application. Since there is no work on feature weighting as yet, the closest context of application is generally machine learning related and specifically feature selection. Consequently, we selected STSSA and IWSSA as trials of improvements of the SSA for comparison.
In order to have a complete comparison, i.e., a comparison that is not limited only to the application (e.g., feature selection in the context of this paper), the ISSA should first exhibit a good performance once tested on a set of mathematical functions (functions that are usually used to evaluate a metaheuristic optimization algorithm). In fact, a good optimization algorithm should not be tailored to a specific function or application; for that reason, we first evaluated the designed optimization algorithm on a wide and diversified set of benchmark functions, i.e., unimodal functions (one local minimum, configurable number of spatial dimensions), multimodal functions (many local minima, configurable number of spatial dimensions), and fixed-dimension multimodal functions (many local minima, fixed number of spatial dimensions). This study is conducted on both accuracy and convergence speed, and forms the content of Section 5.
The comparison within the application context is carried out once the algorithm shows fine performance against other algorithms used for comparison. Therefore, in Section 6, we assess the performance of the ISSA in opposition to other optimization algorithms in the context of features weighting.

Materials
For all computer simulations of this paper, we used Matlab version 9.9.0.1538559 (R2020b) Update 3. To replicate and build on the results, the source code for this work is available since 7 August 2021, on https://github.com/sofiane-kharbech/Feature-Weightingfor-DMI under the MIT license.

Benchmarking of SSA and ISSA
The performance of the proposed ISSA is tested by solving 16 benchmark functions under dimension 30 (dimension of agent) reported in [17]. These functions are grouped into unimodal functions ( f 1 − f 7 ) with one local optimum, multimodal functions ( f 8 − f 13 ) with a lot of local optima, and fixed-dimension multimodal functions ( f 14 − f 16 ). For all tests, the number of search agents is set to 40. In addition to the original SSA, the proposed ISSA is compared with two other improvements on SSA, STSSA [10], and IWSSA [11]. Table 1 describes the performance of the ISSA through the best mean values (Mean), the standard deviations (SD), and the standard errors of means (SEM). The unimodal functions ( f 1 − f 7 ) allow evaluation of the exploitation capability of the studied metaheuristic algorithms. In most of these functions, ISSA is the best optimizer and succeeds in reaching the global optima. The present algorithm can hence provide perfect exploitation. Unlike unimodal functions, multimodal functions include ( f 8 − f 16 ), many local optima. Therefore, this kind of test functions is beneficial to evaluate a given algorithm's exploration capability. From the reported results, ISSA outperforms SSA as well as the algorithms that derive from it.

Comparison Based on Convergence
The convergence rates of the four algorithms are listed in Table 2. These rates are estimated using the mean number of function evaluations (MeanFES) and the success rate (SR). For most benchmark functions, ISSA presents the highest SR and the lowest MeanFES required to reach an acceptable solution. Except for ( f 6 , f 8 , f 12 , f 13 , and f 16 ) functions, despite the difficulty of these multimodal functions converging, ISSA nearly keeps the same values as the original SSA. For f 7 and f 15 , the IWSSA has the best convergence speed. To manifest the convergence performance more intuitively, Figure 5 shows the convergence curves of the tested algorithms for the benchmark functions used. The ISSA presents the fastest convergence speed and the highest convergence precision compared to other algorithms for the most test functions. The ISSA can search for optimal approximation and achieve faster stability for the above benchmark functions. When an algorithm cannot reach an acceptable solution over the fixed number of runs, the value is marked as 'NaN'.

Statistical Analysis
Statistical analysis is conducted to analyze the different outcomes obtained from multiple optimization algorithms quantitatively. Since the results are not based on assumptions, we have used the non-parametric tests; Friedman and Quade tests [18,19]. Figure 6 shows the average rankings of the tested algorithms based on the standard errors of means (SEM). As it is shown in this figure, ISSA is the best ranked. In summary, the computer simulations indicate that the ISSA has an excellent ability to balance between exploration and exploitation phases and improve the whole performance of the SSA in solving the benchmark functions.

ISSA for Feature Weighting in DMI
To measure the performance of the proposed ISSA in providing the best feature weights for DMI, we consider the modulation pool φ = {B-PSK, Q-PSK, 8-PSK, 4-ASK, 8-ASK, 16-QAM}, a MIMO configuration system m × n = 2 × 6, and a signal-to-noise ratio of 5 dB. Table 3 illustrates the solution accuracy for all algorithms; one can see that ISSA achieves the best mean. The convergence rate is depicted in Table 4. In fact, both of ISSA and IWSSA perform better in terms of meanFES and SR. However, the plots of Figure 7 exhibit that, for a higher number of iterations, the accuracy of ISSA is much better than IWSSA, while the latter converges earlier but saturates at a greater value.   For further comparison, the classification performance of all optimization algorithms is compared to two additional cases: (i) without feature weighting (w/o FW) case and (ii) when the z-score normalization method is used as one of the most common preprocessing techniques. From Table 5, we note that the ISSA-based feature weighting approach remains the best. Confusion matrices shown in Figure 8 gives in-depth results and ensures that the proposed approach for features weighting is still the most efficient method, among the compared ones, in the considered context.

Conclusions
In this paper, we proposed an improved version of the SSA, dubbed ISSA, to optimize feature weighting in the context of modulation detection using an MD classifier. The ISSA relies mainly on the good balance between local and global searches through the OBL technique. Simulation results on benchmarking functions showed that the proposed algorithm widely outperforms other algorithms used for comparison in terms of solution accuracy and convergence. Thus, the validation of the ISSA through the set of benchmark functions allows its use in a wide range of optimization problems. Once used for feature weighting in DMI, as a case study, the ISSA showed better results than other approaches used for comparison, once again, in terms of solution accuracy and convergence. In fact, on average, and for moderate SNR conditions (5 dB), feature weighting using ISSA allows the following gains in correct classification rate: about 20% to the approach without feature weighting, 3% to the most used feature normalization technique, 30% to the original version of the algorithm (the SSA), and nearly 1% to the other optimization algorithms. Since it has shown a good achievement in the DMI case study, the ISSA is worthy of being applied in several wireless communications-related problems like other signal parameters detection. Moreover, this makes the ISSA a promising candidate for further preprocessing techniques in ML, such as features selection, especially since features weighting is a generalization of features selection.