Hybrid Grasshopper Optimization Algorithm and Differential Evolution for Multilevel Satellite Image Segmentation

An efficient satellite image segmentation method based on a hybrid grasshopper optimization algorithm (GOA) and minimum cross entropy (MCE) is proposed in this paper. The proposal is known as GOA–jDE, and it merges GOA with self-adaptive differential evolution (jDE) to improve the search efficiency, preserving the population diversity especially in the later iterations. A series of experiments is conducted on various satellite images for evaluating the performance of the algorithm. Both low and high levels of the segmentation are taken into account, increasing the dimensionality of the problem. The proposed approach is compared with the standard color image thresholding methods, as well as the advanced satellite image thresholding techniques based on different criteria. Friedman test and Wilcoxon’s rank sum test are performed to assess the significant difference between the algorithms. The superiority of the proposed method is illustrated from different aspects, such as average fitness function value, peak signal to noise ratio (PSNR), structural similarity index (SSIM), feature similarity index (FSIM), standard deviation (STD), convergence performance, and computation time. Furthermore, natural images from the Berkeley segmentation dataset are also used to validate the strong robustness of the proposed method.


Introduction
Image segmentation is one of the most important techniques in image processing, which partitions a given image into several unique and disjoint classes according to color, texture, edge, and other parameters [1][2][3][4][5]. In the last few decades, many segmentation methods have been proposed by researchers, such as clustering, edge detection, region growing, and thresholding [6][7][8]. Among the available methods, thresholding is extensively used due its simplicity and efficiency [9]. In fact, the thresholding method can be divided into two categories: bi-level thresholding and multilevel thresholding. Bi-level thresholding is suitable for simple segmentation tasks, and splits the image into two classes, namely, foreground and background [10]. However, if the given image is informative and contains multiple objects, then classical bi-level thresholding is insufficient. The bi-level thresholding is, therefore, extended into multilevel thresholding to enhance applicability.
Several techniques have been introduced into this domain to determine the segmentation thresholds. Minimum cross entropy (MCE) is one of the most popular techniques and has attracted widespread attention in recent years [11]. The optimal thresholds were selected by minimizing the and update stages of the optimization process, which can enhance the search efficiency and avoid local optimum to some extent. In [39], a natural selection strategy and democratic decision-making mechanism were used to avoid local best trap. In addition, the dynamic feedback mechanism based on the 1/5 principle was also introduced to adjust the parameter c for better balancing global and local search [39]. The above phenomena motivate us to combine GOA with an efficient strategy to overcome the drawbacks, such as slow convergence speed, unbalanced exploration-exploitation, and local optimization.
Another interesting technique for optimization is differential evolution (DE) [16]. DE is a simple but powerful algorithm, which has been combined with other algorithms to enhance search efficiency and maintain population diversity. In 2017, Jadon et al. proposed a hybridization of artificial bee colony (ABC) and DE algorithms to develop a new meta-heuristic algorithm with better convergence speed and a better balance between exploration and exploitation abilities [40]. Dash et al. introduced a hybrid meta-heuristic technique known as hybrid firefly differential evolution (HFDE) [41]. The firefly movement was used as an additional search of DE to avoid local optimization and improve convergence performance. In 2018, Xiong et al. presented a hybrid WOA with DE, called DE/WOA, for extracting the accurate parameters of photovoltaic (PV) models [42]. DE also served as an additional search technique, and the crossover probability (CR) was the criterion for using DE. To sum up, these promising results indicate the remarkable performance of DE in hybrid meta-heuristic algorithms. In fact, some of the available hybrid models still have drawbacks, for instance, a two-layer position update mechanism is often adopted, that is, after the first strategy completes the position update, the second one will be utilized to continue to update the position that has been updated once [43]. This type of approach does have certain advantages, especially the introduction of greedy selection strategy, which can prevent population stagnation and degradation caused by additional searching. However, the computational complexity of these methods is high, which is determined by the hybrid model used. In this paper, an alternative hybrid algorithm of GOA and jDE is proposed to overcome the above shortcoming.
Satellite images are one of the important sources of information which have been widely used in the field of geographic information systems, astronomy, and earth science [44]. Generally speaking, it is difficult to locate objects and boundaries in satellite images with poor illumination and complex backgrounds. Therefore, developing a high-performance color satellite image segmentation method is of significance and can provide reliable theoretical support for engineering practices. In this paper, a novel hybrid GOA with jDE is introduced for multilevel satellite image segmentation. The MCE technique is used to select the segmentation thresholds. In order to evaluate the algorithm performance objectively, a series of experiments are conducted on various satellite images. Both low and high levels of the thresholds are considered. The proposed method is compared with the standard color image thresholding methods as well as the advanced satellite image thresholding techniques based on different criteria. Comprehensive statistical analyses, including Friedman test and Wilcoxon's rank sum test, are performed to assess the significant difference between the algorithms. The superiority of the proposed hybrid algorithm is illustrated from different aspects, such as average fitness function value, peak signal to noise ratio (PSNR), structural similarity index (SSIM), feature similarity index (FSIM), standard deviation (STD), convergence performance, and computation time. Besides, natural images from the Berkeley segmentation dataset are also used to validate the strong robustness of the proposed method.
The main contributions of this study have three aspects: 1.
Propose an efficient satellite image segmentation method.

2.
Apply the hybrid algorithm of GOA and DE to the multilevel thresholding domain. 3.
Introduce an alternative hybrid model for a meta-heuristic algorithm.
The structure of this paper is presented as follows: Section 2 briefly reviews the GOA. The definition of MCE thresholding technique is presented in Section 3. The proposed GOA-jDE-based Remote Sens. 2019, 11,1134 4 of 26 multilevel satellite image thresholding method is illustrated in Section 4. In Section 5, a series of experiments are conducted, along with a detailed relevant discussion. Finally, Section 6 summarizes the conclusions of this paper and future work directions.

Grasshopper Optimization Algorithm
The grasshopper optimization algorithm is inspired by the swarming behavior of grasshoppers in nature. It was first proposed by Saremi et al. to determine the optimal shape of architectural structures [33]. There are three main factors that influence the movement of the grasshopper individual: social interaction, gravity force, and wind advection [34,45]. The mathematical model of their swarm behavior is given as follows: where X i is the position of the ith grasshopper. S i , G i , and A i indicate the social interaction, gravity force, and wind advection on the ith grasshopper, respectively. However, Equation (1) cannot be directly utilized to solve optimization problems. The authors presented an improved version that was defined as where ub d and lb d represent the upper bound and lower bound in the dth dimension, respectively. T d indicates the value of the dth dimension in the best solution obtained so far. d ij = x j − x i shows the distance between the ith grasshopper and jth grasshopper. s is a designed function that can be calculated by s(r) = f e −r/l − e −r . f and l are two constants. In order to investigate the influence of f and l on s function, two different cases were considered. The relevant results are shown in Figure 1.

Grasshopper Optimization Algorithm
The grasshopper optimization algorithm is inspired by the swarming behavior of grasshoppers in nature. It was first proposed by Saremi et al. to determine the optimal shape of architectural structures [33]. There are three main factors that influence the movement of the grasshopper individual: social interaction, gravity force, and wind advection [34,45]. The mathematical model of their swarm behavior is given as follows: where is the position of the ith grasshopper. , , and indicate the social interaction, gravity force, and wind advection on the ith grasshopper, respectively.
However, Equation (1) cannot be directly utilized to solve optimization problems. The authors presented an improved version that was defined as where and represent the upper bound and lower bound in the dth dimension, respectively. � indicates the value of the dth dimension in the best solution obtained so far.
= � − � shows the distance between the ith grasshopper and jth grasshopper. is a designed function that can be calculated by ( ) = − / − − . and are two constants. In order to investigate the influence of and on function, two different cases were considered. The relevant results are shown in Figure  1. is a significant parameter of GOA that can balance the exploration and exploitation of optimization. It decreases with the number of iterations and can be computed by where and are the maximum and minimum values, respectively. is the current iteration and indicates the maximum number of iterations. Pseudocode of traditional GOA for global optimization is given in Algorithm.

Mutation
The mutation operation of DE algorithm can be calculated as where m t+1 i is the mutant individual in the (t + 1)th iteration. x t r1 , x t r2 , and x t r3 indicate three different individuals in the population. More specifically, r 1 , r 2 , and r 3 are not equal. The scaling factor denoted by SF is a constant here.

Crossover
After the process of mutation, the trial individual c t+1 i is chosen from the current individual x t i or the mutant individual m t+1 i with the purpose of improving the population diversity. The crossover operation of DE algorithm is mathematically defined as follows: where rand is a uniformly distributed random number in the interval [0, 1]. CR is a constant that indicates the crossover probability.

Selection
During the process of selection, a comparison between the trial individual c t+1 i and the current individual x t i is made to obtain an individual of (t + 1)th generation. For a minimization problem, the selection operation is defined as where f indicates the objective function of the given optimization problem.

Self-Adapting Differential Evolution (jDE)
As discussed above, scaling factor SF and crossover probability CR are two significant parameters in the DE algorithm, the value of which can affect the optimization capability. In the standard DE, these two control parameters are constant throughout the entire evolutionary process. However, the fixed value cannot adapt well to various problems, especially complex high-dimensional problems. Therefore, Brest et al. [49] introduced a modified version of DE that can set the control parameters adaptively. In the modified DE, known as jDE, the control parameters SF and CR can be expressed as follows: where τ 1 and τ 2 show the transition probability, and are set to 0.1 here. SF l and SF u indicate the bound of scaling factor. The initial value of SF and CR are set to 0.5 and 0.9, respectively, which can help the algorithm to obtain a remarkable performance. It can be found from Equations (8) and (9) that each search agent has its own SF and CR. During the process of iteration, the value of these two parameters may change, but the transition probability denoted by τ is small. In other words, the parameter value of each search agent is appropriate in many cases, while that value will also become random to increase population diversity (see Figure 2). The pseudocode of the jDE algorithm for an optimization problem has been given in Algorithm 2.

11.
Selection: Select the better individual that will be preserved for the next generation using Equation (7);

end for
13. end while 14. return , which represents the optimal position of optimization; 15. End

Hybrid Algorithm of GOA and jDE (GOA-jDE)
GOA is a novel meta-heuristic algorithm that has been applied to many engineering problems, however, it still has some drawbacks, such as unbalanced exploration-exploitation, sometimes slow convergence speed, and low population diversity [34,39]. DE is a simple but powerful algorithm which has been embedded into other algorithms, as an operator, to enhance local search capability. For example, Elaziz et al. introduced a moth search differential evolution (MSDE) algorithm for task scheduling in cloud computing [50]. The DE algorithm served as a local search method in MSDE, and a roulette wheel approach was used to determine which strategy would be adopted. The experimental results indicate that the proposed algorithm can efficiently solve the cloud task scheduling problem and is superior to other methods in terms of various performance measures [50]. In 2018, a hybrid algorithm based on self-adaptive gravitational search algorithm (GSA) and differential evolution was proposed, which is known as SGSADE [51]. In this paper, a DE algorithm for i = 1 : n 6.
Calculate the objective value of each search agent f i ; 7.
Update the best search agent x best ; 8.
Evaluate the control parameters SF and CR of each search agent using Equations (8)-(9); 9.
Mutation: Generate a mutant individual using Equation (5), and then check the position; 10.
Crossover: Choose the trial individual from current individual and mutant individual using Equation (6); 11.
Selection: Select the better individual that will be preserved for the next generation using Equation (7); 12.
end for 13. end while 14. return x best , which represents the optimal position of optimization; 15. End

Hybrid Algorithm of GOA and jDE (GOA-jDE)
GOA is a novel meta-heuristic algorithm that has been applied to many engineering problems, however, it still has some drawbacks, such as unbalanced exploration-exploitation, sometimes slow convergence speed, and low population diversity [34,39]. DE is a simple but powerful algorithm which has been embedded into other algorithms, as an operator, to enhance local search capability. For example, Elaziz et al. introduced a moth search differential evolution (MSDE) algorithm for task scheduling in cloud computing [50]. The DE algorithm served as a local search method in MSDE, and a roulette wheel approach was used to determine which strategy would be adopted. The experimental results indicate that the proposed algorithm can efficiently solve the cloud task scheduling problem and is superior to other methods in terms of various performance measures [50]. In 2018, a hybrid algorithm based on self-adaptive gravitational search algorithm (GSA) and differential evolution was proposed, which is known as SGSADE [51]. In this paper, a DE algorithm was used with the purpose of improving the exploitation ability and maintaining population diversity [51]. The relevant simulation results demonstrate the superiority of SGSADE over other variant algorithms of GSA. In 2017, Zhao et al. presented a hybrid optimization algorithm of chaotic differential evolution and estimation of distribution [52]. The proposed algorithm, called cDE/EDA, can obtain high precision solutions due to the combination of strong exploration ability in EDA and good exploitation ability in DE. These successful applications, using hybrid algorithm with DE, motivate us to take advantage of them to obtain a modified version of GOA. In this paper, the GOA algorithm is combined with jDE to improve search efficiency while maintaining population diversity, especially in the later iterations. It is worth noting that although the DE algorithm serves as a local search strategy in the proposed algorithm, this does not mean, however, that its exploration capability can be ignored or even discarded. After all, a good balance between exploration and exploitation is crucial for any meta-heuristic algorithm, even as part of a hybrid algorithm. Therefore, an improved version of DE, namely jDE, is used instead of the standard DE.
In the proposed GOA-jDE algorithm, a simple and efficient hybrid model is introduced to improve the optimization ability without increasing the computational complexity [53]. In the authors' opinion, the average objective value of the population represents the overall quality of the search agents in the current generation. For a problem to be minimized, the fitness function value of individual should be less than the average, which indicates that the adjacent search region of the particle is potential and promising. Therefore, a strategy to enhance the local search should be adopted. On the contrary, if the fitness function value is greater than the average, the local search strategy will not be used. Furthermore, the introduction of jDE will not make the algorithm prematurely converge because scaling the difference between individuals makes the population more randomly distributed. This characteristic can better maintain the diversity of populations, especially in the late evolution process. Moreover, the flowchart of GOA-jDE algorithm-based multilevel thresholding is given in Figure 3, and the pseudocode is presented in Algorithm 3.  In addition, experiments were performed on eight satellite images (see Figure 4) to determine the optimal parameter settings of the proposed algorithm. The average rank with different combinations of and are given in Table 1. Note that the values of the parameters and are the initial values here. Other parameters, such as , , etc., are the same as in the original
The sum of the entropies of all parts represents the fitness function value; In addition, experiments were performed on eight satellite images (see Figure 4) to determine the optimal parameter settings of the proposed algorithm. The average rank with different combinations of SF and CR are given in Table 1. Note that the values of the parameters SF and CR are the initial values here. Other parameters, such as τ 1 , τ 2 , etc., are the same as in the original literature [49], in which the authors have discussed, in detail, the impact of the selected parameters on the performance. In this experiment, a lower ranking value indicates better performance. As can be observed, SF = 0.5 and CR = 0.9 present the best results in most cases; thus, this combination of parameter values was adopted in the following experiments.
Otsu methods for fair comparison, and the resulting two techniques were named GOA-jDE-Tsallis and GOA-jDE-Otsu. For the third experiment, the SIPI Image Database (37 color satellite images and 1 grayscale satellite image) was used for further testing [59].
Hence, the performance of each approach can be comprehensively evaluated through the above experiments. In addition, Figure 4 presents the original satellite image and corresponding histogram. The images are named in order (i.e. "Image1" to "Image8"), and the relevant explanation is given in Table 2.

Computational Complexity
In this subsection, the computational complexity of the GOA-jDE algorithm is presented, and the details of the relevant analysis are also given.
It is worth mentioning that this paper only discusses the computational complexity in a single iteration because the number of search agents using the GOA or jDE method varies uncertainly during the evolution process. The computational complexity of the GOA-jDE and standard GOA algorithm can be calculated as follows: where n 1 and n 2 indicate the number of search agents that update their position using jDE and GOA in one iteration, respectively. n is the population size, which is equal to the sum of n 1 and n 2 , namely n = n 1 + n 2 . d is the dimension of given problem, Co f is the cost of function evaluation. Coc and Cos are the cost of crossover and selection operations in DE, respectively. It can be found, from the two above formulations, that the cost of "objective function evaluation" is equal. However, the cost of "position update" is significantly different. Therefore, only the latter needed to be analyzed to assess the differences between them. In GOA-jDE, the population is divided into two parts according to the average fitness value, while the computational complexity of these two parts is different. Mutation, crossover, and selection operations are the main costs of jDE, and the computational complexity is approximately equal to O(n 1 * (d + Coc + Cos)). For the position update using GOA, the computational complexity is approximately equal to O(n 2 * n * d), since each grasshopper needs to compute its social interaction, while the calculation of the distance between two individuals is essential, as shown in Equation (2). Similarly, in the GOA algorithm, the cost of the position update is equal to O n 2 * d , which can be rewritten as O n 1 2 * d + n 2 2 * d + 2n 1 n 2 d . It can be seen from the final versions of Equation (10) and Equation (11) that they all contain three terms representing the cost of position update. Obviously, the first two terms of GOA-jDE cost less than the standard GOA. Considering the third term, it is difficult to determine which algorithm has higher computational complexity due to the unknown cost of crossover and selection operations represented by Coc and Cos, respectively. In fact, the computational complexity of GOA-jDE is much lower than the standard GOA, which will be verified in subsequent experiments.

Experimental Setup
In this paper, a series of experiments are conducted to validate the performance of GOA-jDE algorithm, which can be mainly divided into three parts. For the first experiment, eight satellite color images are selected from "Landsat imagery courtesy of NASA Goddard Space Flight Center and U.S. Geological Survey" [54]. It is worth noting that all images are resized, and the height is fixed to 481 pixels. Both state-of-the-art and conventional multilevel color image segmentation techniques are employed, such as the standard GOA and DE, modified GOA (MGOA) [55], hybrid jDE (hjDE) [21], βDE (BDE) [56], bat algorithm (BA) [17], and particle swarm optimization (PSO) [15]. For the second experiment, the GOA-jDE-based method using MCE (GOA-jDE-MCE) with other threshold-based satellite image segmentation methods, such as modified artificial bee colony using Tsallis entropy (MABC-Tsallis) [57], improved the differential search algorithm using Otsu between-class variance (IDSA-Otsu) [58], and cuckoo search using MCE (CS-MCE) [13]. Considering that these methods use different thresholding techniques, the GOA-jDE algorithm was combined with Tsallis entropy and Otsu methods for fair comparison, and the resulting two techniques were named GOA-jDE-Tsallis and GOA-jDE-Otsu. For the third experiment, the SIPI Image Database (37 color satellite images and 1 grayscale satellite image) was used for further testing [59].
Hence, the performance of each approach can be comprehensively evaluated through the above experiments. In addition, Figure 4 presents the original satellite image and corresponding histogram. The images are named in order (i.e., "Image1" to "Image8"), and the relevant explanation is given in Table 2. Glacier cover in the mountainous region of northwestern Venezuela.

3.
Dukan Lake in the Zagros Mountains, the largest lake in Iraqi Kurdistan.
The waters of Foxe Basin, which have been choked with sea ice for most of the year.
6. The Port of Busan at the southeastern tip of the Korean Peninsula, which has been a trading hub since at least the 15th century. 7.
A fire in Northern California during the summer of 2018. 8.
The Ebro Delta, located more than 200 kilometers (120 miles) southwest of Barcelona.
The parameter values were set according to the original literature, as shown in Table 3, except for the population size N, set to 30, and the number of iterations t max , set to 500 for fair comparison. All experiments were performed 30 times to eliminate errors, and the best result in each case is highlighted in boldface. The experimental environment is given as follows: MATLAB 2017 and Microsoft Windows 10 operating system.

Performance Measures
In this section, several quantitative indicators are introduced to evaluate the performance of algorithms, such as average fitness values, standard deviation (STD), peak signal to noise ratio (PSNR), structural similarity index (SSIM), and feature similarity index (FSIM), as well as Wilcoxon's rank sum test and Friedman test. The description of these metrics can be found in Table 4.
Furthermore, another performance metric called segmentation success rate (SSR) is introduced in this paper, which takes PSNR, SSIM, and FSIM indexes into consideration for comprehensively measuring the similarity between the segmented image and the original image. SSR determines how many times the technique successfully reached a specified value that is called value-to-reach (VTR): where NVTR is the number of times that the technique reached the VTR.
Reflects the degree of dispersion in a dataset. A lower value shows better performance. [46] 3. Peak Signal to Noise Ratio (PSNR) PSNR = 10 log 10 The ratio of the maximum possible power of the signal to the destructive noise power. [60] 4. Mean Squared Error (MSE) Computes the difference between the predicted value. [60] Defines the similarity between the original image and the segmented image. [61] 6. Feature Similarity Index (FSIM) Reflects the similarity of feature structure, the maximum value is 1. [62]

Average Computation Time
Indicates the operating efficiency of each method. [14] 8.

Wilcoxon's Rank Sum Test
Whether there is a significant difference between two algorithms. [63] 9.
Friedman Test Detects significant differences between the behaviors of two or more algorithms. [64]

Results and Discussions
The experimental results can be found in Tables 5-9 and Figures 5-11. Figures 5 and 6 give the segmented results of "Image4" and "Image7" at different threshold levels, respectively. It can be found, from the figures, that the segmented images using high threshold levels contain more detail and information than those using low threshold levels. To be more specific, the local zoom map of "Image4" illustrates that the segmentation operation of higher dimensions makes the texture of the rock clearer. On the contrary, in the case of lower threshold, the given image is only divided into a few classes, which cannot reflect much of the image information, resulting in a poor segmentation effect. Considering the segmentation of "Image7", it can be found that the lower threshold mistakenly classifies the fire source and the smoke, as their gray levels are relatively close (but not the same); when K = 12, the effect is improved because a segmentation threshold between the two gray levels is obtained. The reason for all the above phenomena is that as the number of threshold values increase, more classes with different characteristics are acquired, which can maintain the local features of the original image. In fact, it is difficult for the human eye to determine the performance of each algorithm at the same threshold level, especially the complex image with multiple objects. Therefore, the quality of the segmented images needs to be further evaluated using some specific indicators.         Table 5 presents the average objective function (minimum cross entropy) value obtained by each algorithm over all satellite images (bolded results are best). Note that a lower function value indicates better performance. It can be seen from the table that the proposed method presents the lowest value in most cases. For example, under the conditions of "Image3" (for K = 12), the fitness values are −646.3107, −646.2853, −646.3092, −646.3039, −646.3077, −646.3098, −646.2148, and −646.2988 for GOA-jDE, GOA, DE, MGOA, hjDE, BDE, BA, and PSO, respectively. It is obvious that GOA-jDE outperforms the compared algorithms, indicating its remarkable optimization ability and balanced exploration-exploitation. At the same time, this convincingly demonstrates that the segmented image obtained by the proposed approach is detailed, informative, and of high quality because the entropy of a given image indicates its average information content [65]. More specifically, the reason for the high precision is the use of a powerful hybrid model. In GOA-jDE, the whole population is divided into two parts according to the average fitness value. Compared to the standard GOA and DE, the exploration and exploitation stages of optimization are well balanced, which increases the probability of obtaining the global optimum. Compared to other algorithms, the GOA-jDE algorithm can maintain the population diversity in the late iteration, thus enhancing the ability to avoid local optimization. For example, the main drawback of PSO is the parameter ω. If the starting and ending value of parameter ω for the current problem are not determined in the best form, the PSO may converge prematurely and fall into a local optimum. For the BA algorithm, more parameters need to be adjusted for different problems, such as r i (rate of pulse emission), A i (loudness value), and v max (maximum velocity). The determination of these parameters greatly reduces the universality of BA algorithm, which affects the performance to some extent.
Moreover, the convergence property is also an indicator for evaluating the performance of each algorithm. Some typical drawbacks of the meta-heuristic algorithm can be analyzed from the convergence curve, including premature convergence and slow convergence speed. The convergence curve of each algorithm on "Image2", "Image4", "Image6", and "Image8" at 12 threshold levels is shown in Figure 7 because the difference between algorithms is obvious at a higher threshold. It can be observed from the results that, in most cases, the GOA-jDE algorithm performs better than other methods. In other words, the curve given by the proposed algorithm is relatively low, overall. As discussed above, the main disadvantages of GOA are slow convergence speed and unbalanced exploration-exploitation, which are reflected clearly in the curves. For instance, under the conditions of "Image8", it can be observed that the optimal value obtained by the GOA is continuously updated throughout the iterative process, showing no tendency to converge. This phenomenon illustrates the slow convergence drawback of GOA. However, the GOA-jDE algorithm can better balance the exploration and exploitation stages, showing neither premature convergence nor slow convergence. In fact, the superiority of the proposed algorithm is not only reflected in the segmentation task of "Image8", but also in other images. The promising result indicates the remarkable performance of GOA-jDE, which is suitable for the problem of satellite image segmentation.
Remote Sens. 2019, 11, x FOR PEER REVIEW 3 of 26 "Image8", but also in other images. The promising result indicates the remarkable performance of GOA-jDE, which is suitable for the problem of satellite image segmentation. Figures 8-10 present the PSNR, SSIM, and FSIM values obtained by all algorithms, respectively. As discussed above, the combination of thresholds determines the quality of segmented images, which is reflected in the indicator value of the experiment. A higher value means more similarity to the original image. It can be found, from the figure, that the value of the PSNR indicator increases as the number of thresholds increases, and the position of the line graph given by the proposed method is significantly higher than the compared algorithms, except for "Image5". The reason for this phenomenon is that each image corresponds to a different optimization problem, and it is difficult to find a single algorithm that is universally applicable. Therefore, GOA-jDE can still be considered efficient and powerful from a holistic perspective, although it does not give the best results in all cases. Considering the SSIM and FSIM values obtained over all available images at 12 threshold levels, the algorithms can be ranked, in our study, as GOA-jDE > hjDE > BDE~ MGOA > GOA > DE > PSO > BA. When the threshold level is low (K = 4 and 6), the difference between the algorithms is not obvious, while the difference becomes significant at high threshold levels (K = 12). Table 6 gives the SSR values obtained by each method at 12 threshold levels. Note that the specified values of PSNR, SSIM, and FSIM, here, are 29.5, 0.93, and 0.95 respectively, which is a good reflection of the  Figures 8-10 present the PSNR, SSIM, and FSIM values obtained by all algorithms, respectively. As discussed above, the combination of thresholds determines the quality of segmented images, which is reflected in the indicator value of the experiment. A higher value means more similarity to the original image. It can be found, from the figure, that the value of the PSNR indicator increases as the number of thresholds increases, and the position of the line graph given by the proposed method is significantly higher than the compared algorithms, except for "Image5". The reason for this phenomenon is that each image corresponds to a different optimization problem, and it is difficult to find a single algorithm that is universally applicable. Therefore, GOA-jDE can still be considered efficient and powerful from a holistic perspective, although it does not give the best results in all cases. Considering the SSIM and FSIM values obtained over all available images at 12 threshold levels, the algorithms can be ranked, in our study, as GOA-jDE > hjDE > BDE~MGOA > GOA > DE > PSO > BA. When the threshold level is low (K = 4 and 6), the difference between the algorithms is not obvious, while the difference becomes significant at high threshold levels (K = 12). Table 6 gives the SSR values obtained by each method at 12 threshold levels. Note that the specified values of PSNR, SSIM, and FSIM, here, are 29.5, 0.93, and 0.95 respectively, which is a good reflection of the performance differences between the algorithms in this study. It can be found that GOA-jDE-based technique shows excellent performance in "Image 3" and "Image 7" (i.e., its advantages are obvious). Considering "Image 5", although each algorithm does not achieve the expected results, it can be found from Figures 8-10 that the proposed method is relatively advantageous. The performance of GOA-jDE benefits from its accurately determined threshold and the resulting high-quality segmented image. On the contrary, several compared algorithms, such as DE, BA, and PSO, could not obtain the ideal segmented images due to improper determination of the threshold, and the similarity with the original image is not high.         Due to the random property of the meta-heuristic algorithm, it was necessary to perform detailed stability analysis. Boxplots were used to evaluate the stability of each algorithm. The relevant results of four randomly selected images can be seen in Figure 11. It can be observed that the proposed method again outperforms the others. More specifically, the boxplot obtained was lower in position, more compact, and had no bad points, indicating better stability, continuity, and consistency. However, the performance of the other algorithms was not ideal. For example, considering the segmentation of "Image8", the data presented by BA fluctuate greatly, and the center of the boxplot was obviously higher than other algorithms, showing poor stability. Moreover, in the conditions of "Image7", BDE gives a low median and relatively compact graph, but its accuracy is seriously affected by the existence of bad points. After all, algorithms with unstable performance are difficult to apply to practical engineering problems. In GOA-jDE, a powerful hybrid model was used to balance the exploration and exploitation stages. Thus, its performance was not greatly affected by the random characteristic of the meta-heuristic algorithm, showing remarkable stability in most cases.
Time complexity is also an important indicator for evaluating the performance of meta-heuristic algorithms. In engineering practices, high-precision but high-complexity algorithms are often not the best choice, but algorithms that are fast and can obtain approximate optimal solutions are generally used. Hence, the standard for measuring an algorithm cannot be just computation time or precision and, instead, the evaluation should be comprehensive and objective. The average CPU time of each algorithm can be found in Table 7. It can be observed that PSO is the fastest among available algorithms, but it cannot achieve the best values of the above indicators. The DE algorithm gives competitive results in some cases and is slightly slower than PSO in the overall consideration. In addition, the computation time of GOA-jDE is between GOA and DE, which is determined by the used hybrid model. It is worth noting that this result does not mean that the proposed algorithm is inefficient. As discussed above, the evaluation of algorithm performance should be more comprehensive and objective. The proposed algorithm has shown remarkable performance in many aspects, despite not being the fastest. In summary, GOA-jDE is a high-performance algorithm which can improve the precision significantly while maintaining runtime. Furthermore, the proposed algorithm has a few adjustable parameters, which enables it to adapt to the segmentation of most images. After all, tuning the algorithm parameters may be more complicated than optimizing the problem itself [25].
of the boxplot was obviously higher than other algorithms, showing poor stability. Moreover, in the conditions of "Image7", BDE gives a low median and relatively compact graph, but its accuracy is seriously affected by the existence of bad points. After all, algorithms with unstable performance are difficult to apply to practical engineering problems. In GOA-jDE, a powerful hybrid model was used to balance the exploration and exploitation stages. Thus, its performance was not greatly affected by the random characteristic of the meta-heuristic algorithm, showing remarkable stability in most cases. Time complexity is also an important indicator for evaluating the performance of meta-heuristic algorithms. In engineering practices, high-precision but high-complexity algorithms are often not the best choice, but algorithms that are fast and can obtain approximate optimal solutions are generally used. Hence, the standard for measuring an algorithm cannot be just computation time or precision and, instead, the evaluation should be comprehensive and objective. The average CPU time of each algorithm can be found in Table 7. It can be observed that PSO is the fastest among available algorithms, but it cannot achieve the best values of the above indicators. The DE algorithm gives competitive results in some cases and is slightly slower than PSO in the overall consideration. In

Statistical Tests
Since the experiments of all available methods were the same, it was necessary to perform statistical tests. Friedman test [64] and Wilcoxon's rank sum test [63] were used as non-parametric statistical tests for evaluating the performance of all algorithms, with 5% considered as the significant level. The null hypothesis (H 0 ) in Wilcoxon's rank sum test assumes that there is no significant difference between the two algorithms being compared, and the alternative hypothesis (H 1 ) indicates the difference. In addition, H 0 in the Friedman test states the equality of medians between the algorithms, while H 1 also indicates the difference. A more detailed description of Friedman test and Wilcoxon's rank sum test can be found in the literature [66].
The experimental results can be seen in three tables. From Table 8, the proposed method gives better results (p < 0.05 and h = 1) in all cases, which indicates that the indicator values given by GOA-jDE are not appearing randomly. For the Friedman test, it is worth mentioning that calculating the rankings is the first step, the relevant results of which are presented in Table 9. It can be found from the table that the proposed method ranks first in all cases. The chi-square χ 2 value and p-value are presented in Table 10. According to the chi-square distribution table, the critical value for 7 (8 algorithms − 1) degrees of freedom with 5% significant level is 14.067 [67,68]. As shown in Table 10, the chi-square values obtained at all threshold levels were much larger than the critical value, and the p-values acquired for all number of thresholds were far less than 0.05. This promising result indicates that H 0 can be rejected in all cases, and there is a significant difference among the algorithms. The experiments in this section demonstrate the statistical significance of the performance of GOA-jDE algorithm. In this section, the GOA-jDE-MCE method is compared with other advanced satellite image thresholding methods, such as MABC-Tsallis [57], IDSA-Otsu [58], and CS-MCE [13]. Considering the different criteria used in each method, the GOA-jDE algorithm was combined with MCE, Tsallis entropy, and Otsu's method for fair comparison. In other words, three sets of comparisons were performed, namely GOA-jDE-Tsallis versus MABC-Tsallis, GOA-jDE-Otsu versus IDSA-Otsu, and GOA-jDE-MCE versus CS-MCE. The selected performance measure was the average objective function value. Furthermore, two natural images from the Berkeley segmentation dataset were also used to test the robustness of the algorithm, which are given in Figure 12 [69]. It is worth noting that higher threshold levels (20, 25, and 30) were adopted in this experiment because the satellite images contained more information and detail than normal images. More critically, the objects (targets) that needed to be determined and segmented were, in general, multiple and varied [70]. Therefore, in this section, the threshold level was increased to assess the algorithm performance more practically.

Experimental Series 2: Performance on Other Objective Functions
In this section, the GOA-jDE-MCE method is compared with other advanced satellite image thresholding methods, such as MABC-Tsallis [57], IDSA-Otsu [58], and CS-MCE [13]. Considering the different criteria used in each method, the GOA-jDE algorithm was combined with MCE, Tsallis entropy, and Otsu's method for fair comparison. In other words, three sets of comparisons were performed, namely GOA-jDE-Tsallis versus MABC-Tsallis, GOA-jDE-Otsu versus IDSA-Otsu, and GOA-jDE-MCE versus CS-MCE. The selected performance measure was the average objective function value. Furthermore, two natural images from the Berkeley segmentation dataset were also used to test the robustness of the algorithm, which are given in Figure 12 [69]. It is worth noting that higher threshold levels (20, 25, and 30) were adopted in this experiment because the satellite images contained more information and detail than normal images. More critically, the objects (targets) that needed to be determined and segmented were, in general, multiple and varied [70]. Therefore, in this section, the threshold level was increased to assess the algorithm performance more practically.  Table 11 gives the average objective function value obtained by each thresholding technique (bolded results are best). In the comparison with MABC-Tsallis, the GOA-jDE algorithm exhibits a remarkable performance based on the Tsallis entropy thresholding technique, which is reflected in all the obtained values. Moreover, considering the experiment on natural images, GOA-jDE-Tsallis presents better results in most cases, indicating the robustness and universality of the algorithm. For the comparison with the other two methods, it can be seen that the proposed algorithm-based methods are, again, superior to other methods. In general, the compared method based on MCE is  Table 11 gives the average objective function value obtained by each thresholding technique (bolded results are best). In the comparison with MABC-Tsallis, the GOA-jDE algorithm exhibits a remarkable performance based on the Tsallis entropy thresholding technique, which is reflected in all the obtained values. Moreover, considering the experiment on natural images, GOA-jDE-Tsallis presents better results in most cases, indicating the robustness and universality of the algorithm. For the comparison with the other two methods, it can be seen that the proposed algorithm-based methods are, again, superior to other methods. In general, the compared method based on MCE is relatively more competitive than the compared method based on Tsallis entropy and Otsu's method, because the first one gives better results in more cases.

Experimental Series 3: Further Evaluation on SIPI Image Database
In this section, the performance of the proposed algorithm is further verified on the SIPI Image Database (37 color satellite images and 1 grayscale satellite image). The experiment was carried out at high threshold levels, which are the same as in Section 5.4. Note that the compared selected algorithms are based on different criteria, thus, the PSNR, SSIM, and FSIM indicators are utilized instead of average fitness values. Table 12 presents the average rank of the results for each algorithm. A lower ranking value indicates better performance. It can be found that the GOA−jDE ranks first in all cases, and the advantage is more obvious at higher threshold levels (K = 30). Another interesting result is that the MCE method is more competitive than the Tsallis entropy thresholding technique in the field of satellite image segmentation. More specifically, the results of MABC-Tsallis are not superior to those of MCE-based approaches. However, this does not mean that the methods based on Tsallis entropy are inefficient or meaningless. As stated in the no free lunch (NFL) theorem [71], no algorithm can solve all engineering problems. GOA-jDE-Otsu and GOA-jDE-Tsallis thresholding methods may exhibit superior performance in other image segmentation fields, including CT and MR images. Therefore, applications of the proposed algorithm are potential and meaningful.