Face Image Segmentation Using Boosted Grey Wolf Optimizer

Image segmentation methods have received widespread attention in face image recognition, which can divide each pixel in the image into different regions and effectively distinguish the face region from the background for further recognition. Threshold segmentation, a common image segmentation method, suffers from the problem that the computational complexity shows exponential growth with the increase in the segmentation threshold level. Therefore, in order to improve the segmentation quality and obtain the segmentation thresholds more efficiently, a multi-threshold image segmentation framework based on a meta-heuristic optimization technique combined with Kapur’s entropy is proposed in this study. A meta-heuristic optimization method based on an improved grey wolf optimizer variant is proposed to optimize the 2D Kapur’s entropy of the greyscale and nonlocal mean 2D histograms generated by image computation. In order to verify the advancement of the method, experiments compared with the state-of-the-art method on IEEE CEC2020 and face image segmentation public dataset were conducted in this paper. The proposed method has achieved better results than other methods in various tests at 18 thresholds with an average feature similarity of 0.8792, an average structural similarity of 0.8532, and an average peak signal-to-noise ratio of 24.9 dB. It can be used as an effective tool for face segmentation.


Introduction
Face-based research has received increasing attention as one of the most common and non-contact collection important biometric features.One of the hot topics in face image processing technology is object recognition [1][2][3].Face segmentation and face recognition technologies have been used in various places, such as banks [4], schools [5], and libraries [6].However, face recognition, person detection, and image processing techniques often depend on the effect of image segmentation.Rangayya et al. [7] used kernelized total Bregman divergence-based K-Means clustering-based segmentation technique in the proposed face recognition method to reduce noise interference on the segmentation effect, effectively improving face recognition.Khan et al. [8] developed an automatic facial image segmentation model based on conditional random fields to improve classification accuracy.Segundo et al. [9] embedded a segmentation technique based on edge detection, region clustering, and shape analysis into a face detection system to improve face recognition performance.Zhang et al. [10] proposed a multistep iterative segmentation algorithm to achieve fine segmentation of obscured characters and improve recognition accuracy.Efficient and accurate image segmentation techniques can help improve the performance of face recognition systems [11][12][13].
At present, there are many image segmentation methods, such as multi-threshold [14], region growth [15], edge detection [16], and deep learning [17,18].At the same time, unsupervised learning-based image segmentation methods are currently available for unlabeled training data.For example, Xia et al. [19] proposed a new unsupervised learning segmentation network by combining two fully convolutional networks into one autoencoder inspired by the idea of semantic segmentation.Kim et al. [20] designed an end-to-end unsupervised image segmentation network consisting of argmax functions for normalization and differentiable clustering.Although the segmentation methods based on unsupervised learning do not need to train the neural network in advance, and can directly and unsupervised segmentation of a single image, not only greatly save the computational resources, and segmentation of significant targets more accurate features and advantages.However, such methods are not stable enough to perform the image segmentation task and cannot effectively extract texture features to segment the image into regions of overall significance when the same target has significant color differences.There are defects of confusing foreground and background in some images, over-reliance on color information, and little consideration of target spatial feature information.There is also the problem of poor robustness in unsupervised learning.It is difficult for such methods to place restrictions on neural networks speculatively outputting a sheet of results containing only one category, a design that is more prone to overfitting.Threshold-based image segmentation methods do not require a priori knowledge; the robust segmentation effect is excellent and is an efficient means of image segmentation.On the other hand, the typical exhaustive method for determining the best threshold will increase computing complexity while decreasing computational efficiency.Therefore, using a meta-heuristic optimization algorithm to search for the optimal threshold has become effective.Li et al. [21] proposed a threshold segmentation technique based on particle swarm optimization.Liu et al. [22] used the firework algorithm to find the optimal threshold set.Li et al. [23] used the biogeographic optimization algorithm to enhance multi-threshold image segmentation.Dutta et al. [24] proposed a multi-level image thresholding method based on the quantum genetic algorithm.The threshold segmentation method based on meta-heuristic optimization can obtain the optimal set of thresholds for segmentation more efficiently, which is considered a promising threshold segmentation method.
In recent times, optimization methods have experienced a surge in prominence, capturing the sustained interest of the community in swarm-based optimization, distributed optimization [25], robust optimization [26], multi-objective optimization [27], many objective cases [28], fuzzy optimization [29], etc.The optimization methods can be classified into two fundamental classes: deterministic and approximative techniques, rendering them amenable to addressing a wide spectrum of problem scenarios [30,31].The meta-heuristic approaches stand as pivotal categories of optimization techniques deeply grounded in concepts such as mutation, crossover, and various iterative procedures.These methodologies enable the exploration of solution spaces independently of gradient information, provided that the newly generated solutions adhere to prescribed optimality criteria.Two of the most well-known approaches are genetic algorithms (GAs), which are based on the selection of fittest and survival values in nature [32,33].However, these classes of swarm-based methods are prone to several risks, including weak mathematical models, low robustness issues, immature convergence, and stagnation possibility [34,35].The metaheuristic optimization algorithms can be utilized to solve the optimal solution of complex problems, such as image segmentation [36], feature selection [37], real-world optimization problems [38], bankruptcy prediction [39], scheduling optimization [40], multi-objective optimization [41], global optimization [42,43], target tracking [44], economic emission dispatch [45], feed-forward neural networks [46], and numerical optimization [47][48][49].It has become one of the most popular optimization methods due to its excellent optimization ability.Common optimization algorithms include particle swarm optimization (PSO) [50], sine and cosine optimization algorithm (SCA) [51], whale optimization algorithm (WOA) [52], slime mould algorithm (SMA) [53,54], hunger games search (HGS) [55], Harris hawks optimization (HHO) [56], colony predation algorithm (CPA) [57], rime optimization algorithm (RIME) [58], the weighted mean of vectors (INFO) [59], Runge Kutta optimizer (RUN) [60], grey wolf optimization algorithm (GWO) [61], and other optimization methods.As an example, the development of a method to evaluate task offloading strategies within the context of Mobile Edge Computing (MEC) was facilitated by the utilization of the Sine and Cosine Algorithm (SCA) [62].
According to the No Free Lunch theorems for optimization [63], it is known that no one optimization method can perform well on all problems.Therefore, more and more improved optimization algorithms based on optimization strategies have been proposed to address the shortcomings of different optimization algorithms in the process of searching for the global optimal solution.To adapt the algorithm to the more complex problems to be optimized, the improvement of the original algorithm has become another research hotspot.For example, Yang et al. [64] utilized roundup search, the elite Lévy-mutation, and the decentralized foraging optimization techniques to enhance the performance of differential evolution for multi-threshold image segmentation.Zhang et al. [65] proposed an adaptive differential evolution with an optional external archive (JADE).Guo et al. [66] proposed a self-optimization approach for L-SHADE (SPS_L_SHADE_EIG).Qu et al. [67] proposed a modified sine cosine algorithm based on a neighborhood search and a greedy levy mutation (MSCA).Han and Li [68] proposed an improved genetic algorithm based on adaptive crossover probability and adaptive mutation probability.
PSO is an excellent optimization algorithm; with few parameters and easy implementation.The PSO and its variants have received extensive attention from researchers.Among many PSO variants, GWO not only inherits the advantages of PSO but also is an attempt to improve global optimization ability and convergence [69].Therefore, improved GWOs based on optimization strategies have been proposed recently.Cai et al. [70] optimized a kernel extreme learning machine with an enhanced GWO.Choubey et al. [71] optimized the parameters of the multi-machine power system stabilizer.Li et al. [72] monitored the robot path through an enhanced GWO optimization area.Mehmood et al. [73] applied an improved grey wolf optimizer technique based on chaotic mapping to optimize problems such as autoregressive exogenous structural parameter optimization.
In thresholding optimization, local optimal or suboptimal solutions mistakenly considered the best set of thresholds can lead to incorrect segmentation.Thus, the critical information about the target is lost, resulting in the degradation of image segmentation quality.To obtain a method with high optimization accuracy and a high ability to jump out of the local optimum, this study proposed a GWO improvement method based on the cosmic wormhole strategy, denoted as WGWO.To distinguish background and object more efficiently and to consider the spatial information of pixels, Kapur's entropy was used as the objective function of WGWO and combined with a nonlocal mean two-dimensional histogram to achieve high-quality segmentation of face images.From the Berkeley dataset [74] and Flickr-Faces-High-Quality dataset (FFHQ) [75], eight face images were selected for comparative experiments, and then the segmentation effects were verified by three image evaluation metrics.The experimental results show that the WGWO multi-threshold segmentation method achieves satisfactory segmentation results.The main contributions of this paper are as follows:

•
A multi-threshold image segmentation method based on optimization technique and 2D histogram is proposed, which is used to segment face images; • An enhanced grey wolf optimizer based on cosmic wormhole strategy is proposed that is used to obtain the optimal segmentation threshold for the image.
The remainder of the paper is organized as follows.Section 2 describes the proposed WGWO.Section 3 presents the WGWO-based image segmentation method.In Section 4, the segmentation results have been verified and discussed.Section 5 summarizes the current work and explains the next research directions.

The Proposed WGWO
To improve the search efficiency of the optimal threshold set, this section details an improved grey wolf optimizer for image segmentation called WGWO.

Original GWO (a Variant of PSO)
As a PSO algorithm variant [76], GWO mainly consists of S t 1 , S t 2 and S t 3 to update the positions of the other particles in the population at the t-th iteration.S t 1 , S t 2 and S t 3 are biased by biasing the positions of individual particles in the population based on the top three optimal individuals in the population, g t 1 , g t 2 and g t 3 , respectively, as shown in Equations ( 1)- (3).
where r t 1 , r t 2 , r t 3 , q t 1 , q t 2 , q t 3 all denote random numbers, obeying a random distribution between 0 and 1. ϕ t denotes an acceleration weighting factor decreasing from 2 to 0, ϕ t = 2(1 − t/T).X t i denotes the position vector of the i-th particle at the t-th iteration.Combining the position information of the three intermediates S t 1 , S t 2 , and S t 3 are used to update the position of the i-th particle according to the rule of weighted averaging, as shown in Equation ( 4).

Improved GWO (WGWO)
GWO has shown excellent optimization performance in various fields, e.g., power load forecast [77], power load forecast [78], etc.However, GWO still has insufficient segmentation accuracy for the image segmentation involved in this study.GWO is optimized in this study by combining the cosmic wormhole technique with population variety to improve the ability to avoid falling into the local optimum.Equations ( 8)- (11) show the mathematical model of the cosmic wormhole strategy [79].
In Equations ( 5)-( 8), Ada is a probability parameter, Ada = [0.2,1].The candidate solution is decided whether to update by probability parameter Ada, M is a weight parameter, which controls the influence of random search on the current candidate solution through different search epochs.C is a constant of value 6. r t 4 , r t 5 , r t 6 represent random numbers between 0 and 1.
To obtain a better solution, we proposed the WGWO method by introducing the cosmic wormhole strategy after the wolf population update, and the flowchart of WGWO is shown in Figure 1 (code has been publicly available at https://github.com/Forproject1111/WGWO, accessed on 8 October 2023).To calculate the time complexity of WGWO, we need to analyze the maximum number of iterations of the algorithm (T), the size of the population (N), and the variable size of the individuals (D).WGWO consists of population initialization, searching for prey, and cosmic wormhole strategy.Therefore, the time complexity of WGWO is O(((2N + 1) × D) × T).
To obtain a better solution, we proposed the WGWO method by introducing mic wormhole strategy after the wolf population update, and the flowchart of W shown in Figure 1 (code has been publicly available at https://github.comject1111/WGWO, accessed on 8 October 2023).To calculate the time compl WGWO, we need to analyze the maximum number of iterations of the algorithm size of the population (N), and the variable size of the individuals (D).WGWO co population initialization, searching for prey, and cosmic wormhole strategy.Th the time complexity of WGWO is O(((2N + 1) × D) × T).

The Basic Theory of Multi-Threshold Image Segmentation
The multi-threshold image segmentation method based on WGWO, NML 2 gram, and Kapur's entropy reduces noise interference and improves segmenta ciency.

NML 2-D Histogram
The NML 2-D histogram [13] is composed of grayscale values and nonlocal m a digital image, which reflects the grayscale size of pixels and information relate pixel and neighborhood space.Assume that an image I of size M × N exists.p an resent pixel points, and X() represents the pixel value.NML mathematical mod shown in Equations ( 9)- (12):

The Basic Theory of Multi-Threshold Image Segmentation
The multi-threshold image segmentation method based on WGWO, NML 2-D histogram, and Kapur's entropy reduces noise interference and improves segmentation efficiency.

NML 2-D Histogram
The NML 2-D histogram [13] is composed of grayscale values and nonlocal means in a digital image, which reflects the grayscale size of pixels and information related to the pixel and neighborhood space.Assume that an image I of size M × N exists.p and q represent pixel points, and X() represents the pixel value.NML mathematical model of I is shown in Equations ( 9)-( 12): ) where ω(p, q) is a Gaussian weighting function.σ denotes the standard deviation.µ(p), µ(q) represent the local mean of p, q pixels.x is a pixel in the image I, and S(x) is a filter matrix of size n × n around the pixel For each pixel, I(x, y) in I, x∈ [1, M], y∈[1, N], the corresponding grayscale f (x, y) and the nonlocal mean g(x, y) can be calculated.Then, i and j are used to denote f (x, y) and g(x, y), respectively.h(i, j) denotes the vertical coordinate of the two-dimensional histogram, i.e., the number of occurrences of the gray-NML pair.Finally, h(i, j) is normalized to obtain P ij to construct a greyscale nonlocal mean two-dimensional histogram, as shown in Figure 2.
where ω(p, q) is a Gaussian weighting function.σ denotes the standard deviation.µ(p), µ(q) represent the local mean of p, q pixels.x is a pixel in the image I, and S(x) is a filter matrix of size n × n around the pixel For each pixel, I(x, y) in I, x∈ [1, M], y∈ [1, N], the corresponding grayscale f(x, y) and the nonlocal mean g(x, y) can be calculated.Then, i and j are used to denote f(x, y) and g(x, y), respectively.h(i, j) denotes the vertical coordinate of the two-dimensional histogram, i.e., the number of occurrences of the gray-NML pair.Finally, h(i, j) is normalized to obtain P ij to construct a greyscale nonlocal mean two-dimensional histogram, as shown in

Kapur's Entropy
To ensure that the segmented image obtained the maximum amount of information between the background and the target, the concept of Kapur's entropy [13,80] was introduced in this study.Kapur's entropy was used as a physical quantity to measure the amount of information distributed over the target and background regions.The greater Kapur's entropy, the better the image segmentation quality.The following describes the process for segmenting images with several thresholds using Kapur's entropy.The objective function is expressed as computing the entropy of L − 1 image segments and summing them.The objective function F expression of Kapur's entropy is shown in Equations ( 13)- (15).

Kapur's Entropy
To ensure that the segmented image obtained the maximum amount of information between the background and the target, the concept of Kapur's entropy [13,80] was introduced in this study.Kapur's entropy was used as a physical quantity to measure the amount of information distributed over the target and background regions.The greater Kapur's entropy, the better the image segmentation quality.The following describes the process for segmenting images with several thresholds using Kapur's entropy.The objective function is expressed as computing the entropy of L − 1 image segments and summing them.The objective function F expression of Kapur's entropy is shown in Equations ( 13)- (15).
where H i represents the entropy of the i-th image segment.{t 1 , t 2 . . . ,L − 1} represents the grayscale value of the grayscale image and {s 1 , s 2 . . . ,L − 1} characterizes the grayscale value of the nonlocal mean image.

Image Segmentation Method
The flowchart of segmentation based on WGWO, Kapur's entropy, and NML 2-D histogram are shown in Figure 3.As shown in Figure 3, the input image is converted into a grayscale image and a nonlocal mean filtered image.Then a 2D histogram is calculated using the grayscale information of the grayscale image and the nonlocal mean filtered image.The Kapur's entropy of the two-dimensional histogram is used as an objective function to optimize the segmentation threshold of the image using the proposed WGWO algorithm, which ultimately segments the image into multiple regions.The pseudo-code for segmentation is shown in Algorithm 1.

Algorithm 1 The flow of image segmentation method
Step 1: Input digital image I, which has a size of M × N. The grayscale image F is obtained by graying out the image I; Step 2: The grayscale image F is nonlocal mean filtered to obtain the nonlocal mean image G according to Equations ( 9)-( 12); Step 3: A two-dimensional image histogram is constructed using the grayscale values and nonlocal means in F and G; Step 4: Compute the two-dimensional Kapur's entropy according to Equations ( 13)-( 15); Step 5: Kapur's entropy of the two-dimensional histogram is optimized using WGWO; Step 6: Multi-threshold image segmentation is performed according to the optimal threshold set to obtain pseudo-color and gray images. (15 where H i represents the entropy of the i-th image segment.t 1 ,t 2 …,L − 1 represents th grayscale value of the grayscale image and s 1 ,s 2 … ,L − 1 characterizes the grayscal value of the nonlocal mean image.

Image Segmentation Method
The flowchart of segmentation based on WGWO, Kapur's entropy, and NML 2-D histogram are shown in Figure 3.As shown in Figure 3, the input image is converted into a grayscale image and a nonlocal mean filtered image.Then a 2D histogram is calculated using the grayscale information of the grayscale image and the nonlocal mean filtered image.The Kapur's entropy of the two-dimensional histogram is used as an objectiv function to optimize the segmentation threshold of the image using the proposed WGWO algorithm, which ultimately segments the image into multiple regions.The pseudo-cod for segmentation is shown in Algorithm 1.

Algorithm 1 The flow of image segmentation method
Step 1: Input digital image I, which has a size of M × N. The grayscale image F is obtained by graying out the image I; Step 2: The grayscale image F is nonlocal mean filtered to obtain the nonlocal mean imag G according to Equations ( 9)-(12); Step 3: A two-dimensional image histogram is constructed using the grayscale values and nonlocal means in F and G; Step 4: Compute the two-dimensional Kapur's entropy according to Equations ( 13)-(15); Step 5: Kapur's entropy of the two-dimensional histogram is optimized using WGWO; 1. Step 6: Multi-threshold image segmentation is performed according to the optima threshold set to obtain pseudo-color and gray images.

Experiment Simulation and Analysis
In this section, the performance of the WGWO-based multi-threshold segmentation method proposed in this paper is verified.In addition, all programs were run using Matlab 2018b on a Windows 10 OS-based computer with an Intel CPU i5-11400H (2.70 GHz) and 16 GB of RAM.

IEEE CEC2020 Benchmark Dataset Experiment
In this subsection, this study conducted an ablation experiment and parameter experiment based on IEEE CEC2020 [81] test components to present the global optimization capability of WGWO.The details of IEEE CEC2020 are shown in Table 1.In addition, to ensure the fairness of the experiments, the public parameters of all test methods were set uniformly, the maximum number of function evaluations was set to 300,000, the population size was set to 30, the dimensionality was set to 30, and the number of independent runs of the program was set to 30.First, the ablation experiment was introduced in this subsection to justify the improved strategy.Where WGWO1 used only X t+1 i = X t i + weight as the update formula in Equation ( 5).WGWO2 used only X t+1 i = X t i − weight as the updated formula.Table 2 shows the ranking of the ten benchmark functions and the final ranking in the ablation experiment.Where the optimal results were bolded.From the table, it can be seen that WGWO and WGWO2 have the best results in four cases each.However, the average ranking of 1.90 for WGWO was the best among the four methods.Therefore, WGWO was selected as the threshold optimization method for image segmentation in this study.Second, the values of the parameters are one of the reasons that affect the performance of the algorithm.Among them, Ada and C are two key parameters in WGWO.Therefore, this study focused on testing the parameter sensitivity of Ada and C. Tables 3 and 4 show the ranking results and the final ranking of Ada and C, respectively.As can be seen in Table 3, the combined ability was significantly better than that of the other versions of WGWO despite the relatively poor results for values in the range [0.3, 1] on F1 and F5.Furthermore, the final ranking in Table 4 shows that WGWO was optimized the most when the C value was 6.Based on the results of the sensitivity tests of the two parameters, we fine-tuned Ada and C. A randomized value range of [0.3, 1] for Ada and a value of 6 for C were finally determined as the parameter values for the final version of WGWO for subsequent experiments.
In conclusion, based on the results of the ablation experiment and the parameter sensitivity experiment described above, it can be concluded that the best version of WGWO was applied to optimize the threshold for multi-threshold image segmentation.

Multi-Threshold Face Image Segmentation Experiment
To provide quality face recognition data, this subsection reports the testing of image segmentation performance of the proposed method.

Experimental Settings
Test images of faces from the Berkeley dataset and the Flickr-Faces-High-Quality dataset were selected as validation materials, as shown in Figure 4.The size of the images was 321 × 481, where images A to D were from the Berkeley dataset and images E to H were from the FFHQ dataset.Threshold values were set in the range of 0 to 255.The maximum number of iterations was set to 100.The population size was set to 30, and the number of independent runs was set to 30.
WGWO was compared to GWO [61], PSO [50], WOA [52], BLPSO [82], IGWO [70], HLDDE [64], SCADE [83], and IWOA [84] in multi-threshold image segmentation experiments.The initialization parameters of the nine algorithms are shown in Table 5.To verify the segmentation effect of WGWO, this paper examined the segmentation effect through three evaluation methods: feature similarity (FSIM) [85], structural similarity (SSIM) [86], and peak signal-to-noise ratio (PSNR) [87].The details of these indicators are shown in Table 6.The segmentation results were analyzed for the significance of differences using the Wilcoxon signed-rank test (WSRT) [88].In WSRT, if the p-value is less than 0.05 and WGWO is superior to the comparison method, the advantage of WGWO performance is statistically significant and is denoted by '+'.If the p-value is less than 0.05 and WGWO is inferior to the comparison method, the advantage of the comparison method performance is statistically significant and is denoted by '−'.If the p-value is greater than or equal to 0.05, the performance of WGWO and the comparison method can be approximated as equal and is denoted by '='.
Test images of faces from the Berkeley dataset and the Flickr-Faces-High-Quality da taset were selected as validation materials, as shown in Figure 4.The size of the image was 321 × 481, where images A to D were from the Berkeley dataset and images E to H were from the FFHQ dataset.Threshold values were set in the range of 0 to 255.The max imum number of iterations was set to 100.The population size was set to 30, and the num ber of independent runs was set to 30.WGWO was compared to GWO [61], PSO [50], WOA [52], BLPSO [82], IGWO [70] HLDDE [64], SCADE [83], and IWOA [84] in multi-threshold image segmentation exper iments.The initialization parameters of the nine algorithms are shown in Table 5.To verif the segmentation effect of WGWO, this paper examined the segmentation effect through three evaluation methods: feature similarity (FSIM) [85], structural similarity (SSIM) [86 and peak signal-to-noise ratio (PSNR) [87].The details of these indicators are shown in Table 6.The segmentation results were analyzed for the significance of differences usin the Wilcoxon signed-rank test (WSRT) [88].In WSRT, if the p-value is less than 0.05 and WGWO is superior to the comparison method, the advantage of WGWO performance i statistically significant and is denoted by '+'.If the p-value is less than 0.05 and WGWO i inferior to the comparison method, the advantage of the comparison method performanc is statistically significant and is denoted by '−'.If the p-value is greater than or equal t 0.05, the performance of WGWO and the comparison method can be approximated a equal and is denoted by '='.Original paper [84] Table 6.Details of the three image evaluation metrics.

Metrics Formulas Remarks
FSIM is an image quality assessment method based on phase consistency features and gradient features complementing each other.
SSIM is a similarity assessment based on the luminance, contrast, and structure of the original image and the segmented image, which is a full-reference image quality evaluation index more in line with human vision's judgment of image quality.
PSNR [87] PSNR = 10•log 10 ((peak 2 )/MSE) PSNR represents the ratio of the maximum possible power of a signal to the destructive noise power affecting its representation accuracy and is an objective full-reference image quality evaluation index.

Image Segmentation Experiment
This experiment first demonstrates nine algorithms for segmenting visual images in the Berkeley and FFHQ datasets at a threshold level of 18. Figures 5 and A1-A3 show the results of pseudo-color segmented images and gray segmented images from image A to image H.It can be seen directly from the figures that SCADE and IWOA (in Figures 5 and A1-A3) had poor segmentation effects, and HLDDE (in Figures A2 and A3) also had a poor segmentation effect on image F and image H. Subsequently, Figure 6 shows that WGWO had excellent segmentation results for all eight images at different thresholds.In this case, important information was lost in the segmented images at lower thresholds, while the high threshold segmentation retained more image detail.It is important to note that the visual result only shows the segmentation effect.The experimental results based on FSIM, PSNR, and SSIM reflect the quality of the segmented image more objectively.As a result, the FSIM, PSNR, and SSIM results of WGWO were further analyzed and discussed.
the Berkeley and FFHQ datasets at a threshold level of 18. Figures 5 and A1-A3 show the results of pseudo-color segmented images and gray segmented images from image A to image H.It can be seen directly from the figures that SCADE and IWOA (in Figures 5 and  A1-A3) had poor segmentation effects, and HLDDE (in Figures A2 and A3) also had a poor segmentation effect on image F and image H. Subsequently, Figure 6 shows that WGWO had excellent segmentation results for all eight images at different thresholds.In this case, important information was lost in the segmented images at lower thresholds, while the high threshold segmentation retained more image detail.It is important to note that the visual result only shows the segmentation effect.The experimental results based on FSIM, PSNR, and SSIM reflect the quality of the segmented image more objectively.As a result, the FSIM, PSNR, and SSIM results of WGWO were further analyzed and discussed.Table 7 demonstrates the comparison results of FSIM between the proposed method and other methods at 4 threshold levels.From all 4 threshold levels, WGWO had the best average rankings among the compared algorithms, and even though the results of WGWO were weaker than those of GWO at 3 images in the segmentation results at 5-level and 8level thresholds, WGWO was better as a whole.It indicates that the segmented images obtained based on the WGWO method are better able to portray the local features of the target, as well as being more in line with the human visual system's perception of low- Table 7 demonstrates the comparison results of FSIM between the proposed method and other methods at 4 threshold levels.From all 4 threshold levels, WGWO had the best average rankings among the compared algorithms, and even though the results of WGWO were weaker than those of GWO at 3 images in the segmentation results at 5-level and 8-level thresholds, WGWO was better as a whole.It indicates that the segmented images obtained based on the WGWO method are better able to portray the local features of the target, as well as being more in line with the human visual system's perception of low-level features.Table 8 demonstrates the PSNR comparison results of each segmentation method.A comprehensive analysis of the data in Table 8 shows that the WGWO-based segmentation method was also excellent and stable, indicating that the WGWO-based segmentation method has less distortion and higher image quality in the segmentation process compared with the original image.Table 9 demonstrates the SSIM results of these compared methods.As can be seen from Table 9, the proposed WGWO-based segmentation method dominated, which indicates that the proposed method has less distortion and is more in line with the requirements of the human visual system.It is worth noting that as a variant of PSO, GWOs were more suitable for solving the optimization problem of segmentation thresholds compared to the PSO baseline method (in Tables 7-9).In conclusion, the segmentation performance of WGWO has been validated by three image segmentation quality assessment metrics at multiple threshold levels.In addition to exploring the image segmentation performance of WGWO in a conventional setting, this study further explored the stability of the proposed method when dealing with a larger number of population agents dealing with high threshold segmentation problems.In this study, we set the initial population size to 100 and the threshold segmentation level to 18. Table 10 shows the ranking of the three segmentation metrics and the comparison results based on WSRT.The table shows that the multi-threshold image segmentation method of WGWO is still the best performer with stable image segmentation performance under the condition of a larger population size.In this study, time cost and convergence of threshold optimization methods are one of the evaluation methods of the algorithm's performance, especially the convergence based on Kapur entropy, which is the key for the algorithm to obtain the optimal segmentation threshold.Figure 7 shows the average time cost of 30 experiments of 9 algorithms on different threshold levels for all the images processed.The observation shows that the proposed method ranked third in terms of time cost on each threshold.Moreover, the time cost of each algorithm grew as the threshold level increased.Figure 8 shows the convergence curves for all methods.Figure 8 depicts the convergence curves of nine comparison techniques for Kapur's entropy optimization on eight images.The optimization of Kapur's entropy can be considered a maximum optimization problem, so a higher entropy value means that more useful information is retained, making the segmentation better.The following points can be seen by observing the convergence curves of the optimization of eight images.First, a simple examination of WGWO's eight convergence curves reveals that, although its convergence speed was not the fastest in the whole convergence process, its convergence accuracy was superior to that of other algorithms.It is worth noting that WGWO has a strong ability to prevent premature convergence.Second, the convergence curve of the WOA algorithm in the early and middle iterations was above all algorithms.The slope of the convergence curve of the WGWO algorithm became larger around the 90th iteration, which resulted in the best fitness of WGWO among all methods.Third, compared to PSO, GWO was more suitable to deal with the image segmentation problem based on Kapur's entropy, and GWO had higher convergence accuracy and a better ability to jump out of the local optimal solution.Finally, by comparing the convergence curves of WGWO and GWO, it can be seen that the changes in the two convergence curves were very similar, but the convergence accuracy of WGWO was better than that of GWO, which shows that the introduction strategy enhances the optimization ability of the algorithm.
In conclusion, WGWO has the best optimization ability compared to the other five algorithms and can complete higher-quality multi-threshold image segmentation.Of course, it also can be applied to many other fields, such as machine learning models [89], image denoising [90], medical signals [91], structured sparsity optimization [92], renal pathology image segmentation [93], mental health prediction [94], lung cancer diagnosis [95], computer-aided medical diagnosis [96], MRI reconstruction [97], and power distribution network [98].
90th iteration, which resulted in the best fitness of WGWO among all methods.Third, compared to PSO, GWO was more suitable to deal with the image segmentation problem based on Kapur's entropy, and GWO had higher convergence accuracy and a better ability to jump out of the local optimal solution.Finally, by comparing the convergence curves of WGWO and GWO, it can be seen that the changes in the two convergence curves were very similar, but the convergence accuracy of WGWO was better than that of GWO, which shows that the introduction strategy enhances the optimization ability of the algorithm.The slope of the convergence curve of the WGWO algorithm became larger around the 90th iteration, which resulted in the best fitness of WGWO among all methods.Third, compared to PSO, GWO was more suitable to deal with the image segmentation problem based on Kapur's entropy, and GWO had higher convergence accuracy and a better ability to jump out of the local optimal solution.Finally, by comparing the convergence curves of WGWO and GWO, it can be seen that the changes in the two convergence curves were very similar, but the convergence accuracy of WGWO was better than that of GWO, which shows that the introduction strategy enhances the optimization ability of the algorithm.

Conclusions
This study proposed a grey wolf optimization algorithm based on the cosmic wormhole strategy.The population position update mechanism was optimized to improve the convergence accuracy of the algorithm, which can help the algorithm jump out of the local optimum.A multi-threshold image segmentation method based on WGWO was then used to segment the face images.The facial picture is then segmented using a multithreshold image segmentation approach based on WGWO.The experimental results show that WGWO makes obtaining a set of threshold values suitable for face image segmentation easier.The proposed method is also verified to have better segmentation performance than other methods by three image quality evaluation criteria.In conclusion, the proposed method can support intelligent library face recognition technology more effectively.
Although the WGWO-based image segmentation method proposed in this paper can provide better quality segmented images for face recognition systems, there are still some shortcomings.First, there is still potential to improve the optimization performance of WGWO.In addition, this paper does not explore the optimal threshold level.The above two points are still areas that authors need to further investigate.Additionally, it is intriguing to include parallel computing methods into the framework for multi-threshold picture segmentation to boost computational efficiency.

Figure 3 .
Figure 3.The flow chart of the multi−threshold image segmentation process.

Figure 3 .
Figure 3.The flow chart of the multi−threshold image segmentation process.

Figure 4 .
Figure 4. Face images from the image segmentation dataset .

Figure 7 .
Figure 7. Time cost of each algorithm.

Figure 8 .
Figure 8. Convergence curves of each algorithm .

Figure 7 .
Figure 7. Time cost of each algorithm.

Figure 7 .
Figure 7. Time cost of each algorithm.

Figure 8 .
Figure 8. Convergence curves of each algorithm .Figure 8. Convergence curves of each algorithm.

Figure 8 .
Figure 8. Convergence curves of each algorithm .Figure 8. Convergence curves of each algorithm.

Table 1 .
Details of IEEE CEC2020 benchmark functions.

Table 2 .
Comparative results of the ablation experiment.

Table 3 .
Sensitivity experiment of the parameter Ada.

Table 4 .
Sensitivity experiment of the parameter C.

Table 5 .
Nine algorithm parameter settings.

Table 7 .
FSIM ranking of each algorithm at four thresholds.

Table 8 .
PSNR ranking of each algorithm at the four thresholds.

Table 9 .
SSIM ranking of each algorithm at the four thresholds.

Table 10 .
Segmentation stability ranking and WSRT comparison results in FSIM, PSNR, and SSIM.