Metaheuristic Algorithms Applied to Color Image Segmentation on HSV Space

In this research, we propose an unsupervised method for segmentation and edge extraction of color images on the HSV space. This approach is composed of two different phases in which are applied two metaheuristic algorithms, respectively the Firefly (FA) and the Artificial Bee Colony (ABC) algorithms. In the first phase, we performed a pixel-based segmentation on each color channel, applying the FA algorithm and the Gaussian Mixture Model. The FA algorithm automatically detects the number of clusters, given by histogram maxima of each single-band image. The detected maxima define the initial means for the parameter estimation of the GMM. Applying the Bayes’ rule, the posterior probabilities of the GMM can be used for assigning pixels to clusters. After processing each color channel, we recombined the segmented components in the final multichannel image. A further reduction in the resultant cluster colors is obtained using the inner product as a similarity index. In the second phase, once we have assigned all pixels to the corresponding classes of the HSV space, we carry out the second step with a region-based segmentation applied to the corresponding grayscale image. For this purpose, the bioinspired Artificial Bee Colony algorithm is performed for edge extraction.


Introduction
Image segmentation is the decomposition of an image into meaningful structures, which is a key step in image processing, with the main purpose of facilitating tasks at higher levels, such as object detection, recognition and classification, in passing from image processing to image analysis and image understanding [1]. The basic goal of any image segmentation process is to subdivide an image into components belonging to different objects or to different parts of an object. Theoretically, pixels derived by the same component should have similar properties, forming a connected region [2]. During recent decades, many color segmentation methods have been proposed in the literature; for an in-depth overview, refer to [3][4][5]. A general and broad classification of segmentation image techniques is reported in [6], where a low-level taxonomy is introduced based on distinguishing segmentation methods into spatially guided and spatially blind, where the former perform the spatial arrangement of pixels, unlike the second ones.
Color segmentation techniques can also be divided into three main categories: featurespace-based methods, image-domain-based methods and physics-based methods [7]. In the first class, cluster segments are generated in such a way that they are homogeneous with respect to the characteristics of the feature space, such as intensity level, color or texture. After mapping pixels into a color space, they are allocated in each cluster based on their features, recurring to predefined similarity criteria. Generally, feature-space techniques are spatially blind; they ignore the spatial distribution of pixel colors. Histogram thresholding techniques can be ascribed to this category. In thresholding techniques, pixels are partitioned according to their intensity or color levels, recurring to a global thresholding pled by chromaticity, hence RGB color space does not produce satisfactory segmentation results [21].
As for the HSV space, we need to clarify that hue is the chromatic feature describing a pure color (red, yellow, etc.), saturation quantifies the amount of gray in a particular color, and generally if saturation appears as a range from 0 to 1, 0 represents gray color whereas 1 is a primary color. Value is the intensity or brightness of the color, where 0 is completely black, and 1 is the brightest. The hue and saturation, or alternatively, intensity, components emulate the human perception of color. More precisely, hue is the dominant wavelength, whereas saturation is its purity, or more specifically, the inverse of the amount of white light contained in the color. The value component is apart from chromaticity, so it is decoupled from hue and saturation.
After defining the color space, an illumination equalization was performed, applying a Gaussian blurring to the value channel of the image with standard deviation sigma equal to 1.5; this gives us a local average for the illumination. We have to keep in mind that color representation is sensitive to illumination, so two colors with the same chromaticity can be recognized as different if they have different lighting intensity. This fact makes the clustering processes inefficient, because pixels from the same class but with different illumination can be identified as pixels from separate classes.
After finishing this preprocessing step, we proceeded to a histogram-based segmentation approach applied to each color channel, which uses a metaheuristic algorithm to automatically define the number of clusters and the histogram maxima [22][23][24][25]. Metaheuristic algorithms are a class of approximate methods that allow us to discover possible solutions by exploring a search space in order to find near-optimal solutions. They are iterative processes developed to search for a solution that is good enough in a time that is small enough [26]. These algorithms are frequently nature-inspired, and they have the advantages of finding global optima due to the action of multiple search agents which are randomly generated [27]. The solution of an optimization problem with a metaheuristic algorithm implies an initialization step generating one or more random solutions. In each iteration step, the current solution is then changed by a new one, created by search operators, with a global optimization approach composed by two schemes: the exploitation of new solutions with the goal of improving the quality of solutions and the exploration of the entire search space to prevent the selection of local optima.
In searching maxima of histogram distributions, we suggest the use of this class of optimization algorithms because they guarantee an enhancement of convergence into global optima despite the presence of numerous local maxima, thanks to the simultaneous action of multiple agents moving randomly all around the research space, [28,29]. To this end, we applied the Firefly Algorithm that employs fireflies as search agents, making use of their idealized flashing characteristics to locate the most significant peaks of grayscale histogram for each component [30,31].
Subsequently, the detected maxima are used as initial values of cluster means for the parameter estimation of a Gaussian Mixture Model [32]. The probability density function of a Gaussian Mixture Model is expressed as a weighted sum of Gaussian density functions, whose parameters are evaluated applying the Expectation-Maximization (EM) technique [33]. The coefficients of the linear combination of Gaussians can be seen as prior probabilities of each component, while the posterior probabilities, derived by the Bayes rule, can be used for assigning pixels to clusters without recurring to a thresholding process. The Gaussian Mixture Model (GMM) is a parametric probability density function represented as a weighted sum of Gaussian component densities. The GMM parameters are estimated from data using the iterative Expectation-Maximization (EM) algorithm. As we know, a univariate Gaussian density distribution is expressed by: where N x µ, σ 2 represents the Gaussian or normal distribution of mean µ and standard deviation σ. A Gaussian mixture model is a weighted sum of K components of Gaussian densities, analytically given by: Equation (2) represents a linear superposition of Gaussian probability densities and the mixing coefficients π k indicates the weight of each distribution. After the initialization phase, the Gaussian parameters µ k , σ 2 k and the coefficients π k of the linear combination are evaluated using the Expectation Maximization (EM) algorithm. At each iteration, the EM algorithm computes the responsibilities γ k , so defined: γ k (x) represents the posterior probability of a given intensity x to belong to the k-th cluster, according to the Bayes rule. In short, the responsibilities estimate the grades of membership indicating the degree to which data points belong to each cluster. Consequently, the assignment of pixels of a given gray level x i to the cluster k is performed by means of the evaluation of the maximum value of responsibilities as k varies, given that, by definition, γ k (x i ) indicates the probability of the k-th GMM's component to have generated the value x i .
The issues of the segmentation process applied to each color channel are obtained independently, then they are recombined together to compose the final segmented color image. After that, it performs a reduction in the number of distinct colors recurring to the parallelism of vectors represented in HSV space. In order to do so, we would like to reiterate the fundamental laws of colorimetry, that state [34]:

1.
Any color can be defined by three values and the combination of the three components is unique; 2.
Two colors are equivalent after multiplying or dividing the three components by the same number; 3.
The luminance of a mixture of colors is equal to the sum of the luminance of each color.
According to the second statement, we have chosen the cross product to identify parallel vectors in the HSV space. Indeed, if two vectors have the same direction, or equivalently if they are linearly dependent, their cross product is zero. In this context, if the cross product is approximately null it implies that colors are very similar, so they can be considered as belonging to the same class.
At the end of the process, the performance of a clustering algorithm must be estimated. The academic literature in this field has suggested several performance metrics to assess the validity of cluster partitions [35,36]. Basically, three different techniques are implemented for evaluating the efficiency of clustering algorithms: external criteria, internal criteria and relative criteria [37]. The external validity methods evaluate the clustering results based on their comparison to an externally known result, such as manual image segmentation performed by human users. The internal measures estimate the goodness of a clustering process without considering external information but using the data set itself. Finally, relative clustering validation evaluates the clustering structure by varying different parameter values for the same algorithm (for example changing the number of clusters).
In this work, the algorithm efficiency is computed through an internal clustering validation approach referring to the mean-squared error (MSE) [38]. Usually, MSE is used for assessing distortion between the original image and resulted image. For color images, the formula is extended to include the three components: where I(i, j, k), I(i, j, k) are, respectively, the original and the segmented image, p is the number of the image components (p = 3 for color spaces), and n·m is the size of each component. A low MSE value means that the predicted values are close to the real values, practically, the RMSE will be used, defined by: RMSE = √ MSE. This RMSE depends on the orders of magnitude of the observed values. Therefore, it can vary significantly from one application to the next. To resolve this problem, we could consider the relative absolute error (RAE) defined as follows: Small values of validation indices imply that the estimated image is close to the initial one.

Edge Extraction Applying Artificial Bee Colony Algorithm
After the color segmentation process, we proceed to a region-based segmentation of the corresponding gray image in order to extract edges of homogenous components. The first step in region growing is to select a set of seed points. The region begins to grow from the location of these seeds. In the present work, the selection of initial seed points was achieved through the Artificial Bee Colony algorithm.
The ABC algorithm is a swarm-based metaheuristic algorithm that was introduced by Karaboga in 2005 [39] for optimizing numerical problems. It was inspired by the intelligent foraging behavior of honeybees in nature. The algorithm is specifically based on the model for the foraging behavior of honeybee colonies [40,41]. In the ABC algorithm, the colony of artificial bees consists of three groups: employed bees, onlookers and scouts [25]. Employed bees are those who have discovered a food source. The employed bee whose food source is abandoned becomes a scout bee, starting new random research around the hive. The exchange of information among bees is the most important occurrence in the formation of collective knowledge. Communication among bees related to the quality of food sources takes place in the dancing area of the hive; this dance is called waggle dance. After localizing a source, employed bees share nectar and position information of the food sources with onlooker bees executing the waggle dance. An onlooker bee evaluates nectar information taken from employed bees and decides to employ herself at the most profitable source, with a probability related to the nectar amount [42]. In this context we have adapted the ABC method, considering as food sources the areas of the gray image with pixels not yet assigned to any cluster. Their fruitfulness is greater the greater their extension. The onlooker bees come to the aid of the employed bees in a number proportional to the size of the identified food source and to the number of pixels belonging to it not yet grouped together into any homogeneous region.

Results of the Color Image Segmentation Method
As an initial test image, we have considered the BSD image #295087 extracted by the Berkeley Segmentation Dataset BSDS500. The original image contains 61,258 unique colors ( Figure 1).
During the segmentation of the hue component, the FA algorithm identified four different clusters of intensities 16, 35, 124, 147, respectively, and the final outcome is shown in Figure 2. As we can see, in the original image two predominant hues appear, the first ranging from ocher to dark brown and the second one is due to the blue sky of the background. The validation of the grayscale segmentation was performed by using the Root-Mean-Square Error (RMSE) and the Normalized Correlation Coefficient (NK) [43][44][45].  During the segmentation of the hue component, the FA algorithm identified four different clusters of intensities 16, 35, 124, 147, respectively, and the final outcome is shown in Figure 2. As we can see, in the original image two predominant hues appear, the first ranging from ocher to dark brown and the second one is due to the blue sky of the background. The validation of the grayscale segmentation was performed by using the Root-Mean-Square Error (RMSE) and the Normalized Correlation Coefficient (NK) [43][44][45]. For the hue component we have obtained RMSE = 0.0247 and NK = 0.98.   During the segmentation of the hue component, the FA algorithm identified four different clusters of intensities 16, 35, 124, 147, respectively, and the final outcome is shown in Figure 2. As we can see, in the original image two predominant hues appear, the first ranging from ocher to dark brown and the second one is due to the blue sky of the background. The validation of the grayscale segmentation was performed by using the Root-Mean-Square Error (RMSE) and the Normalized Correlation Coefficient (NK) [43][44][45]. For the hue component we have obtained RMSE = 0.0247 and NK = 0.98.    Regarding the value component, the great variability of the histogram gives rise to seven different clusters with gray intensities equal to 44,81,115,146,162,183,224. Even in this case, we obtained very reliable results with RMSE = 0.0363 and NK = 0.9899 (Figure 4).    Regarding the value component, the great variability of the histogram gives rise to seven different clusters with gray intensities equal to 44,81,115,146,162,183,224. Even in this case, we obtained very reliable results with RMSE = 0.0363 and NK = 0.9899 (Figure 4).   As previously asserted, the variability in adequate solutions for image segmentation is an intrinsic and unavoidable feature, primarily due to the differences in the level of attention, the degree of detail perceived by one human observer compared to another, and the type of represented objects in which the user is interested. However, when pixel colors are projected onto three components, color information are widely scattered, and therefore, one of the drawbacks of color image processing is how to employ this great amount of information. To address this, we performed a color reduction relating to the evaluation of the inner product among vectors in the HSV space. Figure 6 allows us to compare the segmented images after doing cluster reduction. Iteratively applying the procedure, at first we obtained a reduction of 34%, passing from an initial 132 clusters to 87, and then of a further 34.4% with respect to the previous one, reducing the colors to 57 and finally to 37 ( Figure 6). Table 1 shows the values of RMSE and RAE from the original image ( Figure 1) and the successive segmented ones (Figure 6a-c). of information. To address this, we performed a color reduction relating to the evaluation of the inner product among vectors in the HSV space. Figure 6 allows us to compare the segmented images after doing cluster reduction. Iteratively applying the procedure, at first we obtained a reduction of 34%, passing from an initial 132 clusters to 87, and then of a further 34.4% with respect to the previous one, reducing the colors to 57 and finally to 37 ( Figure 6). Table 1 shows the values of RMSE and RAE from the original image (Figure 1) and the successive segmented ones (Figure 6a-c).  The BSD image #295087 represents a case with a low color content, essentially blue, brown and green, but with a high texture content, as we can notice by observing the chromatic distribution of the original image shown in Figure 7.   The BSD image #295087 represents a case with a low color content, essentially blue, brown and green, but with a high texture content, as we can notice by observing the chromatic distribution of the original image shown in Figure 7. of information. To address this, we performed a color reduction relating to the evaluation of the inner product among vectors in the HSV space. Figure 6 allows us to compare the segmented images after doing cluster reduction. Iteratively applying the procedure, at first we obtained a reduction of 34%, passing from an initial 132 clusters to 87, and then of a further 34.4% with respect to the previous one, reducing the colors to 57 and finally to 37 ( Figure 6). Table 1 shows the values of RMSE and RAE from the original image (Figure 1) and the successive segmented ones (Figure 6a-c).  The BSD image #295087 represents a case with a low color content, essentially blue, brown and green, but with a high texture content, as we can notice by observing the chromatic distribution of the original image shown in Figure 7.  In Figures 8 and 9, the significant color reduction is highlighted through the threedimensional scatter diagrams and the bidimensional chromatic distributions related to the segmented images with 132, 87, 57 and 37 distinct colors, respectively. The relevant decrement of colors may avoid an over-segmentation because of merging pixels with similar colors. This method has also been applied to some other images extracted by the Berkeley Segmentation Dataset BSDS500. In the test image #118035 of BSD, the initial 23,786 unique colors are reduced to 19 ( Figure 10).
The training image #35010 of BSD contains 61,267 colors, the final image is represented with only 219 different colors. Nevertheless, the basic chromatic characteristics of the butterfly and the surroundings are preserved (Figure 11).
In Figures 8 and 9, the significant color reduction is highlighted through the three-dimensional scatter diagrams and the bidimensional chromatic distributions related to the segmented images with 132, 87, 57 and 37 distinct colors, respectively. The relevant decrement of colors may avoid an over-segmentation because of merging pixels with similar colors.  This method has also been applied to some other images extracted by the Berkeley Segmentation Dataset BSDS500. In the test image #118035 of BSD, the initial 23,786 unique colors are reduced to 19 ( Figure 10).  In Figures 8 and 9, the significant color reduction is highlighted through the three-dimensional scatter diagrams and the bidimensional chromatic distributions related to the segmented images with 132, 87, 57 and 37 distinct colors, respectively. The relevant decrement of colors may avoid an over-segmentation because of merging pixels with similar colors.  This method has also been applied to some other images extracted by the Berkeley Segmentation Dataset BSDS500. In the test image #118035 of BSD, the initial 23,786 unique colors are reduced to 19 ( Figure 10).  mented images with 132, 87, 57 and 37 distinct colors, respectively. The relevant decrement of colors may avoid an over-segmentation because of merging pixels with similar colors.  This method has also been applied to some other images extracted by the Berkeley Segmentation Dataset BSDS500. In the test image #118035 of BSD, the initial 23,786 unique colors are reduced to 19 ( Figure 10). The training image #35010 of BSD contains 61,267 colors, the final image is represented with only 219 different colors. Nevertheless, the basic chromatic characteristics of the butterfly and the surroundings are preserved (Figure 11). Analyzing the training image #296059, the initial 27,871 colors are reduced to 48 different colors, and the resulting segmented image is shown in Figure 12. The complexity of the ground texture and the elephant skin is strongly simplified, while the tusks are still clearly distinguishable. Analyzing the training image #296059, the initial 27,871 colors are reduced to 48 different colors, and the resulting segmented image is shown in Figure 12. The complexity of the ground texture and the elephant skin is strongly simplified, while the tusks are still clearly distinguishable. Analyzing the training image #296059, the initial 27,871 colors are reduced to 48 different colors, and the resulting segmented image is shown in Figure 12. The complexity of the ground texture and the elephant skin is strongly simplified, while the tusks are still clearly distinguishable. For image #198023 with 31,863 colors, the reduction gives rise to 157 colors. With a further reduction to 124 different colors, the small squares of the grating behind the woman are no longer distinguishable, this makes the foreground more recognizable from the background (Figure 13).

Results of Edge Extraction Applying Artificial Bee Colony Algorithm
In this paper, initially scout bees move in the search space, which is the gray image, describing random paths, each of which is a piecewise linear curve composed by a connected sequence of M arbitrary line segments. The trajectories of scout bees are defined by the following parametric equations: For image #198023 with 31,863 colors, the reduction gives rise to 157 colors. With a further reduction to 124 different colors, the small squares of the grating behind the woman are no longer distinguishable, this makes the foreground more recognizable from the background (Figure 13). Analyzing the training image #296059, the initial 27,871 colors are reduced to 48 different colors, and the resulting segmented image is shown in Figure 12. The complexity of the ground texture and the elephant skin is strongly simplified, while the tusks are still clearly distinguishable. For image #198023 with 31,863 colors, the reduction gives rise to 157 colors. With a further reduction to 124 different colors, the small squares of the grating behind the woman are no longer distinguishable, this makes the foreground more recognizable from the background (Figure 13).

Results of Edge Extraction Applying Artificial Bee Colony Algorithm
In this paper, initially scout bees move in the search space, which is the gray image, describing random paths, each of which is a piecewise linear curve composed by a connected sequence of M arbitrary line segments. The trajectories of scout bees are defined by the following parametric equations:

Results of Edge Extraction Applying Artificial Bee Colony Algorithm
In this paper, initially scout bees move in the search space, which is the gray image, describing random paths, each of which is a piecewise linear curve composed by a connected sequence of M arbitrary line segments. The trajectories of scout bees are defined by the following parametric equations: , v 0 is the initial velocity, θ ∈ [0; 2π] and rand(1) is a generator of random numbers uniformly distributed in the interval [0; 1], the end points of each line segment determine the positions of unemployed bees during their flights. Initially, the paths will be confined inside the image space, so as not to go beyond edges. If along the path a scout bee finds a food source, which is a zone with unclassified pixels, the growing process will be activated starting from the actual position, otherwise the bee keeps going undisturbed. Once the region with uniform gray intensity is outlined, its edges are extracted and the bounded box of boundaries is determined (Figure 14).
At this point, employed bees share their food source information with onlooker bees waiting in the hive and then onlooker bees choose their food sources depending on this information. The scout bees come back to the hive for executing the waggle dance in order to involve onlooker bees in the exploitation phase. In the present application of the ABC algorithm, the fitness values are computed through the percentage of pixels not yet assigned and included inside the bounded box of the extracted regions. Then, onlooker bees give rise to a local search, rushing to scouts' aid proportionally to the number of unclassified pixels and to the size of the rectangle containing the extracted boundary (Figures 15 and 16).
where 1, … . , ∈ 0; 1 , is the initial velocity, ∈ 0; 2 and rand(1) is a generator of random numbers uniformly distributed in the interval 0; 1 , the end points of each line segment determine the positions of unemployed bees during their flights. Initially, the paths will be confined inside the image space, so as not to go beyond edges. If along the path a scout bee finds a food source, which is a zone with unclassified pixels, the growing process will be activated starting from the actual position, otherwise the bee keeps going undisturbed. Once the region with uniform gray intensity is outlined, its edges are extracted and the bounded box of boundaries is determined (Figure 14). At this point, employed bees share their food source information with onlooker bees waiting in the hive and then onlooker bees choose their food sources depending on this information. The scout bees come back to the hive for executing the waggle dance in order to involve onlooker bees in the exploitation phase. In the present application of the ABC algorithm, the fitness values are computed through the percentage of pixels not yet assigned and included inside the bounded box of the extracted regions. Then, onlooker bees give rise to a local search, rushing to scouts' aid proportionally to the number of unclassified pixels and to the size of the rectangle containing the extracted boundary (Figures 15 and 16).  the path a scout bee finds a food source, which is a zone with unclassified pixels, the growing process will be activated starting from the actual position, otherwise the bee keeps going undisturbed. Once the region with uniform gray intensity is outlined, its edges are extracted and the bounded box of boundaries is determined (Figure 14). At this point, employed bees share their food source information with onlooker bees waiting in the hive and then onlooker bees choose their food sources depending on this information. The scout bees come back to the hive for executing the waggle dance in order to involve onlooker bees in the exploitation phase. In the present application of the ABC algorithm, the fitness values are computed through the percentage of pixels not yet assigned and included inside the bounded box of the extracted regions. Then, onlooker bees give rise to a local search, rushing to scouts' aid proportionally to the number of unclassified pixels and to the size of the rectangle containing the extracted boundary (Figures 15 and 16).  The extracted edges of the segmented image #295087 of BSD with 37 clusters are displayed in Figure 17. The algorithms, developed with Matlab, are able to detect even low significance regions that eventually could be excluded on the basis of the measure of their area or perimeter length. The extracted edges of the segmented image #295087 of BSD with 37 clusters are displayed in Figure 17. The algorithms, developed with Matlab, are able to detect even low significance regions that eventually could be excluded on the basis of the measure of their area or perimeter length. The extracted edges of the segmented image #295087 of BSD with 37 clusters are displayed in Figure 17. The algorithms, developed with Matlab, are able to detect even low significance regions that eventually could be excluded on the basis of the measure of their area or perimeter length.

Discussion
This work performed a color image segmentation, referring to metaheuristic and natureinspired algorithms. The algorithms are applied to hue, saturation and value components separately. Thus, this method pertains to the category of monochrome segmentation approaches, which can be considered as a dimensional extension of grayscale image methods.
The metaheuristic Firefly Algorithm automatically evaluates the number of clusters and their initial centroids. Subsequently, the outcomes of FA are used as initial means to estimate the Gaussian Mixture Model. The multilevel image is obtained by recombining the three segmented components. A further color reduction is performed through the use of the inner product, as an index of similarity among colors. The validation analysis has been carried out using different standard measures, showing that the method is fairly robust and reliable.
Concerning the spatial segmentation, the application of the probabilistic ABC algorithm carries out boundaries of segmented regions in a fast way. The erratic motion of scout bees makes it possible to detect edges even if the size of the regions is very small. This is due to the local search activated by onlookers rushing to scouts' aid that makes the

Discussion
This work performed a color image segmentation, referring to metaheuristic and nature-inspired algorithms. The algorithms are applied to hue, saturation and value components separately. Thus, this method pertains to the category of monochrome segmentation approaches, which can be considered as a dimensional extension of grayscale image methods.
The metaheuristic Firefly Algorithm automatically evaluates the number of clusters and their initial centroids. Subsequently, the outcomes of FA are used as initial means to estimate the Gaussian Mixture Model. The multilevel image is obtained by recombining the three segmented components. A further color reduction is performed through the use of the inner product, as an index of similarity among colors. The validation analysis has been carried out using different standard measures, showing that the method is fairly robust and reliable.
Concerning the spatial segmentation, the application of the probabilistic ABC algorithm carries out boundaries of segmented regions in a fast way. The erratic motion of scout bees makes it possible to detect edges even if the size of the regions is very small. This is due to the local search activated by onlookers rushing to scouts' aid that makes the research more effective and detailed. In this context, the region-based approach is performed on the segmented grayscale image rather than on the color one. This is due to a limitation of the method hereby applied, which will have to be overcome in future research. While Balasubramanian et al. [46] applied the region-growing method on color images with a dynamic color gradient thresholding, in this research, the choice of operating with grayscale images is crucial for the application of the ABC metaheuristic algorithm in image processing, which is also one of the aims of the present work.