Fuzzy Color Aura Matrices for Texture Image Segmentation

Fuzzy gray-level aura matrices have been developed from fuzzy set theory and the aura concept to characterize texture images. They have proven to be powerful descriptors for color texture classification. However, using them for color texture segmentation is difficult because of their high memory and computation requirements. To overcome this problem, we propose to extend fuzzy gray-level aura matrices to fuzzy color aura matrices, which would allow us to apply them to color texture image segmentation. Unlike the marginal approach that requires one fuzzy gray-level aura matrix for each color channel, a single fuzzy color aura matrix is required to locally characterize the interactions between colors of neighboring pixels. Furthermore, all works about fuzzy gray-level aura matrices consider the same neighborhood function for each site. Another contribution of this paper is to define an adaptive neighborhood function based on information about neighboring sites provided by a pre-segmentation method. For this purpose, we propose a modified simple linear iterative clustering algorithm that incorporates a regional feature in order to partition the image into superpixels. All in all, the proposed color texture image segmentation boils down to a superpixel classification using a simple supervised classifier, each superpixel being characterized by a fuzzy color aura matrix. Experimental results on the Prague texture segmentation benchmark show that our method outperforms the classical state-of-the-art supervised segmentation methods and is similar to recent methods based on deep learning.


Introduction
Color texture segmentation consists of partitioning an image into homogeneous regions with respect to color and texture properties. It is involved in various fields, such as medical image analysis [1], remote sensing [2], synthetic aperture radar [3], and fruit detection [4]. Although a wide variety of techniques have been developed, color texture segmentation remains an open and challenging problem due to the high variability of textures, combined with the great diversity of colors. Most approaches characterize each pixel by a set of texture and color features, and use them to assign one color texture class to each pixel thanks to classification algorithms [5,6]. Sections 1.1 and 1.2 discuss the state of the art of color texture features and classification algorithms used for color texture segmentation.

Color Texture Features
The characterization of color textures is a fundamental problem in computer vision, where a texture is generally described by some visual cues represented by statistical structures, hereafter called texture features. Pixel colors are often represented by the values or by statistic measures of color components (R, G, B) or by those derived from color spaces such as HSV, L * u * v * , or L * a * b * [7]. Various texture features are designed to characterize texture appearance. Widely used ones are based on Gabor filters, wavelet transform, Markov random field model, local binary patterns, and co-occurrence matrices [8][9][10].
Color texture classification/segmentation techniques can be broadly classified into two approaches according to how they combine color texture and color features [5,8]. In the first approach, color and texture features are computed separately and are then combined by the clustering or classification process. For example, in [11], L * , u * , and v * components were used as color features and the output of Gabor filters as texture features. In [12], the authors used the mean, standard deviation, and skewness of each color channel, H, S, and V, as color features, and the co-occurrence matrices computed on the intensity image were considered as texture features.
In the second approach, color and texture features are assumed to be mutually dependent. With this approach, marginal, opponent, and vector strategies have been developed to compute color texture features [5,8]. The marginal strategy assumes that texture can be separately described within each color channel, where pixels are characterized by only one color component. Texture features designed for gray-level images are then computed for each color channel and aggregated into a global feature representation. In [13], for instance, texture features were extracted from wavelet transform and co-occurrence matrices for each color channel. In [14], Leung-Malik filter banks were applied on each color channel for color texture feature extraction. The opponent strategy extends texture feature extraction to color images thanks to both within-and between-channel analyses. The within-channel part consists of computing features from each color channel separately (as in the marginal strategy), whereas between-channel (opponent) features are obtained by processing pairs of color channels jointly. This strategy was adopted by Panjwani et al. [15], who used Markov random field models to characterize a texture in terms of spatial interaction within each color channel and interaction between different color channels. The vector strategy takes advantage of color vector information. It makes it possible to analyze relationships between colors of neighboring pixels, as in the case of gray-level images. Examples of this strategy can be found in [9,16] where local binary patterns were computed from a color image. It should be noted that the vector strategy is more suitable to characterizing color texture because it is less memory-consuming and fully takes correlation between colors of neighboring pixels into account.

Color Texture Image Segmentation by Pixel Classification
Color texture segmentation methods can also be categorized into three ways, depending on whether the pixel classification is performed in a supervised, unsupervised, or semi-supervised manner [1,11,17]. In supervised segmentation, prior knowledge about the training samples and their class labels is needed to classify the input image pixels. Some classical supervised classifiers are used for texture segmentation [1,4,17]. Among them are the K-nearest neighbor (KNN) [18] and the Bayesian [19] classifiers, the support vector machine (SVM) [20], random forest [12], Markov random field [11], and neural networks [14]. Supervised color texture segmentation based on deep learning has been developed in the last decade [21][22][23]. These efficient methods generally use convolutional neural networks (CNN), such as the U-Net [24], deep visual model (DA) [25], pyramid scene parsing network (PSP-Net) [26], supervised fully convolutional network for texture (FCNT) [21], and empirical wavelet transform-based fully convolutional network for texture (EWT-FCNT) [23].
Unsupervised segmentation does not require prior knowledge and discovers different classes by clustering pixels from their features only [5,6]. For example, the popular K-means clustering algorithm is mainly employed to perform the classification of pixels [6,27].
Semi-supervised segmentation is suitable where only partial prior knowledge about the training samples and their class labels is available. For instance, in [17], the constrained spectral clustering algorithm was applied for semi-supervised classification of pixels characterized by color texture features.
Supervised segmentation methods, especially deep learning based methods, provide better results than unsupervised and semi-supervised approaches thanks to prior information [17]. It should also be noted that some of the above methods perform segmentation at the pixel level and others at the "superpixel" level. This is the case for the methods discussed in [1,3]. Recall that superpixels group neighboring pixels based on their spatial and color similarity, and are used as samples in order to speed up color texture segmentation.

Fuzzy Color Texture Features
All aforementioned methods assume that images are crisp and free from vagueness. However, in practice, color images carry more or less fuzziness because of the imprecision inherent to the discretization of both the spatial domain (sampling) and the color component levels (quantization). Therefore, boundaries separating the various image regions are not precisely defined, and pixel levels are imprecise measures of the reflectance of surfaces observed by the camera. Furthermore, the assumption that texture images are mainly represented by spatial repetitions of a pattern may not be valid any longer. Texture analysis techniques based on fuzzy concepts have then been proposed in order to take this imprecision into account.
For instance, fuzzy histograms [30], fuzzy local binary patterns [31], and local fuzzy patterns [32] are extracted from gray-level texture images. Fuzzy gray-level co-occurrence matrices (FGLCMs) are also proposed to characterize spatial interactions between graylevels of neighboring pixels [33][34][35][36][37]. However, characterizing a color texture by at least three FGLCMs (one by channel) is memory expensive. To overcome this drawback, Ledoux et al. [38] proposed to extend FGLCMs to color images by defining fuzzy color sets. A color image is then represented by one single fuzzy color co-occurrence matrix (FCCM) that characterizes the local interactions between colors of neighboring pixels. Moreover, FGLCMs and FCCMs only consider spatially-invariant neighborhoods. Hammouche et al. [39] showed that adaptive neighborhoods are useful for texture analysis and provided an elegant formalism to deal with spatially-variant neighborhoods thanks to the aura concept.
In the framework based on the aura set concept, Elfadel and Picard [40] proposed a generalization of GLCMs called gray-level aura matrices (GLAMs). A GLAM quantifies the presence of a set of pixels with a specified level in the neighborhood of another set of pixels having another level. The amount of neighboring pixels with the specified level is quantified by means of the aura measure. GLAMs are used for texture representation and synthesis [41,42], image retrieval [43,44], classification [45][46][47][48][49][50], and segmentation [51][52][53]. A generalization of GLAMs to the fuzzy framework has been proposed by Hammouche et al. [39]. Representing each color channel by a fuzzy GLAM (FGLAM) outperforms the FGLCM representation for texture classification. Recently, FGLAMs have been used to improve the accuracy of wood species classification [54]. However, as for FGLCMs, the computation of FGLAMs for each pixel is memory and time expensive, which makes their use for color image segmentation a challenge.
To circumvent these constraints, we adopted a vector strategy and propose to extend FGLAMs to fuzzy color aura matrices (FCAMs). An FCAM makes it possible to locally characterize the interactions between colors of neighboring pixels. While one FGLAM must be computed for each color channel, a single low-dimensionnal FCAM is required to describe the color texture.

FCAM for Image Segmentation by Superpixel Classification
In this study, we applied FCAM to color texture image segmentation. As FCAM is based on a locally-adaptive neighborhood, it can characterize the texture represented by small connected pixel subsets with different shapes, i.e., by superpixels. We use the simple linear iterative clustering (SLIC) scheme [55] to generate superpixels from a color image. These are then classified using a simple supervised classifier to segment the color texture image.
The remainder of this paper is organized as follows. In Section 2, we first give an overview of basic SLIC algorithm; then we propose a modified version that incorporates the regional information. Section 3 introduces some definitions of the fuzzy color aura concept, and explains how to characterize the color texture of each superpixel by an FCAM. In Section 4, we first present the datasets used in the experiments; then we give details about the proposed color texture segmentation method based on FCAMs. Next, we assess the regional SLIC algorithm and discuss its parameter settings. Last, we compare the segmentation results achieved by our supervised segmentation approach with those obtained by fuzzy texture features and by several other state-of-the-art color texture segmentation methods. Concluding remarks about the contribution of this paper are given in Section 5.

Superpixel
The proposed color texture image segmentation is based on the classification of superpixels. A superpixel is a compact set of connected sites with similar properties. It is usually adopted to replace the pixel grid in an image in order to reduce the computational burden of subsequent processing. Superpixels are generated by over-segmentation algorithms using color, spatial, and/or texture information. Among superpixel generation algorithms, simple linear iterative clustering (SLIC) is widely used due to its simplicity, speed, and ability to adhere to image boundaries. In this section, we briefly give an overview of SLIC algorithm used to generate superpixels. Then, we propose a modified version that incorporates the regional information.

Basic SLIC
The SLIC algorithm [55] generates a desired number of regular superpixels, with la ow computational overhead, by clustering sites based on their spatial and color features. Let I be an RGB image defined on a lattice S, such that each site s ∈ S is characterized by three color components: I(s) = (I R (s), I G (s), I B (s)) . The RGB color components are transformed into the La * b * color space, so that SLIC represents each site s(x s , y s ) by a five-dimensional feature vector: (I L (s), I a (s), I b (s), x s , y s ) . SLIC follows a k-means clustering strategy but searches the nearest cluster center according to the distance D(s, s ) = d 2 c (s, s ) + m 2 · d 2 s (s, s )/S 2 between two sites s and s , where d 2 c (s, s ) = (I L (s) − I L (s )) 2 + (I a (s) − I a (s )) 2 + (I b (s) − I b (s )) 2 and d 2 s (s, s ) = (x s − x s ) 2 + (y s − y s ) 2 measure the color and spatial proximity. The compactness parameter m is set to its default value of 1, and the maximum spatial distance within a cluster is defined as the sampling step S = |S|/P, where |S| is the total number of sites in the image and P is the number of superpixels. Figure 1b shows the results of the segmentation achieved by SLIC on the color image of Figure 1a. It should be emphasized that SLIC takes a post-processing step to enforce connectivity by merging small isolated superpixels with nearby larger ones. Therefore, the actual number of superpixels produced by SLIC can be slightly lower than the desired number of superpixels.

Regional SLIC
The basic SLIC algorithm achieves good pre-segmentation on color images, but may fail to find boundaries of textures. Indeed, it successfully detects homogeneous regions but somewhat fails to separate areas with different textures (see Figure 1a and arrows on Figure 1b). To improve this pre-segmentation, we propose a modified version of the algorithm, referred to as regional SLIC, that takes color, spatial, and regional information into account. From the basic SLIC pre-segmentation result, we compute the regional feature f R of each superpixel R as [56]: where p R = |R|/|S| is the area ratio of R to the whole image, and N R is the set of superpixels that are adjacent to R. The regional feature f R is a sum of two terms: the first one reflects the size of the superpixel R and monotonously increases with its size (because p R ∈]0, 1[), and the second one reflects the superpixel context by taking the influence of adjacent superpixels into account. A small superpixel surrounded by small superpixels provides a low value of f R , and conversely, if both R and adjacent superpixels are large, f R is high (see Figure 1c). At each site s of superpixel R, we replace the luminance component I L (s) by the regional feature f R and we apply the SLIC algorithm again with ( f R , I a (s), I b (s), x s , y s ) as the feature vector (see Algorithm 1). Thanks to this regional SLIC method, the lattice S is partitioned into P superpixels {P p } P p=1 , i.e., P p=1 P p = S and P p ∩ P p = ∅, for p = p . A superpixel P p is defined as a set of connected sites, i.e., any two sites s and s in P p are connected by at least one path composed of sites in P p . Figure 1d shows the final superpixels extracted from the color texture image of Figure 1a. Note (see arrows) that areas with different textures are better delineated with the regional SLIC than with its basic version, despite the regional feature describing the superpixel contextual information in the pre-segmented image and not being able to be considered as a texture feature. Algorithm 1 Regional SLIC. Input: RGB image I 1.
Convert I from RGB to La * b * color space; 2.
Apply SLIC to S where each site s is characterized by (I L (s), I a (s), I b (s), x s , y s ) ; 3.
Extract the regional feature f R from each superpixel R of the SLIC map using Equation (1); 4.
Apply SLIC to S where each site s is characterized by ( f R , I a (s), I b (s), x s , y s ) .
Output: Partition of I into superpixels {P p } P p=1

Fuzzy Color Aura
In this section, we extend the aura concept introduced by Elfadel and Picard for graylevel images [40] to color images. We explain how it is used to characterize a superpixel thanks to a matrix of aura cardinals. As this matrix is huge when all RGB colors are considered, we propose to extend the color aura concept to fuzzy colors whose number is reduced.

Color Aura Set
Let I be an RGB image defined on a lattice S, such that each site s ∈ S is characterized by its color I(s). For a given color x ∈ RGB, we define the set of sites with this color as S x = {s ∈ S, I(s) = x}. Then, {S x } x∈RGB is a partition of S-i.e., x∈RGB S x = S and Given two colors (x, x ) ∈ RGB 2 , we define the color aura set A S x (S x ) of S x with respect to S x as: where N s S x is the neighboring site set of each site s ∈ S x : N s · = {r ∈ S, r − s ∞ ≤ d} is the neighborhood of a site s. In this paper, d = 1 such that N s is only composed of the eight closest sites from s.
A S x (S x ) is the subset of S x composed of the sites that are present in the neighborhood of those of S x . It provides an interpretation of the presence of S x in the neighborhood of S x . The color aura set is a generalization of the gray-level aura set introduced by Elfadel and Picard [40] to the color case. Figure 2a shows a color image, defined on a lattice S of 7 × 7 pixels, and three color sets, S R , S G , and S B . For the neighborhood N s , the aura set A S G (S B ) of S B with respect to S G is composed of the green sites marked as circles. Note that it differs from the aura set A S B (S G ) of S G with respect to S B , composed of the blue sites marked as diamonds. (left) and P 2 (right), and aura sets are empty. (c) Superpixels P 1 (left) and P 2 (right), and aura sets

Color Aura Set in a Superpixel
Rather than building an aura set from all image sites, we focus on a superpixel P p and define the color aura set A p S x (S x ) of S x with respect to S x for the superpixel P p as: where N p,s S x is the neighboring site set of each site s ∈ S x within the superpixel P p : The image of Figure 2a contains two textures, one represented by red and green vertical stripes on the left three columns, and another represented by red and blue horizontal stripes on the right three columns. These two textures are separated by the fourth column composed of red, green, and blue sites. Figure 2b shows the partitioning of this image into two superpixels delimited by a vertical black line. P 1 covers the four left columns and P 2 the three right ones. The aura sets A 1 S G (S B ) and A 1 S B (S G ) are not empty, since P 1 contains neighboring green and blue sites, but A 2 S G (S B ) and A 2 S B (S G ) are empty because P 2 contains no green site. Figure 2c shows another partition where P 1 covers the three left columns and P 2 the four right ones. In that case, A 1 S G (S B ) and A 1 S B (S G ) are empty because P 1 contains no blue site. This example illustrates that color aura sets depend on superpixel edges. Indeed, along the fourth column that separates the two textures, only one green site (on first row) and one blue site (second row) belong to the aura sets of superpixels in both Figure 2b,c.

Color Aura Cardinal
The aura measure was introduced by Elfadel and Picard [40] to characterize an aura set by a number that expresses the amount of mixing between neighboring site sets. We use here a simpler measure [39] and quantify the color aura set of a color site set S x with respect to another color site set S x within the superpixel P p thanks to its cardinal defined as: The aura cardinal measures for all the possible pairs of color sets, (S x , S x ), (x, x ) ∈ RGB 2 , are gathered in a matrix. This color aura cardinal matrix [m p (x, x )] can be then considered as the texture feature of the superpixel P p . However, when color components of the image are defined on 256 levels, as is classically the case, the number of possible colors reaches 256 3 and m p would be of size 256 3 × 256 3 . As such, a huge memory requirement is needed in practice, so we propose to decrease the number of analyzed colors by introducing fuzzy colors.

Fuzzy Color
To form a small subset C of C colors among the 256 3 possible ones, we use the uniform quantization technique for its simplicity of implementation. For each color component , of respective width L k · = 256/C k . The C k centers { (L k − 1)/2 , (3L k − 1)/2 , · · · , 256 − (L k + 1)/2 } of the intervals define the k-th component of the colors in C . The numbers C R , C G , and C B are chosen so that the number of colors C = C R · C G · C B is much lower than 256 3 .
In the fuzzy framework [38], a fuzzy colorc is characterized by its membership function µc : RGB −→ [0, 1]. The membership degree µc(x) of any color x ∈ RGB is defined using its infinity norm or Euclidean distance to the crisp counterpart c ∈ C ofc, thanks to either: • the crisp membership function: • the symmetrical Gaussian function: • the triangular function: • or the fuzzy C-means (FCM) membership function: Here, α and β are real positive constants used to control the span of the fuzzy color, and ζ is any real number greater than 1. In this paper, we set β = (L R , L G , L B ), α = β/ 2 ln(2), and ζ = 2. The parameters α and β are chosen to ensure that µ c (x) = 0.5 at the bounds of each color domain of width (L R , L G , L B ) centered at c. Figure 3 shows the shapes of the four membership functions computed at c = (192, 64, 64), for plane (R, G).

Fuzzy Color Aura Set in a Superpixel
A fuzzy color site set S c , c ∈ C , is defined by its membership function µ S c . The membership degree µ S c (s) of each site s ∈ S to S c is the membership degree µc(I(s)) of its color I(s) ∈ RGB to the fuzzy colorc [38]: From there, we define the fuzzy color aura set A p S c (S c ) of S c with respect to S c , for any color pair (c, c ) ∈ C 2 , as the fuzzy site set with the following membership degree at each site r ∈ S: The neighborhood function n p (r, s) expresses the membership degree of r to the neighborhood of s within the superpixel P p , p ∈ [ [1, P]]. This function may have any support size and shape, and may take any real value between 0 and 1. In the simplest binary case, we design it as: As a justification of Equation (12), we consider the fuzzy color aura set by analogy with the crisp case [39]. The crisp set union operator of Equation (4) is transcribed by the fuzzy operator sup to get the membership degree of the fuzzy color aura set S c with respect to S c at each site r ∈ S: The crisp set intersection operator ∩ is transcribed by the fuzzy operator min, such that the fuzzy counterpart of the color site set N p,s S x ∩ S x of Equation (4) is defined by its membership function given by: The fuzzy counterpart of N p,s S x (see Equation (5)) is defined for the fuzzy color site set S c , c ∈ C , by the following membership degree at each site r ∈ S: Plugging Equation (16) into (15) and the result into (14) provides the definition (12) of the fuzzy aura set after swapping the first two operators.

Fuzzy Color Aura Cardinal
The fuzzy color aura cardinal of S c with respect to S c within P p is exactly defined as in the crisp case (see Equation (6)) and directly follows from Equation (12): Following the crisp scheme, we define a fuzzy color aura matrix (FCAM) as the collection of all fuzzy color aura cardinals of a superpixel for the C 2 color pairs (c, c ) ∈ C 2 . Note that to define the FCAM, we only consider a few of the 256 3 colors on which the RGB image is defined-namely, the set C of C colors such that C 256 3 . This provides a compact FCAM of small size C × C that is a suitable color texture descriptor of a superpixel for image segmentation purpose.
To make superpixels of different sizes comparable, their FCAMs are normalized element-wise to sum up to one: Finally, each superpixel P p of an image I is characterized by C 2 features that are the elements of its normalized FCAMm p .

Experiments
In this section, we first present the experimental dataset and explain how FCAM features are used to segment its color texture images. To evaluate the performance of the proposed approach, we then assess the regional SLIC algorithm, discuss parameter settings, and study the relevance of FCAM features. We finally compare the segmentation results achieved by our supervised segmentation approach with those obtained by several state-of-the-art color texture segmentation methods.

Dataset
As the experimental dataset, we used the challenging Prague texture segmentation benchmark [57]. It contains 20 color texture mosaics to be segmented (input test images), some of which are shown in Figure 4 (top) with their corresponding segmentation ground truth images. Each of these images represents from 3 to 12 classes and has been synthetically generated from the same number of original color texture images. The original Prague dataset is composed of 89 images (one image per texture class) grouped into 10 categories of natural and artificial textures. Figure 4 (bottom) shows some of these images that form the training dataset. All images of Prague database are of size 512 × 512 pixels. Figure 4. Examples of test images (top two rows) and training images (from "flowers" and "manmade" categories, bottom two rows) from the Prague dataset. Note that the framed test image represents K I = 6 classes from the sole "flowers" category.

Color Texture Image Segmentation Based on FCAMs
Our method of color texture image segmentation based on FCAMs is a supervised superpixel classification procedure. Each test image I represents K I classes whose pixels should be retrieved by the segmentation. To this end, we used a simple artificial neural network known as the extreme learning machine (ELM) that comes with a very fast learning algorithm [58]. It consists of three fully-connected neuron layers: an input layer of C 2 neurons that receives FCAM elements, a single hidden layer, and an output layer with K I neurons. The number of neurons in the hidden layer was empirically set to 100 · K I . The initial weights of hidden neurons were set to random values, and the output weights were determined according to a least-square solution [58].
The proposed color texture segmentation of any test image I based on FCAMs follows the two successive stages outlined in Figure 5. The ELM is first trained with a set of K I · T training samples, where T is the number of prototype sites per class that are randomly selected on each of the K I training images. Each training sample is the FCAM feature computed over a square patch W t , t ∈ {1, . . . , T}, centered at a prototype site and of size (2W + 1) 2 . The trained ELM is then used to segment I as follows. First, I is segmented into superpixels using the regional SLIC method. Then, the FCAM of each superpixel is fed as input into the trained ELM, whose output provides the estimated texture class. All sites in the superpixel are finally assigned to this class. A refinement procedure can be applied to further improve segmentation accuracy. This consists of reassigning superpixels smaller than 0.5% of the image size with the label of the largest adjacent superpixel [21]. All steps of the proposed color texture image segmentation are summarized in Algorithm 2.

Algorithm 2 Color texture image segmentation.
Input: Test image I, K I training images Parameters: Number T of prototypes per class, number P of superpixels, number C of fuzzy colors, patch half width W, membership function µc.
In each training image, randomly select T prototype sites.

2.
At each prototype site, compute the normalized FCAMm t over a square patch W t of size (2W + 1) 2 using Equation (18).

3.
Train the ELM classifier with the K I · T normalized FCAMs of prototypes.
Run regional SLIC (Algorithm 1) on I to provide P superpixels {P p } P p=1 .

2.
Compute the normalized FCAMm p of each superpixel P p using Equation (18).

3.
Feedm p into the trained ELM and assign each pixel in P p to the ELM output class.

Regional SLIC-Preliminary Assessment
To demonstrate the relevance of the proposed regional SLIC algorithm, we compare its results with those achieved by the basic SLIC algorithm thanks to four standard metrics. The achievable segmentation accuracy (ASA) quantifies the segmentation performance achievable by assigning each superpixel to the ground truth region with the highest overlap [55,59]. The boundary recall (BR) assesses the boundary adherence with respect to the ground truth boundaries [60]. The under-segmentation error (UE) evaluates the segmentation boundary accuracy as the overlap between superpixels and ground truth regions [55,59]. The compactness (COM) measures the compactness of superpixels [60]. Higher BR, ASA, and COM, and smaller UE, indicate better pre-segmentation.
Both the basic and regional SLIC algorithms require one to set the number of superpixels. This number must be small enough to get superpixels that are large enough to contain homogeneous textures, and small enough for the superpixels to finely fit boundaries between textures. We empirically found that P = 400 is a good trade-off for the Prague dataset. Table 1 shows the average and standard deviation of the four metrics obtained by the basic and modified SLIC algorithms over the 20 images of Prague dataset. From this table, we can see that the results achieved by the regional SLIC are better than those obtained by the basic SLIC according to all metrics.

Parameter Settings
The proposed color texture segmentation requires one to set several parameters. For the training stage, the number of prototypes was set to T = 1000, since CNN-based methods use 1000 training texture mosaic images for each image to segment [23]. The size (2W + 1) 2 of the patch centered at each prototype site and used for FCAM computation during the training stage was set according to the image size N and the number P of superpixels such that (2W + 1) 2 ≈ N/P. As N = 512 × 512 and P = 400, the patch size was set to 27 × 27 pixels.
To characterize a site by an FCAM in the segmentation stage, we have to set both the number C of fuzzy colors and the membership function µc that defines the membership degree µc(s) of each site s to any fuzzy colorc according to its color I(s). We consider very few colors in order to evaluate how the memory cost of FCAMs can be reduced while preserving their relevance as texture features. Specifically, the number of fuzzy colors one of these values each time: We retained a given number of fuzzy colors and membership function on the grounds of segmentation accuracy. To make this choice independent of the classification algorithm, we used the nearest neighbor classifier (1NN) instead of the ELM, and with no refinement step. Figure 6 shows the average accuracy obtained with each membership function according to the number C of fuzzy colors over the 20 images. From this figure, we can see that the fuzzy membership functions (Gaussian, triangular, and FCM) largely outperform the crisp one, especially when the number of fuzzy colors is low. However, when C ≥ 12, all fuzzy membership functions lead to similar segmentation accuracies, even if the Gaussian membership function seems to perform slightly better. Moreover, accuracies do not vary significantly beyond C = 16. In the following, all experiments were therefore performed with C = 16 fuzzy colors and the Gaussian membership function.

Comparison with Other Fuzzy Texture Features
We here compare the relevance of FCAMs with respect to other fuzzy texture features, namely, the fuzzy gray-level co-occurrence matrix (FGLCM) [33], the fuzzy color cooccurrence matrix (FCCM) [38], and the fuzzy gray-level aura cardinal matrix (FGLAM) [39]. Note that FGLCM and FCCM are similar to a fuzzy color aura matrix computed with a fuzzy aura local measure, as shown in [39]. FGLCM and FGLAM features are formed by the concatenation of the marginal features of the three R, G, and B components; and a single FCCM captures the interactions among neighboring pixel colors. For a fair comparison, all these matrices were computed with the same parameters as the FCAM (Gaussian membership function, C R = C G = C B = 16 for FGLCM and FGLAM, and C = 16 for FCCM).
The comparison is based on the segmentation accuracy as the main performance measure and on the computational time required by each feature extraction method. Experiments were carried out using the 1NN classifier instead of the ELM, on a computer with an Intel Core i7 3.60 GHz CPU and 8 GB of RAM. The results over the 20 images of the Prague dataset are summarized in Table 2. They clearly show that FCAM is more relevant, three times faster to compute, and requires less memory than marginal features (FGLCM and FGLAM). FCAM is also more efficient and slightly faster to compute than FCCM. Table 2. Average accuracy (%) and computation time (s) for different fuzzy features on the Prague dataset. "nr" means no segmentation refinement, and "wr" means with segmentation refinement.

Comparison with State-of-the-Art Supervised Segmentation Methods
In this section, we compare the results obtained by the proposed segmentation method with the results of several state-of-the-art segmentation methods, on the Prague dataset. The assessment of segmentation performance was based on the conventional measures provided on the Prague texture segmentation website [57], which include: (1) regionbased criteria: correct segmentation (CS), over-segmentation (OS), under-segmentation (US), missed error (ME), and noise error (NE); (2) pixel-wise based criteria: omission error (O), commission error (C), class accuracy (CA), recall (CO), precision (CC), type I error (I.), type II error (II.), mean class accuracy estimate (EA), mapping score (MS), root mean square proportion estimation error (RM), and comparison index (CI); (3) consistency-error criteria: global consistency error (GCE) and local consistency error (LCE).
The segmentation results of EWT-FCNT, FCNT, MRF, COF, and Con-Col were taken from the Prague benchmark website [57]; those of U-Net, DA, and PSP-Net were taken from [23]. All these methods except MRF, COF, and Con-Col, involve a refinement step to improve performance. Regarding the number of prototypes, no information is available for MRF, COF, and Con-Col. In contrast, the CNN-based methods used a training set of 1000 texture mosaic images of size 512 × 512 pixels, specifically created from the original Prague dataset, for each image to segment.
From Table 3, we can see that our proposed method, with and even without refinement, outperforms the classical supervised methods, namely, MRF, COF, and Con-Col.In contrast, EWT-FCNT and other CNN-based methods provided better segmentation results than our method. However, by carefully analyzing these results, we found that-except the EWT-FCNT method, which provided exceptional results-our method provided similar results (to U-Net, FCNT, DA, and PSP-Net). For example, the gaps between its accuracy (CO = 95.20% with refinement) and those of U-Net, FCNT, DA, and PSP-Net were lower than 1.7%. We can also see that our FCAM-based method did not suffer from undersegmentation nor over-segmentation (US and OS measures are equal to 0) compared to deep learning methods. It is also noticeable that our method without refinement outperformed FCNT without refinement according to most of the criteria. Table 3. Results of supervised methods on the Prague dataset (20 test images). Arrows ↑, ↓ denote the required criterion direction; "nr" means no segmentation refinement, and "wr" means with segmentation refinement. Best results are in bold, and the second best ones in italic. To thoroughly analyze segmentation results, Table 4 presents the accuracy obtained on each individual test image by our proposed FCAM method; the handcrafted supervised methods MRF, COF, and Con-Col; and the two CNN-based methods, FCNT (with refinement) and EWT-FCNT. Only the segmentation results of individual test images obtained by the compared methods (EWT-FCNT, FCNT, MRF, COF, and Con-Col) are available on the Prague [57] benchmark website. Table 4 shows that accuracy obtained by our FCAM method was higher than 96% for half of the 20 test images and outperformed MRF, COF and Con-Col for almost all images. For 6 of the 20 tested images, our method also outperformed FCNT and ranked second behind EWT-FCNT. Even better, it provideed the best accuracy for image 11. For the other images, the accuracy obtained by our method is close to that of FCNT, though for images 7, 8, 10, and 13, the obtained accuracy is poor (below 91%). This explains why the average accuracy dropped to 95.20%. Figure 7 shows a visual comparison of two handcrafted methods (COF and Con-Col) and two deep learning-based ones (FCNT wr and EWT-FCNT) with the proposed method. The results of these methods are publicly available online [57]. Overall, FCAM provided satisfactory visual segmentation results that are close to the ground truth. Unlike other methods, FCAM is not prone to over-segmentation, and classification errors almost only occurred at region boundaries. This is mainly due to our pre-segmentation based on regional SLIC that, despite its superiority over the basic SLIC, lacks the accuracy to correctly determine the boundaries between two or more different texture regions. Table 4. Accuracy (CO) for each test image of the Prague dataset. "nr" means no segmentation refinement, and "wr" means with segmentation refinement. Best results are in bold, and the second best ones in italic. However, it is important to underline that deep learning-based methods (including EWT-FCNT) need to create a large training set (1000 images for each image to segment) from the original training database. Moreover, they require an expensive learning step to extract features and classify pixels. In order to get an overview on the computational costs of our approach compared to CNN-based methods, the following section gives the computational times measured during both training and segmentation phases.

Processing Time
In Section 4.2, we estimated the FCAM computational time only at superpixels. In order to provide an overview of the computational requirements of the entire FCAM-based color texture segmentation method, we estimated its overall runtime over the Prague dataset on an 3.60 GHz Intel Core i7 computer with 8 GB RAM. Table 5 displays the average computing time of our proposed method and those of CNN-based methods (EWT-FCNT, FCNT, U-Net, DA, and PSP-Net). These processing times are separated into two parts: the training time and the segmentation time. The segmentation times for CNNs were taken from [23]. They were measured on a laptop computer with 2.5 GHz quad-core Intel Core i7 processor and 16 GB memory, equipped with GTX 1080Ti external GPU with 11 GB memory. The training times of these CNNs were unfortunately not provided. From this table, we can see that the processing time required by our method during the segmentation stage is the highest. It is about 41.14 s for a 512 × 512 image. FCAM computation for superpixels consumes most of this time (33.12 s). The remaining time is mainly shared among the regional SLIC computation (8 s) and the classification of superpixels by ELM (0.02 s). It should be noted that the reported computing time for our method was obtained with MATLAB code without any GPU acceleration, and consequently, our method could be implemented for real-time applications with this enhancement.  In contrast, the total duration of the training stage for our method remained lower than 5 min and can be detailed as 260 s to compute the FCAM at the prototypes, and 0.9 s to train the ELM. The overall training time (260.9 s) was much lower than with CNNs, which typically require several hours on powerful computers equipped with large-memory GPUs.

Conclusions
In this paper, we introduced fuzzy color aura cardinal matrices (FCAMs) to locally characterize colors and textures, and applied them for color texture image segmentation. The FCAM feature makes it possible to locally characterize the interactions between colors of neighboring sites. A single low-dimensionnal FCAM is required to describe the color texture at each site, unlike in the marginal approach where a fuzzy gray-level aura matrix (FGLAM) must be computed for each color channel.
The proposed color texture image segmentation is based on the classification of superpixels, generated from a modified version of the SLIC algorithm to incorporate regional information. An FCAM is then computed for each superpixel thanks to a locally-adaptive neighborhood function. The superpixels are finally classified using a simple supervised ELM classifier. Experiments on the Prague texture segmentation benchmark showed that the proposed color texture segmentation based on FCAMs outperforms the classical stateof-the-art segmentation methods and is competitive with recent methods based on deep learning. However, unlike CNN-based approaches that require an expensive learning procedure and a large training set of segmented texture images, whose construction is time-consuming, our method is applied straightforwardly from a much smaller database using ELM-based classification.
Despite its respectable performance, our segmentation method remains sensitive to pre-segmentation results. Although the regional SLIC improves segmentation results in comparison with the basic SLIC, the detection of color texture boundaries is still not accurate enough. In future work, we intend to use a texture-aware superpixel procedure that takes into account the properties of the available color spaces. The proposed segmentation method was developed in a supervised context; we also plan to adapt it to unsupervised color texture segmentation.