1. Introduction
Texture classification is an active research topic in image processing and computer vision. It has received significant attention in many applications such as content based image retrieval, medical image analysis, face recognition, or biometrics. The texture classification approaches can typically be categorized into two subproblems [
1,
2]: The representation, which aims to characterize an image with a set of texture features, and the decision, which assigns this image to one of the available texture classes. This paper focuses on the first subproblem and particularly on feature space dimensionality reduction techniques. Many approaches perform a reduction of the feature space to transform high-dimensional data into a meaningful representation of reduced dimensionality [
3,
4,
5]. By only retaining the most discriminant features, these approaches aim to improve the classification accuracy, while decreasing the processing time.
Dimensionality reduction techniques can be divided into two categories [
6]. (1)
Feature extraction builds a low dimensional subspace where the new features are usually combinations of the original features. The main drawback of this strategy is it requires to compute all candidate features during the classification stage to build the new feature space, which could be time-consuming; (2)
Feature selection strategies select the most relevant original features. Hence, it just requires the computation of a reduced number of selected features during the classification stage. Among the feature selection techniques, we were particularly interested in those based on individual ranking. These algorithms rank the candidate features with respect to a score which measures their relevance. They are relatively inexpensive in computation time since no subspace procedure generation is used.
In the supervised context, the information about the class distribution is available. Supervised feature selection scores, such as the Fisher and the Supervised Laplacian scores, use the class labels to determine the relevance of each feature. However, it would be interesting to see if using a soft way to measure the similarity between images could be relevant. A soft value does not use any information about the class label of the images but measures the similarity in a subtle way, instead of being binary with just two values (same class or not). This may lead to powerful discriminating information since it should better reflect the geometric structure of the different classes. The Variance and the Laplacian scores measure the ability of a feature to keep the intrinsic data structure without considering any information about the class label of the images [
7]. In accordance with this, they could be considered to be unsupervised. These two different scores, originally designed for the selection of features, have been successfully used in the context of image classification to select relevant features and improve the classification accuracy [
8]. In this paper, we propose to see if the soft way to measure the similarity between images used in these two unsupervised scores is relevant for selecting histograms.
To describe the texture, the local patterns contained in an image are usually represented by histograms, like sum and difference histograms [
9], histograms of equivalent patterns [
10] or bag-of-words histograms [
11]. A set of cross-channel histograms are then computed to represent a color texture. Local Binary Pattern (LBP) is a texture descriptor belonging to this scheme [
12]. The LBP operator transforms an image by thresholding the levels of the
P neighboring pixels around each pixel of the image, and coding the result as a binary number. Usually, the histogram of this LBP image is then used for texture analysis. Many authors have taken an interest in the reduction of this
-dimensional LBP histogram in order to improve the texture classification performances [
13]. Ojala et al. propose to consider the uniform LBP operator, where 59 discriminant pattern types (or bins) are
a priori chosen among the
available ones [
14]. Mäenpää et al. consider a method based on beam search to select a reduced number of discriminant bins [
15]. Boosting has become a very popular approach for feature selection and has been widely adopted for LBP feature selection in various tasks [
13]. Liao et al. introduce the Dominant Local Binary Pattern (DLBP) that considers the most frequently occurred patterns to improve the recognition accuracy [
16]. Because DLBP is only based on the pattern frequency, information about the type (label) of the selected patterns is lost. That is the reason why this texture descriptor has been later improved by labelling the most frequent patterns [
17], like in the Labelled Dominant Local Binary Pattern (L-DLBP) [
18], the Highest-Variance Dominant Local Binary Pattern (HV-DLBP) [
19] or more recently in the Highest-Rank Dominant Local Binary Pattern (HR-DLBP) [
20]. Guo et al. also propose a labelled model of the DLBP based on the Fisher separation criteria [
21,
22]. The most reliable and robust dominant bins are thus determined by considering intra-class similarity and inter-class dissimilarity.
Many other extensions or variants of the LBP operator have been proposed in recent decades for gray level images [
12]. However, the extensions of this operator applied to color images remain relatively limited since 2002, wherein the Extended Opponent Color LBPs (EOCLBP) have been proposed by Pietikäinen et al. [
13]. In EOCLBP, the LBP operator is applied on each color component of a given color space independently and also on pairs of color components according to a cross-channel strategy. This leads to extract nine different histograms, three within-component and six between-component LBP histograms, and it could be interesting to wonder whether all the information contained in these histograms is relevant to discriminate the textures. Paradoxically, reducing the dimensionality of LBP histograms is much less frequent in the framework of color texture analysis whereas the dimension of the feature space is higher. A first solution proposed by Chan et al. uses linear discriminant analysis to project high-dimensional color LBP bins into a discriminant space [
23]. A second solution is proposed by Hussain et al., who exploit the complementarity of Histograms of Oriented Gradients [
24], Local Binary Patterns, and Local Ternary Patterns [
25] and apply partial least squares to resolve their visual object detection problem [
26]. More recently, Porebski et al. propose a different approach which selects, out of the nine LBP histograms extracted from a color texture, those which are the most discriminant [
27]. This strategy, which selects histograms in their entirety, fundamentally differs from all the previous LBP selection approaches which select the bins of the LBP histograms or project them into a discriminant space. To evaluate the relevance of the LBP histograms, Porebski et al. propose a supervised approach where an Intra-Class Similarity score (
-score) is computed for each histogram. This score is based on a measure of the histogram ability to characterize the similarity of the textures within each different class. Inspired by this approach, Kalakech et al. propose another score (the
-score) based on the supervised Laplacian score designed for feature ranking and selection [
28]. In [
29], histogram selection and bin selection schemes have been extended to the multi-color space domain and compared each other in the framework of color texture classification. It has been shown that the classification accuracy reached thanks to histogram selection is slightly higher than the accuracy provided by a bin selection, with a similar classification computation time. The encouraging results obtained with the two supervised
and
-score lead us to propose in this paper two new histogram selection scores: The adapted Variance (
-score) and the adapted Laplacian (
-score) scores. As the names suggest, these scores are respectively adapted from the unsupervised Variance and Laplacian scores which have been originally designed for the selection of features and which use a soft way to measure the similarity between images. In this paper, we propose to extend these scores in order to rank and select LBP histograms extracted from a color image.
First, the traditional unsupervised feature selection scores are presented in
Section 2. The corresponding adapted histogram selection scores are then detailed in
Section 3 and the LBP histogram selection approach is described in
Section 4. In order to compare these two new scores each other and with the results of the state of the art, experiments are performed on benchmark and widely used databases in
Section 5.
3. Histogram Selection Scores
In the histogram selection context, we dispose a dataset of
N color textures images. Each image
(
) is characterized by
D histograms. The whole data is summarized by the matrix
as:
where
is the
rth histogram computed from the
ith color texture image
. It is defined by
where
Q is the histogram bin number.
The ith row of represents a set of D histograms corresponding to the image and whose dimension is . For each column, regroups the values of the rth histogram across the N images.
The histogram selection scheme evaluates each histogram
in order to select the most discriminant one among the
D candidate histograms. For this purpose, we propose to adapt the feature selection scores, presented in
Section 2, in order to define histogram selection scores. Distance and similarity measures are two critical terms used for feature selection. Distance measures are low when the images are close to each other, contrary to similarity measures whose highest value indicates that the considered images are similar. To adapt the traditional feature selection scores to rank and select histograms, it is necessary to consider either a distance measure between histograms or a similarity measure between histograms depending on whether the term to adapt has to be maximized or minimized.
Several measures of similarity and distance between histograms have been used in computer vision and pattern recognition [
33]. Since the objective of this paper is to show the interest of the proposed scores, we retain two simple measures, the histogram intersection as similarity measure and the Jeffrey distance as distance measure: the histogram intersection is considered to adapt the similarity term
which has to be maximized (the kernel is maximized when the images are similar) and the Jeffrey distance is used to extend the Euclidean distance which has to be minimized for similar images.
The intersection between the histograms extracted from two images
and
is defined as follows:
The result of the intersection is the number of pixels of the first image that have a corresponding pixel in the second image which has the same characteristic (the same specific pattern in the case of LBP histograms). So the more the considered images are similar, the higher the histogram intersection is. The histograms being normalized by the number of pixels in the image, the value of this measure varies between 0 and 1.
The Jeffrey distance between the histograms of two images
and
is defined as follows:
As all distance measures, the value of the Jeffrey distance is low when the images are close to each other in the histogram space.
In order to clarify the adaptation of the different scores to histogram selection, we summarize the terms and the scores used in this section in
Table 1 where formulas are applied to evaluate the score of the
rth histogram. The left column groups feature selection terms while the right one summarizes the corresponding histogram selection adaptation. Readers can refer to this table while reading the next section.
3.1. Adapted Variance Score
Using the Jeffrey distance defined in Equation (
4), we extend the Variance score of Equation (
1) in order to select histograms rather than features. The Adapted Variance score
of the histogram
is defined as follows:
where
is the mean histogram that is evaluated by averaging all the bins of the histogram
across the
N images:
, with
.
The histograms are sorted according to the decreasing order of in order to select the most relevant ones.
3.2. Adapted Laplacian Score
Using the Jeffrey distance and the intersection similarity measure defined in Equations (
3) and (
4), we extend the Laplacian score of Equation (
2) in order to select the most discriminant histograms. The Adapted Laplacian score
of the histogram
is defined as follows:
The degree of the image is defined by: and is the weighted histogram average: , with .
As for feature selection using the Laplacian score, the histograms are sorted according to the ascending order of in order to select the most relevant ones.
4. LBP Histogram Selection for Color Texture Classification
The adapted scores previously presented are used in a LBP histogram selection approach described in this section (see
Section 4.2). The candidate color LBP histograms are first presented.
4.1. Candidate Color Texture Descriptors
The LBP operator is one of the most successful descriptor used to characterize texture images due to its ease of implementation, its invariance to monotonic illumination changes, and its low computational complexity. Many variants of the original LBP operator have been proposed in the literature since Ojala’s original definition [
12]. The goal of this paper being to reveal the relevance of the proposed histogram selection scores, no further sophisticated texture descriptors are needed. That is the reason why the color textures are here characterized thanks to the EOCLBP histograms, which are a simple extension to color of the original LBP operator. Obviously, the classification results are expected to be improved using more elaborated descriptors, such as the Improved Opponent Color LBP [
34] or the Median Robust Extended LBP for example [
35], which is a gray level descriptor that has obtained the best overall performance on thirteen texture image sets and which could be extended to color.
To compute the EOCLBP histograms, each image is first coded in a 3-dimensional color space, denoted here . The LBP histograms are then computed from the so-coded images: Three within-component LBP histograms (, , and ) and six between-component LBP histograms (, , , , , and ) are extracted from each image. As do Ojala et al. when they introduce the original LBP operator, the pixel neighborhood ( neighbors) is here considered. A color texture is thus represented by a ()-dimensional feature space.
It is well-known that the performance of a classifier is generally dependent on the dimension of the feature subspace due to the curse of dimensionality [
36]. To reach a satisfying classification accuracy while decreasing the computation time, we propose to reduce the number of candidate LBP histograms by selecting the most discriminating ones thanks to the histogram selection scores previously presented.
4.2. Histogram Selection
To evaluate a supervised color texture classification scheme, it is usual to divide the considered database into a learning and a testing image subset. The learning subset is used to train the classifier during the learning stage, whereas the testing subset is used during the classification stage to evaluate the performances of the proposed method. In the histogram selection framework, the learning stage aims to build a low dimensional discriminating subspace thanks to labelled or unlabelled training data.
Different models are proposed in order to evaluate the relevance of the candidate subspaces [
37]. The
wrapper model uses the classification accuracy as discriminating power of the candidate subspaces. When a classifier such as the nearest neighbor is considered, it involves to decompose the learning subset into a training and a validation subsets. Although this model is time consuming and classifier-dependent, it gives good results and determines easily the dimension of the selected subspace by searching the best classification accuracy. On the contrary,
filter models evaluate the relevance of the candidate subspaces without classifying the images. They are less time consuming but the determination of the dimension of the subspace to be selected is not so easy. To obtain a good compromise between dimension selection, computation time and classification result,
embedded models are preferred [
38]. These approaches combine a filter model to determine the most discriminating subspaces at different dimensions and a wrapper model to determine the dimension of the selected subspace [
6].
The approach used in this paper is an embedded histogram selection scheme which requires to split up the initial image database in a training, a validation and a testing image subset, according to a holdout decomposition. During the learning stage, candidate histograms are generated from training images and ranked with respect to a score which measures the efficiency of each candidate histogram. This score can be computed without considering the class label of the images like the unsupervised selection -score and -score or by taking the information about the class distribution into account, like the -score and the -score do.
Once the score has been computed for each of the D candidate histograms, a ranking is performed. The candidate subspaces—composed, at the first step, of the histogram with the best score, at the second step, of the two first ranked histograms and so on—are then evaluated to determine the relevant histogram subspace. For this purpose, a classifier operates in each candidate subspace in order to classify the validation images. For each subspace dimension d, the classification accuracy is estimated as the percentage of the validation images that have been correctly classified. This rate of well-classified validation images is denoted .
The dimension
of the selected subspace is the one for which the value of
is the highest:
During the classification stage, the relevant histograms previously selected are computed for each testing image and compared to the training images in the selected histogram subspace to determine the testing image label. The purpose of this paper being to show the contribution of the two new histogram selection scores, independently of the considered classifier, its parameters and its metric, the nearest neighbor classifier associated with the histogram intersection as a similarity measure is here considered.
5. Experiments
In this section, the proposed histogram selection scores are compared thanks to three benchmark color texture image sets: Outex-TC-00013, USPTex, and NewBarkTex.
Outex-TC-00013 is composed of 68 color texture images acquired under controlled conditions by a 3-CCD digital color camera and the size of which is
pixels [
39]. Each of these 68 textures is split up into 20
disjoint sub-images. Among these 1360 sub-images, 680 are used for the training subset and the remaining 680 are considered as testing images. The Outex-TC-00013 image test suite can be downloaded at
http://www.outex.oulu.fi/index.php?page=classification.
USPTex set is a more recent database [
40]. It contains 191 natural color textures acquired under an unknown but fixed light source. As for Outex-TC-00013, these images are split up into
disjoint sub-images. Since the original image size is here
pixels, this makes a total of 12 sub-images by a texture. For our experiments, this initial dataset of 2292 sub-images is split up in order to build a training and a testing image subset: 6 images are considered for the training and the 6 others are used as testing images. This decomposition is available at
https://www-lisic.univ-littoral.fr/~porebski/USPtex.zip.
The Barktex database includes six tree bark classes, with 68 images per class [
41]. Even if the number of classes of this database is limited to six, the textures of these different classes are close to each other and their discrimination is not easy. To build the NewBarkTex set, a region of interest, centered on the bark and whose size is
pixels, is first defined. Then, four sub-images whose size is
pixels are extracted from each region. We thus obtain a set of
sub-images per class. To ensure that color texture images used for the training and the testing images are less correlated as possible, the four sub-images extracted from a same original image all belong either to the training subset or to the testing one [
42]: 816 images are thus used as training images and the remaining 816 as testing images. The NewBarkTex image test suite can be downloaded at
https://www-lisic.univ-littoral.fr/~porebski/NewBarkTex.zip.
These sets do not require to consider specific illuminant or rotation invariant texture descriptors since the goal of this paper is to reveal the contribution of the proposed histogram selection scores independently of the texture descriptor invariance to the observation conditions.
Let us note that the considered texture benchmark databases are composed of only two image subsets according to a holdout evaluation method, whereas the considered histogram selection scheme needs three subsets as explained in
Section 4.2. We thus propose to use one subset as the training subset and the second both as the validation and testing subset to evaluate the performances of the proposed scores. Therefore, the dimensionality of the selected feature space will be ideally determined, and the classification results can be interpreted as optimistic. This solution was nevertheless chosen in order to achieve the comparison with other works using the same split into training and testing subsets.
Moreover, in order to evaluate the impact of the used color space, four color spaces are considered for experiments:
,
,
, and
. These color spaces are respectively representative of the four color space families (the primary spaces, the luminance-chrominance spaces, the independent color component spaces, and the perceptual spaces) and do not require to know illumination conditions like the
color space for example [
4].
Section 5.1 presents a comparison of the performances achieved by the proposed histogram selection scores. An analysis of the histogram rank is then done (cf.
Section 5.2). Finally, in
Section 5.3, the classification results obtained by the proposed approach are compared with the state of the art.
5.1. Comparison of the Histogram Selection Scores
In this section, four histogram selection scores are compared on Outex-TC-00013, USPTex, and NewBarkTex sets:
the unsupervised Adapted Variance score (-score),
the unsupervised Adapted Laplacian score (-score),
the Adapted Supervised Laplacian score (
-score) proposed by Kalakech [
28],
and the supervised Intra-Class Similarity score (
-score) proposed by Porebski [
27].
Figure 1,
Figure 2 and
Figure 3 show the rate
of well-classified validation images according to the number
d of ranked histograms on Outex-TC-00013, USPTex, and NewBarkTex sets, respectively, and for each considered color space.
These figures show that the accuracy obtained thanks to the unsupervised
-score globally outperforms the results obtained by the
-score, for the three databases and whatever the considered color space. In the same way, the
-score outperforms the
-score in the supervised context. These results confirm the high performances obtained thanks to the Laplacian scores in the context of feature selection [
8]. For histogram selection, the interest of the similarity term to capture the intrinsic properties of the data is also demonstrated.
These figures also show that the -score globally gives the highest accuracy, followed very closely by the unsupervised -score. These scores reach a high accuracy with a lower dimensional histogram subspace. The unsupervised -score, which is computed without considering the class label of the images, globally outperforms the supervised -score, which takes the information about the class distribution into account. This confirms again the relevance of the similarity matrix used in the Laplacian scores to perform the selection.
Table 2,
Table 3 and
Table 4 show the accuracies
obtained with the
-dimensional selected LBP histogram subspaces, by using the different supervised and unsupervised scores on Outex-TC-00013, USPTex, and NewBarkTex sets, respectively. The accuracy reached without performing any color LBP histogram selection is also presented. The bold values represent the best rates obtained with each color space and the boxed values indicate the best rate obtained for each color texture set.
These tables confirm the interest of selecting LBP histograms: The selection improves the classification accuracy by on average 0.52% on OuTex, 7.70% on USPTex and 6.32% on BarkTex, while reducing the number of considered histograms. We can also see that the performances reached thanks to the different scores are very close to each other, especially for the OuTex and USPTex databases. For the color space that gives the best rates ( for Outex-TC-00013 and NewBarkTex and for USPTex), several scores give the higher performances and the and the scores always appear among the best scores.
For NewBarkTex which is a more challenging set, the
,
, and
scores give exactly the same best accuracy with the same optimal dimension. The difference between these three scores appears more for subspaces with a little dimension: From
Figure 3, we can notice that the
and the
scores seek faster the better histograms specially for the
and
color spaces.
It is also interesting to notice that the unsupervised -score appears among the best scores 10 times out of 12. It outperforms the other unsupervised -score and even the supervised score. Its performances are remarkable since they are similar or very close to those reached by even though it does not consider the class label of the images.
5.2. Comparison of the Histogram Ranks
In this section, an analysis of the histogram ranking is done.
Table 5 shows the histogram ranking obtained thanks the considered scores on Outex-TC-00013, USPTex and NewBarkTex sets. The numbers 1, 2 and 3 represent the three within-component LBP histograms (
,
, and
), and the six between-component LBP histograms (
,
,
,
,
, and
) are respectively numbered 4, 5, 6, 7, 8, and 9.
Each row of this table shows the histogram ranking in the considered color space using the specified histogram selection score, for the three image sets. For example the first row shows that, in the RGB color space, using the -score and the OuTex database, the first selected histogram is the number 2 (), followed by the histogram 4 (), ... and finally the histogram 9 () is the last selected. The bold values correspond to the selected histogram subspace for which the best accuracy is achieved for each of the three color texture sets.
This table shows that the histogram ranking is very variable according to the considered color space or score. This clearly shows the interest of performing a histogram selection, since we can not a priori judge the most relevant histogram subspace, even for a same database.
5.3. Comparison with the State of the Art
In this section, we compare the accuracy obtained using the proposed unsupervised
-score with the results reached in the state of the art on the three considered sets. For a fair comparison, these sets have the same experimental protocol (number of classes, image size, number of images for each class, total number of images, and accuracy evaluation method), and only the works that apply a single color space strategy are mentioned. In addition to the nearest neighbor classifier, we propose also to use the SVM classifier during the classification stage of our approach since the best accuracy reached in the state of the art on Outex-TC-00013 and NewBarkTex with a single color space strategy has been reached thanks to this classifier. A one versus one SVM classifier with a linear kernel is here considered. The results are summarized in
Table 6,
Table 7 and
Table 8.
From these tables, we can see that the second best accuracy result obtained on the Outex-TC-00013 set (95.4%) is achieved thanks to a simple 3D color histogram, although it only characterizes the color distribution within the
color space, and does not take into account the spatial relationships between neighboring pixels, as a color texture feature should. This inconsistency is due to the fact that the Outex-TC-00013, as well as the USPTex sets, present a major drawback: The partitioning used to build these two sets consists in extracting the training and the testing subimages from a same original image. However, such a partitioning, when it is combined with a classifier such as the nearest neighbor classifier, leads to biased classification results [
42]. Indeed, testing images are spatially close to training images. They are thus correlated and a simple 3D color histogram reaches a high classification accuracy [
43]. For the NewBarktex set, the training and the testing subimages come from different original images to ensure that color texture images are less correlated as possible. The analysis of the results is thus more efficient and interpretable on this image set. The best accuracy rate (89.6%) is obtained thanks to the dominant and minor sum and difference histograms [
57]. Selecting LBP histograms thanks to our proposed
-score allows to get close to this highest rate, particularly when a SVM classifier is considered to classify the testing images. In this case, the classification accuracy reaches the promising result of 84.9%. This additional experiment highlights the merit of the unsupervised
-score when it is associated with the SVM classifier.
6. Conclusions
We have proposed to adapt the traditional unsupervised feature selection scores in order to rank and select LBP histograms for color texture classification: The Adapted Variance (-score) and the Adapted Laplacian (-score) scores have thus been presented.
For each one of the nine LBP histograms extracted from a color texture, a score is assigned using one of the proposed adapted scores. The histograms are then ranked in order to select the most discriminant ones and thus build a low dimensional relevant subspace, in which a classifier operates.
Experiments on Outex-TC-00013, USPTex and NewBarkTex sets have shown the interest of performing a LBP histogram selection before classifying the different images. This selection improves the classification accuracy while reducing the dimension of the histogram subspace. The -score outperforms the -score and gives performances comparable or even better than the supervised -score and -score.
For future research directions, we propose to associate the
-score with a multi color space approach [
29]. Moreover, an additional experimentation can be realized in the short term perspective: Similarity can be derived from a given distance by kernelization (exponential with Euclidean distance in the conventional approach of Laplacian score). As the Jeffrey divergence can also be kernelized, it would be interesting to study the trend of the results considering a kernelized Jeffrey measure as similarity measure and, more generally, the impact of the distance and similarity measure on the classification performances.