A Novel Texture Feature Description Method Based on the Generalized Gabor Direction Pattern and Weighted Discrepancy Measurement Model

.


Introduction
In recent years, image feature description methods have received significant attention in the fields of computer vision and pattern recognition.A number of image feature extraction methods are proposed, which can be divided into two categories: holistic and local image feature extraction.There are many holistic feature extraction methods, which can produce a statistical information template from a large amount of training sample images.One of the typical methods is principal component analysis (PCA) [1].Based on the PCA model, some improved methods have been presented including 2D PCA [2,3], incremental PCA [4], block PCA [5,6], etc.Moreover, many methods using matrix decomposition and linear combination have become very popular, such as linear discriminant analysis (LDA) [7][8][9][10][11], independent component analysis (ICA) [12][13][14][15][16], singular value decomposition (SVD) [17][18][19], discrete wavelet transform (DWT) [20,21], etc.Also, a method called k-LDA, which combines LDA with PCA, is proposed to process image classification [22].Without fully taking into account local detailed information, these holistic feature extraction methods are sensitive to geometric shape changes and some illumination and noise variations.However, local image feature extraction methods can effectively overcome those drawbacks.Ojala [23] proposes a texture feature description method called the local binary pattern (LBP), which can achieve superior results for image recognition.The original LBP method computes a binary sequence with 3 × 3 neighborhoods, which compares the central pixel value and its neighbor pixel value in 3 × 3 neighborhoods, and then expresses an LBP histogram as a texture description feature.The fixed neighborhoods make it easy to restrict the larger neighborhood structure, which is an obvious disadvantage for the original LBP method.Afterwards, Ojala improved the original one and suggested an extension LBR P, R [24] with neighborhoods of different sizes, where P is the sample point number in a circle area with a radius of R. Since LBP is a two-value model, which cannot describe more detailed information, Tan [25] extends the two-value model to the three-value model and proposes a novel local feature description method, local ternary patterns (LTP).Furthermore, many variants from the basic LBP have been presented, including local phase quantization (LPQ) [26], local derivative pattern (LDP) [27], local difference binary (LDB) [28], local line directional pattern (LLDP) [29], local binary pattern of pyramid transform domain (PLBP) [30], local tetra patterns (LTrPs) [31], dominant local binary pattern (DLBP) [32], binary robust independent elementary features (BRIEF) [33], local tri-directional patterns (LTPs) [34], local convex-and-concave pattern (LCP) [35] multi-scale local binary patterns (MSLBP) [36] and etc.In addition, motivated by image moments and local binary patterns, some novel texture descriptors have been proposed, such as local Tchebichef moments (LTMs) [37], moment-based local binary patterns (MLBP) [38] and etc. Nanni [39,40] has presented region-based approaches with a co-occurrence matrix, which have had promising results in several medical datasets.Gabor wavelet filters are an excellent feature representation that is insensitive to illumination and expression changes.There are many Gabor feature extraction methods which have shown remarkable performances and wide applications [41][42][43][44][45][46][47][48], such as local normalization entropy-like weighted Gabor features [42], local Gabor binary patterns (LGBP) [43], local Gabor XOR patterns (LGXP) [44], Gabor wavelets and local binary pattern [45], Gabor wavelets combined with volumetric fractal dimension [46], the combined method with the joint of local binary pattern (LBP), local phase quantization (LPQ) and fuses Gabor filters [26].Since the computation amount of Gabor frames is very high, some accelerated Gabor methods have been studied, such as accelerated Gabor frames [47], fusion of multi-channels classifier [48].
Motivated by the LBP structure and Gabor filters, we propose a novel texture feature description method based on GGDP and WDMM.The contributions of this paper can be summarized as follows: (1) Conventional LBP computes the relationship between one image's center pixel value and its neighbor pixel value, and always only utilizes the center pixel's direction information.LBP cannot obtain more detailed direction information from other neighborhood pixels, and thus is sensitive to noise.To overcome these defects, we propose a novel patch-structure direction pattern (PDP) method, which can extract richer feature information and be insensitive to noise.(2) To further improve the effectiveness of PDP, we introduce it into multi-channel Gabor space and get an improved method called GGDP, which can better describe multi-direction and multi-scale texture information.(3) In the traditional classification process, the GGDP feature of each Gabor sub-image should be concatenated and measured.To make the measurement of feature distance more accurate, WDMM is proposed for measuring every GGDP feature of the Gabor sub-image distance and use weighted computing for the final distance with sub-image information content.
This paper is composed of four sections.The texture feature extraction background and our contributions are introduced in Section 1. Section 2 describes the proposed method and its corresponding algorithms including PDP, GGDP and WDMM.Simulated experiments are conducted in Section 3. Section 4 gives the conclusion and introduces future work.

PDP
We suppose the sample image is X and m 0 is the pixel value of the center pixel in the neighborhood.In addition, the neighborhood is set as 3 × 3 and its center pixel's adjacent pixel values are marked as m i (i = 1, 2, • • • , 8), depicted in Figure 1.The patch with the central pixel m 0 is computed by the average value of m i (i = 1, 2, • • • , 8).

PDP
We suppose the sample image is X and 0 m is the pixel value of the center pixel in the neighborhood.In addition, the neighborhood is set as 33  and its center pixel's adjacent pixel values are marked as ).Other adjacent pixel values are computed according to Equation (1), shown in Figure 2.Then, we acquire the patch-structure marked as p X with the size of 33  .Other adjacent pixel values are computed according to Equation (1), shown in Figure 2.Then, we acquire the patch-structure marked as X p with the size of 3 × 3.

PDP
We suppose the sample image is X and 0 m is the pixel value of the center pixel in the neighborhood.In addition, the neighborhood is set as  Other adjacent pixel values are computed according to Equation (1), shown in Figure 2.Then, we acquire the patch-structure marked as p X with the size of 33  .Next, we use Kirsch Masks to find information on the eight directions.Kirsch Masks are shown in Figure 3, which are marked as ) is defined as the following: where notation "  " indicates sum of multiple of corresponding position elements in two matrix.Result of d p X is defined as: Thus, PDP code can be computed as follows: where   i SR is defined in Equation ( 7) ) is defined as the following: Direction information X d p of patch-structure X p is defined in Equation ( 3): where notation "×" indicates sum of multiple of corresponding position elements in two matrix.Result of X d p is defined as: where ) denotes the ith direction information in the neighborhood, and are always not equal to each other in the direction feature description.In this paper, we select the maximum and minimum Kirsch responses, which are respectively marked as R max and R min , defined in Equation ( 5): Thus, PDP code can be computed as follows: where S (R i ) is defined in Equation ( 7) Symmetry 2016, 8, 109 5 of 13 Based on Equation ( 7), we can generate PDP code for the whole image.In order to reduce PDP dimensions and further extract PDP feature, PDP histogram is supposed to describe image feature, defined in Equation ( 8): where x and y denote the horizontal and vertical coordinates in the whole image, and the function I() is defined in Equation ( 9):

GGDP
Gabor wavelet filters can express image direction and scale information for spatial and orientation selectivity.The mathematical model for 2D Gabor wavelet filters are given in Equation ( 10): where X (x, y) is the sample image, x and y denote the horizontal and vertical coordinates, ϕ o (x, y) and ϕ e (x, y) are the odd and even symmetry Gabor filters, respectively.Isotropy Gabor filters ϕ o and ϕ e always use predigest models, which are defined as: where θ, f and σ represent space phase, space frequency and space constant; g(x, y, σ) is a Gauss function: Since θ and f are multi-channel, we suppose F(i) and θ(j) are multi-channel scales and orientations space functions.Herein, we set the multi-channel scales to 4 (i = 1, 2, 3, 4) and orientations to 6 (j = 1, 2, 3, 4, 5, 6).Thus, the multi-channel scales and orientation output of the sample image are marked as φ e F(x,y), θ(x,y) (x, y) and φ o F(x,y), θ(x,y) (x, y) with i = 1, 2, 3, 4 and j = 1, 2, 3, 4, 5, 6.
Suppose A F(i), θ(j) indicates the filter images' amplitude, defined in Equation (13): Next, we generate the PDP histogram for each Gabor filter image A F(i), θ(j) , where i = 1, 2, 3, 4 and j = 1, 2, 3, 4, 5, 6, by Equation ( 8) named as H PDP ( A F(i), θ(j) ), which is GGDP for the sample image X (x, y) feature: where i = 1, 2, 3, 4 and j = 1, 2, 3, 4, 5, 6.In the typical process for Gabor features, GGDP (i, j) will be concatenated.However, we are unable to concatenate every scale and orientation Gabor features for the reason that the importance of every scale and orientation Gabor features are not equal.In fact, we will design a novel discrepancy measurement model to measure the similarity of the two groups' GGDP.

WDMM
Suppose the training sample is X t and testing sample is X s .Main objective of classification is defining the distance between X t and X s .The weighted discrepancy measurement model is defined in the Equation (15): where f i,j s denotes the ith scale and jth orientation GGDP feature of X s , given in Equation ( 16): where A s F(i),θ(j) are the amplitude of X s Gabor filter images.Then, f i,j t is the ith scale and jth orientation GGDP feature of X t , given in Equation ( 17): where A t F(i),θ(j) are the amplitudes of the X t Gabor filter images.
Since image entropy can represent the image texture information, we adopt image entropy to describe the importance of Gabor filter images.The computation process of image entropy is introduced as follows: Suppose the probability of the random variable x (x 1 , x 2 , x 3 , . . . ,x n ) is p (x) (p 1 (x) , p 2 (x) , p 3 (x) , . . .p n (x)).The entropy H(x) is defined in Equation (18): For a Gabor filter image A F(i),θ(j) , its 2D entropy H A F(i),θ(j) can be defined in the following: where m is the image gray degree and p i means the probability of the ith gray degree in the whole image.The weighted coefficient ω i,j is introduced in this paper, which denotes the importance of Gabor filter images with the ith scale and jth orientation.Based on the above discussions, ω i,j is defined by Equation ( 20):

Experiments
For sake of verifying the effectiveness and stability of the proposed method, some simulated experiments were conducted on several public face databases including ORL, CMUPIE and YALE B database, on images in which contain different poses, different expressions and various illumination conditions.The proposed method is compared with some other state-of-art methods, abbreviations for which are listed in Table 1.
Table 1.Method abbreviations and their explanations.

Discussion of Computational Time
Firstly, the computational time of these comparative methods are discussed in this section.In our test, the size of the testing image is set as 128 × 128.Table 2 illustrates the corresponding results, which indicate that LBP cost the least time and has a lower feature dimensions.However, the recognition rate of LBP is the lowest as well.In addition, LGBP and GGDP have a relevant lower feature dimension.According to the following experiments, GGDP can achieve the best results.When balancing effectiveness with efficiency, our proposed method has a considerable advantage over other methods.

Discussion on Classification
In order to evaluate the effectiveness of the classifier, we used the CMUPIE face database, which contains 68 individuals' images, and each one has 60 different poses, expressions and various illumination conditions.Partial images from CMUPIE are shown in Figure 4.
The results reported in Table 3 show that the nearest neighbor (NN) classifier has the worst performance for its simple processing capacity.In contrast, WDMM can achieve slightly better results than a support vector machine (SVM) with GGDP.The results reported in Table 3 show that the nearest neighbor (NN) classifier has the worst performance for its simple processing capacity.In contrast, WDMM can achieve slightly better results than a support vector machine (SVM) with GGDP.

Experiments and Analysis on CMUPIE Database
To further evaluate the stability of the proposed method for different poses and illumination

Experiments and Analysis on CMUPIE Database
To further evaluate the stability of the proposed method for different poses and illumination variations, we conduct the experiments on CMUPIE database.In this experiments, one sub-set of CMUPIE is selected, which contains 60 individuals, and each individual has 13 different poses and 4 different expressions.Moreover, 1, 2, 4, 6, 8 and 10 images are chosen from each person's images as the training sets randomly, and meanwhile, the other remaining images are selected for testing within the same person category.Comparison results of these methods are tabulated in Table 4 as well as Figure 5.It is clear that the recognition rates of all methods increase with the increase of the training numbers.The recognition rate of GGDP with the training number 10 outperforms LGBP and LLDP by an interval of 1.96% and 2.55%, respectively, which is due to the fact that LLDP mainly focuses on the image with line structure (e.g., palmprint).Again, GGDP demonstrates its superior performance.

Experiments and Analysis on the ORL Database
The ORL face database contains 400 grayscale images in PNG format for 40 individuals and each individual has 10 images.There are different facial expressions and poses in this database.Part images of ORL are shown as Figure 6.All face images are normalized at a size of 128 × 128.

Experiments and Analysis on the ORL Database
The ORL face database contains 400 grayscale images in PNG format for 40 individuals and each individual has 10 images.There are different facial expressions and poses in this database.Part images of ORL are shown as Figure 6.All face images are normalized at a size of 128 × 128.To evaluate the effectiveness of the GGDP texture descriptor, some experiments were conducted on ORL databases, which cover different poses and facial expressions.In this paper, 1, 2, 3, 4, 5 and 6 images are randomly chosen from each person set for training sets, and at the same time, the remaining images are selected for testing for the same person category.Table 5 and Figure 7 depict the recognition results of the proposed method and other benchmark methods with different training numbers.It can be gained that those recognition rates of all the comparable methods increase as the training numbers increases.The performance of GGDP can achieve the best results compared with other methods.This is because, in short, GGDP can extract richer and more detailed features.The recognition rates of GGDP with the training number 6 outperform its nearest competitor LLDP 1.5%.In addition, GGDP outperforms LBP and LGBP by intervals of 13.75% and 6.5%, respectively.To evaluate the effectiveness of the GGDP texture descriptor, some experiments were conducted on ORL databases, which cover different poses and facial expressions.In this paper, 1, 2, 3, 4, 5 and 6 images are randomly chosen from each person set for training sets, and at the same time, the remaining images are selected for testing for the person category.Table 5 and Figure 7 depict the recognition results of the proposed method and other benchmark methods with different training numbers.It can be gained that those recognition rates of all the comparable methods increase as the training numbers increases.The performance of GGDP can achieve the best results compared with other methods.This is because, in short, GGDP can extract richer and more detailed features.The recognition rates of GGDP with the training number 6 outperform its nearest competitor LLDP 1.5%.In addition, GGDP outperforms LBP and LGBP by intervals of 13.75% and 6.5%, respectively.To validate the effectiveness of the proposed method under various illuminations, we adopted YALE B database to conduct experiments.A total of 50 individuals' facial images in the Yale B database were selected to form a new sub-database called YALE B SET1, and each person has 64 images with different illuminations.In these experiments, 1, 2, 4, 8, 16 and 32 images are randomly chosen from each group for training purposes and the remaining images are set as testing images.Recognition rates of the proposed method and other benchmark methods are shown in Table 6 and Figure 9.In general, recognition rates of all methods also increase as the training numbers increase.Furthermore, GGDP achieves the best results once again for the same reason as in the former experiments on the ORL database.To validate the effectiveness of the proposed method under various illuminations, we adopted YALE B database to conduct experiments.A total of 50 individuals' facial images in the Yale B database were selected to form a new sub-database called YALE B SET1, and each person has 64 images with different illuminations.In these experiments, 1, 2, 4, 8, 16 and 32 images are randomly chosen from each group for training purposes and the remaining images are set as testing images.Recognition rates of the proposed method and other benchmark methods are shown in Table 6 and Figure 9.In general, recognition rates of all methods also increase as the training numbers increase.Furthermore, GGDP achieves the best results once again for the same reason as in the former experiments on the ORL database.

Conclusions
In this paper, we propose a texture feature description method based on GGDP and WDMM.Firstly, a novel method called PDP is proposed, which can extract rich feature information and be insensitive to noise.Then, motivated by searching for a richer and more discriminant texture feature description method and reducing the local Gabor feature vector's high dimension problem, we

Conclusions
In this paper, we propose a texture feature description method based on GGDP and WDMM.Firstly, a novel method called PDP is proposed, which can extract rich feature information and be insensitive to noise.Then, motivated by searching for a richer and more discriminant texture feature description method and reducing the local Gabor feature vector's high dimension problem, we extend PDP to multi-channel Gabor space to form the GGDP method.Furthermore, WDMM, which can effectively measure the feature distance between two images, is also presented for image sample classification and recognition.Some simulated experiments demonstrate the proposed recognition system can achieve superior results.In future work, we will test our proposed method on other image databases to further validate its effectiveness, such as texture databases (e.g., PhoTex, A lot and RawFooT), medical datasets (e.g., Histopatology and Pap smear), and so on.It may be valuable for us to expand our research scope of face recognition to other practical applications, such as medical analysis, fingerprint recognition, image retrieval, facial recognition, etc.

), depicted in Figure 1 .
The patch with the central pixel 0 m is computed by the average value of i m ( 1, 2, , 8i 

Figure 2 .
Figure 2. Diagram of PDP (patch-structure direction pattern) descriptor.Next, we use Kirsch Masks to find information on the eight directions.Kirsch Masks are shown in Figure 3, which are marked as i K ( 1, 2, , 8 i  ).

Figure 2 .
Figure 2. Diagram of PDP (patch-structure direction pattern) descriptor.Next, we use Kirsch Masks to find information on the eight directions.Kirsch Masks are shown in Figure 3, which are marked as i K ( 1, 2, , 8 i ).

Figure 5 .
Figure 5. Recognition rates of methods on CMUPIE with different training sample numbers.

Figure 5 .
Figure 5. Recognition rates of methods on CMUPIE with different training sample numbers.

3. 3 .
Experiments and Analysis on the ORL Database The ORL face database contains 400 grayscale images in PNG format for 40 individuals and each individual has 10 images.There are different facial expressions and poses in this database.Part images of ORL are shown as Figure 6.All face images are normalized at a size of 128 × 128.

Figure 5 .
Figure 5. Recognition rates of methods on CMUPIE with different training sample numbers.

Figure 7 .
Figure 7. Recognition rates of methods on ORL with different training sample numbers.3.4.Experiments and Analysis on YALE B DatabaseYale B database has 10 subjects and each subject contains 73 viewing conditions with 9 different poses and 64 different illumination conditions.The extended Yale B dataset is extended by 16,128 images for 28 individuals.Partial images from the YALE B database are shown in Figure 8.

Figure 7 .
Figure 7. Recognition rates of methods on ORL with different training sample numbers.

3. 4 .
Experiments and Analysis on YALE B Database Yale B database has 10 subjects and each subject contains 73 viewing conditions with 9 different poses and 64 different illumination conditions.The extended Yale B dataset is extended by 16,128 images for 28 individuals.Partial images from the YALE B database are shown in Figure 8.

Figure 7 .
Figure 7. Recognition rates of methods on ORL with different training sample numbers.

3. 4 .
Experiments and Analysis on YALE B Database Yale B database has 10 subjects and each subject contains 73 viewing conditions with 9 different poses and 64 different illumination conditions.The extended Yale B dataset is extended by 16,128 images for 28 individuals.Partial images from the YALE B database are shown in Figure 8.

Figure 8 .
Figure 8. Part Images of YALE B.

Figure 8 .
Figure 8. Part Images of YALE B.

Figure 9 .
Figure 9. Recognition rates of methods on YALE B SET1 with different training sample numbers.

Table 2 .
Time cost for different feature extraction methods.

Table 3 .
Recognition rates of different classification methods.

Table 3 .
Recognition rates of different classification methods.

Table 4 .
Recognition rates of methods on CMUPLE (Carnegie Mellon University pose, illumination, and expression) with different training sample numbers.

Table 5 .
Recognition rates of methods on ORL with different training sample numbers.

Table 5 .
Recognition rates of methods on ORL with different training sample numbers.

Table 6 .
Recognition rates of methods on YALE B SET1 with different training sample numbers.

Table 6 .
Recognition rates of methods on YALE B SET1 with different training sample numbers.
Figure 9. Recognition rates of methods on YALE B SET1 with different training sample numbers.