A High Precision Feature Based on LBP and Gabor Theory for Face Recognition

How to describe an image accurately with the most useful information but at the same time the least useless information is a basic problem in the recognition field. In this paper, a novel and high precision feature called BG2D2LRP is proposed, accompanied with a corresponding face recognition system. The feature contains both static texture differences and dynamic contour trends. It is based on Gabor and LBP theory, operated by various kinds of transformations such as block, second derivative, direct orientation, layer and finally fusion in a particular way. Seven well-known face databases such as FRGC, AR, FERET and so on are used to evaluate the veracity and robustness of the proposed feature. A maximum improvement of 29.41% is achieved comparing with other methods. Besides, the ROC curve provides a satisfactory figure. Those experimental results strongly demonstrate the feasibility and superiority of the new feature and method.


Backgrounds
The Gabor wavelet has been widely used to extract the texture features [15] since J.G. Daugmann declared that a simple cell could be approximated using 2D Gabor filters to the cells in the human virtual cortex and can be selectively related to orientation and to spatial frequency [16]. A detailed introduction of its performance is given in [17]. In this paper, we define the Gabor kernel function as follows: u Here u and v are the orientation and scale of the Gabor kernel. In most case, one would use Gabor wavelets at five different scales (Nu = 0,1,2,3,4), and eight different orientations (mu = 0,1,2,3,4,5,6,7). We define the parameters as follows: Given a gray-level image T(x,y), which has been pre-processed already, we take the convolution of it with a Gabor kernal as defined by function (4) for feature extraction and image representation: ( , ) ( , )* ( , ) P x y T x y G x y = Local Binary Patterns (LBP) was introduced as a texture descriptor by Ojala [18]. It labels a point as the center point, and compute the differences between it and the points around. On occasion that the difference is larger than 0, we assign the result to be 1, or else to be 0. We give an example in Figure 1. Then the no-argument square region has been replaced by circle region and extended to consider different neighborhood sizes with two parameters R and N shown in Figure 2. Here R means the region radius and N means the number of sampling points around the center point. Many methods based on LBP have been discussed, for example [19,20].

Layer Directed Derivative Local Radii-Changed Binary Pattern Feature (BG2D2LRP)
The Gabor feature and LBP feature have been successfully used in face recognition since they are robust to illumination and expression. However, they have some shortcomings: (a) they are too simple and vague for distinction since what they describe are only a rough and general outline. Many details as well as the unique information contained in an image have not been fully utilized; (b) although they overcome the influence of changing light to a certain extent, they are not so robust when faced with severe changes. Motivated by this, we thought the idea that we may not only use the static pixel local binary, but also the dynamic texture changes, and the latter one is even more important as it would be more unique for different persons. We compared Gabor values of each point with multiple nearby points to judge its shape and changing trend using the Double LBP model. The radii are set to be variable in order to identify the best radii for best performance (GLRP). We choose eight orientations unlike the traditional way of derivation [14], since directed orientation can enhance the distinguishability (directed). Then we take the dynamic changing trend of face along a set direction as its overarching characteristic, which demonstrates the unique contour information (derivative). The static differences are also used as a supplementary explanation. It should be noted that the images are divided into parts for better extraction performance as mentioned in [21] (blocking). After fusion by the layer method (layer), the feature (BG2D2LRP) which consists of both dynamic contour trends and static texture differences includes an almost substantially unique information for sample faces. More details are shown below.
First we chose a region that contains double circles both centered in a same point but with different radii. Each circle performs as mentioned in Section 2 and here we give N a constant value 8. The radii are variable, with different values and proportions. We use C1 and R1 to represent the outer circle and the corresponding radius, while C2 and R2 represent the inner circle and the corresponding radius. Figure 3 shows the Double Radii LBP model. As we can see in Figure 3, when N = 8, there are just eight orientations clockwise from the center point: 0°, 45°, 90°, 135°, 180°, 225°, 270° and 315°, so we can compute the information of the local region in the way shown in Figure 4, which not only filters unnecessary information but also emphasizes the key information.
In Figure 4, there are eight iterations for a center point and each one takes three little blocks: (1) 5 × 5 box with three different ovals; (2) 1 × 3 rectangle lattices with three arrows; (3) 1 × 4 rectangle lattices. The oval region which covers three minimum square are extracted from the 5 × 5 double radii LBP model. The thick arrows indicate the operational relationship of squares in every oval region. Then we give the pixel value differences between the related squares in the 1 × 4 rectangle lattices. Now each square get 4 × 8 features. Considering that the sign plays a more important role than the difference discussed in [22], we encode them with the rule: The code of P(i) is: Here Code (P−C2(i)) is the first derivative of the central square, and Code (C2(i)−C1(i)) is the surrounding changes of the center point. So P(i) can be regarded as the second directed derivative of the center square which describes the texture variation tendency along a certain orientation around it. If the code in it is "1", it is monotonically increasing or decreasing from the center outward. On the contrary, if the code in it is "0", it is first increased and then decreased or first decreased and then increased outwards from the center.
Here we list the encoded features in turn just like the way shown in Figure 5. The detailed result in Figure 5 is an example based on the model in Figure 3. For computational convenience, we transfer the eight binary codes at each layer into a decimal number. These four decimal numbers contain almost all the required information for a square. In Figure 5, L1 is the ensemble of P(i). It reflects the essential attributes of the texture distribution for a person, so it is very easy to distinguish person A from person B. However, it is vague sometimes, such as in the case of feature P1 and feature P2 in Figure 5, although their L1 value are both 0, in fact they represent contrary shapes.
That's why L2 and L3 are needed. The combination of L2 and L3 exactly overcomes this defect and assists L1 in pinpointing the change trend. L2 represents the difference between local region C1 and the center square, while L3 represents the difference the local region C1 and C2 next to the center square. They describe the change trend outward from the center point step by step, while L1 just describes the global monotone trend outwards from the center point. In other words, when "1" appears in L1, we will know clearly whether it is monotonically increasing or decreasing from the center outwards, according to the corresponding codes in L2 and L3.
L4 is the LBP code of the center point with a new radius. L4 and L2 make use of two kinds of LBP information at the same time. It is more comprehensive and distinguishable. We changed R1 and R2 in the experimental module and compared the results.

Our Approach
Our approach can be illustrated in Figure 6. First we used a homomorphic filter and histogram specification to obtain an excellent image splicing effect. The Adaboost algorithm with Haar features [23] was applied to catch an accurate facial contour, which prepares for the BG2D2LRP feature extraction. After we have detected the face and resized the detected face to be 120 × 120, the BG3DLRP feature could be extracted from the face. We choose PCA [24] and LDA [25] for dimension reduction because they are useful to enhance the recognition performance of our feature. At last, given two vectors after PCA and LDA translation, we chose cosine similarity [26] to calculate the distance as it can effectively avoid the difference of the same individual in different degree and has better cooperation with BG3DLRP feature.

Control Group Design
Considering our new feature is based on Gabor and LBP, we chose Gabor, LBP and LDP as comparison methods. To furthest improve recognition performance, we bring in PCA, LDA and cosine similarity and construct a mature face recognition system, so we take PCA, LDA into account for comparison. Besides, we compare our method with a similar method introduced in [12] to illustrate its advantages.

Database Sets
In the recognition module, we choose six well lnown databases and a self-made database to test our new feature. Each has its different emphasis, as shown in Figure 7   system. The recognition rates are shown in Table 1 and Table 2. We gave the consumed time of each method in Table 3. The ROC curves are shown in Figure 8.
From Tables 1 and 2 we can see one of the major advantages of our new feature is that it can greatly improve the recognition rate. The recognition rate of our method always remains at more than 95%. We can see the recognition rates of most methods on the YALE database are even lower than of FRGC or FERET although YALE seems relatively simple. The reason for this lies in the fact that many images in it involve partial occlusions on eyes or noses and thus provide limited information and strong interference, which is shown in Figure 7. However, the recognition rate of our feature and our approach can also reach 96.98% and 97.27% in these cases. Besides, the recognition rate of GLDP method also is 83.42%, which is more than PCA, KGWRCM. This demonstrates that the dynamic contour trend is more effective than static texture differences. There is almost a 12% improvement with our feature based on the GLDP method. This mainly results from two aspects: (a) our directed dynamic changing trend feature is better than the aimless LDP; (b) our static texture difference feature provides extra useful information to some extent.   Expression variation is really a challenge for face recognition. However, we can see the recognition rate of our approach can still reach 95.11% and 97.28% on FERET with our feature. Actually, expression changes can be regarded as a partial translation in the original two-dimensional (x,y) image. This means, the translation would only change the point value but not the shape. LBP is popular in expression recognition as these changes can be eliminated after subtraction. Considering our feature, on the one hand it contains a double LBP texture difference, which provides more information, and on the other hand the dynamic changing trend of it keeps stable no matter how expression varies, so it is markedly robust to expression changes.
Our feature performs well even with single eye and side face, as it extracts the most intrinsic characteristicd of the image, which solves the recognition pose variation problem. The recognition rate of our feature on the AR database which is famous for its severe pose changes is 96.32%, which also illustrates this thesis. The ABERDEEN database is made under strict light changing conditions. Our feature is insensitive to illumination as the difference operation eliminated the illumination changes. As for the FRGC and FSTAR databases, which have complicated changes, our feature is even superior. It shows significant improvement over the existing relative methods. Besides, from the data, we can see that the difference of our feature and our approach is always less than 1%. This means although our approach would improve the recognition rate, the perfect result should be due to our accurate feature.
We need ROC curve to evaluate the recognition performance while recognition rate just illustrates the result accuracy. As we all know, the smaller the graphics area between the curve and the coordinate axis, the better the method. In Figure 8, the curves of our method and feature are always in the bottom far below the curves of other methods. They are near and even overlapped in some cases. This also validates the conclusion that the perfect result should be due to our accurate feature. It remains gradual even when tested on the FRGC database, in which curves of other methods vary sharply. This should be attributed to the L2 and L3 part of our new feature. Although the light changes, the difference would remain unchanged as the variation errors are eliminated by subtracting. The expression robustness mainly is due to the L1 part of the new feature. It catches the texture changing trend of a person, although the expression may change the pixel gray value, the value changing trend within a local region may remain stable. All results of our new feature and approach show impressively superiority to the results of LBP, Gabor features and other similar methods.
In general, ORL, YALE, AR are relatively simple, the recognition rates on it are high and the ROC curves are better than on FERC, FERET, FSTAR, ABERDEEN databases. It should be noted that the more difficult the databases, the greater the difference between our feature and other methods. This means our feature can bear changes more widely than other methods. The joint effort of static texture difference and dynamic contour consists of our high accuracy feature which extracts maximum intro-class similarity and inter-class otherness information.
Besides, when testing with our methods, we sought to find how the local region size would impact the recognition rate, thus we conducted a new group of experiments by changing the radii of C1 and C2 each time and recording the result data. The result of this experiment is shown in Figure 8(h). We can see that the radius should be neither too big nor too small. It's the best choice when the two circles are just the adjacent layers around the center point. If the radii are both too small, the two circles are too near and the information is not representative, especially after subtraction. In contrast, if the local region is too sparse, the information of those pixels would have no rules or connections. Table 3 gives the time consumed by each method per 300 images. It refers to the average time that training samples need to form a characteristic template. In the table we find that Gabor feature and LBP feature are fairly fast, however, PCA and LDA takes more time as they involved in complicated matrix operations. Our feature takes only 4.1 s, even though it contains many operations. This is due to the manipulations of our feature are simple subtraction and logical calculus, which consume less time than matrix operations.

Conclusions
In this paper, we introduced a new feature for face recognition. It absorbs the essence of the Gabor and LBP features, and contains static texture differences and dynamic contour trends, so it reflects the substantive characteristics of the facial texture features. It remains robust to light, expression, and pose changes and makeup. We tested the feature on seven well known databases. The results show that the recognition rate is always more than 95%, giving us confidence that our feature can tremendously improve the quality of face recognition. Furthermore, it takes only 4.1 s to accomplish the whole feature extraction program. However, there are still some deficiencies could be improved in our approach. From the ROC curve, we find the unrecognized images are mainly on account of the undetection. Then in the future, some improvements to the detection aspect may be researched. Moreover, we will initiate modifications on the BG2D2LRP feature, such as finding the optimal radii for each database automatically, or changing the pixels on the circles, in the purpose of improving the recognition rate. A sparse representation classification model is researched to place of PCA+LDA+Cos Similarly, we believe it would improve a lot of our features. In addition, we will turn to 3D face recognition by adding depth information for better recognition performance.