A Kernel Gabor-Based Weighted Region Covariance Matrix for Face Recognition

This paper proposes a novel image region descriptor for face recognition, named kernel Gabor-based weighted region covariance matrix (KGWRCM). As different parts are different effectual in characterizing and recognizing faces, we construct a weighting matrix by computing the similarity of each pixel within a face sample to emphasize features. We then incorporate the weighting matrices into a region covariance matrix, named weighted region covariance matrix (WRCM), to obtain the discriminative features of faces for recognition. Finally, to further preserve discriminative features in higher dimensional space, we develop the kernel Gabor-based weighted region covariance matrix (KGWRCM). Experimental results show that the KGWRCM outperforms other algorithms including the kernel Gabor-based region covariance matrix (KGCRM).


Introduction
Feature extraction from images or image regions is a key step for image recognition and video analysis problems. Recently, matrix-based feature representations [1][2][3][4][5][6] have been developed and OPEN ACCESS employed for feature extraction. Tuzel et al. [3] introduced the region covariance matrix (RCM) as a new image region descriptor and have applied it to object detection and texture classification. RCM is a covariance matrix of basic features extracted from a region. The diagonal entries of the covariance matrix represent the variance of each feature, while the nondiagonal entries represent the respective correlations. Using RCM as region descriptor has several advantages. Firstly, RCM provides a natural fusion method because it can fuse multiple basic features without any normalization or weight operations. Secondly, RCM can be invariant to rotations. Thirdly, its computational cost does not depend on the size of the region. Due to these advantages, RCM has been employed to detect and track objects [3,5], and has achieved promising results. The RCM in [3] and [5] were constructed using the basic features including the pixel locations, color values and the norm of the first and second order derivatives. However, directly employing RCM for human face recognition cannot achieve higher recognition rates. In order to improve face recognition rates, Pang et al. [4] proposed the Gabor-based RCM (GRCM) method using pixel locations and Gabor features to construct region covariance. As Gabor features can carry more discriminating information, GRCM displayed better performance. Subsequently, they also proposed a kernel Gabor RCM (KGRCM) method [7] to capture the higher order statistics in the original low-dimensional space. Their experimental results have demonstrated that the KGRCM can improve the classification performance. Recently, KGRCM has been applied to object detection and tracking [6]. The nonlinear descriptor can capture nonlinear relationships within image regions due to the usage of nonlinear region covariance matrix.
However, the previous methods based on RCM consider each pixel in the training image to be contributing equally when reconstructing the RCM, i.e., the contribution of each pixel is usually set to be 1/N 2 , where N is the number of pixels in a local region. However, this assumption of equal contribution does not hold in real-world applications because it is possible that different pixels in different image parts may have different discriminative powers. For example, pixels at important facial features such as eyes, mouth, and nose should be emphasized and others such as cheek and forehead should be deemphasized.
Motivated by the above-mentioned reasons, we hence propose in this paper a weighted region covariance matrix (WRCM) to explicitly exploit the different importance of each pixel of a sample. WRCM can only extract linear face features. However, by using nonlinear features it can achieve higher performance for face recognition tasks [7][8][9]. To further preserve nonlinear features, we develop the kernel Gabor-based weighted region covariance matrix (KGWRCM). Experimental results on the ORL Face database [10], the Yale Face database [11] and the AR database [12] show that the KGWRCM algorithm outperforms the RCM, the WRCM, the RCM with Gabor features (GRCM) [4], the KRCM with Gabor features (KGRCM) [7], and the conventional KPCA [9], Gabor + PCA [13], and Gabor +LDA [13] algorithms in terms of the recognition rate.

Region Covariance Matrix (RCM)
The RCM [3] is a matrix of covariance of features computed inside a region of an image. Let F be a two-dimensional image size of h × w, where w and h are the height and width of the image region. The number of pixels in image region is N = h × w. Define a mapping that maps a pixel (k, l) of F onto the d dimensional feature vector x i : As a result there are N d-dimensional feature vectors (x i ) i = 1,…,N . For the intensity image, the feature mapping function is defined by pixel locations, gray values and the norm of the first and second order derivatives of the intensities with respect to k and l: The image region can then be represented by a d × d covariance matrix of the basic feature vectors x i : where μ is the mean of the feature vectors x i : Equation (3) can also be expressed by following equation: The computation process is given in Appendix A.

Weighted Region Covariance Matrix (WRCM)
Based on the feature vectors x i , the d × d weighted region covariance matrix of the image region is defined as follows: (6) where the matrix S is a similarity matrix [14], which is chosen as: with a value ranging from 0 to 1, and σ is a suitable constant. D ii is a diagonal matrixes whose entries are column or row sums of S, Comparing Equations (5) and (6), we can see that the WRCM is just the RCM if 1/ , which implies that RCM is a special case of the WRCM method. However, as all the weights in RCM are 1/ , RCM cannot exploit the different importance of each pixel of a sample. On the other hand, the WRCM can assign different weights for each pixel of a sample, so it can preserve more discrimination information than RCM. As C W in Equation (6) is a matrix-form feature, the commonly used distances are not used. The generalized eigenvalue based distance proposed by Forstner [15] is hence used to measure the distance/dissimilarity between the WRCMs and : where λ 1 ,…,λ c are the generalized eigenvalues of covariance and computed from: To preserve the local and global patterns, similar to [3,4], we represent a face image with five WRCMs from five different regions (R 1 , R 2 , R 3 , R 4 , and R 5 ) ( Figure 1) . The five WRCMs (  ,  ,  , , and ) are constructed from five different regions. As is the weighted region covariance matrix of the entire image region R 1 , it is a global representation of the face. The , , , and are extracted from four local image regions (R 2 , R 3 , R 4 , and R 5 ), so they are part-based representations of the face. After obtaining WRCMs of each region, it is necessary to measure the distance between the gallery and probe sets. Let and be WRCMs from the gallery and probe sets. The distance between a gallery WRCM and a probe one is computed as follows: (10)

Kernel Weighted Region Covariance Matrix (KWRCM)
To generalize WRCM to the nonlinear case, we use a nonlinear kernel mapping Φ Ω to map the feature data Φ into a higher dimensional subspace. Then a linear WRCM is performed to preserve intrinsic geometric structures in subspace Ω. Suppose that R 1 and R 2 are two rectangular regions in the gallery and probe set images, respectively. Let m and n be number of pixels located in regions R 1 and R 2 , respectively. and are the higher dimensional features extracted from regions R 1 and R 2 , where, , , … , and , , … , .
Let and be the kernel weighted region covariance matrices of regions R 1 and R 2 , respectively. and are computed as follows: where , , and : where , , and .
Combining Equations (13) and (14), the generalized eigenvalue task in Equation (13) can be expressed in the form of block matrices: (15) The detailed derivation of Equation (15) is given in Appendix B. We defined matrices U, A, and B as Equation (15) can be rewritten as: When A is positive definite, the generalized eigenvalues are obtained through solving the following eigenvalue problem: However, in many cases, A is a singular matrix, we hence incorporate a regularization parameter u > 0 on both sides, respectively: where I is an identity matrix. When u is large enough, (B + uI) is positive definite. Equation (21) becomes a standard eigenvalue problem: Based on eigenvalues obtained by Equation (9) or Equation (22), we compute the distance between the two image regions R 1 and R 2 using Equation (8).

Kernel Gabor-Based Weighted Region Covariance Matrix (KGWRCM)
In Equation (2), these features such as pixel locations (k,l), intensity values and the norm of the first and second order derivatives of the intensities with respect to k and l are effective for tracking and detecting objects. However, their discriminating ability is not strong enough for face recognition [4]. To further improve the performance, Gabor features are added to the feature space. A 2-D Gabor wavelet kernel is the product of an elliptical Gaussian envelope and a complex plane wave, defined as: where u and v define the orientation and scale of the Gabor kernels, z = (x, y), denotes the norm operator, and the wave vector k u,v is defined as follows: where, * denotes the convolution operator, and is a magnitude operator. Therefore, a feature mapping function based on Gabor features is obtain by: As the Gabor wavelet representation can capture salient visual properties such as spatial localization, orientation selectivity, and spatial frequency characteristic, Gabor-based features can carry more important information. The proposed KGWRCM method can be briefly summarized as follows: (1) partition a face image into five regions (R 2 , R 3 , R 4 , and R 5 ), and extract basic features of five regions using Equation (26). (2) compute two weight matrices L * and L # using Equations (11) and (12), and obtain four kernel matrices K(X,X), K(X,Y), K(Y,X), and K(Y,Y), using Equations (30)-(33). Based on these matrices, the matrices A and B are computed utilizing Equations (17) and (18) (10), the nearest neighborhood classifier is employed to performance classification.

Experimental Results
We tested the GKWRCM algorithm on the ORL Face database [10], the Yale Face database [11] and AR Face database [12]. The ORL Face database comprises of 400 different images of 40 distinct subjects. Each subject provides 10 images that include variations in pose and scale. To reduce computational cost, each original image is resized to 56 × 46 by the nearest-neighbor interpolation function. A random subset with five images per individual is taken with labels to comprise the training set, and the remaining constructs the testing set. There are totally 252 different ways of selecting five images for training and five for testing. We select 20 random subsets with five images for training and five for testing.
The Yale face database contains 165 grayscale images with 11 images for each of 15 individuals. These images are subject to expression and lighting variations. In this recognition experiment, all face images with size of 80 × 80 were resized to 40 × 40. Five images of each subject were randomly chosen for training and the remaining six images were used for testing. There are hence 462 different selection ways. We select 20 random subsets with five images for training and six for testing.
The AR database consists of over 4,000 images corresponding to 126 people's faces (70 men and 56 women). These images include more facial variations, including illumination change, and facial occlusions (sun glasses and scarf). For each individual, 26 pictures were taken in two separate sessions and each section contains 13 images. In the experiment, we chose a subset of the data set consisting of 50 male subjects and 50 female subjects with seven images for each subject. The size of images are 165 × 120. We select two images for training and five for testing from the seven images. There are 21 different selection ways. Figure 2 shows some examples of the first object in each database used here.   These results clearly show that the proposed KGWRCM method can capture more discriminative information than other methods for face recognition. Particularly KGWRCM and WRCM outperform KGRCM and RCM, which implies that the weighted approaches can better emphasize more important parts in faces and deemphasize the less important parts, and also preserve discriminated information for face recognition.

Conclusions
In this paper, an efficient image representation method for face recognition called KGWRCM is proposed. Considering that some pixels in face image are more effectual in representing and recognizing faces, we have constructed KGWRCM based on weighted score of each pixel within a sample to duly emphasize the face features. As the weighted matrix can carry more important information, the proposed method has shown good performance. Experimental results also show that the proposed KGWRCM method outperforms other approaches in terms of recognition accuracy. However, similar to KGRCM, the computational cost of KGWRCM is high due to the computation of the high dimensional matrix. In future work, an effective KGWRCM method with low computational complexity will be developed for face recognition.