Fingerprint Liveness Detection Based on Fine-Grained Feature Fusion for Intelligent Devices

: Currently, intelligent devices with ﬁngerprint identiﬁcation are widely deployed in our daily life. However, they are vulnerable to attack by fake ﬁngerprints made of special materials. To elevate the security of these intelligent devices, many ﬁngerprint liveness detection (FLD) algorithms have been explored. In this paper, we propose a novel detection structure to discriminate genuine or fake ﬁngerprints. First, to describe the subtle di ﬀ erences between them and take advantage of texture descriptors, three types of di ﬀ erent ﬁne-grained texture feature extraction algorithms are used. Next, we develop a feature fusion rule, including ﬁve operations, to better integrate the above features. Finally, those fused features are fed into a support vector machine (SVM) classiﬁer for subsequent classiﬁcation. Data analysis on three standard ﬁngerprint datasets indicates that the performance of our method outperforms other FLD methods proposed in recent literature. Moreover, data analysis results of blind materials are also reported.


Introduction
Protecting digital information from illegal attacks is becoming increasingly important in our daily life [1][2][3]. With the coming of the information age, intelligent devices with fingerprint identification are used in various information management systems, such as intelligent devices quick payment and bank attendance. The development of image technology and the application of intelligent devices enable us to capture significant amounts of high-resolution images. Among these, biometric images have attracted considerable attention owing to the popularity of intelligent devices with biometric authentication. Unlike conventional authentication methods based on passwords and tokens, biometrics has the advantage of being hard to forget, copy, lose, or forge. Thus, as an important biometric technique, fingerprint identification is widely employed for unlocking intelligent devices or using them for payment. However, as a result of overuse, fingerprints are becoming the targets of attackers or imposters. Scholars [4] have proven that intelligent devices with fingerprint identification are vulnerable to artificial replicas made from common materials, such as silica, gelatin, clay, and Play-Doh, and attackers or imposters can hinder these optical and capacitive sensors using these forged fingerprints when fingers press on the surface of the scanners. Thus, one of the common problems with these intelligent devices is that they cannot guarantee the authenticity of fingerprints before identification; specifically, they cannot distinguish between genuine or fake fingerprints [5].
The fingerprint liveness detection (FLD) method aims to solve the problems of spoofing attacks. Several researchers have devoted considerable effort to distinguishing genuine fingerprints from fake ones based on different physical or psychological characteristics in recent years [5], and the existing detection performance of the proposed method outperforms other algorithms, and that it also achieves better detection performance in blind material detection.
The remainder of this paper is organized as follows. Section 2 presents the methodology, including different feature extraction algorithms and particulars of the proposed model. Section 3 describes the database and design of the experiments. Section 4 analyzes the results of the experiments. Section 5 concludes.

Feature Extraction
In order to better describe the differences between genuine and fake fingerprints in our data analysis method, we establish feature fusion rules to concatenate the extracted features using three feature extraction algorithms (SIFT, LBP, and HOG). As shown in Figure 1, our framework consists of two processes, namely, the training process and testing processing. The former obtains the model classifier through the training set, while the latter uses the testing set to verify the performance of the model classifier. Firstly, the training set and the testing set of the fingerprint images are used as the inputs of the feature extraction stage (including three feature extractors: SIFT, LBP, and HOG) to extract features of the fingerprint images. Because the dimensions of extracted features are different, it is hard to directly splice them. Thus, before feature fusion, insufficient parts of the feature vector need to be filled with 0. Next, the above features are processed using the feature fusion operation proposed in this paper. Then the fused features of the training set are input into the SVM classifier for training, and the model classifier is obtained. Finally, in the evaluation stage, the testing set is used to verify the performance of the model classifier. detection performance of the proposed method outperforms other algorithms, and that it also achieves better detection performance in blind material detection. The remainder of this paper is organized as follows. Section 2 presents the methodology, including different feature extraction algorithms and particulars of the proposed model. Section 3 describes the database and design of the experiments. Section 4 analyzes the results of the experiments. Section 5 concludes.

Feature Extraction
In order to better describe the differences between genuine and fake fingerprints in our data analysis method, we establish feature fusion rules to concatenate the extracted features using three feature extraction algorithms (SIFT, LBP, and HOG). As shown in Figure 1, our framework consists of two processes, namely, the training process and testing processing. The former obtains the model classifier through the training set, while the latter uses the testing set to verify the performance of the model classifier. Firstly, the training set and the testing set of the fingerprint images are used as the inputs of the feature extraction stage (including three feature extractors: SIFT, LBP, and HOG) to extract features of the fingerprint images. Because the dimensions of extracted features are different, it is hard to directly splice them. Thus, before feature fusion, insufficient parts of the feature vector need to be filled with 0. Next, the above features are processed using the feature fusion operation proposed in this paper. Then the fused features of the training set are input into the SVM classifier for training, and the model classifier is obtained. Finally, in the evaluation stage, the testing set is used to verify the performance of the model classifier.  SIFT is a kind of local feature descriptor, and can detect the key subtle information differences between genuine and fake fingerprints. As a stable descriptor of local features, SIFT remains unchanged when these images are rotated and zoomed, even when the intensity changes. First, the image scale is reconstructed using gray-scale transformation to gain the multi-scale space representation sequences of images, and the main contour of the scale space is extracted from these sequences, which are regarded as a feature vector to realize the extraction of key points in edge and corner detection at different resolutions. Then, to ensure that the detected key points are local extreme points in the scale space and two-dimensional image space, each pixel point is compared with its adjacent points, and the location of the key points realized. In addition, the stable extreme points are extracted in the space of different scales, to guarantee the scale invariance of the key points. In order to make the key points invariable to the image angle and rotation, the direction assignment is realized SIFT is a kind of local feature descriptor, and can detect the key subtle information differences between genuine and fake fingerprints. As a stable descriptor of local features, SIFT remains unchanged when these images are rotated and zoomed, even when the intensity changes. First, the image scale is reconstructed using gray-scale transformation to gain the multi-scale space representation sequences of images, and the main contour of the scale space is extracted from these sequences, which are regarded as a feature vector to realize the extraction of key points in edge and corner detection at different resolutions. Then, to ensure that the detected key points are local extreme points in the scale space and two-dimensional image space, each pixel point is compared with its adjacent points, and the location of the key points realized. In addition, the stable extreme points are extracted in the space of different scales, to guarantee the scale invariance of the key points. In order to make the key points invariable to the image angle and rotation, the direction assignment is realized finding the gradient of each extremum. Finally, the key point descriptor is used to generate a unique vector by dividing the pixel area around the key point into blocks, calculating the gradient histogram within the key point, and this vector is an abstract representation of the image information in the area.

Model
In above calculation, the scale space L(x, y, σ) denotes the convolutional operation between the original image I(x, y) and a variable-scale two-dimensional Gaussian function G(x, y, σ), and the distribution is as follows: The scale space of the image is the calculation of the convolution operation using the Gaussian distribution and the original image, which can be expressed as follows: L(x, y, σ) = G(x, y, σ) * I(x, y). ( LBP [18,19] is an operator used to describe local texture features of images, and has obvious advantages of rotation invariance and gray invariance. The aim is to measure the local contrast of the fingerprints and describe the local texture information of the image. Before constructing the local texture, we need to preprocess the given image, then transform the image into a gray-scale image and analyze its pixels. The LBP operator is defined in the window of size 3 × 3, and the threshold is the pixel in the center of the window. Then, the central pixel values are compared with those of the adjacent 8 pixels. If the surrounding pixels are larger than the central pixel value, the position of the pixel is marked as 1, otherwise as 0. In this way, 8-bit binary numbers are generated by comparing the adjacent 8 points in the window of size 3 × 3, which are arranged in sequence to form a binary number. This value is taken as the LBP value of the pixel in the center of the window to reflect the texture information of the window of size 3 × 3. Usually, the image after LBP operation is divided into many square regions, such as 4 × 4, 10 × 10 or 16 × 16, and obtain 16, 100, or 256 histograms, respectively, representing the feature of fingerprint images by means of the above regions. The equation of the LBP is as follows: where [x c , y c ] represents the position of the center pixel in a 3 × 3 window, p i and p c denote the gray of the neighbor pixel and center pixel, respectively, and s[·] represents the symbolic function. The formula of the symbolic function is as follows: HOG [20] are made up of local features calculating the gradient direction histogram of the given images. Since HOG denote the structural feature of an edge (gradient), they can describe the local shape information, thus, they are a commonly used feature descriptor. The quantization of position and direction space can restrain the influence of translation and rotation to some extent. In addition, after normalizing the histogram in the local region, the influence of illumination change can be partially offset. Before calculation, gray-scale and brightness correction need to be carried out to reduce the influence of local shadow and light changes in the image. Meanwhile, to some extent, the interference of noise is suppressed. Then, to obtain a histogram of gradient, the horizontal and the vertical gradients of the image are calculated by filtering the image with the kernel matrix. Next, the magnitude and direction of each pixel are calculated. Then, each cell consists of 4 × 4 pixels, and the histograms of gradients are counted for each pixel in the cell. To make the generated feature robust to light, shadow, and edge changes, it is also necessary to normalize the HOG features of the block. Finally, a block is denoted by 4 × 4 cells and the features of the block are concatenated to get the final feature of the image, which is employed for subsequent classification.
As shown in Figure 2, visual images of the true and fake feature fingerprints using the HOG method are given. For these features extracted by HOG, the genuine fingerprint features are evenly distributed, while the fake ones are damaged more with stains and other fuzzy states. As shown in Figure 2, visual images of the true and fake feature fingerprints using the HOG method are given. For these features extracted by HOG, the genuine fingerprint features are evenly distributed, while the fake ones are damaged more with stains and other fuzzy states.

Feature Fusion Rule
As we know, the fusion forms of different features are diverse. To better represent the differences between genuine and fake fingerprints, we construct feature fusion rules to concatenate extracted features. Firstly, different feature extractors make the dimensions of feature vectors different. For example, in our method, feature dimension extracted by SIFT is 128, feature dimension extracted by HOG is 379, and feature dimension extracted by LBP is 312. Thus, it is difficult to directly splice them. To fuse these different features and unify their feature dimensions, next, we need to make up 0 for the features of different dimensions before concatenation. That is, before performing the splicing operation, the dimension of the final features is made the same by filling the end of the feature vector with 0, i.e., in our method, all of them are 379. Then, we design five feature fusion rules, namely, addition operation, maximum operation, minimum operation, average operation and concatenation operation. Table 1 reports the specific operation for each feature fusion rule, where denotes the corresponding feature using different feature extractor (x is SIFT, LBP, or HOG), and addition operation, maximum operation, minimum operation, average operation, and concatenation operation are abbreviated as Add, Max, Min, Ave, and Con, respectively. Please refer to the algorithm 1 process for the detailed operation process of different feature fusion. Step 1: Extract feature of the image using corresponding feature extractor x (x is SIFT, LBP or HOG, respectively), which denotes ; Step

Feature Fusion Rule
As we know, the fusion forms of different features are diverse. To better represent the differences between genuine and fake fingerprints, we construct feature fusion rules to concatenate extracted features. Firstly, different feature extractors make the dimensions of feature vectors different. For example, in our method, feature dimension extracted by SIFT is 128, feature dimension extracted by HOG is 379, and feature dimension extracted by LBP is 312. Thus, it is difficult to directly splice them. To fuse these different features and unify their feature dimensions, next, we need to make up 0 for the features of different dimensions before concatenation. That is, before performing the splicing operation, the dimension of the final features is made the same by filling the end of the feature vector with 0, i.e., in our method, all of them are 379. Then, we design five feature fusion rules, namely, addition operation, maximum operation, minimum operation, average operation and concatenation operation. Table 1 reports the specific operation for each feature fusion rule, where F x denotes the corresponding feature using different feature extractor (x is SIFT, LBP, or HOG), and addition operation, maximum operation, minimum operation, average operation, and concatenation operation are abbreviated as Add, Max, Min, Ave, and Con, respectively. Please refer to the algorithm 1 process for the detailed operation process of different feature fusion.
Step 4: For the kth image, implement feature fusion via matrix operation: Step 5: Use SVM to train the fused features ope Step 6: Use testing set to validate the performance of the model classifier.

Parameter Optimization
After fusing the features using our proposed rules, the generated features are fed into an SVM classifier for the subsequent training and testing.
The basic model of SVM is a binary classification model, which is suitable for binary fingerprint liveness detection. Due to the high dimension of the fusion feature, it is linearly indivisible in the low dimensional space, so we choose an RBF (radial basis function) [21] kernel function to realize the nonlinear mapping. SVM is a kind of model classifier using the criterion of structural risk minimization [22,23], and is divided into two categories depending on the common nuclear function: linear or nonlinear. To eliminate the adverse effects caused by outliers' dimensions, first, a standardization operation is performed. Then, to obtain a robust and effective model classifier, two parameters, C penalty coefficient and gamma, should be found. Parameter C, which is common in all SVM kernels, competes with the simplicity of the decision surface and performs a valuable conversion of misclassification of training samples. A smaller C makes the decision surface smoother, while a higher C is designed to correctly classify all training samples. The parameter gamma defines how much impact a single training sample can have. A larger gamma would affect other samples more. The gamma parameter can be considered as the inverse of the radius of the influence of the sample selected by the model support vector. Finally, using the above optimal parameter pair <C, gamma>, we get the model classifier and test the performance of model classifier using the testing samples.

Databases
The performance of our proposed method is evaluated using the benchmark fingerprint datasets LivDet 2011 [13], 2013 [24], and 2015 [6], which were derived from 2011, 2013 and 2015 FLD competitions, respectively, and publicly downloaded after registration. Each set consists of real and fake fingerprints and is procured using four different flat optical sensors. Each real or fake dataset also consists of two parts: a training set and a testing set. The detailed description of LivDet 2011, 2013, and 2015 datasets is given in Tables 2-4. From Tables 2-4, we can clearly observe the distribution of fingerprint images. It is worth emphasizing that there is no overlap between them.

Experimental Process and Evaluation Metrics
First, we adopt an image gray processing operation to eliminate the influence of light and other factors on the fingerprints. Then, the features of the fingerprints are extracted via three classical feature extraction algorithms, namely, SIFT, LBP, and HOG. However, the detection performance of the fingerprint liveness based on a single feature method is unsatisfactory, and our experimental results also confirm this. To solve the problem, one possible solution is to fuse the features to make up for the shortcomings of a single feature algorithm, thereby further enhancing the final performance.
Because of the difference between the three algorithms, the dimensions of the features extracted are inconsistent. To successfully perform the five feature fusion operations in Section 2.2, insufficient parts need to be filled with 0. Since the distributions and ranges of each feature are different, it is necessary to map these features extracted to the same interval using normalization operations to make the components of features consistent. Moreover, rescaling to the appropriate range can make training and testing faster. Then, it is necessary to optimize parameters to find the best C and gamma, which are employed for the subsequent model training. Finally, the classification result is obtained using a trained model classifier.
In order to verify the performance of the feature extraction algorithm in the paper, we adopt the average classification error (ACE) [24][25][26] as the metric of performance evaluation. The formula is defined as follows: where FAR (false accept rate) denotes the ratio of a fake fingerprint being mistaken as a genuine one, while FRR (false reject rate) is the probability of a genuine fingerprint being improperly rejected as a fake fingerprint; these can be expressed as follows: The outcome of fingerprint liveness detection may be any value between 0 and 100. Finally, we can obtain the performance of our proposed algorithm using Equation (5). The smaller the ACE, the better the detection performance of the algorithm.

Parameter Optimization
Before training using an SVM with an RBF, to obtain a model classifier with better robustness, it is necessary to find the optimal parameter pair <C, gamma>. For the parameter pair <C, gamma>, we directly use the grid.py program in the libsvm [23] toolkit to train the classifier, and take the corresponding <C, gamma> with the highest classification accuracy as the optimization.

Parameter Optimization
Before training using an SVM with an RBF, to obtain a model classifier with better robustness, it is necessary to find the optimal parameter pair <C, gamma>. For the parameter pair <C, gamma>, we directly use the grid.py program in the libsvm [23] toolkit to train the classifier, and take the corresponding <C, gamma> with the highest classification accuracy as the optimization.

Parameter Optimization
Before training using an SVM with an RBF, to obtain a model classifier with better robustness, it is necessary to find the optimal parameter pair <C, gamma>. For the parameter pair <C, gamma>, we directly use the grid.py program in the libsvm [23] toolkit to train the classifier, and take the corresponding <C, gamma> with the highest classification accuracy as the optimization.

Classification Accuracy Discussion
In this section, we first analyze and evaluate the performance of our method within LivDet 2011, LivDet 2013, and LivDet 2015 datasets when adopting different feature fusion rules, and the detailed results are reported in Tables 5-7. According to the observations in Tables 5-7, in general, the detection results after feature fusion are better than those of a single feature algorithm, and the time required to test all datasets is acceptable. Moreover, testing a fingerprint can be done without individuals knowing it, which indicates that our method is also applicable to real life.   As shown in Table 5, in the LivDet 2011 dataset from Digital, the classification accuracies of SIFT, LBP, and HOG are 87.8%, 91.2%, and 92.1%, respectively. After performing feature fusion operation, the classification accuracy of SIFT + HOG is 96.9%. The detection accuracy of the algorithm is improved significantly after fusion features. In the LivDet 2011 dataset from Sagem, the classification accuracies of SIFT, LBP, and HOG are 92.3%, 99.8%, and 99.9%, respectively. After performing the feature fusion operation, the classification accuracies of Add, Max, Ave, and Con are all 99.9%. The maximum operations, minimum operations, and average operations run more quickly than the algorithm for a single feature. The same conclusion can be drawn from the Biometrika and Italdata sensors. However, abnormal results may also occur. For example, in Digital, the detection results of feature fusion operations Max, Min, and Ave are weaker than those of a single feature. Based on our analysis, it is possible that the more expressive texture features are discarded after the three matrix operations, resulting in weaker final classification performance. Looking at the fusion operation Con, the results are the best. Although there are some outliers, the overall situation shows that the detection performance of the operation after feature fusion is higher than that of the single feature.
As shown in Table 6, for the Biometrika scanner on the LivDet 2013 dataset, the classification accuracies of SIFT, LBP, and HOG are 86.7%, 94.0%, and 93.8%, respectively. After carrying out the feature fusion operation, the classification accuracy of LBP + HOG is 99.9%. Thus, feature fusion can improve the identification performance of genuine and fake fingerprints. In the CrossMatch scanner on the LivDet 2013 dataset, the classification results of SIFT, LBP, and HOG are 88.8%, 90.6%, and 90.5%, respectively. After performing the feature fusion operation, the classification accuracy of SIFT + LBP is 93.6. The results once again show that the proposed feature fusion method can improve the performance of fingerprint liveness detection.
As shown in Table 7, in the Hi_Scan sensor on the LivDet 2015 fingerprint set, the average classification correct accuracies of SIFT, LBP, and HOG are 85.0%, 98.8%, and 75.8%, respectively, while the classification accuracies of addition and minimum operations are both 99.2%, slightly higher than the classification accuracy rate of a single algorithm. In the Digital-Persona dataset, the classification accuracies of SIFT, LBP, and HOG are 83.2%, 90.6%, and 77.8%, respectively. The average classification correct accuracies of SIFT + LBP and LBP + HOG are 90.7% and 92.4%, respectively, higher than that of single feature method. The same problem occurs in Table 7. There are some outliers, but the overall situation still suggests that the detection performance of the operation after feature fusion is higher than that of the single feature.
In addition, the time required for testing all datasets is also listed in Tables 5-7, and is acceptable. Moreover, under testing a fingerprint, it is basically done without our even knowing it, which indicates that our proposed method is also applicable to real life.
Existing FLD methods are based on known fake fingerprint materials. However, the type of fake fingerprint material is not known when testing it in reality. Thus, we also carried out a cross-material evaluation of the fingerprint image sets used in this paper. For each dataset, we extracted the features of fake fingerprints made by different materials for the training set and the testing set, and compared the accuracy of our feature fusion method with Nogueira et al. [25]. Table 8 provides the results, where the '-' indicates that the experiment was not performed in [25]. Regarding the results of other sensors described above, the experiment shows that the detection accuracy of feature fusion is higher than that of a single feature when performing blind material detection; that is, if we do not know what the fake fingerprint was made of, the accuracy rate and the error rate are equal. Using the method proposed in this work, the experiments indicate that the results are more accurate than those obtained simply by guessing.

Comparisons of Algorithms
Tables 9-11 list the detailed comparison results when we perform the concatenation operation. To provide a clear comparison of each algorithm, the optimal results for each fingerprint sensor are described in bold in each row. The smaller the ACE, the better the proposed method. The results for each table are described below.  In Table 9, the average classification error (ACE) of our method is the lowest, which is 4.6%. By observing the results of different scanners, it is found that the result of the Sagem scanner on the LivDet 2011 is close to 0. That is, when the type of fingerprint scanner is known to be Sagem, we are 99% sure that the fingerprint to be tested is true or fake, and the performance is significantly higher than other algorithms. Moreover, the ACE of our method is 2.64% lower than the best result of [29]. Although our result for the Biometrika scanner is 7.45% higher than one result of [29], the result of the Digital sensor is 7.32% lower than that of [29].
In Table 10, the average classification error (ACE) of our method is the lowest, which is 3.48%. The result of the Biometrika scanner on the LivDet 2013 dataset is close to 0. That is, when the type of fingerprint scanner is known to be Biometrika, we are 99% sure that the fingerprint to be tested is true or false, and the performance is significantly higher than other algorithms. Although our result for the Italdata scanner is 2.75% higher than one result of [29], the ACE of our method is still 3.76% lower than the best result of [29], and the result for Crossmatch is 3.92% lower than the result of [29].
In Table 11, the average classification error (ACE) of our method is the lowest, which is 4.03%. The result of the CrossMatch scanner on the LivDet 2015 dataset is close to 0, and the ACE of our method is still 3.21% lower than the best result of [29]. Although our result for the GreenBit scanner is 3.85% higher than one result of [29], the result for the Hi_Scan sensor is 7.32% lower than that of [29]. To sum up, Tables 9-11 again shows that, to obtain better FLD detection performance, different feature fusion methods can be used when the types of fingerprint scanners are known.

Conclusions
The development of image technology and the application of intelligent devices enable us to capture many high-resolution images. Among these, intelligent devices with fingerprint identification are most popular. However, the study found that they are vulnerable to attack by fake fingerprints made of special materials. To elevate the security of these intelligent devices, in this study we propose a data analysis method to distinguish genuine fingerprints from fake ones. It is well-known that the SIFT feature descriptor is characterized by invariance to rotation, scale, and brightness; the HOG feature descriptor ignores the influence of light on the image, reducing the dimension of the feature for the image; and the LBP feature descriptor is insensitive to light and fast to operate. Combining the advantages of SIFT features, LBP features, and HOG features can make up for the shortcomings of each algorithm and improve the final detection performance. Finally, the fused features are fed into an SVM classifier for the subsequent training and testing. From comparison by experiment, the classification performance based on fused features using SIFT, HOG, and LBP is better than other FLD methods, and our method is more suitable for fingerprint liveness detection to prevent spoof attacks related to these artificial replicas.
Since the feature fusion operation can achieve better detection performance than a single feature, we will try to explore more feature fusion schemes in future work, such as the linear combination of features, to further improve the FLD performance.