Verifying the E ﬀ ectiveness of New Face Spooﬁng DB with Capture Angle and Distance

: Face recognition is a representative biometric that can be easily used; however, spooﬁng attacks threaten the security of face biometric systems by generating fake faces. Thus, it is not advisable to only consider sophisticated spooﬁng cases, such as three-dimensional masks, because they require additional equipment, thereby increasing the implementation cost. To prevent easy face spooﬁng attacks through print and display, the two-dimensional (2D) image analysis method using existing face recognition systems is reasonable. Therefore, we proposed a new database called the “pattern recognition-face spooﬁng advancement database” that can be used to prevent such attacks based on 2D image analysis. To the best of our knowledge, this is the ﬁrst face spooﬁng database that considers the changes in both the angle and distance. Therefore, it can be used to train various positional relationships between a face and camera. We conducted various experiments to verify the e ﬃ ciency of this database. The spooﬁng detection accuracy of our database using ResNet-18 was found to be 96.75%. The experimental results for various scenarios demonstrated that the spoof detection performances were better for images with pinch angle, near distance images, and replay attacks than those for front images, far distance images, and print attacks, respectively. In the cross-database veriﬁcation result, the performance when tested with other databases (DBs) after training with our DB was better than the opposite. The results of cross-device veriﬁcation in terms of camera type showed negligible di ﬀ erence; thus, it was concluded that the type of image sensor does not a ﬀ ect the detection accuracy. Consequently, it was conﬁrmed that the proposed DB that considers various distances, capture angles, lighting conditions, and backgrounds can be used as a training DB to detect spooﬁng attacks in general face recognition systems.


Introduction
Nowadays, biometrics provide reliable indicators for individual recognition and authentication problems [1]. As the biometric identifiers are inherent to individuals, it is difficult to manipulate, share, or overlook these traits [2]. Therefore, these systems have been used in various fields such as cell phone encryption and internet banking authentication. Among biometric methods, the technique used by a face recognition system, which includes face detection and recognition, is one of the most convenient and useful practices [3][4][5][6]. The face recognition system uses a non-invasive method, and the face images have more complex biometric features compared to others. Various features that are used to detect fake face data can be extracted from each instance of data via local binary pattern (LBP), convolutional neural network (CNN), discrete cosine transform (DCT), and Laplacianfaces [7-10].
These reasons have led to the growth of the market size associated with face recognition, and the development of relevant robust systems is, therefore, required [11]. However, in the past few years, potentially vulnerable spoofing attacks have been reported [12]. These attacks occur when people attempt to pretend to be someone else by using fake data, thereby gaining illegitimate access and advantage [13]. Therefore, the face anti-spoofing task has attracted massive attention with the aim to assure reliability of security. In short, the necessity of detecting spoofing attacks in face recognition has increased. In conventional studies, to prevent spoofing attacks, its various types were divided into 2D attacks forged by displaying printed photos, replay attacks using recorded videos on mobile devices, and complex 3D facial mask attacks [14]. As known, there are several public databases where each has unique or characteristic data that has been collected in terms of various aspects. For example, the NUAA PI database and Yale Face Database B, which are among well-known face anti-spoofing databases, use only printed photo attacks [15,16]. Although the face images are obtained through various elements, such as movement, rotation, bending, and lighting manipulation of the photographs, practical requirements might not be satisfied for detection of counterfeit data because of the relatively small number of subjects, i.e., 15 and 10. Further, the Unicamp video-attack dataset (UVAD) prevents only the replay video attacks using 17,076 video clips by capturing 404 subjects from outdoor as well as indoor sites [17]. Additionally, databases such as CASIA-FASD, REPLAY-ATTACK, MSU-MFSD, MSU-USSA, REPLAY-MOBILE, and OULU-NPU include both printed photo and replay video attacks and can be used to consider more situations [18][19][20][21][22][23]. In particular, the MSU-USSA uses a unique factor that is not present in other databases. It has 1140 subjects that not only includes the face data collected by Wang [21] from web images but also data from REPLAY-ATTACK, CASIA-FASD, and MSU-MFSD public databases. Here, the images obtained from the web face database will have only one celebrity whose duplicate image does not exist. Finally, the ROSE-Youtu Face database has data that prevents masking attacks as well as printed photo and replay video attacks [24]. These public databases have significantly contributed to the field of face spoofing detection. Research results have currently given way to the use of commercialized face recognition systems that have anti-spoofing technologies for detecting fake data. Figure 1 shows the samples of public databases. Table 1 presents a comparison between previous face spoofing databases (DBs) and the proposed DB (Pattern Recognition-Face Spoofing Advancement Database (PR-FSAD)); to the best of our knowledge, the PR-FSAD is the only DB that considers the variations in both the distance and angle.   Until now, most of the researches used only public face databases for fake detection. In such cases, as the face data missing from the public database might include new environmental factors, the performance of fake detection for new data might be lowered. Consequently, we created our own face database called PR-FSAD using RGB image sensor to prevent sophisticated spoofing attacks based on printed photo and replay video attacks. In this database, we considered the distance and angle conditions which were not applied in the previous public databases. Further, PR-FSAD consists of 30 subjects with an age range from 13 to 32 years regardless of gender. Thus, the requirements in terms of training and evaluation are met in a better manner compared to other databases. Once the entire data was obtained, the real and fake face data were preprocessed using a face detection algorithm. Four protocols were designed for performance evaluation, and classification was conducted using ResNet-18. Additionally, the cross-database scenarios were tested for evaluating detection accuracy among different DBs. Figure 2 shows the real and fake face data acquisition scheme of PR-FSAD using capture devices. Our research had the following advantages over previous studies. First, a new face spoofing DB was constructed considering the distance and angle variations. Second, the efficiency of the proposed DB considering the deviations in distance and angle was verified by evaluating the spoofing detection accuracy through an algorithm based on deep neural networks. Third, the performance of the proposed DB was compared with those of previous face spoofing DBs by combining the training and test data from heterogeneous DBs to confirm the possibility of generalizing the proposed DB.
The rest of the paper is organized as follows. The detailed design of PR-FSAD and the four  Our research had the following advantages over previous studies. First, a new face spoofing DB was constructed considering the distance and angle variations. Second, the efficiency of the proposed DB considering the deviations in distance and angle was verified by evaluating the spoofing detection accuracy through an algorithm based on deep neural networks. Third, the performance of the proposed DB was compared with those of previous face spoofing DBs by combining the training and test data from heterogeneous DBs to confirm the possibility of generalizing the proposed DB.
The rest of the paper is organized as follows. The detailed design of PR-FSAD and the four protocols for evaluation are described in Section 2. Section 3 shows the experimental results for the protocols for evaluation of PR-FSAD and cross-database and cross-device scenarios. The analysis of these results is shown in Section 4. Finally, Section 5 gives the conclusion, expected benefits, and future works.

Materials and Methods
In this section, we describe the camera devices, environmental conditions (such as posture, illumination, and background), and considerations for constructing PR-FSAD. Further, the real and fake face databases and protocols designed for evaluation are described in detail.

PR-FSAD
The PR-FSAD has various characteristics, such as unfixed backgrounds and three distance and angle cases, for capturing face images. These features can be distinguished from those of other conventional public face databases and may affect the process of real and fake face classification. To construct PR-FSAD, the real and counterfeit face images were obtained from 30 subjects. For the fake database, the printed photo and replay video attacks were used as attack methods to prevent spoofing attacks. The camera system, capture environment, consideration in terms of the PR-FSAD design, and the detailed information about real and fake face data are described in the following sections. Additionally, PR-FSAD is publicly available and can be obtained by submitting a request at our website [25].

Camera System
For capturing the robust real and fake face data, we used four photographing devices consisting of two smartphones and two tablets. The subjects captured the images using the front camera of the devices and recorded videos using the basic camera application that is built into each device. In addition, the camera was set to automatically adjust the brightness based on the change in illumination. The frames per second (fps) were set to be the same. The detailed information for all the devices is shown in Table 2.

. Environmental Conditions
This section describes the considerations while capturing such as the pure-pose, expression, illumination, and background. All the images were captured using the camera system that was described in Section 2.1.1.
Firstly, each subject was asked to sit comfortably and look at the front, not toward the side, diagonal, or at the device. The capturing task was performed by keeping the subject's head at the center of the subject's arm. Most of the subjects took pictures by holding the capturing device, while some of the subjects who found capturing difficult were assisted by the researcher. As is the case with most face recognition circumstances, the facial expressions were required to be natural, not that of laughing or frowning. Further, we explained the diversity of unfixed backgrounds, which is one of the distinct characteristics of PR-FSAD when compared to other existing public face databases. In a conventional face recognition system, the face data are completely segmented from the backgrounds during the preprocessing stage. This is the reason why previous face recognition systems might not have considered the system of not being affected by the backgrounds. However, the face data can often be modified by using various functions of the camera device such as automatic brightness balance or white balance applications. These factors can distort the facial appearance due to the varying illumination conditions of the background environment and can even affect facial recognition. Therefore, for similar real-world situations, the photographs were taken at various unfixed places, including cafes, restaurants, the lobby of a building, and lecture classroom. In other words, the subjects captured images in natural environments without arbitrarily fixed backgrounds.

Consideration of PR-FSAD
The two differentiated features of PR-FSAD were distance and angle. These two factors between the subject and camera may be applied differently depending on the environment where the face recognition system is being used. Furthermore, these differences may affect face recognition due to the factors such as changes in lighting or image texture. Therefore, we considered the abovementioned two factors. Firstly, for distances, as each subject has a different body, we used the relative ratio of the face occupying the display of the devices. Further, to apply the ratio accurately to all the subjects, the display was divided into 3 × 3 grids when the camera was used for capturing. One of the three distances, called near distance, fills the subject's face to approximately 90% of the screen. The halfway distance fills approximately two-thirds of the entire screen. In other words, the face occupies approximately 50% of the eight rectangles except the rectangle at the center of the screen and is located at the center of the display. Finally, the face of distant distance is photographed while keeping only the center of the 3 × 3 grids screen filled. Further, for angles, the top and bottom were positioned differently approximately 30 • from the center angle. When capturing a face from various angles, to match a real-world situation, the subject's gaze will be in the same direction as the middle angle and not looking at the device. While acquiring the face data of PR-FSAD, each subject had to capture three preset distances and angles. In addition, as a face is rarely yawed or rolled in actual use-cases of face recognition, we captured the face images with different pitches. Once captured, the face data for distance and angle were stored using the tags "near", "halfway", and "distant" and "bottom", "middle", and "top". Figure 2 shows the capture method with the considerations.

Real Face Database
The PR-FSAD consists of 30 subjects (male: 19, female: 11). All the subjects except one captured face images in two sessions with the time interval set to at least six hours. As time difference is an important factor that decreases the classification performance, subjects took pictures during the first session at daytime and the other at night [26]. During each session, different backgrounds were applied to each subject. Other capture conditions were performed by the ones written in Section 2.1.3. In addition, the accuracy of face detection while constructing the PR-FSAD was checked using the multitask cascaded convolutional network (MTCNN) face detection algorithm [27]. This is because spoofing detection must be performed based on the accuracy of the detected face data to obtain a significant result. If face detection is not accurate, data are recaptured to construct precise real face data for the PR-FSAD. Figures 3 and 4 show the real face data of the PR-FSAD and the results of the MTCNN method at three angles.
In addition, the accuracy of face detection while constructing the PR-FSAD was checked using the multitask cascaded convolutional network (MTCNN) face detection algorithm [27]. This is because spoofing detection must be performed based on the accuracy of the detected face data to obtain a significant result. If face detection is not accurate, data are recaptured to construct precise real face data for the PR-FSAD. Figures 3 and 4 show the real face data of the PR-FSAD and the results of the MTCNN method at three angles.

Fake Face Database
When creating a fake face for PR-FSAD for spoofing attacks, we used two categories of attacks, namely, printed photo and replay video attacks. The capturing angles and distances used were the same as in the case of the real face data.
Firstly, we used the photographed real face images of all the subjects with four devices for the printed photo attack. For the counterfeit face data to be as similar as possible to the real face, the frame with the most natural look was chosen among the images taken at the halfway distance from the middle angle. The selected frame was printed using a high-quality color printer (Samsung SL-C483W, Fuji Xerox CP115W) to deceive the face recognition system with a high probability of spoofing attacks. While keeping the subject's gaze in the printed image at the front, the counterfeit face images were captured as the real face data. While capturing, the other conditions were performed by the ones written in Section 2.1.3. However, unlike the case for real people, detection of a face in printed photos might be difficult due to the reflection of unexpected light from behind the paper. Therefore, a printed photo has to be maintained as if it were the real face of a person holding the image. Once the fake data were captured, the procedure for normal face detection was also significant result. If face detection is not accurate, data are recaptured to construct precise real face data for the PR-FSAD. Figures 3 and 4 show the real face data of the PR-FSAD and the results of the MTCNN method at three angles.

Fake Face Database
When creating a fake face for PR-FSAD for spoofing attacks, we used two categories of attacks, namely, printed photo and replay video attacks. The capturing angles and distances used were the same as in the case of the real face data.
Firstly, we used the photographed real face images of all the subjects with four devices for the printed photo attack. For the counterfeit face data to be as similar as possible to the real face, the frame with the most natural look was chosen among the images taken at the halfway distance from the middle angle. The selected frame was printed using a high-quality color printer (Samsung SL-C483W, Fuji Xerox CP115W) to deceive the face recognition system with a high probability of spoofing attacks. While keeping the subject's gaze in the printed image at the front, the counterfeit face images were captured as the real face data. While capturing, the other conditions were performed by the ones written in Section 2.1.3. However, unlike the case for real people, detection of a face in printed photos might be difficult due to the reflection of unexpected light from behind the paper. Therefore, a printed photo has to be maintained as if it were the real face of a person holding the image. Once the fake data were captured, the procedure for normal face detection was also

Fake Face Database
When creating a fake face for PR-FSAD for spoofing attacks, we used two categories of attacks, namely, printed photo and replay video attacks. The capturing angles and distances used were the same as in the case of the real face data.
Firstly, we used the photographed real face images of all the subjects with four devices for the printed photo attack. For the counterfeit face data to be as similar as possible to the real face, the frame with the most natural look was chosen among the images taken at the halfway distance from the middle angle. The selected frame was printed using a high-quality color printer (Samsung SL-C483W, Fuji Xerox CP115W) to deceive the face recognition system with a high probability of spoofing attacks. While keeping the subject's gaze in the printed image at the front, the counterfeit face images were captured as the real face data. While capturing, the other conditions were performed by the ones written in Section 2.1.3. However, unlike the case for real people, detection of a face in printed photos might be difficult due to the reflection of unexpected light from behind the paper. Therefore, a printed photo has to be maintained as if it were the real face of a person holding the image. Once the fake data were captured, the procedure for normal face detection was also applied quite similar to as it is in the case of real face database. Figure 5 shows the printed photo attack of PR-FSAD and Figure 6 shows the results of the MTCNN method.
Electronics 2020, 9, x FOR PEER REVIEW 7 of 17 applied quite similar to as it is in the case of real face database. Figure 5 shows the printed photo attack of PR-FSAD and Figure 6 shows the results of the MTCNN method.  Further, we used photographed real face videos of all the subjects for the replay video attack. However, a drawback with smartphones was the relatively small face scale on the screen. Therefore, when the spoofing attacks were attempted at close distances, the focus often did not match with the device for face recognition. To prevent this problem, two tablet devices were used for the replay video attack. Further, as only the face of the subjects in the tablet's display had to be detected, the tablet was used by keeping it at approximately 0.1 m below the shoulder of the person holding the device. In contrast, the smartphone was kept next to the shoulder of the person holding the device. Other capture conditions and procedures were similar to that of the printed photo attack. Figure 7 shows the replay video attack of PR-FSAD, and Figure 8 shows the results for the MTCNN method.  applied quite similar to as it is in the case of real face database. Figure 5 shows the printed photo attack of PR-FSAD and Figure 6 shows the results of the MTCNN method.  Further, we used photographed real face videos of all the subjects for the replay video attack. However, a drawback with smartphones was the relatively small face scale on the screen. Therefore, when the spoofing attacks were attempted at close distances, the focus often did not match with the device for face recognition. To prevent this problem, two tablet devices were used for the replay video attack. Further, as only the face of the subjects in the tablet's display had to be detected, the tablet was used by keeping it at approximately 0.1 m below the shoulder of the person holding the device. In contrast, the smartphone was kept next to the shoulder of the person holding the device. Other capture conditions and procedures were similar to that of the printed photo attack. Figure 7 shows the replay video attack of PR-FSAD, and Figure 8 shows the results for the MTCNN method.  Further, we used photographed real face videos of all the subjects for the replay video attack. However, a drawback with smartphones was the relatively small face scale on the screen. Therefore, when the spoofing attacks were attempted at close distances, the focus often did not match with the device for face recognition. To prevent this problem, two tablet devices were used for the replay video attack. Further, as only the face of the subjects in the tablet's display had to be detected, the tablet was used by keeping it at approximately 0.1 m below the shoulder of the person holding the device. In contrast, the smartphone was kept next to the shoulder of the person holding the device. Other capture conditions and procedures were similar to that of the printed photo attack. Figure 7 shows the replay video attack of PR-FSAD, and Figure 8 shows the results for the MTCNN method.
Electronics 2020, 9, x FOR PEER REVIEW 7 of 17 applied quite similar to as it is in the case of real face database. Figure 5 shows the printed photo attack of PR-FSAD and Figure 6 shows the results of the MTCNN method.  Further, we used photographed real face videos of all the subjects for the replay video attack. However, a drawback with smartphones was the relatively small face scale on the screen. Therefore, when the spoofing attacks were attempted at close distances, the focus often did not match with the device for face recognition. To prevent this problem, two tablet devices were used for the replay video attack. Further, as only the face of the subjects in the tablet's display had to be detected, the tablet was used by keeping it at approximately 0.1 m below the shoulder of the person holding the device. In contrast, the smartphone was kept next to the shoulder of the person holding the device. Other capture conditions and procedures were similar to that of the printed photo attack. Figure 7 shows the replay video attack of PR-FSAD, and Figure 8 shows the results for the MTCNN method.

Evaluation Protocols
We considered various backgrounds, distances, and angles as the features of PR-FSAD. Protocols consisting of eight scenarios were designed to evaluate and verify the performance of face spoofing attack detection using PR-FSAD. For classification evaluation, the distances and angles were divided into three and two cases, respectively. The variables T, M, and B were used to represent top, middle, and bottom of the angle factor, and N, H, and D were used to represent near, halfway, and distant of the distance factor. The background, however, was not considered in the protocol, because it is configured differently and, hence, difficult to divide. In the test, 1, 2, and 3 indicate the real, printed photo, and replay video attacks. The detailed description of the protocols is as follows:

Overall test:
The evaluation test is conducted using all the angles and distances of PR-FSAD: a. Entire data protocol: all the real and fake face data are used.
To perform the designed protocols, PR-FSAD was divided into training, validation, and test sets, which included 7, 10, and 13 subjects, respectively. The composition of the data used for the experiment and the processing time required for training and testing are presented in Table 3. The processing time was measured using Intel (R) Core (TM) i7-6700HQ quad-core CPU, 2.60 GHz with 16 GB RAM and NVIDIA GeForce GTX 1070 GPU with 16 GB RAM. In the training process, the early stopping strategy was adopted before 20 epochs to ensure that the training was completed relatively quickly.

Evaluation Protocols
We considered various backgrounds, distances, and angles as the features of PR-FSAD. Protocols consisting of eight scenarios were designed to evaluate and verify the performance of face spoofing attack detection using PR-FSAD. For classification evaluation, the distances and angles were divided into three and two cases, respectively. The variables T, M, and B were used to represent top, middle, and bottom of the angle factor, and N, H, and D were used to represent near, halfway, and distant of the distance factor. The background, however, was not considered in the protocol, because it is configured differently and, hence, difficult to divide. In the test, 1, 2, and 3 indicate the real, printed photo, and replay video attacks. The detailed description of the protocols is as follows: 1.
Angle test: At each of the three different angles, real and fake data for all the three distances are used: Replay video attack protocol: this uses real and replay video attack data at all the angles and distances (or uses 1 and 3 at all the angles and distances).

4.
Overall test: The evaluation test is conducted using all the angles and distances of PR-FSAD: a. Entire data protocol: all the real and fake face data are used.
To perform the designed protocols, PR-FSAD was divided into training, validation, and test sets, which included 7, 10, and 13 subjects, respectively. The composition of the data used for the experiment and the processing time required for training and testing are presented in Table 3. The processing time was measured using Intel (R) Core (TM) i7-6700HQ quad-core CPU, 2.60 GHz with 16 GB RAM and NVIDIA GeForce GTX 1070 GPU with 16 GB RAM. In the training process, the early stopping strategy was adopted before 20 epochs to ensure that the training was completed relatively quickly.

Face Spoofing Detection Method
The PR-FSAD face data constructed for evaluating spoofing detection performances had similarities to adjacent frames. Therefore, to eliminate this unnecessary similarity and use varying data, sampling was performed. In particular, only 20 images were sampled by extracting images at intervals of 2 to 3 frames per video. Next, the extracted images were preprocessed to crop the face area except the background. In this study, we resized the images to 224 × 224 pixels. The preprocessed images are shown in Figure 9.

Face Spoofing Detection Method
The PR-FSAD face data constructed for evaluating spoofing detection performances had similarities to adjacent frames. Therefore, to eliminate this unnecessary similarity and use varying data, sampling was performed. In particular, only 20 images were sampled by extracting images at intervals of 2 to 3 frames per video. Next, the extracted images were preprocessed to crop the face area except the background. In this study, we resized the images to 224 × 224 pixels. The preprocessed images are shown in Figure 9. One of the deep neural network models, ResNet-18, was used for the final real and fake face classification based on the processed face data [28,29]. Compared with conventional neural network models, ResNet-18 did not cause problems in terms of gradient vanishing or exploding as the layer deepens. This effect was due to the shortcut connection that passes the input of a specific layer directly to the output, making it easier to find out and train fine-grained changes during a model's training process. The preprocessed image was normalized to 224 × 224 pixels and input into the ResNet-18 model. Because it is a three-channel color image, the input feature was defined in 150,528 dimensions (224 × 224 × 3). The feature vector output obtained by average pooling the ResNet-18 model had 512 dimensions. Finally, an output value that determines whether the input face image was real or fake was calculated using the sigmoid function. The procedure of counterfeit face detection using the proposed method is shown in Figure 10. One of the deep neural network models, ResNet-18, was used for the final real and fake face classification based on the processed face data [28,29]. Compared with conventional neural network models, ResNet-18 did not cause problems in terms of gradient vanishing or exploding as the layer deepens. This effect was due to the shortcut connection that passes the input of a specific layer directly to the output, making it easier to find out and train fine-grained changes during a model's training process. The preprocessed image was normalized to 224 × 224 pixels and input into the ResNet-18 model. Because it is a three-channel color image, the input feature was defined in 150,528 dimensions (224 × 224 × 3). The feature vector output obtained by average pooling the ResNet-18 model had 512 dimensions. Finally, an output value that determines whether the input face image was real or fake was calculated using the sigmoid function. The procedure of counterfeit face detection using the proposed method is shown in Figure 10.

Results
In this study, the ResNet-18 model with 0.01 learning rate was used for face spoofing attack detection. Additionally, the half total error rate (HTER) was calculated to verify the performance of the test results. The HTER is the average error rate of the false acceptance rate (FAR) and the false rejection rate (FRR) of the validation set. In the HTER, a smaller value means that the classification performance is better. The HTER indicator was obtained using the confusion matrix which is one of the most intuitive and simple methods for measuring the performance of binary classification models.  Table  4 lists the HTER for protocols 1-4. Among them, the result of the confusion matrix for protocol 4 using the total PR-FSAD is shown in Figure 11. In this case, the HTER is calculated by using Equation (1) which is approximately 3.25%. It is evident from Table 4 that the replay attack of the protocol 3 experiment demonstrated better spoofing attack detection performance than print attack. In the experiment, we used a single frame image for face spoofing attack detection. Information related to the changes in the time series of the face video in the replay attack was not used. Therefore, the texture information might have been used as the most important feature in both replay and print attacks. The screen of the smart device used in the replay attack was made of a glass material that could cause specular reflection from the ambient light source. Such specular reflection can be observed in Figures 7 and 8. This characteristic was

Results
In this study, the ResNet-18 model with 0.01 learning rate was used for face spoofing attack detection. Additionally, the half total error rate (HTER) was calculated to verify the performance of the test results. The HTER is the average error rate of the false acceptance rate (FAR) and the false rejection rate (FRR) of the validation set. In the HTER, a smaller value means that the classification performance is better. The HTER indicator was obtained using the confusion matrix which is one of the most intuitive and simple methods for measuring the performance of binary classification models.  Table 4 lists the HTER for protocols 1-4. Among them, the result of the confusion matrix for protocol 4 using the total PR-FSAD is shown in Figure 11. In this case, the HTER is calculated by using Equation (1) which is approximately 3.25%. HTER = (FP/(TN + FP) + FN/(FN + TP)) × 0. 5 (1) It is evident from Table 4 that the replay attack of the protocol 3 experiment demonstrated better spoofing attack detection performance than print attack. In the experiment, we used a single frame image for face spoofing attack detection. Information related to the changes in the time series of the face video in the replay attack was not used. Therefore, the texture information might have been used as the most important feature in both replay and print attacks. The screen of the smart device used in the replay attack was made of a glass material that could cause specular reflection from the ambient light source. Such specular reflection can be observed in Figures 7 and 8. This characteristic was markedly different from the actual skin surface. In contrast, paper materials did not produce specular reflection. This difference can be analyzed as a performance variation.
Electronics 2020, 9, x FOR PEER REVIEW 11 of 17 markedly different from the actual skin surface. In contrast, paper materials did not produce specular reflection. This difference can be analyzed as a performance variation. In addition, when examining the performance of protocol 1, it was evident that the performance for the front face (middle) was significantly lower than those for the top and bottom wherein the pitch angle difference existed. This could be analyzed because the top and bottom images were distorted by the vertical perspective of the face, whereas other features along with the three-dimensional features of the face were reflected in the training process. In other words, in the middle case, only the texture feature was reflected without considering the perspective property. However, as the above results used only intra-database scenarios of PR-FSAD, it may not be sufficient to demonstrate the effectiveness of PR-FSAD which includes differentiated features compared to previous public databases. Therefore, we used public databases called MSU-MFSD and REPLAY-ATTACK for cross-database scenarios. In all the scenarios, the face spoofing attack detection was performed using the ResNet-18 model, and the HTER obtained by using the confusion matrix was used as an indicator of performance evaluation. Table 5 lists the results of cross-database scenarios. To confirm that the PR-FSAD is also efficient with other spoofing detection algorithms, along with ResNet-18, we considered DenseNet [30] and LBP [31] for the cross-database scenario experiments. Experimental results revealed that the spoofing attack detection performance by training with the PR-FSAD was the best for ResNet-18, DenseNet, and LBP. This shows that although the PR-FSAD contains an angle variation element, it is feasible as training data that can be generalized and used in other face recognition systems. In addition, the relatively good performance for the LBP feature that only uses the texture property can be considered to explain the textural features of spoofing attack media independent of the image sensor. Although the previous DBs (MSU-MFSD, REPLAY-ATTACK) used in the comparison included distance variations, our DB had both distance and angle variations. In the cross-database scenarios presented in Table 5, our DB used both distance and angle variations for training and testing. Therefore, we performed cross-database experiments for a fair comparison using images with only In addition, when examining the performance of protocol 1, it was evident that the performance for the front face (middle) was significantly lower than those for the top and bottom wherein the pitch angle difference existed. This could be analyzed because the top and bottom images were distorted by the vertical perspective of the face, whereas other features along with the three-dimensional features of the face were reflected in the training process. In other words, in the middle case, only the texture feature was reflected without considering the perspective property.
However, as the above results used only intra-database scenarios of PR-FSAD, it may not be sufficient to demonstrate the effectiveness of PR-FSAD which includes differentiated features compared to previous public databases. Therefore, we used public databases called MSU-MFSD and REPLAY-ATTACK for cross-database scenarios. In all the scenarios, the face spoofing attack detection was performed using the ResNet-18 model, and the HTER obtained by using the confusion matrix was used as an indicator of performance evaluation. Table 5 lists the results of cross-database scenarios. To confirm that the PR-FSAD is also efficient with other spoofing detection algorithms, along with ResNet-18, we considered DenseNet [30] and LBP [31] for the cross-database scenario experiments. Experimental results revealed that the spoofing attack detection performance by training with the PR-FSAD was the best for ResNet-18, DenseNet, and LBP. This shows that although the PR-FSAD contains an angle variation element, it is feasible as training data that can be generalized and used in other face recognition systems. In addition, the relatively good performance for the LBP feature that only uses the texture property can be considered to explain the textural features of spoofing attack media independent of the image sensor.
Although the previous DBs (MSU-MFSD, REPLAY-ATTACK) used in the comparison included distance variations, our DB had both distance and angle variations. In the cross-database scenarios presented in Table 5, our DB used both distance and angle variations for training and testing. Therefore, we performed cross-database experiments for a fair comparison using images with only distance variation in the front (such as distant, halfway, and near in the middle position, as shown in Figure 2). The results are presented in Table 6. In comparison to the experiments wherein the variations in both the angle and distance were included (Table 5), improved results were obtained as demonstrated in Table 6. These results indicate that our DB reflects the perspective and resolution characteristics via angle and distance variations, respectively, to generalize the data to other capturing environments. However, because Table 6 presents the results for training and testing performed only with frontal face images from the PR-FSAD, an absolute comparison in terms of the training dataset with the results of Table 5 and Figure 12 is not possible.
Next, we performed experiments to measure the effect of the PR-FSAD, which consists of nine times more data than only the frontal face images by considering the distance and angle variations, on the processing time and classification accuracy. When the training image is used for the front face only and when the nine times more images are used, the time required for training can be considered to be a computational cost. However, because training is performed only once, comparing the training time would be inconsequential. Instead, we measured the time required for spoofing detection with one face image. The processing time was measured using Intel (R) Core (TM) i7-6700HQ quad-core CPU 2.60 GHz with 16 GB RAM and NVIDIA GeForce GTX 1070 GPU with 16 GB RAM. The results of the measurement time, which represents the average time required for 500 images, are presented in Table 7. The measurements are expressed in terms of when only CPU was used and when it was used along with GPU. In addition, Table 7 includes the test accuracy results when we used the nine times extended DB considering the angle and distance and when only the front face was used for training. Table 7. Spoofing detection processing times and classification accuracies for a single image based on distance and angle variations (using only CPU/using GPU together).

Training with only Front Face Image Training with Total Image
Processing time (ms) 320/20 321/20 Accuracy (HTER (%)) 5. 12 3.25 The results show that the time difference in the actual face spoofing detection process is insignificant, but the accuracy is significantly improved. In other words, the benefit of using the image with nine times more data considering the distance and angle variations was confirmed.
Finally, we assessed the performance of face spoofing attacks for cross-device scenarios on images captured with the four types of capturing devices specified in Table 2. The experimental results are presented in Table 8. In the experiment, the images acquired for each device were divided into training and test sets. This test was performed using ResNet-18. As evident from Table 8, the intra-device and inter-device facial spoofing detection performances were not significantly different. In some cases, the HTER of the intra-device was larger than that of the inter-device. Thus, it can be concluded that the characteristics of the media (paper or display) used for the spoofing attack and the geometric positional relationships were reflected accurately in the training process of ResNet-18 as the main feature of spoofing detection, instead of the differences in the image sensor for each device.

Discussion
Results of forgery detection methods using only the face data from PR-FSAD showed excellent performance with an HTER value of less than 5% for all the protocols except one that had an HTER value of 5.36%. As shown by the result of the confusion matrix for protocol 4, which uses the entire PR-FSAD, the ratios of misclassification for real and fake face data were almost identical. Although the number of misclassified data was different, the ratio was the same because PR-FSAD had a 1:2 ratio between the real and fake data. Moreover, this means that the ResNet-18 model training was performed accurately without any biases. Next, significant results were obtained for cross-database scenarios using three public databases.
Firstly, the best spoofing detection result was obtained for MSU-MFSD using the ResNet-18 model trained with PR-FSAD. For the test using REPLAY-ATTACK, the classification result using a model trained with MSU-MFSD, which had a similar data distribution, was the best, and the performance using the model trained with PR-FSAD was the second best. Although a difference of only 3% was observed, because the PR-FSAD demonstrated its training effect in heterogeneous DBs, it can be considered advantageous as a generalized DB that can overcome the variations in face image capturing conditions. Further, for the other two public databases, although the classification results for each other were good, the results for PR-FSAD were relatively poor. In contrast, our database is considered to have a better generalization performance because the HTER values for the public databases were approximately 20%. In other words, it can be noted that more face features were applied to PR-FSAD.
The main contribution of this study was the introduction of a new face spoofing DB that reflects distance and angle differences. We divided the distance and angle into three phases to construct facial spoofing images for nine combinations. In addition, the data were configured in various environments without controlling the lighting or background to reflect the actual environment. In our experiments with the proposed database, a face spoofing detection accuracy of 96.75% was observed. Using the proposed database, the resolution variations in the facial region can be consistently reflected as learned features in the deep neural network learning process by capturing the data at different distances and including perspective variations by including different angles in the images.
The limitation of our DB, however, is that data are obtained by dividing the positional relationship between the face and camera into nine types according to the angle and distance. This limitation was intended to provide a clear guide to the subject in the process of acquiring images. In the future, we plan to incorporate additional data by changing the positional relationship between the camera and face.
Consequently, PR-FSAD, which has a relatively good classification performance for different data alongside itself, is a meaningful face database to prevent spoofing attacks. For result visualization, we used the receiver operating characteristic (ROC) curve, which is useful for visualizing performances [32]. The wider the area under the curve (AUC), which indicates the bottom area of the ROC curve, the better the performance of the classification model. Figure 12 shows the ROC curve results using cross-database scenarios. The rate of sensitivity or recall, called the true positive rate, and the rate of specificity, called the false positive rate, are plotted on the y-axis and x-axis, respectively. Further, PR-FSAD showed the best classification performance among the four databases as can be seen in Figure 12. visualizing performances [32]. The wider the area under the curve (AUC), which indicates the bottom area of the ROC curve, the better the performance of the classification model. Figure 12 shows the ROC curve results using cross-database scenarios. The rate of sensitivity or recall, called the true positive rate, and the rate of specificity, called the false positive rate, are plotted on the y-axis and xaxis, respectively. Further, PR-FSAD showed the best classification performance among the four databases as can be seen in Figure 12.

Conclusions
In this study, we reported a new face DB called the PR-FSAD and verified its effectiveness in terms of face spoofing detection accuracy. The real and fake face data were obtained using four capture devices, and the spoofing attacks primarily consisted of printed photo attack and replay video attack. In particular, compared to public face databases, PR-FSAD is composed of two new factors which are distance and angle. To the best of our knowledge, a DB that considers both the distance and angle has not been proposed in the existing literature which is the main contribution of this study. A combination of three distances and angles were used to construct the real and fake face database, followed by sampling of the images from the captured videos. Finally, the face region was

Conclusions
In this study, we reported a new face DB called the PR-FSAD and verified its effectiveness in terms of face spoofing detection accuracy. The real and fake face data were obtained using four capture devices, and the spoofing attacks primarily consisted of printed photo attack and replay video attack. In particular, compared to public face databases, PR-FSAD is composed of two new factors which are distance and angle. To the best of our knowledge, a DB that considers both the distance and angle has not been proposed in the existing literature which is the main contribution of this study. A combination of three distances and angles were used to construct the real and fake face database, followed by sampling of the images from the captured videos. Finally, the face region was cropped, and the processed face data were applied to ResNet-18 neural network model for classification. To verify the effectiveness of PR-FSAD, 10, 7, and 13 subjects out of the total 30 subjects were used in the proposed method for training, validation, and test, respectively. Specifically, 41,040 (32.2%), 30,240 (23.7%), and 56,160 (44.1%) were applied to training, validation, and test. In addition, the classification was performed using RGB images without any additional equipment or sensors to detect the spoofing attacks. As a result, the HTER, which was used to measure the performance evaluation of the classification, was 3.25%. This result demonstrated a good classification performance in comparison to the existing DBs. For further effectiveness verification of PR-FSAD, we designed the cross-database scenarios using three face databases, including two public databases. The test performances of the three algorithms, namely, ResNet-18, DenseNet, and LBP, which were trained using the proposed DB, were the best. It is still lacking in terms of various algorithm comparisons, but it can be concluded that our DB was more applicable to face recognition systems in other environments.
In future studies, more accurate spoofing detection methods using PR-FSAD may be considered for other applications such as low-quality face data. Further, we will implement additional procedures, such as registering face data, with new feature elements. Furthermore, PR-FSAD can be used in released applications with face recognition systems for preventing counterfeit attacks. By acquiring more face images by further subdividing the environment of distance and angle variations, the proposed PR-FSAD will be improved to a more generalized face spoofing DB. Additionally, we plan to add the face spoofing DB generated from the frontal face images along with face spoofing images obtained through attack media that already include the reflected angle and distance variations. This may be an attack case that is more difficult to filter than face spoofing with angle variations in the frontal face image. Moreover, by comparing various deep neural network models, we will analyze the effects of texture and perspective on face spoofing detection. With respect to facial spoofing systems, we plan to conduct research on the prevention of disturbances related to deep learning or new attack models, such as Deepfake and adversarial perturbations, which are recently becoming issues.

Conflicts of Interest:
The authors declare no conflicts of interest.