Fingerprints Classiﬁcation through Image Analysis and Machine Learning Method

: The system that automatically identiﬁes the anthropometric ﬁngerprint is one of the systems that interact directly with the user, which every day will be provided with a diverse database. This requires the system to be optimized to handle the process to meet the needs of users such as fast processing time, almost absolute accuracy, no errors in the real process. Therefore, in this paper, we propose the application of machine learning methods to develop ﬁngerprint classiﬁcation algorithms based on the singularity feature. The goal of the paper is to reduce the number of comparisons in automatic ﬁngerprint recognition systems with large databases. The combination of using computer vision algorithms in the image pre-processing stage increases the calculation time, improves the quality of the input images, making the process of feature extraction highly effective and the classiﬁcation process fast and accurate. The classiﬁcation results on 3 datasets with the criteria for Precision, Recall, Accuracy evaluation and ROC analysis of algorithms show that the Random Forest (RF) algorithm has the best accuracy ( ≥ 96.75%) on all 3 databases, Support Vector Machine (SVM) has the best results ( ≥ 95.5%) 2/3 databases


Introduction
Recently, machine learning algorithms [1][2][3] have outpaced previous approaches in many problems in object classification [4], object tracking [5] and image segmentation problems [6].Practical problems in energy systems [7] that require extremely high accuracy and very low system errors are thoroughly solved by machine learning.The fingerprint classification system is also one of the systems that has a strict requirements for this.Despite of the progress made in the use of machine learning for object classification or object tracking tasks, their use in real-world applications is error-prone particularly on quality degraded input images, which is mostly human error.Almost all fingerprint images are not of perfect quality.They can be deformed, damaged by noise, such as the skin properties of each object (temperature, the humidity of the skin) are collected fingerprints, position on the finger (fingerprint angle, fingerprint area), pressure (too strong or too light) when rolling out to sample.In this study, we apply computer vision techniques [8] to improve image processing quality, focusing on noise filtering, increasing edge information.At the same time, applying machine learning algorithms, namely Random Forest algorithm [9] and Support Vector Machine (SVM) algorithm [10] to classify fingerprints into 3 types: arch, loop, whorl.The purpose is to increase the performance of the automatic fingerprint recognition system.

Related Work
Machine learning is increasingly proving its important in areas of practical application [11].Therefore, scientists are researching and developing machine learning methods that are becoming increasingly optimal.One of the applications that is attracting growth is fingerprint classification.An approach to the fingerprint classification problem is researched in Reference [12].The fingerprint is segmented into regions, this work will reduce the variance of the element directions.A relation graph is build based on the segmentation of the directional image.A model graph is used to compare with the obtained graph, can be adapted to graph matching techniques.Reference [13] presented a fingerprint classification system and its performance in an identification system used a fuzzy-neural network classifier.The classification scheme is based on fingerprint feature extraction, which involves encoding the singular points together with their relative positions and directions obtained from a binaries fingerprint image.Image analysis is carried in four stages, namely segmentation, directional image estimation, singular-point extraction and feature encoding.A method is proposed in Reference [14] based on analysis singularities and ridges relating singular points.Because of low quality images, it is very difficult to get correct positions of singular features.The authors used analysis ridge tracing and curves features to classify fingerprints.A machine learning algorithm that takes a robust approach to the classification of fingerprints is SVM.This method has created a highly accurate classification system.The advantage of the algorithm is shown in the classification of fingerprints into multiple classes.The authors of Reference [15] used a combination of the SVM algorithm with the naive Bayes method to classify fingerprints based on the number of core and delta points on fingerprints.Many researchers have tried to extract singular points in the flow of the ridges [16].The authors of References [17][18][19] proposed a heuristic algorithm with singularities to classify fingerprints; the disadvantage of these studies is not focusing on improving image quality.Because they use features of singularity points position.This leads to a great influence on the accuracy of the system.The Galton-Henry classification scheme is used to classify fingerprints.This method is presented in Reference [20].They use rotation-invariant distance, then match this distance between the FingerCode sample set with the new fingerprint pattern.During the classification process, the rotation-invariant distance extraction takes place in parallel with the training process.This gives the advantage of fast system time.The Random Forest (RF) algorithm is used to handle large amounts of data and multi-class problems in classification [21].This algorithm is an ensemble of randomized decision trees that resolves a regression task [22] and a classification task [23][24][25] The most prominent application of random forest is the detection of human body parts from depth data [26].This application demonstrates the practicability of random forests for machine learning problems in the real world.
There are many known ways to improve the performance of fingerprint classification systems.In this paper, we proposed a combination of computer vision algorithm and traditional machine learning methods to treat such problems, which we have successfully research in other fields [27][28][29].Moreover, we also designed and implemented image processing methods with denoising and increasing edge using the Canny method.

Fingerprint Image Pre-Processing and Enhancement
There have been many previous studies on solving the problem of a noisy or distorted image dataset.But, image pre-processing is always a challenging field for researchers to develop image quality enhancement algorithms.We focused on image pre-processing techniques based on computer vision techniques, such as image denoising to consider real-world images from an imperfect image-capturing device.
Traditionally, image-denoising methods are often used to treat noisy images [30].Denoising suppresses the small details and perturbations and enhances the edges.This operation can basically be represented as a blurring of the image, followed by the enhancement of the edges.Therefore, the resultant image emphasizes edges and suppressed details, thereby suppressing the noise in the image.
A dataset is directly collected with much noise, especially noise with an extremely high percentage of impulses, which brings a significant challenge for image denoising.So, in the noise filter step we propose a filter method based on the convolutional neural network (CNN) [31].This process includes two main steps.First, we develop a pre-processing step for noisy images using non-local information.Then, the pre-processed images are divided into patches and used for CNN training, leading to a CNN denoising model for future noisy images, to detect the noisy pixels of the image and then smooth them using a Gaussian filter method.In the CNN training step, the pre-processed images are divided into overlapping patches.We use these patches as input for the convolutional neural network.Our network has three layers; in each layer, we define a set of filters and operators to generate mappings.The convolutional result of each patch is corresponding to a n-dimensional feature map.We define a convolutional layer to predict enhancing patches and reconstruct them as a result image in the third layer.In this work, the Gaussian algorithm [32] used to filter noise with mask 3 × 3, then a canny algorithm [33] is applied to improve enhance information edge.Therefore, when the morphology operation has processed all image, the result will be better.Image pre-processing steps are as follows (Figure 1).

Types of Fingerprint and Features Extraction
In this work, we classify types of fingerprint into 3 classes-arch, loop and whorl.

•
Arch: These occur in about 5% of the encountered fingerprints.The identifying features of this arch are that the fingerprints have overlapping shapes that form layers and have a mountain-like peak.Arch of fingerprints are divided into several categories-AS (the lines are stacked on top of each other, unconcerned, no intersection.),AE (a combination of whorl and arch group, the distance from the center to the intersection of eagles is less than 5 veins), AU, AR (the combination of the loop group with arch, the distance from the center to the intersection is less than 5 fringe lines) as shown in Figure 3.  • Whorl: This fingerprint only accounts for about 25% to 35% of fingerprints worldwide.Whorl pattern identification is that they have one circuit and 2 Delta (intersection) as shown in Figure 5. Feature extraction of singularity.On the fingerprint there are areas with unusual structures compared to other areas.They often have a parallel structure called a singularity.There are two types of singularity-core and delta.To extract singularity characteristics, we proceed as follows: • Step 1. Input image then resize image to 256 × 256 • Step 2. Fingerprint pre-processing and enhancement • Step 3. At each pixel, the gradient is calculated in two directions x and y are G x and G y based on the Formula (1):

•
Step 4. Identify singularity points using the Pointcare index [15].Pointcare index at the pixel with coordinates (i, j) is the sum of the deviations of direction of adjacent points, calculated as follows in Equation ( 2): where ϕ is the gradient at pixels in two directions.
Based on the Pointcare index, we can identify singularity points as follows in Equation ( 4):

•
Step 5. Save and create fingerprint features vector.

Classification Fingerprint Based on Random Forest and Decision Tree with Singularity Features
Fingerprint classification is a multi-class classification problem, the feature obtained by unsupervised learning and type of fingerprints (labels) is selected as the new input data for training a multi-classifier by way of a supervised method.In this work there are 3 labels (arch, loop and whorl).For fingerprint classification, a relatively small number of features extracted from fingerprint images.Here, we choose the orientation field base on the gradient and identify singularity features using the Pointcare index as our classification.The machine learning algorithms chosen to training module include Random Forest [34] and Support vector machine [35] (Figure 6).Parameters of Random Forest for Fingerprint Classification.

•
The number of trees to train model = 2000.

•
The function to measure the quality of a split -Gini Impurity.

Parameters of Support Vector Machine for Fingerprint Classification.
• Penalty parameter to measure error term = 1.0.• Kernel: basis functions.

Database
Three databases were used in this paper which are-the Fingerprint Verification Competition (FVC) 2000, 2002 and 2004 [36-38].The database consists of 788 learning images and 100 test images.Our database (HLG) has 500 fingerprint images taken from 100 different fingers; an image of each finger was taken 5 times.The resolution of each image is 300 dpi and the size is 256 × 256.From this number of images, we divide them to create training data of 400 images and 100 images for matching and database 3, the NIST-DB4 database, is used for testing classification accuracy by most of the algorithms, which consists of 4000 fingerprint images (image size is 512 × 512 (Figure 7).All experiments were carried out using a Win10 PC with Intel Core i3 CPU @ 3.00 GHz and 2.00 GB RAM.

Analysis Results of Experimentation
In this article, factors (Accuracy, Precision, Recall) are used to evaluate the performance of a fingerprint classification system using machine learning methods (Random Forest algorithm, Support Vector Machine).

•
Accuracy is a system to measure the degree of closeness of measurements of a quantity to that quantity's actual (true) value.

Acc = (Truepositive + TrueNegative)/(Totalnumbero f elements) •
Precision is the fraction of retrieved documents that are relevant to the find: Recall in information retrieval is a fraction of the documents that are relevant to a query that is successfully retrieved.
Recall = (TruePositive)/(TruePositive + FalseNegative) The results shown in Figures 8-11 demonstrated that the algorithms work stably and with high accuracy.For example, NIST-DB4 consists of a large number of images (>4000), therefore the algorithms are provided with a wide range of input data, which allows them to study more and achieve an accuracy of >97%.To demonstrate the robust accessibility of the two proposed machine learning algorithms-(Random Forest (RF) and Support Vector Machine (SVM)-we have experimented with 3 data sets with 2 models of CNN and k-NN.
The CNN model is trained with a learning rate of 0.01 using a softmax activation function in pooling class.The Relu trigger function is used for all the hidden layers of the convolutional layer, the most learning times of the network is 100.The basic network architecture consists of-Input layer-Layers [Convolution layer-MaxPooling class-Activation class]-Output layer.When training the network, we chose parameter Batch size = 32 to select the sample size to be learned and to adjust the parameters in the network layers to have the best accuracy.The maximum number of network training sessions is 100.The model with the best classification results on the training set will be selected and tested on test sets for evaluation.
We use the kNN model to experiment with many k parameters (k = 1, 3, 5, 7) to choose the best results.In the model aggregation method, we ran experiments with different numbers of decision trees to select the best results.The number of decision trees tested was from 150 to 300 trees.Through experiments, we can see that the RF algorithm for building 200 decision trees will ensure high accuracy.The SVM algorithm uses a linear multiplier function that has the best results for MGE data sets.The results are obtained from classification models (with optimal parameters) as shown in Tables 1-3 and ROC analysis Figure 12.The tables show the effect of increasing the number of trees in the ensemble.For both, increasing trees requires more time to learn but also provide better results in terms of Mean Squared Error (MSE) is calculated as follows in Equation ( 5): where n is the number of test examples, f (x i ) the classifier's probabilistic output on x i and y i are actual labels.The results of the classification of 3 databases with the criteria of Precision, Recall, Accuracy evaluation of Random Forest and SVM algorithms show that the Random Forest algorithm has the best accuracy (>96.75%) for all 3 criteria.The SVM algorithm and CNN models have the best accuracy (>95.5%) for 2/3 of the database.The K-NN model has the lowest accuracy (<93%).The RF algorithm and CNN models need a lot of training time due to the process of tree construction and the interpretation of many cases.

Conclusions
A fingerprint classification method is proposed in this paper, based on two machine learning algorithms-Random Forest and Support Vector Machine.Both algorithms demonstrate the advantage of machine learning in the classification of objects (the accuracy ≥96%).Computer vision techniques are used in combination to improve image pre-processing methods with CNN technology.This helps to increase the quality of input images for the processing system.This study used the singularity feature of fingerprints.For future plans we will be developing algorithms to extract more features and we will be using deep learning methods to optimize the system.

Figure 1 .
Figure 1.Block diagram of fingerprint image pre-processing.The result of image pre-processing of a fingerprint shown in Figure 2. Figure 2a,b show the results of the equalization histogram and noise filter based on CNN, Figure 2c,d show the results of edge detection and morphology.

Figure 2 .
Figure 2. The result of image pre-processing of a fingerprint.(a,b) show the results of the equalization histogram and noise filter based on CNN, (c,d) show the results of edge detection and morphology

Figure 3 .
Figure 3. Example type of fingerprint is arch.•Loop: It is called the loop (can be seen in almost 60% to 65% of fingerprints worldwide) fingerprint because it is shaped like a water wave with the following features-the ridges make a backward turn in loops, triangular with a center and an intersection.Divided into two types: RL-Radial Loop: Top of the triangle facing the pinky finger.It looks like a stream of water flowing downwards (on the little finger).This type accounts for about 6% of fingerprints worldwide.UL-Ulnar Loop: The top of the triangle faces the thumb.It is shaped like a stream of water flowing backward (thumb direction).This form only accounts for 2% of fingerprints worldwide.A loop pattern has only one delta as shown in Figure 4.

Figure 4 .
Figure 4. Example type of fingerprint is loop.

Figure 5 .
Figure 5. Example type of fingerprint is whorl.

Figure 6 .
Figure 6.Block diagram of fingerprint classification using method proposed.

Figure 7 .
Figure 7. Examples of fingerprint in database.

Figure 8 .
Figure 8. Result of classification using Support Vector Machine algorithm of 3 databases.

Figure 10 .
Figure 10.Result of classification using Random Forest algorithm of 3 databases.

Table 2 .
Results of classification algorithms: RF, SVM, CNN, k-NN of FVC database.