FM net : Iris Segmentation and Recognition by Using Fully and Multi-Scale CNN for Biometric Security

Featured Application: The research work proposes an iris segmentation and recognition method “FM net ” that can reduce computational complexity for clinical investigations. Abstract: In Deep Learning, recent works show that neural networks have a high potential in the ﬁeld of biometric security. The advantage of using this type of architecture, in addition to being robust, is that the network learns the characteristic vectors by creating intelligent ﬁlters in an automatic way, grace to the layers of convolution. In this paper, we propose an algorithm “FM net ” for iris recognition by using Fully Convolutional Network (FCN) and Multi-scale Convolutional Neural Network (MCNN). By taking into considerations the property of Convolutional Neural Networks to learn and work at different resolutions, our proposed iris recognition method overcomes the existing issues in the classical methods which only use handcrafted features extraction, by performing features extraction and classiﬁcation together. Our proposed algorithm shows better classiﬁcation results as compared to the other state-of-the-art iris recognition approaches.


Introduction
Nowadays iris recognition is becoming more important security feature in biometric security systems.Biometrics refers to the science of biological variation and related phenomena.A system is called biometric recognition if it is able to automatically determine the identity of a human being based on the measurement of biological traits variables between individuals.Biometric systems recognize a person (or authenticate his identity) whose identity has previously been registered in a database (of N "authorized" persons).
General terms of authentication or recognition cover both identification and verification.Identification in the sense of the term implies a closed group context.It means that the certain user of the biometric system belongs to the N authorized persons, which is about determining the one of the N's best suited to the user, and therefore, it is not necessary to set a threshold for accepting or rejecting the user.We often speak of identification "1-among-N".Verification in the of term of senses operates in a context of the open group.In other words, we are absolutely not certain that the identity of the user is actually known by the system.In practice, the user claims the identity of N individuals from the database.If it is not recognized as such, it is an impostor.We often speak "1-for-1" verification.On the margins of military and police applications, more and more consumer biometric systems are being implemented to replace (or secure) the personal codes, passwords, cards, keys, which today respond to the various authentication needs of individuals.
In general, an algorithm which is applied to recognize human activity, conditioned by pretreatment of the training data i.e., SVM, Random Forest and HMM [1][2][3].However, all of these previous works are mainly performed in controlled environments and are not very robust, and sometimes very demanding in terms of computational time cost as well as memory space.
Deep Neural Networks (DNNs) [4] have been successfully applied to various issues such as gesture recognition and handwriting classification [5], where DNNs are directly applied to the data flow without pre-processing or feature selection step.We have analyzed the results obtained using the neural network to better understand how it works.
The convolutional neural network or ConvNet CNN is very used in the field of computer vision if we only evoke imageNet, and pascal voc which are problems of detection and classification of objects in an image.CNNs also have a lot of success in the biometric surveys.The convolutional neural network specializes in the processing of matrix and signals data.In convNet, two new layer types are presented to the network: the convolutional layer and the pool layer.We describe them in the following parts.Fully automatic processing is performed in 4 steps: Segmentation, Normalization, Feature extraction for each eye image, and finally Classification.In this paper, we propose an approach based on the vision deep neural networks.The first and last two steps are merged with deep neural networks that perform both the automatic segmentation and extraction of features and their classification.Moreover, a fully approach is proposed using several deep networks at different scales (size of the input), and whose outputs are merged.

Why Deep Learning Used in Iris Recognition
Deep learning is a branch of machine learning.A field in which the computer tests, program and learns from algorithms to improve by itself.Machine learning is not new and its roots date back to the mid-20th century.From the last two decades, several artificial intelligence techniques have been developed such as neural networks.Neural networks-based algorithms support deep learning which play a pivotal role in the recognition and computer vision problems.Inspired by the neurons that make up the human brain, neural networks combine layers of neurons that are interconnected to form several interconnected layers.For example, you want a neural network to recognize images that contain at least one eye of individual person.However, factors may effect the iris recognition methods such as different colors, sizes and forms of eye, different lightening and backgrounds.So we need to compile a training package of images -thousands of examples of person eyes.The neural network is then nourished by this image which transforms into data moving in the neural network between the neurons.In the end, the last layer collects all these pieces of information to reach a result of recognizing the person.
The rest of the paper is organized as follows: Section 2 describes the background of the iris recognition and the methods working in deep neural networks.Section 3 presents the related work in contrast to basic concept of the convolutional neural networks (CNNs) and different architectures of CNNs.In Section 4 proposed model is described briefly, while the experimental results and discussions are presented in Section 5. Finally, conclusion is made in Section 6.

Background
The iris placed behind the cornea of the eye is a variable diaphragm pierced by a hole X circular called pupil, controlled by a sphincter and a dilator formed of fiber antagonistic, smooth, radiant and circular muscle fibers.Practical observation through an optical system only allows detecting the edges macroscopic, and not to go down to the level of the elementary tubes.These random patterns of the iris are unique for each individual: they constitute somehow a human bar code from the filaments, hollows and streaks in the colored circles that surround the pupil of each eye.In addition, in the case of an internal tissue, these iridian impressions are relatively safe lesions.
The neural network appeared in the 1940 under the name "multilayer perceptrons [6,7].Multilayer perceptron is defined by Warren McCullochand Walter Pitts, who showed that any function could be approximated arithmetical or logical with multilayer perceptrons.In 1969, the perceptrons could only do binary classifications and on the other hand, difficulties were encountered while performing the classification of the XOR logic gate.In 1980, the complete re-propagation algorithm is developed by David E Rumelhart et al. [8].In the same year, a group of researchers led by Geoff Hinton at the University of Toronto, Canada, found a way to train the neural network without falling into the local minima problem [9].In this era, graphics processing units (GPUs) were also developed, which facilitated the handling of images on the PC.Before this researchers used super giant computers just to experience the technology of learning images.The change occurred in 2007, when computer scientist Faye Leigh of Stanford University and Lee of Princeton University launched ImageNet, a network of images that contains millions of images described by humans.Yann LeCun popularized neural networks with LeNet the first convolutional neural network for character recognition in 1998 [8,10].
In 2001, Paul Viola and Michael Jones at the Mitsubishi Electric Research Laboratory in the United States used Gartem (arithmetic equation) of machine learning techniques to recognize the faces in real-time images.Instead of using neurons of different weights, we use the technique of passing images in a series of simple decisions, such as whether the image contains a light-colored point between the dark spots, which may be the top of the nose?Are there two dark areas above a large pale area?Which represent the eyes and cheeks in black and white pictures.The process of describing hard images was done by technology developed by Amazon called Amazon Mechanical Turk, which pays cents per user describing an image.Today, with the improved computing power of processors and graphics processors, Deep Learning [4] has become a much more affordable method.The use of neural networks has been a success in many disciplines in view of the results obtained, in particular that of Krizhevsky used on images of ImageNet 4 and Cifar 5 databases [11,12].The use of such image-pre-trained networks as feature generators has been shown to be more effective than the use of expert features for SVM [13].In this way, Lagrange et al. [14] achieved excellent mapping results by combining superpixels and deep features in the Data Fusion Contest 2015 [15].The use of convolutional neural network smulti-scale [16] then fully convolutive [17] haveallowed to improve this first approach.Indeed, these fully-convolutional networks (Fully-Convolutional Networks-FCN) learn not only the semantics of individual pixels, but also the spatial structures that connect them, making them particularly suitable for mapping from images.Many recent work has shown the effectiveness of the networks of fully convolutive neurons in this setting, following the principles introduced by Long et al. [18].By densifying the last layers of a classical classifier network, it is possible to obtain either a vector of probabilities but heat maps indicating the probability of different classes in every way.Several subsequent adaptations have been proposed, such as the removal of subsampling [19] and layer expansion convolutive [20] to keep the resolution.Many models are also inspired by convolutional auto-encoders [21] and present a symmetrical architecture encoder/decoder, such as DeconvNet [22] and SegNet [23].The integration of structured models such as Random Markovs were also studied [24,25].Hassan et al. [26] developed a model that relies on the concept of visual attention that presents a considerable improvement over the simple transfer of learning CNN and developed an algorithms by combining a CNN and an RNN.

Related Work
Our approach is built on the recent CNN architectures for image classification [11,[27][28][29][30] and learning.Firstly, the transfer was demonstrated from a set of visual tasks of recognition [31,32] then the detection and segmentation is performed same as [33].We re-architect and refine the classification nets in direct prediction.We map the FCN space and link the previous models for segmentation and extract features by Multi-scale feature extraction method for recognition.

Convolutional Neural Networks (CNNs)
CNNs are also called multi-layer neural networks, which are mostly used in pattern recognition tasks [34,35].CNNs increase the robustness of the algorithm to low input variations.The low pre-treatment rate necessary for the operations that do not require any choice of extractors of specific characteristics.The proposed architecture is based on different deep neural networks by introducing link between convolution and aggregation layers (pooling).These DNNs basically belong to different class of models inspired by the work of Hubel and Wiesel on the primary visual cortex of the cat [36].The beginning of the architecture consisting of a succession of convolution and aggregation layers is dedicated to the automatic extraction of characteristics, while the second part, composed of completely connected layers of neurons, is dedicated to the classification.

Convolution Layer
The N number of convolution maps in M i j (j ∈ {1, ..., N}) parameterize the convolutional layer C i (network layer i), K x × K y denotes the size of the convolution cores (often square) and the connection scheme to the previous layer is represented by L i−1 .Each convolution map M i j is the result of a convolution sum of the maps of the previous layer M i−1 j by its respective convolution core.A bias b i j is then added and the result is passed to a nonlinear transfer function Φ(x) [35].In this case, the map of the completely connected layer can be calculated as follows where * denotes the convolution operator.

Transfer Learning
The Transfer learning [8] is a strategy that seeks to optimize performance on a learning machine from knowledge and other tasks performed by another learning machine.In practice, after Yosinski et al. [37], training a ConvNet model from the beginning (with initialization) is not recommended because training a ConvNet model requires a lot of data and takes a lot of time.On the other hand, it is more usual to use ConvNet models already training and to rehabilitate them for the problem, this is called transfer learning, which is about transferring learning from one model dealing with a problem to another type of problem.There are two types of transfer learning:

•
The extraction of variables from the ConvNet: here, the ConvNet is used as an extractor [8,38], i.e., a vector is extracted from a certain layer of the model without modifying its structure or its weight and the previously extracted vector is used for a new spot.

•
The fine tuning of the ConvNet model [39][40][41]: here, the new ConvNet is initialized with the weights and the structure of the pre-trained model is used.The structure of the pre-trained model is slightly modified for the new task and finally the new model is trained for the new task.

Max-Aggregation Layer
In conventional neural network architectures, the convolutional layers are followed by subsampling layers.A subsampling layer reduces the size of the cards and introduces invariance to rotations and translations that may appear as input.A max-pooling layer is a variant of a layer that has shown some advantages to the previous layers [42].Max-aggregation layer gives the maximum activation value within the input layer for different regions of size K x × K y without overlapping.

Classic Neural Layer
The convolution and pooling layers are parametrized with the activation cards of the last layer equal to 1, which results in a vector 1D of attributes.Completely connected classical layers composed of neurons are then added to the network to perform the classification.The classes can contain number of the desired neurons in the case of supervised learning.

Different Architectures of Convolutional neural networks (CNNs)
AlexNet: is the winner of ILSVRC 2012, obtained with the first position with a top-5 error of 16.4%.Krizhevsky et al. [11] achieved breakthrough in ILSVRC large-scale competition in 2012 by using Deep Convolutional Neural Network (DCNN).AlexNet is a scaled version of convolutional LeNet, it has 25 layers (including five convolutional layers).Here, CNN Features are generated for the iris recognition task by extracting 2 fully connected layers and the outputs of 5 convolutional layers.
GoogLeNet: is the winner of ILSVRC 2014, with a top-5 error of 6.7%.Szegedy et al. [27] introduced the Inception V1 architecture called GoogLeNet with the new insight using (1 × 1) convolutional blocks to reduce and aggregate the number of feature.The improvement was added by employing more inception modules and redesigning the filter in this inception modules [43,44].Here, the CNN Features are generated for the iris recognition task by extracting 12 inception layers and the outputs of 1 convolutional layer.
VGG: is the runner-up of ILSVRC 2014, Simonyan and Zisserman [28] used smaller filter 3 × 3 in each convolutional layer for the performance improvement.Different versions of VGG have been introduced from the last 5 years but the 2 most popular are: VGG16 and VGG19 which contains 16 and 19 layers respectively.Here, CNN Features are generated for the iris recognition task by extracting 2 fully connected layers and outputs of the 16 convolutional layers.
ResNet: is the winner of ILSVRC 2015, which is also a CNN from the ImageNet collection with an ensemble of residual networks.This network worn with only 3.6% error [29].Here, the CNN Features are generated for the iris recognition task by extracting 17 bottleneck layers and the outputs of 1 convolutional layer.
DenseNet: In 2016, Huang et al. [30] proposed DenseNet which is used to connect each layer of CNN in a feed-forward fashion.Here, CNN Features are generated for the iris recognition task by extracting the outputs of a number selected 15 dense layers.

Pre-Processing of Data
High resolution images can not be processed in a single pass by CNN because of their large size .As the average tile size of the Vaihingen ISPRS dataset is 2493 × 2063, while the majority of CNNs are limited to a 256 × 256 pixel entry.Given the limitations of the amount of GPU memory, we divide the high-resolution images by slicing them out of a sliding window.In the case where the pitch of the sliding window is smaller than its size, the predictions are averaged over the pixels subject to recovery.This makes it possible to refine the predictions along the edges of the window and to smooth any discontinuities that may appear.Figure 1 shows flowchart of our proposed method.

Segmentation Using FCN
In computer vision, segmentation is a process in which each coherent region of the image is assigned by a class.This can be done in particular by classifying each pixel of the eye image.The size of the images varies and is not necessarily square.So the size of 256 × 256 is retained for the size of the input images.A padding is then performed on all the images of the base that they are this size.
In this work, we use FCN which reside of convolutional layers, pooling layers and unsampling, minimize computation time and number of parameters.FCN does not hold fully connected layers like traditional CNN.Furthermore, the network can work without each number fixed of original image size, FCN in input entire iris eye image without any repeated computation overlapped regions.Thus it make 2 bottlenecks of CNNs and segmentation iris very fast and more accurate.We can calculate the final output O by the equation bellow: -Softmax layer (p,q) represents the output of softmax layer.θ is parameter of the model, M represents the number of classes and f the ith number of column of class M. x (p,q) represents the response at coordinate (p, q).If O is the output of the heat map containing width, height and the number of label classes of m × n × c of the input image.So the loss of any pixel (p, q) between final output and the ground truth map can be calculated as follows (3) Hence, whole loss at each pixel can be summarized to measure the accuracy.Figure 1 shows illustration of FCN introduced in this work, which contains 23 convolutional layers and 5 pooling layers.In our model, Fusion layer manipulates the sum of the output of these layers after upsampled to the input size.Figure 2 shows segmentation results obtained by FCNs.

Normalization
For comparison between two irises, fixed size of the segmented iris region is aligned and perform the normalization by using Daugmans's rubber sheet model [45].The benefit of using such kind of model is that the circular region can easily be transformed to a rectangular shape.In this process, the center of the radial circle across the iris region is considered as the reference point.During the encoding process, the bit pattern which contain the information bits, corresponding noise mask based on corruption areas within the iris pattern and the mask bits as corrupted pattern are obtained.Each pixel of the iris in the Cartesian domain is assigned a correspondent in the pseudo-polar domain according to the distance of the pixel from the centers of the circles and the angle that it makes with these centers as shown in Figure 3.More precisely, the transformation is done according to the following equation: Where x p (θ) denotes the point of the detected boundary of the segmented pupil and θ is the angle of the center of the pupil.x s (θ) and y s (θ) denotes the coordinates of the points on contour of the iris, obtained by the same principle.Figure 4 shows a normalized image obtained by this process which is rectangular and of constant size, generally the chosen size is 80 × 512 pixels.The width of the image represents the variation on the angular axis while the height represents the variations on the radial axis.

MCNN Feature Extraction
In this work, N number of convolutional neural networks with different retinal sizes (i.e., the size of the retina corresponds to the size of the input images).A given input image is resized N times to the retina in each network.The issue of optimization of classifier outputs is a recurring issue in the field of pattern recognition.For the recognition of iris, it has been shown [46] that a simple averaging of the outputs of each classifier makes it possible to obtain classification rates at least as good as with a linearly combined weights that can be learned via a cross validation method [47].As the classifiers act at the same resolution [46], so there is no reason to weight them and also these classifiers are differ from each other in terms of input distortions.In this work, we assume that the classifier with the lowest resolution are not as important as the one with the highest resolution.The final results are therefore calculated based on the linearly combined N CNNs classifiers.Figure 5 [35] and contain different sizes of the convolution karnel and retina.In our recognition system, we use 6 layers to make the extraction of characteristics for each CNN (Figure 6, Table 1) as follows: After the extraction, classification is performed to classify the images.Here, output in the last layer gives one neuron per class (each neuron represents a class of eyes of each person) (Figure 7).

Experimental Results and Discussion
In this section, proposed model is applied on a task of segmentation and extraction of iris from eye images.It is compared on the scale of the images with the other state of the art methods on different databases.We begin by introducing the basics of databases and their composition and then experimental details are presented.

Databases
In this section, we show experimental results of our proposed method on three iris databases (Figure 8): CASIA-Iris-Thousand, UBIRIS.v2 and LG2200.

CASIA-Iris-Thousand
This database contains more than 20,000 iris images of about 1000 people with 20 images for each person, 10 images for the right iris and 10 images for the left iris with a resolution of 640 × 480 pixels in JPG format, these images were collected using an IKEMB-100 camera produced by IrisKing, IKEMB-100 is an iris camera (double-eye) with a visual feedback, the bounding boxes indicated in the front LCD screen allow users to adjust their position for acquiring the image of the iris with high quality [48].

UBIRIS.v2
The UBIRIS [49] database was published by SOCIA Lab (Soft Computing and Image Analysis Group of the University of Beira Interior, Portugal) which was established in 2004 for the purpose of evaluating images taken under uncontrolled conditions.The peculiarity of this database is that the images are acquired in visible contrary to the usual databases that are acquired in NIR.UBIRIS-v2 [50] contains a total of 11,102 images resolution 300 × 400 pixels in TIFF format with a typical camera Canon EOS 5D Camera representing 500 subjects of different world origins (70 countries).UBIRIS-v2 is acquired remotely (between 4 and 8 m).

LG2200
This database was originally published for the CrossSensor iris recognition which is a challenge.The ND-CrossSensor consists of 27 data sessions with 676 people, a session contains on average 160 unique people who have multiple images.There are 116,564 images captured with the LG2200 device, each person is at least in two sessions on the entire dataset.The resolution of the image is 640 × 480 pixels [51].

Discussion
The proposed model avoids manual extraction of features and is more robust in design as compared to the traditional approaches which typically unable to address difficult problems.The approach of using multiple convolutional neural networks for the same task has already been used in [46].The idea behind this is that each network will learn different characteristics of the same input.To reach this goal, the authors of [47] apply different distortions (color, shape) to the learning set.This approach cannot be used here since the application of distortions can change their chromatin size or distribution, which is not viable for classification.Our model contain the following advantages:

•
The design of feature extraction is robust and easy to calculate.

•
Perform the selection, calculation and evaluation of the relevant characteristics and their relevance for class separation at different level.

•
Avoid and circumvents the difficulties occurred in delicate step related to characteristics.

Results
Here all the experiments were carried out several times to ensure reliability of the results.The error rates presented in Table 2 are the average error rates calculated for each achievement.As expected, these rates decrease when the resolution of input images increases because high resolutions are able to capture more information.Table 2 presents the numerical results obtained on different stages in our method.The network is learned only on "network-images" to the results obtained when the network is refined on the entire images.Regarding the transition to learning whole images, we have noted that the results produced by the refined network on the images contain fewer false positives.The left and right iris images of each subject are supposed to two different classes.At random, we choose 70% data for training of each class and 30% for testing.
To segment the iris, first we use FCN and then normalize the segmented region into a rectangle of size 64 × 256 pixels by using Daugman's rubber sheet method [45].Then the normalized iris region is fixed into the size 256 × 256 and extracted biometric features are employed by using MCNN to define and extract local patterns that fit the iris texture.To increase the recognition accuracy of the proposed method.
The experimental results show highest accuracy and the best accuracy rates on the three databases with the FCN and MCNN classification.Our approach FCN achieve average segmentation error rate of 0.63%, 0.56% and 0.96% on CASIA-Iris-Thousand, UBIRIS.v2 and LG2200 respectively, and MCNN achieve average feature extraction error rate of 0.71%, 0.85% and 1% respectively, by using Equation ( 6), which not only exceeds the performance of traditional methods and other previously proposed deep learning-based architectures, but is competitive with the most state-of-art iris recognition algorithms.
where N, m, n are the number of width and length of the images tested, respectively.G is the ground truth and M is the generated iris mask, and x, y are the coordinates of G and M. The ⊕ represents the XOR operation in equation for evaluate the quarreling pixels between G and M.
In addition, we have also compared the results by a multiclass Support Vector Machine(SVM) classifier [52], which have a depth of for kernel output or feature space and linear combination that produces the output with the pre-trained of MCNN without do modified (Figure 9).The training images are used to train the SVM [52].To make the comparison, we used the same testing and training data.Our proposed MCNN classifier shows lower error than the SVM because the optimal features can be easily extracted with the more convolutional layers.

Performance Metric and Baseline Method
In our proposed work, recognition rate is reported at FAR = 0.1% and used as the baseline feature descriptor for performance comparison.For matching, 1D-Log Gabor with the matching operator Adaptive Hamming Distance is used same as our previous work [53].Our method shows recognition accuracy of 95.63%, 99.41% and 93.17% and FRR 4.27%, 0.49% and 6.73% on three databases: CASIA-Iris-Thousand, UBIRIS.v2 and LG2200, respectively.Quantitative results between the previous methods and our proposed method are shown in (Table 3).

Performance Analyses
For the Qualitative performance analysis, we use three databases: CASIA-Iris-Thousand, UBIRIS.v2 and LG2200.For three datasets analysis are performed and tested on different layers.Results in terms of recognition accuracy peaks are shown in Figure 10.In some middle layers for all CNNs, for CASIA-Iris-Thousand dataset: VGG on layer 9, Inception on layer 10, ResNet on layer 11 and DenseNet on layer 5.For the UBIRIS.v2dataset: VGG on layer 9, Inception on layer 10, ResNet on layer 13 and DenseNet on layer 4. For the LG2200 dataset: VGG on layer 9, Inception on layer 10, ResNet on layer 11 and DenseNet on layer 6.All this three databases uses rather low Daugman matcher rate of 91%.As inception layers is done on layer 10, it converges to the peak more quickly than other layers.ResNet allow the gradient to flow throughout the network due to its property of skip connections.Due to gradient flow condition, ResNet make the network performance much better in terms of iris recognition accuracy.DenseNet allows neurons to interact easily due to its rich dense connections.It can observed that peak results do not occur toward the later layers of the CNNs (Figure 10).The main benefit of normalization is that the original iris images are more complex compared to the normalized iris image.Hence, encoding the normalized iris image do not need large number of layers for the process and the accuracy can be achieved in the middle layers.Among all five CNNs, DenseNet achieves the best recognition accuracy of 99.2% at layer 6 on the CASIA-Iris-Thousand database and 99.5% at layer 4 on the UBIRIS.v2database and 98.9% at layer 6 on the LG2200 database.

Conclusions
In this paper, we present an iris segmentation and recognition method based on a set of convolutional neural networks (CNN) and on several networks acting at different resolutions.Our approach avoids the tedious step of manual segmentation using FCN and extraction of features using MCNN.The proposed approach show the efficiency of convolutional networks on iris recognition and our results achieves the best classification rates on different three databases: CASIA-Iris-Thousand, UBIRIS.v2 and LG2200 compared to conventional methods of the state of the art.Future work is moving towards the implementation of best attribute extractors for classification (residual network models, wide area networks, etc.).The application of the proposed methods to other modalities of iris database is also considered.

Figure 1 .
Figure 1.The framework for iris recognition using FCN segmentation then Normalization, feature are next extracted by using MCNN.

Figure 2 .
Figure 2. Segmentation results by FCNs.The two first, two second and Two third rows are the three database: CASIA-Iris-Thousand, UBIRIS.v2 and LG2200, respectively.The first, second and third columns are the input images, output of our results and the ground truth, respectively.

Figure 3 .
Figure 3. Daugman's Rubber sheet model for the normalization of the iris.

Figure 6 .
Figure 6.Example of Extraction phase by CNN.

Figure 9 .
Figure 9.The framework for iris recognition using SVM Classification.

Figure 10 .
Figure 10.Quantitative analysis by comparison with Recognition accuracy different layers of CNNs for three databases from left to right respectively: CASIA-Iris-Thousand dataset,UBIRIS.v2and LG2200.
shows the overall architecture of MCNN proposed in this work.CNNs acting at different resolutions.Here, we keep the size of the images with the resolution of 80 × 80. Which is then divided by a factor √ 2, 2 √ 2, 3 √ 2 according to each axis to obtain respective images of size 56 × 56, 40 × 40 and 28 × 28.The networks called CNN 80 , CNN 56 , CNN 40 and CNN 28 are built similar to the LeNet model Figure 5.The global architecture of the multi-scale convolutional neural network (MCNN).We build 4

Table 1 .
Network architectures in each box of the table; Top: size of the segmented eye image and Bottom: size of the map.
CN N

Table 2 .
Error Rate evaluation of different architectures used in our Method.

Table 3 .
Recognition Accuracy comparision of the proposed method with previous methods.