Robust Iris Segmentation with Deep CNNs for Detecting Fully or Nearly Closed Eyes in Non-Ideal Biometric Systems

Jan, Farmanullah

doi:10.3390/computers15040253

Open AccessArticle

Robust Iris Segmentation with Deep CNNs for Detecting Fully or Nearly Closed Eyes in Non-Ideal Biometric Systems

by

Farmanullah Jan

Department of Computer Science, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, Dammam 31441, Saudi Arabia

Computers 2026, 15(4), 253; https://doi.org/10.3390/computers15040253

Submission received: 18 March 2026 / Revised: 14 April 2026 / Accepted: 15 April 2026 / Published: 17 April 2026

Download

Browse Figures

Versions Notes

Abstract

This study proposes a robust hybrid framework for iris segmentation in covert biometric systems, specifically addressing the challenge of non-ideal images featuring fully or nearly closed eyes. To overcome the limitations of traditional geometric methods, this study implements a SqueezeNet-based Deep Convolutional Neural Network (DCNN) for rapid eye-state classification. Comparative analysis with various pretrained DCNN models indicates that SqueezeNet provides an optimal balance of accuracy and efficiency, requiring only 1.24 million parameters and a minimal memory footprint of 5.2 MB. For iris contour demarcation, the proposed algorithm combines the Circular Hough Transform (CHT) with global gray-level statistics and anatomical constraints to facilitate reliable iris localization. Utilizing image decimation, percentile-based thresholding, and Canny edge detection, it systematically delineates the limbic and pupillary boundaries. This improved search methodology ensures precise contour delineation, even under sub-optimal imaging circumstances. The proposed algorithm was validated on a novel dataset encompassing challenging conditions such as specular reflections, blur, non-uniform illumination, and varying degrees of occlusion, including nearly or fully closed eyes. Experimental results demonstrate superior segmentation accuracy and significant computational efficiency, underscoring the model’s potential for real-time biometric applications in unconstrained environments.

Keywords:

iris segmentation; deep learning; convolutional neural network; transfer learning; machine learning; smart systems

1. Introduction

To avoid any unwanted criminal misadventures, the individual’s identity is always a big concern at the identification place [1]. Numerous organizations, corporations, and governmental agencies, especially in the underdeveloped countries, are still using the old traditional security measures for human identification. In this regard, the token (e.g., keys) and knowledge-based (e.g., passwords) techniques are used to verify a person’s claimed identity [1,2]. Both the print and social media news reveal that these security measures are not trustworthy, because they can be lost, stolen, or hacked [3,4,5,6].

To address these issues, the research community resorted to the use of biometric technology, which is today considered a more secure measure compared with traditional security measures. In addition, an individual is also free from the headache of carrying something (e.g., an identity card) to the identification place or remembering something such as the personal identification number. Biometric technology is considered a secure shield against cyber attackers in diverse applications such as smart cities, offices, and residential buildings [7,8,9].

Many smart cities around the globe are still relying on the so-called traditional security means [10,11]. In a recent report published by the Fast Identity Online Alliance (FIDO Alliance), it is acknowledged that root causes for the most expensive cybercrimes are the weak, forgotten, or reused passwords [8]. It is observed that trivial passwords (e.g., date of birth) are generally utilized to secure online identities, and are a major source of the loopholes by around 80% [12]. In alliance with FIDO, Nok Nok, Lenovo, and PayPal, in 2013, launched a research project on the usage of passwords [13,14,15,16,17,18,19]. It was reported that the majority of users do not know how to use passwords properly. Due to these risks and hazards, numerous stakeholders such as Visa, Alibaba, and Samsung are now struggling hard to facilitate their products with the high-end sensors to give a hot-stop to the password culture [13,14,15,16,17,18,19].

Biometric technology involves the physiological (e.g., face) and behavioral (e.g., walk) biometric traits of humans for their precise recognition. Unlike traditional security measures, these biometric traits are strong enough to be hacked, except for some expert hacks, such as medical surgery. At present, different biometric technologies (e.g., fingerprint, palm, and face) are in use. However, amongst these technologies, the iris biometric has gained more attention from researchers around the globe due to its stability and reliability. After getting human eye images or videos through a non-invasive procedure, it then applies a set of digital signal processing and pattern recognition techniques to identify them with great precision. Figure 1 shows a typical image of a human eye. An iris is a visible organ safeguarded by the cornea. Its complex structure comprises corona, arching ligaments, furrows, ridges, crypts, and freckles [20,21]. Over the entire person’s life, the structure of the iris stays unchanging except for some minute variations occurring in infancy.

For overt applications (e.g., passport and immigration), iris biometric technology is successfully functioning in many countries, including the Kingdom of Saudi Arabia, the United Arab Emirates, the USA, and the UK, among others. Due to current security threats and terrorist activities, this technology is also demanded for covert applications such as monitoring criminal activities in public places [23,24].

Iris biometric technology in covert applications is ideally supposed to use a fully relaxed environment while getting a subject’s eye images or video data. As depicted in Figure 2, a subject concerned with such setups may be on the move with normal pace and/or at a distance with respect to the image acquisition unit. Sometimes, subjects are even unaware of the target system’s installation. The image acquisition units of these systems are now greatly improved, which involves the near infrared (NIR) or visible spectrum (VS) illuminators while acquiring eye images or video data.

Figure 3 shows some typical NIR and VS eye images acquired in non-ideal iris biometric setups [26]. These images are selected from the public datasets [27], which are used to simulate non-ideal iris biometric systems. The acquired images are not of excellent quality and have non-ideal factors such as the non-uniform illumination, closed eyes, blur, defocus, off-axis/off-angle eyes, and so on.

Amongst the above-highlighted issues, fully or nearly (almost) closed eyes is a great challenge for most contemporary iris segmentation and localization schemes [6,7]. It is because non-ideal iris biometric machines generally mark iris inner (pupillary) and outer (limbic) contours using algorithms usually centered around Circular or Elliptical Hough Transform (CHT/EHT), Integro-differential operator (IDO), active contours models (ACMs) or histogram and thresholding-based schemes [3,27,28]. Most of these schemes mark iris contours quite effectively, but they may take significantly longer or fall into a stuck state when an eye in the target image is fully or almost closed.

Research Gap

Non-ideal iris biometrics is still an open research area because of its strong potential for covert applications, e.g., monitoring criminal activities at private and public places. Due to this reason, research communities around the globe are consistently working on the development of resilient and robust iris biometric systems, specifically for covert applications. For example, a lot of research work is still going on to improve the range of image acquisition setups using NIR and VS illumination [3,25,27,28]. This is imperative to maintain the minimum iris resolution at longer distances.

Also, a plethora of research work is going on to improve iris segmentation modules. As mentioned before, this module plays a very decisive role in the overall system’s performance. It is responsible for demarcating actual iris contours, marking noise (e.g., light reflections, hair, and eyelids) in the valid iris part, and feeding it to the next modules, i.e., feature extraction, matching and recognition. However, if output from this module is not correct (e.g., closed eye or highly occluded iris), then the system concerned may reject the identity of a genuine subject.

Traditional iris recognition systems, particularly those proposed by John Daugman [29] and Richard P. Wildes [21], rely on geometry-based segmentation using IDO and CHT for iris boundary localization, respectively. While effective under ideal conditions, these schemes assume visible circular iris contours and degrade when the eye is fully or nearly closed. Because these schemes are segmentation-driven rather than state-aware, they lack an explicit procedure for detecting eye openness, making them unpredictable in the presence of eyelid occlusion and non-ideal capture conditions.

To cope with the aforementioned issue, researchers have developed schemes to deal with closed and open eyes. However, most of these schemes are based on simple image processing techniques, edge operators (e.g., CHT and IDO), image binarization, or texture analysis. As in non-ideal iris biometric systems, the image data may have non-uniform illumination or dimensions; therefore, these techniques may not be equally useful when the image acquisition setup is not consistent. So, the best solution to cope with this issue lies in the paradigm of deep learning [30,31].

Currently, a big trend is going on using machine learning (ML) techniques in iris biometrics [4,6,14,32,33,34,35,36,37]. Unluckily, many of these techniques are focused on iris segmentation, feature extraction, or matching and recognition modules only. However, truly little to no attention has been given to the issue of fully or nearly closed eyes. For example, in [30], the authors proposed a Deep Convolutional Neural Network (DCNN)-based solution to classify open and closed eyes in VS images. This scheme is based on a pretrained model, ResNet50, using the transfer learning approach. It has shown satisfactory performance when the eye is either fully closed or open. However, it does not reject eyes that are almost closed, which would surely deteriorate the overall system performance if not rejected.

Shakarchy and Ali [31] proposed a DCNN-based method to detect fully closed and open eyes. It includes two main stages. In the first part, it uses a multi-task cascaded CNN (MTCNN) module to mark the eyes’ locations in images. It is tolerant of face tilting, occlusion, and non-uniform illumination. In the second part, it uses an eye closed and open detection network (ECODN). Output from MTCCN is fed to ECODN. This module also includes two stages. For feature extraction, the authors used a custom-designed CNN module including the convolutional, ReLUs, and max-pooling layers. In the second part, they used a fully connected layer and a sigmoid activation function in the output layer for binary classification. While showing some optimal accuracy results, it inherits the same flaw as found in reference [30].

The literature also reveals that accurate iris recognition relies on various image quality factors, such as iris diameter, visible area of iris, illumination, focus, and occlusions [29,38,39,40,41,42]. The minimum diameter of the iris required for dependable feature extraction and phase encoding is at least 100 pixels, and the visible iris area should be between 70–90 percent to ensure adequate unobstructed texture. Moderate rotation of the head and pupils is acceptable, but excessive angles can impair recognition. Recommended focus measures impact both classical and deep learning methods. Removal of eyelids and eyelashes is necessary, as increased occlusion pollutes the usable area of the iris. Images with low resolution diminish discriminative capability, which also impacts DL models. Empirical research suggests that effective recognition necessitates a minimum iris diameter of around 100 pixels and at least 70% visibility of the iris region [29,38,39,40,41,42]; images that do not meet these standards are prone to segmentation errors and a higher rate of false rejections. Notably, these requirements necessitate the need for fully or nearly closed eyes to be filtered to not enter a functional iris biometric machine; instead, a useful candidate should be accessed if feasible.

To fill the research gap highlighted above, this study proposes a real-time iris segmentation algorithm to demarcate iris contours in the presence of noise such as eyelids, reflections, eyeglasses, and to detect fully or nearly (almost) closed eyes. For fully or nearly closed eye detection, this scheme applies a well-trained deep CNN (DCNN) model, which is developed via transfer learning [37,43]. In the transfer-learning approach, the base model (a modified DCNN) benefits from the knowledge provided by the pretrained DCNN (P-DCNN) while learning new patterns in target data. No doubt, fine-tuning a P-DCNN with transfer learning is easier and faster than training from scratch. To date, there exist numerous P-DCNN models such as GoogleNet, SqueezeNet, VGG16, VGG19, and MobileNet-v2, which are well trained on a large set of images taken from ImageNet-1K, a subset of ImageNet [44]. Following that, it applies an iterative segmentation scheme comprising CHT, anatomical constraints of the iris, and global gray level image statistics to demarcate the iris inner and outer contours. Major highlights of this study are as follows:

Efficient detection of fully or nearly closed eyes in covert setups.
Real-time robust iris segmentation in the presence of noise.
The overall efficiency of the resultant iris biometric system has been improved.
Using the proposed DCCN model as a preprocessing step, contemporary iris segmentation and localization schemes can also qualify for real-time applications.
The proposed approach equally works for both visible spectrum (VS) and near-infrared (NIR) images.

The remainder of this article is organized as follows. Section 2 explicitly details the proposed iris segmentation algorithm. In addition, it introduces an enhanced iris biometric system integrating the trained DCNN model. Section 3 discusses experimental results of the proposed algorithm. Finally, Section 4 concludes the study, emphasizing its main contributions and directions for future work.

2. Proposed Algorithm

This section is subdivided into the following sub-sections:

○: DCNN model for fully/nearly closed eyes detection;
○: Enhanced iris biometric system.

2.1. DCNN Model for Fully/Nearly Closed Eyes Detection

As highlighted earlier, most of the contemporary iris biometric systems are not capable of properly detecting fully or nearly (almost) closed eyes. Due to this act, the system may wrongly reject the claimed identity of a genuine subject. To resolve this issue, the author proposes embedding a well-trained DCNN model between the image acquisition and iris segmentation modules.

Due to this add-on feature, the resultant iris biometric systems would efficiently detect fully/Nearly closed eyes, stop feeding it to later system’s modules, and demand another sample from the concerned dataset or image acquisition device if possible. This section has the following sub-tasks:

○: Preprocessing;
○: Dataset augmentation;
○: Transfer learning.

2.1.1. Preprocessing

This module is always considered an essential step in machine learning and digital image processing techniques. Its main job is to clean target data and transform it into a proper format prior to processing it for feature extraction. No doubt, proper selection of preprocessing steps (e.g., downscaling, color correction, noise suppression, and augmentation) significantly increases robustness and efficiency of the DL CNN models. The following steps are proposed to preprocess input images before training the pretrained (P-DCNN) models via transfer learning on the new data:

Step-a:: Color (RGB) requirements: All existing P-DCNN models (e.g., GoogleNet, SqueezeNet, and VGG19) work only with RGB (i.e., visible spectrum) images. The proposed scheme in this step checks the RGB format of the input image, say ∅ (x, y), as follows:
If ∅ (x, y) is not an RGB image, then it copies gray-level values of ∅ (x, y) into red, green, and blue planes to construct a new RGB equivalent image C (x, y), such as

$C (x, y) \leftarrow C o n c a t e n a t e (\emptyset (x, y), \emptyset (x, y), \emptyset (x, y))$

(1)

Otherwise,

$C (x, y) \leftarrow \emptyset (x, y)$

(2)

As shown in Figure 4, Equation (1) does not bring any change in the visual perception of a gray-level image ∅ (x, y); it now has three planes like an RGB image. As C (x, y) has original gray-level values of ∅ (x, y) in its red, green, and blue planes, the pretrained DCNN models accept C (x, y) as a color input image. Otherwise, these models will make an image dimensionality error.
Step-b:: Resolution requirements: As P-DCNN models accept images with a specific resolution, this step adjusts the resolution of C (x, y) accordingly. To begin with, the first proposed scheme explores the first layer of the target P-DCNN model, which has information about image resolution and RGB format. For example, the first layer of the “GoogleNet” shows $(224 \times 224 \times 3)$ as the input image requirement. It means C (x, y) must have 224 rows, 224 columns, and 3 planes (i.e., RGB). Otherwise, it would not be entertained by the input layer.

2.1.2. Dataset Augmentation

If a target dataset is too small or it holds images that do not account for all scenarios of a target object, then P-DCNN models may overfit. It may happen because it is difficult or may be time-consuming to capture all real-world scenarios for a target object, e.g., fully or nearly closed eyes, and full or nearly open eyes. To address data scarcity, the proposed algorithm utilizes a data augmentation strategy that simulates diverse ocular states. Given the binary nature of identifying occluded versus non-occluded eyes, this approach allows the model to generalize effectively across challenging imaging conditions without the need for an exceptionally large primary dataset [45,46].

Augmenting images not only increases dataset size, but it also incorporates new instances, which otherwise may be hard to find. This act generalizes input data into other situations, which may result in better training of the model concerned. New instances can be produced by translating and rotating sample images in the original dataset along the x- and/or y-axis, changing images’ contrast, and converting images from RGB to gray-level format, among others. The proposed schemes used the following steps while augmenting the target dataset:

○: Horizontal Translation: The proposed algorithm generates new images by randomly translating each sample image horizontally to both left and right sides; the translation range is preset to twenty pixels.
○: Vertical Translation: The proposed algorithm also generates new images by randomly translating each sample image vertically up and down; the translation range is preset to twenty pixels.
○: Rotation: In this step, the proposed algorithm generates new images by randomly rotating each sample image in both clockwise and anticlockwise directions; the range is preset to $0 ° ~ 90 ° .$

As augmentation introduces significant variations in the dataset images, this process is applied to both training and validation datasets while training P-DCNN models. Figure 5 shows some sample eye images after the augmentation process.

2.1.3. Transfer Learning: Training P-DCCN Models on New Images

As mentioned earlier, deep learning (DL) is a branch of machine learning (ML) [47,48]. It educates computing machines to perform what comes naturally to human beings, i.e., learn from experience. The ML algorithms use computational schemes for learning the desired information directly from target data, instead of relying on the predetermined equation(s) as a model. Today, DL is considered more suitable for applications such as facial recognition, autonomous driving, motion detection, and pedestrian detection.

Table 1 shows a list of publicly available pretrained DCNN models (P-DCNN) [49]. Network depth stands for the largest number of sequential convolutional (or fully connected) layers on a path from the input layer to the output layer. Input images for all P-DCNN are RGB (i.e., color) images. In addition to the storage size, it also shows the number of trainable parameters against each model. In brief, this table is a quick reference while selecting a P-DCNN model for a target application.

All P-DCNN models are previously trained on images taken from the ImageNet-1K, which is a subset of the ImageNet [44]. ImageNet is also referred to as the ImageNet-21K (or ImageNet-22k), having 14,197,122 images, which are divided into 21,841 classes. Hundreds of relevant images are supplied for each class. Its most universally used subset is the “ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2012–2017 image classification and localization dataset”, i.e., ImageNet-1K or ILSVRC2017. The ImageNet-1K has a total of 1,281,167 trainings, 50,000 validations, and 100,000 test images. It has a total of 1000 classes.

Transfer learning is often used in different DL applications, e.g., general image classification. A P-DCNN is used as a beginning point for learning a new task. Fine-tuning a P-DCNN via transfer learning is much easier and faster than training a similar network, initialized with randomly selected weights, from scratch. The deeper layers in a P-DCNN are fine-tuned by training it on a new dataset. As P-DCNNs have already learned a rich set of image features [49], these can learn features more specific to a new dataset when such networks are fine-tuned. Figure 6 illustrates a block diagram of the proposed transfer learning method. It includes the following steps:

○: Loading a P-DCNN model;
○: Preparing base model (BM);
○: Freezing BM initial layers;
○: Training BM on the new dataset;
○: Fine-tuning BM.

Specifically, the modified architecture of the proposed Base Model (BM) differs from standard P-DCNNs through a targeted structural redesign of the final stages. While standard models like SqueezeNet are configured for 1000 object categories, the BM replaces the final classification layer (fullyConnectedLayer) with a task-specific layer containing only two neurons: FullyAlmostClosedIrises and OpenIrises. Furthermore, the final convolution2dLayer is replaced to enable the extraction of high-level features unique to ocular geometry. To preserve the generic feature-detection capabilities (edges and gradients) of the early layers while minimizing computational overhead, these initial layers are frozen, ensuring that weight updates during training are concentrated solely on the newly integrated task-specific components.

Step-a:: Loading P-DCNN Model: As pretrained P-DCNN models are the saved networks, this step loads one such model (e.g., SqueezeNet) along with its weights. All P-DCNN models have a layered architecture; therefore, their complexity depends on the number of hidden layers, overall architecture, trainable parameters, etc.
Step-b:: Preparing Base Model (BM): As P-DCNN models are already trained on images of 1000 classes, the loaded P-DCNN model has more neurons in the final output layer than required for the current case. Due to this reason, the author replaced the output layer (called classification layer or fullyConnectedLayer) with a new layer having only two classes, i.e., FullyAlmostClosedIrises and OpenIrises. The first one will classify fully or nearly closed eyes, whereas the second one will classify fully or almost open eyes. Additionally, the author also removed the final convolutional layer (known as convolution2dLayer) and replaced it with a new one. Convolution layers extract the vital low- and high-level features from target images during training and update their weights accordingly. The resultant P-DCNN model is now called the base model (BM).

For the closed eye shown in Figure 7a,b shows a montage image including sample images from the convolution2dLayer (fire3-expand1x1). Similarly, Figure 7c is a montage image showing sample images from the convolution2dLayer, but after passing through the ReluLayer (fire5-relu_squeeze1x1). The job of ReluLayer is to suppress negative numbers in the input image.

Step-c:: Freezing BM Initial Layers: Initial layers of BM have knowledge regarding generic type features such as edges, color, and gradients. On the other hand, its final layers have knowledge of features specific to the target task. For this reason, it is wise to freeze starting layers to avoid the headache of retraining. Unfreezing initial layers will let them update their weight again (i.e., relearning) on the training dataset; notably, this act will not be different from training target BM from scratch, resulting in loss of technical resources, time, etc. For this reason, it is better to either freeze these layers or reduce their learning rate to a minimum level as compared with the last learnable layers.
Step-d:: Training BM: In this step, BM is given training on a newly developed dataset (see Section 3.2), including 600 eye images. Here, half of the images have closed eyes, and the remaining half have open eyes. This dataset is divided into training (70%) and validation (30%) sets. Though this dataset is relatively small, due to transfer learning and an augmentation scheme, it is sufficient to get the best results and avoid unnecessary overfitting. Before training begins, it is necessary to initialize the hyperparameters of the new fullyConnectedLayer and convolution2dLayer. Additionally, a choice for a Solver (i.e., optimizer) also needs to be selected. Table 2 shows a summary of hyperparameter initialization.
With three choices for each hyperparameter, the total number of permutations $(n)$ for six hyperparameters is $n = 3^{6} = 729$ . It implies that a total of 729 clones of BM would be prepared after training, for each of the 19 P-DCNNs. That is, for 19 P-DCNN models, the total number of expected clones is $729 \times 19 = 13,851$ . In this step, each hyperparameter is initialized to the first choice from its respective domain, and the training process is initiated.
Step-e:: BM Fine-tuning: For each trial, values of hyperparameters are recorded in conjunction with the mini-batch accuracy, validation accuracy, time elapsed, mini-batch loss, validation loss, etc. To automate this task, the authors utilized the MATLAB Experiment Manager (R2026), which is capable of executing trials in parallel or just sequentially. It also records all necessary training plots and confusion matrices. Fine-tuning of BM is essential while finding the best choice of a well-trained DCNN model for the current application. Figure 8 endorses this claim, because the validation accuracy of BM jumps from 50% to 98.89% by simply changing the value of InitialLearnRate from 0.1 to 0.001, respectively.

Figure 9a shows the training and validation performance of the SqueezeNet model on the target dataset. It shows the progression of the model’s accuracy and loss of over 680 iterations, spanning 20 epochs. Notably, the SqueezeNet model has demonstrated rapid convergence, with the training accuracy crossing 90% within the first 50 iterations and stabilizing near-perfection shortly thereafter. Its final validation accuracy reached 99.36%, which indicates an exceptionally high degree of generalization to the unseen image data. Correspondingly, the loss function exhibited a sharp exponential decay, effectively bottoming out and maintaining a near-zero value for much of the training duration.

In addition to training plots, a confusion matrix was also generated for the validation dataset to provide a comprehensive evaluation of the model’s classification performance, as shown in Figure 9b. This matrix uncovers a high level of discriminative strength between two classes: “FullyAlmostClosedIrises” and “OpenIrises.” Specifically, the SqueezeNet model appropriately identified all 546 instances of the “FullyAlmostClosedIrises” class, achieving a 100% recall for this class. In the “OpenIrises” class, 539 instances were correctly classified, and only 7 instances were misclassified as “FullyAlmostClosedIrises”. These results produce an overall validation accuracy of approximately 99.36%, which underscores the model’s robustness and its ability to minimize false negatives in a binary classification context.

To ensure a rigorous evaluation of the model’s discriminative capabilities, the validation dataset was intentionally balanced with an equal distribution of 546 instances for both ‘FullyAlmostClosedIrises’ and ‘OpenIrises’ classes. This balanced approach prevents metric bias and provides a clear baseline for the model’s sensitivity. While real-world ocular data is typically imbalanced, with open eyes occurring more frequently, the 100% recall achieved for closed irises in this study demonstrates the model’s robustness. By utilizing the data augmentation strategy described in Section 2.1.2, the proposed scheme effectively prepares the P-DCNN to identify rare, challenging ocular states in unconstrained environments, regardless of their frequency in a natural distribution.

2.2. Proposed Enhanced Iris Biometric System

Figure 10 illustrates a block diagram of the proposed enhanced iris biometric system. To prevent the entry of fully or nearly closed eyes into the concerned iris biometric system, the author incorporates a well-trained DCNN model (e.g., SqueezeNet) between the image acquisition setup and iris segmentation modules. Due to this add-on, existing iris biometric systems become capable of detecting fully or nearly closed eyes, rejecting that sample, and acquiring a good sample from datasets or an image acquisition setup. A typical Daugman’s [20] or Wildes’ [21] iris biometric system comprises the following four modules:

○: Image acquisition setup,
○: Iris segmentation,
○: Features extraction, and
○: Matching and recognition.

Each of these modules plays an indispensable role in the overall system’s efficiency; however, the iris segmentation module plays the most critical role [3,50,51].

Proposed Iris Segmentation

In this section, the author proposes a robust and effective method for iris localization and segmentation that combines Circular Hough Transform, anatomical constraints of the iris, and gray-level statistical analysis. The approach exploits the approximately circular structure of pupil and limbic boundaries while using image intensity statistics to guide and refine the search space for circular detection. This combination improves the robustness of iris localization in non-ideal eye images that may contain occlusions, noise artifacts, or illumination variations. It includes the following steps:

Step 1:: The author transforms the input eye image $I (x, y)$ to its grayscale format if it is already an RGB (color) image. The resultant image is $g (x, y)$ , see Figure 11b. However, to reduce computational cost, he further resizes (decimates) the resultant image by retaining only 60% of the original pixels. This process effectively discards 40% of the alternate rows and columns. Notably, this act lowers the data volume and significantly accelerates the subsequent processing steps, particularly CHT, which is computationally an intensive algorithm. The process of decimated image $G (x^{'}, y^{'})$ can be expressed as follows:

$G (x^{'}, y^{'}) = g (m \cdot s, n \cdot s), m = 0, 1, \dots, M^{'}, n = 0, 1, \dots, N^{'}$

(3)

where $s = 0.6$ is the scaling factor, $M^{'}$ and $N^{'}$ denote the dimensions of the resized image, and $x^{'}, y^{'}$ are the coordinates in the decimated image. No doubt, this operation ensures a uniform reduction in both row and column dimensions. After resizing, the author next applies a median filter of size $15 \times 15 p i x e l s$ to the resultant grayscale image $G (x^{'}, y^{'})$ to suppress noise and small artifacts such as eyelashes, reflections, and sensor noise. The median filtering operation is represented as follows:

$M (x, y) = median {G (i, j) ∣ (i, j) \in W (x^{'}, y^{'})},$

(4)

where $W (x^{'}, y^{'})$ is the $15 \times 15$ neighborhood window centered at pixel $(x^{'}, y^{'})$ . The resultant image $M (x, y)$ is shown in Figure 11c. The author chose a $15 \times 15$ median filter to clear out noise like eyelashes, hair, and light reflections. This filter is the best fit because it keeps the inner and outer iris boundaries sharp. Unlike a Gaussian filter, it does not blur edges. This keeps the limbic boundary clear for exact localization. It offers the perfect balance between removing noise and keeping the eye’s true shape. This makes the system much more reliable in messy or noisy images.
Now, the author computes the global gray-level statistics of $g (x, y)$ to estimate candidate regions corresponding to the pupil and the iris. These statistical parameters include the lower saturated limit $α$ , the upper saturated limit $β$ , and the average gray-level intensity $μ$ . The lower and upper saturated limits offer robust approximations of the darkest and brightest regions in the input eye image, respectively, while the average intensity characterizes the overall gray-level distribution.

○: $α$ : It represents gray-level intensity below which the bottom 5% of pixels lie. It basically refers to dark image regions, such as the pupil, eyelashes, eyebrows, or hair. It is expressed mathematically as follows:

$α = Percentile (g (x, y), 5 %),$

(5)
○: $β$ : It represents the intensity above which the top 5% of pixels lie. This parameter generally corresponds to bright regions such as the sclera or specular reflections. It is expressed mathematically as follows:

$β = Percentile (g (x, y), 95 %),$

(6)
○: $μ$ : It represents the average gray-level intensity across the entire image, serving as a reference for intermediate gray regions, such as skin or iris regions. It is expressed mathematically as follows:

$μ = \frac{1}{M N} \sum_{x = 1}^{M} \sum_{y = 1}^{N} g (x, y),$

(7)

where $M$ and $N$ denote the dimensions of $g (x, y)$ .
These parameters greatly facilitate the segmentation process by offering reference thresholds for characterizing dark, intermediate, and bright regions, which is specifically useful for localizing pupil and iris in non-ideal imaging conditions.

Step 2:: The author now applies the Canny edge detector to $M (x, y)$ to produce an edge map $E (x, y)$ , which highlights main intensity transitions in the eye, specifically at boundaries between pupil and iris and vice versa. These edges are crucial because the next stage, CHT, strongly relies on them while locating circular iris boundaries accurately. This process can be represented as follows:

$E (x, y) = Canny (M (x, y)),$

(8)

The Canny method follows several steps:

○: Smooth $M (x, y)$ with a Gaussian filter to decrease noise.
○: Compute intensity gradients to detect areas of instant change.
○: Apply non-maximum suppression to thin the edges.
○: Apply a double-thresholding scheme to keep only the strongest edges.

This method eliminates artifacts, e.g., eyelashes, reflections, or shadows, while maintaining the main anatomical boundaries of the iris intact. The resulting edge map

E (x, y)

is clean and provides a reliable foundation for accurate iris localization in the following step. Figure 11d shows

E (x, y)

.

Step 3:: In this step, the author applies CHT to $E (x, y)$ to detect circular contours representing the iris inner and outer boundaries. First, the candidate circles corresponding to the outer iris boundary are marked. Next, for each candidate, a circular region is extracted from the grayscale image $g (x, y)$ and its average intensity $μ_{r}$ is computed. If $μ_{r} \leq μ$ , the circle is considered a valid iris boundary; otherwise, the next candidate is accessed. The circle’s center coordinates $(a_{i}, b_{i})$ and radius $r_{i}$ are recorded. This circular boundary can be expressed as follows:

$(x - a_{i})^{2} + (y - b_{i})^{2} = r_{i}^{2},$

(9)

CHT itself operates using an accumulator function that is defined as follows:

$H (a, b, r_{i}) = \sum_{x, y} δ ((x - a)^{2} + (y - b)^{2} - r_{i}^{2}),$

(10)

where $H (a, b, r_{i})$ counts the number of edge-pixels (x, y) supporting a circular contour with center at (a, b) and having radius $r_{i}$ , and symbol $δ (\cdot)$ indicates the Dirac delta function. Generally, peaks in $H (a, b, r)$ correspond to the target circular boundary.

Figure 11e shows the surf image of the CHT accumulator for a particular radius and center of the iris outer boundary’s candidate. The maximum peak in this image refers to the potential location of the iris outer boundary.

After localizing the outer iris boundary, a region of interest (ROI) is defined (Figure 11f), masking all edges outside this area; edges corresponding to iris outer contours are also removed from it. After that, CHT is applied again within this ROI to mark the inner pupil boundary. Marked circles are restricted such that

r_{i}

is greater than pupil radius

r_{p}

; typically,

r_{p}

ranges from one-third to two-thirds of

r_{i}

, which guarantees iris anatomical consistency. Figure 11g shows a surf image of the CHT accumulator for a particular radius and center of the iris inner boundary’s candidate. The maximum peak in this image refers to the potential location of the iris inner boundary.

Experiments demonstrate that this method allows robust and precise localization of both iris and pupil boundaries, even in non-ideal images with reflections, eyelashes, and others. Figure 11h,i shows the iris outer and inner contours marked in

I (x, y)

, respectively.

Step 4:: Noise elements, including eyelids, eyelashes, and specular reflections, can be detected and marked at the recognition stage using masking techniques and morphological filtering [27].

3. Results and Discussion

This section explicitly details experimental results and a discussion of the proposed algorithm.

3.1. Experimental Setup

To test and validate the proposed iris segmentation algorithm, the author utilized the following experimental setup:

○: Operating system: Windows 10 Education.
○: Simulation tool: MATLAB Version (R2020).
○: Processor type: Intel(R) Core (TM) i7-8650U CPU 1.90 GHz 2.11 GHz.
○: Installed RAM: 16.00 GB.
○: Dataset: the main dataset has two folders, each representing the labeled data.
○: OpenEyes: it has 300 fully/almost closed eye images.
○: FullyAlmostClosedEyes: it has 300 fully/partially open eye images.
○: P-DCNN models: GoogleNet, SqueezeNet, VGG16, VGG19, MobileNet-v2, ResNet18, ResNet50, ResNet101, Inception-v3, InceptionResNet-v2, AlexNet, DenseNet201, Xception, ShuffleNet, NASNet-Large, NASNet-Mobile, DarkNet19, DarkNet53, and EfficientNet-b0; in experimentation, these P-DCNN models are numbered from 1 to 19, respectively.
○: Public Datasets: MMU [52], IITD [53], CASIA [54], UBIRIS [22], SGGSI&T [55], and CEW [56].

3.2. Preparing Target Dataset: Fully and Nearly Closed/Open Eyes

For training, testing, and validation of the P-DCNN models via a transfer learning approach, the author compiled a new dataset. He manually collected images from numerous public eye and face datasets. In these collections, the images feature different states of eye openness, e.g., fully and nearly closed eyes, almost and fully open eyes. It contains both NIR and VS images. Additionally, these images also offer several non-ideal factors such as eyelashes, contact lenses, blurriness, uneven illumination, light reflections, and irises captured from off-axis and off-angle perspectives. Collectively, these noise factors are sufficient to simulate unconstrained iris biometric systems [57].

Table 3 shows a brief overview of the public datasets [48]. Also, Figure 12a–f shows some sample images taken from the MMU [52], IITD [53], CASIA [54], UBIRIS [22], SGGSI&T [55], and CEW [56] datasets, respectively. For the sake of brevity, details of these datasets are not provided here; however, for comprehensive details, readers are referred to [58]. However, for training and validation of the proposed DCNN model, this dataset is divided into two classes as explained in the following text.

3.2.1. OpenEyes: Open Irises

As stated earlier, the diameter of the iris should be at least 90 pixels to achieve reliable human identification via iris patterns [20]. Consequently, when an iris is heavily occluded by noisy regions such as eyelids, eyeglass frames, or other obstructions, then the concerned iris biometric system is degraded significantly [20,21]. To ensure data quality, eye images containing at least 60% noise free iris regions are subjectively included in this class.

Initially, a pool of 800 eye images was manually collected from the MMU, IITD, CASIA, UBIRIS, SGGSIE&T, and CEW datasets. Following that, the author further screened this collection to exclude nearly closed eyes or eyes where rises exhibiting less than 60% valid iris texture, which left 600 remaining images. In addition, as both CASIA-IrisV4-Distance and CEW datasets primarily contain facial-level images, a single eye image was manually cropped out from each image and compiled into this class. Sample examples from this collection are shown in Figure 13a, which illustrates the diversity of image dimensions and modalities, encompassing near-infrared (NIR), grayscale, and visible-spectrum (VS) formats.

3.2.2. FullyAlmostClosedEyes

The CEW dataset has 1192 facial images containing closed eyes [53]. While a dedicated folder exists containing only closed-eye images extracted from original facial images through an algorithm, these eye images offer low resolution. Due to this fact, the author initially selected a sample of 600 facial images selectively, from which only one closed-eye image (either left or right) was then manually cropped out. In addition, the author also conducted thorough searches of various eye and face datasets to collect images containing fully or almost closed eyes; in this regard, 208 images were further added to this class, resulting in a total of 808 images. Figure 13b shows some sample images of this class.

For experimentation, the author recommends interested readers utilize the CEW dataset [56], which contains a thorough and varied collection of 4846 eye images specifically selected for effective eye-status classification (open vs. closed) in uncontrolled settings. By including a balanced range of samples across different lighting conditions, head positions, and various ethnic backgrounds, the dataset accurately represents real-world ‘wild’ situations. This enables a broader validation of the proposed model’s capability to filter out occluded eye samples before proceeding to the segmentation process.

3.3. DCNN Training and Validation Accuracy

This section elaborates on experimental results of training the P-DCNN models on the labeled data: OpenEyes and FullyAlmostClosedEyes. In context with the hyperparameter settings given in Table 2, the total number of expected trained models is 13,851. In the very beginning, all these trials were conducted over a long period of time due to their serial execution over the mentioned CPU, instead of running these trials on a powerful GPU. Due to this bottleneck in the way of fast execution, it took a couple of days to complete 13,851 trials.

However, as usual, there is always room to skip redundancy and target only those parameters that can give more realistic results. Hence, the author decided to shrink the range of hyperparameters and keep some of them out of the later experimentation. Table 4 shows revised versions of Table 2. Now, the total number of trials is 116, i.e.,

(19 \times 2 \times 1 \times 2 \times 1 \times 1) .

Here, number 19 shows the total number of P-DCNN models under experimentation. Though it is not difficult to include results for all 116 trials, it is wise to highlight only the prominent performance of these models.

Table 5 shows the best performance result of 19 models, along with hyperparameter values. SqueezeNet, MobileNetV2, InceptionResNet-V2, and DarkNet19 have achieved excellent training and validation accuracy results. It is also noted experimentally that NASNet-Large did not perform well, and it rendered errors due to memory shortage issues. The main reason for its outage is its large number of trainable parameters as compared to other models. P-DCNN models showed higher accuracy results during training as compared to results obtained during the validation stage.

It is obvious that models are retrained on the seen data as compared to the unseen data for the validation stage. In addition, the parameter

t_{t v}

shows the total time taken during the training and validation stages for each model. This way, AlexNet and Xception are the lowest and highest time-consuming models, respectively.

To ensure comparison fairness across 19 investigated P-DCNN models, several experimental controls were implemented during the retraining phase. First, a standardized set of hyperparameters was applied to all architectures. Each model was trained using the Stochastic Gradient Descent with Momentum (SGDM) optimizer for exactly eight epochs (

N_{E P C S}

), ensuring equal exposure to the training dataset. Additionally, the Weight Learn Rate Factor (

α_{W L R F}

) and Bias Learn Rate Factor (

β_{B L R F}

) were both set to 10 for the newly added classification and convolutional layers across all models. This uniform scaling ensured that the adaptation of pretrained weights to the specific iris dataset was conducted under consistent conditions.

To mitigate the risk of models converging at local optima, we systematically investigated several critical hyperparameters across various trials. Specifically, the initial learning rate (

γ_{I L R}

) was toggled between 0.001 and 0.0001, while the Mini-Batch Size (

N_{M B S}

) was calibrated between 35 and 40. Rather than presenting only the best-case scenarios, we chose to include both Table 5 and Table 6 to contrast the highest and lowest performance metrics for each architecture. This comprehensive reporting of the performance range verifies the stability of the investigated models. It further confirms that the high accuracies achieved by candidates like SqueezeNet and MobileNetV2 are consistent findings rather than statistical outliers resulting from a single advantageous initialization.

Similarly, Table 6 shows the lowest performance of 19 P-DCNN models, along with hyperparameters’ values. It is clear that the VGG16 performs poorly compared with other models. It may be mainly because of its simple architecture. Furthermore, architectural constraints were documented to explain performance outliers. For instance, the failure of NASNet-Large was attributed to memory shortages caused by its high number of trainable parameters. In the interim, the lower performance of VGG16 has been linked to its simpler architectural design instead of a failure to converge. These thorough records make sure that the comparative analysis signals the inherent capabilities of DCNN models within the context of the hardware-constrained environments being studied. In this case, SqueezeNet and Xception are the lowest and highest time-consuming models, respectively.

To summarize, the results seem quite good, which is mainly because of the two significant classes of eye images. The eye in the images is either fully/almost open or fully/nearly closed. Due to this fact, the intersection between these two classes is low. Due to this negligible interclass dependency, most of the BM (based on P-DCNN) models have shown exceptionally superior performance. Finally, the author recommends only four models (Figure 14): (i) SqueezeNet, (ii) MobileNetV2, (iii) InceptionResNet-V2, and (iv) DarkNet19 for real-time applications in non-ideal iris biometric systems.

3.4. Comparative Analysis: Proposed DCNN Model

For this purpose, the author prepared a dataset of 500 eye images. Half of the images have fully or almost open eyes, while the rest are a combination of either fully closed, almost closed, or a few partially closed eyes. These unseen images are acquired from the public datasets.

As shown in Figure 14, the author finally recommends four models out of the set of nineteen retrained P-DCNN models. However, the selection of the best P-DCNN architecture is based on a comparative analysis of accuracy, training time, and model complexity. As shown in Figure 14, while SqueezeNet, MobileNetV2, InceptionResNet-V2, and DarkNet19 all maintained validation accuracies exceeding 99.4%, SqueezeNet appeared as the superior candidate for deployment. It accomplished an outstanding balance of performance and efficiency, requiring only 4.55 min for the training and validation process, substantially faster than InceptionResNet-V2, which took almost 78.53 min. Also, SqueezeNet’s lean architecture, characterized by only 1.24 million parameters and a 5.2 MB memory footprint, safeguards high-speed processing and fitness for hardware-constrained environments without compromising recognition accuracy.

On the dataset of 500 images, for open eyes, the SqueezeNet model exhibited 99.6% accuracy results. However, for the other class, it obtained 99% accuracy. The reason the model achieved lower accuracy for the second class is that the author intentionally included some partially closed eyes. Due to this act, the intersection between these two classes is now not empty. Figure 15a–d shows typical examples of the open eye, fully closed eye, almost closed eye, and partially open eye, respectively.

The eye in the image shown in Figure 15d is partially open, which is not useful for the person concerned with recognition, and hence it should be rejected. Unfortunately, this type of eye image is predicted as an open eye image. This is not a fault, because this study had trained all BM models through two distinct classes in the target dataset, i.e., fully/almost open eyes, and fully/nearly closed eyes. In future studies, this issue can be tackled. In addition, on average, the SqueezeNet model took around 150 ms to offer its predicted result per image.

As mentioned earlier, researchers have used different schemes while distinguishing between open and closed eyes. For example, some authors used the edge operators (e.g., CHT/IDO), image binarization, or texture analysis [30]. Due to this reason, this study cannot use such schemes for the purpose of comparative analysis because the proposed scheme is purely based on a deep learning approach. Due to the unavailability of the relevant code for different contemporary DCNN-based schemes to detect open and fully/almost closed eyes and the lack of access to a powerful GPU platform, the author prototyped the method in [30] in the same experimental setup. As stated earlier, authors in [30] also used the concept of transfer learning to retrain a P-DCNN (i.e., ResNet50, with slight modification in its structure) to classify closed/open eyes in VS images. Table 7 shows a comparison of the proposed DCCN model (SqueezeNet) with the DCCN model of ref. [30].

It is clear that the proposed scheme outperformed in terms of prediction accuracy. ResNet50 is more complex compared with SqueezeNet. Its depth is 50 and has 25.6 million trainable parameters, whereas SqueezeNet’s depth is 18 and has 1.24 million trainable parameters only. The authors of ref. [30] have brought a slight modification in ResNet50′s structure to boost the training, validation, and prediction time. However, it may perform poorly for the almost closed eyes, which the authors might not have trained this model on. Note, accuracy has been computed subjectively because it is a binary case.

3.5. Comparative Analysis: Proposed Segmentation Algorithm

As mentioned in Section 2.2, a well-trained DCNN model needs to be incorporated into the non-ideal iris biometric systems to cope with the issue of fully or nearly closed eyes (or irises) in input images. To demonstrate this act, the author embedded SqueezeNet (as the proposed DCNN model) between the image acquisition and iris segmentation module. For experimentation, the author utilized the same dataset as gathered in Section 3.4.

Table 8 shows iris segmentation results with and without the proposed DCNN model, i.e., SqueezeNet. Without this model, the proposed iris segmentation scheme obtained 48% accuracy, and it took around 50 s, on average. The main reason for this inferior performance is that this scheme cannot efficiently deal with the issue of fully/nearly closed eyes without the DCCN model. In addition, it uses CHT for iris contour demarcation, which is exhausted when proper iris contours are not present in the input image, or the iris is severely occluded by eyelids; due to this act, it may mark the wrong iris location, as shown in Figure 16h,j.

Figure 16a–d shows typical examples of the open eye, fully closed eye, almost closed eye, and partially open eye, respectively. Figure 16e–h illustrates iris segmentation results of the segmentation module without the DCNN model for the images given in Figure 16a–d. Figure 16i,j shows iris segmentation results using the DCNN model before the iris segmentation module. It is evident that the DCNN model successfully rejected the fully/nearly closed eyes (Figure 16b,d) and passed on both fully and partially open eyes to the iris segmentation module.

As demonstrated by the comparative analysis in Table 9, the proposed iris segmentation algorithm significantly outperforms contemporary schemes, largely due to the integration of the proposed DCNN model, SqueezeNet. To maintain experimental integrity, the author re-implemented these baseline schemes within a unified framework, utilizing the exact image datasets referenced in Section 3.4. Without using the proposed DCNN model in front of contemporary schemes, these methods struggled to accurately demarcate iris boundaries in challenging, unconstrained scenarios, most notably when dealing with nearly or fully closed eyes.

A dramatic shift in performance was observed when the author integrated the proposed DCNN as a front-end processing component for these legacy methods. While this integration produced a dramatic improvement across all benchmarks, the proposed segmentation algorithm remained superior, achieving a peak accuracy of 99.5%. This outperformance is mainly because of the multifaceted strategy, which blends CHT with anatomical iris priors and image gray-level statistics. By exploiting these salient features, the proposed iris segmentation algorithm successfully mitigates the ocular artifacts common in non-ideal imaging environments, thereby validating its robustness and establishing a new benchmark for technical precision within the biometric segmentation landscape.

To evaluate the precision of the proposed iris segmentation algorithm, the author used the Manual Visual Inspection metric [58]. It is a qualitative metric that is widely accepted in biometric research for validating boundary detection in unconstrained ocular images [34]. Under this protocol, segmentation accuracy is determined by a human observer who manually inspects the localized contours; a result is classified as accurate only if the detected outer (limbic) and inner (pupillary) boundaries align precisely with the anatomical iris texture without significant deviation or inclusion of the sclera. This method serves as a critical verification of the algorithm’s robustness, ensuring that the predicted contours are biometrically plausible, and is used to calculate the Success Rate (SR) according to the equation:

S R = \frac{N_{C o r r e c t e d}}{N_{T o t a l}} \times 100

(11)

where

N_{C o r r e c t e d}

represents the number of correctly localized images and

N_{T o t a l}

denotes the total number of images in the dataset.

As shown in Figure 10, the proposed iris segmentation scheme uses the devised DCNN model (SqueezeNet) to detect and reject fully closed or nearly closed (almost closed) eyes, because these images are useless for a biometric machine. In addition, as some images contain partially open eyes, which is the intersection between the two classes: (i) fully/nearly closed eyes and (ii) fully/almost open eyes. The proposed model (DCCN) has been tuned to detect fully/nearly closed eyes, but in some cases, an eye status may fall in the intersection area. It implies that such images can be misclassified by the DCNN model. To subjectively mitigate this issue, the probability score returned by the DCNN model can be utilized to allow or reject such images from entering the subsequent modules.

Figure 17a,b show some typical instances of the accurate and false iris localization results of the proposed scheme. In Figure 17b, the contrast between the pupil and Iris regions is relatively low. In addition, the pupil region is also polluted by the object’s hairs. In the second image, the pupil is elliptical, due to which CHT did not mark it precisely.

4. Conclusions

Research in iris biometrics often struggles with a major unaddressed problem: images possessing fully or nearly closed eyes. When these images enter a biometric system, performance drops significantly. This leads to the rejection of genuine identities or failure to identify individuals in critical public zones. In real-world or covert settings, eye images are not of good quality. Eyes in the acquired images are often blurred by reflections or hidden by eyelashes, eyelids, and eyeglass frames. This study provides a rigid solution to these challenges. The author of this study developed a deep learning-based convolutional neural network (CNN) model that really acts as a vigilant gatekeeper. This model reliably detects and blocks closed-eye images from entering the processing pipeline. To mark iris contours, the author uses an iterative scheme based on the Circular Hough Transform (CHT), anatomical constraints, and image gray-level intensity statistics. The author validated this method using a new dataset of challenging, non-ideal images collected from a diverse set of publicly available datasets. The experimental results show a clear improvement in both systems’ accuracy and speed. No doubt, the proposed algorithm is highly effective for real-time applications and provides a reliable tool for the biometric community. By ensuring only high-quality data is processed, this algorithm significantly strengthens the security and efficiency of iris recognition technology.

While the proposed system demonstrates high accuracy, a primary limitation remains the classification of images at the intersection of nearly fully and partially open eyes. In such cases, the iris often overlaps with the eyelid borders, which can lead to system errors. Addressing these critical edge cases through refined segmentation techniques is reserved for future work.

Funding

The research received no external funding.

Data Availability Statement

Data is available within the text. The datasets used are downloadable from the links given in the references.

Acknowledgments

The author is thankful to the Chinese Academy of Sciences, Malaysia Multimedia University, Indian Institute of Technology Delhi (IITD), University of Beira Interior (UBI), and the authors of the Closed Eyes In The Wild (CEW), whose datasets have been utilized in this research work. In addition, authors are also thankful to MathWorks for their wonderful way to keep the MATLAB platform always updated and ready to use for any type of scientific work in any realm. Finally, authors are also thankful to those researchers whose work has been used or cited in this work.

Conflicts of Interest

The corresponding author declares that there is no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

P-DCNN	Pretrained Deep Convolutional Neural Network.
NIR	Near infrared.
VS	Visible spectrum.
CHT/EHT	Circular/Elliptical Hough transform.
ACM	Active contours models.
IDO	Integro-differential operator.
ML	Machine learning.
DCNN	Deep convolutional neural network.
CNN	Convolutional neural network.
MTCNN	Multi-task cascaded CNN.
ECODN	Eye closed/open detection network.
MMU	Malaysia Multimedia University.
IITD	Indian Institute of Technology Delhi.
CASIA	Chinese Academy of Sciences’ Institute of Automation.
UBI	University of Beira Interior.
BM	Base model.

References

Rajaraman, S.; Antani, S.; Thoma, G.R. Deep Learning for Iris Recognition: A Survey and Analysis. Image Vis. Comput. 2024, 130, 104680. [Google Scholar]
Chen, Y.; Wang, W.; Zeng, Z.; Wang, Y. An Adaptive CNNs Technology for Robust Iris Segmentation. IEEE Access 2019, 7, 64517–64532. [Google Scholar] [CrossRef]
Gautam, G.; Mukhopadhyay, S. Challenges, taxonomy and techniques of iris localization: A survey. Digit. Signal Process. 2020, 107, 102852. [Google Scholar] [CrossRef]
Wang, C.; Muhammad, J.; Wang, Y.; He, Z.; Sun, Z. Towards Complete and Accurate Iris Segmentation Using Deep Multi-Task Attention Network for Non-Cooperative Iris Recognition. IEEE Trans. Inf. Forensics Secur. 2020, 15, 2944–2959. [Google Scholar] [CrossRef]
Khaki, A.; Aghagolzadeh, A.; Cami, B.R. ISUR: Iris Segmentation based on UNet and ResNet. In Proceedings of the 2021 11th International Conference on Computer Engineering and Knowledge (ICCKE), Mashhad, Iran, 28–29 October 2021; pp. 1–7. [Google Scholar]
Shanto, S.H.; Ali, M.N.; Ahsan, S.M.M. An Advanced CNN Based Iris Recognition and Segmentation for Visible Spectrum Images. In Proceedings of the 2022 International Conference on Advancement in Electrical and Electronic Engineering (ICAEEE), Gazipur, Bangladesh, 24–26 February 2022; pp. 1–5. [Google Scholar]
Agarwal, A.; Noore, A.; Vatsa, M.; Singh, R. Generalized Contact Lens Iris Presentation Attack Detection. IEEE Trans. Biom. Behav. Identity Sci. 2022, 4, 373–385. [Google Scholar] [CrossRef]
Le-Tien, T.; Phan-Xuan, H.; Nguyen-Duy, P.; Le-Ba, L. Iris-based Biometric Recognition using Modified Convolutional Neural Network. In Proceedings of the 2018 International Conference on Advanced Technologies for Communications (ATC), Ho Chi Minh City, Vietnam, 18–20 October 2018; pp. 184–188. [Google Scholar]
Rafik, H.D.; Boubaker, M. A Multi Biometric System Based on the Right Iris and the Left Iris Using the Combination of Convolutional Neural Networks. In Proceedings of the 2020 Fourth International Conference on Intelligent Computing in Data Sciences (ICDS), Fez, Morocco, 21–23 October 2020; pp. 1–10. [Google Scholar]
Saraf, T.O.Q.; Fuad, N.; Taujuddin, N.S.A.M. Feature Encoding and Selection for Iris Recognition Based on Variable Length Black Hole Optimization. Computers 2022, 11, 140. [Google Scholar] [CrossRef]
Galterio, M.G.; Shavit, S.A.; Hayajneh, T. A Review of Facial Biometrics Security for Smart Devices. Computers 2018, 7, 37. [Google Scholar] [CrossRef]
Mayer, P.; Zou, Y.; Lowens, B.M.; Dyer, H.A.; Le, K.; Schaub, F.; Aviv, A.J. Awareness, Intention, (In)Action: Individuals’ Reactions to Data Breaches. ACM Trans. Comput. Hum. Interact. 2023, 30, 77. [Google Scholar] [CrossRef]
Biometric_SmartCity. Available online: https://www.smartcity.press/cybersecurity-with-biometric-technology (accessed on 29 May 2023).
Fu, K.; Zhao, Q.; Gu, I.; Yang, J. Deepside: A General Deep Framework for Salient Object Detection. Neurocomputing 2019, 356, 69–82. [Google Scholar] [CrossRef]
Fu, K.; Fan, D.-P.; Ji, G.-P.; Zhao, Q. JL-DCF: Joint Learning and Densely-Cooperative Fusion Framework for RGB-D Salient Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 3049–3059. [Google Scholar]
Fan, D.-P.; Liu, J.-J.; Gao, S.; Hou, Q.; Borji, A.; Cheng, M.-M. Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
Wang, T.; Piao, Y.; Lu, H.; Li, X.; Zhang, L. Deep Learning for Light Field Saliency Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8837–8847. [Google Scholar]
Zhang, L.; Zhang, J.; Lin, Z.; Lu, H.; He, Y. CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 6017–6026. [Google Scholar]
Zhang, J.; Fan, D.-P.; Dai, Y.; Anwar, S.; Saleh, F.; Zhang, T.; Barnes, N. UC-Net: Uncertainty Inspired RGB-D Saliency Detection via Conditional Variational Autoencoders; IEEE: New York, NY, USA, 2020; pp. 8579–8588. [Google Scholar]
Daugman, J.G. High confidence visual recognition of persons by a test of statistical independence. IEEE Trans. Pattern Anal. Mach. Intell. 1993, 15, 1148–1161. [Google Scholar] [CrossRef]
Wildes, R.P. Iris Recognition: An Emerging Biometric Technology. Proc. IEEE 1997, 85, 1348–1363. [Google Scholar] [CrossRef]
UBIRIS_Database. Available online: https://iris.di.ubi.pt/ (accessed on 29 May 2023).
Nsaif, A.K.; Ali, S.H.M.; Nseaf, A.K.; Jassim, K.N.; Al-Qaraghuli, A.; Sulaiman, R. Robust and Swift Iris Recognition at distance based on novel pupil segmentation. J. King Saud Univ. Comput. Inf. Sci. 2022, 34, 9184–9206. [Google Scholar] [CrossRef]
Donida Labati, R.; Muñoz, E.; Piuri, V.; Ross, A.; Scotti, F. Non-ideal iris segmentation using Polar Spline RANSAC and illumination compensation. Comput. Vis. Image Underst. 2019, 188, 102787. [Google Scholar] [CrossRef]
Nguyen, K.; Fookes, C.; Jillela, R.; Sridharan, S.; Ross, A. Long range iris recognition: A survey. Pattern Recognit. 2017, 72, 123–143. [Google Scholar] [CrossRef]
Jan, F. Development and Analysis of Robust Iris Segmentation Algorithms for Non Ideal Iris Recognition System. Ph.D. Thesis, COMSATS Univeristy Islamabad, Islamabad, Pakistan, 2014. [Google Scholar]
Jan, F.; Ahmed, M.I.B.; Min-Allah, N. Databases for Iris Biometric Systems: A Survey. SN Comput. Sci. 2020, 1, 324. [Google Scholar] [CrossRef]
Kumari, P.; Seeja, K.R. Periocular biometrics: A survey. J. King Saud Univ. Comput. Inf. Sci. 2019, 34, 1086–1097. [Google Scholar] [CrossRef]
Daugman, J. How Iris Recognition Works. IEEE Trans. Circuits Syst. Video Technol. 2004, 14, 21–30. [Google Scholar] [CrossRef]
Kim, K.W.; Hong, H.G.; Nam, G.P.; Park, K.R. A Study of Deep CNN-Based Classification of Open and Closed Eyes Using a Visible Light Camera Sensor. Sensors 2017, 17, 1534. [Google Scholar] [CrossRef] [PubMed]
Al-Shakarchy, N.D.; Israa Hadi, A. Open and Closed Eyes Classification in Different Lighting Conditions Using New Convolution Neural Networks Architecture. J. Theor. Appl. Inf. Technol. 2019, 97, 1970–1979. [Google Scholar]
Jian, L.; Li, Z.; Yang, X.; Wu, W.; Ahmad, A.; Jeon, G. Combining Unmanned Aerial Vehicles With Artificial-Intelligence Technology for Traffic-Congestion Recognition: Electronic Eyes in the Skies to Spot Clogged Roads. IEEE Consum. Electron. Mag. 2019, 8, 81–86. [Google Scholar] [CrossRef]
Yiu, Y.-H.; Aboulatta, M.; Raiser, T.; Ophey, L.; Flanagin, V.L.; zu Eulenburg, P.; Ahmadi, S.-A. DeepVOG: Open-source pupil segmentation and gaze estimation in neuroscience using deep learning. J. Neurosci. Methods 2019, 324, 108301–108307. [Google Scholar] [CrossRef] [PubMed]
Le, A.D.; Nguyen, H.T.; Nakagawa, M. An End-to-End Recognition System for Unconstrained Vietnamese Handwriting. SN Comput. Sci. 2019, 1, 7. [Google Scholar] [CrossRef]
Hajjami, A.; Khalid, A.; Arsalane, Z. Iris Localisation and segmentation using Convolutional neural network. In Proceedings of the 2019 Third International Conference on Intelligent Computing in Data Sciences (ICDS), Marrakech, Morocco, 28–30 October 2019; pp. 1–6. [Google Scholar]
Ribeiro, E.; Uhl, A.; Alonso-Fernandez, F. Iris super-resolution using CNNs: Is photo-realism important to iris recognition? IET Biom. 2019, 8, 69–78. [Google Scholar] [CrossRef]
Al-Waisy, A.S.; Qahwaji, R.; Ipson, S.; Al-Fahdawi, S.; Nagem, T.A.M. A multi-biometric iris recognition system based on a deep learning approach. Pattern Anal. Appl. 2018, 21, 783–802. [Google Scholar] [CrossRef]
Hollingsworth, K.; Bowyer, K.W.; Flynn, P.J. The Best Bits in an Iris Code. IEEE Trans. Pattern Anal. Mach. Intell. 2009, 31, 964–973. [Google Scholar] [CrossRef]
Bowyer, K.W.; Hollingsworth, K.; Flynn, P.J. Image Understanding for Iris Biometrics: A Survey. Comput. Vis. Image Underst. 2008, 110, 281–307. [Google Scholar] [CrossRef]
ISO/IEC 19794-6; Information Technology—Biometric Data Interchange Formats—Part 6: Iris Image Data. International Organization for Standardization: Geneva, Switzerland, 2011.
Proença, H.; Alexandre, L.A. Iris Recognition: Analysis of the Error Rates Regarding the Accuracy of the Segmentation Stage. Image Vis. Comput. 2010, 28, 202–213. [Google Scholar] [CrossRef]
Proença, H.; Alexandre, L.A. UBIRIS: A Noisy Iris Image Database. In Proceedings of the International Conference on Image Analysis and Processing (ICIAP), Cagliari, Italy, 6–8 September 2005. [Google Scholar]
Li, Y.-H.; Huang, P.-J.; Juan, Y. An Efficient and Robust Iris Segmentation Algorithm Using Deep Learning. Mob. Inf. Syst. 2019, 2019, 4568929. [Google Scholar] [CrossRef]
ImageNet. Available online: https://www.image-net.org/ (accessed on 29 May 2023).
Fan, Y.; Li, H.; Bao, Y.; Xu, Y. Cycle-consistency-constrained few-shot learning framework for universal multi-type structural damage segmentation. Struct. Health Monit. 2026, 25, 874–893. [Google Scholar] [CrossRef]
Xu, Y.; Zhang, C.; Li, H. Transformer-based large vision model for universal structural damage segmentation. Autom. Constr. 2025, 176, 106256. [Google Scholar] [CrossRef]
Tong, Y.; Zeng, Y.; Lu, Y.; Huang, Y.; Jin, Z.; Wang, Z.; Wang, Y.; Zang, X.; Chang, L.; Mu, W.; et al. Deep learning-enhanced microwell array biochip for rapid and precise quantification of Cryptococcus subtypes. View 2024, 5, 20240032. [Google Scholar] [CrossRef]
Tian, X.; Jiang, Y.; Zhu, S.; Liu, X.; Anwaier, A.; Ye, S.; Chang, K.; Qu, Y.; Gu, Y.J.; Zhang, H.; et al. A multimodal deep learning framework for predicting sunitinib response in advanced clear cell renal cell carcinoma. View 2025, 7, 20250157. [Google Scholar] [CrossRef]
Mathworks. Available online: http://www.mathworks.com/ (accessed on 29 May 2023).
Min-Allah, N.; Qureshi, M.B.; Jan, F.; Alrashed, S.; Taheri, J. Deployment of real-time systems in the cloud environment. J. Supercomput. 2020, 77, 2069–2090. [Google Scholar] [CrossRef]
Jan, F.; Min-Ullah, N. An effective iris segmentation scheme for noisy images. Biocybern. Biomed. Eng. 2020, 40, 1064–1080. [Google Scholar] [CrossRef]
MMU_Iris_Database. Available online: https://www.kaggle.com/datasets/naureenmohammad/mmu-iris-dataset (accessed on 29 May 2023).
IITD_Iris_Databases. Available online: https://www4.comp.polyu.edu.hk/~csajaykr/IITD/Database_Iris.htm (accessed on 29 May 2023).
CASIA_Database. Available online: https://www.kaggle.com/swoyam2609/datasets (accessed on 14 April 2026).
SGGSIE&T_iris_database. Available online: https://sggs.ac.in/home/page/electronics-and-telecommunication-engineeringl (accessed on 14 April 2026).
CEW_datset. Closed Eyes in the Wild (CEW). Available online: http://parnec.nuaa.edu.cn/_upload/tpl/02/db/731/template731/pages/xtan/ClosedEyeDatabases.html (accessed on 29 May 2023).
Jan, F.; Usman, I.; Agha, S. Iris localization in frontal eye images for less constrained iris recognition systems. Digit. Signal Process. 2012, 22, 971–986. [Google Scholar] [CrossRef]
Jan, F.; Alrashed, S.; Min-Allah, N. Iris segmentation for non-ideal Iris biometric systems. Multimed. Tools Appl. 2021, 83, 15223–15251. [Google Scholar] [CrossRef]
Ibrahim, M.T.; Khan, T.M.; Khan, S.A.; Khan, M.A.; Guan, L. Iris localization using local histogram and other image statistics. Opt. Lasers Eng. 2012, 50, 645–654. [Google Scholar] [CrossRef]
Khan, T.M.; Aurangzeb Khan, M.; Malik, S.A.; Khan, S.A.; Bashir, T.; Dar, A.H. Automatic localization of pupil using eccentricity and iris using gradient based method. Opt. Lasers Eng. 2011, 49, 177–187. [Google Scholar] [CrossRef]
Daugman, J. New Methods in Iris Recognition. IEEE Trans. Syst. Man Cybern. Part B Cybern. 2007, 37, 1167–1175. [Google Scholar] [CrossRef] [PubMed]

Figure 1. A human eye image [22].

Figure 2. Typical non-ideal iris biometric setups [25].

Figure 3. Typical NIR and VS eye images.

Figure 4. (a) and (b) ∅ (x, y) and C (x, y), respectively.

Figure 5. Sample eye images are produced with augmentation techniques.

Figure 6. Block diagram of transfer learning using BM, i.e., P-DCNN.

Figure 7. (a) Input eyeimage to the SqueezeNet (P-DCNN Model). (b,c) Intermittent sample features images taken from the layers “fire3-squeeze1x1” and “fire5-relu_squeeze1x1”.

Figure 8. (a) Training progress when WLRF = 10, BLRF = 10, MBS = 35, and ILR-S = 0.1. (b) Training progress when WLRF = 10, BLRF = 10. MBS = 35, and ILR-S = 0.001. Note WLRF and BLRF are used for the newly added convolution and output layers.

Figure 9. (a,b) Training and validation plots of the SqueezeNet model on the target dataset and confusion matrix on the validation set, respectively.

Figure 10. Proposed enhanced iris biometric system.

Figure 11. (a–i) Original image, gray-level image, median filtered image, edgemap, surf image of CHT accumulator for a particular radius and center of iris outer contour’s candidate, edgemap for pupil, surf image of CHT accumulator for a particular radius and center of iris inner contour’s candidate, outer boundary marked, and inner boundary marked.

Figure 12. (a–f) Images taken from MMU, IITD, CASIA, UBIRIS, SGGSI&T, and CEW datasets, respectively.

Figure 13. (a,b) sample images taken from the folders marked as OpenEyes and FullyAlmostClosedEyes, respectively.

Figure 14. Training and validation accuracy (along with time in minutes taken for both steps) for the top best retrained BM models.

Figure 15. (a–d) shows typical examples of the fully open eye, fully closed eye, almost closed eye, and partially open eye, respectively.

Figure 16. (a–d) Shows typical examples of the open eye, fully closed eye, almost closed eye, and partially open eye, respectively. (e–h) Iris segmentation results of the segmentation module without the DCNN model for the images given in first row. (i,j) Iris segmentation results using the DCNN model before iris segmentation module.

Figure 17. (a,b) shows few instances of the accurate and wrong iris segmentation results of the proposed scheme, respectively.

Table 1. Pretrained Deep Convolutional Neural Network (P-DCNN) models.

P-DCNN Models	Depth	Size (MB)	Trainable Parameters (Millions)	Image Dimensions (Pixels)
SqueezeNet	18	5.2	1.24	$(227 \times 227)$
GoogleNet	22	27	7.0	$(224 \times 224)$
Inception-v3	48	89	23.9	$(299 \times 299)$
DenseNet201	201	77	20.0	$(224 \times 224)$
MobilenetV2	53	13	3.5	$(224 \times 224)$
ResNet18	18	44	11.7	$(224 \times 224)$
ResNet50	50	96	25.6	$(224 \times 224)$
ResNet101	101	167	44.6	$(224 \times 224)$
Xception	71	85	22.9	$(299 \times 299)$
InceptionResNet-V2	164	209	55.9	$(299 \times 299)$
ShuffleNet	50	5.4	1.4	$(224 \times 224)$
NASNetMobile	*	20	5.3	$(224 \times 224)$
NASNet-Large	*	332	88.9	$(331 \times 331)$
DarkNet19	19	78	20.8	$(256 \times 256)$
DarkNet53	53	155	41.6	$(256 \times 256)$
EfficientNet-b0	82	20	5.3	$(224 \times 224)$
AlexNet	8	227	61.0	$(227 \times 227)$
VGG16	16	515	138	$(224 \times 224)$
VGG19	19	535	144	$(224 \times 224)$

* means not specified.

Table 2. Hyperparameters initialization.

Hyperparameter	Value (or Range)	Comment
$N_{M B S}$	[35,39,44]	It is a subset of the training dataset, which is used to evaluate the gradient of the loss function and update the weights.
$N_{E P C S}$	[8,12,16]	An epoch means a full pass of the target data.
$γ_{S o l}$	[sgdm, rmsprop, adam]	Sgdm: it uses the Stochastic Gradient Descent with Momentum (SGDM) optimizer. rmsprop: it uses the RMSProp optimizer. Adam: it uses the Adam optimizer.
$γ_{I L R}$	[0.1, 0.001, 0.0001]	It is needed by Solver.
$α_{W L R F}$	[1,5,10]	Software will multiply this factor by the global learning rate while finding the learning rate for weights in this layer.
$β_{B L R F}$	[1,5,10]	Software will multiply this factor by the global learning rate while figuring out the learning rate for the biases in this layer.

Note: N_MBS = MiniBatchSize,

N_{E P C S}

= No. of Epochs,

γ_{S o l}

= Solver,

γ_{I L R}

= InitialLearnRate needed for Solver,

α_{W L R F}

= WeightLearnRateFactor,

α_{B L R F}

= BiasLearnRateFactor.

α_{W L R F}

and

β_{B L R F}

are used for the newly added classification and convolutional layers.

Table 3. Public eye/face datasets.

Dataset	Illumination	Size (Pixels)	Images	Non-Ideal Factors	Setup
MMU V1.0	NIR	$320 \times 240$	460	Specular reflections, off-axis irises, and eyeglasses.	I
MMU V2.0	NIR	$320 \times 240$	995	Off-axis irises, non-uniform illumination, blurring, reflections, contact lenses, eyelids, eyelashes, hair, and eyeglasses.	LC
IITD V1.0	NIR	$320 \times 240$	1120	Contact lenses, reflections, focus, rotated iris, eyelids, eyelashes, and eyebrows.	LC
CASIA-IrisV3-Interval	NIR	$320 \times 280$	2639	Reflections, non-uniform illumination, partially open eye, blurring, natural and cosmetic eyelashes, eyeglasses, low-contrast, hair, eyelids, rotated-iris, non-circular boundaries, and lenses.	LC
CASIA-IrisV3-Twins		$640 \times 480$	3118
CASIA-IrisV3-Lamp		$640 \times 480$	16,212
CASIA-IrisV4-Thousand	NIR	$640 \times 480$	20,000	Synthesized images, defocus, reflections, non-uniform illumination, eyeglasses, eyelids, eyelashes, face images, tilted face images, and hair.	LC/UC
CASIA-IrisV4-Syn		$640 \times 480$	10,000
CASIA-IrisV4-Distance		$2352 \times 1728$	2567
UBIRIS V1.0	NIR/VS	$800 \times 600$ ; $200 \times 150$	1877	Close eyes, low contrast, and reflections.	LC/UC
UBIRIS V2.0	VS	$300 \times 400$	1102	Low contrast, closed eye, reflections, off-axis and off-angle irises, and eyebrows.	LC/UC
SGGSIE&T Iris Image Dataset	NIR	$240 \times 200$	1200	Eyebrows, eyelids, off-axis and off-angle irises, and reflections.	LC
CEW Dataset	VS	Non-uniform	2423 face images	Closed/open eyes, eyeglasses, contact lenses, eyelids, eyelashes, off-axis and off-angle irises, blur, low resolution, and poor contrast.	UC

Where I = ideal, LC = less constrained, and UC = unconstrained.

Table 4. Basic hyperparameters initialization—trimmed.

Hyperparameter	Value (Range)	Comment
$N_{M B S}$	[35,39]	A subset of the training dataset is used to evaluate the gradient of the loss function and update the weights.
$N_{E P C S}$	[8]	An epoch means a full pass of the target data.
$γ_{S o l}$	[sgdm]	Sgdm: Stochastic Gradient Descent with Momentum (SGDM) optimizer.
$γ_{I L R}$	[0.001, 0.0001]	It is needed by the Solver, i.e., sgdm.
$α_{W L R F}$	[10]	Software will multiply this factor by the global learning rate to find the learning rate for weights in this layer.
$β_{B L R F}$	[10]	Software multiplies this factor by the global learning rate to find out the learning rate for biases in this layer.

Note:

N_{M B S}

= MiniBatchSize,

N_{E P C S}

= No. of epochs,

γ_{S o l}

= Solver,

γ_{I L R}

= InitialLearnRate needed for Solver,

α_{W L R F}

= WeightLearnRateFactor,

α_{B L R F}

= BiasLearnRateFactor.

α_{W L R F}

and

β_{B L R F}

are used for the newly added classification and convolutional layers.

Table 5. Models’ highest validation accuracy, along with the hyperparameters’ values.

BM Model	TA (%)	VA (%)	$γ_{S o l}$	$α_{W L R F}$	$β_{B L R F}$	$γ_{I L R}$	$N_{E P C S}$	$N_{M B S}$	$t_{t v}$ (min)
GoogleNet	100.00	99.40	sgdm	10	10	0.001	8	35	10.36
SqueezeNet	100.00	99.70	sgdm	10	10	0.001	8	35	4.55
MobileNetV2	100.00	99.65	sgdm	10	10	0.001	8	40	16.78
ResNet18	100.00	99.40	sgdm	10	10	0.001	8	35	9.82
ResNet50	100.00	99.40	sgdm	10	10	0.001	8	40	25.53
ResNet101	97.14	99.40	sgdm	10	10	0.001	8	35	47.75
Inception-v3	100.00	98.89	sgdm	10	10	0.001	8	35	39.18
InceptionResNet-V2	100.00	99.45	sgdm	10	10	0.001	8	40	78.53
AlexNet	100.00	99.40	sgdm	10	10	0.0001	8	35	3.82
VGG16	97.14	99.40	sgdm	10	10	0.001	8	35	50.1
VGG19	100.00	98.89	sgdm	10	10	0.0001	8	35	62.1
EfficientNet-b0	100.00	99.40	sgdm	10	10	0.001	8	35	23.92
DenseNet201	100.00	98.89	sgdm	10	10	0.0001	8	35	54.4
Xception	100.00	98.89	sgdm	10	10	0.001	8	35	91.47
ShuffleNet	100.00	98.89	sgdm	10	10	0.001	8	35	8.26
NASNet-Large	Exhibited memory outage in every trial
NASNet-Mobile	97.14	98.33	sgdm	10	10	0.001	8	35	34.33
DarkNet19	100.00	99.50	sgdm	10	10	0.001	8	40	19.93
DarkNet53	100.00	99.43	sgdm	10	10	0.001	8	35	45.27

Note: TA (%) = training accuracy in percentage, VA (%) = validation accuracy in percentage,

t_{t v} =

total time taken in minutes for both training and validation.

Table 6. Models’ lowest validation accuracy, along with the hyperparameters’ values.

BM Model	TA (%)	VA (%)	$γ_{S o l}$	$α_{W L R F}$	$β_{B L R F}$	$γ_{I L R}$	$N_{E P C S}$	$N_{M B S}$	$t_{t v}$ (min)
GoogleNet	97.14	95.56	sgdm	10	10	0.0001	8	35	9.91
SqueezeNet	100.00	98.33	sgdm	10	10	0.0001	8	40	4.82
MobileNetV2	100.00	96.11	sgdm	10	10	0.0001	8	35	16.38
ResNet18	100.00	96.67	sgdm	10	10	0.0001	8	35	9.70
ResNet50	100.00	97.22	sgdm	10	10	0.0001	8	35	25.42
ResNet101	97.14	96.11	sgdm	10	10	0.0001	8	35	42.45
Inception-v3	100.00	94.44	sgdm	10	10	0.0001	8	35	37.37
InceptionResNet-V2	97.14	93.89	sgdm	10	10	0.0001	8	35	73.88
AlexNet	97.50	98.33	sgdm	10	10	0.001	8	40	4.99
VGG16	50.00	50.00	sgdm	10	10	0.001	8	40	68.12
VGG19	100.00	96.11	sgdm	10	10	0.001	8	40	60.60
EfficientNet-b0	97.50	92.78	sgdm	10	10	0.0001	8	40	23.83
DenseNet201	100.00	98.33	sgdm	10	10	0.001	8	35	51.53
Xception	95.00	93.89	sgdm	10	10	0.0001	8	40	97.68
ShuffleNet	97.50	96.11	sgdm	10	10	0.0001	8	40	14.50
NASNet-Large	Exhibited memory outage in every trial
NASNet-Mobile	92.00	92.78	sgdm	10	10	0.0001	8	40	55.53
DarkNet19	100.00	98.89	sgdm	10	10	0.0001	8	40	32.45
DarkNet53	100.00	98.89	sgdm	10	10	0.0001	8	40	47.34

Note: TA (%) = training accuracy in percentage, VA (%) = validation accuracy in percentage,

t_{t v} =

total time taken in minutes for both training and validation.

Table 7. Comparison of the proposed DCNN model with contemporary schemes.

Ref	P-DCNN Model	Open Eyes (%)	Fully/Nearly Closed Eyes (%)	Average Time (ms)
[30]	ResNet50	98.4	97%	250
Proposed	SqueezeNet	99.6	99%	150

Table 8. Iris segmentation results with and without proposed DCNN model.

Mode	Accuracy (%)	Time (s)
Without SqueezeNet	48	50
With SqueezeNet	99.5	0.6

Table 9. Iris segmentation comparative analysis.

Ref.	Accuracy (%) Without DCNN Model	Accuracy (%) With DCNN Model
[59]	38	86.6
[60]	44	78.4
[61]	48	88.9
[21]	34	75.5
Proposed	48	99.5

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Jan, F. Robust Iris Segmentation with Deep CNNs for Detecting Fully or Nearly Closed Eyes in Non-Ideal Biometric Systems. Computers 2026, 15, 253. https://doi.org/10.3390/computers15040253

AMA Style

Jan F. Robust Iris Segmentation with Deep CNNs for Detecting Fully or Nearly Closed Eyes in Non-Ideal Biometric Systems. Computers. 2026; 15(4):253. https://doi.org/10.3390/computers15040253

Chicago/Turabian Style

Jan, Farmanullah. 2026. "Robust Iris Segmentation with Deep CNNs for Detecting Fully or Nearly Closed Eyes in Non-Ideal Biometric Systems" Computers 15, no. 4: 253. https://doi.org/10.3390/computers15040253

APA Style

Jan, F. (2026). Robust Iris Segmentation with Deep CNNs for Detecting Fully or Nearly Closed Eyes in Non-Ideal Biometric Systems. Computers, 15(4), 253. https://doi.org/10.3390/computers15040253

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Robust Iris Segmentation with Deep CNNs for Detecting Fully or Nearly Closed Eyes in Non-Ideal Biometric Systems

Abstract

1. Introduction

Research Gap

2. Proposed Algorithm

2.1. DCNN Model for Fully/Nearly Closed Eyes Detection

2.1.1. Preprocessing

2.1.2. Dataset Augmentation

2.1.3. Transfer Learning: Training P-DCCN Models on New Images

2.2. Proposed Enhanced Iris Biometric System

Proposed Iris Segmentation

3. Results and Discussion

3.1. Experimental Setup

3.2. Preparing Target Dataset: Fully and Nearly Closed/Open Eyes

3.2.1. OpenEyes: Open Irises

3.2.2. FullyAlmostClosedEyes

3.3. DCNN Training and Validation Accuracy

3.4. Comparative Analysis: Proposed DCNN Model

3.5. Comparative Analysis: Proposed Segmentation Algorithm

4. Conclusions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI