1. Introduction
In recent years, road traffic safety has become an increasingly serious issue. More than 1.3 million people die in road traffic accidents annually, and more than 20 million people suffer non-fatal injuries as a result of these accidents [
1]. Statistics show that fatigue driving is one of the leading causes of traffic accidents [
2]. Consequently, the effective monitoring of driver fatigue has become critical for improving road safety and reducing traffic-related fatalities. With the rapid development of computer vision technology, eye-tracking-based driver fatigue monitoring systems [
3] have gradually become a research hotspot. Eye trackers determine the center of the pupil to calculate indicators such as the driver’s gaze point and eye movement speed, enabling the real-time monitoring of the driver’s condition. However, factors such as poor road conditions can cause vehicle vibrations, resulting in blurred images being captured by the eye tracker [
4]. In addition, the field of view of the image captured by the eye tracker is large, with the eye region occupying only a small portion of the image pixels, and the resolution is relatively low [
5]. These factors may cause the eye tracker to capture blurred images of the human eye, which affects the precise localization of the pupil center and, consequently, reduces the reliability of driver fatigue monitoring. Therefore, developing real-time pupil localization algorithms for blurred images is of great value in monitoring driver fatigue.
Pupil localization algorithms can be classified into two categories: data-driven methods and knowledge-based methods [
6,
7]. Data-driven methods learn the inherent patterns and features of data through extensive training on large datasets, thereby accomplishing the task of pupil localization [
8]. These methods demonstrate superior performance when processing complex scenes and blurred images. For instance, a robust pupil localization algorithm based on a modified dense fully connected network has been proposed, enabling the rapid and accurate detection of the pupil center even in the presence of reflections and occlusions [
9]. Additionally, a Pupil-Locator Network (PLNet), which utilizes domain-specific data augmentation, has been developed to address challenging conditions such as reflections, exposure variations, and blur, further enhancing the robustness of pupil localization [
10]. However, data-driven methods rely on the support of datasets [
11], and the existing datasets struggle to fully cover human eye images with varying levels of blur, resulting in limited model generalization. In addition, the model inference and data processing steps require more computational resources, which may affect the response times of driver fatigue monitoring systems when hardware resources are constrained.
Traditional knowledge-based methods perform pupil localization based on prior knowledge, such as low grayscale values, circular shapes, and a lack of texture within the pupil. These methods can be categorized into three types based on the features utilized by the algorithms: grayscale-based [
12,
13], gradient-based [
14,
15,
16], and shape-based methods [
17,
18]. A grayscale-based approach using the Gray Projecting Integral Method (GPIM) [
13] for pupil localization offers flexibility but suffers from reduced accuracy under poor lighting and in complex scenes. To improve localization precision, a gradient-based approach utilizing the Hough Transform (HT) [
14] detects the pupil center with high accuracy and robustness against interference. However, its one-to-many mapping results in high computational complexity and poor real-time performance. In contrast, shape-based approaches, such as the Fast Radial Symmetry Transform (FRST) [
18], enhance real-time efficiency by reducing algorithmic complexity through parameter space conversion. These algorithms achieve accurate pupil localization in clear eye images.
However, traditional methods designed for clear images face significant challenges when applied to blurred images of human eyes. Although the overall grayscale distribution trend and large-scale shape information are preserved, high-frequency components such as texture features and edge features undergo significant losses, and local grayscale detail features and small-scale shape features are weakened. As a result, conventional pupil localization algorithms are ineffective in processing blurred images. To address this, Zheng et al. [
19] proposed a human eye location method based on blurred infrared images. First, the image is enhanced by adaptive histogram equalization to compensate for the loss of gray information, and then the human eye region is located through template matching. While this method yields good localization results for blurred images, it targets the human eye region rather than the pupil center. Li et al. [
20] proposed a coarse-to-fine positioning algorithm that combines grayscale, gradient, and other information. The algorithm consists of four steps: coarse positioning using the Otsu method, edge positioning using gradient features, edge repositioning using sub-pixel methods, and ellipse fitting. The algorithm effectively improves the pupil localization accuracy of blurred images, but it is time-consuming and does not meet the real-time requirements for monitoring driver fatigue.
To achieve accurate pupil localization in blurred images while ensuring real-time performance, this paper proposes a pupil localization algorithm based on double constraints. First, the rough pupil area is extracted from the whole image based on grayscale constraints to reduce the interference of invalid information. Subsequently, a specifically designed pupil shape index serves as the key condition for screening the pupil area, and the pupil area is refined based on geometric constraints. This approach avoids the complex computations involved in edge fitting and mitigates the loss of high-frequency components caused by image blur. Finally, the geometric moments of the region are directly used to locate the pupil center, reducing computational time. The experimental results demonstrate that the proposed algorithm achieves precise pupil localization in blurred human eye images, improving both the localization speed and accuracy, thereby enhancing the technology used for real-time driver fatigue monitoring.
2. Proposed Method
Near-infrared imaging is often used for on-board monitoring [
21] because near-infrared imaging can stably capture pupil features under various lighting conditions, especially in accident-prone nighttime environments. In addition, the eye has a higher reflection of near-infrared light than visible light. Therefore, the pupil location algorithm proposed in this study is targeted at infrared images.
The features of a clear human eye image are shown in
Figure 1, and the features of a blurred human eye image are shown in
Figure 2. As shown in
Figure 1 and
Figure 2, the pupil localization algorithm for blurred images requires the pupil area to be accurately separated from the iris and sclera areas and the pupil center to be detected in the presence of blurred edges and reduced contrast.
Based on the analysis above, the flowchart of the pupil localization algorithm proposed in this article is shown in
Figure 3. The algorithm consists of three steps: extracting the rough pupil area based on grayscale constraints, refining the pupil region based on geometric constraints, and determining the pupil center using geometric moments. First, the statistical information of the grayscale histogram is combined with grayscale integral projection as grayscale constraints to extract the rough pupil area. Next, connected components are extracted, and morphological hole filling is performed to mitigate the impact of reflective artifacts. The designed pupil shape index is then used as the screening condition of the connected components, and the pupil area is refined through geometric constraints. Finally, the geometric moment of the pupil area is calculated to output the coordinates of the pupil center.
2.1. The Extraction of the Rough Pupil Area Based on Grayscale Constraints
Due to factors such as uneven illumination, dark areas and reflected spots appear in the image. Additionally, interference from hair and eyebrows may also be present. Therefore, it is necessary to find the rough pupil region from the whole image. Due to the high complexity of the scene in the image to be processed, a fixed-position and fixed-size region mask cannot provide satisfactory results. Therefore, we propose a method to extract the rough pupil region based on grayscale constraints, as shown in
Figure 4. This step relies solely on grayscale statistical information, thereby minimizing interference caused by image blurring. Moreover, only one traversal of the image pixels is required, allowing for efficient use of the grayscale information and reducing computational complexity.
Firstly, the segmentation threshold is determined using the statistical information from the gray histogram. Based on prior knowledge, the pixel values of the pupil area are low, and the pupil area occupies a specific region. Regardless of the ambient lighting conditions or image clarity, the pixels of the pupil region are generally contained in the spikes with low gray values. Therefore, we set the segmentation threshold at the gray value corresponding to the valley of the lowest peak in the histogram. The binarization results generated using different threshold segmentation methods are shown in
Figure 5. The result of the threshold segmentation using the histogram statistical method (
Figure 5d) generally includes the pupil area and dark areas with similar grayscale values. This method adapts to the pupil region’s gray range based on statistical information, making it both simple and effective.
After obtaining the binarization result using the histogram statistical method (
Figure 5d), the image is subjected to grayscale integral projection along both the vertical and horizontal directions. By traversing and accumulating the number of pixels with zero grayscale value in these directions, two one-dimensional grayscale cumulative distribution curves are generated. The coordinate values corresponding to the main regions of the two curves help to determine the rough pupil region in both dimensions, resulting in the creation of a rectangular mask. To account for potential interference, such as reflected light spots near the pupil boundary, which may lead to an undersized rough pupil area, the rectangular mask is expanded by 1/20 of the number of pixels along the long side of the image.
The method of extracting the rough pupil area based on grayscale constraints leverages the statistical characteristics of gray information in the human eye image, and the threshold does not need to be manually selected. It is both robust and adaptable, effectively reducing the impact of image blur on pupil localization. Furthermore, the calculations of the grayscale histogram and the grayscale integral projection can be performed simultaneously during the pixel traversal process, allowing for significant improvements in the algorithm’s real-time performance through parallelization, especially for large-scale human eye images.
2.2. Refinement of the Pupil Area Based on Geometric Constraints
The rough pupil region extracted using gray constraints does not meet the accuracy requirements of pupil localization in monitoring driver fatigue, so further refinement of the pupil area is needed. In this study, we use morphological operations and the pupil shape index as geometric constraints to refine the pupil area, and the process is shown in
Figure 6. First, the rough pupil area is preprocessed via a morphological operation. The connected regions are extracted and a hole-filling algorithm is applied to eliminate the influence of reflected light spots. Then, the connected components are screened according to the proposed pupil shape index, with the region exhibiting the largest pupil shape index selected for further processing.
First, the connected components are extracted from the rough pupil area. The shape of the connected components depends on the chosen connectivity criterion. There are two types of connectivity: 4-connectivity and 8-connectivity. The 4-connectivity method is suitable for simple object segmentation, whereas 8-connectivity is more effective for segmenting complex objects with irregular boundaries. Therefore, 8-connectivity is preferable for extracting the pupil region. After extracting the connected components, an opening operation is applied to remove low-level noise and mitigate its impact. Subsequently, a hole-filling algorithm is used to fill holes that are not connected to the image boundary, thereby eliminating the interference of reflection spots. Finally, multiple complete connected components are obtained as pupil candidates.
Since the pupil area is not a perfect circle and has a certain degree of convexity and concavity on the edge, the use of a single geometric feature for screening connected components yields unsatisfactory results. There may be no pupil candidate that meets the requirements because the restrictions are too strict; alternatively, the wrong pupil candidate may be selected because the restrictions are insufficient. To address this issue, this paper designs a pupil shape index
w to screen multiple connected components to determine the pupil area. The expression for the pupil shape index
w is as follows:
where
is the secondary moment of inertia,
is the main moment of inertia, and
is the error between the connected component and the ellipse fitted to it. The pupil shape index
w is composed of two terms, which capture the inherent shape characteristics of the pupil and are robust to noise and other disturbances. The first item in the formula,
, represents the inertia rate of the connected component, which describes the uniformity of its shape. When the shape of the region is more homogeneous, the primary and secondary moments of inertia are closer, and the value of the first term is larger. The second item in the formula,
, represents the ellipticity of the connected component and the smoothness of its boundary. When the shape of the connected component is closer to an ellipse and its edges are smoother, the value of
becomes smaller, and the value of the second term increases. In general, the closer the shape of the connected component is to a circle, the larger the value of the pupil shape index
w.
Because the geometric shape of the pupil is evenly distributed, increasing the weight of the first term encourages w to prioritize regions with uniform distribution. By reducing the weight of the second term, a pupil area with slight occlusion or local noise can still receive a higher score. In this paper, = 0.6, = 0.4. The pupil shape index of each connected component is calculated, and the connected component with the highest index is selected as the pupil area.
Geometric features commonly used in image processing include the radius, area, circularity, inertia, convexity, and others. In this paper, five geometric constraints—namely, area, circularity, inertia ratio, convexity and the pupil shape index
w—are employed to screen the pupil area. The pupil recognition accuracy using different geometric constraints is shown in
Figure 7. Accuracy is defined as the ratio of the number of samples where the pupil area is correctly identified to the total number of samples. As shown in
Figure 7, the accuracy achieved when using the pupil shape index
w is higher than that obtained using any single feature.
The refinement of the pupil area based on geometric constraints not only enhances the accuracy of pupil recognition but also avoids the complex operation of extracting edges to calculate roundness and shape symmetry. This approach provides a solid foundation for locating the pupil center in the subsequent step. Through morphological operation, the pupil candidates become smoother and more complete. The pupil area is then screened using the proposed pupil shape index, which is robust to variations in pupil shape and aligns well with the characteristics of the pupil. The method for refining the pupil area through geometric constraints is applicable to both regular and irregular pupil regions. The results of refining the pupil area are shown in
Figure 8.
2.3. Determination of the Pupil Center Using Geometric Moments
After obtaining the refined pupil area, the geometric moment [
22] is calculated directly to determine the pupil center. The grayscale values of the binary image are treated as a two-dimensional density distribution function. The center and radius of the pupil region are then derived from the geometric moments of the connected component. The moment of a pupil area
S can be expressed as:
where
i and
j are non-negative integers,
, and
i +
j represents the order of moment
M.
f(
x,
y) represents the grayscale value of the pixel at the coordinate (
x,
y) in the image. The center and size of the pupil are determined by calculating the zero-order moment and the first-order moment. Specifically, the pupil center is calculated as the ratio of the first-order moment to the zero-order moment.
The central position of the pupil can be determined using the geometric moments described above. According to Equations (2)–(4), the pupil size can be calculated directly from the zero-order moment, and the pupil center can be obtained by the ratio of the first-order moment to the area. The time complexity of locating the pupil center using geometric moments is O (P), where P is the number of nodes in the pupil area. This method exhibits low time complexity and excellent real-time performance.
This algorithm can handle different gray levels and pupil areas, demonstrating strong robustness to noise. The accuracy of the localization results is ensured through grayscale and geometric constraints. Since the algorithm does not require iterative operations or edge fitting, it minimizes the interference from noise points and improves the processing speed. Furthermore, the use of geometric moments for pupil center localization provides defocus adaptability, enabling reliable performance even in blurred images. In contrast, methods that rely on edge extraction and circle fitting are prone to errors in blurred images due to deviations in edge point selection, which have a significant impact on the accuracy of the pupil center localization. The proposed method calculates the geometric moments of the connected component, reducing the impact of edge contour diffusion in blurred images, thereby minimizing errors caused by deviations in edge point selection. Therefore, the proposed method enables the accurate localization of the pupil in blurred images.
4. Conclusions
In this study, we designed a real-time pupil localization algorithm for blurred images, based on double constraints, for driver fatigue monitoring. The algorithm first extracts the rough pupil area based on grayscale constraints and then refines the pupil area using a designed pupil shape index based on geometric constraints; finally, it locates the pupil center using geometric moments. By avoiding the need to fit a circle through feature points, the proposed algorithm mitigates the effects of edge diffusion in blurred images and reduces errors resulting from biased edge point selection. The algorithm demonstrates low time complexity, minimal parameter configuration requirements, and satisfactory real-time performance, making it suitable for applications in real-time driver fatigue monitoring. In future work, we plan to further investigate the performance of the pupil localization algorithm under various lighting conditions, such as visible light, to enhance its applicability. Additionally, the algorithm will be optimized for practical scenarios, including addressing challenges posed by drivers wearing glasses, prior to its deployment.