Pupil and Glint Detection Using Wearable Camera Sensor and Near-Infrared LED Array

This paper proposes a novel pupil and glint detection method for gaze tracking system using a wearable camera sensor and near-infrared LED array. A novel circular ring rays location (CRRL) method is proposed for pupil boundary points detection. Firstly, improved Otsu optimal threshold binarization, opening-and-closing operation and projection of 3D gray-level histogram are utilized to estimate rough pupil center and radius. Secondly, a circular ring area including pupil edge inside is determined according to rough pupil center and radius. Thirdly, a series of rays are shot from inner to outer ring to collect pupil boundary points. Interference points are eliminated by calculating gradient amplitude. At last, an improved total least squares is proposed to fit collected pupil boundary points. In addition, the improved total least squares developed is utilized for the solution of Gaussian function deformation to calculate glint center. The experimental results show that the proposed method is more robust and accurate than conventional detection methods. When interference factors such as glints and natural light reflection are located on pupil contour, pupil boundary points and center can be detected accurately. The proposed method contributes to enhance stability, accuracy and real-time quality of gaze tracking system.


Introduction
Human beings acquire 80%~90% of outside information from our eyes. Humans' visual perception information can be acquired through eye gaze tracking. With the increasing development of computer/machine vision technology, gaze tracking technology has been more and more widely applied in fields of medicine [1], production tests [2], human-machine interaction [3,4], aviation military [5,6], etc.
As one of traditional gaze tracking methods [7][8][9][10][11][12], the pupil center-corneal reflection (PCCR) technique has been developed and improved increasingly in recent years [13][14][15][16][17][18]. Pupil and glint (corneal reflection) center detection plays a crucial role on gaze tracking methods based on PCCR. There are always interference factors such as eyelashes, eyelids, shadows and natural light reflection in the images acquired by a CCD camera, which will cause false boundary points around pupil contour. In order to ensure the accuracy of gaze estimation, robust and accurate method of pupil and glint detection is essential.
Previous scholars have done a great many research works on pupil and glint detection. Ebisawa [19] proposes a pupil detection technique using two alternate infrared light sources and image difference of bright and dark eye image. Bright/dark eye image is acquired by switching on light source in coaxial/uncoaxial with the camera during add/even field alternatively, due to which the sampling time is limited. The glint position stays almost fixed. To detect it, pupil brightness should be as low as possible. Although the image difference method is simple, switching on/off of light sources may influence its stability. To overcome the limitation of this technique, methods utilizing single eye image are proposed continuously.
In [13], in order to obtain accurate pupil center position, double ellipse fitting (rough and detailed) are performed to eliminate false boundary points. It is difficult to eliminate false boundary points around pupil contour and double ellipse fitting cost a long time. The glint is detected by searching near the pupil. Its centroid is then calculated as center position. The uncertain searching time and result of glint can lead to instability of the method. Yoo et al. [20] acquire rough pupil bound by iterative projection. Snakes are utilized to converge to the boundary of pupil. Elimination of false boundary points is not considered. Glint searching region is limited by rough pupil bound. At last, pupil and glint center position are determined by ellipse fitting. Gwon et al. [21] locate approximate pupil area using CED method, then precise pupil center is obtained by calculating geometric center of black pixels. Before pupil detection, glints are erased by neighboring pixels in horizontal direction. The erasion causes error to pixel points around pupil contour and influences accuracy of pupil center location. To better locate pupil boundary, Li et al. [22] develop a feature-based method. In the process of feature detection, pupil contour candidates are detected along a series of rays shooting from a best guess of pupil center and marked with crosses. RANSAC is applied to differentiate pupil contour points (inliers) and interference points (outliers). When interference factors such as glints and natural light reflection locate on or around pupil contour, part of interference points and pupil contour points are mixed together. In this case, RANSAC is not capable enough to differentiate them. The location accuracy of pupil center is affected. Krishnamoorthi and Annapoorani [23] propose a boundary extraction technique to localize pupil. Orthogonal polynomials model is adopted to analysis the structure of an eye image. Hartley's statistical hypothesis test is employed in edge map extraction. A where-to-go approach is proposed to find pupil boundary points with the assistant of weightage assignment. Although the algorithm can locate pupil boundary points accurately, it has a limitation of boundary assumption.
The remainder of this paper is organized as follows: Section 2 presents the proposed method in detail. Section 3 describes the experiments and shows the experimental results. Section 4 concludes the whole work.

Proposed Method
A novel and robust method of pupil and glint detection using wearable camera sensor and near-infrared LED array for gaze tracking system is proposed in this paper. Compared with original Starburst, the proposed circular ring rays location(CRRL) method has higher stability, accuracy and real-time quality. This method overcomes the location uncertainty of initial shooting point of rays. The process of shooting rays back towards the start point to collect more pupil boundary points is omitted. RANSAC is also omitted for the reason that the interference points can be eliminated effectively. Pupil center can be detected accurately when interference points are located on or around pupil contour. Improved Otsu method is employed to acquire the eye's binary image. Part of the remainder interference factors (including eyelashes and eyelids) are eliminated by opening-and-closing operation with structure elements of different size. Projections of 3D gray-level histogram are utilized to estimate rough pupil radius and center position. The circular ring area is determined by provisional pupil radius and center. A series of rays with equal gap are shot from the inner to outer ring to detect pupil boundary points by calculating gradient amplitude. Gradient amplitude of each pixel is used to eliminate false boundary points. Spline interpolation is performed on the neighborhood of boundary points to obtain subpixel-precise ones. Improved total least squares is developed to fit ellipse and then pupil center position is calculated through elliptic equation fitted. Because the gray levels of glint pixels are higher than anywhere else, rough glint region is estimated by binarization with a fixed threshold level. According to glint's illumination intensity (suited for Gaussian distribution), Gaussian function deformation solved by improved total least squares is utilized to calculate glint center.

Proposed Gaze Tracking Device
In this study, we develop a wearable gaze tracking device composed by a helmet, a monitor, an array of four near-infrared light emitting diodes (NIR LEDs) and a microspur camera shown in Figure 1. Considering the imaging distance is limited between 3~5 cm, a microspur camera is adopted to acquire eye image. The image resolution is 640ˆ480 pixels (CCD sensor). The wavelength of NIR LED is 850 nm and the power is less than 5mw. The experimental system brings no harm to human eyes [24].

Binarization and Opening-and-Closing Operation
An improved Otsu method is employed to obtain eye binary image in this paper. Proposed by Otsu first in 1979, the Otsu method is based on adaptive threshold selecting [25]. The original eye image is shown in Figure 2a. Gray-level histogram of eye image is shown in Figure 2b. Assuming number of pixels with gray level i is n i in eye image, all gray levels are divided into 3 groups, as shown in Figure 2b: Group G 0 contains mainly gray levels of black area such as pupil and eyelashes. Group G 1 contains mainly gray levels of iris and shadows. Group G 2 contains mainly gray levels of cornea and skin around. Assuming the respective occurring probability of G 0 , G 1 , G 2 is ω 0 , ω 1 , ω 2 , the corresponding gray level is h 0 , h 1 , h 2 : The within-class variance is defined as We develop an improved and fast solution method of optimal thresholds. According to Equations (3) and (4), within-class variance is transformed into integral form in Equation (5).
Partial derivative on T 1 and T 2 is calculated respectively on both sides of Equation (5). The calculation result is shown in Equation (6).
Formula to solve threshold in Otsu method is expanded as Equation (7).
According to Equations (6) and (7), optimal thresholds can be solved. For each pixel point in the original eye image, mean gray-level of a 3ˆ3 neighboring region around it is calculated to substitute its original gray-level. The occurring probabilities of new gray-levels are calculated and utilized to solve optimal segmentation threshold T 1 and T 2 according to Equations (6) and (7). According to the distributing regularity of eye image's gray-level histogram, value of T 1 is limited between 0~50, value of T 1 is limited between T 1~1 50. The maximum value of g pT 1 , T 2 q is calculated according to Equation (7) and the corresponding pT 1 , T 2 q is the optimal threshold solved.
The computational complexity of the new method to solve optimal threshold is decreased. As shown in Table 1, the segmentation time of improved method is less than that of original Otsu, which contributes to the real-time quality of eye gaze tracking. In order to extract the pupil, threshold T 1 is utilized in the process of binarization. Eye's binary image is shown in Figure 3. To eliminate interference points (mainly remnant eyelashes and eyelids) clearly, opening-and-closing operation with structure elements of different size are employed. According to the shape and size of interference factors shown in Figure 3, a 0.3T 1ˆ0 .3T 1 square structure element is utilized in the process of opening operation, and a 0.7T 1ˆ0 .7T 1 square structure element is utilized in the process of closing operation. The operating result is shown in Figure 4.

Rough Location of Pupil Area and Center
Pupil image acquired through opening-and-closing operation presents an elliptical shape (irregular at glints and natural light reflection). 3D gray-level histogram of opening-and-closing operation result is shown in Figure 5a. Projection along x and y axis of 3D gray-level histogram is shown in Figure 5b. Rough location of pupil area and center position is determined by distribution of gray level in projection image. Rough pupil area locates in a rectangular box with length l 2 and width l 4 . Estimated pupil center is defined as o 1 p " pl 1`l2 {2, l 3`l4 {2q. Estimated pupil radius is defined as r 1 p " pl 2`l4 q{4.

Collection of Pupil Boundary Points
A novel circular ring rays location (CRRL) method is proposed for pupil boundary points detection based on a modified Starburst. The proposed method has the following advantages than the original Starburst method. First, a series of rays are shot from inner circular ring to outer circular ring in proposed method instead of shooting rays from a guessed point to detected point. In original Starburst, a second shooting of rays is needed to collect more pupil counter candidates. In our proposed method, shooting rays once can collect sufficient pupil boundary points to fit ellipse, which saves the period of pupil boundary points collection. The style of shooting rays can also save calculation time because the rays shot are shorter than those shot in original Starburst method. Second, RANSAC is utilized in original Starburst to distinguish and separate pupil contour points (inliers) and interference points (outliers), which costs much time. We calculate the gradient amplitude at pixels neighboring pupil boundary utilizing pixel gray values of pupil and iris region in advance. Then a threshold of gradient amplitude is set to detect pupil boundary points. Number of pupil boundary points detected on each ray is counted to eliminate interference points. The experimental results show that the method for interference points elimination is suitable and effective in CRRL method. Third, cubic spline interpolation is utilized neighboring collected pupil boundary points to determine subpixel-precise pupil boundary points, which contributes to the accuracy enhancement of pupil center location.
Collection steps of pupil boundary points are presented in detail below: Input: Gray-level eye image. Output: Point set of pupil boundary points.
Step 1: Building of circular ring area. As shown in Figure 6, in order to build a circular ring area including pupil boundary inside, estimated pupil center o 1 p is taken as center of inner and outer ring (green line) with respective radius 0.5r 1 p and 1.5r 1 p .
Step 2: Location of pupil boundary points. 36 rays (with equal gap 10) are shot from inner to outer circular ring. Gradient ∇ f " " g x g y ı T is calculated at each pixel location px, yq along shooting direction of each ray. M px, yq " b g 2 x`g 2 y is calculated as gradient amplitude. According to variation range of gradient amplitude neighboring pupil contour, a threshold of gradient amplitude is set as P r1.3δ, 1.5δs pδ " T 2´T1 q in advance to select pupil boundary points. If gradient amplitude at pixel location px, yq along shooting direction is within the range of , pixel px, yq is recorded as one of pupil boundary points. Located pixel points matching threshold δ on each ray are counted.
Step 3: Elimination of interference points. When interference factors (glints and natural light reflection) are located on or around pupil contour, number of pixels matching threshold δ on the ray may be more than 1. In this case, all boundary points recorded on the ray are eliminated to avoid interference caused by glints and natural light reflection.
Step 4: Subpixel-precise location of pupil boundary points. To enhance location accuracy of pupil boundary points, cubic spline interpolation [26] is utilized neighboring collected pupil boundary points in Step 2 to determine subpixel-precise pupil boundary points.
Step 5: Mark of pupil boundary points. As shown in Figure 6, determined pupil boundary points are marked with yellow "+". All the determined candidates of pupil boundary points are collected into one point set for ellipse fitting.

Ellipse Fitting
Total least squares (TLS) [27,28] was proposed first in 1980. An improved total least squares is developed in this paper to fit collected pupil boundary points. Compared with least squares (LS) method, errors of independent and dependent variable are taken into account in the calculating process of total least squares. In TLS, matrix equation Ax " b is solved by considering errors in both data matrix A and observation vector b. To compensate errors existed in A and b, perturbation vector e is utilized to perturb observation vector b, and simultaneously, perturbation matrix E is utilized to perturb observation data matrix A. Both e and E are of minimum amount.
Assuming the elliptic equation of eye pupil is Ax 2`B xy`Cy 2`D x`Ey`F " 0, constraint condition is set as A`C " 1 [29] in order to obtain higher fitting accuracy. Then elliptic equation is deformed as Equation (8).
where i " 1, 2,¨¨¨, n, n is the number of pupil boundary points extracted. Errors in pixel position px, yq is defined as`v x , v y˘, the ideal form of Equation (8) is defined as where M " 2 n ı T . Let augmented matrix H " r´Y, Ms and its singular values σ 1 ě σ 2 ě¨¨¨ě σ min ) are calculated utilizing SVD method. According to the subspace interpretation of total least squares, the total least squares solution of matrix equation Mτ " Y is deduced as where σ min is the minimal singular value of augmented matrix H. Consequently, σ 2 min is the common variance of each component in perturbation matrix D " r´e, Es.
For the reason that row of constant in coefficient matrix M cannot be considered in SVD, we propose an improved method for SVD solution. By setting α 1i " x i y i , α 2i " y 2 Here we set Therefore, coefficient F is described as where α " By taking Equation (14) into Equation (12), we acquire The total least squares solution of matrix equation ε " Xτ 1´Z is described as New augmented matrix is defined as L " r´Z, Xs. In order to improve the fitting accuracy and stability of TLS, a novel and fast SVD solution method is utilized to acquire the singular values of matrix L. L is described as SVD format in Equation (17).
Matrix Q is defined as Equation (19) shows the multiplication result of different rows L s , L t in matrix L.
where 1 ď s ď 4, 1 ď t ď 4, s ‰ t. Eigenvalue matrix Σ st is calculated as Then rows of matrix L are redefined as rL s , L t s ∆ st . Orthogonal transformation is conducted for any two redefined rows of matrix L. Non-diagonal elements of matrix Q are eliminated. Eigenvalue matrix of Q is solved as " B C D E ı T is calculated according to Equation (16). Pupil center can be acquired through Equation (22).
where A " 1´C. The sensitivity of the TLS problem depends on the ratio r "`r σ p´σp`1˘{ r σ 1 p . r σ p , σ p`1 and r σ 1 p are the respective least singular value of matrix X(or M), L(or H) and X 0 (or M 0 )(coefficient matrix in corresponding LS problem). When the value of ratio r is larger, the TLS will be more accurate than LS. During the ellipse fitting of pupil boundary points, the respective ratios r of TLS problem solved by SVD and improved SVD are 0.82 and 0.94. The improved TLS achieves a higher accuracy than original TLS. Improved TLS method makes a compensation for errors in pixel location. The fitting result is more closed to the ideal form of elliptic equation (Equation (9)). The result of ellipse fitting is shown in Figure 7. Red ellipse represents the fitted pupil contour. Red "‚" represents the center of fitted pupil contour.

Glint Detection
For the reason that the pixel number of glint region is limited and there is halo existing around glint contour, the proposed method for pupil detection is not suitable for glint. Improved Gaussian fitting is utilized to locate glint center.

Rough Location of Glint Region
Because illumination intensity of glint is higher and its gray-levels are near to 255, threshold " 240 is adopted on binarization of eye image to extract glints. A 2ˆ2 square structure element is utilized in the process of opening-and-closing operation to filter binary image. As shown in Figure 8, red rectangular boxes are utilized to locate rough glint regions.  Figure 9a shows the enlarged glint region. The 3D gray-level histogram of enlarged glint is shown in Figure 9b. The glint's illumination intensity suits for Gaussian distribution [30].

Gaussian Fitting
I px, yq is the gray-level of pixel px, yq in glint region. As the amplitude of Gaussian distribution, H is the highest gray-level in glint region.`x g , y g˘r epresents the glint center to be calculated. σ x an σ y is the respective standard deviation of gray-level in horizontal and vertical direction. A logarithmic operation is conducted to Equation (23). The arrangement and deformation result is as follow: where z " lnI px, yq, a "´1{2σ 2 ' % x g "´c 2a y g "´d 2b (25) Figure 10 shows the detection result of glint center (marked with green "+").

Pupil Detection of Single Subject
The process of pupil detection is shown in Figure 11. Figure 11a-d shows four original eye image with different relative position of pupil and glints acquired from single subject; a 1 -d 1 shows eye binary image utilizing improved Otsu optimal threshold; a 2 -d 2 shows result of opening-and-closing operation with 5ˆ5 square structure element; a 3 -d 3 shows extraction result of pupil boundary points (marked with yellow "+"); a 4 -d 4 shows fitting results of pupil (red ellipse). The center of fitted pupil contour is marked with red "‚".  Table 2 shows the parameters of pupil detection, including threshold T 1 and T 2 , rough pupil center and final pupil center fitted.

Pupil Detection of Different Subjects
In order to verify the applicability of the proposed circular ring rays location(CRRL) method, original eye images of another four different subjects are acquired. The experimental results are shown in Figure 12. In Section 2.2.1, a larger size of structure element is set in process of closing operation than that in opening operation. For subjects with heavy eyelashes and eyelids, different sizes of structure element in opening-and-closing operation can ensure the complete elimination of remnant interference factors caused by eyelashes and eyelids.  Table 3 shows the parameters of pupil detection, including threshold T 1 and T 2 , rough pupil center and final pupil center fitted.

Glint Detection
Glint detection is implemented for Figures 11a-d and 12a-d. The process of detection is shown in Figure 13.

Stability and Error
To evaluate the stability and accuracy of proposed method, 105 eye images of each subject are acquired for pupil and glint detection. Stability, RMS error and processing time of proposed method in this paper are shown in Table 5. As a reference, stability, RMS error and processing time of detection methods in paper [13,[20][21][22] are listed in Table 5. As can be seen from the experimental results in Table 5, stability, accuracy and real-time quality of the proposed method are better than those in paper [13,[20][21][22].

Conclusions
A novel and robust method of pupil and glint detection using a wearable camera sensor and near-infrared LED array for gaze tracking system is proposed in this paper. A circular ring rays location (CRRL) method is proposed for detection of pupil boundary points. An improved Otsu method is proposed for threshold segmentation. The experimental results show that the segmentation time of improved method is less than that of original Otsu, which contributes to the real-time quality of eye gaze tracking. Size and number of gradient amplitude are employed to eliminate interference factors. In order to compensate for errors of pupil boundary points in horizontal and vertical direction, improved total least squares is developed to fit ellipse. The experimental results show that the improved total least squares has a higher accuracy than original total least squares on pupil ellipse fitting. For the purpose of a higher location accuracy of glint, improved total least squares is utilized for the solution of Gaussian function deformation to calculate glint center. As we can see from the experimental results, stability, accuracy and real-time quality of the proposed method are better than those existing currently for pupil and glint detection. When interference factors such as glints and natural light reflection are located neighboring pupil boundary, interference points caused can be eliminated fast and effectively. The proposed method contributes to the enhancement of stability, accuracy and real-time quality of gaze tracking system.