# Multi-User Identification-Based Eye-Tracking Algorithm Using Position Estimation

## Abstract

**:**

_{1}score by up to 0.490, compared with benchmark algorithms.

## 1. Introduction

## 2. Proposed Algorithm

#### 2.1. Pre-Processing Module

_{1}, λ

_{2}, and λ

_{3}denote the horizontal, vertical, and diagonal weights, respectively (which are 0.5, 0.5, and 0.25, respectively), and I

_{x}

_{+1/2,y}, I

_{x}

_{,y+1/2}, and I

_{x}

_{+1/2,y+1/2}denote the horizontal, vertical, and diagonal interpolated pixels, respectively.

#### 2.2. Face-Detection Module

#### 2.3. User-Classification Module

_{F}denotes a detected face block. Using the gradients, the HOGs of magnitude and orientation for each pixel are generated as follows:

_{x,y}and θ

_{x,y}denote the magnitude and orientation of the pixel, respectively. Histograms for the two properties are generated, and histograms for several blocks are combined into one feature vector. Then, the feature vector is classified using a support vector machine (SVM) [13] to partition the classes maximally, thereby generating the exact class for the input face.

#### 2.4. Three-Dimensional Eye-Position Extraction Module

_{i}and y

_{i}denote an initial pixel point in the detected facial region, α and β denote the horizontal and vertical offsets, respectively, I

_{max}and I

_{depth}denote the maximum intensity level and the intensity level of the detected face, and d

_{max}denotes the real maximum distance. Using these parameters, the final left and right eye positions are as follows:

## 3. Simulation Results

_{1}scores [14,15], which are derived as follows:

_{1}score was calculated, for which a value of one indicates perfect accuracy. For the test sequences, we used several sequences at different distances (ranging from 1 m to 3.5 m) between the camera and multiple users.

_{1}score, combining precision and recall at different distances. In terms of precision, the total averages of the benchmark Algorithms 1, 2, and 3 were 0.669, 0.849, and 0.726 on average, respectively. In contrast, the proposed algorithm resulted in a perfect score of 1.000. In terms of recall, the total averages of the benchmark Algorithms 1, 2, and 3 were 0.988, 0.993, and 0.738, whereas the proposed algorithm resulted in 0.988. Therefore, the average F

_{1}score for the proposed algorithm was up to 0.294, 0.151, and 0.490 higher than those of Algorithms 1, 2, and 3, respectively. This means that the detection accuracy of the proposed algorithm was higher than that of the benchmark algorithms. Figure 8 also shows the same results where the precision and recall values of the proposed algorithm were higher than those of the benchmark algorithms. This was because the proposed algorithm accurately classified foreground and background images by using several cascade classifiers after calibrating RGB and depth images.

## 4. Conclusions

_{1}score that was up to 0.490 higher than that of the benchmark algorithms.

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## References

- Lopez-Basterretxea, A.; Mendez-Zorrilla, A.; Garcia-Zapirain, B. Eye/Head Tracking Technology to Improve HCI with iPad Applications. Sensors
**2015**, 15, 2244–2264. [Google Scholar] [CrossRef] [PubMed] - Lee, J.W.; Heo, H.; Park, K.R. A Novel Gaze Tracking Method Based on the Generation of Virtual Calibration Points. Sensors
**2013**, 13, 10802–10822. [Google Scholar] [CrossRef] [PubMed] - Chen, Y.-S.; Su, C.-H.; Chen, J.-H.; Chen, C.-S.; Hung, Y.-P.; Fuh, C.-S. Video-based eye tracking for autostereoscopic displays. Opt. Eng.
**2001**, 40, 2726–2734. [Google Scholar] - Li, L.; Xu, Y.; Konig, A. Robust depth camera based multi-user eye tracking for autostereoscopic displays. In Proceedings of the 9th International Multi-Conference on Systems, Sygnals & Devices, Chemnitz, Germany, 20–23 March 2012.
- Ojala, T.; Pietikainen, M.; Maenp, T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell.
**2002**, 24, 971–987. [Google Scholar] [CrossRef] - Bilaniuk, O.; Fazl-Ersi, E.; Laganiere, R.; Xu, C.; Laroche, D.; Moulder, C. Fast LBP face detection on low-power SIMD architectures. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Columbus, OH, USA, 23–28 June 2014.
- Viola, P.; Jones, M. Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Kauai, HI, USA, 8–14 December 2001.
- Jain, A.; Bharti, J.; Gupta, M.K. Improvements in openCV’s viola jones algorithm in face detection - tilted face detection. Int. J. Signal Image Proc.
**2014**, 5, 21–28. [Google Scholar] - Kang, S.-J.; Jeong, Y.-W.; Yun, J.-J.; Bae, S. Real-time eye tracking technique for multiview 3D systems. In Proceedings of the 2016 IEEE International Conference on Consumer Electronics, Las Vegas, NV, USA, 8–11 January 2016.
- Lehmann, T.M.; Gonner, C.; Spitzer, K. Survey: Interpolation methods in medical image processing. IEEE Trans. Med. Imaging
**1999**, 18, 1049–1075. [Google Scholar] [CrossRef] [PubMed] - Crow, F. Summed-area tables for texture mapping. ACM SIGGRAPH Comput. Gr.
**1984**, 18, 207–212. [Google Scholar] [CrossRef] - Dala, N.; Triggs, B. Histograms of Oriented Gradients for Human Detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, 20–26 June 2005.
- Lowe, D.G. Distinctive image features from scale-invariant key points. Int. J. Comp. Vis.
**2004**, 60, 91–110. [Google Scholar] [CrossRef] - Kang, S.-J.; Cho, S.I.; Yoo, S.; Kim, Y.H. Scene change detection using multiple histograms for motion-compensated frame rate up-conversion. J. Disp. Technol.
**2012**, 8, 121–126. [Google Scholar] [CrossRef] - Yang, Y.; Liu, X. A re-examination of text categorization methods. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley, CA, USA, 15–19 August 1999; pp. 42–49.

**Figure 1.**Various examples in a vehicle application: (

**a**) a drowsiness warning system; and (

**b**) an interface control system using multi-user eye tracking.

**Figure 4.**Pixel arrangement in bilinear interpolation algorithm when an input image resolution is doubled.

**Figure 5.**Concept for the cascading structure of the face-detection module in the proposed algorithm.

**Figure 8.**The data distribution of the precision-recall graph for the proposed and benchmark algorithms.

**Figure 9.**Comparing the detection accuracy of the proposed and benchmark algorithms at a distance of 2.5 m from the RGB and depth cameras (top: RGB image; bottom: depth image): (

**a**) Algorithm 1; (

**b**) Algorithm 2; (

**c**) Algorithm 3; and (

**d**) proposed algorithm.

**Figure 10.**Comparing the detection accuracy of the proposed and benchmark algorithms at a distance of 3.5 m from RGB and depth cameras (top: RGB image; bottom: depth image): (

**a**) Algorithm 1; (

**b**) Algorithm 2; (

**c**) Algorithm 3; and (

**d**) proposed algorithm.

**Table 1.**Average precision and recall values for the proposed and benchmark algorithms at different distances.

Distance (m) | Algorithm 1 | Algorithm 2 | Algorithm 3 | Proposed Algorithm | ||||
---|---|---|---|---|---|---|---|---|

Precision | Recall | Precision | Recall | Precision | Recall | Precision | Recall | |

1.000 | 0.741 | 0.981 | 0.877 | 0.991 | 0.730 | 0.619 | 1.000 | 0.981 |

1.500 | 0.573 | 0.985 | 0.732 | 0.991 | 0.493 | 0.514 | 1.000 | 0.985 |

2.000 | 0.637 | 0.975 | 0.833 | 0.981 | 0.825 | 0.789 | 1.000 | 0.975 |

2.500 | 0.664 | 1.000 | 0.853 | 1.000 | 0.713 | 0.938 | 1.000 | 1.000 |

3.000 | 0.717 | 0.991 | 0.886 | 1.000 | 0.824 | 0.828 | 1.000 | 0.991 |

3.500 | 0.806 | 0.991 | 0.972 | 0.995 | 0.708 | 0.800 | 1.000 | 0.991 |

Random | 0.544 | 0.994 | 0.792 | 0.994 | 0.792 | 0.677 | 1.000 | 0.994 |

Distance (m) | Algorithm 1 | Algorithm 2 | Algorithm 3 | Proposed Algorithm | |||
---|---|---|---|---|---|---|---|

F_{1} Score | Difference | F_{1} Score | Difference | F_{1} Score | Difference | F_{1} Score | |

1.000 | 0.844 | −0.147 | 0.931 | −0.060 | 0.674 | −0.320 | 0.991 |

1.500 | 0.725 | −0.268 | 0.842 | −0.151 | 0.503 | −0.490 | 0.993 |

2.000 | 0.771 | −0.216 | 0.901 | −0.086 | 0.807 | −0.180 | 0.987 |

2.500 | 0.798 | −0.202 | 0.921 | −0.079 | 0.811 | −0.189 | 1.000 |

3.000 | 0.832 | −0.164 | 0.939 | −0.057 | 0.826 | −0.170 | 0.996 |

3.500 | 0.889 | −0.107 | 0.983 | −0.013 | 0.751 | −0.245 | 0.996 |

Random | 0.703 | −0.294 | 0.882 | −0.115 | 0.731 | −0.266 | 0.997 |

Distance (m) | Face 1 | Face 2 | Face 3 | |||
---|---|---|---|---|---|---|

Detection Number | Detection Ratio | Detection Number | Detection Ratio | Detection Number | Detection Ratio | |

1.000 | 70/70 | 1.000 | 68/70 | 0.970 | 70/70 | 1.000 |

1.500 | 70/70 | 1.000 | 69/70 | 0.980 | 69/70 | 0.980 |

2.000 | 64/68 | 0.940 | 67/68 | 0.980 | 68/68 | 1.000 |

2.500 | 70/70 | 1.000 | 70/70 | 1.000 | 70/70 | 1.000 |

3.000 | 70/70 | 1.000 | 70/70 | 1.000 | 70/70 | 1.000 |

3.500 | 69/70 | 0.980 | 70/70 | 1.000 | 70/70 | 1.000 |

Random | 89/90 | 0.990 | 87/90 | 0.970 | 90/90 | 1.000 |

© 2016 by the author; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Kang, S.-J.
Multi-User Identification-Based Eye-Tracking Algorithm Using Position Estimation. *Sensors* **2017**, *17*, 41.
https://doi.org/10.3390/s17010041

**AMA Style**

Kang S-J.
Multi-User Identification-Based Eye-Tracking Algorithm Using Position Estimation. *Sensors*. 2017; 17(1):41.
https://doi.org/10.3390/s17010041

**Chicago/Turabian Style**

Kang, Suk-Ju.
2017. "Multi-User Identification-Based Eye-Tracking Algorithm Using Position Estimation" *Sensors* 17, no. 1: 41.
https://doi.org/10.3390/s17010041