Next Article in Journal
A Decentralized Framework for the Detection and Prevention of Distributed Denial of Service Attacks Using Federated Learning and Blockchain Technology
Previous Article in Journal
Cotton T-Shirt Size Estimation Using Convolutional Neural Network
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

Real-Time Head Orientation and Eye-Tracking Algorithm Using Adaptive Feature Extraction and Refinement Mechanisms †

Graduate Institute of Communication Engineering, National Taiwan University, Taipei 106319, Taiwan
*
Author to whom correspondence should be addressed.
Presented at the 2024 IEEE 6th Eurasia Conference on IoT, Communication and Engineering, Yunlin, Taiwan, 15–17 November 2024.
Eng. Proc. 2025, 92(1), 43; https://doi.org/10.3390/engproc2025092043
Published: 30 April 2025
(This article belongs to the Proceedings of 2024 IEEE 6th Eurasia Conference on IoT, Communication and Engineering)

Abstract

We propose a fast eye-tracking method that takes the depth image and the gray-scale infrared (IR) image using a traditional image processing algorithm. As an IR image contains one face and the corresponding depth image, the method locates the real coordinate of the camera with a high speed (>90 frames per second) and with and acceptable error. The method takes advantage of the depth information to quickly locate the face by shrinking the eyeballs. The method decreases the error rate but accelerates the operation speed. After finding the face region, less complicated computer vision algorithms are used at a high execution speed. Refinement mechanisms for extracting features and determining edge distribution are used to locate the eyeball’s position and transform the pixel coordinate of the image to the real coordinate.

1. Introduction

Eye tracking is used to find the position of eyeballs. The eye-tracking method is designed for an application using the coordinate (x, y, z) of the eyes [1]. The coordinate is determined instantly after the camera captures an image that contains the eyes and is updated for every frame. The frame rate of the camera is the lower bound of the execution speed. A conventional camera captures frames at a rate of 90 images per second, but we used a camera that could take 270–320 frames per second, which is three times more than that of the conventional camera.
The flow chart and the visualization of the method in this study are shown in Figure 1 and Figure 2. The output three-dimensional coordinate of the sample images was (1.43514, 4.08796, 58.1) cm. The method was developed to achieve high execution speed by shrinking the image in the preprocessing step. The rotation of the face or the mask was used for the detection of the eyes. The head-state classification was also used to check if the face was rotated correctly (the orientation of the head) or if the mask was worn. Finally, we determined the position of the eyes on the shrunken image after detection and postprocessing. The midpoint of elements of the pair was transformed on the coordinate set by a camera.

2. Method

2.1. Input Images

In the method, two input images were captured at a speed of 90 frames per second. The pixels of the depth image were 0–65,535 (16 bits). The gray-scale infrared image had a range of 0–255. When two objects were captured in the same resolution at the same time, their positions and sizes were also identical. Therefore, if the midpoint of the eyeballs is at (r, c) on the image, the depth value at (r, c) can be computed on the 3D coordinate. This property enables the position of the face by decreasing the detection region in image preprocessing.

2.2. Preprocessing

The size of the input image was 848 columns by 480 rows, which was too large for the system to achieve high execution speed. Apart from the speed, the error rate must be reduced. By highlighting the facial region on images, confusing objects such as the background or the users’ clothes were omitted. Figure 3 shows the flow chart of image preprocessing. The region of interest (ROI) contained the facial region. Other parameters such as average intensity were also recorded (Table 1). The eye-tracking method measures the distance between the camera and the user less than 700 mm; so, the thresholding was defined as (1).
F 0 = D < 800
Then, we used binary open and closing to remove noise and fill holes in the image by using (2) and (3).
F 0 ( F 0 k o ) k o
F 0 ( F 0 k c ) k c
We searched the images as shown in Figure 4 to determine the ROI and other outputs, which are denoted “F” and are M rows by N columns (Figure 5).

2.3. Head State Classification

The face image was rotated to determine whether the mask was on or not. By changing either state, the detection process was conducted. If the head is tilted, a pair of eyeballs do not lie in the same horizontal position. Therefore, it is necessary to check the state of the face before detecting the eyeballs. Then, we represented each type of state as state variables. The name and brief introduction of state variables are listed in Table 2. The variables pitch, roll, and yaw are represented in “char”, and the variable mask is presented as “bool”. Figure 6 shows the examples of four head states.

2.3.1. Roll

In step 7 of procedure 1 in Table 1, a tilting angle θ was determined by rotating the head by −θ to make the look as if it was not tilted. In Figure 7, we show the result to make the orientation of the head align with the vertical axis by rotation.

2.3.2. Pitch-Up, Yaw, and Mask

We checked if the head was lifted, turned, or was on the mask using the information of the edge distribution of the lower part of F. We unrelated the mask with the three types of rotation into the face state as they affected the edge distribution on the bottom part of the face.
To obtain the edge of F, we used the Sobel operator (4) on F [3], and the Sobel operator was calculated using (5). I is the gray-scale image on the edges, and S is the edge distribution with the variable type, integer. In (6), Eg is the edge distribution represented by a binary image and p 80 is the 80th percentile in S found by a histogram. In the histogram, the O(nlogn) was determined at a lower bound. Therefore, 1-pixels of Eg represented the top 20% likely-to-be-edge pixels.
G X = 1 0 1 2 0 2 1 0 1 I ,   G Y = 1 2 1 0 0 0 1 2 1 I
S r , c r o u n d G X r , c 2 + G Y r , c 2
E g = S p 80
The edge distributions of the lower part of different cases are shown in Table 3.
When the mask was not worn and the face was not lifted, there were “thick” edges at the bottom part of the image due to the fact that mouth and nose were there. When the mask was not worn and the head was lifted, there were no edges at the bottommost region, but several edges of the mouth or nose were visible below the half of the image. When the mask was worn and the head was not rotated, a horizontal edge was found across the face, which was caused by the border between the mask and the face. The number of edges on the lower part was significantly less than that on the upper part. When the mask was worn and the head was turned, a thin edge on the border between the mask and the cheeks was visible. When the mask was worn and the head was lifted, no visible edges except for side regions were observed. Thick edges of the mouth were observed in the x direction. In comparison, the vertical edge of the mask and the cheek was thinner. We excluded the thin edges by simply opening the image with a kernel with one row.
In the method, we determined the three state variables by Procedure 2 in Table 4 using the above principle.
To consider the edges of the mask or the face and remove the edge between the background and the user, steps 1 and 2 were used. Steps 3 to 5 were used to check if the face met the condition. If so, the system judged that the user did not wear mask and did not lift the head, and then determined if the head was turned from the position of thick edges. If there were no or few thick edges, the user lifted the head with a mask. Then, steps 6 to 8 were applied to judge the status of the face. Edges close to the center and located at the bottom half of F were determined. u b in step 6 was computed to determine the number of edges in the upper part that was significantly larger than that in the lower part.
In step 8, the standard deviation of the x coordinates of thick edges and y coordinates of all edges was calculated. Standard deviation was used to determine the dispersion of a dataset in x or y directions.
Then, state classification is performed according to the criterions in Table 5. The edges of the mouth and the nose were concentrated compared to the horizontal edge of the mask. Therefore, if the standard deviation of x coordinates is small, the edges are relatively concentrated, and the system judges that the face without the mask was lifted.
When the standard deviation of the x coordinates was large, the dispersion of the y coordinate was observed. The vertical edge of the mask was invisible when the mask was worn and the face was not turned. Although the edges spread over the x-axis, they were concentrated on the y-axis. On the other hand, the vertical edge was visible and spread over the entire y-axis.

2.3.3. Pitch Down

To detect if the head was pitched down, the edge distribution of the upper part of the face was determined. When a person pitched the head down, the hair covered the upper part of the head. Therefore, the number of edges in the upper part decreased.

2.4. Detection and Postprocessing

After size reduction and determining the states of the face, the system detected the eyeballs from F. The first step was to determine find Mid[i], the mid-column coordinate of non-zero pixels at the ith row. Then, the ROIs were calculated in four directions using (7) (Table 6).
E 0 = b i t w i s e _ a n d ( F < t h r ,   F k e )
where F means the non-zero region of F.
After thresholding, we viewed each connected component of one pixel as a candidate for the eye, and we removed all candidates whose area was larger than 0.015MN or was not located in the ROI of eyes.
Finally, we selected a pair of candidates that were most likely to be eyes. Every pair was composed of a candidate on the left to the midline of the face, and a candidate was located near it, and every pair had a score calculated by using (8).
s c o r e L i , R j = V A L i A R j w L i w R j
where A L i and A R j are the areas of the candidates in the pair; w L i and w R j are weight that will be explained later; and V means the validity of the pair, in other words, if the pair is impossible to be the pair of eyes, V = 0, and Table 7 shows the condition that V = 1.
The uppermost and bottommost coordinates of the two elements were labeled as L i and R j , in the pair as u L i , u R j , b L i , b R j . The two candidates were overlapped in a row by “k” means. The two integer intervals ( u i k , b i + k ) and ( u j k , b j + k ) were overlapped. “ w L i ” and “ w R j ” are the “weight”, which is either 1 or 4 when pitch != ‘D’. The weight of a side was set to be 4 when the candidate formed a nonzero score pair, which was the lowest (the row coordinate was the highest), or the candidate overlapped with the lowest candidate of its side in the row in a range of 2. Otherwise, the weight was 1. Because eyebrows were a dark object, they became candidates, and the positions and areas were close to those of eyeballs which caused misdetection. Therefore, the weight was used to prevent the situation that eyebrows were chosen because eyebrows were positioned higher than the eyeballs. In the pitch-down cases, the eyeballs’ region shrank compared with that in the non-rotated case; so, the weight became higher.
The candidate was the eyes connected to the eyebrows, and the problem was solved by postprocessing. When the difference in intensity, I , in a candidate was larger than 3, the pixels whose intensity > max intensity I 2 was reset to 0. Then, because this operation broke a candidate into several pieces, the system selected two candidates again (Table 8).

2.5. Coordinate Transformation and Correction

After determining the two candidates, we transformed the midpoint of the centroids of the two candidates to the global coordinate by using the pinhole camera model. The raw result might be an image with lots of non-smooth peaks, oscillations. To avoid this, a correction was made as follows.
A coordinate that is the nth sample, f[n], is an oscillation if (9) is satisfied.
f n 0.3   and   m > 9
where f n is the second-order difference and the variable m is the samples after the last reset of the correction step. The corrected image was reset (m reset to 0) if the number of continuous corrections was larger than 5. This was necessary because the error presented in Figure 8 occurred.
When a coordinate was an oscillation, we corrected it by finding the linear regression [4] of the last nine samples behind it and then corrected the value of the coordinate by treating it as the tenth samples. Then, the value was predicted using the linear regression model. The result after correction is shown as the orange curve in Figure 9, where the blue curve is the x-curve in Figure 10.

3. Discussions

The execution time of the method was calculated. We used Intel core i7-6700K @ 4.00 GHz and ISO C++ 14 on the platform Visual Studio 2022 v143. In the developed method, 270–320 frames per second was used, which is three times faster than the camera’s capture rate.

3.1. Detection Rates in Different Cases

A camera was placed in front of the user who had different head states, and images were captured at a rate of 90 frames per second. A detection was regarded successful if the pair of candidates overlapped the eyeballs on F. Figure 11 shows an example of successful detection. The detection rate is listed in Table 9. Because the detection was successful only if the head state classification was correct, the detection rate was regarded as the lower bound of the successful rate of the head orientation detection.

3.2. Accuracy and Stability

We evaluated the method using random face images. The printed face with a mask and glasses was put in 27 different places in front of the camera at distances of 50, 60, and 70 cm and 9 angles per distance. Then, 9000 frames (100 s) were captured in each place. Three parameters were defined using (10) to (12), and the result are shown in Table 10, Table 11 and Table 12.
a c c u r a c y = m e a n d a t a r e f e r e n c e   p o s i t i o n
p r e c i s i o n = s t d ( d a t a ) ,
m a x s h i f t = m a x ( m a x d a t a m e a n d a t a , m e a n d a t a m i n ( d a t a ) )

4. Conclusions

We propose a fast eye-tracking method based on computer vision and image processing algorithms. We shrank the input image with the assistance of depth information to significantly increase the speed of the system and remove interferences. For eyes in the rotated face, we adopted edge detection and PCA to determine the status of the face. After two preparatory processes, we detected eyes by using the connected component analysis. Then, we transformed the average of the centers of the eyes to the coordinates in the camera model. The system showed a speed of 270 to 320 frames per second with stability. The standard deviation (precision) of the method in detecting eyes on static face was lower than 1 mm, and the maximum deviation (maxshift) was lower than 5 mm. For accuracy and high detection rate, we did not eliminate the influence of the camera distortion, and the mean error (accuracy) seldom exceeded 10 mm. The detection rate surpassed 90% when the face was not rotated.

Author Contributions

Conceptualization, M.-C.Y. and J.-J.D.; methodology, M.-C.Y.; software, M.-C.Y.; validation, M.-C.Y.; formal analysis, J.-J.D.; investigation, J.-J.D.; resources, M.-C.Y.; data curation, M.-C.Y.; writing—original draft preparation, M.-C.Y.; writing—review and editing, J.-J.D.; visualization, M.-C.Y.; supervision, J.-J.D.; project administration, J.-J.D.; funding acquisition, J.-J.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the AUO Corporation.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

No new data were created.

Acknowledgments

The authors thank for the support of the AUO Corporation.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Satoh, K.; Kitahara, I.; Ohta, Y. 3D Image Display with Motion Parallax by Camera Matrix Stereo. In Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems, Hiroshima, Japan, 17–23 June 1996; pp. 349–357. [Google Scholar]
  2. Wijewickrema, S.N.R.; Papliński, A.P. Principal Component Analysis for The Approximation of an Image as an Ellipse. In Proceedings of the 13th International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision 2005, Plzen, Czech Republic, 31 January–4 February 2005; pp. 69–70. [Google Scholar]
  3. Vincent, O.R.; Folorunso, O. A Descriptive Algorithm for Sobel Image Edge Detection. Informing Sci. IT Educ. Conf. 2009, 40, 97–107. [Google Scholar]
  4. Jiang, B.N. On The Least-Squares Method. Comput. Methods Appl. Mech. Eng. 1998, 152, 239–257. [Google Scholar] [CrossRef]
Figure 1. Flow chart of the developed method.
Figure 1. Flow chart of the developed method.
Engproc 92 00043 g001
Figure 2. Depth input and IR input, after preprocessing, after detection, and postprocessing from left to right.
Figure 2. Depth input and IR input, after preprocessing, after detection, and postprocessing from left to right.
Engproc 92 00043 g002
Figure 3. Flow chart of preprocessing.
Figure 3. Flow chart of preprocessing.
Engproc 92 00043 g003
Figure 4. (a) Input depth image, (b) after thresholding, and (c) after morphology.
Figure 4. (a) Input depth image, (b) after thresholding, and (c) after morphology.
Engproc 92 00043 g004
Figure 5. Sample outputs of the preprocessing step.
Figure 5. Sample outputs of the preprocessing step.
Engproc 92 00043 g005
Figure 6. (a) Roll right, (b) pitch up, (c) yaw left, and (d) mask on.
Figure 6. (a) Roll right, (b) pitch up, (c) yaw left, and (d) mask on.
Engproc 92 00043 g006
Figure 7. Canceled effect of “roll”.
Figure 7. Canceled effect of “roll”.
Engproc 92 00043 g007
Figure 8. Error caused by continuous “corrections”.
Figure 8. Error caused by continuous “corrections”.
Engproc 92 00043 g008
Figure 9. Corrected x position in Figure 10.
Figure 9. Corrected x position in Figure 10.
Engproc 92 00043 g009
Figure 10. Sample output of the 3D coordinates.
Figure 10. Sample output of the 3D coordinates.
Engproc 92 00043 g010
Figure 11. Successful detection.
Figure 11. Successful detection.
Engproc 92 00043 g011
Table 1. Searching step of preprocessing.
Table 1. Searching step of preprocessing.
Step 1 Find   the   upmost   1 - pixel ,   F 0 ( r u ,   c u )
Step 2 For   each   1 - pixel   F 0 ( r , c ) ,   if   c c u < 150 , we call the 1-pixel “valid”
Step 3When both r and c are multiple of 5, push it into a set of pixel coordinates, P
Step 4Stop condition: the number of the rows which have been searched exceed
m a x ( 10 ,   75,000 A v e r a g e   d e p t h   t o   t h a t   p o i n t   )
Step 5“roi” is the smallest rectangle that contains all “valid” 1-pixels
Step 6Average depth and intensity are also recorded
Step 7Performing PCA on P to find tilting angle θ. [2]
Step 8Resize the image by a v e r a g e   d e p t h 70
Table 2. State variables.
Table 2. State variables.
StatesMeaningPossible ValuesMeanings
PitchThe head lifted or fallen.‘\0’, ‘U’, ‘D’None, Up, Down
RollThe head tilted left or right.‘\0’, ‘L’, ‘R’None, Left, Right
YawThe head turns left or right‘\0’, ‘L’, ‘R’None, Left, Right
MaskWhether wears a mask.False, TrueNo, Yes
Table 3. Edge distribution in different cases.
Table 3. Edge distribution in different cases.
(Mask, Pitch, Yaw)IR ImageEdge Distribution
(0, ‘\0’, ‘\0’)Engproc 92 00043 i001Engproc 92 00043 i002
(0, ‘\0’, ‘L’ or ‘R’)Engproc 92 00043 i003Engproc 92 00043 i004
(0, ‘U’, ‘\0’)Engproc 92 00043 i005Engproc 92 00043 i006
(1, ‘\0’, ‘\0’)Engproc 92 00043 i007Engproc 92 00043 i008
(1, ‘\0’, ‘L’ or ‘R’)Engproc 92 00043 i009Engproc 92 00043 i010
(1, ‘U’, ‘\0’)Engproc 92 00043 i011Engproc 92 00043 i012
Table 4. Procedure 2: state classification part 2.
Table 4. Procedure 2: state classification part 2.
Step 1Erode (F != 0) by a 1 rows by 23 columns kernel, we obtain a binary image F.
Step 2 E g b i t w i s e _ a n d ( E g , F ) , E g T b i t w i s e _ a n d E g T , F
Step 3 Find   C e n = 1 M i = 0 M 1 c e n i ,   L = C e n 2 ,   R = M + C e n 2 ,   where   c e n i is the midpoint of 1-pixels at i-th row of F’ and M is the number of rows of F
Step 4In this step, we search E g T ( r ,   c ) for the three regions:
(a) r > f l o o r ( 4 5 M )   and   C e n 5 c C e n + 5
(b) r > f l o o r ( 4 5 M )   and   L c L + 5
(c) r > f l o o r ( 4 5 M )   and   R 5 c R
Then ,   calculate   number   of   1 - pixels number   of   all   pixels in each region, denote them as e c ,   e l ,   and   e r respectively.
Step 5(1) If e c > 0.1 && ( e l and e r < 0.15 || abs( e l e r ) < 0.1):
Mask   F a l s e ,   Y a w     \ 0 ,   P i t c h    ‘\0’, procedure stops
(2) If ( e l or e r   0.15 && abs( e l e r )  0.1:
Mask   False ,   P i t c h    ‘\0’.
                              If   e l e r : Yaw ‘L’, procedure stops
                                          otherwise   Y a w   ‘R’, procedure stops.
Step 6Search E g ( r , c )   for  Cen 0.3 N     c     Cen + 0.3 N and r   >   floor ( 1 2 M ) where N is the number of columns of F .   Then   calculate   σ r and μ c . The former is the standard deviation of the row coordinates of all found 1-pixels, and the latter is the average value of column coordinates of all found 1-pixels. In addition, calculate n u and n b , the number of 1-pixels located at the upper half and bottom half of the search region, then calculate u b = n u n b number   of   all   1 - pixels .
Step 7 Search   E g T ( r ,   c ) for the same range as that in step 7.
If   number   of   1 - pixels number   of   all   pixels 0.01:
Mask ← True, Pitch ← ‘U’, Yaw ← ‘\0’, procedure stops.
Also, calculate σ T c , which is the standard deviation of the column (x) coordinates of all found 1-pixels in the searching of in this step.
Step 8(1) If σ T c > 0.125N && ( σ r < 0.1M || u b > 0.8):
Mask True ,   P i t c h     \ 0 ,   Y a w    ‘\0’, procedure stops.
(2) If σ T c > 0.125N && σ r  0.1M:
Mask True ,   P i t c h    ‘\0’.
If   μ c closer to L:
Yaw    R   otherwise   Y a w    ‘L’, procedure stops.
(3) If σ T c  0.125N:
Mask    False ,   P i t c h     U ,   Y a w    ‘\0’, procedure stops
Table 5. Procedure 3: state classification part 3.
Table 5. Procedure 3: state classification part 3.
Step 1Search E g T ( r ,   c )   for   f l o o r ( 4 10 M ) > r > f l o o r ( 1 10 M ) and all c.
Step 2 Check   if   number   of   1 - pixels number   of   all   pixels < 0.02 .   If   so ,   P i t c h    ‘D’
Table 6. Procedure 4: find the ROI of eyes.
Table 6. Procedure 4: find the ROI of eyes.
Step 1 Search   E g along Mids[i]. i start at 0.1M,
But if Yaw != ‘\0’ or Pitch == ‘D’, start at 0.2M.
Step 2 When   E g (i, Mids[i]) != 0, procedure stops and record i.
Step 3Pitch == ‘U’up   i + 0.05M,
if up > 0.3M, set it to be 0.3M.
Pitch == ‘D’up   i + 0.1M,
if up > 0.7M or <0.5M, set it to be 0.7M or 0.5M.
Yaw != ‘\0’up   i + 0.1M,
if up > 0.4M or <0.3M, set it to be 0.4M or 0.3M.
No pitch & yawup   i + 0.1M,
if up > 0.4M or <0.25M, set it to be 0.4M or 0.25M.
Step 4Pitch == ‘U’low   up + 0.25M
Not pitch uplow   up + 0.3M
Step 5leftMids[ u p + l o w 2 ] 0.35 N , right  Mids[ u p + l o w 2 ] + 0.35N
Table 7. Thresholds in different cases.
Table 7. Thresholds in different cases.
Mask offNo Yaw&Pitch t h r = p 11   i n   r o i + 1
Pitch up t h r = p 11   i n   r o i + 2
Pitch down or Yaw t h r = min   r o i + 5
Mask onNo Pitch&Yaw t h r = p 11   i n   r o i + 1 + a
Pitch up t h r = p 11   i n   r o i + 3 + a 2
Pitch down t h r = min   r o i + 5
Yaw t h r = min   r o i + 5 + a
Table 8. Requirements for V = 1.
Table 8. Requirements for V = 1.
StateDistanceY-Position Other
Yaw != ‘\0’0.15N~0.4NOverlapped by range 2Width both < 0.1N
Pitch == ‘U’0.3N~0.5NOverlapped by range 0
else0.3N~0.5N | H o r i z o n t a l   A n g l e |   <   20 °
Table 9. Detection rates in different cases.
Table 9. Detection rates in different cases.
MaskOrientationSuccessful Frame/All FrameDetection Rate
offNormal957/9880.968623
Yaw820/9020.909091
Pitch up843/9380.898721
Pitch down476/5810.819277
onNormal898/9990.898899
Yaw772/9110.84742
Pitch up777/9010.862375
Pitch down326/4070.800983
Table 10. Results in 50 cm.
Table 10. Results in 50 cm.
Degreeacc_x (mm)acc_y (mm)acc_z (mm)pre_x (mm)pre_y (mm)pre_z (mm)shift_x (mm)shift_y (mm)shift_z (mm)
−18.88−2.4492−3.46290.85250.61940.61640.5552.53823.74811.6015
−14.51−2.6547−4.1006−0.11070.57410.53590.63042.47833.06242.5647
−9.85−1.8524−4.09640.17720.63750.53960.63233.02583.21252.2768
−4.98−0.431−4.3965−0.10650.76680.71850.64963.0378.87552.5605
00.1263−4.5644−0.27260.66340.60550.54064.65652.86211.7316
4.980.4748−5.0363−0.170.61560.53050.58822.67832.45362.624
9.851.512−5.292−0.62690.67490.55390.65782.96273.00163.0809
14.512.4336−5.2404−2.86170.70070.66760.76642.91363.11012.3307
18.883.2321−5.7841−0.49620.69680.55970.60143.84413.53573.0198
Table 11. Results in 60 cm.
Table 11. Results in 60 cm.
Degreeacc_x (mm)acc_y (mm)acc_z (mm)pre_x (mm)pre_y (mm)pre_z (mm)shift_x (mm)shift_y (mm)shift_z (mm)
−18.88−3.8767−6.0773−0.49850.64710.69780.51935.38832.67873.5215
−14.51−2.8117−6.5566−0.11950.6350.53050.65686.98532.36652.9055
−9.85−2.5407−6.4374−0.93410.56890.55630.54255.11539.86162.0909
−4.98−1.879−6.9238−0.06030.56610.58340.45232.47812.99142.0093
0−0.3749−7.5495−0.41310.80670.77560.58182.68842.79452.6119
4.980.3135−7.5778−0.81750.7010.55640.5962.76892.34342.2075
9.851.3503−8.0553−0.87160.53760.5560.64012.40332.37472.8206
14.512.9756−8.915−0.41680.71610.55430.64033.08262.7532.6082
18.884.7939−9.1908−0.07230.80760.67730.64044.50213.13924.9427
Table 12. Results in 70 cm.
Table 12. Results in 70 cm.
Degreeacc_x (mm)acc_y (mm)acc_z (mm)pre_x (mm)pre_y (mm)pre_z (mm)shift_x (mm)shift_y (mm)shift_z (mm)
−18.88−3.57−7.80491.9380.58450.5950.74522.7392.49112.492
−14.51−1.2507−8.35960.68580.42440.44740.76365.93032.10042.7492
−9.85−1.9073−8.87371.9190.63810.65870.72622.24372.53732.511
−4.980.6765−8.90882.0120.58390.55780.63652.39472.57222.557
00.3695−9.11561.22280.55030.52050.81122.56462.70142.2122
4.980.5622−9.99221.53190.44610.46120.84252.09992.10982.0769
9.852.1014−10.30070.15220.79280.67950.82214.59862.61733.2828
14.512.9638−10.48520.80160.75890.69640.89112.57682.73182.6334
18.884.5946−11.16241.27760.82170.86340.91543.12663.36162.8176
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ye, M.-C.; Ding, J.-J. Real-Time Head Orientation and Eye-Tracking Algorithm Using Adaptive Feature Extraction and Refinement Mechanisms. Eng. Proc. 2025, 92, 43. https://doi.org/10.3390/engproc2025092043

AMA Style

Ye M-C, Ding J-J. Real-Time Head Orientation and Eye-Tracking Algorithm Using Adaptive Feature Extraction and Refinement Mechanisms. Engineering Proceedings. 2025; 92(1):43. https://doi.org/10.3390/engproc2025092043

Chicago/Turabian Style

Ye, Ming-Chang, and Jian-Jiun Ding. 2025. "Real-Time Head Orientation and Eye-Tracking Algorithm Using Adaptive Feature Extraction and Refinement Mechanisms" Engineering Proceedings 92, no. 1: 43. https://doi.org/10.3390/engproc2025092043

APA Style

Ye, M.-C., & Ding, J.-J. (2025). Real-Time Head Orientation and Eye-Tracking Algorithm Using Adaptive Feature Extraction and Refinement Mechanisms. Engineering Proceedings, 92(1), 43. https://doi.org/10.3390/engproc2025092043

Article Metrics

Back to TopTop