Gaze-Based Vehicle Driving Evaluation of System with an Actual Vehicle at an Intersection with a Traffic Light

Shimauchi, Takumi; Sakurai, Keiko; Tate, Lindsey; Tamura, Hiroki

doi:10.3390/electronics9091408

Open AccessArticle

Gaze-Based Vehicle Driving Evaluation of System with an Actual Vehicle at an Intersection with a Traffic Light

Faculty of Engineering, University of Miyazaki, 1-1, Gakuen Kibanadai-nishi, Miyazaki-shi 889-2192, Japan

^*

Author to whom correspondence should be addressed.

Electronics 2020, 9(9), 1408; https://doi.org/10.3390/electronics9091408

Submission received: 30 July 2020 / Revised: 27 August 2020 / Accepted: 28 August 2020 / Published: 1 September 2020

(This article belongs to the Special Issue Applications of Bioinspired Neural Network)

Download

Browse Figures

Versions Notes

Abstract

:

Due to the population aging in Japan, more elderly people are retaining their driver’s licenses and the increase in the number of car accidents by elderly drivers is a social problem. To address this problem, an objective data-based method to evaluate whether elderly drivers can continue driving is needed. In this paper, we propose a car driving evaluation system based on gaze as calculated by eye and head angles. We used an eye tracking device (TalkEye Lite) made by the Takei Scientific Instruments Cooperation. For our image processing technique, we propose a gaze fixation condition using deep learning (YOLOv2-tiny). By using an eye tracking device and the proposed gaze fixation condition, we built a system where drivers could be evaluated during actual car operation. We describe our system in this paper. In order to evaluate our proposed method, we conducted experiments from November 2017 to November 2018 where elderly people were evaluated by our system while driving an actual car. The subjects were 22 general drivers (two were 80–89 years old, four were 70–79 years old, six were 60–69 years old, three were 50–59 years old, five were 40–49 years old and two were 30–39 years old). We compared the subjects’ gaze information with the subjective evaluation by a professional driving instructor. As a result, we confirm that the subjects’ gaze information is related to the subjective evaluation by the instructor.

Keywords:

car driving evaluation system; eye tracking device; YOLOv2-tiny; gaze fixation; subjective evaluation

1. Introduction

Due to the population aging in Japan, more elderly people are retaining their driver’s licenses [1]. However, elderly people may have characteristics such as a decrease in dynamic visual acuity, difficulty processing multiple pieces of information at the same time and the inability to make instantaneous decisions, resulting in delays in the steering wheel and brake operations. Cessation of driving simply due to advanced age may hinder independence in life, increase the risk of illnesses such as depression and shorten the life expectancy [2,3,4]. Accordingly, a method to evaluate whether elderly drivers can continue safely driving is needed. However, establishing criteria for judging the driving ability of elderly people is difficult [5]. When driving a car, 90% of relevant information is generally visual [6]. In addition, Owsley et al. [7] modeled the relationship between eye health, effective field of view and car accidents involving elderly drivers, finding that it was the effective field of view that was most related to these car accidents. Therefore, the effective field of view range is greatly involved in a person’s ability to drive a motor vehicle.

In this paper, we propose a car driving evaluation system using both the gaze information obtained from an eye tracking device (TalkEye Lite [8]) and the positional information of a traffic light obtained using image processing technology. The sampling frequency of the video and gaze information used in this paper is 30 Hz, but image processing takes time so we downsampled the frequency of the video and gaze information to 10 Hz when the evaluation was performed. In order to estimate the score near the traffic light, multiple regression analysis (MRA) was performed using the gaze information obtained from our proposed method as inputs and the subjective evaluation by a professional driving instructor as the target output. This paper shows the automatically driving evaluation method to have the most similarity to a professional driving instructor’s evaluation during actual car driving.

2. Previous Studies about Gaze Estimation Method for Driving the Car

With the increase in car driving accidents, studies of detection of fatigue during driving using speech recognition [9] and detection of sleepiness using a percentage of eyelid closure (PERCLOS) [10] and driver’s head location and eye status [11] have been conducted. Although fatigue and sleepiness can be detected using these methods, they are insufficient as a method for determining driving ability because the driver’s aptitude judgment is needed. Paper [12] showed the car driving evaluation in a driving simulator to apply for car automated driving. However, paper [12] does not have the research of car driving evaluation that includes gaze information using an actual car driving.

The driving aptitude of the elderly driver is judged by measuring driving skill, presence or absence of visual field abnormality and cognitive ability. Usually, driving aptitude is judged by each measurement (driving skill, visual field abnormality, cognitive ability). However, to determine the driver’s driving suitability by measuring the visual information and reaction speed using a gaze tracking device is researched using a driving simulator and an actual vehicle.

Regarding visual field abnormalities during driving, studies focusing on the number of fixations [13], average fixation time, average eye movement distance and focusing on the narrowing of the gaze range and peripheral visual field [14,15,16] have been conducted. Although distraction can be detected using these methods, they are insufficient as a method for determining the cause of distraction. In addition, these studies were carried out using driving simulators, which is less ecologically valid and generalizable than evaluating on-road driving in an actual vehicle. Moreover, in reference [13,14,15], it was not possible to know whether the driver was aware of the information on the road (e.g., pedestrians, traffic lights, road signs). Our proposed method can finally determine what specifically the driver is looking at using the image from the view camera. Therefore, our system can calculate the visual attention for the traffic light while driving, which is a useful method to prevent car accidents.

3. Results

We used a GPS device and an eye tracking device to obtain eye information in a specific area. In this chapter, we introduce the eye tracking and GPS devices. The eye tracking device is the same as in reference [16].

3.1. Measurement System Using TalkEye Lite

TalkEye Lite [8] is a wearable eye movement measurement system that connects directly to the processing computer and uses a USB camera for eye detection and visual field recording. Through its USB camera, TalkEye Lite can track the pupil of the subject; therefore, using the TalkEye Lite software allows us to see what the subject is looking at. The overlay display on the visual field image includes not only the left and right viewpoints but also the center axis of the gaze of both eyes. Figure 1a is a picture of the TalkEye Lite goggle with arrows indicating its view camera and its eyeball camera and Figure 1b is a picture of a person wearing the TalkEye Lite goggle.

The blue cross points in Figure 2 indicate the gaze of both eyes. The eye motion analysis program can analyze visual videos recorded by the TalkEye Lite, enabling us to determine at which target the subject was looking. It is possible to calculate the angle of convergence using the angle data of both eyes. The angle is expressed in degrees in values ranging from −180 degrees to +180 degrees.

3.2. The Head Angle Estimation Method Using Template Matching

The head angle estimation method [16] is used in this paper. We used the viewing camera of the TalkEye Lite to obtain the head movement information. Although the head angle estimation method using the RGB-D sensor needs to set a sensor, the head angle estimation method using the proposed template matching uses a field of view camera attached to the TalkEye Lite. Therefore, our proposed method has the merit that it can be easily used in various environments. Since the estimation of the line of sight and the head angle is performed during actual car driving, it is necessary to use a method that can estimate the line of sight given the restricted physical space available for equipment and measurement; template matching accomplishes this. We installed four templates as shown in Figure 3 (Marker1, an upward triangle; Marker2, a circle; Marker3, a downward triangle; and Marker4, a star) on the front glass of the car.

This system required calibration prior to data collection onset. We defined the forward-facing position of the head as angle 0 degrees. We placed the markers (Marker1, Marker2, Marker3 and Marker4) on the dashboard of the car as shown in Figure 4, with one of the three markers directly in front of the driver (0 degrees). We asked the driver to face forward for five seconds in order to record the reference position for each of the three markers. These reference positions were used for template matching. The driver’s head angle was calculated after determining the head movement amount “a” using the difference between the current marker’s position and the reference position. Marker4 was excluded from calibration because Marker4 did not appear on the screen when facing the front. Therefore, the reference position of the star marker was calculated using the ratio of α and β as shown in Figure 4. The coordinates of Cx were obtained by transforming Equation (1), as in Equation (2), to Equation (3).

Bx = \frac{β A x + α C x}{α + β}

(1)

(α + β) Bx = β A x + α C x

(2)

- α C x = - α Bx - β Bx + β A x

(3)

C x = - \frac{1}{α} {β (A x - Bx) - α Bx} .

(4)

3.2.1. Estimating Inclination of Head

As for the inclination of the head, there are three axis directions: the yaw axis, the pitch axis and the roll axis (Figure 5). The yaw angle and the pitch angle are necessary for gaze estimation. The yaw angle and the pitch angle of the head were calculated using differences in the recorded marker coordinates. If the yaw axis and the pitch axis changed while the roll axis of the head was inclined, the head orientation angle could not be estimated correctly. An example is shown in Figure 6. When the head roll axis (θ in Figure 6) was tilted and if the b degree moved to the right, the marker position moved a pixel. Therefore, when the roll axis rotated, we first needed to determine the roll angle. After calculating the roll angle, the coordinate transformation was performed. The deviation of the estimated angle due to the movement of the roll angle is shown in Figure 5. The deviation of the yaw axis can be calculated by

a - acos θ

and the deviation of the pitch axis by

asin θ

. By considering these deviations, we obtained coordinates converted to a zero roll angle, solving the problem caused by the inclination of the head.

3.2.2. Calculation of the Roll Angle

The formula for calculating the roll angle is shown in Equation (5). Letting the coordinates of the two found markers be (

x_{1}

,

y_{1}

) and (

x_{2}

,

y_{2}

), the roll angle could be calculated by Equation (5). However, it was a condition of

x_{2}

>

x_{1}

.

θ = \tan^{- 1} (\frac{y_{2} - y_{1}}{x_{2} - x_{1}})

(5)

θ

: Roll angle; (

x_{1}

,

y_{1}

) and (

x_{2}

,

y_{2}

): the coordinates of the two markers found.

3.2.3. Calculation of Yaw Angle and Pitch Angle

The yaw angle and the pitch angle were calculated by obtaining the movement amount of a pixel of the marker from (x’, y’) obtained by the rotation conversion in Equations (6) and (7). x, y indicates the coordinates of the marker before the coordinate transformation and x’, y’ indicates the coordinates after transformation. Equations (8) and (9) show equations for calculating the yaw angle and pitch angle. For example, if the coordinate of the marker moved a pixel horizontally, the head inclined at an angle shown in Equation (8) with respect to the yaw angle.

x^{'} = x \cos θ - y \sin θ

(6)

y^{'} = x \sin θ + ycos θ

(7)

Y a w f a c e a n g l e = \frac{a}{\frac{C a m e r a r e s o l u t i o n}{C a m e r a v i e w i n g a n g l e}} (\deg)

(8)

P i t c h f a c e a n g l e = \frac{b}{\frac{C a m e r a r e s o l u t i o n}{C a m e r a v i e w i n g a n g l e}} (\deg)

(9)

where a is the horizontal movement amount of the marker (pixel) and b is the vertical movement amount of the marker (pixel).

The recognition accuracy of the template matching was 90.2% and the average error of the head angle estimation was 4.1 degrees.

3.3. GPS Information

In this paper, we used high-precision Global Positioning System (GPS) information to extract the information within a specific area of the driving course, which contained a traffic light. The video extracted by GPS was about 30 seconds. Figure 7a shows the plot map from the GPS information with the red rectangle representing the traffic area, the white point representing the car and the white line representing the driving trajectory. Figure 7b is a still image of the intersection with the traffic light. In our driving experiment, car drivers almost did not change the head position due to back and forward movement. Moreover, the change of the yaw angle and the pitch angle of the head orientation by the change of the head position did not occur in the driver’s seat. Therefore, our proposal method did not use the back and forward movement of the head position.

4. Proposed Method

Humans constantly move and stop their eyes in order to recognize the shapes and colors of objects in their environments. In the eye movement field, the gaze fixation is defined as the maintaining of the visual gaze on a single location or the gaze movement between two saccades. A saccade is a rapid eye movement performed by humans to obtain information. Humans require a fixed latency of 150 to 250 milliseconds at the start of a saccade [17] so the gaze must remain at least 150 milliseconds between gaze and saccade. In order to detect gaze fixation using saccade information, the sampling frequency should be over 14 Hz [18]. However, the sampling frequency of the data recorded by the TalkEye Lite is 10 Hz, which is not high enough to use saccades to identify the gaze fixation condition. In this paper, we propose the identification of the gaze fixation condition using not saccades but position relationship between the gaze point and the traffic light point as described in Figure 8. The conditions labeled 4.1 to 4.4 in Figure 8 are further described below.

4.1. The Traffic Light in the Video is Recognized by Image Processing

In order to recognize the traffic signal in the videos, we focused on the recognition rate due to environmental changes (clear, cloudy, rain) and compared six image processing methods. Table 1 shows the comparison results. Regarding the recognition method using color features, it was confirmed that the traffic light could not be recognized due to the taillight of the preceding vehicle, surrounding signboards, strong sunlight, etc. [19,20,21,22]. We confirmed the recognition rates and environmental responses of YOLOv2 and YOLOv2-tiny [23]. The 11,034 traffic light images such as in Figure 7b were learned by YOLOv2 and YOLOv2-tiny models in the learning environment (Table 2). To confirm the accuracy of YOLOv2-tiny, we used three daytime videos. In this paper, the recognition rate refers to the ratio of the recognized traffic light frames among the reflected traffic light in the video. For YOLOv2-tiny, the recognition rate was over 93.0%. It was almost 100% for the 22 subjects’ videos used in this paper. The processing time was 0.07 seconds/frame in the execution environment (Table 2).

As indicated in Table 1, YOLOv2-tiny had the highest recognition rate and the best environmental response.

4.2. The Driver Faces the Front

In this condition, the face angle estimation method [16] explained in Chapter 3.2 was used. We defined the front as the area within ±15 degrees of the average yaw angle at the intersection based on reference [24]. In addition, right and left confirmation states were outside of this range. This condition was based on the driver’s face angle from reference [16] technology and the knowledge of the driver’s driving behavior evaluation based on ease of oversight of pedestrians considering the effective human visual field and visual characteristics and head posture [24]. Figure 9 shows the yaw angle of an example driver’s face while at the intersection. When the driver faced to the right, the yaw angle had a positive value over +15 degrees. Figure 9 shows that this driver did not face left but right when turning to the right.

4.3. The Traffic Light Exists Within the Visual Effective Field

First, we obtained the traffic light center coordinates (

x_{t}, y_{t}

) by detecting the traffic light (4.1). Second, the pixel distance between the traffic light center coordinates (

x_{t}, y_{t}

) and the gaze coordinates (

x_{g}, y_{g}

) obtained from the eye tracking device was calculated by Equation (10).

pixel distance = \sqrt{{(x_{t} - x_{g})}^{2} + {(y_{t} - y_{g})}^{2}} .

(10)

The system could calculate the numerical value that could be used to determine whether the traffic light was within the effective field of view; the amount of change in the angle per one pixel in the video was 0.07/pixel in the specific driving school. Figure 10 shows the positional relationship between the gaze point and the traffic light at the intersection. Figure 10 shows that this driver looked near a traffic light or looked around. The effective field narrows with aging [25] so we changed the threshold of the effective field of view from 10 to 25 every 5 relates to d fixation time decreasing.

4.4. The Conditions from 4.1 to 4.3 are Satisfied for 300 Milliseconds or More

This condition is based on visual information processing and can be described in terms of a two-level model [26,27,28,29]. At the first level, objects are dynamically localized in the 3D environment that is ‘ambient’. These objects, or rather ‘blobs’, are identified at the second level. Fixation duration was from 150 to 250 milliseconds in the first level and increased to 500 milliseconds in the second level. Additionally, according to previous research [30,31], the number and duration of eye fixations can be important metrics for a visual recognition range when facing a dangerous traffic situation.

Our proposed method recorded the total eye fixation time (seconds), the number of eye fixations, the minimum distance and the average distance during the gaze fixation. Using this gaze information and the driver’s age, we examined whether gaze evaluation at intersections was possible. Hereafter, the notation for variable naming will be according to this example: “the number of eye fixations 20” is the number of eye fixations when an effective field is 20 degrees.

5. Experimental Methods and Results

In this chapter, we investigate whether our proposed method was able to evaluate gaze information near traffic. The subjects were 22 general drivers (two were 80–89 years old, four were 70–79 years old, six were 60–69 years old, three were 50–59 years old, five were 40–49 years old and only one test drive without training and they drove for six to fifteen minutes in the driving school course. Table 3 and Table 4 show the subjects’ gaze information from our proposed method. We investigated the relationship between the gaze information and the three-level subjective evaluation (0/0.5/1) by the driving school instructors. The evaluation area was limited to the traffic signal area. Subjective evaluation was performed by the following items: focusing on the signal confirmation, left and right safety confirmation, over the stop line, oncoming lane confirmation, right side and right rear confirmation. Regarding evaluation, 0 was bad, 0.5 was intermediate and 1 was good.

Here, we explain S.D.A.P. (Smart Driving Assessment Program) [32], which is one of the driving skill evaluation systems developed by OFA SUPPORT INC for driving schools. S.D.A.P. operates the steering wheel, accelerator, brake, etc. to evaluate driving technique. The score obtained from the S.D.A.P. (S.D.A.P. score) is deducted from 0. In this paper, the S.D.A.P. score is used as reference data. Table 3 shows the subjective evaluation (0/0.5/1) as the relationship of the S.D.A.P. score. Multiple regression analysis (MRA) tools used the four gaze information (the total eye fixation time seconds, the number of eye fixations, the minimum distance and the average distance) and driver’s age as inputs and the subjective evaluation (0/0.5/1) as the target output. We used an alpha value of 0.05 to evaluate the statistical significance of predictors. The proposed method automatically obtained Almost the same evaluation as the driving instructor from this methodology.

5.1. Result Using Various Variables as Inputs

As shown in Table 5, when MRA was performed using age, number of fixations 20, minimum distance 15, average distance 15, minimum distance 20, average distance 20 and average distance 25 as inputs and the subjective evaluation (0/0.5/1) as the target output, the p-value of each coefficient was less than our alpha of 0.05. The number following the variable name in Table 5 represents the threshold of the effective field of view. As shown in Figure 11, the coefficient of determination was 0.7382 so these seven variables were related to the subjective evaluation by the instructor. Additionally, the recognition rate was 100% for the instructor’s evaluation of a score of 0.5 (threshold: maximum of 0, minimum of 1). However, the sample size was low at 22 subjects so this result is unreliable. Therefore, we ran a second MRA model with fewer variables.

5.2. Result Using Subjects’ Age and the Number of Fixations as Inputs

As shown in Table 6, when MRA was performed using only age and number of fixations 20 as inputs and the subjective evaluation (0/0.5/1) as the target output, the p-value of each coefficient was less than 0.05. As shown in Figure 12, the coefficient of determination was 0.5517 so these two variables were related to the subjective evaluation by the instructor. Additionally, the recognition rate was 86.4% for the instructor’s evaluation of a score of 0.5 (threshold: maximum of 0, minimum of 1).

5.3. Discussion

In this section, we will discuss the results and compare our previous work [8] with our updated proposed method.

From the results of MRA model 1 (Section 4.1), the recognition rate of the subjective evaluation level of 0.5 was 100%. While there were seven input variables used for MRA model 1, the sample size used in this paper was underpowered at 22 subjects. Therefore, we consider that this evaluation does not have high reliability and we must increase the number of subjects in order to evaluate MRA model 1.

From the results of MRA model 2 (Section 4.2), the recognition rate of the subjective evaluation level of 0.5 was 86.4%. The subject with the lowest rating among subjective evaluation of 0.5 was considered valid because we verified that the subject’s driving skill was low as indicated by the driving ability evaluation system (S.D.A.P. [32]).

Sakurai et al. [16] proposed a method wherein the driver’s gaze range was estimated while driving the entire test course. Our proposed method used GPS information to extract only the gaze data near the intersection on the test course and performed the evaluation within this specific area. Therefore, our proposed method could not evaluate the subjects’ gaze over the entire course.

6. Conclusions

In this paper, we proposed to apply the experimental method of our previous study [16]. The proposed method used GPS information to extract the video and eye information at an intersection with a traffic light. In addition, we extended the functions of our car driving ability evaluation system using image processing and eye information. We mainly carried out three things:

(1): We defined the condition for eye fixation, one of the eye movements, using the coordinates of the traffic light in the video obtained by the image processing and the coordinates of the gaze obtained from the eye tracking device.
(2): We constructed the system to extract the gaze information about eye fixation.
(3): We investigated the ability of gaze information and drivers’ ages to predict the three-level subjective evaluation given by the professional driving instructor.

The coefficient of determination was more than 0.5, indicating a relationship between the gaze information obtained by our proposed system and the subjective evaluation by the driving instructor. We consider that this result about the gaze information for the traffic light will help prevent car accidents. Our driving evaluation system was performed automatically and helps to review elderly drivers’ driving ability. It is worth, for accurate evaluation, to be able to evaluate in a free and normal environment without instructors. The final objective of this paper was to construct a car driving evaluation system that could determine whether elderly drivers are safe to continue driving.

As our future issues, we should confirm the result adding other object (stop sign, a pop-up doll and so on). And we consider using the NIRS (Near-infrared spectroscopy) as the other objective data to improve our proposed method accuracy.

Author Contributions

T.S. from University of Miyazaki developed the system, performed the experiments and wrote the manuscript; H.T. from University of Miyazaki managed the research project and revised the manuscript; K.S. from University of Miyazaki managed the research project and revised the manuscript; L.T. from University of Miyazaki revised the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by OFA SPPORT INC.

Conflicts of Interest

The authors declare no conflict of interest.

References

Cabinet Office in Japan. Preventing Traffic Accidents Involving the Elderly. The Current Situation Surrounding the Elderly. Available online: https://www8.cao.go.jp/koutu/taisaku/h29kou_haku/zenbun/genkyo/feature/feature_01.html (accessed on 16 March 2020). (In Japanese).
Negative Effects of Operation Suspension. Available online: https://www.ncgg.go.jp/cgss/department/cre/gold/about/page2.html (accessed on 16 March 2020). (In Japanese).
Shimada, H.; Makizako, H.; Tsutsumimoto, K.; Hotta, R.; Nakakubo, S.; Doi, T. Driving and Incidence of Functional Limitation in Older People: A Prospective Population-Based Study. Gerontology 2016, 62, 636–643. [Google Scholar] [CrossRef] [PubMed]
Shimada, H.; Makizako, H.; Doi, T.; Lee, S. Lifestyle activities and the risk of dementia in older Japanese adults. Geriatr. Gerontol. Int. 2018, 18, 1491–1496. [Google Scholar] [CrossRef] [PubMed]
Schultheis, M.T.; Deluca, J.; Chute, D.L. Handbook for the Assessment of Driving Capacity; Academic Press: San Diego, CA, USA, 2009. [Google Scholar]
Hartman, E. Driver vision requirements. Soc. Automot. Eng. 1970, 629–630. [Google Scholar] [CrossRef]
Owsley, C.; Ball, K.; Sloane, M.E.; Roenker, D.L.; Bruni, J.R. Visual/cognitive correlates of vehicle accidents in older drivers. Psychol. Aging 1991, 6, 403–415. [Google Scholar] [CrossRef] [PubMed]
Takei Scientific Instruments Cooperation. Available online: https://www.takei-si.co.jp/en/productinfo/detail/65.html (accessed on 16 March 2020).
Krajewski, J.; Trutschel, U.; Golz, M.; Sommer, D.; Edwards, D. Estimating fatigue from predetermined speech samples transmitted by operator communication systems. In Proceedings of the 5th International Driving Symposium on Human Factors in Driver Assessment, Training and Vehicle Design, Big Sky, MT, USA, 22–25 June 2009. [Google Scholar] [CrossRef]
Daza, I.G.; Hernandez, N.; Bergasa, L.M.; Parra, I.; Yebes, J.J.; Gavilan, M.; Quintero, R.; Llorca, D.F.; Sotelo, M.A. Drowsiness monitoring based on driver and driving data fusion. In Proceedings of the 14th International IEEE Conference on Intelligent Transportation Systems (ITSC), Washington, DC, USA, 5–7 October 2011; pp. 1199–1204. [Google Scholar]
Han, C.C.; Pai, Y.J.; Lee, C.H. A Fast Implementation Framework for Drowsy Driving Detection on Embedded Systems. In Proceedings of the 2019 International Conference on Machine Learning and Cybernetics (ICMLC), Kobe, Japan, 7–10 July 2019; pp. 854–860. [Google Scholar]
Gao, F.; He, B.; He, Y. Detection of Driving Capability Degradation for Human-machine Cooperative Driving. Sensors 2020, 20, 1968. [Google Scholar] [CrossRef] [Green Version]
Kunishige, M.; Fukuda, H.; Iida, T.; Kawabata, N.; Ishizuki, C.; Miyaguchi, H. Spatial navigation ability and gaze switching in older drivers: A driving simulator study. Hong Kong J. Occup. Ther. 2019, 32, 22–31. [Google Scholar] [CrossRef] [Green Version]
Van Leeuwen, P.M.; Happee, R.; de Winter, J.C.F. Changes of driving performance and gaze behavior of novice drivers during a 30-min simulator-based training. Procedia Manuf. 2015, 3, 3325–3332. [Google Scholar] [CrossRef] [Green Version]
Reimer, B. Impact of Cognitive Task Complexity on Drivers’ Visual Tunneling. Transp. Res. Rec. J. Transp. Res. Board 2009, 2138, 13–19. [Google Scholar] [CrossRef]
Sakurai, K.; Tamura, H. A Study on Gaze Range Calculation Method during an Actual Car Driving Using Eyeball Angle and Head Angle Information. Sensors 2019, 19, 4774. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Koga, K. The Eye Movement Research Handbook; Japan Institute for Science of Labor: Tokyo, Japan, 1998. (In Japanese) [Google Scholar]
Ohsuga, T.; Tanaka, M.; Niiyama, Y.; Inoue, H. Experimental Study on Eye Fixation Time in Opinion Test with Food Pictures. Trans. Soc. Instrum. Control. Eng. 2013, 49, 880–886. [Google Scholar] [CrossRef]
Matsuo, H.; Kimura, K. Traffic Lights Recognition Using Learning and Detecting Shape and Color, IPSJ SIG Technical Report. Ipsj Sig Notes Cvim 2014, 2014, 1–7. (In Japanese) [Google Scholar]
Research on Traffic Light Recognition Method for Tsukuba Challenge. Available online: http://www.ail.cs.gunma-u.ac.jp/ailwiki/index.php (accessed on 16 March 2020). (In Japanese).
Omachi, M.; Omachi, S. Fast Detection of Traffic Light with Color and Edge Information; The Institute of Image Electronics Engineers of Japan: Tokyo, Japan, 2009; Volume 38, pp. 673–679. (In Japanese) [Google Scholar]
Moizumi, H.; Sugaya, Y.; Omachi, M.; Omachi, S. Traffic Light Detection Considering Color Saturation Using In-Vehicle Stereo Camera. J. Inf. Process. 2016, 24, 349–357. [Google Scholar] [CrossRef]
Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525. [Google Scholar]
Yamasaki, A.; Raksincharoensak, P.; Shino, M. Extraction of Driver’s Gaze Region by Face Direction Estimation Using On-board Cameras. In Transactions of the Society of Automotive Engineers of Japan; Society of Automotive Engineers of Japan: Tokyo, Japan, 2017; Volume 48, pp. 1113–1119. (In Japanese) [Google Scholar]
Salvi, S.M.; Akhtar, S.; Currie, Z. Ageing changes in the eye. Postgrad. Med. J. 2006, 82, 581–587. [Google Scholar] [CrossRef] [PubMed]
Hoffman, J.E. Stages of processing in Visual Search and Attention. In Stratification in Cognition and Consciousness; Challis, B.H., Velichkovsky, B.M., Eds.; John Benjamins: Amsterdam, The Netherlands; Philadelphia, PA, USA, 1999; pp. 43–71. [Google Scholar]
Trevarthen, C. Two visual systems in primates. Psychologische Forschung 1969, 31, 321–337. [Google Scholar]
Velichkovsky, B.M. Visual Cognition and its Spatial-Temporal Context. In Cognitive Research in Psychology; Klix, F., Hoffmann, J., van der Meer, E., Eds.; North Holland: Amsterdam, The Netherlands, 1982; pp. 30–48. [Google Scholar]
Pomplun, M. Analysis and Models of Eye Movements in Comparative Visual Search; Cuvillier: Göttingen, Germany, 1998. [Google Scholar]
Velichkovsky, B.M.; Rothert, A.; Miniotas, D.; Dornhoefer, S.M.; Joos, M.; Pannasch, S. Visual Fixations as a Rapid Indicator of Hazard Perception. In Operator Functional State and Impaired Performance in Complex Work Environments Amsterdam/Washington; Hockey, G.H.R., Gaillard, A.W.K., Burov, O., Eds.; IOS Press: Amsterdam, The Netherlands, 2003; pp. 313–321. [Google Scholar]
Sun, Q.; Xia, J.C.; He, J.; Foster, J.K.; Falkmer, T.; Lee, H. Towards unpacking older drivers’ visual-motor coordination: A gaze-based integrated driving assessment. Accid. Anal. Prev. 2018, 113, 85–96. [Google Scholar] [CrossRef] [PubMed]
OFA Support. S.D.A.P. Available online: http://minamikyusyu-car.main.jp/sdap/ (accessed on 17 March 2020). (In Japanese).

Figure 1. The TalkEye Lite measurement system: (a) the TalkEye Lite goggle; (b) a person wearing the TalkEye Lite goggle.

Figure 2. The TalkEye Lite operation image.

Figure 3. Types of markers.

Figure 4. Between range to be calibrated, marker distance ratio and the coordinate positions of the marker. α, β means the ratio between the markers.

Figure 5. Head direction axes used for template matching.

Figure 6. The flow of head angle estimation using template matching.

Figure 7. Experimental environment: (a) the plot map from the GPS information; (b) an image of the intersection with a light.

Figure 8. The flow of our proposed method.

Figure 9. The yaw angle of a driver’s face at the intersection.

Figure 10. The positional relationship between the gaze point and the traffic light point.

Figure 11. Results of MRA model 1.

Figure 12. Results of MRA model 2.

Table 1. Comparative results for various image processing methods.

Method	Recognition Rate
Haar-like feature + Adaboost [19]	80.0%
RGB →HSV + Extraction of specific color + Noise removal [20]	84.0%
RGB → Normalized RGB + Extraction of candidate region + Extraction of edge + Apply to the circle equation [21]	86.6%
Histogram + Kalman filter [22]	86.0%
YOLOv2 [23]	87.4%
YOLOv2-tiny [23]	93.0%

Table 2. PC environment for recognizing the traffic light.

	Learning Environment		Execution Environment
Memory	16 GB	Memory	8 GB
CPU	Core i7 8700 (3.2 GHz)	CPU	Core i7 8700 (3.2 GHz)
GPU	Geforce RTX 2080(VRAM:8 GB)	OpenCV	3.4.0
CUDA	10

Table 3. The subjects’ gaze information from our proposed method.

Subject	Age	Weather	Time Zone	Subjective Evaluation	S.D.A.P. Score [32]	Number of Fixations 10	Total of Fixation Time 10	Minimum Distance 10	Average Distance 10	Number of Fixations 15	Total of Fixation Time 15
A	82	Rain	2 p.m.	0.5	−766	10	7.8	0.82	5.08	12	12.6
B	66	Rain	3 p.m.	0	−1650	4	1	2.51	6.19	12	2.5
C	77	Fine	4 p.m.	0.5	−910	4	1.6	4.14	7.50	10	3.8
D	30	Fine	4 p.m.	1	−49	10	1.8	5.20	7.96	22	7.5
E	79	Cloudy	2 p.m.	1	−390	18	6.8	0.36	5.37	26	10.8
F	40	Fine	4 p.m.	0.5	−235	10	3.3	1.60	7.31	10	5.8
G	38	Fine	4 p.m.	1	−605	4	0.8	3.88	6.85	6	1.1
H	77	Fine	3 p.m.	0.5	−717	0	0	–	–	4	0.6
I	53	Cloudy	3 p.m.	1	−440	10	2.6	1.55	6.30	22	5.7
J	67	Cloudy	3 p.m.	0.5	−195	10	3.2	4.49	6.93	10	6.2
K	59	Fine	4 p.m.	0.5	−355	10	2.7	2.11	4.39	12	2.7
L	57	Fine	1 p.m.	0.5	−193	12	12.5	0.88	6.19	8	15
M	44	Fine	1 p.m.	0.5	−738	8	1.3	2.48	7.96	12	5.1
N	84	Cloudy	1 p.m.	0	−957	4	0.6	2.55	5.85	12	2.1
O	67	Cloudy	2 p.m.	0.5	−450	12	9.4	1.44	6.76	10	14.5
P	49	Cloudy	11 a.m.	0.5	−217	16	3.9	3.10	7.19	14	5.6
Q	69	Cloudy	11 a.m.	0.5	−1312	6	1.7	3.43	5.37	14	4.4
R	74	Fine	10 a.m.	0.5	−785	16	6.1	1.72	5.76	8	7.5
S	40	Fine	2 p.m.	0.5	−733	10	2.8	0.87	5.77	20	6.3
T	67	Cloudy	1 p.m.	0.5	−822	6	4.8	3.75	7.06	4	5
U	67	Cloudy	1 p.m.	0.5	−645	4	1.7	1.99	6.90	2	1.5
V	46	Cloudy	4 p.m.	0.5	−360	0	0	–	–	2	0.4

Table 4. The subjects’ gaze information from our proposed method.

Subject	Minimum Distance 15	Average Distance 15	Number of Fixations 20	Total of Fixation Time 20	Minimum Distance 20	Average Distance 20	Number of Fixations 25	Total of Fixation Time 25	Minimum Distance 25	Average Distance 25
A	0.82	7.67	10	15.1	0.70	8.35	6	15.6	0.82	9.57
B	2.51	9.31	4	5.1	2.15	11.37	20	6	2.51	14.99
C	4.14	10.23	12	5.8	3.55	11.02	12	6.8	4.14	13.29
D	5.20	11.17	29	15.3	4.46	12.48	18	16.5	5.20	15.20
E	0.36	7.35	28	13.8	0.31	8.00	34	15.2	0.36	9.72
F	1.22	8.17	13	9.8	1.05	8.71	14	8.7	1.22	11.10
G	3.88	8.03	8	2.5	3.33	11.78	10	2.9	3.88	14.59
H	12.41	14.22	3	1.8	9.01	14.71	8	2.9	9.01	16.81
I	1.55	8.85	17	10.3	1.55	11.98	40	15.5	1.55	14.50
J	4.49	9.90	10	10.6	4.49	12.62	16	13.5	4.49	14.67
K	2.11	4.39	8	3.8	2.11	6.47	20	6.9	0.60	14.00
L	0.88	7.09	3	15.6	1.17	7.46	4	15.8	0.88	7.65
M	2.48	10.92	5	7.9	2.48	12.96	10	8.4	2.48	13.39
N	2.55	9.42	6	3.1	2.94	11.55	10	3.8	2.55	13.50
O	1.44	8.48	17	16.9	1.24	8.49	12	18	1.44	10.48
P	3.10	8.82	8	7.2	3.10	10.74	14	7.7	3.10	11.51
Q	3.43	9.02	9	7.5	2.94	10.34	16	9.2	3.43	13.26
R	1.72	6.80	14	7.5	1.47	7.46	12	10	0.07	9.53
S	0.87	9.06	4	6.1	0.74	10.10	14	9.5	0.87	12.06
T	3.75	7.61	8	6.1	3.22	10.78	20	9.6	3.75	13.56
U	1.99	7.92	13	5.8	1.71	12.82	12	6.5	1.99	16.14
V	9.94	11.90	9	1.2	11.63	15.90	4	2.1	9.94	18.42

Table 5. Results of MRA model 1.

	Coefficient	p-Value
Intercept	1.444	0.003
Age	−0.007	0.008
Number of fixations 20	0.027	0.000
Minimum distance 15	0.161	0.012
Average distance 15	−0.133	0.020
Minimum distance 20	−0.120	0.041
Average distance 20	0.122	0.041

Table 6. Results of MRA model 2.

	Coefficient	p-Value
Intercept	1.444	0.003
Age	−0.007	0.008
Number of fixations 20	0.027	0.000

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shimauchi, T.; Sakurai, K.; Tate, L.; Tamura, H. Gaze-Based Vehicle Driving Evaluation of System with an Actual Vehicle at an Intersection with a Traffic Light. Electronics 2020, 9, 1408. https://doi.org/10.3390/electronics9091408

AMA Style

Shimauchi T, Sakurai K, Tate L, Tamura H. Gaze-Based Vehicle Driving Evaluation of System with an Actual Vehicle at an Intersection with a Traffic Light. Electronics. 2020; 9(9):1408. https://doi.org/10.3390/electronics9091408

Chicago/Turabian Style

Shimauchi, Takumi, Keiko Sakurai, Lindsey Tate, and Hiroki Tamura. 2020. "Gaze-Based Vehicle Driving Evaluation of System with an Actual Vehicle at an Intersection with a Traffic Light" Electronics 9, no. 9: 1408. https://doi.org/10.3390/electronics9091408

APA Style

Shimauchi, T., Sakurai, K., Tate, L., & Tamura, H. (2020). Gaze-Based Vehicle Driving Evaluation of System with an Actual Vehicle at an Intersection with a Traffic Light. Electronics, 9(9), 1408. https://doi.org/10.3390/electronics9091408

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Gaze-Based Vehicle Driving Evaluation of System with an Actual Vehicle at an Intersection with a Traffic Light

Abstract

1. Introduction

2. Previous Studies about Gaze Estimation Method for Driving the Car

3. Results

3.1. Measurement System Using TalkEye Lite

3.2. The Head Angle Estimation Method Using Template Matching

3.2.1. Estimating Inclination of Head

3.2.2. Calculation of the Roll Angle

3.2.3. Calculation of Yaw Angle and Pitch Angle

3.3. GPS Information

4. Proposed Method

4.1. The Traffic Light in the Video is Recognized by Image Processing

4.2. The Driver Faces the Front

4.3. The Traffic Light Exists Within the Visual Effective Field

4.4. The Conditions from 4.1 to 4.3 are Satisfied for 300 Milliseconds or More

5. Experimental Methods and Results

5.1. Result Using Various Variables as Inputs

5.2. Result Using Subjects’ Age and the Number of Fixations as Inputs

5.3. Discussion

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI