Gaze-Based Vehicle Driving Evaluation of System with an Actual Vehicle at an Intersection with a Tra ﬃ c Light

: Due to the population aging in Japan, more elderly people are retaining their driver’s licenses and the increase in the number of car accidents by elderly drivers is a social problem. To address this problem, an objective data-based method to evaluate whether elderly drivers can continue driving is needed. In this paper, we propose a car driving evaluation system based on gaze as calculated by eye and head angles. We used an eye tracking device (TalkEye Lite) made by the Takei Scientiﬁc Instruments Cooperation. For our image processing technique, we propose a gaze ﬁxation condition using deep learning (YOLOv2-tiny). By using an eye tracking device and the proposed gaze ﬁxation condition, we built a system where drivers could be evaluated during actual car operation. We describe our system in this paper. In order to evaluate our proposed method, we conducted experiments from November 2017 to November 2018 where elderly people were evaluated by our system while driving an actual car. The subjects were 22 general drivers (two were 80–89 years old, four were 70–79 years old, six were 60–69 years old, three were 50–59 years old, ﬁve were 40–49 years old and two were 30–39 years old). We compared the subjects’ gaze information with the subjective evaluation by a professional driving instructor. As a result, we conﬁrm that the subjects’ gaze information is related to the subjective evaluation by the instructor.


Introduction
Due to the population aging in Japan, more elderly people are retaining their driver's licenses [1]. However, elderly people may have characteristics such as a decrease in dynamic visual acuity, difficulty processing multiple pieces of information at the same time and the inability to make instantaneous decisions, resulting in delays in the steering wheel and brake operations. Cessation of driving simply due to advanced age may hinder independence in life, increase the risk of illnesses such as depression and shorten the life expectancy [2][3][4]. Accordingly, a method to evaluate whether elderly drivers can continue safely driving is needed. However, establishing criteria for judging the driving ability of elderly people is difficult [5]. When driving a car, 90% of relevant information is generally visual [6]. In addition, Owsley et al. [7] modeled the relationship between eye health, effective field of view and car accidents involving elderly drivers, finding that it was the effective field of view that was most related to these car accidents. Therefore, the effective field of view range is greatly involved in a person's ability to drive a motor vehicle.
In this paper, we propose a car driving evaluation system using both the gaze information obtained from an eye tracking device (TalkEye Lite [8]) and the positional information of a traffic light obtained With the increase in car driving accidents, studies of detection of fatigue during driving using speech recognition [9] and detection of sleepiness using a percentage of eyelid closure (PERCLOS) [10] and driver's head location and eye status [11] have been conducted. Although fatigue and sleepiness can be detected using these methods, they are insufficient as a method for determining driving ability because the driver's aptitude judgment is needed. Paper [12] showed the car driving evaluation in a driving simulator to apply for car automated driving. However, paper [12] does not have the research of car driving evaluation that includes gaze information using an actual car driving.
The driving aptitude of the elderly driver is judged by measuring driving skill, presence or absence of visual field abnormality and cognitive ability. Usually, driving aptitude is judged by each measurement (driving skill, visual field abnormality, cognitive ability). However, to determine the driver's driving suitability by measuring the visual information and reaction speed using a gaze tracking device is researched using a driving simulator and an actual vehicle.
Regarding visual field abnormalities during driving, studies focusing on the number of fixations [13], average fixation time, average eye movement distance and focusing on the narrowing of the gaze range and peripheral visual field [14][15][16] have been conducted. Although distraction can be detected using these methods, they are insufficient as a method for determining the cause of distraction. In addition, these studies were carried out using driving simulators, which is less ecologically valid and generalizable than evaluating on-road driving in an actual vehicle. Moreover, in reference [13][14][15], it was not possible to know whether the driver was aware of the information on the road (e.g., pedestrians, traffic lights, road signs). Our proposed method can finally determine what specifically the driver is looking at using the image from the view camera. Therefore, our system can calculate the visual attention for the traffic light while driving, which is a useful method to prevent car accidents.

Results
We used a GPS device and an eye tracking device to obtain eye information in a specific area. In this chapter, we introduce the eye tracking and GPS devices. The eye tracking device is the same as in reference [16].

Measurement System Using TalkEye Lite
TalkEye Lite [8] is a wearable eye movement measurement system that connects directly to the processing computer and uses a USB camera for eye detection and visual field recording. Through its USB camera, TalkEye Lite can track the pupil of the subject; therefore, using the TalkEye Lite software allows us to see what the subject is looking at. The overlay display on the visual field image includes not only the left and right viewpoints but also the center axis of the gaze of both eyes. Figure 1a is a picture of the TalkEye Lite goggle with arrows indicating its view camera and its eyeball camera and Figure 1b is a picture of a person wearing the TalkEye Lite goggle. The blue cross points in Figure 2 indicate the gaze of both eyes. The eye motion analysis program can analyze visual videos recorded by the TalkEye Lite, enabling us to determine at which target the subject was looking. It is possible to calculate the angle of convergence using the angle data of both eyes. The angle is expressed in degrees in values ranging from -180 degrees to +180 degrees.

The Head Angle Estimation Method Using Template Matching
The head angle estimation method [16] is used in this paper. We used the viewing camera of the TalkEye Lite to obtain the head movement information. Although the head angle estimation method using the RGB-D sensor needs to set a sensor, the head angle estimation method using the proposed template matching uses a field of view camera attached to the TalkEye Lite. Therefore, our proposed method has the merit that it can be easily used in various environments. Since the estimation of the line of sight and the head angle is performed during actual car driving, it is necessary to use a method that can estimate the line of sight given the restricted physical space available for equipment and measurement; template matching accomplishes this. We installed four templates as shown in Figure  3 (Marker1, an upward triangle; Marker2, a circle; Marker3, a downward triangle; and Marker4, a star) on the front glass of the car. This system required calibration prior to data collection onset. We defined the forward-facing position of the head as angle 0 degrees. We placed the markers (Marker1, Marker2, Marker3 and Marker4) on the dashboard of the car as shown in Figure 4, with one of the three markers directly in front of the driver (0 degrees). We asked the driver to face forward for five seconds in order to record the reference position for each of the three markers. These reference positions were used for template matching. The driver's head angle was calculated after determining the head movement amount "a" using the difference between the current marker's position and the reference position. Marker4 was The blue cross points in Figure 2 indicate the gaze of both eyes. The eye motion analysis program can analyze visual videos recorded by the TalkEye Lite, enabling us to determine at which target the subject was looking. It is possible to calculate the angle of convergence using the angle data of both eyes. The angle is expressed in degrees in values ranging from −180 degrees to +180 degrees. The blue cross points in Figure 2 indicate the gaze of both eyes. The eye motion analysis program can analyze visual videos recorded by the TalkEye Lite, enabling us to determine at which target the subject was looking. It is possible to calculate the angle of convergence using the angle data of both eyes. The angle is expressed in degrees in values ranging from -180 degrees to +180 degrees.

The Head Angle Estimation Method Using Template Matching
The head angle estimation method [16] is used in this paper. We used the viewing camera of the TalkEye Lite to obtain the head movement information. Although the head angle estimation method using the RGB-D sensor needs to set a sensor, the head angle estimation method using the proposed template matching uses a field of view camera attached to the TalkEye Lite. Therefore, our proposed method has the merit that it can be easily used in various environments. Since the estimation of the line of sight and the head angle is performed during actual car driving, it is necessary to use a method that can estimate the line of sight given the restricted physical space available for equipment and measurement; template matching accomplishes this. We installed four templates as shown in Figure  3 (Marker1, an upward triangle; Marker2, a circle; Marker3, a downward triangle; and Marker4, a star) on the front glass of the car. This system required calibration prior to data collection onset. We defined the forward-facing position of the head as angle 0 degrees. We placed the markers (Marker1, Marker2, Marker3 and Marker4) on the dashboard of the car as shown in Figure 4, with one of the three markers directly in front of the driver (0 degrees). We asked the driver to face forward for five seconds in order to record the reference position for each of the three markers. These reference positions were used for template matching. The driver's head angle was calculated after determining the head movement amount "a" using the difference between the current marker's position and the reference position. Marker4 was

The Head Angle Estimation Method Using Template Matching
The head angle estimation method [16] is used in this paper. We used the viewing camera of the TalkEye Lite to obtain the head movement information. Although the head angle estimation method using the RGB-D sensor needs to set a sensor, the head angle estimation method using the proposed template matching uses a field of view camera attached to the TalkEye Lite. Therefore, our proposed method has the merit that it can be easily used in various environments. Since the estimation of the line of sight and the head angle is performed during actual car driving, it is necessary to use a method that can estimate the line of sight given the restricted physical space available for equipment and measurement; template matching accomplishes this. We installed four templates as shown in Figure 3 (Marker1, an upward triangle; Marker2, a circle; Marker3, a downward triangle; and Marker4, a star) on the front glass of the car. The blue cross points in Figure 2 indicate the gaze of both eyes. The eye motion analysis program can analyze visual videos recorded by the TalkEye Lite, enabling us to determine at which target the subject was looking. It is possible to calculate the angle of convergence using the angle data of both eyes. The angle is expressed in degrees in values ranging from -180 degrees to +180 degrees.

The Head Angle Estimation Method Using Template Matching
The head angle estimation method [16] is used in this paper. We used the viewing camera of the TalkEye Lite to obtain the head movement information. Although the head angle estimation method using the RGB-D sensor needs to set a sensor, the head angle estimation method using the proposed template matching uses a field of view camera attached to the TalkEye Lite. Therefore, our proposed method has the merit that it can be easily used in various environments. Since the estimation of the line of sight and the head angle is performed during actual car driving, it is necessary to use a method that can estimate the line of sight given the restricted physical space available for equipment and measurement; template matching accomplishes this. We installed four templates as shown in Figure  3 (Marker1, an upward triangle; Marker2, a circle; Marker3, a downward triangle; and Marker4, a star) on the front glass of the car. This system required calibration prior to data collection onset. We defined the forward-facing position of the head as angle 0 degrees. We placed the markers (Marker1, Marker2, Marker3 and Marker4) on the dashboard of the car as shown in Figure 4, with one of the three markers directly in front of the driver (0 degrees). We asked the driver to face forward for five seconds in order to record the reference position for each of the three markers. These reference positions were used for template matching. The driver's head angle was calculated after determining the head movement amount "a" using the difference between the current marker's position and the reference position. Marker4 was This system required calibration prior to data collection onset. We defined the forward-facing position of the head as angle 0 degrees. We placed the markers (Marker1, Marker2, Marker3 and Marker4) on the dashboard of the car as shown in Figure 4, with one of the three markers directly in front of the driver (0 degrees). We asked the driver to face forward for five seconds in order to record the reference position for each of the three markers. These reference positions were used for template matching. The driver's head angle was calculated after determining the head movement amount "a" using the difference between the current marker's position and the reference position. Marker4 was excluded from calibration because Marker4 did not appear on the screen when facing the front. Therefore, the reference position of the star marker was calculated using the ratio of α and β as shown in Figure 4. The coordinates of Cx were obtained by transforming Equation (1), as in Equation (2), to Equation (3).
Electronics 2020, 9, x FOR PEER REVIEW 4 of 16 excluded from calibration because Marker4 did not appear on the screen when facing the front. Therefore, the reference position of the star marker was calculated using the ratio of α and β as shown in Figure 4. The coordinates of Cx were obtained by transforming Equation (1), as in Equation (2), to Equation (3).

Estimating Inclination of Head
As for the inclination of the head, there are three axis directions: the yaw axis, the pitch axis and the roll axis ( Figure 5). The yaw angle and the pitch angle are necessary for gaze estimation. The yaw angle and the pitch angle of the head were calculated using differences in the recorded marker coordinates. If the yaw axis and the pitch axis changed while the roll axis of the head was inclined, the head orientation angle could not be estimated correctly. An example is shown in Figure 6. When the head roll axis (θ in Figure 6) was tilted and if the b degree moved to the right, the marker position moved a pixel. Therefore, when the roll axis rotated, we first needed to determine the roll angle. After calculating the roll angle, the coordinate transformation was performed. The deviation of the estimated angle due to the movement of the roll angle is shown in Figure 5. The deviation of the yaw axis can be calculated by a − acosθ and the deviation of the pitch axis by asinθ. By considering these deviations, we obtained coordinates converted to a zero roll angle, solving the problem caused by the inclination of the head.

Estimating Inclination of Head
As for the inclination of the head, there are three axis directions: the yaw axis, the pitch axis and the roll axis ( Figure 5). The yaw angle and the pitch angle are necessary for gaze estimation. The yaw angle and the pitch angle of the head were calculated using differences in the recorded marker coordinates. If the yaw axis and the pitch axis changed while the roll axis of the head was inclined, the head orientation angle could not be estimated correctly. An example is shown in Figure 6. When the head roll axis (θ in Figure 6) was tilted and if the b degree moved to the right, the marker position moved a pixel. Therefore, when the roll axis rotated, we first needed to determine the roll angle. After calculating the roll angle, the coordinate transformation was performed. The deviation of the estimated angle due to the movement of the roll angle is shown in Figure 5. The deviation of the yaw axis can be calculated by a − acosθ and the deviation of the pitch axis by asinθ. By considering these deviations, we obtained coordinates converted to a zero roll angle, solving the problem caused by the inclination of the head.
Electronics 2020, 9, x FOR PEER REVIEW 4 of 16 excluded from calibration because Marker4 did not appear on the screen when facing the front. Therefore, the reference position of the star marker was calculated using the ratio of α and β as shown in Figure 4. The coordinates of Cx were obtained by transforming Equation (1), as in Equation (2), to Equation (3).

Estimating Inclination of Head
As for the inclination of the head, there are three axis directions: the yaw axis, the pitch axis and the roll axis ( Figure 5). The yaw angle and the pitch angle are necessary for gaze estimation. The yaw angle and the pitch angle of the head were calculated using differences in the recorded marker coordinates. If the yaw axis and the pitch axis changed while the roll axis of the head was inclined, the head orientation angle could not be estimated correctly. An example is shown in Figure 6. When the head roll axis (θ in Figure 6) was tilted and if the b degree moved to the right, the marker position moved a pixel. Therefore, when the roll axis rotated, we first needed to determine the roll angle. After calculating the roll angle, the coordinate transformation was performed. The deviation of the estimated angle due to the movement of the roll angle is shown in Figure 5. The deviation of the yaw axis can be calculated by a − acosθ and the deviation of the pitch axis by asinθ. By considering these deviations, we obtained coordinates converted to a zero roll angle, solving the problem caused by the inclination of the head.

Calculation of the Roll Angle
The formula for calculating the roll angle is shown in Equation (5). Letting the coordinates of the two found markers be (x , y ) and (x , y ), the roll angle could be calculated by Equation (5).
θ: Roll angle; (x , y ) and (x , y ): the coordinates of the two markers found.

Calculation of Yaw Angle and Pitch Angle
The yaw angle and the pitch angle were calculated by obtaining the movement amount of a pixel of the marker from (x', y') obtained by the rotation conversion in Equations (6) and (7). x, y indicates the coordinates of the marker before the coordinate transformation and x', y' indicates the coordinates after transformation. Equations (8) and (9) show equations for calculating the yaw angle and pitch angle. For example, if the coordinate of the marker moved a pixel horizontally, the head inclined at an angle shown in Equation (8) with respect to the yaw angle.
= cos − sin (6) = sinθ + ycos where a is the horizontal movement amount of the marker (pixel) and b is the vertical movement amount of the marker (pixel). The recognition accuracy of the template matching was 90.2% and the average error of the head angle estimation was 4.1 degrees.

GPS Information
In this paper, we used high-precision Global Positioning System (GPS) information to extract the information within a specific area of the driving course, which contained a traffic light. The video extracted by GPS was about 30 seconds. Figure 7a shows the plot map from the GPS information with the red rectangle representing the traffic area, the white point representing the car and the white line representing the driving trajectory. Figure 7b is a still image of the intersection with the traffic light. In our driving experiment, car drivers almost did not change the head position due to back and forward movement. Moreover, the change of the yaw angle and the pitch angle of the head orientation by the change of the head position did not occur in the driver's seat. Therefore, our proposal method did not use the back and forward movement of the head position.

Calculation of the Roll Angle
The formula for calculating the roll angle is shown in Equation (5). Letting the coordinates of the two found markers be (x 1 , y 1 ) and (x 2 , y 2 ), the roll angle could be calculated by Equation (5). However, it was a condition of x 2 > x 1 .

Calculation of Yaw Angle and Pitch Angle
The yaw angle and the pitch angle were calculated by obtaining the movement amount of a pixel of the marker from (x', y') obtained by the rotation conversion in Equations (6) and (7). x, y indicates the coordinates of the marker before the coordinate transformation and x', y' indicates the coordinates after transformation. Equations (8) and (9) show equations for calculating the yaw angle and pitch angle. For example, if the coordinate of the marker moved a pixel horizontally, the head inclined at an angle shown in Equation (8) with respect to the yaw angle.
x = x cos θ − y sin θ (6) Yaw f ace angle = a Camera resolution Camera viewing angle (deg) (8) Pitch f ace angle = b Camera resolution Camera viewing angle (deg) (9) where a is the horizontal movement amount of the marker (pixel) and b is the vertical movement amount of the marker (pixel). The recognition accuracy of the template matching was 90.2% and the average error of the head angle estimation was 4.1 degrees.

GPS Information
In this paper, we used high-precision Global Positioning System (GPS) information to extract the information within a specific area of the driving course, which contained a traffic light. The video extracted by GPS was about 30 seconds. Figure 7a shows the plot map from the GPS information with the red rectangle representing the traffic area, the white point representing the car and the white line representing the driving trajectory. Figure 7b is a still image of the intersection with the traffic light. In our driving experiment, car drivers almost did not change the head position due to back and forward movement. Moreover, the change of the yaw angle and the pitch angle of the head orientation by the change of the head position did not occur in the driver's seat. Therefore, our proposal method did not use the back and forward movement of the head position.

Proposed Method
Humans constantly move and stop their eyes in order to recognize the shapes and colors of objects in their environments. In the eye movement field, the gaze fixation is defined as the maintaining of the visual gaze on a single location or the gaze movement between two saccades. A saccade is a rapid eye movement performed by humans to obtain information. Humans require a fixed latency of 150 to 250 milliseconds at the start of a saccade [17] so the gaze must remain at least 150 milliseconds between gaze and saccade. In order to detect gaze fixation using saccade information, the sampling frequency should be over 14 Hz [18]. However, the sampling frequency of the data recorded by the TalkEye Lite is 10 Hz, which is not high enough to use saccades to identify the gaze fixation condition. In this paper, we propose the identification of the gaze fixation condition using not saccades but position relationship between the gaze point and the traffic light point as described in Figure 8. The conditions labeled 4.1 to 4.4 in Figure 8 are further described below.

Proposed Method
Humans constantly move and stop their eyes in order to recognize the shapes and colors of objects in their environments. In the eye movement field, the gaze fixation is defined as the maintaining of the visual gaze on a single location or the gaze movement between two saccades. A saccade is a rapid eye movement performed by humans to obtain information. Humans require a fixed latency of 150 to 250 milliseconds at the start of a saccade [17] so the gaze must remain at least 150 milliseconds between gaze and saccade. In order to detect gaze fixation using saccade information, the sampling frequency should be over 14 Hz [18]. However, the sampling frequency of the data recorded by the TalkEye Lite is 10 Hz, which is not high enough to use saccades to identify the gaze fixation condition. In this paper, we propose the identification of the gaze fixation condition using not saccades but position relationship between the gaze point and the traffic light point as described in Figure 8. The conditions labeled 4.1 to 4.4 in Figure 8 are further described below.

Proposed Method
Humans constantly move and stop their eyes in order to recognize the shapes and colors of objects in their environments. In the eye movement field, the gaze fixation is defined as the maintaining of the visual gaze on a single location or the gaze movement between two saccades. A saccade is a rapid eye movement performed by humans to obtain information. Humans require a fixed latency of 150 to 250 milliseconds at the start of a saccade [17] so the gaze must remain at least 150 milliseconds between gaze and saccade. In order to detect gaze fixation using saccade information, the sampling frequency should be over 14 Hz [18]. However, the sampling frequency of the data recorded by the TalkEye Lite is 10 Hz, which is not high enough to use saccades to identify the gaze fixation condition. In this paper, we propose the identification of the gaze fixation condition using not saccades but position relationship between the gaze point and the traffic light point as described in Figure 8. The conditions labeled 4.1 to 4.4 in Figure 8 are further described below.

The Traffic Light in the Video is Recognized by Image Processing
In order to recognize the traffic signal in the videos, we focused on the recognition rate due to environmental changes (clear, cloudy, rain) and compared six image processing methods. Table 1 shows the comparison results. Regarding the recognition method using color features, it was confirmed that the traffic light could not be recognized due to the taillight of the preceding vehicle, surrounding signboards, strong sunlight, etc. [19][20][21][22]. We confirmed the recognition rates and environmental responses of YOLOv2 and YOLOv2-tiny [23]. The 11,034 traffic light images such as in Figure 7b were learned by YOLOv2 and YOLOv2-tiny models in the learning environment (Table 2). To confirm the accuracy of YOLOv2-tiny, we used three daytime videos. In this paper, the recognition rate refers to the ratio of the recognized traffic light frames among the reflected traffic light in the video. For YOLOv2-tiny, the recognition rate was over 93.0%. It was almost 100% for the 22 subjects' videos used in this paper. The processing time was 0.07 seconds/frame in the execution environment ( Table 2).

The Driver Faces the Front
In this condition, the face angle estimation method [16] explained in Chapter 3.2 was used. We defined the front as the area within ±15 degrees of the average yaw angle at the intersection based on reference [24]. In addition, right and left confirmation states were outside of this range. This condition was based on the driver's face angle from reference [16] technology and the knowledge of the driver's driving behavior evaluation based on ease of oversight of pedestrians considering the effective human visual field and visual characteristics and head posture [24]. Figure 9 shows the yaw angle of an example driver's face while at the intersection. When the driver faced to the right, the yaw angle had a positive value over +15 degrees. Figure 9 shows that this driver did not face left but right when turning to the right.

The Traffic Light Exists Within the Visual Effective Field
First, we obtained the traffic light center coordinates ( , ) by detecting the traffic light (4.1). Second, the pixel distance between the traffic light center coordinates ( , ) and the gaze coordinates ( , ) obtained from the eye tracking device was calculated by Equation (10).
The system could calculate the numerical value that could be used to determine whether the traffic light was within the effective field of view; the amount of change in the angle per one pixel in the video was 0.07/pixel in the specific driving school. Figure 10 shows the positional relationship between the gaze point and the traffic light at the intersection. Figure 10 shows that this driver looked near a traffic light or looked around. The effective field narrows with aging [25] so we changed the threshold of the effective field of view from 10 to 25 every 5 relates to d fixation time decreasing.

The Conditions from 4.1 to 4.3 are Satisfied for 300 Milliseconds or More
This condition is based on visual information processing and can be described in terms of a twolevel model [26][27][28][29]. At the first level, objects are dynamically localized in the 3D environment that is 'ambient'. These objects, or rather 'blobs', are identified at the second level. Fixation duration was from 150 to 250 milliseconds in the first level and increased to 500 milliseconds in the second level. Additionally, according to previous research [30,31], the number and duration of eye fixations can be important metrics for a visual recognition range when facing a dangerous traffic situation.

The Traffic Light Exists Within the Visual Effective Field
First, we obtained the traffic light center coordinates (x t , y t ) by detecting the traffic light (4.1). Second, the pixel distance between the traffic light center coordinates (x t , y t ) and the gaze coordinates (x g , y g ) obtained from the eye tracking device was calculated by Equation (10). (10) The system could calculate the numerical value that could be used to determine whether the traffic light was within the effective field of view; the amount of change in the angle per one pixel in the video was 0.07/pixel in the specific driving school. Figure 10 shows the positional relationship between the gaze point and the traffic light at the intersection. Figure 10 shows that this driver looked near a traffic light or looked around. The effective field narrows with aging [25] so we changed the threshold of the effective field of view from 10 to 25 every 5 relates to d fixation time decreasing.
Electronics 2020, 9, x FOR PEER REVIEW 8 of 16 Figure 9. The yaw angle of a driver's face at the intersection.

The Traffic Light Exists Within the Visual Effective Field
First, we obtained the traffic light center coordinates ( , ) by detecting the traffic light (4.1). Second, the pixel distance between the traffic light center coordinates ( , ) and the gaze coordinates ( , ) obtained from the eye tracking device was calculated by Equation (10).
The system could calculate the numerical value that could be used to determine whether the traffic light was within the effective field of view; the amount of change in the angle per one pixel in the video was 0.07/pixel in the specific driving school. Figure 10 shows the positional relationship between the gaze point and the traffic light at the intersection. Figure 10 shows that this driver looked near a traffic light or looked around. The effective field narrows with aging [25] so we changed the threshold of the effective field of view from 10 to 25 every 5 relates to d fixation time decreasing.

The Conditions from 4.1 to 4.3 are Satisfied for 300 Milliseconds or More
This condition is based on visual information processing and can be described in terms of a twolevel model [26][27][28][29]. At the first level, objects are dynamically localized in the 3D environment that is 'ambient'. These objects, or rather 'blobs', are identified at the second level. Fixation duration was from 150 to 250 milliseconds in the first level and increased to 500 milliseconds in the second level. Additionally, according to previous research [30,31], the number and duration of eye fixations can be important metrics for a visual recognition range when facing a dangerous traffic situation. This condition is based on visual information processing and can be described in terms of a two-level model [26][27][28][29]. At the first level, objects are dynamically localized in the 3D environment that is 'ambient'. These objects, or rather 'blobs', are identified at the second level. Fixation duration was from 150 to 250 milliseconds in the first level and increased to 500 milliseconds in the second level. Additionally, according to previous research [30,31], the number and duration of eye fixations can be important metrics for a visual recognition range when facing a dangerous traffic situation.
Our proposed method recorded the total eye fixation time (seconds), the number of eye fixations, the minimum distance and the average distance during the gaze fixation. Using this gaze information and the driver's age, we examined whether gaze evaluation at intersections was possible. Hereafter, the notation for variable naming will be according to this example: "the number of eye fixations 20" is the number of eye fixations when an effective field is 20 degrees.

Experimental Methods and Results
In this chapter, we investigate whether our proposed method was able to evaluate gaze information near traffic. The subjects were 22 general drivers (two were 80-89 years old, four were 70-79 years old, six were 60-69 years old, three were 50-59 years old, five were 40-49 years old and only one test drive without training and they drove for six to fifteen minutes in the driving school course. Tables 3 and 4 show the subjects' gaze information from our proposed method. We investigated the relationship between the gaze information and the three-level subjective evaluation (0/0.5/1) by the driving school instructors. The evaluation area was limited to the traffic signal area. Subjective evaluation was performed by the following items: focusing on the signal confirmation, left and right safety confirmation, over the stop line, oncoming lane confirmation, right side and right rear confirmation. Regarding evaluation, 0 was bad, 0.5 was intermediate and 1 was good.
Here, we explain S.D.A.P. (Smart Driving Assessment Program) [32], which is one of the driving skill evaluation systems developed by OFA SUPPORT INC for driving schools. S.D.A.P. operates the steering wheel, accelerator, brake, etc. to evaluate driving technique. The score obtained from the S.D.A.P. (S.D.A.P. score) is deducted from 0. In this paper, the S.D.A.P. score is used as reference data. Table 3 shows the subjective evaluation (0/0.5/1) as the relationship of the S.D.A.P. score. Multiple regression analysis (MRA) tools used the four gaze information (the total eye fixation time seconds, the number of eye fixations, the minimum distance and the average distance) and driver's age as inputs and the subjective evaluation (0/0.5/1) as the target output. We used an alpha value of 0.05 to evaluate the statistical significance of predictors. The proposed method automatically obtained Almost the same evaluation as the driving instructor from this methodology.

Result Using Various Variables as Inputs
As shown in Table 5, when MRA was performed using age, number of fixations 20, minimum distance 15, average distance 15, minimum distance 20, average distance 20 and average distance 25 as inputs and the subjective evaluation (0/0.5/1) as the target output, the p-value of each coefficient was less than our alpha of 0.05. The number following the variable name in Table 5 represents the threshold of the effective field of view. As shown in Figure 11, the coefficient of determination was 0.7382 so these seven variables were related to the subjective evaluation by the instructor. Additionally, the recognition rate was 100% for the instructor's evaluation of a score of 0.5 (threshold: maximum of 0, minimum of 1). However, the sample size was low at 22 subjects so this result is unreliable. Therefore, we ran a second MRA model with fewer variables.

Result Using Various Variables as Inputs
As shown in Table 5, when MRA was performed using age, number of fixations 20, minimum distance 15, average distance 15, minimum distance 20, average distance 20 and average distance 25 as inputs and the subjective evaluation (0/0.5/1) as the target output, the p-value of each coefficient was less than our alpha of 0.05. The number following the variable name in Table 5 represents the threshold of the effective field of view. As shown in Figure 11, the coefficient of determination was 0.7382 so these seven variables were related to the subjective evaluation by the instructor. Additionally, the recognition rate was 100% for the instructor's evaluation of a score of 0.5 (threshold: maximum of 0, minimum of 1). However, the sample size was low at 22 subjects so this result is unreliable. Therefore, we ran a second MRA model with fewer variables.  Figure 11. Results of MRA model 1.

Result Using Subjects' Age and the Number of Fixations as Inputs
As shown in Table 6, when MRA was performed using only age and number of fixations 20 as inputs and the subjective evaluation (0/0.5/1) as the target output, the p-value of each coefficient was less than 0.05. As shown in Figure 12, the coefficient of determination was 0.5517 so these two Subjective evaluation by instructor Figure 11. Results of MRA model 1.

Result Using Subjects' Age and the Number of Fixations as Inputs
As shown in Table 6, when MRA was performed using only age and number of fixations 20 as inputs and the subjective evaluation (0/0.5/1) as the target output, the p-value of each coefficient was less than 0.05. As shown in Figure 12, the coefficient of determination was 0.5517 so these two variables were related to the subjective evaluation by the instructor. Additionally, the recognition rate was 86.4% for the instructor's evaluation of a score of 0.5 (threshold: maximum of 0, minimum of 1). variables were related to the subjective evaluation by the instructor. Additionally, the recognition rate was 86.4% for the instructor's evaluation of a score of 0.5 (threshold: maximum of 0, minimum of 1). 0.027 0.000 Figure 12. Results of MRA model 2.

Discussion
In this section, we will discuss the results and compare our previous work [8] with our updated proposed method.
From the results of MRA model 1 (Section 4.1), the recognition rate of the subjective evaluation level of 0.5 was 100%. While there were seven input variables used for MRA model 1, the sample size used in this paper was underpowered at 22 subjects. Therefore, we consider that this evaluation does not have high reliability and we must increase the number of subjects in order to evaluate MRA model 1.
From the results of MRA model 2 (Section 4.2), the recognition rate of the subjective evaluation level of 0.5 was 86.4%. The subject with the lowest rating among subjective evaluation of 0.5 was considered valid because we verified that the subject's driving skill was low as indicated by the driving ability evaluation system (S.D.A.P. [32]).
Sakurai et al. [16] proposed a method wherein the driver's gaze range was estimated while driving the entire test course. Our proposed method used GPS information to extract only the gaze data near the intersection on the test course and performed the evaluation within this specific area. Therefore, our proposed method could not evaluate the subjects' gaze over the entire course.

Conclusions
In this paper, we proposed to apply the experimental method of our previous study [16]. The proposed method used GPS information to extract the video and eye information at an intersection

Discussion
In this section, we will discuss the results and compare our previous work [8] with our updated proposed method.
From the results of MRA model 1 (Section 4.1), the recognition rate of the subjective evaluation level of 0.5 was 100%. While there were seven input variables used for MRA model 1, the sample size used in this paper was underpowered at 22 subjects. Therefore, we consider that this evaluation does not have high reliability and we must increase the number of subjects in order to evaluate MRA model 1.
From the results of MRA model 2 (Section 4.2), the recognition rate of the subjective evaluation level of 0.5 was 86.4%. The subject with the lowest rating among subjective evaluation of 0.5 was considered valid because we verified that the subject's driving skill was low as indicated by the driving ability evaluation system (S.D.A.P. [32]).
Sakurai et al. [16] proposed a method wherein the driver's gaze range was estimated while driving the entire test course. Our proposed method used GPS information to extract only the gaze data near the intersection on the test course and performed the evaluation within this specific area. Therefore, our proposed method could not evaluate the subjects' gaze over the entire course.

Conclusions
In this paper, we proposed to apply the experimental method of our previous study [16]. The proposed method used GPS information to extract the video and eye information at an intersection with a traffic light. In addition, we extended the functions of our car driving ability evaluation system using image processing and eye information. We mainly carried out three things: (1) We defined the condition for eye fixation, one of the eye movements, using the coordinates of the traffic light in the video obtained by the image processing and the coordinates of the gaze obtained from the eye tracking device.