Research on Assessing Driving Ability of Older Drivers Based on Cognitive Tests: A Case Study of Beijing, China

: Research on cognitive tests for older drivers will contribute to accurately identifying unsafe drivers and decreasing the risk that older drivers pose to themselves and other roadway users. This study aims to design and evaluate a comprehensive cognitive test, including memory, reaction and judgment ability tests. A total of 72 drivers from Beijing, China, were recruited in 2020 to participate in these cognitive tests to obtain detailed test information on the recorded response time and accuracy. A one-way ANOVA test was proposed to examine the signiﬁcance among different age and crash record groups. The comprehensive cognitive test was proved effective in judging the at-risk older drivers, where 96.7% of the safe young group and 100% of the safe older group passed the test, and 89.5% of the at-risk older group failed the test. The study clariﬁed the efﬁciency and accuracy of each question as well as the whole test. It also conﬁrmed that driving ability decreased with the increase in age. According to the obtained comprehensive cognitive test, it provided a scientiﬁc method basis for standardizing the management of the older drivers with a license, so as to guide the older drivers to understand trafﬁc elements and rules.


Introduction
In recent years, the number of car crashes involving older drivers has been increasing, and this has become a problem that must be addressed by society.In 2019, the crash rate of drivers aged 60 and older in China was three times higher compared to the drivers aged from 21 to 25 and 36 to 50.Simultaneously, the older population is growing, and the corresponding increase in the number of older drivers on road systems poses challenges for driving authorities and the community [1].In China, the number of drivers aged 60 and older reached more than 15 million by 2020.A new policy cancelled the upper limit of 70 years for applying for a driver's license, which increased the risk that older drivers pose to themselves and other roadway users [2].
Older drivers are more likely to make mistakes in driving and be at fault in a crash than middle-aged and young drivers through analyzing the crash statistics' data [3,4].In China, the causes of road traffic crashes can be classified into three categories, where the driver's reaction process accounts for about 70-80%, the judgment process accounts for about 20-30% and the operation process accounts for only 10-20% [1].Meanwhile, older drivers experienced significantly slower reaction times, had more collisions, drove slower, deviated less in speed and exhibited greater performance inconsistency [5,6], where different dangerous scenarios were designed to measure the driving performance parameters of the older drivers using the driving simulator [7,8].With the decrease in mental and physical functioning, there is a higher possibility that older drivers may be involved in a crash compared to younger drivers.Meanwhile, the physical vulnerability of older drivers makes them more susceptible to injuries.Therefore, it is essential to apply the cognitive tests to the older drivers to accurately identify those unsafe drivers.
Driving safely requires adequate cognitive and psychological skills, as well as selfawareness regarding one's own driving abilities.Due to the inconsistency of older drivers' performance, researchers referred them for neuropsychological evaluations to determine their fitness to drive, including brief mental status examinations, executive functioning tests and awareness tests [9].The brief mental status examinations mainly consist of the Montreal Cognitive Assessment (MoCA) and the Mini Mental State Examination (MMSE) [10,11].Awareness allows an individual to engage in appropriate self-monitoring while driving, and subsequently make important decisions regarding compensatory strategies [12][13][14][15][16][17].
The measurement of visual perception and visual spatial abilities, which are of great importance in predicting unsafe older drivers [18][19][20] [21,22].Speed of information processing appears to be an important aspect of safe driving, and the Trail Making Test Part A (TMT-A) was the most useful tool for the researchers in identifying the danger prediction [23][24][25].Executive functioning has many valid implications for carrying out complex behaviors associated with safe driving, mainly including the Trail Making Test Part B (TMT-B), the Hazard perception test and the Wisconsin Card Sorting Test (WCST) [26,27].
Moreover, a few studies have focused on evaluating attention, memory and specific driving measures for distinguishing unsafe older drivers [28,29].Comprehensive tests were also evaluated to assess driving-relevant cognition in older drivers.Rahel compared the Bern Cognitive Screening Test (BCST), which consists of visual-spatial attention, executive functions, eye-hand coordination, distance judgment and speed regulation tests, with the paper and pencil cognitive screening tests.The research indicated that the BCST is more accurate than paper and pencil screening tests [30].Si-Woon Park built the Cognitive Perceptual Assessment for Driving (CPAD) for measuring the visual perception, attention, working memory, and executive function, and the CPAD result was associated with performances in steering, vehicle positioning and making lane changes, as well as the occurrence of car crashes [31].
Generally, the efficiency of various tests has been evaluated, focusing on different driving abilities.It is tedious for older drivers to pass all these tests to obtain their driving license, where the content of these tests may overlap.In order to improve the efficiency of the test, this study focuses on designing a comprehensive test for measuring the cognitive ability of older drivers as a novel approach.Based on the crash categories in China, the cognitive test is classified into three subtests, including memory ability, reaction ability and judgment ability tests, where each subtest may combine the current test classification and involve part of the existing test.Moreover, a one-way ANOVA is used to examine the significance of driving behavior among the safe and at-risk groups.Finally, the proposed test is applied to identify the unsafe older drivers in the city of Beijing, China.

Study Sample
A total of 72 drivers (mean age 58.68 years, SD = 16.29) with driving certificates were recruited to participate.The drivers were divided into three groups.Whether the driver had experienced crashes in the last five years where they were partly or completely at fault was selected as a criterion to evaluate a driver's ability [3,30].In order to test the effectiveness of older drivers, age was used as another criterion to divide groups.The young group of drivers (under 60 years old) consisted of drivers who have good historical driving behavior and no recent major traffic crashes.The mean age of the young group was 42.17 years old (range = 19-56, SD = 10.35) who were all classified into the safe younger group.The older group of drivers (over 60 years old) was classified into two subgroups, including the safe older group and the at-risk older group.The average age of the safe older group was 69.73 years old (range = 60-87, SD = 6.11), and they had no main responsibility for accident traffic crashes in the last 5 years.Meanwhile, the at-risk older group consisted of the drivers who had been involved in crashes and taken the main responsibility in the last 5 years; their average age was 72.05 years old (range = 66-78, SD = 3.30).
This study was conducted with approval from the Road Safety Research Institute of the Ministry of Public Security, and the tests were conducted in the experiment center in Beijing, China.The participants were recruited by text messages and recruitment information published on the Internet.After being informed of the intention of this study, the recruited participants volunteered to participate without additional compensation and signed the informed consent form.

Measures
The measures of the cognitive ability in this study consisted of the memory, reaction and judgment ability tests, where the question type mainly includes true or false and single choice.The questions were specifically designed based on the characteristics of the traffic environment.
The memory ability test included 3 types of questions (short-term memory, figure recognition, and object recognition).During the short-term memory questions, three or four numbers were presented on the screen within four seconds, and the participants were required to choose the right reciprocal of these numbers according to the options.The figure recognition questions would display a traffic sign with an arrow and vanish after two seconds, where the direction of the arrow would be questioned.On the aspect of object recognition, participants were instructed to memorize three items that appeared on the screen and select the items that were not shown before.
The reaction ability test included 3 types of questions (concentration, anti-interference, and quick position recognition).During the concentration questions, four balls of different colors were presented on the screen, moving randomly.After 8 s, the balls disappeared randomly into one of four quadrants of a grid on screen, and the participants had to specify the right grid where the specific ball disappeared.In order to test anti-interference ability, the real color of the word was required to be identified, where the words were written in another color.The quick position recognition questions presented three overlapped figures on the screen and the participants had to pick the closest one, which were specially designed from MMSE.
The judgment ability test included 6 types of questions (color judgment, spatial judgment, calculation, clock recognition, information processing, and reasoning judgment).The color judgment questions required the participants to describe the color of the object, while the location relationships between vehicles in the diagram were designed to test spatial judgment abilities.Calculation ability questions referred to the MoCA.The participants were tested on their quantity ability by counting the number of black spots in four graphs, and clock judgment ability by calculating the time shown in the given picture, based on CDT.Information processing and reasoning judgment questions gave a group of repeated regular traffic signs, and participants speculated and reasoned the next traffic sign based on the given potential regular.
Based on the analysis of the time tolerance of the older drivers, 20 questions were specially designed for the cognitive test, where the total score was 100 points and 5 points for each question.The reaction process was evaluated by the reaction and memory ability questions, so they accounted for 70-80% of the whole test; meanwhile, the judgment ability questions still accounted for nearly 30%.Considering that the older drivers' driving process is mainly based on concentration, anti-interference, short-term memory, object recognition and spatial judgment ability [1], we increased the proportion of these five questions.Each of these increased questions would account for 10-20% of the total scores, and other questions would account for 5%.The final composition of the comprehensive cognitive test is shown in Table 1.The tests were supposed to be finished on the laptop, and all participants in this study were required to answer the questions within 20 min, where the reading time of each question was 5-8 s and the test time was about 1 min.After submitting the test on the computer, assistants confirmed the participant's personal information, crash history, recorded response time and accuracy.The crash history was also confirmed through the public security system.

Statistical Analysis
The ROC analysis was selected to evaluate the threshold of the cognitive test for diagnosing the decreased ability of older drivers.A null hypothesis was proposed that the test scores were the same between the safe and at-risk groups, where the discrepancy in scores of each participant was not related to the varied driving abilities.Then, a oneway ANOVA was used to examine the significance among different groups, where the whole comprehensive test, three sub ability tests and each single question score were all analyzed.The homogeneity of variance and the Brown-Forsyth correction coefficient were calculated, where the 95% confidence interval (α = 0.05) decided the significance in the whole comprehensive test and the three sub ability tests, and the significance of the single question score was relaxed to a 90% confidence interval (α = 0.1).Finally, whether the cognitive ability was affected by age was determined using a linear test with continuous ages and the scores.All the analysis was processed by SPSS Statistics.

Results
The inflection point of the ROC curve exists in (0.2,1.0), where the sensitivity and specificity represent true positive rate (TPR) and true negative rate (TNR), respectively.The point (0,1), where TPR = 100% and TNR = 100%, shows that all the predicted results are the same as the reality.So, the closer an ROC curve is to the upper left corner, the more efficient the test is.The score of 80 points with 0.2 specificity is the best point to identify the at-risk older drivers, therefore the test threshold is set to be 80 points (Figure 1). Figure 2 shows the statistical description of the scores of the cognitive test, and the average score is 87.57(SD = 12.94).Out of 72 participants who finished the cognitive test, 54 participants passed it, with a 75% passing rate.On the aspect of the safe younger group, the average score of 30 young drivers was 95 (SD = 5.92), and the passing rate was 96.7%.On the aspect of the safe older group, 23 older drivers participated in the test, with a 91.52 average score (SD = 3.44), and all the older drivers with no recent crash record passed the test.While the at-risk older group showed a bad performance, the average score was 71.05 (SD = 13.62) and the passing rate was 10.5%.Only two older drivers scored more than 80 points, out of the 19 older drivers in the at-risk older group.Compared with the safe group, the passing rate and the score of the at-risk group were much lower, and the standard deviation of the at-risk group score was larger.Additionally, the average score of younger drivers was higher than an older driver in the safe group.Table 2 shows the detailed scores and standard deviation of different groups.The comprehensive cognitive scores between the safe and at-risk groups are significant (p < 0.001); meanwhile, all three subtests show significant differences, including the memory ability test (p < 0.001), reaction ability test (p < 0.001) and the judgment ability test (p < 0.001).
There were seven questions in the memory ability test that accounted for 35 points.As listed in Table 3, the memory ability test showed a significant difference between the safe and the at-risk groups (p < 0.001).The average memory test scores of the safe group were 32.83 and 31.74,while the at-risk group scored 26.32 on average.The safe group showed a significantly higher memory ability than the at-risk group, especially in the questions of short-term memory 1 to 3 and object recognition 2. The p-values of the short-term memory 1 to 3 were all smaller than 0.005, indicating that short-term memory can better distinguish the differences of driving ability in at-risk groups.Comparing with a single memory question, comprehensive judgment of memory ability can better distinguish the ability differences between drivers by questions of object recognition, while the question of figure recognition failed in effectively judging the at-risk older drivers.Six questions were designed for the reaction ability test and accounted for 30 points.The correct rates of each question in the safe group were all higher than that in the at-risk group, where the p-value of the total score of reaction ability was smaller than 0.001 at a confidence level above 95%.The average reaction test scores of the safe groups were 28.67 and 25.65, while the at-risk group scored 16.32 on average.The safe group showed a significantly higher reaction ability than the at-risk group, especially the questions of the concentration and quick position recognition.The p-values of concentration 1 and 2 were all below 0.001 and the average score of the safe group was nearly twice that of the at-risk group, indicating that the comprehensive reaction ability can be better revealed by the concentration questions.The at-risk group showed less reaction ability in the questions of quick position recognition, anti-interference and anti-interference ability of colors, where the p-values were all smaller than 0.1.
The judgment ability test included seven questions and accounted for 35 points.The accuracy of each question in the safe group was significantly higher than that of at-risk group (p < 0.001).The average judgment test scores of the safe group were 33.50 and 34.13, while the at-risk group scored 28.42 on average.The p-values of the spatial 2, calculation, quantity, clock and information processing questions were smaller than 0.1, indicating that the judgment ability could be distinguished by the combination of each question.
The linear test showed that age had a significant effect on the comprehensive cognitive test scores, where the p-value was 0.03 and the R-squared coefficient was 0.727.Additionally, there are also significant relationships between age and the three sub-tests.The p-values of all three subtests were smaller than 0.05, and the R-squared coefficients of the memory, reaction and judgment tests were 0.367, 0.591, and 0.210, respectively.The estimations of the parameter values were all negative (comprehensive cognitive test, 0.397; memory test, −0.111; reaction test, −0.221; judgment test, −0.065), which demonstrates the driving ability evaluated by the cognitive test is decreased with age.

Discussion
The cognitive test proved effective in judging the at-risk older drivers, with the passing score set at 80 points based on ROC analysis.In total, 96.7% of the safe young group and 100% of the safe older group passed the test, while 89.5% of the at-risk older group failed the test.The result was consistent with that obtained in the previous research, where the p-value of the significance of the whole test and three subtests were all smaller than 0.001.The goodness of the fit was considerably better than that achieved in other studies [18,30], where the threshold of 80 points showed a high sensitivity and a low specificity in ROC analysis.Joanne M. Wooda evaluated the screening tests to predict the potential for safe and unsafe driving, where the multi-disciplinary laboratory-based assessment, hazard perception and hazard change detection tests took approximately 2 h to complete [18].The average test time in this study was 15 min, which showed high efficiency and accuracy to differentiate the unsafe drivers.
A combination of memory, reaction and judgment ability tests was proved to be effective and valid in predicting the at-risk older drivers, indicated similarities and differences with different test methods in previous studies.Rahel and Si-Woon developed a comprehensive test method to identify the older unsafe drivers, while they have not highlighted the proportion of each content of the test [30,31].Furthermore, this study described the detail consideration and proportion of the questions in the test, considering the behavior of the drivers and the characteristics of the traffic environment.The questions that were optimized from the existing tests showed high significance, including the calculation and clock questions in the judgment test and the quick position recognition question in the reaction test.The questions that showed no significance mainly existed in the same type, where the short-term memory 1 to 3, anti-interference 2 and spatial 2 questions were all able to significantly distinguish the at-risk older drivers, but the short-term memory 4, antiinterference 1 and spatial 1 failed.This may be explained by the fact that the older drivers would learn from the first experience answering the same type of question.What is more, most of the researchers focused on evaluating the efficiency of the original cognitive test method, including MMSE, UFOA, DSST, TMT-B and so on.Wagner and Anstey confirmed that the aging of physiological function does not necessarily mean the decline in driving ability, while cognitive ability caused by physical aging will lead to driving danger [17,32].The MMSE test was evaluated as a small effect size test compared to the other neuropsychological evaluations [19,26,27].Therefore, using a single neurological ability test was insufficient for predicting the driving ability of elderly drivers.It is necessary to evaluate the neuropsychological status of the older drivers from a comprehensive perspective, which proved the comprehensive cognitive test was more effective and scientific than a single cognitive test.
In this study, both the results of the linear test and the average score of the cognitive test indicated that driving ability decreased with the increase in age, agreeing with previous findings that age-related deficits in attentional and executive control may affect the consistency of driving performance in older persons [5].Buncea stated that age had a significant relationship with cognitive responses and visual tracking capabilities.In this research, further examination of the cognitive tests revealed that memory, reaction and judgment ability were all associated with the age of the drivers because all the test scores had a significant influence between the two groups.This was also consistent with Robert's conclusion that differences in cognitive function increase with age [15].

Conclusions
There were two main findings in this study.On the one hand, the comprehensive cognitive test proved effective in judging the at-risk older drivers, where 96.7% of the safe young group and 100% of the safe older group passed the test and 89.5% of the at-risk older group failed the test.On the other hand, the average score of the cognitive test indicated that driving ability decreased with the increase in age.
The results of this study have important policy implications for the protection of older drivers and other drivers.By effectively reflecting the driving cognitive ability of the older drivers, this test provides a scientific method basis for standardizing the management of the older drivers with a license.Simultaneously, the questions are specially designed based on the characteristics of the traffic environment, including road traffic signs, signal lights, and road scenes, which reflect the cognitive abilities of older drivers in the specific situation.
This study is an initial step in predicting the unsafe older drivers using the cognitive test.There are nevertheless three potentially important limitations in the present study.Firstly, the sample is relatively small and may not have been fully representative of the broader population of drivers, and in this sample, the young drivers may have been better functioning on average than the general population.The second limitation concerns the fact that the data of this research are cross-sectional, while it is necessary to use panel data to record the driving behaviors after the test to verify the accuracy of the cognitive test.Additionally, the non-discriminant questions are supposed to be optimized and analyzed for identifying the unsafe older drivers in the further study.Consequently, future research could explore how to help them stay on the road as safe drivers and enable their community mobility.

Figure 1 .
Figure 1.ROC curve of the cognitive test.

Figure 2 .
Figure 2. Passing rate of cognitive test.

Funding:
This work is supported by the Special Fund of Chinese Central Government for Basic Scientific Research Operations in Commonwealth Research Institutes, Grant number 111041000000180001220402, the Ministry of Public Security Technology Research Program, Grant number 2021JSYJC19, and the Public Security Theory and Soft Science Research Project, Grant number 2022LL82.Institutional Review Board Statement: The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board of the Ministry of Public Security of the People's Republic of China (protocol code 2021JSYJC19, July 2021).
include the UFOA test, Delayed Match to Sample Test (DMS), Wechsler Digit Symbol Substitution Test (DSST), Digit Symbol Matching Test (DSM) and the Clock Drawing Test (CDT)

Table 1 .
The proportion of specific test questions.

Table 2 .
Descriptive statistics for the scores of the cognitive test.

Table 3 .
Descriptive statistics for the scores of the memory test, reaction test and judgment test.