2. Materials and Methods
This study aimed to evaluate the participants’ driving performances, driving behaviors, and situational awareness under various controlled but challenging conditions.
2.1. Experiment
This study involved a total of 30 participants divided into two groups: 15 civilian drivers and 15 professional drivers, specifically, trained delegation drivers from the Personal Protection Division of the Hungarian Police Service. Participant recruitment did not account for confounding factors such as age, experience, or gender. The inclusion criteria were to be regular drivers and professionals. The participants were tested in a simulation environment designed to assess their driving performance under various challenging conditions.
Since the primary goal of this pilot study was to explore the hypotheses and assess the feasibility, we selected a moderate sample size (15 participants per group). In a previous preliminary simulator study exploring drivers’ responses to driving system reminders on four stimuli in vehicles, only six participants were involved, with each participant participating in four sessions [
29]. In another study, 17 participants were involved to investigate the effect of time-on-task on drivers’ mental workload and driving performance during a simulated driving task [
30]. Therefore, the sample size selected in this study was consistent with that in previous simulator studies.
The simulation setup consisted of a custom-built BeamNG.tech v0.32 (BeamNG GmbH, Bremen, Germany) simulation environment, specifically tailored to include unique scenarios relevant to the study objectives. The hardware used for the experiment included a high-performance Intel i9 PC equipped with an NVIDIA GeForce GTX GPU and three 32-inch monitors providing a 180-degree visual experience. Additionally, a Genius G29 steering wheel and pedal set were employed to ensure realistic driving inputs.
The experiment featured several driving scenarios categorized into cone avoidance and motorway tasks. The cone avoidance scenarios tested the participants’ precision and control in the following exercises (
Figure 1):
Narrow passage: Navigating through a confined space with cones to assess precision under restricted conditions.
Slalom: Driving in a zigzag pattern around cones.
Center of gravity displacement: Simulating vehicle stability challenges requiring precise handling techniques.
Double obstacle avoidance: Navigating around two obstacles in quick succession.
In addition, a motorway scenario was designed to test high-speed driving skills and the ability to respond to in-car and road-related events. This scenario included the following (
Figure 2 and
Figure 3):
High-speed driving: Maintaining speed and safe positioning in the inner lane.
Stationary vehicles blocking the road (a simulated accident).
Unexpected event management (UEM): Responding to a simulated incident where a vehicle partially shifted into the inner lane due to an accident, testing situational awareness.
2.2. Measurement System
The testing method utilized a Pupil Neon eye-tracking device (Pupil Labs GmbH, Berlin, Germany) to observe visual focus. Selected for its accuracy—1.8° without individual calibration and 1.3° with a basic offset adjustment—and its effectiveness under different lighting conditions and head movements, the Pupil Neon is dependable for practical applications [
31]. The eye-tracking information was captured using a mobile app that transmitted raw data to a cloud-based service. For post-processing purposes, the Pupil Neon Player v5.0.5 desktop application (Pupil Labs GmbH, Berlin, Germany) was used, and all data were exported for further calculations.
This head-mounted system includes binocular glasses with dual-infrared eye cameras for precise eye movement tracking, complemented by a wide-angle RGB camera that captures the driver’s field of vision. The Pupil Labs Neon records a video with the world-view and infrared eye cameras, along with raw data such as timestamps, pupil locations, and gaze coordinates (x- and y-positions).
A full-HD webcam was mounted above the participants’ heads to record hand movements throughout the tasks. The webcam data captured specific hand positions and movements, including the following:
Two hands on the steering wheel;
One hand on the steering wheel;
No hand on the steering wheel.
The Polar H9 wearable heart rate-monitoring device (Polar Electro Oy, Kempele, Finland) was attached to all participants throughout the experiment.
2.3. Procedure
Before starting the experiment, all participants provided written consent to participate. Following this, detailed instructions regarding the tasks and procedures were delivered to each participant via machine voice to ensure consistency in instruction delivery. A 5 min free-driving session was provided to allow participants to familiarize themselves with the simulation environment and controls.
The experiment consisted of four cone avoidance tasks followed by a motorway scenario, all conducted in the same order for every participant to maintain uniformity. For the cone avoidance tasks, the number of cone hits was recorded at the end of each task.
In the motorway scenario, participants were required to complete an NDRT involving an increase in the interior temperature by 3 degrees. The test assistant initiated this task at a predetermined point on the road, just before a simulated accident scenario. The participant’s ability to complete the NDRT and their response to the upcoming accident were carefully observed and noted to assess their task performance.
After the driving tasks, participants completed a General Questionnaire to provide essential background information. The questionnaire collected data on demographics and driving experience. Additionally, the questionnaire included specific questions about the simulation, such as prior experience with driving simulators and comfort with the setup.
This structured procedure ensured that all participants underwent identical conditions, allowing for standardized comparison across tasks and scenarios.
2.4. Statistical Analysis
We processed and analyzed the collected data using the R program version 4.5.0 (R Core Team, R Foundation for Statistical Computing, Vienna, Austria). Regarding the acclimation period, which represents the familiarization session with the driving simulator in our experiment, it was excluded from the analysis. To determine the participants’ profiles (characteristics), the statistical analysis involved descriptive statistics. Confounders, including age, experience, or gender, did not control participant recruitment. The inclusion criteria were to be regular drivers and professionals. However, through analyzing the collected data, we identified a significant difference between professional and civilian drivers. To overcome this, age was included as a covariate in a mixed ANCOVA to statistically control its potential to affect the dependent variables. In the case of the cone avoidance test, to compare how both driver groups differed in their performance (cone avoidance precision rate, speed, steering intensity, hand on steering wheel, throttle, fixation frequency, mean duration of fixation, and heart rate), we used the Wilcoxon rank-sum test, also called the Mann–Whitney U test.
Furthermore, several mixed analyses of covariance (mixed-design ANCOVA) were conducted to examine the effects of class and scenario, controlling for age, on the collected metrics. The mixed ANCOVA was modeled using the lme() function from R’s lme4 package. For each metric, the linearity, homogeneity of regression slopes, normality, outliers, and homogeneity of variance of the models’ residual assumptions were checked. In the cases where the residual variance differed between scenarios, we included varIdent() from the R’s nlme package function in the model, reflecting the heterogeneous residuals. The variance function was used in all the models, except for the hands on steering wheel dependent variable, where the assumption of homogeneity of variance of the models’ residual was met. The results of the assumptions can be found in
Appendix A (
Table A1 and
Table A2). The statistical significance level was set at 0.05.
In the case of the motorway scenario, the Wilcoxon rank-sum test and a t-test were conducted to assess the difference between the collected metrics of both driver classes. The normality of the metrics was first tested using the Shapiro–Wilk test to assign the appropriate statistical test. A t-test was employed when a normal distribution was met (p-value > 0.05). If not, the Wilcoxon rank-sum test was applied. Moreover, the Wilcoxon rank-sum test was used to compare how both driver groups assessed the different scenarios (cone avoidance vs. motorway) in terms of the realism of the scenario, the subjective control over the vehicle, and the self-assessed success.
3. Results
This section presents an analysis of the experimental results, highlighting the differences between civilian and professional drivers. The results are reported with 95% confidence intervals (α = 0.05 level of significance).
3.1. Participants’ Profiles
The mean age of all participants was 38.66 years (min. = 21, max. = 51, and SD = 10.49). However, professional drivers were older than the civilian class, with average ages of 45.4 and 31.9, respectively. Similarly, for the number of years since license acquisition, the professional category reported a higher average (27 years) than the civilian category (13.7 years). Additionally, the average kilometers driven by each category are illustrated in
Figure 4. Professionals traveled approximately twice the monthly distance and three times the distance to work as civilians. In total, professional drivers covered three times the average distance driven by civilian drivers.
In terms of driving frequency, professional drivers uniformly reported daily driving. In contrast, civilian drivers were distributed into daily (47%) and weekly (53%) driving frequencies.
3.2. Cone Avoidance Scenarios
To aid interpretation, the collected metrics were assigned to three categories: driving performance, driving behavior, and visual and physiological metrics. The driving performance metrics were basically related to the scenarios, involving cone avoidance precision rate and the duration of the experiment. The driving behavior metrics involved speed, steering intensity, hands on the steering wheel, and throttle. Lastly, the visual and physiological metrics reflected human responses, including fixation frequency, mean duration of fixation, and heart rate. The metrics served for the comparative assessment of civilian and professional drivers across four experimental tasks. The descriptions of the metrics are presented in
Table 1.
3.2.1. Descriptive Statistics of Cone Avoidance Tasks
We summarize the collected metrics of the four scenarios for civilian and professional drivers in
Table 2 and
Table 3, respectively. In addition,
Figure 5 was derived from both tables, summarizing the same set of metrics. Civilians showed a high performance, with an average 92.35% cone avoidance rate across the four scenarios. Similarly, professionals exhibited the same dedication, with 91.1%. Civilians also showed higher values of speed than professional drivers. The average speed of civilians was 28.75 km/h, compared with 23.5 km/h for professionals, which represented an increase of approximately 18%. Across scenarios, differences followed the same pattern, comprising scenario (a), narrow passage (34.7 vs. 26.4 km/h); scenario (b), slalom (29.5 vs. 23.8 km/h); scenario (c), center of gravity displacement (28.9 vs. 22.8 km/h); and scenario (d), double obstacle avoidance (22.8 vs. 21 km/h). Among professionals, steering intensity values were consistently higher (overall mean: 309.9) compared with civilians (268.7), with a reduction of approximately 15% for the latter. In terms of throttle, a pattern of decreasing mean values appeared across scenarios, with higher values scored by civilians. For the visual behavior, the fixation mean length showed large differences, with civilians averaging 583.75 ms compared with 456.25 ms for professionals, an increase of 27%. Across scenarios, fixation durations were consistently longer for civilians (e.g., slalom: 675.00 vs. 483 ms). As for the heart rate, for both driver classes, a slight increase in the mean was noticed in scenario (b) compared with scenario (a), followed by a decrease in the remaining scenarios. Civilians exhibited a higher heart rate, averaging 88.9 bpm compared with 84.77 bpm for professionals, representing a 5% increase.
3.2.2. Wilcoxon Test Results
In the first stage, Wilcoxon tests were used in an initial exploratory analysis to evaluate the difference between scenarios independently. The preliminary results are summarized in
Table 4. In terms of driving performance metrics, only one significant difference was found between the experiment durations of civilians and professionals in the scenario (c) (
p = 0.0343). The Wilcoxon analysis results indicate that the driving behavior parameters tended to have significant differences between both groups in different scenarios, but particularly in scenario (c) (all
p-values < 0.05). While the results confirmed that the speed differences were statistically significant in scenarios (b) and (c), the throttle differences were statistically significant in scenarios (c) and (d). Steering intensity and hands on the steering wheel were only significantly different in scenario (c). As for the physiological metrics, only one significant difference was detected between the mean fixation duration of civilians and professionals in scenario (b) (
p = 0.0381).
3.2.3. Mixed-Design ANCOVA Results
Following preliminary non-parametric comparisons using Wilcoxon tests, in the second stage, we conducted mixed-design ANCOVA to more comprehensively assess the effects of scenarios (within-subject factor) and class (between-subjects factor), while controlling for age as a covariate, on the collected metrics. Age was not the main factor of interest in this work, and as the difference in age was significant between the two driver groups, it was initially treated as a covariate to adjust for the imbalance statistically. However, when the homogeneity of regression slopes assumption of a model was violated (indicated by a significant scenario × age interaction (p-value < 0.05)), age was retained in the interaction with scenario and class factors. Retaining the interaction allowed its effect to vary across scenarios rather than treating it as a uniform covariate. In this study, for the speed, steering intensity, heart rate, and experiment duration dependent variables, the homogeneity of regression slopes assumption was violated. Thus, the models were refitted by accounting for age as an interacting covariate.
Each collected metric was analyzed separately as a dependent variable. The modeling results of the three categories are illustrated in
Table 5,
Table 6 and
Table 7.
Driving Behavior Metrics
Our speed model results (
Table 6) indicate a significant effect of the drivers’ classes on speed (F(1,28) = 8.19;
p = 0.008). The scenario also had a significant effect on speed (F (3,80) = 9.65;
p < 0.001). Thus, speed was significantly different across all the scenarios. The interaction between the class and scenario variables was not significant (F(3,80) = 1.57;
p = 0.204). Both driver groups followed a similar pattern of change across scenarios. Regarding age, the interaction term scenario x age was significant (F(3,80) = 6.46;
p < 0.001), which means that the relationship between age and speed differed across scenarios. Based on the post hoc pairwise comparison of the estimated marginal means of the speed for each scenario within each group, among civilians, only scenario (d) scored significantly higher than scenario (c) (estimated difference = 5.77;
p = 0.044). Again, scenario (d) scored significantly higher than the two other scenarios (a) and (b) among professional drivers (estimated difference = 11.15 (
p = 0.008) and estimated difference = 6.58 (
p = 0.002), respectively. Regarding age’s influence on speed, it depended on the scenario. The results of the estimated marginal trend analysis indicated a negative effect in scenarios (a) and (b), where older participants had lower speeds. No significant effect of age was noticed across scenarios (c) and (d).
In terms of the steering intensity modeling results, there was no significant mean difference between professional and civilian drivers in steering intensity scores across the four scenarios (p = 0.2404 > 0.05). However, group differences were notable in only scenario (c), where professional drivers showed a significantly higher steering intensity than civilian drivers (estimate = 131.70; p = 0.0098). Thus, the difference between the two groups was scenario-based, where the scenario was the key factor (p < 0.0001). In this case, the age effect was not significant.
Regarding the ‘hands on steering wheel’ metric, our modeling results revealed that this metric was only significantly influenced by the scenario to which the data belonged (p < 0.0001). Class membership and age had no significant effect on the metric. Professional drivers tended to have slightly lower values for hands on the steering wheel on average compared with civilian drivers (estimated at 0.116), but the difference was not statistically reliable. Pairwise comparisons indicated that scenario (c) differed significantly from scenario (a) (estimated difference = 0.5; p = 0.009).
As for the throttle metric, the mixed ANCOVA results (
Table 6) indicate that class membership and scenario independently affected the score (
p = 0.0014 and
p < 0.0001, respectively). Following pairwise comparisons, it was revealed that scenarios (b, c, and d) scored significantly lower on throttle metrics compared with scenario (a). This can be visually seen from the distribution of the throttle score in
Figure 6. However, the class × scenario interaction had no significant effect on the throttle metric (
p = 0.4544), indicating that professional and civilian drivers’ effects did not differ across the four scenarios. In this current model, age showed a non-significant marginal negative effect (
p = 0.076). Although it was not entirely reliable, the age trend suggests that throttle decreased slightly with older participants.
Driving Visual Behavior and Physiological Response Metrics
In terms of visual behavior, the modeling results indicate that the scenario and age main effects were not significant for fixation frequency and the mean duration of fixation. The interaction between class and scenario also had no statistically significant effect on both metrics (
p = 0.4526 and
p= 0.1298). Overall, the visual behavior of both driver groups had the same pattern across the four scenarios. The distribution of the visual metrics, as depicted in
Figure 7, clearly illustrated this pattern. Notably, the mean fixation frequency differed significantly between the two groups (
p = 0.0441). However, the post hoc pairwise comparison between civilian and professional drivers did not reach statistical significance (estimate = −14.3, SE = 12.9, t (27) = −1.11, and
p = 0.28). Professionals showed a higher mean fixation frequency than civilians, even though the difference between the two classes was not strong enough to be statistically significant.
Regarding the heart rate results, both groups had no significant difference (
p = 0.3453). However, the heart rate mean was significantly different across scenarios (
p = 0.0021), with lower values in scenarios (c) and (d) compared with (a). The post hoc pairwise comparisons were not significant, though the contrast between scenarios (b) and (d) approached significance (
p = 0.055). Furthermore, the class × scenario interaction had no significant effect on the mean heart rate (
p = 0.8392), suggesting similar patterns between both groups across all the scenarios (
Figure 8). For this metric, age was a strong negative predictor of heart rate (
p = 0.0188). Consequently, the interaction between scenario and age had a significant effect on heart rate (
p = 0.0001). The relationship between age and heart rate depended on the scenario. The model estimates indicated that, despite older participants exhibiting lower heart rates in scenario (a), they demonstrated relatively higher heart rate scores in scenarios (b), (c), and (d) compared with younger participants.
3.3. Motorway Test Scenario
Since the motorway scenario was a longer and more complicated task, involving the management of an unexpected event, it was analyzed separately. Except for the cone avoidance precision rate, the same metrics as cone avoidance scenarios were collected and analyzed. Moreover, the brake metric was analyzed. The descriptive statistics, parametric and non-parametric analyses, and subjective assessments are presented as follows.
3.3.1. Descriptive Statistics of Motorway Task
For the comparison of civilian and professional drivers in the motorway scenario, the descriptive statistics (means, medians, and standard deviations) are summarized in
Table 8. Overall, the mean values of the driving metrics of civilian drivers were close to those of professional ones.
Based on the participants’ recorded videos, we identified four cases of how drivers managed the unexpected event. The cases are illustrated in
Table 9. Interestingly, 20% of civilian drivers and professional drivers collided even with the use of brakes. Disregarding the braking maneuvers, 73% of civilians and 60% of professionals succeeded in the task by avoiding the accident from the left. While only 1 civilian driver broke and then passed in the same lane, 20% of professionals followed the same management.
3.3.2. Inferential Analysis
Interestingly, none of the statistical tests revealed a significant difference between the two driver groups in terms of driving metrics (
Table 10). On average, professionals required more time to complete the motorway segment (M = 1.14 min) than civilians (M = 0.97 min), an approximate 17.5% increase. Civilians maintained a higher mean speed (M = 88.7 km/h) compared with professionals (M = 79.2 km/h), a 12% increase that did not reach statistical significance (
p = 0.08). Based on the speed distribution illustrated in
Figure 9, professionals also had higher variability. These findings indicate that professionals reduced their speed, resulting in longer completion times and suggesting a more controlled driving strategy.
As for the remaining set of indicators, including steering, throttle, braking, and hands-on-wheel behaviors, professionals demonstrated a 40% increase in steering intensity, a 7% reduction in throttle input, and a 33% decrease in braking activity compared with civilians. While most group differences did not reach statistical significance, braking behavior showed a marginal trend (p = 0.08). This finding suggests that professionals used less sudden deceleration, consistent with anticipatory control and smoother vehicle handling. Hands-on-wheel contact was 12% lower among professionals, indicating a more relaxed grip, although this difference was not statistically significant. Overall, higher variability could be noticed among professionals compared with civilian drivers.
For physiological measures, professionals exhibited a 3% lower mean heart rate, a difference that was not statistically significant (
p = 0.67523). Fixation metrics showed nearly identical mean fixation durations (518 ms for professionals and 513 ms for civilians). This scanning pattern was consistent with enhanced situational awareness. The fixation frequency per minute was 4% lower than for professionals, likely due to their longer task durations. Graphically, the distribution of these metrics among civilians was approximately like that of professionals, with higher variability among professionals (
Figure 10).
3.3.3. Subjective Assessment
Overall, participants rated the motorway test higher than the cone avoidance scenarios. With an average score of (5.23/7), participants rated their performance success.
Figure 11 illustrates the average scores of participants’ subjective assessments of the realism of the scenario, subjective control over the vehicle, and self-assessed success on a seven-point Likert scale. In terms of the realism of the scenario, the results of the Wilcoxon rank-sum test revealed a statistically significant difference between the driver groups (
p-value = 0.017 < 0.05). Professional drivers showed a larger increase in the rating from the cone avoidance scenarios to the motorway scenario compared with civilian drivers. As for subjective control over the vehicle, the Wilcoxon rank-sum test indicated no statistically significant difference between the driver groups in how they rated the cone avoidance and motorway tests (
p-value = 0.49 > 0.05). Similar results were found for the self-assessed success scores (
p-value = 0.20 > 0.05).
4. Discussion
The main goal of this study was to thoroughly compare the driving performance, behavior, and underlying physiological and visual responses of civilian and professional drivers in a high-fidelity simulated environment. To achieve this, we used a new, two-part approach, including (I) a structured cone avoidance test designed to challenge technical maneuvering skills, and (II) a realistic motorway test that incorporated both NDRT and UEM. In the next section, we examine the data in light of our initial hypotheses to draw conclusions about how experience differently affects various driving demands.
4.1. Hypotheses
4.1.1. Hypothesis 1: Civilian Drivers Have Lower Driving Performance than Professional Drivers
The results from both case (I) and case (II) collectively present a significant and unexpected finding: contrary to the established literature on the benefits of extensive driving experience, civilian drivers performed comparably to their professional counterparts in terms of overall precision and collision rate. Although professional drivers possessed twice the average years of experience, the cone avoidance precision rates were statistically close, and in the motorway test, collision rates were identical. This finding necessitates a deep exploration into the specific constraints of the experimental design that may have suppressed the expected advantage of professional expertise.
The lack of difference in performance is primarily explained by the interaction between scenario difficulty and participant characteristics, as shown by the mixed ANCOVA results (
Table 7). Case (I) scenarios were likely insufficiently difficult or too short in duration to fully activate the complex, long-term anticipation and planning skills that define professional driving. Specifically, the cone avoidance test functioned more as a measure of acute vehicle control and reaction rather than strategic navigation. Scenario (b), which required the most technical precision and was the first complex task, consistently yielded a higher score. Psychologically, this novelty may have boosted the motivation and engagement of the civilian group, compensating for their lack of experience. Furthermore, civilians’ tendency to finish tasks faster than professionals suggests a distinct motivation: prioritizing speed over deliberate precision, which happened to yield comparable success in the low-consequence simulated environment [
32].
A critical factor obscuring the professional/civilian distinction was the high variability in age and the inherent limitations of the driving simulator. Our findings, in line with Lee and Kawabata’s findings, show that increasing age significantly and negatively impacts driving skills, with older participants exhibiting lower cone avoidance precision and longer task durations due to reduced reaction times [
32,
33]. Crucially, the limited ecological validity of the simulator may have disproportionately affected older participants, who may struggle more with the unfamiliar controls, display systems, or the lack of vestibular feedback, thus compounding age-related performance decline, regardless of their professional status [
32]. The simulator’s characteristics, therefore, acted as a confounding variable that attenuated the expected benefit of professional experience.
This result—the failure of professionals to significantly outperform civilians—contrasts sharply with foundational studies in driving expertise, which have often found that professionals demonstrate superior hazard detection, scanning strategies, and vehicle stability [
34]. The difference suggests a ceiling effect for the performance metrics measured, particularly in case (II), where all participants successfully managed the NDRT and subsequent collision event. The task design, emphasizing quick, localized maneuvers, rather than long-term hazard prediction, likely failed to capture the superior mental models and risk perception habits that professional training instills [
35]. This warrants future research using tasks specifically calibrated to stress the high-level strategic and anticipatory skills of experts.
4.1.2. Hypothesis 2: In Terms of Driving Behavior, Civilian Drivers Maintain Significantly Lower Control and Stability Compared with Professional Drivers
The results of case (I) reveal distinct findings. Civilians’ speed and throttle metrics were significantly higher than those of professional drivers. Both the class membership and the scenario were the main factors explaining the significance. Regarding age’s effect on speed, it depended on the scenario. In scenarios (a) and (b), older participants had lower speed. No significant effect of age was noticed across scenarios (c) and (d). A possible explanation of this pattern is the progression of the experiment, where participants tended to begin cautiously and later became more engaged and confident. Regarding the steering intensity and hands on steering wheel modeling results, there was no significant mean difference between professional and civilian drivers in scores across the four scenarios. However, the significant inconsistency in the steering intensity between scenarios is logical given the nature of the scenarios. In the first scenario, there was no cone to avoid in the middle of the path, and the driver only needed to drive in a straight line, requiring minimal steering input. However, in the other scenarios, the tasks involved avoiding cones, resulting in making turns and following curves, which naturally required greater use of the steering wheel. Professional drivers had a higher mean steering intensity in scenarios (b) and (c), which showed the more precise and tactile control of the vehicle based on their experience.
In terms of metrics, the findings suggest that professional drivers approached the motorway scenario with greater deliberation and control, as indicated by slightly longer completion times and reduced braking. Civilians prioritized speed, as shown by higher values and increased reliance on braking.
4.1.3. Hypothesis 3: Civilian Drivers Have Significantly Lower Visual Attention and Higher Heart Rate Responses Compared with Professional Drivers
The physiological and oculomotor data demonstrate that professionals maintained a calmer physiological state (lower heart rate) and employed a scanning strategy with more frequent fixations. However, the low heart rate is attributable not only to participants’ confidence or experience but also to an age-related reduction in heart rate [
36]. Despite older participants exhibiting lower heart rates in scenario (a), they demonstrated relatively higher heart rate scores in scenarios (b), (c), and (d) compared with younger participants. This pattern may be explained by the complexity of the latter scenarios, the challenging driving, and the increasing perceived risk. Furthermore, professionals showed a higher mean fixation frequency than civilians, even though the difference between the two classes was not strong enough to be statistically significant. These results support the conclusion that professional drivers emphasize anticipation, stability, and safety, whereas civilians display a faster and more reactive driving style. In the motorway scenario, mainly the visual behaviors and physiological responses revealed no significant differences between the groups.
4.2. Self-Assessment
Subjectively, in terms of the realistic assessment, professional drivers showed a significant increase in the rating from the cone avoidance scenarios to the motorway scenario compared with the civilian drivers. This difference may reflect the familiarity and experience of professionals with motorway conditions. Professional drivers reported longer driving distances. Their experience on motorways helped them to be accustomed to sustained focus and rapid decision making. Another potential explanation is the order effect. Since the motorway scenario was tested after the cone avoidance scenarios, these latter scenarios may have facilitated the cognitive activation of participants, serving as a “warm-up”. Furthermore, the recency effect may have affected participants’ ratings, as professionals may have remembered the last scenario more precisely and rated it higher [
37].
4.3. Limitations and Future Work
The findings of this study should be viewed in light of several important methodological limitations. First, our non-randomized recruitment led to a small and uneven sample size, resulting in a high degree of variability and notable group differences between professional and civilian drivers, especially regarding age and baseline experience. This high potential for age bias is a critical concern, as older participants were overrepresented in certain groups. Although we attempted to reduce the effect of age by including it as a covariate in the ANCOVA, the lack of initial control limits the broader applicability of the results.
Second, relying on a simulator raises concerns about limited real-world relevance. Despite the high-fidelity setup, the simulated environment could not fully capture the complex sensory and psychological demands of actual driving. This limitation might have particularly affected the performance and physiological responses of older drivers or professionals familiar with a vehicle’s specific tactile feedback.
Third, due to data collection limitations, we were unable to measure reaction times during the critical motorway scenario, which is a significant gap in understanding cognitive processing speed related to hazard avoidance (UEM). Finally, the need to evaluate cone avoidance scenarios collectively in the subjective assessment may have overlooked subtle perceptual differences tied to the complexity of individual scenarios.
Future work should aim to address these limitations by recruiting a more representative sample. Considering balanced gender representation would serve a more comprehensive analysis. Additionally, we should incorporate reaction time measurement to gain a comprehensive understanding of cognitive and behavioral responses. Since Hypothesis 1 was not supported (yielding a similar performance), future work should explicitly test a scenario with greater complexity or longer duration that could better distinguish experienced professional drivers. Because visual attention (fixation frequency) showed a non-significant trend (Hypothesis 3), future research should propose using a more sensitive eye-tracking metric (e.g., pupil diameter and scan path complexity) to detect subtle differences. It would also be valuable to test the effect of several types of NDRTs. Lastly, it would be beneficial to have more structured qualitative measures that enhance the applicability of the research findings.
5. Conclusions
This exploratory study conducted a comprehensive behavioral and performance analysis of civilian and professional drivers in a high-fidelity simulated environment. Our findings revealed a degree of performance and physiological similarity between the two groups that largely contradicted our initial hypotheses. In the cone avoidance scenarios, civilian drivers exhibited comparable precision, visual behavior, and physiological responses, with significant differences noted only in metrics such as speed and throttle input. Crucially, the ecologically valid motorway scenario yielded no significant differences across most key metrics, including collision rates.
While these findings—derived from a sample with heterogeneity in age and experience—do not invalidate professional training, they highlight the specific conditions under which expertise manifests or is obscured. The observed interaction between participant age and simulator performance strongly suggests that the simulation environment acted as a confounding variable, attenuating the expected benefits of professional experience. Furthermore, the task design, which emphasized acute maneuvering over strategic planning, may have failed to fully capture the superior mental models and risk anticipation that professional training aims to instill.
These results have significant practical implications for driver evaluation and training design. Training curricula should shift their focus from simple, technical precision drills toward complex, prolonged, and ecologically valid scenarios to specifically target and strengthen the advanced strategic planning and risk management skills that differentiate experts in real-world driving. Accordingly, driver evaluation protocols should move beyond basic task completion rates to incorporate more sensitive behavioral indicators of deliberation and control (e.g., consistent speed management and minimal unnecessary braking) as primary markers of professional competency. To validate these preliminary results, future research must prioritize large, age-matched cohorts and employ tasks with a significantly higher cognitive load and longer duration. This approach will help overcome potential ceiling effects and allow for the detection of subtle differences through more sensitive physiological measures (e.g., pupil metrics), ultimately leading to actionable recommendations for revising training programs and better supporting older professional drivers.