The Effectiveness of Different Incentive Programs to Encourage Safe Driving

This study examined the effectiveness of various financial incentive schemes for improving drivers’ safety performance, specifically in regard to speeding, tailgating, and frequent lane changing without signaling. The study examined the hypothesis that, with regard to modifying unsafe driving behavior in a sample of professional bus drivers in Israel, small yet reliable rewards are more effective than rewards that are large but rarely obtained. While this hypothesis has been tested and partially supported in laboratory studies, the current study is the first to test it in real-world conditions. This study demonstrates that a combination of surveillance, rewards (monetary compensation), and informing the drivers about their driving performance in real time produces a lasting and significant decline in traffic violations. The results show that financial incentives are effective for encouraging safe driving behavior. Simultaneously, the results show some indications that small yet probable rewards may be more effective than large but uncertain ones. This study also demonstrates that the improvement in behavior continued during the period immediately after the experiment.


Introduction
Road accidents carry substantial economic and social costs. Therefore, improving road safety is essential for achieving sustainable development [1]. Driver behavior and traffic violations in particular have long been identified as major contributors to road accidents [2][3][4][5], and some sources suggest that traffic offenses were responsible for as much as 90% of road accidents [6]. Interventions seeking to modify driver behavior have usually relied on the deterrence paradigm and focused on preventing risky behavior. However, the facilitation of good or appropriate behavior is rarely highlighted in the road safety literature [7].
Moreover, research about how driving behavior is influenced by the type, value, and probability of rewards has not received much attention. The current study examined the effectiveness of various financial incentive schemes for improving driver safety performance, specifically investigating the degree of influence of small yet reliable rewards compared to large but rarely obtained rewards on a sample of professional bus drivers in Israel. While this hypothesis has been tested and partially supported in laboratory studies, the current study was the first to test it in real-world conditions.
The deterrence paradigm suggests that humans fear sanctions and will modify their behavior to avoid them [8,9]. Attempts to reduce the potential benefits derived from risky driving include fines of various magnitudes, suspension or revocation of drivers' licenses, mandatory participation in driver rehabilitation courses, community service, imprisonment, and other attempts to modify the social acceptability of such behavior among young driver peer groups [10]. The effectiveness of such sanctions remains a matter of debate. While sanctions reduce the number of offenses, their influence on crashes seems limited [9,11,12] and not necessarily well-maintained over the long term [9]. While numerous studies examined the effectiveness of penalties and enforcement as a tool for changing driver behavior, good or appropriate behavior is rarely highlighted in the road safety field [13].
Many countries experimented with rewarding young drivers. In Norway, for instance, part of the insurance premium was returned to young drivers if they remained accident-free for a specific period. This resulted in far fewer crash reports than were made for young drivers who did not participate in this scheme [31]. In Sweden, young drivers participating in an experiment were credited with a starting bonus. The bonus was reduced for each minute that they drove over the speed limit. At the end of the month, each participant received the cash value of the remaining amount. The study showed that the participants committed fewer speeding violations overall and that the number of serious speeding violations was substantially reduced [32]. In the Netherlands, the cars of young drivers had a device that registered the length, speed, and time of each drive. Insurance premiums for these drivers were decided based on the outcome; safer drivers paid a lower premium. The group receiving positive incentives committed fewer speeding violations than the control group [33].
The effectiveness of various reward schemes is determined by, among other factors, the value and type of reward, the probability of receiving the reward, and the type of behavior that is rewarded [27]. While there have been a few studies on the use of rewards to enhance road safety, studies regarding the influence of these factors in the context of road safety remain rare.
Existing studies generally show that small-scale reward schemes among relatively homogeneous groups (such as employees of the same company) yield better results than large-scale schemes (such as those in which all car drivers of a specific region are the target group) [9]. Cognitive dissonance theory suggests that the reward should be enough to induce changes in behavior, but not so large as to be reasoned that it was the sole motivation. In such a case, the improved behavior was expected to terminate once the reward was removed [34]. On the other hand, one may initially engage in a behavior because of an external reward but later quite willingly take on this behavior despite termination of the reward [35]. For example, it is possible that some people initially started to use seatbelts with the expectation of an external reward and maintained this behavior even when rewards were withdrawn, possibly because they viewed this measure as important and acceptable [35,36].
The literature contains many approaches, most of them in the area of industrial relationships, concerning the probability of receiving rewards as part of the study of relationships between such a probability and the value of the reward [37,38]. According to [38], the behavior the subject follows depends on the perceived likelihood that the behavior will lead to the goal and the goal's subjective value. Hence, greater motivation to follow a specific behavior is related to the subject's belief that a goal will be attained and that the goal has a higher incentive value. Furthermore, when making decisions, people tend to provide more weight to certain short-term advantages over uncertain rewards expected in the long term [26]. According to Myerson et al. [38], individuals often prefer a smaller immediate reward over a larger delayed reward, primarily due to an assumption that the delayed reward's subjective value is discounted, whereas the value of the immediate reward is not. Wine, Chen, and Brewer [39] recently examined the effects of delays in transferring earned monetary rewards in a program to support employees. The results showed a decrease in responses when the probability of receiving the rewards decreased. Arazi [40] examined the necessity of small and immediate incentives over larger and more significant deferred incentives. The Arazi research team worked in a lab setting. This study showed that prizes had a small positive effect which was not statistically significant.
In line with the above, this study examined the effectiveness of various financial incentive schemes to improve drivers' safety performance, specifically with regard to speeding, tailgating, and frequent lane changing without signaling. This study examined the hypothesis that, with regard to unsafe driving behavior in a sample of professional bus drivers in Israel, small yet reliable rewards are more effective than large but rarely obtained rewards. While this hypothesis has been tested and partially supported in laboratory studies, the current study was the first to test it in real-world conditions.

Method
The research methodology was based on a field experiment conducted in collaboration with Metropoline Public Transportation Ltd. (henceforth Metropoline), one of Israel's largest public bus companies. As part of their routine procedures, all Metropoline buses featured electronic driver assistance (EDA) systems that reported on the bus's location, speed, trailing distance, and frequency of changing lanes without signaling. Notably, EDAs have been previously found to be highly effective in reducing high-speed driving [41], injuries, and casualties [42,43].
One hundred thirty-three drivers participated in the study after receiving a recruitment letter that explained the experiment's process, although only ninety-two of them participated in all aspects of the study. The letter included information about the experiment and specifically explained the competition and how investigators planned to determine scores and points. The drivers knew that scores would be based on multiple driving characteristics that included speed, changing lanes without signaling, and distancekeeping or tailgating. The drivers also knew that during the study, they would be eligible to earn rewards in varying amounts. The number of drivers eligible to receive each reward type was also communicated to them. The drivers received advance notice of the dates and hours of the study.
Each evening during the second and third stages of the study, each driver received a text message advising him about whether he had earned a reward the previous day, the amount of that reward, and the cumulative amount he had earned since the beginning of the study. Note that Metropoline administrative requirements determined that any earned rewards would be paid to the drivers after the study period's conclusion.
Participation was mandatory. The experimental period consisted of 86 workdays, excluding weekends. (In Israel, workdays are Sunday through Thursday; the Friday halfday of work was not counted). Every workday, drivers with the lowest safety scores were rewarded with incentives according to the research plan (detailed henceforth), based on a safe driving index.

The Safe Driving Index
The safe driving index was constructed based on data attained from the EDAs and included the following:

•
The total number of times that the driver exceeded the speed limit. Driving at a speed of 20 km/h or more over the limit was classified as a severe speeding offense; • The number of tailgating alerts, active only when the speed exceeded 30 km/h; • The number of daytime alerts warning of a collision with pedestrians; • The number of alerts warning of a collision with another vehicle; • The number of times a driver changed lanes without signaling; • Driving time per day. This included the time allocated to preparing the bus; • Time driving over the speed limit (exceeding the limit by 10, 20, or 30 km/h or more).

Calculating the Safe Driving Score
When determining the final score, each component's weight was determined according to its contribution to the occurrence of accidents. The components included in the index included time speeding (including three levels of speeding), alerts about near collisions with other vehicles, alerts about tailgating, and alerts about near collisions with pedestrians, as elaborated in Table 1. All variables were normalized by the overall driving time.
In the frame of the experiment, after every workday, two files were received from the company: an offense file and a travel file. In the data received from these files, trips that took less than 15 min were not analyzed, because they probably represented system faults or involved vehicle maintenance activities. Trips for which no offenses were registered were also excluded, as it was likely that the EDA failed to register any offenses with regard to this ride. Because the definition of offenses was relatively inclusive, with the ability to record speed limit deviations of as little as 5 km/h (although such instances were not taken into account in the scoring), the likelihood that a driver had a complete workday with no offenses was so small that we attributed a report with zero reported offenses as being a system failure.

Sample Planning
Because participation in the study was mandatory, the study was freed from any distortion that could stem from drivers' willingness to participate and from drivers' personal characteristics. Beyond acknowledging that all the drivers were men, Metropoline protected the confidentiality of its drivers' identities and characteristics. The only other condition was that drivers had to work at least 40 monthly hours for three months, starting from the beginning of the experiment. We divided the drivers into two random groups and exposed each group to two reward types: • Type A (large rewards): The driver who had the best driving score for the day among his group received NIS 500, or USD 140, during the study period. (Initially, the intention was that if several drivers attained the same top score, they would split the sum among themselves, but in practice, no such reward-dividing incidents occurred in this group.); • Type B (small rewards): The top 50% of drivers who won the best scores that day split an NIS 500 reward among themselves. (From among a group of 40 drivers, the top 20 drivers would receive NIS 25, or USD 7, each.); • During the experiment, the groups alternated to ensure continuing randomness through exposure to both reward types. (After completing Stage 2 Condition I, the group that had been working for a large reward began to work for a smaller reward; the other group switched in the opposite direction.) However, it is possible that the order itself also had an impact. Each driver was notified daily about his safety score and whether he won the previous day's reward, how much (if any) he had won, and the total rewards he had won to date. Notably, due to administrative constraints, drivers received all their daily rewards in one check at the end of the experiment.

The Experiment
The experiment included four stages: a. Baseline: The first stage, which lasted 13 workdays, was used to generate a baseline for drivers' behavior. Drivers were not aware of this stage; they were neither rewarded nor informed of their safety scores. While drivers were aware that their behavior was being recorded, it was only as part of the routine surveillance carried out by Metropoline; b.
Condition I: Drivers were randomly assigned to one of the experimental group types described above (Type A or B). This stage lasted 30 days; c.
Condition II: Drivers were reassigned to the alternative experimental group type. This stage also lasted 30 days; d.
Post-experimental period: The fourth stage of the experiment consisted of 13 days of dry testing to examine the experiment's long-term impact on drivers' behavior. This process took place after the reward stages of the experiment ended. At this stage, drivers were not aware that the experiment investigators continued to monitor their driving. The drivers always knew that Metropoline always had EDA monitors. Table 2 describes the number of participants in each stage, the average workdays, and the reward scheme that was applied to the driver groups in each stage.

Procedure
Descriptive statistics were employed to test our hypothesis. We used a Pearson's test to verify the relation between the total reward and the daily rewards accumulated by drivers, as well as the driving index in both the second and the third stages of the experiment. In addition, we developed a linear regression model to examine the relationships between the total rewards, daily rewards, driver group, experimental condition, and the safe driving index.

Changes in Driving Behavior
The study results showed drivers who participated reduced the amount of time they drove at speeds above the posted limits for every speed range category. Most importantly, the results indicate that the speed reductions continued even after the study period ended. Figure 1 illustrates all four stages of the part of the study that focused on speeds over 20 km/h. Stage 1 represents the baseline findings. Stage 2 shows the overall results after the study began and before the groups alternated between large and small reward categories. Stage 3 shows essentially the same type of information as was shown in Stage 2, but took place during the period after the groups had changed their reward focuses. Stage 4 shows that both groups continued to decrease excessive speed driving during the period after the drivers believed that the experiment was over, as neither group expected to receive any further rewards.  Figure 2 displays the same type of information as shown in Figure 1, but it focuses on those drivers who initially operated their vehicles at speeds that were more than 25 km/h over the posted limit.  Figure 2 displays the same type of information as shown in Figure 1, but it focuses on those drivers who initially operated their vehicles at speeds that were more than 25 km/h over the posted limit.  Figure 2 displays the same type of information as shown in Figure 1, but it focuses on those drivers who initially operated their vehicles at speeds that were more than 25 km/h over the posted limit.  Figure 3 displays the same type of information as shown in Figures 1 and 2, but it focuses on those drivers who initially operated their vehicles at speeds that were 30 km/h or more over the posted limit.  Figure 3 displays the same type of information as shown in Figures 1 and 2, but it focuses on those drivers who initially operated their vehicles at speeds that were 30 km/h or more over the posted limit.   Figures 1-3 demonstrate that drivers in both experimental groups tended to drive 30 km/h over the speed limit more than they drove at the slightly reduced 20 and 25 km/h over the speed limit. The decline in this level of speeding (30 km/h over the limit) was the most substantial, showing a reduction of almost 70 min among drivers in Group 1 (from 193.6 to 123.3 min) and about 60 min among drivers in Group 2 (161.3 to 106.4 min). Table 3 demonstrates that overall, the experiment brought about a significant change in travel speeds. However, it is difficult to draw clear conclusions about the relative effectiveness of the two reward schemes. The decline between the first and second stages of the experiment was more significant for the first group of drivers than for the second group among the three speed incidents. This disparity may be because the first group was first assigned to the Type A scheme (high but rare rewards). As discussed, maintaining the behavior change is crucial to the success of any new scheme. As Table 3 demonstrates, among the second group of drivers, who were initially assigned to the Type B condition (small yet reliable rewards), the decrease in the average time of driving 30 km/h over the limit from Stage C to Stage D was statistically significant, while the decrease among the first group was not. This result supports our original hypothesis, suggesting that a Type B scheme (small but probable rewards) is more effective for maintaining the behavior  1-3 demonstrate that drivers in both experimental groups tended to drive 30 km/h over the speed limit more than they drove at the slightly reduced 20 and 25 km/h over the speed limit. The decline in this level of speeding (30 km/h over the limit) was the most substantial, showing a reduction of almost 70 min among drivers in Group 1 (from 193.6 to 123.3 min) and about 60 min among drivers in Group 2 (161.3 to 106.4 min). Table 3 demonstrates that overall, the experiment brought about a significant change in travel speeds. However, it is difficult to draw clear conclusions about the relative effectiveness of the two reward schemes. The decline between the first and second stages of the experiment was more significant for the first group of drivers than for the second group among the three speed incidents. This disparity may be because the first group was first assigned to the Type A scheme (high but rare rewards). As discussed, maintaining the behavior change is crucial to the success of any new scheme. As Table 3 demonstrates, among the second group of drivers, who were initially assigned to the Type B condition (small yet reliable rewards), the decrease in the average time of driving 30 km/h over the limit from Stage C to Stage D was statistically significant, while the decrease among the first group was not. This result supports our original hypothesis, suggesting that a Type B scheme (small but probable rewards) is more effective for maintaining the behavior change over time. Alternatively, it is also possible that drivers in this group had more room for improvement. Table 3. Review of the statistical significance of differences in average over-the-speed-limit driving time between research stages (ANOVA).  Figure 4 presents the average number of tailgating alert incidents for the various research stages for the total sample. A decreasing trend in the average number of tailgating alert incidents was evident throughout the research stages in both groups. The decline between the stages was not statistically significant.   Figure 4 presents the average number of tailgating alert incidents for the various research stages for the total sample. A decreasing trend in the average number of tailgating alert incidents was evident throughout the research stages in both groups. The decline between the stages was not statistically significant.  Table 4 presents the average driving indices and standard deviations for the various research stages for the total sample.  Table 4 presents the average driving indices and standard deviations for the various research stages for the total sample.  Table 5 presents the significances of the differences between these indices. Note that lower index values are associated with reductions in risky driving behavior. Table 5. Statistical significance of differences in the index values between the four stages of research (ANOVA). A decreasing trend for the index was evident throughout the stages of research. The decline from Stage A to Stage B was statistically significant. This trend continued between the second stage and third stages of the research when each group was reassigned to the incentive method of the other. Notably, the decreasing trend continued even after the experiment ended. Overall, the improvement in driving safety was evident.

Stage Difference Average Difference Standard Deviation p-Value
Driving Score Data Analysis by Research Stage and Incentive Method Table 6 presents the daily average driving score throughout the stages of the research per group.  Table 7 presents the differences in driving scores between the research stages and shows the statistical significance of these differences.
As is shown, the scores of both driver groups declined significantly throughout the experiment. The average score of the first group (assigned to the Type A reward scheme under Condition I) declined significantly between Stages A and B. While the scores for this group continued to decline in the following stages, the later changes did not reach statistically significant amounts. The average score of the second group (assigned to a Type B reward scheme under Condition I) also declined between Stages A and B, although the score decline was not statistically significant. However, the decline between Stages B and C (during the shift to Condition II) was significant. The index continued to decline for both groups in the post-experiment stage (Stage D) but not in statistically significant amounts. These results suggest that monitoring, along with rewards, does bring about a significant improvement in driving safety. However, it is not possible to draw clear conclusions about the relative effectiveness of either reward scheme. Table 8 presents the per group distribution of drivers who received rewards at least once. This variable was converted into a binary variable: never rewarded versus rewarded at least once. As anticipated, low-reward method participants enjoyed the highest probability of receiving any reward. Interestingly, drivers in both groups were more successful in achieving at least one reward at Stage C, having already experienced one of the incentive structures at Stage B. Thus, Group 1 drivers were more successful in getting rewards at Stage C (small but probable rewards) than Group 2 drivers at Stage B while they faced the same method (96.49% vs. 91.80%). Additionally, in Group 2, drivers were more successful in getting rewards at Stage C (high but rare rewards) than the drivers in Group 1 at Stage B while they faced the same method (29.51% vs. 27.42%). This disparity may reflect the impact of participating in the experiment, which renders rewards tangible and expected and encourages safer driving. Table 9 presents the results of the driving index's linear regression model. The results demonstrated a negative correlation between the daily reward's size and the driving score; a higher daily reward indicated safer driving. The total reward had a more significant impact on the decline in the driving index than the daily reward. The more rewards drivers accumulated, the more their driving performance improved. Based on the Pearson's correlation test results, it is evident that a small yet probable reward was more effective than a large yet rare reward. When drivers were under the former scheme, the decline in the safe driving index was significantly larger. Similarly, we saw a positive, significant correlation between being in Group 1 and the driving score. As we have seen, the Group 1 drivers began with a higher index, and this variable served as the control for the impact of the reward method.

Discussion and Conclusions
The present study examined the influence of incentives in general and of two incentive distribution schemes in particular on real-life driving behavior among a group of professional bus drivers in Israel. To our knowledge, it is the first study to examine this question in the field rather than under laboratory conditions.
The research demonstrated that a combination of monitoring, incentives (in this case, financial rewards), and real-time notification of drivers regarding their performance resulted in a significant and continuous decline in traffic offenses. This is consistent with [44], who investigated the effects of a feedback-reward system on speed limit compliance rates in a field trial among Canadian drivers. As in the current study, Merrikhpour et al. [44] found that providing feedback increased compliance with speed limits, and that this positive effect was still apparent after the experiment was over, albeit to a weaker degree. Our results are also consistent with Elias [13], who demonstrated that the use of rewards could complement or possibly replace the reliance on negative sanctions to modify behavior.
In contrast to Hurst [45], who argued that attempts to reward safe driving were not likely to produce any useful results, our results support the hypothesis that financial rewards can demonstrably improve driving practices. Our results also have long-lasting effects, including the period after the termination of the incentive scheme. This impact may stem from drivers' awareness of being monitored and the competitive environment created in the experiment. Future studies should examine these effects over a more extended period than was possible in the current study. As was shown, both groups demonstrated a continuous decline in driving scores, making it impossible to draw unequivocal conclusions about the most effective incentive scheme. However, there are some indications that a Type B scheme (small yet probable rewards) is a more effective motivator of good driving behavior. First, drivers assigned to a Type B scheme in the latter stage were more likely to maintain the change in behavior in the post-experimental stage. Second, the results demonstrated a higher correlation between driving and daily rewards for groups under the Type B scheme. These results are also consistent with Erev [46], who examined the impact of different enforcement conditions on the level of compliance with safety regulations in a factory. He found that small incentives were the most effective scheme for increasing workers' compliance rates and for maintaining the trend after completion of the study.
Advanced technology offers new ways of improving monitoring, although it may bring about unexpected reactions [41][42][43]47]. While in the present research driver participation was mandatory, in other applications, and to expand the use of the approach, we would like drivers to participate voluntarily, hence the importance of the incentive. Notably, western driving culture historically views the vehicle as an extension of the home, a private place where the driver is entirely autonomous [48]. Consequently, many drivers refuse to install monitoring technologies in their vehicles [49]. The aim of new traffic safety policies is to gain the public's willing collaboration [50], especially those policies that rely on an incentive scheme. Unlike policies centered on negative sanctions, incentive schemes assume that the driver will be rewarded for routine behavior. Thus, maintenance of such schemes requires constant monitoring. The use of incentives may contribute to a reframing of driving practices and, consequently, driver-monitoring activities under a fairness title, as part of a rights and duties package. This approach is in line with the growing preference for a holistic approach to traffic safety, a reframing of the responsibility for car accidents, and a shift to a shared responsibility paradigm [51]. Implementing such a system within a commercial company may improve safety and bring down safety expenditures while contributing to the company's public image as innovative and promoting safety. Adoption of this approach on the national level could substantially cut down expenses and improve public safety. Such action will contribute to achieving sustainable development and sustainable public health.
This study had several characteristics that limited the generalizability of its results. First, the sample was relatively small. Second, for bureaucratic reasons, rewards could not be distributed to drivers daily or even weekly. We are unsure about the extent to which daily text messages were able to replace immediate rewards. Third, we had no way of checking whether drivers read those messages. Perhaps we should have implemented a mechanism that would have required drivers to respond to the messages, thus confirming they read them. We refrained from this, as we wished to minimize the burden on the drivers in the study's framework.