Understanding the Interaction between Cyclists’ Traffic Violations and Enforcement Strategies: An Evolutionary Game-Theoretic Analysis

An evolutionary game-theoretic analysis method is developed in this study to understand the interactions between cyclists’ traffic violations and the enforcement strategies. The evolutionary equilibrium stabilities were analysed under a fixed (FPS) and a dynamic penalty strategy (DPS). The simulation-based numerical experiments show that: (i) the proposed method can be used to study the interactions between traffic violations and the enforcement strategies; (ii) FPS and DPS can reduce cyclists’ probability of committing traffic violations when the perceived traffic violations’ relative benefit is less than the traffic violation penalty and the enforcement cost is less than the enforcement benefit, and using DPS can yield a stable enforcement outcome for law enforcement compared to using FPS; and (iii) strategy-related (penalty amount, enforcement effectiveness, and enforcement cost) and attitudinal factors (perceived relative benefit, relative public image cost, and cyclists’ attitude towards risk) can affect the enforcement strategy’s impacts on reducing cyclists’ traffic violations.


Introduction
Due to the increasing congestion and pollution, there is an increasing demand for mobility alternatives to driving such as riding a human-powered bike or an electric bike (i.e., e-bikes equipped with bicycle pedals and e-bikes in scooter forms, hereafter referred to as "e-bike"). The ownership of e-bikes has skyrocketed over the past decade around the world. In China, 3 million e-bikes were sold and over 200 million were on the road in 2016 compared to just a few thousand in 1998 [1,2]. Between 2005 and 2012, annual sales of e-bikes in Switzerland increased from 1792 to 52,941 at an average annual growth rate of 62.2% [3]. Although the increasing usage of human-powered bikes/e-bikes offers a cheap, convenient, and environmentally friendly travel alternative, human-powered bike/e-bike-related traffic accidents also increase significantly. Traffic violation of cyclists (i.e., bike and e-bike riders) is one of the primary reasons for the increasing cyclist-related traffic accidents around the world. Traffic violation in this study is defined as the intention to disobey some traffic rules to gain some personal benefit such as reducing travel time. For example, in 2005 a survey of cyclists and pedestrians in Florida reported that nearly 15% of cyclist-related crashes were caused by cyclists' right-of-way violations [4]. In North Carolina municipalities between 2008 and 2012, over 10% of the total cyclist-related accidents were Table 1. Payoff matrix for cyclists and law enforcement.

Enforce Traffic Rules Do Not Enforce Traffic Rules
Cyclists Commit traffic violations ∆V − rC p , rC p − C e (∆V, −∆I) Do not commit traffic violations (0, −C e ) (0, 0) It is important to note that the payoffs to cyclists and law enforcement are simplified for illustration purposes. Most of the assumptions related to payoffs can be relaxed. The main objective of this study is to analyze the evolutionary stable strategies (ESSs) and their conditions in the game regardless of the exact utility (e.g., income or cost) as long as the nature of utility meets the rationality constraints as required in a real traffic scenario.
Given the payoffs in Table 1, the models for violations and enforcement decisions follow Expected Utility Theory [37][38][39], whereby one party's decision utility depends upon the expected action of the other party. Therefore, payoff functions of two parties are inter-related. The expected utility decision models are shown in Equations (1)- (6).
The probabilities of the co-decision utilities are the anticipations or beliefs of the other party's decisions, p and q (p ∈ [0, 1] and q ∈ [0, 1]). p is law enforcement's estimated probability that cyclists will commit traffic violations in the period. q is the cyclists' estimated probability that law enforcement will enforce traffic rules which represents the monitoring intensity of law enforcement such as police patrols and traffic cameras.
The expected utility decision models of committing (Equation (1)) and not committing traffic violations (Equation (2)), and the average expected utility decision model for cyclists (Equation (3)), can be calculated as follows: where U v1 : Expected utility for committing traffic violations of cyclists, U v2 : Expected utility for not committing traffic violations of cyclists and U v : Average expected utility for cyclists' decision.
The expected utility decision models of enforcing (Equation (4)) and not enforcing traffic rules (Equation (5)), and the average expected utility decision model for law enforcement ((Equation (6)), can be calculated as follows: U e2 = p(−∆I) + (1 − p)(0) = −p∆I (5) U e = qU e1 + (1 − q)U e2 = q prC p − C e − p∆I(1 − q) (6) where U e1 : Expected utility for enforcing traffic rules of law enforcement, U e2 : Expected utility for not enforcing traffic rules of law enforcement and U e : Average expected utility for law enforcement' decision. Then, the replicator equation is used to capture the evolutionary process through selection dynamics [40,41]. It can be used to measure the changes to a party's probability of choosing a strategy throughout different time periods. The replicator equation of cyclists' decision is denoted as f (p) = dp dt , and the replicator equation of law enforcement's decision is denoted as f (q) = dq dt [42]. The replicator equations can be written as: In Equations (7) and (8), the penalty amount (C p ) can be flexible. It can be a fixed penalty strategy (FPS) or a dynamic penalty strategy (DPS). Under FPS, the penalty amount during the study period is fixed, while under DPS, the penalty amount changes at the end of every time unit (one day or several days) within the study period. The evolutionary equilibrium stabilities under these two penalty strategies are discussed in details in Sections 2.2 and 2.3.
The perceived relative benefit (∆V) can be influenced by cyclists' attitude towards risk. To capture the impacts of these attitude-related factors on the perceived relative benefit of traffic violations, Cumulative Prospect Theory (CPT) is used [43,44]. CPT is used to model descriptive decisions under risk and uncertainty. Based on CPT, people tend to measure the perceived benefits/costs of the possible outcomes to a reference point (most of the time is the status quo) rather than the outcome's absolute value. People can have various risk attitudes towards gains (i.e., outcomes above the reference point) and losses (i.e., outcomes below the reference point). Most people are risk-averse. It means that most people value more about the potential losses than the same amount of potential gains when exposed to uncertainty [45,46]. The cyclists' perceived relative benefit is formulated as follows: where, ω(p i ) is cyclists' decision weight (i.e., how much weight they assigned to each risky outcome's probability) on their subjective perception towards the probabilities of two risky outcomes (p i ), i = 1 or 2, p 1 refers to the probability of not getting punished for traffic violations, and p 2 refers to the probability of getting punished for traffic violations. ω(p 1 ) + ω(p 2 ) < 1 is based on Kahneman and Tversky [43].
v(x i ) is cyclists' value function for the benefit of committing traffic violation with/without being punished compared to the reference point, where x 1 refers to the difference between the benefit received (i.e., benefit of committing traffic violations without being punished) and the reference point, and x 2 refers to the difference between the benefit received (i.e., benefit of committing traffic violations but got punished) and the reference point. The reference point is the benefit of not committing traffic violations. The value function proposed by Tversky and Kahneman [44] is applied in this study. It can be written as follows: where, α, β are the risk attitude coefficients that determine the convexity or concavity of the value function shape, and α, β ∈ [0, 1]. These two coefficients are related to cyclists' risk attitude towards traffic violations. The smaller the risk attitude coefficients, the higher the risk that cyclists perceive the violations. λ is the loss aversion coefficient that can capture cyclists' sensitivity to possible losses when punished for committing traffic violations, and λ ≥ 1. By combining Equations (9) and (10), ∆V can be written as follows: where, E is the benefit received from each violation for cyclists. Assuming C p > E, the final conversion equation of ∆V is represented as follows:

Evolutionary Equilibrium Stability under FPS
Under FPS, the fixed penalty amount is calculated as follows: By substituting Equation (13) into Equations (7) and (8), under FPS, the replicator equations can be written as: In an evolutionary game model, the trajectory emitted from an arbitrarily small neighborhood evolves towards a certain asymptotically stable balance point, which is called ESS [47]. If a sufficient probability of parties adopts a certain strategy that achieves ESSs, then the system will remain stable. In Equations (14) and (15), the transformation rate should be zero based on the definitions of ESSs, i.e., f (p) = dp dt = 0 and f (q) = dq dt = 0. Thereby, the potential five ESSs are: Friedman's [48] study provided the ESSs condition for the evolutionary game. Specifically, for an equilibrium state to be asymptotically stable, the determinant of the Jacobian matrix J should be positive (det J > 0) and the trace of Jacobian matrix J should be negative (tr J < 0). Any state meets the above condition is an ESS. By using the replicator Equations of (14) and (15), the Jacobian matrix J can be written as: Then, the det J and tr J can be given by: Based on Equations (17) and (18), ESSs are conditioned upon the values of the following parameters: ∆V, C p( f ) , r, C e , and ∆I, where ∆V are determined by cyclists, C p( f ) , r, and C e are determined by law enforcement, ∆I are determined by the public. Under FPS, we discuss all possible conditional constraints for the equilibrium stability analysis. Table 2 summarizes the determinants and traces of the Jacobian matrix J for five potential ESSs. The local stability of equilibriums for three situations are shown in Tables 3-5. Situation 1: if ∆V > rC p( f ) and rC p( f ) + ∆I > C e , E 4 = (1, 1) is an ESS, which corresponds to a pure strategy (i.e., one party can only adopt one strategy at one time) in which cyclists commit traffic violations, and law enforcement enforces traffic rules. Situation 2: if ∆V > 0 and rC p( f ) + ∆I < C e , E 3 = (1, 0) is an ESS, which corresponds to a pure strategy in which cyclists commit traffic violations, and law enforcement does not enforce traffic rules. Situation 3: if 0 < ∆V < rC p( f ) and rC p( f ) + ∆I > C e , E 5 = (p * , q * ) is an unstable center, which corresponds to a mixed strategy (i.e., one party has a probability of adopting each strategy). It means that the strategy probabilities of cyclists and law enforcement will fluctuate around (p * , q * ) and cannot converge. The other situations (i.e., ∆V = rC p( f ) or rC p( f ) + ∆I = C e ) are less likely to occur [49].

Evolutionary Equilibrium Stability under DPS
The volatility and repeated traffic violations (i.e., in Situation 3 of FPS, the probability of committing traffic violations is constantly fluctuating) may cause law enforcement to make overly optimistic/pessimistic estimations. For example, law enforcement may introduce a penalty which is sufficient to reduce traffic violations based on an optimistic estimation under FPS. This means that an ideal penalty strategy should not only be able to reduce the total number of traffic violations but also allow law enforcement to correctly assess/predict the effectiveness of its strategies (i.e., a stable equilibrium solution). Hence, the potential of using DPS in which the penalty amount is correlated with the probability of committing traffic violations is proposed and studied. The dynamic penalty amount is calculated as follows: where k is the dynamic penalty coefficient relative to p. In this study, we first select one form of k, i.e., k = 1 + p, k ∈ [1, 2] for demonstration purpose. It means the minimum penalty amount for traffic violations is C p(d) and the maximum penalty amount is 2C p(d) , as the probability of committing traffic violations reaches to 1.
By substituting Equation (19) into Equations (7) and (8), under DPS, the replicator equations can be given as: Similarly, the candidates of five ESSs are: Similarly, by using the replicator Equations of (20) and (21), the Jacobian matrix J is given by Then, the det J and tr J can be given by Under DPS, we discuss all possible conditional constraints for the equilibrium stability analysis. Table 6 summarizes the determinants and traces of the Jacobian matrix J for five potential ESSs. The local stability of equilibriums for three situations are shown in Tables 3-5. Situation 1: if ∆V > 2rC p(d) and is an ESS, which corresponds to a mixed strategy that cyclists' strategy probability will converge to p * and law enforcement's strategy probability will converge to q * . The other situations (i.e., ∆V = 2rC p(d) or 2rC p(d) + ∆I = C e ) are less likely to occur [49].
To evaluate the performance of the proposed framework, numerical experiments are conducted. In addition, the impacts of the penalty amount, enforcement effectiveness, perceived relative benefit, enforcement cost, relative public image cost, and cyclists' attitude towards risk (risk attitude coefficients, loss aversion coefficient, and decision weight) on the interactions between cyclists and law enforcement are discussed. Runge-Kutta algorithm [50,51] is used to solve the proposed framework.
MATLAB R2014b is used to conduct numerical experiments. The values of each factor in the proposed models are shown in Table 7. In this study, t represents the smallest time unit (one day or a few days) within the study period. We assume that the cyclists or law enforcement will evaluate their behavior at the end of each time unit and determine their action for the next time unit. Their behavior will not change within the time unit. For simplicity, we define a time unit as one day. The results of numerical experiments are presented in Sections 3 and 4. Table 6. Determinants and traces of the Jacobian matrix J for five potential ESSs under DPS.

Equilibrium
State Table 7. Values of each factor in the proposed models in Situations 1, 2, and 3.

Numerical Experiments and Discussion in Situations 1 and 2
In Situations 1 and 2 under FPS or DPS, the strategy choices of both cyclists and law enforcement are pure strategies. The simulation results are shown in Appendix A ( Figures A1-A8). As shown in Appendix A (Figures A1, A3, A5 and A7), if the perceived relative benefit of committing traffic violations is larger than the traffic violation penalty, and the enforcement cost is smaller than the enforcement benefit (the sum of the traffic violation penalty and the relative public image cost), the strategy probabilities of both parties converge to (1,1). This means that cyclists will commit traffic violations, and law enforcement will enforce them. If the perceived relative benefit of committing traffic violations is larger than zero and the enforcement cost is larger than the enforcement benefit, the strategy probabilities of both parties converge to (1, 0) (Appendix A (Figures A2, A4, A6 and A8)). It means that cyclists will commit traffic violations, while law enforcement will not enforce them. Table 8 summarizes the converging speed (i.e., the time takes to reach equilibrium solutions) changes of the cyclists' probability of committing traffic violations and law enforcement's probability of enforcing traffic rules in Situations 1 and 2. The benefit of committing traffic violations for cyclists increases when the penalty amount or the relative public image cost decreases, or the perceived relative benefit or the enforcement cost increases, and vice versa. The benefit of enforcing traffic rules for law enforcement increases when the penalty amount or the relative public image cost or the perceived relative benefit increases, or the enforcement cost decreases, and vice versa. These results suggest that (i) in Situations 1 and 2, cyclists needs less time (i.e., a faster converging speed) to reach the conclusion of committing traffic violations when its benefits increase and vice versa; and (ii) in Situation 1, law enforcement needs less time to reach the conclusion of enforcing traffic rules when its benefit increases and vice versa, while in Situation 2, law enforcement needs more time to reach the conclusion of not enforcing traffic rules when its benefit increases and vice versa. These results show that the increasing benefit of committing traffic violations can incentivize cyclists to commit traffic violations. For law enforcement, when the increasing benefit of enforcing traffic rules can cover the cost of enforcing traffic rules, it can incentivize law enforcement to enforce them. When the increasing benefit is not sufficient to cover the cost of enforcing traffic rules, law enforcement would choose not to enforce them. In Situation 1, the converging speed for cyclists under DPS is much faster compared to it under FPS, while the converging speed for law enforcement under DPS is relatively slower compared to it under FPS. In Situation 2, the converging speed for both cyclists and law enforcement is faster under DPS compared to it under FPS. Overall, the differences among the converging speeds of FPS and DPS in Situations 1 and 2 are small (i.e., less than 10% in all cases). For example, as shown in Appendix A ( Figure A1), it takes about 73 days for the probability of committing traffic violations to 1 under FPS (when C p( f ) = 40), while it takes around 67 days for the probability of committing traffic violations to 1 under DPS (when C p(d) = 20) in Situation 1 ( Table 8). The main reason for such differences is that the penalty amount under DPS increases gradually compared to a fixed penalty under FPS. This means cyclists benefit more from committing traffic violations and law enforcement benefits less from enforcing traffic rules under DPS compared to those under FPS. That is why it takes a shorter time for cyclists to reach a final decision (Situations 1 and 2), and it takes a longer time for law enforcement to reach a final decision in Situation 1, but a shorter time in Situation 2.
To sum up, both FPS and DPS cannot effectively reduce cyclists' traffic violations as long as the traffic violation penalty is lower than the perceived relative benefit or the enforcement cost is higher than the benefit of enforcing traffic rules.

Numerical Experiments and Discussion in Situation 3
In Situation 3, the perceived relative benefit of committing traffic violations is less than the traffic violation penalty, and the enforcement cost is smaller than the enforcement benefit. Under FPS or DPS, the strategy choices of both cyclists and law enforcement are mixed strategies. The first-order partial derivatives of p * , q * and ∆V were conducted to illustrate the impacts of various factors on the probabilities of traffic violations and enforcing traffic rules and numerical experiment results are shown in Figures 1-10 to further illustrate these impacts. As shown in Figures 1a, 5a, 9a and 10a, under FPS, the strategy choice path fluctuates. This can potentially result in uncertainties in the expected outcome of the enforcement. As shown in Figures 1b, 2, 4, 5b, 6, 7, 8, 9b and 10b, under DPS, a stable expected outcome of the enforcement can be achieved. Table 9 summarizes the changes in probabilities and stabilities of traffic violations and enforcing traffic rules in Situation 3.

Analyzing Factors Affecting Cyclists and Law Enforcement Behavior under DPS
We solve the first-order partial derivatives of p * and q * with respect to C p(d) , r, ∆V, C e , and ∆I, as well as the first-order partial derivatives of ∆V with respect to α/β, λ, and ω(p 1 ). The results are shown in Appendix B. From Equations (A1) to (A9), when the traffic violation penalty is larger than the perceived relative benefit and the enforcement cost is smaller than the enforcement benefit, if C p(d) , r, and ∆I increase or C e decreases, the probability of committing traffic violations will decrease; if C p(d) , r, and C e increase or ∆V and ∆I decreases, the probability of enforcing traffic rules will decrease. From Equations (A10) to (A12), when the penalty amount is more than the benefit received from each violation for cyclists, if λ increases or α/β and ω(p 1 ) decrease, the perceived relative benefit will decrease, resulting in a reduction in the probability of enforcing traffic rules.
studies are needed to evaluate the cost-benefit of reducing traffic violations through improving the enforcement effectiveness.
To evaluate the impacts of how different forms of DPS can affect the equilibrium solution, additional simulations are conducted by introducing three types of dynamic penalty coefficient functions (i.e., = 1 + , = 1 + , and = 1 + 2 − ). Figure 3 reflects the relationship between probabilities of committing traffic violations and the dynamic penalty coefficient. As shown in Figure 4, the strategy probabilities of two parties converge to the ESSs (i.e., (0.34,0.45), (0.39,0.52), and (0.31,0.39)). These results illustrate that when the dynamic penalty coefficient in DPS follows = 1 + 2 − , the probabilities of committing traffic violations and enforcing traffic rules are both the lowest. These results suggest that the effectiveness of the DPS depends on the growth rate of the dynamic penalty coefficient, and a decreasing growth rate of the dynamic penalty coefficient can yield a more effective result. It means that using a stricter penalty at the beginning and then gradually relaxing them might achieve a better enforcement outcome.     To sum up, DPS can be more effective when the traffic violation penalty is more than the perceived relative benefit and enforcement cost is less than the enforcement benefit In addition, increasing penalties (e.g., higher fines and a demerit point system), improving enforcement effectiveness (e.g., automatic speeding/red-light running/ retrograding violation detection and response system), and adopting a DPS with a decreasing growth rate in the dynamic penalty coefficient over a FPS can be more effective in reducing traffic violations.  To sum up, DPS can be more effective when the traffic violation penalty is more than the perceived relative benefit and enforcement cost is less than the enforcement benefit In addition, increasing penalties (e.g., higher fines and a demerit point system), improving enforcement effectiveness (e.g., automatic speeding/red-light running/ retrograding violation detection and response system), and adopting a DPS with a decreasing growth rate in the dynamic penalty coefficient over a FPS can be more effective in reducing traffic violations. Under DPS, the strategy probabilities of two parties converge to the ESSs (i.e., (0.52,0.27) , (0.52,0.53), and (0.52,0.79)) ( Table 9 and Figure 5b). It means that the probabilities of committing traffic violations remain unchanged, while the probabilities of enforcing traffic rules are significantly reduced as the perceived relative benefit decreases, and their strategy probabilities will reach an equilibrium solution. These results show that decreasing perceived relative benefit within a certain range (less than the traffic violation penalty) and using DPS can gain a stable result in terms of reducing cyclists' probability of committing traffic violations compared to using FPS.

The Effects of Traffic Violation Penalty
The total traffic violation penalty received depends on each penalty amount, enforcement effectiveness for both FPS and DPS. Under DPS, it is also affected by the form of the dynamic penalty coefficient function. Under FPS, the simulation results show that the strategy probabilities of two parties fluctuate periodically around (0.38, 0.83), (0.30, 0.63), and (0.25, 0.50) when three levels of individual penalty amount are introduced (Table 9 and Figure 1a). It means that as the penalty amount increases, the centers of the fluctuation of two parties' strategy probabilities gradually decrease, and the probabilities of committing traffic violations and enforcing traffic rules are both unstable. Under DPS, the strategy probabilities of two parties converge to the ESSs (i.e., (0.67, 1), (0.40, 0.90), and (0.34, 0.74)) ( Table 9 and Figure 1b). It means that the probabilities of committing traffic violations and enforcing traffic rules are both significantly reduced, and their strategy probabilities will reach an equilibrium solution as the penalty amount increases. These results show that increasing penalty amount and using DPS can reduce cyclists' probability of committing traffic violations and achieve a stable enforcement outcome as long as the traffic violation penalty is higher than the perceived relative benefit and the enforcement cost is smaller than the enforcement benefit. Such results are consistent with most of the previous studies. Kim and Kim [36] concluded that increasing the penalty amount can reduce the probability of speeding among drivers. Wong et al. [32] suggested that increasing the penalty (e.g., a higher fine and a demerit point system) for running a red light can effectively reduce such behavior among public light bus drivers. Paola et al. [33] suggested that using a demerit point system instead of a money-only system can be more effective in reducing traffic violations among drivers based on the empirical data from Italy. Such results are different from a related study by Bjørnskau and Elvik [35]. They concluded that increasing the penalty amount had no effect on the probability of speeding among drivers. A possible reason for such difference is that Bjørnskau and Elvik [35] did not consider penalties issued as a payoff for law enforcement.      To sum up, DPS can be more effective when the traffic violation penalty is more than the perceived relative benefit, and enforcement cost is less than the enforcement benefit. In addition, various measures that can decrease the cyclists' perceived relative benefit (e.g., school-based education, advertisements, and training programs related to traffic violations) can be effective in achieving the same level of probability of committing traffic violations with a lower probability of enforcing traffic rules. These measures should be designed to influence cyclists' (i) risk attitudes towards committing traffic violations, (ii) sensitivity to the possible loss when punished for committing traffic violations, (iii) and perceived likelihood of being punished.  These results show that as the enforcement cost decreases, the probability of committing traffic violations significantly reduces while the probability of enforcing traffic rules gradually increases. Possible measures for reducing enforcement cost can include adopting more cost-effective technologies, improving system operation, etc. In addition, DPS can be more effective in achieving a stable reduction in probability of committing traffic violations when the enforcement cost reduces compared to FPS if the enforcement cost is less than the enforcement benefit.

The Effects of Relative Public Image Cost
Under FPS, the simulation results show that the strategy probabilities of two parties fluctuate This means that as the relative public image cost increases, the probability of committing traffic violations significantly reduces, while the probability of enforcing traffic rules gradually increases. These results show that increasing relative public image cost as long as it is more than the difference of the enforcement cost minus the traffic violation penalty and using DPS can reduce the probability of committing traffic violations with a slightly increasing probability of enforcing traffic rules. These results show that when the relative public image cost increases, the probability of committing traffic violations can be effectively reduced as it provides more incentives for law enforcement to enforce traffic rules. The relative public image cost is calculated by the cost of the negative public image by not punishing violations plus the benefit of the positive public image by enforcing traffic rules. Thereby, policymakers should leverage media and education campaigns to influence public opinion Additional simulation results show that as the enforcement effectiveness (r) increases (from 0.5 to 1), law enforcement can effectively reduce the probability of traffic violations (from 52% to 34%) with fewer intersections/segments to monitor (from 79% to 45%) in the equilibrium solution under DPS (Table 9 and Figure 2). These results show the importance of the enforcement effectiveness in reducing traffic violations. Motor vehicle safety [52] reported that several measures (e.g., automated red-light enforcement and automated speed-camera enforcement) that can improve the enforcement effectiveness can reduce drivers' traffic violations (e.g., red-light running and speeding). Additional studies are needed to evaluate the cost-benefit of reducing traffic violations through improving the enforcement effectiveness. To evaluate the impacts of how different forms of DPS can affect the equilibrium solution, additional simulations are conducted by introducing three types of dynamic penalty coefficient functions (i.e., k = 1 + p, k = 1 + p 2 , and k = 1 + 2p − p 2 ). Figure 3 reflects the relationship between probabilities of committing traffic violations and the dynamic penalty coefficient. As shown in Figure 4, the strategy probabilities of two parties converge to the ESSs (i.e., (0.34, 0.45), (0.39, 0.52), and (0.31, 0.39)). These results illustrate that when the dynamic penalty coefficient in DPS follows k = 1 + 2p − p 2 , the probabilities of committing traffic violations and enforcing traffic rules are both the lowest. These results suggest that the effectiveness of the DPS depends on the growth rate of the dynamic penalty coefficient, and a decreasing growth rate of the dynamic penalty coefficient can yield a more effective result. It means that using a stricter penalty at the beginning and then gradually relaxing them might achieve a better enforcement outcome.
To sum up, DPS can be more effective when the traffic violation penalty is more than the perceived relative benefit and enforcement cost is less than the enforcement benefit In addition, increasing penalties (e.g., higher fines and a demerit point system), improving enforcement effectiveness (e.g., automatic speeding/red-light running/ retrograding violation detection and response system), and adopting a DPS with a decreasing growth rate in the dynamic penalty coefficient over a FPS can be more effective in reducing traffic violations.

The Effects of Perceived Relative Benefit
The total amount of perceived relative benefit depends on risk attitude coefficients, loss aversion coefficient, and decision weight. Under FPS, the simulation results show that the strategy probabilities of two parties fluctuate periodically (i.e., unstable) around (0.43, 0.20), (0.43, 0.40), and (0.43, 0.60) with three levels of perceived relative benefit (Table 9 and Figure 5a). It means that as the perceived relative benefit decreases, the centers of the fluctuation of cyclists' strategy probabilities remain unchanged and those of law enforcement's strategy probabilities significantly decrease. Under DPS, the strategy probabilities of two parties converge to the ESSs (i.e., (0.52, 0.27), (0.52, 0.53), and (0.52, 0.79)) ( Table 9 and Figure 5b). It means that the probabilities of committing traffic violations remain unchanged, while the probabilities of enforcing traffic rules are significantly reduced as the perceived relative benefit decreases, and their strategy probabilities will reach an equilibrium solution. These results show that decreasing perceived relative benefit within a certain range (less than the traffic violation penalty) and using DPS can gain a stable result in terms of reducing cyclists' probability of committing traffic violations compared to using FPS.
The perceived relative benefit depends on several attitude-related factors, such as risk attitude coefficients, loss aversion coefficient, and decision weight. These factors are often acquired by analyzing stated preference surveys and field observations [53,54]. In addition, we assume ω(p 1 ) + ω(p 2 ) = 0.9 under the condition that the decision weights of complementary events sum to less than 1 [43].
Under DPS, the simulation results show that the strategy probabilities of two parties converge to the ESSs (i.e., (0.30, 0.07), (0.30, 0.23), and (0.30, 0.50)) when three levels of risk attitude coefficients are introduced (Table 9 and Figure 6). These results show that as the risk attitude coefficient decreases, it takes much longer for cyclists to reach the equilibrium solution and the probability of their committing traffic violations at the equilibrium remain the same; while it takes much longer for law enforcement to reach the equilibrium solution, but the probability of their enforcing traffic rules at the equilibrium significantly reduces. These results suggest that the higher risk that cyclists perceive towards committing traffic violations, the lower probability of enforcing traffic rules is needed to maintain the same level of probability of committing traffic violations, and the longer it takes to reach the equilibrium solution.
Under DPS, the simulation results show that the strategy probabilities of two parties converge to the ESSs (i.e., (0.31, 0.50), (0.31, 0.48), and (0.31, 0.46)) with three levels of loss aversion coefficient (Table 9 and Figure 7). As the loss aversion coefficient increases, the probability of committing traffic violations remains the same, while the probability of enforcing traffic rules reduces slightly. These findings indicated that if cyclists are more sensitive to possible losses towards being punished for committing traffic violations, law enforcement can control the probability of committing traffic violations to the same level with a lower probability of enforcing traffic rules.
Under DPS, the simulation results show that the strategy probabilities of two parties converge to the ESSs (i.e., (0.31, 0.19), (0.31, 0.34), and (0.31, 0.50)) given three levels of decision weight of p 1 (the probability of not getting punished for traffic violations) ( Table 9 and Figure 8). It means that as the decision weight of p 1 increases, it takes fewer time for cyclists and law enforcement to reach the equilibrium solutions, while the equilibrium probability of committing traffic violations remains the same but the equilibrium probability of enforcing traffic rules significantly increases. The results suggest that when cyclists underestimate the likelihood of being punished for traffic violations, law enforcement should adopt a higher probability of enforcing traffic rules to achieve the same level of probability of committing traffic violations.
To sum up, DPS can be more effective when the traffic violation penalty is more than the perceived relative benefit, and enforcement cost is less than the enforcement benefit. In addition, various measures that can decrease the cyclists' perceived relative benefit (e.g., school-based education, advertisements, and training programs related to traffic violations) can be effective in achieving the same level of probability of committing traffic violations with a lower probability of enforcing traffic rules. These measures should be designed to influence cyclists' (i) risk attitudes towards committing traffic violations, (ii) sensitivity to the possible loss when punished for committing traffic violations, (iii) and perceived likelihood of being punished.

The Effects of Enforcement Cost
Under FPS, the simulation results show that the strategy probabilities of two parties fluctuate periodically around (0.43, 0.60), (0.71, 0.60), and (0.86, 0.60) under three levels of enforcement cost (Table 9 and Figure 9a). It means that as the enforcement cost decreases, the centers of the fluctuation of cyclists' strategy probability of committing traffic violations significantly decrease, and those of law enforcement's strategy probability of enforcing traffic rules remain the same. Under DPS, the strategy probabilities of two parties converge to the ESSs (i.e., (0.52, 0.79), (0.78, 0.67), and (0.89, 0.62)) ( Table 9 and Figure 9b).
These results show that as the enforcement cost decreases, the probability of committing traffic violations significantly reduces while the probability of enforcing traffic rules gradually increases. Possible measures for reducing enforcement cost can include adopting more cost-effective technologies, improving system operation, etc. In addition, DPS can be more effective in achieving a stable reduction in probability of committing traffic violations when the enforcement cost reduces compared to FPS if the enforcement cost is less than the enforcement benefit.

The Effects of Relative Public Image Cost
Under FPS, the simulation results show that the strategy probabilities of two parties fluctuate periodically around (0.43, 0.60), (0.38, 0.60), and (0.30, 0.60) given three levels of the relative public image cost (Table 9 and Figure 10a). As the relative public image cost increases, the centers of the fluctuation of cyclists' strategy probability of committing traffic violations gradually decrease and those of law enforcement's strategy probability of enforcing traffic rules remains the same. Under DPS, the strategy probabilities of two parties converge to the ESSs (i.e., (0.52, 0.79), (0.45, 0.83), and (0.36, 0.88)) ( Table 9 and Figure 10b). This means that as the relative public image cost increases, the probability of committing traffic violations significantly reduces, while the probability of enforcing traffic rules gradually increases. These results show that increasing relative public image cost as long as it is more than the difference of the enforcement cost minus the traffic violation penalty and using DPS can reduce the probability of committing traffic violations with a slightly increasing probability of enforcing traffic rules. These results show that when the relative public image cost increases, the probability of committing traffic violations can be effectively reduced as it provides more incentives for law enforcement to enforce traffic rules. The relative public image cost is calculated by the cost of the negative public image by not punishing violations plus the benefit of the positive public image by enforcing traffic rules. Thereby, policymakers should leverage media and education campaigns to influence public opinion to foster a social environment in which the public image cost of not enforcing traffic rules plays an important role in designing penalty strategies.

Conclusions
In this paper, we developed an evolutionary game theory framework to understand the interaction between cyclists' traffic violations and enforcement strategies. To evaluate the proposed framework, numerical experiments were conducted to analyze the evolutionary equilibrium stability under two law enforcement penalty strategies (FPS and DPS). Based on the cost differences among the penalty amount, enforcement effectiveness, perceived relative benefit, enforcement cost, and relative public image cost, three potential situations were studied. When the perceived relative benefit is larger than the traffic violation penalty or the enforcement cost is large than the enforcement benefit (Situations 1 and 2), the equilibrium state is very similar under FPS and DPS and both strategies are unable to reduce the probability of committing traffic violations. When the perceived relative benefit of committing traffic violations is less than the traffic violation penalty and the enforcement cost is smaller than the enforcement benefit (Situation 3), the equilibrium state (i.e., a stable expected outcome of the enforcement) can only be achieved under DPS. The numerical experiments also show that the penalty amount, enforcement effectiveness, perceived relative benefit, enforcement cost, relative public image cost, and cyclists' attitude towards risk (risk attitude coefficients, loss aversion coefficient, and decision weight) have significant impacts on their choice of strategy.
These findings can have a few policy implications that can help to reduce traffic violations among cyclists, particularly when law enforcement resources are limited. First, cyclists are more likely to commit traffic violations if their perceived relative benefit of traffic violations is larger than the traffic violation penalty, or the enforcement cost is larger than the enforcement benefit. It is important for law enforcement to not only introduce stiffer penalty (higher fines and a demerit point system) for traffic violations but also improve enforcement effectiveness (e.g., automatic speeding/red-light running/retrograding violation detection and response system) to increase the traffic violation penalty. Second, DPS can achieve a more stable reduction in probability of committing traffic violations compared to FPS. Law enforcement may consider leveraging mobile technologies (e.g., smartphone apps) to update the penalties quickly with limited infrastructural investment and help cyclists to be more informed of the penalty changes. Third, it is important to develop educational programs and media campaigns to reduce traffic violations by influencing cyclists' risk attitude towards traffic violations, sensitivity to the possible loss when punished for traffic violations, estimated likelihood of being punished, and the cost to the public image by not enforcing traffic rules. Last but not the least, adopting more cost-effective technologies and improving system operation can potentially reduce the cost of enforcing traffic rules which lead to fewer traffic violations.
This study has a few limitations and can be addressed through future studies. First, some of the assumptions related to the payoffs to cyclists and law enforcement can be relaxed. Additional studies are needed to consider the impacts of the potential cost of committing traffic violations (such as increased safety risk) and the potential benefits of not committing traffic violations (such as reduced safety risk). Second, cyclists of regular bikes and e-bikes are considered similar and are represented as one agent. However, regular bike cyclists and e-bike cyclists are very different in their sociodemographic and behavioral characteristics [19,21,24,28]. In many countries, these two types of cyclists are managed differently and some e-bikes are classified as motor vehicles [55]. A potential future direction can be studying the potential differences in terms of regular bike cyclists' and e-bike cyclists' interaction with law enforcement. Third, additional studies are needed to calibrate the proposed model using real-world data. A four-phase study has been planned to address this issue. In Phase I, a self-reported survey will be conducted to study the potential influence of the socio-demographic variables (e.g., age, gender, and education background) of cyclists on traffic violations. In Phase II, an interactive bicycling simulator study will be conducted to evaluate cyclists' traffic violations and enforcement strategy. Detailed post-study interviews will also be conducted to identify possible additional factors affecting the interaction between cyclists' traffic violations and enforcement strategies. In the following phase, two arterials located in Nantong University, China (one for FPS and the other one for DPS) will be used evaluate real-world interactions between cyclists' traffic violations and enforcement strategies. In the final phase, we are planning to collaborate with law enforcement agencies in Nantong City, China to implement the enforcement strategies designed based on aforementioned studies in the city and validate the study results. Lastly, considering the differences among the road users, the proposed approach can be applied to investigate the interaction between drivers or pedestrians and law enforcement related to traffic violations.        = ∆Vr r 2 C p(d) 2 +r(2C p(d) ∆I+4C p(d) C e )+∆I 2 +∆Vr 2 C p(d) +(2C e +∆I)∆Vr (r 2 C p(d) 2 +2rC p(d) C e +∆I 2 ) r 2 C p(d) 2 +r(2C p(d) ∆I+4C p(d) C e )+∆I 2 +r 3 C p(d) 3 +r 2 (C p(d) 2 ∆I+4C p(d) 2 C e )−r(C p(d) ∆I 2 +4C p(d) C e ∆I)−∆I 3 < 0 (r 2 C p(d) 2 +2rC p(d) C e +∆I 2 ) r 2 C p(d) 2 +r(2C p(d) ∆I+4C p(d) C e )+∆I 2 +r 3 C p(d) 3 +r 2 (C p(d) 2 ∆I+4C p(d) 2 C e )−r(C p(d) ∆I 2 +4C p(d) C e ∆I)−∆I 3 > 0 (A9)