# On the Impact of the Rules on Autonomous Drive Learning

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Related Works

## 3. Model

#### 3.1. Road Graph

#### 3.2. Cars

#### 3.3. Drivers

#### 3.3.1. Observation

#### 3.3.2. Action

#### 3.4. Rules

#### 3.4.1. Intersection Rule

#### 3.4.2. Distance Rule

#### 3.4.3. Right Lane Rule

#### 3.5. Reward

#### 3.6. Policy Learning

## 4. Experiments

- (a)
- cars are kept with status $s=\mathrm{dead}$ in the road graph for ${t}_{\mathrm{dead}}$ time steps, and then are removed; and
- (b)
- cars are kept with status $s=\mathrm{dead}$ in the road graph for ${t}_{\mathrm{dead}}$ time steps, and then their status is changed back into $s=\mathrm{alive}$.

## 5. Results

#### Robustness to Traffic Level

## 6. Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Howard, D.; Dai, D. Public perceptions of self-driving cars: The case of Berkeley, California. In Proceedings of the Transportation Research Board 93rd Annual Meeting, Washington, DC, USA, 12–16 January 2014; Volume 14, pp. 1–16. [Google Scholar]
- Skrickij, V.; Sabanovic, E.; Zuraulis, V. Autonomous Road Vehicles: Recent Issues and Expectations. IET Intell. Transp. Syst.
**2020**. [Google Scholar] [CrossRef] - Bojarski, M.; Del Testa, D.; Dworakowski, D.; Firner, B.; Flepp, B.; Goyal, P.; Jackel, L.D.; Monfort, M.; Muller, U.; Zhang, J.; et al. End to end learning for self-driving cars. arXiv
**2016**, arXiv:1604.07316. [Google Scholar] - Maqueda, A.I.; Loquercio, A.; Gallego, G.; García, N.; Scaramuzza, D. Event-based vision meets deep learning on steering prediction for self-driving cars. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2018; pp. 5419–5427. [Google Scholar]
- Sharifzadeh, S.; Chiotellis, I.; Triebel, R.; Cremers, D. Learning to drive using inverse reinforcement learning and deep q-networks. arXiv
**2016**, arXiv:1612.03653. [Google Scholar] - Jaritz, M.; De Charette, R.; Toromanoff, M.; Perot, E.; Nashashibi, F. End-to-end race driving with deep reinforcement learning. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–25 May 2018; pp. 2070–2075. [Google Scholar]
- Bouton, M.; Nakhaei, A.; Fujimura, K.; Kochenderfer, M.J. Safe reinforcement learning with scene decomposition for navigating complex urban environments. In Proceedings of the 2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France, 9–12 June 2019; pp. 1469–1476. [Google Scholar]
- Wang, C.; Liu, L.; Xu, C. Developing a New Spatial Unit for Macroscopic Safety Evaluation Based on Traffic Density Homogeneity. J. Adv. Transp.
**2020**, 2020, 1718541. [Google Scholar] [CrossRef] - Qiao, Z.; Muelling, K.; Dolan, J.; Palanisamy, P.; Mudalige, P. Pomdp and hierarchical options mdp with continuous actions for autonomous driving at intersections. In Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA, 4–7 November 2018; pp. 2377–2382. [Google Scholar]
- Tram, T.; Jansson, A.; Grönberg, R.; Ali, M.; Sjöberg, J. Learning negotiating behavior between cars in intersections using deep q-learning. In Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA, 4–7 November 2018; pp. 3169–3174. [Google Scholar]
- Liebner, M.; Baumann, M.; Klanner, F.; Stiller, C. Driver intent inference at urban intersections using the intelligent driver model. In Proceedings of the 2012 IEEE Intelligent Vehicles Symposium, Alcala de Henares, Spain, 3–7 June 2012; pp. 1162–1167. [Google Scholar]
- Isele, D.; Cosgun, A.; Subramanian, K.; Fujimura, K. Navigating intersections with autonomous vehicles using deep reinforcement learning. arXiv
**2017**, arXiv:1705.01196. [Google Scholar] - Capasso, A.P.; Bacchiani, G.; Molinari, D. Intelligent Roundabout Insertion using Deep Reinforcement Learning. arXiv
**2020**, arXiv:2001.00786. [Google Scholar] - Shalev-Shwartz, S.; Shammah, S.; Shashua, A. Safe, multi-agent, reinforcement learning for autonomous driving. arXiv
**2016**, arXiv:1610.03295. [Google Scholar] - Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: London, UK, 2018. [Google Scholar]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Graves, A.; Antonoglou, I.; Wierstra, D.; Riedmiller, M. Playing atari with deep reinforcement learning. arXiv
**2013**, arXiv:1312.5602. [Google Scholar] - Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–8 December 2012; pp. 1097–1105. [Google Scholar]
- Sallab, A.E.; Abdou, M.; Perot, E.; Yogamani, S. Deep reinforcement learning framework for autonomous driving. Electron. Imaging
**2017**, 2017, 70–76. [Google Scholar] [CrossRef][Green Version] - Loiacono, D.; Prete, A.; Lanzi, P.L.; Cardamone, L. Learning to overtake in TORCS using simple reinforcement learning. In Proceedings of the IEEE Congress on Evolutionary Computation, Barcelona, Spain, 18–23 July 2010; pp. 1–8. [Google Scholar]
- Hoel, C.J.; Wolff, K.; Laine, L. Automated speed and lane change decision making using deep reinforcement learning. In Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA, 4–7 November 2018; pp. 2148–2155. [Google Scholar]
- Grigorescu, S.; Trasnea, B.; Cocias, T.; Macesanu, G. A survey of deep learning techniques for autonomous driving. J. Field Robot.
**2019**. [Google Scholar] [CrossRef] - Kiran, B.R.; Sobh, I.; Talpaert, V.; Mannion, P.; Sallab, A.A.A.; Yogamani, S.; Pérez, P. Deep Reinforcement Learning for Autonomous Driving: A Survey. arXiv
**2020**, arXiv:2002.00444. [Google Scholar] - Brodsky, J.S. Autonomous vehicle regulation: How an uncertain legal landscape may hit the brakes on self-driving cars. Berkeley Technol. Law J.
**2016**, 31, 851–878. [Google Scholar] - Holstein, T.; Dodig-Crnkovic, G.; Pelliccione, P. Ethical and social aspects of self-driving cars. arXiv
**2018**, arXiv:1802.04103. [Google Scholar] - Nyholm, S.; Smids, J. Automated cars meet human drivers: Responsible human-robot coordination and the ethics of mixed traffic. In Ethics and Information Technology; Springer: Cham, Switzerland, 2018; pp. 1–10. [Google Scholar]
- Kirkpatrick, K. The Moral Challenges of Driverless Cars. Commun. ACM
**2015**, 58, 19–20. [Google Scholar] [CrossRef] - Rizaldi, A.; Althoff, M. Formalising traffic rules for accountability of autonomous vehicles. In Proceedings of the 2015 IEEE 18th International Conference on Intelligent Transportation Systems, Las Palmas, Spain, 15–18 September 2015; pp. 1658–1665. [Google Scholar]
- Vanholme, B.; Gruyer, D.; Lusetti, B.; Glaser, S.; Mammar, S. Highly automated driving on highways based on legal safety. IEEE Trans. Intell. Transp. Syst.
**2013**, 14, 333–347. [Google Scholar] [CrossRef] - Medvet, E.; Bartoli, A.; Talamini, J. Road traffic rules synthesis using grammatical evolution. In Proceedings of the European Conference on the Applications of Evolutionary Computation, Amsterdam, The Netherlands, 19–21 April 2017; pp. 173–188. [Google Scholar]
- O’Neill, M.; Ryan, C. Grammatical evolution. IEEE Trans. Evol. Comput.
**2001**, 5, 349–358. [Google Scholar] [CrossRef][Green Version] - Nenzi, L.; Bortolussi, L.; Ciancia, V.; Loreti, M.; Massink, M. Qualitative and quantitative monitoring of spatio-temporal properties. In Runtime Verification; Springer: Cham, Switzerland, 2015; pp. 21–37. [Google Scholar]
- Bartocci, E.; Bortolussi, L.; Loreti, M.; Nenzi, L. Monitoring mobile and spatially distributed cyber-physical systems. In Proceedings of the 15th ACM-IEEE International Conference on Formal Methods and Models for System Design, Vienna, Austria, 29 September 2017; pp. 146–155. [Google Scholar]
- Tumova, J.; Hall, G.C.; Karaman, S.; Frazzoli, E.; Rus, D. Least-violating control strategy synthesis with safety rules. In Proceedings of the 16th International Conference on Hybrid Systems: Computation and Control, Philadelphia, PA, USA, 8–11 April 2013; pp. 1–10. [Google Scholar]
- Saunders, W.; Sastry, G.; Stuhlmueller, A.; Evans, O. Trial without error: Towards safe reinforcement learning via human intervention. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems. International Foundation for Autonomous Agents and Multiagent Systems, Stockholm, Sweden, 10–15 July 2018; pp. 2067–2069. [Google Scholar]
- Mirchevska, B.; Pek, C.; Werling, M.; Althoff, M.; Boedecker, J. High-level decision making for safe and reasonable autonomous lane changing using reinforcement learning. In Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA, 4–7 November 2018; pp. 2156–2162. [Google Scholar]
- Wu, C.; Kreidieh, A.; Parvate, K.; Vinitsky, E.; Bayen, A.M. Flow: A Modular Learning Framework for Autonomy in Traffic. arXiv
**2017**, arXiv:1710.05465. [Google Scholar] - Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal policy optimization algorithms. arXiv
**2017**, arXiv:1707.06347. [Google Scholar] - Mnih, V.; Badia, A.P.; Mirza, M.; Graves, A.; Lillicrap, T.; Harley, T.; Silver, D.; Kavukcuoglu, K. Asynchronous methods for deep reinforcement learning. In Proceedings of the International Conference on Machine Learning, New York, NY, USA, 24–19 June 2016; pp. 1928–1937. [Google Scholar]
- Glorot, X.; Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth International Conference on Artificial Intelligence and Statistics, Sardinia, Italy, 13–15 May 2010; pp. 249–256. [Google Scholar]

**Figure 4.**Training results with cars removed after ${t}_{\mathrm{dead}}$ time steps. Here, we draw the training values of R, E, and C, at a certain training episode, averaged on ${n}_{\mathrm{trial}}$ experiments. We indicate with solid lines the mean of R, E, and C among the ${n}_{\mathrm{car}}$ vehicles, and with shaded areas their standard deviation among the ${n}_{\mathrm{car}}$ vehicles.

**Figure 5.**Training results with cars restored after ${t}_{\mathrm{dead}}$ time steps. Here, we draw the training values of R, E, and C, at a certain training episode, averaged on ${n}_{\mathrm{trial}}$ experiments. We indicate with solid lines the mean of R, E, and C among the ${n}_{\mathrm{car}}$ vehicles, and with shaded areas their standard deviation among the ${n}_{\mathrm{car}}$ vehicles.

**Figure 6.**Overall number of collisions in the simulation against the overall traveled distance in the simulation, averaged across simulations with the same ${n}_{\mathrm{car}}$. Each dot is drawn from the sum of the values computed on the ${n}_{\mathrm{car}}$ vehicles.

Param | Meaning | Value |
---|---|---|

${l}_{\mathrm{car}}$ | Car length | 7 |

${t}_{\mathrm{coll}}$ | Impact duration | 10 |

${t}_{\mathrm{dead}}$ | Collision duration | 20 |

${d}_{\mathrm{view}}$ | Driver’s view distance | 50 |

${v}_{\mathrm{max}}$ | Driver’s maximum speed | 50 |

${a}_{\mathrm{max}}$ | Driver’s acceleration (deceleration) | 2 |

$\Delta t$ | Time step duration | 0.2 |

$\left|S\right|$ | Number of road sections | 12 |

$\left|I\right|$ | Number of road intersections | 9 |

$w\left(p\right),p\in G$ | Number of lanes | $\in \{1,2\}$ |

$l\left(p\right),p\in S$ | Section length | 100 |

${n}_{\mathrm{car}}$ | Cars in the simulation | 40 |

T | Simulation time steps | 500 |

Param | Meaning | Value |
---|---|---|

${n}_{\mathrm{trial}}$ | Number of trials | 20 |

${n}_{\mathrm{train}}$ | Training iterations | 500 |

${n}_{\mathrm{car}}$ | Cars in the simulation | 40 |

$\gamma $ | Discount factor | 0.999 |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Talamini, J.; Bartoli, A.; De Lorenzo, A.; Medvet, E.
On the Impact of the Rules on Autonomous Drive Learning. *Appl. Sci.* **2020**, *10*, 2394.
https://doi.org/10.3390/app10072394

**AMA Style**

Talamini J, Bartoli A, De Lorenzo A, Medvet E.
On the Impact of the Rules on Autonomous Drive Learning. *Applied Sciences*. 2020; 10(7):2394.
https://doi.org/10.3390/app10072394

**Chicago/Turabian Style**

Talamini, Jacopo, Alberto Bartoli, Andrea De Lorenzo, and Eric Medvet.
2020. "On the Impact of the Rules on Autonomous Drive Learning" *Applied Sciences* 10, no. 7: 2394.
https://doi.org/10.3390/app10072394