# LiDAR-as-Camera for End-to-End Driving

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

- We demonstrate that LiDAR images, as produced by Ouster OS1-128 LiDAR firmware, contain sufficient information for road-following on complex and narrow rural roads, hence validating their usefulness for self-driving.
- We compare LiDAR-image-based driving with camera-based driving and show it adds robustness to light and weather conditions in this task.
- We study the correlation between off-policy and on-policy performance metrics, which has not been studied before in a real car context.
- We collect and publish a real-world dataset of more than 500 km of driving data on challenging rural roads, with LiDAR and camera sensors and centimeter-level accurate GNSS trajectory. The dataset covers a diverse set of weather conditions, including snowy winter conditions.

## 2. Methods

#### 2.1. Behavioral Cloning

#### 2.2. Data Collection

#### 2.3. Data Preparation

#### 2.4. Architecture and Training Details

**Table 1.**Details of the architecture. Batch normalization always precedes activation function. All convolutions are applied with no padding. The resulting output dimensions can be seen in Figure 3.

Layer | Hyperparameters | BatchNorm | Activation |
---|---|---|---|

Input | (264, 68, 3) | ||

Conv2d | filters = 24, size = 5, stride = 2 | BatchNorm2d | LeakyReLU |

filters = 24, size = 5, stride = 2 | BatchNorm2d | LeakyReLU | |

filters = 36, size = 5, stride = 2 | BatchNorm2d | LeakyReLU | |

filters = 48, size = 3, stride = 1 | BatchNorm2d | LeakyReLU | |

filters = 64, size = 3, stride = 1 | BatchNorm2d | LeakyReLU | |

Flatten | - | - | - |

Linear | nodes = 100 | BatchNorm1d | LeakyReLU |

nodes = 50 | BatchNorm1d | LeakyReLU | |

nodes = 10 | none | LeakyReLU | |

nodes = 1 | none | none |

#### 2.5. Evaluation Metrics

#### 2.6. On-Policy Evaluation Procedure

- In the last week of November 2021: the weather conditions and vegetation levels were very similar to the most recent training data recorded at the end of October. Due to a missing parameter in the inference code, the RGB models were run on BGR input, and the results had to be discarded. Hence, only LiDAR-based models were adequately tested in this session. Night driving was performed with dipped-beam headlights on. Results from these tests are marked with (Nov) in Table 2.
- In the first week of February 2022: with snow coverage on the road. This constitutes a clearly out-of-distribution scenery for the camera models. Moreover, also for LiDAR models, the surface shapes and reflectivity of snow piles differ from vegetation and constitute out-of-distribution conditions. LiDAR and camera images from summer, autumn, and winter are given in Appendix A. From this trial, marked with (Jan), we report only the driving performance with LiDAR, as the camera still operated in the BGR mode.
- In the first week of May 2022: early spring, which constitutes a close-to-training-distribution condition. Camera models were evaluated with the correct inference code. The location of LiDAR on the car had changed before this trial compared to the training data. LiDAR-based models underperformed during this test, despite our efforts to adjust the inputs.

## 3. Results

#### 3.1. Driving on an Unseen Track

#### 3.2. Overfitting Setting

#### 3.3. Night Driving and Winter Driving

#### 3.4. Informativeness of Individual LiDAR Channels

#### 3.5. Correlation Study between On- and Off-Policy Metrics

## 4. Discussion

## 5. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## Abbreviations

LiDAR | Light detection and ranging |

FOV | Field of view |

WRC | World Rally Championship |

RGB/BGR | Red green blue/blue green red |

GNSS | Global Navigation Satellite System |

ReLU | Rectified linear unit |

## Appendix A. LiDAR and Camera Image Stability Across Seasons

**Figure A1.**LiDAR and camera images in summer, autumn, and winter (from top to down for LiDAR, left to right for the camera). The area used for model inputs is marked with a red rectangle. In LiDAR images, the red channel corresponds to intensity, green to depth, and blue to ambient radiation.

## Appendix B. Visualization of Individual LiDAR Channels

**Figure A2.**LiDAR channels at the same location across the three seasons, in order from top down: summer, autumn, winter. (

**a**) In the intensity channel, we see a significant difference in how the road itself looks, while vegetation is surprisingly similar despite deciduous plants having no leaves in autumn and winter. (

**b**) Depth image looks stable across seasons, but rather uninformative, as road and low vegetation areas are hard to discern. (

**c**) Ambient radiation images vary strongly in brightness across the seasons, while also displaying strong noise. The noise looks akin to white noise or salt-and-pepper noise and authors do not know its cause.

## Appendix C. Intervention Locations in the Winter

**Figure A3.**Interventions of a LiDAR v1 model in the winter. The interventions are far more frequent in open fields, whereas the model can handle driving in the forest much better. Furthermore, the middle section of the route which contains bushes by the roadside is driven well.

## Appendix D. Dataset Details

**Table A1.**Training dataset. The recording names start with dates in yyyy-mm-dd format, followed by the hour. Certain recordings failed to record GNSS locations but captured camera and LiDAR feeds and steering, so we can still use them for model training.

Recording Name | Rally Estonia 2021 Stage Name | Complete | Length (km) | |
---|---|---|---|---|

1 | 2021-05-20-12-36-10_e2e_sulaoja_20_30 | Shakedown | 1 | 6.8 |

2 | 2021-05-20-12-43-17_e2e_sulaoja_20_30 | Shakedown, backwards | 0 | GNSS error |

3 | 2021-05-20-12-51-29_e2e_sulaoja_20_30 | Shakedown, backwards | 0 | sums to 6.8 km with above |

4 | 2021-05-20-13-44-06_e2e_sulaoja_10_10 | Shakedown | 1 | 6.83 |

5 | 2021-05-20-13-51-21_e2e_sulaoja_10_10 | Shakedown, backwards | 0 | 6.14 |

6 | 2021-05-20-13-59-00_e2e_sulaoja_10_10 | Shakedown, backwards | 0 | 1.12 |

7 | 2021-05-28-15-07-56_e2e_sulaoja_20_30 | Shakedown | 1 | 6.77 |

8 | 2021-05-28-15-17-19_e2e_sulaoja_20_30 | Shakedown, backwards | 0 | 1.43 |

9 | 2021-06-09-13-14-51_e2e_rec_ss2 | SS12/SS16 | 1 | 22.04 |

10 | 2021-06-09-13-55-03_e2e_rec_ss2_backwards | SS12/SS16 backwards | 1 | 24.82 |

11 | 2021-06-09-14-58-11_e2e_rec_ss3 | SS4/SS8 backwards | 1 | 17.74 |

12 | 2021-06-09-15-42-05_e2e_rec_ss3_backwards | SS4/SS8 | 1 | 17.68 |

13 | 2021-06-09-16-24-59_e2e_rec_ss13 | RE2020 stage, overlaps SS13/17 & SS3/7 | 1 | 13.96 |

14 | 2021-06-09-16-50-22_e2e_rec_ss13_backwards | RE2020 stage, overlaps SS13/17 & SS3/7 | 1 | 13.99 |

15 | 2021-06-10-12-59-59_e2e_ss4 | RE2020 stage, overlaps SS13/17 & SS3/7 | 1 | 10.03 |

16 | 2021-06-10-13-19-22_e2e_ss4_backwards | RE2020 stage, overlaps SS13/17 & SS3/7 | 1 | 10.14 |

17 | 2021-06-10-13-51-34_e2e_ss12 | RE2020 stage, overlaps with SS2/6 | 1 | 6.9 |

18 | 2021-06-10-14-02-24_e2e_ss12_backwards | RE2020 stage, overlaps with SS2/6 | 1 | 6.85 |

19 | 2021-06-10-14-44-24_e2e_ss3_backwards | SS4/SS8 | 0 | 16.23 |

20 | 2021-06-10-15-03-16_e2e_ss3_backwards | SS4/SS8 | 0 | 1.14 |

21 | 2021-06-14-11-08-19_e2e_rec_ss14 | RE2020 stage, overlaps with SS5/9 | 0 | 6.98 |

22 | 2021-06-14-11-22-05_e2e_rec_ss14 | RE2020 stage, overlaps with SS5/9 | 0 | 10.77 |

23 | 2021-06-14-11-43-48_e2e_rec_ss14_backwards | RE2020 stage, overlaps with SS5/9 | 1 | 18.85 |

24 | 2021-09-24-11-19-25_e2e_rec_ss10 | SS10/SS14 | 0 | 12.11 |

25 | 2021-09-24-11-40-24_e2e_rec_ss10_2 | SS10/SS14 | 0 | 6.06 |

26 | 2021-09-24-12-02-32_e2e_rec_ss10_3 | SS10/SS14 | 0 | 3.36 |

27 | 2021-09-24-12-21-20_e2e_rec_ss10_backwards | SS10/SS14 backwards | 1 | 23.9 |

28 | 2021-09-24-13-39-38_e2e_rec_ss11 | SS11/SS15 | 1 | 12.26 |

29 | 2021-09-30-13-57-00_e2e_rec_ss14 | SS5/SS9 | 0 | 0.93 |

30 | 2021-09-30-15-03-37_e2e_ss14_from_half_way | SS5/SS9 | 0 | 7.89 |

31 | 2021-09-30-15-20-14_e2e_ss14_backwards | SS5/SS9, backwards | 1 | 19.26 |

32 | 2021-09-30-15-56-59_e2e_ss14_attempt_2 | SS5/SS9 | 0 | 19.2 |

33 | 2021-10-07-11-05-13_e2e_rec_ss3 | SS4/SS8 | 1 | 17.62 |

34 | 2021-10-07-11-44-52_e2e_rec_ss3_backwards | SS3/SS7 | 1 | 17.47 |

35 | 2021-10-07-12-54-17_e2e_rec_ss4 | SS3/SS7 | 1 | 9.16 |

36 | 2021-10-07-13-22-35_e2e_rec_ss4_backwards | SS3/SS7 backwards | 1 | 9.14 |

37 | 2021-10-11-16-06-44_e2e_rec_ss2 | SS12/SS16 | 0 | GNSS error |

38 | 2021-10-11-17-10-23_e2e_rec_last_part | SS12/SS16 paved section | 0 | sums to 12.6 km with above |

39 | 2021-10-11-17-14-40_e2e_rec_backwards | SS12/SS16 backwards | 0 | GNSS error |

40 | 2021-10-11-17-20-12_e2e_rec_backwards | SS12/SS16 backwards | 0 | sums to 12.6 km with above |

41 | 2021-10-20-13-57-51_e2e_rec_neeruti_ss19_22 | SS19/22 | 1 | 8 |

42 | 2021-10-20-14-15-07_e2e_rec_neeruti_ss19_22_back | SS19/22 | 1 | 8 |

43 | 2021-10-20-14-55-47_e2e_rec_vastse_ss13_17 | SS13/SS17 | 1 | 6.67 |

44 | 2021-10-25-17-06-34_e2e_rec_ss2_arula_back | SS2/SS6 | 1 | 12.83 |

45 | 2021-10-25-17-31-48_e2e_rec_ss2_arula | SS2/SS6 | 1 | 12.82 |

Total distance: | 465.89 |

**Table A2.**Validation dataset used for early stopping. The recording names start with dates in yyyy-mm-dd format, followed by hour. The last two recordings are also used in the training set of models overfitted to the evaluation track. Sections of the last two recordings (sections corresponding to the on-policy testing route) were used for obtaining seasonal off-policy metrics for models tested in the autumn.

Recording Name | Rally Estonia 2021 Stage Name | Complete | Length (km) | Comment | |
---|---|---|---|---|---|

1 | 2021-05-28-15-19-48_e2e_sulaoja_20_30 | Shakedown, backwards | 0 | 4.81 | |

2 | 2021-06-07-14-06-31_e2e_rec_ss6 | SS20/SS23 | 0 | 1.45 | |

3 | 2021-06-07-14-09-18_e2e_rec_ss6 | SS20/SS23 | 0 | 2.06 | |

4 | 2021-06-07-14-20-07_e2e_rec_ss6 | SS20/SS23 | 1 | 11.46 | |

5 | 2021-06-07-14-36-16_e2e_rec_ss6 | SS20/SS23, backwards | 1 | 11.41 | |

6 | 2021-09-24-14-03-45_e2e_rec_ss11_backwards | SS11/SS15 backwards | 1 | 9.83 | used in autumn val. set |

7 | 2021-10-11-14-50-59_e2e_rec_vahi | not RE stage | 1 | 4.97 | used in autumn val. set |

8 | 2021-10-14-13-08-51_e2e_rec_vahi_backwards | not RE stage | 1 | 4.9 | used in autumn val. set |

9 | 2021-10-20-15-11-29_e2e_rec_vastse_ss13_17_back | SS13/SS17 | 1 | 7.11 | used in autumn val. set |

10 | 2021-10-26-10-49-06_e2e_rec_ss20_elva | SS20/SS23 | 1 | 10.91 | used in “overfit”, autumn val. set |

11 | 2021-10-26-11-08-59_e2e_rec_ss20_elva_back | SS20/SS23 backwards | 1 | 10.89 | used in “overfit”, autumn val. set |

Total distance: | 79.8 |

**Table A3.**Winter recordings used for seasonal off-policy metrics computation for models tested in winter. Only 4.3 km sections from each recording were used, corresponding to the on-policy test route. The recording names start with dates in yyyy-mm-dd format, followed by the hour.

Recording Name | Rally Estonia 2021 Stage Name | Complete | Length (km) | |
---|---|---|---|---|

1 | 2022-01-28-14-47-23_e2e_rec_elva_forward | SS20/SS23 | 1 | 10.52 |

2 | 2022-01-28-15-09-01_e2e_rec_elva_backward | SS20/SS23 | 1 | 10.82 |

**Table A4.**Spring recordings used for seasonal off-policy metrics computation for models tested in spring. The recording names start with dates in yyyy-mm-dd format, followed by the hour.

Recording Name | Rally Estonia 2021 Stage Name | Complete | Length (km) | |
---|---|---|---|---|

1 | 2022-05-04-10-54-24_e2e_elva_seasonal_val_set_forw | SS20/SS23 | 0 | 4.3 |

2 | 2022-05-04-11-01-40_e2e_elva_seasonal_val_set_back | SS20/SS23 | 0 | 4.3 |

## Appendix E. On-Policy and off-Policy Metrics

**Table A5.**On-policy test sessions, on-policy metrics recorded, and the corresponding off-policy metrics from the same track in the same season. These values serve as the basis for correlation calculations between distance per intervention (DpI) and other metrics. DpI for tests with 0 interventions was set to 10 km. Combined refers to the sum of $MA{E}_{steer}$ and ${W}_{off-policy}$, both standardized to zero mean and standard deviation one.

Model (Session) | DpI | ${\mathit{MAE}}_{\mathit{trajectory}}$ | Failure Rate | ${\mathit{W}}_{\mathit{effective}}$ | ${\mathit{W}}_{\mathit{on}-\mathit{policy}}$ | ${\mathit{MAE}}_{\mathit{steer}}$ | ${\mathit{W}}_{\mathit{off}-\mathit{policy}}$ | Combined |
---|---|---|---|---|---|---|---|---|

Camera v1 BGR (Nov) | 665.00 m | 0.2715 m | 2.82% | 109.99 ${}^{\circ}$/s | 41.36 ${}^{\circ}$/s | 7.23${}^{\circ}$ | 99.00 ${}^{\circ}$/s | 3.703541 |

Camera v2 BGR (Nov) | 743.87 m | 0.2906 m | 3.31% | 58.91 ${}^{\circ}$/s | 27.27 ${}^{\circ}$/s | 6.91${}^{\circ}$ | 74.50 ${}^{\circ}$/s | 2.577991 |

Camera v3 BGR (Nov) | 685.03 m | 0.2526 m | 1.54% | 94.10 ${}^{\circ}$/s | 35.42 ${}^{\circ}$/s | 7.46${}^{\circ}$ | 90.50 ${}^{\circ}$/s | 3.633461 |

Camera overfit BGR (Nov) | 1185.92 m | 0.2707 m | 4.83% | 68.16 ${}^{\circ}$/s | 28.36 ${}^{\circ}$/s | 6.86${}^{\circ}$ | 116.00 ${}^{\circ}$/s | 4.038257 |

LiDAR v1 (Nov) | 4221.26 m | 0.2164 m | 0.42% | 60.72 ${}^{\circ}$/s | 22.96 ${}^{\circ}$/s | 5.93${}^{\circ}$ | 52.40 ${}^{\circ}$/s | 1.202971 |

LiDAR v2 (Nov) | 4232.96 m | 0.2395 m | 0.98% | 40.79 ${}^{\circ}$/s | 17.71 ${}^{\circ}$/s | 6.05${}^{\circ}$ | 30.10 ${}^{\circ}$/s | 0.509354 |

LiDAR v3 (Nov) | 2810.78 m | 0.2536 m | 2.18% | 56.74 ${}^{\circ}$/s | 18.84 ${}^{\circ}$/s | 5.88${}^{\circ}$ | 41.00 ${}^{\circ}$/s | 0.745176 |

LiDAR overfit (Nov) | 10,000.00 m | 0.2599 m | 4.38% | 38.54 ${}^{\circ}$/s | 19.21 ${}^{\circ}$/s | 4.02${}^{\circ}$ | 27.30 ${}^{\circ}$/s | −1.221416 |

Camera v1 RGB (May) | 4232.17 m | 0.2309 m | 0.96% | 65.03 ${}^{\circ}$/s | 24.63 ${}^{\circ}$/s | 6.26${}^{\circ}$ | 63.20 ${}^{\circ}$/s | 1.745492 |

Camera v2 RGB (May) | 2090.79 m | 0.2382 m | 0.69% | 107.73 ${}^{\circ}$/s | 33.46 ${}^{\circ}$/s | 6.72${}^{\circ}$ | 54.70 ${}^{\circ}$/s | 1.910144 |

Camera v3 RGB (May) | 1198.55 m | 0.2403 m | 0.66% | 67.55 ${}^{\circ}$/s | 29.70 ${}^{\circ}$/s | 6.68${}^{\circ}$ | 63.66 ${}^{\circ}$/s | 2.076987 |

Camera overfit RGB (May) | 2829.71 m | 0.2453 m | 1.39% | 65.94 ${}^{\circ}$/s | 23.07 ${}^{\circ}$/s | 5.96${}^{\circ}$ | 59.91 ${}^{\circ}$/s | 1.405479 |

LiDAR v1 shifted (May) | 436.15 m | 0.2649 m | 1.52% | 49.76 ${}^{\circ}$/s | 23.66 ${}^{\circ}$/s | 7.54${}^{\circ}$ | 64.41 ${}^{\circ}$/s | 2.775023 |

LiDAR v2 shifted (May) | 481.37 m | 0.2673 m | 3.52% | 52.64 ${}^{\circ}$/s | 27.25 ${}^{\circ}$/s | 8.03${}^{\circ}$ | 59.80 ${}^{\circ}$/s | 3.155804 |

LiDAR v3 shifted (May) | 251.56 m | 0.2786 m | 3.20% | 92.61 ${}^{\circ}$/s | 40.73 ${}^{\circ}$/s | 9.64${}^{\circ}$ | 90.70 ${}^{\circ}$/s | 5.404183 |

LiDAR overfit shifted (May) | 603.82 m | 0.2783 m | 4.69% | 56.62 ${}^{\circ}$/s | 31.01 ${}^{\circ}$/s | 7.88${}^{\circ}$ | 73.70 ${}^{\circ}$/s | 3.388144 |

LiDAR v2 day 2 (Nov) | 10,000.00 m | 0.2193 m | 1.97% | 33.89 ${}^{\circ}$/s | 19.17 ${}^{\circ}$/s | 6.05${}^{\circ}$ | 30.10 ${}^{\circ}$/s | 0.509354 |

## Appendix F. Qualitative Observations of Sensitivity

## References

- Tampuu, A.; Matiisen, T.; Semikin, M.; Fishman, D.; Muhammad, N. A survey of end-to-end driving: Architectures and training methods. arXiv
**2020**, arXiv:2003.06404. [Google Scholar] [CrossRef] - Ly, A.O.; Akhloufi, M. Learning to drive by imitation: An overview of deep behavior cloning methods. IEEE Trans. Intell. Veh.
**2020**, 6, 195–209. [Google Scholar] [CrossRef] - Huang, Y.; Chen, Y. Autonomous driving with deep learning: A survey of state-of-art technologies. arXiv
**2020**, arXiv:2006.06091. [Google Scholar] - Yurtsever, E.; Lambert, J.; Carballo, A.; Takeda, K. A Survey of Autonomous Driving: Common Practices and Emerging Technologies. arXiv
**2019**, arXiv:1906.05113. [Google Scholar] - Bansal, M.; Krizhevsky, A.; Ogale, A. Chauffeurnet: Learning to drive by imitating the best and synthesizing the worst. arXiv
**2018**, arXiv:1812.03079. [Google Scholar] - Chitta, K.; Prakash, A.; Jaeger, B.; Yu, Z.; Renz, K.; Geiger, A. Transfuser: Imitation with transformer-based sensor fusion for autonomous driving. arXiv
**2022**, arXiv:2205.15997. [Google Scholar] [CrossRef] - Zeng, W.; Luo, W.; Suo, S.; Sadat, A.; Yang, B.; Casas, S.; Urtasun, R. End-to-end Interpretable Neural Motion Planner. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 8660–8669. [Google Scholar]
- Casas, S.; Sadat, A.; Urtasun, R. Mp3: A unified model to map, perceive, predict and plan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 14403–14412. [Google Scholar]
- Sadat, A.; Casas, S.; Ren, M.; Wu, X.; Dhawan, P.; Urtasun, R. Perceive, predict, and plan: Safe motion planning through interpretable semantic representations. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; pp. 414–430. [Google Scholar]
- Amer, N.H.; Zamzuri, H.; Hudha, K.; Kadir, Z.A. Modelling and control strategies in path tracking control for autonomous ground vehicles: A review of state of the art and challenges. J. Intell. Robot. Syst.
**2017**, 86, 225–254. [Google Scholar] [CrossRef] - Yao, Q.; Tian, Y.; Wang, Q.; Wang, S. Control strategies on path tracking for autonomous vehicle: State of the art and future challenges. IEEE Access
**2020**, 8, 161211–161222. [Google Scholar] [CrossRef] - Bojarski, M.; Del Testa, D.; Dworakowski, D.; Firner, B.; Flepp, B.; Goyal, P.; Jackel, L.D.; Monfort, M.; Muller, U.; Zhang, J.; et al. End to end learning for self-driving cars. arXiv
**2016**, arXiv:1604.07316. [Google Scholar] - Osiński, B.; Jakubowski, A.; Miłoś, P.; Ziecina, P.; Galias, C.; Michalewski, H. Simulation-based reinforcement learning for real-world autonomous driving. arXiv
**2019**, arXiv:1911.12905. [Google Scholar] - Kendall, A.; Hawke, J.; Janz, D.; Mazur, P.; Reda, D.; Allen, J.M.; Lam, V.D.; Bewley, A.; Shah, A. Learning to drive in a day. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 8248–8254. [Google Scholar]
- Codevilla, F.; Miiller, M.; López, A.; Koltun, V.; Dosovitskiy, A. End-to-end driving via conditional imitation learning. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–25 May 2018; pp. 1–9. [Google Scholar]
- Sauer, A.; Savinov, N.; Geiger, A. Conditional affordance learning for driving in urban environments. arXiv
**2018**, arXiv:1806.06498. [Google Scholar] - Hawke, J.; Shen, R.; Gurau, C.; Sharma, S.; Reda, D.; Nikolov, N.; Mazur, P.; Micklethwaite, S.; Griffiths, N.; Shah, A.; et al. Urban Driving with Conditional Imitation Learning. arXiv
**2019**, arXiv:1912.00177. [Google Scholar] - Zhou, B.; Krähenbühl, P.; Koltun, V. Does computer vision matter for action? arXiv
**2019**, arXiv:1905.12887. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Xiao, Y.; Codevilla, F.; Gurram, A.; Urfalioglu, O.; López, A.M. Multimodal End-to-End Autonomous Driving. arXiv
**2019**, arXiv:1906.03199. [Google Scholar] [CrossRef] - Godard, C.; Mac Aodha, O.; Firman, M.; Brostow, G.J. Digging into self-supervised monocular depth estimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision; Seoul, Korea, 27 October–2 November 2019, pp. 3828–3838.
- Alhashim, I.; Wonka, P. High quality monocular depth estimation via transfer learning. arXiv
**2018**, arXiv:1812.11941. [Google Scholar] - Yuan, W.; Gu, X.; Dai, Z.; Zhu, S.; Tan, P. NeW CRFs: Neural Window Fully-connected CRFs for Monocular Depth Estimation. arXiv
**2022**, arXiv:2203.01502. [Google Scholar] - Cui, Y.; Chen, R.; Chu, W.; Chen, L.; Tian, D.; Li, Y.; Cao, D. Deep learning for image and point cloud fusion in autonomous driving: A review. IEEE Trans. Intell. Transp. Syst.
**2021**, 23, 722–739. [Google Scholar] [CrossRef] - Xu, H.; Lan, G.; Wu, S.; Hao, Q. Online intelligent calibration of cameras and lidars for autonomous driving systems. In Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand, 27–30 October 2019; pp. 3913–3920. [Google Scholar]
- Pacala, A. Lidar as a Camera—Digital Lidar’s Implications for Computer vision. Available online: https://ouster.com/blog/the-camera-is-in-the-lidar/ (accessed on 15 March 2023).
- Pomerleau, D.A. Alvinn: An autonomous land vehicle in a neural network. In Proceedings of the Advances in Neural Information Processing Systems, Denver, CO, USA, 27–30 November 1989; pp. 305–313. [Google Scholar]
- Codevilla, F.; Santana, E.; López, A.M.; Gaidon, A. Exploring the limitations of behavior cloning for autonomous driving. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 9329–9338. [Google Scholar]
- Jain, A.; Del Pero, L.; Grimmett, H.; Ondruska, P. Autonomy 2.0: Why is self-driving always 5 years away? arXiv
**2021**, arXiv:2107.08142. [Google Scholar] - Vitelli, M.; Chang, Y.; Ye, Y.; Wołczyk, M.; Osiński, B.; Niendorf, M.; Grimmett, H.; Huang, Q.; Jain, A.; Ondruska, P. SafetyNet: Safe planning for real-world self-driving vehicles using machine-learned policies. arXiv
**2021**, arXiv:2109.13602. [Google Scholar] - Kalra, N.; Paddock, S.M. Driving to safety: How many miles of driving would it take to demonstrate autonomous vehicle reliability? Transp. Res. Part Policy Pract.
**2016**, 94, 182–193. [Google Scholar] [CrossRef] - Nassi, B.; Nassi, D.; Ben-Netanel, R.; Mirsky, Y.; Drokin, O.; Elovici, Y. Phantom of the ADAS: Phantom Attacks on Driver-Assistance Systems. 2020. Available online: https://eprint.iacr.org/2020/085.pdf (accessed on 20 January 2023).
- Zablocki, É.; Ben-Younes, H.; Pérez, P.; Cord, M. Explainability of deep vision-based autonomous driving systems: Review and challenges. Int. J. Comput. Vis.
**2022**, 130, 2425–2452. [Google Scholar] [CrossRef] - Bojarski, M.; Chen, C.; Daw, J.; Değirmenci, A.; Deri, J.; Firner, B.; Flepp, B.; Gogri, S.; Hong, J.; Jackel, L.; et al. The NVIDIA pilotnet experiments. arXiv
**2020**, arXiv:2010.08776. [Google Scholar] - comma.ai. End-to-End Lateral Planning. Available online: https://blog.comma.ai/end-to-end-lateral-planning/ (accessed on 15 March 2022).
- Santurkar, S.; Tsipras, D.; Ilyas, A.; Madry, A. How does batch normalization help optimization? Adv. Neural Inf. Process. Syst.
**2018**, 31, 1–11. [Google Scholar] - Xu, B.; Wang, N.; Chen, T.; Li, M. Empirical evaluation of rectified activations in convolutional network. arXiv
**2015**, arXiv:1505.00853. [Google Scholar] - Codevilla, F.; López, A.M.; Koltun, V.; Dosovitskiy, A. On offline evaluation of vision-based driving models. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 236–251. [Google Scholar]
- Loshchilov, I.; Hutter, F. Decoupled weight decay regularization. arXiv
**2017**, arXiv:1711.05101. [Google Scholar] - Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32; Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R., Eds.; Curran Associates, Inc.: New York, NY, USA, 2019; pp. 8024–8035. [Google Scholar]
- Eraqi, H.M.; Moustafa, M.N.; Honer, J. End-to-end deep learning for steering autonomous vehicles considering temporal dependencies. arXiv
**2017**, arXiv:1710.03804. [Google Scholar] - Fernandez, N. Two-stream convolutional networks for end-to-end learning of self-driving cars. arXiv
**2018**, arXiv:1811.05785. [Google Scholar] - Hauke, J.; Kossowski, T. Comparison of values of Pearson’s and Spearman’s correlation coefficients on the same sets of data. Quaest. Geogr.
**2011**, 30, 87. [Google Scholar] [CrossRef] [Green Version]

**Figure 1.**The location of sensors used in this work. There are other sensors on the vehicle not illustrated here.

**Figure 2.**Input modalities. The red box marks the area used as model input. Top: surround view LiDAR image, with red: intensity, blue: depth, and green: ambient. Bottom: 120-degree FOV camera.

**Figure 3.**The modified PilotNet architecture. Each box represents the output from a layer, with the first box corresponding to the input of size (264, 68, 3). The model consists of 5 convolutional layers and 4 fully connected layers. The flattening operation is not made visible here. See the filter sizes, usage of batch normalization, and activation functions in Table 1.

**Figure 4.**Safety-driver interventions in the experiments where the test track was not included in the training set. Interventions from 3 test runs with different versions of the same model and from both driving directions are overlaid on one map. Interventions due to traffic are not filtered out from these maps, unlike in Table 2. Left: camera models v1–v3 (first 3 rows of Table 2). Middle: LiDAR models v1–v3 (rows 4–6 of Table 2). Right: an example of a situation where the safety driver has to take over due to traffic. Such situations are not counted as interventions in Table 2.

**Table 2.**Results of on-policy evaluations. Evaluations interrupted due to a high frequency of interventions are marked with *. Horizontal lines separate values illustrating the different results subsections.

Experiment | Model (Session) | Distance | Interventions | DpI | ${\mathit{MAE}}_{\mathit{trajectory}}$ | Failure Rate | ${\mathit{W}}_{\mathit{on}-\mathit{policy}}$ |
---|---|---|---|---|---|---|---|

Generalization | Camera v1 (May) | 8464.33 m | 2 | 4232.17 m | 0.2309 m | 0.96% | 24.63 ${}^{\circ}$/s |

Camera v2 ( may) | 8363.17 m | 4 | 2090.79 m | 0.2382 m | 0.69% | 33.46 ${}^{\circ}$/s | |

Camera v3 ( may) | 8389.88 m | 7 | 1198.55 m | 0.2403 m | 0.66% | 29.70 ${}^{\circ}$/s | |

LiDAR v1 (Nov) | 8442.5 m | 2 | 4221.3 m | 0.22 m | 0.42% | 23.0 ${}^{\circ}$/s | |

LiDAR v2 (Nov) | 8465.9 m | 2 | 4233.0 m | 0.24 m | 0.98% | 17.7 ${}^{\circ}$/s | |

LiDAR v3 (Nov) | 8432.3 m | 3 | 2810.8 m | 0.25 m | 2.18% | 18.8 ${}^{\circ}$/s | |

Overfitting | Camera overfit (May) | 8489.14 m | 3 | 2829.71 m | 0.2453 m | 1.39% | 23.07 ${}^{\circ}$/s |

LiDAR overfit (Nov) | 8436.9 m | 0 | >8436.9 m | 0.26 m | 4.38% | 19.2 ${}^{\circ}$/s | |

Night | LiDAR v2 (Nov) | 8216.3 m | 8 | 1027.0 m | 0.24 m | 1.27% | 25.6 ${}^{\circ}$/s |

LiDAR v2 #2 (Nov) | 8376.6 m | 3 | 2792.2 m | 0.23 m | 1.55% | 20.3 ${}^{\circ}$/s | |

LiDAR overfit (Nov) | 8521.5 m | 1 | 8521.5 m | 0.25 m | 1.52% | 21.5 ${}^{\circ}$/s | |

Winter | LiDAR v1 (Jan) | 8080.5 m | 19 | 425.3 m | 0.24 m | 0.94% | 38.4 ${}^{\circ}$/s |

LiDAR v2 (Jan) | 8001.4 m | 22 | 363.7 m | 0.28 m | 3.10% | 38.7 ${}^{\circ}$/s | |

LiDAR v3 (Jan) | 7698.9 m | 34 | 226.4 m | 0.26 m | 1.64% | 42.2 ${}^{\circ}$/s | |

LiDAR | LiDAR v2 (Nov) | 8491.6 m | 0 | >8491.6 m | 0.22 m | 1.97% | 19.2 ${}^{\circ}$/s |

channels | LiDAR intensity (Nov) | 8446.2 m | 2 | 4223.1 m | 0.33 m | 7.02% | 24.0 ${}^{\circ}$/s |

(next day) | LiDAR depth (Nov) | 1679.0 m * | 22* | 76.3 m * | 0.61 m * | 19.95% * | 29.9 ${}^{\circ}$/s * |

LiDAR ambience (Nov) | 329.5 m * | 19 * | 17.3 m * | 0.73 m * | 17.49% * | 168.2 ${}^{\circ}$/s * |

**Table 3.**Pearson correlations of the main driving quality metric distance per intervention (DpI) with other on- and off-policy metrics. The highest-correlating metrics of both types are highlighted in bold.

On-Policy Measures | Off-Policy Measures | |||||
---|---|---|---|---|---|---|

Measure | ${\mathit{MAE}}_{\mathit{Trajectory}}$ | Failure Rate | ${\mathit{W}}_{\mathit{on}-\mathit{Policy}}$ | ${\mathit{W}}_{\mathit{effective}}$ | ${\mathit{W}}_{\mathit{off}-\mathit{Policy}}$ | ${\mathit{MAE}}_{\mathit{steer}}$ |

Pearson R | −0.56 | −0.06 | −0.56 | −0.67 | −0.72 | −0.76 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Tampuu, A.; Aidla, R.; van Gent, J.A.; Matiisen, T.
LiDAR-as-Camera for End-to-End Driving. *Sensors* **2023**, *23*, 2845.
https://doi.org/10.3390/s23052845

**AMA Style**

Tampuu A, Aidla R, van Gent JA, Matiisen T.
LiDAR-as-Camera for End-to-End Driving. *Sensors*. 2023; 23(5):2845.
https://doi.org/10.3390/s23052845

**Chicago/Turabian Style**

Tampuu, Ardi, Romet Aidla, Jan Aare van Gent, and Tambet Matiisen.
2023. "LiDAR-as-Camera for End-to-End Driving" *Sensors* 23, no. 5: 2845.
https://doi.org/10.3390/s23052845