Point Cloud Wall Projection for Realistic Road Data Augmentation
Abstract
:1. Introduction
- To enhance the representation of distant objects using the refinement of spherical point projection without the need for complex extrapolation techniques.
- To prevent excessive loss in virtual objects’ details, ensuring the shapes of distant objects resemble real sensor data more closely via the point-wall method.
- To accurately depict the orientation of synthetic object points, supporting realistic platooning scenarios on roadways using the pose determination module.
2. Related Works
3. Proposed Method
3.1. Position Determination
3.1.1. Ground Filtering
3.1.2. Collision Handling
- Collision between virtual objects: After generating candidate coordinates for the synthetic objects (from the region of interest, or RoI, which is the ground point cloud), the algorithm first ensures that virtual objects do not overlap. It does so by selecting one candidate coordinate at random and removing all other coordinates within a certain distance, known as the “collision threshold”. This ensures that virtual objects are spaced out properly to prevent overlaps as shown in Figure 4b.
- Collision with non-ground points: Once the first collision detection (between virtual objects) is completed, the remaining candidate coordinates are checked for collisions with non-ground points (such as vegetation, sidewalks, etc.). This is carried out by comparing the coordinates with the non-ground point cloud, which was previously segmented using a ground segmentation algorithm. If any candidate coordinates are too close to non-ground points (within a specified distance), they are discarded. For the second step, the collision threshold used for virtual object-to-non-ground collision detection is half the value used for virtual object-to-virtual object collisions. This ensures a finer level of collision avoidance when checking proximity to non-ground features like vegetation or sidewalks as shown in Figure 4c. This approach of using distance-based thresholds for collision handling can be less precise when compared to synthetic mesh-based pruning. Although a mesh-based approach would improve placement precision, the computational trade-offs may not justify its use in large-scale dataset generation, particularly when speed and scalability are prioritized. Therefore, this limited approach comprising distance-based thresholds for collision handling is utilized, which will be replaced by mesh-based approaches in the future.
3.1.3. Pose Determination
- Pose decision area: The initial yaw value is set based on the object’s Y-coordinate () in the LiDAR sensor’s coordinate system, divided into three areas:
- -
- Straight pose area (): the yaw angle is set to 0 (same direction) or (opposite direction), chosen randomly.
- -
- Intersection pose area (): the yaw angle is set to 0, , or a random value ( between 0 and ).
- -
- Random pose area (): the yaw angle is set randomly between 0 and .
- Update with nearby real vehicles: Once the yaw angle is assigned based on the pose decision area, it can be updated based on the orientation of nearby real vehicles. The input point clouds in this context are typically labeled with object categories such as “car”, “truck”, “pedestrian”, etc. If a real vehicle is within a certain proximity to the synthetic object, the yaw value of the virtual object is updated to match the yaw angle of the nearest real vehicle. Thereby, the yaw value is determined with reference to the position and orientation derived from the bounding box. This step reflects the real-world phenomenon where vehicles on the road often drive in the same direction (or opposite directions) in clustered groups, such as on highways or in dense traffic.
3.2. Object Generation
3.2.1. Spherical Point Cloud Projection
- Coordinate transformation: The module first converts the input LiDAR point cloud data from Cartesian (orthogonal) coordinates to spherical coordinates. This step is necessary for applying the spherical point-tracking technique.
- Spherical point tracking: Once the data are in spherical coordinates, the module uses spherical point tracking and point cloud wall creation techniques to generate a synthetic model of the virtual object. This process defines the structure of the synthetic object based on the real-world point cloud data.
- Projection and final transformation: After applying the spherical point-tracking method, the resulting synthetic model is converted back into Cartesian coordinates. This step finalizes the projection of the virtual object into the synthetic point cloud data, effectively augmenting the original data with the new object.
- Integration into synthetic annotation: The augmented point cloud data are then passed into the synthetic annotation module, which processes the data further to fit the required data format, reflecting object occlusion and other relevant information like object type, position, and orientation.
Algorithm 1 Spherical Point Projection Algorithm |
3.2.2. Point Wall
3.3. Synthetic Annotation
4. Experimental Results
4.1. Experimental Environment
Dataset and Model
4.2. Synthetic LiDAR Point Cloud Generation
4.2.1. Car Class
4.2.2. Pedestrian and Cyclist Classes
4.2.3. Platooning
4.3. Deep Learning Model Training
4.3.1. Car Class
4.3.2. Pedestrian and Cyclist Classes
5. Limitations and Future Work
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Esmorís, A.M.; Weiser, H.; Winiwarter, L.; Cabaleiro, J.C.; Höfle, B. Deep learning with simulated laser scanning data for 3D point cloud classification. ISPRS J. Photogramm. Remote Sens. 2024, 215, 192–213. [Google Scholar] [CrossRef]
- Beltrán, J.; Cortés, I.; Barrera, A.; Urdiales, J.; Guindel, C.; García, F.; de la Escalera, A. A method for synthetic LiDAR generation to create annotated datasets for autonomous vehicles perception. In Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand, 27–30 October 2019; pp. 1091–1096. [Google Scholar]
- Yue, X.; Wu, B.; Seshia, S.A.; Keutzer, K.; Sangiovanni-Vincentelli, A.L. A lidar point cloud generator: From a virtual world to autonomous driving. In Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, Yokohama, Japan, 11–14 June 2018; pp. 458–464. [Google Scholar]
- Wang, F.; Zhuang, Y.; Gu, H.; Hu, H. Automatic generation of synthetic LiDAR point clouds for 3-D data analysis. IEEE Trans. Instrum. Meas. 2019, 68, 2671–2673. [Google Scholar] [CrossRef]
- Hossny, M.; Saleh, K.; Attia, M.; Abobakr, A.; Iskander, J. Fast synthetic LiDAR rendering via spherical UV unwrapping of equirectangular Z-buffer images. In Proceedings of the Computer Vision and Pattern Recognition, Image and Video Processing, Glasgow, UK, 23–28 August 2020. [Google Scholar]
- Chitnis, S.A.; Huang, Z.; Khoshelham, K. Generating Synthetic 3D Point Segments for Improved Classification of Mobile LIDAR Point Clouds. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2021, 43, 139–144. [Google Scholar] [CrossRef]
- Yin, T.; Gastellu-Etchegorry, J.P.; Grau, E.; Lauret, N.; Rubio, J. Simulating satellite waveform Lidar with DART model. In Proceedings of the 2013 IEEE International Geoscience and Remote Sensing Symposium-IGARSS, Melbourne, Australia, 21–26 July 2013; pp. 3029–3032. [Google Scholar]
- Yin, T.; Lauret, N.; Gastellu-Etchegorry, J.P. Simulation of satellite, airborne and terrestrial LiDAR with DART (II): ALS and TLS multi-pulse acquisitions, photon counting, and solar noise. Remote Sens. Environ. 2016, 184, 454–468. [Google Scholar] [CrossRef]
- Gastellu-Etchegorry, J.P.; Yin, T.; Lauret, N.; Cajgfinger, T.; Gregoire, T.; Grau, E.; Feret, J.B.; Lopes, M.; Guilleux, J.; Dedieu, G.; et al. Discrete anisotropic radiative transfer (DART 5) for modeling airborne and satellite spectroradiometer and LIDAR acquisitions of natural and urban landscapes. Remote Sens. 2015, 7, 1667–1701. [Google Scholar] [CrossRef]
- Gastellu-Etchegorry, J.P.; Yin, T.; Lauret, N.; Grau, E.; Rubio, J.; Cook, B.D.; Morton, D.C.; Sun, G. Simulation of satellite, airborne and terrestrial LiDAR with DART (I): Waveform simulation with quasi-Monte Carlo ray tracing. Remote Sens. Environ. 2016, 184, 418–435. [Google Scholar] [CrossRef]
- Yang, X.; Wang, Y.; Yin, T.; Wang, C.; Lauret, N.; Regaieg, O.; Xi, X.; Gastellu-Etchegorry, J.P. Comprehensive LiDAR simulation with efficient physically-based DART-Lux model (I): Theory, novelty, and consistency validation. Remote Sens. Environ. 2022, 272, 112952. [Google Scholar] [CrossRef]
- Yang, X.; Wang, C.; Yin, T.; Wang, Y.; Li, D.; Lauret, N.; Xi, X.; Wang, H.; Wang, R.; Wang, Y.; et al. Comprehensive LiDAR simulation with efficient physically-based DART-Lux model (II): Validation with GEDI and ICESat-2 measurements at natural and urban landscapes. Remote Sens. Environ. 2025, 317, 114519. [Google Scholar] [CrossRef]
- Fang, J.; Zuo, X.; Zhou, D.; Jin, S.; Wang, S.; Zhang, L. Lidar-aug: A general rendering-based augmentation framework for 3D object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 4710–4720. [Google Scholar]
- Xiao, A.; Huang, J.; Guan, D.; Zhan, F.; Lu, S. Transfer learning from synthetic to real lidar point cloud for semantic segmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, Online, 22 February–1 March 2022; Volume 36, pp. 2795–2803. [Google Scholar]
- Xiang, Z.; Huang, Z.; Khoshelham, K. Synthetic lidar point cloud generation using deep generative models for improved driving scene object recognition. Image Vis. Comput. 2024, 150, 105207. [Google Scholar] [CrossRef]
- Zhao, J.; Zheng, P.; Ma, R. D-Aug: Enhancing Data Augmentation for Dynamic LiDAR Scenes. arXiv 2024, arXiv:2404.11127. [Google Scholar]
- Zhang, Y.; Ding, M.; Yang, H.; Niu, Y.; Ge, M.; Ohtani, K.; Zhang, C.; Takeda, K. LiDAR Point Cloud Augmentation for Adverse Conditions Using Conditional Generative Model. Remote Sens. 2024, 16, 2247. [Google Scholar] [CrossRef]
- Park, J.; Kim, K.; Shim, H. Rethinking Data Augmentation for Robust LiDAR Semantic Segmentation in Adverse Weather. arXiv 2024, arXiv:2407.02286. [Google Scholar]
- Reichardt, L.; Uhr, L.; Wasenmüller, O. Text3DAug–Prompted Instance Augmentation for LiDAR Perception. arXiv 2024, arXiv:2408.14253. [Google Scholar]
- López, A.; Ogayar, C.J.; Jurado, J.M.; Feito, F.R. A GPU-accelerated framework for simulating LiDAR scanning. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–18. [Google Scholar] [CrossRef]
- Winiwarter, L.; Pena, A.M.E.; Weiser, H.; Anders, K.; Sánchez, J.M.; Searle, M.; Höfle, B. Virtual laser scanning with HELIOS++: A novel take on ray tracing-based simulation of topographic full-waveform 3D laser scanning. Remote Sens. Environ. 2022, 269, 112772. [Google Scholar] [CrossRef]
- Anand, V.; Lohani, B.; Pandey, G.; Mishra, R. Toward Physics-Aware Deep Learning Architectures for LiDAR Intensity Simulation. arXiv 2024, arXiv:2404.15774. [Google Scholar]
- Zyrianov, V.; Che, H.; Liu, Z.; Wang, S. LidarDM: Generative LiDAR Simulation in a Generated World. arXiv 2024, arXiv:2404.02903. [Google Scholar]
- Eggert, M.; Schade, M.; Bröhl, F.; Moriz, A. Generating Synthetic LiDAR Point Cloud Data for Object Detection Using the Unreal Game Engine. In Proceedings of the International Conference on Design Science Research in Information Systems and Technology, Trollhättan, Sweden, 3–5 June 2024; Springer: Berlin/Heidelberg, Germany, 2024; pp. 295–309. [Google Scholar]
- Manivasagam, S.; Wang, S.; Wong, K.; Zeng, W.; Sazanovich, M.; Tan, S.; Yang, B.; Ma, W.C.; Urtasun, R. Lidarsim: Realistic lidar simulation by leveraging the real world. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11167–11176. [Google Scholar]
- Li, R.; Li, X.; Heng, P.A.; Fu, C.W. Pointaugment: An auto-augmentation framework for point cloud classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 6378–6387. [Google Scholar]
- Lee, S.; Lim, H.; Myung, H. Patchwork++: Fast and robust ground segmentation solving partial under-segmentation using 3D point cloud. In Proceedings of the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Kyoto, Japan, 23–27 October 2022; pp. 13276–13283. [Google Scholar]
- Geiger, A.; Lenz, P.; Urtasun, R. Are we ready for autonomous driving? The kitti vision benchmark suite. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 3354–3361. [Google Scholar]
- Caesar, H.; Bankiti, V.; Lang, A.H.; Vora, S.; Liong, V.E.; Xu, Q.; Krishnan, A.; Pan, Y.; Baldan, G.; Beijbom, O. Nuscenes: A multimodal dataset for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11621–11631. [Google Scholar]
- Lang, A.H.; Vora, S.; Caesar, H.; Zhou, L.; Yang, J.; Beijbom, O. Pointpillars: Fast encoders for object detection from point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 12697–12705. [Google Scholar]
- Shi, S.; Guo, C.; Jiang, L.; Wang, Z.; Shi, J.; Wang, X.; Li, H. Pv-rcnn: Point-voxel feature set abstraction for 3D object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10529–10538. [Google Scholar]
- Deng, J.; Shi, S.; Li, P.; Zhou, W.; Zhang, Y.; Li, H. Voxel r-cnn: Towards high performance voxel-based 3D object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Online, 2–9 February 2021; Volume 35, pp. 1201–1209. [Google Scholar]
Research Study | Methodology | Key Aspects | Limitations |
---|---|---|---|
Esmoris et al. [1] | Trains models with virtual laser scanning data | Automated scene and model training | Limited to specific real data applications |
Yin et al. [7] | Extended DART model with Monte Carlo methods for satellite LiDAR simulation | Efficient scattering models; supports data fusion with other sensors | Focuses on vegetation and urban scenes |
Zhao et al. [16] | LiDARsim uses real data, ray casting, and neural networks | Realistic LiDAR for autonomous testing | High fidelity needed, weather simulation issues |
Zhang et al. [17] | Conditional generative model with segmentation maps | Large dataset, domain adaptation strategies. | Cannot generalize to more severe settings |
Park et al. [18] | Generative models for long-tail object recognition | Generative augmentation for minority classes. | Focuses only on object recognition |
Reichardt et al. [19] | Virtual laser scanning for semantic segmentation | Automated training, real-world data reliance. | Accuracy gaps, dynamic scene challenges |
Lopez et al. [20] | GPU-based LiDAR simulator generates dense semantic point clouds for DL training | High speed (99% faster); large-scale labeling; procedural scene generation | Limited to procedural or static environments |
Winiwarter et al. [21] | HELIOS++ simulates terrestrial, airborne, and mobile LiDAR with modular scene modeling | Handles vegetation, supports Python, has fast runtime, and creates training data | Slightly less accurate than ray-tracing models |
Anand et al. [22] | Physics-informed DL for LiDAR intensity simulation using U-NET and Pix2Pix architectures | Adds incidence angle as input; improves intensity prediction accuracy | Lacks material property integration |
Zyrianov et al. [23] | LidarDM generates realistic, layout-aware, temporally coherent 4D LiDAR sequences | 4D generation; driving scenario guidance; high realism for simulations | Not real time; no intensity modeling yet |
Eggert et al. [24] | Synthetic point cloud generation using Unreal Engine for object detection | High-quality clouds; suitable for industrial datasets | Sparse datasets; lacks specific real-world equivalency |
Manivasagam et al. [25] | Simulates LiDAR with real data, simulations, and ML | Realistic LiDAR simulations | Requires large datasets, domain challenges |
Fang et al. [13] | Rendering-based LiDAR augmentation | Point distribution representation | Low performance for long-distance objects |
Xiao et al. [14] | SynLiDAR dataset creation, translation via PCT | Large synthetic dataset, transfer learning | Focused on segmentation, overfitting risk |
Xiang et al. [15] | Generative models for LiDAR object recognition | Synthetic point clouds, minority class focus | Limited generalization, needs tailored models |
Li et al. [26] | PointAugment auto-optimizes point cloud samples via adversarial learning | Sample-specific augmentation; improves shape classification | Limited to certain transformations |
Values | Category | Data | Type |
---|---|---|---|
1 | Class | Describes the type of object | string |
3 | Position | 3D object location in LiDAR coordinates (in meters) Ex. [, , ] | float[] |
3 | Rotation | 3D object rotation in LiDAR coordinates Ex. [0, 0, Yaw] | float[] |
3 | Dimension | 3D object dimensions in LiDAR coordinates (in meters) Ex. [, , ] | float[] |
1 | Occluded | Integer (0,1,2,3) indicating occlusion state: 0 = fully visible 1 = partly occluded 2 = largely occluded 3 = unknown | int |
Velodyne HDL-64E (KITTI 360 Dataset) | Velodyne HDL-32E (nuScenes Dataset) | ||
---|---|---|---|
Range | ∼120 m | ∼100 m | |
Resolution | Horizontal | 0.08° | 0.08∼1.33° |
Vertical | 0.4° | 1.33° | |
Field of View | Horizontal | 360° | 360° |
Vertical | 26.8°(−24.8∼+2) | 41.33°(−30.67∼+10.67) |
Values | Name | Description |
---|---|---|
1 | Type | Describes the type of object: ‘Car’, ‘Van’, ‘Truck’, ‘Pedestrian’, ‘Person_sitting’, ‘Cyclist’, ‘Tram’, ‘Misc’ or ‘Dont Care’ |
1 | Truncated | Float from 0 (non-truncated) to 1 (truncated), where truncated refers to the object leaving the image boundaries |
1 | Occluded | Integer (0,1,2,3) indicating occlusion state: 0 = fully visible 1 = partly occluded 2 = largely occluded 3 = unknown |
1 | Alpha | Observation angle of object, ranging [−pi, pi] |
4 | Bbox | 2D bounding box of object in the image (0-based index) : contains left, top, right, and bottom pixels |
3 | Dimensions | 3D object dimensions: height, width, and length (in meters) |
3 | Location | 3D object location x, y, z in camera coordinates (in meters) |
1 | Rotation_y | Rotation, ry, around Y-axis in camera coordinates [−pi, pi] |
Methods | AP3D (%) | mAP | ||
---|---|---|---|---|
Easy | Moderate | Hard | ||
PointPillars with KITTI | 85.41 | 73.59 | 68.76 | 75.92 |
PointPillars with LiDAR-Aug | 87.75 | 77.83 | 74.90 | 80.16 |
PointPillars with proposed work (Ours) | 88.75 | 80.43 | 77.52 | 82.23 |
PV-RCNN with KITTI | 88.86 | 78.83 | 78.30 | 82.00 |
PV-RCNN with LiDAR-Aug | 90.18 | 84.23 | 78.95 | 84.45 |
PV-RCNN with proposed work (Ours) | 92.64 | 84.54 | 82.98 | 86.72 |
Voxel R-CNN with KITTI | 92.24 | 85.01 | 82.51 | 86.59 |
Voxel R-CNN with LiDAR-Aug | NB | NB | NB | NB |
Voxel R-CNN with Proposed (Ours) | 92.67 | 85.34 | 83.13 | 87.05 |
Methods | AP3D (%) | ||||||||
---|---|---|---|---|---|---|---|---|---|
Car | Pedestrian | Cyclist | |||||||
Easy | Moderate | Hard | Easy | Moderate | Hard | Easy | Moderate | Hard | |
PointPillars with KITTI | 85.41 | 73.59 | 68.76 | 47.51 | 43.82 | 42.20 | 84.64 | 64.26 | 60.69 |
PointPillars with LiDAR-Aug | 87.75 | 77.83 | 74.90 | 59.99 | 55.15 | 52.66 | - | - | - |
PointPillars with Proposed work | 87.99 | 78.42 | 75.35 | 56.28 | 50.5 | 46.07 | 84.68 | 64.63 | 61.12 |
Methods | AP3D (%) | ||||||||
---|---|---|---|---|---|---|---|---|---|
Car | Pedestrian | Cyclist | |||||||
Easy | Moderate | Hard | Easy | Moderate | Hard | Easy | Moderate | Hard | |
PV-RCNN with KITTI | 88.86 | 78.83 | 78.30 | 60.56 | 53.75 | 51.90 | 90.24 | 71.83 | 68.33 |
PV-RCNN with LiDAR-Aug | 90.18 | 84.23 | 78.95 | 65.05 | 58.90 | 55.52 | - | - | - |
PV-RCNN with Proposed (Ours) | 92.00 | 84.68 | 82.67 | 66.41 | 59.45 | 53.62 | 90.85 | 72.42 | 69.04 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, K.; Lee, S.; Kakani, V.; Li, X.; Kim, H. Point Cloud Wall Projection for Realistic Road Data Augmentation. Sensors 2024, 24, 8144. https://doi.org/10.3390/s24248144
Kim K, Lee S, Kakani V, Li X, Kim H. Point Cloud Wall Projection for Realistic Road Data Augmentation. Sensors. 2024; 24(24):8144. https://doi.org/10.3390/s24248144
Chicago/Turabian StyleKim, Kana, Sangjun Lee, Vijay Kakani, Xingyou Li, and Hakil Kim. 2024. "Point Cloud Wall Projection for Realistic Road Data Augmentation" Sensors 24, no. 24: 8144. https://doi.org/10.3390/s24248144
APA StyleKim, K., Lee, S., Kakani, V., Li, X., & Kim, H. (2024). Point Cloud Wall Projection for Realistic Road Data Augmentation. Sensors, 24(24), 8144. https://doi.org/10.3390/s24248144