# 3D Scene Reconstruction Using Omnidirectional Vision and LiDAR: A Hybrid Approach

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

- We propose a surface analysis technique that is used to build a topological space on top of the point cloud, providing for each point its ideal neighborhood and taking into account the underlying surface.
- We keep track of a global 3D map that is continuously updated by means of a surface reconstruction technique that allows us to resample the point cloud. This way, the global 3D map is continuously improved, i.e., the noise is reduced, and the alignment of the following point clouds can be conducted more accurately.
- The topological space is used to compute low-level features, which will be incorporated in an adapted version of the ICP algorithm to make this latter more robust. The alignment of consecutive local point clouds, as well as the alignment of a local point cloud with the global 3D map will be conducted using this improved ICP algorithm.
- We incorporate the residual of the ICP process in the loop closure process. Based on this residual, we can predict the share of each pose estimation in the final pose error when a loop has been detected.

## 2. Related Work

## 3. System

#### 3.1. Acquisition Platform

#### 3.2. Terminology

#### 3.3. Reference Coordinate System

#### 3.4. Problem Statement

## 4. Approach

#### 4.1. 2D Projection

#### 4.2. Surface Analysis

#### 4.3. Local Registration

#### 4.4. Global Registration

#### 4.5. Map Fusion

#### 4.6. Loop Closure

## 5. Evaluation

#### 5.1. Our Dataset

#### 5.2. Kitti Vision Benchmark

## 6. Conclusions

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## References

- Mur-Artal, R.; Montiel, J.M.M.; Tardós, J.D. ORB-SLAM: A Versatile and Accurate Monocular SLAM System. IEEE Trans. Robot.
**2015**, 31, 1147–1163. [Google Scholar] [CrossRef] - Caruso, D.; Engel, J.; Cremers, D. Large-Scale direct SLAM for omnidirectional cameras. In Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 28 September–2 October 2015; pp. 141–148.
- Billings, S.D.; Boctor, E.M.; Taylor, R.H. Iterative Most-Likely Point Registration (IMLP): A Robust Algorithm for Computing Optimal Shape Alignment. PLoS ONE
**2015**, 10, e0117688. [Google Scholar] [CrossRef] [PubMed] - Pomerleau, F.; Colas, F.; Siegwart, R.; Magnenat, S. Comparing ICP Variants on Real-world Data Sets. Auton. Robot.
**2013**, 34, 133–148. [Google Scholar] [CrossRef] - Han, J.; Yin, P.; He, Y.; Gu, F. Enhanced ICP for the Registration of Large-Scale 3D Environment Models: An Experimental Study. Sensors
**2016**, 16, 228. [Google Scholar] [CrossRef] [PubMed] - Zlot, R.; Bosse, M. Efficient Large-Scale 3D Mobile Mapping and Surface Reconstruction of an Underground Mine. Field Serv. Robot.
**2012**, 92, 479–493. [Google Scholar] - Bosse, M.; Zlot, R.; Flick, P. Zebedee: Design of a Spring-Mounted 3-D Range Sensor with Application to Mobile Mapping. IEEE Trans. Robot.
**2012**, 28, 1104–1119. [Google Scholar] [CrossRef] - Nüchter, A.; Lingemann, K.; Hertzberg, J.; Surmann, H. 6D SLAM—3D Mapping Outdoor Environments: Research Articles. J. Field Robot.
**2007**, 24, 699–722. [Google Scholar] [CrossRef] - Hong, S.; Ko, H.; Kim, J. VICP: Velocity updating iterative closest point algorithm. In Proceedings of the 2010 IEEE International Conference on Robotics and Automation (ICRA), Anchorage, Alaska, 3–8 May 2010; pp. 1893–1898.
- Gressin, A.; Mallet, C.; Demantké, J.; David, N. Towards 3D LiDAR point cloud registration improvement using optimal neighborhood knowledge. ISPRS J. Photogramm. Remote Sens.
**2013**, 79, 240–251. [Google Scholar] [CrossRef] - Moosmann, F.; Stiller, C. Velodyne SLAM. In Proceedings of the IEEE Intelligent Vehicles Symposium, Baden-Baden, Germany, 5–9 June 2011; pp. 393–398.
- Sarvrood, Y.B.; Hosseinyalamdary, S.; Gao, Y. Visual-LiDAR Odometry Aided by Reduced IMU. ISPRS Int. J. Geo Inf.
**2016**, 5. [Google Scholar] [CrossRef] - Segal, A.; Hähnel, D.; Thrun, S. Generalized-ICP. In Robotics: Science and Systems; Trinkle, J., Matsuoka, Y., Castellanos, J.A., Eds.; The MIT Press: Cambridge, MA, USA, 2009. [Google Scholar]
- Biber, P.; Straßer, W. The normal distributions transform: A new approach to laser scan matching. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 27 October–1 November 2003; pp. 2743–2748.
- Sun, Y.X.; Li, J.L. Mapping of Rescue Environment Based on NDT Scan Matching. In Proceedings of the 2nd International Conference on Computer Science and Electronics Engineering (ICCSEE 2013), Hangzhou, China, 22–23 March 2013; Volume 760, pp. 928–933.
- Magnusson, M.; Lilienthal, A.; Duckett, T. Scan registration for autonomous mining vehicles using 3D-NDT. J. Field Robot.
**2007**, 24, 803–827. [Google Scholar] [CrossRef][Green Version] - Einhorn, E.; Gross, H.M. Generic NDT Mapping in Dynamic Environments and Its Application for Lifelong SLAM. Robot. Auton. Syst.
**2015**, 69, 28–39. [Google Scholar] [CrossRef] - Zhang, J.; Singh, S. LOAM: LiDAR Odometry and Mapping in Real-time. In Proceedings of the Robotics: Science and Systems Conference (RSS), Berkeley, CA, USA, 13–15 July 2014.
- Zhang, J.; Singh, S. Visual-LiDAR Odometry and Mapping: Low-drift, Robust, and Fast. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Seattle, Washington, DC, USA, 26–30 May 2015.
- Pathak, K.; Birk, A.; Vaskevicius, N.; Poppinga, J. Fast Registration Based on Noisy Planes with Unknown Correspondences for 3D Mapping. IEEE Trans. Robot.
**2010**, 26, 424–441. [Google Scholar] [CrossRef] - Grant, W.; Voorhies, R.; Itti, L. Finding Planes in LiDAR Point Clouds for Real-Time Registration. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Tokyo, Japan, 3–7 November 2013; pp. 4347–4354.
- Xiao, J.; Adler, B.; Zhang, J.; Zhang, H. Planar Segment Based Three-dimensional Point Cloud Registration in Outdoor Environments. J. Field Robot.
**2013**, 30, 552–582. [Google Scholar] [CrossRef] - Low, K.L. Linear Least-Squares Optimization for Point-to Plane ICP Surface Registration; Technical Report 4; University of North Carolina: Chapel Hill, NC, USA, 2004. [Google Scholar]
- Levin, D. The Approximation Power of Moving Least-squares. Math. Comput.
**1998**, 67, 1517–1531. [Google Scholar] [CrossRef] - Galvez-Lopez, D.; Tardos, J.D. Bags of Binary Words for Fast Place Recognition in Image Sequences. IEEE Trans. Robot.
**2012**, 28, 1188–1197. [Google Scholar] [CrossRef] - Shoemake, K. Animating Rotation with Quaternion Curves. SIGGRAPH Comput. Graph.
**1985**, 19, 245–254. [Google Scholar] [CrossRef] - Geiger, A.; Lenz, P.; Urtasun, R. Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, 16–21 June 2012.
- Wu, C. Towards Linear-Time Incremental Structure from Motion. In Proceedings of the 2013 International Conference on 3D Vision, Sydney, Australia, 1–8 December 2013; pp. 127–134.

**Figure 1.**Two pictures of our mobile acquisition platform Vellady mounted on a kitchen cart (

**right**) and close-up (

**left**). The platform consists of a Ladybug panoramic camera (mounted at the top) and a Velodyne HDL32-e LiDAR scanner (mounted at the bottom).

**Figure 2.**A 360${}^{\circ}$ point cloud acquired by the Velodyne HDL-32e LiDAR scanner, associated with the stitched lady bug image of Figure 3, captured at the Dow chemical company.

**Figure 3.**An image acquired by stitching the six images of the Ladybug together. The image shows the starting point of a video sequence captured at the Dow chemical company.

**Figure 4.**Schematic drawing of the rotation performed after nullifying the offset of the sensor origins. The angle about the x-axis, i.e., the pitch angle, was computed using the output of the two-axis accelerometer $\mathbf{a}=({a}_{y},{a}_{z})$ from the Velodyne HDL. In this figure, the subscript $\mathrm{L}$ denotes the coordinate system of the Ladybug, whereas $\mathrm{V}$ denotes the coordinate system of the Velodyne.

**Figure 5.**Overview of the problem statement. Each time a new point cloud ${\mathcal{P}}_{k}$ has arrived, its sensor pose ${\mathrm{S}}_{k}$ in relation to the world reference coordinate system $\mathrm{W}\phantom{\rule{3.33333pt}{0ex}}=\phantom{\rule{3.33333pt}{0ex}}{\mathrm{L}}_{0}$ is determined using the concatenation of the previous pose ${\mathrm{S}}_{k-1}$ and the transformation ${\mathrm{T}}_{k,k-1}$.

**Figure 6.**Overview of the system, which is implemented as an incremental process. Every time a consecutive point cloud ${\mathcal{P}}_{k}$ has arrived, it is first projected on a 2D grid (Step 1). Next, it is analyzed to obtain some low-level surface features (Step 2), after which it is aligned (registered) with the point cloud ${\mathcal{P}}_{k-1}$ acquired during the previous sweep $k-1$ (Step 3). The obtained transformation serves as an initial guess to subsequently align ${\mathcal{P}}_{k}$ with the current world model ${\mathcal{W}}_{k-1}$ (Step 4) and to fuse them together to form ${\mathcal{W}}_{k}$ (Step 5). This fusion consists of re-sampling the point cloud by means of a surface reconstruction technique. Optionally, when a loop has been detected, loop closure is performed to cope with the drift and to preserve global consistency (Step 6).

**Figure 7.**The 3D points generated by the Velodyne scanner are projected onto a 2D spherical grid. This way, we can exploit the adjacency in the 2D domain, e.g., to quickly find neighboring points in 3D.

**Figure 8.**Example of a 360${}^{\circ}$ range image obtained by projecting the point cloud of Figure 2 onto a 2D grid. The blue color means that points are close-by, whereas the red color denotes that points are located further away.

**Figure 9.**Visual representation of the low-level surface features. The values ${\sigma}_{i}=\sqrt{{\lambda}_{i}}$ represent the standard deviation along the eigenvectors. When ${\sigma}_{1}\approx {\sigma}_{2}\approx {\sigma}_{3}$, the points are volumetric, often times representing a scatter, such as vegetation. When ${\sigma}_{1}\gg {\sigma}_{2},{\sigma}_{3}$, the respective points are lying on a line between planar surfaces or are part of thin structures, such as pipelines. Finally, when ${\sigma}_{1},{\sigma}_{2}\gg {\sigma}_{3}$, the points are lying on a planar surface.

**Figure 10.**Graphical representation of the ICP algorithm. In every iteration, four main steps are conducted: (1) point selection; (2) correspondence estimation; (3) weighting; and (4) transformation estimation. In every iteration, the transformation is updated until convergence has been reached and the two scans are perfectly aligned.

**Figure 11.**Overview of the loop closure procedure. When a loop has been detected, its loop transform Δ is considered as an error and is propagated back in the pose graph. To this end, we use the residual of the minimization step (cf. Section 4.3) to assign a weight to each link in the pose graph. We assign a higher weight for those transformations that had a high residual in the previous minimization step. The idea is that a high residual indicates that two consecutive point clouds were potentially inaccurately aligned.

**Figure 12.**Part of the ground truth 3D point cloud of the Dow chemical site acquired using a Leica static terrestrial LiDAR scanning system.

**Figure 13.**Bird’s-eye view of the ground truth (black) and reconstructed (blue) point cloud aligned with each other. Visually, one can see that the walls and edges of both point clouds are overlapping. This is best visible in areas for which the ground truth point cloud has a LiDAR shadow due to occlusion. The average distance of the corresponding dominant planes (cf. Table 2) is approximately 1 cm, whereas the average deviation in angle is $0.{84}^{\circ}$.

**Figure 14.**2D projection on the ground plane of both the reconstructed and the ground truth 3D model in order to evaluate the accuracy qualitatively. To obtain the image, the ground plane was removed from both point clouds, and a quantization in 2D cells was conducted. Brighter areas denote a higher point density in the cell and, hence, emphasize the overlap of both point clouds. Visually, one can notice little to no discrepancies as the walls and edges of both models are overlapping.

**Figure 15.**Five plots of the trajectories estimated by the visual structure from motion (SfM) framework presented in [28] using the images of each Ladybug camera separately (red color). The result of the camera pointing upwards is omitted. The blue plot represents the trajectory estimated using our method. For the visual SfM approach, many poses are missing due to the lack of good feature matches. Moreover, many outlier poses are present.

**Figure 16.**Bird’s eye view of the obtained 3D reconstruction and trajectory of a data sequence captured at the chemical site of Dow company using LiDAR data acquired by the Vellady platform. The blue line is representing the estimated trajectory of the mobile observer.

**Figure 17.**Some example images of the 3D reconstructions obtained by applying our algorithm on the recordings of a campus of Ghent University.

**Figure 18.**Eleven plots of the estimated (blue) and ground truth (red) trajectories of the Kitti benchmark presented in [27]. The sequences “00” to “10” are presented from left to right and from top to bottom. The algorithm was run without the loop detection and closure algorithm. The majority of the results are satisfying.

**Figure 19.**Results of the Kitti vision benchmark showing the average rotation error of all test sequences. Left: the rotation error is expressed as a function of the path length. Right: the rotation error is expressed as a function of the speed.

**Figure 20.**Results of the Kitti vision benchmark showing the average translation error of all test sequences. Left: the translation error is expressed as a function of the path length. Right: the translation error is expressed as a function of the speed.

**Figure 21.**Results of the Kitti vision benchmark showing 3D reconstructions of the validation sequences “11”, “13”, “15” and “18” (left to right, top to bottom) for which no ground truth is provided.

Sensor | Feature | |
---|---|---|

Ladybug system | - 6 camera’s in pentagonal prism, one pointing upwards | |

- resolution of 1600 × 1200 per image | ||

- 1 frame (6 images) per second | ||

- mounted perpendicular w.r.t. the ground plane | ||

Velodyne HDL-32e | - 360${}^{\circ}$ horizontal FOV, 41.3${}^{\circ}$ vertical FOV | |

- 32 lasers spinning at 10 sweeps per second | ||

- ±700,000 points per sweep | ||

- mounted with an angle of 66${}^{\circ}$ w.r.t. the ground plane |

**Table 2.**The difference in angle between two corresponding planes $\varphi ({H}_{s},{H}_{t})\triangleq \mathrm{acos}({\mathbf{n}}_{s}\xb7{\mathbf{n}}_{t})$ (in degrees) and the average distance of two corresponding planes $d({H}_{s},{H}_{t})\triangleq \frac{1}{|{H}_{t}|}{\sum}_{{\mathbf{p}}_{t}\in {H}_{t}}\langle {\mathbf{p}}_{t},{\mathbf{n}}_{s}\rangle -{d}_{s}$ (in meter) for the eight most dominant planes in the scene.

Plane | $\mathit{d}({\mathit{\varphi}}_{\mathit{g}},{\mathit{\varphi}}_{\mathit{r}})$ | ${\mathit{d}}_{\mathit{\mu}}({\mathit{H}}_{\mathit{g}},{\mathit{H}}_{\mathit{r}})$ |
---|---|---|

1 | 0.759761 | 0.00818437 |

2 | 0.276246 | 0.00724192 |

3 | 1.12969 | 0.008299 |

4 | 0.286674 | 0.0115731 |

5 | 0.292083 | 0.0164202 |

6 | 0.492578 | 0.0106246 |

7 | 1.80667 | 0.0118678 |

8 | 1.68458 | 0.00993411 |

$\mathbf{0}.\mathbf{841036}$ | $\mathbf{0}.\mathbf{0105181}$ |

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Vlaminck, M.; Luong, H.; Goeman, W.; Philips, W.
3D Scene Reconstruction Using Omnidirectional Vision and LiDAR: A Hybrid Approach. *Sensors* **2016**, *16*, 1923.
https://doi.org/10.3390/s16111923

**AMA Style**

Vlaminck M, Luong H, Goeman W, Philips W.
3D Scene Reconstruction Using Omnidirectional Vision and LiDAR: A Hybrid Approach. *Sensors*. 2016; 16(11):1923.
https://doi.org/10.3390/s16111923

**Chicago/Turabian Style**

Vlaminck, Michiel, Hiep Luong, Werner Goeman, and Wilfried Philips.
2016. "3D Scene Reconstruction Using Omnidirectional Vision and LiDAR: A Hybrid Approach" *Sensors* 16, no. 11: 1923.
https://doi.org/10.3390/s16111923