Scene Wireframes Sketching for Drones †

The increasing use of autonomous UAVs inside buildings and around human-made structures demands new accurate and comprehensive representation of their operation environments. Most of the 3D scene abstraction methods use invariant feature point matching, nevertheless some sparse 3D point clouds do not concisely represent the structure of the environment. Likewise, line clouds constructed by short and redundant segments with inaccurate directions limit the understanding of scenes as those that include environments with poor texture, or whose texture resembles a repetitive pattern. The presented approach is based on observation and representation models using the straight line segments, whose resemble the limits of an urban indoor or outdoor environment. The goal of the work is to get a full method based on the matching of lines that provides a complementary approach to state-of-the-art methods when facing 3D scene representation of poor texture environments for future autonomous UAV.


Introduction
The vast majority of the current approaches for 3D scene reconstruction are based on point clouds.Commonly, points are matched between pairs of views based on their descriptors, then triangulated [1] to make an initial estimation of their location in 3D space, and finally their poses are adjusted by least squares minimization [2].A number of efficient point detectors and descriptor have made it possible to obtain robust and detailed 3D reconstructions based on feature point clouds [3][4][5][6].These algorithms made possible to evolve from simple 3D reconstructions of the surfaces [7] to dense point reconstructions of extensive landscapes and cities [8].
The goal of this work is to obtain a real-time three dimensional representation of a scene by using a limited number of matched straight segments.Our approach takes advantage of multi-scale line detection and matching [9] to increase the accuracy of the line endpoints triangulation among pairs of line-matched frames.Secondly, our method goes one step ahead in the least squares adjustment of cameras and lines by exploiting geometrical relationships of the coplanar lines.After classifying the spatial lines according to their co-planarity, the intersection of the observed lines are brought into a second run of the SBA process.

Materials and Methods
The first problem to solve for the computation of a 3D sketch from the matched observations is that the camera poses P are unknown.These can be estimated from the endpoint correspondences of l of from a feature points based SfM pipeline.The first camera is provided with the pose being K the calibration matrix.The rest of cameras will be stacked from this position in the world reference frame.Once we have the camera matrices for the first pair of cameras, a linear triangulation method [10] can be used to retrieve the first estimations for 3D lines, i.e., the members of Γ with observed counterpart on both camera planes.The final spatial segments are obtained as the centre of gravity of their estimations obtained in the stereo triangulations.Finally, a Bundle Adjustment is performed to optimize the relative pose for the cameras and spatial lines.The flow for the proposed method is depicted in Figure 1.

Results
Figure 1 shows the result of the proposed method employing 8 images from the public Ground Truth dataset [11].It is compared compare to the results of Line3D++ [12], shown in Figure 2. The result proves that the proposed method is able to obtain a number of structures of the house from a low number of images, and still holding a decent accuracy.On the other hand, the method Line3D++ [12] returns sparse short segments.This sparsity complicates the understanding of what the spatial line cloud is resembling, and difficult the alignment to the Ground Truth mesh.Note that this method also fails to retrieve any long segment of the house for this test case.

Conclusions
This work presents a novel integration of a set of algorithms to create a line-based spatial sketch, showing the main structures of the man-made environment laying in front of a camera.It gets as input its intrinsic parameters and at least 3 pictures.The set of methods include novel observation relations of groups of straight segments that are captured from different poses.Quantitative results have been obtained and compared with other state-of-the-art line based SfM method.Future work might include the exploitation of weak epipolar constraints during the line matching process.

Figure 1 .
Figure 1.Resulting line matching using the proposed method for images {5,8} of the dataset [11].Resulting 3D abstraction and measurements of distances to Ground Truth mesh.

Figure 2 .
Figure 2. Results with the method Line3D++[12], using the same set of images as input.