Mobile Laser Scanned Point-Clouds for Road Object Detection and Extraction : A Review

The mobile laser scanning (MLS) technique has attracted considerable attention for providing high-density, high-accuracy, unstructured, three-dimensional (3D) geo-referenced point-cloud coverage of the road environment. Recently, there has been an increasing number of applications of MLS in the detection and extraction of urban objects. This paper presents a systematic review of existing MLS related literature. This paper consists of three parts. Part 1 presents a brief overview of the state-of-the-art commercial MLS systems. Part 2 provides a detailed analysis of on-road and off-road information inventory methods, including the detection and extraction of on-road objects (e.g., road surface, road markings, driving lines, and road crack) and off-road objects (e.g., pole-like objects and power lines). Part 3 presents a refined integrated analysis of challenges and future trends. Our review shows that MLS technology is well proven in urban object detection and extraction, since the improvement of hardware and software accelerate the efficiency and accuracy of data collection and processing. When compared to other review papers focusing on MLS applications, we review the state-of-the-art road object detection and extraction methods using MLS data and discuss their performance and applicability. The main contribution of this review demonstrates that the MLS systems are suitable for supporting road asset inventory, ITS-related applications, high-definition maps, and other highly accurate localization services.


Introduction
The advancement in mobile laser scanning (MLS) technology, integrated with laser scanners, location and navigation sensors (e.g., Global Navigation Satellite Systems (GNSS), and Inertial Measurement Unit (IMU)), and imagery data acquisition sensors (e.g., panoramic and digital cameras) on moving platforms has enhanced the performance of MLS in static urban objects detection, extraction, and modeling [1].These tasks using various geospatial point cloud data with different geometric

Mobile Laser Scanning: The State-of-the-Art
Since the GNSS techniques have been publicly accessible services in the past several decades, MLS technologies have indicated the huge commercial potentials as advanced surveying and mapping techniques for effective and rapid geospatial data collection [6].In comparison with TLS systems, ALS systems, and digital satellite imaging technologies, MLS systems have the flexible mobility and proven ability to collect highly dense point cloud data with cost-saving and time-efficiency measurements.
In general terms, an MLS platform is a vehicle-mounted mobile mapping system that is integrated with multiple onboard sensors, including light detection and ranging (LiDAR) sensors, GNSS units (e.g., GNSS antenna, an IMU, and a distance measurement indicator (DMI)), advanced digital cameras, and a centralized computing system for data synchronization and management [7].Additionally, the GNSS, the IMU, and the DMI constitute a Position and Orientation System (POS) for mobile mapping applications.By using profiling scanning techniques and detecting laser pulse intensity reflected from the target surfaces, MLS systems can directly support the localization, navigation, object detection, accurate perception, object tracking, and three-dimensional (3D) urban digital modeling missions.Depending on the speed of light and flying time of laser pulses, the precise measurement range is calculated.Accordingly, highly accurate and geo-referenced 3D coordinates for MLS point clouds are ascertained depending on angular measurement, range measurement, position and orientation information [8].In addition, MLS data acquisition rate is determined according to the scanning sensor-deflecting pattern and laser pulse repetition rate.Currently, the most high-end commercial MLS systems (e.g., RIEGL VMX-2HA MLS system) provide 500 scan lines and two million measurements per second, which can collect high-density 3D point clouds for an extended range of applications, including city modeling, tunnel surveying, intelligent transportation systems (ITS), architecture and façade measurement, and civil engineering.
In order to fully describe the feasibility of MLS systems for urban object detection and extraction, this part details (1) the components of MLS systems, (2) direct geo-referencing, and (3) error analysis.

Components of MLS System
MLS systems are applied to point clouds data acquisition of urban road networks and any objects on both roadsides using multiple sensors onboard moving vehicles.Figure 2 presents crucial subsystems of a Trimble MX-9 MLS system and a RIEGL VMX-2HA MLS system, respectively.Additionally, the GNSS, the IMU, and the DMI constitute a POS for mobile mapping applications.(1) Laser scanner These laser scanners continuously emit laser pulses with a near-infrared wavelength to the surfaces of specified targets and digitize the backscattered signals.Then, based on light travel time, the scanning distances between the scanners and scanning targets can be calculated to obtain coordinates and intensity information.Currently, two state-of-the-art techniques are applied to range measurements in the plurality of MLS systems: time of flight (TOF) and phase shift [9].
In a certain medium, laser pulses move at a finite and constant velocity [9].Therefore, the scanning range between the laser sensors and the target can be determined by calculating time-of-flight (also known as time delays) between light pulse transmission and return when these pulses are backscattered from the target [10].Most commercial MLS systems are equipped with TOF scanners, since such scanners provide a farther measurement range than phase-based scanners [11].When compared to TOF scanners, phase-based scanners emit continuous amplitude-modulated waves using phase differences.Thus, the distance between the MLS system and the target is calculated depending on the phase shift from light beam transmission and reception.
(2) Digital camera Advances in the development of digital cameras have driven the applicability of MLS systems [12].In order to provide visual coverage, most commercial MLS systems are integrated with high-end digital cameras, so that point clouds are realistically rendered using optical color imagery data.Additionally, stable image geometry provides detailed information to enhance the reliability and applicability of point clouds data [13].Therefore, most optical camera manufacturers focus on the research and development of customized digital cameras to meet any application specifics of MLS systems.For instance, RIEGL VMX-2HA MLS systems are equipped with nine digital cameras to capture rich geometry information (e.g., color and texture information) of targets.Since most laser scanners provide the reflectivity at the laser wavelength as texture information, while color image data is lack of coordinate reference, a combination of these two types of data can be applied to generate colorized point clouds, which is widely applied in elaborated visualization and 3D city model construction.
However, digital cameras play a secondary role that they are applied in visualization, while laser scanners that are integrated into MLS systems are the primary source of precisely georeferenced data [14].Additionally, perspective distortions of images are caused by changes in imaging distances and angles when the vehicle-based MLS system is moving.Moreover, for manufacturers of MLS systems, it is challenging to ascertain a proper size format digital camera (e.g., medium format digital camera) according to both point density and the driving speed of such systems.
(3) Global navigation satellite system & Orientation system As mentioned before, the GNSS, the IMU, and the DMI (or odometer) constitute a POS system, which can be utilized to ascertain localization information of the MLS system.The GNSS provides highly precise positioning information up to centimeter accuracy and offers two additional observations: velocity and time.However, the positioning accuracy would decline due to multipath propagation issues caused by high-rise obstacles (e.g., tall trees, buildings and tunnels).For instance, a GNSS receiver is surrounded by buildings or under tree canopies.To eliminate multipath effects and to overcome the limitation of GNSS signal loss, an IMU is therefore used to provide the immediate position, velocity, and attitude of an MLS system.Typically, an IMU can interpolate stable and continuous positioning information in case of GNSS dropouts [11].Additionally, three gyroscopes and three accelerometers are installed in an IMU, which can be used to measure three-axis angular rotations and three-axis accelerations.Since the positioning and orienting precision would decrease with the increase of measurement time, the combination of GNSS/IMU can improve positioning accuracy [8].Consequently, the IMU provides relative position and orientation information in the periods of weak satellite signals, while a GNSS presents detailed continuous positioning information to the IMU.Furthermore, a DMI that is installed on the vehicle's wheel using data transmission cable is capable to offer the observation to constrain error drift, especially for urban areas with unstable GNSS signals.
(4) Central control system A highly integrated control system, as a sophisticated software unit, is designed to control all sensors (e.g., laser scanners and digital cameras), synchronize data from such sensors, and store trajectory information obtained from a POS system.

Direct Geo-Referencing
The mechanism of direct geo-referencing is elucidated in Figure 3.According to the scanning angle α and the scanning range d of a certain point P, its position is thus determined in its coordinate system.In addition, such a position in the coordinate system of mapping frame can be transformed from the laser scanner system.Table 1 lists parameters that are involved in the direct geo-referencing transformation, and the positioning information vector of target P is calculated using Equation (1) [14]: As shown in Table 1, [X P , Y P , Z P ] T presents the positioning information of target P in the certain mapping system; [X GNSS , Y GNSS , Z GNSS ] T shows the positioning information of the GNSS antenna in the same mapping system; ω, φ, κ are roll, pitch and yaw details of IMU in the mapping coordinate system; ∆ω, ∆φ, ∆κ are bore sight angles that bring the scanners into correspondence with the IMU; α and d denote the incident angle and shooting range of laser pulses; and, other parameters are identified via system calibration.Table 1.Parameters used in the direct geo-referencing transformation.

Parameters
Representation Source (X P , Y P , Z P ) Coordinate of the laser point P in the mapping system Mapping frame Rotation information align mapping frame with IMU IMU R S IMU (∆ω, ∆ϕ, ∆κ) Transformation parameters from the laser scanner to IMU coordinate system System calibration Relative position information of point P in the laser sensor coordinate system Laser scanners The offsets between the IMU origin and the laser scanner origin System calibration The offsets between the GNSS origin and the IMU origin System calibration (X GNSS , Y GNSS , Z GNSS ) Position of GNSS sensor in mapping frame GNSS antenna

Error Analysis for MLS Systems
As shown in Equation ( 1), the relationship among observation parameters is defined to generate direct geo-referenced point clouds data.Thus, several typical errors of these parameters can result in accuracy reduction for data acquisition and correction.
(1) Laser scanning errors Based on Equation (1), the data acquisition accuracy is impacted by ranging error and incident angle error.A ranging error is involved by the system errors and indeterminate intervals, which are utilized to measure the TOF and the width of output laser pulses.Moreover, an incident angle error is involved because of the angular resolution and the uncertainty of laser pulses divergence [15].
(2) IMU errors An IMU sensor in an MLS system indicates roll, pitch, and yaw details that define the rotation matrix between the IMU and the mapping coordinate system.An IMU unit mainly includes two components: (1) three orthogonal accelerometers that are employed to measure changes in position along a specific axis; and, (2) three orthogonal gyroscopes that maintain an absolute angular reference according to the principles of conservation of angular momentum.Typically, common errors of the IMU contains accelerometer biases and gyroscope drifts, such as gravity misalignment, scale factor, and random walk (sensor noises).Generally, the IMU accuracy is specified in the manufacturers' technical specification.For instance, an Applanix POS LV 520, which is configured within a RIEGL VMX-2HA MLS system, provides 0.005 • in both roll and pitch angles and 0.015 • in yaw angle within one standard deviation. (

3) Localization errors
The localization accuracy of GNSS systems is mainly impacted through many reasons, including multipath effects, atmospheric errors, receiver errors, and baseline length [16].Thus, determining the absolute localization accuracy for a GNSS measurement is always challenging.By using differential GNSS and real-time kinematic (RTK) methods, the localization accuracy is often expected to be 1 cm + 1 ppm horizontally and 2 cm + 1 ppm vertically for a short kinematic baseline length with local GNSS-reference stations.
(4) Lever-arm offset errors Highly precise geo-referenced MLS data can be obtained if the lever-arm offsets are known.Calibration and physical measurement methods are commonly applied to determine such offsets.The first approach is not extensively applied because it is challenging to implement when compared to the second method.However, physical measurement errors can evolve due to the assumption of the alignment of two sensor's axes [8].
Based on the discussion about error sources, it demonstrates that the overall performance of an MLS system relies on the accuracy of resultant positions by laser scanners and a GNSS/IMU subsystem.Lever-arm offset errors are effectively eliminated by appropriate system calibration and measurement.Moreover, navigation solution can make an impact on the overall accuracy of MLS data collection.Multipath propagation effects and GNSS-signal loss that are caused by trees and constructions along the roadsides can decrease GNSS localization accuracy [16].Accordingly, a postprocessing operation of trajectory data is performed to increase the MLS data accuracy.

Introduction of Several Commercial MLS Systems
MLS systems are attacking worldwide attention with time.To date, based on the design and development of MLS-related technologies (e.g., laser scanning, positioning devices, and digital imaging), many MLS systems have hit the market.The worldwide manufacturers in the surveying and digital mapping industries (e.g., Leica, RIEGL, and Velodyne) have been well-established as suppliers of laser scanners or MLS systems.Table 2 summaries the several commercial-available MLS systems and Figure 4 illustrates their configurations.
As illustrated in Table 2, Faro Focus3D laser scanner integrated within a Road-Scanner C MLS system provides the most accurate mobile measurements by using phase shift measuring technique, while the SICK LMS 291 laser scanners present the worst measurement accuracy with 35 mm absolute measurement accuracy.Furthermore, RIEGL series and Trimble series laser scanners can achieve measurement ranges over 420 m due to employing the TOF measuring technique.In particular, the scanning range of a RIEGL VQ-450 scanner approximates 800 m.Additionally, RIEGL VMX-2HA, VQ-450, Trimble MX-9, and Lynx HS-600 scanners can operate with a full 360 • field of view, while SICK LMS 291 and Faro Focus 350 scanners provide less than 360 • coverage.Besides, the system portability is an important factor when suppliers design and develop their MLS systems, which enables customers to conduct the survey conveniently.An IP-S3 MLS system is compact and highly integrated within a small case for easy installing on the roofs of various motorized vehicles.Similarly, VMX-2HA, VMX-450, Trimble MX-9, and Lynx SG are relatively easily moved despite that they have a portable control unit mounted inside the trunk.However, the system portability of Road-Scanner C MLS system is hindered because of its large-sized control unit.
Moreover, point density is a significant attribute in many applications, including road information inspection, landslide detection, deformation monitoring, and forest health inspection using MLS point clouds.Additionally, point densities of different laser scanners are determined based on the scan rate and driving speed of the specified MLS platform [11].For instance, a RIEGL VQ-450 scanner is able to acquire highly dense point clouds over 8000 points/m 2 at a driving speed of 60 km/h.Moreover, point density will increase with the shorter measurement range.As shown in Figure 4, all MLS systems can provide very high point cloud densities within short measuring distances (e.g., the distance to roadside buildings).Accordingly, the RIEGL VQ-450 and VUX-1HA, and Optech Lynx HS-600 Dual laser scanners can produce relatively large data volumes as compared to SICK LMS 291 scanners integrated within Topcon's IP-S3 Compact+ system.
Meanwhile, driving speed has a great impact on point density.For this reason, users are able to determine the desired point density by adjusting the proper driving speed and incremental angle.For instance, for a Trimble MX-9 MLS system, the scan rate can reach 500 profiles per second from a moving vehicle at regular traffic speeds.When compared with other laser scanners that are listed in Table 2, the RIEGL VQ-2HA and Trimble MX-9 system can provide the best specifications of point density.Therefore, high-density point clouds enable a wide variety of applications of MLS systems for monitoring and detecting road conditions, especially in complex urban road environments.

Open Access MLS Datasets
To further expand the applications of MLS technologies, more and more national governments, research institutions, and universities are devoting to releasing and sharing their MLS datasets free to the public.The Robotics laboratory (CAOR) at MINES ParisTech, France released the Paris-rue-Madame dataset [22] that contains approximately 160 m long urban street MLS point clouds.This dataset, which was obtained by the LARA2-3D MLS system, has been classified and segmented, which makes the point-wise evaluation of MLS point-based detection, segmentation, and classification methods possible.More recently, MINES ParisTech also made Paris-Lille-3D dataset [23] publicly available.Such a dataset has been entirely labeled with 50 types of object classes to assist the related research communities on automated point cloud detection and classification methods.TUM City Campus MLS point cloud dataset [24] was collected by Fraunhofer IOSB using two Velodyne HDL-64 laser scanners.Several classes have been assigned and labeled as: unlabeled, high vegetation, low vegetation, natural terrain, artificial terrain, building, hardscape, artefact, and cars.This urban street dataset contains 1.7 billion points over 62 GB of storage, which is suitable for the development of methods for 3D object detection, SLAM-based navigation, and for transportation-related applications, including ITS and city modeling.
Moreover, the Perceptual Robotics Lab (PeRL) at the University of Michigan provides the Ford Campus Vision and LiDAR Data Set [25] obtained by a pickup truck-based autonomous vehicle integrated high-end RIEGL LiDAR sensors around the Ford Research campus, as well as downtown Dearborn, Michigan.This dataset approximates 100 GB with high-density point clouds to help research on real-time perceptual sensing, autonomous navigation, and high-definition maps for autonomous vehicles in prior unknown traffic environments.Furthermore, a 147.4 km long North Campus Long-Term Vision and LiDAR Dataset [26] is publicly available collected by the PeRL to facilitate research and development focusing on long-term autonomous driving function in dynamic urban street environments.
Based on a wagon-based MLS with high-resolution color and grayscale video cameras and Velodyne laser scanners, the KITTI Vision Benchmark Suite [27] project that was conducted by the Karlsruhe Institute of Technology and Toyota Technological Institute at Chicago released massive images, point cloud datasets, and benchmarks.Such datasets were collected in urban areas, rural areas and on highways around the city of Karlsruhe, which provided detailed road information (e.g., cars and pedestrians) to support computed vision related missions, such as 3D tracking and 3D object detection.Meanwhile, the Cityscapes Datasets [28] focusing on the semantic understanding of urban street scene were publicly accessible with a diversity of 50 cities and 30 classes.More recently, Large-scale Point Cloud Classification Benchmark dataset [29] that is a benchmark for classification with eight class labels was released to boost the applications in robotics, augmented reality and urban planning.Such datasets contain fine details of urban streets and provide labeled 3D point clouds with over four-billion points in total.These publicly accessible MLS datasets boost the development of city modeling, transportation engineering, autonomous driving, and high-definition navigation maps.

On-Road Information Extraction
The traffic information and inventory on the road, including road surfaces, road markings, and manmade holes, are significant and necessary elements in the geometric design of roadways to provide guidance, warning, and bans for all road users (e.g., drivers, cyclists, and pedestrians).According to [8], MLS data can be utilized for accurately detecting and extracting various types of on-road object inventory, which are mainly classified as five categories: road surfaces, road markings, driving lines, road cracks, and road manholes.This section contains a systematic and comprehensive literature review of backgrounds and studies that are related to urban on-road object detection and extraction using MLS point clouds.Figure 5 indicates the applications and related experiment results of on-road information extraction using MLS point clouds.

Road Surface Extraction
Road surfaces, as an important urban road design structure, play an important role in ITS development, urban planning, 3D digital city modeling, and high-definition roadmap generation [30].The relatively large data volumes obtained by MLS systems in an urban environment consist of different kinds of objects, including roads, buildings, pole-like objects (e.g., trees, traffic signs, and poles), cars, and pedestrians.In general, efficient detection and accurate extraction of road surfaces are the prerequisites for the extraction and classification of other on-road inventories (e.g., road manholes) [31].
It aims to reduce the data volume and boost the subsequent process straightforward and reliably [7].Furthermore, filtering out off-ground point clouds can minimize computational load and save memory space.A variety of methods and algorithms have been developed for road surface detection and extraction using MLS data.According to different data formats, such methods are mainly categorized into three groups: 3D point-based extraction, two-dimensional (2D) feature image-based extraction, and other data source based extraction [8]. Figure 6 presents several road surface extraction algorithms and a few representative experiment results.

3D Point-Based Extraction
Road surface extraction is conducted using either MLS point clouds directly or 2D feature imagery derived from 3D point clouds.The majority of algorithms directly extracted pavement, while the others first extracted road edges or road boundaries [32][33][34].By applying the derivatives of the Gaussian function to MLS point clouds, Kumar et al. [32] effectively extracted road edges.According to the parametric active contour and snake model, they implemented a revised automated method for road edge extraction from MLS data [33].Additionally, Zai et al. [34] proposed a supervoxel generation algorithm that automatically extracted road boundaries and pavement surfaces while using MLS point clouds.Wang et al. [35] developed a computationally efficient approach integrating Hough forest framework with supervoxel to remove pavement from raw 3D mobile point clouds for further object detection.By calculating angular distances to the ground normals, Hervieu and Soheilian [36] effectively extracted road edges and road surfaces from MLS point clouds.Hata et al. [37] extracted ground surfaces by employing various filters, including differential filter and regression filter, on 3D MLS points.Guan et al. [38] proposed a curb-based road extraction method to separate pavement surfaces from roadsides with the assumption that road curbstones can represent the boundaries of pavement.Xu et al. [39] proposed an automated road curb extraction method using MLS point clouds.First, candidates' points of pavement curbs were extracted by an energy function.Then, such candidate points were refined based on the proposed least cost path model.Riveiro et al. [40] implemented the road segmentation using a curvature analysis that is directly derived from MLS point clouds.Additionally, a voxel-based upward growing method was performed for direct ground point segmentation from entire MLS point clouds [7].
In order to enhance computational efficiency, trajectory data were commonly used [31,38,41].Wu et al. [42] proposed a step-wise method for off-ground point removal.Firstly, raw point clouds were vertically partitioned along the vehicle's trajectory.Then, the Random Sample Consensus (RANSAC) method was employed for ground point extraction by determining the average height of ground points.
As mentioned, MLS systems collect data by multiple scan lines.The spatial configuration of the scan line relies on the user-defined parameters of a specific MLS system, such as driving speed, sensor trajectory, and sensor orientation.Consequently, the high-density pavement points decrease the computation complexity in pavement segmentation by processing scan lines [38,[43][44][45].According to scan lines, Cabo et al. [43] proposed a method depending on line cloud grouping from MLS point clouds.The input point clouds were first organized into lines covering pavements.Subsequently, such lines were clustered based on a group of quasi-planar restrictions.Finally, road edges were extracted from the end points of lines.The scan line related methods have been demonstrated to be efficient, but they still have challenges in processing unordered MLS data that lack the characteristics of the scan lines' sequence [30].
Furthermore, some studies focus on directly extracting road surfaces from MLS data, according to the data characteristics and road properties, including point intensity, road width, elevation, and smoothness [32,46,47].According to the MLS system mechanism, point density is negatively correlated to the shooting range from the onboard laser scanners.Kumar et al. [32] filtered out off-ground points from raw MLS point clouds by taking point densities into consideration.Guo et al. [48] removed non-road surface points by combining road width and the elevation data.Other valuable data characteristics and road properties include local point patterns, laser pulse information, and height-related data (e.g., slope, height mean value, and height variance).In many existing methods, certain characteristics of MLS point clouds and road properties were combined to segment pavement points.Such methods extracted road surfaces using either local scales or global scales, but they were computationally intensive and time-consuming.
Some studies concentrated on a unique road object, while the others performed the classification which the whole 3D point-based scene was classified into different types of objects, including traffic lights, roadside trees, building facades, and road pavement [41,46,49,50].By considering shape, size, orientation, and topotaxy, Pu et al. [41] were capable of automatically extracting road surfaces, ground objects, and off-ground objects within a 3D point-based scene.Fan et al. [46] classified the point clouds into six groups, including roads, cars, trees, traffic signs, poles, and buildings, based on point density, height histogram, and fine geometric details.Moreover, Díaz-Vilariño et al. [50] proposed a method using roughness descriptors to automatically segment and classify asphalt pavements while using MLS point clouds.Although these studies indicated promising solutions for road surface extraction from MLS point clouds, other data sources, including optical RGB imagery, are of vital importance in the classification of more types of roads and defect detection on the pavements.

2D Feature Image-Based Extraction
Converting 3D MLS point clouds into 2D geo-referenced feature (GRF) images enables decreasing computational complexity at the stage of road surface extraction.By using the existing computer vision and image processing methods, road boundaries and pavements can be efficiently detected and extracted [51].Pavement segmentation of range scan lines was implemented in 2D, Cabo et al. [43] conducted an experiment to differentiate road inventories (e.g., trees, high-rise buildings, and road surfaces) based on the height deviation.In order to minimize the computational effort, Kumar et al. [52] proposed a 2D approach by projecting the 3D point clouds onto the XOY plane and assigning to a lattice of adjacent squares covering the whole 3D points scene.Subsequently, the generated 2D elevation image was processed to detect the road curbs by using active contour models.Riveiro et al. [40] firstly projected the entire MLS point clouds onto a 2D space.The principal component analysis (PCA) was then employed to detect peaks (e.g., altitude and deflection angle) denoting the transversal limits of the road.With the assumptions that road surfaces are large planes with a certain distance to trajectory data of MLS systems and the normal vectors of road surfaces are approximately parallel to the Z-axis, Yang et al. [53] segmented road surfaces by generating GRF images to filter out off-ground objects from the raw MLS point clouds.
Although studies on road surface extraction using image processing algorithms from MLS point clouds have been pursued for years, fully automated road surface extraction is still a huge challenge.Additionally, it is still difficult to deal with the steep terrain environments from 2D GRF images.

Other Data-Based Extraction
To eliminate the limitations of MLS point clouds (e.g., data occlusion and unorganized data distribution), other data sources, including high-resolution satellite imagery, unmanned aerial vehicle (UAV) imagery, ALS data, and TLS data, were integrated with MLS point clouds.Choi et al. [54] created detailed 3D geometric models of road surfaces based on the data fusion of MLS point clouds, video data, and scanning profiles.Jaakkola et al. [49] employed image processing algorithms to the intensity and height images for road curbstone segmentation, and the road surfaces were modeled as triangulated irregular networks (TIN) using 3D point clouds.With the assistance of such data, the extraction accuracy can be boosted by providing additional pavement texture information.Boyko and Funkhouser [55] proposed a method to combine multiple aerial scans and TLS highly dense point clouds of urban road environments together for completely extracting road surfaces.With the assistance of ALS data, Zhou and Vosselman [56] implemented a sigmoid function to the 3D points near the detected curbstone, then adapted for processing of MLS data.Accordingly, Cavegn and Haala [57] proposed a method to efficiently extract pavement surfaces that are based on the TLS points, images, and scaled numeric maps.
Combining other data sources with MLS point clouds enables both accuracy and correctness enhancement of road surface extraction.However, different input data might be collected by different sensors at multiple times, under different weather conditions, with different densities and resolutions, and under various illuminations, it makes the data registration, data calibration, and data fusion challenging.
Table 3 is a summary of three categories of road surface extraction methods.Although the existing image processing algorithms can be efficiently performed on 2D GRF imagery derived from 3D MLS point clouds [44], it is challenging to deal with steep terrain environments.Thus, the accuracy of pavement extraction results can be improved based on the additional detailed road information supported by other data sources.When compared to 2D GRF image-based road surface extraction, 3D point-based methods are capable of extracting road surfaces directly from MLS data, and segmenting road surfaces in the local-global scales [35,38,58].However, these methods are mostly computationally intensive and time-consuming.Second, use the octree data structure organizes the voxel in each grid.Third, the voxel grows upward to its nine neighbors, then continues to search upward to their corresponding neighbors and terminates when no more voxels can be searched.Finally, the voxel with the highest elevation is compared with a threshold to label it as ground or off-ground part.
• Adaptive to large scenes with strong ground fluctuations.

•
May not function well for data with a lot of noise.
[7, [59][60][61][62][63] Ground Segmentati-on First, voxelize the input data into blocks.Second, cluster neighboring voxels whose vertical mean and variance differences are less than given thresholds and choose the largest partition as the ground.Third, select ground seeds using the trajectory.Finally, use K-Nearest Neighbor algorithm to determine the closest voxel for each point in the trajectory.
• Making the region growing process faster.

•
Suitable for simple scenes • Implement in global scale [63,64] Trajectory-based filter First, using the trajectory data and the distance to filter original data.Second, rasterize the filtered data onto the horizontal plane and then compute different features of point cloud in each voxel.The raster data structure is visualized as an intensity image.Finally, binary the image using different features.Run AND operation between both binary images to filter the ground points.
• Fit for data with high intensity.

•
May cause over-segment due to rasterization.[41,65] Terrain filter First, partition the input data into grids along horizontal XY-plane.Second, determine a representative point based on the given percentile.Then use it and its neighboring cell's representative points to estimate a local plane.Third, define a rectangle box with a known distance to the estimated plane.Such points located outside the box are labeled as "off-terrain" points.The box is partitioned into four small groups.Repeat Step 1, 2, 3 until meet the termination condition.Finally, Apply the Euclidean clustering to group the off-terrain points into clusters.
• Suitable for complex scene.• Time-consuming • Heavy computational burden. [66] Voxel-based ground removal method Firstly, the point clouds are vertically segmented into voxels with equal length and width on the XOY plane.Then, the ground points are removed by an elevation threshold in each voxel.
• Fit for scenes with small fluctuations.

•
Simple and time-saving.[67,68] RANSAC-based filter First, divide the entire point clouds into partitions along the trajectory.Use plane fitting to estimate the average height of ground.Second, iteratively apply RANSAC algorithm to fit a plane.The iteration terminates when the elevation of one point is higher than the mean elevation or the number of points in the plane generated by RANSAC remains unchanged.
• Quick and effectively.

•
Not suitable for complex scene.

•
Cannot work for scenes with multiple planes.

Road Marking Extraction and Classification
Road markings, including lane lines, center lines, zebra crossings, and arrows, provide necessary road information (e.g., warning and guidance) for all road users.Identifying and extracting road markings accurately is significant for advanced driver assistance systems (ADAS) and autonomous driving systems to plan reliable navigation routes and prevent collisions in complex urban road networks.When considering urban road condition and road topography, the availability and clarity of road markings are critical elements in traffic management systems and accidents where road networks themselves are the cause.For instance, the fatal accident rate is high due to the damage of clearly painted road markings, especially in highly complex urban road and highway environments [59].
Road markings are highly retro-reflective materials painted on asphalt concrete pavements.Thus, the relatively high-intensity value is considered to be a unique characteristic for road marking extraction from using point clouds [40].Accordingly, a variety of methods are mainly categorized into two classes depending on semantic knowledge (e.g., shape) and MLS intensity properties: 2D GRF image-driven extraction, and 3D point-driven extraction.Figure 7 shows several experiment results by using different road marking extraction methods.

2D GRF Image-Driven Extraction
The majority of studies focused on the road marking extraction using 2D GRF images interpreted from 3D points.According to semantic information (e.g., size, orientation, and shape) of road markings, the existing image processing approaches were widely used, including Hough Transform, multi-thresholding segmentation, morphology, and Multi-Scale Tensor Voting (MSTV) [13,[69][70][71].Toth [13] conducted a simplified intensity threshold segmentation for the extraction with regard to intensity distribution in different searching windows.Accordingly, Li et al. [70] implemented a global intensity filter to roughly segment road markings based on the generated density-based images.In order to efficiently extract lane-shaped road markings (e.g., broken lane lines and continuous edge lines), Yang et al. [71] performed a Hough Transform method in four connected regions of the georeferenced intensity images.However, the applications of Hough transformation for road marking extraction is limited by defining the quantity of road marking types to be segmented, especially in the process of dealing with multiple types of road markings (e.g., arrows).In contrast, multi-thresholding segmentation methods were developed through exploiting the relationship between the intensity values and scanning ranges, which were commonly applied to overcome intensity inconsistency due to different scanning patterns and non-uniformity of point cloud distribution [69].Accordingly, Guan et al. [38] and Ma et al. [47] dynamically segmented road markings using multiple thresholds regarding different scanning distances that are based on point densities, followed by a morphological operation with a linear structuring feature.Kumar et al. [12] performed a scanning range dependent thresholding algorithm to identify road markings from the georeferenced intensity and range imagery.Furthermore, the MSTV approach can suppress high-level noises and preserve complete road markings.In the study that was conducted by [14], road marking segmentation was improved by using weighted neighboring difference histogram based dynamic thresholding and MSTV algorithms from the noisy GRF images.

3D Point-Driven Extraction
Meanwhile, most studies directly extracted road markings from MLS point clouds rather than the generated GRF images.Chen et al. [72] and Kim et al. [73] performed a profile-based intensity analysis algorithm to quickly segment pavement markings from 3D MLS points.Firstly, raw MLS point clouds were partitioned into point cloud data slices based on the trajectory of vehicles.Then, pavements were detected and extracted by considering the geometric properties of road edges, boundaries, and barriers.Finally, line-shaped pavement marks were efficiently extracted by determining the peak value of intensity in each scan line.Additionally, Yu et al. [74] directly derived road markings using MLS point clouds and classified them into large-sized (e.g.stop line and centerline), small-sized (e.g., arrow), and rectangular-shaped (e.g., zebra crossing) road markings.According to trajectory data and road curb lines, large-sized road markings were firstly extracted by using multiple threshold segmentation and spatial point density filtering methods, while the Otsu's thresholding approach [75] was adopted to select multiple optimal thresholds.Then, small-sized road markings were extracted and classified according to Deep Boltzmann Machines (DBMs)-based neural networks.Finally, rectangular-shaped markings were efficiently classified by performing a PCA-based method.Soilán et al. [76] computed several discriminative features for individual road marking based on the geometric parameters and pixel distribution.Subsequently, each road marking was defined as a Geometry Based Feature (GBF) vector that was the input of a supervised neural network for road marking classification.
Table 4 summarizes road marking extraction and classification methods in terms of 2D image-driven and 3D MLS point-driven extraction.Converting MLS point clouds into 2D GRF images is effective to overcome intensity inconsistency and density variance issues due to different scanning patterns.However, complex types of road markings (e.g., words) make the extraction process via 2D feature image processing algorithms challenging.By comparison, MLS point-driven extraction methods concentrating on directly segment road markings from raw MLS point clouds can achieve both completeness and correctness improvement.In addition, detailed geospatial information road markings can be preserved for further applications.However, automated road marking extraction from the large-volume MLS point clouds especially with strong concavo-convex features and unevenly distributed point clouds is still a very difficult task [75].Accordingly, many studies were conducted for automated road marking classification by using deep learning based neural networks (e.g., PointNet [77] and PointCNN [78]), but such methods need massive labeled road markings for training purposes and they have limitations in processing extra-large MLS data volume for urban road networks.

Driving Line Generation
Driving lines, defined as the driving routes for motorized vehicles or autonomous vehicles, play a critical role in the generation of high-definition roadmaps and fully autonomous driving systems [42].When considering the turning speed limitation, driving lane departure, and vehicle-handling capability issues, one of the most challenging tasks to achieve fully autonomous driving is to enable autonomous vehicles to navigate and maneuver themselves at complex urban environments without direct human intervention [79].Therefore, generating highly accurate driving lines within centimeter-level localization and navigation accuracy can boost the development of high-definition roadmaps and autonomous vehicles.
Recently, many worldwide automotive manufacturers (e.g., BMW, Ford, and Mercedes-Benz) and digital map suppliers (e.g., HERE, Civil Maps, Ushr, and nuTonomy) are generating driving lines from MLS data.Ma et al. [47] proposed a three-step process to generate horizontally curved driving lines.Firstly, with the assistance of the trajectory of the MLS system, the curb-based road surface extraction algorithms were adopted to segment road surfaces from raw MLS point clouds.Secondly, road markings were directly extracted by using the multi-threshold segmentation method, followed by a statistical outlier removal (SOR) filtering algorithm for noise removal.Finally, the conditional Euclidean clustering algorithm was performed, followed by a nonlinear least-squares curve fitting algorithm.Consequently, the driving lines were efficiently generated with 15 cm-level localization accuracy.
Additionally, Li et al. [80] firstly implemented a voxel-based upward growing algorithm to extract the ground points from the entire MLS point clouds.Then, road markings were categorized into two types (i.e., lane markings and textual road markings).Such road markings were separately extracted based on the distance-to-road-edge thresholding algorithm and related road design standards.Lastly, 3D high-definition roadmaps were generated with detailed road edge, road marking, and the estimated lane centerline information.Furthermore, Li et al. [81] proposed a step-wise method for road transition line generation while using MLS data.The region growing algorithm was first proposed to enhance the curb-based road surface extraction performance.Subsequently, the multi-thresholding segmentation and geometric feature filtering algorithms were conducted for lane marking extraction.Finally, a node structure generation algorithm was developed to generate lane geometries and lane centerlines, followed by a cubic Catmull-Rom spline approach for transition line generation.Figure 8 indicates the generated driving lines from MLS data for high-definition map generation.

Road Crack Detection
Road cracks, as a common type of distress in asphalt concrete pavement, are caused by road surface fractures due to overloaded heavy-duty trucks, thermal deformation, moisture corrosion, and road slippage or contraction.Rapid and accurate perception of road cracks not only enables the local transportation departments to perform monitoring and repair for traffic efficiency, but it also provides necessary information to the ITS for the probabilities of potential road surface distresses and traffic risks [82].
With the development of laser imaging techniques, MLS data are commonly utilized in road crack detection applications.In addition to providing an efficient promising solution for road crack detection using high-resolution digital imagery, MLS point clouds are valuable data sources for evaluating road surface distresses.Morphological mathematics based algorithms were implemented for crack extraction from pavement GRF images [83].Tsai et al. [84] summarized the previous six commonly used pavement crack extraction and classification methods: Canny edge detection, regression or relaxation thresholding, crack seed verification, multi-scale wavelets, iterative clipping, and dynamic optimized method.However, most methods are computationally intensive and time-consuming.The intensity information of road cracks normally indicates lower values in comparison with their neighbors, Yu et al. [85] employed the Otsu thresholding algorithm to extract crack candidates.Next, a spatial density filtering algorithm was performed for outlier removal.Finally, crack points were clustered into crack-lines, and crack skeletons were extracted by performing an L1-medial skeleton extraction algorithm.However, due to either grayscale similarity or discontinuity, the correctness and robustness of the thresholding-based segmentation algorithms mainly rely on road surface materials and surrounding environments, which can cause unreliable crack extraction results.
Meanwhile, many studies focused on extracting road cracks using advanced data mining, artificial intelligent (AI)-based, and neural network methods [86,87].However, the training process needs a lot of labeled data, and the selection of parameters mostly depends upon data quality and crack variations.Accordingly, Guan et al. [14] developed a method framework, called ITVCrack, to automatically extract pavement cracks by using the iterative tensor voting (ITV) method.The curvilinear cracks were efficiently delineated based on the ITV-based extraction algorithm, and a four-pass-per-iteration morphological thinning algorithm was afterward performed to remove noise.Additionally, Chen et al. [82] proposed a method for detecting asphalt pavement cracks by generating Digital Terrain Model (DTM) from MLS point clouds.Then, local height changes that may relate to cracks were detected based on a high-pass filter.Finally, a two-step matched filter (i.e., Gaussian-shaped kernel) was used to extract crack features.

Road Manhole Detection
Road manholes, as a major road infrastructure on the urban environment, provide entry to conduits that are utilized to perform drainage, steam, liquefied gas, communication cables, optical fiber, and other underground utility networks.Typically, road manholes are covered using concrete-made or metal-made materials to prevent road users from falling into the wells.Such manhole covers can be detected while using MLS systems to achieve higher time-efficiency and better cost-savings than human field surveying, in a safe and effective way.Based on the various data format, the majority of existing methods for road manhole cover detection are classified into two types: 2D imagery-based methods and 3D point-based methods.
Due to the complex background of pavement images, Cheng et al. [88] proposed an optimized Hough transformation approach for the manhole cover detection.Firstly, all the contours were determined by contour tracking from binary edge imagery.False detections were then filtered out using contour filters.Finally, circular-shaped manhole covers were extracted using the revised Hough transform algorithm.Additionally, in the study conducted by [89], the single view and multiple view processing methods were performed for manhole cover detection from road surface images.Subsequently, Niigaki et al. [90] developed a novel method to identify blurred manhole covers from low-quality road surface images.By focusing on the uniformity and separability of the intensity distribution, the Bhattacharyya coefficient was fine-tuned and three operators (i.e., uniformity operator, oriented separability operator, and circular object operator) were defined to detect manhole covers.
Meanwhile, MLS point clouds have also been widely used for road manhole cover detection.Guan et al. [91] proposed an MSTV algorithm to detect road manhole covers from MLS point clouds.First, ground points were extracted and rasterized into 2D GRF images.Next, the manhole cover candidates were determined based on distance-dependent intensity thresholding.Then, an MSTV algorithm was implemented for noise suppression and manhole cover pixel preservation.Finally, manhole covers were extracted through distance-based clustering and morphological operations.Yu et al. [92] presented a novel method for automated road manhole cover detection using MLS data.First, in order to improve the computational efficiency, off-ground points were filtered out through a curb-based road surface segmentation algorithm and rasterized into GRF intensity images via inverse distance weighted (IDW) interpolation approach.Then, a supervised deep learning neural network was trained to build a multi-layer feature generation network to determine high-level features of local image patches.Next, the relationships between the high-level patch features and the probabilities of the presence of manhole covers were learned by training a random forest model.Finally, based on the multi-layer feature generation network and random forest model, road manhole covers were efficiently extracted from GRF intensity images.

Off-Road Information Extraction
In mobile LiDAR systems, laser scanners continually collect 3D points along the roads, which produces complicated and multi-dimensional, large volume data that requires specific processing.Thus, efficient storage and management of such data are necessary for complicated archive, rapid data dissemination, and processing for multi-level application, including interrogation, filtering, and product derivatization [93].In the view of scanned MLS data, the ground points always occupy the largest part; however, our focused objects are set on the ground.Thus, data preprocessing that segment the raw data into different groups roughly, and then select the related information locally to extract off-road objects is particularly challenging.
Most researchers usually segment the raw data into three groups: ground, on-ground and off-ground points [42,61,[94][95][96].Table 3 lists six representative ground removal methods.Among these methods, voxelization and segmentation along the trajectory are commonly used.Voxelization is applied to simplify the huge amount and heterogeneous distribution of data.Thus, the computing cost is reduced by mapping the whole data in a three-dimensional grid without suffering a significant information loss by choosing correct scale, because the reduced version is stored and directly linked to the original data [97].Trajectory data indicates precisely real-time positioning information of the vehicle, which can provide localization and orientation information of MLS systems [98].Such trajectory data is commonly used as a filter to remove points that are far away from the road in the raw point clouds.
The remaining off-ground points will be reduced to a large extent when compared with the raw data when the ground and the on-ground points are removed.Within the off-road parts, those objects with various intensity values are still unorganized.A rough clustering is commonly performed to group the points into different clusters.Euclidean clustering and DBSCAN are two main density-based spatial clustering methods.

•
Euclidean clustering.This method is based on Euclidean space to calculate the distances between discrete points and their neighbors.Then the distances are compared with a threshold to group nearest points together.This method is simple and efficient.However, some disadvantages include no initial seeding system and no over-and under-segmentation control.In

Traffic Sign Detection
Accurate recognition and localization of traffic signs are significant in intelligent traffic-related applications (e.g., autonomous driving [101] and driver-assisted systems), which help machines to respond in a timely and accurate manner in different situations.The high accuracy of points (about 1 cm) in the meter-level area along with imagery provides promising solutions for traffic sign detection (TSD) and recognition (TSR) while using mobile laser scanning techniques.Accurate geometry and localization information is provided by point clouds; whereas, detailed texture and semantic information is contained in digital images [96].Existing algorithms apply geometric and spatial features of traffic signs [65, 102,103] or learn these features automatically [95,[104][105][106] to achieve TSD.TSR is commonly achieved by integrating imagery data and 3D point clouds together.These methods can be grouped into TSD using 3D point clouds and TSR using imagery data [1,2,94,95,104], as follows: 1. TSD using point clouds Traffic sign point clouds have obvious geometric and spatial features.Table 5 lists the attributes of them, which are employed in the most of existing traffic sign detection methods using point clouds [65, 97,99,105,108].The metallic material of traffic signs makes their surfaces have a strong reflectance intensity, for laser pulses hitting the retroreflective surfaces and returning to the receivers without much signal loss.Besides, traffic signs are commonly installed at a certain height along the roadsides.The sizes of them are limited to a certain range and their shapes are regular, such as rectangle, circle, and triangle.Wen et al. [65] set the minimum number of points in clusters (setting as 50) to remove small parts.Besides, point intensities below 60,000 are also filtered as non-sign objects.Yu et al. [66] also used intensity filter to extract traffic sign clusters.But, before that, an improved voxel-based normalized cut segmentation method was applied to separate occlusive and overlapped objects when Euclidean distance clustering failed.Ai et al. [94] used scanning distances and incidence angles to normalize each clusters' retro-intensity value.The comparison between the median value from the population of the normalized retro-intensity and the selected threshold was conducted to assess the traffic sign retro-reflectivity condition.In [63], traffic sign attributes, such as planar surfaces and an enclosed range of heights, were used in detection solutions.A height filter was proposed to eliminate clusters with heights that were smaller than 25 cm.Planar filters remove objects with reflective features that are not planar or small.Huang et al. [61] used the two same attributes to remove irrelevant points, then they applied a shape-based filtering to exclude the pole-like objects.Guan et al. [108] detected traffic signs using the intensity that extracted objects with low intensity and geometrical structures and removed non-planar parts.

TSD Using Point Clouds and Imagery
After finishing TSD tasks, TSR is commonly achieved by integrating point clouds and imagery data together.The fusion of point clouds and imagery are composed of two steps: (1) transform the 3D point clouds coordinate system from the world coordinate system to the camera coordinate system and (2) map the 3D points from the camera coordinate system to the 2D image plane [65].After that, the TSR task is completed based on the projected area.One example of TSD and TSR is shown in

Light Pole Detection
Street light poles, used to adjust illumination condition to help the pedestrians and drivers at night, are cost-effectively monitored, maintained, and managed for the transportation agencies [110,111].MLS techniques, which provide high-density and high-accuracy point clouds, are applied effectively in the light pole detection [15].Existing light poles detection algorithms can be grouped into the following types: knowledge-driven and data-driven methods.
(1) Knowledge-driven Knowledge-driven solutions have been widely used in the pole-like object extraction.Figure 11 lists different types of light poles, such as one-side light, two-side light, and beacon light, etc. Knowledge of light attributes in Figure 11 can be used to recognize and model pole-like objects.e.g., Li et al. [112] set two constraints to extract light poles: (1) poles are commonly installed near the road curbs; and, (2) poles stand vertically to the ground within a certain height range.Knowledgedriven methods can be classified into two types: matching-based extraction and rule-based extraction.
Matching-based extraction is one of the representative knowledge-driven methods, which is achieved by matching the extracted objects with pre-selected modules.Yu et al. [66] extracted light poles using a pairwise 3D shape context, where a shape descriptor was designed to model the geometric structure of a 3D point-cloud object.One-by-one matching, local dissimilarity, and global dissimilarity were calculated to extract light poles.In [7], light poles were extracted by matching the segmented batch with the pre-selected point cloud object prototype.The prototype was chosen based on prior knowledge.The feature points for this voxel was chosen from the points nearest the centroid of the voxel.Then, the matching cost for feature points between the prototype and individual grouped object was conducted via 3D object matching framework.Finally, the extraction results derived from the filtered matching costs from all segmented objects.Cabo et al. [97] spatially discretized the initial point clouds and chose the initial 20-30% representative point clouds as the inputs.The reduction of the inputs data kept model accessible without losing any information by linking each point to its voxel.Then, they selected and clustered small 2D segments that are compatible with a component of a target pole.Finally, the representative 3D voxel of the detected pole-like objects were determined and the points from the initial point clouds linking to individual pole-like objects were derived.Wang et al. [113] also introduced a similar matching method to recognize lamp poles and street signs.First, the raw data within an octree was grouped into multi-levels via 3D SigVox descriptor.Then, PCA was applied to calculate the eigenvalues of points in each voxel, which were projected onto the suitable triangle of a sphere approximating icosahedron.This step was iterated within multiple scales.Finally, street lights were identified via measuring the similarity of 3D SigVox descriptors between candidate point clusters and training objects.However, these three algorithms highly rely on the parameters and matching module selection.Thus, it is challenging to achieve automatic extraction.One-by-one matching has high computation cost when compared with other methods.These methods are not suitable for the large-volume data processing.

•
Rule-based extraction.Rule-based extraction is the other typical knowledge-driven method.Geometric and spatial features of light poles are setting as rules in the process of extraction.Teo et al. [114] applied several geometric features to remove the non-pole-like road objects.This algorithm includes feature extraction for each object and non-pole-like objects removal using geometric features (e.g., area of cross-section, position, size, orientation, and shape).Besides, pole parameters (e.g., location, radius, and height) were computed to extract feature, as well as hierarchical filtering was used to filter the non-pole-like objects.A height threshold (3 m) was used to classify the data into two parts, and do segmentation in horizontal and vertical directions.However, not all the pole-like objects are higher than 3 m, the parameter choosing is subjective.Rodríguez-Cuenca et al. [115] utilized the specified characteristics of the poles in the pole-like object extraction.Since poles stand vertically to the ground, this attribute is employed by the Reed and Xiaoli (RX) anomaly detection method for a linear pole structure with pre-organized point clouds.Finally, the vertical elements were detected as man-made poles or trees using the clustering method.However, if pole-like objects are overlapped by other objects, the disconnected upper parts are discarded as noise.Li et al. [116] used a slice-based method to extract pole-like road objects.A set of common rules were constrained to split and merge the groups.E.g., the connected components that contain vertical poles were analyzed for integrating or separating; the detached components as well as their nearby components were checked for further merging.Rule-based methods are rapid and efficient.But, the quantity of rules is hard to determine.The data would be over-segmented with numerous rules or under-segmented with insufficient rules.Especially, some true objects with poor data distribution are usually removed by strict rules.
(2) Data-driven Plenty data-driven methods have been proposed for light pole extraction using MLS techniques in the past decade.A set of features can be designed and applied to train or learn a pole extraction model that is based on massive labeled training data.Thus, these methods can also be regarded as learning-based methods.Guan et al. [117] chose a set of 50 training datasets, which covered a road patch of about 50 m separately, to construct a contextual visual vocabulary for depicting the features of light poles.Based on the generated contextual visual vocabulary, a bag-of-contextual-visual-words was constructed to detect pole-like objects from the filtered off-ground points.The extraction stage is similar to Yu et al. [62] research.Wu et al. [42] proposed a localization method based on integration of 2D images and point clouds.There are three main steps, i.e., raw localization map generation, "ball falling", and position detection.Then, the output of the three steps processing was viewed as prior information for segmentation.Once the first segmentation was finished, several features were computed to describe the pole: the height, the average height, the standard deviation of height, the estimated volume, the number of pole points, and the number of the super-voxels whose area of the convex hull.Besides, once the second segmentation was achieved, a set of features were computed to describe the global objects: four kinds of height-related features, the pixel intensity related to the location map, the area of the convex hull for whole mapped points, and the approximated volume, etc.These local and global features were fed into SVM and random forests classifiers.Yan et al. [67] used the Ensemble of Shape Functions (ESF) as well as geometric characters to describe the characteristics of light poles.The ESF includes three shape functions (point distance, area, and angle) and a ratio function.Geometric features of the objects, including linearity, planarity, scattering of the covariance matrix, ratio of the minor axis, and the major axis of Oriented Bounding Box, etc. are considered.Then, these characteristics were input into Random forest to train a pole-like objects classifier.Yan et al. [100] proposed an unsupervised segmenting algorithm to group the above ground point clouds.Several decision rules were defined to extract potential light poles.However, learning-based methods cannot interpret data on their own, which means that human designed feature extraction is needed before training.

Roadside Trees Detection
With the improvement of urban environments, a lot of landscape trees are planted along the urban streets.Timely measurement and management of trees are required to ensure their healthy growth without interference with traffic conditions.MLS point clouds have intensive sampling density and rich information, which can provide a promising solution for single tree extraction and tree species classification.Existing methods can be classified into two parts: rule-based and deep learning methods.Figure 12 illustrates some of the tree classification methods.
Rule-based methods commonly separate trees into two parts: leaf-off trunks and tree crowns.Xu et al. [118] presented a bottom-up hierarchical segmenting algorithm to merge non-photosynthetic clusters.Merging was conducted based on the dissimilarity between two groups.Euclidean distance was used to calculate the proximity matrix; whereas, the principal direction was applied to compute the direction.The key contribution is the cluster combination optimization with minimum energy function and extraction of non-photosynthetic components using an automatic hierarchical clustering.Li et al. [119] extracted individual trees from MLS data based on the general constitution of trees.The trunk and crown are two components for each tree, which can be detected using the dual growing method.A coarse classification was conducted first to remove human-made objects.Automatic seeds selection was the second step to avoid manual initial parameters setting.Then, the trees were separated using a dual growing process, which circumscribed a trunk in a scalable growing radius and segmented a crown in limited growing regions.Rule-based methods need pre-defined thresholds to process the data in several steps, which is time-consuming.
Recently, deep learning methods have been used in the roadside tree extraction.Guan et al.
[60] presented a method with two steps: tree preprocessing and tree classification.The tree preprocessing is similar to the work in [62].Then, the geometric structures of trees were modeled while using a waveform representation.After that, the generation of high-level feature abstractions of the trees' using waveform representations was conducted via deep learning architectures.Finally, these features were fed into SVM to train a tree classifier.Zhou et al. [120] also proposed a deep learning method to extract trees and their species.At first, individual trees were extracted by analyzing the density of the point clouds.Then, the voxel-based rasterization was applied to generate low-level feature representation.Finally, the tree species were classified by a deep learning model.However, deep learning architecture was applied to generate low-level or high-level features, which were fed into a machine learning classifier later.This operation cannot fully utilize the strength of deep learning methods, but just use them as a feature generator.Automated extraction focusing on deep learning should be considered in future research.

Power Lines Extraction
Power lines play an important role in connections among various nationwide grids and distributions of regional electricity.Accurate and timely detection and monitoring are essential in electric equipment management and electric power engineering related decision-makings.Recently, MLS data has been widely applied in power line extraction.Power lines acquired by MLS systems are featured with the following attributes: (1) power lines are placed above the ground; (2) power lines have a certain clearance distance; and, (3) power lines have highly linear distribution [121].Existing methods are commonly designed based on these features.
Zhu et al. [122] proposed an approach focusing on the locations and heights of the pylons to model the 3D power lines.Gradients were used to group similar points that were mapped into a binary image.The processing of the binary image was constrained by the image region properties (e.g., region's shape, length, and area), which proves effective in removing non-powerline objects.Guan et al. [123] extracted the power lines by mapping the point clouds into the 2D plane.They utilized incidence angles to estimate road ranges and applied elevation-difference and slope criteria to separates off-road points from ground points.Then, height, spatial density, and an integration of size and shape filters were used to extract power line point clouds from the clustered off-road objects.Identical power lines were extracted using Hough transform and Euclidean distance clustering.Finally, the x-y plane modeling of the 3D power line was featured with a horizontal line; whereas, the x-z plane was featured with a vertical catenary curve derived from a hyperbolic cosine function.Cheng et al. [121] used a voxel-based hierarchical method to calculate geometric features of the individual voxel to extract power lines.A bottom-up method was applied to filter the power lines.The experimental results of these methods demonstrate the applicability of MLS data for power line extraction.

Multiple Road Objects Extraction
Instead of focusing on the single urban object extraction, some researchers are dedicated to extracting multiple objects in urban areas.The development of implicit shape model and random forests renders Hough forests is capable of constructing an effective model, which can map the components features to the center of objects.Wang et al. [106] utilized the Implicit Shape Model to label object types and applied the framework of Hough Forest for detecting off-road objects: trees, light poles, cars, and traffic signs.Structural and reflective characteristics were employed to describe a 3D local, which was then mapped to predict the approximate position of the object centroid.Then, the peak points in the 3D Hough voting space was used to classify objects.A circular voting strategy was used to maintain the invariance of objects.Based on [106], Wang et al. [35] integrated super-voxel with Hough forest framework for detecting objects (e.g., cars, light poles, traffic signs) from 3D laser scanning point clouds.But, there exist two differences.First, over-segmenting was conducted to group the point clouds into spatially consistent super-voxels.Individual supervoxels, as well as its first order neighborhoods, were segmented into small components.Second, a random forest classifier was used to predict the possible location of the object center.Subsequently, a set of multiscale Hough Forest was applied by Yu et al. [59] to the encode high-order feature of 3D local patches to estimate car point cloud voxel centroids.Then the visibility estimation model was conducted to predict the completeness of cars by integrating Hough voting.
Yu et al. [62] proposed a contextual visual vocabulary approach integrated with spatial contextual information of feature regions to encode local abstract features of point cloud objects (e.g., light poles, traffic signposts, and cars).Then, related objects were extracted by measuring the similarity of the bag of contextual-visual words between the query object and the segmented semantic objects.Moreover, Luo et al. [124] used the similarities to detect road scenes, such as trees, lamp posts, traffic signs, cars, pedestrians, and hoardings from 3D point cloud data.At first, 3D components derived from point clouds were applied in the construction of a 3D patch-based match graph structure (3D-PMG), which effectively changes the labeling condition of category labels from labeled to unlabeled point cloud road scenes.Secondly, 3D components contextual information was analyzed with the integration of 3D-PMG and Markov random fields, which play a role in rectifying the transferring errors result from local patch similarities in multiple classes.Furthermore, Lehtomäki et al. [6] used several geometry-based features, i.e., local descriptor histograms (LDHs), spin images, general shape, and point distribution features, in the classification of the following roadside objects: trees, lamp posts, traffic signs, cars, pedestrians, and hoardings.

Challenges and Trend
This literature review reveals that MLS systems have proven their potential in static urban objects extraction and modeling, especially for road-related equipment inventory and condition detection.With the consistent advance of MLS technology and the declined costs of the system components, technique and algorithm challenges and future trends are discussed in this part.
(1) System design and cost MLS techniques have been fast-developing especially with the advancement of laser scanners, high-resolution cameras, and accurate positioning systems in the past years.However, existing laser scanners are relatively expensive, which limits the wide applications of MLS.In future, cost-effective laser scanners which provide intensive point clouds with multiple attributes (e.g., colors and normal directions) are in high demand.For the digital camera, high-resolution and high-quality output images are needed for the extraction of semantic objects.The improvement of locating and positioning techniques for GNSS and IMU can provide an accessibility for highly accurate point clouds and image position information.The upgrade of hardware with powerful computation and low cost affects the efficiency and accuracy of dealing with large volume point clouds and imagery data.
(2) Software developments Few existing point cloud processing programs are capable to process large volumes of 3D points in one or two steps.Complicated data processing, difficult fusion of multivariate data, and inconsistent data output format hinder the cross-application of point cloud data in different disciplines.Thus, large-volume data processing, seamless integration of multi-source data, and consistent data output software are urgently needed.Besides, there is no public platform for fusing data obtained from MLS systems.MLS systems produce 3D point clouds, imagery, position, and localization data.These data can be integrated together to generate advanced products.Efficient data fusion platform construction should be considered in future to improve the data utilization.
(3) Algorithm trend Currently, the learning-based point cloud processing algorithms are the mainstream trend.Machine learning methods [77,[125][126][127] train efficient classifiers by designing an effective feature descriptor.It highly relies on the prior knowledge of the human operators, which is a challenging task in the complex urban environments.Then, designing a proper feature descriptor for various sizes of point clouds is the future focus for machine learning methods.Recently published effective and available point cloud processing algorithms mainly focus on deep learning architectures and demonstrate their excellent performance, such as PointNet [71], PointNet++ [128], PointCNN [78], VoxelNet [129], and MSNet [130].These methods mainly process small scene point clouds.It is challenging for them to deal with large scenes.Thus, the direct construction of efficient neural networks on the 3D points with various sizes and volumes is an interesting topic for future studies.
(4) Data storage and updating With the advancement of data acquisition and data resolution, data, including point clouds and images collected by MLS systems, is in quite large-volume [131].Accordingly, cost-effective and efficient data storage and updating have impacts on the working continuity and efficiency of MLS systems.Moreover, post-processing data storage and updating are important for MLS-related applications with regard to modeling, mapping, and managing.Challenges in large-volume data storage and timely data updating are the primary concerns when the worldwide LiDAR manufacturers develop new types of LiDAR sensors and information and communication technology (ICT) companies design commercial data processing and management software.However, data quality can be influenced due to system errors of MLS systems, ambient occlusion, and data calibration.Such data obtained from multiple sensors at different times, in various weather conditions, under different illuminations, and with different sampling densities, make the data calibration and data fusion challenging [132].Furthermore, the evaluation of data updating and data correction results are cumbersome.Therefore, reliable, efficient, and cost-effective data management is one of the major challenges in development and application of MLS techniques.

Conclusions
The urban road network plays an important role in the urban planning with regard to planning, design, construction, management, and maintenance.The inventory of road features contains both pavement geometrical properties (e.g., road width, slopes, and lanes) and road infrastructures (e.g., road markings, pavement cracks, road manholes, traffic signs, roadsides trees, and buildings).In order to enhance traffic safety and efficiency, managers, and decision-makers in cities and countries should pay much attention to periodically investigate urban spatial structures and road asset inventory.Such periodically surveyed data are used not only for ITS-related development to reconstruct and maintain the existing road networks, but also for city managers to formulate traffic policies and advance traffic management.As a vehicle-based mobile mapping system, an MLS system integrated with laser scanners, GNSS sensors, and digital cameras, collects high-density 3D point clouds from the surrounding objects in urban road environments and provides detailed georeferenced coordinate information of roads.To date, due to the high flexibility and accurate data acquisition over large-scaled surveyed areas, the industrial applicability of MLS systems has been proven to provide higher time-efficiency and better cost-saving in comparison to aerial imagery, terrestrial laser scanning, and human fieldwork.When considering such strengths of MLS systems, as demonstrated in previous studies, journal publications, and projects, an increasing number of governments and transportation departments have been applying mobile laser scanning techniques to urban road asset inventory.
In this paper, a literature review described and detailed the feasibility and applications of MLS systems to road asset inventory.The MLS technique, system component, direct geo-referencing, error analysis, and current applications of on-road and off-road inventory were accordingly discussed.The state-of-the-art road object detection and extraction algorithms, focusing on mobile laser scanning point clouds, have been reviewed.Accordingly, their performance and adaptability have been discussed.When compared to other review papers focusing on MLS applications, the main contribution of this review demonstrates that the MLS systems are suitable for supporting road asset inventory, ITS-related applications, high-definition maps, and other highly accurate localization services, such as road feature extraction and accurate perception.
MLS systems have indicated great potentials for rapid commercialization.With the further advancement of laser scanning technologies and the decreased costs of new MLS-related hardware, it can be predicted that MLS systems and big data management technologies will dominate the advanced remote sensing and photogrammetry fields, as well as accurately localization-based services, such as high-definition maps and autonomous driving.To summarize, this review indicates that MLS systems provide an effective and reliable solution for conducting road asset inventory in complex urban road environments.

Figure 1 .
Figure 1.(a) Ratio of publications related to 'road object', using airborne laser scanning (ALS) point cloud, imagery, and mobile laser scanning (MLS) point cloud, respectively, from 2008 to 2018.Source: MDPI; and, (b) Number of publications related to 'mobile laser scanning' from 2010 to 2018.

Figure 5 .
Figure 5. Applications and experimental results of on-road information extraction using MLS point clouds.

Figure 6 .
Figure 6.Methods and experimental results for road surface extraction from MLS point clouds.

Figure 7 .
Figure 7. Methods and experimental results for road marking extraction from MLS point clouds.
[2,97], Euclidean clustering was applied to segment the off-terrain points into different clusters.Due to the lack of under-segmentation control, Yu et al.[7,66,96] proposed an extended voxel-based normalized cut segmentation method to split the outputs of Euclidean distance clustering, which are overlapped or occlusive objects.• DBSCAN.DBSCAN is the most commonly used clustering methods without a priori knowledge of the number of clusters.Besides, the remaining data can be identified as noise or outliers by the algorithm.Riveiro et al. [99] and Soilán et al. [64] used the DBSCAN algorithm to obtain and cluster a down-sampled set of points.Before that, a Gaussian mixture model (GMM) was presented to determine the distribution of those points that matched sign panels.For some points which were not related to traffic signs, PCA was used to analyze the curvature to filter the noise.Yan et al. [100] used the module that was provided by the Scikit-learn to perform DBSCAN.The points with height-normalized above ground features were clustered and then identified and classified as potential light poles and towers.After acquiring the rough off-road objects clusters, more detailed extraction solutions are analyzed in the following sections: pole-like object extraction, power line extraction, and multiple object extraction.Attributes of off-road urban objects are listed in Figure 9.These features play important roles in static urban object extraction in the following algorithms.

Figure 9 .
Figure 9. Off-road objects attributes, including geometric attributes and spatial attributes .

Figure 10 .
Figure 10.Learning-based methods, such as machine learning methods and deep learning algorithms, are commonly used techniques in the TSR process.• Machine learning methods.Machine learning based TSR methods are most commonly used in the imagery-based TSR tasks in the past years.Wen et al. [65] assigned traffic sign types by a support vector machine (SVM) classifier trained by integral features composed of Histogram of Gradients (HOG) and color descriptor.Tan et al. [109] developed a latent structural SVM-based weakly supervised metric learning (WSMLR) method for the TSR and learned a distance metric between the captured images and the corresponding sign templates.For each sign, recognition was done via soft voting by the recognition results of its corresponding multi-view images.However, machine learning methods need manually designed features that are subjective and highly rely on the prior knowledge of researchers.• Deep learning methods.Recently, deep learning methods are applied in MLS regions for their powerful computation capacity and superior performance without human design features.By learning multilevel feature representations, deep learning models are effective in TSR.Yu et al. [96] applied the Gaussian-Bernoulli deep Boltzmann machine (GB-DBM) model for TSR tasks.The detected traffic signs point clouds were projected into 2D images, which produced a bounding box on the image.The TSR was achieved among the bounding box.To train a TSR classifier, they arranged the normalized image into a feature vector and then input it into to the hierarchical classifier.Then a GB-DBM model was constructed to classify traffic signs.This model contains three hidden layers and one logistic regression layer, which contains quantity activation units.These units were linked to the predefined class.Guan et al. [108] also conducted TSR using GB-DBM model.Their method was proved for rapidly handling large-volume MLS point clouds toward TSD and TSR.Arcos-García et al. [63] applied a DNN model to finish TSR from the segmented RGB images.This DNN model combined several convolutional, spatial transformers, non-linearity, contrast normalization, and max-pooling layers, which were proved useful in the TSR stage.These deep learning methods are proven to generate good experimental results.However, the point cloud traffic sign detection results have impacts on the recognition results.If the traffic sign is missing in the first stage, the following TSR results will be inaccurate.

Figure 12 .
Figure 12.Methods and experimental results for tree extraction from MLS point clouds.

Table 2 .
Product specifications of several MLS systems.

Table 3 .
Summary of several road surface extraction methods.

Table 4 .
Comparison of road marking extraction methods.

Table 5 .
This table shows the attributes of traffic signs.