Outdoor Scene Understanding Based on Multi-Scale PBA Image Features and Point Cloud Features

Outdoor scene understanding based on the results of point cloud classification plays an important role in mobile robots and autonomous vehicles equipped with a light detection and ranging (LiDAR) system. In this paper, a novel model named Panoramic Bearing Angle (PBA) images is proposed which is generated from 3D point clouds. In a PBA model, laser point clouds are projected onto the spherical surface to establish the correspondence relationship between the laser ranging point and the image pixels, and then we use the relative location relationship of the laser point in the 3D space to calculate the gray value of the corresponding pixel. To extract robust features from 3D laser point clouds, both image pyramid model and point cloud pyramid model are utilized to extract multiple-scale features from PBA images and original point clouds, respectively. A Random Forest classifier is used to accomplish feature screening on extracted high-dimensional features to obtain the initial classification results. Moreover, reclassification is carried out to correct the misclassification points by remapping the classification results into the PBA images and using superpixel segmentation, which makes full use of the contextual information between laser points. Within each superpixel block, the reclassification is carried out again based on the results of the initial classification results, so as to correct some misclassification points and improve the classification accuracy. Two datasets published by ETH Zurich and MINES ParisTech are used to test the classification performance, and the results show the precision and recall rate of the proposed algorithms.


Introduction
Outdoor scene understanding based on mobile laser scanning (MLS) point cloud data are a fundamental ability for unmanned vehicles and autonomous mobile robots navigating in urban environments. Recently, a variety of laser point cloud processing methods have been presented to recognize the main elements of road environment [1], to accomplish robust place recognition [2], to extract parameters of trees [3], and so on. Moreover, the point clouds obtained from a laser scanner can also be utilized to accomplish real-time shape acquisition [4], outdoor 3D laser data classification [5], and outdoor scene understanding [6]. A state-of-the-art review for object recognition, segmentation, and classification of MLS Point Clouds was also given in [7].
In order to reduce the computational complexity of feature extraction and classification, some scholars have converted 3D laser point clouds into 2D images and used image processing methods to process 3D point clouds, such as range image [8], reflectance image [9], and bearing angle image (BA image) [10]. The BA image was originally used to solve the calibration problem between camera and laser scanner [10]. Since the BA image has clearer texture details than range image and The other method is to use the 3D laser scanner to obtain the 3D point clouds directly. However, these 3D point clouds are composed of several groups of scanning data and are always unordered when stored, so they cannot be represented by a matrix. In addition, in most public laser scanning datasets, there are no scan sequence relationships stored between different laser scans. To solve this problem, a novel Panoramic Bearing Angle (PBA) image model is proposed in this paper and introduced as follows.

Projection of 3D Laser Point Cloud to Pixel Plane
Viewpoint selection is a crucial step for 2D images generating from 3D laser point clouds. For fixed-point scanning, the location of the rotating 2D laser range finder is selected as the viewpoint. For on-the-fly scanning, the viewpoint is usually selected on the trajectory of the moving laser range finder. Suppose that a selected viewpoint of a 3D point cloud is V(xv,yv,zv), a laser point in the cloud is Pi (xi,yi,zi), and the matrix size for the 2D image to be generated is M×N. As shown in Figure 2, a spherical coordinate system is established in which the viewpoint V is the center of the sphere. It should be noted that the size of the panoramic image is only related to the resolution of the image (the size of the image matrix) regardless of the size of the projection surface. According to (1), the original 3D laser point Pi (xi,yi,zi) is converted from the global coordinate system to the spherical coordinate system with the viewpoint V as the center of the sphere. The point in the spherical coordinate system is Pi(ri,θi,φi). 2 2 2 ( ) ( ) (z z ) zz arccos( ) arctan( )  The other method is to use the 3D laser scanner to obtain the 3D point clouds directly. However, these 3D point clouds are composed of several groups of scanning data and are always unordered when stored, so they cannot be represented by a matrix. In addition, in most public laser scanning datasets, there are no scan sequence relationships stored between different laser scans. To solve this problem, a novel Panoramic Bearing Angle (PBA) image model is proposed in this paper and introduced as follows.

Projection of 3D Laser Point Cloud to Pixel Plane
Viewpoint selection is a crucial step for 2D images generating from 3D laser point clouds. For fixed-point scanning, the location of the rotating 2D laser range finder is selected as the viewpoint. For on-the-fly scanning, the viewpoint is usually selected on the trajectory of the moving laser range finder. Suppose that a selected viewpoint of a 3D point cloud is V(x v ,y v ,z v ), a laser point in the cloud is P i (x i ,y i ,z i ), and the matrix size for the 2D image to be generated is M×N. As shown in Figure 2, a spherical coordinate system is established in which the viewpoint V is the center of the sphere. It should be noted that the size of the panoramic image is only related to the resolution of the image (the size of the image matrix) regardless of the size of the projection surface. According to (1), the original 3D laser point P i (x i ,y i ,z i ) is converted from the global coordinate system to the spherical coordinate system with the viewpoint V as the center of the sphere. The point in the spherical coordinate system where θ i ∈ [0, π] and ϕ i ∈ [0, 2π]. According to (2), M warps l m and N + 1 wefts l p are drawn which can divide the sphere into M × N independent grids. The left image in Figure 3 is a spherical coordinate system which is divided into 64 grids by eight warps and nine wefts (two poles are included).  According to (2), M warps lm and N + 1 wefts lp are drawn which can divide the sphere into M × N independent grids. The left image in Figure 3 is a spherical coordinate system which is divided into 64 grids by eight warps and nine wefts (two poles are included).

,
[0, 1] , Take the center of the sphere V as the starting point and make a ray through each laser scanning point Pi(ri,θi,φi), so that the laser point can be projected to a grid of the sphere. If there are more than one projections of laser points in a grid, the one closest to the center of the sphere is retained. Then cut the spherical surface along the 0-degree warp and spread it to the horizontal plane to obtain the 2D matrix of the PBA (see the right image of Figure 3).
As shown in Figure 4a, a 3D laser point cloud is obtained in the fixed scanning point V, and Figure 4b is the corresponding panoramic image, which is displayed in the binary value. The white pixel indicates that there is a laser scanning point corresponding to it, while the black pixel indicates that no laser point corresponds to it.  According to (2), M warps lm and N + 1 wefts lp are drawn which can divide the sphere into M × N independent grids. The left image in Figure 3 is a spherical coordinate system which is divided into 64 grids by eight warps and nine wefts (two poles are included).

,
[0, 1] , Take the center of the sphere V as the starting point and make a ray through each laser scanning point Pi(ri,θi,φi), so that the laser point can be projected to a grid of the sphere. If there are more than one projections of laser points in a grid, the one closest to the center of the sphere is retained. Then cut the spherical surface along the 0-degree warp and spread it to the horizontal plane to obtain the 2D matrix of the PBA (see the right image of Figure 3).
As shown in Figure 4a, a 3D laser point cloud is obtained in the fixed scanning point V, and Figure 4b is the corresponding panoramic image, which is displayed in the binary value. The white pixel indicates that there is a laser scanning point corresponding to it, while the black pixel indicates that no laser point corresponds to it. Take the center of the sphere V as the starting point and make a ray through each laser scanning point Pi(r i ,θ i ,ϕ i ), so that the laser point can be projected to a grid of the sphere. If there are more than one projections of laser points in a grid, the one closest to the center of the sphere is retained. Then cut the spherical surface along the 0-degree warp and spread it to the horizontal plane to obtain the 2D matrix of the PBA (see the right image of Figure 3).
As shown in Figure 4a, a 3D laser point cloud is obtained in the fixed scanning point V, and Figure 4b is the corresponding panoramic image, which is displayed in the binary value. The white pixel indicates that there is a laser scanning point corresponding to it, while the black pixel indicates that no laser point corresponds to it.

Calculating of Image Gray Value
There are many classical image models to represent laser points stored in the 2D matrix, such as reflectance image, range image and bearing angle (BA) image. However, the reflectance image is less robust and the edge description in range image is not clear enough, especially in large-scale scenes.

Calculating of Image Gray Value
There are many classical image models to represent laser points stored in the 2D matrix, such as reflectance image, range image and bearing angle (BA) image. However, the reflectance image is less robust and the edge description in range image is not clear enough, especially in large-scale scenes. The quality of the BA image depends on the selection of the viewpoint position. In addition, grayscale change may appear in the BA image. As shown in Figure 5, the gray values for the same railing are inconsistent, which is not beneficial to feature extraction and classification.
(b) The panoramic binarized image of the same scene's point cloud projected to the spherical coordinate system.

Calculating of Image Gray Value
There are many classical image models to represent laser points stored in the 2D matrix, such as reflectance image, range image and bearing angle (BA) image. However, the reflectance image is less robust and the edge description in range image is not clear enough, especially in large-scale scenes. The quality of the BA image depends on the selection of the viewpoint position. In addition, grayscale change may appear in the BA image. As shown in Figure 5, the gray values for the same railing are inconsistent, which is not beneficial to feature extraction and classification.
In order to overcome the above limitations, a novel PBA image model is proposed in this paper inspired by the BA model, which is not related to the selection of viewpoints. Moreover, the PBA image model can provide stable gray values for the same object and also ensure clear texture and high image contrast with high computational efficiency. Here we will explain how to calculate the gray value of each pixel in the PBA image. As shown in Figure 6, there are M rows in the image matrix, and the image pixel corresponding to the laser scanning point P is defined as Px,y, which is located in row x and column y. Two neighboring laser points Pl and Pr for point P are chosen as:  In order to overcome the above limitations, a novel PBA image model is proposed in this paper inspired by the BA model, which is not related to the selection of viewpoints. Moreover, the PBA image model can provide stable gray values for the same object and also ensure clear texture and high image contrast with high computational efficiency.
Here we will explain how to calculate the gray value of each pixel in the PBA image. As shown in Figure 6, there are M rows in the image matrix, and the image pixel corresponding to the laser scanning point P is defined as P x,y , which is located in row x and column y. Two neighboring laser points P l and P r for point P are chosen as: where Ψ(·) represents the image pixel of a laser point. If the pixel is in the upper part of the image, its upper left and upper right pixels are selected as neighboring pixels; otherwise, the lower left and right lower pixels are selected as neighboring pixels. The pixel gray value of Px,y is defined as: where α is the angle between P and its neighboring laser scanning points Pl and Pr, which can be obtained as follows: The pixel gray value of P x,y is defined as: where α is the angle between P and its neighboring laser scanning points P l and P r , which can be obtained as follows: where VP, VP l , VP r represent the distances between the center of the sphere V and laser points P, P l , P r , respectively. An example of a PBA gray image is given in Figure 7. Compared with the BA image in Figure 5, the gray values for the same railing are consistent, and the boundaries of the objects in the scene are clearer. The pixel gray value of Px,y is defined as: where α is the angle between P and its neighboring laser scanning points Pl and Pr, which can be obtained as follows: where VP, VPl, VPr represent the distances between the center of the sphere V and laser points P, Pl, Pr, respectively.
An example of a PBA gray image is given in Figure 7. Compared with the BA image in Figure 5, the gray values for the same railing are consistent, and the boundaries of the objects in the scene are clearer.

Laser Point Clouds Classification Using Multi-Scale PBA Image Features and Point Cloud Features
It is important to select the neighborhood range of the laser points in the feature extraction step. In our work, the image pyramid model is adopted to extract the texture features of PBA images on multiple scales. The point cloud pyramid model is then used to extract the local features of the 3D point cloud on multiple scales.

Multi-Scale PBA Image Feature Extraction
In our work, feature extraction is accomplished in 2D gray images on multiple scales. When the scale is large, the computational cost is very high. Therefore, the PBA image is downsampled by using the image pyramid model [19]. The image pyramid model for PBA images is given in Figure 8. It

Laser Point Clouds Classification Using Multi-Scale PBA Image Features and Point Cloud Features
It is important to select the neighborhood range of the laser points in the feature extraction step. In our work, the image pyramid model is adopted to extract the texture features of PBA images on multiple scales. The point cloud pyramid model is then used to extract the local features of the 3D point cloud on multiple scales.

Multi-Scale PBA Image Feature Extraction
In our work, feature extraction is accomplished in 2D gray images on multiple scales. When the scale is large, the computational cost is very high. Therefore, the PBA image is downsampled by using the image pyramid model [19]. The image pyramid model for PBA images is given in Figure 8. It should be noted that the image in each layer of the pyramid model is generated directly from the 3D laser point cloud, rather than from the downsampling of the original image.
Local Binary Pattern (LBP) is a kind of image texture feature, which is extracted from multi-resolution PBA images. For the classic LBP feature, eight fixed neighborhood pixels are selected (see Figure 9a). In order to extract multi-scale texture features an improved neighborhood selection method [20] is adopted for LBP feature extraction in our work, in which a circular neighborhood is selected with variable radius r. The pixel coordinate of the neighborhood points (x p , y p ) can be obtained as follows: where (x c , y c ) is the pixel coordinate of the center pixel. As shown in Figure 9a,b, r is selected as 1 and 2, respectively. should be noted that the image in each layer of the pyramid model is generated directly from the 3D laser point cloud, rather than from the downsampling of the original image. Local Binary Pattern (LBP) is a kind of image texture feature, which is extracted from multiresolution PBA images. For the classic LBP feature, eight fixed neighborhood pixels are selected (see Figure 9a). In order to extract multi-scale texture features an improved neighborhood selection method [20] is adopted for LBP feature extraction in our work, in which a circular neighborhood is selected with variable radius r. The pixel coordinate of the neighborhood points (xp, yp) can be obtained as follows: where (xc, yc) is the pixel coordinate of the center pixel. As shown in Figure 9a,b, r is selected as 1 and 2, respectively.
where Gold is the pixel gray value of the original PBA image.   Local Binary Pattern (LBP) is a kind of image texture feature, which is extracted from multiresolution PBA images. For the classic LBP feature, eight fixed neighborhood pixels are selected (see Figure 9a). In order to extract multi-scale texture features an improved neighborhood selection method [20] is adopted for LBP feature extraction in our work, in which a circular neighborhood is selected with variable radius r. The pixel coordinate of the neighborhood points (xp, yp) can be obtained as follows: where (xc, yc) is the pixel coordinate of the center pixel. As shown in Figure 9a,b, r is selected as 1 and 2, respectively.
where Gold is the pixel gray value of the original PBA image. Figure 10 shows an example of the simplified 3-level PBA image (black-0; gray-127; white-255), and four categories of typical local scenes also show distinct texture features, which are artificial ground (top left), natural ground (bottom left), buildings (top right), and vegetation (bottom right).
where G old is the pixel gray value of the original PBA image. Figure 10 shows

Multi-Scale Point Cloud Feature Extraction
In our work, features are extracted from 3D laser point clouds on multiple scales. However, when the neighborhood radius is expanded at a linear rate, the number of neighborhood points of a laser point is approximately increased at a cubic speed, which greatly increases the computational burden. In order to solve this problem, the point cloud pyramid model is derived which is inspired by the image pyramid model in image processing.
Similar to the image pyramid model, the downsampling algorithm is utilized for the original point clouds to build the point cloud pyramid model. The voxel model is used to divide the laser point cloud to be downsampled into different 3D grids. Then the center of gravity of the laser points in each voxel (3D grid) is calculated to represent all the points in the voxel. An illustration of the point cloud pyramid model is shown in Figure 11, in which the bottom layer is the original laser point cloud. Then a fixed number of laser points are selected as neighborhood points in different layers of the point cloud pyramid model. When feature extraction in different layers of the image pyramid model for the PBA images is completed, these features in different layers need to be fused. Starting from the top layer image of the pyramid, the image features are upsampled, and then superimposed with the image features of the next layer. These two steps are repeated until the features in all layers are superimposed on the image at the bottom layer of the image pyramid model.
In summary, the (P + 1) layer image pyramid model of PBA images is built from the original laser point cloud, and each layer of PBA images is converted to a 3-level gray image. LBP features are then extracted in each image pixel on m scales. Finally, the features in different layers are superimposed together from the top layer to the bottom layer. Therefore, there are m × (P + 1) image features for every pixel in the original PBA image.

Multi-Scale Point Cloud Feature Extraction
In our work, features are extracted from 3D laser point clouds on multiple scales. However, when the neighborhood radius is expanded at a linear rate, the number of neighborhood points of a laser point is approximately increased at a cubic speed, which greatly increases the computational burden. In order to solve this problem, the point cloud pyramid model is derived which is inspired by the image pyramid model in image processing.
Similar to the image pyramid model, the downsampling algorithm is utilized for the original point clouds to build the point cloud pyramid model. The voxel model is used to divide the laser point cloud to be downsampled into different 3D grids. Then the center of gravity of the laser points in each voxel (3D grid) is calculated to represent all the points in the voxel. An illustration of the point cloud pyramid model is shown in Figure 11, in which the bottom layer is the original laser point cloud. Then a fixed number of laser points are selected as neighborhood points in different layers of the point cloud pyramid model.  After determining the neighborhood range of each laser point, feature extraction will be performed which includes statistical features, geometric morphological features, and histogram features.

Statistical Features
Let the total number of laser points in the current neighborhood be (k + 1), and the coordinate of the lowest point in the neighborhood be h min . In our work, five statistical features are extracted, which are: • h, the absolute height of the laser point; • ∆h = h − h min , the relative height between the laser point and the lowest laser point in the neighborhood; , the standard deviation of the laser point's height in the neighborhood; • r, the radius of the maximum bounding sphere of the neighborhood; • d = k+1

Morphological Features
According to the summary in [15], a covariance matrix is adopted to describe the 3D laser point cloud in the neighborhood, where p c is the current query point and p i is the neighborhood point around the query point. The covariance matrix can be expressed as: which is a three-dimensional positive definite matrix. By eigen decomposition, three eigenvalues λ 1 , λ 2 , λ 3 (let λ 1 ≥ λ 2 ≥ λ 3 ≥ 0) and three eigenvectors e 1 , e 2 , e 3 corresponding to λ 1 , λ 2 , λ 3 are obtained, respectively. In our work, nine morphological features are extracted, which are Linearity L λ , Planarity P λ , Sphericity S λ , Omnivariance O λ , Anisotropy A λ , Eigenentropy E λ , Sum Σ λ , Change of Curvature C λ and Verticality V λ . These features can be calculated as follows:

Histogram Features
Fast point feature histograms (FPFH) is a set of 33-dimensional histogram features [21]. Compared to morphological features, FPFH can describe the geometric features in the query point's neighborhood in more detail and represent the roughness of the plane effectively, which can be used to distinguish two typical road surfaces (artificial ground and natural ground). As shown in Figure 12a, FPFH consists of two Simplified Point Feature Histograms (SPFH). One is composed of the query point p and its neighborhood point p k (the points in the red circle) and the other one is composed of each neighborhood point p k and its neighborhood points (the points in the blue circle). FPFH can be defined as follows: where k stands for the number of neighborhood points around the query point p, w k stands for distance weight which is used to measure the density between neighborhood points and query points. SPFH is composed of Simplified Point Features (SPF). SPF is a three-dimensional angular feature descriptor that represents the position relationship between two laser points. As shown in Figure 12b, P2 is a laser point in the neighborhood of P1, and n1 and n2 are the normal vectors of P1 and P2. According to (12), the UVW coordinate system is established with P1 as the coordinate origin: SPFH is composed of Simplified Point Features (SPF). SPF is a three-dimensional angular feature descriptor that represents the position relationship between two laser points. As shown in Figure 12b, P 2 is a laser point in the neighborhood of P 1 , and n 1 and n 2 are the normal vectors of P 1 and P 2 . According to (12), the UVW coordinate system is established with P1 as the coordinate origin: The angular parameter δ, α, θ are used to describe the position relationship between two laser points, which can be defined as follows: Although FPFH can describe the geometric characteristics of the laser point cloud in more detail, it increases the computational burden significantly. Therefore, we only extract FPFH features for the laser point at the bottom of the point cloud pyramid, while the other 14-point cloud features (five statistical features and nine morphological features) are extracted for each laser point of the point cloud pyramid.

Classification with Random Forest and Reclassification Based on the Contextual Information
In this paper, the Random Forest classifier is used to perform feature screening on the extracted high-dimensional features, and the initial classification of the 3D laser point clouds is implemented. Since this method does not consider the contextual information between laser points, the credibility of classification results is low for the objects with similar local features (such as eaves and vegetation). In order to make full use of the contextual information between laser points, the classification results are remapped into the PBA images, and superpixel segmentation is performed on the PBA images. Within each superpixel block, the classification is performed again based on the results of the initial classification, so as to correct partial misclassification points and further improve the classification accuracy.
The Random Forest classifier is composed of multiple decision tree classifiers. In the training stage, some training samples are randomly selected to complete the training for each decision tree. In the classification stage, some decision trees are randomly selected and the mode of their output categories is taken as the final classification result. Figure 13 shows the classification results by using the Random Forest classifier and the ground truth. Seven different colors are used to represent seven different categories: dark gray for artificial ground, yellow for natural ground, dark green for high vegetation, light green for low vegetation, red for buildings, dark brown for railings, and silver for cars. From the classification results, we can see that the main objects, such as buildings, ground, cars, and vegetation, can be effectively classified. ground, yellow for natural ground, dark green for high vegetation, light green for low vegetation, red for buildings, dark brown for railings, and silver for cars. From the classification results, we can see that the main objects, such as buildings, ground, cars, and vegetation, can be effectively classified. By comparing the classification results with the ground truth, we can find that a large number of laser points that do not belong to vegetation are classified into vegetation. This is due to the cluttered distribution of these laser points, and the local features of these laser points are very close to those of the vegetation. Therefore, the laser point clouds will be reclassified by considering the contextual information of the 3D laser point clouds based on the PBA images. In this paper, the SEEDS-based superpixel segmentation is performed on the PBA images [22]. For each superpixel block, if the pixel proportion of vegetation is less than a threshold, the laser points corresponding to vegetation will be reclassified into the category with the highest pixel proportion in the block. This strategy makes full use of the contextual information of the 3D laser point cloud in By comparing the classification results with the ground truth, we can find that a large number of laser points that do not belong to vegetation are classified into vegetation. This is due to the cluttered distribution of these laser points, and the local features of these laser points are very close to those of the vegetation. Therefore, the laser point clouds will be reclassified by considering the contextual information of the 3D laser point clouds based on the PBA images.
In this paper, the SEEDS-based superpixel segmentation is performed on the PBA images [22]. For each superpixel block, if the pixel proportion of vegetation is less than a threshold, the laser points corresponding to vegetation will be reclassified into the category with the highest pixel proportion in the block. This strategy makes full use of the contextual information of the 3D laser point cloud in 2D images, which can reduce the error rate of the point cloud classification As shown in Figure 14, the initial classification result based on the Random Forest classifier is at the top left and the reclassification result is at the top right. The bottom left images and the bottom right images are local details in enlarged images of the initial classification result and reclassification result, respectively. After reclassification, most of the point clouds that were previously misclassified are corrected. In this paper, the SEEDS-based superpixel segmentation is performed on the PBA images [22]. For each superpixel block, if the pixel proportion of vegetation is less than a threshold, the laser points corresponding to vegetation will be reclassified into the category with the highest pixel proportion in the block. This strategy makes full use of the contextual information of the 3D laser point cloud in 2D images, which can reduce the error rate of the point cloud classification As shown in Figure 14, the initial classification result based on the Random Forest classifier is at the top left and the reclassification result is at the top right. The bottom left images and the bottom right images are local details in enlarged images of the initial classification result and reclassification result, respectively. After reclassification, most of the point clouds that were previously misclassified are corrected.

Classification Results of 3D Point Clouds Obtained in Fixed-Point Scanning Mode
In this subsection, a 3D laser point cloud dataset published by ETH Zurich is selected to verify the algorithm. This dataset includes 15 typical scenes. Two typical scenes are selected for testing, and the remaining scenes are used for training. The two testing sets contain seven categories which are

Classification Results of 3D Point Clouds Obtained in Fixed-Point Scanning Mode
In this subsection, a 3D laser point cloud dataset published by ETH Zurich is selected to verify the algorithm. This dataset includes 15 typical scenes. Two typical scenes are selected for testing, and the remaining scenes are used for training. The two testing sets contain seven categories which are represented by seven different colors: dark gray for artificial ground, yellow for natural ground, dark green for high vegetation, light green for low vegetation, red for buildings, dark brown for railings, and silver for cars. The category distribution of the two testing sets is shown in Table 1.      According to the initial classification results, it can be seen that the recall rates of vegetation and natural ground are very low. A large number of laser points that belong to cars and buildings are misclassified into vegetation and a large number of laser points that belong to artificial ground are misclassified into natural ground. For misclassified categories (vegetation and natural ground), reclassification will be carried out.
The superpixel segmentation is used for reclassification. In this paper, the PBA image is segmented into 2025 superpixel blocks. For each superpixel block, if the pixel proportion of vegetation is less than 1/8, the laser points corresponding to vegetation will be reclassified into the category with the highest pixel proportion in the block. If the superpixel block contains both natural ground and artificial ground, the laser points belonging to the category with a small proportion will be reclassified into that with a larger proportion. The images in the middle of Figures 15 and 16 are the reclassification results and the images on the right of Figures 15 and 16 show the ground truth. The recall rate and precision rate are given in Tables 4 and 5.  According to the initial classification results, it can be seen that the recall rates of vegetation and natural ground are very low. A large number of laser points that belong to cars and buildings are misclassified into vegetation and a large number of laser points that belong to artificial ground are misclassified into natural ground. For misclassified categories (vegetation and natural ground), reclassification will be carried out.
The superpixel segmentation is used for reclassification. In this paper, the PBA image is segmented into 2025 superpixel blocks. For each superpixel block, if the pixel proportion of vegetation is less than 1/8, the laser points corresponding to vegetation will be reclassified into the category with the highest pixel proportion in the block. If the superpixel block contains both natural ground and artificial ground, the laser points belonging to the category with a small proportion will   After reclassification, the recall rates of vegetation and natural ground have been improved. However, for Testing Set A, the recall rate of low vegetation is still not high. A large number of laser points belonging to motorcycles are classified into low vegetation. Since motorcycles are not considered as a category, the lower recall rate is acceptable for low vegetation.
In addition, for Testing Set B, the precision rate of natural ground classification declined dramatically due to the disparity in the area between artificial ground and natural ground. After reclassification, some laser points belonging to natural ground are classified into artificial ground. Although this strategy sacrifices the precision rate of natural ground classification, it improves the precision rate of artificial ground classification greatly and the classification effect of the whole scene is better.

Classification Results of 3D Point Clouds Obtained in On-the-Fy Scanning Mode
In this subsection, a 3D laser point cloud dataset published by MINES ParisTech is selected to verify the algorithm. Since the data are obtained by on-the-fly scanning, pre-processing is performed to filter out some laser points with large errors. Simple cropping and downsampling are also performed to remove the laser points scanned into the interior of the building. A typical scene is selected for testing and the category distribution is shown in Table 6. In on-the-fly scanning mode, multiple PBA images are needed to fully represent the 3D scene. As shown in Figure 17, the red ray approximates the trajectory of the data acquisition vehicle, and the length is about 80 m. The five red triangles are viewpoints selected on the acquisition trajectory. The images on the top and bottom of Figure 17 are the PBA images obtained from the five viewpoints.  We also compare the classification results with Weinmann's work [14]. Weinmann selected a fixed neighborhood scale for point clouds and 21-dimensional features were extracted for each laser point. The comparison of classification results is shown in Table 7. It can be seen that the method proposed in this paper has obvious advantages for the classification of small objects such as railings and cars.  Due to the low density of data acquired by on-the-fly scanning, the resolution of the image at the bottom of the image pyramid is selected as 720 × 360 and FPFH features are not extracted. Since the scene contains only four categories, reclassification is not carried out. The classification results are shown in Figure 18.  We also compare the classification results with Weinmann's work [14]. Weinmann selected a fixed neighborhood scale for point clouds and 21-dimensional features were extracted for each laser point. The comparison of classification results is shown in Table 7. It can be seen that the method proposed in this paper has obvious advantages for the classification of small objects such as railings and cars. We also compare the classification results with Weinmann's work [14]. Weinmann selected a fixed neighborhood scale for point clouds and 21-dimensional features were extracted for each laser point. The comparison of classification results is shown in Table 7. It can be seen that the method proposed in this paper has obvious advantages for the classification of small objects such as railings and cars.

Conclusions
This paper presents an approach of 3D laser point cloud classification to accomplish outdoor scene understanding in urban environments. To improve the performance of point cloud classification, a new transformation model is proposed to transform point clouds to PBA images. Due to the correspondence between the original point cloud and the PBA image, multiple-scale features are extracted from both point clouds and PBA images, and then the Random Forest classifier is adopted to get initial classification results. To correct the misclassification points, reclassification is performed by remapping the classification results into the PBA images and using superpixel segmentation. Finally, we have conducted a series of experiments on two public datasets named ETH Zurich and MINES ParisTech, and testing results demonstrate the validity and the robustness of the proposed method.