^{1}

^{2}

^{*}

^{1}

^{1}

^{1}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (

Curb detection is an essential component of Autonomous Land Vehicles (ALV), especially important for safe driving in urban environments. In this paper, we propose a fusion-based curb detection method through exploiting 3D-Lidar and camera data. More specifically, we first fuse the sparse 3D-Lidar points and high-resolution camera images together to recover a dense depth image of the captured scene. Based on the recovered dense depth image, we propose a filter-based method to estimate the normal direction within the image. Then, by using the multi-scale normal patterns based on the curb's geometric property, curb point features fitting the patterns are detected in the normal image row by row. After that, we construct a Markov Chain to model the consistency of curb points which utilizes the continuous property of the curb, and thus the optimal curb path which links the curb points together can be efficiently estimated by dynamic programming. Finally, we perform post-processing operations to filter the outliers, parameterize the curbs and give the confidence scores on the detected curbs. Extensive evaluations clearly show that our proposed method can detect curbs with strong robustness at real-time speed for both static and dynamic scenes.

Curb detection is a crucial component in both Autonomous Land Vehicles (ALV) and Advanced Driver Assistance Systems (ADAS). Robust curb detection in real environments can undoubtedly improve driving safety and benefit those systems in various tasks. In urban environments, curbs limit the driving area. They even have the same value as obstacles, for vehicles should not drive across the curbs. Curbs can also support map building and vehicle localization [

Curbs are continuous objects in the road scene and thus they have specific geometric and visual properties which serve as the backbone for robust curb detection. Traditionally, these properties are captured separately by different sensors. More specifically, range sensors measure the geometric property, while visual sensors capture the visual appearance property. We now introduce the properties of curbs in details and then explain how to better use these properties in our method to detect the curbs. From the geometric model shown in

From the road images containing curbs, as shown in

Though curbs have the aforementioned specific properties, curb detection in various environments remains a challenging problem, even with state-of-the-art sensors. The major difficulty lies in that the curbs only have a subtle variation of height, compared with obstacles. Such subtle range changes are easily confused with noises and are hardly detected by range sensors. On the other hand, for visual sensors, edges of the curbs are prone to being confused with other objects. Such cases will be worse in cluttered scenes.

Thus, range-visual fusion seems to be a promising approach for robust curb detection. Traditional range-visual fusion methods [

A diagram illustrating the above four steps is given in

The major contributions of this paper can be briefly summarized as follows:

To the best of our knowledge, we are almost the first to use the dense depth image, which are obtained by effectively fusing the 3D-Lidar and camera data, for curb detection. The advantages of such data fusion for curb detection are well demonstrated in the experiments.

We propose a novel filter-based method for efficient surface normal estimation, and we show that the normal image can be used to accurately detect curb point features.

We build a Markov Chain model, using the curb point detection result, which elegantly captures the consistency property for curb point linking. The optimal curb path for each side is then linked by simple dynamic programming, which is computationally fast and cheap.

We propose several effective and feasible post-processing steps to filter out the outliers, to parameterize the curbs, and to obtain the confidence scores.

Based on the above proposed methods, we obtain robust curb detection results in various conditions (including quite bad conditions) efficiently within a long range, which is up to 30 m from the sensors. This is the best result ever achieved for curb detection to our best knowledge.

The remainder of the paper is organized as follows: in Section 2, we briefly review the existing curb detection methods with different sensors. In Section 3, we provide a detailed description of our proposed method for curb detection. In Section 4, comprehensive experiments are demonstrated in various scenes along with the quantitative and qualitative illustrations of our method. Finally, we give our conclusions and the future directions in Section 5.

The research on curb detection has a long history [

Camera-based curb detection methods have the advantage of low cost, but their performance is strongly sensitive to the outdoor conditions. For example, in low illumination, textureless, or cluttered scenes, their curb detection results may be unstable. Monocular methods utilize edge cues, and/or use texture information combined with machine learning techniques [

In contrast with camera-based methods, Lidar-based methods can achieve reliable and accurate results in their valid range. However, common used Lidar sensors can only provide sparse data in a certain range, so their applicability is severely restricted in a limited area. For instance, a 2D-Lidar can only measure the distance in one scanning plane each time [

Fusion-based methods generally achieve better results, compared with pure camera or Lidar based methods, by integrating different information. The principle of existing fusion-based methods is to estimate reliable curbs in near region with range data at first, and then extend this result to be faraway by using the image data [

No matter what sensors are used, all the above methods need a curb model. There are various models used in different methods, such as the straight line and line segment chain model [

Other cues can also help to improve curb detection, such as the static property of the curb and the detection results of obstacles. Curbs are static relative to the road, hence, multi-frame data can be aligned to keep the persistent ones for denoising [

An overview of our method is provided in the Introduction. In this section, we introduce the details of four components of our proposed method along with the implementation details.

Geometric properties are important cues for curb detection. However, as mentioned, the geometric properties measured solely from the sparse range data are not robust and reliable for curb detection. In order to more effectively utilize the geometric properties of curbs, in this work, we propose to first recover a

The recovered depth image is located within two coordinate systems, _{C,} Y_{C,} Z_{C}–axis paralleling with U, V, optical-axis respectively. The units of image coordinates are in terms of pixels, while the units of camera coordinates are based on meters.

_{C}_{C}_{C}

In _{x}_{y}_{x}_{y}

Let _{C} in

Here we provide an illustrating example in

A real example of depth recovery in outdoor scenes is demonstrated in

With the depth image, it is easy to calculate the word coordinates and camera coordinates for each point. The word coordinate system is fixed with respect to the vehicle, with its origin located at the projection point of the camera centre on the ground plane, its X_{W}-axis pointing to the right side, Y_{W}–axis pointing to ahead, and Z_{W}–axis pointing to the up side. The extrinsic parameters translating world coordinates to camera coordinates are the rotation matrix R and the translation vector T, which can be estimated in advance and assumed to be known here. With _{C}_{C}_{C}_{C}, Y_{C} images are shown in _{C}_{C}_{C}

When z_{W} is known, it is easy to identify the road region, whose _{W}_{z}_{Z}

After recovering the depth image and point coordinates in several systems, we now proceed to perform the curb point detection. First, we devise a filter-based normal estimation method using the depth image. Then, we use the curb pattern in the normal image to detect the curb point features row by row. The height property of curbs, or more precisely the fact that curbs are above the road surface from 5 to 35 cm, is also utilized for filtering out the non-road region.

In 3D information processing, surface normal direction estimation is of great importance for robotics/ALV to describe objects and understand the scenes. For unorganized 3D points, the statistics-based method is commonly used, which estimates a plane to fit each point and its neighboring points. However, this method is time-consuming for dense data. In contrast, in [

In particular, for one point with camera coordinates (

In camera coordinates, the normal estimation is formulated as

With the chain rule of the derivative, the partial derivatives in

We apply the above partial derivatives onto

This can be written in a compact matrix form:

In the above _{x}_{y}_{x}_{y}

For suppressing the noises, we first smooth the depth image with a Gaussian kernel (with kernel width _{s}

The normal estimation results, using the above convolution kernels and different smoothing kernel widths _{s}_{s}_{s}_{s}

Note that this normal estimation method only needs three spatial convolutions with small kernels and some pixel-level operations, so the computation cost is quite cheap. This method can achieve accurate surface normal estimation for each point in the image. In displaying the normal direction, we use the following color codes throughout the paper. When (_{x}_{y}_{z}_{x}_{y}_{z}

As in _{W} and Z_{W} directions. In particular, in

Based on above observations, we design multi-scale row patterns for better detecting curb features, which are illustrated in

We define the curb feature for each side of the curb based on the pattern responses. The left curb feature, as shown in _{W} projection map (_{W} projection map (_{W} projection map (_{W} projection map (

Using the consistency property of the curbs, we build a Markov Chain model for linking the curb points. After transforming the feature responses into node and edge probabilities in the Markov Chain, we can link the best curb path by high efficient dynamic programming algorithm [

We build a Markov Chain model for the curb in each road side, to make full use of the feature responses and explore the consistency property of the curbs.

Denote the curb point position in each row of the image as a random variable x_{i}, and each x_{i} has N different states, each of which corresponds to a specific column of N columns in the image. Here we restrict our model in road region, which is identified by the value z_{W} in Section 3.1.

The node probability (_{i}_{n}_{n}

The edge probability which describes the consistency property is defined as:

The major consideration in defining this edge probability is that the best path should be smooth in terms of both position and feature. The feature _{f}_{x}

With the node and edge probabilities defined above, the best path (linking curb points with the largest total probability) can be obtained by applying dynamic programming algorithm. The linking process includes forward and backward searching steps.

In the forward searching steps, the algorithm selects the path from top to bottom. We calculate the probability from the top row (in road region) to current point with

In the backward steps, we choose the point with the maximum probability in the bottom row, and track back to the top by using the link data. In this way, best paths can be linked to filter the isolated noises. The Markov Chain based curb point linking results are shown in

In this subsection, we introduce the details of employed post-processing in this work for further refining the curb detection results.

By analyzing the positions and the curb point features along the best path, we detect suitable break points to cut the best path into several segments, and choose the best segment as our final output to filter out the non-curb parts. After an average smoothing, we calculate the curvature and feature variation along the best path. Throughout the experiments, the break points are defined as points with the curvature greater than 10 or the feature variation greater than 0.003. We sum up the probabilities along each segment, and choose the segment with the largest probability as the output best segment.

We use a polynomial model (up to second order) in the curb modeling, which is given in

We use weighted least square for estimating parameters (a,b,c) by using the nodePot as the weight for each point. For optimizing the parameters, we minimize the objective function in

Here (u_{i}, v_{i}) is a curb point in the best segment, and w_{i} is the

We give our confidence scores for the detected curbs based on both the node probability and the accuracy of the model:

That is to say, more points lying in the path with stronger curb feature and less model error can lead to a higher score for the detection result. The refined results are shown in _{sc}

In this section, we present the experimental evaluations of our proposed method for curb detection. Here we use the widely used KITTI dataset [

We use the KITTI dataset in this section to evaluate our method. The KITTI dataset is one of the most comprehensive datasets for ALV applications, which is commonly used as the test bed for various tasks [

We conduct comprehensive evaluations of our method on KITTI dataset, and our proposed method achieves outperforming results. In this section, we present some evaluation results under different conditions.

From

The statistics results of each experiment are summarized in

For detecting both curved and straight curbs of different lengths, our method provides reliable detection results, as shown in

For broken curbs, our method can provide reasonable results, as shown in

With confidence scores, our method can judge whether there is a curb in each side. In our experiments, if the confidence score is lower than 10, we disable the result.

In this subsection, we provide some further discussions and illustrations on our proposed method.

In the edge probability design, we use both the position and the feature. In this subsection, we give a comparison with different probability strategies. One is our method and the other one only uses the position information.

In each step of our method, we take the implementation efficiency into account. The depth recovery and normal estimation take the major part of the computation resource in our method. For depth recovery, we use the Graphic Processing Unit (GPU) implementation from [

One of the most important advantages of our method is its robustness. By using the dense depth image, our method achieves reliable results even for quite noisy scenes. Our method also achieves larger detection range. In KITTI dataset, in no occlusion condition, the detection range is about +30 m in vertical and +8 m in horizontal for common curbs with about 10 cm height. The typical results are shown in

In this paper, we have proposed a curb detection method based on fusing the 3D-Lidar and camera data. Using the dense depth image from range-visual fusion, we derived a filter-based method for efficient surface normal estimation. By using the specifically designed pattern of curbs, curb point features were detected in the normal image row by row. We then formulated the curb point linking process as a best path searching on a corresponding Markov Chain, which was solved via dynamic programming. We also designed several post-processing steps to filter the noises, parameterize the curb models and compute their confidence scores. Comprehensive evaluations on KITTI dataset showed that our method achieved good results in both static and dynamic scenes, and processed the data at the speed of 15 Hz. For the obstacle occlusion and strong shadow, our method showed strong robustness. In typical scenes without occlusion, our detection rang reached 30 m for front and 8 m for each side.

In the future, we are going to apply this curb detection method for other ALV applications, such as map building, vehicle localization and so on. To date, there is no widely accepted benchmark for curb detection. A comprehensive benchmark with different sensors for curb detection is needed for fair comparison, and this could be our future work.

This work was supported in part by the National Natural Science Foundation of China under Grant Nos. 91220301 and 90820302. The first author would like to acknowledge the financial support from the China Scholarship Council for supporting his visiting research at National University of Singapore, Singapore. The authors would like to acknowledge Jiashi Feng and Li Zhang from National University of Singapore for their advices for this work. The authors would like to acknowledge the reviewers for their constructive and helpful suggestions.

The authors declare no conflict of interest.

Geometric model of curbs. The short arrows indicate the surface normal directions in different positions.

Curb images. (

Overview of our proposed method. The method consists of four steps. For more details, please refer to the text.

Illustration of a depth image within both the camera coordinate and image coordinate systems.

(

(

(_{C} Image; (_{C} Image; (

Normal images with different Gaussian kernels: (_{s}_{s}_{s}_{s}

(_{W}; (_{W}.

(

(_{W} projection map; (_{W} projection map; (_{W} projection map.

(

(

Best path for each side. Red points indicate the left curb, blue points indicate the right curb.

Final results. The confidence score for left curb is 36.6604, and the right is 12.7776.

(

(

(

(

(

(

(

Detection range and confidence scores of each experiment.

| ||||||
---|---|---|---|---|---|---|

_{W}) |
_{W}) |
_{W}) |
_{W}) | |||

Left | −6.19 | −4.99 | 6.08 | 27.58 | 24.93 | |

Right | 0.99 | 1.34 | 6.26 | 30.30 | ||

Left | −5.47 | −5.28 | 8.05 | 15.63 | 19.58 | |

Right | 0.47 | 0.79 | 7.43 | 62.14 | ||

Left | −7.89 | 10.88 | 25.04 | 12.89 | ||

Right | 3.65 | 4.72 | 6.52 | 42.16 | ||

Left | −2.33 | −2.01 | 28.14 | 38.60 | ||

Right | 1.42 | 2.47 | 6.06 | 16.16 | 18.08 | |

Left | −6.55 | −5.17 | 6.88 | 46.43 | ||

Right | 3.43 | 5.06 | 6.31 | 46.24 | ||

Left | −5.84 | −5.39 | 6.50 | 19.28 | 26.17 | |

Right | 0.97 | 1.11 | 6.38 | 10.71 | 36.69 | |

Left | −4.90 | −4.67 | 6.51 | 11.53 | 33.47 | |

Right | 2.51 | 4.07 | 28.63 | 24.47 | ||

Left | −4.11 | 5.56 | 6.48 | 22.98 | 28.39 | |

Right | 3.76 | 6.60 | 22.62 | 63.25 | ||

Left | −5.92 | −4.77 | 6.19 | 20.48 | 43.45 | |

Right | 0.99 | 1.13 | 8.79 | 17.93 | ||

Left | −4.00 | −3.45 | 6.23 | 20.06 | 32.06 | |

Right | - | - | - | - | 0 | |

Left | −5.98 | −4.27 | 6.55 | 12.68 | 25.76 | |

Right | - | - | - | - | 0 | |

Left | −5.63 | −4.29 | 6.31 | 12.48 | 29.11 | |

Right | 0.24 | 1.43 | 8.70 | 14.14 | 21.30 |