Vision-Based Real-Time Traversable Region Detection for Mobile Robot in the Outdoors

Environment perception is essential for autonomous mobile robots in human-robot coexisting outdoor environments. One of the important tasks for such intelligent robots is to autonomously detect the traversable region in an unstructured 3D real world. The main drawback of most existing methods is that of high computational complexity. Hence, this paper proposes a binocular vision-based, real-time solution for detecting traversable region in the outdoors. In the proposed method, an appearance model based on multivariate Gaussian is quickly constructed from a sample region in the left image adaptively determined by the vanishing point and dominant borders. Then, a fast, self-supervised segmentation scheme is proposed to classify the traversable and non-traversable regions. The proposed method is evaluated on public datasets as well as a real mobile robot. Implementation on the mobile robot has shown its ability in the real-time navigation applications.


Introduction
In recent decades, the robotics community has made great efforts to develop mobile robots with complete autonomy. Traversable region detection is one of the fundamental problems for such autonomous navigation systems. Numerous vision-based approaches have been proposed for structured road detection. In recent years, some researchers have attempted to tackle more challenging unstructured road conditions where unstructured roads are referred to the roads that have arbitrary surfaces and various shapes without painted markers or distinguishable borders [1]. However, due to the demand of good tradeoff between time efficiency and accuracy, it is still challenging for a ground robot to autonomously locate a variety of traversable areas and safely navigate itself with respect to human-defined rules (i.e., keep off the non-road area) in real-time [2].
Kong et al. [3] proposed a method to decompose the detection process into two steps: vanishing point estimation and road segmentation based upon the detected vanishing point. This approach could be used to detect various types of roads. However, it was limited by high computational complexity of the vanishing point estimation. Many algorithms have attempted to speed up the procedure of vanishing point estimation. Moghadam et al. [4] proposed an optimal local dominant orientation method using joint activities of only four Gabor filters and an adaptive distance-based voting scheme for estimation of the vanishing point. Miksik [5] investigated a method of expanding Gabor wavelets into a linear combination of Haar-like functions to perform fast filtering, and using superpixels in the voting scheme to speed up the process. Besides the vanishing point based road detection methods, many methods have directly differentiated the road pixels from the background using appearance models [1,[6][7][8][9]. Tan et al. [6] adopted multiple color histograms to capture variability of the road surface and a single-color histogram to model the background in RGB space. Ramstrom and Christensen [7] • A new method is proposed to robustly estimate a vanishing point, which outperforms the state-of-the-art considering tradeoff of time efficiency and accuracy. The vanishing point is detected by voting of a few line segments formed with some dominant pixels rather than by voting of all pixels in most existing methods. • A fast, self-supervised segmentation scheme is proposed for unstructured traversable region detection. An appearance model based on multivariate Gaussian is constructed from the sample region adaptively determined by the vanishing point and dominant borders in the input image. This scheme allows real-time performance on a mobile robot.
The remainder of the paper is organized as follows. The proposed method is described in detail in Section 2. Experimental results and discussion are presented in Section 3. Finally, a conclusion is drawn in Section 4. Figure 1 depicts the pipeline of the proposed traversable region detection. First, texture orientation is computed using Gabor filter with eight directions for each pixel of the input image. Secondly, instead of directly using all the pixels to vote for the vanishing point based on their orientations, we only group some dominant pixels with the same orientation into line segment candidates. Thirdly, the vanishing point is estimated by voting of those line segment candidates, and a seed pixel belonging to the road area is located through the constraints of the vanishing point and two road border candidates. Lastly, a sample region surrounding the seed pixel is selected to model the road combining RGB, IIS and LBP features using multivariate Gaussian such that the road can be classified from the background based on this appearance model. voting of a few line segments formed with some dominant pixels rather than by voting of all pixels in most existing methods.  A fast, self-supervised segmentation scheme is proposed for unstructured traversable region

Traversable Region Detection
detection. An appearance model based on multivariate Gaussian is constructed from the sample region adaptively determined by the vanishing point and dominant borders in the input image. This scheme allows real-time performance on a mobile robot.
The remainder of the paper is organized as follows. The proposed method is described in detail in Section 2. Experimental results and discussion are presented in Section 3. Finally, a conclusion is drawn in Section 4. Figure 1 depicts the pipeline of the proposed traversable region detection. First, texture orientation is computed using Gabor filter with eight directions for each pixel of the input image. Secondly, instead of directly using all the pixels to vote for the vanishing point based on their orientations, we only group some dominant pixels with the same orientation into line segment candidates. Thirdly, the vanishing point is estimated by voting of those line segment candidates, and a seed pixel belonging to the road area is located through the constraints of the vanishing point and two road border candidates. Lastly, a sample region surrounding the seed pixel is selected to model the road combining RGB, IIS and LBP features using multivariate Gaussian such that the road can be classified from the background based on this appearance model.

Vanishing Point Estimation
In this paper, Gabor filters are used to estimate local dominant orientation for each pixel because of their well-known accuracy. A 2D Gabor filter for an orientation n  and radial frequency ω is defined as follows [20], and  is set to 4 2 .

Vanishing Point Estimation
In this paper, Gabor filters are used to estimate local dominant orientation for each pixel because of their well-known accuracy. A 2D Gabor filter for an orientation θ n and radial frequency ω is defined as follows [20], where a = x cos θ n + y sin θ n , b = −x sin θ n + y cos θ n , c = π/2, ω = 2π/λ and λ is set to 4 √ 2. Let I(p) be the grayscale value of input image at p(x, y). The convolution of input image I(p) and a bank of Gabor filters with orientation θ n and radial frequency ω are calculated as, where N is the total number of orientations. The square norm of the complex Gabor filter response is then computed as, Thus, the local dominant texture orientation θ max is defined as the orientation corresponding to the strongest Gabor response across all the orientations. More precise angular resolution can be achieved with a larger number of orientations (N = 36 in [3]). However, it would be at the cost of computation complexity. In this paper, only 8 orientations are preferred to be used with a resolution of 22.5 • . A confidence-rated technique similar to the work of Kong et al. [3] is used to provide a confidence level for the local texture orientation θ max (p) at pixel p. Suppose E 1 (p) > . . . > E 8 (p) is the ordered values of Gabor response for the 8 predefined orientations, the confidence in the orientation θ max (p) is given by, The pixels with a confidence level smaller than a threshold T th , i.e., con f θ max < T th will be discarded. In our experiments, the optimal T th is set to 35.
It is found out that in many paved roads, those pixels contributing more to the voting share have similar orientations and can be grouped into line segments. Therefore, we propose to first group those dominant pixels into line segment candidates and then use these line segments to vote for the vanishing point.
Instead of using gradient angle like the existing approach [21], the texture orientation is used in this paper under the concept of line-support region for line segment detection. A region growing algorithm is applied to group connected pixels (8-connected neighborhood used in this paper) with a constant orientation tolerance into line-support region. A small orientation tolerance will result in too narrow line-support regions while a large one tends to include too many outliers. Hence, in our experiments, the orientation tolerance is empirically set as 22.5 • . Small line-support regions are rejected by the following criterion [21], n reg < − log 10 (11(X m Y m ) 5/2 )/ log 10 (θ th /180) (5) where n reg is the number of pixels in the region, X m and Y m are the sizes of the input image, θ th is the resolution of each orientation. Once a line-support region R i is found, a least-square-fitting is applied to obtain a line l i . A line-support region R i (a set of pixels) must be associated with a line segment, actually a narrowed rectangle with its length and width. Thus, a rectangular approximation should be constructed to evaluate the fitted line. In this paper, we use the center of mass as the center of the rectangle (c i,x , c i,y ), where g(j) is the gradient magnitude of pixel j. The main direction of the rectangle is set to the direction of the fitted line l i . Then, the width L i,w and length L i,l of the rectangle are set to the smallest values so as to cover the full line-support region R i . The fitted line l i (and the line-support region R i ) will be rejected if the rectangle is not narrow enough according to the ratio of length to width, where r th is a threshold and set to 2 in our experiment.
To further evaluate the fitted line l i and select better line segment candidates for the vanishing point estimation, more constraints need to be considered. As Figure 2 shows, after the region growing and line fitting are applied, only a few line segments are extracted. Other less important line segments will be further excluded and could be viewed as noise for the vanishing point estimation. Suppose the set of line segments is {l i } M and their slopes and centers are {K i } M , (c i,x , c i,y ) M , respectively. It is assumed that the vanishing point is not located on the left/right edges of the image. Only those line segments satisfying the following criterion are selected as candidates for vanishing point estimation: where where th r is a threshold and set to 2 in our experiment.
To further evaluate the fitted line i l and select better line segment candidates for the vanishing point estimation, more constraints need to be considered. As Figure 2 shows It is assumed that the vanishing point is not located on the left/right edges of the image. Only those line segments satisfying the following criterion are selected as candidates for vanishing point estimation: where , i l L and  Finally, a set of line segment candidates {l i } M obtained above will be used to estimate the vanishing point by a weighted voting scheme. A distance parameter D i,j is defined for voting, where L i,l and L j,l are the length of lines l i and l j respectively, L l is the sum of the length of all the lines,  if n reg < − log 10 (11(X m Y m ) 5/2 )/ log 10 (θ th /180) reject the line-support region R i else get a line l i using least-square-fitting of arbitrary lines l i and l j in {l i } M

4.2.
Find the intersection point p vp (x i,j , y i,j ) = argmin p(x i,j ,y i,j ) D i,j i.e., the estimated vanishing point.

Sample Region Selection
Since the appearance of traversable region varies significantly, it is more plausible to build the appearance model adaptively with the input image than with off-line training images [1]. To this end, the selection of a sample region of the road plays an important role. The selected sample region tends to include non-traversable region with non-adaptive methods. In our approach, the detected vanishing point is used to adaptively define the sample region because it provides a strong clue to the true location of road area.
In this paper, a similar technique is used as presented in Ref. [3] to find the two most dominant borders from a set of imaginary rays that originate from the initially estimated vanishing point. The difference is that we just roughly estimate the borders to define the sample region rather than to segment the traversable area. Specifically, we only consider 17 evenly distributed imaginary rays with the angle between two neighboring ones being 10 • , as is shown in Figure 3. Suppose A i,L and A i,R are two neighboring regions on either side of the ray i respectively. The color difference of A i,L and A i,R for each channel of color space (RGB used in this paper) is defined as, , then the right and left borders are simply defined as the ray j and k to satisfy the following expressions, Once the right and left borders are obtained, a new imaginary ray b is constructed to be the bisector of ray j and k . A seed point p seed is found at the location of 2/3 of the bisector. Finally, a region R S of K × K (K = 15 in our experiments) surrounding the seed is selected as the sample region, as shown in the right image of Figure 3.  ( ( ,  ) ), Once the right and left borders are obtained, a new imaginary ray b  is constructed to be the bisector of ray j  and k  . A seed point seed p is found at the location of 2/3 of the bisector. Finally, in our experiments) surrounding the seed is selected as the sample region, as shown in the right image of Figure 3.

Segmentation Method
In this paper, a computationally efficient model based on multivariate Gaussian with complementary features, such as RGB, IIS and LBP, is used for segmentation. Given the sample region, the mean feature vector C μ and the covariance matrix C  are first obtained using 7 channels of each pixel, where s n is the total number of pixels in the sample region, Then, the likelihood of a pixel p belonging to the road/non-road region is measured as the Mahalanobis distance between the pixel and the learned model, For all the pixels in the sample region, the initial mean A pixel j p is classified into the road region if it satisfies the following condition,

Segmentation Method
In this paper, a computationally efficient model based on multivariate Gaussian with complementary features, such as RGB, IIS and LBP, is used for segmentation. Given the sample region, the mean feature vector µ C and the covariance matrix ∑ C are first obtained using 7 channels of each pixel, where n s is the total number of pixels in the sample region, p C,i is the value of ith pixel for channel C (r, g, b are for RGB space, c 1 , c 2 , c 3 are for IIS space, and c LBP is for LBP).
Then, the likelihood of a pixel p belonging to the road/non-road region is measured as the Mahalanobis distance between the pixel and the learned model, For all the pixels in the sample region, the initial mean µ D 0 and the variance σ D 0 are computed as, Sensors 2017, 17, 2101

of 17
A pixel p j is classified into the road region if it satisfies the following condition, where λ is a parameter depending on the location of the pixel. µ D k and σ D k are respectively the mean and variance of Mahalanobis distance for all the pixels in the road region. µ D k and σ D k will be adaptively updated as the new road pixel outside the sample region is found. In this paper, the segmentation process starts from the seed pixel with region growing method (8-connected neighborhood), as shown in Figure 4. The whole image is divided into three parts by the border candidates and the horizontal line on which the vanishing point is located. To make the segmentation be accurate and robust to the noise, we apply adaptive threshold by changing the parameters λ, µ D k and σ D k . The parameter λ changes with the pixel location because the likelihood of a pixel belonging to the road region varies for different areas of the image. In our experiments, λ is set as, Furthermore, once a new road pixel p j is found and added to the road region, the parameters µ D k and σ D k are updated as, where n s,k−1 is the total number of pixels in the current road region, specifically, n s,0 is the total number of pixels in the sample region.
where  is a parameter depending on the location of the pixel. be adaptively updated as the new road pixel outside the sample region is found. In this paper, the segmentation process starts from the seed pixel with region growing method (8-connected neighborhood), as shown in Figure 4. The whole image is divided into three parts by the border candidates and the horizontal line on which the vanishing point is located. To make the segmentation be accurate and robust to the noise, we apply adaptive threshold by changing the parameters  , where , 1 s k n  is the total number of pixels in the current road region, specifically, ,0 s n is the total number of pixels in the sample region. The segmentation algorithm is summarized as Algorithm 2. The segmentation algorithm is summarized as Algorithm 2. Find the seed pixel p seed and the sample region R S surrounding the seed by the vanishing point.

2.
For each pixel p i ∈ R S on each channel, compute the mean feature vector µ C and covariance matrix ∑ C .

3.
For each pixel p i ∈ R S , compute Mahalanobis distance D(p i ) and then compute the initial mean µ D 0 and variance σ D 0 4. Start segmentation from the seed pixel p seed with region growing 4.1. add p seed to the traversable region R trav , 4.2.
for each p i ∈ R trav do for each p j neighbor of p i and status(p j ) = used do add the pixel p j to R trav , update parameters µ D k and σ D k , status p j = used end end end

Experimental Results and Discussion
Three experiments have been conducted to evaluate the proposed method. Firstly, vanishing point detection was tested on an image dataset for unstructured pedestrian lane detection and vanishing point estimation (PLVP) [1]. This dataset consists of 2000 images of unstructured lanes under various environmental conditions. Another more challenging image dataset from Ref. [3] (referred to as Challenge dataset in the following section) was also used for more intensive tests. This Challenge dataset contains 1003 images in total including 430 images taken along a Grand Challenge route in Southern California desert. All the images were normalized to the same size 240 × 180 and all the algorithms were run on a standard personal laptop (Intel i5-3230 CPU) without optimization or GPU acceleration. Then, the traversable region segmentation was evaluated on the PLVP dataset as well as KITTI road benchmark [22] to demonstrate the performance of the proposed method on different unstructured scenarios. Lastly, to show the effectiveness of the method for real-time application on robotics, the whole proposed framework was implemented on a Summit XL mobile robot platform with a binocular sensor (baseline 7 cm) in an unstructured campus environment.

Vanishing Point Detection
To evaluate the performance of vanishing point estimation algorithm, we compare the proposed method with two other related methods. One comparable method is a Gabor-based method presented by Kong et al. [3]. In this method, the vanishing point was estimated directly by the voting of all pixels based on their local orientations that were computed using Gabor filters in 36 directions. MATLAB source codes provided by the authors of Ref. [3] were implemented for comparison. The other comparable method is a Hough-based method proposed by Wang et al. [23]. In this method, Hough transform was first used to extract line segments on the edge map. Then, the vanishing point was detected by voting the intersections of line pairs. This method was implemented with C++ by us since the source code is not publicly available.
To quantitatively assess the vanishing point estimation, the estimation error is defined as follows [4], where p d and p g are the detected vanishing point and ground-truth respectively, L is the diagonal length of the image. Some examples of vanishing point estimation with different methods are shown in Figure 5. Table 1 shows the vanishing point estimation performances both on the PLVP dataset and Challenge dataset with different methods in terms of accuracy and runtime.  Figure 5. Table 1 shows the vanishing point estimation performances both on the PLVP dataset and Challenge dataset with different methods in terms of accuracy and runtime.  According to Figure 5, the Gabor-based method was easily affected by clutter pixels in an image with a complex background (e.g., Figure 5b,e,k,m) because all pixels in a half-disk region were directly used to vote for the vanishing point candidates. In contrast, the proposed method just takes those dominant pixels that contribute more for the voting of vanishing point candidates and thus is more robust to clutter noisy pixels. Thus, the proposed method has a better performance than the Gabor-based method on the PLVP dataset. However, only a few line segment candidates are utilized to estimate the vanishing point with a simple voting scheme in the proposed method. Too few candidates will affect the estimation accuracy especially in very challenging scenarios (e.g., desert regions Figure 5q-x). Contrarily, the Gabor-based method outperforms the proposed one on the Challenge dataset because plenty of voting points are always available for the Gabor-based method. According to Table 1, the average errors of the proposed method on the PLVP dataset and Challenge dataset are 0.0734 ± 0.0858 and 0.1023 ± 0.1085, respectively, while the average errors of the Gaborbased method are 0.0812 ± 0.1042 and 0.0909 ± 0.1010, respectively. It is concluded that these two methods have close performances in terms of accuracy. In addition, the Hough-based method easily failed for natural scenes containing noisy edges or many short line segments (e.g., Figure  5f,g,i,k,m,n,p) because the vanishing point was simply estimated based on all the straight lines extracted from the image. Instead, the proposed method forms pixels into a line segment based on  According to Figure 5, the Gabor-based method was easily affected by clutter pixels in an image with a complex background (e.g., Figure 5b,e,k,m) because all pixels in a half-disk region were directly used to vote for the vanishing point candidates. In contrast, the proposed method just takes those dominant pixels that contribute more for the voting of vanishing point candidates and thus is more robust to clutter noisy pixels. Thus, the proposed method has a better performance than the Gabor-based method on the PLVP dataset. However, only a few line segment candidates are utilized to estimate the vanishing point with a simple voting scheme in the proposed method. Too few candidates will affect the estimation accuracy especially in very challenging scenarios (e.g., desert regions Figure 5q-x). Contrarily, the Gabor-based method outperforms the proposed one on the Challenge dataset because plenty of voting points are always available for the Gabor-based method. According to Table 1, the average errors of the proposed method on the PLVP dataset and Challenge dataset are 0.0734 ± 0.0858 and 0.1023 ± 0.1085, respectively, while the average errors of the Gabor-based method are 0.0812 ± 0.1042 and 0.0909 ± 0.1010, respectively. It is concluded that these two methods have close performances in terms of accuracy. In addition, the Hough-based method easily failed for natural scenes containing noisy edges or many short line segments (e.g., Figure 5f,g,i,k,m,n,p) because the vanishing point was simply estimated based on all the straight lines extracted from the image. Instead, the proposed method forms pixels into a line segment based on their texture orientations and employs a strict rejection scheme to keep only a small number of valid line segment candidates.
As shown in Table 1, the average computation time of the proposed method was significantly shorter than that of the Gabor-based method. The average error of Hough-based method was much higher than that of the proposed method although it was about two times faster than the proposed method.
In summary, the proposed method can achieve good tradeoff between accuracy and time efficiency for real time implementation.

Traversable Region Segmentation
Two comparable methods were used to make comparisons for road segmentation. One is a boundary-based method presented in Ref. [3] while the other is a pixel-based method presented in Ref. [10].
To quantitatively evaluate the road segmentation accuracy, we employ a similar approach as the one in Ref. [3]. Suppose that A d is the segmented road area and A g is the binarized ground-truth. The matching score is calculated as, where traversable areas for A d and A g are set to 1 while non-road areas are set to 0. The matching score η can reach maximum value of 1 only when the detected road area completely coincides with the ground-truth. In order to show the road segmentation performance on the dataset, we change the matching score from 0 to 1 and compute the rate of correctly segmented images ( Figure 6). Six examples of road segmentation with different methods are demonstrated in Figure 8.
As shown in Table 1, the average computation time of the proposed method was significantly shorter than that of the Gabor-based method. The average error of Hough-based method was much higher than that of the proposed method although it was about two times faster than the proposed method.
In summary, the proposed method can achieve good tradeoff between accuracy and time efficiency for real time implementation.

Traversable Region Segmentation
Two comparable methods were used to make comparisons for road segmentation. One is a boundary-based method presented in Ref. [3] while the other is a pixel-based method presented in Ref. [10].
To quantitatively evaluate the road segmentation accuracy, we employ a similar approach as the one in Ref. [3]. Suppose that d A is the segmented road area and g A is the binarized ground-truth.
The matching score is calculated as, where traversable areas for d A and g A are set to 1 while non-road areas are set to 0. The matching score  can reach maximum value of 1 only when the detected road area completely coincides with the ground-truth. In order to show the road segmentation performance on the dataset, we change the matching score from 0 to 1 and compute the rate of correctly segmented images ( Figure 6). Six examples of road segmentation with different methods are demonstrated in Figure 8. According to Figure 6, the proposed method outperforms the other two methods. In the pixelbased method [10], a training region at the bottom of the image was selected to construct GMMs for the road appearance. However, the training region might contain non-road pixels and cannot always represent the true road area. For instance, the segmentation for the image on the sixth row in Figure  7 is not satisfying because the sample region included a small portion of non-road pixels. Moreover, it was difficult to determine an appropriate number of Gaussian models and the threshold for segmentation (e.g., it is over segmented for the image on the fourth row in Figure 7). In contrast, the proposed method tends to select an optimal sample region based on the vanishing point and dominant borders. In addition, a multivariate Gaussian model combining complementary features and an adaptive threshold are used in the proposed method to robustly segment the road from the background.  According to Figure 6, the proposed method outperforms the other two methods. In the pixel-based method [10], a training region at the bottom of the image was selected to construct GMMs for the road appearance. However, the training region might contain non-road pixels and cannot always represent the true road area. For instance, the segmentation for the image on the sixth row in Figure 7 is not satisfying because the sample region included a small portion of non-road pixels. Moreover, it was difficult to determine an appropriate number of Gaussian models and the threshold for segmentation (e.g., it is over segmented for the image on the fourth row in Figure 7).
In contrast, the proposed method tends to select an optimal sample region based on the vanishing point and dominant borders. In addition, a multivariate Gaussian model combining complementary features and an adaptive threshold are used in the proposed method to robustly segment the road from the background. As for the boundary-based method [3], the segmentation depended on detection of the dominant borders. However, inaccurate dominant borders would lead to unexpected road segmentation. For instance, the road segmentation for the image on the first row in Figure 7 includes much of the nonroad regions because of false borders. Furthermore, this method classified all the pixels locating in the area between two straight-line borders into road pixels and thus was not suitable for most curved roads (e.g., the segmentations for images of curved roads on the second, fourth, fifth and sixth rows in Figure 7 either included some non-road pixels or excluded some road pixels). In comparison, the proposed method also utilizes the dominant borders but does not strongly depend on them. If there is a big tree or building on the left and near the camera, the edges of the tree-trunk or building might form a line segment and become a line segment candidate used for voting. On the one hand, these kinds of noisy candidates can be rejected at some degrees by the strict criterions (Equations (5), (7) and (8)). Moreover, as can be seen in Figure 8, even though the vanishing point estimation or the dominant border detection was not good enough, a seed pixel of road could still be correctly found. Then, the correct road sample region surrounding the seed could be used to further build the appearance model. Thus, the subsequent segmentation would not be influenced by such inaccurate vanishing points or borders. As for the boundary-based method [3], the segmentation depended on detection of the dominant borders. However, inaccurate dominant borders would lead to unexpected road segmentation. For instance, the road segmentation for the image on the first row in Figure 7 includes much of the non-road regions because of false borders. Furthermore, this method classified all the pixels locating in the area between two straight-line borders into road pixels and thus was not suitable for most curved roads (e.g., the segmentations for images of curved roads on the second, fourth, fifth and sixth rows in Figure 7 either included some non-road pixels or excluded some road pixels). In comparison, the proposed method also utilizes the dominant borders but does not strongly depend on them. If there is a big tree or building on the left and near the camera, the edges of the tree-trunk or building might form a line segment and become a line segment candidate used for voting. On the one hand, these kinds of noisy candidates can be rejected at some degrees by the strict criterions (Equations (5), (7) and (8)). Moreover, as can be seen in Figure 8, even though the vanishing point estimation or the dominant border detection was not good enough, a seed pixel of road could still be correctly found. Then, the correct road sample region surrounding the seed could be used to further build the appearance model. Thus, the subsequent segmentation would not be influenced by such inaccurate vanishing points or borders. The KITTI road benchmark consists of urban unmarked, marked, and multiple marked lanes. Because this paper mainly focuses on unmarked traversable region detection, the unmarked lane subset of KITTI was used to further evaluate the performance. According to Figure 9, the proposed method has achieved good performance on this public dataset compared with the other two methods.
The average processing time of the proposed method for traversable region detection (including vanishing point estimation and segmentation) was 28.16 ms, which is significantly faster than that of the pixel-based method [10] (2.93 s) and the boundary-based method [3] (63.56 s). In other words, our algorithm can be run in real-time over 30 fps on standard CPU although the efficiency of the proposed algorithm could be further improved by parallel computing.

Real Time Implementations for Robot Navigation
The proposed framework has been implemented on a real mobile robot in campus environments with unstructured pedestrian lanes ( Figure 10). The robot was only allowed to travel on the normal pedestrian lanes like a human being. In our experiments, the lanes are 1.5-2 m wide. The robot must accurately localize itself and simultaneously build the traversable map for the environment.
The depth of the traversable region could be recovered with the binocular images by stereo matching. We adopt a fast stereo matching algorithm suitable for embedded real-time systems described in [24]. Once the disparity for a pixel ( , ) p x y of traversable region is found by stereo matching, the 3D point in the sensor coordinates can be calculated with, The KITTI road benchmark consists of urban unmarked, marked, and multiple marked lanes. Because this paper mainly focuses on unmarked traversable region detection, the unmarked lane subset of KITTI was used to further evaluate the performance. According to Figure 9, the proposed method has achieved good performance on this public dataset compared with the other two methods. The KITTI road benchmark consists of urban unmarked, marked, and multiple marked lanes. Because this paper mainly focuses on unmarked traversable region detection, the unmarked lane subset of KITTI was used to further evaluate the performance. According to Figure 9, the proposed method has achieved good performance on this public dataset compared with the other two methods.
The average processing time of the proposed method for traversable region detection (including vanishing point estimation and segmentation) was 28.16 ms, which is significantly faster than that of the pixel-based method [10] (2.93 s) and the boundary-based method [3] (63.56 s). In other words, our algorithm can be run in real-time over 30 fps on standard CPU although the efficiency of the proposed algorithm could be further improved by parallel computing.

Real Time Implementations for Robot Navigation
The proposed framework has been implemented on a real mobile robot in campus environments with unstructured pedestrian lanes ( Figure 10). The robot was only allowed to travel on the normal pedestrian lanes like a human being. In our experiments, the lanes are 1.5-2 m wide. The robot must accurately localize itself and simultaneously build the traversable map for the environment. The depth of the traversable region could be recovered with the binocular images by stereo matching. We adopt a fast stereo matching algorithm suitable for embedded real-time systems described in [24]. Once the disparity for a pixel ( , ) p x y of traversable region is found by stereo matching, the 3D point in the sensor coordinates can be calculated with, The average processing time of the proposed method for traversable region detection (including vanishing point estimation and segmentation) was 28.16 ms, which is significantly faster than that of the pixel-based method [10] (2.93 s) and the boundary-based method [3] (63.56 s). In other words, our algorithm can be run in real-time over 30 fps on standard CPU although the efficiency of the proposed algorithm could be further improved by parallel computing.

Conclusions
This paper proposes a novel method for real-time detection of unstructured traversable regions from a single image in complex outdoor environments. This method utilizes vanishing point estimated by a new fast voting scheme to adaptively determine a road sample region of input image. A self-supervised segmentation approach based on a multivariate Gaussian model built from the sample region is used to classify the road and background rapidly and robustly. Experimental results on the public dataset have shown that the proposed method is able to detect various unstructured roads in real-time. Furthermore, implementation on a real mobile robot in challenging environments has shown the effectiveness of the proposed framework for robot navigation where the traversable region detection could be performed at a frame rate of 30 fps. Future work will focus on combing the proposed traversable region detection method with new localization and mapping algorithms to facilitate robot navigations in more challenging large-scale environments.

Conclusions
This paper proposes a novel method for real-time detection of unstructured traversable regions from a single image in complex outdoor environments. This method utilizes vanishing point estimated by a new fast voting scheme to adaptively determine a road sample region of input image. A self-supervised segmentation approach based on a multivariate Gaussian model built from the sample region is used to classify the road and background rapidly and robustly. Experimental results on the public dataset have shown that the proposed method is able to detect various unstructured roads in real-time. Furthermore, implementation on a real mobile robot in challenging environments has shown the effectiveness of the proposed framework for robot navigation where the traversable region detection could be performed at a frame rate of 30 fps. Future work will focus on combing the proposed traversable region detection method with new localization and mapping algorithms to facilitate robot navigations in more challenging large-scale environments.
Author Contributions: Fucheng Deng proposed the algorithms, analyzed the data and wrote the paper; Xiaorui Zhu conceived and designed the experiments, and partly wrote the paper; Chao He performed the experiments and contributed to the robot localization algorithm.