Next Article in Journal
Regulation of 5-Aminolevunilic Acid and Its Application in Agroforestry
Previous Article in Journal
Experimental Study on the Dynamic Stability of Circular Saw Blades during the Processing of Bamboo-Based Fiber Composite Panels
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A 3D Lidar SLAM System Based on Semantic Segmentation for Rubber-Tapping Robot

Mechanical and Electrical Engineering College, Hainan University, Haikou 570228, China
*
Authors to whom correspondence should be addressed.
Forests 2023, 14(9), 1856; https://doi.org/10.3390/f14091856
Submission received: 24 July 2023 / Revised: 23 August 2023 / Accepted: 11 September 2023 / Published: 12 September 2023
(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)

Abstract

:
Simultaneous localization and mapping (SLAM) in rubber plantations is a challenging task for rubber-tapping robots. Due to the long-term stability of tree trunks in rubber plantations, a SLAM system based on semantic segmentation, called Se-LOAM, is proposed in this work. The 3D lidar point cloud datasets of trunks collected in rubber plantations of Hainan University are used to train the semantic model, and the model is used to extract features of trunk point clouds. After clustering the trunk point clouds, each single rubber tree instance is segmented based on the Viterbi algorithm. The point clouds of tree instances are fitted to the cylindrical trunk models for semantic cluster association and positional estimation, which are used for lidar odometry and mapping. The experimental results show that the present SLAM system is accurate in establishing online mapping, and the location of the trunk in the map is clearer. Specifically, the average relative pose error is 0.02 m, which is better than the positioning performance of LOAM and LeGO-LOAM. The average error of estimating the diameter at breast height (DBH) is 0.57 cm, and it only takes 401.4 kB to store a map of the area of approximately 500 m 2 , which is about 10% less than other classic methods. Therefore, Se-LOAM can meet the requirements of online mapping, providing a robust SLAM method for rubber-tapping robots.

1. Introduction

Rubber-tapping robots need to perform the task of walking autonomously and safely through the rubber plantations, moving from one row to the next row and stopping within a fixed range to tap natural rubber trees [1]. Therefore, rubber-tapping robots are required to have the capability of forest navigation. SLAM is one of the core technologies, and accuracy, as well as robustness are the key performances of forest navigation [2]. Accurate pose estimation helps robots avoid collisions during operation and complete navigation and other tasks better. At the same time, robustness means that rubber-tapping robots can adapt to changes in the variable tapping environment, such as light, climate, or temperature [3,4,5].
In the complex environment of rubber plantations, the performance of positioning systems based on GPS will be affected due to the cover of multilayered foliage and remote geographical location [6]. The existing autonomous navigation robots use different classes of sensors to build maps, including lidar, camera, ultrasonic sensor, and so on [7,8,9,10]. Among them, lidar sensors are more robust to the change in light and can meet the requirements of night and bad weather. Many effective and efficient mapping methods based on lidar point clouds have been proposed [11,12,13]. However, due to the noise effect of leaves, lawns, etc., the constructed map has the problem of unclear feature information and more memory [14]. It is not easy to generalize these methods to large-scale scenarios. Using more detailed features to build maps in rubber plantations can significantly reduce the size of the map and improve the stability of the mapping [15].
Traditional point cloud preprocessing methods are mainly based on the Euclidean clustering method, which relies on feature information such as shape and texture [16,17]. Zhou et al. [18] designed a SLAM system based on plane–line–cylinder adjustment (PLCA) and registered local landmark features to the global for real-time attitude estimation, effectively solving large-scale PLCA problems. Zhao et al. [19] believed that the traditional geometry-based clustering algorithm is worth considering; thus, they tried to use the traditional clustering algorithm to segment and cluster the point cloud panorama for the first time. Pierzcha et al. [20] used the Pratt method to fit a cylindrical trunk to extract the trunk diameter information in the offline 3D lidar reconstruction forest map so as to provide a cost-effective and acceptable quality forest mapping method. The above clustering segmentation method is easy to be affected by the noise and roughness on the object surface, which may lead to instability fitting.
Some point cloud preprocessing algorithms used more advanced semantic feature information. To improve the efficiency and accuracy of the identification, Fan et al. [21] classified tree species based on the PointNet++ algorithm and verified the feasibility of classifying point clouds collected by an airborne lidar. Wang et al. [22] designed a clustering association algorithm based on geometric consistency for the long-term positioning of driverless cars in dynamic urban scenarios. Milioto et al. [23] proposed a laser point cloud semantic segmentation RangeNet++ algorithm, which extracted features on the distance image by 2D convolutional neural network, providing a new method for 3D point cloud semantic segmentation. At present, the semantic segmentation algorithm is mainly applied to indoor [24,25], urban traffic [26,27,28], field operations [29,30], and other scenarios. The existing algorithms can be referenced for us to develop a semantic SLAM system for rubber plantations.
The above preprocessing methods constitute the front end of the SLAM system. For example, Hess [11] designed the Cartographer algorithm based on graph optimization, which adopted the most classic framework of front-end local mapping, closed-loop detection, and back-end global optimization. The following LOAM [31] used a high-frequency and low-precision odometer to estimate the pose of the lidar and removed the motion distortion (assuming uniform motion) generated during the experiment based on the result. However, when the car runs in various scenarios, the movement is not very smooth due to turbulence, resulting in motion distortion of data. Thus, LeGO-LOAM [32] split ground points on the basis of LOAM [31] to improve the adaptability of ground vehicles in variable terrain environments. LIO-SAM [33] is an extended version of LeGO-LOAM [32], adding GPS factors and IMU pre-integration factors. The algorithm proposed a tightly coupled lidar inertial odometer framework based on smoothing and mapping to achieve higher accuracy and real-time mobile robot trajectory estimation and map construction. Different from the single matching of plane points and edge points, Pan et al. [34] proposed an efficient point cloud local registration algorithm, MULLS-ICP. The linear least squares optimization of point-to-point (plane, straight line) error measurement was realized on roughly classified geometric feature points.
Although many SLAM algorithms have been proposed, few empirical studies have shown the effectiveness of lidar descriptors for long-term positioning in complex rubber plantations. Zhang et al. [1] adopted low-cost 2D lidar and gyroscope to obtain the center point of each tree and then simulated the navigation path of the robot in the forest. This paper initially provided a low-cost and real-time solution for forest navigation of rubber-tapping robots. Nie et al. [35] later took the optimized DBSCAN trunk recognition method as the front-end input of their SLAM system, which made it possible to synchronize the positioning and mapping between forests based on rubber trunk landmarks. Fang et al. [36] used a distance-adaptive Euclidean clustering method combined with cylinder fitting to identify tree trunks. However, due to the complexity of the operating environment in the forest and the fragility of Euclidean cylindrical clustering dependent on geometry, it is difficult for the robot to separate the trunk point clouds from the background point clouds.
To solve these problems, a semantic SLAM system based on semantic clustering and segmentation of tree trunks, called Se-LOAM, is proposed in this work. It is the first semantic SLAM system applied to rubber-tapping robots. The system uses RangeNet++ [23], a lightweight semantic segmentation network based on distance images, as the model framework. To ensure the accuracy of feature association, Se-LOAM clusters the point cloud of tree trunks after semantic segmentation and simulates the synthesis of cylinders with the same radius and position to participate in feature matching. The pose transformation is calculated by searching the nearest neighbor points between the current point cloud and the corresponding cylindrical landmark in the projected frame. To verify the performance of Se-LOAM, it is compared with LOAM [31], which uses edge points and vertices as feature matching points, and LeGO-LOAM [32], which is based on ground point optimization. The main contributions of this paper are as follows: (1) The trunk detection method based on semantic segmentation, combined with the nearest neighbor matching between the cylinder and feature points in each frame, improves the accuracy of attitude estimation and the efficiency of mapping; (2) The problem of attitude estimation and mapping for rubber-tapping robot during long-term operation is solved; (3) Se-LOAM can adapt to complex terrain while estimating landmark parameters (radius and centroid coordinates).
The rest of the structure of this article is as follows: Section 2 introduces the composition of the system; Section 3 describes the details of the system; Section 4 compares, analyzes, and discusses the experimental results; finally, conclusions are put forward in Section 5.

2. System Overview

Se-LOAM system consists of a track-mounted mobile platform that includes an odometry, a mechanical arm, a rubber-tapping end-effector, and a 64-line 3D lidar, as shown in Figure 1. A semantic lidar SLAM system based on rubber tree trunk landmarks is proposed to ensure that the rubber-tapping robot can achieve accurate and fast segmentation of rubber trunk point clouds, which can make timely decisions during its autonomous operation.
The lidar SLAM system proposed in this paper only uses a 3D lidar to complete lidar odometry and mapping and uses an odometer sensor as the truth value to compare the resulting trajectory during the experimental phase. As shown in Figure 2, Se-LOAM system overview is divided into five sections: (A) The original 3D point cloud image output from the lidar is converted into a range image representation; (B) Semantic segmentation of 2D distance images is carried out to extract semantic labels of trunk point clouds; (C) Point cloud is converted from 2D to 3D and outputs 3D images; (D) Cluster individual trees and put the pose transformation in key frames; (E) Match the local map to the global map and output the global map.
The notations commonly used in this paper are described here to make it easier to understand the content in Section 3. Define k, k Z + to describe a sweep and use t k to express the beginning of a sweep k. P k shows the point clouds perceived during a sweep k. Define T k ¯ , S k ¯ , P k ¯ as the projections of T k , S k , P k during [ t k , t k + 1 ] to the frame at t k + 1 [31]. The sensor pose transform during a sweep k in the lidar coordinate system {L} is denoted as T k L . Similarly, the position and pose transformation during a sweep k in the world coordinate system {W} is denoted as T k W .

3. Materials and Methods

3.1. Point Cloud Processing

Point cloud preprocessing refers to a series of processing and optimization operations on point cloud data for better use in feature extraction. The preprocessing steps of the Se-LOAM system include point cloud filtering, the storage structure of datasets, and the construction of range images for model training.
(a)
Point Cloud Filtering
When the rubber-tapping robot walks along rubber trees, the complex environment of rubber plantations causes various point noises, which affect the performance of mapping. Therefore, it is necessary to reduce the noise of the original point cloud to make landmark structure easier to collect. In this paper, a point cloud number of clusters less than five is set as the noise, and it will be removed without participating in the subsequent processing. Also, this paper set the lidar range to 20 m to obtain the point cloud within the map range that the rubber-tapping robot needs to build.
(b)
Storage Structure
Octree and 3D KD-tree [37] are often used in 3D space scene management, such as space division of point clouds, fast search of nearby points, etc. [38,39]. The nearest neighbor search processing is in Section 3.3 for finding feature point correspondence. To speed up the nearest neighbor search processing, The KD-tree structures in {L} and {W} are built to store point sets for a fast index. In the lidar mapping module, only the index of trunks is stored.
(c)
Range Image Construction
Stereographic projection is used to generate 2D range images from each frame of the 3D point cloud as input for subsequent semantic network training [23]. Each point cloud P i   = ( x , y , z ) is mapped to 2D image coordinates ( u , v ) , and is defined as follows:
u v = 1 2 1 a r c t a n y , x π 1   w   1 a r c s i n z r 1 + f d o w n f 1   h
where ( u , v ) is the image coordinate; ( h , w ) is the length and width of the converted 2D image; f = f u p + f d o w n is the vertical field-of-view of the sensor, and r = P i 2 is the range of each point. This process will generate a list of the ( u , v ) tuples containing a pair of image coordinates for each P i . Using this index, it extracts for each P i , its range r , its x , y , and z coordinates, and its remission, and stores them in the range image, creating a [ 5 × h × w ] tensor for feature extraction.

3.2. Feature Extraction

3.2.1. Semantic Network Training

Semantic segmentation networks require abundant datasets to obtain better detection performance, so the system needs to obtain plenty of trunk feature points for training. As shown in Figure 3a, the point clouds acquired in this paper are located in Danzhou City, Hainan Province, China (19°300′ N, 109°290′ E). The data were obtained by a vehicle lidar, and the moving range of the rubber-tapping robot is shown in the yellow area in Figure 3b. A 3D point cloud semantic annotation tool is used to label the semantic labels of trunk clouds on the self-recorded datasets, as shown in Figure 3c. Area 1 is used to show the path of the robot, and the position of the robot can be selected by clicking on the box. Area 2 is used to select the color of the annotation. In this study, brown colored points are set to represent the feature of tree trunk point clouds, and the black colored points represent other point clouds. This area can also adjust the size of each mouse click to reduce the amount of labor for large-area annotation. Area 3 is used to show the point cloud map of the current lidar frame. The current frame only selects complete and obvious point clouds, and the fuzzy point clouds are ignored. Because the area moves according to the position of the robot, the blurred features can be found in the nearby frame of the clear image as the frame is played in series. In this annotation tool, users can use a keyboard and mouse to rotate, translate, and scale the point cloud and can annotate successive frames to facilitate the acquisition of labels. The resulting semantic clusters include point clouds, label names, and storage addresses.
The point clouds collected in a natural environment contain many kinds of noise, and a simple graphical clustering method makes it difficult to accurately segment the tree trunk point cloud information in a large and complex environment [40]. Thus, the Se-LOAM system uses deep convolutional neural networks to segment the point cloud data. Considering indicators such as lightweight and running speed, RangeNet++ [23] is preferred to obtain tree trunk point cloud information in this work. The network architecture adopts the hourglass architecture of the encoder–decoder, and Darknet53 is used as the backbone network. It converts the input point cloud into a range image and uses the 2D image to complete convolutional semantic segmentation. Then, the semantic transformation from 2D to 3D of whole points is restored from the original point cloud. Finally, 3D post-processing based on range image is used to remove unsatisfactory point clouds, and the GPU-based accelerated k-Nearest Neighbor (KNN) search algorithm is used to eliminate redundant discretization

3.2.2. Semantic Feature Clustering

Euclidean clustering algorithm [41] is carried out on the range image R k after semantic segmentation of point cloud P k at the frame k. The cluster label index at t k is I k I k i i = 0 N i , where the number of cluster labels is N i and each label Euclidean clustering category is denoted as I k i . The dynamic Viterbi algorithm [42] is used to efficiently solve the shortest path to find a tree trunk instance, as shown in Algorithm 1. Specifically, each frame of the depth image is sliced horizontally. The point cloud feature of each tree is reflected in a row as a circle that can be synthesized within a certain range, and the column feature is shown as a continuous point cloud in the vertical direction.
Firstly, for each cluster label of a frame point cloud, when the number of point clouds in the cluster is more than M i n p t s , it will participate in subsequent tree trunk clustering. Then, the coordinates of each point cloud are indexed on a row. When the number of coordinates is more than three, it means that a unique circle can be synthesized, and the set of single-layer slices is denoted as C . On this basis, the longitudinal ordinate index of the circle of each slice is carried out, and the points are added to the structure S. The minimum height of point cloud sequences that can build a tree is M i n h e i g h t needs to be defined. If the ordinate sequence of S is more than M i n h e i g h t , the points are marked as the set of a tree instance T k . The ordered trunk label output in the P k is defined as T k T k j j = 0 N j , k , where N j , k is the number of tree trunks in the range image R k . T k j p k l l = 0 N l , k is one of the trunks that make up trunk labels T k , and p k l is a single point cloud that makes up a trunk T k j in the current frame. The feature extraction module outputs ordered tree trunk labels T k as a result, along with associated parameters such as radius and centroid coordinates.
Algorithm 1: Tree-trunk Point Cloud Segmentation Based on Viterbi Algorithm
Input: Semantic range image R k of size w × h ; label index I k ; minimum number of points clouds M i n p t s ; and minimum height of trees M i n h e i g h t .
Output: Tree point clouds with ordered index T k
1:   If at the beginning of a sweep P k  do
2:      T k =
3:   end if
4:   For each label index I k i  do
5:     If  I k i . s i z e > M i n p t s  then
6:       Create a new tree structure S
7:       For each u 1 : h  do
8:         For each v 1 : w  do
9:           If  R k u , v . l a b e l = I k i  then
10:            Create a new container Q, add R k u , v to Q.points
11:          end if
12:        end for
13:        If Q.points can fit into a unique circle C then
14:          Create a new node V
15:          V.data = C
16:          V.index = I k i
17:          Add V to S
18:        end if
19:      end for
20:      If S.size > M i n h e i g h t  then
21:        Add S to T k
22:      end if
23:    end if
24:  end for
25:  return  T k

3.3. Lidar Odometry

3.3.1. Semantic Feature Fitting

Given a set of trunk points T k j p k l l = 0 N l , k , the nonlinear least squares method is adopted to 3D point clouds for cylinder fitting. By adding the distance between the feature points and the axis of the column, the minimum value is approached to optimize the column parameters continuously. For a trunk points p k l and a fitted cylinder S k j , approximate distance from this point to the cylinder is calculated as follows [43]:
d s s , p = | ( p ( ρ + 1 k ) × n ) × a | 1 k = p ρ + 1 k n 2 p ρ + 1 k n , a 2 1 k
where n = sin θ cos φ , sinθsin φ , cos θ ; θ is the included angle between the n and the Z-axis, and φ is the included angle of the projection of n to the X-axis in the X–O–Y plane. Let n θ = ( c o s θ c o s φ , c o s θ s i n φ , s i n θ ) be the partial derivative of n with respect to θ . Similarly, put n φ ¯ = ( s i n φ , c o s φ , 0 ) . The axis direction of the cylinder is a , where a   = 1, and the radius of the cylinder is 1 k , as shown in Figure 4. Following the reference [43], a can be parameterized as follows:
a = n θ c o s α + n φ ¯ s i n α
where α is the angle subtended between a and n θ . To sum up, the nonlinear square method transforms the problem of fitting a cylinder S k j composed of point clouds P k l of a tree trunk T k j into calculating the minimum of the equation as follows:
  l = 0 N l , k d s ( S k j , p k l )

3.3.2. Feature Point Correspondence and Motion Estimation

The lidar odometry module is to solve the change in sensor motion state between two scans. The module uses T k ¯ , S k ¯ from P k ¯ as input, and uses 6-DOF attitude estimation T k L = [ t x ,   t y ,   t z ,   θ x ,   θ y ,   θ z ] as output, where t x ,   t y ,   t z are translations in lidar coordinates L , and θ x ,   θ y ,   θ z are rotations under the right-hand rule. The framework follows the LOAM system [31]. The semantic segmentation module is added in the front end, and the matching method of feature points is changed.
Specifically, at the beginning of a sweep k , let T k L = 0 and S k = . Supposing P k 1 ¯ and P k are available, the pose is estimated by matching correspondent feature points in two frames. Firstly, for each rubber tree T k j p k l l = 0 N l , k , cylinder module S k j is updated by solving Formula (4). Then, for each trunk feature point clouds p k l , the closest neighbor model in S k 1 ¯ are going to be found. Trunk datasets S k 1 ¯ are stored in a 3D KD-tree for fast searching in the frame {L}. After finding the correspondence, compute the point-to-cylinder distance by solving Formula (2), and then stack the equation to the formula below:
a r g m i n T k L j = 1 N j , k l = 1 N l , k d ( S k j ¯ , p k l )
Until the end of a sweep k, S k , and T k are projected to the next sweep k + 1, denoting S k ¯ and T k ¯ . As well, T k L is returned for the next odometry.

3.4. Lidar Mapping

At the end of a sweep k, lidar odometry module outputs the projected T k ¯ , S k ¯ , and a position and pose transformation T k L between t k 1 and t k . The lidar mapping module aims to match and register trunk points T k ¯ to the world frame {W} once per sweep, denoted as T k W ¯ . At the same time, this step extends T k W to T k + 1 W by using the output from the lidar odometry module and matches T k W ¯ to T k 1 W by optimizing the sensor pose T k W . The method of lidar mapping module refers to the description from LOAM [31].
At the start of time t k = 0 , the map of trunks is empty; so let T k W = . Next, during each scan P k , feature points are extracted the same as in Section 3.2. The cylinder sets S k ¯ output from the lidar odometry module are going to combine with the model S k W in the global map. For each feature point in T k ¯ , add them to the associated trunks T k W in the global map by using the similar feature point corresponding method in Section 3.3.2.

4. Results

This paper describes a series of experiments to qualitatively and quantitatively analyze three SLAM methods, LOAM, LeGO-LOAM, and Se-LOAM, on the embedded system called Jetson TX2. The algorithms are implemented in C++ and executed using the robot operating system (ROS 18.04) [44]. The system uses lidar to scan at a frequency of 10 Hz. The field of view of the lidar is f _ u p = 22.5 ° and f _ d o w n = 22.5 ° . The lidar is installed on the platform of the self-propelled rubber-tapping robot. The experiment was carried out in the national natural rubber forest located in Danzhou City, Hainan Province, China (19°300′ N, 109°290′ E). The average annual temperature is 21 °C~29 °C, and the average annual precipitation is 1815 mm. The segmentation network is trained on 2000 scans from vehicle-borne lidar. A total of 2000 scans were collected. The network runs with ThinkStation P720 workstation as the processing platform, with an Intel Xeon(R) Silver 4210 CPU at a basic frequency of 2.20 GHz, 32 G RAM, and an NVIDIA Quadro RTX 4000 GPU. The network model adopted the Pytorch framework, and the CUDA version is 11.0. The training results of the final model achieve a 0.69 average IoU score on a 10-fold cross-validation, as shown in Figure 5.
The evaluation indicators of the whole system are as follows:
  • Comparison of map construction results using different SLAM algorithms.
  • Estimation error of rubber tree DBH and location error.
  • Running time and memory consumption of global map construction.

4.1. Mapping Performance

The final maps from LOAM, LeGO-LOAM, and Se-LOAM, when run on an embedded system called Jetson TX2, are shown in Figure 6. Results show that LOAM is difficult to build a map based on complex rubber forest data sets, and the point cloud map is not clear. LOAM and LeGO-LOAM are prone to frequent ghosting and fuzzy mapping. Because the rubber-tapping operation needs to estimate the diameter of the tree and the location of the tree with high precision, this ambiguity is unsatisfactory. Se-LOAM, on the other hand, produces a clear map preserving the characteristics of rubber tree trunks.

4.2. Localization Performance Accuracy

The effectiveness of Se-LOAM in estimating trunk DBH is evaluated. The DBH can be estimated directly because the radius 1 k of each landmark S j W can be extracted by clustering the tree instances and fitting them to the cylinder model (Section 3.3.1). For the vehicle-borne lidar datasets of three different scenarios described in Figure 7a, this paper manually measured 34 trees along the robot’s path and used the model generated by Se-LOAM to estimate the DBH. The estimated DBH values are then compared with human measurements. The circumference of the rubber tree at 1.3 m height is measured as the real DBH value, as shown in Figure 7b [45]. Three sets of lidar point clouds collected in the rubber plantations were input into the system as experiments, and their error results are shown in Figure 7c. Se-LOAM detects that the average error of the tree is 0.57 cm, which conforms to the industrial tree diameter measurement error standard [46]. The number of sample trees with a measurement error greater than 1.5% DBH and a measurement error less than 3% DBH should not exceed 5% of the total number of measured trees, where the allowable error number of the three sequences is five. The experiment results have three points exceeding 1.5% DBH, which is less than five. So, it meets the criteria.
To qualitatively evaluate the positioning accuracy of the algorithm, trajectory analysis experiments are carried out in this study. An airborne lidar traversing a dense rubber plantation environment is used to record datasets. An odometry sensor is used to record trajectory as the relative ground truth. The dataset was collected from a 50 m × 100 m area in size, as shown in Figure 8a. We used 2520 frames of data collected from lidar to plot the trajectories. The trajectory output by Se-LOAM, as well as other methods, are shown in Figure 8b. The results show that our algorithm achieves better performance than others. Se-LOAM achieves the lowest drift.
To quantitatively evaluate the location accuracy of the algorithm, error analysis experiments are carried out. We recorded the absolute pose error (APE) and relative pose error (RPE) in Table 1 and Table 2. APE is the direct difference between the estimated pose and the real pose, which can directly reflect the accuracy and consistency of the global trajectory. Table 1 shows that Se-LOAM achieves an average pose error of 0.36 m, which is lower than in other methods. It proves that our algorithm has better consistency in the global trajectory. RPE describes the local accuracy of the trajectory at fixed intervals, including translation and rotation. Our algorithm obtains the minimum average value. As shown in Table 2, it is not more than 0.18 m and about 0.02 m on average. The metrics of our algorithm show superior performance. More specifically, the corresponding relationship between the pose error and timestamp is shown in Figure 9. As can be found from the graphs, a peak of pose error appeared at the tum (around the middle of the timestamp) with high probability. It reflects the impact of complex terrain on pose estimation.

4.3. Experiment on Efficiency

To verify the effectiveness of the algorithm, this section also compared the running times of different algorithms that handle a single scan. As shown in Table 3, the Se-LOAM algorithm takes a long time to extract the trunk features from the original point cloud. However, after the preprocessing step, the localization and mapping process of optimized algorithms is much faster and far superior to other algorithms. Although the front-end time of Se-LOAM is longer than other algorithms, due to the high repeatability of the preprocessing process, it can be accelerated with multi-thread or CPU parallel computing techniques.
In addition, Table 4 shows the memory of map consumption under different algorithms. Due to the extraction of backbone features, the amount of data used for positioning and mapping is greatly reduced. In the same environment, the memory consumption of the map is on the order of kilobytes, which is much lower than the memory required by other algorithms.

5. Discussion

In this paper, a semantic SLAM system based on landmarks is proposed. The system aims to provide continuous and accurate pose estimation, as well as output the radius of trunks. The semantic segmentation is added at the front end of the system. This pre-processing step is made to extract landmarks to reduce the redundancy of mapping. As shown in Figure 8, LOAM has poor performance in qualitative map construction, and the point clouds of trees are not clear. LeGO-LOAM adds ground-based detection, which can make the map more accurate. However, noise, such as weeds and shrubbery, will be detected in the complex environment of rubber plantations. It will cause problems in tree detection. Our algorithm only builds the map of the tree trunk, which improves the efficiency and clarity of the map.
The back end of the classical LOAM architecture adopts the method of matching the edge points and plane points in adjacent frames. Compared to the LOAM algorithm and LOAM combined with the RangeNet++ algorithm in Table 1 and Table 2, it can be found that the semantic segmentation preprocessing will reduce the number of feature points, which will lose the accuracy of global and local pose estimation. Thus, instead of the lidar odometry method in LOAM, Se-LOAM clusters the point cloud of tree trunks after semantic segmentation and simulates the synthesis of cylinders with the same radius and position to participate in feature matching. The pose transformation is calculated by searching the nearest neighbor points between the current point cloud and the corresponding cylindrical landmark in the projected frame. Compared to the Se-LOAM algorithm and LOAM combined with the RangeNet++ algorithm in Table 1 and Table 2, it can be found that our method improves the accuracy of the LOAM algorithm, which only relies on edge points and surface points.
Under the comprehensive evaluation criteria, the average APE of Se-LOAM is 0.34 m, and the RPE is 0.02 m, which is more accurate than other methods. In addition, because of the extraction of trunk features, our method has better construction efficiency and consumes less memory.

6. Conclusions

In addition to the instability of the environment in rubber plantations, simultaneous localization and mapping face great challenges. Accurate mapping is significant for rubber-tapping robots to stop operating and divert in time. Aiming at the limitations of the classic slam algorithm, this paper increases a semantic segmentation network model to the front end of the structure of LOAM for extracting meaningful trunk features. This method reduces the redundant noise and improves the robustness of the system. This semantic model is combined with a Viterbi algorithm-based method to extract each rubber tree instance and associated attributes, such as radius and center coordinates. The standardized model of trees is used to estimate the sensor position and pose transformation. The experimental results show that the Se-LOAM system can construct a clear map with landmarks and measure the DBH with an average error of 0.57 cm, which meets the criterion of agriculture. Mapping an environment of about 500 m 2 , the running time of lidar odometry is 5.6 ms, and the running time of lidar mapping is 94.7 ms, which is 50% less than LeGO-LOAM. The memory consumption is only 401.4 KB, about 10% less than LOAM and LeGO-LOAM.
In conclusion, the performance of the optimized Se-LOAM system has faster running speed, less memory consumption, and better mapping performance compared with the other two classical slam algorithms. A semantic segmentation model with geometry-based feature association is the key to achieving accurate and large scene mapping in challenging rubber plantations. Future research will focus on the following areas: (1) This paper currently outputs the fitted cylinder radius of trunk feature point clouds, so DBH may cause some errors. By using sensor pose transformation later, the system could try to output the perimeter of trees at the height of 1.3 m directly; (2) With the development of research in the semantic segmentation network area, lightweight networks with higher performance can be selected to improve the accuracy of feature detection; (3) The multi-sensor fusion (lidar and vision) SLAM system can be combined with the semantic segmentation network and feature matching algorithm in this paper to improve the adaptability of the system to varied and complex working conditions.

Author Contributions

Conceptualization, H.Y. and Z.Z.; methodology, H.Y.; software, H.Y. and J.L.; validation, H.Y.; formal analysis, H.Y.; investigation, Y.C.; resources, X.Z.; data curation, H.Y.; writing—original draft preparation, H.Y.; writing—review and editing, Z.Z. and J.L.; visualization, H.Y. and Y.C.; funding acquisition, Z.Z. and X.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the “Key technology research and demonstration of intelligent copying advanced rubber-tapping machine, grant no. YSPTZX202109”, “International Scientific and Technological Cooperation R&D Project in Hainan Province, grant no. GHYF2023002”, “Key Research and Development Project of Hainan Province, grant no. ZDYF2021XDNY198” and “National Modern Agricultural Industry Technology System Post Scientist Project, grant no. CARS-33-JX2”.

Data Availability Statement

Not applicable.

Acknowledgments

We would like to thank Zhengbin Liang for his help in recording the rosbag. We would like to thank Zejin Sun for helping to measure the DBH.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Zhang, C.L.; Yong, L.Y.; Chen, Y.; Zhang, S.L.; Ge, L.Z.; Wang, S.; Li, W. A Rubber-Tapping Robot Forest Navigation and Information Collection System Based on 2D LiDAR and a Gyroscope. Sensors 2019, 19, 2136. [Google Scholar] [CrossRef] [PubMed]
  2. Palieri, M.; Morrell, B.; Thakur, A.; Ebadi, K.; Nash, J.; Chatterjee, A.; Kanellakis, C.; Carlone, L.; Guaragnella, C.; Agha-Mohammadi, A.A. LOCUS: A Multi-Sensor Lidar-Centric Solution for High-Precision Odometry and 3D Mapping in Real-Time. IEEE Robot. Autom. Lett. 2021, 6, 421–428. [Google Scholar] [CrossRef]
  3. Bauwens, S.; Bartholomeus, H.; Calders, K.; Lejeune, P. Forest Inventory with Terrestrial LiDAR: A Comparison of Static and Hand-Held Mobile Laser Scanning. Forests 2016, 7, 127. [Google Scholar] [CrossRef]
  4. Polewski, P.; Yao, W.; Cao, L.; Gao, S. Marker-free coregistration of UAV and backpack LiDAR point clouds in forested areas. Isprs J. Photogramm. Remote Sens. 2019, 147, 307–318. [Google Scholar] [CrossRef]
  5. Shao, J.; Zhang, W.M.; Mellado, N.; Wang, N.; Jin, S.G.; Cai, S.S.; Luo, L.; Lejemble, T.; Yan, G.J. SLAM-aided forest plot mapping combining terrestrial and mobile laser scanning. Isprs J. Photogramm. Remote Sens. 2020, 163, 214–230. [Google Scholar] [CrossRef]
  6. Yang, H.; Sun, Z.J.; Liu, J.X.; Zhang, Z.F.; Zhang, X.R. The Development of Rubber Tapping Machines in Intelligent Agriculture: A Review. Appl. Sci. 2022, 12, 9304. [Google Scholar] [CrossRef]
  7. Zhou, Y.; Tuzel, O. VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection. In Proceedings of the 31st IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 4490–4499. [Google Scholar]
  8. Galceran, E.; Carreras, M. A survey on coverage path planning for robotics. Robot. Auton. Syst. 2013, 61, 1258–1276. [Google Scholar] [CrossRef]
  9. Jones, E.S.; Soatto, S. Visual-inertial navigation, mapping and localization: A scalable real-time causal approach. Int. J. Robot. Res. 2011, 30, 407–430. [Google Scholar] [CrossRef]
  10. Labbe, M.; Michaud, F. RTAB-Map as an open-source lidar and visual simultaneous localization and mapping library for large-scale and long-term online operation. J. Field Robot. 2019, 36, 416–446. [Google Scholar] [CrossRef]
  11. Hess, W.; Kohler, D.; Rapp, H.; Andor, D. Real-Time Loop Closure in 2D LIDAR SLAM. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden, 16–21 May 2016; pp. 1271–1278. [Google Scholar]
  12. Ye, H.Y.; Chen, Y.Y.; Liu, M. Tightly Coupled 3D Lidar Inertial Odometry and Mapping. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Montreal, Canada, 20–24 May 2019; pp. 3144–3150. [Google Scholar]
  13. Weiss, U.; Biber, P. Plant detection and mapping for agricultural robots using a 3D LIDAR sensor. Robot. Auton. Syst. 2011, 59, 265–273. [Google Scholar] [CrossRef]
  14. Chen, S.W.; Nardari, G.V.; Lee, E.S.; Qu, C.; Liu, X.; Romero, R.A.F.; Kumar, V. SLOAM: Semantic Lidar Odometry and Mapping for Forest Inventory. IEEE Robot. Autom. Lett. 2020, 5, 612–619. [Google Scholar] [CrossRef]
  15. Han, S.Q.; Xi, Z.H. Dynamic Scene Semantics SLAM Based on Semantic Segmentation. IEEE Access 2020, 8, 43563–43570. [Google Scholar] [CrossRef]
  16. Wang, X.T.; Fan, X.N.; Shi, P.F.; Ni, J.J.; Zhou, Z.K. An Overview of Key SLAM Technologies for Underwater Scenes. Remote Sens. 2023, 15, 2496. [Google Scholar] [CrossRef]
  17. Debeunne, C.; Vivet, D. A Review of Visual-LiDAR Fusion based Simultaneous Localization and Mapping. Sensors 2020, 20, 2068. [Google Scholar] [CrossRef]
  18. Zhou, L.P.; Huang, G.Q.; Mao, Y.N.; Yu, J.C.; Wang, S.Z.; Kaess, M. PLC-LiSLAM: LiDAR SLAM With Planes, Lines, and Cylinders. IEEE Robot. Autom. Lett. 2022, 7, 7163–7170. [Google Scholar] [CrossRef]
  19. Zhao, Y.M.; Zhang, X.; Huang, X.M.; Soc, I.C. A Technical Survey and Evaluation of Traditional Point Cloud Clustering Methods for LiDAR Panoptic Segmentation. In Proceedings of the 18th IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, Canada, 11–17 October 2021; pp. 2464–2473. [Google Scholar]
  20. Pierzchala, M.; Giguere, P.; Astrup, R. Mapping forests using an unmanned ground vehicle with 3D LiDAR and graph-SLAM. Comput. Electron. Agric. 2018, 145, 217–225. [Google Scholar] [CrossRef]
  21. Fan, Z.; Wei, J.; Zhang, R.; Zhang, W. Tree Species Classification Based on PointNet++ and Airborne Laser Survey Point Cloud Data Enhancement. Forests 2023, 14, 1246. [Google Scholar] [CrossRef]
  22. Wang, Z.H.; Li, S.L.; Cao, M.; Chen, H.Y.; Liu, Y.H. Pole-like Objects Mapping and Long-Term Robot Localization in Dynamic Urban Scenarios. In Proceedings of the 2021 IEEE International Conference on Robotics and Biomimetics (IEEE-Robio 2021), Sanya, China, 27–31 December 2021; pp. 998–1003. [Google Scholar] [CrossRef]
  23. Milioto, A.; Vizzo, I.; Chley, J.; Stachniss, C. RangeNet plus plus: Fast and Accurate LiDAR Semantic Segmentation. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 4–8 November 2019; pp. 4213–4220. [Google Scholar]
  24. Zhang, J.D.; Wang, W.; Qi, X.Y.; Liao, Z.W. Social and Robust Navigation for Indoor Robots Based on Object Semantic Grid and Topological Map. Appl. Sci. 2020, 10, 8991. [Google Scholar] [CrossRef]
  25. Zhu, J.S.; Li, Q.; Cao, R.; Sun, K.; Liu, T.; Garibaldi, J.M.; Li, Q.Q.; Liu, B.Z.; Qiu, G.P. Indoor Topological Localization Using a Visual Landmark Sequence. Remote Sens. 2019, 11, 73. [Google Scholar] [CrossRef]
  26. Chen, X.Y.L.; Milioto, A.; Palazzolo, E.; Giguere, P.; Behlcy, J.; Stachniss, C. SuMa plus plus: Efficient LiDAR-based Semantic SLAM. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 4–8 November 2019; pp. 4530–4537. [Google Scholar]
  27. Li, S.X.; Li, G.Y.; Wang, L.; Qin, Y.C. SLAM integrated mobile mapping system in complex urban environments. ISPRS J. Photogramm. Remote Sens. 2020, 166, 316–332. [Google Scholar] [CrossRef]
  28. Zhao, S.B.; Fang, Z.; Li, H.L.; Scherer, S. A Robust Laser-Inertial Odometry and Mapping Method for Large-Scale Highway Environments. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 4–8 November 2019; pp. 1285–1292. [Google Scholar]
  29. Milioto, A.; Lottes, P.; Stachniss, C. Real-time Semantic Segmentation of Crop and Weed for Precision Agriculture Robots Leveraging Background Knowledge in CNNs. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–25 May 2018; pp. 2229–2235. [Google Scholar]
  30. Lin, J.R.; Zhang, F. Loam_livox: A fast, robust, high-precision LiDAR odometry and mapping package for LiDARs of small FoV. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–15 June 2020; pp. 3126–3131. [Google Scholar]
  31. Zhang, J.; Singh, S. Low-drift and real-time lidar odometry and mapping. Auton. Robot. 2017, 41, 401–416. [Google Scholar] [CrossRef]
  32. Shan, T.X.; Englot, B. LeGO-LOAM: Lightweight and Ground-Optimized Lidar Odometry and Mapping on Variable Terrain. In Proceedings of the 25th IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 4758–4765. [Google Scholar]
  33. Shan, T.X.; Englot, B.; Meyers, D.; Wang, W.; Ratti, C.; Rus, D. LIO-SAM: Tightly-coupled Lidar Inertial Odometry via Smoothing and Mapping. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 24 October–24 January 2020; pp. 5135–5142. [Google Scholar]
  34. Pan, Y.; Xiao, P.C.A.; He, Y.J.; Shao, Z.L.; Li, Z.S. MULLS: Versatile LiDAR SLAM via Multi-metric Linear Least Square. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Xian, China, 30 May–5 June 2021; pp. 11633–11640. [Google Scholar]
  35. Nie, F.Y.; Zhang, W.M.; Wang, Y.; Shi, Y.L.; Huang, Q. A Forest 3-D Lidar SLAM System for Rubber-Tapping Robot Based on Trunk Center Atlas. IEEE-Asme Trans. Mechatron. 2022, 27, 2623–2633. [Google Scholar] [CrossRef]
  36. Fang, J.; Shi, Y.; Cao, J.; Sun, Y.; Zhang, W. Active Navigation System for a Rubber-Tapping Robot Based on Trunk Detection. Remote Sens. 2023, 15, 3717. [Google Scholar] [CrossRef]
  37. de Berg, M.; van Kreveld, M.; Overmars, M.; Schwarzkopf, O. Computation Geometry: Algorithms and Applications, 3rd ed.; Springer: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
  38. Wang, W.; Zhang, Y.; Ge, G.Y.; Jiang, Q.; Wang, Y.; Hu, L.H. A Hybrid Spatial Indexing Structure of Massive Point Cloud Based on Octree and 3D R*-Tree. Appl. Sci. 2021, 11, 9581. [Google Scholar] [CrossRef]
  39. Lu, B.; Wang, Q.; Li, A. Massive Point Cloud Space Management Method Based on Octree-Like Encoding. Arab. J. Sci. Eng. 2019, 44, 9397–9411. [Google Scholar] [CrossRef]
  40. Zhao, Z.H.; Zhang, W.Q.; Gu, J.F.; Yang, J.J.; Huang, K. Lidar Mapping Optimization Based on Lightweight Semantic Segmentation. IEEE Trans. Intell. Veh. 2019, 4, 353–362. [Google Scholar] [CrossRef]
  41. Liu, X.Y.; Wang, Y.X.; Kang, F.; Yue, Y.; Zheng, Y.J. Canopy Parameter Estimation of Citrus grandis var. Longanyou Based on LiDAR 3D Point Clouds. Remote Sens. 2021, 13, 1859. [Google Scholar] [CrossRef]
  42. Hayashi, A.; Iwata, K.; Suematsu, N. Marginalized Viterbi algorithm for hierarchical hidden Markov models. Pattern Recognition 2013, 46, 3452–3459. [Google Scholar] [CrossRef]
  43. Lukacs, G.; Marshall, A.; Martin, R. Geometric least-squares fitting of spheres, cylinders, cones and tori. Deliv. Doc. 1997, 23, 1–20. [Google Scholar]
  44. Quigley, M.; Gerkeyy, B.; Conleyy, K.; Fausty, J.; Footey, T.; Leibsz, J.; Bergery, E.; Wheelery, R.; Ng, A. ROS:An open-source robot operating system. In Proceedings of the InWorkshop on Open Source Software, Kobe, Japan, 12–17 May 2009. [Google Scholar]
  45. Liu, H.S. Discussion of DBH (diameter at breast height). China Terminol. 2011, 13, 44–45. [Google Scholar]
  46. Guan, B.F. Discussion on improving the measuring accuracy of DBH of sample wood in continuous inventory of forest resources. Inn. Mong. For. Investig. Des. 2010, 33, 63–64. [Google Scholar] [CrossRef]
Figure 1. Rubber-tapping robot of Se-LOAM system.
Figure 1. Rubber-tapping robot of Se-LOAM system.
Forests 14 01856 g001
Figure 2. Block diagram of the semantic SLAM system. Each of the arrows represents the flow of the system.
Figure 2. Block diagram of the semantic SLAM system. Each of the arrows represents the flow of the system.
Forests 14 01856 g002
Figure 3. Semantic network training. (a) Spatial distribution of the training data. (b) The moving area of rubber-tapping robot during data acquisition process. (c) Point cloud labeling tool.
Figure 3. Semantic network training. (a) Spatial distribution of the training data. (b) The moving area of rubber-tapping robot during data acquisition process. (c) Point cloud labeling tool.
Forests 14 01856 g003aForests 14 01856 g003b
Figure 4. Point cloud labeling tool.
Figure 4. Point cloud labeling tool.
Forests 14 01856 g004
Figure 5. Training results of different input sizes.
Figure 5. Training results of different input sizes.
Forests 14 01856 g005
Figure 6. Mapping results of different algorithms. (a) Rubber plantation scene. (b) LOAM algorithm. (c) LeGO-LOAM algorithm. (d) Se-LOAM algorithm.
Figure 6. Mapping results of different algorithms. (a) Rubber plantation scene. (b) LOAM algorithm. (c) LeGO-LOAM algorithm. (d) Se-LOAM algorithm.
Forests 14 01856 g006
Figure 7. Circumference measurement in the plantation. (a) Scenes of different sequences. (b) Circumference measurement. (c) The radius error of the tree.
Figure 7. Circumference measurement in the plantation. (a) Scenes of different sequences. (b) Circumference measurement. (c) The radius error of the tree.
Forests 14 01856 g007
Figure 8. Trajectory comparison between Se-LOAM system and other methods. (a) The real scene in a normalized rubber plantation. (b) Trajectory diagram in rubber plantation scenario.
Figure 8. Trajectory comparison between Se-LOAM system and other methods. (a) The real scene in a normalized rubber plantation. (b) Trajectory diagram in rubber plantation scenario.
Forests 14 01856 g008
Figure 9. The APE and RPE of different methods. (a) Se-LOAM algorithm. (b) LeGO-LOAM algorithm. (c) LOAM algorithm. (d) LOAM combined with RangeNet++ algorithm.
Figure 9. The APE and RPE of different methods. (a) Se-LOAM algorithm. (b) LeGO-LOAM algorithm. (c) LOAM algorithm. (d) LOAM combined with RangeNet++ algorithm.
Forests 14 01856 g009
Table 1. Quantitative results on absolute pose error (m).
Table 1. Quantitative results on absolute pose error (m).
MethodsMaxMeanRmseMinStd
Ours1.9561400.3643230.4496550.0098570.294151
LeGO-LOAM1.9537730.6191620.6999070.0079340.326356
LOAM12.3823125.7692556.6327970.2156923.272567
LOAM + RangeNet++10.6596165.9647355.4013861.0520122.1275.0
Table 2. Quantitative results on relative pose error (m).
Table 2. Quantitative results on relative pose error (m).
MethodsMaxMeanRmseMinStd
Ours0.1846360.0229460.0314930.0005570.021571
LeGO-LOAM0.1912420.0244560.0337380.0004990.023241
LOAM0.2081880.0545680.0683200.0104570.041109
LOAM + RangeNet++0.2281460.0416090.0544720.0016940.035155
Table 3. Runtime for processing one scan (in milliseconds).
Table 3. Runtime for processing one scan (in milliseconds).
AlgorithmFeature ExtractionOdometryMapping
Seq1Seq2Seq3Seq1Seq2Seq3Seq1Seq2Seq3
LOAM46.631.535.656.425.333.5977.4424.6870.5
LeGO-LOAM11.27.89.910.97.39.3149.960.3124.0
Se-LOAM18.711.215.75.62.03.694.735.778.9
Table 4. Memory consumption of map.
Table 4. Memory consumption of map.
AlgorithmMap Size
Seq 1Seq 2Seq 3
LOAM4.1 MB3.3 MB5.6 MB
LeGO-LOAM3.2 MB2.6 MB4.4 MB
Se-LOAM401.4 kB324.0 kB511.4 kB
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yang, H.; Chen, Y.; Liu, J.; Zhang, Z.; Zhang, X. A 3D Lidar SLAM System Based on Semantic Segmentation for Rubber-Tapping Robot. Forests 2023, 14, 1856. https://doi.org/10.3390/f14091856

AMA Style

Yang H, Chen Y, Liu J, Zhang Z, Zhang X. A 3D Lidar SLAM System Based on Semantic Segmentation for Rubber-Tapping Robot. Forests. 2023; 14(9):1856. https://doi.org/10.3390/f14091856

Chicago/Turabian Style

Yang, Hui, Yaya Chen, Junxiao Liu, Zhifu Zhang, and Xirui Zhang. 2023. "A 3D Lidar SLAM System Based on Semantic Segmentation for Rubber-Tapping Robot" Forests 14, no. 9: 1856. https://doi.org/10.3390/f14091856

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop