Research on Visual Positioning of a Roadheader and Construction of an Environment Map

: The autonomous positioning of tunneling equipment is the key to intellectualization and robotization of a tunneling face. In this paper, a method based on simultaneous localization and mapping (SLAM) to estimate the body pose of a roadheader and build a navigation map of a roadway is presented. In terms of pose estimation, an RGB-D camera is used to collect images, and a pose calculation model of a roadheader is established based on random sample consensus (RANSAC) and iterative closest point (ICP); constructing a pose graph optimization model with closed-loop constraints. An iterative equation based on Levenberg–Marquadt is derived, which can achieve the optimal estimation of the body pose. In terms of mapping, LiDAR is used to experimentally construct the grid map based on open-source algorithms, such as Gmapping, Cartographer, Karto, and Hector. A point cloud map, octree map, and compound map are experimentally constructed based on the open-source library RTAB-MAP. By setting parameters, such as the expansion radius of an obstacle and the updating frequency of the map, a cost map for the navigation of a roadheader is established. Combined with algorithms, such as Dijskra and timed-elastic-band, simulation experiments show that the combination of octree map and cost map can support global path planning and local obstacle avoidance.


Introduction
With the rapid development of the industrial Internet, big data, and artificial intelligence, the intellectualization and robotization of equipment have become an inevitable trend in industrial development. Based on literature data from the Web of Science database, a cluster analysis of hotspots in intelligent coal mine research, from 2010 to 2020, obtained using VOSviewer software, is shown in Figure 1. It reveals that the in-depth integration of intelligent technology and the development of the coal industry is inevitable. For China's coal industry, the degree of automation and intelligence of fully mechanized mining faces has been significantly improved over years of innovative development, and has been applied practically in many mining areas [1,2]. However, the intelligent development of fully mechanized excavation faces is still in a bottleneck period due to the poor working environments and complex excavation processes; the imbalance between mining and tunneling seriously restricts the safety and efficiency of coal mine production. Therefore, it is urgent to carry out research on the intelligence of tunneling equipment and systems for a fully mechanized excavation face [3,4]. As the main equipment for coal roadway driving, autonomous positioning of roadheaders is a key link to realize intelligent tunneling, and many scholars have done a great deal of original work in this area. An indoor global position system (IGPS)-based pose measurement method for roadheaders was proposed by Tao et al. [5]. The position coordinates of a receiver can be obtained from a single transmitter station, which realized the autonomous measurement of the body pose in a multi-point time-sharing manner. A remote-controlled system for roadheaders was developed by Roman et al. [6]. The system had built-in active lighting infrared sensors, which could assist in the positioning of a roadheader without visual control by workers. Du et al. [7] studied the pose measurement method of roadheaders based on machine vision. A laser target was fixed on a roadheader to receive the laser beam. After the imaging features were collected using a monocular camera, the roadheader pose can accurately be solved by solving the pose model and performing error compensation. Mao et al. [8] explored a multi-source fusion measurement method based on ultrasound, inertial navigation, a laser, and geomagnetic information. The method could overcome the shortcoming of excessive accumulated errors in a single inertial navigation measurement and effectively monitor the roadheader pose. Fu et al. [9,10] developed a set of roadheader pose measurement systems based on an ultra-wideband. A time-of-arrival model was built after the distance between the positioning node of a roadheader and ultra-wideband (UWB) positioning base station was measured. After the positioning algorithm was solved, the roadheader's pose information could be obtained. Yang et al. [11] presented a vision measurement method for the pose of roadheader's cutting head, based on infrared LEDs. A 16-point infrared target was fixed on the roadheader, and then the target image was collected and processed using a monocular camera; the pose of the cutting head could then be estimated. Simulation and experimental results showed that the accuracy of pose measurements could meet the requirements of underground roadway construction. Yan et al. [12] investigated a roadheader pose measurement method based on multi-source information, such as a gyroscope, a total station, and an inertial measurement system, and built a positioning model. Experimental results showed that this method could meet precision requirements. Xue et al. [13] put forward a positioning method for roadheaders based on binocular vision technology. By collecting information with a binocular vision sensor, the coordinate model between a roadheader and the roadway space was established, and then the pose was calculated. As the main equipment for coal roadway driving, autonomous positioning of roadheaders is a key link to realize intelligent tunneling, and many scholars have done a great deal of original work in this area. An indoor global position system (IGPS)-based pose measurement method for roadheaders was proposed by Tao et al. [5]. The position coordinates of a receiver can be obtained from a single transmitter station, which realized the autonomous measurement of the body pose in a multi-point time-sharing manner. A remote-controlled system for roadheaders was developed by Roman et al. [6]. The system had built-in active lighting infrared sensors, which could assist in the positioning of a roadheader without visual control by workers. Du et al. [7] studied the pose measurement method of roadheaders based on machine vision. A laser target was fixed on a roadheader to receive the laser beam. After the imaging features were collected using a monocular camera, the roadheader pose can accurately be solved by solving the pose model and performing error compensation. Mao et al. [8] explored a multi-source fusion measurement method based on ultrasound, inertial navigation, a laser, and geomagnetic information. The method could overcome the shortcoming of excessive accumulated errors in a single inertial navigation measurement and effectively monitor the roadheader pose. Fu et al. [9,10] developed a set of roadheader pose measurement systems based on an ultra-wideband. A time-of-arrival model was built after the distance between the positioning node of a roadheader and ultra-wideband (UWB) positioning base station was measured. After the positioning algorithm was solved, the roadheader's pose information could be obtained. Yang et al. [11] presented a vision measurement method for the pose of roadheader's cutting head, based on infrared LEDs. A 16-point infrared target was fixed on the roadheader, and then the target image was collected and processed using a monocular camera; the pose of the cutting head could then be estimated. Simulation and experimental results showed that the accuracy of pose measurements could meet the requirements of underground roadway construction. Yan et al. [12] investigated a roadheader pose measurement method based on multi-source information, such as a gyroscope, a total station, and an inertial measurement system, and built a positioning model. Experimental results showed that this method could meet precision requirements. Xue et al. [13] put forward a positioning method for roadheaders based on binocular vision technology. By collecting information with a binocular vision sensor, the coordinate model between a roadheader and the roadway space was established, and then the pose was calculated. Simulations and experiments showed that the accuracy of the calculated positioning parameters was good and the established positioning model was reasonable. Yang et al. [14] proposed a vision measurement method for the pose of a roadheader body based on laser beams. The laser beams were used for the image feature, and a measurement system of roadheader body pose, based on monocular vision, was established. Experimental results showed that this method could effectively solve body pose information of roadheaders. A pose measurement method for a roadheader, combining total station and strapdown inertial navigation, was proposed by Zhang et al. [15]. Total station and inertial navigation technology were used to measure the pose of roadheaders, respectively, and the pose was determined using fusion data, which reduced the influence of dust in the mine.
The above research covers the main schemes of present pose measurement of roadheaders. Total station has high installation flexibility, but its light path is easily blocked, and its real-time performance is poor and cannot adapt to the dynamic working conditions of a roadheader. An inertial sensor is generally used as an auxiliary measurement method because of its large cumulative error. UWB technology has a strong real-time performance, but its signal is greatly affected by the complex electromagnetic environment of coal mines. IGPS technology has a large measurement range, but the cost of installation and construction is high. Visual measurements based on target imaging are susceptible to the influence of light beam scattering, and the installation of a light source emitter is limited by complicated roadway structures. The composition of the combined measurement system is complex, and the multi-source data are difficult to synchronize accurately in time and space; thus, the data reliability is not high. State perception, trajectory planning, and motion control are the three keys to realize intelligent and robotized roadheaders. As a core technology of autonomous positioning and navigation for mobile devices, simultaneous localization and mapping (SLAM) is widely used in mobile robot navigation, unmanned driving, and in other fields [16][17][18]. For instance, the open source library, Real-Time Appearance-Based Mapping (RTAB-MAP), provides a positioning and mapping solution that is independent of time and scale, which can be applied to various setups [19]. According to different sensor types, SLAM technology is mainly divided into laser-based and vision-based [20]. The point cloud obtained by LiDAR is sparse, a created map lacks semantic information, and the linearization error is large. In contrast, vision sensors can obtain far more status information than LiDARs. Furthermore, the RGB-D (red, green, blue, and depth) camera in a vision sensor is based on active ranging technology, and it has the advantages of not being affected by ambient lighting changes and textures compared with monocular and binocular cameras.
In summary, SLAM technology is a better solution to realize autonomous positioning and navigation of roadheaders and roadway mapping in a GPS-free roadway environment. At present, there are no reports on the application of visual SLAM technology to coal roadway roadheaders, and research on this technology is still lacking. Therefore, the positioning requirements of roadheaders and the objective conditions, such as tunnel environment and sensor attributes, are comprehensively considered, a method of body pose estimation for roadheaders, based on visual SLAM, is proposed, and multiple schemes of environmental map construction based on laser and visual SLAM are tested. These can provide information for pose correction, navigation of roadheaders, and three-dimensional reconstruction of roadways.

Definition of Coordinate System and Pose
The positioning problem of a roadheader is transformed into a state estimation problem in this paper; that is, to solve the optimal pose. To accurately describe the pose of a roadheader in three-dimensional (3D) space, a series of coordinate systems need to be established, an earth coordinate system O e X e Y e Z e , roadway coordinate system O t X t Y t Z t , world coordinate system O w X w Y w Z w , and camera coordinate system O c X c Y c Z c . The O e system is fixed on the earth, which is the basis of the design of the roadway; the O t system is the absolute reference of the location in the roadway, and its position is determined by the structure of the roadway; the O w system is the reference frame for the pose description of an airborne camera when it is moving, and the O c and O w systems are related to rotation and translation.
As the input device of visual SLAM, an RGB-D camera can be installed in the sensing system of a roadheader through intrinsic safety technology. Considering that the camera is connected to the roadheader body, the pose of the roadheader's body is expressed by the pose of the RGB-D camera relative to the O w system. If the absolute pose of the roadheader body in the O t system or the O e system is required, it can be obtained using Euclidean coordinate transformation. In addition, the pose of the cutting head of a cantilever roadheader can be solved according to the Denavit-Hartenberg model and forward kinematics in robotics [21]. In sum, the pose acquisition of a roadheader body in the O w system is the basis for realizing other functions.
At the initial moment, it is stipulated that the O c and O w systems are coincident. The X w axis of the O w system is located in the horizontal plane and is perpendicular to the roadway midline, pointing to the right of the heading direction. The Y w axis is located in the horizontal plane and is parallel to the roadway midline, pointing to the heading direction; the Z w axis is perpendicular to the horizontal plane and points to the roadway roof. After the roadheader moves, the pose of the O c system changes relative to the O w system, and the pose of the roadheader body (camera) is described by axis deflection angles α, β, and γ, and axis displacements t x , t y , and t z , as shown in Figure 2. α is defined as the included angle between the projection of the axis in the tunneling direction on the O w X w Y w plane and the Y w axis, namely the yaw angle ( Figure 2a). β is defined as the included angle between the projection of the axis in the tunneling direction on the O w Y w Z w plane and the Y w axis, namely the pitch angle ( Figure 2b). γ is defined as the included angle between the projection of the transverse axis of the roadheader on the O w X w Z w plane and the X w axis, namely the roll angle ( Figure 2c); t x , t y , and t z are, respectively, the displacement of the roadheader body in the direction of X w , Y w , and Z w axes (Figure 2d According to rigid body kinematics, the motion of a roadheader can be decomposed into rotation and translation. The rotation of the body (camera) in the Ow system is described by rotation matrix R; Rcw means the rotation transformation from the Ow system to the Oc system. Specifically, point Pw(xw, yw, zw) in the Ow system is rotated and trans- According to rigid body kinematics, the motion of a roadheader can be decomposed into rotation and translation. The rotation of the body (camera) in the O w system is described by rotation matrix R; R cw means the rotation transformation from the O w system to the O c system. Specifically, point P w (x w , y w , z w ) in the O w system is rotated and transformed to point P c (x c , y c , z c ) in the O c system, which can be described as: (1) rotating at angle α along the Z w axis, as shown in Equation (1); (2) rotating at angle β along the X w axis, as shown in Equation (2); and (3) rotating at angle γ along the Y w axis, as shown in Equation (3). R has a definite transformation relationship with the α, β, and γ, as shown in Equation (4) [22].
The translation of the body (camera) is described by 3D vector t, and t cw represents the translation transformation from the O w system to the O c system, as shown in Equation (5). After homogeneous transformation, R cw and t cw can be unified into transformation matrix T cw , as shown in Equation (6) [22]. In the absence of ambiguity, R cw , t cw , and T cw below are referred to as R, t, and T without subscripts.

SLAM Mathematical Model of Roadheader
In the moving process of the roadheader, the airborne camera observes the environmental signposts and estimates its own motion. As shown in Figure 3, the SLAM process of a roadheader equipped with a camera can be visualized, and the pose state and signposts are represented by nodes, and processes of camera movement and observation are represented by edges. Further, at time i, let the pose variable of the airborne camera be x i , the position variable of waypoint is y j , the motion input is u i , and the observation data of airborne camera are z i,j ; then, the mathematical model of visual SLAM for the roadheader is shown as Equation (7).
where f p and go are the process function and observation function, respectively; w i and v i,j are the motion noise and observation noise at time i, respectively. is shown as Equation (7).
where fp and go are the process function and observation function, respectively; wi and vi,j are the motion noise and observation noise at time i, respectively.

Pose Estimation Scheme
The Pose estimation scheme of the roadheader (camera) is shown in Figure 4. Specifically, (1) the images of the roadway environment are collected by the airborne RGB-D camera and depth information is obtained; (2) feature extraction and matching are performed on the color images, and 3D matching pairs are formed after depth mapping; (3) combined with the random sample consensus (RANSAC) strategy and the iterative closest point (ICP) strategy, the pose of the roadheader is estimated; (4) the pose graph, based on keyframe and closed-loop constraints, is constructed, and the global pose optimization is carried out to obtain the optimal pose estimation of the roadheader.

Pose Estimation Scheme
The Pose estimation scheme of the roadheader (camera) is shown in Figure 4. Specifically, (1) the images of the roadway environment are collected by the airborne RGB-D camera and depth information is obtained; (2) feature extraction and matching are performed on the color images, and 3D matching pairs are formed after depth mapping; (3) combined with the random sample consensus (RANSAC) strategy and the iterative closest point (ICP) strategy, the pose of the roadheader is estimated; (4) the pose graph, based on keyframe and closed-loop constraints, is constructed, and the global pose optimization is carried out to obtain the optimal pose estimation of the roadheader.

Image Feature Extraction and Matching
Images collected by the RGB-D camera need feature calculations and data associations, and commonly used methods include the direct method, optical flow method, and feature point method [23]. Both the direct method and the optical flow method rely too heavily on the assumption of a constant gray level, so it is difficult for them to be directly used in a roadway environment; the feature point method, however, is less affected by motion and illumination and is relatively stable. In coal mine roadways, the forming quality of a section is affected by the real-time and accurate measurement of the roadheader

Image Feature Extraction and Matching
Images collected by the RGB-D camera need feature calculations and data associations, and commonly used methods include the direct method, optical flow method, and feature point method [23]. Both the direct method and the optical flow method rely too heavily on the assumption of a constant gray level, so it is difficult for them to be directly used in a roadway environment; the feature point method, however, is less affected by motion and illumination and is relatively stable. In coal mine roadways, the forming quality of a section is affected by the real-time and accurate measurement of the roadheader pose. Compared with other features, ORB (oriented fast and rotated brief) features perform better in terms of real-time performance and accuracy, and ORB features are not sensitive to rotation and translation [24,25]. Consequently, ORB features are used for extraction and matching in this paper.
To verify the effect of ORB feature extraction and matching, two images from the public dataset, TUM-RGBD, were selected for the experiment. The experimental results are shown in Figure 5. Figure 5a shows the effect obtained after using the Brute-Force matcher and Figure 5b shows the effect after matching filtering, and the filtering basis is that the Hamming distance is less than twice the minimum distance.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 8 Figure 5. Effect of ORB feature extraction and matching.

Pose Solution
Based on image feature extraction and matching, an initial motion transform w high confidence is iteratively solved using the RANSAC method, which aims to pro reliable initial values for pose estimation based on ICP. The process is as follows: (1 cording to the matched 3D point set, the initial model is generated by randomly samp some point pairs; (2) the model is evaluated according to the given threshold by calc ing the point-pair error; and (3) through calculating the number of interior point pairs the average value of the error, the model is updated and iterated based on the g threshold. After several iterations, a relatively optimal motion transformation can b tained [26,27].
After depth mapping and RANSAC iteration, a set of 3D point sets with exact m ing is obtained, i.e., P = {p1, p2, …, pn}, Q = {q1, q2, …, qn}, the body pose transforma between the i-th pair of 3D matching points in the point sets can be written as: According to the SLAM mathematical model of a roadheader (Equation (7)), t are noises in the motion equation and observation equation of the airborne camera;

Pose Solution
Based on image feature extraction and matching, an initial motion transform with a high confidence is iteratively solved using the RANSAC method, which aims to provide reliable initial values for pose estimation based on ICP. The process is as follows: (1) According to the matched 3D point set, the initial model is generated by randomly sampling some point pairs; (2) the model is evaluated according to the given threshold by calculating the point-pair error; and (3) through calculating the number of interior point pairs and the average value of the error, the model is updated and iterated based on the given threshold. After several iterations, a relatively optimal motion transformation can be obtained [26,27].
After depth mapping and RANSAC iteration, a set of 3D point sets with exact matching is obtained, i.e., P = {p 1 , p 2 , . . . , p n }, Q = {q 1 , q 2 , . . . , q n }, the body pose transformation between the i-th pair of 3D matching points in the point sets can be written as: According to the SLAM mathematical model of a roadheader (Equation (7)), there are noises in the motion equation and observation equation of the airborne camera; thus, the state estimation is uncertain. The inter-frame error function, e (3D-3D)i , is positioned as [28]: For n pairs of 3D matching points in the point set, the least squares expression of the objective function, F(R, t), can be expressed as [29,30]: To solve Equation (10), let the de-centroid coordinates of point sets P and Q be p i and q i , respectively, that is: Combining Equations (10) and (11), the de-centroid form can be obtained as: Equation (12) can be decomposed into two functions, f 1 (R) and f 2 (t), namely: Further, Equation (13) can be expanded and simplified as: where, tr() is the trace operator of matrix.
To understand the R contained in Equation (15), a method based on singular value decomposition is adopted, that is, let [31,32]: where D, U, and V are all diagonal matrices, and matrix D is composed of singular values. According to the principle of Cauchy-Schwarz's inequality, when the objective function g 1 (R) in Equation (15) takes the minimum value and DUV T is full rank, R can be obtained, the expression is: After R is solved, let f 2 (t) in Equation (14) be zero, and t can be solved, i.e.: So far, through the construction of the least squares problem of pose solution and singular value decomposition, the pose parameters, R and t, of the roadheader body can be solved.

Pose Optimization
Pose optimization must be carried out because of the accumulated errors in pose solutions. Waypoints are regarded as constraints of pose optimization rather than as optimization variables; the optimization model of a body pose graph is established by selecting the image key frame. Additionally, closed-loop constraints are added to carry out global pose optimization, so as to ensure the effectiveness and real-time performance of pose estimation, and then the deviation correction is completed in time.

Pose Graph Construction and Closed-Loop Detection
The selection principle of key frames is based on the evaluation of the distance between frames. After R and t are given different weights, the motion transformation index representing the distance between frames can be calculated. If the index is less than a given threshold, the current frame is determined to be a key frame.
After selecting the key frame, the pose graph of the motion transformation constraint can be constructed. To further modify the computed pose, closed-loop detection is needed and closed-loop constraints are added to the pose graph. The key of closed-loop detection lies in the judgment of the similarity between two frames of images, i.e., if the similarity between the current frame and the historical key frame exceeds a certain value of similarity between the current frame and the previous-moment key frame, and the consistency of space and time is satisfied, a closed loop can be determined.
Based on the SLAM motion model of the roadheader, shown in Figure 3, the pose graph optimization model, which eliminates the waypoint variables and adds closed-loop constraints, is shown in Figure 6. The figure shows two processes of motion transformation and closed-loop detection, T i(i−1) is the motion transformation constraint between pose nodes x i−1 and x i , and T (i+1)n is the closed-loop constraint between pose nodes x i+1 and x n . The selection principle of key frames is based on the evaluation of the distance be tween frames. After R and t are given different weights, the motion transformation inde representing the distance between frames can be calculated. If the index is less than given threshold, the current frame is determined to be a key frame.
After selecting the key frame, the pose graph of the motion transformation constrain can be constructed. To further modify the computed pose, closed-loop detection is neede and closed-loop constraints are added to the pose graph. The key of closed-loop detectio lies in the judgment of the similarity between two frames of images, i.e., if the similarit between the current frame and the historical key frame exceeds a certain value of simila ity between the current frame and the previous-moment key frame, and the consistenc of space and time is satisfied, a closed loop can be determined.
Based on the SLAM motion model of the roadheader, shown in Figure 3, the pos graph optimization model, which eliminates the waypoint variables and adds closed-loo constraints, is shown in Figure 6. The figure shows two processes of motion transfo mation and closed-loop detection, Ti(i−1) is the motion transformation constraint betwee pose nodes xi−1 and xi, and T(i+1)n is the closed-loop constraint between pose nodes xi+1 an xn.

Pose Graph Optimization
According to the pose graph optimization model shown in Figure 6, let the inte frame transformation matrix obtained in the pose solution stage or closed-loop detectio be Tij, and the pose estimation error be eij; then, eij can be expressed as [33]: where the sign ˅ means to transform the matrix into a vector.
To optimize the global pose graph, the least squares expression of the objective fun tion, G(T), based on the error term is constructed, that is [34]: (20 where Σij is the covariance matrix of the pose measurement, which is used to represent th uncertainty of the measurement. Equation (20) is nonlinear and is solved using an iterative method in this paper. Le

Pose Graph Optimization
According to the pose graph optimization model shown in Figure 6, let the inter-frame transformation matrix obtained in the pose solution stage or closed-loop detection be T ij , and the pose estimation error be e ij ; then, e ij can be expressed as [33]: where the sign ∨ means to transform the matrix into a vector.
To optimize the global pose graph, the least squares expression of the objective function, G(T), based on the error term is constructed, that is [34]: where Σ ij is the covariance matrix of the pose measurement, which is used to represent the uncertainty of the measurement. Equation (20) is nonlinear and is solved using an iterative method in this paper. Let ∆x be the iterative increment, x* be the initial value of the pose estimation, and then the Taylor expansion form of the error, e ij , at x* can be expressed as: where J ij is the Jacobian matrix of e ij at x*. Combined with Equations (20) and (21), Equation (22) can be obtained after approximate processing, that is: By introducing Lagrange multiplier ρ on the basis of Equation (22), the increment equation based on the Levenberg-Marquadt method is shown in Equation (23): where I is the identity matrix. After multiple iterations of Equation (23), when the termination condition of the iteration is satisfied, the optimized pose estimation can be obtained. Subsequently, a closedloop control and deviation correction can be carried out for the traveling of the roadheader according to the optimized body pose, and trajectory planning of the cutting arm can be performed according to robotics to ensure the forming quality of the roadway section.

Overview of Mapping Scheme
Although the acquired pose can provide a priori information for the moving of the roadheader, its navigation also depends on the environment map. Based on SLAM technology, a roadway map is established, which can help the roadheader identify obstacles and judge roadway edge shapes, and provide support for remote man-machine interactions, roadheader travel prediction, section forming control, and so on.
Common mapping schemes are shown in Table 1, including laser-based and visionbased schemes, and the real triangle symbol indicates that the algorithm supports the relevant interface. It can be seen from Table 1 that the RTAB-MAP algorithm can provide a variety of sensor input interfaces, and can output a dense 3D point cloud map, which is currently an excellent application-based mapping solution.  [19] Comprehensively considering the underground environment of a coal mine as well as the existing conditions of a laboratory, a long corridor is adopted to simulate an underground roadway, and some mapping algorithms, listed in Table 1, are tested. The long corridor used for test, as shown in Figure 7, is as close as possible to the dimly lit, narrow and closed roadway environment.

Principle of Laser Mapping
To construct an occupancy grid map based on laser SLAM, the key is to use the prior information to calculate the posterior probability of the map, and then identify the grid state, i.e., unknown, idle, or occupied [41,42]. The grid state schemas formed after twodimensional (2D) LiDAR scanning are shown in Figure 8; OgXgYg and ObXbYb are the grid coordinate system and body coordinate system, respectively. The white area in the grid is idle, the black area in the grid is occupied, and the gray area in the grid is unknown, which is the target area for further exploration using SLAM technology. The measured data after the laser beam is emitted by the 2D LiDAR is denoted as z L , and is as shown in Equation (24). According to probability theory, if s = 1 that means that the grid is occupied, and if s = 0 that means that the grid is idle; then, the updated rule of state S for any grid is shown in Equation (25). (24) where d represents the distance of the measuring point, and θ represents the included angle between the laser beam and the body orientation.

Principle of Laser Mapping
To construct an occupancy grid map based on laser SLAM, the key is to use the prior information to calculate the posterior probability of the map, and then identify the grid state, i.e., unknown, idle, or occupied [41,42]. The grid state schemas formed after twodimensional (2D) LiDAR scanning are shown in Figure 8; O g X g Y g and O b X b Y b are the grid coordinate system and body coordinate system, respectively. The white area in the grid is idle, the black area in the grid is occupied, and the gray area in the grid is unknown, which is the target area for further exploration using SLAM technology.

Principle of Laser Mapping
To construct an occupancy grid map based on laser SLAM, the key is to use information to calculate the posterior probability of the map, and then identify state, i.e., unknown, idle, or occupied [41,42]. The grid state schemas formed a dimensional (2D) LiDAR scanning are shown in Figure 8; OgXgYg and ObXbYb are coordinate system and body coordinate system, respectively. The white area in th idle, the black area in the grid is occupied, and the gray area in the grid is unknow is the target area for further exploration using SLAM technology. The measured data after the laser beam is emitted by the 2D LiDAR is deno and is as shown in Equation (24). According to probability theory, if s = 1 that me the grid is occupied, and if s = 0 that means that the grid is idle; then, the update state S for any grid is shown in Equation (25). The measured data after the laser beam is emitted by the 2D LiDAR is denoted as z L , and is as shown in Equation (24). According to probability theory, if s = 1 that means that the grid is occupied, and if s = 0 that means that the grid is idle; then, the updated rule of state S for any grid is shown in Equation (25).
where d represents the distance of the measuring point, and θ represents the included angle between the laser beam and the body orientation.
where p() is the probability calculation expression. The map constructed using laser SLAM can provide a priori information for navigation and obstacle avoidance of the roadheader, but it is not safe to move in the occupancy grid map using only Boolean values [42]. To realize global navigation of the roadheader and to avoid dynamic obstacles in real time, an obstacle map layer and expansion layer should be added, based on the static map layer constructed using SLAM; that is, a cost map for navigation should be constructed. The value principle of a cost map is shown in Figure 9, where each grid state is represented by a gray value from 0 to 255. The dashed line frame in the figure represents the contours of the roadheader, and the grid value of 254 indicates that the obstacles coincide with the center of the body, which will inevitably lead to a collision. The single dotted circle represents the inner tangent circle of the body contour line, the grid value in the inner tangent circle is 253, and collision is also inevitable in this area. The double-dot dotted circle represents the outer tangential circle of the body contour line, the grid value range in the outer tangential circle is 128-252, and collisions may occur in this area; the three-point scribbled circle represents the buffer area, and the grid value range is 1-128, which should be avoided. The grid value in the three-point double-dashed circle is 0, and the area is allowed free passage. The gray block with a grid value of 255 represents an unknown area, which needs to be updated according to the sensor information [43].

Appl. Sci. 2021, 11, x FOR PEER REVIEW
where p() is the probability calculation expression.
The map constructed using laser SLAM can provide a priori information for tion and obstacle avoidance of the roadheader, but it is not safe to move in the occ grid map using only Boolean values [42]. To realize global navigation of the road and to avoid dynamic obstacles in real time, an obstacle map layer and expansio should be added, based on the static map layer constructed using SLAM; that i map for navigation should be constructed. The value principle of a cost map is sh Figure 9, where each grid state is represented by a gray value from 0 to 255. The line frame in the figure represents the contours of the roadheader, and the grid 254 indicates that the obstacles coincide with the center of the body, which will in lead to a collision. The single dotted circle represents the inner tangent circle of t contour line, the grid value in the inner tangent circle is 253, and collision is also in in this area. The double-dot dotted circle represents the outer tangential circle of t contour line, the grid value range in the outer tangential circle is 128-252, and co may occur in this area; the three-point scribbled circle represents the buffer area, grid value range is 1-128, which should be avoided. The grid value in the thr double-dashed circle is 0, and the area is allowed free passage. The gray block wit value of 255 represents an unknown area, which needs to be updated accordin sensor information [43].

Mapping Test Based on Laser SLAM
Laser SLAM is divided into filtering-based and optimization-based, the Gm algorithm [35] in Table 1 belongs to the former, and the algorithms, Cartograph Karto [37], and Hector [38], belong to the latter. In this paper, the four aboveme

Mapping Test Based on Laser SLAM
Laser SLAM is divided into filtering-based and optimization-based, the Gmapping algorithm [35] in Table 1 belongs to the former, and the algorithms, Cartographer [36], Karto [37], and Hector [38], belong to the latter. In this paper, the four abovementioned laser SLAM algorithms are used to carry out mapping tests of the simulated roadway along the long corridor shown in Figure 7.
The test results of the four algorithms are shown in Figure 10. Figure 10a shows a map of simulated roadway, built based on the Gmapping algorithm, and the Gmapping algorithm relies on the Rao-Blackwellized particle filter, which takes laser data and odometer data as inputs; the map constructed based on the Cartographer algorithm is shown in Figure 10b. The Cartographer algorithm consists of two phases of local and global SLAM, with the first phase establishing and maintaining grid submaps, and the second phase eliminating cumulative errors and carrying out global optimization. Figure 10c shows a map built based on the Karto algorithm, which is based on the graph optimization theory, and uses the graph mean to represent the map, and the front-end and back-end are carried out by a single thread. The map constructed based on the Hector algorithm is shown in Figure 10d; the Hector algorithm only creates a grid map by matching laser data without the need for odometer data.
Appl. Sci. 2021, 11, x FOR PEER REVIEW Figure 10d; the Hector algorithm only creates a grid map by matching laser data the need for odometer data.
The Gmapping algorithm based on the filtering framework has a lower co tional complexity, but it relies too heavily on odometer data. Although the Hect rithm is not limited to the odometer, it requires a high laser frequency. In addi Cartographer algorithm is relatively resource intensive. Consequently, based on occupy grid map constructed using the Karto algorithm, the cost map for the na of the roadheader is constructed by introducing parameters, such as the expansio of obstacles and the updating frequency of map. The simulation test results are s Figure 11, the red arrow represents the body pose.   The Gmapping algorithm based on the filtering framework has a lower computational complexity, but it relies too heavily on odometer data. Although the Hector algorithm is not limited to the odometer, it requires a high laser frequency. In addition, the Cartographer algorithm is relatively resource intensive. Consequently, based on the 2D occupy grid map constructed using the Karto algorithm, the cost map for the navigation of the roadheader is constructed by introducing parameters, such as the expansion radius of obstacles and the updating frequency of map. The simulation test results are shown in Figure 11, the red arrow represents the body pose.

Principle of Vision Mapping
According to Table 1, the map constructed using the ORB-SLAM1/2 algorithm is a sparse point cloud map, which cannot be used for navigation and obstacle avoidance of the roadheader. In addition, the algorithm can easily miss a target when the airborne camera rotates and it is sensitive to noise. Considering the complex environment of underground roadways and the requirements of tunneling work, the performance of running and being in real-time, the accuracy of algorithm should be given more priority. The RTAB-MAP framework provides a map construction scheme that is independent of scale and time. As shown in Figure 12 [19], its internal memory management is implemented through three memory nodes, i.e., WM (working memory), LTM (long-term memory), and STM (short-term memory). The weight of the registration point is updated in STM,  Table 1, the map constructed using the ORB-SLAM1/2 algorithm is a sparse point cloud map, which cannot be used for navigation and obstacle avoidance of the roadheader. In addition, the algorithm can easily miss a target when the airborne camera rotates and it is sensitive to noise. Considering the complex environment of underground roadways and the requirements of tunneling work, the performance of running and being in real-time, the accuracy of algorithm should be given more priority. The RTAB-MAP framework provides a map construction scheme that is independent of scale and time. As shown in Figure 12 [19], its internal memory management is implemented through three memory nodes, i.e., WM (working memory), LTM (long-term memory), and STM (shortterm memory). The weight of the registration point is updated in STM, and the closed-loop detection of the registration point is completed in WM. When the number of registration points exceeds a certain threshold, the registration point with the longest storage time is placed in WM from STM. To ensure real-time Bayesian estimation of the number of registration points, two functional nodes, namely fetch and transfer, are defined between WM and LTM during closed-loop detection in WM. When the closed loop is detected in the WM, the adjacent location points in the closed loop will be retrieved from the LTM and placed in the WM. When the processing time for closed-loop detection exceeds a certain threshold, the anchor point with the lowest weight is transferred to the LTM [19]. Based on the above memory management mechanism, the RTAB-MAP framework avoids the problem of having map data that are too large, which can meet the engineering needs of long-term, real-time, and online mapping in a large-scale environment. When the close loop is detected in the WM, the adjacent location points in the closed loop will be retrieve from the LTM and placed in the WM. When the processing time for closed-loop detectio exceeds a certain threshold, the anchor point with the lowest weight is transferred to th LTM [19]. Based on the above memory management mechanism, the RTAB-MAP frame work avoids the problem of having map data that are too large, which can meet the eng neering needs of long-term, real-time, and online mapping in a large-scale environment The RTAB-MAP algorithm can build a dense point cloud map. Considering the di advantage of a large amount of data in a dense point cloud, a portable stored octree ma can be formed after octree compression [44,45], which can be used for stereoscopic nav gation of the roadheader. The data structure of the octree is shown in Figure 13; each pa ent node contains eight child nodes, among which the leaf layer node corresponds to th map with the highest resolution. In addition, each node points to a cube voxel, and th probability of the n-th node corresponding to the voxel being occupied is calculated ac cording to Bayesian theory, as shown in Equation (26) [45]. The RTAB-MAP algorithm can build a dense point cloud map. Considering the disadvantage of a large amount of data in a dense point cloud, a portable stored octree map can be formed after octree compression [44,45], which can be used for stereoscopic navigation of the roadheader. The data structure of the octree is shown in Figure 13; each parent node contains eight child nodes, among which the leaf layer node corresponds to the map with the highest resolution. In addition, each node points to a cube voxel, and the probability of the n-th node corresponding to the voxel being occupied is calculated according to Bayesian theory, as shown in Equation (26) [45].
where z c represents the observation data of the airborne camera at the corresponding subscript time. The RTAB-MAP algorithm can build a dense point cloud map. Considerin advantage of a large amount of data in a dense point cloud, a portable stored o can be formed after octree compression [44,45], which can be used for stereosco gation of the roadheader. The data structure of the octree is shown in Figure 13; ent node contains eight child nodes, among which the leaf layer node correspon map with the highest resolution. In addition, each node points to a cube voxe probability of the n-th node corresponding to the voxel being occupied is calcu cording to Bayesian theory, as shown in Equation (26) [45].

Mapping Test Based on Vision SLAM
Based on RTAB-MAP framework, the image data collected by RGB-D camera are input, the dense 3D point cloud map is constructed by mapping the test of the simulated roadway (shown in Figure 14), and the octree map is constructed via octree compression and is shown in Figure 15.   Octree map is convenient for navigation of the roadheader because of its portabl characteristics. Based on the-octree map shown in Figure 15, the multi-layer cost map fo navigation is constructed by introducing parameters, such as the expansion radius of ob stacles and the frequency of map updates, as shown in Figure 16. The red arrow in th figure represents the body pose, the darkly colored expansion area around the body is local cost map, which is used for local path planning; the rest of the area is a global cos map, which is used for global path planning. In this test, the Dijskra algorithm is used fo global path planning, and the red lines in the figure are the trajectory of the global path Octree map is convenient for navigation of the roadheader because of its portable characteristics. Based on the octree map shown in Figure 15, the multi-layer cost map for navigation is constructed by introducing parameters, such as the expansion radius of obstacles and the frequency of map updates, as shown in Figure 16. The red arrow in the figure represents the body pose, the darkly colored expansion area around the body is a local cost map, which is used for local path planning; the rest of the area is a global cost map, which is used for global path planning. In this test, the Dijskra algorithm is used for global path planning, and the red lines in the figure are the trajectory of the global path planning. The TEB (timed-elastic-band) algorithm is used for local path planning, and the blue line is the trajectory of local path planning. planning. The TEB (timed-elastic-band) algorithm is used for local path planning, and th blue line is the trajectory of local path planning. As can be seen from Figure 16, when an obstacle is temporarily added in the naviga tion test, the dynamic obstacle expansion region can be marked by updating the cost map and the obstacle avoidance based on local path planning is successfully realized. Although the mapping accuracy of laser SLAM is high and the computational com plexity is low, the map constructed is a sparse point cloud, lacks semantic information As can be seen from Figure 16, when an obstacle is temporarily added in the navigation test, the dynamic obstacle expansion region can be marked by updating the cost map, and the obstacle avoidance based on local path planning is successfully realized.

Mapping Test Based on Fusion of Laser and Vision
Although the mapping accuracy of laser SLAM is high and the computational complexity is low, the map constructed is a sparse point cloud, lacks semantic information, and has large linearization errors. In contrast, the environmental information obtained by the camera is much more than that obtained by LiDAR, and the semantic information is rich. Moreover, the RGB-D camera has the advantage of not being affected by the change in ambient light and texture, but its mapping accuracy is affected by the cumulative error. Based on the above considerations, the two mapping schemes are complementary. Specifically, in the dynamic environment of a coal mine roadway with strong interference, the mapping scheme based on the laser and vision fusion has a stronger robustness.
The RTAB-MAP framework provides an interface between the camera and LiDAR [19], as shown in Figure 17. First, the laser and visual data are entered, then the multi-source information is initialized and aligned, and finally a map of the environment is output.

Mapping Test Based on Fusion of Laser and Vision
Although the mapping accuracy of laser SLAM is high and the computational complexity is low, the map constructed is a sparse point cloud, lacks semantic information, and has large linearization errors. In contrast, the environmental information obtained by the camera is much more than that obtained by LiDAR, and the semantic information is rich. Moreover, the RGB-D camera has the advantage of not being affected by the change in ambient light and texture, but its mapping accuracy is affected by the cumulative error. Based on the above considerations, the two mapping schemes are complementary. Specifically, in the dynamic environment of a coal mine roadway with strong interference, the mapping scheme based on the laser and vision fusion has a stronger robustness.
The RTAB-MAP framework provides an interface between the camera and LiDAR [19], as shown in Figure 17. First, the laser and visual data are entered, then the multisource information is initialized and aligned, and finally a map of the environment is output.  The mapping result of the simulated roadway, based on the fusion of laser and visual data, is shown in Figure 18. The map is in a composite form, and includes a dense 3D point cloud and a 2D occupied grid. Compared with Figure 14, it can be seen that the SLAM mapping scheme that integrates laser and vision has more advantages in detail processing of the point cloud, and the map is more refined. Additionally, the point cloud can be used for 3D reconstruction of the environment after gridding and texture mapping in the later stage, and the 2D occupancy grid map can be used for roadheader navigation after being converted into a cost map.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 18 o The mapping result of the simulated roadway, based on the fusion of laser and visu data, is shown in Figure 18. The map is in a composite form, and includes a dense 3D po cloud and a 2D occupied grid. Compared with Figure 14, it can be seen that the SLA mapping scheme that integrates laser and vision has more advantages in detail processi of the point cloud, and the map is more refined. Additionally, the point cloud can be us for 3D reconstruction of the environment after gridding and texture mapping in the la stage, and the 2D occupancy grid map can be used for roadheader navigation after bei converted into a cost map.

Conclusions
In this paper, according to the development requirements of intelligent and robotiz tion of fully mechanized excavation faces, the method of estimating roadheader poses a constructing navigation maps by using vision and laser SLAM technology is propose The main conclusions are as follows: (1) The positioning method based on visual SLAM technology is proposed, the enviro ment data are collected by an airborne RGB-D camera, and the RANSAC+ICP mod

Conclusions
In this paper, according to the development requirements of intelligent and robotization of fully mechanized excavation faces, the method of estimating roadheader poses and constructing navigation maps by using vision and laser SLAM technology is proposed. The main conclusions are as follows: (1) The positioning method based on visual SLAM technology is proposed, the environment data are collected by an airborne RGB-D camera, and the RANSAC+ICP model is constructed to solve and realize the autonomous measurement of pose by establishing and solving the optimization model of pose graph with closed-loop constraints. The global consistent pose of the roadheader can be obtained, which can effectively solve the difficult positioning and orientation problem of roadheaders in a restricted space.