HDPP: High-Dimensional Dynamic Path Planning Based on Multi-Scale Positioning and Waypoint Refinement

Wang, Jingyao; Ruan, Xiaogang; Huang, Jing

doi:10.3390/app12094695

Open AccessArticle

HDPP: High-Dimensional Dynamic Path Planning Based on Multi-Scale Positioning and Waypoint Refinement

by

Jingyao Wang

,

Xiaogang Ruan

and

Jing Huang

^*

Beijing Key Laboratory of Computational Intelligence and Intelligence System, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(9), 4695; https://doi.org/10.3390/app12094695

Submission received: 11 February 2022 / Revised: 7 April 2022 / Accepted: 27 April 2022 / Published: 6 May 2022

(This article belongs to the Topic Motion Planning and Control for Robotics)

Download

Browse Figures

Versions Notes

Abstract

:

Algorithms such as RRT (Rapidly exploring random tree), A* and their variants have been widely used in the field of robot path planning. A lot of work has shown that these detectors are unable to carry out effective and stable results for moving objects in high-dimensional space, which generate a large number of multi-dimensional corner points. Although some filtering mechanisms (such as splines and valuation functions) reduce the calculation scale, the chance of collision is increased, which is fatal to robots. In order to generate fewer but more effective and stable feature points, we propose a novel multi-scale positioning method to plan the motion of the high-dimensional target. First, a multi-scale feature extraction and refinement scheme for waypoint navigation and positioning is proposed to find the corner points that are more important to the planning, and gradually eliminate the unnecessary redundant points. Then, in order to obtain a stable planning effect, we balance the gradient of corner point classification detection to avoid over-optimizing some of them during the training phase. In addition, considering the maintenance cost of the robot in actual operation, we pay attention to the mechanism of anti-collision in the model design. Our approach can achieve a complete obstacle avoidance rate for high-dimensional space simulation and physical manipulators, and also work well in low-dimensional space for path planning. The experimental results demonstrate the superiority of our approach through a comparison with state-of-the-art models.

Keywords:

path planning; high-dimensional; machine learning; multi-scale positioning; waypoint refinement; ROS

1. Introduction

Path planning is a basic task of computer vision and robot kinematics. It is necessary to predict the target position and gait at the same time. With the great development of deep learning, path planning is successfully integrated into more and more real-world application systems, such as autonomous driving, human-machine collaboration, and remote control [1,2]. Among them, motion planning in high-dimensional space becomes a hot topic, which is a key for realizing the slogan of “machine serving society” (for example, palletizing in delivery warehouses, underwater exploration, aerial cruise) [3,4]. However, today’s high-dimensional spatial planning can only accomplish simple motions in specific environments with few obstacles, and this planning process is lengthy. Hence, it is necessary to explore an efficient path planning approach, especially for high-dimensional space, to fill the gap, which is a prominent and interdisciplinary research area.

However, it is a challenging task to deal with high-dimensional data for path planning. Firstly, most current works are rarely studied in high-dimensional environments, and the vulnerability of general path detectors has been revealed by planning in more complex environments [5,6]. For example, although the Walk To algorithm [7] can be directly used in any dimension, it is completely unable to deal with obstacles and other situations. Bugs [8] need to rotate clockwise (or counterclockwise) around the obstacle, and it is difficult to define the direction in high-dimensional space. The ant colony algorithm [9] will bring about slow convergence, easy to over-fitting problems. Other search-based or sampling-based methods, such as RRT and A*, can predict waypoints in high-dimensional space but cannot be executed in the real world due to the huge scale of time cost for even analyzing a single dimension. The existing, more mature robotic arm planning algorithms mostly rely on sensors [10]. They iteratively update the planned waypoints by continuously scanning obstacles through devices such as laser and radar, which has a high cost but low efficiency. Secondly, path planning methods based on research [11,12] can be used just in simple known scenes but tend to crash in unknown environments. The paths generated by these methods are so limited that such a priori detection is unpredictable for emergent factors, which is devastating for robots. Some training-based waypoint detection models (such as gray-level co-occurrence matrices [13]) are not efficient enough to generate the best waypoints and are likely to cause overfitting. Thirdly, although today’s high-dimensional path planning of robots has made some progress in targets such as drones [14], there is still a certain risk of collision. To date, not much research has been conducted on employing the balance between planning speed and obstacle avoidance accuracy.

Therefore, we conclude that a stable and efficient high-dimensional space path planning approach can be designed from the following aspects by analyzing the situations mentioned above: regional obstacle detection planning to determine the current dynamic sequence is a dynamic process; a screening mechanism for waypoints to ensure safety and efficiency; operations to reduce the number of iterations and computing time. In light of this, we launched an exploration of HDPP.

A novel multi-scale positioning and waypoint refinement approach—HDPP—is proposed for high-dimensional dynamic path planning. Specifically, it is obvious that for complex space planning, the less waypoints of the image that are detected while the path planning effect does not degenerate [15], the better. In order to separate viable areas in the sensory field from obstacles as quickly as possible, a multi-scale model is proposed to dynamically refine and learn adaptively assigned feature maps at different scales with multiple predictions. Secondly, we empirically observe that the importance of different areas in the receptive field varies greatly for high-dimensional navigation. Based on this, a patch selection and refining scheme is proposed to continuously generate redundant waypoints through the cumulative gradients of key regions based on multi-scale feature maps since we learn that the key-areas are extremely difficult to be distinguished at the beginning and would change during the procession. The insignificant waypoints will be adaptively deleted in each region which is divided according to the point distribution until the key point calibration becomes stable. Experiments show that our approach converges the results in less than 40% of the number of iterations of other baseline algorithms and reach the same or even a higher obstacle avoidance rate.

In addition, from the perspective of application scenarios, recent works can only plan for two-degree-of-freedom or three-degree-of-freedom robots, while our proposed approach can simultaneously perform multiple-degree-of-freedom robots. Specifically, we balance the gradient of multi-category waypoints (for example, road boundary, road center, close to obstacle) planning to avoid over-optimizing one of them during the training phase. Figure 1 illustrates an example that our HDPP greatly reduces the scale of waypoints based of efficient path planning. The algorithm can realize migration applications in complex environments, especially high-dimensional space. In summary, the contribution of this work is fourfold:

A novel multi-scale positioning model of high-dimensional path planning is proposed to adaptively extract and learn multi-scale feature maps with multiple predictions, while dynamically refine them through fast positioning. The multi-scale strategy is used to improve the characterization ability of unknown environmental factors. The results demonstrate the superiority of our approach through comparison with state-of-the-art models;
A patch selection and optimization scheme are proposed to dynamically find and divide category areas, and gradually eliminate insignificant waypoints for multi-degree-of-freedom planning;
We balance the gradients of multi-category feature points to process obstacles and road segment edges at the same time to avoid over-optimizing one of them during the training phase, eliminating the redundancy and collapse of previous research planning methods;
Extensive experiments conducted on both manually annotated and machine weakly labeled datasets demonstrate the effectiveness of the proposed approach. We have verified the effect of HDPP not only in high-dimensional space, but also in a planar simulation environment.

The remainder of the article is structured as follows. We present the related work in Section 2. The proposed approach—HDPP—is then introduced in detail in Section 3. Next, the experimental results are elaborated in Section 4. Finally, we make conclusions and describe future work in Section 5.

2. Related Work

2.1. Path Planning Methods

In recent years, great progress has been made in the field of general automatic driving and path planning. Current methods can already plan and process more than 90% of planar moving targets, which can be roughly divided into search-based (for example, A*, Dijkstra and Jump Point Search) [16], sampling-based (for example, RRT, RRT* and S-RRT) [17,18] and reinforcement Learning-based (such as Q-learning and Monte Carlo) [19,20] methods. The representative detectors of the three common path planning methods are detailed as follows:

Search-based. The research for search-based path planning has been studied broadly (for example, A*, Dijkstra, LPA* etc.) [16,21,22,23]. These algorithms are based on heuristic functions that calculate the prior loss at different locations on the map and thus search for the optimal path; but this is usually a static, flat search. Saleem et al. [21] propose a multi-heuristic framework to guide the search along the relevant classes for dynamic inspection. Elhoseny et al. [22] built an efficient, Bezier curve-based approach for the path planning in a dynamic field using a Modified Genetic Algorithm (MGA), which aims to boost the diversity of the generated solutions of the standard GA. Dang et al. [23] present a novel search-based strategy for autonomous graph-based exploration path planning in subterranean environments. These search-based methods break the limitations of being limited to static searches;
Sampling-based. Similarly to search-based methods, sampling-based path planning also has been widely used (for example, RRT, RRT*, S-RRT) [17,24,25,26,27,28]. It uses an incremental method to build a search tree to gradually improve the resolution ability without setting any resolution parameters. This sampling-based strategy has a good effect on dynamic programming. Summers [24] propose a distributional, robust incremental sampling-based method—DR-RRT—for dynamic motion planning under uncertainty. Lai et al. [25] propose a novel framework—PDMP—that uses bijective and differentiable mappings or diffeomorphisms to transform sampling distributions of sampling-based motion planners. Some researchers use multilevel abstractions to solve motion planning problems involving high-dimensional state spaces and develop novel multilevel motion planning algorithms, which are called QRRT* and QMP* [26,27]. Those algorithms are proved to be probabilistically complete and almost-surely asymptotically optimal. Meanwhile, a learning method based on Monte-Carlo Tree Search is proposed to learn assignment orders for the variable-subsets [28]. The strategy can efficiently compute a set of diverse valid robot configurations for mode-switches within sequential manipulation tasks, which are waypoints for subsequent trajectory optimization or sampling-based motion planning algorithms;
Reinforcement Learning-based. With the development of deep learning, many methods [29,30,31,32,33] have been proposed for robot’s path planning with reinforcement learning strategy (for example, Q-learning, Monte Carlo, DQN, C51, HER etc.). Chen et al. [30] propose a path planning and manipulating approach based on Q-learning, which can drive a cargo ship by itself without requiring any input from human experiences. STAPP [31]—a multi-agent reinforcement learning algorithm is proposed for multi-UAV target assignment and path planning (MUTAPP) based on a multi-agent deep deterministic policy gradient. Yan et al. [32] present a Deep Reinforcement Learning (DRL) approach for UAV path planning based on the global situation information. RLGWO [33]—a novel reinforcement learning based grey wolf optimizer algorithm is also proposed for unmanned aerial vehicles (UAVs) path planning.

However, not much research has been conducted on multi-degree-of-freedom targets path planning moving in higher dimensions. The existing algorithms have many limitations. For search-based methods, although some works have been able to achieve dynamic programming, it is difficult to achieve fast programming in high dimensional space due to the huge search scale. For sampling-based works, these dynamic programming methods are usually two-stage, inefficient and difficult to determine whether they are optimal. For reinforcement Learning, thanks to deep learning strategies, these approaches can adapt to complex environments and are often used with robots such as UAVs, but there are many problems: low sampling efficiency, hard design for reward function, local optimization, poor stability etc. It is necessary to explore a stable and efficient high-dimensional path planning approach from three aspects: the efficient path point calculation, optimal path evaluation and comprehensive feature detection.

2.2. Multi-Degree-of-Freedom Target Motion Planning

Multi-degree-of-freedom (multi-DOF) robots have been widely used in social production and life (for example, 6-DOF robot arm, human service robot, industrial production line etc.). An example of path planning in a complex environment is the palletizing activity of a six-degree-of-freedom manipulator. Although the scope of activity is small, it is prone to excessive calculations due to multiple dimensions in a single environment, causing the system to crash or stagnate. Meanwhile, under relatively complex background conditions, wrong planning occurs due to obstacle detection deviations. The result of this situation is devastating to the equipment. L Previous work on motion planning mainly focused on posture adjustment and Slam Mapping [34]. Firstly, the environment needs to be scanned, and a priori processing is performed in advance. Therefore, it is not suitable for motion planning that requires the position and gait of object instances in an unknown environment task.

Motion and path planning of multi-degree-of-freedom targets has attracted increasing interest in recent years. DeepMapping [35] proposed a novel registration framework that uses DNN (Deep Neural Networks) [36] to simulate the highly non-convex mapping process, which can usually achieve more robust and accurate multi-point cloud global registration, thereby planning high-dimensional spatial moving objects. However, the data generated by it has a large amount of calculation and is inefficient for real-time tasks and requires high sensors. Real-time path planning [37] integrates the PRM (Probabilistic Roadmap) used for changing the environment with 3D sensor data, uses high-speed sensors to scan real-time obstacles, and then performs high-dimensional spatial motion planning. This method is safe and efficient but depend on hardware.

In view of the previous local planning, HDPP designed a hierarchical discrete distribution decomposition framework to learn probability points and regional matching, which can simulate not only matching uncertainty, but also regional propagation. Compared with these algorithms, our approach can generate fewer key waypoints, and at the same time has a better migration planning effect in high-dimensional space planning tasks.

3. Materials and Methods

In this section, we will introduce our model in detail. First, we provide the problem statement in Section 3.1. After that, the HDPP process is introduced in Section 3.2. The framework of the proposed HDPP is shown in Figure 2. Then, we will introduce a novel patch selection and optimization scheme in Section 3.3. for dynamically finding and dividing category areas. Finally, in Section 3.4., we discuss how to balance the gradients from different types of waypoint detection to stabilize the overall planning results and use multi-scale features to predict and plan the target’s motion.

3.1. Problem Formulation

In this work, we explored the HDPP for multi-degree-of-freedom robot motion planning under high-dimensional conditions. We try to adjust the complexity of the environment by setting background patterns and obstacles, exploring the movement of robots with different degrees of freedom, and comparing them horizontally with the search-based path planning algorithm and many other algorithms. Search algorithms such as A* or Cuckoo Search use raster maps [38] to calculate the cost value of the shortest path, and the performance of the algorithm can be weighted and analyzed using the running time of the recursive function and the memory occupied. Our goal is to perform efficient path planning for moving objects in this complex environment with the fewest waypoints, so that the algorithm can obtain SOTA (State of the Art) in high-dimensional motion planning tasks.

3.2. Multi-Scale Positioning for Waypoint Navigation

This section introduces the detailed path planning process of our HDPP. In order to accurately distinguish the different types of feature points in the forward road, thereby distinguishing road sections and roadblocks to further correct the gait, we propose to perform multi-scale positioning on the input image through iterative refinement.

The impact caused by the misidentification of obstacles will cause unpredictable damage to the equipment, so it is necessary to accurately identify the obstacles and pathways in the receptive field. We perform multi-scale fusion and regional supplementation on the receptive field image. The framework of this approach is shown in Figure 2. First, we fill the image collected by the robot for contour filling to prevent the loss of boundary information.

X^{″} = d r a w C o n t o u r s (X, k^{(..)})

(1)

k

is the image contour filling operator. There are two contour matrices in this algorithm, focusing on patch and image, respectively.

X

is the input image, while

X^{″}

is the filled result.

Next, we input the filled result

X^{″}

into the 2-channel feature extraction pyramid network and, respectively, obtain 5 different-sized corner feature maps {C1, C2, C3, C4, C5} for obstacles and path information. The features contain various categories of waypoints, such as road boundary, road center, close to obstacle etc. Then, we segment the two-channel multi-scale feature maps separately to obtain different a priori boxes and position predictions for each contour. These feature maps contain both strongly represented semantic information and geometric details. We will accomplish multi-scale feature fusion in a serial multi-branch structured feature pyramid, mainly through layer-hopping connections. We return the feature rescaling obtained at low scale (for example, {C1, C2, C3}) to the high scale feature layer (for example, C4) by weighted averaging to improve the feature resolution. In addition, Gaussian non-local attention is used to increase the features. Residual connectivity will also be used in the information integration calculation for multi-scale fusion. This process is described in detail in the pseudo-code of Algorithm 1. In order to improve the model representation ability, we obtained multiple results through multiple rounds of cross fusion to prevent performance degradation caused by interference.

After obtaining high-level and detailed information, instance-level segmentation and position are performed on the image, which means that we can obtain pixel positions for obstacles of different sizes. The gradient of one iteration cannot accurately reveal the actual position and contour size of the obstacle, so we repeat the positioning until all target predicted positions in the receptive field are not changing to obtain a stable and reliable gradient heat map to obtain the initial positioning.

F_{*} (p) = w_{*}^{T} ϕ_{S} (p)

(2)

F_{*} (p)

is the predicted position coordinate,

\emptyset_{S} (P)

is the feature vector of the position, where

p

represents the region proposal,

w_{*}

is the matrix consist of HDPP’s parameters for positioning. And the loss function and optimization objective are as follows:

L o s s = \sum_{i}^{n} {(t_{*}^{i} - {\hat{w}}_{*}^{T} ϕ_{S} (P^{i}))}^{2}

(3)

ω_{*} = \arg \min_{w_{*}} \sum_{i}^{n} {(t_{*}^{i} - {\hat{w}}_{*}^{T} ϕ_{S} (P^{i}))}^{2} + α {‖ {\hat{w}}_{*} ‖}^{2}

(4)

Here our

L o s s

uses L2 loss, where

t_{*}^{i}

represents the true value and

{\hat{w}}_{*}^{T} \emptyset_{S} (P^{i})

is the predicted result.

w_{*}

is our optimization goal, which can be solved by gradient descent or least square method. Moreover,

n

is the total number of iterations in the actual runtime (the upper limit of iterations

M a x_I

is designed to be 3000 in this paper) We also introduce a regularization term

α ‖ {\hat{w}}_{*} ‖^{2}

in order to prevent overfitting, which is the parameters of FCN.

Specifically, we use the initial positioning frame obtained from the cumulative gradient heat map to map the receptive field to the actual space. In the mapping process, we adaptively find the most suitable position transformation matrix to modify in each Nk iterations, and subtract the irrelevant when the center of the position suggestion frame is stable. For the results obtained from multi-scale feature fusion mentioned above, we use the following scale-invariant mean square error

L (y, y^{*})

for fusion, where

y

is the further refinement for

F_{*} (p)

.

L (y, y^{*}) = \frac{1}{2 n} \sum_{i = 1}^{n} (\log y_{i} - \log y_{i}^{*} + β (y, y^{*}))^{2}

(5)

where

y

and

y^{*}

respectively represent the true value and our predicted value. And the parameter

β

can be calculated as follow:

β (y, y^{*}) = \frac{1}{n} \sum_{i} (\log y_{i}^{*} - \log y_{i})

(6)

Finally, in order to generate fewer waypoints, thereby reducing the time required to calculate and update the gait, we perform a second screening to further remove the interference corner points that have no effect on the result. After the above process, we can obtain a marked point image with less disturbance and better performance. Throughout the training process, we use the following function to obtain the result:

L o s s (y, y^{*}) = \frac{1}{n} \sum_{i} d i f f_{i}^{2} - \frac{σ}{n^{2}} {(\sum_{i} d i f f_{i})}^{2}

(7)

where

d i f f_{i}^{2}

represents the difference between the final true value and the predicted value. The

σ

parameter in disappear loss is set to 0.5 in this algorithm.

The initial key pixel is determined by the cumulative gradient. Specifically, we divide each

H \times W

instance box predicted from the original image into a grid of

H / m \times W / m

cells (the size of each grid is

m \times m

). Then, we sum the absolute gradient values in each cell of all the instance boxes and select the top K (we set K = 5 in the experiment) cell. Since in our experiment, the grid-shaped corner area patch can successfully superimpose the confidence level of the feature point category of the receptive field, so as to efficiently perform threshold classification. Therefore, we transfer K cells containing path information to grid-shaped cells and treat the pixels on the grid lines as waypoint information. Then, we transfer K cells containing roadblock information to the annotated image and treat the pixels on the grid line as roadblock points with a large cost.

Algorithm1. Multi-scale positioning for waypoint navigation

Input:

F

(image of robot receptive field),

c

(movement cost),

\sum_{j = 0}^{n} G_{j}

(

G

for gait sequence,

j

for gait sequence subscript,

n

for overall number of action groups),

s

(start point),

g_{s}

(the planning result for start point)

g_{i}

(the planning result for

i

th iteration),

M a x_I

(maximum iteration, which is 3000 in this paper),

X_{i}

(Robot gait planning result after

i

th iteration),

N e t 1

,

N e t 2

(the modular combination network including instance segmentation and position, and the network for patch-based selection and refining mentioned in Section 3.3),

ε

(the lower limit of the difference between the target prediction results)
Output: Robot gait planning result

X

after optimization
/* INITIALIZATION */
1: Initialize

X_{0} \Leftarrow g_{s}

,

j \Leftarrow r a n d o m n u m

,

X \Leftarrow \emptyset

,

G \Leftarrow \emptyset

,

P a r e n t (g_{s}) \Leftarrow \emptyset

,

i \Leftarrow 0

/* MULTISCALE FEATURE EXTRATION */
2: while

g_{i} \in \sum_{j = 0}^{n} G_{j}

and

i \in [1, I]

do:
3:

F^{'} \Leftarrow

Get and fill the

F

4:

F_{1}, F_{2}, F_{3}, F_{4}, F_{5} \Leftarrow

two stream feature extraction pyramid

F^{'} \cap (X_{i} \Leftarrow (X_{i - 1} \cup g_{i})

5:

X_{i} \Leftarrow N e t (F_{1}, F_{2}, F_{3}, F_{4}, F_{5})

6: while

X_{i} - X_{i - 1} < ε

do:
7:

X_{i} \Leftarrow N e t (F_{1}, F_{2}, F_{3}, F_{4}, F_{5})

8:      end while
/* OPTIMIZATION                 */
9:      Optimize

g_{i}

∈

\sum_{j = 0}^{n} G_{j}

based on Equation (8)–(10)
10: Update and Extract

g_{i}

/* INTEGRATION */
11: for each

g_{i} \in \sum_{j = 0}^{n} G_{j}

do:
12: Update

X_{i} \Leftarrow X_{i - 1} \cup g_{i}

,

P a r e n t (g_{s}) \Leftarrow g_{i}

13: end for
14: end while
15: end
16:

X \Leftarrow

Integration (Parent,

X_{i}

).

Our HDPP is based on the iterative fast gradient multi-scale positioning method, with a small step size αto locate the multi-category feature points. In addition, in order to stabilize the integrated planning process, we introduce additional parameters to balance the gradients of the different types of detector modules in the network. The details are introduced in Section 3.3.

3.3. Patch-Based Selection and Refining of Waypoints

In order to eliminate the distortion of the visual sensor and improve the planning efficiency, we propose a novel patch selection and refinement scheme, as shown in Figure 3. The calculation process and code details are shown in Algorithm 2.

Although we can obtain the initial positioning frame of the multi-category feature points according to the cumulative gradient heat map, the key pixels will change when we update the robot state. Therefore, we propose an adaptive method to solve these problems, that is, we add correction patches to the image according to the current gradient in each Nk iteration. In this way, we can find the influence of the current detected waypoints on the planning effect and realize the directional deletion of the waypoints by arranging obstacles.

Not all the waypoints we detect play a key role in the robot’s movement, but the resulting gait planning time is very large. However, deleting some of these edge points during the planning process may result in a decrease in overall performance due to changes in the shortest path and optimal gait. Removal of insignificant waypoints and maintaining the planning scheme can fully improve the efficiency of robot operation. In our work, when the planning strategy becomes stable (the number of bounding boxes reaches the minimum value at least Dk times), we perform random elimination, removing the points whose pixel change value is less than the average change value divided by 3. We repeat this process until the maximum number of iterations, I, is reached.

Algorithm 2. Patch-based Selection and Refining

Input:

P

(perturbation),

N P

(the number of patches),

M a x_P

(the max number of patches),

M i n_P

(the min number of patches),

A_{k}

(the frequency of adding perturbation),

D_{k}

(the threshold of decreasing perturbation),

I

(maximum iteration),

S A

(the flag for increased classification area),

S D

(the flag for decreased classification area),

x

(input image),

O p t i m i z e

(optimize method which returns new waypoints and the number of bounding boxes)
Output: Classification result

X

after optimization
1:

S A \Leftarrow f a l s e

,

S D \Leftarrow f a l s e

,

a_{k} \Leftarrow 0

,

d_{k} \Leftarrow 0

2:

m i n_b b_n u m \Leftarrow I N F

3:

f o r e a c h i \in [1, I] d o

4:

P, N \Leftarrow O p t i m i z e (x)

5:

i f N > 0 t h e n

6:

a_{k} \Leftarrow a_{k + 1}

7:

i f a_{k} % A_{k} = 0 a n d N P < M a x_P t h e n

8:

N P \Leftarrow N P + 1

9:

e n d i f

10:

i f P N < M i n_P t h e n

11:

M i n_P \Leftarrow P N

12:

e n d i f

13:

i f P N = M i n_P t h e n

14:

d_{k} \Leftarrow d_{k + 1}

15:

i f d_{k} % D_{k} = 0 t h e n

16:

S D \Leftarrow t r u e

17:

e n d i f

18:

i f S D = t r u e t h e n

19:

X \Leftarrow d e c r e a s e c l a s s i f i c a t i o n a r e a (x)

20:

e n d i f

21:

e n d i f

22:

e n d f o r

23:

e n d

3.4. Position Correction and Eliminate Camera Distortion

In high-dimensional space planning, there is a deviation between the robot camera coordinate system and the actual coordinate system, as well as the influence of tangential distortion, which leads to planning errors. In order to adjust the gait more accurately, our HDPP performs coordinate system mapping to achieve the purpose of correction. In order to stabilize the calibration and cope with different robot vision sensor installation conditions, we use the following method to perform coordinate mapping:

x_{k + 1} = x_{k} + \sum_{i = 1}^{N} ω_{i} \cdot μ \cdot s i g n (\nabla_{x}, J_{i} (x_{k}, y))

(8)

J (α) = - \frac{1}{N} \sum_{i = 1}^{N} L (α_{i}, 0)

(9)

ω_{i} = \max (1, D_{i} (x) - D_{i} (x^{*}))

(10)

where

x_{k}

is Optimized image in kth iteration,

J (α)

is the loss function we use to optimize i-th image, where L is the Mean Square Error (MSE),

α_{i}

is the confidence score of i-th bounding box. And μ is the updated step, the weight

ω_{i}

is the parameter of balancing gradients.

4. Experiment

In this section, we first describe the experimental datasets. Next, due to space limitations, we have carried out detailed methods settings in the second section. In the third section, we introduce the implementation details to provide scientific and detailed evaluation indicators for the experiment. In the experimental link, we first conducted an extensive experiment to evaluate HDPP’s planning of moving objects in different high-dimensional spaces and horizontally compared it with other algorithms. At the same time, in order to explore the impact of HDPP on search optimality and efficiency trade-off, we also conducted a qualitative analysis, namely how HDPP effectively finds the optimal gait sequence optimization problem.

4.1. Datasets

We adopted the following path-planning datasets include self-constructed and public with obstacle annotations to collect planning demonstrations.

High-dimensional Space Map Dataset. This is a data set we created. In the scene, we manually control the robotic arm to calibrate the position of the steering gear and perform image acquisition at a fixed frame rate. This data set is based on palletizing application scenarios, with a total of ten environments. Each environment contains 6 different specifications of obstacles (3 × 2, there are 3 shapes of obstacles, and each shape has 2 sizes). A single environment group consists of 600 training images, 100 verification images, and 100 test images. Obstacles of the same type are placed in different layouts. At the same time, we use robots with different degrees of freedom (as shown in Table 1) to complete the entire experiment in a reasonable time. We follow the original settings to independently train and evaluate each environment group. In addition, in order to increase the amount of data and provide a basis for more robust ablation experiments, we have performed data enhancement operations. We cropped multiple random blocks with a size of 64 × 64 from each map and randomly generated them in the dataset scene. We use 30 of the 10 scene maps to generate 1600 training and 300 validation maps randomly. At the same time, we also constructed 200 data for testing through mapping, so as to ensure that there is no overfitting caused by the occurrence of shared maps. The entire data set contains more than 6000 sets of data for robust detection, 1900 sets of data for modular detection, and more than 2000 sets of collected data for safety collision detection. We hope to achieve the purpose of feasibility experiment and robustness detection through this original high-dimensional spatial data set.
Motion Planning Dataset [39]. This data set was selected by us as the data for qualitative experiments. The MP dataset is a collection of eight grid world environments with unique obstacle shapes created by Bhardwaj et al. Each environment group is composed of 800 training maps, 100 verification maps and 100 test maps, and obstacles of the same type are placed in different layouts. We resize the data set to the same size scene specification of the High-dimensional Space Map Dataset to provide data support for the feasibility of the experiment. The entire data set contains a total of 8000 sets of data. We hope to use this data set to provide a sufficient and objective test for the experimental conclusions.

In addition to the above two data sets, we also introduced a simulation environment for feasibility testing in the experiment. The collection environment of the experiment is the gazebo simulation environment under the Linux system, which is similar to the environment of the High-dimensional space map and is constructed to eliminate the environmental impact of the entity. We set up the digital manipulator and manually control it to move, so as to realize the parameter recording of the steering gear. This data set contains ten independent environments, and each combination of scenes and obstacles collects ten steering gear parameters and gait sequences, which provides reliable support for qualitative analysis and feasibility analysis. We should notice that these data are dynamic. In addition, we added randomly generated obstacle detection in the test session of the experiment as a way to verify the effectiveness of HDPP in dynamic planning.

4.2. Methods for Experiment

The proposed HDPP is compared with the baseline methods of multi-DOF path planning. The experiment-related algorithms and improvement details are shown below:

HDPP. As the backbone of this algorithm, we adopted U-Net [40] with the VGG-16 [41] backbone network, the last layer of which is activated by Relu (Rectified Linear Unit) constrain the guiding graph. For the multi-scale feature module, we use the feature pyramid to analyze the receptive field and use the Chebyshev distance as a heuristic function suitable for the eight-neighborhood grid world. In addition, in order to achieve high-precision and high-dimensional planning, we introduce a patch optimization and selection module. In this module, we use MSE (mean-square error) loss to filter the Top K (in this paper, K = 10) patches, and introduce the disappear parameter to control the number of patches to adjust the detection accuracy and calculations. The relationship between the quantity. We also introduce gradient balance in the network structure to prevent overfitting.

Baseline. In order to evaluate the superiority of HDPP in high-dimensional space motion planning, we adopted the typical path planner and the following data-driven planner as benchmark competitors. These competitors have the same guarantee of planning success in design as we do. We have extended the author’s online implementation.

Tangent Bug Algorithms [42]. This is a real-time path planning algorithm for detecting surrounding obstacles, also a variant of Bug1 and Bug2. Tangent Bug is different from the previous two algorithms. Only when the distance from the obstacle is 0, the characteristics of the obstacle are found to be different. The variable R is defined. As long as there is an obstacle within the distance R, it will be detected. Next, take the robot as a single point to detect whether there is an obstacle on the line between the current position and the target point, and respond to it. We adjusted the R parameters of Tangent Bug, introduced μ parameters, and calculated R according to the robot’s motion environment. Compared with the traditional method, this method has certain advantages in the real-time obstacle avoidance scene of the robotic arm. We hope to introduce this algorithm to compare the effect of HDPP at the obstacle avoidance level.
SAIL (Summation-based incremental learning). High search efficiency is achieved by learning heuristic functions from demonstrations. Distinct from HDPP, SAIL uses hand-designed features to introduce prior information into the algorithm, such as the nodes of each target and nearest obstacle. We adjusted SAIL to adapt to the detection and planning of high-dimensional space and evaluated the effect of SAIL on the data set. We hope that this method can be used to compare the impact of introducing a priori information on preliminary path planning.
Cuckoo Search Algorithm [43]. A parallel compact cuckoo search algorithm for three-dimensional path planning. The algorithm uses an improved cuckoo search based on compact parallel technology to find the target in a complex three-dimensional environment. Cuckoo search proposes a new parallel communication strategy that saves the memory of the robot. We train the encoder by treating the parallel search module as a black box function and optimizing it through the black box, while keeping other settings. We transformed the drone in the original application scenario of the algorithm into a multi-joint robot, which is consistent with the HDPP operating environment. And check the plan in a high-dimensional environment. This algorithm is often used in UAVs (unmanned aerial vehicle), underwater robots and other equipment, and has reference significance for high-dimensional space planning.

Traditional comparisons. In order to conduct a comprehensive comparative evaluation of HDPP, we selected the most classical methods and variants of the three traditional algorithms (search-based, sampling-based and reinforcement learning) to participate in the experiment:

Search-based [16,21,22,23]. We mainly choose Dijkstra’s Algorithm, A* and D* lite for experiment. These algorithms are representative of traditional path planning based on search strategy, which will find the optimal path by visiting and calculating map nodes and costs. However, the difference is that the search strategies of these three algorithms are constantly refined, and their performance has a progressive relationship.
Sampling-based [17,24,25,26,27,28]. RRT and its improved algorithm—RRT*—are regarded as representative of sample-based path planning. For this strategy, the optimal path point was found by random sampling and added into the planned path tree. RRT* is improved on the basis of RRT, mainly by reselecting the parent node and rewiring operations.
Reinforcement learning [29,30,31,32,33]. Classical Monte Carlo algorithm and DQN are transferred to the experiment to explore the effect of reinforcement learning ideas in high-dimensional space, especially for multi-degree-of-freedom objects. Since our data set is dynamic and variable, and time thresholds are designed to emphasize practical applications, underfitting or local optimality can be expected with this strategy of calculating returns by receiving continuous environmental feedback. Two of the most typical algorithms are introduced to test this judgment.

4.3. Implementation Details

We tested ABB’s six-axis robotic arm and four-legged trolley in the experiment. Among them, the experimental environment of ABB’s six-axis robotic arm is natural light, and the background floor has a single color. There are 10 rectangular paper boxes of different sizes and different colors as the target. During the operation of the robotic arm, there will be random interference of black squares. Our HDPP performs path planning for the robot to run from any position to the target carton to grab this segment of the travel process. For the simulation of the quadruped car, the obstacle avoidance is carried out based on the complex environment of the laboratory.

For the parameter setting of our method, we set the maximum number of iterations of the high-dimensional manipulator I = 5 K, the quadruped car’s I = 2 K, the frequency of patch refinement and gradient optimization Nk = 100, and the threshold Mk for reducing redundant waypoints = 10, the patch size is 50 × 50 (the resize of the receptive field is 1024 × 1024), and the upper limit of waypoints is N = 20.

In order to fully evaluate our algorithm, we have introduced the following indicators:

A C C_{g a i t - \cos t} = {\begin{matrix} α \sum_{r = 1}^{N} P_{n o w} / P_{o r i g i n a l} P o {int}_{n o w} < P o {int}_{o r i g i n a l} \\ 0 o t h e r w i s e \end{matrix}

(11)

C O S T_{planning} = {\begin{matrix} 0 S U C C E S S \\ σ F A I L U R E \end{matrix}

(12)

C O S T_{collision} = {\begin{matrix} τ H I T \\ 0 o t h e r w i s e \end{matrix}

(13)

where

A C C_{g a i t - c o s t}

is the number of points after fitting and the number of planned waypoints without HDPP, and the

C O S T_{p l a n n i n g}

and

C O S T_{c o l l i s i o n}

, respectively, represents the planning loss and collision loss. The σ and τ can be modified according to the application scenario, which is set to 10 and 1000 in our experiment.

However, while these indicators have an outstanding effect on optimization results, they are not standard. In order to intuitively analyze the experimental data, we transform and combine these three aspects into clearer data representation, and also add an index for time cost (the name of the indicator is qualitatively determined by the function of performance quantification, which use superscript “*” to distinguish):

O P T^{*} = \sum_{i = 1}^{n} A C C_{g a i t - cost} / \sum_{i = 1}^{n} α

(14)

A C C^{*} = (\sum_{i = 1}^{n} σ - C O S T_{p l a n n i n g} + \sum_{i = 1}^{n} τ - C O S T_{c o l l i s i o n}) / (\sum_{i = 1}^{n} σ + \sum_{i = 1}^{n} τ)

(15)

T S^{*} (T i m e - S c o r e) = \sum_{i = 1}^{n} t_{i} (H D P P) / T

(16)

where

n

means the number of rounds tested and

i

is the current iteration number,

T

means the average n rounds planning time of the baseline algorithms;

O P T^{*}

(waypoint optimization) represents the efficiency of remove the redundancy based on waypoint importance, also the efficiency of gait updating;

A C C^{*}

quantify the condition for collision and whether path planning is successful, while

T S^{*}

is used to evaluate the real-time performance of HDPP.

Obviously, to achieve a high level of planning, we need to minimize the number of waypoints and reduce the number of re-planning gait under the premise of ensuring that the path is successfully planned without collision. Due to the different operating environments of the robots, we use

O P T^{*}

,

A C C^{*}

, and

T S^{*}

in the results to represent the average score of each image.

4.4. Results of HDPP

Fast and efficient waypoint planning. We conducted experiments on multi-degree-of-freedom robots (for example, 2-DOF Mobile robot, 3-DOF and 6-DOF Robotic arm) to verify HDPP’s effectiveness in eliminating redundant computational path points to improve overall performance. The results of HDPP are shown in Table 1. Our HDPP can completely achieve anti-collision path planning, but only use other algorithms with less than 60% waypoints. If the waypoint coefficient is increased from 1 to 100, the number of waypoints can be reduced to 40% or less of other algorithms, and the anti-collision plan can still reach more than 90%. Compared with the traditional method based on search, it reduces the amount of data calculation by 68.73%, and achieves the same better and faster exercise effect. These results show that our approach can improve the efficiency of robot motion.

Comparisons with Baselines. In order to conduct a comprehensive and reliable evaluation of HDPP, we conducted a comparative experiment. Table 2 shows the improvement of our algorithm in redundant waypoint screening and security, and also shows that our strategy can dynamically find and divide very well compared with search or sampling-based methods. Category area for waypoint optimization.

We noticed that some recent work based on reinforcement learning are also dedicated to reducing the waypoints in robot path planning. They used clever strategies and achieved good results in robot navigation. These works have the same focus and task as ours but put the improvement of motion planning speed in the first place, instead of paying too much attention to the safety issues in robot motion. In order to make a fair comparison with these works, we follow their experimental settings and generate no more than 70% of the waypoints based on the search algorithm for our HDPP. Despite the addition of such constraints, the experimental results show that the proposed HDPP works well in anti-collision, and the scale of waypoints and the number of gait updates are not too different. At the same time, we also test the hypothesis that reinforcement learning is less efficient in calculating rewards under time constraints in high-dimensional path planning. Table 3 shows the performance of HDPP algorithm on three data sets.

As shown in Table 2 and Table 3, on the basis of ensuring the robustness of planning, our HDPP has greatly improved in the two links of waypoint optimization and safety. It can obtain stable waypoint planning results in a short time and minimize the probability of collision with obstacles, which is crucial in the actual robot path planning.

Qualitative Results. Figure 4 shows the selected path planning results of the qualitative experiment on six-degree-of-freedom robotic arm. We visualized the path planning for different algorithms. Specifically, the higher guidance cost was allocated to the entire obstacle area that caused the dead end, and it was drawn on the map in green form, and the bypass that was not searched and traversed to the target was not planned to pass through. The map is displayed in gray. Figure 4 depicts how methods including HDPP adaptively perform searches for targets in the same map.

The Result of Optimization and Gradient Balance. We set up experiments to explore the effects of patch optimization and gradient balance modules on model training. In this section, we selected the initial high-dimensional spatial motion scene for testing. Since the module needs to be rewritten, we adjusted the original experimental parameter I = 2 K to 3 K. In this way, it can evaluate the effects of patch optimization and gradient balance more comprehensively. From the experimental curve, we found that the addition of modules can significantly improve the accuracy of the model and speed up the training process and iteration speed. The result is shown in Figure 5.

Ablation Study. We demonstrate the effectiveness of our method on three-dimensional six-degree-of-freedom and two-dimensional two-degree-of-freedom robots (only a single background and no more than 10 connected domains). The results are reported in Table 4.

For these two cases, we used two baselines: on the basis of ensuring the collision avoidance rate, we cross-adjusted the number of waypoints. In the case where the boundary between the obstacle feature point and the path feature point is obvious, whether to choose the initial optimization position of the patch, the result is not much different. This also proves that the key category boundaries cannot be accurately found at the beginning of the planning process. On the contrary, our patch selection and refinement scheme improves performance and gradually eliminates insignificant waypoints, thereby reducing AS, BS, and CS indicators, and greatly improving planning efficiency. In addition, using them at the same time can reduce the number of gait updates by more than 40% on basis of ensuring safe exercise.

For the case where the number of obstacle areas is restricted, the waypoint classification optimization patch determined by the gradient heat map obtains better performance, especially when planning objects with six degrees of freedom and above. In addition, the proposed strategy and the stable gradient both improve BS, and the former also eliminates many inefficient waypoints. These results further prove the effectiveness of the proposed method.

5. Conclusions

In this article, we investigate the problem of exploiting the multi-scale and discriminative information for multi-degree-of-freedom dynamic motion planning in high dimensional space, and our novel approach—HDPP—is proposed. A multi-scale positioning framework with patch-based waypoint refinement is employed to learn high-level features, which contain strongly represented semantic and geometric information, and dynamically fuse them. This mechanism can adaptively locate important waypoints, and then refine the planning results gradually through iteration and eliminate redundancy. In order to narrow the scope of the model and make quick calculations, a patch-based selection and refining scheme is proposed to dynamically capture key regions for path planning by computing gradient accumulation. For the selection of planning feature points, we balance the gradients from different types of waypoints to avoid over-optimizing one of them. The experiments conducted on both in reality and simulation environment demonstrate the effectiveness of the proposed method.

However, although our HDPP can achieve more than 95% efficient obstacle avoidance for multi-degree-of-freedom motion targets, there are still some limitations. For example, collisions still exist in high-dimensional path planning due to the anticipation of fast path planning, even though we have focused on collision costs in design. Due to the existence of multiple efficient paths in multi-DOF and multi-dimensional space, it is difficult to evaluate the optimal scheme. These problems will influence the performance of our model. In the future, we want to find an appropriate approach to balance the relationship between planning efficiency and equilibrium collision, which is devastating to robots, and design a set of evaluation rules to standardize the analysis of high-dimensional path planning. In addition, we hope that our work can arouse more people’s attention to the safety of high-dimensional spatial motion planning and provide a way to improve the real-time performance of planning.

Author Contributions

Conceptualization, J.W. and J.H.; methodology, J.W.; software, J.W.; validation, J.W.; formal analysis, J.W.; investigation, J.W.; resources, J.W. and J.H.; data curation, J.W.; writing—original draft preparation, J.W.; writing—review and editing, J.W. and J.H.; visualization, J.W.; supervision, J.H.; project administration, J.H. and X.R.; funding acquisition, J.H. and X.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Key Research and Development Program Project (No. 2020YFB1005900), National Natural Science Foundation of China (No. 61773027 and 62076014), and Industrial Internet Innovation and Development Project (No. 135060009002).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Wang, J.; Liu, J.; Kato, N. Networking and communications in autonomous driving: A survey. IEEE Commun. Surv. Tutor. 2018, 21, 1243–1274. [Google Scholar] [CrossRef]
Jhaver, S.; Birman, I.; Gilbert, E.; Bruckman, A. Human-machine collaboration for content regulation: The case of reddit automoderator. ACM Trans. Comput. Human Interact. TOCHI 2019, 26, 1–35. [Google Scholar] [CrossRef]
Bao, L.G.; Dang, T.G.; Anh, N.D. Storage assignment policy and route planning of agvs in warehouse optimization. In Proceedings of the 2019 International Conference on System Science and Engineering (ICSSE), Dong Hoi City, Quang Binh Province, Vietnam, 20–21 July 2019; pp. 599–604. [Google Scholar]
Tsai, C.H.; Elibol, A.; Chong, N.Y. A UAV-UUV Transformative Housing for Minimal Logistics Underwater Exploration. In Proceedings of the 2021 18th International Conference on Ubiquitous Robots (UR), Jeju, Korea, 4–6 July 2021. [Google Scholar]
Agrawal, A.; Verschueren, R.; Diamond, S.; Boyd, S. A rewriting system for convex optimization problems. J. Control. Decis. 2018, 5, 42–60. [Google Scholar] [CrossRef]
Zhou, Q.Y.; Park, J.; Koltun, V. Open3D: A modern library for 3D data processing. arXiv 2018, arXiv:1801.09847. [Google Scholar]
Lei, Z.; Han, S.; Bouferguène, A.; Taghaddos, H.; Hermann, U.; Al-Hussein, M. Algorithm for mobile crane walking path planning in congested industrial plants. J. Constr. Eng. Manag. 2015, 141, 05014016. [Google Scholar] [CrossRef]
Sharma, N.; Thukral, S.; Aine, S.; Sujit, P.B. A virtual bug planning technique for 2D robot path planning. In Proceedings of the IEEE 2018 Annual American Control Conference (ACC), Milwaukee, WI, USA, 27–29 June 2018; pp. 5062–5069. [Google Scholar]
Liu, J.; Yang, J.; Liu, H.; Tian, X.; Gao, M. An improved ant colony algorithm for robot path planning. Soft Comput. 2017, 21, 5829–5839. [Google Scholar] [CrossRef]
Chou, J.S.; Cheng, M.Y.; Hsieh, Y.M.; Yang, I.T.; Hsu, H.T. Optimal path planning in real time for dynamic building fire rescue operations using wireless sensors and visual guidance. Autom. Constr. 2019, 99, 1–17. [Google Scholar] [CrossRef]
Ammirato, P.; Poirson, P.; Park, E.; Košecká, J.; Berg, A.C. A dataset for developing and benchmarking active vision. In Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 29 May–3 June 2017; pp. 1378–1385. [Google Scholar]
Banino, A.; Barry, C.; Uria, B.; Blundell, C.; Lillicrap, T.; Mirowski, P.; Pritzel, A.; Chadwick, M.J.; Degris, T.; Modayil, J.; et al. Vector-based navigation using grid-like representations in artificial agents. Nature 2018, 557, 429–433. [Google Scholar] [CrossRef]
Ummenhofer, B.; Zhou, H.; Uhrig, J.; Mayer, N.; Ilg, E.; Dosovitskiy, A.; Brox, T. Demon: Depth and motion network for learning monocular stereo. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 5038–5047. [Google Scholar]
Modares, J.; Ghanei, F.; Mastronarde, N.; Dantu, K. Ub-anc planner: Energy efficient coverage path planning with multiple drones. In Proceedings of the 2017 IEEE international conference on robotics and automation (ICRA), Singapore, 29 May–3 June 2017; pp. 6182–6189. [Google Scholar]
Fraccaro, M.; Rezende, D.; Zwols, Y.; Pritzel, A.; Eslami, S.A.; Viola, F. Generative temporal models with spatial memory for partially observed environments. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 1549–1558. [Google Scholar]
Zhou, K.; Yu, L.; Long, Z.; Mo, S. Local path planning of driverless car navigation based on jump point search method under urban environment. Future Internet 2017, 9, 51. [Google Scholar] [CrossRef] [Green Version]
Chen, B.; Dai, B.; Lin, Q.; Ye, G.; Liu, H.; Song, L. Learning to plan in high dimensions via neural exploration-exploitation trees. arXiv, 2019; arXiv:1903.00070. [Google Scholar]
Wei, K.; Ren, B. A method on dynamic path planning for robotic manipulator autonomous obstacle avoidance based on an improved RRT algorithm. Sensors 2018, 18, 571. [Google Scholar] [CrossRef] [Green Version]
Clifton, J.; Laber, E. Q-learning: Theory and applications. Annu. Rev. Stat. Its Appl. 2020, 7, 279–301. [Google Scholar] [CrossRef] [Green Version]
Betancourt, M. A conceptual introduction to Hamiltonian Monte Carlo. arXiv 2017, arXiv:1701.02434. [Google Scholar]
Saleem, M.S.; Sood, R.; Onodera, S.; Arora, R.; Kanazawa, H.; Likhachev, M. Search-based Path Planning for a High Dimensional Manipulator in Cluttered Environments Using Optimization-based Primitives. In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 27 September–1 October 2021; pp. 8301–8308. [Google Scholar]
Elhoseny, M.; Tharwat, A.; Hassanien, A.E. Bezier curve based path planning in a dynamic field using modified genetic algorithm. J. Comput. Sci. 2018, 25, 339–350. [Google Scholar] [CrossRef]
Dang, T.; Mascarich, F.; Khattak, S.; Papachristos, C.; Alexis, K. Graph-based path planning for autonomous robotic exploration in subterranean environments. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3–8 November 2019; pp. 3105–3112. [Google Scholar]
Summers, T. Distributionally robust sampling-based motion planning under uncertainty. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 6518–6523. [Google Scholar]
Lai, T.; Zhi, W.; Hermans, T.; Ramos, F. Parallelised diffeomorphic sampling-based motion planning. In Proceedings of the Conference on Robot Learning, Online, 11 January 2022; pp. 81–90. [Google Scholar]
Orthey, A.; Toussaint, M. Rapidly-exploring quotient-space trees: Motion planning using sequential simplifications. arXiv 2019, arXiv:1906.01350. [Google Scholar]
Orthey, A.; Akbar, S.; Toussaint, M. Multilevel motion planning: A fiber bundle formulation. arXiv 2020, arXiv:2007.09435. [Google Scholar]
Ortiz-Haro, J.; Hartmann, V.N.; Oguz, O.S.; Toussaint, M. Learning efficient constraint graph sampling for robotic sequential manipulation. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 4606–4612. [Google Scholar]
Schmitt, P.S.; Wirnshofer, F.; Wurm, K.M.; Wichert, G.V.; Burgard, W. Planning reactive manipulation in dynamic environments. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3–8 November 2019; pp. 136–143. [Google Scholar]
Chen, C.; Chen, X.Q.; Ma, F.; Zeng, X.J.; Wang, J. A knowledge-free path planning approach for smart ships based on reinforcement learning. Ocean. Eng. 2019, 189, 106299. [Google Scholar] [CrossRef]
Qie, H.; Shi, D.; Shen, T.; Xu, X.; Li, Y.; Wang, L. Joint optimization of multi-UAV target assignment and path planning based on multi-agent reinforcement learning. IEEE Access 2019, 7, 146264–146272. [Google Scholar] [CrossRef]
Yan, C.; Xiang, X.; Wang, C. Towards real-time path planning through deep reinforcement learning for a UAV in dynamic environments. J. Intell. Robot. Syst. 2020, 98, 297–309. [Google Scholar] [CrossRef]
Qu, C.; Gai, W.; Zhong, M.; Zhang, J. A novel reinforcement learning based grey wolf optimizer algorithm for unmanned aerial vehicles (UAVs) path planning. Appl. Soft Comput. 2020, 89, 106099. [Google Scholar] [CrossRef]
Ichter, B.; Pavone, M. Robot motion planning in learned latent spaces. IEEE Robot. Autom. Lett. 2019, 4, 2407–2414. [Google Scholar] [CrossRef] [Green Version]
Ding, L.; Feng, C. DeepMapping: Unsupervised map estimation from multiple point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 8650–8659. [Google Scholar]
Ichter, B.; Harrison, J.; Pavone, M. Learning sampling distributions for robot motion planning. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–25 May 2018; pp. 7087–7094. [Google Scholar]
Kunz, T.; Reiser, U.; Stilman, M.; Verl, A. Real-time path planning for a robot arm in changing environments. In Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan, 18–22 October 2010; pp. 5906–5911. [Google Scholar]
Paden, B.; Čáp, M.; Yong, S.Z.; Yershov, D.; Frazzoli, E. A survey of motion planning and control techniques for self-driving urban vehicles. IEEE Trans. Intell. Veh. 2016, 1, 33–55. [Google Scholar] [CrossRef] [Green Version]
Bhardwaj, M.; Choudhury, S.; Scherer, S. Learning heuristic search via imitation. In Proceedings of the Conference on Robot Learning, Online, October 2017; pp. 271–280. [Google Scholar]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar]
Qassim, H.; Verma, A.; Feinzimer, D. Compressed residual-VGG16 CNN model for big data places image recognition. In Proceedings of the 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 8–10 January 2018; pp. 169–175. [Google Scholar]
Xu, Q.L.; Yu, T.; Bai, J. The mobile robot path planning with motion constraints based on Bug algorithm. In Proceedings of the 2017 Chinese Automation Congress (CAC), Jinan, China, 20–22 October 2017; pp. 2348–2352. [Google Scholar]
Song, P.C.; Pan, J.S.; Chu, S.C. A parallel compact cuckoo search algorithm for three-dimensional path planning. Appl. Soft Comput. 2020, 94, 106443. [Google Scholar] [CrossRef]

Figure 1. Illustration of an example. Boxes with red lines and blue block are main scan area. We show path planning result by HDPP (ours), sampling-based methods and search-based methods. Noticeably, the proposed HDPP has generated the least waypoints, also the most efficient in high-dimensional space. In addition, we show comparison before and after optimization in order to clearly see the effect of HDPP’s gradient optimization operation.

Figure 2. The framework of the proposed HDPP.

Figure 3. Illustration of the process to eliminate redundant points and obtain key waypoints.

Figure 4. Selected Path Planning Results for Qualitative Experiment. The figures are the planning results of different algorithms on six-degree-of-freedom robotic arm. The three rows are the original scanned map, the normalized map and the simplified map. The black pixels in the figure represent obstacles. The start node (represented by “S”) and the target node (represented by “G”) have been marked in the first column of the first row. The optimal path planned by the algorithm is represented by red nodes, and other explored nodes are displayed in green.

Figure 5. Learning Curve. The horizontal axis of the figure in the figure represents the number of iterations. The vertical axis represents the training accuracy. The blue line is the training curve without patch optimization and gradient balance, and the red line is the training result after the module is introduced. It can be clearly seen that the addition of modules makes the model learn faster and better, and the optimization node ushered in at an earlier time.

Table 1. HDPP waypoint effect in high-dimensional space.

Degree of Freedom	Original	Removing Redundancy
2-DOF Four-legged car	37	14
3-DOF Robotic arm	59	31
6-DOF Robotic arm	93	56
10-DOF Robotic arm	152	72

Table 2. Comparison on 6 DOF (degree-of-freedom) robotic arm. We conducted 50 tests on sports in 10 scenarios. We use the three metrics mentioned in Section 4.3. to quantify performance.

High-Dimensional Space Map Dataset
Methods	OPT*	ACC*	TS*
Dijkstra’s Algorithm	0	0.71	0.49
A*—A-star algorithm	0.13	0.93	0.58
D* Lite	0.59	0.77	0.38
RRT—Rapidly-exploring Random Trees	0.04	0.89	0.74
RRT*	0.16	0.87	0.81
Bug Algorithms	0.48	0.83	0.78
Monte Carlo	0.60	0.43	0.24
DQN	0.75	0.62	0.43
Tangent Bug Algorithms	0.56	0.89	0.81
SAIL	0.63	0.92	0.91
Cuckoo Search Algorithm.	0.48	0.93	0.94
HDPP	0.72	0.91	0.97
Motion planning Dataset
Methods	OPT*	ACC*	TS*
Dijkstra’s Algorithm	0	0.63	0.51
A*—A-star algorithm	0.14	0.91	0.45
D* Lite	0.58	0.74	0.49
RRT—Rapidly-exploring Random Trees	0.10	0.92	0.63
RRT*	0.12	0.94	0.57
Bug Algorithms	0.52	0.79	0.61
Monte Carlo	0.35	0.31	0.47
DQN	0.52	0.54	0.66
Tangent Bug Algorithms	0.60	0.79	0.77
SAIL	0.64	0.89	0.83
Cuckoo Search Algorithm.	0.46	0.90	0.94
HDPP	0.81	0.87	0.94
Simulation Dataset
Methods	OPT*	ACC*	TS*
Dijkstra’s Algorithm	0	0.60	0.55
A*—A-star algorithm	0.13	0.92	0.64
D* Lite	0.69	0.78	0.43
RRT—Rapidly-exploring Random Trees	0.11	0.89	0.79
RRT*	0.21	0.90	0.72
Bug Algorithms	0.48	0.84	0.78
Monte Carlo	0.65	0.60	0.57
DQN	0.87	0.84	0.68
Tangent Bug Algorithms	0.56	0.91	0.86
SAIL	0.62	0.91	0.93
Cuckoo Search Algorithm.	0.46	0.95	0.98
HDPP	0.83	0.93	0.99

Table 3. Performance of HDPP algorithm on three data sets. We have summarized the test results of HDPP on three data sets. Each data set contains three rows of data, the first row is the minimum value, the second row is the maximum value, and the third row is the average.

OPT*	Dataset 1	Dataset 2	Dataset 3
MIN	0.65	0.79	0.80
MAX	0.83	0.94	0.85
AVG	0.72	0.81	0.83
ACC*	Dataset 1	Dataset 2	Dataset 3
MIN	0.79	0.67	0.77
MAX	1	0.91	1
AVG	0.91	0.87	0.93
TS*	Dataset 1	Dataset 2	Dataset 3
MIN	0.92	0.93	0.93
MAX	1	1	1
AVG	0.97	0.94	0.99

Dataset 1: High-dimensional Space Map Dataset; Dataset 2: Motion planning Dataset; Dataset 3: Simulation Dataset.

Table 4. Performance of HDPP algorithm on three data sets. We have summarized the test results of HDPP on three data sets. Each data set contains three rows of data, the first row is the minimum value, the second row is the maximum value, and the third row is the average.

Stabilizing Gradient	√					√
Gradient-based Location		√				√
Patch Selection and Refining			√	√	√
Eliminate camera distortion				√
Position correction					√	√
3-6 Robot	0.83	0.87	0.69	0.71	0.73	0.94
2-2 Robot	0.91	0.89	0.83	0.91	0.95	0.94

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, J.; Ruan, X.; Huang, J. HDPP: High-Dimensional Dynamic Path Planning Based on Multi-Scale Positioning and Waypoint Refinement. Appl. Sci. 2022, 12, 4695. https://doi.org/10.3390/app12094695

AMA Style

Wang J, Ruan X, Huang J. HDPP: High-Dimensional Dynamic Path Planning Based on Multi-Scale Positioning and Waypoint Refinement. Applied Sciences. 2022; 12(9):4695. https://doi.org/10.3390/app12094695

Chicago/Turabian Style

Wang, Jingyao, Xiaogang Ruan, and Jing Huang. 2022. "HDPP: High-Dimensional Dynamic Path Planning Based on Multi-Scale Positioning and Waypoint Refinement" Applied Sciences 12, no. 9: 4695. https://doi.org/10.3390/app12094695

APA Style

Wang, J., Ruan, X., & Huang, J. (2022). HDPP: High-Dimensional Dynamic Path Planning Based on Multi-Scale Positioning and Waypoint Refinement. Applied Sciences, 12(9), 4695. https://doi.org/10.3390/app12094695

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

HDPP: High-Dimensional Dynamic Path Planning Based on Multi-Scale Positioning and Waypoint Refinement

Abstract

1. Introduction

2. Related Work

2.1. Path Planning Methods

2.2. Multi-Degree-of-Freedom Target Motion Planning

3. Materials and Methods

3.1. Problem Formulation

3.2. Multi-Scale Positioning for Waypoint Navigation

3.3. Patch-Based Selection and Refining of Waypoints

3.4. Position Correction and Eliminate Camera Distortion

4. Experiment

4.1. Datasets

4.2. Methods for Experiment

4.3. Implementation Details

4.4. Results of HDPP

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI