Obstacle Avoidance in a Three-Dimensional Dynamic Environment Based on Fuzzy Dynamic Windows

: This paper presents a real-time path planning approach for controlling the motion of space-based robots. The algorithm can plan three-dimensional trajectories for agents in a complex environment which includes numerous static and dynamic obstacles, path constraints, and/or performance constraints. This approach is extended based on the dynamic window approach (DWA). As the classic reactive method for obstacle avoidance, DWA uses an optimized function to select the best motion command. The original DWA optimization function consists of three weight terms. Changing the weights of these terms will change the behavior of the algorithm. In this paper, to improve the evaluation ability of the optimization function and the robot’s ability to adapt to the environment, a new optimization function is designed and combined with fuzzy logic to adjust the weights of each parameter of the optimization function. Given that DWA has the defect of local minima, which makes the robot hard to escape U-shaped obstacles, a dual dynamic window method and local goals are adopted in this article to help the robot escape local minima. By comparison, the proposed method is superior to traditional DWA and fuzzy DWA (F_DWA) in terms of computational efﬁciency, smoothness and security.


Introduction
It is expected that space autonomous robotics will be used to complete complex and dangerous tasks in space as space technology develops [1]. As a robot completes a mission, it needs to plan a safe trajectory from a starting point to a target point. Collisions and path optimization are the main issues relating to the planning of the trajectory (i.e., navigation). Currently, navigation algorithms fall into two categories: global planning algorithms and local navigation algorithms. Previous studies have also combined the above two types of algorithms to achieve better navigation, but such a combination depends on the robot's autonomy and environmental factors [2].
Some notable global navigation algorithms avoid the collision problem by building a global environment map: e.g., A* [3], D* [4], FD* [5], and RRT [6]. When planning a path on a map, the optimization goal is usually the shortest path or the lowest energy use. However, considering that in the planning process, the speed, and accuracy of graph-based search methods depend on the granularity of the search space, those approaches are not suitable for real-time application.
To improve real-time performance, some studies have proposed local navigation algorithms. Currently, the simplest local navigation algorithms are bug-based algorithms, e.g., bug1, bug2 [7], and tangent bug [8]. When the robot encounters an obstacle, the robot optimizer) method which transforms the complex cost function into a simple convex form and applied it to cope with the problem of optimizing real-time three-dimensional (3D) trajectory [33]. Nonetheless, when the constraints are complex, the numerical method still has the problem of local minima in the optimization process.
To address the navigation problem in the three-dimensional environment, this paper proposes a fuzzy dynamic window approach (DF_DWA). According to the traditional DWA, it is believed that the values of 0.8, 0.1, and 0.1 for the parameters of azimuth, obstacle clearance, and speed, respectively can provide good results in some situations [2]. Nevertheless, as the environment becomes more complex, this set of weights is not suitable in all situations. Different working environments using the same weight coefficient will result in obstacle avoidance failure or make the machine stop working [15]. Zhang, H. et al. suggested using fuzzy logic to update the weight of DWA objective function by analyzing the distribution of eight obstacles [34]. Abubakr, O. A. et al. simplified the distribution of obstacles to three situations, thereby reducing the number of fuzzy logic rules, and improving computational efficiency [35]. However, the two methods mentioned update the weight of the objective function only by the distribution of obstacles, without considering the distance of the obstacles. As a result, the robot can easily fall into U-shaped obstacles or the local minima [36,37]. Moreover, the existing objective function does not consider the pointing problem when the agent approaches the target. In this article, by adding the term of the distance target, a new objective function is established to eliminate the influence of angle orientation. The weight parameters are adjusted by fuzzy logic so that the space robot can adapt to environmental changes. The dual dynamic window method and local goal method are used to avoid the agent trapping in the local minima and U-shaped obstacles.
The paper is organized as follows: Section 2 describes the collision-avoidance problem. Section 3 designs fuzzy rules and describes the dynamic-window approaches. The effectiveness of the method is analyzed by simulation in Section 4. Finally, conclusions are drawn from the results of the study in Section 5.

Formulation of the Collision-Avoidance Problem
In this paper, we mainly focus on the 3D real-time trajectory planning of the kind of space-based robots, such as SPHERES [38] and astronaut assistant robots [39]. The agent can work in a space station or specific working range in which the flying distance is much shorter than the radius of the orbit. In addition, the algorithm can also be extended to other types of agent 3D trajectory planning. This section provides the mathematical model of the motion and the attitude of the spatial agent. In a 3D environment, the attitude of the agent is represented by the pitch angle θ, yaw angle ψ, and roll angle φ, as shown in Figure 1. Under reference coordinate system (X f , Y f , Z f ), the equation of the agent's motion is written as: where the ∆t is the sampling time. The conversion between the reference coordinate system oX f Y f Z f and the body coordinate system oX b Y b Z b is in the order of 321. Therefore, the transformation matrix from the body coordinate system to the reference coordinate system is: where c and s are the cosine and sine, respectively. The relationship between the angular velocity component and Euler angle is: T , ω x , ω y and ω z are the components of the angular velocity ω in the axis of the body coordinate. The equation of the attitude of the agent is written as: The trajectory of the agent is generated by Equations (1) and (5). Due to the limitation of the sensor, the activity range that the agent can perceive in a certain time is limited; for example, the effective detection distance of a vehicle-mounted lidar is within 6-8 m.
Taking the geometric center of the agent as the origin, the range can be obtained by: where P = [x, y, z] is the position of the agent, P(t) is the point within the detection range of the agent. r s is the detection radius of the sensor. Within this range, the space occupied by obstacles can be expressed as: is the position of the obstacles, r o is the radius of the obstacles. In this paper, obstacles and agents are assumed to be spherical with the radius of R obstacle and R agent . Users can also refer to the reference [40] to configure agents and obstacles by different boundary constraints. In addition, the obstacle information can be obtained by the sensors [41]. All the admissible positions of the agent can be expressed as: where the S b (t) is an added safety zone that accounts for measurement and process uncertainty. Because of the limitation of sensors, new obstacles will appear in the search area, and the environment within the search range is changing, which requires the algorithm itself to be able to adapt to the change. In addition, some obstacles may also be dynamic, so the admissible range of the agent is a time-varying area. In response to these problems, we propose a fuzzy dynamic window method, which will be discussed in Section 3.

Proposed Dynamic Window Approach
DWA is one of the commonly used methods for local obstacle avoidance. During an iteration, the algorithm can calculate all feasible speed groups for the next iteration. The optimal motion combination is then selected by the evaluation function. The algorithm adopted in the present paper differs from the original algorithm as described below: (1) DWA is extended to a 3D dynamic environment.
(2) New evaluation items are added to the original evaluation function and therefore a new evaluation function is established. Fuzzy logic is also introduced to adjust the weight of each evaluation item according to the working conditions. The distance from the agent to the goal and the distance between the agent and the obstacle are the basis for assessing the adjustment. (3) Two different velocity windows are evaluated at the same time. To ensure the safety of the trajectory, the maximum velocity window is adopted to calculate the braking distance and the distance to the obstacle. (4) Local goals are used to avoid large obstacles or U-shaped obstacles.

The Two Velocity Windows
Due to the acceleration and velocity limitations, the agent cannot move as required. Meanwhile, the high rate iteration of the algorithm or obstacles may cause the agent speed to be extremely slow, which may cause the speed group to be ignored and run errors. Therefore, two-level velocity windows are used to deal with this problem, as shown in Figure 3. Figure 3a shows the two-level velocity windows. The black cube represents the agent and the blue cube represents the 'local window'. The local window is the reachable velocity by the agent in the next iteration, which is limited by the agent acceleration and maximum velocity. The light cube represents the window of the maximum speed that the agent can reach. The maximum speed window can ignore the kinematic constraints of the agent. In Figure 3b, according to the elevation and azimuth angles, the agent's motion is divided into different motion planes to form a 3D fan-shaped area. The dense part of the 3D sector represents that the agent can reach predicted positions if the agent maintains its velocity for 3 s. The dark-blue area represents the positions where the agent can reach at the maximum speed. The red rectangular prism represents the position of the agent selected by the evaluation function. By setting two-level velocity windows, when the agent iteration speed is extremely slow, the best trajectory between the maximum speed is first determined. Then the agent moves to near the optimal position in the next iteration. Based on this, we can further analyze how the agent determines the best local goal when it encounters a large obstacle or a U-shaped obstacle.

The Candidate Local Goal
As shown in Figure 4, the candidates for local goal points are generated in a safe fan-shaped area where the agent can fly, and the optimal value of the candidate local goal point is then detected. The candidates for local goal points are defined by: where goal local (i) is the candidates for local points, φ i (t) is the azimuth of the candidates for local points. θ i (t) is the elevation of the candidates for local points, X(t) is the position of the agent, and R S is the safe separation of the agent and obstacles. j, k and m are the index and number of the candidate local goals. The larger the value of m is, the more local goal points can be selected, and the better the optimization result of the local goal can get. R S is defined by: where R r is the agent radius, R braking is the braking distance of the agent, v t is the current velocity of the agent, and a max is the maximum acceleration that the agent can achieve.

Distance-to-the-Virtual Goal Term
This evaluation term is used to select the candidate virtual local goal point closest to the real goal.
where P goal is the position of the real goal, P goallocal is the position of the virtual local goal, d 1max is the distance between the real goal and the current position of the agent.
A safe area is established between the agent and the local target, and the safe area is defined as a cuboid area, as shown in Figure 5. If there exist obstacles in the safe area between the agent and the candidate local goals, the candidate local goal in the cuboid area will be deleted from goal local (i).

Orientation-to-the-Local Goal Term
This evaluation term is used to select the candidate local goal point with the smallest angle between the real goal and the local goal points.

Distance-to-the-Obstacle Term
This evaluation term is used to select a local goal point far away from obstacles.
where P obstacle is the position of obstacles detected by the agent, m is the number of obstacles. The optimal local goal is selected by the evaluation function Equation (17): (17) where µ dist1 , µ dist2 and µ head1 are the weights of the functions. The goal local (i) point corresponding to the larger set of results in g end is used as the local goal.

The Evaluation Function for DF_DWA
Once the dynamic window is set, each set of speeds (linear and angular velocities) is evaluated to select the best combination in the iteration. This paper uses the speed, goal distance, goal orientation, and obstacle clearance as components of the evaluation function. The detailed content of the algorithm DF_DWA is given in Algorithm 1. for I = 1:5000 4: If norm(x(i)-goal)>p_e ♦ p_e is the allowable position error 5.

Speed Term
The speed term is used to evaluate whether the agent advances at the optimal speed. There are two scenarios in the process of an agent flying toward a goal. One is that the agent faces the goal while the other is that the agent faces away from the goal. If the goal is in front of the robot, the reward value of this term can be gotten as follow: When the agent faces away from the target, it only needs to rotate motion to save energy: where α and β are the proportional parameters of the linear velocity and angular velocity. The value of α and β is between 0 and 1. The angular velocity has two maximum values: As the linear velocity increases, the rotation speed will decrease until v t = v max , ω t = 0.

Distance-to-the-Goal Term
This term prizes the trajectory that makes the agent move towards the goal. The distance from the starting point to the goal is defined as the maximum distance d max . Also, the reward function is normalized between 0 and 1. Therefore, where d goal is the distance from the position of the agent to the goal, and d max is the distance from the starting point to the goal.

Orientation-to-the-Goal Term
This heading term prizes the curvature arcs that head the agent towards the goal. The direction angle between the end of the trajectory and the goal is compared with the azimuth of the agent (i.e., φ error , θ error ). As shown in Figure 6, the φ error and θ error can be given by: The reward function is given by 180-φ error , 180-θ error and normalized between 0 and 1.
where (x goal , y goal, z goal ) is the coordinate of the goal point and (th pitch , th yaw ) is the attitude angle of the agent.

Distance-to-the-Obstacle Term
This term prizes that the agent travels far from the obstacles. The distance to obstacles includes the distance to static obstacles and the predicted distance to moving obstacles. If the value of this term is small, the agent will be close to obstacles. In contrast, if the value of this term is big, the agent will be far from obstacles. where (x obstacle , y obstacle, z obstaclel ) are the coordinates of obstacles. Distance to obstacles includes distance to static obstacles (dist static ) and distance to dynamic obstacles (dist dynamic ). For the dynamic obstacle, the position and velocity direction of the dynamic obstacle are measured by the sensor in real time, and then the position that the dynamic obstacle can reach in each time interval (∆t) in the next prediction time is calculated as the predicted position of the dynamic obstacle. Then, among these positions, the shortest distance to the agent is used as the dynamic obstacle distance (dist dynamic ), as shown in Figure 7. By considering this distance, the agent plans the next motion command. In actual operation, the polynomial fitting algorithm [19] and extended Kalman filtering method [33] can be used to predict the trajectory of obstacles. In addition, choosing the shortest distance as the distance of the dynamic obstacles increases the threat posed by the dynamic obstacle to the agent. Therefore, fuzzy rules can increase the weight of distance to the obstacle item and the agent can avoid obstacle as quickly as possible.
where (x preob , y preob , z preob ) are the predicted position of the dynamic obstacle. The reward value of the iteration is calculated using Equation (29). Among all the speed groups being evaluated, the largest speed pair among the evaluation function values is selected as the motion instruction.
where ε speed is the reward value of speed, ε goal is the reward value of the distance from the candidate points to the goal. ε obstacle is the reward value of the distance from the agent to the obstacles, and (ε pitchheading, ε yawheading ) is the heading reward value.

The Fuzzy Rule for the DF_DWA
According to the evaluation function (26), the agent selects the best speed group in the predicted trajectory as the action command. The reward value in the evaluation function determines the weight of each evaluation term in the process of predicting the trajectory. Therefore, with different weight settings, the selected trajectories are also different. In the next paragraphs, we will discuss the impact of the weight settings on trajectory generation and obstacle avoidance, in which each reward value of the evaluation function will be set to zero or predominant parameter.
To test the influence of the reward value weight in the evaluation function on the trajectory generation, we set scenarios as shown in Figures 8 and 9, and the test data is given in Table 1. The starting point of the agent is (0,0,0) and the goal point of the agent is (4,4,4). α is the number of iterations in each case. υ and ω are the average linear velocity and the average angular velocity of the agent, respectively. Meanwhile, the variance of linear velocity (σ v 2 ) and angular velocity (σ ω 2 ) is used to evaluate the smoothness of the trajectory. In the test, the speed of the agent is limited to 2 m/s (linear velocity) and 60 degree/s (angular velocity). The acceleration is limited to 0.5 m/s 2 (linear velocity) and 90 degree/s 2 (angular velocity).  It can be seen from Figure 8 that when ε obstalce is set as the predominant weight, the planned trajectory can avoid obstacles. If other terms are set as the predominant weight, the trajectory can cross through obstacles, resulting in failure of obstacle avoidance. Meanwhile, Figure 9 shows that if the value of ε obstalce is set to zero, the trajectory cannot avoid obstacles. When ε speed is set as the predominant weight, it can be observed from Table 1 that the number of iterations is the least. If ε speed is set to zero, although the agent can safely reach the target position, the number of iterations doubles. It can be observed from Figures 8 and 9 that the same situation occurs when ε heading is set to zero or ε goal is set as the predominant weight. In other words, the trajectory reaches the goal after a full circle, which increases the number of iterations and causes the trajectory to be not smooth enough.
It can also be analyzed from Figure 9 that when there is no ε pitchheading , the trajectory has a horizontal rotation, and when ε yawheading is absent, there is a vertical rotation. Therefore, the weight of these two evaluation items should not deviate too much when being set. In addition, Figure 9a shows that if ε goal is zero, the agent will fall into local minima. The above analysis shows that, in the evaluation function, a single or constant combination of reward values cannot guarantee the safety of the agent's trajectory. When the environment changes, the agent cannot make changes to adapt to the environment. Therefore, based on the traditional DWA, this paper introduces fuzzy rules to adjust the reward value.
When setting a fuzzy rule to adjust the weights of parameters, the distance to the obstacle and the distance to the target are used as the inputs of the fuzzy rule, and the evaluation coefficients of the speed, distance to the goal, distance to the obstacle, and the direction to the goal are used as outputs. The fuzzy rule is shown in Figures 10 and 11.
(1) When the distance to the goal and the distance to the obstacles are long (i.e., longer than 5 agent radii), the agent does not need to avoid obstacles first. The agent should accelerate to a certain speed and approach the goal with moderate attention. ε obstalce can therefore be set at a small value. The value of ε goal can be increased moderately. ε heading can be set at a large value. (2) When the distance to the goal is long and the distance to obstacles is short (i.e., shorter than 2 agent radii), the agent should avoid obstacles first and not rush to approach the goal. The weight of the agent speed should be reduced to guarantee that the agent has enough time to adjust the forward direction. Therefore, ε obstalce should be set at a large value while ε goal , ε heading , and ε speed should be set at a small value. (3) When the distance to the goal is short and the distance to obstacles is long, the agent should approach the goal first and not rush to avoid obstacles. The weight of the agent speed should be appropriately reduced to ensure that it reaches the goal safely. ε obstalce and ε speed can therefore be set at a small value, while ε goal being set as the predominant weight and ε heading being set at a medium value. (4) When the distance to the goal and the distance to obstacles are short, the agent should avoid obstacles first and approach the goal with moderate attention. The weight of the agent's speed should be reduced. ε speed and ε heading should therefore be set at a small value. ε obstalce should be set at the predominant weight. ε goal should be set at a medium value.  The fuzzy rules Distgoal, Diatobstacle and ε goal , ε obstalce , ε speed , ε heading are shown in Tables 2-5.  The Mamdani fuzzy inference algorithm is used while defuzzification adopts the center-of-gravity method. The evaluation function can be obtained: Based on fuzzy rules, if the minimum value of the input obstacle distance is set larger, the agent's reaction time for obstacle avoidance will be earlier. Meanwhile, the minimum value of the ε speed and ε heading fuzzy domain is not set to zero, which can prevent the agent from falling into local minima because of the slow velocity. The specific steps of DF_DWA algorithm are as follows.

Simulation Results
In this section, the simulation results will be discussed. The simulations were run with 64 bit MATLAB R2018a on the Intel Core I7-9750H, 2.6GHz processor. The parameters used in the numerical simulations are listed in Table 6. Table 6. Agent and obstacles parameters.

Parameters Value
The

Static Obstacle Avoidance
In this scenario, there are multiple static obstacles in the space. Meanwhile, an Lshaped obstacle with a height of 2.4 m and a length of 1.8m is added to the obstacles. We have discussed the performance of the planned trajectory when the reward value takes different weights in the second section. In this test, the fuzzy logic is introduced to adjust the weight of the parameters. Figure 12a shows the trajectories planned by the three methods. Figure 12b shows the velocity change of the agent. Table 7 shows the number of iterations and the minimum distance between the agent and obstacles. In terms of the time spent in planning the trajectory, it can be seen from Figure 12b and Table 7 that it takes the most time when using F_DWA to plan trajectory, and DWA spends the least time in planning trajectory. Regarding the safety factor, it can be seen that the trajectory planned by DF_DWA is the farthest from obstacles, so the trajectory is the safest. The minimum distance between the trajectory planned by DWA and obstacles is 0.28mm, less than the safety distance of 0.3 mm, meaning that although agents can reach the goal with the least time when following the DWA planned trajectory, the safety is lowered. F_DWA takes the most time to plan the trajectory because of the velocity falling into the local minima many times. Therefore, DF_DWA is superior to F_DWA and DWA in terms of computational efficiency and safety.  It can be seen from the simulation results that compared with the algorithms DW, BUG, and PF in the literature [19], DF_DWA can ensure that the agent's trajectory is safer when escaping the local minima. Compared with the hybrid dynamic window method [18], DF_DWA can make agent keep a safer distance from obstacles when escaping U-shaped traps. Compared with DW4DO [15], DF_WDA can adjust the parameter weight of the optimization function according to the changes in the obstacle scene.

Dynamic Obstacle Avoidance
In space activities, multiple agents are sometimes required to complete tasks at the same time. For one agent, the other agents are regarded as dynamic obstacles. Each agent is equipped with sensors that can perceive the external environment around it. Therefore, for abnormal situations, if an agent is chasing the agent ahead, it can be perceived by the agent ahead.
In this section, the agent can detect the location of other agents by sensors and calculate the distance between them [19,33,42]. When using DF_DWA, although the speed does not change much, it is not constant. Therefore, the safety distance between obstacles must be strictly ensured. In this paper, the current maximum window speed is used to calculate the braking distance (i.e., R braking ).
(1) Two agents To further test the obstacle avoidance ability of the algorithm in a dynamic environment, we set up a scenario in which two agents move to each other and are obstacles to each other. As shown in Figure 13a, the starting point of Agent 1 is (0,0,3). The goal point of Agent 1 is (8,0.5,5). The starting point of Agent 2 is (8,0,3). The goal point of Agent 2 is (0,0.7,5). It is observed from the test result that this situation is safe in that Agent 1 moves upward and Agent 2 moves downward when the separation of the two agents is short. The black dotted line is the predicted trajectory of Agent 2 to Agent 1. It can be seen from Figure 13b that the parameter ε speed is set as a predominant weight when the two agents are far apart initially. As the two agents approach each other, the weight of the ε obstalce gradually increases. When the agents reach the obstacle avoidance range, the weight of the ε obstalce exceeds 0.9. After escaping the threat posed by one another, the first task of each agent is to reach its goal. Therefore, ε goal gradually increases. When the goal is about to be reached, the weight of the agent speed should be decreased to prevent the agent from passing over the goal while traveling at an extremely high speed.
(2) Four agents In the previous section, when two agents move towards each other, each agent has sufficient space for movement. However, when multiple agents fly close to each other, the space to avoid obstacles will be shrunk. To verify the obstacle avoidance ability of the agent in this situation, two scenarios are set up in this section: one is that four agents fly to four locations within the same plane that are close to each other (FFLSP), and the other is that four agents cross each other and fly to the designated locations (FFCL).
In the FFLSP scenario, four agents fly to different positions at the same height to complete tasks similarly to formation or assembly. The starting point of Agent 1 is (0,2,0). The goal point of Agent 1 is (5,4,5). The starting point of Agent 2 is (0,0,0). The goal point of Agent 2 is (5,6,5). The starting point of Agent 3 is (4,2,0). The goal point of Agent 3 is (3,4,5). The starting point of Agent 4 is (4,0,0). The goal point of Agent 4 is (3,6,5). As shown in Figure 14a, the agents safely reach their goal positions. Meanwhile, Figure 14c shows the minimum distance between agents, that is, the smallest value in the set of distances between each agent and other agents at the same moment during the flight. It can be seen that the minimum distance of Agent 1 and Agent 2 is similar before the 95th iteration. This result shows that Agent 1 and Agent 2 are the main obstacles to each other. When running the 78th iteration, the minimum safety distance is 0.5324 m, which is greater than the safety distance of 0.3 m. In the FFCL scenario, four agents fly from four diagonal directions. This setting is used to simulate multiple agents reaching their positions at the same time to carry out their operational tasks, respectively. During the process, the agents pass through similar path points. The four agents are unknown dynamic obstacles to each other. The starting position of Agent 1 are (7,1,1). The goal point of Agent 1 is (8,1,1). The starting position of Agent 2 is (0,2,0). The goal point of Agent 2 is (5,4,5). The starting position of Agent 3 is (0,7,4). The goal position of Agent 3 is (8,1,1). The starting position of Agent 4 is (6,5,4). The goal position of Agent 4 is (1,1,1). It can be seen from Figure 14b that the agents safely reach their goal points, respectively. Figure 14d shows the results of the minimum distance between agents at the same moment. At the 69th iteration, the minimum distance between Agent 1 and Agent 3 is 1.258 m, which is far greater than the safety distance of 0.3m.
The analysis of the minimum distance shows that the planned trajectory is safe and reliable. Figure 15 shows the weight of reward value changes in FFLSP and FFCL scenarios. In FLSP scenario, when an agent starts moving, the distance to its goal, and the distance to the obstacles are relatively long. Therefore, in Figure 12a, the weight of ε goal is set as the predominant weight at the beginning. As the agents approach the goal point, the weight of ε obstacle of the Agent 1 and Agent 2 gradually increases before the 75th iteration, and the weight of ε obstacle is set as the predominant weight until the 95th iteration. Corresponding to Figure 14c, the minimum obstacle distance of Agent 1 and Agent 2 is also in this iteration interval, so the agents avoid obstacles first. Then, when Agents 1 and 2 escape the collision threat, the weight of ε obstacle gradually decreases, whereas the weight of ε goal gradually increases. For Agent 3, since the weight of ε goal is always set as the predominant weight, it is more likely that the Agent 3 will not be collided during operation. As for Agent 4, the weight of ε obstacle gradually increases since the 59th iteration, and reaches the maximum at the 91st iteration. It is noticeable from Figure 14c that Agent 4 reaches the smallest collision position at the 91st iteration. The agent needs to avoid obstacles first, and then fly to the goal, so the weight of ε obstacle is the maximum.
It can be seen from Figure 14d that as the minimum obstacle distance is far longer than the safe distance of 0.3m, the distance between the agents is safe. Therefore, in Figure 15b, the weight of the reward value slightly fluctuates. In the process of approaching the goal, the weight of the ε speed and ε obstacle gradually decreases, while the weight of the ε goal gradually increases. When approaching the goal, the weight of the ε speed and ε obstacle reaches the minimum, but the weight of the ε goal reaches the maximum. The analysis of the parameter changes shows that DF_DWA is effective, because the weight of the reward value can be adjusted with the change of the environment, thereby ensuring the safety of the agent's trajectory. Compared with DW4DO [15], DF_DWA can ensure the smoothness of operation by moving obstacles in advance. Compared with ASTRO [33], it reduces the mathematical constraints between two motion agents, thus making it more conducive for multiple agents to operate together and complete space tasks through data sharing. Table 8 shows all the numerical results of the dynamic obstacle avoidance test. It is obvious that the planned trajectory is smooth, and it is suitable for application in the navigation framework. εgoal gradually increases. For Agent 3, since the weight of εgoal is always set as the predominant weight, it is more likely that the Agent 3 will not be collided during operation. As for Agent 4, the weight of εobstacle gradually increases since the 59th iteration, and reaches the maximum at the 91st iteration. It is noticeable from Figure 14c that Agent 4 reaches the smallest collision position at the 91st iteration. The agent needs to avoid obstacles first, and then fly to the goal, so the weight of εobstacle is the maximum.

Conclusions
This paper proposes a three-dimensional obstacle-avoidance approach for dynamic environments based on DWA. The weight of each parameter in the evaluation function adjusts in real-time in different working environments to ensure an agent selects a safe trajectory. By the simulation test, the approach responds well to large obstacles and dynamic obstacles and plans a safe trajectory. In addition, the numerical results show that the planned trajectory is smooth and is applicable in the navigation framework. Compared with other methods, the developed DF_DWA does not require complicated constraint formulas. Also, it reduces calculation requirements and increases the ability to respond to environmental changes. In future work, we will continue to study obstacle avoidance algorithms for dynamic obstacles and improve experimental conditions. Author Contributions: Conceptualization, C.X.; methodology, C.X.; validation, C.X.; formal analysis, C.X.; investigation, C.X.; data curation, C.X.; writing-original draft preparation, C.X.; writingreview and editing, M.X.; funding acquisition, Z.X. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement:
No new data were created or analyzed in this study. Data sharing is not applicable to this article.