A Hybrid Approach for Autonomous Collision-Free UAV Navigation in 3D Partially Unknown Dynamic Environments

: In the past decades, unmanned aerial vehicles (UAVs) have emerged in a wide range of applications. Owing to the advances in UAV technologies related to sensing, computing, power, etc., it has become possible to carry out missions autonomously. A key component to achieving this goal is the development of safe navigation methods, which is the main focus of this work. A hybrid navigation approach is proposed to allow safe autonomous operations in three-dimensional (3D) partially unknown and dynamic environments. This method combines a global path planning algorithm, namely RRT-Connect, with a reactive control law based on sliding mode control to provide quick reﬂex-like reactions to newly detected obstacles. The performance of the suggested approach is validated using simulations.


Introduction
Unmanned aerial vehicles (UAVs), also known as aerial drones, have emerged in many applications where it is required that some repetitive tasks are performed in a certain environment. Autonomous operation is highly desirable in these applications, which adds more requirements on the vehicle to achieve safe navigation towards areas of interest. Navigation methods can be generally classified as global path planning (deliberative), local path planning (sensor-based), and hybrid. A subset of sensor-based methods include reactive approaches [1].
Global path planning requires an overall knowledge about the environment to produce optimal and efficient paths which can be tracked by the vehicle's control system. There exist many different techniques to address global planning problems, including roadmap methods [2,3], cell decomposition [4], potential field [5,6], probabilistic roadmap (PRM) [7], rapidly exploring random tree (RRT) [8], and optimization-based techniques [9][10][11][12]. Some of these techniques become computationally challenging when dealing with unknown and dynamic environments since a complete updated map is required a priori. As a workaround to handle such environments, extensions to some of these approaches were proposed by adding an additional layer to continuously refine the initial path locally around detected obstacles. This still may be less efficient in highly complex and dynamic environments.
On the other hand, sensor-based methods generate local paths or motion commands in real-time based on a locally observed fraction of the environment interpreted directly from sensors measurements. Search-based methods, such as those used for global planning, and optimization-based methods can be used with a local map to generate paths locally where computational complexity depends on the selected map size. On the contrary, reactive methods offer better computational solutions by directly coupling sensors observations into control inputs providing quick reactions to perceived obstacles. Hence, they can be more suitable in unknown and dynamic environments. Examples of classical reactive methods used in unknown environments are dynamic window [13] and curvature velocity [14]. Other classical examples of reactive methods dealing with dynamic obstacles include collision cones [15] and velocity obstacles [16]. A class of reactive approaches adopt a boundary following paradigm to circumvent obstructing obstacles; for examples, see [17][18][19][20][21][22][23]. The low computational cost of such methods comes at the expense of being prone to trapping situations. Some researchers suggested a combination of a randomized behavior with the boundary following approach to escape such situations [24]. However, this may sometimes produce very unpredictable motions and even inefficient ones [25] without utilizing previous sensors observation acquired through the motion.
Hybrid strategies tend to address the afromentioned drawbacks by combining both deliberative and reactive approaches for a more efficient navigation behavior in unknown and dynamic environments. There exists a body of literature on hybrid approaches; for examples, see [25][26][27][28][29][30][31] and references therein. A hybrid approach was suggested in [25] for navigation in dynamic environments which combined a potential field-based local planner with the A* algorithm as a global planner based on a topological map. Similarly, the work [26] adopted the A* algorithm with binary grid maps while using a variant of the Bug algorithm as a reactive component to address navigation in partially unknown environments. The authors of [27] suggested another hybrid approach for micro aerial vehicles that uses A* for both deliberative and local planning components where the global path gets refined locally around obstacles through replanning processes. The hybrid approach presented in [28] adopted a fuzzy logic-based boundary following technique to implement the reactive layer while an optimal reciprocal collision avoidance (ORCA) algorithm was used in [29]. Sampling-based search methods were also used in some approaches such as [30], where a global planner based on the dynamic rapidly exploring random tree (DRRT) was suggested. Real-time obstacle avoidance was then dealt with by choosing a best candidate trajectory from a sampled set. In [31], the parallel elliptic limitcycle approach was adopted to implement both global and local planning components.
Many of the existing three-dimensional (3D) hybrid navigation methods consider search-based methods to implement the local planning component. Among those that adopt reactive-based approaches, many have just considered two-dimensional methods by constraining UAV movement to a fixed altitude which does not utilize the full capabilities of UAVs. Therefore, the main contribution of this work is to propose a hybrid 3D navigation strategy for UAVs to allow efficient navigation in partially unknown/dynamic environments. The suggested strategy combines a global path planning layer with a reactive obstacle avoidance control law developed based on a general 3D kinematic model. The global path planning layer, based on RRT-Connect, can produce efficient paths based on the available knowledge about the environment. The sliding mode technique is adopted to implement a boundary following behavior in the reactive layer. This choice provides quick reactions to obstacles with a cheap computational cost compared to search-based and optimization-based local planners. To develop a proper hybrid navigation strategy, implementation of a switching mechanism is presented to handle the transition between the two control laws. Overall, the proposed method can overcome the shortcomings of relying purely on a deliberative or reactive approaches. This paper is organized as follows. Section 2 provides a formulation of the tackled navigation problem. The suggested hybrid strategy is then presented in Section 3. The performance of this approach is confirmed through different simulation scenarios, which is shown in Section 4. Finally, concluding remarks are made in Section 5.

Problem Statement
A general 3D navigation problem is considered here, where a UAV is required to navigate safely in a partially known environment E ⊂ R 3 . The environment E contains a set of n obstacles O = {O 1 , O 2 , · · · , O n } ⊂ E . These obstacles can either be static and known O k or unknown static/dynamic O u such that O = O k ∪ O u . The main objective is to safely guide the UAV to reach some goal position p goal ∈ E \{O} represented in a world coordinate frame W with obstacle avoidance capability. The UAV starts with an initial map of the environment containing only information about O k based on previous knowledge. Unknown environments can also be considered where the UAV can start building a map as it traverses the environment. In this case, O k = ∅ initially. In addition, let p(t) = [x(t), y(t), z(t)] T be the UAV Cartesian coordinates expressed in W. Note that we represent vector quantities as column vectors; however, they can be written as the transpose of row vectors [·] T to simplify writing. A safety requirement for the UAV is defined as keeping a safe distance d sa f e > 0 from all obstacles according to the following: where d(t) is the distance to the closest obstacle, and || · || is the standard Euclidean norm of a vector in R 3 . We consider a general 3D nonholonomic kinematic model which is applicable to different types of UAVs and autonomous underwater vehicles. A description of this model is given as follows:ṗ where V(t) ∈ R + is the linear speed, and u(t) ∈ R 3 is a 2 degree-of-freedom (DOF) control input related to angular velocity. The motion direction at any given time instant t (i.e., orientation) is characterized by a unit vector r(t). The condition (4) indicates that the input u(t) is always perpendicular to r(t) which generates a steering-like behavior in 3D. The UAV velocities V(t) and u(t) are regarded as control inputs with some upper bounds (denoted by V max , u max > 0) due to physical limitation. These constraints can be expressed as follows: Notice that we consider only forward motions by not allowing negative values for V(t). For constant-speed applications, V(t) is kept constant at some valueV > 0. Additionally, the following assumptions are made. Assumption 1. The UAV senses a fraction of the surroundings by onboard sensors, and it can determine the distance to closest obstacle as part of its perception system. An abstract sensing model is considered where obstacles within a distance of d sensing > 0 from the UAV can be detected, and bounding geometric primitive can be used to represent the sensed fraction of the object (ex. box, sphere, cylinder, and/or ellipsoid).

Assumption 2.
Estimates of the UAV's position p(t) and orientation vector r(t) are available.
The following statement summarizes the considered problem in this work. Problem 1. Consider a UAV whose motion can be described by the model (2)-(4). Under assumptions 1 and 2, design control laws for V(t) and u(t) to ensure a collision-free navigation through an unknown or partially known environment E to reach a goal position p goal by satisfying the safety requirement in (1) and the constraints in (5).

Proposed Hybrid Navigation Strategy
Generally, the design of autonomous navigation methods adopts modular structures. The suggested structure in this work for the overall system from a software perspective includes subsystems for perception, high-level navigation, and low-level control. The perception subsystem is responsible for processing onboard sensor measurements to provide meaningful information about the environment. Thus, it can generate an updated global map representation of the environment as well as providing distance and direction to the closest obstacle as required by our reactive control law.
The design of the high-level navigation subsystem is the main contribution of this work, and the proposed design combines a few main components, namely global path planning, motion supervisor, path following control, reactive control, and a switching mechanism. The architecture of the overall hybrid navigation strategy showing these components is presented in Figure 1. The global path planning component is responsible for generating feasible and safe geometric paths based on the currently available global map. It is triggered to generate new paths by the motion supervisor whenever a new goal position is assigned or whenever a trapping situation (i.e., the UAV gets stuck) is detected as new portions of the environment are discovered. Initially, the map can be empty in unknown environments or partially filled in the case of partially known environments. As the vehicle navigates through the environment, the perception system updates the map based on sensors measurements. To execute the planned global paths, a path following control is adopted based on the pure pursuit guidance laws.
In unknown and dynamic environments, the vehicle may find that the planned global paths are unsafe due to the detection of new obstacles. In order to provide quick responses, a reactive control component is used to generate locally safe motions around unknown/dynamic obstacles by directly coupling command velocities into current sensors observations. The quick reactions to obstacles provided by reactive control is due to its low computational cost compared to replanning new paths. A mechanism is then used to switch between both path following and reactive navigation modes based on sensors measurements.
A detailed description of these components is given in the following subsections. Note that the high-level navigation can normally be implemented on an onboard mission computer. The mission computer is normally connected to a Flight control unit (FCU) which is directly connected to all motors. A low-level control subsystem is then used to generate actuators/motors commands to execute the high-level velocity commands generated by the high-level navigation subsystem.

Global Path Planning
A global path planning algorithm requires a map representation of the environment. There exist many algorithms in the literature where the choice of appropriate method can vary according to the UAV sensing and computing capabilities. Optimality of the planned paths and the computational complexity of the overall algorithm could be key factors in determining which method to consider for a specific application. Note that it is also possible to adopt more than one path planning algorithm where each planner is used under certain conditions. A typical example for this case is the use of optimizationbased algorithms to initially generate optimal paths while implementing a computationally efficient algorithm to modify the initial path when replanning is needed.
Our proposed hybrid navigation strategy is not restricted to a specific path planning algorithm. However, this work considers a sampling-based approach as a backbone for implementing the global planner which scales very well when planning in three-dimensional spaces. Namely, a variant of the rapidly exploring random tree (RRT) algorithm is used, which is called RRT-Connect [32]. This choice is due to the algorithm's popularity and low computational cost which is favorable since online replanning might be needed during the motion whenever trapping scenarios are detected. Moreover, we follow an optimistic approach in our implementation of the RRT-Connect algorithm where unknown space is considered to be unoccupied (i.e., obstacle-free). Based on this assumption, the reactive controller will be responsible for handling obstacle avoidance whenever an unknown space is found to be occupied.
RRT algorithms are randomized sampling-based path planning methods which can find collision-free paths if exist with a probability that will reach one as the runtime increases (i.e., probabilistically complete). In practice, these algorithms can provide quick solutions. The basic concept of an RRT planner can be summarized as follows. Let G be a search tree (a graph) initialized with an initial configuration (defined later). An iterative approach is used to extend the search tree through the configuration space until a feasible solution is found. At each iteration, a configuration is sampled either randomly or using some heuristics which can help biasing the growth of the search tree. Biasing the sampling process to select the goal configuration with some probability (0 < pb goal < 1) was found to enhance RRT growth. The algorithm then tries to extend the search tree G to the sampled configuration by connecting it to the nearest one within the tree. Different methods could be used to connect configurations within the configuration space especially when trying to satisfy some constraints. However, a common approach is to use straight lines to connect two configurations especially when dealing with Euclidean spaces which is considered here. A collision checker is then used to check the feasibility of each extension based on the available environment map where each feasible extension results in growing the search tree by adding the new configuration. The sampling process gets repeated iteratively until a feasible path between the initial and goal configurations is found or until a stopping criteria is met (for example, exceeding a predefined planning time limit).
RRT-Connect follows the same idea; however, it maintains two trees originating from both initial and goal configurations. At each iteration, the planner attempts to extend one of the trees followed by an attempt to find a collision-free connection to one of the vertices in the other tree. This can provide rapid convergence to a solution in complex environments compared to the standard RRT algorithm.
Due to the sampling nature of RRT algorithms, the generated paths are non-optimal in terms of the overall length. Furthermore, these paths do not satisfy nonholonomic constraints if straight lines were considered when extending the search tree. Therefore, we follow a common practice by refining the obtained paths through two post-processing stages, namely pruning and smoothing, as was done in [33,34].
For the pruning stage, redundant waypoints are removed from the planned path to improve the overall path quality. Let W = {w 1 , w 2 , · · · , w k } be a set of waypoints representing a path generated by RRT-Connect, and let W p be the pruned path obtained after this stage. Redundant waypoints can then be removed using Algorithm 1 based on [33].

Algorithm 1 RRT Path Prunning
Input: W = {w 1 , w 2 , · · · , w k } Output: W p Stop when a collision-free segment is found 9: insert w j into W p 10: i ← j 11: until i = 1 A smoothing algorithm is applied next to the pruned path to ensure that the final path satisfies the vehicle's nonholonomic constraints (minimum radius of curvature). To that end, parametric Bezier curves were used to generate continuous-curvature smooth paths following the approach suggested in [34].

Pure Pursuit Path Following Control
A path following control design is needed to ensure that the UAV can accurately track the planned path. The proposed design adopts the pure pursuit tracking (PP) algorithm which is known for its stability and simplicity [35]. Assuming that a geometric path τ ⊂ R 3 is available, the PP algorithm steers the vehicle to follow a virtual target p v = (x v , y v , z v ) ∈ τ moving along the path. This target is usually selected to be at some lookahead distance L away from the closest path point p c ∈ τ to the vehicle's current position. For more stability, a modified version of the PP algorithm is considered in this work based on [36] which suggested using an adaptive lookahead distance instead of a fixed value. This PP variant provides more stability when the vehicle is further from the planned path which is needed here since the vehicle can sometimes deviate from the planned path when avoiding obstacles with the reactive control component. The following control law is used for path following: where γ > 0. The mapping function F(w 1 , w 2 ) : R 3 × R 3 → R 3 acts as a steering function which produces a vector that is perpendicular to w 1 in the direction of w 2 , and it is defined as follows:

Reactive Control Law
Navigation in unknown and dynamic environments requires more safety measures as the planned path from the global planner can become unsafe whenever new obstacles are detected, especially if they are dynamic. Hence, a reactive control law is used to generate reflex-like reactions to detected obstacles by navigating around them until it is safe to continue following the previously planned path.
In our implementation, we adopt a reactive control law utilizing a sliding mode control technique based on [37]. The distance to closest obstacle d(t) is the only information needed to implement this controller and can be obtained from onboard sensors. This reactive control is based on a boundary following paradigm, and it is given as follows: where a constant forward speed is considered, d 0 > d sa f e > 0 is a desired distance, Γ = ±1 determines the avoidance maneuver direction (i.e., clockwise or counter clockwise with respect to the axis of rotation), and sgn(α) is the signum function. In addition, χ(β) is a saturation function which is defined as: for some design parameters γ, δ > 0. In addition, i a (t * ) represents an avoidance plane normal associated with a certain obstacle as explained in [37]. This normal can be different for each obstacle, and it can be determined at the time instant t * when a new obstacle is detected. The reactive navigation law (9)-(11) ensures that the vehicle will maintain a fixed distance d 0 while navigating around the nearest obstacle under some assumptions as was mathematically proven in [20].

Switching Rules
A switching mechanism is important for hybrid navigation methods. It is responsible for deciding whether it is safe to follow the planned path or to reactively avoid a newly detected obstacle. In general, two navigation modes will be used, and a switching mechanism is adopted to switch between those two modes. The two modes are path following mode M 1 (control law (6) and (7)) and obstacle avoidance/reactive mode M 2 (control law (9)-(11)).
Assuming that the vehicle initially starts in mode M 1 , we consider the following switching rules: R1: switch to the reactive mode M 2 when the distance to the closest obstacle O i drops below some threshold distance C (i.e., d(t) = C andḋ(t) < 0). R2: switch to the path following mode M 1 from M 2 when |d(t) − d 0 | < 0 ∀i for some small value 0 > 0 and the vehicle's heading is targeted towards the virtual target p v on the planned path which can be determined according to the following condition: where 1 > 0.

Motion Supervisor
The motion supervisor is responsible for detecting trapping situations whenever the vehicle gets stuck in a local minimum due to a newly discovered fraction of the environment. An example of such scenario can be seen when navigating in maze-like environments, concave obstacles, and/or long blocking walls [38]. The proposed approach to tackle this problem is by issuing a replanning command to the global path planner to acquire a new path based on the updated knowledge about the environment.

Simulation Results and Discussion
The developed hybrid navigation approach was tested using simulations considering two different cases, namely static and dynamic environments. Some knowledge about these environments were assumed to be known, a priori, to show the role of the global planner; however, the developed approach can work well even when no such knowledge is available. The RRT-Connect algorithm was implemented based on [32] to plan global paths.
The simulations were carried out using the MATLAB software running on a 2.6 GHz Intel Core i7 CPU. We considered a maximum linear speed of V = 0.75 m/s and a maximum angular velocity of u max = 1.75 and 2.5 rad/s for the first and second cases, respectively. The design parameters were also selected as follows: L = 0.5, δ = 0.5, γ = 1, d 0 = 1, and C = 2. An arbitrary choice was made for i a (t * ) used in (11) which vary for each obstacle. In addition, the update rate of the control was set to be 0.01 s. A description and the results of these simulations are presented next.

Case I: Unknown Static Obstacles
In the first simulation case, we consider an environment in which the UAV performs some repetitive tasks. Hence, some knowledge about the environment is known a priori, such as wall locations. However, new static obstacles which are not known to the vehicle can be added to the environment at different times. A real-life example of this scenario is when operating in a warehouse or inside a building. The warehouse/building layout can be known in advance or from an initial mapping process while objects can be moved around all the time. Figure 2 shows the considered environment for this simulation case. The initial map available to the vehicle includes only information about the two walls (shown in gray/black). There are also 4 unknown cylindrical shaped obstacles with different sizes. A bounding shape can usually be estimated in practice by the vehicle's perception system to represent nearby obstacles where very close obstacles can be represented by a single bounding object to satisfy the control assumptions.
It can clearly be seen from Figure 2 that the proposed hybrid navigation strategy can safely guide the vehicle starting from some initial position (green circular marker) to reach the goal location (blue star marker). This figure shows both the path generated by the global planner based on the initial knowledge about the environment as well as the actual executed path by the vehicle. The vehicle can successfully track the planned path whenever it has good clearance from obstacles. Once an obstacle is detected by the vehicle's sensors, the vehicle switches to the reactive mode M 2 to move around the obstacle. Then, it goes back to the path following mode M 1 whenever it is clear to do so according to the switching rule R2 as described earlier. The distance to the closest obstacle during the motion is given in Figure 3, which clearly shows that the vehicle can satisfy the safety requirement by maintaining a proper clearance from all obstacles. In addition, the linear speed of the vehicle is shown in Figure 4. Overall, these results confirm that the proposed method can guide the vehicle safely among unknown static obstacles.

Case II: Unknown Dynamic Obstacles
The second simulation scenario considers the case where there are several unknown moving obstacles, which makes the environment dynamic. Similar to the previous case, some initial knowledge about the environment layout is assumed to be known, which is shown in gray/black in Figure 5. Moreover, there are multiple unknown dynamic spherical obstacles with random sizes between 0.2 and 1 m in radius and arbitrary linear speeds of 0.1, 0.2, and 0.25 m/s (less than 0.75 m/s). Note that for simplicity, collisions between different obstacles are not considered in this case.
Based on the initial map, the global planner finds a safe path using RRT-Connect, which is shown as a dashed red line in Figure 5. The vehicle then starts moving in mode M 1 to track the planned (reference) path. However, due to the highly dynamic nature of the environment, this path becomes unsafe whenever there are obstacles approaching the vehicle, as shown in Figure 6, at different time instants during the motion. Each time a threatening obstacle is detected, the vehicle switches to navigation in the reactive mode M 2 according to the switching rule R1. This can be seen in Figures 5 and 6, which show that the vehicle's actual executed path deviates from the planned path at some locations to avoid the dynamic obstacles. This is verified in Figure 7, where the distance to the closest obstacle remain above the safety threshold. In addition, the UAV's linear speed is shown in Figure 8. It is evident from these results that the proposed strategy also works well in dynamic environments. It should be mentioned that the UAV's maximum velocity must be larger than obstacles' velocities to guarantee safety. However, the reactive control law can still handle some cases where the obstacles are moving faster than the vehicle but with no safety guarantees in some aggressive scenarios.

Conclusions
A hybrid three-dimensional (3D) navigation strategy for unmanned aerial vehicles (UAVs) was presented in this paper. The problem formulation considered a general 3D nonholonomic kinematic model which is applicable to different UAV types and autonomous underwater vehicles. A global planner based on the RRT-Connect algorithm was used which works well in three-dimensional spaces. A reactive control law based on sliding mode control was used to avoid unknown and dynamic obstacles. Simulation results confirm that the proposed hybrid navigation method works well in 3D environments with unknown static and dynamic obstacles.