Fast Obstacle Detection System for UAS Based on Complementary Use of Radar and Stereoscopic Camera

: Autonomous unmanned aerial systems (UAS) are having an increasing impact in the scientiﬁc community. One of the most challenging problems in this research area is the design of robust real-time obstacle detection and avoidance systems. In the automotive ﬁeld, applications of obstacle detection systems combining radar and vision sensors are common and widely documented. However, these technologies are not currently employed in the UAS ﬁeld due to the major complexity of the ﬂight scenario, especially in urban environments. In this paper, a real-time obstacle-detection system based on the use of a 77 GHz radar and a stereoscopic camera is proposed for use in small UASs. The resulting system is capable of detecting obstacles in a broad spectrum of environmental conditions. In particular, the vision system guarantees a high resolution for short distances, while the radar has a lower resolution but can cover greater distances, being insensitive to poor lighting conditions. The developed hardware and software architecture and the related obstacle-detection algorithm are illustrated within the European project AURORA. Experimental results carried out employing a small UAS show the effectiveness of the obstacle detection system and of a simple avoidance strategy during several autonomous missions on a test site.


Introduction
In recent years, there has been growing interest in autonomous UAS, and this is also reflected in the increase in scientific publications on the topic [1][2][3][4][5].The importance of this research area is fundamental since the development of autonomous technologies in drones can guarantee significant benefits for society in the near future.For instance, implementations for the search of missing people and the exploration of post-disaster or inaccessible environments to humans where the use of drones with autonomous decisionmaking capabilities could play a decisive factor in terms of the number of lives saved [6,7].
Nowadays, the topic of human transport in urban contexts through the use of autonomous drones is also becoming more and more interesting, precisely because it would allow to limit ground traffic by moving part of it to the sky.In this regard, the work presented in this article was developed within the European project AURORA (Safe Urban Air Mobility for European Citizens), which has as its ultimate goal the development of autonomous technologies aimed at this type of use.In this regard, a fundamental element that an autonomous driving system must implement on board is the obstacle-detection and avoidance part [8,9].In fact, without it, the aircraft cannot modify the global trajectory to avoid unexpected obstacles and therefore would not be able to be used for applications that instead require high safety and flexibility in terms of automatic recalculation of the desired trajectory.In the literature, there are papers that deal with the development of such systems; many of these are based on the use of stereoscopic cameras capable of depth perception, allowing to carry out the detection of obstacles [10,11].
Recently, the scientific community has begun to experiment with the use of radar systems, which have multiple advantages, such as longer detection ranges and being insensitive to light and visibility conditions [12,13].Compared to optical systems, it is important to note that radar technology also has disadvantages, such as a lower spatial resolution and the almost total absence of resolution in height [12,14].However, the two technologies have complementary characteristics and therefore lend themselves well to being used together; in fact, they are able to compensate for each other's disadvantages.Automotive applications of simultaneous radar and vision systems for obstacle detection [15][16][17] are widely documented in the literature; however, this is not common in the UAS field.In fact, except for some implementations where the 3D perception is obtained by merging together the radar data with the monocular vision [14], until now and to the authors' knowledge, there are no documented techniques that use radar and stereoscopic vision in a complementary way.
In this article, an approach for UAS applications that exploits the complementary features of an automotive-derived radar and a stereoscopic optical sensor to increase the reliability of the detection algorithm is presented.
The two systems have been kept completely independent in such a way that a malfunction of one cannot affect the functioning of the other.As it will be detailed in the following sections, the avoidance strategy comes into operation as soon as one of the two systems detects a dangerous obstacle according to the detection policy implemented.
The paper is organized as follows: Section 2 describes the hardware and software architecture, explains the method of creating environmental maps and defines the algorithm that has the task of detecting dangerous obstacles inside them.Furthermore, Section 2 also discusses the implemented avoidance strategy which comes into operation only when obstacles considered dangerous are highlighted.Section 3 shows the results obtained from field tests carried out in an open environment that simulates an urban context.Finally, Section 4 reports the discussion on results and future developments.

UAS Architecture
The platform used for the development of the obstacle detection and avoidance system consists of a DJI Matrice 300 RTK drone to which a ZED 2 stereoscopic camera, an AWR1843BOOST radar and an Nvidia Jetson Xavier AGX board have been added as a payload.The obstacle detection algorithm runs on the Jetson board, which is directly interconnected with the camera and the radar.Moreover, through a UART connection, the board is also able to pilot the UAS autonomously.
Figure 1 shows the DJI Matrice 300 RTK equipped with the additional components.In particular, at the top, it is possible to observe the Nvidia Jetson Xavier AGX board, while at the bottom, there are the radar and the optical sensors used for the implementation of the obstacle detection algorithm.

Stereoscopic Vision
To obtain three-dimensional information on the surrounding environment through the use of computer vision, there are basically two technologies that can be used.The first is based on RGB-D technology, where an RGB optical sensor is used alongside a TOF (time of flight) depth sensor.The second one uses the optical flow coming from two RGB cameras that needs to be processed on board the companion computer to provide 3D data.The TOF sensor mounted on RGB-D cameras uses a laser beam matrix, very sensitive to ambient light conditions, and does not guarantee high operating ranges.To comply with the specifications imposed by the AURORA project, classical stereoscopy based on two RGB optical streams was chosen, which provides more stable performance, especially in outdoor environments.In particular, the ZED 2 stereoscopic camera by StereoLabs was used, which in addition to providing the point cloud of the surrounding environment defined in the fixed frame, is able to generate an estimate of the trajectory traveled by the camera in 3D space.This estimate was obtained from the SLAM algorithm implemented in the SDK (Software Development Kit) supplied with the camera.Unlike other optical systems (see Intel t-265), the ZED 2 camera does not perform calculations on board, leaving all the computational load to the Nvidia Jetson module.This is a significant limitation since a considerable amount of the hardware resources is already reserved for the camera SDK, leaving less space for user applications.For these limitations, it is important to pay particular attention to the optimization of the detection algorithm in order to minimize the computational cost necessary for the detection of dangerous obstacles.Despite the presence of these problems, currently the ZED 2 camera represents the state of the art for stereoscopic vision systems.In fact, it guarantees an operating range of up to 40 m and performance superior to those offered by other products.Figure 2 shows a functional example of this camera, where the RGB optical flow (a) and the depth flow, processed on board the Jetson module (b), are visible.

Radar
The radar used for this work is an AWR1843BOOST by Texas Instruments [12].A radar detects the distance of the target by sending and receiving an electromagnetic signal through at least a couple of antennas.Using a multiple input multiple output array, it is also able to retrieve the direction of arrival.The sensor used is equipped with 3 TX and 4 RX antennas, which correspond to 12 virtual antennas, disposed as shown in Figure 3.In Figure 3, the z axis represents the altitude, while the x axis is left to right, and λ is the wavelength of the electromagnetic signal.This arrangement of virtual antennas achieves a good azimuth resolution and a poor elevation resolution, the elevation resolution being related to the inverse of the z-distance between the antennas.
For the current application, the elevation resolution is used only as an angular cut-off.In other words, the value of elevation measured by the radar is not used for mapping, and it is set equal to zero for each target, but it is used as a spatial filter for rejecting the target outside of a selected angular area.We set the angular field as ±45 deg in azimuth and [0,20] deg in elevation.Therefore, all targets outside this interval are not used for mapping.
Since the radar is not able to provide the elevation of the target, we decided to always consider zero as the elevation of the object.This is equivalent to assuming that each target is on a horizontal plane at the same height of the drone.This hypothesis is not as restrictive as it seems since usually a target at z = 0 m (at the same height of the UAS) is the most reflective.Under this hypothesis, the map generated using the radar is bi-dimensional.It becomes three-dimensional by changing the drone altitude [12].In order to reduce the number of false alarms, the radar signal was filtered using two constant false alarm rate (CFAR) algorithms [12].The first CFAR algorithm discriminates the physical targets from the thermal noise and from possible clutter using a moving threshold on range direction.The second CFAR algorithm was applied in the Doppler direction (which means on the speed) to further discriminate the possible targets from false alarm.The radar provides the coordinates of each target that exceed the CFAR thresholds.
In the current application, the radar range resolution was 0.5 m with a maximum range of 120 m and 10 Hz frame periodicity.

Framework
The whole framework runs on board the Jetson card, and it was developed through the ROS (robot operating system) environment.The main advantage offered by ROS is to guarantee extreme flexibility and modularity: it is possible to interconnect multiple software packages, called nodes, through a publisher-subscriber scheme.The framework developed is shown in Figure 4, where the nodes are visible in blue, the topics in green, and the functions implemented within the nodes in black.The DJI OSDK node provides a communication interface to the Matrice 300 RTK.In particular, through the telemetry function, it is possible to obtain all the data of interest, while through the control function, it is possible to autonomously fly the drone sending a set-point velocity vector.In the current implementation, the telemetry data that are actually used in the framework are the GNSS (global navigation satellite system) coordinates and the attitude vector of the drone.This data are then used to build the radar maps, which will be explained in Section 2.5.Regarding the radar node, it provides the topic containing the 2D coordinates of the obstacles detected in the radar frame.In fact, as already explained in Section 2.3, the AWR1843BOOST radar does not reliably provide the relative elevation coordinate.The ZED 2 node creates the interface toward the stereoscopic camera: this node provides the fixed-frame position estimation by processing the vision data, the camera attitude and the topic point cloud that contains the 3D image of the environment.As will be explained in Section 2.6, by suitably processing this topic, it is possible to create an obstacle-detection system.The ODS node (obstacle-detection system) implements various functions: in addition to the obstacle-detection task, it is responsible for the creation of the maps [12] explained in Section 2.5, exploiting the absolute paths generated from the GNSS data [12] via the DJI OSDK node and from the SLAM (simultaneous localization and mapping) estimated by the vision process [18][19][20][21].
Since through the point cloud topic, it is not possible to create vision maps directly, a specific node called "point cloud to voxel map" was developed.This node converts the point cloud topic into a usable data structure [22] and the processed data are used for the creation of the vision maps.
The last node implemented in the framework is the control node, through which it is possible to define the generating policy (planner) of the desired set points for the autonomous driving algorithm.In addition, the control node implements the avoidance strategy, which is enabled in the event that a potentially dangerous obstacle is detected.In this case, the function that realizes the avoidance strategy bypasses the planner function, which is normally enabled during the autonomous mission.

Mapping Algorithm
Both vision and radar maps exploit the Octomap library [23,24] to create and manage volume maps (defined by sets of voxels) in an efficient and fast way, which is a requirement when performed on embedded devices, such as Jetson boards.Two different type of maps can be employed: local and global maps, both using the same Earth reference system (ENU in this work).In general, we define a map as local when it only contains the obstacles detected at the current time, whereas a global map includes all the objects detected during the entire mission.In either cases, the task for updating the maps is computationally expensive since it requires a transformation of each detected point from the sensor frame to Earth frame, given the UAS attitude and position (pose).As said, for updating a local map, it is necessary to destroy the previous information at each new frame.Conversely, for the creation of the global map, it is sufficient to define a global container, where at each frame, all detected targets are inserted, without, however, deleting previously detected objects, even if they may no longer be present.Figure 5 shows the differences between the global map built using the radar sensor and the global map built starting from the optical one.In order to meet project AURORA specifications, which require independent and complementary sources for the obstacle detection subsystem, the proposed solution exploits two specific maps: a Vision local map and a Radar global map, as shown in the diagram of Figure 4.In particular, the vision maps are computed using the position estimate obtained from the vision process as a reference.Due to the camera operating range, it is quite common to have no visible objects during a mission for certain time intervals: in these conditions the SLAM process can diverge, creating inconsistencies between the new coordinates entered at the current time and the old ones, giving origin to distortions within the global map (see Figure 6).For this reason, the global map is to be considered unreliable, while the local map is the best choice for the task since in the latter, there are only the current obstacles, and these are always consistent with the position estimate provided by the SLAM process.

GNNS Path SLAM Path
Absence of visible fixed obstacles -divergence begins Figure 6.Path divergence issues: estimated trajectories starting from data provided by the RTK system (green) and by the SLAM algorithm that uses the vision data (red).The latter diverges if there are no fixed objects in the operating range of the camera.This situation is shown in the image on the top right, where, after the avoidance of the building, all the fixed obstacles are out of the range of the optical sensor.
As for the radar obstacles detection, the reasoning for choosing a global map is different: unlike the stereoscopic vision system, the radar has a much lower resolution, so some obstacles may not always be reflective and consequently they could be detected only in certain time intervals.Therefore, a global map reporting all the targets detected during the mission allows for greater safety.It should be also noted that the radar maps, unlike the vision ones, are created using the path received by the RTK system as a reference [12], which has no particular divergence problems, being based on GNSS localization.

Obstacle Detection Methodology
Obstacle detection refers to the task of searching for dangerous obstacles in specific regions of the maps introduced in Section 2.5 in order to guarantee safe UAS autonomous navigation.As already mentioned, the vision system guarantees high resolution for distances within 40 m, while the radar system provides a lower resolution but an operating range up to 120 m [12].Therefore, the visual detection algorithm is typically more effective in urban/indoor environments or in lower UAS speed ranges, whereas in larger open environments, and with few obstacles, the radar detection system guarantees safer operations, allowing for faster flight.In addition to this, it must be considered that in urban or indoor environments, there may be poor or incomplete GNSS coverage.In these conditions, the visual detection system is more reliable, not only in terms of resolution, but also because the vision maps are computed using the position provided by the SLAM process, which does not require geolocation data.
In order to be able to detect the presence of dangerous obstacles starting from the two maps used (vision and radar), it is necessary to define a region of interest (ROI) [25] as a subset of the maps where the obstacles are searched.The approach followed in this article considers any object (static or dynamic) found inside the ROI to be potentially dangerous.The shape of the ROI is defined to ensure that the global trajectory produced a priori by the planner can be considered safe under the condition that no obstacles are inside the ROI.In order to simplify the computational cost of the obstacle search, we make some assumptions about the UAS navigation which are quite reasonable: (i) The flight occurs with the sensors (radar and vision) facing the direction of motion; (ii) The altitude is kept constant during the flight.
Under these operating conditions, we define the ROI as a cylindrical sector with radius d m (maximum search depth along the flight direction), a FOV angle ψ R , and vertical height h m .This region can be first computed with respect to a reference frame B centered in the UAS position P and with Euler angles (0,0,ψ), i.e., with the y b axis aligned with the drone heading angle ψ, as shown in Figure 7.The figure shows a schematic of the detection process applied to the local map: the sensor FOV where obstacles are detected is shown in gray.Instead, the yellow area defines the ROI such that the presence of obstacles in this sub-region is considered dangerous.As the drone moves along the planned trajectory, the coordinates of the ROI in the fixed reference frame must be updated at each iteration of the algorithm, taking into account the UAS current position and yaw angle through the relation where F i and F b i are the coordinates of the i-th point of the ROI in ENU and B frames, respectively, and is the corresponding rotation matrix.This operation can be expensive, and its burden depends on the number N of points representing the ROI.For this reason, in this article, an efficient algorithm to generate the minimum number of points belonging to the ROI volume, given the map resolution m r , is reported in Appendix A.
Once all the iterations necessary to update the entire ROI are performed, the result shown in Figure 8 is obtained, where the relative voxel can be seen for each calculated point.Finally, the last step of the obstacle detection task is to query the radar and vision maps (via the Octomap libraries) using the coordinates of the ROI vectors F i in Equation ( 1): for each point of the ROI, an occupation probability is returned such that for probabilities greater than a given threshold (0.5 in our experiments), the relative voxel is considered occupied, and an obstacle-avoiding strategy needs to be activated.It is important to underline that the ROI update algorithm should start calculating the coordinates of the points from those closest to the drone.This is preferable to make sure that the search for dangerous obstacles starts from the most critical points, i.e., those that are closest to the drone.

Simplified Avoidance Strategy
To carry out the experiments required by the AURORA project, it was useful to design and implement an obstacle avoidance strategy on board the drone.Although in the literature, there are many refined techniques [26][27][28][29] that could have been used, the main objective of the work package assigned to the University of Florence unit was the development and testing of a fast complementary obstacle detection system exploiting radar and optical sensors.In this context, it was chosen to implement a very simple avoidance strategy able to correctly operate in an open environment.
In Figure 9, a flowchart shows the workflow of the algorithm: during the autonomous mission, the planner and the obstacle detection (OD) blocks are enabled.As long as no obstacles are detected inside the ROI, the OD block keeps the planner generating a trajectory through a velocity reference V 1 .When the OD task detects the presence of one or more obstacles, it disables the planner by interrupting the loop connection and activates the obstacle avoidance (OA) block, which remains active as long as there are obstacles inside the ROI.As already mentioned, a simplified avoidance strategy was implemented by switching to a velocity reference vector V 2 , which corresponds to an increase in altitude.This state persists until no more obstacles are found in the ROI such that the standard mission planner can be re-enabled.Notice that this strategy, once enabled, does not allow the subsequent lowering of the altitude for realignment with the global path.This means that the mission will be completed at a higher altitude than that of the path defined a priori.However, this is not a problem since, as already mentioned, the avoidance strategy has been implemented only for testing the detection system.

Results
The tests were carried out at the Celle castle at Incisa Valdarno (Italy) visible in Figure 11.To test the correct functioning of the obstacle detection and avoidance system in the planner, an autonomous mission was preloaded inside the control node, where the global trajectory consists of a straight path that collides with one of the buildings present on the site.Considering the type of environment in which the tests were carried out, to prevent the avoidance strategy from immediately starting to work, the desired depth of the ROI was set at 7 m for the vision and 10 m for the radar.
The trajectory that the drone needs to follow is shown in Figure 12.This mission foresees an automatic take-off from point 1, an arrival in point 2 and a return to the take-off point with consequent landing.All phases of the mission were managed on board the Jetson, which triggered the avoidance strategy explained in the previous section as soon as the ODS node detected the presence of a building.Since the two detection systems (radar and optical) are able to work independently, several missions were carried out in order to separately test each sensor.Figure 13 presents the results obtained by enabling first the radar detection system and then the optical one.On the top row of Figure 13a, three snapshots of a mission for the radar detection system are shown together with the voxels relating to the global map and the voxels that make up the ROI.In the central image, the ROI turns red since the algorithm detects an obstacle (the edge of a building).Instead, the bottom row shows the text area, where the ODS node starts publishing the warning messages with the coordinates of the detected obstacles (central image) and the function that implements the avoidance strategy (right image).This leads the drone to climb in altitude and subsequently to pass the building, avoiding the collision.A similar behavior can also be observed for the optical detection system, visible in Figure 13b.
As already explained, the resolutions that the two sensors can provide are very different: in fact, for the global radar map, only the voxels relating to the edges of the building are displayed, being more reflective, while in the optical local map, the entire visible wall of the building is filled with voxels.In this regard, the resolution of the radar maps is set at 1 meter, while for the vision maps, it is equal to 0.5 m.Since the radar has less resolution, it is safer to increase the volume associated with the single obstacles detected.Table 1 shows the average update times for the coordinates of the ROI as a function of different desired depths.Obviously, at greater depths, the algorithm must calculate a larger number of voxels, and consequently, the complexity increases.These timings were achieved by setting a map resolution of 0.5 m and a FOV of 24 degrees.As can be seen, even at the maximum search depth (120 m), the average update time is only 46.2 ms.This allows for the effective real-time detection of dangerous obstacles, thus maximizing the time available for the implementation of the avoidance strategy.

Discussion
The experimental tests presented in this paper have shown how in an urban scenario, the complementary obstacle detection and avoidance system is always able to detect and avoid dangerous static obstacles along the UAS path.Having adopted a redundant architecture consisting of a radar sensor and an optical sensor, which possess complementary operational features, it is possible to guarantee the detection of dangerous obstacles in different operating conditions, significantly increasing the safety of the entire system.In fact, as already discussed in the previous sections, in addition to having different operating ranges and resolutions between the radar and the optical sensor, in low light conditions, the radar can effectively replace the vision, even for short ranges.In conditions of acceptable brightness, vision can guarantee higher resolutions and therefore greater safety in detecting short-range obstacles.Moreover, redundancy can also be found in the different map reference systems employed: radar maps are based on GNSS geolocation data, while the vision maps use an independent position reference obtained from the optical SLAM process.In this way, if the satellite coverage fails, the vision maps will continue to function correctly and vice versa.
It is important to note that research associated with radar maps can involve larger regions of interest than optical ones, as the radar sensor is able to reach longer ranges.This implies that by appropriately scaling the resolutions of the vision and radar maps as a function of the specific sensor range, it is possible to obtain equivalent calculation times for the related regions of interest.This approach allows to add greater complementarity for the two detection systems.
Future research work will focus on the implementation of more complex avoidance strategies, which, eventually with the help of GPUs, can simultaneously compute multiple alternative trajectories in order to find the optimal one, also avoiding moving obstacles.Another interesting aspect that deserves additional work concerns the study of possible different strategies for the creation and management of radar and vision maps, where, for example, the RTK and SLAM references can be merged or replaced if necessary.where the indexes n i , n j , n k are integers depending both on the ROI geometry and map resolution: In brief, the computation of the N points of the ROI proceeds along the radius (using the n i index) from the origin up to the maximum depth, with a step resolution equal to m r .For any generic depth, the n j index spans the FOV angle ψ R to compute all the points along the corresponding arc of circle.Finally, the third index n k computes a copy of the above planar geometry along the vertical dimension.The number N of the points computed by the algorithm can be approximately estimated by the ratio between the ROI geometrical volume and the voxel volume, such that Figure A1 shows two examples of ROI with a common FOV angle of 24 degrees but different depths.

Figure 1 .
Figure 1.DJI Matrice 300 RTK drone equipped with the additional hardware components.

Figure 2 .
Figure 2. Functional example of a ZED 2 stereoscopic camera.Image (a) shows an RGB frame coming from the camera, while image (b) shows a depth frame processed by the ZED SDK on board the Jetson module.

Figure 4 .
Figure 4. Block diagram of the framework that implements the obstacle detection and avoidance algorithm.In particular, the framework is able to communicate with the drone and to receive data from the sensors to search for obstacles in the surrounding environment.

Figure 5 .
Figure 5. Differences between vision and radar global maps: left (a) is the radar map where detected obstacles are reported on the same plane as the drone; top right (b) is the optical point cloud, while on the bottom right (c), there is the corresponding conversion into a set of voxels used for the detection of dangerous obstacles.In this latter case, the map obtained also allows one to know the altitude of the detected obstacles.

Figure 7 .
Figure 7. Mapping region of the local map (gray); searching region where the presence of an obstacle is considered dangerous (yellow); occupied voxels by obstacles (blue).

Figure 8 .
Figure 8.The figure shows the voxels calculated by the detection algorithm within the search region.In particular, the number of points computed by the algorithm depends both on the map resolution and ROI volume.

Figure 10 1 WP2Figure 10 .
Figure 10 illustrates the different steps of the simplified avoidance strategy.

Figure 11 .
Figure 11.Celle castle site used to carry out the tests on the obstacle detection and avoidance system.

Figure 12 .
Figure 12.Example of mission carried out to test the correct functioning of the obstacle detection and avoidance system.This mission involves a trajectory that takes the drone on a collision course with a building on the site.

Figure 13 .
Figure 13.Screenshots of the building detection and avoidance phases through the separate use of the two sensors: radar detection system (a), and optical detection system (b).

•
m r : map resolution; • d m : maximum depth to investigate (ROI radius); • ψ R : ROI field of view [rad]; • h m : ROI height.The ROI is computed as the following set of N coordinates F b in the reference frame B ROI =    F b ∈ R 3 s.t.F b =  n i m r sin (n j /n i ) n i m r cos (n j /n i )

Figure A1 .
Figure A1.On the left (a) is the ROI relative to a desired depth of 7 m; on the right (b) is the ROI relative to a desired depth of 120 m.FOV angle is equal to 24 degrees.

Table 1 .
Number of voxels automatically calculated by the update algorithm of the ROI, the resolution and the computation time as a function of various desired depths.