3D Distance Filter for the Autonomous Navigation of UAVs in Agricultural Scenarios

: In precision agriculture, remote sensing is an essential phase in assessing crop status and variability when considering both the spatial and the temporal dimensions. To this aim, the use of unmanned aerial vehicles (UAVs) is growing in popularity, allowing for the autonomous performance of a variety of in-ﬁeld tasks which are not limited to scouting or monitoring. To enable autonomous navigation, however, a crucial capability lies in accurately locating the vehicle within the surrounding environment. This task becomes challenging in agricultural scenarios where the crops and/or the adopted trellis systems can negatively affect GPS signal reception and localisation reliability. A viable solution to this problem can be the exploitation of high-accuracy 3D maps, which provide important data regarding crop morphology, as an additional input of the UAVs’ localisation system. However, the management of such big data may be difﬁcult in real-time applications. In this paper, an innovative 3D sensor fusion approach is proposed, which combines the data provided by onboard proprioceptive (i.e., GPS and IMU) and exteroceptive (i.e., ultrasound) sensors with the information provided by a georeferenced 3D low-complexity map. In particular, the parallel-cuts ellipsoid method is used to merge the data from the distance sensors and the 3D map. Then, the improved estimation of the UAV location is fused with the data provided by the GPS and IMU sensors, using a Kalman-based ﬁltering scheme. The simulation results prove the efﬁcacy of the proposed navigation approach when applied to a quadrotor that autonomously navigates between vine rows.


Introduction
Autonomous vehicles such as unmanned aerial vehicles (UAVs) are systems that are designed to operate in an unpredictable and partially unknown environment. This requires the UAVs to be able to navigate without disruption and to avoid any obstacle placed within their movement boundaries. In this framework, two main concepts are of particular relevance: (i) perception, i.e., the ability to acquire information from the vehicle environment and to sense objects around it or their relative position; and (ii) navigation, i.e., the ability to locate the vehicle in the surrounding environment by the merging of the sensors' data, combined with the vehicle dynamics [1]. To achieve these tasks, information is perceived by the use of sensors in order to autonomously perform the UAV localisation and navigation [2].
Sensors are devices that map the detected events or changes in the surroundings to a quantitative measurement for further processing, useful for UAVs' path planning and decision making, and essential precursors for controlling the motion of the drone.
The sensors used for data collection and autonomous navigation are typically grouped into two major categories, (i) proprioceptive and (ii) exteroceptive sensors. Proprioceptive sensors measure values internal to the system (e.g., UAV), whereas exteroceptive sensors perceive and acquire information such as the distance measurements or the light intensity from the surroundings of the system. According to these definitions, inertial measurement unit (IMU) and global positioning system (GPS) devices are classified as proprioceptive sensors, while sonar, ultrasonic distance sensors, and infrared (IR) sensors are some examples of exteroceptive sensors [3]. Therefore, the sensing capability of a UAV that employs a diverse set of sensors is an essential element as the cooperation and performance of these sensors can directly determine the viability and safety of an autonomous vehicle. UAVs primarily utilise ultrasonic sensors to perceive the operative environment [4]. Additionally, other sensors, including GPS and IMU, are used to determine the relative and absolute positions of the vehicle. Therefore, the merging and integration of the data from several sources becomes crucial.
This task is typically identified as sensor fusion, whose definition is provided by the Joint Directors of Laboratories Workshop [5] as "a multi-level process dealing with the association, correlation, combination of data and information from single and multiple sources to achieve refined position, identify estimates and complete and timely assessments of situations, threats and their significance". Therefore, sensor fusion techniques combine data from multiple sources to achieve better accuracy than the one obtained by the use of a single sensor. The sensor fusion framework also involves the development of consistent models that can accurately perceive the surroundings in various environmental conditions. The goal of using data fusion in multi-sensor frameworks is to reduce the detection error probability and to increase the information reliability by using data gathered from multiple distributed sources.
The available data fusion techniques can be classified into three main categories: (i) data association, (ii) decision fusion, and iii) state estimation. Other classifications have been proposed over the years, as reported in [6]. The goal of the data association techniques is to establish the set of observations and/or measurements that are generated by the same target over time. On the other hand, for the decision fusion techniques, a decision is taken based on the knowledge of the perceived situation, provided by many sources. These techniques aim at making a high-level inference about the events and activities that are produced by the detected targets. Finally, the term state estimation identifies the techniques that determine the state of the moving target, given its observation or measurements. The final goal is to obtain a global target state from the observations coming from different sensors or sources.
The estimation problem can be defined as follows: given a set of redundant observations of the vector state (e.g., position, velocity, orientation), the goal is to find the set of parameters that provides the best fit to the observed data. Indeed, data acquisition is typically affected by errors and measurement noise. In the last fifty years, numerous state estimation techniques have been proposed, starting from the well-known Kalman filter [7]. Most of the state estimation methods are based on control theory and employ the laws of probability to compute a vector state from a vector measurement or a stream of vector measurements. The most common estimation methods, in addition to the Kalman-like filters [8,9], include the maximum likelihood [10] and maximum posterior approach [11], the particle filter [12][13][14], the distributed particle filter [15,16], and the covariance consistency methods [17].
UAV navigation systems significantly rely on knowledge of the environment, which needs to be reliably described to allow an accurate determination of both drone position and orientation [18]. In this sense, 3D high-accuracy maps can represent a valuable resource to improve the UAV's capability of autonomously navigating in the field [19]. This ability to correctly assess the location of UAVs becomes fundamental in modern applications requiring the autonomous and coordinated navigation of agricultural unmanned vehicles, both aerial and ground, in the agriculture 4.0 framework [20]. Such a 3D map can be derived from several types of sources, such as LiDAR sensors [21], structured light cameras [22], or even standard cameras, by photogrammetric techniques [23] or stereo vision [24,25]. The raw data is usually a point cloud, which is a set of 3D coordinates of unlabelled points in the space representing the external surface of the detected objects [26,27].
On the other hand, retrieving 3D maps first and then exploiting them in real time for guidance and navigation tasks is challenging. This is mainly due to the required high data storage and computational demand with respect to the commonly available UAV hardware. Moreover, as highlighted in [3], to accurately determine the position and orientation of the UAV in its operative environment, it is better to exploit simple and understandable structures. The main technique for representing the environment is the geometric one, which is used to describe the environment by parameterising primitive geometric objects such as curves, lines, and points. The geometric representation of the environment is closer to the sensor and actuator world and it is the best one to perform local navigation.
Several triangulation techniques have been proposed to properly approximate surfaces or volumes in space by generating 2D or 3D mashes from sets of points [28]. Even if the most widely adopted techniques rely on Delaunay triangulations [29] and Voronoi diagrams [30], many other effective approaches have been developed for triangulated mesh generation, such as the Powell-Sabin-based methods (Barrera 2021) or the moving parabolic approximation approach [31]. The refinement of triangulated meshes is also a crucial task, essential for reducing the number of instances used to spatially describe an environment [32,33]. In addition, the semantic interpretation of a uniform dataset is essential to fully exploit the information potential of the 3D models of real scenarios, by assigning proper labels to each different element's category [34]. An innovative approach was proposed in [35] to generate low-complexity 3D mesh models of vine rows starting from a point cloud-processing pipeline. The main output of this processing flow was a 3D mesh, still properly describing the spatial layout and shape of the vine, but with a significant reduction of the required data storage demands. The efficacy of the so-called low-complexity georeferenced (LCG) maps was already validated in [36]. In this paper, the LCG maps were exploited to automatically build the obstacle/occupancy grid map used by the guidance and control schemes for allowing autonomous vehicles to navigate between vine rows. On the other hand, the authors assumed to know the vehicle's exact location, neglecting the sensor fusion task for autonomous navigation.
In agricultural scenarios, when relying on low-cost/low-accuracy sensors, the autonomous navigation of vehicles into narrow and complex environments could encounter several issues. Indeed, when operated within the vine rows, UAVs could be affected by the GPS signal being compromised or lost, which can jeopardise their navigation capabilities and lead to collisions. For this reason, in [37], we first proposed an innovative sensor fusion approach in which the data provided by GPS, IMU, and ultrasound sensors were fused together with the information provided by the LCG maps. In that paper, the ellipsoid method was applied to fuse the data collected by the ultrasonic sensors and the LCG map to improve the location estimation of an unmanned ground vehicle (UGV) navigating within the vine rows.
In this paper, we extend the aforementioned approach to the 3D case, thus allowing for the use of a similar filtering scheme for the autonomous navigation of UAVs into narrow spaces, such as vine rows. The main differences can be summarised as follows: (i) a real-time selection of the reference plane with respect to which the distance between the UAV center-of-mass (CoM), and assessment of the projected LCG map; (ii) the use of a Kalman-based filter to fuse the data from the combined ultrasound sensors-LCG maps and the data of the GPS and IMU with their correlated uncertainty; and (iii) the use of a moving-window approach to select the section of the 3D mesh of interest, complying with the estimated UAV location, thus allowing for a significant reduction of the algorithm's computational complexity. The efficacy of the proposed sensor fusion approach was validated in simulation by considering a quadrotor equipped with low-cost sensors and navigating within the rows of a vineyard in the Piedmont region of Italy.
The remainder of the paper is structured as follows. In Section 2, we present the selected framework and the modelling setup used to describe the main elements involved in the selected scenario. The proposed navigation scheme is detailed in Section 3, where the main features of the 3D distance filter are outlined together with some hints for the classic Kalman filtering scheme. The preliminary numerical results are presented in Section 4 and discussed in Section 5, while the main conclusions are drawn in Section 6.

Selected Framework
In this section, we provide an overview of the models exploited to describe the main elements involved in the selected navigation framework: (i) the quadrotor UAV; (ii) the LCG maps; and (iii) the ultrasound sensors.

Rotary-Wing UAV Modelling
The kinematics and dynamics of a rotary-wing UAV can be defined with respect to two main reference frames, as depicted in Figure 1: (i) an inertial frame I = (î I ,ĵ I ,k I ), with the z-axis pointing upward; and (ii) a moving (body) frame B = (î B ,ĵ B ,k B ), attached to the UGV with oriented axes, in which we envision the so-called "+" configuration, with the body axes aligned with the quadrotor arms. In this configuration, we have a pair of rotors (1 and 3) spinning counter-clockwise, whereas the other pair (2 and 4) spins clockwise with the angular speed of the i-th rotor denoted as ω i , with i ∈ N 4 1 .
The Euler equations are used to describe the quadrotor rotational dynamics, where the torques generated by the rotors are defined in the body frame as τ = [τ φ , τ θ , τ ψ ] , and the contribution of the rotating blades is defined by the rotor moment of inertia I r and the hover rotational speed Ω r . Thus, neglecting the aerodynamics effects, the rotational dynamic model is given by: where ω B = [p, q, r] is the UAV angular velocity expressed in the body frame and I its inertial tensor. For the kinematic model, we exploited the quaternion-based formulation, thus avoiding singularity issues, i.e.: where [q 1 , q 2 , q 3 ] are the vectorial components of the quaternion q while q s is the scalar one. Then, to model the translational motion of the UAV in the inertial frame p I = [x I , y I , z I ] , we used the classical Newton's equations, expressed in the inertial frame as: where Ξ is the 3 − 1 − 2 inertial-to-body rotation matrix, m is the UAV mass, is the thrust vector expressed in the body frame, and g = [0, 0, −g] is the gravitational acceleration defined in the inertial frame.
In terms of the generated thrust and torque, each rotor is able to generate a vertical force F i and a moment M i as: where the proportional constants k F and k M can be determined by experimentation or by simulation (see, e.g., [38]). Then, it is possible to relate the control inputs, i.e., F z , τ φ , τ θ , τ ψ , to the quadrotor characteristic length L, the rotors' angular velocities ω i , and the thrust F i , knowing that: • the thrust command F z is the sum of the thrust contributions F i generated by each rotor, i.e., • the rolling torque τ φ is produced by a different angular velocity variation of the rotors 2 and 4, i.e., the pitching torque τ θ is produced instead by a different angular velocity variation of the rotors 1 and 3, i.e., • the yawing torque τ ψ derives from the drag generated by the propellers on the quadrotor itself, with a torque direction opposite to the one of the rotors' motion, such that: Further details can be found in [39].

Vine Row Modelling from LCG Maps
Contrary to the 2D case in [37], in which maps were processed offline to derive the single cutting plane, in this paper, we use the LCG maps in real time within the navigation loop of a quadrotor. We first assume that we can retrieve a 3D plane, defined as a T x + b T y + c T z + d T = 0, that approximates the terrain directly from the LCG map defined with respect to the body frame. Then, at each time step k, we need to properly roto-translate this plane to intersect the 3D mesh. In this way, we can obtain the 2D reference slice that will be exploited in the ellipsoid method, as described in Section 3.1 (see also Figure 2). The required roto-translation depends on the UAV relative flight altitude with respect to the (approximated) terrain, i.e., z k , as well as on its attitude. Assuming that, at time k, the UAV orientation with respect to the local/inertial frame is defined by the Euler angles (φ k , θ k , ψ k ), we can obtain the normal vector to the new plane by a rotation Ξ k , as: , and [Ψ k ] are the rotation matrices, and n T = [a T , b T , c T ] is the normal to the terrain approximating plane. Then, we translate this new plane by the vector [x k , y k , z k ] , representing the UAV location at time k, in order to make it pass through the quadrotor CoM. In this way, we can identify the last parameter d k to define the new plane at time k as: by intersecting this approximating plane with the 3D mesh, the resulting 2D slice (the dark green section in Figure 2) for the i-th row turns out to be composed by N i vertices and the N i − 1 segments connecting them. To model these slices, and to later use them in the ellipsoid method, we need to retrieve, for each pair of vertices (N i , N i+1 ), the approximating line equation of the form a i x + b i y + c i = 0 as follows: in the body frame, we have: We retrieve the coefficients a i , b i , and c i as: Once the straight line passing through the vertices (N i , N i+1 ) is identified, we compute its characteristic slope β i , which will later be used in the ellipsoid method, as:

Ultrasound Sensor Modelling
Ultrasonic sensors are based on the measured propagation time of an ultrasonic signal. They emit high-frequency sound waves which are reflected by an object, including transparent and other demanding objects where optical technologies fail. In particular, the distance measurement is based on a time-of-flight measurement. The sensor calculates the time between the sending and receiving of the reflected sound signal. In this work, we assume that the UAV is equipped with five ultrasonic sensors, one on each quadrotor arm, to measure the distance in the x-y-body plane with respect to the vine rows, and one at the bottom of the UAV, for measuring the relative flight altitude with respect to the terrain.
In particular, we assume that the US sensor returns a distance measurement d given by the sum of two contributions, i.e.: where e d is an unknown-but-bounded error defined in the range e d ∈ [e d , e d ], whereas d ⊥ is the distance measured from the UAV CoM to the LCG map on the intercepted 2D slice, as depicted in Figure 3. It is important to remark that e d is defined as e d = e s + e m , where e s is the ultrasound sensor accuracy error, i.e., e s ∈ [e s , e s ], and e m is the LCG approximation error, i.e., e m ∈ [e m , e m ]. Consequently, we have that d ∈ [d, d] and where d r is the real distance from the quadrotor CoM and the vine row.
The approximation error e m takes into account the error introduced by the use of the map, which provides only an approximation of the crop location and shape. On the other hand, the UAV-LCG distance d ⊥ represents the input to the row selection sub-phase, together with the information arising from the map and the position p k|k−1 and attitude θ k|k−1 predicted at time k, given the measurement at time k − 1, as depicted in Figure 4 and described in detail in Section 3.1.

3D Navigation Filtering Scheme
The work logic of the proposed sensor fusion approach is depicted in Figure 4, from which it is possible to recognise the classic two-phase approach of a Kalman-like filtering scheme (prediction and update), in between the so-called 3D distance filter, enclosed into the green box. Given the estimate of the statex k−1 and the error covariance matrix P k−1 at time k − 1, we can obtain the state and covariance projection, i.e.,x k andP k , at time k during the Kalman filter prediction phase. In the specific case, the state vector includes the UAV positionp k = [x k ,ỹ k ,z k ] and attitude (φ k ,θ k ,ψ k ). The heading anglẽ θ k and the flight altitudez k represents the input to the 2D map update block, allowing for the definition of the intersecting plane and the 2D slice, as described in Section 2.2. Together with the UAV positionp k , the headingθ k , the measured distance d ⊥ k , and the estimation error e d , the 2D slice represent the inputs to the row selection phase. This block leads to the identification of the reference segment for the calculation of the parallel cuts, defined by the parameters a O k , β k ,β k . The last step of the 3D distance filter allows for the propagation of the uncertain ellipsoid defined byx k andP k , obtaining the minimum volume ellipsoid E k that contains the feasible point set defined by its center x k and covariance matrix P k . These data, together with those provided at time k by the GPS and IMU, i.e., (x GPS k , P GPS k ) and (x I MU k , P I MU k ), respectively, represent the inputs to the last phase of the Kalman filter, i.e., the update phase. The main outputs are the estimate of the statex k and the error covariance matrix P k at time k. Each block is detailed in the following sections, starting from the innovative part of this approach, i.e., the 3D distance filter.

3D Distance Filter
As represented in Figure 4, the 3D distance filter can be split into three main sub-phases: (i) the 2D map update, which includes the modelling of the vine row starting from the LCG maps, as described in Section 2.2; (ii) the row selection, a moving window approach that allows for the identification of the row section of interest, given the perceived UAV location (in this way, the number of segments of the 2D slice, among which the one complying with the measured distance is identified, is reduced, together with the computational burden of the algorithm); and (iii) the final distance filter update, which is based on the parallel-cuts ellipsoid method [40], first proposed in [37] as filtering scheme.
At each time step k, we first determine the reference plane a k x + b k y + c k z + d k = 0 following the approach described in Section 2.2, whose results coincide with the UAV body xy-plane. In this framework, the only relevant attitude angle is represented by the heading (pitch) angle θ k . Then, we assume that at time k, the quadrotor location (x k , y k ) is affected by uncertainty, represented by an ellipsoid E k defined as where P k is the ellipsoid shape matrix. Then, we collect the distance measurement d k from the UAV and the row, and we select a subset of segments composing the 2D slice that are in the proximity of the UAV location. Then, among this subset, we identify the segment i-th whose distance from the UAV adheres more with the measured one d ⊥ k . Last, we retrieve the corresponding approximating equation Because all these measures are affected by errors, this implies that there are multiple points inside the uncertainty ellipsoid complying with the ultrasound measure. Hence, it is possible to define a lower and upper bound on the offset from the i-th approximating line to limit the points satisfying this constraint. In particular, given the UAV orientation θ k , the line slope β i , and the bounds on the sensor measurement, i.e., d and d, we can define the upper and lower offsets as These offsets identify two straight-lines, parallel to the one representing the i-the segment and described as where The area of the uncertainty ellipsoid E k constrained between these two lines is the socall feasible point set F , defined as and represented by the yellow area in Figure 5. The parallel-cuts ellipsoid method, described in [40], allows the simultaneously use of the constraints imposed by the pair of parallel cuts to generate the new ellipsoid E k having minimum volume and containing all the points between the two half-spaces (see Figure 6). Hence, let us compute the algebraic distance of each half-plane from the center of the ellipse (x k , y k ) asα =β − a T x k (a T P k a) , α = a T x k − β (a T P k a) . (14) Figure 6. Propagation of the uncertainty ellipsoid E k at time k + 1, i.e., E k , using the parallel-cuts method. E k represents the ellipsoid at minimum volume containing the feasible point set (yellow area).
Then, we can compute the propagation of the ellipsoid E k (see Figure 6) by its center x k and shape matrix P k as: where n is the space dimension, and σ, τ, and δ are the dilation, step, and expansion parameters, respectively. For the parallel-cuts approach, these parameters can be computed as [40]:

Kalman Filter for Sensor Fusion
Since the early 1960s, much effort has been devoted to modelling processes {y k } in the state-space form, i.e.: where the state x k obeys the recursion: The measurement noise v k and the process noise w k are assumed to be vector-valued zero-mean white-noise processes with known covariance, i.e., Q and R, respectively. On the other hand, the initial state x 0 is assumed to have zero-mean covariance matrix Π 0 , and to be uncorrelated with {v k } and {w k }. We also assume that the matrices Φ, H, Q, R, S, and Π 0 are known a priori, following the classical assumptions of the Kalman filter (see also [41]). Then, we can define the estimation error e k as the difference between x k and its estimatex k , and the error covariance matrix P k can be defined as: Assuming thatx k is the prior estimate ofx k , the new estimate can be updated by combining the old estimate with the measured data as: (19) where K k is the Kalman gain defined by: andP k is the prior estimate of P k , which can be updated as: Then, the state and covariance projection can be achieved as follows: This classical filtering approach was used in the proposed navigation scheme to fuse the data coming from the sensors mounted on the UAV, as depicted in Figure 4.

Results
The proposed case study involves a DJI Matrice 100 (M-100) quadrotor that shall autonomously navigate between the vine rows of a vineyard located in Barolo (Piedmont, Italy). The selected field extends over a surface of about 0.7 hectares, with an elevation that ranges from 460-m to 490-m above sea level. The space between the vine plants and the inter-row space are about 0.9 m and 2.5 m, respectively. As is typical for precision agricultural applications involving drones [42,43], the UAV is equipped with one GPS module (CP.TP.000025 [44]), one IMU sensor ), and ultrasound sensors (HC-SR04 [46]). The main features of the drone and of the equipped sensors are reported in Tables 1 and 2, respectively.  The 3D LCG map was obtained from a remote sensing mission, as detailed in [35]. The validation of the proposed navigation filtering scheme was carried out by exploiting a simulator developed with MATLAB/Simulink 2019b, running over an Intel Core i7-6500U with a CPU @2.50 GHz, a RAM of 12 GB, and a 1-TB hard-disk drive. The presence of several sensors required the development of a multi-rate simulator, where a characteristic sample rate was defined for each sensor according to the classical settings (see, e.g., [47][48][49]). In particular, we have: (i) 5 Hz for the GPS; (ii) 100 Hz for the IMU; and (iii) 20 Hz for the ultrasound sensors. The different output data rates made it necessary to implement an ad hoc strategy for the data synchronisation. In particular, we followed the approach proposed in [50], where the data provided by the slower sensors (i.e., GPS and ultrasonic) are fused with the data from the faster sensor (IMU) only when available. This means that the global sample time coincides with the fastest one, i.e., 0.01 s. The opposite approach, proposed in [51], where the faster measurement data is adapted to the slower one, was discarded because it might cause a lower accuracy of the final estimation. The aforementioned simulator is based on the linearised dynamics of the UAV, obtained by linearising the nonlinear dynamics model presented in Section 2.1 with respect to the hovering condition. For the results presented below, we set the UAV initial position at [36, 376, 37.5] m with respect to the local frame, with an airspeed of [10, 2.9, 0.3] , null angular velocity, and an attitude of [0, π 90 , − π 6 ] , placing the UAV at the beginning of the first inter-row, at non-zero speed, and pointing toward its end. Figure 7 provides an overview of the simulation framework where it is possible to observe the LCG map and the evolution of the UAV location (black circles) and the corresponding 2D intersecting planes (yellow rectangles) and 2D slices (green polyhedrons). The motion of the UAV was perturbed by a uniform, zero-mean random noise affecting both the position and attitude with a standard deviation of 0.1 m and 10 degrees, respectively. Before starting the validation of the navigation scheme, it was crucial to determine the optimal sensors configuration and to identify the minimum number of ultrasound sensors to be mounted onboard the UAV in order to obtain the desired estimation accuracy level. Figure 8 reports the results in terms of the lateral (E i lat ) and longitudinal (E i long ) estimation errors obtained by comparing the three different configurations: (i) with i = 1 distance sensor mounted at 45 degrees, with respect to the main axis, pointing towards the right vine row; (ii) with i = 2 sensors, one on each side of the UAV; and (iii) with i = 4 ultrasound sensors, each one mounted at the end of the UAV arms. In Figure 8, we also reported the performance obtained without the use of the LCG map (black lines) to highlight the benefits achievable with the proposed navigation scheme. Once the optimal sensor configuration has been assessed, we proceeded with the navigation filter validation in the aforementioned simulation environment, of which the results are depicted in Figure 9. In particular, we compared the evolution of the UAV's true position (yellow circles) and the corresponding confidence ellipses, i.e., representing all the possible positions where the vehicle could be in a given time instant according to the covariance matrix relative to the filter estimation, when the GPS and IMU data were fused (red ellipse) or not fused (black ellipse) with the information provided by the distance filter (which includes the LCG map in the estimation process).
(a) (b) Figure 9. In (a), the UAV real position evolution (yellow circles) within vine rows and corresponding uncertainty ellipsods when the LCG maps and ultrasounds are (red lines) or not (black lines) included into the navigation filter. In (b), it is possible to observe that the true position is always inside the uncertainty ellipsoids, but it is almost aligned with the red ellipses' centers.
A statistical analysis of the estimation error was performed. As shown in Figure 10, we retrieved the normal distribution of the lateral and longitudinal position errors achievable when the LCG maps are exploited (red bars) or not exploited (blue bars). The obtained standard deviations are reported in Table 3.  Last, we checked the computational compatibility of the proposed navigation scheme, measuring the computational demand for each phase of the filter and for the entire navigation block, as represented in Figure 11.

Discussion
Regarding the study on the identification of the optimal number of sensors to be installed onboard the quadrotor, the obtained results confirmed that better performance can be achieved by increasing the number of equipped ultrasound sensors, up to a certain level. Above a given number of sensors, the achievable benefits do not balance the additional weight to be carried onboard. In detail, when only one distance sensor is employed, the accuracy of the new estimated position is generally improved with respect to the case when only the GPS and IMU data are used for position estimation. In particular, an average reduction close to 47% was obtained for the lateral estimation error, versus 35% for the longitudinal one. Adding a second ultrasound sensor to gather data from both vine rows allowed for the further improvement of those percentages up to 64% for the lateral estimation and 38% for the longitudinal one. Better results were obtained with the foursensor configuration, with a reduction of the average estimation error of 72% for the lateral position and 49% for the longitudinal one. Further analyses were made by assuming double the number of ultrasonic sensors for each arm of the UAV. The results showed that the obtained improvements, i.e., an average reduction of 74% for the lateral estimation error and 55% for the longitudinal one, were not balancing the undesired additional computation effort and system complexity. In addition, as shown in Figure 9, the comparison between the prediction and the uncertainty propagation, obtained by including (red ellipses) or not including (black ellipses) the information provided by the ultrasonic sensors and the LCG map highlighted the benefits achievable with the proposed filtering approach. The algorithm's effectiveness was also supported by the obtained value of the standard deviation reduction when exploiting the proposed distance filter. Indeed, the standard deviation reduction achieved was close to 76% for the lateral error estimation and to 42% for the longitudinal one.
According to the presented results, it was possible to observe considerable improvements achievable by the exploitation of the proposed 3D distance filter, where the LCG maps where used to improve the quality of the location estimation, compared to a classical filtering scheme involving only the GPS and IMU sensors. In particular, it was possible to significantly reduce the average estimation error and standard deviation for the UAV lateral and longitudinal locations (see Figure 10). Specifically, it was possible to observe a higher accuracy for the lateral estimation, with respect to the longitudinal one, due to the more accurate information provided by the 3D distance filter and the maps in the lateral direction. These results are compatible with the need to avoid collision with the vine rows during the UAV autonomous navigation.
Last, the analysis of the computational costs required by each phase of the filtering scheme shows that the additional computational cost introduced by the 3D distance filter is compatible with the navigation time constraint. Indeed, to guarantee that all data provided by the sensors are properly processed and filtered in order to estimate, at each time step, the UAV location and attitude, the computation time required to solve the navigation task shall be lower than the navigation sample time, i.e., it has to be lower than 0.01 s. When this condition is not satisfied, some data are lost, compromising the navigation task. The obtained results, presented in Figure 11, showed that the most computationally demanding phase was related to the 3D/2D map conversion, whereas the distance filter update and the row selection phase proved to be the fastest, with computation times lower than 10 −4 seconds. Overall, the total computation time measured during the validation, with an average value of about 0.005 s, demonstrated that the proposed filtering scheme is suitable for real-time scenarios.

Conclusions
In this paper, we presented an improved UAV localisation system to allow for the autonomous navigation of rotary-wing UAVs within vine rows, in order to perform scouting and crop-monitoring operations. The proposed method extends the ellipsoid method-based filtering approach to the 3D case. In particular, we proved the efficacy of the proposed 3D sensor fusion approach in simulation. The improvements on the estimation of the UAV location when operated within the crops, despite the adoption of COTS sensors, are promising to the point that experimental validation campaigns have already been planned as the next research step. In particular, the first phase involves the realisation of the 3D LCG maps of the experimental testbed, i.e., the vineyard located inside the campus of the University of Turin (DiSAFA facilities). Then, once the testing UAV is customised and calibrated, mounting the selected sensors suite and uploading the proposed navigation algorithm, an extensive experimental campaign will be carried out to assess the autonomous navigation capabilities achievable in a real scenario. Last, additional flight tests will be carried out in a vineyard in Barolo (Piedmont, Italy) to compare the simulation and experimental results.