Aiding Depth Perception in Initial Drone Training: Evidence from Camera-Assisted Distance Estimation

Murray, John; Richardson, Steven; Joiner, Keith; Wild, Graham

doi:10.3390/technologies13070267

Open AccessArticle

Aiding Depth Perception in Initial Drone Training: Evidence from Camera-Assisted Distance Estimation

by

John Murray

^1,*,

Steven Richardson

²

,

Keith Joiner

³

and

Graham Wild

⁴

¹

School of Engineering and Technology, UNSW Canberra, Canberra, ACT 2600, Australia

²

School of Science, Edith Cowan University, Joondalup, WA 6027, Australia

³

Capability Systems Centre, UNSW Canberra, Canberra, ACT 2600, Australia

⁴

School of Science, UNSW Canberra, Canberra, ACT 2600, Australia

^*

Author to whom correspondence should be addressed.

Technologies 2025, 13(7), 267; https://doi.org/10.3390/technologies13070267

Submission received: 15 May 2025 / Revised: 8 June 2025 / Accepted: 16 June 2025 / Published: 24 June 2025

Download

Browse Figures

Versions Notes

Abstract

Remotely Piloted Aircraft (RPA) pilots frequently experience difficulties with depth perception, particularly when estimating distances between the drone and environmental obstacles. This study evaluates whether the use of onboard camera imagery can improve exocentric distance estimation accuracy among ab initio drone pilots operating under visual line-of-sight (VLOS) conditions. Two groups of undergraduate students performed distance estimation tasks at 20 and 50 m. One group used direct observation only to estimate the exocentric distance between the drone and an obstacle. The second group, as well as direct observation, had access to a live video feed from the drone’s onboard camera via a ground control station. At 20 m, there was no statistically significant difference in estimation accuracy between the groups. However, at 50 m, the camera-assisted group demonstrated significantly improved accuracy in distance estimation and reduced variance in estimation error. These findings suggest that a ubiquitous and low-cost technology, originally intended for imaging, can offer measurable benefits for depth perception at greater operational distances. The inclusion of camera-assisted perception training during early-stage licensing may enhance safety and spatial judgement in RPAS operations.

Keywords:

remotely piloted aircraft; camera; depth perception; exocentric distance

1. Introduction

Although pilotless aircraft have a long history of use, it is only with recent technological advances that drones have become more prevalent in their use and become accessible and affordable [1]. These advances have allowed Remotely Piloted Aircraft Systems (RPAS) to be made available for hobbyists and professional users, from agriculture to surveying and from mining to search and rescue operators, all at appropriate size, ease of purchase, and use. RPAS of much larger size and sophistication can be used for security and defence purposes along with burgeoning commercial applications. These larger Remotely Piloted Aircrafts (RPAs) are ready for use in National Airspace Systems (NAS), with NASA conducting research during the second decade of this century into the integration of RPAS into NAS [2]. In the example of Australian aviation, the ever-increasing use of RPAS can be evidenced in the surging number of Remote Pilot Licenses (RePLs) being issued by the Civil Aviation Safety Authority. The increase in license issues gives a pointer to increased use of RPA. Future projections predict a yearly growth in drone activity of 20%. Currently it is estimated that there are 1.5 million drone flights annually in Australia, rising to a predicted 60.4 million flights in 2043. Of those flights, it is estimated that over 70% will be conducted by goods delivery operations [3]. This growth may be increasingly constrained by regulatory considerations rather than technological limitations. With such a rapid increase in activity in and around the urban environments where most recipients of the delivered goods reside, there can be expected to be an increase in accidents and incidents. There is an obvious and growing need to recognize and regulate specific RPA pilot skills for specific flying tasks [4]. Without appropriate skills to navigate in obstacle-strewn environments such as urban settings, the growth of RPA activity will also be constrained by the human element which remains a vital element in RPAS.

Despite being driven by technological innovations, most RPAS accidents have been identified as arising from failures in this same technology [5]. Notwithstanding the technology failures, researchers have become concerned about the role of the human pilot in RPAS accidents [6,7]. RPA operator skills have been identified by these researchers as a leading cause of safety occurrences. Lack of depth perception is a greater contributor to RPA accidents than it is for general aviation and all aviation operations in Australia [7]. The predicted increase in urban drone activity for delivery of goods will increase the importance of obstacle avoidance. Moving safely through an environment is summarized by Lester et al. [8] as “A particularly complex behavior relying on a range of perceptual, mnemonic, and executive computations. It requires the integration of different types of spatial information, the selection of the appropriate navigation strategy, and, if circumstances change, switching between strategies.” (p. 1021).

Exercising accurate depth perception skills can be problematic for RPA pilots owing to the unique challenges facing the pilot. These can include generic challenges for most flights, such as not being co-located with the drone and not always being aligned with the direction of travel. Flying an RPA can be a challenging task. Compared to flying a conventionally crewed aircraft, the RPA pilot is deprived of 4 of their 5 senses. Only vision remains for the RPA pilot to be able to use [9]. Further, not being co-located with the aircraft can also make perceptual skills and establishing accurate depth perception more difficult with the loss of supporting cues such as motion feedback [10]. There can also be specific challenges for the sortie being undertaken, e.g., as Hartley et al. [11] describe, issues unique to flying around forests. Distance estimation difficulties appear to increase with an increase in viewing distance. Lappin et al. [12] identified studies that indicated the foreshortening of estimations at greater viewing distances. All produce challenges for the depth perception abilities of RPA pilots.

When operating in third person view, it is difficult to maintain depth perception and to judge distances to objects and thus avoid the objects [13], difficulties that only increase with increasing distance between the drone and the pilot [14,15]. Further increasing this difficulty is when the drone pilot is stationary (as is often the case) whilst conducting operations. This situation results in motion parallax not being available to help in depth perception. Further difficulties in perceptual awareness arise for the remote pilot when the aircraft is not orientated in the same direction as the pilot, as the principle of motion compatibility is eroded [16]. When this happens, the control inputs from the pilot may not move the aircraft in the desired direction, making it easier for the pilot to fly the aircraft into an object.

An important skill in depth perception is the ability of the drone pilot to envisage themselves in a different position, that is, a change in egocentric perspective [17]. This has been identified as “essential for navigation in large-scale environments” [18] (p. 414). This required egocentric perspective is one of the components of depth perception. Establishing this perspective requires estimation of distances between the RPAS pilot and objects in the environment being viewed, including the drone itself. The exocentric distance estimation between the drone and obstacles is also part of depth perception (Figure 1). The lack of, or inaccurate, distance estimation is a causal factor in incidents and accidents for RPAS operations.

Solutions can be provided by both technology and increased depth perception skills of RPA operators. In response, drone manufacturers are developing technical solutions to enhance pilots’ depth perception. These solutions include long-standing technologies such as primary radar, which can be used to detect other aircraft and objects on the ground that could be a risk. LIDAR is also available for the detection of obstacles [19]. With the more recent development of machine learning, Electro-Optical (EO) sensors can become a viable option to help detect collision threats, although the authors concede some issues need to be overcome to make this option truly viable. Along with the emergence of fully automated flights, these solutions would seem to reduce the need for the remote pilot to be able to perceive distances between obstacles in the flying field and the drone. However, the solutions can be cumbersome for light drones, expensive to have installed and operated, and not available to RPA pilots of smaller RPA in the micro, very small, and small categories. One technical solution offered by drone manufacturers for the identification of potential obstacles is onboard cameras. While their primary purpose is videography, they can also be used for safety purposes. Cameras are ubiquitous and available on even the most basic off-the-shelf prosumer drones, from sub 250 g drones to larger drones. The cameras offer high-definition views of the flight path and surrounding environments with a monoscopic image projected onto a 2D screen at the ground control station.

While cameras are readily available, the difficulty of implementing other supporting aids means there is likely to remain a need for VLOS pilots to be able to establish and maintain depth perception skills without the support of aids.

With the ease of use and availability of onboard cameras, this research examines the use of the image from a camera to help with establishing better depth perception amongst ab initio RPA VLOS pilots as measured by exocentric distance estimation. Hence, the specific research objective of this study was to evaluate whether the use of onboard camera imagery can improve the accuracy of exocentric distance estimation among ab initio RPA pilots operating under VLOS conditions, at varying operational distances.

2. Literature Review

While photography would seem to be the primary purpose of having a camera onboard, it can be put to other uses. These include using the transmitted images to the Ground Control Station (GCS) for navigation purposes and obstacle avoidance. Alvarez et al. [20] used a small multi-copter to demonstrate the efficacy of a camera-based obstacle avoidance system by flying it through narrow corridors littered with obstacles. The ability to use a camera for this task has come about because of the development of greater computing power and better algorithms [21]. They also identified that camera vision was more advantageous for VLOS pilots than LIDAR.

The cameras used for obstacle avoidance can be monocular, stereo, or depth cameras [22], although Mohta et al. [21] do not believe the latter are suitable for use in outdoor, sunny settings. Stereo was the preferred option over monocular [21] because of the ease of initialization prior to flight. The benefits of camera vision have been identified as the range the camera can provide, low power requirements, and low weight [23]. The image from the camera can be used to enhance the depth perception skills of the RPA pilot.

Although humans appear to have an innate ability to perceive depth, depth perception and the ability to perceive in three dimensions has been described as something of a paradox for humans [24]. Walk and Gibson [25] experimented with infants on a centre board with a shallow drop-off on one side and a much larger fall on the other—described as a “visual cliff”. Consistently the infants refused to crawl towards the side of the centre board with the large fall, even with the inducement of having its mother on the large fall side of the centre board. The researchers concluded that the “average human infant discriminates depth as soon as it can crawl.” [25] (p. 23). This innate ability to perceive depth develops as children get older, aided by environmental experiences [24].

Non-visual information can also be used to detect depth in the environment. Perception in the environmental elements may also be established through physical stimuli such as heat, electrical fields, or audible cues [26]. When subjects had audio cues presented to them, they were more readily able to visually detect objects when the audio and visual stimuli were presented in unison. If the stimuli were presented at different times, there would be no improvement in detection [27]. This characteristic was also noted in RPAS pilots where Dunn [28] found that audio stimuli improved students’ horizontal distance accuracy when they flew in simulated wind conditions, i.e., a more challenging flight condition.

Hing et al. [10] identified problems with using the camera for flying in cluttered environments, including the reduction in the field of view, which can reduce the pilot’s ability to know where the extremities of the aircraft are located. As the camera angles changed, there was a requirement for more mental effort from the drone pilot to establish a picture of the flying environment. This effort can induce vertigo. While it is difficult to gain depth perception for VLOS flying, Beyond Visual Line of Sight (BVLOS) operations further increase the difficulty as the aircraft is out of the pilot’s sight. As the aircraft flies further away, the accuracy of depth judgement decreases [29]. The depth perception must be gained through the image on the ground control screen sent to it from the onboard camera. For objects outside the view captured by the camera, they remain unseen by the pilot.

To help overcome limitations of using a single camera, pilots were tested wearing a VR Head Mounted Display with monoscopic and stereoscopic views of a drone flight [14]. The subjects were asked to estimate the height—above ground level—of the camera carried by the drone. Consistently, when viewed in monoscopic view, the subjects overestimated the distance. In stereoscopic view, the estimations of height made by the participants were closer to the actual height of the camera on the drone but were still not considered accurate.

Hing et al. [10] examined the use of what they described as “chase viewpoint” where the drone pilot was provided with perceptual assistance by being presented with a 3D map of the flying environment, “as they were following a fixed distance behind the vehicle” (p. 5642). Results indicated that chase view-assisted flights were conducted in a more stable manner with a reduced number of large angular accelerations, allowing for a smoother, safer flight through the flying environment compared to the flights conducted using the onboard camera only.

These technological developments improve RPAS safety elements such as collision avoidance and will be for useful future developments such as urban air mobility. For RPAS tasks such as photography being conducted by a real estate agent, these technological developments have been described as “overkill” [15]. This is because these technologies add expense, increase the difficulty in operating the drone, and require larger, heavier equipment, which outweigh the intended benefits. Simple solutions would benefit broad industry adoption.

Augmented reality (AR), in providing pilots with computer-generated graphical cues and other stimuli on images of the physical world, has also been suggested for enhancing pilots’ ability to accurately judge distance and height [30]. The efficacy of augmented reality (AR) overlay has been used to enhance the depth perception of drone pilots [31]. Subjects were asked to estimate the position of a stationary drone by marking its position on a map. Half of the subjects used AR while the second half of the cohort merely observed the aircraft. Using AR provides for more accurate estimation of the aircraft’s location than observing the aircraft. The subjects overall found the AR to be easy to use. For BVLOS flights, the use of AR was tested when operating at mid-level altitudes. The drone pilots found it useful in enhancing their situational awareness during long-range flights, especially for route planning and target identification [32].

Further enhancement in using AR is the presentation of information to pilots wearing goggles. The pilot can interact with the ground control station whilst viewing the aircraft and its positioning. This reduces the amount of time the drone pilot redirects their view away from the aircraft to the ground control station. This reduction, in turn, can be expected to help pilots establish and maintain depth perception. An observational study of pilots using AR and not using AR showed there was over twice as much viewing of the aircraft when using the AR googles as against not using them during an automated flight. The researchers concluded that this was beneficial for the situational awareness levels of drone pilots [33].

Dunn [28] conducted an accuracy and timeliness experiment where student groups tracked a drone by flying VLOS using a monitor and goggles for BVLOS. The goggles slowed the students down, as they took much longer to complete the flying course. Dunn attributed this to “redundant cues such as two visual forms of feedback presented together, may become burdensome if oppositely oriented” (p. 105).

Inoue et al. [30] proposed an enhanced third-person view (TPV) to provide the AR overlay to aid depth perception when flying BVLOS by having a follower drone fly above and behind the camera ship with both aircraft spatially coupled by being controlled by the same stick inputs from the pilot. The researchers found the use of enhanced TPV with an AR overlay associated with this allowed the subjects to have better depth perception than a first-person view (FPV), with subjects also commenting on the difficulties of establishing depth perception with unaided TPV.

Another technical solution to aid pilots with depth perception is assisted landing systems to aid accuracy when landing the drone [13]. Problems arising from using assisted landing systems included excluding the pilot from the control loop. The aircraft is thus not able to adapt to changing pilot goals. Changes in the physical environment that may make the planned landing unsuitable can also not be adapted. A suggested solution was to utilize shared autonomy. Using a simulated environment, subjects were required to land the drone at specific landing sites when assisted by shared autonomous mode or unassisted mode. Novice pilots were found often to try to land between landing sites, which was attributed by the researchers to a lack of depth perception. This in turn negatively affected the ability of the assistance from the shared autonomy to correctly gauge the pilot’s intentions. Overall, the assisted landings were more accurate, and the landing site closest to the drone pilot was more accurately landed upon as against sites further away [13]. Further work was performed on shared autonomy within a real-world environment by the researchers. Subjects were required to land the drone on pre-set landing sites being either assisted by shared autonomy or having no assistance. The assisted landings were more accurate, helping the pilots to gauge depth. Unassisted landings produced evidence of undershooting the target landing site [34].

Haptic control of a drone has also been shown to help pilots in visually challenging environments. These systems provide tactile feedback to inform the pilot about conditions within the flying environment, including obstacles that need to be avoided. When subjects were provided with haptic control, it helped overcome reduced or lack of visual information, likely reducing the collision risk [35]. Macchini et al. [36] found visual systems augmented with haptic control allowed the subjects to hover the drone closer to objects without collisions. The use of haptic control can allow for the reduction in “head-down” behaviour of the drone pilot, as they need to view the ground control station less. This can improve the perception of the relationship between the drone and obstacles [37]. Ramachandran et al. [38] trialled a haptic control sleeve so that when the drone was in the vicinity of an obstacle, it locked the pilot’s elbow and wrist joints. In simulated exercises where visual feedback for the subjects was compromised, the use of this sleeve led to less collisions when attempting to fly the drone through a hole in a wall.

Cacan et al. [39] demonstrated that the human operator can be better than automation when conducting airdrops of a parafoil onto designated landing spots. Using an unmanned parafoil system to drop a payload onto a designated spot, a fully autonomous system had 50% of the impact points within 17.7 m of the desired landing spot (90% of the impact points were within 38 m of the landing spot). While the researchers were pleased with this accuracy, they noted that the autonomous system “was blind to objects around the impact point” [39] (p. 1145) and human intervention was sometimes required to stop the parafoil hitting objects surrounding the landing site. When a human operator was inserted into the system for the landing process using conventional remote control by being visual with the desired landing spot, the parafoil, and the surrounding environment, accuracy of landing improved to 10.4 m for 50% of the landing attempts (90% of the impact points were within 21 m of the landing spot).The researchers noted that depth perception when using this system was more difficult when the operator was not aligned with the impact point. Overall, the researchers noted an improvement in landing performance, including depth perception, when there was a human in the control loop [36].

The limitations of technology in being able to assist drone pilots with depth perception were also noted by Eiris et al. [40] when drones are employed for indoor building inspections—an obvious location for accurate depth perception and collision avoidance. The technologies suggested for safe flying in this environment are expensive and not easy to implement, resulting in the human operator being relied on to operate the drone in a very challenging environment. The development of technological devices would appear at this stage of the journey to be supportive and enhance the pilot’s depth perception abilities rather than providing a complete replacement solution. Thus, while there are technical solutions available to drone pilots to assist with depth perception, the pilot remains an important component for accurate flying.

3. Materials and Methods

With cameras so readily available on drones of every size and type, from multi-rotor to vertical take-off, they are a potentially useful tool for enhancing the depth perception abilities of RPA pilots. This research examines the use of the transmitted image from the camera to a screen on the Ground Control Station (GCS) for improved distance estimation. A group of ab initio RPA students flew a sub 250 g drone with no assistance from a camera and a second group of ab initio RPA students were helped with distance estimation by having access to a camera image on the GCS. The null hypothesis can be stated that the use of camera vision results in no improvement in distance estimation for ab initio drone pilots.

The utilized methodology is a pre-experimental design of double static group comparison, illustrated in Figure 2. Two independent groups of ab initio drone pilots were tasked with flying the drone towards two obstacles, one placed 20 m from the pilot and the other at 50 m. The obstacle direction was 11 o’clock for the 20 m obstacle and 1 o’clock for the 50 m obstacle.

As the drone was flown towards each obstacle, the researcher called for the drone to be stopped and hovered at approximately random times for each participant. The range of the stopping distances was between 1.7 m and 6.2 m at the 20 m obstacle and 1.6 m and 18.9 m at the 50 m distances. The student was asked to estimate the exocentric distance between the drone and the obstacle.

The drone used was a DJI Mini (2 and 3). The actual distance of the drone was ascertained using a laser range finder. The flight path had the drone first flown to the 20 m obstacle and then continued to the 50 m obstacle. An illustrative example of the flight path is given in Figure 3.

The subjects were first-year undergraduate students enrolled in various degree courses who, for familiarity with aviation operations and technologies, had to participate in a tutorial on the operation and control of a very light drone (sub 250 g). The unassisted group (n = 38) used a ground control station that did not have any supporting aids to assist the students in estimating the distance between the drone and the respective obstacle. The assisted group (n = 55) was provided with a GCS with a screen containing the image of the flight surroundings transmitted from the camera onboard the drone. The students in this group were able to directly view both the wider environment and the screen of the GCS. The activity took each participant approximately 5 min to complete. For the unassisted group, the entire time was spent looking at the drone in flight. For the assisted group, they were required to spend much of their time looking at the drone, applying visual flight protocols, and utilized the camera for approximately 60–90 s for each measurement, looking at the screen to help their exocentric distance estimation. This procedure ensured that no visual overload or conflicting perceptual cues from direct vision and the GCS screen would interfere with performance.

No distance information was available to the students on the ground control station screen. All flights were flown in clear visibility conditions.

4. Results

The unassisted group (n = 38) were given no assistance in their task to estimate distances. The assisted group (n = 55) used a ground control station with a screen showing the onboard camera view. The absolute estimation error was calculated by the differences between the actual distance of the drone from each of the obstacles and the estimation made by the students, and is presented in a whisker box plot in Figure 4a. The whiskers are shown based on quartiles. Outliers are shown in the figure as ‘o’ and were calculated by MATLAB 2024A as instances exceeding the limits of the box plus or minus 1.5 interquartile ranges, referred to as the quartile method.

A Kolmogorov–Smirnov test for normality was conducted to confirm the appropriateness of this range of presentations, and to determine the appropriate statistical tests of difference. At 20 m the absolute estimation error was not sufficiently normal either for the Unassisted Group D (38) = 0.1434, p > 0.05 or for the Assisted Group D (55) = 0.1486, p > 0.05. However, at 50 m the absolute estimation error was normal for both the Unassisted Group D (38) = 0.1908 p < 0.05 and the Assisted Group D (55) = 0.1527 p < 0.05. This can be seen in the violin plots given in Figure 4b.

The difference in mean performance between the groups was much larger for the 50 m obstacle. This was also true for the standard deviation, albeit with a marginally higher standard deviation for the Assisted Group at the 20 m obstacle compared to the Unassisted Group at 20 m. These results are summarized in Table 1.

The proportion of students overestimating the distance varied between the two groups at different obstacle distances. The percentage of students overestimating the distance between the drone and the 20 m obstacle was 34.2% for the Unassisted Group and 25.5% for the Assisted Group. At the 50 m obstacle, the percentage of students overestimating the distance was 7.9% for the Unassisted Group and 25.5% for the Assisted Group. This can be seen in Figure 5, where the dumbbell plot contrasts the length of the bars, the two assisted plots are balanced, the 50 m unassisted is skewed to underestimate, and the 20 m unassisted is skewed to overestimate, relative to the assisted.

Levene’s test for the equality of variances across both groups was conducted, the results of which are summarized in Table 2. The requirement for homogeneity of variance at the 20 m obstacle was met, and there was no significance at the p < 0.05 level. At the 50 m obstacle, the requirement of homogeneity of variance was not met and there was significance at the p < 0.05 level. Aside from preparing the correct statistical test for the differences in means, Levene’s test shows the Assisted Group is statistically significantly more consistent in their error estimation than the Unassisted Group. Due to the homogeneity of variance outcomes, Welch’s t-test was conducted for the two groups at the 50 m obstacle, while a t-test for equal variance was conducted for the 20 m obstacle.

A one-tailed t-test (Table 3) for independent samples (equal variance assumed) showed the difference between the two groups at the 20 m obstacle was not statistically significant at the 95% level, with t (91) = 0.2798, p = 0.7803, and 95% confidence intervals in the Unassisted Group (1.072, 1.775) and the Assisted Group (1.334, 1.675). Thus, the null hypothesis cannot be rejected at this distance.

A one-tailed t-test (Table 4) for independent samples (equal variance not assumed) showed that the difference between the two groups at the 50 m obstacle was statistically significant at the 95% level, with t (44) = 3.0823, p < 0.004, and 95% confidence intervals in the Unassisted Group (3.594, 7.206) and Assisted Group (1.903, 2.973). Thus, the null hypothesis can be rejected at this 50 m distance, and the result is statistically significant.

The Hedge’s measure of effect size at 20 m, g = 0.034, indicated a small effect size. At 50 m the Hedge’s was g = 0.751, indicating a medium effect size.

5. Discussion

5.1. Findings

While depth perception is a challenge for pilots of all aircraft types, it is especially so for RPA pilots [7]. As a technology-driven sector there have been different technological solutions provided in attempts to solve this issue. These solutions, however, can come at a financial and operational cost that places them beyond many operators. A simple aid to safe flying can be the onboard camera, which is freely available on most RPA from sub-250 g drones to larger craft. The utility of the camera in aiding the depth perception of ab initio students was tested, measured by exocentric distance estimation. At the shorter distance of 20 m there was no significant difference between the means or variances of the two groups. The use of the image from the camera did not provide for more accurate estimations. However, at the greater distance of 50 m, the students in the Assisted Group who had the use of the camera image were significantly more accurate and consistent in their estimations. The difference in the accuracy of estimations at the different distances could reflect operating within the egocentric action space. Within this space of 20–30 m, distance judgement by people is very accurate [41]. The difficulty in accurate distance estimation at greater distances is consistent with that identified in the literature [12].

While the use of technology to aid the accurate flying of RPA has been questioned [14,40], this new finding suggests that having even straightforward technology such as an onboard camera and using the image provided by it may be advantageous for safety. As cameras are ubiquitous on even the smallest drones, using a GCS with an appropriate screen during ab initio training could be beneficial for enhancing depth perception skills.

The greater accuracy and consistency of the students who had access to the camera image have implications for the training of ab initio drone pilots (discussed below). At the shorter tested distance of 20 m, there would appear to be little if any advantage to training students to use the camera image. It is at this distance that the initial training takes place as ab initio students are introduced to handling exercises such as horizontal and vertical rectangles, simulated bridge inspections, and figures of 8. Using a GCS with a transmitted camera image would not appear to increase effectiveness over 20 m. As humans are accurate with depth perception over this distance [41], the provision of an aid such as a camera to help distance estimation may be redundant.

At longer distances, implementing items within the training curriculum dealing with the use of a camera could improve operational safety through better distance estimation and less likelihood of collisions. A possible negative outcome from such a camera use may be an inappropriate level of confidence for inexperienced RPA pilots who operate at distances greater than their abilities support. That is, the pilot may believe in the camera image providing an extra layer of safety and security that may not be warranted. It could be tempting for VLOS operations to become quasi-BVLOS flights.

A second metric tested the proportions of under- or over-estimation of the distance between the drone and the obstacle. In making exocentric distance estimations, the students in both groups markedly underestimated the distance of the drone to the obstacle, consistent with the literature for these distances. For the Unassisted Group, the proportion of underestimation was greater at the 50 m obstacle. For the Assisted Group, the proportion of underestimation was more constant between the two obstacles.

The higher proportion of underestimation of these exocentric distances is somewhat surprising. Research into RPA accidents [7] indicates there were many collisions between a drone and an obstacle, suggesting pilots did not accurately perceive the distance between them. In underestimating the distance between the drone and the obstacle, the pilot perceives the drone as closer to the obstacle than it is. Such a perception could be thought to protect the flight, as there would be a reduced chance of the drone colliding with obstacles. More analysis of the incident database is required to determine at what distances each misjudgement was made, related as these are to each drone’s speed and purpose.

5.2. Limitations

While the Assisted Group students did have the use of the camera image on the GCS to aid their distance estimations, they also were able to look up from the GCS and view the actual environment and the drone’s location within it. No measurement was made about the division, if any, of viewing between the environment and the GCS made by these students. For future studies, the identification of whether the students spent longer looking at the GCS or at the actual environment would be of use. Endsley [42] noted that the allocation of visual attention across multiple sources of information is a critical component of situation awareness in dynamic environments.

This study used ab initio students from an elite university program. The extrapolation of the results to students of different backgrounds, as well as more experienced operators, can be queried; that is, the results may have limited generalizability beyond the demographics involved. It may be that as pilots gain more flying experience, their depth perception skills increase, and they do not rely as much on supporting aids such as camera vision. Wickens and McCarley [43] have shown that the ways in which individuals integrate and prioritize perceptual cues can change significantly with increased experience, suggesting that the observed benefits of camera use may be specific to novice operators. Further work with RPA pilots with experience would help understand how much of the benefit of the camera may have been due to the inexperience of ab initio pilots in depth perception. Similarly, if such perception practice does improve the skill, to what extent does it atrophy and therefore need currency training? Importantly, the potentially limited generalizability does not discount the statistically significant findings, which show that for new operators learning to use drones for the first time, significant safety gains can be achieved by utilizing the readily available technology in the correct way to prevent controlled flights into terrain, and other associated aviation accidents [7]. This has the potential to reduce operating costs for the industry.

The limitations of using a monocular image have been identified as not always providing enough visual information for accurate distance information [44], which could explain this result. However, research with people having monocular vision (loss of vision in one eye) was found to be just as accurate in large exocentric distance perception as those with binocular vision [45]. Critically, the limitation of monocular vision using a camera at 50 m is not a significant concern, due to the application of the onboard camera to an exocentric distance estimation. That is, while the drone was 50 m from the student, it was only a few meters from the obstacle.

Stationary obstacles were used for this study, as they are a large part of the operational reality faced by small drone users such as photographers. Further, previous research indicated a number of accidents involving drones colliding with non-dynamic objects such as buildings and trees [7].

5.3. Operational Implications

The results of this study indicate that the use of the onboard camera may provide increased accuracy and consistency in exocentric distance estimations by ab initio RPA pilots, particularly at greater distances. These findings have implications for the design of RPA training programs and standard operating procedures. While traditional VLOS training focuses on visual awareness and spatial judgement without assistance, incorporating the use of a ground control station with a live video feed from the onboard camera could support skill development, especially in operations extending beyond 20 m. The improved consistency observed in the assisted group suggests that camera use may reduce the variability in pilot performance, which is a desirable outcome for safety-critical tasks. This aligns well with the work of Backman et al. [13], which was in the context of landing assistance.

Consideration can therefore be given to including camera-based perceptual assistance during the training period, not as a substitute for visual judgement, but as a tool to reinforce spatial understanding. This may be particularly relevant for scenarios involving obstacle-rich environments, such as infrastructure inspection or low-level urban operations. However, it must be emphasized that this should not lead to an erosion of regulatory boundaries between VLOS and BVLOS operations. Rather, the camera can be used as a supplementary aid in controlled settings during training to enhance perception without substituting core visual competencies. Consideration can be given to incorporating practical guidance on the need for following regulatory standards to prevent inappropriate overreliance on visual aids in VLOS operations.

5.4. Future Research

While the findings of this study point to improved accuracy and consistency in distance estimation from camera assistance, further work is needed to explore how these findings extend beyond the specific training context used. Future studies could investigate whether the improved performance results from enhanced perceptual support or whether it reflects a reliance on the camera image that does not translate into improved unassisted performance. Identifying whether the benefit leads to a transferable perceptual skill or is task-dependent will inform how camera use should be integrated into training curricula.

Longitudinal research could also examine whether repeated exposure to camera-supported tasks contributes to more lasting improvements in depth perception and spatial judgement, even when the camera is removed. This would help determine whether camera-based interventions have a training effect or simply function as a situational aid.

Additionally, expanding the participant pool beyond ab initio students to include experienced RPA operators would provide insight into whether the benefits observed are specific to novice users. Further work would be advantageous in determining the usefulness of the camera being confined to ab initio students or whether experienced pilots could also benefit from using a camera. That is, the current working hypothesis is that there would be less impact on the exocentric distance estimation for experienced pilots; however, this warrants experimentation to determine if this is supported by measurements, or not.

Studies incorporating more ecologically valid scenarios, including dynamic flight paths, cluttered environments, and stress factors, would assist in understanding how these perceptual skills are deployed in more operationally realistic conditions. Although stationary obstacles were used for the study, future research should examine the potential of using camera-assisted distance estimation with dynamic obstacles. This has been done with unmanned ground vehicles [44].

Crossover design utilizing future students—the course the students were drawn from runs annually—will be implemented. Improvement in the research can further be made using eye-tracking devices to measure the amount of time participants devote to watching the camera image compared to how much time is spent gazing at the natural environment.

6. Conclusions

Depth perception is a challenge for RPA pilots who do not experience the same level of feedback as conventionally controlled aircraft pilots. Examination of the accident database indicated a higher proportion of accidents from depth perception failings in RPAS operations as against other sectors of aviation [7]. With a predicted 60.4 million RPAS flights by 2043 [3], there will be ever greater opportunity for the number of depth perception-related occurrences to increase.

To help improve depth perception skills, this study evaluated if the use of the image from a camera onboard an RPA would improve the accuracy of exocentric measurements of ab initio pilots. While there are different technologies available to assist in this skill, they can be difficult or expensive to install and use. Cameras are ubiquitous pieces of equipment that are included on most light, prosumer drones, and are thus easily available to VLOS ab initio pilots.

Two groups of ab initio students flew a drone towards obstacles at different distances from themselves. One group of participants used only direct viewing of the drone and its relationship with the obstacle it was approaching, while the second group had both direct view of the drone and the obstacle, and assistance from the vision provided by the camera image. The findings identified benefits of using a camera to enhance distance estimation. For the group assisted by the camera, the overall mean of the estimation error and the consistency of estimation improved. However, the improvements were noted only at the further tested distance of 50 m, not at the closer tested distance of 20 m. The closer distance is part of a sphere within which humans are remarkably accurate in their depth perception skills [41]. It may be that using a camera at shorter distances of 20–30 m may not provide improvements in depth perception. It was also noted that both groups of ab initio pilots had high proportions of pilots who underestimated the distance between the drone and the obstacles. How this finding pertains to collision avoidance incidences of drones is uncertain until further incident analysis is undertaken.

Further study is required to identify the extrapolation of these findings from a cohort of ab initio pilots to more experienced RPAS operators. Does using the camera to assist with depth perception decrease as experience is gained? Identified benefits in using a camera for both ab initio and more experienced RPA pilots can lead to consideration for inclusion in RPAS training syllabi.

Author Contributions

Conceptualization, J.M. and G.W.; methodology, J.M. and G.W.; validation, J.M., S.R., K.J. and G.W.; formal analysis, J.M. and S.R.; investigation, J.M. and G.W.; resources, J.M. and G.W.; data curation, J.M., S.R. and G.W.; writing—original draft preparation, J.M.; writing—review and editing, J.M., S.R., K.J. and G.W.; visualization, J.M. and G.W.; supervision S.R., K.J. and G.W.; project administration, G.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by an Australian Government Research Training Program (RTP) Scholarship.

Institutional Review Board Statement

The study was approved by the Human Research Advisory Panel of the University of New South Wales (Approval HC220421 on the 31 August 2022).

Informed Consent Statement

Subject consent was waived by the Human Research Advisory Panel of the University of New South Wales as the data and the task upon which this research is based was part of, and originally solely for the purpose of, an undergraduate laboratory course. All data were deidentified prior to provision to the researchers.

Data Availability Statement

The datasets presented in this article are not readily available due to restrictions on their distribution to parties external to the original research, in accordance with the ethics approval. Requests to access the datasets should be directed in the first instance to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Bartsch, R. Unmanned and Uncontrolled: The Commingling Theory and the Legality of Unmanned Aircraft System Operations. Master’s Thesis, University of Sydney, Sydney, Australia, 2016. [Google Scholar]
Gipson, L. UAS Integration in the NAS: About Us. Available online: https://www.nasa.gov/directorates/armd/integrated-aviation-systems-program/uas-in-the-nas/uas-integration-in-the-nas-about-us/ (accessed on 18 March 2025).
Scyne Advisory. Sizing the Future Drone and Advanced Air Mobility Market in Australia. Projections of the Growth of Commercial Drones in Australian Skies Over the Next Two Decades. 2024. Available online: https://www.airservicesaustralia.com/wp-content/uploads/2024/02/Sizing-the-Future-Drone-Industry-in-Australia_February.pdf (accessed on 18 March 2025).
Murray, J.; Joiner, K.; Wild, G. Micro-Credentialing and Digital Badges in Developing RPAS Knowledge, Skills, and Other Attributes. Multimodal Technol. Interact. 2024, 8, 73. [Google Scholar] [CrossRef]
Wild, G.; Murray, J.; Baxter, G. Exploring Civil Drone accidents and incidents to help prevent potential air disasters. Aerospace 2016, 3, 22. [Google Scholar] [CrossRef]
Grindley, B.; Phillips, K.; Parnell, K.J.; Cherrett, T.; Scanlan, J.; Plant, K.L. Over a decade of UAV incidents: A human factors analysis of causal factors. Appl. Ergon. 2024, 121, 104355. [Google Scholar] [CrossRef]
Murray, J.; Richardson, S.; Joiner, K.; Wild, G. Identifying Human Factors Causes of Remotely Piloted Aircraft Systems Safety Occurrences in Australia. Aerospace 2025, 12, 206. [Google Scholar] [CrossRef]
Lester, A.W.; Moffat, S.D.; Wiener, J.M.; Barnes, C.A.; Wolbers, T. The aging navigational system. Neuron 2017, 95, 1019–1035. [Google Scholar] [CrossRef]
Shively, J. Human performance issues in remotely piloted aircraft systems. In Proceedings of the International Civil Aviation Organization Remotely Piloted Aircraft Systems Symposium, Montreal, QC, Canada, 23–25 March 2015. [Google Scholar]
Hing, J.T.; Sevick, K.W.; Oh, P.Y. Improving unmanned aerial vehicle pilot training and operation for flying in cluttered environments. In Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, St. Louis, MO, USA, 10–15 October 2009; pp. 5641–5646. [Google Scholar]
Hartley, R.J.A.L.; Henderson, I.L.; Jackson, C.L. BVLOS unmanned aircraft operations in forest environments. Drones 2022, 6, 167. [Google Scholar] [CrossRef]
Lappin, J.S.; Shelton, A.L.; Rieser, J.J. Environmental context influences visually perceived distance. Percept. Psychophys. 2006, 68, 571–581. [Google Scholar] [CrossRef] [PubMed]
Backman, K.; Kulic, D.; Chung, H. Learning to assist drone landings. IEEE Robot. Autom. Lett. 2021, 6, 3192–3199. [Google Scholar] [CrossRef]
Smolyanskiy, N.; Gonzalez-Franco, M. Stereoscopic First Person View System for Drone Navigation. Front. Robot. AI 2017, 4, 11. [Google Scholar] [CrossRef]
Lyle, A.L.; Johnson, M. Drone, Photography. In Handbook of Forensic Photography; Weiss, S.L., Borchers, A., Eds.; CRC Press: Boca Raton, FL, USA, 2022. [Google Scholar]
McCarthy, P.; Teo, G.K. Assessing human-computer interaction of operating remotely piloted aircraft systems (RPAS) in attitude (ATTI) mode. In Proceedings of the Engineering Psychology and Cognitive Ergonomics: Cognition and Design, Vancouver, BC, Canada, 9–14 July 2017; Springer International Publishing: Berlin/Heidelberg, Germany; pp. 251–265. [Google Scholar]
Ruginski, I.T.; Creem-Regehr, S.H.; Stefanucci, J.K.; Cashdan, E. GPS use negatively affects environmental learning through spatial transformation abilities. J. Environ. Psychol. 2019, 64, 12–20. [Google Scholar] [CrossRef]
Kozhevnikov, M.; Motes, M.A.; Rasch, B.; Blajenkova, O. Perspective-taking vs. mental rotation transformations and how they predict spatial navigation performance. Appl. Cogn. Psychol. 2006, 20, 397–417. [Google Scholar] [CrossRef]
Wu, H.; Li, Y.; Xu, W.; Kong, F.; Zhang, F. Moving event detection from LiDAR point streams. Nat. Commun. 2024, 15, 345. [Google Scholar] [CrossRef] [PubMed]
Alvarez, H.; Paz, L.M.; Sturm, J.; Cremers, D. Collision avoidance for quadrotors with a monocular camera. In Proceedings of the Experimental Robotics: The 14th International Symposium on Experimental Robotics, Marrakech and Essaouira, Morocco, 15–18 June 2014; Springer International Publishing: Berlin/Heidelberg, Germany; pp. 195–209. [Google Scholar]
Mohta, K.; Watterson, M.; Mulgaonkar, Y.; Liu, S.; Qu, C.; Makineni, A.; Saulnier, K.; Sun, K.; Zhu, A.; Delmerico, J.; et al. Fast, autonomous flight in GPS-denied and cluttered environments. J. Field Robot. 2018, 35, 101–120. [Google Scholar] [CrossRef]
Falanga, D.; Kleber, K.; Scaramuzza, D. Dynamic obstacle avoidance for quadrotors with event cameras. Sci. Robot. 2020, 5, eaaz9712. [Google Scholar] [CrossRef] [PubMed]
Nhair, R.R.; Al-Assadi, T.A. Vision-based obstacle avoidance for small drone using monocular camera. In IOP Conference Series: Materials Science and Engineering; IOP Publishing: Bristol, UK, 2020; Volume 928, p. 032048. [Google Scholar]
Greenspan, S.B. Behavioral and developmental studies of visual depth perception. Am. J. Optom. Arch. Am. Acad. Optom. 1971, 48, 677–688. [Google Scholar] [CrossRef]
Walk, R.D.; Gibson, E.J. A comparative and analytical study of visual depth perception. Psychol. Monogr. Gen. Appl. 1961, 75, 1–44. [Google Scholar] [CrossRef]
Howard, I.P. Perceiving in Depth: Volume 1 Basic Mechanisms; Oxford University Press: Oxford, UK, 2012. [Google Scholar] [CrossRef]
Frassinetti, F.; Bolognini, N.; Làdavas, E. Enhancement of visual perception by crossmodal visuo-auditory interaction. Exp. Brain Res. 2002, 147, 332–343. [Google Scholar] [CrossRef]
Dunn, M. Remotely Piloted Aircraft: The Impact of Audiovisual Feedback and Workload on Operator Performance. Ph.D. Thesis, UNSW Sydney, Sydney, Australia, 2023. [Google Scholar]
Ping, J.; Weng, D.; Liu, Y.; Wang, Y. Depth perception in shuffleboard: Depth cues effect on depth perception in virtual and augmented reality system. J. Soc. Inf. Disp. 2020, 28, 164–176. [Google Scholar] [CrossRef]
Inoue, M.; Takashima, K.; Fujita, K.; Kitamura, Y. BirdViewAR: Surroundings-Aware Remote Drone Piloting Using an Augmented Third-Person Perspective. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI ’23), Hamburg, Germany, 23–28 April 2023; ACM: New York, NY, USA. [Google Scholar] [CrossRef]
Zollmann, S.; Hoppe, C.; Langlotz, T.; Reitmayr, G. FlyAR: Augmented Reality Supported Micro Aerial Vehicle Navigation. IEEE Trans. Vis. Comput. Graph. 2014, 20, 4. [Google Scholar] [CrossRef]
Ruano, S.; Cuevas, C.; Gallego, G.; García, N. Augmented Reality Tool for the Situational Awareness Improvement of UAV Operators. Sensors 2017, 17, 297. [Google Scholar] [CrossRef]
Coleman, J.; Thirtyacre, D. Remote Pilot Situational Awareness with Augmented Reality Glasses: An Observational Field Study. Int. J. Aviat. Aeronaut. Aerosp. 2021, 8, 3. [Google Scholar] [CrossRef]
Backman, K.; Kulić, D.; Chung, H. Reinforcement learning for shared autonomy drone landings. Auton. Robot. 2023, 47, 1419–1438. [Google Scholar] [CrossRef]
Mung Lam, T.; Mulder, M.; van Paassen, M. Haptic Interface for UAV Collision Avoidance. Int. J. Aviat. Psychol. 2007, 17, 167–195. [Google Scholar] [CrossRef]
Macchini, M.; Havy, T.; Fabrizio Schiano, A.-W.; Floreano, D. Hand-worn Haptic Interface for Drone Teleoperation. arXiv 2020, arXiv:2004.0711. [Google Scholar] [CrossRef]
Jimenez, C.; Faerevaag, C.L.; Jentsch, F. User Interface Design Recommendations for Small Unmanned Aircraft Systems (sUAS). Int. J. Aviat. Aeronaut. Aerosp. 2016, 3, 5. [Google Scholar] [CrossRef][Green Version]
Ramachandran, V.; Macchini, M.; Floreano, D. Arm-Wrist Haptic Sleeve for Drone Teleoperation. IEEE Robot. Autom. Lett. 2022, 7, 4. [Google Scholar] [CrossRef]
Cacan, M.; Costello, M.; Ward, M.; Scheuermann, E.; Shurtliff, M. Human-in-the-loop control of guided airdrop systems. Aerosp. Sci. Technol. 2019, 84, 1141–1149. [Google Scholar] [CrossRef]
Eiris, R.; Albeaino, G.; Gheisari, M.; Benda, B.; Faris, R. Indrone: Visualizing drone flight patterns for indoor building inspection tasks. In Proceedings of the 20th International Conference on Construction Applications of Virtual Reality, Middlesbrough, UK, 30 September–2 October 2020; Teesside University Press: Middlesbrough, UK, 2020; pp. 273–282. [Google Scholar]
Thompson, W.B.; Dilda, V.; Creem-Regehr, S.H. Absolute distance perception to locations off the ground plane. Perception 2007, 36, 1559–1571. [Google Scholar] [CrossRef]
Endsley, M.R. Toward a Theory of Situation Awareness in Dynamic Systems. Hum. Factors 1995, 37, 32–64. [Google Scholar] [CrossRef]
Wickens, C.D.; McCarley, J.S.; Gutzwiller, R.S. Applied Attention Theory; CRC Press: Boca Raton, FL, USA, 2022. [Google Scholar] [CrossRef]
Luo, Y.; Wang, J.; Liang, H.N.; Luo, S.; Lim, E.G. Monoscopic vs. stereoscopic views and display types in the teleoperation of unmanned ground vehicles for object avoidance. In Proceedings of the 2021 30th IEEE international conference on robot & human interactive communication (RO-MAN), Online, 8–12 August 2021; pp. 418–425. [Google Scholar] [CrossRef]
Gao, L.; Huang, Y.; Zhang, Y.; Zhang, X.; Liu, Z.; Pan, J.S.; Yu, M. Monocular information for perceiving large egocentric distance: A comparison between monocularly blind patients and normally sighted observers. Vis. Res. 2023, 211, 108279. [Google Scholar] [CrossRef]

Figure 1. Exocentric distance (d₁) and egocentric distance (d₂).

Figure 2. The double static group design implemented with intervention being access to a ground control station with the view from an onboard camera.

Figure 3. Flight path, from the take-off and landing point (round bullseye) to the two upright targets at 20 m and 50 m, not colinear.

Figure 4. (a) Whisker box plots of absolute difference between student estimation and actual distance; (b) equivalent data shown as violin plots, highlighting the equivalence (or not) of the variance.

Figure 5. Proportions of under- and over-estimation of distance between a drone and an obstacle.

Table 1. Mean and standard deviation of estimation error.

	20 m		50 m
	Unassisted Group	Assisted Group	Unassisted Group	Assisted Group
Mean	1.4 m	1.36 m	5.4 m	2.44 m
Standard Deviation	1.1 m	1.22 m	5.68 m	2.02 m

Table 2. Levene’s test of homogeneity.

	20 m	50 m
F ratio	0.0915	10.05
p value	0.76	0.002
homogeneity	met	Not met

Table 3. Obstacle t-test at 20 m: two-sample assuming equal variances.

	Unassisted Group	Assisted Group
Mean	1.4237	1.3545
Variance	1.2213	1.4759
Observations	38	55
Pooled variance			1.37
df			91
t stat			0.280
p (T <= t) one-tail			0.39
t critical one-tail			1.66

Table 4. Obstacle t-test at 50 m: two-sample assuming unequal variances.

	Unassisted Group	Assisted Group
Mean	5.4	2.44
Variance	32.257	4.0957
Observations	38	55
df			44
t stat			3.08
p (T <= t) one-tail			0.0018
t critical one-tail			1.68

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Murray, J.; Richardson, S.; Joiner, K.; Wild, G. Aiding Depth Perception in Initial Drone Training: Evidence from Camera-Assisted Distance Estimation. Technologies 2025, 13, 267. https://doi.org/10.3390/technologies13070267

AMA Style

Murray J, Richardson S, Joiner K, Wild G. Aiding Depth Perception in Initial Drone Training: Evidence from Camera-Assisted Distance Estimation. Technologies. 2025; 13(7):267. https://doi.org/10.3390/technologies13070267

Chicago/Turabian Style

Murray, John, Steven Richardson, Keith Joiner, and Graham Wild. 2025. "Aiding Depth Perception in Initial Drone Training: Evidence from Camera-Assisted Distance Estimation" Technologies 13, no. 7: 267. https://doi.org/10.3390/technologies13070267

APA Style

Murray, J., Richardson, S., Joiner, K., & Wild, G. (2025). Aiding Depth Perception in Initial Drone Training: Evidence from Camera-Assisted Distance Estimation. Technologies, 13(7), 267. https://doi.org/10.3390/technologies13070267

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Aiding Depth Perception in Initial Drone Training: Evidence from Camera-Assisted Distance Estimation

Abstract

1. Introduction

2. Literature Review

3. Materials and Methods

4. Results

5. Discussion

5.1. Findings

5.2. Limitations

5.3. Operational Implications

5.4. Future Research

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI