1. Introduction
A detect-and-avoid (DAA) system supports a remote pilot (RP) of a remotely piloted aircraft (RPA) to observe and avoid nearby air traffic or other hazards using sensor and guidance technology [
1]. In general, a DAA system can have a remain well clear (RWC) and a collision avoidance (CA) function. The RWC function supports detection and analysis of potential conflicting traffic and provides flight path guidance to the RP to prevent the conflict from developing into a collision hazard [
2]. Considering the three protection layers of air traffic management (ATM), being (1) strategic conflict management, (2) separation provision, and (3) collision avoidance, the RWC function supports the second layer together with air traffic control (ATC) [
1]. In support of the third layer, the CA function provides last-resort resolution advisories (RAs) to the RP to avoid physical contact between the aircraft. The RWC and CA functions may use multiple degrees of freedom (DOF) in their guidance and advisories, particularly for manoeuvring horizontally, vertically, or changing speed. It is up to the RP, often in coordination with ATC, to decide on and implement an appropriate control strategy given the DAA output. The focus in this paper is on the RWC function for horizontal manoeuvring.
A number of DAA standards have been published, including minimum operational performance standards (MOPS) [
2] and minimum aviation system performance standards (MASPS) [
3]. ACAS Xu is a fully specified DAA system [
4], including surveillance filtering, conflict detection, manoeuvre guidance and alerting. It is part of the ACAS X programme, which also includes the new airborne collision avoidance system (ACAS) for manned aircraft ACAS Xa [
5]. The threat resolution module (TRM) of ACAS Xu is based on partially observable Markov decision process (POMDP) models and objective functions, which have been solved by a dynamic programming (DP) approach to calculate a Q-function representing the value gained for taking a particular action in the current state. The TRM of ACAS Xu is based on two independent POMDP models for advisories in the vertical plane and in the horizontal plane. The optimisation for the vertical and horizontal dimensions is separated, since the combined problem was considered intractable to solve due to its large (discretized) state space [
6]. This implies that no overall 3D optimised advice is achieved. The RWC guidance provided by ACAS Xu is based on a rollout approach [
7], which uses the POMDP-based cost tables to infer an increase in collision risk in relation to DAA alert timing requirements. The RWC guidance does not use coordination between nearby aircraft. The advisories and guidance provided by ACAS Xu are updated every second, and they depend on the relative positions and speeds of the aircraft. This implies that they may change frequently.
A DAA system resides onboard of an RPA, and its advisories and guidance are downlinked to a remote pilot station (RPS) and shown on a traffic display of the RP controlling the ownship RPA via command and control (C2) links. An intruding aircraft that poses a threat for remaining well clear and that requires action by the RP is displayed by a dedicated symbol. Furthermore, the RWC guidance is displayed as bands of vertical rates and relative track angles that have to be avoided to remain well clear of other traffic. In ACAS Xu, horizontally an array of 13 bands is used with widths of 15 degrees from −97.5 to 97.5 degrees relative to the ownship track. Given the guidance, the RP judges whether action is needed and what RWC manoeuvre can be performed. If so, the RP should request a DAA manoeuvre clearance to ATC, except when the pilot judges that such a request is not needed given the criticality of the conflict or the absence of ATC [
2]. Human-in-the-loop (HITL) simulations for an early version of ACAS Xu [
8] showed mean response time to RWC guidance of about 17 s, including time for coordination with ATC. Several losses of DAA well clear (LDWC) were observed in these HITL simulations, which were attributed to pilot performance, including (1) a pilot attempting to return to the route too soon following an avoidance manoeuvre, (2) a poor manoeuvre choice by the pilot, and (3) an overly long coordination time with ATC. These cases illustrate potential complexity of dealing with the DAA advisories and guidance by RPs.
In addition to HITL simulation, the prime approach for design and validation of ACAS and DAA systems has been using fast-time simulation of sets of aircraft encounters and the evaluation of performance metrics like numbers of alerts and probabilities of near mid-air collision (NMAC) and LDWC [
6,
9]. Such performance metrics provide feedback to designers for tuning of an optimisation-based system like ACAS Xu, including the surveillance filtering and tracking, the reward function for optimising the look-up tables, and parameters of the roll-out approach. The fast-time simulations are based on encounter models, aircraft and avionics models, and (remote) pilot models [
10]. Encounter models specify the probabilities and characteristics of ways that aircraft can come close to each other [
11,
12]. Such encounter modelling has had a predominant impact on ACAS validation studies, e.g., [
13,
14]. Aircraft and avionics models describe the aircraft manoeuvring capabilities and the characteristics of the input data of the ACAS or DAA systems, like the accuracy of altitude and range measurements. While ACAS validation studies have largely used deterministic avionics models, more recently, stochastic modelling of sensor errors was applied, providing insight into their contribution to ACAS and DAA performance metrics [
15,
16,
17]. (Remote) pilot models describe the situation awareness concerning the ACAS/DAA advisories and guidance, the decision-making process, and the dynamics of the response (delay, strength) of the human operator. While pilot models for manned aircraft are relatively straightforward, representing the manner (response probability, delay, strenght) by which the pilot reacts to a (vertical) RA [
16,
18,
19], remote pilot models for unmanned aircraft systems (UAS) are considerably more complex, as they represent the decision-making, ATC coordination, and manoeuvring in both the horizontal and vertical planes in response to RAs and RWC guidance [
17,
20]. Both in manned and unmanned aircraft operations, the pilot models have a large impact on the overall performance. Agent-based simulations of manned aircraft operations [
15] show that differences in pilot performance are more important than differences between legacy (TCAS II) and new (ACAS Xa) ACASs. Agent-based simulations of UAS with the ACAS Xu DAA system [
17] show that RP performance can contribute to livelock conditions in particular encounter scenarios, where the aircraft attain continuing fluctuations away from an intruder and back to their course without reaching their destination.
It follows from above that the RP plays a key role in the DAA sociotechnical system and that their decisions have a large impact on the overall effectiveness of the DAA system. A good understanding of the RP’s performance and the considerations in the decision-making in encounter scenarios are therefore crucial in the design and evaluation of DAA systems. This is especially the case for the optimisation-based design of a DAA system [
10], like ACAS Xu. However, a limitation of current standards for DAA systems [
1,
2,
3,
4] is that they do not include the intended/planned routes of the involved aircraft, as known by the RP, for the provision of RWC guidance. These standards strictly use state data as input of the DAA system, such as horizontal position and speed, altitude and vertical rate, slant range, and relative bearing angle. Such state data and future state projections are used to attain the RWC bands that are advised to be avoided, shown on displays to the RP [
4]. A RP model that most quickly manoeuvres away from these bands without considering the intent of the aircraft in the scenario, and that moves back to course if the RWC bands do not intervene, can lead to the livelock conditions observed in the agent-based simulations [
17].
As a way forward, an intent-based DAA system was proposed [
21], which uses the sharing of intended routes as the basis of the conflict resolution. An A* path planning approach was employed to maintain a sufficient distance between the aircraft in an encounter and to achieve the destination by each aircraft. In Monte Carlo (MC) simulation runs including sensor errors and a variety of closed-loop delays, it was shown that by using the RWC route guidance of this intent-based A* DAA system, all scenarios in a set of horizontal encounters could be handled without attaining livelock or LDWC conditions. In contrast, in MC simulation of the same encounter scenarios with ACAS Xu, LDWC was attained in 25% of the runs, and the mean additional flight distance was nine times as large as for the A* DAA system. It was concluded that the intent-based A* DAA system is a promising approach for a more effective DAA.
Notwithstanding the promising results in [
21] of the intent-based A* DAA system in comparison with the ACAS Xu RWC function, the evaluation is completely based on computer simulation, and it lacks views by human operators that would need to work with the RWC guidance of the DAA systems. Given the levels of automation (LOA) in air traffic management as defined in [
22], ACAS Xu is a decision support system (level 1), which supports the RP in action selection by providing a solution space, while the A* DAA system is a resolution support system (level 2), which proposes an optimal solution in the solution space. Both the ACAS Xu and the A* DAA systems provide recommendations to the human operator that are based on optimisation strategies (POMDP solution and roll-out approach, and A* path planning, respectively). In line with the Artificial Intelligence (AI) Roadmap [
23] of the European Union Aviation Safety Agency (EASA) and the EU AI Act [
24], such DAA systems are AI-based systems and their trustworthiness should be assured. Key ethical requirements for trustworthy AI in human-centric operations include human agency and oversight, transparency, and technical robustness and safety [
25].
The objective of the study presented in this paper is to achieve structured feedback by professional remote pilots on the RWC guidance of the ACAS Xu and A* DAA systems as a basis for an initial human factors evaluation. The key topics concern trust, transparency, risk perception and competence of the systems and the manoeuvring, situation awareness and display preferences in drone encounter scenarios. This study was designed as a comparative within-subjects evaluation study, where participants experienced both DAA systems via video clips of encounter scenarios and expressed their preferences and opinions about them, collected as questionnaire data.
Next,
Section 2 presents the materials and methods, including the displays of both DAA systems, the encounter scenarios shown to the participants, the questions asked, and the scoring and statistical tests.
Section 3 presents the results for the answers by the participants for the two systems, regarding transparency, pilot behaviour, situation awareness, display orientation, risk perception, competence, trust, and overall system preference. The results and their implications for use and development of the DAA systems are discussed in
Section 4. Conclusions are provided in
Section 5.
2. Materials and Methods
2.1. DAA Displays
In the experiments, two types of displays were used for communicating the DAA RWC guidance to the participants. The ACAS Xu case was called “RWC avoidance bands guidance”, while the A* DAA case was called “RWC route guidance”.
Figure 1 shows the RWC avoidance bands guidance display of the remote pilot. The display design was inspired by the DAA cockpit display of [
26]. At the two sides of the display there are tape indicators for the airspeed in knots (left) and altitude in feet (right). There is no vertical speed indicator, since only manoeuvring in the horizontal plane is considered in this study. The ownship is shown in the middle of a compass that is displayed over a moving map. The track angle of the ownship is shown by a dashed line connecting the ownship symbol with the compass and by its value above the compass. The position of the intruder is shown by an aircraft symbol with an identity code above and the relative altitude below. If there are no active RWC bands, the aircraft symbols are coloured white, while they are coloured yellow if there are active RWC bands. The RWC bands as generated by ACAS Xu are shown by a yellow region at the edge of the compass. To support the remote pilot in understanding the horizontal distance with the intruder, distance circles are shown at 1.5, 3, 4.5, and 6 nautical miles (NM). The display can be used in two configurations: (1) track-up, where the ownship track is always pointing to the top of the display and where the compass and map rotate when the ownship turns, and (2) north-up, where compass and map remain in a stable north-up position, while the ownship symbol rotates when turning. The left case in
Figure 1 shows a track-up display without active bands, while the right case shows a track up display with an active band, which indicates that the aircraft should not maintain its current course in order to remain well clear.
Figure 2 shows the RWC route guidance display. Its main elements are equal to those of the RWC avoidance bands display of
Figure 1. A unique feature is the planned route of ownship that is shown as blue line segments between waypoints (blue dots). At the start of an encounter scenario, this as shown as one line segment towards a destination (
Figure 2 left). After a conflict has been detected by the A* DAA system, an advised route is shown by additional line segments and waypoints, so as to remain well clear with respect to the intruder (
Figure 2 right). Then, an advised track angle is also shown by a green block of −5 to +5 degrees, and the symbols of both aircraft are coloured yellow. Furthermore, the route of the intruder is then shown by a grey line. Lastly, an orange line segment shows the planned horizontal miss distance between the two aircraft in the proposed resolution.
2.2. RWC Guidance in Encounter Scenarios
In the experiments, the same two encounter scenarios were used for both DAA systems. The scenarios consider encounters between a pair of drones that are both flying at a constant altitude of 8000 ft and both have a constant speed of 120 kt. There is no wind, implying that the heading and track angle of each aircraft are equal. One of the remotely piloted aircraft systems is the ownship (AC2) of which the DAA display is shown to the participant. The other drone is its intruder (AC1). In both encounter scenarios the aircraft fly along straight lines, and these original trajectories intersect with a distance of 0 m, meaning that without evasive action there would be a collision. In the scenarios, AC1 always has a course of 0 deg (flying north), while AC2 has a course of 45 deg (flying north-east) or 90 deg (flying east).
For these encounter scenarios, the RWC guidance of the DAA systems and the response of the remote pilot to the guidance were simulated. In the simulated encounter scenarios, the intruder maintains a constant course, while the ownship manoeuvres in line with the RWC guidance by the DAA system. This is in line with the right-of-way rule which states that for aircraft of the same category converging at approximately the same altitude (except head-on, or nearly so), the aircraft to the other’s right has the right of way [
2]. When responding to the RWC guidance the remote pilot does so in one second, implying that the aircraft starts to turn (almost) instantaneously following the RWC guidance. While more delayed responses (e.g., representing time for communication with ATC) were used in earlier simulation studies [
17,
21], these were not used in the current experiment to support cause–effect transparency for the guidance–manoeuvre link to the participants. Turns in response to the RWC guidance are always made with a turn rate of 2 deg/s. This value is based on a nominal turn rate of 3 deg/s for horizontal resolution advisories [
4] and the assumption that a reduced turn rate can be applied for less alarming RWC guidance.
The ACAS Xu-supported encounter scenarios were simulated using the Collision Avoidance Validation and Evaluation Tool (CAVEAT) version 3.12.0.1. It is based on a stochastic dynamic agent-based modelling and simulation approach [
16,
27] and supports retrospective and prospective analysis of encounter scenarios with DAA and ACAS, including TCAS II, ACAS Xa, and ACAS Xu. The C++ based CAVEAT software was developed for EUROCONTROL by NLR and everis/NTT-Data. It incorporates the validated Julia libraries for ACAS Xu as provided in the MOPS [
28]. The remote pilot model [
17] in CAVEAT can be tuned in a variety of ways. For the simulations used in the experiment, no biases in the decision-making were assumed and the pilot closely follows the RWC guidance. This means that the pilot turns towards the closest edge of RWC bands. For instance, if the aircraft is on a course of 90 deg and the RWC bands are active from 75 to 120 deg, then the pilot would turn left to achieve a course of 75 deg. Furthermore, the pilot is assumed to turn back towards the original course if there are no RWC bands that advise differently.
Figure 3 shows the trajectories of the aircraft in a deterministic simulation (without sensor errors or other stochastic factors) using the above pilot performance model for the two encounter geometries used in the experiment. In both encounters, AC2 initially turns left away from AC1, following the ACAS Xu RWC bands at about 100 s before the aircraft would collide. After the initial turn, the RWC bands allow AC2 to turn right again, and the pilot model does so to move towards the original course. Next, this kind of back-and-forth manoeuvring according to the closest edge of the RWC bands is maintained. In the 90 deg encounter (
Figure 3 left), such manoeuvring is ended when AC2 manages to pass in front of AC1 with a horizontal miss distance of 1.0 NM. In the 45 deg encounter, AC2 does not manage to pass AC1 and attains a course that is parallel to AC1. In both encounters there is no loss of the DAA well clear (as defined for en-route traffic in [
2]).
The A* DAA-supported encounter scenarios were simulated in dedicated Java software that was developed by NLR for the A* DAA algorithms, as well as the representation of the encounter scenarios. The A* DAA system uses a new approach that employs the intended routes of the involved aircraft and an A* path planning approach to minimise conflict risk costs [
21]. It results in routes advised to the RP so as to remain well clear with other traffic, following conflict detection and resolution coordination. As long as the involved aircraft sufficiently adhere to the coordinated routes, the routes do not change.
Figure 4 shows the trajectories of the aircraft in a deterministic simulation for the two encounter geometries. In both encounters, AC2 initially turns right towards AC1, following the A* DAA route guidance at 160 s before collision for the 90 deg encounter and at 169 s before collision for the 45 deg encounter. In both encounters, AC2 passes behind AC1 at a smallest distance of 1.3 NM and effectively sets course towards its destination.
2.3. Videos of the RWC Guidance in the Encounter Scenarios
Using the various combinations of DAA systems, simulated encounter scenarios, and display orientations, videos were produced as described in
Table 1. These videos are available as
Supplementary Materials.
Fragments from these videos were taken for the experiments. The following phases were used (see also
Table 1):
Initial: Start of encounter from 23 s before the first RWC guidance (bands or route) until the end of the first turn (duration 40 to 60 s);
Receding band: Receding of the RWC band and turn back towards course (duration 5 to 10 s);
Back & forth: Next three times turning away from RWC bands and two times turning back towards course (duration: 35 s);
Pass front: Continuation of RWC bands until there are no more active RWC bands and AC2 passes AC1 in front (duration 103 s);
Livelock: Several times turning left and right until AC2 flies about parallel to AC1. RWC bands are active until the end (duration 84 s);
RWC route: Manoeuvring along RWC route from 40 s before second turn until second to last turn (duration 55 to 62 s);
Pass behind: Last turn back to course until AC2 passes AC1 from behind (duration 20 to 25 s).
2.4. Participants
Recruiting participants for this study proved to be an expected challenge, as the target population of interest—professional remote pilots—was a small group with very diverse backgrounds and, as such, not the ideal population from which to recruit a large sample size. Considering that, conducting a traditional a priori power analysis to determine the required sample size was not feasible. Instead, an alternative approach was adopted, guided by the realistic upper bound of the achievable sample size. Assuming a maximum of 8 to 10 participants, a sensitivity analysis [
29] was performed to estimate the minimum detectable effect size (MDE) at the conventional significance level of α = 0.05 (95% confidence) and a statistical power of 80%. This analysis showed that the study would be sufficiently powered to detect large effect sizes even with only 8–10 participants, consistent with established practices in human–computer interaction and usability research, where small expert samples are widely recognised as both practical and appropriate to detect the majority of meaningful usability issues [
30].
For this study, participants were recruited via contacts at NLR, by a call distributed in LinkedIn UAS groups and at conferences, and by referral. Participants were required to have experience as remote pilots of beyond-visual-line-of-sight drone operations, implying that they should have professional experience. Nine expert remote pilots participated in the study, eight male and one female, with ages ranging from 31 to 63 (M = 46.7, Mdn = 50, SD = 11.2) and countries of residence being Austria, Australia, Greece, Netherlands, Norway and the United Kingdom. Their experience consisted of flying a variety of fixed wing and multirotor drones for military and civil operations, including surveillance and powerline inspection. Their flight hours of these drone operations ranged from 30 to 2000 h (M = 700.8, Mdn = 640, SD = 681). In addition, most participants had other professional aviation experience, including fixed wing and helicopter pilot, instructor, and safety manager.
Prior to the experiment, each participant received research subject information, describing the purpose and conduct of the study, and their rights as participants, and they signed an informed consent form. Participation was voluntary, without compensation being paid. This study received ethical approval by the NLR Committee on Human Research (NLR Commissie Mensgebonden Onderzoek) with certificate number CMO2025012.
2.5. Questions and Conduct of the Experiment
The experiments were conducted via an on-line video conferencing system. Each experiment was joined by a participant, an experiment leader, who explained the purpose of the study and the DAA displays and asked questions, and a note keeper, who noted the answers by the participant. In support of the data gathering, the sessions were recorded. The introductions, videos and questions were all displayed on the screen. The experiments lasted about 70 to 90 min, depending on the detail of explanations provided by the participant. The overall structure of each session was as follows:
Introduction. Explanation of the purpose of the study, being to evaluate two types of DAA systems, namely RWC avoidance bands guidance and RWC route guidance. No information on the background of the systems was given. Explanation of the encounter scenarios in the videos, such as horizontal manoeuvring only and pilot response in 1 s. Explanation of the types of questions during several stages in each encounter.
DAA system 1. Explanation of the display. Questions as explained below along Videos V1–V4 for ACAS Xu, or Videos V5–V8 for A* DAA.
DAA system 2. Explanation of the display. Questions as explained below along Videos V1–V4 for ACAS Xu, or Videos V5–V8 for A* DAA.
Conclusion. Question about overall preference of DAA system and remaining remarks.
There was a random distribution of the order of the DAA systems in the experiment, such that DAA system 1 was ACAS Xu for five participants and A* DAA for four participants. Furthermore, it can be noted that the 90 deg encounter was always presented before the 45 deg encounter (which leads to a livelock for ACAS Xu), while order of the track up versus north up displays was alternated.
The questions in the experiments were organised using the following levels:
Overall preference of DAA system;
DAA systems: RWC avoidance bands guidance, or RWC route guidance;
Display orientations: track-up, or north-up;
Encounter geometries: 90 degrees, or 45 degrees;
Phases: three or four phases per encounter scenario.
The questions asked for the topics at the various levels of the experiment are listed in
Table 2. The answering scales are defined in
Table 3. The situation awareness question requested the participant to choose between six figures that represent the tracks of both aircraft for the case that was shown (see example in
Figure 5). Here, one of the figures correctly represented the tracks of both aircraft, while the others were false variations. For all questions the participants could explain their choice, either by their own initiative or on request of the facilitator.
2.6. Data Processing
The answers for the topics in
Table 2 with a 5-point Likert scale (e.g., Transparency, Competence) were coded as 1 to 5 (1 = Strongly disagree/Very unclear, 5 = Strongly agree/Very clear). For the Trust topic, participants could select multiple positive statements (items 1, 2, 4, and 5) or negative statements (items 3 and 6), and each of them were coded as 1 or −1. A value of 0 was attached to option 7 (“I don’t agree with any of these”). For each participant, a total Trust score was attained by summation of the item valuations (scale −2 to 4), and total scores for Risk perception and Competence were achieved by the average of the valuations for their three items. For the Transparency and Pilot Manoeuvring questions that were asked after each phase (Level 5), averages of the ratings over all phases were determined to obtain a score per video for each participant.
The presentation of the results is mainly done using the average and sample standard deviation (SD) of the ratings over the group of participants. These results are presented on a scale of 0% to 100%, such that, e.g., an average of 0% means that all participants strongly disagree and 100% means that all strongly agree.
For all of the topics, we ran a series of confirmatory statistical tests for evaluating the differences in scores between the two DAA systems, as well as between the two groups based on the order of presentation. All data was tested for normality using a Shapiro–Wilk test, and for the variables which deviated from a normal distribution, non-parametrical tests were chosen. Specifically, for the normally distributed data we used Student’s t-test for both the independent and paired-sample tests, with Cohen’s delta (reported below as d) as the effect size measure. For the non-parametric data, we used the Mann–Whitney U-test and Wilcoxon’s signed-rank test as the independent-sample and paired-sample options, respectively, together with the Rank-biserial correlation (reported as rrb) as the effect size measure.
The effect size metrics were used to quantify the magnitude of the difference between the score means and indicate in practical terms how large the difference between the groups was. Here, for Cohen’s delta, is small, is medium, and means a large effect, while for the Rank-biserial correlation, is small, is medium, and means a large effect. The results from the sensitivity analysis for nine participants (with a conventional significance level of α = 0.05 and 95% confidence, and a statistical power of 80%) calculated the minimum detectable effect size (MDE) for our study as 0.85 for paired-samples testing, and 1.86 for independent-sample testing. All findings in the following section are interpreted through these values.
For each answer, participants could elaborate on their evaluations—sometimes prompted by the experimenter, other times providing a longer explanation on their own. A qualitative data analysis was performed for the free text explanations that the participants provided. In total, 233 = 137 (ACAS Xu) + 96 (A* DAA) free text entries were collected. The free text entries were coded using Iterative Categorization (IC, can also be seen mentioned as iterative coding) [
34], which resulted in 29 codes for generalised expressions concerning transparency, pilot behaviour, orientation preference, and risk perception, competence and trust.
3. Results
3.1. Pilot Manoeuvring
After each video clip, the participants’ agreement with the turns made by the pilot model given the provided guidance was evaluated (Level 5,
Table 2).
Figure 6 provides an overview of the mean and SD of the level of agreement for each of the phases per video (see
Table 1) of the encounter scenarios with 45 or 90 degrees relative heading and the track-up/north-up displays. The frequency of general explanations associated with pilot manoeuvring is provided by the free text coding results in
Table 4.
At the first phase of the ACAS Xu scenarios, the participants expressed mixed responses, with some being neutral or somewhat disagreeing, while others were in full agreement with the initiated left turn, leading to agreement levels of 58 to 75% with considerable SD. Participants disagreeing often indicated that the aircraft should better turn right to pass behind the intruder. In the second phase, when the aircraft turned right again following the receding RWC band, there was, on average, a bit more agreement in the 90 deg encounter and similar agreement in the 45 deg encounter. In the last phases (3 and 4), when the aircraft makes multiple turns and either passes or does not pass the intruder, the mean agreement decreased to the lowest level of 42%. The large SD signifies that opinions differed a lot, with some participants strongly disagreeing with the manoeuvring, stating that the aircraft should have turned right early on and that it should not maintain going back and forth, while others found it quite acceptable and were fine with the aircraft ending up flying about parallel to the intruder.
The agreement with the turns made by the pilot for the A* DAA is considerably more positive as can be seen in
Figure 6. The lowest mean agreement level is at 75% for the 90 deg track-up case, which is the first video of the series. For later videos the mean agreement is at higher levels, which might be a learning effect. Some participants indicated that the initial turn could have been better done over a larger angle, while one participant advised a full 360-degree turn to let the intruder pass by. In the later phases 2 and 3, the agreement with the manoeuvring increased to high levels and the SD declined, since almost all participants completely agreed with the turns made.
The free text coding results in
Table 4 are in line with the above observations. For the ACAS Xu scenarios, participants were much more likely to disagree with the given DAA advice, feeling that the system is in an unsafe state, or comment negatively on the action taken. Inversely, participants were twice more likely to express agreement with the chosen action for the A* DAA system.
We conducted two types of statistical analyses: paired-sample tests for comparing the ratings between the two systems, and independent-sample tests for checking the effect of the system order. From the paired tests comparison between ACAS Xu and A*, we found a very clear agreement preference for the A* DAA system, with the ACAS Xu system showing significantly lower scores of pilot model agreement across all four encounter scenarios, i.e., case 1: 90 deg track-up (p < 0.001, d = −1.6, 95% CI [−2.42, −0.72]), case 2: 90 deg north-up (p = 0.038, rrb = −0.786, 95% CI [−1.22, −0.01]), case 3: 45 deg north-up (p = 0.004, d = −1.194, 95% CI [−1.9, −0.44]), and case 4: 45 deg track-up (p = 0.003, d = −1.254, 95% CI [−1.97, −0.48]). Of these, only case 2 showed large but insufficiently powered effect size, whereas all other cases showed significantly different values.
The order of presentation of the DAA systems had no effect on the agreement with the pilot model, and we did not find any significant differences either for ACAS Xu nor for A* when conducting the independent-sample tests.
3.2. Transparency
The evaluation of the systems’ perceived transparency was done at two levels—after every single video clip, participants were asked whether they found it clear how the guidance of the system was achieved (Level 5 in
Table 2), and after each complete video, they were asked whether the RWC guidance was clearly communicated and how easy it was to understand how it supports safe passage of the intruder (Level 4 in
Table 2).
The results for Level 5 are shown in
Figure 7. It follows that the participants mostly expressed that they found it very clear how the guidance was achieved. The lowest overall score was attained for ACAS Xu case 1, which is the only case with a significantly lower rating, albeit with an insufficiently powered effect (Wilcoxon’s signed-rank test:
p = 0.029,
rrb = −0.778, 95% CI [−1.33, −0.08]). In later cases higher ratings were achieved, which may reflect a learning effect.
The results for Level 4 are shown in
Figure 8. The participants mostly found that the RWC guidance was communicated clearly.
Figure 8a shows that the ratings were systematically higher for the A* DAA system, but a significant difference between the DAA systems was seen again only in case 1 with a very large effect size (Wilcoxon’s signed-rank test:
p = 0.047,
rrb = −1, 95% CI [−1.38, −0.12]). Concerning the understanding how the DAA system supports safe passage, the results in
Figure 8b show that there are large differences. While the participants mostly found this clear for the route guidance by A* DAA, the mean ratings and SD for ACAS Xu indicate that it was overall not clear and that opinions differed. The lowest ratings were achieved for cases 3 and 4, where the aircraft ended up flying about parallel to the intruder. Wilcoxon’s signed-rank tests found significant differences with large effect sizes for case 1 (
p = 0.009,
rrb = −1, 95% CI [−2.18, −0.59]) and case 3 (
p = 0.024,
rrb = −1, 95% CI [−1.68, −0.31]), while case 4 showed a borderline but ultimately insignificant difference (
p = 0.052,
rrb = −0.867, 95% CI [−1.29, −0.06]).
We observed a small effect of the DAA systems’ order on the ratings. We measured the differences with a series of independent-sample t-tests comparing between the participants that experienced ACAS Xu first vs. the ones that had the A* DAA system first. Participants that first experienced the ACAS Xu condition gave higher evaluations in a few questions for the ACAS Xu system compared to the participants that first saw the A* instead. Participants that experienced the A* DAA first had almost no significant differences.
An overview of the types of explanations related to the transparency questions is provided in
Table 5. For ACAS Xu, the most frequent types of comments concern issues with the display, the provided advise, and the partial understanding of the system. In particular, participants indicated that the frequent updating of the RWC bands gives a busy and distracting view, where it is not always possible to maintain a good mental picture of the traffic situation. Also, participants indicated that they did not always understand or disagreed with the location of the RWC bands or thought that they should have stayed in a particular direction for a longer time. Positively, it was indicated that the RWC bands as such are displayed clearly.
For A* DAA, the most frequent type of comments concerns the good display. Participants appreciated seeing the predicted route and closest distance to the intruder, and they found the advised heading in green rather more comfortable than the yellow RWC bands. Critical comments included the information load, with a lot of lines and guidance at the same time, and the lack of an indication of the time to reach the closest point of approach.
3.3. Situation Awareness
To evaluate the situation awareness of participants, after each of the eight videos (Level 4) we presented participants with six figures of drone movements and asked them to select the correct one for the scenario they just encountered (see example in
Figure 5). The percentages of correct situation awareness for each of these cases are shown in
Table 6. Overall, for both systems the participants were able to correctly select the correct graph in about 80% of the cases.
We did not find any statistically significant differences on any of the levels of comparison. Specifically, we conducted series of Wilcoxon paired-sample tests to compare the correctness scores on three levels:
Display orientation (system-specific). We compared between the scores in the track-up versus north-up cases separately for the two systems, both at the individual video level and at average system level. No significant differences were found.
Encounter geometry (system specific). We compared the scores in the 90-degree versus 45-degree cases separately for the two systems, both at the individual video level and at average system level. No significant differences were found.
DAA systems. This evaluation looked at the differences between the DAA systems on multiple levels. We compared both across the encounter geometry and display orientation, as well as the average score for both systems. No significant differences were found.
3.4. Display Orientation
Following two videos each with the same encounter angle and DAA system, the participants were asked about their preference for the display orientation (Level 3 in
Table 2). The mean scores for this evaluation are listed in
Table 7. Overall, participants’ preferences were evenly split, and we found no statistically significant differences. In the ACAS Xu scenarios, there was a slight preference for the north-up orientation, and the distribution was the same between the first (90-degree encounter) and second (45-degree encounter) evaluation of the orientations. In the A* DAA scenarios, no preference was observed, but there were some changes in preference between the 90 and 45-degree cases. In particular, two participants changed their evaluation after the 45-degree case, switching from north-up or no preference to track-up.
An overview of the explanations by the participants for the preferences on display orientation is provided in
Table 8. All comments were relatively balanced across the two DAA systems, with the exception of the comment “More consistent/Easier to use for orientation”, which was used at a higher frequency for the ACAS Xu system, albeit not with a clear display preference. For the five of these mentions in the ACAS Xu case, three had a north-up preference, and two track-up. It appears that the display orientation, while overall not statistically fundamental, made more of a difference and was more relevant for the ACAS Xu bands.
3.5. Risk Perception, Competence and Trust
The evaluation of the participants’ rating of the systems’ perceived risk, competence, and trust was done once per DAA system after all videos had been shown (Level 2), using the questions listed in
Table 2 and the scoring explained in
Section 2.6.
Figure 9 shows the mean scores and SD for these three topics, while
Table 9 shows associated general explanations as put forward by the participants.
The risk perception results show that for both DAA systems, overall, the participants tend to realise that there is always some risk and that pilots must be cautious when applying the RWC guidance, as they are ultimately responsible for flight safety. The large SD indicates that opinions differed, though. The perceived risk showed a borderline higher value for ACAS Xu with a medium size and underpowered effect (Student’s paired t-test: p = 0.049, d = 0.626, 95% CI [0.005, 1.21]).
The participants expressed varying opinions about the competence of the ACAS Xu RWC functionality. Part of them answered negatively, indicating poor advice that did not lead to lasting resolution. Others answered more positively, indicating that relevant information for collision avoidance is provided. The competence score for A* DAA was significantly higher with a very large size effect (Wilcoxon’s signed rank-test: p = 0.011, rrb = −1, 95% CI [−1.98, −0.48]), and the participants were more in agreement. Participants indicated that they found it to be a very clear guidance system. It was also indicated that it could be useful to show the speed of both aircraft and a frame for the time until the closest distance.
Also, the large SD for the trust score of ACAS Xu indicates that the participants differed in opinion, with some of them stating not feeling safe and not trusting its recommendations, while others found it a safe, reliable system that can be trusted. Overall, a bit above a medium score is attained. For the A* DAA system, the participants were much more aligned in their finding that it is a safe and reliable system that provides suitable guidance. This resulted in a high overall score, which is significantly higher than for ACAS Xu with a large but insufficiently powered effect size (Wilcoxon’s signed-rank test: p = 0.033, rrb = −0.786, 95% CI [−1.25, −0.03]).
The order of presentation of the DAA systems had no effect on the evaluation of competence or trust. We also did not find any significant differences for competence or trust when conducting the independent-sample tests, neither for ACAS Xu nor for A* DAA.
However, for the risk perception ratings we observed a clear effect of the order of the DAA systems. We measured the differences with an independent-sample t-test, and we compared the scores between the participants that experienced ACAS Xu first vs. the ones that had the A* DAA system first.
Participants that first experienced ACAS Xu had a lower risk evaluation of ACAS Xu compared to the other group (Student’s independent-sample t-test: p = 0.036, d = −1.416, 95% CI [−2.71, 0.01]).
Participants that experienced A* DAA first had inverse results—the participants that saw A* DAA as the second system had a lower risk perception (Student’s independent-sample t-test: p = 0.019, d = −1.706, 95% CI [−3.09, −0.17]).
While both comparisons showed a large effect size, as the two groups consisted of five and four participants, respectively, the Cohen’s d statistic did not meet the MDE limit for sufficient power, making the nature of the findings more explorative.
As the risk evaluation was always more favourable in the A* case, we believe that when participants saw the ACAS Xu case after the A* one, they evaluated ACAS Xu as more risky compared to the participants that evaluated the risk level of ACAS Xu as the first system because they had a different frame of reference. Conversely, when participants saw A* DAA first, their perceived risk of the A* DAA system (despite being overall significantly lower than the ACAS Xu one) was higher than the ones that saw it after ACAS Xu, for the same reasons as before—the second group had a different frame of reference and saw the “riskier” system first, such that they rated the A* DAA system as even less risky.
3.6. DAA System Preference and General Comments
At the end of the experiment (Level 1), we inquired after participants’ explicit preference for one DAA system over another. We found (
Table 10) that 8 out of 9 had an explicit preference for the A* DAA system, and only one participant had no preference, stating that it depends on the type of operation. The overall preference is in line with the earlier-presented more positive scores for the A* DAA.
Finally, the following remaining remarks and advice were provided:
Advice to display headings in 3-digit format, e.g., 090 instead of 90;
Advice to apply larger separation distances using larger turns;
Advice to display the compass in a less heavy way on screen, as it requires too much attention;
Advice to include a RWC boundary around the intruder aircraft;
Advice to display a vector rather than a long line in the route guidance;
Advice to display the ownship moving across the map instead of being fixed in the middle;
Advice to consider wake turbulence in the situation when ownship is passing behind the intruder without changing altitude, especially if the type of the aircraft is not known;
Advice to show the intruder’s air speed to help the pilot make more informed decisions;
Remark about the preference for the route guidance system, discussing that it can be an added layer on top of the bands guidance to support transparency.
Remark about importance of DAA systems, and that the ability to see how the system can calculate a trajectory or predict near collision is helpful;
Advice to look at procedures for manned aircraft overtaking;
Advice to display Protected Track Lines (PTL), showing protected areas of each aircraft.
4. Discussion
4.1. Human Factors in DAA Systems
In this study, two approaches for DAA horizontal RWC guidance were evaluated using structured feedback by professional remote pilots: ACAS Xu bands guidance versus A* DAA route guidance. The RWC bands guidance can be regarded as a decision support system (LOA-1), which supports the RP by advising against particular headings so as to remain well clear. It is based on the ACAS Xu standard [
4], which uses state-based estimates, a look-up table and a roll-out approach. The route guidance system is a resolution support system (LOA-2), which supports the RP by advising a specific route to remain well clear. It is based on recent research applying intent and coordination for conflict resolution using A* path planning [
21]. The results show that significantly higher ratings were achieved for the route guidance approach in the competence and trust indicators, as well as for the agreement with the provided guidance. Overall, 8 out of 9 participants expressed a preference for the route guidance approach, while one participant had no preference.
Regarding the key topics for AI in human-centric operations [
25], the evaluation results for competence and trust are significantly higher (
p < 0.05) for the A* DAA route guidance, whereas its risk perception is significantly lower (
p < 0.05). So, while the A* DAA is a LOA-2 resolution support system where the level of human agency may be considered lower than for the LOA-1 decision support ACAS Xu system, these human perspectives are more positive for the higher LOA. A key advantage of A* DAA, as recognised by the participants, is that it provides a stable overview of the predicted routes and closest distance of the aircraft, rather than the frequently changing RWC bands of ACAS Xu without understanding the conflict resolution.
These findings can be linked with the transparency results. For both systems the participants found it mostly clear how the guidance was achieved and communicated. The RWC bands and RWC route guidance often made sense to them for the traffic conditions, and the displays showed the guidance clearly. In spite of this transparency, it was often not easy to understand how the RWC bands supported safe passage of the intruder. This is a direct consequence of the design choice to only show the headings to momentarily avoid the nearby traffic, rather than to structurally resolve the conflict. In contrast, the participants found that the RWC route guidance provides transparency for the conflict resolution. The displayed RWC routes and the CPA line supported the RPs in building a stable mental model of the traffic situation.
4.2. Pilot Manoeuvring and Implications of Pilot Modelling for DAA Optimisation
The participants mostly agreed with the aircraft manoeuvring for the route guidance system, but the agreement with the manoeuvring for the bands guidance system was more restricted. For the latter, some participants agreed with the turns by the pilot, as the ownship moved consistently away from the intruder and remained at a sufficient distance, but other participants recognised that the turns adopted by the pilot were not supporting effective passing of the intruder. In other words, there was only limited support for the pilot model closely following the nearest edge of the RWC bands as used in the applied encounters. The critical participants used knowledge of the intended routes and indicated that the ownship should rather turn right towards the intruder at an early stage, thereby turning over a larger angle in the active bands. Whereas such turning to the farther edge of the bands is allowed, it is not the most intuitive response, and it requires additional knowledge on the encounter. As such, the provided bands may lead to a poor manoeuvre choice by the pilot, which does not support efficient passing of an intruder.
Such possible suboptimal performance given the RWC bands of ACAS Xu leads to the question of how the system was optimised and validated. ACAS Xu was developed in the ACAS X programme, and as a basis it uses the POMDP and dynamic programming approach that was developed for the design of ACAS Xa [
5]. ACAS Xa provides resolution advisories (RAs) in the vertical plane to pilots of manned aircraft, which are similar to the RAs of TCAS II v7.1 (the existing ACAS in commercial air transport). These RAs present singular solutions to the pilots of the aircraft in an encounter, and they have been coordinated between the systems of the aircraft, e.g., Climb for one and Descend for the other. It means that both TCAS II and ACAS Xa can be regarded as resolution support systems (LOA-2). Such specific resolution support implies that the pilot model used in optimisation and validation can be relatively straightforward, basically describing whether and how the pilot lets the aircraft climb/descend at particular rates and following some delay [
16,
18,
19]. It also means that the level of uncertainty induced by the pilot model in the feedback to design and the overall optimised ACAS Xa is restricted. This is largely different for the design and validation of ACAS Xu. As it provides RAs as well as RWC guidance, which are both in the horizontal and vertical planes, the manoeuvring response by the remote pilot requires various decisions and is thus much more uncertain. For instance, the results achieved in this paper make clear that a turn towards the closest edge of the RWC bands may not be the best choice. The maintained size of the solution space in the guidance of the decision support system (LOA-1) and the associated variability in remote pilot responses implies that there is considerable uncertainty in the remote pilot models that are used in the feedback to design and the validation of ACAS Xu. This dependence on model-based evaluation also means that the performance of the resulting system can be sub-optimal in various situations and that achieved validation metrics like probability of LDWC and NMAC incorporate high uncertainty levels.
For the A* DAA route guidance system, the participants largely agreed with turns made by the pilot, reflecting little uncertainty in the pilot model for the considered encounters. This agreement was supported by the effectiveness of the advised routes, which provided transparent ways to effectively pass the intruder and which were in line with the reasoning of some of the participants to turn right towards the intruder sufficiently early. Furthermore, since the A* DAA system is a resolution support system (LOA-2) that provides one specific solution, there remains little uncertainty in the impact of the guidance on the aircraft manoeuvring. Given that the pilot agrees with the suggested route, there could be some delay and slight variation with respect to the advised route, but overall, the uncertainty in the achieved trajectory would be limited. This means that a pilot model for responding to the route guidance can be relatively straightforward, describing some variation with respect to a specific solution. Also, an automated response (possibly following authorisation by the remote pilot) might be modelled straightforwardly, implying a very small difference between the achieved and advised routes. Overall, the restricted uncertainty in the pilot response model that is possible for an effective resolution support system means that valid evaluation results can be better obtained, such that the optimisation of the system can be managed more effectively than for a decision support system with more uncertainty in the pilot response model.
4.3. Interaction with Air Traffic Control
It is considered that the second protection layer in ATM regarding separation provision can be implemented by ATC and/or by RWC functionality of the DAA sociotechnical system, typically requiring coordination between ATC and RP [
1,
2]. It can be argued that a DAA RWC functionality that uses route guidance rather than avoidance bands is more in line with the way that ATC is implemented. In particular, given a potential conflict, an air traffic controller (ATCO) would decide on a specific resolution and instruct the involved aircraft accordingly. The A* DAA approach works similarly by determining a coordinated resolution that involves all aircraft in the encounter and advising the RPs accordingly. If there would be datalink communication of the DAA systems with the ATC centre, it is an opportunity to share the found resolution approach with an ATCO, either for information or for authorisation. In this way, a combination of ATC and DAA RWC may be effectively supported.
4.4. Limitations
The results in this study were achieved using structured feedback by nine professional RPs on videos of two types of DAA system displays for two encounter geometries. The results of the statistical analyses, including confirmatory statistical tests and effect size evaluation, showed significant differences between evaluation of the two systems for various constructs. Nevertheless, the approach of this study has some limitations which temper the universality of the findings.
A limited number of nine RPs with a heterogeneous background (e.g., 30–2000 flight hours, civil vs. military, various mission domains) participated in this study. While our analysis showed that this number provided sufficient statistical power to detect meaningful usability issues in most of the metrics we used, it was not sufficient to analyse possible differences between subgroups of RPs.
This study used only two encounter geometries. These conflicts were effectively resolved by following the A* DAA guidance, but to a lesser extent by following the ACAS Xu guidance. However, in other encounter scenarios, like larger encounter angles or sufficient differences in speed, following the ACAS Xu guidance can be more effective, and the difference in appreciation between the systems may be more restricted.
The participants were not able to interact with the DAA systems in the experiments. This lack of interaction may have been a larger disadvantage for the appreciation of ACAS Xu than A* DAA, since in the latter the pilot only has to accept and manoeuvre according to the suggested route, while the decision support guidance of ACAS Xu offers more room for interaction. Specifically, as discussed above, some participants indicated that they would manoeuvre to the farther edge of the RWC bands, which might lead to a more efficient resolution (provided that the turn would be made timely).
The encounters only considered straight initial routes of two aircraft to a waypoint. Scenarios with more complex routes, without fixed waypoints, or involving more than two aircraft were not considered. While simulations in [
21] showed three-aircraft encounters that were effectively resolved by A* DAA but not by ACAS Xu, there is a lack of knowledge on the performance of the DAA systems in more complex and dynamic environments.
4.5. Recommendations for Future Studies
Building on above considerations, several recommendations for future research are provided.
To extend the design and analysis of the A* DAA system to more complex traffic scenarios.
To apply HITL simulation where RPs can interact with the DAA systems by manoeuvring their ownship in a variety of encounter scenarios with either of the systems. The choice of the encounter scenarios should be sufficiently extensive, including potential critical cases. Agent-based simulations of the UAS sociotechnical system can support the identification of such a set of encounter scenarios [
17]. As part of such HITL simulations, the trust, competence, and risk perception questions can be asked and provide a broader perspective for comparing RWC band versus RWC route guidance.
To evaluate the robustness of conflict resolution by DAA systems guidance for a large variety of pilot response strategies and behaviour, using agent-based simulation and HITL simulation.
To investigate information sharing between DAA and ATC systems to support effective separation provision and to conduct broad-scope HITL simulation involving both RPs and ATCOs in the conduct of encounter scenarios.