1. Introduction
Despite the increasing availability and affordability of technologies that enable general aviation (GA) pilots to receive weather information in the cockpit [
1,
2], weather remains a primary factor contributing to many fatal GA accidents in the US and abroad [
3]. Weather conditions can change unexpectedly, and most GA weather-related accidents occur when pilots inadvertently transition from visual flight rules (VFR) into instrument meteorological conditions (IMC) [
3], which can result in spatial disorientation and increased risk of losing aircraft control [
4]. There are a number of potential reasons why increased access to weather information fails to improve weather-related flight decision-making.
One potential reason for the continued problems in processing weather may be that these technologies introduce too much information, or at least too much in a particular format [
2]. “Data overload” problems such as these are often the target of human factors research [
5,
6], describing when the cognitive resources have been exceeded by the information processing demands. Additionally, pilots may not be aware of the availability of new weather information, having failed to be alerted via the technologies; they could receive but misinterpret the information, or they could process the information fully yet not apply the information appropriately to decisions regarding the flight plan [
1,
3,
7]. Whether the systems fail because there is “too much information” displayed and the most immediately relevant information is hidden amongst less-relevant clutter [
2] because of human attentional limitations, or because of factors more relevant to human decision making, there is a need to understand how pilots currently use these technologies as well as other weather information sources in the cockpit. Thus, there is a need to understand how pilots integrate weather information seeking and decision making into their workflow under various flight contexts.
Per regulations, GA pilots should obtain and process weather information during preflight planning before take-off [
8]. However, because of the proliferation of in-cockpit technologies that can provide weather information in-flight, some pilots may instead wait until after taking off to thoroughly assess weather conditions and, if necessary, adapt their flight plan accordingly [
9].
While historically, official weather reports have been communicated auditorily over radio (broadcast from a particular radio channel or relayed via air traffic control (ATC) communication), the ability to display this information in a graphical interface offers new advantages. Regions of relevance can be more easily identified when linked to a dynamic map, and the weather messages can also be relayed in a persistent text format which allows them to be read (and reviewed) whenever it is convenient for the pilot’s workflow.
Current weather technologies include integrated cockpit systems and, increasingly, portable devices that can be carried into the cockpit. Common equipment such as Automatic Dependent Surveillance-Broadcast (ADS-B) receivers, GPS systems, and commercial smartphone or tablet apps such as Foreflight or Garmin Pilot are capable of receiving and displaying weather-related information to pilots in-flight [
10]. These devices compile official aviation weather service reports from the National Weather Service (NWS), Federal Aviation Administration (FAA), and the Department of Defense (DoD) [
11]—see examples of these reports in
Table 1—and some can also integrate other less formal sources of weather information (such as Doppler Weather Radar).
Given that current cockpit technologies support communicating weather messages in multiple media formats (e.g., visual and/or auditory), there is an opportunity to further support pilots’ interruption management by spreading some aspects of the messaging to information processing channels that are more “available” for the pilot, given the demands of concurrent flight tasks [
12], such as tactile. For example, pilots can listen to auditory representations of the messages when visual demands of flight tasks are high, and/or can display a message as visual text, which can often be faster to process and allows the message to be persistent (rather than the transient nature of the auditory communication). Persistent visual displays can be sampled as part of pilots’ scan patterns, reducing the demand on working memory resources by providing an easy means of referencing specific details in the message.
While in flight, pilots can be challenged at times to fit the tasks of monitoring routine weather reports (e.g., Aerodrome Routine Meteorological Reports (METARs)) as well as processing any unscheduled weather messages (e.g., Significant Meteorological Information (SIGMET)) into their workflow while concurrently performing other flight-related tasks. Under normal flight conditions, there is flexibility in when and how pilots attend to these incoming messages. For example, pilots can choose to listen to continuously looped recordings of routine reports by tuning to a particular radio channel and can call ATC to read out or clarify any unscheduled messages. This flexibility supports some degree of pilot “interruption management”, an important concept to consider in aviation human factors [
7,
13,
14] and a topic of research in a variety of work domains that require humans to divide attention and information-processing resources among multiple tasks that overlap in time [
15,
16,
17,
18].
Especially when adverse weather develops, a pilot’s cognitive resources will be in high demand for other flight tasks (e.g., aviating, navigating, and communicating). Therefore, it is critical that weather messages be announced with salient cues that reliably capture attention even when visual and auditory resources are engaged in other flight tasks [
19]. However, capturing attention does not always support effective interruption management, as it can result in an automatic reorientation of pilots’ cognitive resources when reorientation may not be warranted. This can be detrimental to overall flight safety if other, more safety-critical ongoing tasks are interrupted. In this way, having a bright and colorful (thus highly salient) graphical display of weather information in a prominent cockpit location can inappropriately draw attention and even lead to attentional tunneling, in which pilots attend increasingly to the graphical representation at the cost of normal instrument scanning and outside-the-cockpit visual references, leading to a reduction in flight situational awareness [
20]. Under the higher levels of cognitive workload that pilots can experience in adverse weather, scan patterns are often more rapid and irregular [
21,
22], which increases the chances for pilot error [
23]. Thus, it is important to consider non-visual means of announcing the availability of new weather information so that pilots are aware of the message and can process it fully when flight-related workload and workflow allow.
Aviation, much like a number of visually, auditorily, and cognitively demanding domains, can benefit from offloading some of the more heavily demanded resources by redistributing messages so they can be processed by relatively available perceptual and cognitive channels [
12,
24]. The sense of touch represents an underutilized display channel in the cockpit, and according to human information processing theory [
12], engaging this channel can improve multitasking performance when visual and/or auditory resources are in high demand for concurrent tasks [
5,
24]. Previous research has investigated how these so-called “tactile” displays might be introduced into the cockpit to improve spatial awareness [
25,
26], improve awareness of cockpit automation behavior [
27], and to guide attention to visual cockpit displays [
28]. An example of a haptic display integrated into the cockpit is the “stick shaker” that warns pilots of impending stall conditions. Tippey et al. [
29] were the first to evaluate how vibratory cues presented via smartwatches could be used to improve GA pilots’ reception of new weather messages, showing better detection and faster responses to these messages on a visual display when they were announced by a vibratory cue.
One key limitation of the Tippey et al. [
29] study was that the vibratory cues were salient—thus effectively capturing attention—but the vibration patterns that mapped to the different types of weather messages were not easily distinguishable due, in part, to the hardware limitations of the smartwatch chosen for that study. An interesting finding was that the vibration patterns were less distinguishable when pilots experienced higher cognitive workload related to flight tasks [
29]. This led to pilots reliably receiving weather messages, but because interruption management was not well supported (pilots were not able make informed decisions about whether and when to shift attention to the incoming message), concurrent task processing could be disrupted and overall flight safety may be impacted negatively [
29].
To improve upon the Tippey et al. [
29] study, efforts were made in the current study to meaningfully encode the “severity” of an incoming weather message—partial information that can be used to inform whether and when to reorient attention to process the full message—into dimensions of a vibrotactile cue that is highly distinguishable and intuitively interpretable. Previous research has shown how vibrotactile displays can support interruption management by conveying task-relevant information via vibrotactile patterns that require minimal cognitive engagement to interpret [
5,
16].
Following the design requirements of creating a set of vibratory signals that are maximally distinguishable and identifiable under varied workload, Roady [
30] conducted a series of studies with vibratory cues which were varied according to signal intensity (low, medium, and high gain), frequency, rhythmicity (straight cadence vs. syncopation), and dynamism (“melodic” vibration patterns that changed constantly with time vs. those with relatively static levels in vibratory display dimensions). A very large set of generated patterns was evaluated in a controlled experiment that manipulated the imposed workload in an aviation-like task environment (NASA’s Multi-Attribute Task Battery; [
31]), ultimately resulting in the final selection of three patterns that maximized “perceptual distance” and were identifiable as low, moderate, or high severity with high accuracy [
30].
The current study applied the vibratory patterns designed in Roady [
30] to a considerably more complex flight environment, testing the effectiveness of cues presented via a smartwatch for supporting pilot interruption management in the reception and processing of weather messages. The evaluation context was a flight scenario that imposed a range of workloads from very low to very high as weather and visibility degraded over the course of the flight. Pilots that received the vibratory cues paired with incoming weather messages had the opportunity to infer the severity of the message by the encoded vibratory pattern and to use that information in their decision making about whether and when to reorient attention and information processing resources to process the full message. It was expected that the pilots who received these vibratory cues would be more likely to receive and faster to acknowledge the arrival of the message, but that the time to process the full message would depend on the scenario-imposed task load relative to the interpreted severity. For example, a message that is announced with a vibratory cue conveying “moderate” severity may be processed immediately (or faster) when other flight-related demands are low, but may appropriately show longer response times when flight task demands are high.
The findings of this study provide further evidence of the benefits of integrating vibrotactile cues to support multitasking performance and safety in visually and/or auditorily demanding work contexts. Aviation is a domain that has historically welcomed haptic and tactile displays (the “stick shaker” stall warning is a great example), and the introduction of vibrotactile cues that can be reliably interpreted and differentiated can lead to further improvements in flight safety by better supporting flight management when encountering adverse weather.
2. Experiment
Thirty-six general aviation pilots participated in the study (TAMU IRB approval # 2014-0154D), which took place in a flight training device at the Federal Aviation Administration (FAA) William J. Hughes Technical Center (WJHTC) in Atlantic City, NJ. Participants were at least 18 years old, held an active Private Pilot License (PPL), and had flown in the previous 6 months. The reported mean age for 32 of the participants (4 participants’ biographical data were missing) was 54.2 years old (min = 19 years, max = 80 years, standard deviation = 16.9 years). The pilots had varying levels of flight experience, with a mean of 1102.56 flight hours (min = 100, max = 5500, median = 600, standard deviation = 1253.68 h). They also had a mean of 87.7 instrument flight hours (min = 0, max = 500, median = 20, standard deviation = 136.6 h).
This research investigated the effectiveness of severity-mapped vibratory cues delivered via a smartwatch to improve pilots’ acknowledgement and response to weather messages in a simulated flight scenario. Additionally, situation awareness was assessed via periodic question probes to determine the extent to which cues may have distracted or disrupted concurrent flight-related activities.
2.1. Experimental Variables
The primary independent variable investigated was whether or not vibratory cues accompanied the incoming weather messages. This variable was handled as a between-subjects factor, with participants divided into “Vibration” (which received coded vibratory cues with each weather message (WM) arrival—see
Section 2.3) and “No Vibration” groups, each with 18 participants.
In this study, pilots’ performance data were collected with regard to reception of weather messages, decision making, and situation awareness. Dependent measures were associated with the presentation of coded weather messages (WMs) (see
Section 2.3 for more information on these messages) and Situation Awareness Probes (SAPs).
The variable Acknowledgment Rate (AR) represented the proportion of presented cues that participants “acknowledged”, which was evidenced either by verbal response (e.g., “I see that I have a new weather message”) or by another observed action that followed directly from that message (such as pressing a button to read the message text or calling air traffic control for clarification). The AR variable was calculated separately for WMs and SAPs, coded as Weather Message Acknowledgment Rate (WM.AR) and Situation Awareness Probe Acknowledgment Rate (SAP.AR), respectively. In the small number of cases in which flight-related decisions led to the scenarios ending early (e.g., calling air traffic control (ATC) and requesting changes to the flight plan, such as turning around or diverting to another destination), some late-scenario WMs and SAPs were never issued to the pilots and thus not considered in the AR calculations.
Acknowledgment Times (ATs) were also collected for both WMs and SAPs, coded as Weather Message Acknowledgement Time (WM.AT) and Situation Awareness Probe Acknowledgment Time (SAP.AT), respectively. WM.AT was measured as the time between the arrival of the message (whether cued or not) and the first verbal or physical indication that showed the pilot’s awareness of the message. SAP.AT was measured as the time between the complete delivery of an SAP query (e.g., the final utterance in the request from ATC that represented the probe) until the first verbal or physical indication of the pilot’s acknowledgement of that SAP query. The AT measure is indicative of pilots’ attentional state and the salience and informativeness of the visual and vibratory cues associated with WMs, as well as the auditory (radio-based) cues associated with SAPs.
Response Times (RTs) for both WMs and SAPs, coded as Weather Message Response Time (WM.RT) and Situation Awareness Probe Response Time (SAP.RT), respectively, were measured as the time between the point of acknowledgement until the pilot’s full response had been delivered. The point of “full response” was determined via consensus coding by multiple experimenters and represented when pilots had verbally (via think-aloud protocol; see
Section 2.4) or demonstrably (through aircraft interaction) responded to the message. This measure is indicative of pilots’ abilities in interruption management, balancing the task load between activities for maintaining safe flight and dedicating resources to processing WMs and SAP queries.
In some cases, the pilots never acknowledged one or more weather messages that were presented to them, as is indicated in the WM.AR measure. As a result, these in-stances were treated as missing data points and did not factor into the mean calculations of WM.AT and WM.RT. The impact that the WM.AR has on the mean WM.AT and mean WM.RT should be kept in mind when interpreting these latter measures.
2.2. Flight Environment and Scenario
An FAA WJHTC Flight Training Device (FTD) (see
Figure 1a) was configured to perform similarly to a Mooney aircraft, having out-the-window visuals generated using Active Sky Next [
32] for PREPAR3D [
33]. The simulated scenario was a flight from Santa Fe, New Mexico (KSAF), to Albuquerque, New Mexico (KABQ), developed based on historical National Transportation Safety Board (NTSB) reports of weather-related accidents. The scenario involved mountainous terrain and weather patterns (mountain turbulence and convective activity) that progressively worsened as the pilots approached Albuquerque. Members of the experimental team who were Certified Flight Instructors (CFIs) role-played as ATC (see
Figure 1b) and followed a script which precisely timed some communications (such as SAPs) but allowed for improvised responses to any queries from the pilots.
Figure 2 illustrates the intended route as well as terrain and other aeronautical information, and
Table 2 summarizes the key scenario events. The weather conditions and visibility progressively worsened during the flight, which was cleared for take-off from KSAF under Visual Flight Rules (VFR) with 12 statute miles of visibility. As the aircraft progressed south, the visibility gradually worsened and Instrument Meteorological Conditions (IMC) were realized shortly after making a turn westward for the approach to KABQ. This final turn (into IMC conditions) also crossed over the Sandia mountain range, which introduced rising terrain and mountain obscuration that made it extremely challenging to safely navigate, while also making it virtually impossible to turn the aircraft around in order to escape the hazardous flight environment.
Participants did not have any prior experience with this particular scenario, but they were adequately trained for familiarity with the flight environment and displays with a training scenario set in the eastern United States.
During the flight, pilots received four scripted weather messages (WMs) which varied in severity at key points in the scenario, which imposed different levels of workload on the pilots (see
Table 2). For example, WM1 was delivered at a point with good visibility, relatively little weather development, and with autopilot engaged. WM2, WM3, and WM4 were delivered in increasingly higher-workload contexts, with additional workload imposed by autopilot failure, degrading weather and visibility, increased frequency of ATC communications, and the addition of turbulence. Pilots were told that their response to these messages—including their time to acknowledge, fully process, and act on the messages—would be measures of interest in this study, but that they should keep flight safety as their top priority.
To assess whether the additional weather messages may positively or negatively impact overall flight situation awareness, three SAPs were distributed to occur in low-, moderate-, and high-workload contexts of the flight. These probes inquired about the pilot’s flight plans and intentions, as well as weather, altitude, and position information. Following the Situation Present Assessment Method (SPAM) [
34], these probes were relevant to and embedded in the task itself, so that both the accuracy and the timing of the response provide insight into the pilot’s situational awareness at that point.
2.3. Weather Message Displays
Inside the cockpit, terrain and weather information was available on a tablet computer with a proprietary experimental interface developed by AeroTech Research (ATR; [
35]) to look and function similarly to existing commercial applications, such as Foreflight [
36]. This display included an active graphical map with “layers’’ of information that could be toggled to be displayed or hidden using touchscreen soft buttons in the menu bar at the top of the map (see
Figure 3). The map also supported functionality to zoom in and out, with concentric lines indicating the map scale and the aircraft proximity to various scenario areas of interest.
The tablet display was the primary means by which new weather messages were delivered to the pilots in-flight. Incoming weather messages included those listed in
Table 1 but also other communications that may be relevant to weather-related decisions, such as pilot reports (PIREPs). The arrival of a new message was announced by a color change in the associated soft button on the menu bar (see Figure 3b, highlighting an incoming PIREP). The full text of each incoming message (either or both of encoded and verbose text formats) was then accessed by pressing the associated button. This opened a pop-up text overlay on top of (and obscuring most of) the map. The message text could be toggled to be hidden or brought back into focus as often as pilots desired for the remainder of the scenario.
Each incoming weather message was characterized with “summary” information that was intended to convey the severity of the weather developments or, alternatively, the severity with which pilots should process the full message (by accessing and reading the displayed message text). The summary statement was modeled after those which were found to be beneficial for supporting pilot workload and task management in previous weather technology interaction research [
1,
9]. These statements typically included the type of weather message, and a severity reference (“low”, “moderate”, or “severe”), which pilots can take into account when deciding whether and when to devote attentional resources to access and read the full message while concurrently maintaining safe flight parameters. In addition to the highlighting of the soft buttons on the tablet display, the summary messages were displayed visually on a Samsung Gear S3 smartwatch, which all participants wore on their left wrist (see
Figure 4).
For the Vibration (V) participant group, vibratory cues from the smartwatch were also presented to coincide with the arrival of the WM and the summary statement (the NV participant group had all other display aspects except vibratory cues). The vibratory patterns persisted for 1 s in duration and were encoded to communicate the severity of the WM (“low”, “moderate”, or “severe”) through properties of syncopation, intensity, and duration that were found to be maximally distinguishable and intuitively identifiable under varied workload conditions [
30].
2.4. Procedure
After reviewing and signing the consent form, participants completed a demographics questionnaire based on flight qualifications and experience with mobile and wearable technologies and were given a formal flight briefing by CFIs from the experimental team. The briefing included a modified version of the aeronautical map illustrated in
Figure 2 as well as current visibility and weather conditions (which were supportive of flying under VFR). Participants were then trained in the FTD in a 10-min simplified training scenario set in the eastern United States, which allowed them to practice manually controlling the aircraft, interacting over the radio with ATC, and accessing route and weather information via the tablet display. Participants in the “Vibration” group were also given several example presentations which were repeated until participants demonstrated an ability to determine the severity of incoming messages by correctly interpreting the vibratory cue pattern. Participants demonstrated their understanding and ability to perform the tasks to experimenters prior to the completion of the FTD training session.
Participants were also trained to provide think-aloud verbal protocol data and practiced this during the training session while piloting the aircraft. This technique provides insight into the decision-making thought process of the pilots, as has been used in previous aviation studies [
9,
37,
38,
39]. The think-aloud protocol provided the experimenters with insight into when pilots noticed weather message cues and how they used the summary information to determine when to access the full message while concurrently managing other flight demands.
In all cases, participants were instructed to interact in the FTD and make flight-related decisions as if they were in an actual aircraft in a real flight context. In this sense, the pilots’ primary task was always to safely fly the aircraft. Participants were told that performing the think-aloud protocol as well as attending to scenario events such communicating with ATC and receiving and reviewing weather messages were all secondary to flight safety and should only be performed when safety was minimally compromised.
After the training session, participants completed the experimental flight from KSAF to KABQ. The flight scenario lasted about 20–25 min and ended when one of the following conditions was met: (a) pilots requested an alternative flight plan from ATC; (b) via think-aloud protocol, pilots expressed their clear intent to change the flight plan; (c) pilots flew into the IMC conditions and attempted to land at KABQ; (d) the pilots crashed the aircraft.
4. Discussion
This study builds on previous works that used vibratory notifications to support pilot situation awareness and performance by effectively guiding attention in the cockpit [
13,
27,
29]. The current study investigated the extent to which pilots’ awareness of weather dynamics and management of concurrent flight tasks could be supported when the availability of new weather information is announced via vibratory cues. Furthermore, as a follow-up to Tippey et al. [
29], the current study took special steps to design vibrotactile cues that featured patterns which could reliably be distinguished and intuitively associated with the concept of “severity” [
30]. Thirty-six general aviation pilots completed the study in a flight training device. The experimental scenario gradually added workload by having an autopilot failure, decreasing visibility until reaching IMC conditions, turbulence, rising terrain, and increasing proximity to weather cells, as listed in
Table 2. Weather messages were delivered to the participants at specific points in time, and half of the participants also received a severity-coded vibratory alert.
The results indicate that the participant group receiving the severity-mapped vibrations through a smartwatch showed significantly higher likelihood of acknowledging the arrival of weather messages compared to the group that did not receive the vibratory cues. Particularly for WM4, which represents the highest flight-related workload context, the highly salient “high severity” vibratory cue led to a much higher reception of the message as compared to the No Vibration group. Furthermore, those in the Vibration group acknowledged the messages sooner than those in the No Vibration group, indicating that attention is effectively drawn when there is new information worth processing. After this acknowledgment, both groups took similar amounts of time to fully respond to the messages, indicating that there were not unforeseen adverse effects of display configurations (i.e., including vibratory cues or not) on the ability to visually process and act on the full message.
There was no statistical difference between the Vibration and No Vibration groups in terms of acknowledging and responding to SAPs. While the probes used in this study were quite simple and all of them were correctly responded to, the lack of impact on the timing of responses shows that SA was relatively consistent between these groups [
34]. The inclusion of situation awareness probes in this study was not the primary measure of interest, and for future work, it is recommended that probes include queries for more complex responses as well as to evaluate pilots’ awareness of critical flight variables over longer timescales (i.e., asking about current, trending, and predicted near-future levels of various safety-critical types of flight data).
As with any research involving the complexities of aviation, there are a number of limitations in interpreting the results of this study and scaling its findings to practice. First, the study was conducted in a flight training device with a controlled and scripted scenario, and while efforts were made to add realism to the experiment, the artificiality of this context likely led pilots to make decisions under considerably different stress and time pressure than those imposed by an aircraft in real flight during adverse weather. Additionally, the lack of key environmental stimuli such as motion cues (the FTD did not include a motion base) means that some aspects of pilot workload (such as physical reaction to the forces from aerodynamic maneuvering and the cognitive load involved in processing information in a moving frame) were not well represented. Finally, a key factor in the FTD-based study was the absence of substantial vibratory “noise” that originates from engine operation as well as turbulence and other external sources in real flight. This vibratory noise propagates through the airframe to the pilot and shows potential to mask encoded vibratory signals presented to the wrist [
42,
43], thus suggesting that the smartwatch-based vibrotactile cues may be less effective in real flight.
To investigate the concerns of vibrotactile masking in an aircraft, the experimenters conducted in-flight evaluations of the perceptibility and identifiability of vibratory signals [
44]. While wearing a smartwatch on each surface (palmar and dorsal sides) of each wrist, assuming several common postures (e.g., resting the hand/wrist/arm on the flight yoke, seat armrest, and airframe itself) in several aircrafts of varied engine numbers and sizes (from 150 HP single-engine to 600 HP dual-engine) and during several phases of flight, it was found that signals with maximum intensity, higher dynamism, and moderate or high syncopation best supported perception and identifiability of the signals [
44]. The characteristics of intensity, dynamism, and syncopation that were most effective in real flight also adequately describe the vibratory cues used in the current FTD study.