1. Introduction
Meteorological conditions that are hazardous for GA include phenomena such as turbulence, lightening, wind shear, and icing. Clouds reduce visibility and make navigational awareness difficult. This can lead to misinterpretation of aircraft positions [
1] and possible controlled flight into terrain (CFIT) accidents as well as disorientation and possible departure from controlled flight. Pilots perform numerous weather-related tasks both before and during flight so as to avoid weather hazards. Flight planning involves interpreting weather forecasts and weather information and integrating them into flight decisions [
2]. However, despite the fundamental nature of these activities, weather-related GA accidents remain significant, with high fatality rates.
GA pilots flying into instrument meteorological conditions (IMCs) whilst operating under visual flight rules (VFRs) and being restrained to visual meteorological conditions (VMCs) remains a leading and continuing cause of accidents [
3]. According to the Aircraft Owners and Pilots Association [
4], this is “one of the most consistently lethal mistakes in all of aviation” with 86% of occurrences resulting in fatalities. The United States National Transportation Safety Board (NTSB) estimates that of the GA accidents that occur in IMCs, two-thirds will be fatal. This is far higher than the overall rate for GA accidents [
5], including fatalities from mid-air collisions, wire strikes, or pilot incapacitation [
4].
When pilots continue with a VFR flight into IMCs (VFR2IMC), they can experience spatial disorientation and may lose control of the aircraft, which can lead to an unrecoverable flight attitude or in-flight structural failures [
6]. Although there are varying estimations in the literature, most researchers have established that 60–80% of GA accidents are attributable, in some part, to human errors [
7]. From NASA’s voluntary reporting system (ASRS) 70% of reported incidents are said to have been caused by “human error” and “pilot error” [
8]. The role human factors play in safety occurrences can be analyzed using the Human Factors Analysis and Classification System (HFACS). HFACS was adopted by the Federal Aviation Authority (FAA) in 1998 as a human error taxonomy for identifying accident and incident causal factors attributed to humans in aviation and other high-risk industries [
9]. HFACS, originally developed for military aviation, has 19 causal categories; however, most GA accidents are analyzed only using the lower two tiers of HFACS (unsafe acts of Operations and Preconditions for Unsafe Acts) which includes ten causal categories. These are skill-based errors, decision errors, perceptual errors, violations, adverse mental states, physical/mental limitations, crew resource management, personal readiness, and technological environment. Of these, the unsafe acts include skill-based errors (clearance, altitude/clearance, aircraft control), decision errors (VFR2IMC, in-flight planning, weather decision), perceptual error (aircraft control, altitude, descent), and violations (VFR2IMC) are the most relevant to VFR2IMC safety occurrences [
9].
This research utilized a total of 196 safety occurrences for the quantitative study, with 26 of these having official reports for use in the qualitative study. The safety occurrences covered a period from 2003 to 2019 from the Australian Transport Safety Bureau (ATSB) to examine the factors associated with VFR2IMC occurrences. The reports were subjected to qualitative and ex post facto quantitative analysis to demonstrate the causal factors behind occurrences in addition to the application of the HFACS framework to isolate the human factors. For the quantitative study, the features of the 196 occurrences were analyzed to find statistically significant features in the dataset. The primary research question for this work is, “how do weather, aircraft, pilot, and operational factors contribute to Australian VFR2IMC occurrences?”. The secondary research question for this work is, “how do the distributions of contributing factors in Australian VFR2IMC occurrences differ to those expected?”. The key hypotheses posed initially are (1) that pilot experience should reduce the likelihood of experiencing VFR2IMC, (2) student pilots involved in solo training exercises are more likely to encounter VFR2IMC, and (3) the type of GA aircraft should have no influence on the likelihood of experiencing VFR2IMC.
3. Methodology
3.1. Research Design
The research utilized an explanatory design. This is a mixed-method approach that commences with a qualitative study and is followed by a quantitative study [
34]. In this type of research design, the findings from the qualitative study are “explained” with the quantitative data and analysis. The qualitative study was a collective case study, with 26 cases. The quantitative study was an ex post facto study, with a sample size of 196.
3.2. Qualitative Study
The qualitative study examined aviation accident databases and reports to understand the human factors involved in VFR2IMC aircraft accidents in general aviation. The Australian Transport Safety Bureau (ATSB) database was used to identify reports associated with VFR2IMC. The period spanned from January 2003 to December 2019, with an initial screening of the ATSB database for all accidents occurring with the Occurrence Type “Operational-Flight preparation/Navigation-VFR into IMC”. The screening yielded 196 occurrences. The scope of the qualitative study was limited to those cases with accident reports from the ATSB. The review isolated 26 occurrences, 23 which had final reports and 3 with preliminary reports. Each of the occurrences was coded with “demographic” details in terms of the date of accident, location (address, latitude, longitude, state), involved aircraft (manufacturer, model, type), pilot details (licenses, ratings, hours, and medical), number of fatalities, and coded causal factors.
To generate a deeper understanding of the safety occurrences, additional coding was undertaken using the HFACS framework. This analysis was performed to identify and evaluate the human factors associated with all the occurrences [
7]. HFACS was chosen as an almost ubiquitous accident investigation tool used across the aviation industry [
35], and thus facilitates industry-wide understanding of any results and comparisons to other investigations [
36]. The coding was undertaken by the second author, an aviation human factors academic for 14 years, and a former commercial pilot including single-pilot IFR (instrument flight rules) operations as well as flight instruction. The HFACS framework has four levels of failure, with categories and subcategories, as shown in
Figure 1.
All of the categorized qualitative data were analyzed using Fisher’s exact test. This involved testing the distribution of categories for fatal occurrences against the distribution of categories for non-fatal occurrences. The details to calculate the
p-value for Fisher’s exact test are complex [
37]. However, software packages are capable of providing the resultant value; in this work, MATLAB was utilized to calculate the
p-values of each Fisher’s exact test.
3.3. Quantitative Study
In the quantitative ex post facto study, each of the 196 safety occurrences were categorized based on occurrence type, fatalness, type of operation, aircraft manufacturer, airspace classification, year, and month of occurrence. The occurrence types used were accident, serious incident, and incident. Fatalness refers to the fact that the occurrence either resulted in fatalities or not. The types of operations included:
The aircraft manufactures included:
Cessna;
Piper;
Beech;
Amateur;
Cirrus;
Bell;
Mooney;
Robinson;
Eurocopter;
Air Tractor; and
Other.
The other category includes 31 other manufactures, with a single manufacturer having at most 4 safety occurrences and an average of 2 occurrences per manufacture in this category. These were not included individually as the BITRE (Bureau of Infrastructure, Transport, and Regional Economics) annual report did not specify the hours associated with these manufacturers, and their associated hours were grouped in an “other” category. The airspace classifications used in Australia include A, C, D, E, and G. An additional airspace classification of PRD includes prohibited, restricted, and danger areas.
The quantitative data analysis followed that of previous work, investigating safety occurrences involving remotely piloted aircraft [
38]. The data analysis involved Pearson’s chi-squared tests for goodness of fit, calculated in Excel. The VFR2IMC data, represented by the observed data (
O) and the expected data (
E), came from various sources. Specifically:
Occurrence type—ATSB population (all safety occurrences), over the same time period;
Fatalness—ATSB population, over the same time period;
Operation type—hours reported for each operation type by BITRE;
Aircraft Manufacturer—hours reported for each manufacture by BITRE;
Airspace Classification—ATSB population, over the same time period; and
Month—average monthly rainfall reported by the Bureau of Meteorology.
All data are freely available from the corresponding government database. The statistical hypotheses are given as
where P is in reference to the proportions of the
n-th category for VFR2IMC and the expected (
E) distribution. The null hypothesis (H
0) can therefore be expressed as “the proportions of VFR2IMC safety occurrences are equal to the proportions expected, for the different categories”. Conversely, the alternative hypothesis (H
A) is that “the proportions are not equal”.
For each of the factors of interest, an ideal way to show the difference between the observed data and expected data is to calculate the relative percentage differences, deltas (Δ), using [
38]
Using the delta values facilitates a direct comparison between what is observed for VFR2IMC for each of the categories and what would be expected if the data was a random sample of the expected data. Specifically, a positive delta infers that a VFR2IMC occurrence is more likely than expected, while a negative delta infers that a VFR2IMC occurrence is less likely than expected.
Since year is an interval variable, a suitable parametric test was needed to assess the trend and statistical significance. Specifically, correlation was used to measure if the number of VFR2IMC occurrences have reduced over the 17-year span of the study.
7. Discussion
7.1. Findings
A key difference between the results found in this study and previous literature is that pilot experience did not positively influence VFR2IMC occurrences. Specifically, the proportion of occurrences associated with pilots with over 500 h of experience was much higher than all the other experience categories. This contradicted the proposed research hypothesis that flight experience should have a positive influence. Previous research [
3] reported the proportions expected for 2000–5000 h was 11.7%, while this dataset had 26% of occurrence at this level of experience. CASA notes that Australia has less risk due to weather and low values for lowest safe altitudes. As such, greater overconfidence could be present in Australian pilots.
Previous work also showed that pilots with less than 250 h accounted for 23.4% of occurrences. Based on this, it was hypothesized that student pilots and training operations would specifically be more likely to result in inadvertent flight into IMCs. All commercial operations actually had negative delta values, indicating that relative to the number of hours flown, the number of occurrences were less than expected. Only private operations showed an observed value greater than expected, indicating that VFR2IMC is a greater concern for private operations and is associated with poor planning and preflight preparation.
The final research hypothesis posed was that the type of aircraft would not influence the likelihood of VFR2IMC. This was not found to be the case, with Cessna and Piper aircraft being more likely to be involved in these occurrences, and Robinson helicopters less likely to be involved. It is not expected that aircraft type is in some way directly correlated with VFR2IMC and, rather, that there is a confounding variable responsible for the observed association. This is discussed further below in the assumptions and limitations.
There are a number of novel findings that have not previously been presented in the literature. The first of this is the association between VFR2IMC occurrences in months with less average rainfall. This result, supported by the lack of proper preflight planning using NAIPS by many pilots, is related to familiarity and overconfidence. The assumption that pilots make is that the weather now will be the weather later, and the time of year is associated with good weather. As such, flights are undertaken with no expectation to encounter inclement weather, meaning the pilots are unprepared for the situation. This overconfidence and familiarity would also make it more likely for pilots to assume the weather system or cloud is localized and, therefore, the extant of the threat is not appreciated.
The most crucial finding concerns the actions taken by pilots. While in
Section 4.3 (
Table 6), it is noted that continuing on is more likely to result in a fatal outcome, the key question to ask is what happened in the cases that were not fatal; more specifically, what happened in the occurrences that did not result in an accident (crash and/or fatality)? These 9 occurrences fall into two broad categories. The first (3 cases) is that the pilot held an IFR rating. The more interesting cases are those that are coded as “support”, which occurred in 7 of the 9 occurrences. Here, support was sought and given from either other pilots in the area or ATC, to help talk the pilot through the situation. This will be discussed further below in recommendations.
7.2. Assumptions and Limitations
There are a number of other factors that would have been interesting to code from the accident reports. Two demographics, gender and age, would have been interesting to determine if male pilots were more likely to engage in the risky behavior of continuing on into IMCs, and how age moderated the choice. Unfortunately, the reports did not provide sufficient data to test either of these hypotheses. Similarly, many of the factors investigated in the qualitative study could not be explained with a quantitative study as the data were not sufficient due to a lack of detailed information.
A crucial assumption made is that the data from the ATSB database are complete and correct. It should be noted that, according to the European Spreadsheet Risks Interest Group, over 90% of spreadsheets contain an error [
45]. Most of these errors are associated with mathematical operations, resulting in computational errors. The ATSB data is presented as a spreadsheet as extracted from their database. Assuming the data are entered faithfully, it should then be accurately reproduced.
As previously mentioned, it is assumed that the sample of investigated occurrences is random. The ATSB sets out priority guidelines, and it is interesting to note that as an ongoing issue, which primarily involves private operations, VFR2IMC does not fall under the aviation broad hierarchy which reflect the priorities for investigation. As such, it is reasonable to assume that no systematic bias exists to investigate one type of VFR2IMC occurrence over another.
There is a limitation with regards to the aircraft manufacture. The current dataset does not account for the confounding effect between aircraft manufacturer and type of operation. In private operations, the common status of Cessna and Piper aircraft mean they are far more likely to be involved in private operations. It is assumed that if this confounding influence was factored into a measure of association, then the differences observed in aircraft manufacturer would be accounted for by differences in the type of operation.
Potentially the most interesting limitation to note in this work is the HFACS framework. While being an “industry standard” in aviation, for VFR2IMC, which is a significant issue in private operations, there are little to no failures that can be attributed to “unsafe supervision” and “organizational influences”, as these are not applicable in private operations. There are many directions this discussion could proceed in, for example, the view of the regulator as a supervisory organization, the need for private pilots to peer supervise, and many more.
7.3. Recommendations and Future Work
Although education efforts continue to try and reduce the number of VFR2IMC occurrences, the numbers suggest that they are not decreasing. Standard recommendations for inadvertent flight into IMCs already make it clear what the preventative and corrective actions should be. Pilots should always carry out a detailed review of weather and establish suitable minima, and weather conditions should be monitored throughout the flight. As conditions deteriorate, turning back should always be the first action. When stuck in IMC, a mayday radio call should be used, implementing the three “C’s” of contact, confess, and comply. The best way for these findings to be used is for the safety authority to utilize case reports in safety publications where pilots share their safe recovery accounts. The new findings from this work that pilots need to be mindful of, is the fact that occurrences are more likely in months where rain is not expected.
Future work is planned to understand the fatalness of all safety occurrences in aviation and how VFR2IMC fits in and compares with all types of occurrences. Following this work, the key question posed is which types of aviation safety occurrences in Australia are more fatal than VFR2IMC.
8. Conclusions
Prior studies have reported quantitative data that is from before 2013 [
46], and almost exclusively in the US context [
3,
9,
17,
46,
47]. In this work, we have reviewed Australian aviation safety occurrences over a 17 year timeframe, which facilitates a continuation of the work previously reported in Australia from 1976 to 2003 [
20].
In terms of both the type of occurrence (accident or incident) and the fatalness of those occurrences, first and foremost, we note similar trends to all of the previous studies. If VFR2IMC occurs, then it is disproportionately more likely to result in an accident and to end fatally. The specific numbers for Australia from 2003 to 2019 are 10.2% of VFR2IMC occurrences are accidents, compared to 3.5% of all occurrences in the ATSB database for the same period and, more extremely disproportionally, 75% of VFR2IMC accidents are fatal accidents, compared to 12.4% for all accidents in the ATSB database for the same period.
When looking at the type of operation, private aviation activities are more likely to be involved in VFR2IMC occurrences. The number of reported occurrences was 3 times more than expected relative to the number of hours flown in each type of activity. As such, aerial work operations had significantly less occurrences of VFR2IMC relative to the number of h flown, at over 4 times less than expected. These results both influenced the aircraft that are associated with VFR2IMC occurrences. That is, those aircraft more typically owned and operated in private aviation activities were more likely to be involved in VFR2IMC occurrences, specifically, Cessna and Piper aircraft, accounting for almost 1.5 times more occurrences than expected relative to the number of hours flown. Similarly, for the Robinson helicopter, which is used extensively in aerial work in Australia, it had almost 8 times less occurrences than expected relative to the number of hours flown. It is important to note here that no direct influence of the aircraft is expected to account for the likelihood of VFR2IMC occurrences.
Even though the type of license had no significant link with fatalness, the type of rating held was associated with fatalness; specifically, holding only a night VFR rating was associated with an increased likelihood of a fatal occurrence. That is, of the 5 accidents where the “highest” rating held by the pilot was a night VFR rating, 4 of these ended fatally, that is, 80%; by contrast, for pilots with neither a night VFR nor an instrument rating, approximately 53% of these accidents ended fatally. The most interesting finding, which is in contrast to some previously published results, is that there was a clear association with the likelihood of a VFR2IMC occurrence, and the more flight experience a pilot had. That is, there were more accidents where the pilot had in excess of 5000 h of flight experience than when the pilot only had double-digit values for h of flight experience.
Environmental hazards (clouds, terrain, and cloud with rain) were a strong indicator not only of the frequency of accidents, but also of fatalness. The action a pilot takes when encountering IMC conditions, particularly the action to continue into IMCs, led to more accidents and fatalities. This finding is associated with the HFACS analysis which showed that errors (clearance, altitude/clearance, aircraft control) and violations occur with slightly greater frequency with fatal occurrences than non-fatal occurrences. A final contributing factor relates to preflight planning. Pilots who used NAIPS to access weather data had greatly reduced chances of experiencing a VFR2IMC occurrence, while those that did not use NAIPS were more likely to experience a fatal occurrence. Combining all the conclusions, the primary combination of factors likely to result in a fatal VFR2IMC occurrence are encountering cloud with rain, having undertaken no correct and thorough preflight weather assessment, for a flight over elevated rough terrain with trees, then not immediately turning around (and potentially climbing), and not making a mayday call to support this action.
The yearly count is significant to consider. In the study by Batt and O’Hare [
20], there were 491 total occurrences in the ATSB database from September of 1976 to March of 2003. This gives 18.5 occurrences a year during a period in time where aviation was growing (7041 registered aircraft in 1979, growing to 16,900 registered aircraft in 2000). However, looking at recent traffic data, we note the aviation industry has stagnated, if not declined, in terms of hours over the last several years. Hence, while there is a reduction from 18.5 occurrences per year between 1976 and 2003, to 11.5 occurrences per year between 2003 and 2019, there has been no noticeable reduction in occurrence per year since 2003. Looking at the BITRE traffic data, total flying hours increased from 1985 to 1999, spiked up in 2005, and has remained constant around 3.5 million hours since then. As such, even when factoring in the total number of h flown over the periods of time, there has been no reduction on VFR2IMC occurrences in the last 15 years. Noting that the old ATSB data is not available, it would still be reasonable to assume that Batt and O’Hare would have noted that per flying hour, the number of occurrences decreased from 1976 to 2003.
Pilots should have requisite experience and qualifications to meet the minimum standards in Australia. However, the study shows that pilots with VFR night ratings were more likely to experience fatal occurrences, implying that even though pilots make an accurate assessment of the hazard situation, their decision to continue flight into adverse weather conditions was a result of overconfidence in their ability and unrealistic optimism about being able to control the aircraft and avoid personal harm.