Event Detection and Spatio-temporal Analysis of Low-Altitude Unstable Approach

Low-altitude unstable approach (UA) is one of the crucial risks that threaten flight safety. In this study, we proposed a technical program for detecting low-altitude UA events. The detection logic was to optimize the step-wise regression model with iterative surveys with more than 20 experienced pilots. Accordingly, the frequencies of UA events occurring around each airport in January 2018 were calculated for all the airports within mainland China. Finally, the spatial distribution characteristics of UA events were analyzed via exploratory spatial data analysis. In addition, Pearson’s correlation coefficient and the geographically weighted correlation coefficient were used to explore the correlations between UA frequency and the altitude elevation, wind level, and bad weather. The experimental results revealed that the proposed method can accurately detect the occurrence of low-altitude UA and quantitatively characterize risks. It was found that UA exhibits obvious differences in spatial distribution. Moreover, significantly strong correlations were found between UA and altitude elevation, wind level, and bad weather, and correlation differences were also reflected in different regions in China.


Introduction
A stable approach is a flying stage during which the pilot establishes and maintains a constant angle glidepath towards a predetermined point on the landing runway [1]. For a safe flight, it is necessary for a flight to keep stabilized by 1000 feet above airport elevation in instrument meteorological conditions and by 500 feet in visual meteorological conditions [2]. An unstable approach (UA) can happen when the pilot fails to plan, prepare and conduct a stable approach. Although the approach process only accounts for about 4% of the entire flight process, unsafe incidents occurring during this process account for up to 49.1% of total flight accidents [3]. In particular, UA was a causal factor in 66% of 76 approach-and-landing accidents or serious incidents all over the world in 1984 through 1997 [4]. Therefore, it is of great practical significance to explore and analyze the spatio-temporal distribution of UA events and their influencing factors.
A number of studies using physical methods have been conducted to explore the criteria of a stable approach and identify UA events. Wang et al. [5] proposed a method for analyzing the criteria of a stable approach with surveillance track data. Moreover, they also explored aborted approaches and their underlying factors [6]. Rao and Puranik [7] presented a retrospective approach to explore the causes of UA events with historical accident data. Yan and Lv [8] presented a possibility measure-based model for the detection of UA via flight data analysis. Li et al. [9] proposed a Gaussian Mixture Model-based cluster analysis approach to detect abnormal flights with elevated risks. Campbell et al. [10] developed go-around decision making criteria for avoiding UA risks via simulation studies. These studies largely concerned mechanical factors of UA events, but ignored the flying experience of pilots, particularly for evaluating their potential risks. In this paper, a detection model based on extensive questionnaires of more than 20 experienced pilots is proposed.
Moreover, previous studies on UA primarily focused on artificial [11], meteorological, and mechanical [12] influencing factors, such as the impact of sudden low-altitude wind shear [13]. Risks potentially caused by UA have also been widely evaluated via the gray clustering method [14], approaching angle and trajectory analysis [15], K-means clustering [13], and neural networks [16]. However, UA events and relative factors have rarely been studied from spatial and temporal perspectives [17,18]. In this study, spatio-temporal patterns and relative factors of UA events are explored with the risk of low-altitude UA events detected and evaluated.
In this study, with data collected from quick access recorders (QARs) [19][20][21], a program for detecting low-altitude UA events with their risks evaluated was developed. The spatio-temporal analyses of detected UA events were conducted via exploratory spatial data analysis (ESDA) techniques [22]. Moreover, relationships between the influencing factors of UA events were further explored to develop models for prediction and explanation of UA events. The exploratory analysis of the UA events presented in this article provides an important research basis and guidance for future quantitative causal analysis, and also shows practical significance for predicting and avoiding UA events.
The remainder of this paper is organized as follows. Section 2 presents the methodology and data used in this study. In Section 3, results of UA detection and exploratory factors are presented with a case study. Finally, this study is summarized and future work on this topic is also anticipated in Section 4.

Data
A database station was constructed by the China Academy of Civil Aviation Science and Technology to collect all the QAR data of domestic flights for flight operations quality assurance (FOQA). The conduction of a wide range of flight risk studies with this QAR data is important for promoting civil aviation safety. In total, QAR data of over 3500 aircraft or airlines are aggregated daily into the base station, and up to 2000 parameters are recorded in the data records, including the time, latitude and longitude, flight altitude, flight speed, wind speed, temperature, ground proximity warnings, and flight attitude. These parameters comprehensively reflect the characteristics of a flight by recording its time, space, pilot control, and engine performance. Most of the parameters are recorded every second, while some sensitive parameters are recorded as often as every 1/8 second. QAR data can be widely applied in condition monitoring, fault diagnosis, flight quality assessment, flight process visualization, mechanics maintenance, and fuel consumption control.
In this work, QAR data collected from more than 10,000 flights from 1 January to 30 June 2018 were utilized. The UA events were explored with the proposed method, and it was found that approximately 5000 flights seemed to be affected by UA. Moreover, the spatio-temporal patterns of UA events during January 2018 were explored via ESDA techniques, as well.

Methods and Models
The trigger frequency was calculated as the ratio of the number of UA events divided by the total amount of air traffic of each airport, and it was modeled as a spatial analysis unit in this paper. The spatio-temporal distribution patterns of UA events were explored by temporal heatmaps, spatial distribution maps, and spatio-temporal cubes. Correlation analysis was used to make a preliminary analysis of the potential influencing factors of UA events, such as airport terrain and meteorological factors.

Unstable Approach Detection
More than 20 experienced pilots were surveyed, and flight parameters such as the airspeed, descent rate, configuration, heading, glide path, and slope were considered for the detection of UA events; these parameters are interpreted and presented in Tables 1 and 2, respectively. All the parameter thresholds were iteratively verified with empirical data collected from 10,000 flights via the analysis ground station (AGS) platform. When the data satisfied one of the conditions, it was empirically recognized as a suspicious UA event. In particular, a radio height under 200 feet was chosen in this study, as UA events occurring under this condition could be highly risky and cause a real accident. According to the rules in Table 2, all the UA events could be preliminarily detected via the empirical rules. In this study, QAR data from 10,000 flights were utilized, of which UA events were found to occur in 5000. We then distributed them randomly in the questionnaire for the pilots to mark their risks. As an output, a score ranging from 0 to 1 was provided for each suspicious UA event. A score close to 1 means a high risk potentially caused by UA, and a score equal to 1 indicates a UA accident.

Quantitative Risk Evaluation Model
In this study, an empirical data set was produced via questionnaire survey with experienced pilots. To quantitatively evaluate the risk of UA events, we adopted the multiple linear regression technique to explore the relationships between the risk scores of UA events and relative flight parameters. In the first place, step-wise regression was adopted to specify a proper regression model [23]. Its workflow was designed as follows: STEP 1: Select one of the available variables that produces the smallest Akaike information criterion (AIC) value to form a one-way regression equation; Appl. Sci. 2020, 10, 4934 4 of 12 STEP 2: Select one of the remaining independent variables, along with the regression equation in STEP 1, to form a binary regression equation. The binary regression equation should have a smaller AIC value; STEP 3: Repeat while selecting one variable each iteration until all the variables are introduced. If the first variable is no longer important due to the introduction of some variable, then remove the first variable; STEP 4: The model is optimized when no variables need to be introduced or eliminated.
Repeat STEP 1 to STEP 4 until an overrun model for all 5 parameters is obtained. Note that this procedure was tried several times with scores amended and outliers excluded so that the models could be developed optimally.
The stepwise regression equation of pitch overrun modeling is as follows, where SCORE is the risk score marked optimally for each UA event.
The results of the stepwise regression demonstrate that the track deviation and airspeed were deleted in the process of stepwise regression, which indicates that airspeed is not a statistically significant factor in the modeling of pitch overrun.
The stepwise regression equation of track deviation modeling is as follows, where Flight_speed means the flight speed by (Reference speed − Indicated airspeed, (VREF − IAS)), and Alignment_TRD means the alignment of track and runway direction. The stepwise regression equation of excessive airspeed modeling is as follows: The results of the stepwise regression demonstrate that the excessive airspeed limit is only related to airspeed and track deviation.
The stepwise regression equation of slope overrun modeling is as follows: The stepwise regression equation of the rate reduction overrun modeling is as follows: With the above models, the UA events happening at each airport could be chosen carefully and evaluated quantitatively. These models could play important roles in evaluating risk scores of detected suspicious UA events. We incorporated them in a toolkit for UA risk evaluation with the parameters in Table 1 as input, which will allow users to select the triggering reason and relevant parameters to obtain the UA risk score with QAR records. In this sense, the frequencies of UA events occurring around each airport can be calculated accordingly. In this study, all the UA events triggered by Chinese civil aircraft in January 2018 concern airports within mainland China.

Temporal Heatmap Analysis
In this study, the spatio-temporal statistics of all the UA events triggered by Chinese civil aircraft in January 2018 were analyzed. A temporal heatmap is presented in Figure 1 to show the temporal distribution characterization of UA events. The results indicate that UA was highly likely to occur from the 8th to the 10th or in late January, and the occurrence time was mainly between 11 am to 5 pm. Observe that this period was just one week before Chinese New Year, known as "Spring Festival travel season", so the number of flights could be the highest of the year. Furthermore, the period (11 am to 5 pm) corresponds to the busiest time with the most flights landing during a day. The results indicate that UA events should be noted for busy airports, but fail to provide detailed information for better risk management. Thus, spatial analyses with long time series data are needed for future studies.

Spatial Distribution Analysis
The spatial distribution processing steps were as follows: (1) Count the number of UA events occurring at each airport; (2) Count the monthly throughput of the airport; (3) Calculate the frequency of UA using the number of UA events and the throughput; (4) Conduct spatial correlation analysis to match the segment of the UA and airports. The frequency of UA events was visually analyzed, and the results are presented in Figure 2. As shown in Figure 2, UA events were found to occur frequently in the eastern part of China, where airports are always busy. This pattern is consistent with the result presented in Figure 1. In particular, dramatic changes in terrain seem to be highly correlated with UA event frequencies, as in the northeastern and midwestern areas. Note that high frequencies appear in high-altitude areas, such as Yunnan. This observation indicates that the occurrence of UA is correlated with high-altitude terrain or its changes. In Xinjiang Province and most of the eastern regions, UA events were found to occur with different frequencies in different time periods, which could be related to the local meteorological conditions in January.

Spatial Distribution Analysis
The spatial distribution processing steps were as follows: (1) Count the number of UA events occurring at each airport; (2) Count the monthly throughput of the airport; (3) Calculate the frequency of UA using the number of UA events and the throughput; (4) Conduct spatial correlation analysis to match the segment of the UA and airports.
The frequency of UA events was visually analyzed, and the results are presented in Figure 2. As shown in Figure 2, UA events were found to occur frequently in the eastern part of China, where airports are always busy. This pattern is consistent with the result presented in Figure 1. In particular, dramatic changes in terrain seem to be highly correlated with UA event frequencies, as in the northeastern and midwestern areas. Note that high frequencies appear in high-altitude areas, such as Yunnan. This observation indicates that the occurrence of UA is correlated with high-altitude terrain or its changes. In Xinjiang Province and most of the eastern regions, UA events were found to occur with different frequencies in different time periods, which could be related to the local meteorological conditions in January.
dramatic changes in terrain seem to be highly correlated with UA event frequencies, as in the northeastern and midwestern areas. Note that high frequencies appear in high-altitude areas, such as Yunnan. This observation indicates that the occurrence of UA is correlated with high-altitude terrain or its changes. In Xinjiang Province and most of the eastern regions, UA events were found to occur with different frequencies in different time periods, which could be related to the local meteorological conditions in January.  From the distribution of the triggering causes of UA presented in Figure 3, the UA incidents that happened in the northeastern and southern regions of China were mainly caused by track deviations, while pitch overruns were found to be more prominent causes of UA in the eastern and northwestern regions of China. From the distribution of the triggering causes of UA presented in Figure 3, the UA incidents that happened in the northeastern and southern regions of China were mainly caused by track deviations, while pitch overruns were found to be more prominent causes of UA in the eastern and northwestern regions of China.

Pearson's Correlation Coefficient
Pearson's correlation coefficient is used to analyze the correlations of parameters, and the correlation analysis is performed on two parameters. The calculation formula is as follows: Figure 4 illustrates the correlations between pitch angle of flight (PITCH) and other factors. The

Pearson's Correlation Coefficient
Pearson's correlation coefficient is used to analyze the correlations of parameters, and the correlation analysis is performed on two parameters. The calculation formula is as follows: Figure 4 illustrates the correlations between pitch angle of flight (PITCH) and other factors. The upper-right triangle of the matrix presented in Figure 4 presents the correlation coefficients corresponding to the row and column numbers; the greater the correlation coefficient, the stronger the correlation. The lower-left triangle corresponds to the scatter plot between parameters. For example, the correlation coefficient between instantaneous vertical velocity (IVV) and PITCH is 0.88.  PITCH was found to be highly correlated with IVV, and it tended to be the most significant correlated factor, as the correlation coefficient is always large when pitch overrun occurs.

Geographically-weighted Correlation Coefficient
The geographically-weighted correlation coefficient (GWCC) [24][25][26] is an indicator used to explore the spatial heterogeneity of correlations between variables. The GWCC is calculated using the following formula: where , is the spatial coordinate, and , and , , are the geographicallyweighted standard deviation and covariance, respectively. , , is calculated using the following equation: where is the weighting factor of observation point j at calculation point i and can be calculated using the kernel function. In Table 3, we present four common kernel functions.  PITCH was found to be highly correlated with IVV, and it tended to be the most significant correlated factor, as the correlation coefficient is always large when pitch overrun occurs.

Geographically-weighted Correlation Coefficient
The geographically-weighted correlation coefficient (GWCC) [24][25][26] is an indicator used to explore the spatial heterogeneity of correlations between variables. The GWCC is calculated using the following formula: where (u i , v i ) is the spatial coordinate, and SD x (u i , v i ) and Cov (x,y) (u i , v i ) are the geographicallyweighted standard deviation and covariance, respectively. Cov (x,y) (u i , v i ) is calculated using the following equation: where w ij is the weighting factor of observation point j at calculation point i and can be calculated using the kernel function. In Table 3, we present four common kernel functions.

Function Name Function
Box car Bi-square Gaussian Exponential In Equations (9)(10)(11)(12), d ij is the distance between the location points i and j, and b is the bandwidth, which can be either fixed-type or adaptive-type. Fixed bandwidth takes a constant value as the distance threshold, while the adaptive bandwidth takes an integer N, and uses the distance value to the N-th nearest neighbor as the specific bandwidth value for each location-wise solution. In this work, the Gaussian kernel function and adaptive bandwidth were used to calculate the GWCC. Choosing the appropriate bandwidth is very important for GWCC calculation, as the size of the bandwidth will determine the spatial variation scale of the coefficient, which can be optimized using the cross-validation (CV) method.
The local correlations between UA events and meteorological and topographical factors were analyzed with GWCCs, which embody typical regional differences, especially for factors such as wind level and weather parameters.
Pearson's correlation coefficient was firstly calculated by analyzing the correlation between the number of days corresponding to the wind level at each airport in January and the triggering frequency of UA during the same period, as shown in Table 4. The correlation coefficients reveal that the triggering frequency of the UA of the aircraft and the number of days of each wind level have a relatively positive correlation, especially for wind of levels 4-5. The correlations between the number of days of bad weather conditions (rain, clouds, and snow) and the triggering frequency of corresponding UA events in January were also analyzed. The results are exhibited in Table 4, where a positive correlation was also found between the triggering frequency of UA events and bad weather. Observe that the strongest correlation appears for wind of levels 4-5, instead of levels 5-6 as anticipated empirically. We presume that a very bad weather condition (meaning strong winds of levels 5-6 or even higher) might lead to strict air traffic control or even cancellation. The correlation coefficient between UA triggering frequency and wind of levels 4-5 up to 0.69 is sufficient to demonstrate that windy weather plays a vital role in UA risks.
In addition, digital elevation model (DEM) data were used to analyze the correlation between the triggering frequency of UA and the elevations of the corresponding airports. As shown in Figure 5, Boeing aircraft were found to exhibit an insignificant anomalous negative correlation. This is mainly attributable to the performance of Boeing aircraft and the limitation of the oxygen supply, and the fact that a very small number of Boeing aircraft are employed to execute flights in high-altitude areas, like the western part of China. In this sense, this insignificant finding here does not necessarily mean that the correlation between UA events and elevation is irrelevant, but more detailed analysis should be done from a spatial perspective. In general, the analysis of global correlation provides a macroscopic view of the relationships between the triggering frequency of UA events and meteorological and topographical factors. However, the analysis results are relatively singular and absolute, and therefore lack a guiding significance for the refined analysis of UA events.
It was found that the mean value of the GWCC is positively correlated with the wind level. This is in accordance with the empirical causes of UA events, and indicates that the local correlation coefficients that consider spatial heterogeneity characteristics can better reflect the correlation.  In general, the analysis of global correlation provides a macroscopic view of the relationships between the triggering frequency of UA events and meteorological and topographical factors. However, the analysis results are relatively singular and absolute, and therefore lack a guiding significance for the refined analysis of UA events.
It was found that the mean value of the GWCC is positively correlated with the wind level. This is in accordance with the empirical causes of UA events, and indicates that the local correlation coefficients that consider spatial heterogeneity characteristics can better reflect the correlation. Figure 6 presents the estimates of the GWCCs between the triggering frequency of UA and wind levels 5-6. The triggering frequency of the UA of aircraft was found to be correlated to wind levels 5-6 in the Bohai coastal area. In the case of high winds, such conditions would demand an immediate go-around, and a pilot experienced in manual flying can effectively avoid UA events. Figure 7 presents the distribution of the GWCCs between the trigger frequency of UA events and elevation. It was found that the trigger frequency of UA events that affect Airbus aircraft tended to be more correlated with the altitude elevation; the higher the altitude, the higher the trigger frequency of UA events, especially in the western part of China. This phenomenon is in accordance with previous inferences.
GWCC, as compared to the traditional global correlation coefficient, can reflect the correlations between factors and can be used to more precisely and objectively analyze the relationships between the objective factors of UA from a local perspective. However, only ESDA methods were utilized in the present work. Detailed phenomena and incentives, such as the use of geographically-weighted models, e.g., geographically-weighted regression [21,22], must be further studied and explored for the diversified analysis of relative factors.
between the triggering frequency of UA events and meteorological and topographical factors. However, the analysis results are relatively singular and absolute, and therefore lack a guiding significance for the refined analysis of UA events.
It was found that the mean value of the GWCC is positively correlated with the wind level. This is in accordance with the empirical causes of UA events, and indicates that the local correlation coefficients that consider spatial heterogeneity characteristics can better reflect the correlation.   Figure 6 presents the estimates of the GWCCs between the triggering frequency of UA and wind levels 5-6. The triggering frequency of the UA of aircraft was found to be correlated to wind levels 5-6 in the Bohai coastal area. In the case of high winds, such conditions would demand an immediate go-around, and a pilot experienced in manual flying can effectively avoid UA events. Figure 7 presents the distribution of the GWCCs between the trigger frequency of UA events and elevation. It was found that the trigger frequency of UA events that affect Airbus aircraft tended to be more correlated with the altitude elevation; the higher the altitude, the higher the trigger frequency of UA events, especially in the western part of China. This phenomenon is in accordance with previous inferences. GWCC, as compared to the traditional global correlation coefficient, can reflect the correlations between factors and can be used to more precisely and objectively analyze the relationships between the objective factors of UA from a local perspective. However, only ESDA methods were utilized in the present work. Detailed phenomena and incentives, such as the use of geographically-weighted models, e.g., geographically-weighted regression [21,22], must be further studied and explored for the diversified analysis of relative factors.

Conclusion
In this study, a technical program for the detection of UA events was proposed via step-wise regression with pilot survey data, and its relative factors were quantitatively explored. Moreover, ESDA methods, namely a temporal heatmap and spatial distribution map, were utilized for spatio-

Conclusions
In this study, a technical program for the detection of UA events was proposed via step-wise regression with pilot survey data, and its relative factors were quantitatively explored. Moreover, ESDA methods, namely a temporal heatmap and spatial distribution map, were utilized for spatio-temporal distribution analysis. The relationships between the triggering frequency of UA events and the relative meteorological and topographical factors were studied with both global and local correlation analysis methods. The results demonstrate that the spatio-temporal characteristics of UA events and relative factors including wind level, bad weather, and elevation reflect diverse correlations.
In addition, as compared with the traditional Pearson's correlation coefficient, the GWCC can reflect the correlative relationships between UA events and relative factors from a local perspective. The results yielded by this study provide an important research basis and guidance for future quantitative causal analysis, as well as for the effective avoidance of such risks and the improvement of prediction accuracy.
Notably, this study makes a preliminary attempt at understanding the underlying factors of UA events, and much more work is needed in the future. For instance, the period of data collection is limited to January, which corresponds to weather conditions in winter time, but complex conditions may also appear in summer time. In this sense, long time series analysis should be conducted. Moreover, some other potential factors, such as pilot workload [27], level of automation (LOA) [28], and airmanship [29], could be also incorporated.