#### 3.1. Turbulence

The conventional wind turbulence model in aviation is described using the PSD function of the SR method. For an aircraft flying at speed

V through a frozen turbulence field, PSD functions

$\mathsf{\Phi}\left(\mathsf{\Omega}\right)$ of the von Karman model [

10] in three directions are shown as below:

where

u,

v and

w refer to the longitudinal, lateral and vertical turbulence, respectively.

${\sigma}_{u}$,

${\sigma}_{v}$, and

${\sigma}_{w}$ are the intensity measures of the turbulence.

${L}_{u}$,

${L}_{v}$, and

${L}_{w}$ (in feet) are the length scales of the turbulence.

$\mathsf{\Omega}$ is the spatial frequency, which is equal to the angular frequency

$\omega $ divided by

V. In general,

$\omega $ is in the range of

$[0,100\pi ]$ [

10]. With respect to

$\omega $, the PSD

$S\left(\omega \right)$ is computed as follows:

When the altitude of aircraft

h (in feet) is less than 1000 feet, the

L and

$\sigma $ can be calculated as below:

where

${V}_{W20}$ is the wind speed measured at 20 feet (

$6.10$ m). Given a certain flight condition, for example,

h = 600 feet (

$182.88$ m),

${V}_{W20}$ = 15 knots (

$7.72$ m/s), and

V = 140 knots (

$72.02$ m/s), the PSD of the turbulence model in three directions is shown in

Figure 1. 2000 time series of 256 s in 16 Hz are generated for each direction using the Equation (

23) in the same flight condition. One example of the generated time series is shown using the blue dash line in

Figure 2. The KL expansion method is implemented based on the generated 2000 turbulence series. Compared with the PSD, eigenvalues with the corresponding index in KL expansion are also shown in

Figure 1.

Figure 1 shows that the shape of PSD and eigenvalues in the KL expansion are similar. As mentioned in

Section 2.3, the index

k of the KL expansion terms corresponds to a certain angular frequency

${\omega}_{k}$ and the eigenvalue

${\lambda}_{k}$ in the KL expansion is equivalent to the PSD

$P\left(\omega \right)$. The PSD of the turbulence model decreases with an increase of

$\omega $. Meanwhile, the eigenvalues

${\lambda}_{k}$ in the KL expansion are sorted. Thereby, the shape of PSD and eigenvalues is almost similar. This result validates that the KL expansion method coincides with the SR method for the stationary stochastic process.

Two truncated orders

$K=30$ and

$K=100$ are selected in the KL expansion of turbulence. One example of KL reconstructed results is shown in

Figure 2. It shows that the reconstructed process and the series generated from the PSD match very well. The reconstructed accuracy is improved with the increase of the truncated order. As the high order term is corresponding to the high frequency in this case, higher frequencies are neglected in the lower order truncated KL reconstruction. To quantify the influence of the KL truncated order

K, the cumulative variance ratio

$1-\u03f5$ of the KL expansions in the case of longitudinal turbulence is calculated using Equation (

8) and also shown in

Figure 3a. It shows that the reconstructed turbulence with

$K=100$ can achieve

$84.6\%$ of variance. When

K increases to 200,

$92.6\%$ of variance is achieved. Furthermore, the PSDs of the 2000 reconstructed longitudinal turbulence series using KL expansions are computed and shown in

Figure 3b. The ’theoretical’ black line is computed directly based on the von Karman model using Equation (

23). The ’raw data’ indicates the PSD are cumputed based on the turbulence data generated from the von Karman model. The ’

K = 200, 100, 30’ shows the mean of PSD computed based on the KL reconstructed turbulence data with difference truncated orders. It also shows that the accuracy of PSD obtained from reconstructed data will be improved with the increasing order

K. This property is also indicated by the cumulative variance ratio. Nevertheless, the KL expansion with sufficient orders based on the turbulence data is consistent with the conventional von Karman model. This approach allows us to construct the turbulence wind process.

#### 3.2. Headwind

Headwind has a significant influence on landing performance. A case of headwind process construction is demonstrated in this section. The headwind from 1000 feet (304.80 m) to 50 feet (15.24 m) above ground level (AGL) is estimated from the operational flight data in the final approach phase. 849 series of headwind (in knots) are resampled from the smoothed headwind based on the altitude (in feet) shown in

Figure 4a. All wind series are homogenized after resampling.

The KL expansion method is implemented based on the headwind series of 849 flights.

Figure 4b shows the relationship between the cumulative variance ratio and the truncated order

K.

$K=20$ is selected in this case, and more than 99.9% variance of the process is obtained. Eigenfunctions

${\varphi}_{k}$ of the autocorrelation matrix are shown in

Figure 4c. According to the Equation (

9), three arbitrary reconstructed wind series using the calculated

${\xi}_{k}$ are illustrated and compared with raw wind series data in

Figure 4d. Although only 20 terms KL functions are used in the construction, the reconstructed wind series and raw wind series match very well, which indicates that truncated KL expansion can be used to reconstruct the headwind series.

New samples of the

${\xi}_{k}$ are required to generate the new wind series. The marginal distribution of each

${\xi}_{k}$ is studied. Box plot of

${\xi}_{k}$ in

Figure 5a shows the computed

${\xi}_{k}$ are distributed close to zero mean and unit variance, which is also derived in Equation (

10). However,

Figure 5a also reveals that the median of

${\xi}_{k}$ is not always zero. The empirical CDF of

${\xi}_{k}$ and standard Gaussian distribution are compared in

Figure 5b. All blue dash lines are related to the empirical CDF of

${\xi}_{k}$. It indicates that some of

${\xi}_{k}$ are almost following the standard Gaussian distribution. However, some deviations of CDF do exist in this case. Furthermore, marginal distributions for all

${\xi}_{k}$ are estimated using maximum likelihood methods from the distribution candidates, such as ’Gaussian’, ’Student-t’, ’Generalized Extreme Value’, ’t Location-Scale’, ’Logistic’ distributions, and so on. Estimated probability densities of

${\xi}_{k}$ are compared in

Figure 5c, and the corresponding distribution parameters are shown in

Table A1 of

Appendix B. Deviations in PDF between the estimated distributions and Gaussian distribution are obvious. Instead of Gaussian distribution, the estimated marginal distributions of

${\xi}_{k}$ are used to generate the realizations of

${\xi}_{k}$, which are subsequently used for the new wind series production based on the Equation (

9).

As mentioned before, dependence among

${\xi}_{k}$ might exist for non-Gaussian distributed

${\xi}_{k}$. A vine copula approach is applied to capture the dependence of the obtained

${\xi}_{k}$. To analyze the pairwise dependence between two

${\xi}_{k}$, the empirical contour plots using the copula data are shown in

Figure A1 of

Appendix C. The copula data has three scales:

X scale denoted by

${\xi}_{k}^{X}$,

U scale denoted by

${\xi}_{k}^{U}$, and

Z scale denoted by

${\xi}_{k}^{Z}$.

${\xi}_{k}^{X}$ is the same with the raw data

${\xi}_{k}$.

${\xi}_{k}^{U}$ is obtained via transforming all samples of

${\xi}_{k}$ into the uniform space as follows:

where

${F}_{k}$ is the CDF of the marginal distribution of

${\xi}_{k}$.

${\xi}_{k}^{Z}$ is used in the normalized contour plots, and it is obtained via transforming all samples of

${\xi}_{k}$ into the standard Gaussian space using:

where

$\mathsf{\Psi}$ is the CDF of the standard Gaussian distribution.

Figure A1 shows that most dependence between two

${\xi}_{k}$ is small, but some pair dependence exist, for example, the dependence between

${\xi}_{10}$ and

${\xi}_{16}$. A vine copula with the estimated marginal distributions is used in this case to capture the dependence. The parametric and nonparametric bivariate copula are used separately and normalized contour plots on the Z scale are shown in

Figure A2a,b, respectively. The parametric bivariate copula is using the defined copula function to fit the data, and the unknown parameters in the copula functions are estimated [

17]. In contrast, the nonparametric bivariate copula is using the empirical function for two variables, which is the same as the empirical CDF for one variable. New realizations of

${\xi}_{k}$ can be generated via independent sampling, parametric vine copula sampling, and nonparametric vine copula sampling. To visualize the difference of generated samples, sample plots between

${\xi}_{10}$ and

${\xi}_{16}$ are shown in

Figure 6. The raw data (

Figure 6a), independent samples (

Figure 6b), parametric (

Figure 6c) and nonparametric copula samples (

Figure 6d) in

Z scale are illustrated, respectively. The dependence between

${\xi}_{10}$ and

${\xi}_{16}$ is evident in this case. Compared with the individually sampling from the marginal distributions, samples generated using the vine copula capture the dependence of the raw data and thus give a better representation of the raw data.

The statistical characteristics of the generated wind series are analyzed to validate the construction results. 5000 samples of

${\xi}_{k}$ are generated using the independent sampling and the vine copula sampling. Then the corresponding 5000 wind series are generated for independent samples and dependent samples, respectively. The empirical CDFs and the kernel fitted PDFs of headwind speed at four different altitudes: 1000 feet (304.80 m), 600 feet (182.88 m), 300 feet (91.44 m), and 50 feet (15.24 m) are shown in

Figure 7 and

Figure 8. It indicates that the generated wind series obtain a similar statistical distribution at a certain altitude compared with the raw data. Moreover, there is no significant difference in the headwind distribution between the independent and dependent sampling in 1000 feet, 600 feet, 300 feet. However, headwind distribution densities at 50 feet using the parametric and nonparametric copula sampling are closer to the raw data than the independent sampling.

To further analyze the results, the first four order sample moments of the observed wind data along with the altitude are calculated and shown in

Figure 9. It shows that the constructed series have a good match at the mean and standard deviation with the raw data. For the independent sampling, the skewness is close to 0, and the kurtosis is always close to 3 for all altitudes. It is close to the moments of Gaussian distribution. In terms of the skewness and kurtosis, results show that the generated series can not exactly capture the skewness and kurtosis. However, the statistical moments using the parametric and nonparametric copula are closer to the raw statistical moments than the independent sampling, especially at the low altitude. Since there is no significant deviation in the numerical value of the skewness and kurtosis, the generated headwind series can be used in simulations for the quantitative assessment.

#### 3.3. Windshear

A safety-critical wind series, low-level wind shear, has been recognized as a severe hazard for flight safety. To analyze and quantify the occurrence probability and severity of the wind shear, the significant change of headwind

${V}_{W}$ with the change of the altitude

h, so-called horizontal wind shear ramps, are extracted from the flight data. The algorithm is to detect the extreme incremental values of

$\Delta {V}_{W}$ within a certain interval

$\mathsf{\Gamma}$, which is defined as a certain height deviation in this case. A headwind increment for a moving window with a given width of

$\Delta H=\mathsf{\Gamma}/2$ is defined as below:

A headwind series of one flight is illustrated in

Figure 10. The headwind speed incremental threshold

$\Delta {V}_{W}^{*}$ and the detected hight window

$\Delta {H}^{*}$ are predefined. As shown in

Figure 10, the incremental threshold

$\Delta {V}_{W}^{*}$ is set to 5 knots (

$2.57$ m/s) and the width of the rolling window

$\Delta H$ is set to 100 feet. Two wind shear ramps with

$\Delta {V}_{W}\ge 5$ knots and

$\mathsf{\Gamma}=2\Delta H=200$ feet are extracted from this flight.

The detection algorithm is applied to the 849 flights. The incremental threshold of

$\Delta {V}_{W}$ is set to 5 knots, and the search window is set to 50 feet, 100 feet, and 200 feet, respectively. The mean and standard deviation of the extracted wind shear ramps series and the corresponding number of extracted series are visualized in

Figure 11. It shows that the features of the mean and standard deviation depend on the width of the search window.

Given the detected series of wind shear ramps, the KL expansion method allows us to construct new series and enrich the database of the wind shear ramps. The KL truncated order

$K=10$ is used in the construction of wind shear ramps. 5000 new series of

$\Delta {V}_{W}\ge 5$ knots are generated via the independent sampling, parametric and nonparametric copula sampling, respectively. As shown in

Figure 11, the mean value of the raw data and generated series match very well, and there are only small biases in the standard deviation. In the case of the

$\mathsf{\Gamma}=200$ feet, the parametric copula sampling has a better match with the raw data than others. The comparison results show that KL constructed model can be used to produce the new wind shear ramps series with similar statistical characteristics.

To further validate if the generated series can be used for the analysis of the wind shear intensity, a metric of

$max\Delta {V}_{W}/\Delta h$ related to the wind shear intensity is defined using the maximum change of headwind with the altitude as below:

The metric

$max\Delta {V}_{W}/\Delta h$ is computed using the raw data and generated series data, respectively. The probability densities of the metric in the three cases are compared. The PDFs of the metrics are estimated using the kernel density functions. As illustrated in

Figure 12, the estimated PDFs based on the raw data and generated new series have a good match for the case

$\mathsf{\Gamma}=200$ feet. However, the small deviation exists for the case

$\mathsf{\Gamma}=100$ feet, and

$\mathsf{\Gamma}=400$ feet. Since the numbers of the detected wind shear ramps are limited, increasing the raw data set might improve the results. Nevertheless, the larger difference in PDFs among the three cases can be captured using the generated series. The constructed results also show the same statistical characteristics in wind shear intensity with the raw data. The KL expansion model of wind shear ramps can be used for the analysis of the wind shear intensity.