Explainable Learning Framework for the Assessment and Prediction of Wind Shear-Induced Aviation Turbulence

Khattak, Afaq; Chan, Pak-wai; Chen, Feng; Elhassan, Adil A. M.; Alsulami, Badr T.

doi:10.3390/atmos16121318

Open AccessArticle

Explainable Learning Framework for the Assessment and Prediction of Wind Shear-Induced Aviation Turbulence

by

Afaq Khattak

^1,*,

Pak-wai Chan

²

,

Feng Chen

^3,*,

Adil A. M. Elhassan

⁴ and

Badr T. Alsulami

⁵

¹

Department of Civil, Structural and Environmental Engineering, Trinity College Dublin, D02 PN40 Dublin, Ireland

²

Hong Kong Observatory, 134A Nathan Road, Kowloon, Hong Kong, China

³

Key Laboratory of Infrastructure Durability and Operation Safety in Airfield of CAAC, Tongji University, 4800 Cao’an Road, Jiading, Shanghai 201804, China

⁴

Department of Civil Engineering, College of Engineering, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia

⁵

Civil Engineering Department, College of Engineering and Architecture, Umm Al-Qura University, Makkah 24382, Saudi Arabia

^*

Authors to whom correspondence should be addressed.

Atmosphere 2025, 16(12), 1318; https://doi.org/10.3390/atmos16121318

Submission received: 8 October 2025 / Revised: 13 November 2025 / Accepted: 19 November 2025 / Published: 22 November 2025

(This article belongs to the Section Atmospheric Techniques, Instruments, and Modeling)

Download

Browse Figures

Versions Notes

Abstract

Wind shear-induced aviation turbulence (WSAT) remains a major safety concern during approach and takeoff phases at complex terrain airports. This study develops an interpretable Explainable Boosting Machine (EBM) framework to classify WSAT events at Hong Kong International Airport (HKIA). The framework integrates Differential Evolution with HyperBand (DEHB) for hyperparameter tuning and applies multiple data balance methods such as SMOTE, Borderline SMOTE, Safe-Level SMOTE, and G-SMOTE. The dataset consists of Pilot Reports (PIREPs) collected between 1 January 2007 and 31 July 2023, with 6838 wind shear events that include variables that relate to wind shear magnitude, altitude, runway distance, rainfall condition, and causal factors. Among all configurations, the EBM tuned via DEHB and trained with SMOTE-treated data achieved the highest predictive performance with BA = 0.710, MCC = 0.321, and G-Mean = 0.708, higher than untreated and other balance variants. EBM-based interpretation showed that wind shear altitude and wind shear magnitude were key predictors, and their interaction reflected a nonlinear pattern where WSAT probability rose under moderate-to-high shear conditions (wind shear altitude ≈ 0.5–2.5 and magnitude ≈ 30–35 knots). The DEHB-optimized EBM–SMOTE framework provides a transparent interpretive foundation for WSAT risk assessment and advances quantitative evaluation in aviation meteorology.

Keywords:

wind shear; aviation turbulence; HKIA; DEHB

1. Introduction

Wind shear-induced aviation turbulence (WSAT) is one of the most critical aerodynamic hazards in flight operations. It results from abrupt spatial or temporal variations in wind speed vectors, which cause rapid changes in lift, thrust, and aircraft attitude, thereby reducing flight path stability. WSAT is most hazardous during low-altitude flight phases such as approach and takeoff, when aircraft operate with limited energy margins and minimal recovery altitude. The International Civil Aviation Organization [1] defines low-level wind shear as that occurring below 1600 feet above ground level or within 3 nautical miles of a runway threshold. In such zones, vertical and horizontal wind gradients can generate severe turbulence that often forces pilots to conduct go-around or missed-approach maneuvers to maintain safety. Aviation Turbulence Intensity (ATI) is expressed through the cube root of the Eddy Dissipation Rate (EDR), which represents the rate at which turbulent kinetic energy cascades and dissipates into smaller eddies within the atmospheric boundary layer [2,3]. The EDR metric provides a standardized, aircraft-independent measure of turbulence intensity that allows uniform reporting across different aircraft and flight conditions [4]. Low-to-moderate or insignificant aviation turbulence corresponds to EDR values between 0.3 and 0.5 m²s⁻³, while values above 0.5 m²s⁻³ indicate significant turbulence [5]. Such conditions can lead to rapid altitude fluctuations, increased control input requirements, and high structural loads on the aircraft.

Among major airports vulnerable to wind shear and low-level aviation turbulence, Hong Kong International Airport (HKIA) is one of the most affected [6,7,8]. Due to frequent events, HKIA has deployed the Wind Shear and Turbulence Warning System (WTWS), Doppler Light Detection and Ranging (Doppler LiDAR), and dense anemometer networks to monitor, identify, and warn of hazardous wind phenomena in real time [9,10]. These systems form one of the most comprehensive aviation meteorological infrastructures worldwide. In addition to these automated detection systems, Pilot Reports (PIREPs) act as an important operational tool for identifying and verifying wind shear and turbulence events [11]. Each PIREP provides a direct in-flight observation from flight crews on conditions such as turbulence, wind shear, or convective activity that may not be captured by ground-based sensors. At HKIA, where terrain-induced flow distortion occurs frequently, Pilot Reports (PIREPs) furnish real-time validation for WTWS and LiDAR-based alerts. Since operations began in July 1998, statistical assessments show that roughly one in every 500 arrival or departure operations has encountered significant wind shear, whereas about 1 in 3500 flights has reported significant aviation turbulence, on average [12]. However, despite their operational value in situational awareness, these systems lack inherent predictive capability, which establishes the necessity for advanced data-driven forecast strategies.

In recent years, the use of Artificial Intelligence (AI) has expanded rapidly across multiple domains, including healthcare and medicine [13,14,15], finance and economics [16,17], and engineering [18,19]. This growing prominence of AI is primarily due to the exponential increase in data availability, diversity, and computational capability, which together enable the development of advanced analytical frameworks capable of processing complex datasets with high precision and automation [20,21]. Although AI has shown remarkable success across many disciplines, its use within aviation has grown rapidly in recent years, with research now centered on meteorological forecast tasks, turbulence detection work, and improvements in operational safety. For instance, Stacked Temporal Convolutional Networks and Extreme Gradient Boosting Framework (TCNs + XGBoost) have been used for low-level wind shear prediction [22], Artificial Neural Networks (ANNs) for short-term wind gust forecasting [23], and Random Forest (RF) and Long Short-Term Memory (LSTM) [24], as well as hybrid Convolutional Neural Network–Long Short-Term Memory (CNN–LSTM) architectures for visibility forecasting and classification [25]. In addition, Principal Component Analysis (PCA) combined with the K-Means clustering method has been applied for the prediction of aviation turbulence risk [26]. Similarly, AI methods have been employed to improve airport efficiency and flight risk management. The decision tree (DT) approach has been used for estimating runway occupancy time [27], while Support Vector Machines (SVMs) [28], unsupervised clustering algorithms, including Density-Based Spatial Clustering of Applications with Noise (DBSCAN) [29], and CatBoost [30] have been utilized for flight delay prediction. Furthermore, Bayesian Neural Networks (BNNs) have been applied for the prediction of flight trajectory [31]. For Low-Visibility Conditions (LVCs), a decision support system based on the Analog Ensemble (AnEn) strategy was developed, which predicts LVCs up to 24 h ahead [32].

Despite the growing use of AI in aviation analytics, two major challenges remain in predicting WSAT from PIREP data. The first challenge lies in the imbalanced distribution of aviation turbulence classes, where reports of low/moderate or insignificant turbulence greatly exceed those of severe or significant turbulence. This imbalance biases the learning process and limits the predictive accuracy of models for the most safety-critical events. The second challenge concerns the lack of interpretability in conventional AI algorithms. Many existing machine learning models operate as opaque “black-box” systems [33,34] that provide predictions without revealing how different variables affect the final output, which restricts their practical value. To address these issues, this study employs the Explainable Boosting Machine (EBM), a transparent glass-box model within the Generalized Additive Model (GAM) structure, which combines interpretability with stronger predictive capacity [35,36,37,38]. To correct class imbalance in the PIREP dataset, SMOTE and its variants, including Borderline SMOTE, Safe-Level SMOTE, and G-SMOTE, are used to generate synthetic minority samples so the classifier forms a more balanced decision boundary [39,40,41,42]. Hyperparameter tuning is carried out using Differential Evolution with HyperBand (DEHB), which refines model configuration and improves prediction while reducing overfit risk [43]. The major contributions of this study are summarized as follows:

Development of an interpretable EBM-based framework for predicting WSAT in the vicinity of airport runways.
Application of data balancing techniques, including SMOTE and its advanced variants, to address class imbalance within the PIREP dataset and improve EBM model performance.
Provision of transparent feature-level interpretation of WSAT prediction outcomes through the inherently explainable structure of the EBM.

The remainder of this paper is structured as follows. Section 2 provides a case study description as well as a theoretical overview of the DEHB–EBM framework with data balancing strategies. Section 3 discusses the EBM analysis results and its interpretation, while Section 4 concludes with recommendations for future research.

2. Materials and Methods

2.1. Case Study Description

In this study, HKIA is taken as the case location as it is situated within one of the most aerodynamically complex operational environments in global civil aviation due to the strong terrain–flow interaction around Lantau Island [44]. It is directly influenced by complex terrain features, including Tai Tung Shan (871 m), Yi Tung Shan (747 m), and Lantau Peak (934 m), which induce strong orographic flow disturbances, as shown in Figure 1. Under prevailing southerly or southeasterly synoptic flows, the surrounding terrain acts as a major perturbation source, which forces abrupt displacement of the approaching synoptic flow. This displacement then produces mechanically induced turbulence, rotor-type circulations, and localized wind shear within the approach and departure corridor [7]. The HKIA-based PIREPs also revealed that most wind shear events cluster very near ground level below 200 m, which shows the highest hazard density, as shown in Figure 2 [45]. The frequency drops rapidly with altitude, which shows far fewer reports beyond 600 m. This pattern reflects the aerodynamic instability that is strongest close to the runway environment during the takeoff and landing phases. Furthermore, studies also revealed that based on HKIA-based PIREPs, runway corridor 07LA at HKIA experienced the highest number of wind shear events, while some corridors such as 07LD and 25LD had very few, as shown in Figure 3 [46]. Similarly, wind shear occurrence shows clear seasonal variation across months at HKIA. Occurrence reaches higher levels in spring months, with April and March at the top range, as shown in Figure 4. Winter months remain low, while late summer and early autumn months show moderate levels.

2.2. Theoretical Overview of the DEHB–EBM Framework

The proposed framework integrates the EBM with DEHB and SMOTE and its variants to improve interpretability and prediction reliability in WSAT classification using PIREP data. The EBM provides a transparent additive structure that captures both nonlinear and interaction effects among features [48,49,50], while DEHB determines the best model hyperparameters [43]. SMOTE and its variants address class imbalance by creating synthetic minority samples, which allows the model to better detect significant WSAT events [51]. A 10-fold cross-validation (CV) procedure is applied to validate generalization capability and avoid bias from single-split evaluation. Together, they form an interpretable and data-driven analytical framework suitable for WSAT classification. The conceptual workflow is shown in Figure 5, and the theoretical formulation is presented below.

2.2.1. Explainable Boosting Machine Model (EBM)

The EBM is a GAM with pairwise interaction terms, designed to provide interpretability while preserving predictive flexibility. Given a set of predictor variables

X = [x_{1}, x_{2}, \dots, x_{n}]

and a target variable

y

, EBM models the prediction function as Equation (1).

\hat{y} = f (X) = β_{0} + \sum_{i = 1}^{n} f_{i} (x_{i}) + \sum_{i < j} f_{i j} (x_{i}, x_{j})

(1)

where

β_{0}

is the global intercept,

f_{i} (x_{i})

represents the univariate shape function that captures the nonlinear contribution of feature

x_{i}

, and

f_{i j}

denotes pairwise interaction terms between features

x_{i}

and

x_{j}

.

Each univariate and interaction function is learned through gradient boosting of shallow decision trees trained in a cyclic additive manner. This structure allows EBM to approximate complex relationships while retaining interpretability. In the case of WSAT, the probability of significant aviation turbulence

(ψ_{s i g})

is modeled through the logistic link function, given as Equation (2).

P (y = 1 | X) = \frac{1}{1 + e^{- \hat{y}}}

(2)

The predicted turbulence class

\tilde{y}

is then determined by a threshold applied to the probability as shown by Equation (3).

y = \{\begin{matrix} 1, & i f P (y = 1 | X) \geq τ, \\ 0, & otherwise . \end{matrix}

(3)

where

τ

is the decision threshold, generally set to 0.5 for binary classification.

2.2.2. Data Balancing Through SMOTE and Its Variants

Due to the class imbalance in PIREP data, where instances of low or insignificant aviation turbulence

(ψ_{i n - s i g})

dominate over significant turbulence

(ψ_{s i g})

events, data balancing is applied before model training. The SMOTE generates artificial instances for the minority class by interpolating between existing minority instances. Given a minority class instance

x_{i}

and one of its

k - nearest

neighbors

x_{i, N N}

, the synthetic instance

x_{n e w}

is created, as shown in Equation (4).

x_{n e w} = x_{i} + λ (x_{i, N N} - x_{i}), λ \sim U (0, 1)

(4)

where

λ

is a random number drawn from a uniform distribution

U (0, 1)

. This formulation makes sure that the new instances are distributed along the line segment connecting

x_{i}

and its nearest neighbor, which enriches the minority class without duplication.

To enhance the minority class representation, four oversampling techniques are applied, including SMOTE, Borderline SMOTE, Safe-Level SMOTE, and G-SMOTE. SMOTE generates synthetic instances by interpolating between minority instances and their nearest neighbors to create a balanced dataset. Borderline SMOTE concentrates on minority instances near the decision boundary to strengthen class distinction and reduce misclassification risk. Safe-Level SMOTE adjusts the number of synthetic instances according to the safety level of each instance, minimizing the risk of noise amplification. G-SMOTE employs geometric constraints around minority instances to generate adaptive synthetic instances within a safe feature space, improving the diversity and boundary learning. The degree of oversampling

(α)

determines the number of synthetic instances added, as given in Equation (5).

N_{new} = α \times N_{minority}

(5)

where

N_{minority}

is the number of original minority class instances.

2.2.3. DEHB for Hyperparameter Tuning of EBM

Hyperparameter tuning plays an important role in achieving high predictive performance for EBM. Traditional methods such as grid search or random search often require extensive computational effort and fail to efficiently explore complex, high-dimensional search spaces [52]. To address this limitation, the DEHB algorithm provides a reliable and resource-efficient approach for hyperparameter optimization by combining the global exploration capability of DE with the adaptive resource allocation mechanism of Hyperband. The DEHB framework provides several advantages for EBM tuning, including hybrid efficiency through the integration of exploration and resource allocation. Through this hybrid approach, DEHB provides a powerful and effective means of optimizing EBM hyperparameters while preserving interpretability and enhancing generalization capability.

Differential Evolution Component

The DEHB algorithm maintains a population of candidate hyperparameter vectors, denoted as

Θ

, given as Equation (6).

Θ = \{θ_{1}, θ_{2}, \dots, θ_{N}\}

(6)

where

N

is the population size and each

θ_{i}

represents a vector of hyperparameters for the EBM.

At each generation

g

, new candidate solutions are generated through mutation and crossover operations. For a target vector

θ_{i}^{g}

, a donor vector

v_{i}^{g}

is created, as shown by Equation (7).

v_{i}^{g} = θ_{r 1}^{g} + F (θ_{r 2}^{g} - θ_{r 3}^{g})

(7)

where

θ_{r 1}^{g}, θ_{r 2}^{g}, θ_{r 3}^{g}

are distinct randomly selected population members, and

F \in [0, 1]

is the scaling factor controlling mutation strength.

The crossover step creates a trial vector

u_{i}^{g}

, as illustrated by Equation (8).

u_{i, j}^{g} = \{\begin{matrix} v_{i, j}^{g}, & i f r a n d_{j} (0, 1) \leq C R o r j = j_{r a n d}, \\ θ_{i, j}^{g}, & o t h e r w i s e \end{matrix}

(8)

where

C R \in [0, 1]

is the crossover probability, and

j_{r a n d}

makes sure that at least one parameter is inherited from the donor.

The selection step determines whether the trial vector replaces the target vector, shown by Equation (9).

θ_{i}^{g + 1} = \{\begin{matrix} u_{i}^{g}, & i f f (u_{i}^{g}) \leq f (θ_{i}^{g}), \\ θ_{i}^{g}, & o t h e r w i s e \end{matrix}

(9)

Hyperband Component

The Hyperband mechanism allocates computational resources efficiently across candidate configurations by performing successive halving. Each configuration is initially assigned a small training budget, and only the best-performing fraction proceeds to higher budgets.

For a given budget

b

and reduction factor

Γ > 1

, Hyperband is defined as Equation (10).

n_{i} = \frac{n_{\max}}{Γ^{i}}, b_{i} = b_{\max} Γ^{i}, i = 0, 1, \dots, s,

(10)

where

n_{i}

is the number of configurations at stage

i

, and

b_{i}

is the corresponding computational budget.

Combined DEHB Procedure

The DEHB algorithm integrates the evolutionary update rule of DE with the early-stopping mechanism Hyperband. Each population member’s fitness evaluation corresponds to a model trained on a given resource budget

b_{i}

. Configurations with lower performance are eliminated early, while promising configurations are further improved through DE mutation and crossover operations. Formally, DEHB can be represented as Equation (11).

θ * = \underset{θ \in Θ}{\arg \min f (θ, b_{i})}

(11)

Equation (11) is subject to adaptive budget allocation and evolutionary search updates until convergence or maximum resource consumption.

Model Output and Interpretability

Once optimized, the DEHB–EBM framework produces both predicted classifications and feature contribution profiles. The effect of each factor can be visualized through partial dependence plots of

f_{i} (x_{i})

and pairwise surfaces of

f_{i j} (x_{i}, x_{j})

. These representations quantitatively describe how each meteorological and operational factor influences aviation turbulence probability, as illustrated in Equation (12).

W S A T = \{\begin{matrix} ψ_{s i g}, & if \hat{y} \geq τ, \\ ψ_{i n - s i g}, & otherwise . \end{matrix}

(12)

2.3. Performance Measures

The predictive and classification performance of the EBM was assessed using six metrics, including precision

(Pn)

, recall

(R l)

, balanced accuracy

(B A)

, geometric mean

(G M)

, Matthews Correlation Coefficient (MCC), and Receiver Operating Characteristic (ROC) curve. These metrics were computed from the confusion matrix, which compares predicted and actual class outcomes. The corresponding performance formulas are presented in Table 1.

3. Analysis and Results

This section presents the analytical outcomes derived from HKIA-based PIREPs collected between 1 January 2007 and 31 July 2023. During this period, PIREPs reported a total of 6838 WSAT events from both outbound and inbound flight operations. Among these, 1169 WSAT events were classified as

ψ_{s i g}

and the remaining 5668 were classified as

ψ_{i n - s i g}

. A binary classification framework was established in which each observation was assigned a categorical label, i.e., “1” represents

ψ_{s i g}

(minority class), while “0” represents

ψ_{i n - s i g}

(majority class), as defined in Equation (12). Table 2 provides the detailed description and coding scheme of the factors extracted from HKIA-based PIREPs that were used as input factors in the classification model.

Figure 6a–e present the frequency distributions of wind shear-related factors categorized by aviation turbulence severity

(ψ_{s i g} / ψ_{i n - s i g})

. Figure 6a shows that most turbulence events occurred at altitudes below 400 ft, with 3952

ψ_{i n - s i g}

and 503

ψ_{s i g}

cases, which indicates that near-surface wind shear contributes most to turbulence generation. Figure 6b shows that the majority of

μ_{W S}

fall within the 15–20 knots range (4806

ψ_{i n - s i g}

and 971

ψ_{s i g}

). Figure 6c compares

ρ_{W S}

and shows that most

ψ_{i n - s i g}

events occurred under no-rain conditions. Furthermore, Figure 6d,e illustrate the influence of

ϕ_{W S}

and

δ_{W S}

, respectively. Terrain-induced wind shear dominated the dataset, with 3671

ψ_{i n - s i g}

and 746

ψ_{s i g}

cases, followed by sea breeze (1272

ψ_{i n - s i g}

t and 360

ψ_{s i g}

) and gust front (725

ψ_{i n - s i g}

and 63

ψ_{s i g}

). Regarding

δ_{W S}

, most reports occurred at or near the runway, with 1815

ψ_{s i g}

and 415

ψ_{i n - s i g}

cases “at runway” and 1857

ψ_{i n - s i g}

and 317

ψ_{s i g}

cases within 0–1 NM from runway. Table 3 summarizes the class distribution before and after data balancing.

3.1. EBM Training and Testing

The EBM model was applied to classify WSAT severity using the extracted predictor variables. The full dataset contained 6837 samples, with 5668

ψ_{i n - s i g}

cases (82.9%) and 1169

ψ_{s i g}

cases (17.1%), as shown in Figure 6. The data was divided into a 70% training subset (3967

ψ_{i n - s i g}

and 818

ψ_{s i g}

) and a 30% testing subset (1701

ψ_{i n - s i g}

and 351

ψ_{s i g}

), while retaining the original class proportions. Since the imbalance existed within the training subset, oversampling was applied only to this portion. SMOTE and its three variants (Borderline SMOTE, Safe-Level SMOTE, and Geometric SMOTE) were used, resulting in a balanced training distribution of 3967

ψ_{s i g}

and 3967

ψ_{i n - s i g}

cases, as shown in Table 3.

Furthermore, in order to improve the predictive capability of the EBM, the DEHB strategy was adopted for hyperparameter optimization across four data balancing techniques, including SMOTE, Borderline SMOTE, Safe-Level SMOTE, and G-SMOTE, as shown in Figure 7a–d. Each subplot in Figure 7 presents the search range explored for six principal hyperparameters and highlights the optimal configuration (green marker) identified through the DEHB process.

The tuned hyperparameters include Max Bins, Learning Rate, Max Leaves, Min Samples Leaf, Outer Bags, and Inner Bags, which govern the interpretability, convergence behavior, and generalization capability of the EBM model. The Max Bins parameter determines the discretization level of continuous features, which influences how the model captures feature interactions. The Learning Rate controls the contribution of each iteration during boosting, balancing convergence speed and stability. Max Leaves and Min Samples Leaf define the complexity and depth of individual trees in the additive model, while Outer Bags and Inner Bags relate to the bootstrapping mechanism that improves ensemble stability and reduces overfitting.

For the SMOTE-balanced dataset, DEHB identified the optimal configuration as Max Bins = 90.021, Learning Rate = 0.043, Max Leaves = 16.225, Min Samples Leaf = 35.987, Outer Bags = 2.371, and Inner Bags = 9.099. For the Borderline SMOTE dataset, the optimal parameters were Max Bins = 164.444, Learning Rate = 0.014, Max Leaves = 24.502, Min Samples Leaf = 49.661, Outer Bags = 19.910, and Inner Bags = 5.715. Under the Safe-Level SMOTE configuration, DEHB achieved the best performance with Max Bins = 363.634, Learning Rate = 0.018, Max Leaves = 22.648, Min Samples Leaf = 15.346, Outer Bags = 2.945, and Inner Bags = 9.179. Finally, for the G-SMOTE case, the optimal settings obtained were Max Bins = 170.796, Learning Rate = 0.039, Max Leaves = 15.362, Min Samples Leaf = 27.018, Outer Bags = 17.111, and Inner Bags = 6.720.

The performance assessment of the EBM using untreated data (without any class balancing) is illustrated in Figure 8a,b. The results reflect the baseline predictive behavior of the EBM model when trained on the original, imbalanced dataset, where the majority of cases correspond to

ψ_{i n - s i g}

and the minority to

ψ_{s i g}

. The confusion matrix in Figure 8a shows that the model correctly classified 1675 instances (81.6%) as

ψ_{i n - s i g}

, while 52 instances (2.5%) were correctly identified as

ψ_{s i g}

. However, 299

ψ_{s i g}

cases (14.6%) were misclassified as

ψ_{i n - s i g}

, which indicates a tendency of the model to favor the majority class. Only 26

ψ_{i n - s i g}

cases (1.3%) were incorrectly predicted as

ψ_{s i g}

. This imbalance in predictions shows that the untreated EBM model struggles to identify minority-class

ψ_{s i g}

events effectively. The ROC curve in Figure 8b shows an Area under the Curve (AUC) of 0.788, which indicates reasonable discrimination between

ψ_{s i g}

and

ψ_{i n - s i g}

despite class imbalance. For EBM on the untreated dataset, BA (0.566), MCC (0.262), and G-Mean (0.382) collectively reveal moderate predictive performance, with limited sensitivity toward

ψ_{s i g}

cases.

The performance of the EBM after applying the SMOTE technique is illustrated in Figure 9a,b. The confusion matrix in Figure 9a shows an improved ability to detect significant turbulence events, with 266 (13.0%) correctly identified

ψ_{s i g}

cases and a reduction in missed detection compared with the untreated model. The ROC curve in Figure 10b achieved an AUC of 0.785, indicating comparable overall discriminative ability but improved class balance. The BA (0.710) and G-Mean (0.708) increased slightly, and the MCC (0.321) also improved, which indicates that SMOTE allowed the EBM to handle the minority class more effectively without reducing performance across the full dataset. The EBM with Borderline SMOTE achieved balanced performance, as shown in Figure 10a,b. The model correctly identified 261 (12.7%)

ψ_{s i g}

cases with an AUC of 0.765, BA of 0.701, G-Mean of 0.699, and MCC of 0.307, which indicated stable but slightly lower performance than standard SMOTE due to borderline sample complexity.

The Safe-Level SMOTE and G-SMOTE results are presented in Figure 11a,b and Figure 12a,b, respectively. For Safe-Level SMOTE, the EBM achieved improved minority class recognition, correctly classifying 243 (11.8%)

ψ_{s i g}

cases. The model yielded an AUC of 0.772, BA of 0.690, G-Mean of 0.690, and MCC of 0.295, which reflected balanced predictive behavior with steady sensitivity to

ψ_{s i g}

events. For G-SMOTE, the EBM maintained steady discrimination performance, which correctly classified 196 (9.6%)

ψ_{s i g}

cases with an AUC of 0.775, BA of 0.667, G-Mean of 0.658, and MCC of 0.279, as shown in Figure 12a,b.

Based on the above results, the EBM model trained with SMOTE-treated data achieved the best overall performance among all data balancing strategies and can be interpreted both globally and locally.

3.2. EBM-Based Interpretation

The EBM model trained with SMOTE-treated data identified wind shear altitude

(η_{W S})

and magnitude

(μ_{W S})

as the two most influential predictors in distinguishing between

ψ_{s i g}

and

ψ_{i n - s i g}

categories, as shown in Figure 13a. The global feature importance plot reveals that these variables, along with their interaction term

(η_{W S} \times μ_{W S})

, dominate model explainability and indicate that both the strength and vertical position of wind shear critically affect aviation turbulence severity. Similar conclusions were reported in prior research that examined atmospheric turbulence dynamics, which identified vertical wind shear instabilities as one of the primary sources of aviation turbulence, along with convective and mountain wave activities [53]. A previous case study at the HKIA further substantiates this relationship. The mountainous terrain of Lantau Island, located south of the airport, often disrupts airflow under specific wind directions, generating marked low-level wind shear and turbulence during aircraft approach and departure [54].

Other factors including

ρ_{W S}

,

δ_{W S}

, and

ϕ_{W S}

also contributed meaningfully, though to a lesser extent. The clear separation between single-feature and interaction terms implies that the model effectively captures nonlinear and interdependent aerodynamic effects. In the

η_{W S}

plot (Figure 13b), the model score transitions from negative to strongly positive when

η_{W S}

exceed approximately 0.5 and maintains a stable high contribution up to 2.5 before slightly declining. This pattern indicates that low altitude between 0.5 and 2.5 considerably improves the likelihood of

ψ_{s i g}

. Similarly, for

μ_{W S}

, the model shows a nonlinear response with alternating positive and negative regions, as shown in Figure 13c. The probability of

ψ_{s i g}

begins to rise sharply once

μ_{W S}

exceeds 18 knots. However, at around 28–30 knots, the score drops, which implies that intermediate

μ_{W S}

may not always produce

ψ_{s i g}

. Beyond 35 knots, a distinct positive peak appears, defining an optimal range where strong shear coincides with maximum turbulence amplification. This pattern aligns with empirical evidence from the HKIA, where recurrent low-level wind shear events of comparable magnitude result from terrain-induced flow distortion near Lantau Island [55,56,57].

Furthermore, the interaction plot between

η_{W S}

and

μ_{W S}

reveals a strong interdependence in shaping turbulence likelihood, as shown in Figure 13d. The highest positive scores, shown in bright yellow, appear where both

η_{W S}

and

μ_{W S}

reach moderate-to-high levels, typically around

η_{W S}

≈ 1.5–2.5 and

μ_{W S}

≈ 30–35 knots. This indicates that when elevated wind shear occurs concurrently with a strong magnitude, the conditions are most conducive to

ψ_{s i g}

formation. On the other hand, the darker regions at the lower-left (low

η_{W S}

and

μ_{W S}

) and upper-left corners (high

μ_{W S}

but low

η_{W S}

) correspond to negative scores, which means that such combinations are likely to produce

ψ_{i n - s i g}

.

4. Conclusions and Recommendations

This study proposed a framework that integrates the EBM model with several data balancing strategies, such as SMOTE, Borderline SMOTE, Safe-Level SMOTE, and G-SMOTE, for WSAT classification and interpretation. Results indicate that an imbalance in the WSAT dataset affects model performance and that controlled synthetic enrichment of the minority class leads to higher accuracy and clearer interpretability outcomes. The EBM model trained with SMOTE reached the highest performance with a BA of 0.710, an MCC of 0.321, and a G-Mean of 0.708, with Borderline SMOTE, Safe-Level SMOTE, and G-SMOTE producing lower performance values. This confirms that the minority WSAT class required augmentation to reveal its underlying structure in a reliable manner.

Interpretability results from the EBM framework revealed that wind shear altitude and wind shear magnitude act as the dominant variables that drive WSAT classification. Their joint effect illustrates a nonlinear response area, where the highest WSAT probability takes place at a moderate to higher altitude range (0.5–2.5 units) together with wind shear magnitude between 30 and 35 knots. This interaction gives physical insight into the aerodynamic mechanisms that support WSAT generation close to the ground within runway operational zones.

4.1. Limitations of This Study

This study is based on HKIA only, which is characterized by strong terrain–flow interaction unique to Lantau Island. This domain specificity limits direct transferability of the learned WSAT behavior to airports with weaker or different topographic influence, different turbulence climatology, or distinct operational runway layouts. The modeling was also based on PIREPs, which depend on subjective pilot reporting behavior and do not provide quantitative turbulence magnitude distribution information beyond occurrence classification.

4.2. Future Recommendations

Future studies should expand the framework across multiple airports with different climates and topographic regimes so the generalization capability can be assessed beyond a single location. The integration of higher-resolution turbulence measurements from LiDAR, the ADS-B-derived Eddy Dissipation Rate, and aircraft-mounted sensor-based measurements can strengthen objective signal representation and reduce dependence on subjective pilot-based turbulence reporting. In addition, other atmospheric state variables such as vertical shear profile, temperature gradient, and humidity stratification can raise classification reliability further. Probabilistic uncertainty quantification should be incorporated in future model development because aviation safety applications require calibrated risk estimation rather than a single deterministic outcome.

Author Contributions

Conceptualization, A.K.; Formal analysis, A.K. and B.T.A.; Funding acquisition, P.-w.C.; Investigation, P.-w.C.; Methodology, A.K.; Resources, F.C.; Software, F.C.; Supervision, F.C.; Validation, P.-w.C. and A.A.M.E.; Visualization, A.A.M.E.; Writing—review and editing, B.T.A. All authors have read and agreed to the published version of the manuscript.

Funding

The present study received financial support from the National Natural Science Foundation of China (Grant No. 52250410351), the National Foreign Expert Project (Grant No. QN2022133001L), and Xiaomi Young Talent Program.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

The authors acknowledge with gratitude the colleagues at the Hong Kong Observatory, Hong Kong International Airport, for supplying the PIREPs data that formed an essential component of this research. The authors would also like to acknowledge the support of the Deanship of Scientific Research at Taif University, Taif, Saudi Arabia has supported this research. In addition, the authors would also like to acknowledge the use of the Grammarly AI tool, which was employed solely for grammar correction in the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

ICAO. Manual on Low-Level Wind Shear; ICAO: Montreal, QC, Canada, 2005. [Google Scholar]
Di Vito, V.; Zollo, A.L.; Cerasuolo, G.; Montesarchio, M.; Bucchignani, E. Clear-Air Turbulence and Aviation Operations: A Literature Review. Sustainability 2025, 17, 4065. [Google Scholar] [CrossRef]
Hon, K.; Chan, P. Application of LIDAR-derived eddy dissipation rate profiles in low-level wind shear and turbulence alerts at Hong Kong International Airport. Meteorol. Appl. 2014, 21, 74–85. [Google Scholar] [CrossRef]
Wang, D.; Gao, Z.; Gu, H.; Guan, X. The optimization of aircraft acceleration response and EDR estimation based on linear turbulence field approximation. Atmosphere 2021, 12, 799. [Google Scholar] [CrossRef]
Chan, P. LIDAR-based turbulence intensity for aviation applications. In Aviation Turbulence: Processes, Detection, Prediction; Springer: Berlin/Heidelberg, Germany, 2016; pp. 193–209. [Google Scholar]
Hon, K.-K. Predicting low-level wind shear using 200-m-resolution NWP at the Hong Kong International Airport. J. Appl. Meteorol. Climatol. 2020, 59, 193–206. [Google Scholar] [CrossRef]
Louis, K.; Guan, Y.; Li, L.K. RANS simulations of terrain-disrupted turbulent airflow at Hong Kong International Airport. Comput. Math. Appl. 2021, 81, 737–758. [Google Scholar] [CrossRef]
Chan, P.W.; Lai, K.K.; Li, Q.S. High-resolution simulation of a severe case of low-level windshear at the Hong Kong International Airport: Turbulence intensity and sensitivity to turbulence parameterization scheme. Atmos. Sci. Lett. 2022, 23, e1090. [Google Scholar] [CrossRef]
Kwong, K.-M. Chaotic Oscillator Based Artificial Neural Network with LiDAR Data for Wind Shear and Turbulence Forecasting and Alerting. Master’s Thesis, The Hong Kong Polytechnic University, Hong Kong, China, 2011. [Google Scholar]
Wu, T.-C.; Hon, K.-K. Application of spectral decomposition of LIDAR-based headwind profiles in windshear detection at the Hong Kong International Airport. Meteorol. Z. 2018, 27, 33–42. [Google Scholar] [CrossRef]
Krozel, J.A.; Sharman, R. Remote Detection of Turbulence via ADS-B. In Proceedings of the AIAA Guidance, Navigation, and Control Conference, Kissimmee, FL, USA, 5–9 January 2015; p. 1547. [Google Scholar]
HKO. Windshear and Turbulence in Hong Kong: Information for Pilots; Hong Kong Observatory; Hong Kong Special Administrative Region Government: Hong Kong, China, 2019.
Rahman, A.; Debnath, T.; Kundu, D.; Khan, M.S.I.; Aishi, A.A.; Sazzad, S.; Sayduzzaman, M.; Band, S.S. Machine learning and deep learning-based approach in smart healthcare: Recent advances, applications, challenges and opportunities. AIMS Public Health 2024, 11, 58. [Google Scholar] [CrossRef]
Das, S.; Nayak, S.P.; Sahoo, B.; Nayak, S.C. Machine learning in healthcare analytics: A state-of-the-art review. Arch. Comput. Methods Eng. 2024, 31, 3923–3962. [Google Scholar] [CrossRef]
Chakraborty, C.; Bhattacharya, M.; Pal, S.; Lee, S.-S. From machine learning to deep learning: Advances of the recent data-driven paradigm shift in medicine and healthcare. Curr. Res. Biotechnol. 2024, 7, 100164. [Google Scholar] [CrossRef]
Wan, Y.; Tao, H.; Zhao, Y. Artificial intelligence in economics and finance: Applications and prospects of machine learning methods. Artif. Intell. 2025, 18, 2025. [Google Scholar] [CrossRef]
Hao, J.; He, F.; Ma, F.; Zhang, S.; Zhang, X. Machine learning vs deep learning in stock market investment: An international evidence. Ann. Oper. Res. 2025, 348, 93–115. [Google Scholar] [CrossRef]
Yaghoubi, E.; Yaghoubi, E.; Khamees, A.; Vakili, A.H. A systematic review and meta-analysis of artificial neural network, machine learning, deep learning, and ensemble learning approaches in field of geotechnical engineering. Neural Comput. Appl. 2024, 36, 12655–12699. [Google Scholar] [CrossRef]
Cao, S.; Sun, X.; Widyasari, R.; Lo, D.; Wu, X.; Bo, L.; Zhang, J.; Li, B.; Liu, W.; Wu, D. A systematic literature review on explainability for machine/deep learning-based software engineering research. arXiv 2024, arXiv:2401.14617. [Google Scholar]
Pilz, K.F.; Heim, L.; Brown, N. Increased compute efficiency and the diffusion of AI capabilities. In Proceedings of the 39th Annual AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 27 February–4 March 2025; Volume 39, pp. 27582–27590. [Google Scholar]
Sharma, K.; Salagrama, S.; Parashar, D.; Chugh, R.S. AI-Driven Decision Making in the Age of Data Abundance: Navigating Scalability Challenges in Big Data Processing. Rev. D’Intelligence Artif. 2024, 38, 1335. [Google Scholar] [CrossRef]
Khattak, A.; Zhang, J.; Chan, P.-w.; Chen, F.; Almaliki, A.H. A New Frontier in Wind Shear Intensity Forecasting: Stacked Temporal Convolutional Networks and Tree-Based Models Framework. Atmosphere 2024, 15, 1369. [Google Scholar] [CrossRef]
Coburn, J.; Arnheim, J.; Pryor, S.C. Short-term forecasting of wind gusts at airports across CONUS using machine learning. Earth Space Sci. 2022, 9, e2022EA002486. [Google Scholar] [CrossRef]
Penov, N.; Guerova, G. Sofia airport visibility estimation with two machine-learning techniques. Remote Sens. 2023, 15, 4799. [Google Scholar] [CrossRef]
Chen, C.-J.; Huang, C.-N.; Yang, S.-M. Aviation visibility forecasting by integrating Convolutional Neural Network and long short-term memory network. J. Intell. Fuzzy Syst. 2023, 45, 5007–5020. [Google Scholar] [CrossRef]
Mizuno, S.; Ohba, H.; Ito, K. Machine learning-based turbulence-risk prediction method for the safe operation of aircrafts. J. Big Data 2022, 9, 29. [Google Scholar] [CrossRef]
Chow, H.W.; Lim, Z.J.; Alam, S. Data-driven runway occupancy time prediction using decision trees. In Proceedings of the 2021 IEEE/AIAA 40th Digital Avionics Systems Conference (DASC), San Antonio, TX, USA, 3–7 October 2021; pp. 1–9. [Google Scholar]
Wu, W.; Cai, K.; Yan, Y.; Li, Y. An improved svm model for flight delay prediction. In Proceedings of the 2019 IEEE/AIAA 38th Digital Avionics Systems Conference (DASC), San Diego, CA, USA, 8–12 September 2019; pp. 1–6. [Google Scholar]
Dai, M. A hybrid machine learning-based model for predicting flight delay through aviation big data. Sci. Rep. 2024, 14, 4603. [Google Scholar] [CrossRef]
Alfarhood, M.; Alotaibi, R.; Abdulrahim, B.; Einieh, A.; Almousa, M.; Alkhanifer, A. Predicting flight delays with machine learning: A case study from Saudi Arabian airlines. Int. J. Aerosp. Eng. 2024, 2024, 3385463. [Google Scholar] [CrossRef]
Zhang, X.; Mahadevan, S. Bayesian neural networks for flight trajectory prediction and safety assessment. Decis. Support Syst. 2020, 131, 113246. [Google Scholar] [CrossRef]
Alaoui, B.; Bari, D.; Bergot, T.; Ghabbar, Y. Analog ensemble forecasting system for low-visibility conditions over the main airports of morocco. Atmosphere 2022, 13, 1704. [Google Scholar] [CrossRef]
Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 2019, 1, 206–215. [Google Scholar] [CrossRef]
Hakkoum, H.; Idri, A.; Abnane, I. Global and local interpretability techniques of supervised machine learning black box models for numerical medical data. Eng. Appl. Artif. Intell. 2024, 131, 107829. [Google Scholar] [CrossRef]
Körner, A.; Sailer, B.; Sari-Yavuz, S.; Haeberle, H.A.; Mirakaj, V.; Bernard, A.; Rosenberger, P.; Koeppen, M. Explainable Boosting Machine approach identifies risk factors for acute renal failure. Intensive Care Med. Exp. 2024, 12, 55. [Google Scholar] [CrossRef] [PubMed]
Mahmoudian, A.; Bypour, M.; Kioumarsi, M. Explainable boosting machine learning for predicting bond strength of FRP rebars in ultra high-performance concrete. Computation 2024, 12, 202. [Google Scholar] [CrossRef]
Colantonio, L.; Equeter, L.; Dehombreux, P.; Ducobu, F. Explainable AI for tool condition monitoring using Explainable Boosting Machine. Procedia CIRP 2025, 133, 138–143. [Google Scholar] [CrossRef]
Wahab, S.; Salami, B.A.; AlAteah, A.H.; Al-Tholaia, M.M.; Alahmari, T.S. Exploring the interrelationships between composition, rheology, and compressive strength of self-compacting concrete: An exploration of explainable boosting algorithms. Case Stud. Constr. Mater. 2024, 20, e03084. [Google Scholar] [CrossRef]
Kabir, M.A.; Ahmed, M.U.; Begum, S.; Barua, S.; Islam, M.R. Balancing fairness: Unveiling the potential of smote-driven oversampling in ai model enhancement. In Proceedings of the 2024 9th International Conference on Machine Learning Technologies, Oslo, Norway, 24–26 May 2024; pp. 21–29. [Google Scholar]
Hu, C.; Deng, R.; Hu, X.; He, M.; Zhao, H.; Jiang, X. An automatic methodology for lithology identification in a tight sandstone reservoir using a bidirectional long short-term memory network combined with Borderline-SMOTE. Acta Geophys. 2025, 73, 2319–2335. [Google Scholar] [CrossRef]
Yilmaz Eroglu, D.; Pir, M.S. Hybrid oversampling and undersampling method (houm) via safe-level smote and support vector machine. Appl. Sci. 2024, 14, 10438. [Google Scholar] [CrossRef]
Lu, S.; Ye, J. Imbalanced data classification scheme based on G-SMOTE. Procedia Comput. Sci. 2024, 247, 1295–1303. [Google Scholar] [CrossRef]
Acosta, N.O.; Klein, S.; Starmans, M.P. Automated method design for cancer image classification by Differential Evolution and Ensembling. In Proceedings of the MICCAI Student Board EMERGE Workshop: Empowering Medical Information Computing and Research Through Early-Career Guidance and Expertise, Daejeon, Republic of Korea, 23 September 2025. [Google Scholar]
Shi, G.; Jiang, Z.; Wong, C.M.S.; Ding, X.; Wu, S.; Zhao, C. Multimodal land subsidence of the new reclaimed HKIA 3rd Runway from InSAR and independent component analysis. In Proceedings of the IGARSS 2024–2024 IEEE International Geoscience and Remote Sensing Symposium, Athens, Greece, 7–12 July 2024; pp. 1624–1627. [Google Scholar]
Chen, F.; Peng, H.; Chan, P.w.; Ma, X.; Zeng, X. Assessing the risk of windshear occurrence at HKIA using rare-event logistic regression. Meteorol. Appl. 2020, 27, e1962. [Google Scholar] [CrossRef]
Hon, K.K.; Chan, P.w. Historical analysis (2001–2019) of low-level wind shear at the Hong Kong International Airport. Meteorol. Appl. 2022, 29, e2063. [Google Scholar] [CrossRef]
Chen, F.; Peng, H.; Chan, P.-w.; Huang, Y.; Hon, K.-K. Identification and analysis of terrain-induced low-level windshear at Hong Kong International Airport based on WRF–LES combining method. Meteorol. Atmos. Phys. 2022, 134, 60. [Google Scholar] [CrossRef]
Greenwell, B.M. Explainable Boosting Machines in R with the ebm Package. R J. 2023, 15, 100–115. [Google Scholar]
Yousif, Y.; Müller, J. Efficient and Interpretable Traffic Destination Prediction using Explainable Boosting Machines. arXiv 2024, arXiv:2402.03457. [Google Scholar] [CrossRef]
Liu, G.; Sun, B. Concrete compressive strength prediction using an explainable boosting machine model. Case Stud. Constr. Mater. 2023, 18, e01845. [Google Scholar] [CrossRef]
Hairani, H.; Widiyaningtyas, T.; Prasetya, D.D. Addressing class imbalance of health data: A systematic literature review on modified synthetic minority oversampling technique (SMOTE) strategies. JOIV Int. J. Inform. Vis. 2024, 8, 1310–1318. [Google Scholar] [CrossRef]
Wilson, A.; Anwar, M.R. The future of adaptive machine learning algorithms in high-dimensional data processing. Int. Trans. Artif. Intell. 2024, 3, 97–107. [Google Scholar] [CrossRef]
Storer, L.N.; Williams, P.D.; Gill, P.G. Aviation turbulence: Dynamics, forecasting, and response to climate change. Pure Appl. Geophys. 2019, 176, 2081–2095. [Google Scholar] [CrossRef]
Chan, P. Case study of a special event of low-level windshear and turbulence at the Hong Kong International Airport. Atmos. Sci. Lett. 2023, 24, e1143. [Google Scholar] [CrossRef]
Chan, P. Severe wind shear at Hong Kong International airport: Climatology and case studies. Meteorol. Appl. 2017, 24, 397–403. [Google Scholar] [CrossRef]
Chan, P. A significant wind shear event leading to aircraft diversion at the Hong Kong international airport. Meteorol. Appl. 2012, 19, 10–16. [Google Scholar] [CrossRef]
Khattak, A.; Chan, P.-W.; Chen, F.; Peng, H. Time-series prediction of intense wind shear using machine learning algorithms: A case study of Hong Kong International Airport. Atmosphere 2023, 14, 268. [Google Scholar] [CrossRef]

Figure 1. Layout of HKIA and its surroundings (source: [47]).

Figure 2. Wind shear occurrence by height band at HKIA.

Figure 3. Wind shear occurrence by runway corridor at HKIA.

Figure 4. Wind shear occurrence by month at HKIA.

Figure 5. Conceptual workflow of the proposed EBM-based framework for WSAT prediction.

Figure 6. Frequency distribution of

ψ_{s i g} / ψ_{n o n - s i g}

(a)

η_{W S}

, (b)

μ_{W S}

, (c)

ρ_{W S}

, (d)

ϕ_{W S}

, and (e)

δ_{W S}

.

Figure 6. Frequency distribution of

ψ_{s i g} / ψ_{n o n - s i g}

(a)

η_{W S}

, (b)

μ_{W S}

, (c)

ρ_{W S}

, (d)

ϕ_{W S}

, and (e)

δ_{W S}

.

Figure 7. Best values obtained via DEHB for EBM hyperparameters under different resampling strategies: (a) SMOTE; (b) Borderline SMOTE; (c) Safe-Level SMOTE; (d) G-SMOTE.

Figure 8. Performance measures of EBM model using untreated data: (a) confusion matrix; (b) ROC curve.

Figure 9. Performance measures of the EBM model using SMOTE-treated data; (a) confusion matrix; (b) ROC curve.

Figure 10. Performance measures of EBM model using Borderline SMOTE-treated data; (a) confusion matrix; (b) ROC curve.

Figure 11. Performance measures of the EBM model using Safe-Level SMOTE-treated data; (a) confusion matrix; (b) ROC curve.

Figure 12. Performance measures of the EBM model using G-SMOTE-treated data; (a) confusion matrix; (b) ROC curve.

Figure 13. EBM interpretation; (a) global feature importance ranking for the EBM model trained with SMOTE-treated data; (b) impact of

η_{W S}

on WSAT; (c) impact of

μ_{W S}

on WSAT; (d) impact of interaction of

(η_{W S} \times μ_{W S})

on WSAT.

Figure 13. EBM interpretation; (a) global feature importance ranking for the EBM model trained with SMOTE-treated data; (b) impact of

η_{W S}

on WSAT; (c) impact of

μ_{W S}

on WSAT; (d) impact of interaction of

(η_{W S} \times μ_{W S})

on WSAT.

Table 1. Summary of classification performance metrics used for model evaluation.

Metric	Description	Expression
$Precision (P n)$	Measures the proportion of correctly predicted positive instances out of all predicted positives.	$P n = \frac{P_{C}}{P_{C} + P_{E}}$
$Recall (R l)$	Quantifies how effectively the model identifies actual positive instances within the dataset.	$R l = \frac{P_{C}}{P_{C} + N_{E}}$
$Balanced Accuracy (B A)$	Measures the average recall for each class, which provides an unbiased estimate of overall classification.	$B A = \frac{1}{2} (\frac{P_{C}}{P_{C} + N_{E}} + \frac{N_{C}}{N_{C} + P_{E}})$
$Geometric Mean (G M)$	Represents the balance between precision and recall by taking their geometric mean.	$G M = \sqrt{P n \times R l}$
Matthews Correlation Coefficient (MCC)	Evaluates the overall quality of binary classifications by considering true and false positives and negatives.	$M C C = \frac{(P_{C} \times N_{C}) - (P_{E} \times N_{E})}{\sqrt{(P_{C} \times P_{E}) (P_{C} \times N_{E}) (N_{C} \times P_{E}) (N_{C} \times N_{E})}}$
Receiver Operating Characteristic (ROC) Curve	Plots the True Positive Rate (TPR) against the False Positive Rate (FPR) at various threshold settings.	—

Note:

P_{C} :

True Positives,

P_{E} :

False Positives

(P_{E})

,

N_{C} :

True Negatives, and

N_{E} :

False Negatives.

Table 2. Description and coding details of the factors used in this study.

Factors	Symbol	Description and Coding Details
Wind Shear Magnitude	$μ_{W S}$	Represents the magnitude of wind shear in knots, measured as a continuous value.
Wind Shear Distance from Runway	$δ_{W S}$	Indicates the distance of wind shear occurrence from the runway. If coded 0, it represents at runway; if coded 1, it represents 0–1 NM from runway; if coded 2, it represents 1–2 NM from runway; if coded 3, it represents 2–3 NM from runway; and if coded 4, it represents Beyond 3 NM.
Wind Shear Altitude	$η_{W S}$	Indicates the altitude range at which wind shear occurred. If coded 0, it represents 0–399 ft; if coded 1, it represents 400–799 ft; if coded 2, it represents 800–1199 ft; and if coded 3, it represents 1200–1600 ft.
Wind Shear Causes	$ϕ_{W S}$	Represents the primary cause of wind shear. If coded 0, it corresponds to terrain; if coded 1, it corresponds to sea breeze; and if coded 2, it corresponds to gust front.
Rainfall Condition	$ρ_{W S}$	Indicates the rainfall condition at the time of the event. If coded 0, it represents no rain; if coded 1, it represents rain.
Aviation Turbulence Category	$ψ_{s i g} / ψ_{i n - s i g}$	$If coded 0, it denotes ψ_{i n - s i g}$ $; if coded 1, it denotes ψ_{s i g}$

Table 3. Class distribution before and after data balancing.

Method	$ψ_{s i g}$		$ψ_{i n - s i g}$
Method	Before Balance	After Balance	Before Balance	After Balance
SMOTE	17.1% (818)	50.0% (3967)	82.9% (3967)	50.0% (3967)
Borderline SMOTE	17.1% (818)	50.0% (3967)	82.9% (3967)	50.0% (3967)
Safe-Level SMOTE	17.1% (818)	50.0% (3967)	82.9% (3967)	50.0% (3967)
G-SMOTE	17.1% (818)	50.0% (3967)	82.9% (3967)	50.0% (3967)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Khattak, A.; Chan, P.-w.; Chen, F.; Elhassan, A.A.M.; Alsulami, B.T. Explainable Learning Framework for the Assessment and Prediction of Wind Shear-Induced Aviation Turbulence. Atmosphere 2025, 16, 1318. https://doi.org/10.3390/atmos16121318

AMA Style

Khattak A, Chan P-w, Chen F, Elhassan AAM, Alsulami BT. Explainable Learning Framework for the Assessment and Prediction of Wind Shear-Induced Aviation Turbulence. Atmosphere. 2025; 16(12):1318. https://doi.org/10.3390/atmos16121318

Chicago/Turabian Style

Khattak, Afaq, Pak-wai Chan, Feng Chen, Adil A. M. Elhassan, and Badr T. Alsulami. 2025. "Explainable Learning Framework for the Assessment and Prediction of Wind Shear-Induced Aviation Turbulence" Atmosphere 16, no. 12: 1318. https://doi.org/10.3390/atmos16121318

APA Style

Khattak, A., Chan, P.-w., Chen, F., Elhassan, A. A. M., & Alsulami, B. T. (2025). Explainable Learning Framework for the Assessment and Prediction of Wind Shear-Induced Aviation Turbulence. Atmosphere, 16(12), 1318. https://doi.org/10.3390/atmos16121318

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Explainable Learning Framework for the Assessment and Prediction of Wind Shear-Induced Aviation Turbulence

Abstract

1. Introduction

2. Materials and Methods

2.1. Case Study Description

2.2. Theoretical Overview of the DEHB–EBM Framework

2.2.1. Explainable Boosting Machine Model (EBM)

2.2.2. Data Balancing Through SMOTE and Its Variants

2.2.3. DEHB for Hyperparameter Tuning of EBM

Differential Evolution Component

Hyperband Component

Combined DEHB Procedure

Model Output and Interpretability

2.3. Performance Measures

3. Analysis and Results

3.1. EBM Training and Testing

3.2. EBM-Based Interpretation

4. Conclusions and Recommendations

4.1. Limitations of This Study

4.2. Future Recommendations

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI