Structure-Aware Graph Neural Network with Representation Enhancement and Interpretability for Early Gas Kick Monitoring

Xia, Boyi; Li, Qihao; Li, Yuhong; Weng, Zhuang; Jiang, Han; Zhu, Zhaopeng; Zhou, Detao

doi:10.3390/pr14071110

Open AccessArticle

Structure-Aware Graph Neural Network with Representation Enhancement and Interpretability for Early Gas Kick Monitoring

by

Boyi Xia

¹,

Qihao Li

²,

Yuhong Li

^3,*,

Zhuang Weng

³,

Han Jiang

³,

Zhaopeng Zhu

⁴ and

Detao Zhou

²

¹

Greatwall Drilling Engineering Research Institute, Panjin 124009, China

²

College of Artificial Intelligence, China University of Petroleum (Beijing), Beijing 102249, China

³

PetroChina (Beijing) Digital Intelligent Research Institute Co., Ltd., Beijing 100029, China

⁴

College of Mechanical and Transportation Engineering, China University of Petroleum (Beijing), Beijing 102249, China

^*

Author to whom correspondence should be addressed.

Processes 2026, 14(7), 1110; https://doi.org/10.3390/pr14071110

Submission received: 25 February 2026 / Revised: 18 March 2026 / Accepted: 24 March 2026 / Published: 30 March 2026

(This article belongs to the Section Petroleum and Low-Carbon Energy Process Engineering)

Download

Browse Figures

Versions Notes

Abstract

Gas kick events in drilling operations are characterized by strong coupling dynamics, subtle early-stage evolution, and severe class imbalance, which limit the effectiveness of conventional feature-independent monitoring methods. To address these challenges, this paper proposes a structure-aware intelligent monitoring framework for early gas kick detection. First, multivariate drilling parameters are modeled as an interacting graph, and a graph neural network (GNN) is introduced to capture relational dependencies and anomaly propagation behaviors at the structural level. Second, to mitigate abnormal sample scarcity and enhance temporal discriminability, a representation enhancement strategy integrating conditional tabular generative adversarial networks (CTGAN) and shapelet-based temporal patterns is developed. Finally, a multi-level interpretability mechanism combining graph attention analysis and SHAP attribution is constructed to provide transparent insights into both structural interactions and feature contributions. Experiments conducted on real drilling datasets demonstrate that the proposed GNN baseline achieves the highest accuracy (0.7302) among various machine learning and deep learning models. With representation enhancement, the GNN+CTGAN+Shapelet model further improves accuracy to 0.7507 and F1-score to 0.7347, validating the effectiveness of the enhancement strategy. Interpretability results reveal that the model decisions are primarily driven by flow-rate and standpipe-pressure-related temporal evolution patterns, which are consistent with drilling engineering knowledge. Overall, the proposed framework provides a structurally consistent, robust, and interpretable solution for intelligent gas kick monitoring in modern drilling operations.

Keywords:

graph neural network; early gas kick detection; representation enhancement; shapelet; explainable AI; intelligent drilling

1. Introduction

As global energy demand continues to grow, drilling activities in the oil and gas industry are expanding continuously. During the extraction process, factors such as geological conditions and the combination of drilling tools often lead to complex downhole incidents [1], and wellbore gas kick risks are a common occurrence among downhole accidents. If early warning for such situations is not prompt, it can lead to exacerbation of gas kick situations, and even trigger more severe blowouts, causing significant damage to personnel, property, and the natural environment on the well site. Early monitoring of gas influx therefore represents a critical component of intelligent drilling systems [2,3]. However, the identification of early-stage kicks is inherently difficult due to weak precursor signals, complex multi-parameter coupling, and the partially unobservable nature of downhole dynamics.

Traditional monitoring approaches primarily rely on physical instrumentation and threshold-based analysis. Flow-out measurements, liquid level monitoring, acoustic sensing, and downhole pressure measurements have been widely used to detect abnormal wellbore responses. Orban et al. [4] proposed a small error flowmeter that not only offers high accuracy but can also simultaneously detect water-based and oil-based muds. Schafer et al. [5] developed a rolling float flowmeter designed for precise measurement of flow rates in pipeline outflows. Schubert et al. [6] utilized the principle that gas kick causes changes in the liquid level in the wellbore. They have designed flow detection and warning systems using acoustic monitoring and separator liquid level monitoring technology. Liu Shoujun [7] utilized an ultrasonic liquid level monitor to track changes in the mud tank’s liquid level and provides overflow warnings based on flow rate variations. Mostafa Rashed [8] employed Along-the-Pipe Wellbore Dynamics (APWD) measurements to monitor downhole overflows in real-time by measuring bottomhole annular pressure, fluid temperature, and drilling fluid equivalent density. Karimi Vajargah et al. [9] achieved overflow warnings by placing multiple sets of pressure sensors on the downhole drill string and monitoring bottomhole pressure changes in stages.

Although these techniques provide reliable observations, they often respond only after significant deviations occur and lack the capability to capture subtle multi-parameter interactions preceding kick escalation. Consequently, they offer limited sensitivity to early weak anomalies.

With the increasing availability of real-time drilling data, data-driven intelligent monitoring methods have attracted growing attention. Ekaterina Gurina et al. [10] have constructed a classification model based on decision tree gradient boosting. They utilize Measurement While Drilling (MWD) data and employ leave-one-out cross-validation as a standard quality measure. This model uses aggregated statistical information and gradient boosting classification for time series comparison. This algorithm can identify about half of the abnormal situations, resulting in an average of approximately 53% false alarms per day. Wanjun Hu et al. [11] created a sample database for various safety risks and designed a two-layer Convolutional Neural Network (CNN) architecture based on the sample form of gas drilling monitoring data. They extracted and learned the patterns of changes and relevant features from multiple monitoring parameters. Based on the training results of the neural network, they selected different types of safety risk samples to improve the recognition accuracy. Zhu et al. [12] established an unsupervised gas kick detection time-series model and achieved excellent results.

Despite these advances, most existing methods treat drilling parameters as independent feature vectors and neglect the intrinsic structural interactions among them. In practice, drilling systems operate as strongly coupled dynamical processes, where disturbances propagate through hydraulic, mechanical, and thermodynamic pathways. Ignoring these dependencies restricts the model’s ability to characterize early anomaly diffusion mechanisms.

Another major challenge arises from the scarcity of labeled kick events. Gas influx incidents are relatively rare, resulting in highly imbalanced datasets that hinder model generalization and stability. Furthermore, temporal patterns associated with early kicks are often localized and subtle, making them difficult to represent using global statistical features alone. Beyond detection accuracy, the lack of interpretability in deep learning models also limits their acceptance in safety-critical drilling environments, where engineering trust and traceability are essential.

To address these limitations, this study proposes a structure-aware intelligent monitoring framework that models drilling parameters as an interacting graph system rather than independent inputs. A graph neural network (GNN) is introduced to capture relational dependencies and information propagation among parameters, enabling the detection of weak abnormal signals reflected through system-wide interactions. To enhance robustness under limited abnormal samples and enrich temporal representation, conditional generative adversarial networks (CTGAN) are employed for data augmentation, while shapelet-derived embeddings are incorporated to encode discriminative local temporal patterns as node attributes. Furthermore, a multi-level interpretability strategy combining graph attention analysis and SHAP attribution is developed to reveal both structural interaction mechanisms and feature-level contributions underlying model decisions. Overall, this work shifts gas kick monitoring from conventional feature-based analysis toward relational structure learning. By incorporating interaction-aware modeling, generative robustness enhancement, and multi-level interpretability, the proposed framework provides a more physically consistent, reliable, and transparent solution for intelligent drilling risk management. The overall architecture is shown in Figure 1.

2. Methodology

2.1. Data Preprocessing

The data utilized in this study consists of comprehensive well logging records collected from a representative drilling block, covering 45 candidate monitoring parameters. The dataset comprises records from 10 wells, with a total of 42,814 time-series samples. Among these, 829 samples are labeled as gas kick events, while 41,985 samples correspond to normal drilling conditions, resulting in a highly imbalanced dataset with approximately 1.94% positive samples.

To evaluate model generalization under realistic conditions, the dataset was divided at the well level, with data from 7 wells used for training and the remaining 3 wells reserved for testing. This well-based split strategy prevents data leakage across wells and better reflects real-world deployment scenarios. To address the class imbalance issue during model training, a down-sampling strategy was applied to the training set, where an equal number of normal samples were randomly selected to match the number of gas kick samples, while the test set retained the original data distribution.

To ensure data reliability and modeling robustness, a systematic data preprocessing pipeline was established, including outlier detection, data cleaning and imputation, feature correlation analysis, and gas kick labeling. Specifically, abnormal values were first identified and removed using statistical criteria, followed by feature-level preprocessing to mitigate noise and missing information. Subsequently, correlation-based analysis combined with drilling engineering knowledge was employed to select key monitoring parameters.

Finally, gas kick labels were determined based on multiple sources of operational evidence to ensure reliability. Confirmed gas kick intervals were identified through the integration of drilling operation logs, field engineer reports, and mud logging records that documented abnormal hydrocarbon readings. These operational records were further verified using engineering diagnostic indicators, including unexplained increases in outlet flow rate, pit volume gain, and decreases in standpipe pressure. Only intervals consistently confirmed across these cross-validated sources were labeled as gas kick events. This strategy ensures that the dataset reflects real operational occurrences of gas kicks rather than purely threshold-based automatic labeling. For the purpose of time-series modeling, the identified gas kick intervals were mapped onto the corresponding time stamps of the drilling sensor data, and samples within these intervals were labeled as kick events, while the remaining samples were labeled as normal drilling states.

The overall data preprocessing workflow is illustrated in Figure 2.

The box plot method, also known as Tukey’s boxplot, is a widely used statistical tool for detecting outliers in univariate data. It visualizes the data distribution through five key summary statistics: the minimum, first quartile (Q1), median (Q2), third quartile (Q3), and maximum. Outliers are typically defined as data points that fall outside the range of 1.5 times the interquartile range (IQR) from either the upper or lower quartile. Specifically, any value less than Q1 − 1.5 × IQR or greater than Q3 + 1.5 × IQR is considered an outlier. This method provides an intuitive and efficient approach for identifying extreme deviations in the dataset, thereby ensuring data quality and reliability for subsequent modeling and analysis.

Correlation analysis allows us to determine the degree, direction, and strength of linear relationships between variables. It helps us understand the relationships between variables, predict trends in variable changes, select appropriate variables, assess interactions between variables, and has significant theoretical and practical importance. Feature selection involves reducing the dimensionality of input data by selecting the most useful features from the original data, thereby eliminating redundant and irrelevant features. Dimensionality reduction can lower the computational complexity and storage requirements of algorithms while simultaneously improving algorithm training speed and prediction accuracy.

This article conducted a correlation analysis of the data to identify the relationships between feature variables. Pearson correlation coefficient was employed for this purpose. The coefficient’s values range from −1 to +1, with 0 indicating no correlation between two variables. A positive value indicates a positive correlation, while a negative value indicates a negative correlation. The magnitude of the coefficient represents the strength of the correlation [13]. The formula for calculating correlation is shown as Equation (1).

ρ_{x y} = \frac{C o v (X, Y)}{σ_{X} σ_{Y}}

(1)

To provide a more intuitive understanding of the relationships among variables, a Pearson correlation heatmap of all numerical features is presented in Figure 3. The heatmap visually illustrates the pairwise correlations, enabling identification of strongly correlated parameter groups.

As shown in Figure 3, strong correlations can be observed among flow-related parameters (e.g., inlet and outlet flow rates), pressure-related variables (e.g., standpipe pressure and casing pressure), and pit volume measurements. In addition, hydrocarbon-related features (e.g., methane, ethane, and total hydrocarbon content) exhibit high inter-correlations, reflecting their shared physical origin during gas influx events.

In this study, four key parameters were selected for gas kick risk monitoring based on both correlation analysis results and domain knowledge: outlet flow rate, total pool volume, standpipe pressure (SPP), and total hydrocarbon content. These selections were not only guided by correlation analysis but also grounded in a detailed understanding of downhole physical mechanisms and operational experiences from drilling sites. Specifically, standpipe pressure serves as a critical indicator of pressure dynamics within the wellbore. During a gas kick event, the static fluid column is disrupted due to uncontrolled fluid influx, resulting in a decrease in bottom hole pressure (BHP), which is often accompanied by a noticeable decline in SPP. The outlet flow rate is another crucial diagnostic feature, as an unexplained increase—without a corresponding rise in the inlet flow—may signify formation fluid influx, thereby providing an early warning of a potential kick. Similarly, the total pool volume reflects surface fluid storage changes. As the influx progresses, whether from oil, water, or gas, the increase in return volume leads to a measurable rise in surface tank volume, especially during gas expansion. Lastly, the total hydrocarbon content directly captures the composition of returning fluids. During a gas invasion, gas entrained in the returning mud causes a spike in total hydrocarbon readings at the surface, which serves as an immediate and reliable signal of gas influx.

Collectively, these parameters provide complementary perspectives on wellbore stability, enabling timely and accurate identification of gas kick risks. Furthermore, the observed interdependencies among these variables provide additional justification for the use of graph-based models, which are capable of explicitly capturing such relational structures.

2.2. Graph-Based Monitoring Framework

Drilling operations are governed by strongly coupled hydraulic and thermodynamic processes, where multiple surface and downhole parameters interact dynamically. Gas kick events rarely manifest as isolated deviations in a single signal; instead, they emerge through subtle disturbance propagation across multiple correlated parameters. Conventional data-driven models generally treat input variables independently, which limits their capability to capture interaction-driven anomaly evolution. To address this limitation, this study constructs a graph-based monitoring network that explicitly models inter-parameter dependencies and their deviations under gas influx conditions.

Inspired by the idea of learning latent dependency structures among heterogeneous sensors, the proposed framework represents drilling parameters as nodes in a graph and models their relationships as learnable edges. Let the multivariate time-series input be denoted as X ∈ ℝ ^{^} (T × N), where T is the sliding window length and N is the number of selected parameters. Each parameter corresponds to a graph node whose feature representation is constructed from its temporal sequence within the window. This design enables the network to capture temporal patterns preceding gas kick events while preserving parameter-level structural relationships.

Unlike conventional graph neural networks that rely on predefined adjacency matrices, the interaction structure in this work is learned dynamically through an attention mechanism. This is particularly suitable for drilling environments where parameter dependencies vary with formation characteristics, operational conditions, and drilling stages. The adaptive graph therefore reflects data-driven interaction patterns rather than static assumptions derived solely from prior physical knowledge.

Information propagation across the graph is achieved through attention-based message passing. For each target node, representations from neighboring nodes are aggregated according to learned attention coefficients that quantify interaction importance. Specifically, the representation of node i at layer l + 1 is obtained by applying a nonlinear transformation to the weighted combination of neighboring node representations at layer l, where the weights are normalized attention scores derived from node embeddings. This mechanism allows the network to capture heterogeneous coupling relationships among drilling parameters and dynamically adjust interaction strengths across operating conditions.

To tailor the model for gas kick monitoring, several task-oriented adaptations are introduced. First, temporal sliding-window representations are incorporated at the node level to emphasize gradual deviation evolution rather than instantaneous fluctuations. Second, multiple graph neural layers are stacked to capture higher-order dependency propagation, reflecting cascading disturbance transmission among flow rate, pressure, density, and pit volume signals during gas influx development. Third, learned node embeddings are projected into a compact latent space and processed through fully connected layers to produce gas kick risk predictions. By modeling deviations from learned interaction patterns, the framework effectively distinguishes early-stage gas kick behavior from normal drilling variability.

An additional advantage of the proposed graph attention mechanism lies in its interpretability potential. The learned attention coefficients indicate the relative influence among parameters during prediction, providing insights into dominant interaction pathways associated with abnormal events. This interaction-level interpretability complements feature attribution analysis conducted separately using SHAP, enabling multi-perspective understanding of model decision behavior.

Figure 4 illustrates the overall modeling framework, including parameter-to-node mapping, adaptive graph learning, attention-based information propagation, and prediction generation.

Drilling time-series measurements are encoded as node features through sliding-window processing and mapped into a learnable dependency graph where parameters are treated as interconnected nodes. Attention-driven graph propagation captures heterogeneous interactions and deviation patterns among parameters, enabling robust representation learning under varying operational conditions. The resulting embeddings are processed by fully connected layers to produce risk predictions, while the attention weights offer insights into influential parameter interactions, supporting interpretability analysis.

2.3. Representation Enhancement

2.3.1. Generative Adversarial Network

Generative Adversarial Networks (GANs), first proposed by Goodfellow in 2014, are a class of neural network models based on adversarial learning mechanisms [14]. They are widely used for data generation tasks, including images, speech, and text. As shown in Figure 5, a GAN consists of two core components: a generator and a discriminator. The generator G maps random noise into synthetic samples, while the discriminator D evaluates whether the input data are real or generated, and provides feedback to guide the training of G.

Through this adversarial interaction, the generator progressively improves its ability to approximate the underlying data distribution, whereas the discriminator simultaneously enhances its capability to distinguish real samples from generated ones. With iterative training, this competitive process drives the generated data to become increasingly similar to real data, enabling high-quality sample generation [15].

To address common challenges in GAN training, such as instability, gradient explosion, and convergence issues, a variety of improved variants have been developed, including DCGAN [16], Wasserstein-GAN [17], and LSGAN [18].

In this study, the Conditional Tabular GAN (CTGAN) proposed by Xu et al. [19] is adopted as the data generation method. CTGAN is specifically developed for tabular data synthesis under conditional constraints. While it retains the generator–discriminator framework of conventional GANs, it incorporates conditional information to guide the data generation process. Owing to its flexibility, controllability, and capability of producing high-quality samples, CTGAN is well suited for handling diverse tabular data generation scenarios.

The generator in CTGAN is implemented as a feedforward neural network composed of multiple fully connected layers, where conditional normalization is applied to incorporate the influence of given conditions into intermediate feature representations. The output layer adopts a sigmoid activation to constrain generated values within a valid range, thereby producing samples consistent with the underlying data distribution. In this framework, the generator learns to map latent noise vectors, together with conditional inputs, into structured tabular samples that satisfy specified constraints.

The discriminator is designed as a binary classification network, also constructed with stacked fully connected layers. LeakyReLU activation functions are employed to improve nonlinearity and training stability. Its primary function is to distinguish real samples from those synthesized by the generator. In addition, residual connections are introduced to facilitate gradient propagation and accelerate convergence.

From an architectural perspective, CTGAN is parameterized by a generator–discriminator pair tailored for tabular data synthesis. The generator leverages residual-style blocks with fully connected transformations and normalization to enhance feature representation capability. In contrast, the discriminator integrates linear layers, LeakyReLU activations, and dropout regularization to mitigate overfitting. Furthermore, a gradient penalty (GP) strategy is incorporated to stabilize training. This is achieved by interpolating between real and generated samples, computing gradients on these interpolations, and enforcing a Lipschitz constraint through a penalty term.

In this study, the conditional variable corresponds to the gas kick label (i.e., normal vs. gas kick), which guides the generation of class-specific samples. The learning rate for both the generator and discriminator is set at 0.00001, with 500 training iterations, which can be adjusted through experimentation.

To avoid potential information leakage, the dataset was first divided into training and testing sets before any data augmentation was performed. The CTGAN model was trained exclusively using the training data, and all synthetic samples were generated solely from the training set distribution. These generated samples were used only to augment the training dataset. The test set remained completely unseen during the augmentation process, ensuring a fair and unbiased evaluation.

Data visualizations for the original and generated data of total pool volume under drilling conditions and the total hydrocarbon parameter when drilling gas kick risks occur are shown in Figure 6. The generated samples were qualitatively compared with real data distributions, showing consistent trends in key parameters, indicating that the synthetic data reasonably preserves the underlying data characteristics.

Time series data related to risks is often limited by a small dataset size, which can lead to model overfitting or an inability to capture complex data patterns. CTGAN can generate synthetic time series data, increasing the dataset size and aiding the model in better understanding the data’s distribution and patterns. By generating a substantial amount of synthetic data, CTGAN can expand small-sample time series datasets, improving the model’s transfer and generalization abilities while effectively reducing errors on the test dataset.

2.3.2. Shapelet Transformation

Shapelet transformation is a powerful and interpretable feature extraction method specifically designed for time series data. Unlike traditional feature engineering, which often relies on global statistics or domain knowledge, shapelet transformation focuses on identifying discriminative subsequences, known as shapelets, that are most representative of differences between classes or states in the data [20,21]. These shapelets capture localized temporal patterns that are critical for classification, anomaly detection, and pattern recognition tasks.

The core idea is to transform raw time series into a feature space where each feature represents the minimum distance between a time series instance and a specific shapelet. This enables traditional machine learning classifiers (e.g., decision trees, SVM, random forests) to operate effectively on time series data.

The transformation process involves the following steps:

Shapelet Discovery: The algorithm initially performs an exhaustive or heuristic scan across all possible subsequences of the time series to identify candidate shapelets. A scoring function—such as information gain, F-statistic, or accuracy improvement—is used to evaluate the discriminative power of each candidate. The top-k shapelets are selected and stored based on their performance.
Shapelet Selection: Since storing all candidate shapelets may introduce redundancy and increase computational burden, a cross-validation strategy is applied to determine the optimal subset of shapelets to be used in the transformed dataset. This ensures both efficiency and generalization of the resulting features.
Feature Transformation: Each time series in the dataset is transformed into a new feature vector, where each dimension corresponds to the shortest distance between the series and one of the selected shapelets. As a result, the original sequential data is projected into a static and fixed-length feature space.

The major advantage of this transformation is that it decouples the process of shapelet learning from model training, offering flexibility to use any off-the-shelf classifier. Moreover, the resulting shapelet-based features are often highly interpretable, as they highlight meaningful time series segments that are strongly associated with specific outcomes or labels.

As illustrated in Figure 7, two example shapelets extracted from a specific dataset demonstrate how localized time series segments can capture critical discriminative features. These shapelets are not only informative for classification tasks but also help reveal underlying physical or behavioral patterns embedded in the temporal data.

2.4. Interpretability Analysis

Accurate prediction alone is insufficient for intelligent monitoring systems deployed in safety-critical drilling operations. Engineers must understand the reasoning behind model outputs in order to validate alarms against physical mechanisms and build operational trust. Gas kick detection involves complex multi-parameter interactions, and purely black-box predictions hinder diagnosis, model validation, and adoption in field environments. Therefore, interpretability is incorporated as an integral component of the proposed framework.

To achieve transparent decision inspection, a multi-level interpretability strategy is developed that analyzes both individual feature contributions and structural interaction patterns. Feature-level attribution is conducted using SHAP, quantifying how each parameter influences prediction outcomes. Structural-level interpretation is obtained by examining attention weights learned within the graph neural network, revealing interaction pathways that govern anomaly propagation. A unified visualization summarizing both perspectives is presented in Figure 8, which should be placed immediately after this paragraph.

As illustrated in Figure 8, the interpretability module links model predictions to both measurable physical indicators and learned parameter dependencies, enabling engineers to evaluate decision consistency from complementary viewpoints.

The left panel presents SHAP-based feature attribution, illustrating the global contribution distribution of drilling parameters influencing gas kick prediction. The right panel visualizes attention-derived interaction dependencies learned by the graph neural model, highlighting dominant parameter coupling pathways during anomaly characterization. Together, these perspectives provide complementary transparency at both feature and structural levels, supporting engineering validation and trustworthy deployment.

2.4.1. Feature Contribution Analysis Using SHAP

SHapley Additive exPlanations (SHAP) are employed to quantify the influence of individual drilling parameters on gas kick prediction. Rooted in cooperative game theory, SHAP evaluates feature contributions by computing their marginal effect on the model output across all possible feature coalitions. This framework provides consistent and theoretically grounded attribution while allowing direct comparison across heterogeneous measurements.

In this study, SHAP values are calculated for the trained monitoring model to perform both global and local interpretability analysis. Global attribution identifies parameters that most strongly affect prediction behavior across the dataset, enabling comparison with engineering knowledge regarding pressure dynamics, flow imbalance, fluid return variation, and hydrocarbon indicators. Local attribution examines individual alarm events, highlighting which features drive model responses in specific operational scenarios.

As shown in the left panel of Figure 8, the SHAP summary visualization reveals both the magnitude and direction of feature contributions. This representation enables validation of whether model attention aligns with physically meaningful indicators and assists in diagnosing potential bias or over-reliance on spurious correlations. Through feature-level attribution, the proposed framework improves transparency and strengthens confidence in automated monitoring outputs.

2.4.2. Interaction Interpretability via Graph Attention Mechanisms

While feature attribution reveals individual parameter effects, gas kick evolution is inherently governed by interactions among hydraulic, mechanical, and thermodynamic variables. To capture interpretability at the system level, the proposed graph neural model leverages attention mechanisms that assign adaptive importance weights to edges representing parameter dependencies.

During training, attention coefficients regulate message passing among nodes and reflect the relative influence of neighboring parameters on representation updates. Analysis of these learned coefficients enables identification of dominant interaction pathways that contribute to anomaly detection. For example, pressure–flow coupling or fluid–gas response relationships can be revealed through elevated attention intensity.

The right panel of Figure 8 visualizes the learned attention distribution as an interaction heatmap, illustrating structural dependencies discovered by the model. Unlike post hoc explanations, this interpretability is embedded within the learning process and reflects the internal reasoning of the graph representation itself. Consequently, attention-based inspection provides insights into disturbance propagation patterns and system-level deviation formation that precede observable anomalies.

By integrating SHAP attribution and attention-based structural inspection, the proposed framework establishes multi-granular interpretability across feature and interaction domains. This dual-perspective transparency facilitates engineering validation, supports trustworthy deployment, and enhances the practical applicability of graph-based gas kick monitoring in real-world drilling operations.

2.5. Evaluation Metrics

The confusion matrix is a commonly used evaluation tool that provides a detailed summary of model predictions across different categories, as shown in Table 1. It consists of four fundamental components: True Positive (TP), False Positive (FP), True Negative (TN), and False Negative (FN), which form the basis for deriving various performance metrics. To quantitatively assess and compare the performance of different models, evaluation metrics including accuracy, recall, and precision are adopted in this study.

The accuracy formula is given by Equation (2).

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(2)

The precision formula is given by Equation (3).

P r e c i s i o n = \frac{T P}{T P + F P}

(3)

The recall formula is given by Equation (4).

R e c a l l = \frac{T P}{T P + F N}

(4)

3. Results and Discussion

3.1. Baseline Model Comparison

To comprehensively evaluate the effectiveness of the proposed structure-aware monitoring framework, comparative experiments were conducted across representative traditional machine learning models and advanced deep learning architectures. The traditional baselines include SVM and KNN, as well as several state-of-the-art ensemble learning methods [23], namely Random Forest, XGBoost [24], CatBoost [25], and LightGBM [26]. These ensemble-based models are widely recognized for their strong nonlinear fitting capability, robustness to noisy features, and effectiveness in handling tabular industrial datasets. In particular, gradient boosting frameworks such as XGBoost, CatBoost, and LightGBM leverage iterative tree aggregation strategies to enhance generalization performance and are often considered strong benchmarks in engineering prediction tasks.

In addition to ensemble models, neural network architectures including ANN, LSTM, GRU, Transformer, and the proposed GNN were implemented to evaluate the benefits of nonlinear representation learning, temporal modeling, and structural interaction modeling. Sequential models such as LSTM and GRU capture temporal dependencies in drilling data, while Transformer introduces self-attention mechanisms to dynamically assign importance weights across time steps.

The final optimized model parameters are summarized in Table 2.

To ensure a fair comparison among different models, a unified sliding window strategy was adopted to construct time-series samples from the raw drilling data. Specifically, a fixed window length of 8 time steps was used with a stride of 1, resulting in a high-overlap segmentation scheme.

For each window, the corresponding label was assigned based on the state at the final time step, following an end-of-window labeling strategy. This design ensures that the input features are aligned with the most recent system state, which is particularly important for early gas kick detection.

The resulting windowed samples were used as inputs for all time-dependent models, including LSTM, GRU, Transformer, and the proposed GNN, thereby ensuring consistency and comparability across different model architectures.

The quantitative comparison results are summarized in Table 3. To ensure the robustness of the results, we carefully controlled the randomness in model training (e.g., fixed random seeds and consistent data splits). We observed that the model performance remains stable across multiple runs, with no significant fluctuations in key evaluation metrics.

Among all evaluated methods, the proposed GNN achieves the highest overall accuracy (0.7302). Although the margin over strong ensemble learners such as XGBoost and CatBoost is moderate, the improvement is consistent across multiple evaluation metrics, demonstrating the benefit of explicitly modeling structural dependencies among drilling parameters rather than relying solely on feature-level aggregation.

It is important to note that ensemble learning methods already provide competitive performance due to their ability to combine multiple decision trees and capture nonlinear decision boundaries. However, these models treat input parameters as independent feature vectors and do not explicitly encode relational interactions among physical variables. In drilling systems, parameters such as standpipe pressure, flow rate, pit volume, and hydrocarbon content are inherently coupled through hydraulic and thermodynamic processes. By modeling these variables as interconnected nodes in a learnable graph, the proposed GNN captures dependency-driven deviation propagation patterns that are difficult to represent through independent feature aggregation.

From a metric-specific perspective, the GNN achieves the highest precision (0.7953), indicating a superior ability to reduce false positive alarms. In safety-critical drilling environments, minimizing false alarms is essential to avoid unnecessary operational interruptions and maintain trust in automated monitoring systems. Although its recall (0.629) is slightly lower than that of certain boosting-based models, this trade-off suggests that the graph-based model adopts a more conservative anomaly identification strategy, prioritizing reliable abnormal discrimination over aggressive triggering. The resulting F1-score remains competitive (0.7024), reflecting a balanced performance between precision and recall.

An additional observation from the comparison is the strong performance of attention-based architectures. Transformer, which leverages self-attention to dynamically model temporal dependencies, achieves competitive accuracy and F1-score among neural baselines. Similarly, the proposed GNN integrates graph attention mechanisms to adaptively learn parameter interaction strengths. The consistent performance of both models indicates that attention mechanisms enhance the model’s capability to focus on informative relationships while suppressing irrelevant fluctuations. In complex drilling environments characterized by multi-parameter coupling and weak anomaly signals, such adaptive weighting strategies improve representation quality and anomaly discrimination stability. These findings suggest that attention-based learning plays a critical role in effectively modeling both temporal and structural dependencies in gas kick monitoring tasks.

Overall, the experimental results demonstrate that while ensemble learning methods establish strong baselines and sequential neural networks capture temporal dynamics, explicitly incorporating interaction-aware graph modeling and attention mechanisms further enhances detection robustness. These results support the central hypothesis of this study that structure-aware and attention-guided representation learning improves gas kick monitoring performance.

3.2. Effectiveness of Representation Enhancement

3.2.1. Performance Improvement

While the baseline comparison demonstrates the effectiveness of structure-aware graph modeling, gas kick detection remains challenged by two inherent data characteristics: the scarcity of abnormal samples and the presence of weak, localized temporal patterns preceding gas influx events. To address these limitations, the proposed framework incorporates representation enhancement strategies, including CTGAN-based data augmentation and shapelet-informed temporal embedding. This section evaluates the contribution of these components to overall model performance.

To systematically analyze their effects, four configurations were compared: (1) baseline GNN without enhancement, (2) GNN with CTGAN-based augmentation, (3) GNN with shapelet-derived embeddings, and (4) GNN with both CTGAN and shapelet integration. The results are summarized in Figure 9.

To further clarify the implementation of shapelet-based enhancement, the extraction and feature construction process is described as follows. For each selected key parameter, fixed-length subsequences with a length of 8 time steps were extracted from the time series using a sliding window mechanism. These subsequences were then grouped using K-means clustering, and the cluster centers were treated as representative shapelets. In this study, three shapelets were generated for each variable.

For each sample, the similarity between the observed subsequence and the extracted shapelets was quantified using the Euclidean distance. These distance values were then used as additional features to enhance the input representation of the model. Since the shapelet length is consistent with the subsequence length, direct distance computation can be applied without additional alignment or warping operations.

The incorporation of CTGAN improves model robustness under data imbalance by generating synthetic minority samples that preserve joint feature distributions. Compared to the baseline GNN, the augmentation-enhanced model demonstrates improved stability in recall and F1-score, indicating enhanced sensitivity to abnormal patterns. This suggests that generative augmentation effectively mitigates the scarcity of gas kick events and reduces bias toward normal drilling conditions.

Shapelet-based embedding further strengthens temporal representation capability. Unlike global statistical descriptors, shapelets encode localized and discriminative temporal subsequences associated with early-stage gas kick evolution. When integrated as node-level features within the graph framework, shapelet embeddings improve the model’s ability to capture subtle deviation signatures that may not dominate overall signal magnitude. The resulting performance improvement indicates that incorporating localized temporal structures enhances anomaly separability.

When CTGAN augmentation and shapelet embedding are jointly applied, the combined model achieves the best overall performance among all tested configurations. The improvement in F1-score and recall demonstrates that the two enhancement strategies provide complementary benefits: CTGAN addresses data imbalance and distribution sparsity, while shapelet features improve fine-grained temporal discrimination. Their integration strengthens both data-level robustness and representation-level expressiveness.

It is worth emphasizing that the enhancement strategies do not alter the core structural modeling mechanism of the GNN. Instead, they improve input representation quality and training stability, allowing the attention-based graph model to learn more reliable interaction patterns. The observed performance gains therefore validate the effectiveness of combining structure-aware modeling with data augmentation and discriminative temporal encoding in gas kick monitoring.

Overall, the experimental results confirm that representation enhancement significantly improves detection robustness under limited abnormal samples and weak anomaly signals. This finding highlights the importance of integrating generative and temporal feature learning techniques to complement graph-based structural modeling in complex drilling environments.

3.2.2. Process-Level Interpretation of Learned Shapelets

Unlike conventional feature engineering approaches that rely on instantaneous measurements or aggregated statistics, the shapelets used in this study are derived from clustered subsequences of the original time series, as described in the previous section. As such, each shapelet represents a localized temporal evolution pattern occurring during drilling operations.

The extracted shapelets in this study predominantly capture characteristic process behaviors observed in the early stages of gas kick events. Typical shapelets include gradual divergence between outlet flow rate and standpipe pressure, slow but persistent increases in total pool volume, and abnormal fluctuations in total hydrocarbon content. These patterns reflect the progressive disturbance of the wellbore hydraulic balance caused by formation fluid influx, rather than abrupt threshold violations.

Figure 10 illustrates representative shapelets extracted from outlet flow rate and standpipe pressure, overlaid on the original normalized time series. As shown, the identified shapelets correspond to localized yet structured temporal patterns rather than isolated spikes or random fluctuations.

For the outlet flow rate, the extracted shapelet captures a rapid upward deviation followed by a short stabilization phase, indicating the onset of imbalance between inflow and outflow. This pattern is consistent with early-stage gas influx, during which formation fluids gradually enter the wellbore before triggering obvious surface alarms.

Similarly, the standpipe pressure shapelet is located at the initial stage of a sharp pressure decline, reflecting the progressive loss of effective bottom-hole pressure caused by gas invasion. Notably, the shapelet focuses on the transition segment of the pressure drop rather than the extreme low-pressure region, highlighting its sensitivity to early process deviation.

These results demonstrate that the learned shapelets represent meaningful drilling process evolution patterns, enabling the proposed method to detect latent instability prior to clear threshold violations.

From a process perspective, these localized temporal patterns are particularly valuable for early monitoring. In the initial phase of a gas kick, changes in drilling parameters are often subtle and distributed over time, making them difficult to detect using traditional rule-based or threshold-based methods. Shapelets, by focusing on the evolution trend within short time windows, are capable of capturing these weak but accumulative signals, thereby providing earlier and more sensitive indications of abnormal process behavior.

Moreover, because shapelets are extracted from real operational data, they retain a direct correspondence with observable drilling phenomena. This enables drilling engineers to visually inspect and interpret the learned patterns, enhancing confidence in the model outputs and facilitating practical decision-making.

3.3. Interpretability Analysis

Accurate detection of gas kick events must be accompanied by transparent decision reasoning to ensure reliability in real drilling operations. While predictive performance demonstrates the effectiveness of the proposed framework, understanding the internal logic behind model outputs is essential for engineering validation and operational trust. This section presents a multi-level interpretability analysis based on feature attribution and interaction inspection.

3.3.1. Feature-Level Attribution via SHAP

While the visualization of representative shapelets provides intuitive insight into characteristic temporal evolution patterns, a quantitative interpretability analysis is necessary to clarify how these pattern-based features influence the decision-making process of the graph neural network. Therefore, SHAP analysis is conducted to systematically evaluate the contribution of both raw parameters and shapelet-derived features within the trained GNN model.

A global SHAP analysis is first performed to identify dominant features influencing prediction outcomes across the entire dataset. The global importance ranking is presented in Figure 11. The results clearly indicate that shapelet-based distance features contribute substantially more to the model output than instantaneous raw measurements.

In particular, multiple shapelets derived from outlet flow rate and standpipe pressure rank among the top contributing features. These shapelets encode localized yet structured temporal evolution patterns associated with early gas kick development. Their dominance in the SHAP ranking suggests that the GNN primarily relies on process dynamics rather than absolute parameter magnitudes when identifying abnormal conditions. This observation is consistent with the physical mechanism of gas influx, which manifests as progressive hydraulic imbalance rather than abrupt threshold violations.

By contrast, several instantaneous measurements, such as the last observed pit volume or raw hydrocarbon content values, exhibit relatively low SHAP contributions. This indicates that single-point readings alone provide limited discriminative power for early-stage detection. Instead, the model decision is driven by the similarity between ongoing temporal behavior and learned characteristic evolution patterns.

The global SHAP analysis therefore confirms that the proposed framework effectively shifts gas kick monitoring from point-wise anomaly detection toward pattern-based process interpretation. Rather than reacting to isolated sensor spikes, the model integrates temporally structured signals that reflect gradual system instability.

To further examine the stability of feature influence under specific operational conditions, a local SHAP analysis is conducted for a representative early gas kick event. The local explanation results are illustrated in Figure 12. Although the exact ranking and magnitude of feature contributions vary depending on drilling context, the dominant features remain shapelet-based representations derived from outlet flow rate and standpipe pressure.

This consistency between global and local attribution indicates that the GNN decision mechanism is not driven by incidental correlations, but rather by persistent and physically meaningful temporal patterns. The slight variation in local SHAP values reflects context-dependent dynamics, including formation properties and transient operational states, which naturally influence anomaly manifestation.

Overall, the SHAP-based feature attribution analysis demonstrates that the proposed graph neural framework relies on process-aware temporal descriptors embedded within shapelet features. The alignment between model attribution and drilling physics strengthens the interpretability, robustness, and practical credibility of the monitoring system.

3.3.2. Interaction-Level Interpretation via Graph Attention

While SHAP provides insight into individual feature effects, gas kick evolution is inherently governed by interactions among parameters. The attention mechanism embedded in the graph neural network enables inspection of these structural dependencies.

Figure 13 presents the learned attention weight distribution during a gas kick event. Notably, strong attention connections are observed between standpipe pressure and outlet flow rate, as well as between pit volume and standpipe pressure. These interaction patterns align with wellbore pressure–volume coupling and pressure–flow coupling, corresponding to hydraulic dynamics and the relevant physical processes during the early influx stage.

Compared with normal drilling conditions, abnormal intervals exhibit intensified attention weights along specific parameter pairs (outlet flow rate self-attention, and between outlet flow rate and standpipe pressure), indicating that the model dynamically adjusts interaction emphasis when deviation propagation occurs. This behavior suggests that the GNN does not merely aggregate features but actively models dependency-driven anomaly evolution.

Importantly, the attention-derived interaction patterns are consistent with domain knowledge such as wellbore pressure–flow coupling. This structural alignment strengthens confidence that the model decision process reflects realistic physical relationships.

3.3.3. Discussion

By integrating SHAP-based feature attribution and attention-based interaction inspection, the proposed framework achieves multi-granular interpretability. Feature-level analysis explains which parameters contribute to predictions, while structural attention analysis reveals how parameter interactions evolve during abnormal states. The consistency between learned patterns and drilling physics supports the reliability and practical applicability of the proposed graph-based monitoring approach.

4. Conclusions

This study addresses key challenges in early gas kick monitoring, including strong parameter coupling, limited abnormal samples, and insufficient model interpretability, by proposing a unified framework that integrates structure-aware modeling, representation enhancement, and multi-level explanation.

The main findings are summarized as follows:

(1): A structure-aware gas kick monitoring model based on graph neural networks is developed. By explicitly modeling interactions among drilling parameters, the GNN effectively captures anomaly propagation behaviors. Comparative experiments show that the GNN baseline achieves the best performance among baseline models, demonstrating the advantage of relational modeling over conventional feature-independent approaches. This demonstrates the importance of explicitly modeling parameter interactions in complex drilling environments.
(2): A collaborative representation enhancement strategy combining CTGAN and shapelet features is proposed. CTGAN alleviates data imbalance, while shapelets strengthen localized temporal pattern representation. Experimental results show consistent performance gains, with the GNN+CTGAN+Shapelet model achieving further performance improvements in both accuracy and F1-score, indicating improved robustness and discriminative capability. This indicates that combining data-level and feature-level enhancement is effective for handling imbalanced and complex time-series data.
(3): A multi-level interpretability framework integrating attention analysis and SHAP attribution is established. The results indicate that model decisions are primarily driven by flow-rate and standpipe-pressure-related temporal evolution patterns, which align well with drilling engineering knowledge and enhance model trustworthiness. This improves the interpretability and practical reliability of the model in engineering applications.

Overall, this work shifts gas kick monitoring from independent feature modeling toward structure-aware intelligence, achieving synergistic improvements in performance, robustness, and interpretability.

It should be noted that the current study is validated using data from a single representative drilling block, which may limit direct assessment of cross-well generalization. However, the proposed framework leverages structure-aware modeling and attention-based interaction learning to capture intrinsic relationships among drilling parameters, and incorporates shapelet-based representations to enhance robustness to temporal pattern variations.

Future work will focus on cross-well validation, transfer learning, and online adaptive updating to further improve model generalization and support real-world intelligent drilling deployment.

Author Contributions

Conceptualization, B.X. and Q.L.; methodology, B.X.; software, B.X.; validation, B.X., Q.L. and D.Z.; formal analysis, B.X.; investigation, B.X., Z.W. and H.J.; resources, Y.L.; data curation, B.X.; writing—original draft preparation, B.X.; writing—review and editing, Q.L. and Y.L.; visualization, B.X.; supervision, Y.L. and Z.Z.; project administration, Y.L.; funding acquisition, Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Acknowledgments

The authors would like to thank the field engineers and technical staff for their assistance in data acquisition and operational support during the drilling process. The authors also acknowledge the anonymous reviewers for their constructive comments, which helped improve the quality of this manuscript.

Conflicts of Interest

Authors Yuhong Li, Zhuang Weng and Han Jiang were employed by PetroChina (Beijing) Digital Intelligent Research Institute Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Sun, A. Preventive Measures and Handling Principles for Complex Drilling Issues. Petrochem. Ind. Technol. 2017, 10, 181. (In Chinese) [Google Scholar] [CrossRef]
Bahaloo, S.; Mehrizadeh, M.; Najafi-Marghmaleki, A. Review of Application of Artificial Intelligence Techniques in Petroleum Operations. Pet. Res. 2023, 8, 167–182. [Google Scholar] [CrossRef]
Li, G.; Song, X.; Tian, S.; Zhu, Z. Intelligent drilling and completion: A review. Engineering 2022, 18, 33–48. [Google Scholar] [CrossRef]
Orban, J.J.; Zanker, K.J. Accurate flow-out measurements for kick detection, actual response to controlled gas influxes. In Proceedings of the SPE/IADC Drilling Conference and Exhibition, Dallas, TX, USA, 28 February–2 March 1988; p. SPE-17229-MS. [Google Scholar]
Schafer, D.M.; Loeppke, G.E.; Glowka, D.A.; Scott, D.D.; Wright, E.K. An evaluation of flowmeters for the detection of kicks and lost circulation during drilling. In Proceedings of the SPE/IADC Drilling Conference and Exhibition, New Orleans, LA, USA, 18–21 February 1992; p. SPE-23935-MS. [Google Scholar]
Schubert, J.J.; Wright, J.C. Early kick detection through liquid level monitoring in the wellbore. In Proceedings of the SPE/IADC Drilling Conference and Exhibition, Dallas, TX, USA, 3–6 March 1998; p. SPE-39400-MS. [Google Scholar]
Liu, S. Development of device for drilling fluid level detection and automatic grout system. China Pet. Mach. 2006, 2, 29–30. [Google Scholar] [CrossRef]
Rohani, M.R. Managed-Pressure Drilling; Techniques and Options for Improving Operational Safety and Efficiency. Pet. Coal 2012, 54, 24–33. [Google Scholar]
Vajargah, A.K.; Miska, S.Z.; Yu, M.; Ozbayoglu, M.E.; Majidi, R. Feasibility study of applying intelligent drill pipe in early detection of gas influx during conventional drilling. In Proceedings of the SPE/IADC Drilling Conference and Exhibition, Amsterdam, The Netherlands, 5–7 March 2013; p. SPE-163445-MS. [Google Scholar]
Gurina, E.; Klyuchnikov, N.; Zaytsev, A.; Romanenkova, E.; Antipova, K.; Simon, I.; Makarov, V.; Koroteev, D. Application of machine learning to accidents detection at directional drilling. J. Pet. Sci. Eng. 2020, 184, 106519. [Google Scholar] [CrossRef]
Hu, W.; Xia, W.; Li, Y.; Jiang, J.; Gao, L.; Chen, Y. An intelligent identification method of safety risk while drilling in gas drilling. Pet. Explor. Dev. 2022, 49, 428–437. [Google Scholar] [CrossRef]
Zhu, Z.; Zhou, D.; Yang, D.; Song, X.; Zhou, M.; Zhang, C.; Duan, S.; Zhu, L. Early Gas Kick Warning Based on Temporal Autoencoder. Energies 2023, 16, 4606. [Google Scholar] [CrossRef]
Pearson, K. On the Theory of Contingency and Its Relation to Association and Normal Correlation; Cambridge University Press: Cambridge, UK, 1904; Volume 1. [Google Scholar]
Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. arXiv 2014, arXiv:1406.2661. [Google Scholar] [CrossRef]
Qi, S. Research and Application of Facial Image Inpainting and Super Resolution Reconstruction Based on GAN. Master’s Thesis, Henan University, Kaifeng, China, 2022. [Google Scholar]
Radford, A.; Metz, L.; Chintala, S. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv 2015, arXiv:1511.06434. [Google Scholar]
Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein generative adversarial networks. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 214–223. [Google Scholar]
Mao, X.; Li, Q.; Xie, H.; Lau, R.Y.; Wang, Z.; Smolley, S.P. Least squares generative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2794–2802. [Google Scholar]
Xu, L.; Skoularidou, M.; Cuesta-Infante, A.; Veeramachaneni, K. Modeling Tabular Data Using Conditional GAN. In Proceedings of the Advances in Neural Information Processing Systems 32 (NeurIPS 2019), Vancouver, BC, Canada, 8–14 December 2019. [Google Scholar]
Lines, J.; Davis, L.M.; Hills, J.; Bagnall, A. A shapelet transform for time series classification. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Beijing, China, 12–16 August 2012; pp. 289–297. [Google Scholar]
Ye, L.; Keogh, E. Time series shapelets: A new primitive for data mining. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, 28 June–1 July 2009; pp. 947–956. [Google Scholar]
Zhou, D.; Zhou, C.; Zhang, Z.; Zhou, M.; Zhang, C.; Zhu, L.; Li, Q.; Wang, C. Intelligent Lost Circulation Monitoring Method Based on Data Augmentation and Temporal Models. Processes 2024, 12, 2184. [Google Scholar] [CrossRef]
Dong, X.; Yu, Z.; Cao, W.; Shi, Y.; Ma, Q. A survey on ensemble learning. Front. Comput. Sci. 2020, 14, 241–258. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd Acm SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased boosting with categorical features. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, QC, Canada, 3–8 December 2018. [Google Scholar]
Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T. Lightgbm: A highly efficient gradient boosting decision tree. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]

Figure 1. Conceptual workflow of the proposed intelligent monitoring framework for gas kick detection.

Figure 2. Workflow of data preprocessing and dataset construction for gas kick monitoring.

Figure 3. Pearson correlation heatmap of all numerical features.

Figure 4. Architecture of the proposed graph neural monitoring network for gas kick detection.

Figure 5. The structure of generative adversarial network.

Figure 6. Data augmentation before and after comparison.

Figure 7. Two shapelets learned on a particular dataset.

Figure 8. Multi-level interpretability analysis of the proposed monitoring framework.

Figure 9. Performance of GNN under Various Augmentations.

Figure 10. Visualization of representative shapelets overlaid on drilling time series.

Figure 11. Global contribution of drilling parameters based on SHAP analysis.

Figure 12. Case-level SHAP explanation for an early gas kick detection example.

Figure 13. Learned attention weight distribution.

Table 1. Confusion matrix structure (adapted from our previous work [22]).

True Result	Forecast Result
True Result	Positive Class	Negative Class
Positive class	TP	FN
Negative class	FP	TN

Table 2. The model parameters.

Model	Selecting Parameters
SVM	random_state: 42
KNN	n_neighbors: 5
Random Forest	n_estimators: 100; random_state: 42; n_jobs: −1
XGBoost	n_estimators: 200; max_depth: 4; learning_rate: 0.05; random_state: 42; n_jobs: −1
CatBoost	iterations: 200; depth: 4; earning_rate: 0.05; random_state: 42
LightGBM	n_estimators: 200; max_depth: 4; learning_rate: 0.05; random_state: 42; n_jobs: −1
ANN	MLP hidden sizes: (128, 64, 32); dropout: 0.3; output dim: 2
GRU	hidden_size: 64; num_layers: 2; dropout: 0.3
LSTM	hidden_size: 64; num_layers: 2; dropout: 0.3
Transformer	d_model: 64; nhead: 4; num_layers: 2; dim_feedforward: 128; dropout: 0.3;
GNN	Graph layers: 2; embedding dim: 64; Dropout: 0.3; Attention heads: 1;

Table 3. The model comparison results.

	Accuracy	Precision	Recall	F1-Score
SVM	0.7298	0.7738	0.6588	0.7117
KNN	0.6677	0.6892	0.6261	0.6561
Random Forest	0.6593	0.6704	0.6435	0.6567
XGBoost	0.6983	0.7010	0.7048	0.7029
CatBoost	0.6985	0.7215	0.6588	0.6887
LightGBM	0.6881	0.6977	0.6774	0.6874
ANN	0.6945	0.7138	0.6621	0.6870
GRU	0.6478	0.6716	0.5954	0.6313
LSTM	0.7071	0.7352	0.6588	0.6949
Transformer	0.7260	0.7434	0.7006	0.7214
GNN	0.7302	0.7953	0.6290	0.7024

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xia, B.; Li, Q.; Li, Y.; Weng, Z.; Jiang, H.; Zhu, Z.; Zhou, D. Structure-Aware Graph Neural Network with Representation Enhancement and Interpretability for Early Gas Kick Monitoring. Processes 2026, 14, 1110. https://doi.org/10.3390/pr14071110

AMA Style

Xia B, Li Q, Li Y, Weng Z, Jiang H, Zhu Z, Zhou D. Structure-Aware Graph Neural Network with Representation Enhancement and Interpretability for Early Gas Kick Monitoring. Processes. 2026; 14(7):1110. https://doi.org/10.3390/pr14071110

Chicago/Turabian Style

Xia, Boyi, Qihao Li, Yuhong Li, Zhuang Weng, Han Jiang, Zhaopeng Zhu, and Detao Zhou. 2026. "Structure-Aware Graph Neural Network with Representation Enhancement and Interpretability for Early Gas Kick Monitoring" Processes 14, no. 7: 1110. https://doi.org/10.3390/pr14071110

APA Style

Xia, B., Li, Q., Li, Y., Weng, Z., Jiang, H., Zhu, Z., & Zhou, D. (2026). Structure-Aware Graph Neural Network with Representation Enhancement and Interpretability for Early Gas Kick Monitoring. Processes, 14(7), 1110. https://doi.org/10.3390/pr14071110

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Structure-Aware Graph Neural Network with Representation Enhancement and Interpretability for Early Gas Kick Monitoring

Abstract

1. Introduction

2. Methodology

2.1. Data Preprocessing

2.2. Graph-Based Monitoring Framework

2.3. Representation Enhancement

2.3.1. Generative Adversarial Network

2.3.2. Shapelet Transformation

2.4. Interpretability Analysis

2.4.1. Feature Contribution Analysis Using SHAP

2.4.2. Interaction Interpretability via Graph Attention Mechanisms

2.5. Evaluation Metrics

3. Results and Discussion

3.1. Baseline Model Comparison

3.2. Effectiveness of Representation Enhancement

3.2.1. Performance Improvement

3.2.2. Process-Level Interpretation of Learned Shapelets

3.3. Interpretability Analysis

3.3.1. Feature-Level Attribution via SHAP

3.3.2. Interaction-Level Interpretation via Graph Attention

3.3.3. Discussion

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI