Prediction of Coalbed Methane Production Using a Modified Machine Learning Methodology

Zhang, Hongyang; Li, Kewen; Shi, Shuaihang; He, Jifu

doi:10.3390/en18061341

Open AccessArticle

Prediction of Coalbed Methane Production Using a Modified Machine Learning Methodology

¹

School of Energy Resources, China University of Geosciences (Beijing), 29 Xueyuan Road, Beijing 100083, China

²

Key Laboratory of Marine Reservoir Evolution and Hydrocarbon Enrichment Mechanism, Ministry of Education, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

Energies 2025, 18(6), 1341; https://doi.org/10.3390/en18061341

Submission received: 15 February 2025 / Revised: 7 March 2025 / Accepted: 7 March 2025 / Published: 9 March 2025

(This article belongs to the Section H: Geo-Energy)

Download

Browse Figures

Versions Notes

Abstract

Compared to natural and shale gas, studies on predicting production specific to coalbed methane (CBM) are still relatively limited, and mainly use decline curve methods such as Arps, Stretched Exponential Decline Model, and Duong’s model. In recent years, machine learning (ML) methods applied to CBM production prediction have focused on the significant data characteristics of production, achieving more accurate predictions. However, throughout the application process, these models require a large amount of data for training and can only achieve accurate forecasts over a short period, such as 30 days. This study constructs a hybrid ML model by integrating a long short-term memory (LSTM) network and Transformer architecture. The model is trained using the mean absolute error (MAE) loss function, optimized using the Adam optimizer, and finally evaluated using metrics such as MAE, root mean square error (RMSE), and R squared (R²) scores. The results show that the LSTM-Attention (LSTM-A) hybrid model based on small training datasets can accurately capture the CBM production trend and is superior to traditional methods and the LSTM model regarding prediction accuracy and effective prediction time interval. The methodologies established and the results obtained in this study are of great significance to accurately predict CBM production. It is also helpful to better understand the mechanisms of CBM production.

Keywords:

deep learning; CBM production forecasting; dynamic attention mechanism; long short-term memory; decline curve analysis

1. Introduction

Coalbed methane is gaining global attention as a clean and unconventional natural gas resource [1,2]. Exploiting CBM can effectively alleviate the problem of natural gas shortages [3]. It provides a relatively environmentally friendly energy option and helps reduce mine safety accidents [4] and improve the mining area’s environment [5,6,7]. The production prediction of CBM is of great significance to the development of CBM [8], economic evaluation [9,10], the optimization of fracturing parameters, and control of drainage and production systems [11].

The prediction of CBM production can be derived from methods used to forecast other oil and gas resources. By combining traditional approaches with modern data science techniques, the accuracy of these predictions can be significantly improved [12,13]. Decline Curve Analysis (DCA) [14,15,16,17,18,19,20], numerical simulation, and artificial intelligence are the primary methods for forecasting CBM production. The DCA method is relatively simple, with low computational resources and time costs, and is supported by extensive empirical research validating its effectiveness [21]. However, it is unsuitable for new wells, and its accuracy in long-term forecasting may diminish due to changes in geological and engineering conditions [22]. The advantage of numerical simulation is that it can solve the production prediction problem without data samples [23]. However, its drawbacks include requiring numerous model parameters and high-precision input data. In field production, obtaining accurate dynamic values for parameters such as permeability and saturation is challenging, raising concerns about simulation results’ reliability.

CBM production forecasting can be modeled as a time-series problem to capture the trends and patterns over time. However, there are still several challenges in applying machine learning to oil and gas production, especially concerning key geological factors such as porosity, permeability, and pore connectivity [24]. These challenges include the following: 1. High measurement difficulty and significant error due to reservoir heterogeneity. 2. Limited changes in these values during production, making them less effective as features for time-series models. 3. Variations in the geological factors emphasized during different development stages, making it difficult to collect and standardize these factors into a unified dataset. These challenges are similar to those faced using numerical simulation methods.

Despite these challenges, machine learning has significantly advanced production forecasting [23,25,26] and has been shown to enhance DCA models [27,28]. Initially, ML was mainly used to process DCA input and output data. Yehia et al. [29] optimized anomaly detection in shale gas production using ML but did not improve prediction accuracy. Han et al. [30] integrated clustering and artificial neural networks (ANNs) into DCA and stable production stages. The research results can be categorized into several areas. Firstly, feature extraction methods are optimized, such as filtering outliers and clustering historical data [31,32,33]. Secondly, some studies use specific ML methods for production forecasting with a particular sample set, comparing the results with DCA and numerical simulation results [34]. Thirdly, combining temporal and attribute features allows for the analysis and prediction of data with different attributes. These studies have effectively optimized existing prediction methods and improved accuracy [35,36].

Based on the above, deep learning and neural network methods have been applied to study CBM well production forecasting. However, these methods often face gradient vanishing or exploding issues when dealing with long-term dependencies, making it difficult to achieve accurate daily production forecasts. The application of the LSTM model alleviates the problem, but it still cannot be considered a complete solution. The Transformer model is an effective method to address these issues. Using the self-attention mechanism can effectively capture the dependencies of distant positions within a sequence and adaptively assign different weights to different time steps based on the content of the input sequence, thereby better capturing essential patterns in the sequence [37,38,39]. This paper proposes an improved LSTM-A network method based on the principles of the Transformer model to predict CBM production, achieving progress in the application of forecasting algorithms.

The rest of this paper is organized as follows. Section 2 explains various application scenarios of production forecasting in conjunction with actual production change curves, summarizes traditional production forecasting models, including mathematical and analytical models, and introduces the principles and implementation methods of applying the attention mechanism to the LSTM model. Section 3 presents the research results, compares them with traditional forecasting methods in different application scenarios, and discusses the model’s specific feature selection and parameter optimization. Section 4 provides the conclusion and offers suggestions for further research.

2. Methodology

2.1. Model Construction

The model uses the Transformer’s advanced attention mechanism to capture time dependencies and complex relationships in the data and then utilizes the sequential processing capabilities of the LSTM network for CBM production. The prediction results of the model are compared with the DCA.

2.1.1. Long Short-Term Memory Network

The LSTM network is a recursive neural network (RNN). The model uses its unique structure to solve the gradient disappearance and gradient explosion problems on long-series data of standard RNN [40,41]. Each LSTM unit consists of an input gate (

i_{t}

)), a forget gate (

f_{t}

)), and an output gate (

o_{t}

)). The input gate determines how much current information to update into the cell state:

i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(1)

where

i_{t}

is the input gate at time step t,

σ

is the sigmoid function,

W_{i}

is the weight matrix for the input gate,

h_{t - 1}

is the hidden state from the previous time step,

x_{t}

is the input at the current time step, and

b_{i}

is the bias term for the input gate.

The forgetting gate determines what information to discard from the cell state. Through a sigmoid layer, it outputs a value between 0 and 1 that is multiplied by the cell state to determine how much information is forgotten:

f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})

(2)

where

f_{t}

is the forget gate at time step t,

W_{f}

is the weight matrix for the forget gate, and

b_{f}

is the bias term for the forget gate.

The output gate determines the part of the cell state to be exported. Through the combination of a sigmoid layer and a tanh layer, the cell state is turned into the output:

o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})

(3)

where

o_{t}

is the output gate at time step t,

W_{o}

is the weight matrix for the output gate, and

b_{o}

is the bias term for the output gate.

In this model, the LSTM layers capture the temporal dependencies in the CBM production data. The hidden states produced by the LSTM layers are then passed through a fully connected layer to match the input dimensions required by the subsequent Transformer layers.

2.1.2. Transformer Architecture

The Transformer architecture is initially designed for natural language processing tasks. The model has demonstrated exceptional performance in capturing dependencies across long sequences through its self-attention mechanism [42]. The self-attention mechanism allows the model to weigh the importance of different parts of the input sequence when making predictions.

Attention (Q, K, V) = softmax (\frac{Q K^{T}}{\sqrt{d_{k}}}) V

(4)

where

Q

,

K

, and

V

are the query, key, and value matrices, respectively, and

d_{k}

is the dimension of the keys. The Transformer architecture consists of an encoder–decoder structure. This study utilizes only the encoder part, comprising multiple self-attention and feed-forward neural network layers.

2.2. Comparison Model

Traditional CBM production forecasting methods include the Decline Curve Analysis (DCA) and numerical simulation methods, which are based on geological and engineering parameters. The DCA method has been widely applied in stable production reservoirs for many years, and many attempts have been made to apply it to unstable production reservoirs. As a result, various DCA models are now available. Most of these models use analytical equations that describe the physical characteristics of production over time, with the coefficients of these equations calculated by fitting them to the production history curve [14].

The most commonly used curve-fitting-based DCA model for CBM reservoirs is the modified Arps model [43].

When

D

>

D_{i}

,

q = \frac{q_{i}}{{(1 + b D_{i} t)}^{\frac{1}{b}}}

(5)

When

D

=

D_{i}

,

q = \frac{q_{i}}{(1 + D_{i} t)}

(6)

When

D

<

D_{i}

,

q = q_{i} e^{- D_{i} t}

(7)

where

q

is the production rate,

q_{i}

is the initial production rate at the beginning of the boundary-dominated flow period,

D

is the standard decline rate,

D_{i}

is the initial decline rate according to Arps, and b is the hyperbolic exponent.

In addition to the production decline formula over time, there are other models that reflect the production decline of oil and gas reservoirs in different ways and that are easy to apply. Li and Horne proposed a model that reflects the linear relationship between the inverse of the oil production rate and the cumulative oil production [44]. The model can be expressed as follows:

q (t) = a_{0} \frac{1}{R (t)} - b_{0}

(8)

where R(t) is the recovery at time t, in pore volume units (R = N_p/V_p, N_p is the cumulative oil production, and V_p is the pore volume).

a_{0}

and

b_{0}

are two constants associated with capillary and gravity forces, respectively. The two constants

a_{0}

and

b_{0}

are expressed as follows.

2.3. Model Inference

The specific implementation process of the model is depicted in Figure 1. The process begins with the input of CBM production data, which include variables such as bottomhole pressure, water production rate, and dynamic fluid level. Then, the data from different wells are clustered based on their similarities, enabling the model to group wells with expected behaviors.

Once clustered, the data are transformed into sequential feature sets for time-series analysis. The model’s architecture begins with an LSTM network, which processes the input sequences to capture long-term temporal dependencies that are critical for accurate prediction. The output from the LSTM layers passes through one or more fully connected layers, adjusting feature dimensions and preparing the data for further processing by the Transformer network.

In the next phase, the data enter the Transformer encoder layers. These layers apply self-attention mechanisms, enabling the model to weigh the importance of different features and capture intricate dependencies within the sequence. This attention-based approach allows the model to focus on critical patterns, such as production peaks or declines, that span the entire time series. To mitigate overfitting, the output from the Transformer layers passes through a Dropout layer before a final fully connected layer generates the prediction. This architecture ensures that both short-term and long-term dependencies are effectively modeled, resulting in robust predictions of coal-bed methane production.

The model structure that integrates LSTM and Transformer combines the strengths of both to more effectively capture short-term dependencies, long-term dependencies, and global features in time-series data. The following sections will describe the entire model structure and computation process using paragraphs and formulas.

The input feature sequence

X \in R^{B \times T \times D_{i n}}

is first passed through the LSTM model, where

B

represents the batch size,

T

is the number of time steps, and

D_{i n}

is the dimension of the input features. The LSTM functions by using its gating mechanisms to extract short-term and long-term dependency features from the input sequence. After processing by the LSTM, we obtain the hidden states at each time step

H_{L S T M} \in R^{B \times T \times D_{L S T M}}

and the cell state

C

:

H_{L S T M}, C = L S T M (X)

(9)

The hidden states

H_{L S T M}

contain the feature representations of the input sequence at each time step, where

D_{L S T M}

represents the size of the LSTM hidden layer. These representations will be further processed.

To transform the output features of the LSTM into a feature dimension suitable for processing by the Transformer, we use a fully connected layer (linear layer). The function of the fully connected layer is to map the feature dimension

D_{L S T M}

of the LSTM hidden states

H_{L S T M}

to the feature dimension

D_{T r a n s f o r m e r}

required by the Transformer:

H_{FC 1} = H_{LSTM} W_{FC 1} + b_{FC 1}

(10)

where

W_{F C 1} \in R^{D_{L S T M} \times D_{T r a n s f o r m e r}}

is the weight matrix and

b_{FC 1} \in R^{D_{T r a n s f o r m e r}}

is the bias vector. After passing through the fully connected layer, the output tensor shape becomes

H_{FC 1} \in R^{B \times T \times D_{T r a n s f o r m e r}}

.

Next is the tensor shape adjustment, which aims to match the input format required by the Transformer layer. In PyTorch 3.1.2’s Transformer model, the time step must be the first dimension of the input. Therefore, we transform the dimensions of

H_{FC 1}

:

H_{Transformer input} = permute (H_{FC 1})

(11)

The tensor with adjusted dimensions is then fed into the Transformer layer. Through its attention mechanism, the Transformer layer can capture global feature relationships within the input sequence, especially handling long-distance dependencies. During the encoding process, we obtain the output of the Transformer

H_{Transformer output} \in R^{B \times T \times D_{T r a n s f o r m e r}}

:

H_{Transformer output} = Transformer (H_{Transformer input})

(12)

This output retains the global features of each time step. Through multi-layer encoding and the attention mechanism, the Transformer can effectively learn the global context information of the sequence.

From the output of the Transformer layer, we typically select the hidden state of the last time step

H_{Transformer output} [- 1]

as the input, which is passed to the final fully connected layer to generate the final prediction:

H_{final} = H_{Transformer output} [- 1]

(13)

Then, through a fully connected layer, the features are mapped to the output space to obtain the final prediction:

\hat{y} = H_{f i n a l} W_{F C 2} + b_{F C 2}

(14)

The final output shape is

\hat{y}

, representing the predicted values for each batch. Here,

W_{F C 2}

is the weight matrix, and

b_{F C 2}

is the bias vector.

By combining LSTM and Transformer, we leverage LSTM to extract short-term and long-term dependency features of the time series. These features are then mapped to a suitable dimension for the Transformer using a fully connected layer. The Transformer layer captures global feature relationships, and finally, the output passes through a fully connected layer to generate the final prediction. This combined model approach fully utilizes the advantages of both LSTM and Transformer, achieving a more profound modeling of time-series data.

2.4. Data Description

This study examines a CBM block in Yangquan, Shanxi. The specific location is shown in Figure 2. The mining area is located on the northeastern edge of the Qinshui coalfield, on the western side of the middle section of the Taihang Uplift. To the north of the mining area lies the Wutai block. It is situated on the eastern wing of the Qilu Arc and a series of graben-like polyphase structures in central Shanxi. The area forms a composite zone with the Taihang Uplift of the Neo-Cathaysian system and the latitudinal structural zone in central Shanxi, specifically the Yangqu–Yuxian east–west fold–fault zone. Additionally, it is positioned between the Shouyang–Xiluo meridional structure and the Taihang meridional structure. Production data from 21 representative wells were selected for analysis. The coal seam burial depths in the dataset vary significantly, ranging from 467 to 823 m. The reservoir is characterized by low permeability and pressure. The data cover the period from post-fracturing operations to the end of production.

The sample set is divided into two subsets based on the different fracturing operation times and production dates, comprising 9 and 12 wells, respectively. The interval between fracturing operations within each subset does not exceed three months. Due to varying well conditions, the production periods can differ by up to 110 days.

Due to the influence of geological conditions and fracturing operations, CBM wells’ final production and production variations with similar production dates and gas-bearing layers can vary significantly. As Figure 3 illustrates, CBM wells can be categorized into three types based on their production volume and variation: declining, stable, and fluctuating. Well 1 is a typical example of a declining CBM well. It has a gas-bearing layer with a high gas content, effective fracturing results, high permeability, and low water content. During the extraction process, no equipment failures occurred, and there were no artificial restrictions from the production stage onward. Its production curve can be divided into three stages: rapid increase in production, noticeable decline, and stable production. This type of CBM well is ideal but often rare.

The primary reasons for the significant difference between actual production curves and ideal curves can be attributed to two main factors. One is long-term factors such as reservoir geology. For example, if the coal seam has poor permeability, gas will have difficulty migrating from within the coal seam to the wellbore, even after fracturing operations. Similarly, high water content in the coal seam, complex geological structures, low gas content, and formation stress causing fracture closure can also lead to variations in gas production. The surrounding rock’s poor sealing properties can result in wells like Well 2, which, despite overall stability, have low gas production. In severe cases, gas production may stop or not occur shortly after starting.

Another type of factor is short-term influences, which include inter-well interference caused by fracturing operations in adjacent wells, human interventions in production through extraction plans, and sudden issues with extraction equipment. These influences can immediately and significantly impact production. For instance, the production curve of Well 3 shows high overall output with frequent sharp increases or decreases in daily production. Identifying the reasons for these abrupt changes in production and preventing and addressing similar issues has been a critical focus in CBM development. Wells affected by these short-term influences constitute the majority of cases.

2.5. Data Preprocess

Data preprocessing plays a pivotal role in enhancing model performance and prediction accuracy. To ensure high data quality and minimize interference, this study implemented a comprehensive series of preprocessing steps, including normalization, noise management, and missing data imputation. Data points where production remained below 100 m³ for more than 20 consecutive days were removed, ensuring completeness while filtering abnormal values.

To address the variability in scales and ranges among feature values, the Min-Max normalization method was utilized, mapping all features to the range [0, 1]. The normalization formula used is

x^{'} = \frac{x - \min (x)}{\max (x) - \min (x)}

(15)

Here,

x

represents the original feature value, while

m i n (x)

and

m a x (x)

denote the minimum and maximum values of the feature, respectively. This method reduces discrepancies in feature scales, preventing any single feature from disproportionately influencing the model during training. Additionally, normalization helps accelerate model convergence by standardizing input values.

Noise in the data, particularly zero-point data and outliers, was also addressed. Zero-point data, often caused by equipment failures or transmission errors, can harm model performance. Data with excessive zero values or prolonged consecutive zeros were removed using thresholds. After removing noise, GPD shows a relatively weak positive correlation with other production parameters (Figure 4), and the correlations between other features are also not particularly pronounced. Considering the correlation between the features and the significance of the features themselves, the model selected bottomhole pressure (BHP), casing pressure (CP), water production rate (WPR), and dynamic liquid level (DLL) as input features.

The data were derived from two datasets, including 3078 samples from primary fracturing and 3545 samples from secondary fracturing. After removing outliers, 4512 data points were retained (Figure 5). The production data exhibit a clear right-skewed distribution (positive skewness), with most samples concentrated in the lower production range, while high-production samples are relatively rare. This distribution characteristic is common in coalbed methane production and reflects the resource limitations and technological impacts on well productivity.

Missing data were handled based on their proportion. For features with less than 5% missing values, linear interpolation or time-series mean methods were used. For higher proportions, the affected time periods were excluded to avoid errors. Inconsistent sensor sampling frequencies were aligned and resampled to a daily interval, and interpolation filled gaps to ensure temporal consistency.

3. Results

This section compares the performance of the LSTM-A, LSTM, exponential, and harmonic models in predicting the production of different types of CBM wells. The evaluation metrics used include MAE, RMSE, and R². After determining the strengths and weaknesses of the LSTM-A model relative to the other models, the model optimization process regarding feature selection and parameter tuning is analyzed.

To assess the model’s performance, several key metrics are used: MAE, RMSE, and R² score. The MAE is calculated as

M A E = \frac{1}{n} \sum_{i = 1}^{n} | X_{i} - Y_{i} |

(16)

where

X_{i}

is the actual value,

Y_{i}

is the predicted value, and

n

is the number of predictions. The RMSE can be expressed as

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} (X_{i} - Y_{i})^{2}}

(17)

The R² score is computed as

R^{2} = 1 - \frac{\sum_{i = 1}^{n} (X_{i} - Y_{i})^{2}}{\sum_{i = 1}^{n} (X_{i} - \bar{X})^{2}}

(18)

where

\bar{X}

is the mean of the actual values. These metrics provide a comprehensive evaluation of the model’s predictive performance, with lower MAE and RMSE values indicating higher accuracy and a higher R² score, indicating better model fit.

3.1. Model Optimization

The hyperparameter settings of the model are key to model tuning [45]. In order to improve the prediction accuracy of the model, reduce the training time, and prevent overfitting, we discuss the effects of feature selection, Dropout rate, time step, and layer number on the prediction effect and apply them to optimize the prediction model. The remaining hyperparameters are detailed in Table 1. The dataset contains a total of 4512 data points. Considering the varying production cycles of individual wells, the test set was not allocated strictly by proportion. Overall, approximately 90% of the data were allocated to the training set, and about 10% were allocated to the test set. Well 3 is a complex situation. This is consistent with the pattern of most CBM wells and is a suitable candidate for validation.

3.1.1. Feature Selection

Selecting different features significantly impacts the prediction results in Figure 6. When using casing pressure and water production as features for prediction, the main trends and variations in production can be accurately reflected, resulting in the most accurate predictions.

The MAE obtained using CP as a feature is the lowest among all prediction results (Table 2). The RMSE obtained using WPR is the lowest among all prediction results. Additionally, the prediction result using WPR as a feature yields the highest R² value. The prediction results using BHP as a feature are average. Finally, the prediction results obtained using the above four parameters as features have an MAE of 71.03, an RMSE of 115.17, and an R² of 0.56, accurately reflecting actual production conditions. When using all four features, the negative impact of high-dimensional data on the model is minimal and exhibits good stability.

3.1.2. Dropout Rate

Dropout is a regularization technique that randomly drops a certain percentage of neurons during training to prevent the model from overfitting (Figure 7). The model shows the best prediction performance at a Dropout rate of 0.1. When the Dropout rate increases to 0.2, MAE and RMSE increase slightly (Table 3).

However, a further increase in the Dropout rate to 0.3 results in a marked deterioration in model performance. At a Dropout rate of 0.4, the predictive performance is still worse than at Dropout rates of 0.1 and 0.2. A lower Dropout rate (0.1 or 0.2) is more favorable.

3.1.3. Time Step

The time step determines the number of past time points the model can consider when predicting the current output. As shown in Figure 8, the model’s fit to the actual data is poor at a time step of 5, especially in the early stages of prediction, where the model fails to capture the fluctuations in production accurately. With a time step of 10, the model’s performance improves but still exhibits significant deviations. The performance at a time step of 15 is better than that of the previous two. Beyond the 100-day mark, the model can better track the production trend. The curve at a time step of 20 shows a trend more closely aligned with actual production, making this time step the best-performing one. Overall, the longer the time step, the more accurate the prediction becomes. However, to prevent issues such as over-smoothing of the model and loss of detailed information, as well as to reduce the risk of overfitting, the time step will no longer be increased. The model uses 20 time steps to make predictions.

As the time step increases, the R² value gradually rises, reaching 0.39 at a time step of 20, indicating an enhancement in the model’s explanatory power of the data (Table 4). The gradual decrease in MAE and RMSE also indicates that the longer the time step, the stronger the model’s prediction ability.

3.1.4. Layer

The number of layers determines the complexity and expressiveness of the model. More layers typically capture more complex patterns in the data, but they may also lead to overfitting. The overall trend in the one-layer model aligns well with the actual data (Figure 9). The two-layer model significantly underestimates daily production at the beginning of the forecast. The three-layer model performs slightly better than the two-layer model, but some areas still have significant errors. The four-layer model shows the closest trend to the actual data in the figure. During significant fluctuations, the model almost wholly follows the changes in actual production. It has the lowest MAE (64.49) and the highest R² (0.60) (Table 5).

Several characteristics, including casing pressure and water production, can be improved by adjusting the parameters. The model performed best at loss rates of 0.1 and 0.2. Longer time steps improved performance and reduced the MAE and RMSE, while one- and four-zone configurations provided the best accuracy. These results are applied to different well data in the next section.

3.2. Field Application

3.2.1. Rapid Decline Prediction

The actual daily production data for the CBM show a clear downward trend (Figure 10). Among the two conventional production decline curves, the exponential model fits the actual data poorly, mainly underestimating the production in the later stages. The harmonic model, while less accurate in predicting the early trend than the exponential model, performs better in the last stages, only slightly underestimating the production. Li–Horne’s model grossly overestimates future production.

The LSTM model exhibits noticeable volatility in its predictions from the beginning, with the primary source of error being significant fluctuations. However, it remains relatively accurate over the long term (Figure 11). All models use a certain period of prior data as the training set. The LSTM-A model effectively corrects the volatility error of the LSTM model, allowing for long-term tracking of the actual data trend (Table 6).

3.2.2. Stable Production Prediction

Figure 12 shows the predictive performance of the LSTM-A model compared to the LSTM, exponential model, harmonic model, and Li–Horne model during a stable production decline phase. The CBM production data remain stable and gradually decline after a certain period. Both the exponential and harmonic models in the ARPS model performed poorly, so the Li–Horne model was added as a comparison; the exponential model performed well during the stable production phase but significantly overestimated production in the later stages (Table 7). The harmonic model, which predicted an initial decline followed by stable production, completely contradicted the actual production trend. The Li–Horne model outperformed the first two but still overestimated the yield.

The LSTM model’s overall predictions are close to the actual values, slightly underestimating production in the early phase and overestimating it in the later phase, with the actual decline being more pronounced than the model predicts (Figure 13). The LSTM-A model’s overall predictive trend is very close to the actual values, with an R² value reaching 0.92, indicating a solid alignment with the actual data.

Overall, the LSTM-A model demonstrates a higher prediction accuracy than the LSTM model, especially in tracking the decline phase of the actual production data. The LSTM model, while following the trend, shows more significant prediction errors and fluctuations.

3.2.3. Fluctuating Production Prediction

Figure 14 illustrates the prediction results of two machine learning models for fluctuating CBM production. The actual data exhibit considerable variability and multiple peaks during the production period. It is evident that conventional production decline curve models, including the exponential and harmonic models, cannot predict this type of production decline curve.

The LSTM model performs poorly in this prediction. Although the final predicted production amount is close to the actual value, the trend significantly deviates from the actual data (Figure 15). In comparison, the LSTM-A model demonstrates better consistency with the peaks and troughs of actual production data, with the R² improving from −1.04 (LSTM) to 0.56.

4. Discussion

The LSTM-A model proposed in this study demonstrates significant advantages in forecasting coalbed methane (CBM) production, particularly across different production stages, including rapid decline, stable production, and fluctuating production. By combining long short-term memory (LSTM) networks with the attention mechanism of the Transformer architecture, the LSTM-A model effectively captures both short- and long-term dependencies in time-series data, leading to a marked improvement in prediction accuracy. When compared with traditional Decline Curve Analysis (DCA) models, the LSTM-A model outperforms in all production phases, especially in stages with greater production fluctuations.

The results of this study demonstrate the effectiveness of the LSTM-A model in predicting CBM production, particularly in comparison with traditional decline curve models and the LSTM model alone [46]. The LSTM-A model showed superior performance in short-term and long-term production predictions across various production stages—declining, stable, and fluctuating.

During the rapid decline phase, the LSTM-A significantly outperformed traditional exponential and harmonic models, which tended to underestimate production, especially in later stages. While the LSTM model also provided accurate predictions, it displayed some volatility that was effectively smoothed by incorporating the attention mechanism in the LSTM-A.

Compared to traditional prediction methods, such as DCA and other numerical simulation methods, the approach proposed in this study demonstrates clear advantages. Traditional methods often rely on empirical formulas or simplified physical models, which struggle to accurately capture the complexity of CBM reservoirs, particularly in dynamic and unconventional environments. While DCA models perform well under steady-state conditions, they cannot adapt to rapidly changing production data, making them unsuitable for real-time predictions. In contrast, the method proposed in this study effectively handles time-series data and dynamically adjusts to evolving reservoir conditions, providing more accurate predictions.

Although the DNN model demonstrates a high coefficient of determination (R² = 0.923) [47], it mainly relies on static data, limiting its ability to model the dynamic features of CBM production. By integrating the attention mechanism, the method proposed here can prioritize important features, enhancing its ability to capture long-term dependencies and abrupt changes in production, a capability that is often overlooked by other machine learning models.

The hybrid method that combines convolutional neural networks (CNNs) and LSTM shares similarities with the approach presented in this study, particularly in feature extraction and time-series modeling [36]. However, the attention mechanism in this study further optimizes the feature weighting process by allowing the model to focus on the most important time steps, improving prediction accuracy. This enables the method to be more adaptable to different well conditions and production scenarios, while the hybrid models in the literature lack this optimization layer.

Wei Xiaoyi successfully applied the LSTM model to predict CBM production and based their model on geological data [48]; they typically only used static factors such as coal seam depth and permeability. In contrast, the method presented here integrates both geological data and dynamic production data (such as pressure and gas production rates), offering a more comprehensive approach to CBM production forecasting. This integrated approach allows for more accurate and timely predictions, as it considers not only geological features but also the continuously changing production conditions.

The method proposed in this study, by integrating dynamic and static data and providing stronger adaptability through the attention mechanism, exhibits better computational stability and prediction accuracy, demonstrating superior performance compared to existing CBM prediction models. It presents a promising tool for real-time, large-scale applications in CBM production forecasting.

Despite its advantages, the LSTM-A does have some limitations, particularly when predicting highly volatile production data. Future research could explore incorporating additional features or developing hybrid models that combine strengths from different machine learning approaches.

Although the model proposed in this study demonstrates promising performance in coalbed methane (CBM) production forecasting, there is still room for improvement. Future research can explore several directions. First, while the model effectively captures dynamic changes in production data, it may still face challenges in extreme conditions, such as in areas with complex geological environments. To further enhance the model’s diversity and stability, more types of deep learning models, such as graph neural networks or generative adversarial networks, could be integrated to improve its predictive ability in high-variance or unknown production patterns. Secondly, although this study combined geological data and dynamic production data, future research could incorporate additional data sources, such as climate change, socio-economic factors, or multi-scale geological models, to more comprehensively assess the production potential of CBM.

Moreover, as production environments evolve rapidly, real-time data stream-based forecasting systems will be increasingly important. Future studies could explore online learning methods that enable the model to adjust its predictions in real time, better addressing uncertainties in actual production, especially over long production cycles. Furthermore, the method presented here could be expanded to other types of unconventional energy, such as shale gas and tight gas. Given the similarities in geological characteristics and production data between these energy sources and CBM, the proposed model could have broader applicability. Finally, although deep learning models have significantly improved prediction accuracy, their “black box” nature remains a challenge. Future research could focus on enhancing model interpretability, for example, through visualization techniques or attention mechanisms within the model, to help users better understand which factors contribute the most to the prediction results.

5. Conclusions

The LSTM-A model proposed in this study effectively integrates geological data and well cluster features. The model can accurately predict future trends by utilizing time-series attributes closely related to production. The following conclusions were drawn:

(1): The LSTM-A model demonstrates superior performance in predicting CBM production, particularly during the rapid decline and stable production phases, outperforming traditional ARPS, Li–Horne, and LSTM models and proving its effectiveness across various production curve types.
(2): The combination of Transformer and LSTM architectures enables the LSTM-A model to more effectively analyze the temporal complexity of CBM extraction, significantly improving the R² score from 0.17 (LSTM) to 0.79 (LSTM-A) and enhancing the model’s explanatory power.
(3): By optimizing the key parameters, such as feature selection, Dropout rate, time step, and the number of layers, the LSTM-A model effectively balances prediction accuracy and training efficiency, making it well suited for diverse CBM production scenarios.
(4): The incorporation of certain engineering parameters (CP, BHP, WPR, DLL) as features has significantly improved the predictive capability of the model. These engineering parameters offer advantages over geological parameters, such as being easier to obtain, lower in cost, higher in accuracy, easier to standardize, and more intuitively impactful.
(5): The LSTM-A model successfully captures the periodicity and regression characteristics of fluctuating production data. It is capable of handling short-term production fluctuations and long-term trend changes.

Author Contributions

Conceptualization, H.Z. and K.L.; methodology, H.Z. and K.L.; software, H.Z. and. J.H.; validation, K.L.; formal analysis, S.S.; investigation, H.Z. and K.L.; resources, K.L., H.Z. and K.L.; writing—original draft preparation, H.Z.; writing—review and editing, S.S. and K.L.; visualization, S.S.; supervision, K.L.; project administration, K.L.; funding acquisition, K.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was financially supported by Huayang New Material Technology Group Co., Ltd.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

CBM	Coalbed methane
ML	Machine learning
LSTM	Long short-term memory
MAE	Mean absolute error
RMSE	Root mean square error
R²	R squared
LSTM-A	Lstm-attention
DCA	Decline curve analysis
ANN	Artificial neural network
LSSVM	Least squares support vector machine
EUR	Estimated ultimate recovery
GPR	Gas production rate
SR	Seam roof
PBD	Pump bottom depth
GPD	Gas production day
BHP	Bottomhole pressure
CP	Casing pressure
SP	System pressure
DLL	Dynamic liquid level
LCH	Liquid column height
SR	Stroke rate
WPR	Water production rate
CGP	Cumulative gas production
CWP	Cumulative water production

References

Xu, F.; Yan, X.; Lin, Z.; Li, S.; Xiong, X.; Yan, D.; Wang, H.; Zhang, S.; Xu, B.; Ma, X.; et al. Research progress and development direction of key technologies for efficient coalbed methane development in China. Coal Geol. Explor. 2022, 50, 1–14. [Google Scholar]
Mohamed, T.; Mehana, M. Coalbed methane characterization and modeling: Review and outlook. Energy Sources Part A Recovery Util. Environ. Eff. 2020, 47, 2874–2896. [Google Scholar] [CrossRef]
Suárez, A.A. The expansion of unconventional production of natural gas (tight gas, gas shale and coal bed methane). Adv. Nat. Gas Technol. 2012, 7, 123–146. [Google Scholar]
Zheng, C.; Jiang, B.; Xue, S.; Chen, Z.; Li, H. Coalbed methane emissions and drainage methods in underground mining for mining safety and environmental benefits: A review. Process Saf. Environ. Prot. 2019, 127, 103–124. [Google Scholar] [CrossRef]
Qin, Y.; Moore, T.A.; Shen, J.; Yang, Z.; Shen, Y.; Wang, G. Resources and geology of coalbed methane in China: A review. Int. Geol. Rev. 2017, 60, 777–812. [Google Scholar] [CrossRef]
Song, Y.; Liu, S.; Zhang, Q.; Tao, M.; Zhao, M.; Hong, F. Coalbed methane genesis, occurrence and accumulation in China. Pet. Sci. 2012, 9, 269–280. [Google Scholar] [CrossRef]
Moore, T.A. Coalbed methane: A review. Int. J. Coal Geol. 2012, 101, 36–81. [Google Scholar] [CrossRef]
Zeng, B.; Li, H. Prediction of Coalbed Methane Production in China Based on an Optimized Grey System Model. Energy Fuels 2021, 35, 4333–4344. [Google Scholar] [CrossRef]
Sarhosis, V.; Jaya, A.A.; Thomas, H.R. Economic modelling for coal bed methane production and electricity generation from deep virgin coal seams. Energy 2016, 107, 580–594. [Google Scholar] [CrossRef]
Mastalerz, M.; Drobniak, A. Coalbed methane: Reserves, production, and future outlook. In Future Energy; Elsevier: Amsterdam, The Netherlands, 2020; pp. 97–109. [Google Scholar]
Liang, W.; Yan, J.; Zhang, B.; Hou, D. Review on coal bed methane recovery theory and technology: Recent progress and perspectives. Energy Fuels 2021, 35, 4633–4643. [Google Scholar] [CrossRef]
Tariq, Z.; Aljawad, M.S.; Hasan, A.; Murtaza, M.; Mohammed, E.; El-Husseiny, A.; Alarifi, S.A.; Mahmoud, M.; Abdulraheem, A. A systematic review of data science and machine learning applications to the oil and gas industry. J. Pet. Explor. Prod. Technol. 2021, 11, 4339–4374. [Google Scholar] [CrossRef]
He, J.; Li, K.; Wang, X.; Gao, N.; Mao, X.; Jia, L. A Machine Learning Methodology for Predicting Geothermal Heat Flow in the Bohai Bay Basin, China. Nat. Resour. Res. 2022, 31, 237–260. [Google Scholar] [CrossRef]
Aminian, K.; Ameri, S.; Bhavsar, A.; Sanchez, M.; Garcia, A. Type Curves for Coalbed Methane Production Prediction. In SPE Eastern Regional Meeting; SPE: Calgary, AB, Canada, 2004. [Google Scholar]
Valkó, P.P.; Lee, W.J. A Better Way to Forecast Production from Unconventional Gas Wells. In SPE Annual Technical Conference and Exhibition? SPE: Calgary, AB, Canada, 2010. [Google Scholar]
Valkó, P.P. Assigning value to stimulation in the Barnett Shale: A simultaneous analysis of 7000 plus production hystories and well completion records. In SPE Hydraulic Fracturing Technology Conference and Exhibition; SPE: Calgary, AB, Canada, 2009. [Google Scholar]
Duong, A.N. Rate-Decline Analysis for Fracture-Dominated Shale Reservoirs. SPE Reserv. Eval. Eng. 2011, 14, 377–387. [Google Scholar] [CrossRef]
Li, K.; Horne, R.N. Characterization of spontaneous water imbibition into gas-saturated rocks. In SPE Western Regional Meeting; SPE: Calgary, AB, Canada, 2000; pp. 267–278. [Google Scholar]
Li, K.; Horne, R.N. An analytical model for production decline-curve analysis in naturally fractured reservoirs. SPE Reserv. Eval. Eng. 2005, 8, 197–204. [Google Scholar] [CrossRef]
Li, K.; Horne, R.N.; Stanford, U. Generalized Scaling Approach for Spontaneous Imbibition: An Analytical Model. SPE Reserv. Eval. Eng. 2006, 9, 251–258. [Google Scholar] [CrossRef]
Jongkittinarukorn, K.; Last, N.; Escobar, F.H.; Srisuriyachai, F. A straight-line DCA for a gas reservoir. J. Pet. Sci. Eng. 2021, 201, 108452. [Google Scholar] [CrossRef]
Tugan, M.F.; Weijermars, R. Variation in b-sigmoids with flow regime transitions in support of a new 3-segment DCA method: Improved production forecasting for tight oil and gas wells. J. Pet. Sci. Eng. 2020, 192, 107243. [Google Scholar] [CrossRef]
Jaber, A.K.; Al-Jawad, S.N.; Alhuraishawy, A.K. A review of proxy modeling applications in numerical reservoir simulation. Arab. J. Geosci. 2019, 12, 701. [Google Scholar] [CrossRef]
Zhang, Y.; Li, Z.; Wu, H. Interactive machine learning for segmenting pores of sandstone in computed tomography images. Gas Sci. Eng. 2024, 126, 205343. [Google Scholar] [CrossRef]
Guo, Y.-S. Selection of machine learning algorithms in coalbed methane content predictions. Appl. Geophys. 2023, 20, 518–533. [Google Scholar] [CrossRef]
Zhang, Q.; Tang, S.; Zhang, S.; Xi, Z.; Jia, T.; Yang, X.; Lin, D.; Yang, W. Data-Driven Approach for the Prediction of In Situ Gas Content of Deep Coalbed Methane Reservoirs Using Machine Learning: Insights from Well Logging Data. ACS Omega 2025, 10, 2871–2886. [Google Scholar] [CrossRef] [PubMed]
Aung, Z.; Mikhaylov, I.S.; Aung, Y.T. Artificial Intelligence Methods Application in Oil Industry. In Proceedings of the 2020 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus), St. Petersburg and Moscow, Russia, 27–30 January 2020; pp. 563–567. [Google Scholar]
Mask, G.M.; Wu, X.; Nicholson, C. Enhanced hydrocarbon production forecasting combining machine learning, transfer learning, and decline curve analysis. Gas Sci. Eng. 2025, 134, 205522. [Google Scholar] [CrossRef]
Yehia, T.; Khattab, H.; Tantawy, M.; Mahgoub, I. Removing the Outlier from the Production Data for the Decline Curve Analysis of Shale Gas Reservoirs: A Comparative Study Using Machine Learning. ACS Omega 2022, 7, 32046–32061. [Google Scholar] [CrossRef]
Han, D.; Kwon, S.; Son, H.; Lee, J. Production forecasting for shale gas well in transient flow using machine learning and decline curve analysis. In Proceedings of the Asia Pacific Unconventional Resources Technology Conference, Brisbane, Australia, 18–19 November 2019. [Google Scholar]
Du, S.; Wang, J.; Wang, M.; Yang, J.; Zhang, C.; Zhao, Y.; Song, H. A systematic data-driven approach for production forecasting of coalbed methane incorporating deep learning and ensemble learning adapted to complex production patterns. Energy 2023, 263, 126121. [Google Scholar] [CrossRef]
Xu, X.; Rui, X.; Fan, Y.; Yu, T.; Ju, Y. A multivariate long short-term memory neural network for coalbed methane production forecasting. Symmetry 2020, 12, 2045. [Google Scholar] [CrossRef]
Yu, J.; Zhu, L.; Qin, R.; Zhang, Z.; Li, L.; Huang, T. Combining K-Means Clustering and Random Forest to Evaluate the Gas Content of Coalbed Bed Methane Reservoirs. Geofluids 2021, 2021, 9321565. [Google Scholar] [CrossRef]
Jang, H.; Kim, Y.; Park, J.; Lee, J. Prediction of production performance by comprehensive methodology for hydraulically fractured well in coalbed methane reservoirs. Int. J. Oil Gas Coal Technol. 2019, 20, 143–168. [Google Scholar] [CrossRef]
Meng, X.; Chang, H.; Wang, X. Methane concentration prediction method based on deep learning and classical time series analysis. Energies 2022, 15, 2262. [Google Scholar] [CrossRef]
Li, X.; Li, X.; Xie, H.; Feng, C.; Cai, J.; He, Y. Enhanced coalbed methane well production prediction framework utilizing the CNN-BL-MHA approach. Sci. Rep. 2024, 14, 14689. [Google Scholar] [CrossRef]
Hu, C.; Zhong, Y.; Lu, Y.; Luo, X.; Wang, S. A Prediction Model for Time Series of Dissolved Gas Content in Transformer Oil Based on LSTM. J. Phys. Conf. Ser. 2020, 1659, 012030. [Google Scholar] [CrossRef]
Tang, B.; Matteson, D.S. Probabilistic Transformer for Time Series Analysis. Adv. Neural Inf. Process Syst. 2021, 28, 23592–23608. [Google Scholar]
Kim, D.K.; Kim, K. A Convolutional Transformer Model for Multivariate Time Series Prediction. IEEE Access 2022, 10, 101319–101329. [Google Scholar] [CrossRef]
Yu, Y.; Si, X.; Hu, C.; Zhang, J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef]
Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
Han, K.; Xiao, A.; Wu, E.; Guo, J.; Xu, C.; Wang, Y. Transformer in transformer. Adv. Neural Inf. Process Syst. 2021, 34, 15908–15919. [Google Scholar]
Arps, J.J. Analysis of decline curves. Trans. AIME 1945, 160, 228–247. [Google Scholar] [CrossRef]
Li, K.; Horne, R.N. Fractal modeling of capillary pressure curves for The Geysers rocks. Geothermics 2006, 35, 198–207. [Google Scholar] [CrossRef]
Bradshaw, C.R.; Schmitt, J.; Langebach, R. Predicting Vapor Injected Compressor Performance Using Artificial Neural Networks. 2024. Available online: https://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=3911&context=icec (accessed on 8 November 2024).
Xu, X.; Rui, X.; Fan, Y.; Yu, T.; Ju, Y. Forecasting of coalbed methane daily production based on T-LSTM neural networks. Symmetry 2020, 12, 861. [Google Scholar] [CrossRef]
Song, H.; Du, S.; Yang, J.; Wang, M.; Zhao, Y.; Zhang, J.; Zhu, J. Forecasting and influencing factor analysis of coalbed methane productivity utilizing intelligent algorithms. Gongcheng Kexue Xuebao/Chin. J. Eng. 2024, 46, 614–626. [Google Scholar]
Wei, X.; Huang, W.; Liu, L.; Wang, J.; Cui, Z.; Xue, L. Low-rank coalbed methane production capacity prediction method based on time-series deep learning. Energy 2024, 311, 133247. [Google Scholar] [CrossRef]

Figure 1. LSTM-A model implementation flow chart.

Figure 2. Location distribution of the well dataset.

Figure 3. Comparison of production curves of different types of CBM wells.

Figure 4. Pearson correlation matrix of coalbed methane production and production parameters. (gas production rate (GPR), seam roof (SR), pump bottom depth (PBD), gas production day (GPD), bottomhole pressure (BHP), casing pressure (CP), system pressure (SP), dynamic liquid level (DLL), liquid column height (LCH), stroke rate (SR), water production rate (WPR), cumulative gas production (CGP), cumulative water production (CWP)).

Figure 5. Preprocessed CBM production distribution for ML in Yangquan, Shanxi.

Figure 6. Effect of different features on prediction results of Well 3.

Figure 7. Effect of different Dropout rates on prediction results of Well 3.

Figure 8. Effect of different time steps on prediction results of Well 3.

Figure 9. Effect of different layers on prediction results of Well 3.

Figure 10. Comparison of daily production analysis models for Well 1.

Figure 11. Comparison of cumulative production analysis models for Well 1.

Figure 12. Comparison of daily production analysis models for Well 2.

Figure 13. Comparison of cumulative production analysis models for Well 2.

Figure 14. Comparison of daily production analysis models for Well 3.

Figure 15. Comparison of cumulative production analysis models for Well 3.

Table 1. LSTM-A model architecture, developed in Python 3.12.

Parameters	Values
LSTM Layers	2
LSTM Hidden Size	128
Transformer Layers (Encoder)	6
Transformer Layers (Decoder)	6
Transformer Hidden Size	128
Number of Attention Heads	8
Dropout Rate	0.3
Final Output Size	1

Table 2. Prediction accuracy evaluation of different features.

Model	Four Features	CP	BHP	WPR	DLL
MAE	71.03	63.64	93.53	63.66	142.85
RMSE	115.17	111.50	129.09	110.20	171.74
R²	0.56	0.58	0.44	0.59	0.01

Table 3. Prediction accuracy evaluation of different Dropout rates.

Dropout Rate	0.1	0.2	0.3	0.4
MAE	68.02	70.08	101.47	79.09
RMSE	113.86	110.19	135.13	119.40
R²	0.57	0.59	0.39	0.52

Table 4. Prediction accuracy evaluation of different time steps.

Time Step	5	10	15	20
MAE	137.73	144.14	115.21	101.47
RMSE	181.27	167.50	141.91	135.13
R²	−0.10	0.06	0.32	0.39

Table 5. Prediction accuracy evaluation of different layers.

Model	1 Layer	2 Layers	3 Layers	4 Layers
MAE	65.86	101.47	83.91	64.49
RMSE	109.45	135.13	122.36	109.57
R²	0.60	0.39	0.50	0.60

Table 6. Prediction accuracy evaluation of LSTM-A model with LSTM, Li–Horne model, exponential model, and harmonic model in production decline stage for Well 1.

Model	LSTM	LSTM-A	Exponential Model	Harmonic Model	Li–Horne Model
MAE	57.51	18.27	229.93	76.88	258.17
RMSE	74.70	37.98	242.30	84.97	268.80
R²	0.17	0.79	−7.70	−0.07	−11.51

Table 7. Prediction accuracy evaluation of LSTM-A model with LSTM, Li–Horne model, exponential model, and harmonic model in production decline stage for Well 2.

Model	LSTM	LSTM-A	Exponential Model	Harmonic Model	Li–Horne Model
MAE	21.63	7.75	67.17	100.34	33.02
RMSE	25.74	9.72	71.40	100.54	39.01
R²	0.41	0.92	−3.52	−7.96	−0.349

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, H.; Li, K.; Shi, S.; He, J. Prediction of Coalbed Methane Production Using a Modified Machine Learning Methodology. Energies 2025, 18, 1341. https://doi.org/10.3390/en18061341

AMA Style

Zhang H, Li K, Shi S, He J. Prediction of Coalbed Methane Production Using a Modified Machine Learning Methodology. Energies. 2025; 18(6):1341. https://doi.org/10.3390/en18061341

Chicago/Turabian Style

Zhang, Hongyang, Kewen Li, Shuaihang Shi, and Jifu He. 2025. "Prediction of Coalbed Methane Production Using a Modified Machine Learning Methodology" Energies 18, no. 6: 1341. https://doi.org/10.3390/en18061341

APA Style

Zhang, H., Li, K., Shi, S., & He, J. (2025). Prediction of Coalbed Methane Production Using a Modified Machine Learning Methodology. Energies, 18(6), 1341. https://doi.org/10.3390/en18061341

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Prediction of Coalbed Methane Production Using a Modified Machine Learning Methodology

Abstract

1. Introduction

2. Methodology

2.1. Model Construction

2.1.1. Long Short-Term Memory Network

2.1.2. Transformer Architecture

2.2. Comparison Model

2.3. Model Inference

2.4. Data Description

2.5. Data Preprocess

3. Results

3.1. Model Optimization

3.1.1. Feature Selection

3.1.2. Dropout Rate

3.1.3. Time Step

3.1.4. Layer

3.2. Field Application

3.2.1. Rapid Decline Prediction

3.2.2. Stable Production Prediction

3.2.3. Fluctuating Production Prediction

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI