1. Introduction
Some blocks of the Sulige gas field in China have entered a late stage of production and development. As development progresses, formation pressure gradually decreases, leading to significant declines in gas well production. Currently, the main gas-producing layers in this field have been extensively explored and developed, with a high degree of control over the well network in these areas, resulting in high reserve utilization. However, in the eastern part of the field, some non-dominant gas layers are affected by well control, resulting in scattered, point-like distributions. In certain localized areas, these layers show signs of relative enrichment, high well-encounter rates, and favorable physical properties, indicating potential for production enhancement. Therefore, it is necessary to predict the production capacity of these non dominant gas wells to help further evaluate the potential for gas reservoir development.
Productivity prediction is an important step in evaluating gas well production capacity and in planning gas reservoir development. Currently, there are various methods for capacity prediction of gas wells in multi-layer reservoirs, such as the capacity test-well [
1,
2,
3] and analytical methods [
4,
5,
6,
7,
8,
9,
10,
11]. The capacity test-well method is widely used in gas field development, but it requires a long testing time and only a few wells can be tested because of its cost. The analytical method evaluates the prediction by deriving the capacity equation, in which there are many assumptions in the capacity equation, and the influence of the physical parameters for each gas-bearing formation on the production cannot be considered sufficiently, which leads to a large deviation in the prediction result from the actual value. Multi-layer tight sandstone gas reservoirs can provide high production after fracturing, but a large difference between the layers of the reservoir will cause gas well production to decrease rapidly. Further, with the extension of production time, the formation pressure decreases and the contribution from each gas-bearing formation to the total production will vary, so using the conventional production prediction method makes it difficult to quickly and accurately predict the production rate for the wells. Therefore, it is necessary to find a suitable method for predicting the production capacity of non-dominant gas wells in tight gas reservoirs.
With the rapid development of artificial intelligence in recent years, many machine learning algorithms have been applied in oil and gas well production prediction [
12,
13,
14,
15,
16,
17,
18,
19]. Neural networks are among the most widely used machine learning algorithms, with flexible structures that adapt to different gas well productivity prediction tasks. From simple self-organizing maps and backpropagation networks to fuzzy, deep, and hybrid neural networks, various models have been applied to predict gas well capacity. For example, Christian Oberwinkler et al. [
20] used a three-layer self-organizing map to predict one-year gas production from 200 tight gas wells, considering reservoir thickness, fluid volume, proppant type, and other features. Shelley et al. [
21] established a neural network model based on geological and engineering parameters for oil production prediction from 301 fractured wells. Liu Hong et al. [
22] developed a fuzzy neural network for reservoir properties and sand addition, while Guofan Luo et al. [
23] created a deep neural network with four hidden layers to predict shale oil output, considering primarily numerical features. Shuhua Wang et al. [
24] built deep neural networks with 11 numerical and 7 classification features for a cumulative oil forecast in the Bakken Formation. Besides neural networks, algorithms like support vector machines [
25,
26], random forests [
27,
28], and gradient boosting [
29,
30] have also been used. Support vector machines optimized via algorithms like grey wolf [
31] and particle swarm [
32] have also been employed to improve model accuracy, especially under limited data conditions.
In summary, gas well productivity prediction studies vary in models, variables, and data volume, but most focus on model training and optimization based on structured data. While sophisticated models can enhance accuracy, data limitations—especially with smaller datasets—restrict the predictive performance of transfer learning approaches.
To address these issues, this study introduces a multi-task approach to mitigate the problem of limited data samples. Based on reservoir properties of multi-layer tight sandstone gas reservoirs, fracturing data from gas wells, and production history data, a feature selection process for gas wells in non-dominant gas layers of tight sandstone gas reservoirs is established. A multi-layer tight sandstone gas reservoir–gas well production prediction method (PLEMT) based on Progressive Layered Extraction (PLE) is proposed, enabling multi-task learning-based production capacity prediction for multi-layer tight sandstone gas reservoirs.
3. Productivity Prediction Model for Multi-Task Progressive Hierarchical Extraction
3.1. Feature Selection Method
Gas well productivity is affected by many factors, some of which may have a nonlinear relationship with production. In the data collection stage, it is necessary to collect as much as possible the relevant files in the database of the gas field enterprise, and extract the required parameters from the files to establish a gas production database, so as to provide sufficient and high-quality training data for modeling. In this paper, a feature selection process for influencing factors of gas well productivity in tight sandstone reservoirs is proposed, as shown in
Figure 1.
3.1.1. Mutation Method to Obtain Daily Production of Each Gas-Bearing Layer
The productivity of multi-layer gas wells is affected by the reservoir properties, fracturing parameters, and production dynamic parameters of the wells. The non-dominant layer of multi-layer tight sandstone gas wells is often combined with the dominant layer for production, and there are differences in gas-bearing layers of each well, so if the reservoir properties and fracturing parameters of each layer are input into the production prediction model, it means that it is necessary to realize “one prediction model for one well”, which greatly increases the prediction difficulty and workload. Additionally, when the reservoir properties of two gas-bearing layers differ significantly, the reservoir properties of the layer with poorer properties are easily excluded when predicting gas well production capacity, leading to missing input data. For example, well A is a gas well with a three-layer tight sandstone reservoir, where the reservoir properties of secondary layer 3 are significantly inferior to those of secondary layers 1 and 2. Between 2009 and 2014, the well underwent four profile tests, which showed that secondary layer 3 contributed 23.2% to 30% of the gas well’s production, as shown in
Figure 2.
However, when conducting a Spearman correlation analysis between the commonly used reservoir properties of the well (including porosity, permeability, formation thickness, and gas saturation of each gas-bearing layer) and the cumulative gas production of the well, the Spearman correlation coefficient between the characteristics of layer 3 and the total production was significantly lower than that of the other two gas-bearing layers due to the poor reservoir properties of layer 3. Therefore, the reservoir properties of layer 3 cannot be used as feature inputs for the predictive model. Consequently, in the production capacity prediction of multi-layer tight sandstone gas reservoirs, it is necessary to segment the gas well’s production. By studying the correlation between the production capacity and features of each gas-bearing layer after segmentation, the input features for the production capacity prediction model of multi-layer tight sandstone gas wells can be determined.
Mutation theory, founded by French mathematician René Thom in the 1970s, is a comprehensive use of topology, singularity theory, and structural stability to study the phenomenon of mutation in the internal role of the uncertain system of mathematical disciplines. The theory can also be applied to ranking or preferentially selecting different entities based on the same influencing factors. The most widely used types of mutation include cusp mutation, swallowtail mutation, and butterfly mutation models [
36], and the potential function and divergence point set equations are shown in
Table 1. In oil and gas research, applying mutation theory involves the topics of reservoir evaluation, production splitting, and target optimization, and the production splitting of gas-bearing layers of multi-layered combined extraction wells has also achieved good results [
37,
38,
39].
The normalization formula can be derived from the divergence point set equation, and then the total mutation affiliation function value of the system can be found, so the state variables and control variables in the normalization formula need to be normalized to between 0 and 1. The normalization formulas for the three commonly used mutation models given above are as follows:
First, determine the control variables affecting the mutation model, and sort and classify the influence size; establish the mutation model architecture from bottom to top, i.e., from the indicator layer to the criterion layer, and then to the target layer; next, calculate the mutation affiliation function value according to the mutation model satisfied by each layer within the architecture after normalization of each control variable; and finally determine the system target value for the different evaluation objects.
3.1.2. De-Multicollinearity
Gas well productivity prediction often involves many input features, and high inter-feature similarity can lead to multicollinearity, distorting model estimates or reducing stability and thus causing overfitting. To address this, we apply two sequential screening steps:
We compute the Spearman rank correlation coefficient () between each feature and the target variable. Features with are retained to form feature subset A.
- (2)
Hampel Distance
We then evaluate the full-order nonlinear dependency of each feature on the target using the Hampel identifier. For feature
, the Hampel distance
is defined as
where
is the biased mutual information estimate for feature
,
is the median of all
values,
is the median absolute deviation of these
values, and 1.4826 is the normalization factor that makes
an unbiased estimator of the standard deviation under normality.
By the 3-sigma rule, which states that approximately 99.7% of values in a normal distribution lie within three standard deviations of the mean, features with are deemed to have significant dependency on the target and are retained to form feature subset B.
Finally, we merge subsets A and B to create the candidate set C. To remove redundancy, we compute pairwise Spearman correlations within C; whenever between two features, we discard the feature with lower task relevance. The remaining features constitute the final subset D.
3.1.3. Feature Importance Evaluation
In this study, the importance scores for each feature were evaluated using three different methods. First, the sequential backward selection method assessed feature importance based on the change in model performance after iteratively removing each feature, combined with importance scores derived from a random forest model. Features with higher importance scores contributed more significantly to the model’s accuracy. Second, recursive feature elimination was used to evaluate feature importance based on the absolute values of coefficients or importance metrics, where features with larger absolute values were considered more influential. Third, the SHAP algorithm quantified the contribution of each feature to individual predictions, and the average of the absolute SHAP values across all samples was used as a global importance measure. The scores from these three methods were normalized using min–max scaling and then combined through a weighted average to generate a comprehensive feature importance ranking. Based on this ranking, the top features were selected to form the feature subset, denoted as E, for subsequent modeling.
3.1.4. Feature Number Optimization
To ensure the robustness of the feature selection process and prevent overfitting, all feature importance rankings and optimal feature subset selections were performed exclusively on the training dataset. For each candidate feature subset, a model was trained solely on training data, and its performance was evaluated within the training set using cross-validation. The feature subset yielding the best validation performance was identified as the optimal set. Importantly, the test dataset was kept completely independent and unused during the feature selection process, thereby avoiding any data leakage. The final model was trained with this selected feature set and directly tested on the test data.
3.2. Progressive Hierarchical Extraction Methods
Multi-task learning aims to enhance the generalization performance by sharing information among multiple tasks and essential in choosing appropriate parameter-sharing methods. Existing parameter-sharing methods in multi-task learning mostly use hard parameter sharing, soft parameter sharing, and Multi-Gate Mixture of Experts (MMOE) and its variants; however, these methods have certain limitations. Hard parameter sharing does not take into account the sensitive relationships and essential differences among tasks, and the model will share features during the training of different task predictions, which will degrade the overall task performance and lead to a negative migration for multi-task learning. The soft-sharing approach requires training a model for each task, which is not parametrically efficient, and the model building and implementation of the parameter-sharing approach requires the operator to have rich prior knowledge. MMOE does not address the issue of co-use of features for different tasks and has high requirements for task relevance.
To effectively harness relevant features and facilitate efficient feature sharing among multiple tasks, this study adopts the Progressive Layered Extraction (PLE) method [
40]. PLE aims to progressively extract and fuse features across multiple levels, enabling the model to preserve and leverage task-specific information while enhancing its generalization capability. The structure of the PLE model is shown in
Figure 3.
The PLE model typically consists of multiple hierarchical layers, each containing shared experts and task-specific experts. Shared experts are responsible for learning common features across all tasks, whereas task-specific experts focus on capturing unique information for individual tasks. In each layer, gating mechanisms dynamically fuse the outputs from shared and task-specific experts, producing task-specific feature representations. This layered extraction process captures multi-scale, hierarchical task-correlated features.
The primary advantage of this method lies in its ability to integrate multi-level features effectively, avoiding premature feature interference or loss. It enables the model to adaptively share features at different hierarchical levels, boosting both accuracy and robustness in multi-task scenarios. Additionally, the flexible gating mechanism allows tasks to dynamically adjust the contribution of different feature sources, improving overall model adaptability and generalization.
3.3. MTPLE Capacity Forecasting Model
To improve the accuracy of capacity prediction, this study introduces the Multi-Task Progressive Layered Extraction (MTPLE) model. As shown in
Figure 4, the MTPLE model leverages multi-task learning and progressive hierarchical feature extraction to build a robust, generalizable forecasting framework.
The model consists of multiple hierarchical layers. Input features are first preprocessed and then passed into expert networks at each layer, which include both shared experts and task-specific experts. A gating mechanism dynamically fuses the outputs of these experts, generating hierarchical features tailored to each task. These features are subsequently processed by task-specific tower networks to produce the final predictions.
The model is designed to address two main prediction tasks: Task 1 involves predicting the daily average production of each gas-bearing layer during the first month after well commencement. Task 2 focuses on predicting the cumulative production of each gas-bearing layer over the first year. The selected features are fed into the MTPLE structure, where different expert networks and shared networks are constructed for the two tasks. The outputs from these networks are fused via gating units and further processed by tower networks to generate task-specific predictions.
To ensure a balance between computational efficiency and predictive accuracy, the extraction network is configured with two layers, and linear activation functions are applied to the expert and shared network layers. The tower networks are implemented as Deep Neural Networks (DNNs) with 64 units each. The initial learning rate is set to 0.001, decaying by 0.2 every 10 iterations, with a maximum of 100 iterations. The performance metric used for network optimization is the coefficient of determination (R2).
4. Example Analysis of Gas Well Productivity Prediction in Multi-Layer Tight Sandstone Gas Reservoirs
4.1. Data Source and Preprocessing
This study analyzes a three-layer tight sandstone gas reservoir. The reservoir comprises 66 production wells, with non-main gas-producing layers including the He 8–1 section, Shan 2–3 section, and Benxi Formation. Based on the collected data, we extracted a categorized set of 43 feature parameters covering key geological, fracturing, and operational factors.
Since there are no missing values, no imputation was performed. Categorical features (such as the production layer location) were encoded using one-hot encoding. Continuous features (such as permeability and pressure) were standardized. Considering that all engineering parameters (such as orifice size, casing pressure, tubing pressure) are at well level, we can use the layer capacity results obtained from the mutation method to perform a “virtual correction” and estimate each layer’s parameters accordingly. Specifically, the method assumes
This approach allows us to derive layer-specific estimates for these well-level parameters based on the production capacity split, providing more realistic features for subsequent modeling. Taking the features for task 1 as an example, they include geological parameters and fracturing parameters for each layer, as well as the virtual layer engineering parameters derived from the mutation method. These features allow for a more precise representation of each layer’s capacity in the first month, providing targeted input variables for the model.
Table 2 shows the range of values for some features in task 1.
4.2. Calculation of Gas Well Production by the Mutation Method
The process of producing natural gas is regarded as a mutation phenomenon under the joint influence of reservoir physical properties, fracturing effect, and production dynamic parameters. Utilizing the mutation theory, the gas-bearing layer target state value is obtained to evaluate the production state of each gas-bearing layer. According to the experience of gas reservoir development, the production splitting coefficient of each layer in multi-layer combined gas wells is considered a separate system, controlled by ten internal factors categorized into three subsystems, as shown in
Figure 5. The reserve characteristics subsystem (A) includes three factors: effective thickness (A1), porosity (A2), and gas saturation (A3), and is modeled as a swallowtail mutation. The development characteristic subsystem (B) comprises permeability (B1), interlayer interference coefficient (B2), and mid-layer pressure (B3), also forming a swallowtail mutation model. The geological features subsystem (C) consists of sedimentary microfacies (C1), sandstone content (C2), reservoir density (C3), and gas layer depth (C4), which constitute a butterfly mutation model. The three subsystems—reserve characteristics, development characteristics, and geological features—together form the overall mutation system.
Because not all non-dominant gas wells in the target block have undergone gas production profile testing, to verify the rationality of the mutation method, eight gas wells with gas production profile detection records were selected for calculation. Taking Well Y as an example, the calculation process of the mutation method is described.
Table 3 shows the values of the influence factor statistics for Well Y.
Taking a relatively discontinuous surface as an example, the reservoir feature subsystem, which includes layer thickness, porosity, and gas saturation, forms a swallowtail discontinuity model. Based on the obtained target values for each layer,
Mi, and the relative mutation surface target value,
M′, the production splitting coefficient for each layer is calculated using Equation (3).
The development feature subsystem of this layer constitutes a swallowtail mutation; as calculated above, B = 0.7014. The geological feature subsystem constitutes a butterfly mutation; as calculated above, C = 0.9033. The three subsystems constitute a swallowtail mutation; therefore, the target value of the relative mutation surface system is M = 0.9070. Using the same method, we obtain the system target values for each production layer, as shown in
Table 4. Next, we calculate the yield-splitting coefficient through Equation (5), and then obtain the production capacity for each sublayer.
The contribution of gas well productivity to each sublayer was also analyzed using mutation theory for the remaining seven wells. The calculation results show that the error of the mutation method in calculating the layered production contribution and the gas production profile test is less than 15%, which meets the requirements of the engineering calculations, as shown in
Figure 5. Therefore, the mutation method was utilized to split 66 joint wells in the non-dominant layer of a multi-layer tight sandstone gas field to obtain the daily production from each gas-bearing layer of the gas wells.
4.3. Feature Selection Results
According to the feature selection process, the reservoir properties and fracturing parameters described in
Section 3.1 are used as independent variables. For feature selection, the reservoir properties and fracturing parameters described in
Section 3.1 were used as independent variables. The per-layer gas productivity ratios were obtained after layer-wise splitting using the mutation method. These ratios were then used as target variables to derive open-flow capacities and first-year cumulative production for each gas-bearing layer, based on which the indicators for tasks 1 and 2 were selected. Since the feature selection process is the same for different prediction tasks, we only describe in detail the feature selection for task 1.
4.3.1. De-Multicollinearity Results
The Spearman correlation coefficient method was used to evaluate task 1 and 43 features, retaining those with a correlation greater than 0.2 to form feature subset A (22 features). Due to the large number of features, the Spearman correlation plot is not shown in this paper. Simultaneously, Hampel entropy was used for feature evaluation. Features with Hample entropy greater than 3 were selected to form feature subset B (20 features). The Hampel entropy calculation results are shown in
Figure 6. The union of A and B was taken to form C (20 features). By combining the Spearman correlation coefficient method to remove redundant features, the number of input features was reduced to 16, forming feature set D.
4.3.2. Results of the Feature Importance Calculation
Comprehensive sequence backward selection, SHAP, and the three methods were used to calculate the importance scores of the features in D. The weighted average of the three methods after 0–1 normalization was used to obtain the final feature importance scores for the 16 features in task 1, as shown in
Figure 7.
4.3.3. Determine the Number of Task Features
To determine the number of task-specific features, we ranked features by [feature importance/chosen ranking metric] and then evaluated model performance by sequentially using the top 1, top 2, …, top 16 features as inputs. For each feature subset we trained a Random Forest (max depth = 3, n_estimators = 1000) and recorded the R
2 on the validation set (see
Figure 8). The R
2 curve peaked at seven features, which therefore constituted the optimal feature set E for task 1. The same procedure produced feature set F for task 2. The final feature set U was formed by taking the union of E and F (duplicates removed) and used as input to the MTPLE model. The resulting input features were casing pressure, gas layer thickness, permeability, fracturing sand intensity, production time, gap nipple size, and total fracturing fluid volume.
4.4. Validity Analysis of MTPLE Model
To validate the effectiveness of the proposed Multi-Task Progressive Learning Environment (MTPLE), we designed a comprehensive experiment to compare its performance with that of single-task neural network models. The single-task models are trained individually at the end of the PLE structure for each task (task 1 and task 2), with identical hyperparameters to ensure a fair comparison.
For data partitioning, the first 56 wells were used as the test set, and the remaining 10 wells as the training set. All neural network models were trained for 1000 epochs to allow sufficient convergence. Given the inherent randomness in neural network training—such as weight initialization and mini-batch sampling—each model configuration was trained 100 times to improve robustness and reduce stochastic variability. This resulted in repeated training of the same model architecture, each time starting from different initial conditions, and generating slightly different models.
Considering the use of 5-fold cross-validation to better evaluate model stability and generalization, the entire training process was conducted across five different data splits, with each fold serving as the validation set once, while the remaining four folds served as training data. Each fold was trained 100 times with different initializations, leading to a total of 5 folds × 100 repetitions = 500 trained models for each task and each model type (MTPLE and single-task).
The performance of the models was assessed by predicting across these 500 models for each sample, and then averaging the results to obtain the final predicted values. This approach minimizes the influence of randomness and provides a more reliable estimate of the models’ predictive capabilities.
The results in
Table 5 show that for task 1, the prediction errors were generally less than 15%, while for task 2, errors remained below 20%. These levels of accuracy satisfy the typical requirements for capacity prediction in this geological block.
Figure 9 and
Figure 10 provide a visual comparison of the 500 predicted values generated by the MTPLE model and single-task models for tasks 1 and 2, respectively, under 5-fold cross-validation, demonstrating the effectiveness and stability of the multi-task approach.
Figure 9 and
Figure 10 show the comparison of prediction results between the MTPLE and DNN models for task 1 and task 2, respectively.The dashed black box line represents the distribution of the 10% to 90% quantile predicted values, and the narrower the distribution range, the more stable the model-predicted results. In tasks 1 and 2, the interval of the 90% quantile of the predicted yield of the MTPLE model is narrower than that of the DNN model, and the mean value is closer to the actual yield, which indicates that the MTPLE model plays the role of data enhancement through the information sharing between tasks, and improves the accuracy and stability of the productivity prediction.
Table 6 compares the prediction errors of the MTPLE and DNN models. Compared with the DNN model, the RMSEs of the two tasks of MTPLE are reduced by 40% and 25.35%, respectively, indicating that the prediction performance of MTPLE is significantly better than that of the single-task model.
4.5. Performance Comparison Between MTPLE and Classical Machine Learning Models
To further evaluate the performance of the MTPLE model, four classical machine learning algorithms—K-Nearest Neighbor (KNN), Random Forest (RF), Support Vector Machine (SVM), and XGBoost—were also implemented. All models utilized the feature selection results to independently predict the two main tasks: task 1, predicting the capacity of each gas-bearing layer in the first month, and task 2, predicting the capacity of each layer during the first year. Since these traditional algorithms generally support single-task prediction, separate models were trained for each task. The hyperparameters were optimized via Bayesian optimization, which constructs a probabilistic surrogate model of the objective function to efficiently explore the hyperparameter space. The search ranges for each model’s hyperparameters were set as follows: for KNN, the number of neighbors (K) varied from 1 to 20; for SVM, the kernel included ‘linear’, ‘rbf’, and ‘poly’, with the regularization parameter C set between 0.1 and 1000, and gamma between 0.01 and 10; for Random Forest, the number of trees (n_estimators) ranged from 100 to 1000, and the maximum depth (max_depth) between 5 and 30; for XGBoost, the learning rate (learning_rate) was between 0.01 and 0.2, the maximum depth (max_depth) between 3 and 10, and the number of estimators (n_estimators) between 100 and 1000. Each model was run 500 times with different initializations, and the prediction results were averaged for stability. All models were evaluated through 5-fold cross-validation to ensure robustness and reduce bias.
Figure 11 and
Figure 12 show the prediction results of MTPLE and Support Vector Machine, K-Nearest Neighbor, Random Forest, and XGBoost for tasks 1 and 2, respectively. The horizontal coordinates in the figure are the actual initial production and the first-year cumulative production values, respectively, the vertical coordinate is the model prediction value, and the black dashed line is the 45° diagonal; the smaller the angle between the fitted line and the diagonal of the prediction result, the higher the model prediction accuracy. The prediction results of MTPLE are closer to the 45° diagonal for both tasks, and the prediction result is superior to that of classical machine learning algorithms.
Table 7 compares the MSE and R
2 of the Support Vector Machine, K-Nearest Neighbor, Random Forest, XGBoost, and MTPLE models for tasks 1 and 2, and the MTPLE model achieves the minimum error in both tasks. In the case of limited training data, the traditional machine learning algorithm is limited by the single-task prediction mode, which makes it difficult to fully learn the change-rule of yield; thus, the prediction accuracy is significantly lower than that of the MTPLE model.
The prediction model proposed in this paper can be used not only for predicting the productivity of gas wells in non-dominant multi-layer tight sandstone gas reservoirs, but also for predicting the productivity of other multi-layer gas wells and multi-layer oil wells after changing the appropriate input characteristics.
5. Influence of Different Characterization Categories on the Results of Production Capacity Prediction
Reservoir physical properties represent the geological characteristics of gas wells, and fracturing parameters reflect the fracture formation and expansion status, which in turn affect the production capacity of gas wells, and engineering parameters represent the impact of changes in the working system on the production capacity of gas wells. To clarify the influence of different feature categories and to reveal the relationship law between different features and production capacity, the feature categories listed above were input into the MTPLE prediction model to judge the influence of different feature categories on the production capacity of gas wells.
Figure 13 compares the prediction performance of the MTPLE model with different feature category input. The production dynamic parameters contribute the least to the performance of task 1, which is caused by the relatively consistent pressure drop in each well at the beginning of production and the small change in the gas nipple size. The production dynamic parameters have the largest contribution to task 2, which may be due to the good correlation between the casing pressure and gas well productivity in the long-term gas well production, while operational changes, such as switching on and off wells and replacing the gas nipple, provide dynamic information related to fluctuations in production, which constrains the overall trend in the long-term production of gas wells. Geological parameters have a small effect on both tasks 1 and 2, which reflects the strong heterogeneity and strong interval variability of reservoirs in tight sandstone gas reservoirs. The large difference between the mean and median values of the reservoir physical properties in
Table 1 indicates that the distribution of these characteristics is more sporadic between the maximum and the minimum values, whereas the difference between the mean and median of the gas well open-flow capacity and the first-year cumulative production is very small, which results in a small difference between the correlation of these characteristics and the open-flow capacity and the first-year cumulative production. This results in a poorer correlation between these other features and the open-flow capacity and first-year cumulative production, and there may be human errors in the initial entry. The influence of fracturing parameters on tasks 1 and 2 is larger, which indicates that the fracturing parameters can reflect the production status of gas wells during the initial and long-term production periods, and reasonable fracturing parameters have a greater influence on the productivity capacity of gas wells in multi-layered tight sandstone reservoirs.