1. Introduction
Seismic exploration is an important technique for reservoir identification and prediction in petroleum exploration and development, which has been widely used in the exploration and development of oil and gas resources worldwide. By analyzing the characteristics of seismic wave propagation in underground strata, seismic exploration can provide critical geological data for reservoir identification, prediction, and development [
1,
2,
3]. In hydrocarbon exploration, there are multiple challenges for the prediction of sandstone reservoirs. The study of deep tight sandstones in the Qigu Formation of the Junggar Basin conducted by Tian Lei et al. (2024) suggests that deposition controls the distribution and composition of sandstone bodies, which is the basic contributor to reservoir heterogeneity, while diagenesis is a key factor that affects the heterogeneity of deep tight sandstones by modifying sandstone pores in terms of structure and type [
4]. The study of Guo Song et al. (2013) on the beach-bar sandstones of the Boxing Oilfield in the Bohai Bay Basin shows that these sandstone reservoirs are highly heterogeneous due to the development of thinly interbedded sandstone and mudstone, their hydrocarbon-bearing properties vary significantly in different well areas and sandstone formations, their quality is closely correlated with their hydrocarbon-bearing properties, and the degree of hydrocarbon filling and hydrocarbon distribution are directly controlled by reservoir heterogeneity [
5]. Typically, sandstone reservoirs are highly heterogeneous, with complex variations in porosity and permeability distributions, and they are affected by multiple factors such as depositional environments and tectonic deformation. These factors make it very difficult to predict sandstone reservoirs.
In the selection of favorable seismic attributes, traditional methods often employ techniques such as correlation coefficient analysis, principal component analysis, and multiple regression, which are standardized, easy to implement, and computationally efficient. They are suitable for intuitive interpretation based on geological knowledge and can effectively support reservoir prediction when the number of seismic attributes is small and the seismic response characteristics are relatively simple. However, with the significant increase in seismic attribute types and data dimensions, these methods overly rely on linear correlation and human experience, making it difficult to characterize the complex inter-attribute relationships. Furthermore, due to collinearity and redundancy among seismic attributes, the results of attribute selection are easily affected by changes in samples and parameters, and the stability and applicability of these methods are limited [
6,
7]. Therefore, traditional methods for seismic attribute selection can barely meet the needs of accurate reservoir prediction, and more intelligent, data-driven attribute selection and fusion techniques need to be introduced. In addition, the statistical results for frequency attributes show that the dominant frequency of seismic data is usually in the range of 20–40 Hz (peak frequency: about 25–30 Hz), leading to weakened relationships between traditional seismic attributes and sandstone reservoirs and limited accuracy of reservoir prediction based on a single seismic attribute [
8,
9]. The relatively low contribution of frequency-related attributes is likely influenced by the limited seismic bandwidth (dominant frequency of ~25–30 Hz) and the corresponding vertical resolution, which may reduce their sensitivity to subtle thickness variations compared with amplitude-related attributes. Therefore, the lower weights reflect data-dependent sensitivity rather than inherent geological irrelevance. More importantly, the seismic responses of sandstone reservoirs often exhibit high ambiguity. Different seismic attributes may lead to different interpretations of the same reservoir. Due to this reason, traditional seismic attribute analysis methods have significant limitations in terms of accuracy and reliability, making it difficult to meet the needs of accurate reservoir prediction [
10,
11,
12].
Therefore, accurately capturing and interpreting reservoir features using more advanced seismic exploration techniques has become a key factor in improving the efficiency of hydrocarbon exploration and development. In recent years, the rapidly developing deep learning techniques have provided new solutions, and in particular, the application of convolutional neural networks (CNNs) and attention mechanisms has achieved breakthroughs in many fields. According to Zhang Chenjia et al. (2021), CNNs integrated with attention mechanisms can adaptively weight features at different scales such as channel and space and have significantly enhanced feature representation capabilities that enable them to achieve better results than traditional methods in complex tasks such as image recognition, fully demonstrating the advantages of deep learning techniques in automatic feature extraction and nonlinear modeling [
13]. CNNs can automatically extract multiscale spatial features from raw seismic data [
14], while attention mechanisms can highlight key features and suppress irrelevant information by dynamically weighting input features, significantly improving the performance of CNN models in complex tasks [
15]. These deep learning techniques have shown great potential in image processing in the field of seismic exploration.
In this paper, a seismic attribute fusion method based on a multiscale CNN and a self-attention mechanism is proposed to address the accuracy and reliability issues in sandstone reservoir prediction. This method can improve the accuracy of reservoir prediction by extracting multiscale features from seismic data through the multiscale CNN and by performing weighted fusion of features through the attention mechanism. It enables more effective extraction of key features from seismic data, providing more reliable technical support for accurately predicting sandstone reservoirs.
In summary, existing works on the prediction of complex sandstone reservoirs still have several deficiencies. First, they often use a single seismic attribute or a small number of seismic attributes for linear modeling, making it difficult to fully explore the nonlinear relationships between multi-source seismic attributes and resulting in limited accuracy of reservoir prediction. Second, they rely primarily on manual experience or simplified statistical indicators for the selection of favorable seismic attributes, lacking a unified framework for quantitative characterization of contributions from seismic attributes. Third, most of the studies on the sandstones in the Lower Talang Akar Formation (LTAF) of the B field in the South Sumatra Basin focus on geological understanding and conventional seismic interpretation, lacking deep learning-driven practices in sandstone thickness prediction. Therefore, an intelligent seismic attribute characterization method integrating a multiscale CNN and a self-attention mechanism was proposed to perform sandstone thickness prediction and seismic attribute selection for the LTAF sandstone reservoir in the B field, and its effectiveness was verified using actual production data.
The main contributions of this study are threefold. (1) We design a multiscale CNN backbone with parallel receptive fields to capture seismic–geological features at multiple spatial scales, which is critical for heterogeneous channelized sandstone reservoirs. (2) We incorporate a self-attention module to adaptively fuse multi-attribute features and mitigate redundancy/noise among seismic attributes. (3) We propose a dual-output strategy that outputs both a thickness prediction map and quantitative attribute importance scores, enabling joint prediction and interpretation within a unified framework. In comparison with existing research, our method offers substantial improvements in both prediction accuracy and interpretability. Recent studies have demonstrated the advantages of integrating seismic attributes with machine learning for reservoir prediction, yet limitations remain. For example, multi-attribute regression frameworks have been successfully applied to predict porosity and shale volume in complex deltaic settings, and feature importance analysis has been used to interpret influential seismic attributes [
16,
17]. However, these approaches do not fully exploit multiscale seismic information or attention mechanisms. More recently, deep learning-driven multi-frequency seismic inversion methods that combine multiscale convolutional modules with self-attention have shown improved characterization of thin-layer stratigraphy, illustrating the benefits of capturing contextual relationships in seismic data [
18,
19,
20]. Building on these advances, our framework incorporates multiscale learning and self-attention to effectively fuse diverse seismic attributes and enhances interpretability through a dual-output design, providing both thickness prediction maps and attribute importance scores—features that are often lacking in conventional methods.
2. Geological Setting and Overview of the Study Area
2.1. Regional Geological/Tectonic Setting and Location of the Study Area
Located in southeastern Sumatra, the South Sumatra Basin is one of the most important hydrocarbon-bearing basins in Indonesia and one of the regions with the greatest hydrocarbon potential in the world [
21]. The formation of this basin is closely related to its unique tectonic setting, and it has undergone multiple stages of tectonic evolution. Its oil and gas resources are mainly concentrated in medium-deep sandstone reservoirs, with great potential and value for exploration and development [
22]. The abundant oil and gas resources in the basin are closely related to its complex geological structure, depositional environment, and petrophysical properties. Therefore, systematically studying the geological characteristics of the basin is of great significance for hydrocarbon exploration and development.
The B field is situated in the core area of the South Sumatra Basin. The basin’s tectonic evolution process can be roughly divided into three stages, namely, the rift stage, the post-rift and back-arc subsidence stage, and the inversion and uplift stage. During the rift stage, the subduction of the West Sumatra Trench caused the basin to extend, creating a series of half-grabens. During the post-rift stage, tectonic activity weakened, the subsidence of the basin intensified, the sea level rose, and the basin entered the marine transgression stage. During the inversion and uplift stage, the basin experienced multistage structural inversion accompanied by the uplift of the Barisan Mountains and the formation of compressional-torsional folds. These structural inversion events facilitated the formation and adjustment of various types of reservoirs, providing favorable conditions for hydrocarbon accumulation and preservation [
23].
2.2. Stratigraphic and Sedimentary Characteristics
The South Sumatra Basin has a diverse and complex geological framework, and the evolution of its depositional environment and tectonic activity have profoundly influenced the formation and distribution of reservoirs. The depositional environment of the basin has evolved from a lacustrine environment to a shallow sea environment and then to a deep-sea environment. In particular, multistage marine transgressions have significantly affected the spatiotemporal distribution of the basin’s strata and the development of reservoirs in the region.
The main sedimentary units of the basin include the pre-rift basement/bedrock, the rift-stage Lahat, Lemat, and Talang Akar formations, and post-rift formations such as Batu Raja, Gumai, Air Benakat, and Muara Enim. The significant differences in sediment type and tectonic setting among these formations directly control the degree of source rock development and the spatiotemporal distribution of reservoirs. The Talang Akar Formation and the Gumai Formation differ significantly in sediment characteristics and lithology. The Gumai Formation is dominated by marine sediments and volcaniclastic rocks, with well-developed secondary pores providing ample space for hydrocarbon storage. In contrast, the depositional system of the Talang Akar Formation (the main stratigraphic unit studied herein) is dominated by deltaic and fluvial facies, with favorable facies zones such as delta fronts and channel deposits. The sandstones of the Talang Akar Formation generally have high porosity and high permeability, forming a series of high-quality sandstone reservoirs in zones with delta fronts and channel deposits [
24]. Due to differences between these two formations in sedimentary facies and lithology, there are multiple types of reservoirs and various hydrocarbon accumulation patterns in the study area, creating great resource potential and broad prospect for subsequent hydrocarbon exploration and development.
In this study, the target interval is the lower sandstone interval of the Talang Akar Formation in the B gas field (hereafter abbreviated as the LTAF interval). This abbreviation is used for convenience within the study area. It is mainly distributed in the rift zone and the slopes on both sides of the rift zone. In terms of lithology, it is dominated by medium- to fine-grained sandstone interbedded with a small number of thin conglomerate and mudstone layers, and channel sandstone is the most prevalent type of reservoir rock. In terms of depositional environment, LTAF was formed in a lacustrine-deltaic depositional system with multiple sedimentary facies, including braided channels, meandering channels, crevasse splays, and floodplains. The delta front sandstones and channel-filling sandstones generally have high porosity and high permeability, making them favorable facies zones for hydrocarbon accumulation. The thickness of LTAF is notably characterized by coupled effects of tectonic-depositional processes. It is relatively thick in the central depression zone, reaching tens of meters to nearly 100 m, and it gradually thins towards structural highs and basin margins until it pinches out. Controlled by both sedimentary facies distribution and faulting, the LTAF sandstone reservoir is highly heterogeneous in both horizontally and vertically, with significant variations in hydrocarbon charge degrees and gas-bearing properties in different sections. Therefore, it is necessary to accurately predict the distribution and thickness of the LTAF sandstone body based on seismic attributes using intelligent algorithms in order to provide support for adjusting the mid- to late-stage development strategy of the B field (
Figure 1).
2.3. Research Data Overview
Reservoir prediction in this study was performed using 3D seismic data and well logs from the study area. The 3D seismic volume covers approximately 80 km2, with a trace spacing of 20 m × 20 m and a sampling interval of 2 ms, which satisfies the resolution requirements for reservoir-scale geological analysis. Formation of 2D input slices: For each seismic attribute volume, 2D slices were extracted within the target interval bounded by the picked top and base horizons. The selected 18 seismic attributes were stacked along the channel dimension to form multi-channel inputs. Each full 2D slice is defined on a regular grid with a spatial sampling of 20 m × 20 m and a size of H × W = 876 × 712 grid nodes, where H = 876 and W = 712 correspond to the numbers of nodes in the J- and I-directions, respectively. For model training, we constructed well-centered 2D patches cropped from these slices and stacked the same 18 seismic attributes to form patch tensors of shape h × w × 18 (where h and w denote the patch dimensions). Each attribute channel was normalized using z-score normalization based on training set statistics. The label for each patch was the log-interpreted sandstone thickness at the corresponding well location, which was used to supervise thickness prediction. The seismic data contain stable and effective frequency information within the target interval. Sandstone thickness labels interpreted from logs in 113 wells (out of 237 wells in the study area) were used as supervised-learning targets.
During data processing, the seismic volume was first preprocessed and calibrated using conventional workflows. Multiple seismic attributes (e.g., average negative amplitude, average positive amplitude, RMS amplitude) were extracted within the time window of the target interval.
To construct training and validation samples, well markers were tied to the seismic time domain, and multi-attribute seismic responses were matched with the log-derived sandstone thickness at the corresponding well locations. In addition, to ensure comparable magnitudes across different attributes and improve training stability, each attribute channel was normalized using statistics computed from the training set. The dataset was divided into training and validation sets at a ratio of 70%:30% for model training and evaluation.
4. Results of Seismic Attribute Fusion and Model Validation
4.1. Seismic Attribute Contribution Matrix and Sandstone Thickness Prediction Results
Based on the trained model, we obtain (i) an attribute importance matrix (
Figure 3) by summarizing attention-related statistics and (ii) a fused sandstone thickness prediction map (
Figure 4b).
Figure 3 indicates that amplitude-related attributes (e.g., RMS amplitude and maximum amplitude) contribute more strongly to thickness prediction, whereas frequency- and energy-related attributes contribute less. This provides a quantitative basis for attribute selection in the LTAF interval.
The matrix in
Figure 3 shows the relative contributions of different seismic attributes. Amplitude-related attributes (such as the RMS amplitude and maximum amplitude) contribute more significantly to sandstone thickness prediction, while frequency and energy-related attributes contribute relatively less. This provides a basis for attribute selection. This observation is consistent with the bandwidth-limited resolution of the available seismic data, which may weaken the response of some frequency-derived measures to fine-scale thickness variations. Hence, the lower weights should be interpreted as reduced sensitivity under the present data conditions rather than inherent geological irrelevance.
Compared with (
Figure 4a), the fused prediction (
Figure 4b) exhibits clearer geological patterns and improved continuity, suggesting that multiscale feature extraction helps capture channel morphology and lateral connectivity, while adaptive fusion suppresses scattered noise and enhances spatially consistent sand-body signals.
The importance of each seismic attribute was quantitatively evaluated using the seismic attribute contribution matrix, and weights were dynamically assigned to seismic attributes based on sandstone thickness prediction. Then, multiple attribute information was deeply mined and synergistically fused using the CNN in combination with the self-attention mechanism to improve the continuity and clarify of the spatial changes in amplitude, frequency, and energy features, significantly improving the accuracy of sandstone thickness prediction and providing reliable reference for subsequent geological modeling and the fine characterization of sandstone reservoirs.
In
Figure 4, compared with the original seismic data map that has not been processed by deep learning, the fused map shows clearer key geological features that have been effectively enhanced by multiscale convolutional feature extraction through the CNN and the weighted fusion of extracted features through the self-attention mechanism. In particular, these feature extraction and fusion processes have enhanced the model’s ability to identify sandstone bodies. The amplitude-related attributes (taking the RMS amplitude as an example) and sandstone thickness prediction results obtained after seismic attribute fusion show a clearer trend of accumulation in the spatial dimension, which is difficult to observe in the original seismic data map.
Before attributing fusion, the seismic attributes in the original map are often sparsely distributed and characterized by high noise levels, making it difficult to reveal reservoir complexity and detail. The CNN model with self-attention can effectively suppress noise and irrelevant information while enhancing the representation of key features. In this process, the weights of different seismic attributes are dynamically adjusted by the self-attention mechanism to further highlight areas where sandstone bodies are predicted to exist.
Compared with the original map, the fused thickness prediction map shows the relationship between reservoir features and seismic attributes more clearly, which facilitates interpretation.
4.2. Model Validation
To evaluate the predictive accuracy of the proposed multiscale CNN with self-attention, we performed well-constrained validation and compared the results with baseline methods. The coefficient of determination (R2) was used as the primary metric, and additional residual-based analysis was conducted to inspect deviation patterns.
4.2.1. Validation Metrics
For the purpose of this study, the coefficient of determination (R
2) is used as the main metric to verify the model’s accuracy in sandstone thickness prediction. R
2 measures the goodness of fit between the model’s predictions and actual observations, representing the proportion of actual sandstone thickness variations that the model can interpret. R
2 is calculated as follows:
where
denotes the measured (actual) sandstone thickness,
denotes the corresponding model prediction, and
is the mean of
. A larger R
2 indicates a better agreement between predictions and measurements.
4.2.2. Benchmark Comparison with Baseline Methods
Figure 5 shows the cross-plot between the sandstone thickness predicted by the proposed model and the measured thickness at wells. The validation result indicates a strong correlation, with R
2 = 0.8954, suggesting that the proposed model can explain most of the variation in sandstone thickness in the study area.
For comparison, we further evaluated two commonly used baselines. First, a linear regression model was built using a representative single seismic attribute (RMS amplitude) as the predictor. The corresponding cross-plot is shown in
Figure 6, where the coefficient of determination is R
2 = 0.8281, which is notably lower than that of the proposed model. Second, a random forest regression model was used to represent a nonlinear baseline in
Figure 7. Its performance is R
2 ≈ 0.8453, which also remains lower than that of the proposed multiscale CNN with self-attention.
Overall, the proposed model achieves the highest R2 among the tested methods, indicating improved nonlinear regression capability and better generalization for thickness prediction in complex heterogeneous sandstone reservoirs.
4.2.3. Ablation Study
To quantify the contribution of the self-attention mechanism to attribute fusion, we conducted an ablation study by removing the self-attention module while keeping the multiscale CNN backbone, input attributes, and training strategy unchanged. In addition, we included a single-scale CNN baseline (without multiscale kernels and without attention) to evaluate the benefit of multiscale feature extraction. All models were trained and evaluated using the same dataset split and the same evaluation metrics for fair comparison. The ablation experiments have been designed to quantify the contributions of multiscale feature extraction and the self-attention mechanism under identical data split and training settings. The quantitative results for the ablated models will be reported in
Table 1 after completing the additional runs.
This ablation setup enables a controlled evaluation of how multiscale convolutions and self-attention contribute to thickness prediction performance and interpretability, and the finalized statistics will be included in the revised version. The ablation results provide direct evidence on how the self-attention module improves performance by explicitly modeling inter-attribute correlations and enabling adaptive weighting during the fusion process.
4.2.4. Deviations and Uncertainty Analysis
Residuals between predicted and measured sandstone thickness are analyzed using a residual scatter plot (
Figure 8) and a residual histogram (
Figure 9), where the residual is defined as Predicted − Actual (m). As shown in
Figure 8, residuals are generally centered around the zero line across the prediction range with a slight negative bias (mean = −0.41 m; median = −0.21 m), indicating mild underestimation on average. Quantitatively, the overall error level is MAE = 1.01 m and RMSE = 1.37 m.
Figure 9 shows that most residuals concentrate near zero, while the tails and a few outliers indicate localized uncertainty. These larger deviations are likely related to (i) low signal-to-noise ratio and local acquisition/processing artifacts, (ii) thin-bed tuning/interference effects, (iii) rapid lateral facies transitions, and/or (iv) structural complexity (faulting or steep dips). Therefore, zones with larger absolute residuals should be treated as higher-risk targets and prioritized for additional verification, such as updated well ties or local seismic conditioning. Overall, the residual analysis suggests stable predictions with meter-level typical errors and limited localized uncertainty.
4.3. Geological and Engineering Implications
The predicted sandstone bodies in the LTAF interval are mainly distributed along braided channels and delta-front fairways in the central depression zone of the B field, which aligns with the established sedimentary facies understanding. Several favorable zones extend toward structural lows and slope areas, indicating potential targets that warrant further evaluation.
From an engineering perspective, the thickness prediction map can be translated into an actionable decision-support workflow (
Figure 10). First, it can be used to screen and rank candidate locations for new wells by applying thickness thresholds and connectivity criteria within the predicted sand-body fairways, while avoiding isolated anomalies likely caused by noise. Second, for existing wells located near predicted thickness gradients, the results provide a quantitative basis to prioritize intervention options (e.g., sidetracking or infill drilling) by jointly considering predicted thickness, sand-body continuity, and production dynamics. Third, the attribute importance output offers an interpretable link between predictions and seismic responses, supporting risk management by identifying which attribute groups drive model decisions in different blocks. Overall, the proposed framework provides a feasible technical route for mid- to late-stage development optimization in complex sandstone reservoirs.