3.3.1. Performance Evaluation of VA-HDW Feature Matrix Construction
The VA-HDW method relies on the construction of features between stratigraphic spatial points within the input region and individual boreholes. These features are critical to both the classification accuracy and computational efficiency of the VA-HDW method, and the performance of feature matrix construction directly affects the overall effectiveness and efficiency of the proposed approach.
In this experiment, the performance of feature matrix construction was systematically evaluated using a geological borehole dataset from a newly developed urban area. The objective was to assess the computational time required for feature matrix construction under different dataset sizes and upsampling rates. Feature matrix construction involves first analyzing the distance relationships between stratigraphic spatial points and boreholes, computing the feature vector for each spatial point, and then assembling these vectors into a feature matrix. Parallel computing strategies were employed to accelerate this process.
The size of the borehole set, determined by the number of boreholes, directly affects the efficiency of feature matrix construction. When the number of boreholes is fixed, the upsampling rate (defined as the ratio between the number of points after upsampling and the original number of data points) influences the number of stratigraphic spatial points and consequently impacts construction efficiency. Therefore, experiments were conducted to evaluate the time consumption of feature matrix construction under different dataset sizes and upsampling rate settings. For each experimental configuration, the reported results correspond to the average of ten repeated runs. The experimental results are presented in
Table 3.
The experiments were conducted using datasets with borehole scales of 20, 40, and 64. Based on the feature construction method proposed in Chapter 2, feature vectors were constructed for the borehole spatial points obtained after upsampling, and a feature matrix was subsequently formed. The experimental results show that the average computation time increases progressively with both the borehole scale and the upsampling rate. This is because an increase in the number of boreholes leads to a quadratic growth in computational operations, thereby significantly extending the time required for feature construction and model computation. In addition, a higher upsampling rate results in a linear increase in the number of stratigraphic spatial points, which also contributes to the overall computational cost.
From an empirical perspective, the feature matrices generated using the VA-HDW method achieve the expected performance. Changes in the upsampling rate exhibit a clear linear relationship with feature matrix construction efficiency. An appropriate upsampling rate can improve the quality of the feature matrix; however, excessively high upsampling rates may lead to overfitting. Therefore, in this study, the upsampling rate is set to 10 in order to achieve a balance between feature matrix performance and the risk of overfitting.
3.3.2. Comparative Analysis of VA-HDW and Other Stratum Classification Methods
To verify the superiority of the stratum classification method based on VA-HDW proposed in this paper, a GBARNN neural network was first designed based on the idea of VA-HDW, and the implementation of the VA-HDW method was completed by training this network. Subsequently, based on the classification task, the advantages and disadvantages of the VA-HDW method were evaluated using indicators such as accuracy, F1-score, and the confusion matrix of the classification results. A comparative analysis was conducted between the VA-HDW method and other stratum spatial point classification methods, including KNN, SVM, and GeoPDNN [
19]. Among them, the input features of KNN and SVM are both normalized 3D spatial coordinates (x, y, z), while the input features of GeoPDNN refer to the settings in its original study. In addition, the parameters involved in the classification methods themselves have an impact on the results; therefore, a certain degree of parameter tuning was performed on the relevant classifiers, and the experiment only compared the indicators of each method when the optimal accuracy was achieved.
In this experiment, a detailed verification of stratum classification was conducted based on the dataset of a new urban area, aiming to evaluate the comparative effect between the VA-HDW method proposed in this paper and other classification methods.
- (1)
Parameter Settings and Network Training
In this experiment, the network structure settings of GBARNN and GBDNN are shown in
Table 4, including information such as the number of hidden layers, the number of neural units in hidden layers, the loss function, and the optimizer type.
The hyperparameter settings in this experiment are shown in
Table 5.
The image showing the changes in loss function values and accuracy with the increase in the number of iterations during the training process is presented in
Figure 12.
By observing the above figure, it can be seen that after 500 iterations of training, the GBARNN exhibited good convergence during the training process. The loss value decreased while the accuracy increased, and after 500 epochs, the loss value tended to stabilize. This indicates that the network gradually learned the correct stratum classification patterns and converged, demonstrating that the model can effectively learn the features of the stratum classification task and gradually optimize its performance.
- (2)
Analysis of Comparative Experimental Results
In this section, by comparing the GBARNN, KNN, SVM, and GeoPDNN models, their respective performances in the stratum classification task are systematically evaluated. Three indicators, namely classification accuracy, F1-score, and Kappa coefficient, are used for comprehensive assessment. Meanwhile, aiming at the parameter error problem of KNN and SVM, the grid search method is adopted for parameter optimization, and the different parameter ranges and the parameter configuration corresponding to the optimal classification accuracy are recorded in the table. The experimental results are presented in
Table 6.
From the experimental results in
Table 7, it can be observed that in terms of classification accuracy, the traditional machine learning methods KNN and SVM perform relatively poorly in classifying stratum spatial points, while the deep learning models GeoPDNN and the proposed GBARNN show a small gap, and both achieve high classification accuracy. In terms of F1 Score, GeoPDNN and GBARNN still demonstrate strong classification capabilities, showing a good balance between precision and recall. In contrast, the F1 Scores of KNN and SVM remain relatively low, indicating deficiencies in their predictive performance for certain categories.
An in-depth exploration of the distribution of Kappa coefficients reveals that the two deep learning models, GeoPDNN and GBARNN, exhibit extremely high stability. Their Kappa values mostly remain within a high range, leading to the conclusion that after excluding the interference of random factors, these two methods are more reliable and stable in classification accuracy. In contrast, traditional machine learning algorithms such as KNN and SVM are less reliable, as their Kappa values are much lower and tend to fluctuate significantly when handling multi-category recognition tasks due to external interference.
Compared with GeoPDNN, GBARNN greatly enhances the deep feature representation capability through the concept of generalized distance between boreholes and stratum spatial points, thereby improving the analytical accuracy of geological attributes of stratum spatial points. This model achieves deep integration of spatial data and borehole information through the generalized distance mechanism, effectively enhancing its adaptability and predictive accuracy in stratum classification tasks. When processing complex stratum data, GBARNN provides more accurate and comprehensive predictive results due to its refined feature extraction characteristics, with slightly improved classification accuracy and Kappa coefficient compared to GeoPDNN. However, despite improvements in feature extraction, GBARNN’s over-reliance on certain features may reduce the model’s generalization ability, resulting in slightly inferior coordination between precision and recall and a relatively lower overall F1 score.
The experimental results in
Figure 13 show that in terms of classification accuracy, the GBARNN neural network outperforms traditional classic algorithms such as KNN, SVM, and GeoPDNN. It also performs the best in terms of the Kappa coefficient. Although there is a gap between GBARNN and the GeoPDNN algorithm in terms of F1 score, GBARNN has obvious advantages in the classification process. To better explore the application value of GBARNN and identify room for optimization, this paper intends to adopt the method of independent sample testing, select specific target strata for classification effect research, and use confusion matrices to analyze the recognition effects and improvement space of different models under different stratum attributes. Subsequently, the classification performance of the four algorithms on a specific stratum will be studied.
Figure 14 presents the stratum classification results based on the KNN algorithm, with a comprehensive evaluation using a confusion matrix. Statistics show that the classification accuracy of this method for each stratum ranges from 80% to 91%, indicating a certain level of recognition capability. However, although the overall performance is relatively stable, obvious misjudgments occur in the 5th and 6th strata. This suggests that due to its over-reliance on the assumption of local similarity, the KNN model struggles to accurately capture the nonlinear relationships between strata when dealing with complex geological structures.
As shown in
Figure 15, the average accuracy of the SVM algorithm in classifying various geological strata ranges from 86% to 91%, indicating that the SVM algorithm has strong stability. Compared with the KNN algorithm, the SVM algorithm is more reliable in classification performance, with a smaller fluctuation range in prediction results and stronger anti-interference ability, which suggests that the SVM algorithm can more accurately identify differences between samples in geological stratum classification tasks. Although the SVM algorithm has stronger classification consistency than the KNN algorithm, it still cannot avoid misjudgments for some datasets with large category differences. This may be related to the insufficient ability of the SVM algorithm to explain the internal complexity of some geological stratum data.
Figure 16 shows the confusion matrix of the GeoPDNN algorithm. According to the classification performance evaluation results of the GeoPDNN model, its accuracy for samples of various strata is mostly between 89% and 93%, indicating that the GeoPDNN algorithm has obvious advantages in stratum classification and excellent ability in balanced prediction across categories. Compared with traditional methods, the GeoPDNN algorithm achieves higher overall accuracy in multi-stratum recognition, with a more balanced error distribution. This demonstrates that the GeoPDNN algorithm has stronger generalization ability and adaptability to complex geological environments, thereby reducing the risk of misjudgment.
As shown in
Figure 17, the analysis of the confusion matrix constructed using the GBARNN algorithm indicates that it has exhibited good performance in various rock stratum classification tasks, with the overall classification accuracy basically stable between 91% and 94%, which suggests that this algorithm has advantages in stratum classification and can maintain high recognition accuracy in complex geological data environments; in terms of the distribution of classification errors between different rock stratum sequences, the fluctuation exhibited by GBARNN is relatively low, which further highlights its generalization ability and stability characteristics in multi-source stratum samples.
In special scenarios where stratum classification accuracy is relatively low, the GBARNN algorithm exhibits obvious stability, achieving a classification accuracy of 91.2%. This well demonstrates the model’s robustness in complex category determination and its efficient feature representation capability. By adopting the generalized distance measurement method and improving the feature formation mechanism, GBARNN can accurately identify many core features containing abundant spatial distribution information, thereby improving the overall classification performance.
Compared with traditional machine learning algorithms, GBARNN has significantly improved classification accuracy; its optimized design scheme has greatly reduced the misclassification probability. When dealing with complex correlation structures and high-dimensional features in geological data, it exhibits excellent robustness and anti-interference ability. In contrast to other classic methods (KNN, SVM), GBARNN, relying on the deep learning framework, can more accurately mine the nonlinear features in the data, thereby improving the classification effect.
3.3.3. Geological Body Modeling Experiments Based on GBARNN Classification Results
This subsection mainly conducts geological body modeling experiments based on the classification results of the GBARNN model. The specific procedures are as follows:
The stratigraphic information of the study area is analyzed to train a GBARNN model for stratum classification.
A spatial bounding box covering the study area is defined, within which several detection boreholes are arranged. Stratigraphic sampling points to be classified are generated along these boreholes at fixed intervals through resampling.
The trained classification model is applied to classify the stratigraphic attributes of the sampling points along each detection borehole.
The boundary point coordinates of each stratigraphic surface are aggregated, and kriging interpolation is employed to fit constrained stratigraphic surfaces. The volume between two adjacent stratigraphic surfaces is regarded as a single stratigraphic unit.
The spatial bounding box is discretized into a set of three-dimensional grids. Based on the marching cubes algorithm, the surface models are visualized as isosurfaces. Subsequently, stratigraphic attributes are assigned to all grid cells according to the spatial relationships between the grid cells and the isosurfaces, thereby constructing the geological body model.
To better represent geological body boundaries, the outermost boreholes are extracted as boundary constraints to ensure that the constructed model more closely matches real geological conditions.
Figure 18 illustrates the bounding box constrained by geological body boundaries.
To ensure the spatial accuracy of the model, the grid resolution is set to 256 × 256 × 256. Within the bounding box of the target area, 500 detection boreholes are uniformly arranged, and each detection borehole is discretized into 256 sampling intervals along the depth direction to accurately capture stratigraphic variations. Based on these settings, the modeling workflow for a single stratigraphic unit is illustrated in
Figure 19.
Based on the above procedures, a three-dimensional geological body model of a newly developed urban area is constructed, and smoothing processing is applied to the model. The resulting models are shown in
Figure 20 and
Figure 21.
To evaluate the consistency between the constructed geological body and the actual borehole data, a comparative analysis is conducted between the geological cross-sections extracted from the model and the real borehole information. This analysis focuses on whether the stratigraphic distributions along the cross-sections are consistent with the strata revealed by the boreholes, as well as whether key characteristics such as stratigraphic thickness are in agreement. Through this comparison, the accuracy and reliability of the geological body model in representing real geological conditions can be assessed.
As shown in
Figure 22, two cross-sections, S1 and S2, are established within the study area to validate the accuracy of the proposed modeling method.
Figure 23 further presents comparisons between the S1 and S2 cross-sections and the surrounding actual borehole data. Since the geological body model is constructed based on detection (virtual) boreholes rather than being directly fitted to the real boreholes, minor discrepancies with the actual borehole information are observed. Nevertheless, from an overall perspective, the modeling results are generally consistent with the actual geological characteristics.
The proposed modeling approach integrates the VA-HDW method and therefore possesses a certain capability for characterizing stratigraphic pinch-out phenomena. To evaluate the overall performance of the method in modeling pinch-out strata, the pinch-out feature highlighted by the red box on cross-section S1 is selected for detailed analysis. Since the present study focuses only on the first five stratigraphic units, the remaining portions of the corresponding boreholes are masked in the analysis. As shown in
Figure 24, the selected area exhibits a high potential for pinch-out occurrence. The modeling results along cross-section S1 are consistent with this observation, thereby confirming the presence of pinch-out in the constructed geological model. These results indicate that the proposed modeling method demonstrates a certain capability in modeling stratigraphic pinch-out features.
In this study, the geological body model is constructed based on detection (virtual) boreholes, while the actual boreholes are used only as input training data. As a result, the spatial coupling between detection boreholes and actual boreholes is relatively weak. Therefore, analyzing the relationship between the average stratigraphic thickness derived from the stratigraphic surfaces generated by detection boreholes and the average stratigraphic thickness reflected by actual boreholes provides an objective means to evaluate the performance of the proposed modeling method. In this study, a borehole fitting degree index is adopted to quantitatively assess the fitting accuracy of stratigraphic thickness. This index effectively reflects the capability of the model to represent geological structures. The calculation formula is given as follows:
In the above formulation,
denotes the stratigraphic layer index, and
represents the borehole fitting degree of the
stratum.
Him denotes the average thickness of the
stratum at the actual borehole locations derived from the geological body generated by the classification model, while
denotes the corresponding actual average stratigraphic thickness obtained from real borehole data. A smaller value of
indicates a larger discrepancy between the model and the real geological conditions, whereas a value closer to 1 implies better agreement between the modeling results and the actual geological situation. In this study, the borehole fitting degree of the VA-HDW-based method is calculated, and the fitting degree results for each stratigraphic layer are summarized in
Table 8.