Next Article in Journal
Physics–Data-Driven Crashworthiness Design of Slotted Circular Tubes for Airdrop Cushioning Energy Absorption in Transport Vehicles
Previous Article in Journal
Multimodal and Social Virtual Reality (VR): Exploring and Validating Promising Enablers for Next-Generation Interactive and Group-Based Virtual Visits
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Lithological Mapping Based on Multi-Source Fusion Data and Convolutional Neural Networks: A Case Study of the Guyang Area, Inner Mongolia, China

1
Geomathematics Key Laboratory of Sichuan Province, School of Mathematical Sciences, Chengdu University of Technology, Chengdu 610059, China
2
SinoProbe Laboratory, Institute of Mineral Resources, Chinese Academy of Geological Sciences, Beijing 100037, China
3
College of Big Data and Artificial Intelligence, Chengdu Technological University, Chengdu 611730, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2026, 16(8), 4003; https://doi.org/10.3390/app16084003
Submission received: 11 March 2026 / Revised: 13 April 2026 / Accepted: 15 April 2026 / Published: 20 April 2026
(This article belongs to the Special Issue Emerging Trends in Geological and Mineral Exploration)

Abstract

Remote sensing offers distinct advantages for lithological mapping, but its ability to detect underlying bedrock is limited in covered areas, whereas geochemical data are constrained by sparse sampling and low spatial resolution. To address these challenges, this study proposes a texture-guided adaptive data fusion framework combined with a Multi-scale Convolutional Neural Network (MCNN) for lithological mapping, using the Guyang area in Inner Mongolia as a case study. First, the non-linear relationships between geochemical components and remote sensing spatial textures are modeled to achieve complementary integration of heterogeneous multi-source data. Second, an MCNN model is constructed to extract multi-scale geological features, enabling improved discrimination of lithological units and more effective inference of concealed bedrock beneath Quaternary cover. Experimental results show that the proposed method overcomes the limitations of single data sources and achieves an overall accuracy (OA) of 0.95 on the fused dataset. Ablation experiments further demonstrate that the texture-guided fusion strategy significantly improves lithological identification performance. This study provides an effective framework for intelligent geological mapping and confirms the feasibility of inferring underlying bedrock in covered areas using multi-source surface information.

1. Introduction

Lithological mapping is a core task in regional geological surveys and a crucial component of mineral exploration, environmental assessment, and geological hazard prevention and mitigation [1]. Although traditional geological mapping offers high accuracy, it is time-consuming, labor-intensive, and difficult to implement in areas with complex topography, dense vegetation cover, or poor accessibility. With the rapid development of earth observation technologies, large-scale and rapid lithological identification using multi-source remote sensing (RS) has become a core method in modern geological work [2]. Owing to its macroscopic perspective and abundant spectral information, remote sensing technology plays an essential role in lithological identification [3,4,5]. Multispectral data (e.g., Landsat-8, ASTER) are widely used to extract the spectral absorption and reflection features of rocks [6,7,8]. Meanwhile, hyperspectral satellite data, such as PRISMA and GF-5, have been extensively applied in remote sensing geological surveys across typical covered areas [9,10,11]. Furthermore, multi-source remote sensing data are increasingly adopted for lithological mapping [2,12,13,14]. However, optical RS can only reflect the spectral properties of surface cover. Single-source RS data may be subject to interference from surface cover, thereby compromising the detection of lithological information [15].
Quaternary-covered areas have consistently posed a challenge for geological mapping due to poor bedrock exposure. To address this challenge, some scholars have attempted to use borehole data to reveal vertical lithological assemblages to assist mapping; nevertheless, drilling is costly and difficult to implement on a large scale. Other scholars have proposed a “Quaternary lithological mapping method” which infers the underlying bedrock by analyzing the characteristics of surface clastics, providing a new perspective for mapping covered areas [16]. Advances in the mechanisms of geochemical migration in covered areas have led to the observation of nanoscale metal particles (e.g., Cu, Au) in soil and plant cells above ore bodies, confirming the objective existence of deep-seated substance migration to the surface. According to the theories of element migration and pedogenesis, surface soils and stream sediments largely inherit the material compositional characteristics of the underlying bedrock [17,18]. Therefore, geochemical exploration data not only record the spatial distribution characteristics of the crustal surface but also chemically compensate for the insufficient penetrability of optical RS, revealing information about deep or concealed geological bodies. This provides a scientific approach for lithological mapping [19].
In recent years, with the rise in artificial intelligence and big data technologies, multi-source geoscientific data fusion has emerged as a new paradigm for geological mapping research [20]. Early fusion studies mostly utilized traditional machine learning algorithms, such as Support Vector Machines (SVM) or Random Forests (RF) [21,22]. For instance, some studies identified leucogranite plutons by integrating geochemical data and ASTER RS imagery using a random forest metric learning method [12]. Although traditional machine learning methods have improved mapping accuracy to some extent, they are fundamentally pixel-based shallow classification models. In contrast, Convolutional Neural Networks (CNNs), by virtue of their robust multi-level feature extraction capabilities, can more accurately capture the spatial patterns of complex structures [23]. Some scholars have found that CNNs outperform RFs in lithological mapping using fused data [24]. Furthermore, a deep neural network architecture has been proposed based on geological route data fused with multimodal data for 1:50,000 geological mapping, achieving favorable application results [25].
In parallel, voxel-based and 3D geological modeling have become important paradigms for reconstructing subsurface structures and deep mineral systems [26]. These approaches generally integrate borehole data, geophysical observations, field constraints, and geological rules to build volumetric lithological models [27,28]. Recent developments have further expanded this framework through unified voxel modeling, dynamic model updating, GIS-based geostatistical interpolation, and graph-based learning methods [29,30,31,32]. Nevertheless, their practical application often depends on dense subsurface observations and computationally intensive workflows, which limit their scalability in large regional studies with sparse exploration data [33]. Therefore, cost-effective surface-data-driven methods are still needed to provide prior constraints for lithological interpretation in data-limited areas.
Although traditional machine learning methods and standard CNNs have improved mapping accuracy, most existing fusion approaches rely on simple layer stacking (early fusion) or decision-level voting [12]. These conventional methods often fail to reconcile the fundamental physical and resolution disparities inherent in multi-source data: RS provides high-frequency spatial structural boundaries, while geochemical data manifests as low-frequency, diffuse compositional halos. Without an adaptive mechanism to organically bridge this semantic gap, existing models frequently produce blurred lithological boundaries or lose vital localized geochemical anomalies [23,24].
To address the aforementioned issues, this study takes the Guyang area in Inner Mongolia as a case study and proposes a texture-guided adaptive data fusion framework combined with an improved Multi-scale Convolutional Neural Network (MCNN) for lithological mapping. This framework enables the deep and organic integration of Landsat-8 RS imagery, geochemical data, and basic geological information. By extracting multi-scale features, the MCNN model synergistically captures the material compositional signals of geochemical elements and the spatial texture details of RS imagery, thereby achieving the fine recognition of lithology and the effective inference of underlying bedrock in Quaternary covered areas. Through ablation experiments, the feasibility and robustness of lithological identification utilizing multi-source fused surface information were further validated.
The main contributions of this study can be summarized as follows:
(1) A texture-guided adaptive data fusion framework is proposed to establish a physically meaningful coupling between geochemical composition and remote sensing spatial texture;
(2) A multi-scale convolutional neural network (MCNN) is developed to capture geological features at different spatial scales and improve lithological discrimination;
(3) A unified multi-source feature representation is constructed, integrating spectral, textural, and geochemical information for enhanced mapping accuracy;
(4) The proposed method is validated in a Quaternary-covered area, demonstrating its effectiveness in inferring concealed bedrock through ablation experiments.

2. Geological Overview

The study area is located in central Inner Mongolia, situated at the northern foot of the Yinshan Mountains in the Baotou-Bayan Obo belt. Geographically, the study area encompasses multiple surrounding counties and cities, including Guyang and Wuchuan. The stratigraphy and lithology of this region are highly complex. The outcropping strata range in age from the Archean to the Mesozoic, with the Jurassic, Cretaceous, and Proterozoic strata being the most prominent [34]. Tectonically, the study area is located in the middle segment of the northern margin of the North China Plate, adjacent to the southern margin of the Tianshan-Xing’an orogenic belt to the north (Figure 1). This region hosts multiple metallogenic belts, such as the Wulashan-Daqingshan metallogenic sub-belt. The geotectonic unit belongs to the Yinshan Fault Uplift of the Langshan-Bayan Obo Margin Depression within the Inner Mongolia Anteclise of the North China Platform, specifically the Wulashan-Daqingshan east–west tectonic belt on the southern margin of the Guyang Fault Basin. This tectonic belt comprises a series of nearly east–west trending folds and compressive fracture zones. The area has experienced frequent intermediate-acid magmatic activity, resulting in a wide distribution of intrusive rock bodies. Its unique evolutionary history and complex geological structure have provided favorable conditions for the formation of various mineral resources. Consequently, conducting relevant research in this region is of significant importance for understanding the geological environment, evolutionary history, and ore genesis of the study area.
Although RS and geochemical data directly reflect material compositions, the tectono-magmatic-sedimentary events in this study area exhibit distinct stages, with specific geological ages corresponding to unique rock assemblages. As shown in the lithological assemblage labels (Table 1), the lithostratigraphic units are composed as follows: (1) Jurassic (J): Mainly consists of purplish-red conglomerate and tuffaceous sandstone. (2) Early Yanshanian (Y) intrusive rocks manifest as inequigranular K-feldspar granite. (3) Proterozoic (P): Includes the Shinagan Group limestone and sandstone formation, as well as migmatized slate and quartzite formations, along with siliceous banded limestone intercalated with thin-bedded calcareous slate, and grey quartzite intercalated with magnetite quartzite. (4) Cretaceous (C): Dominated by the Guyang Formation, which consists of black shale intercalated with freshwater limestone, oil shale, coal seams, grey-green sandstone, purplish-red sandstone, and conglomerate intercalated with sandy shale and grey-white tuff. Furthermore, the study area has undergone multiple periods of magmatic intrusion, with Late Caledonian and Early Yanshanian rock bodies being widely distributed. (5) Variscan (V) rocks are primarily intrusive; the early, middle, and late-stage rock assemblages are represented by pyroxenite and peridotite, biotite quartzite and biotite granite, as well as granodiorite and albitized/muscovitized granite. (6) Late Caledonian (K) intrusive rocks are characterized by grey-green gneissose granodiorite and grey-white gneissose plagiogranite. (7) Early Luliangian (L) intrusive rocks are dominated by grey-green migmatized schistose quartz diorite. (8) Archaeozoic (A): Dominated by deeply metamorphosed biotite plagioclase gneiss, as well as marble intercalated with quartzite, and plagioclase gneiss intercalated with magnetite quartzite.
Although these various lithological assemblage labels are named according to geological ages, these units represent lithostratigraphic and geomorphological features on the geological map with unique mineral assemblages and geochemical characteristics. Essentially, the model is trained to learn the differences and characteristics of the material compositions of these specific rocks.

3. Data Introduction

3.1. 1:200,000 Scale Data Source and Analytical Methods

The 1:200,000 scale geochemical data for the region comes from the regional exploration scanning plan, with the sample medium being river system sediments. The sampling density was one sampling point per 4 km2, and a total of 39 elements were analyzed (Bi, Cu, P, La, Li, Ag, Sn, Au, Mo, Th, U, W, Sb, Hg, Mn, Cr, Sr, Nb, Pb, Ni, Ti, Y, Cd, Co, Ba, Be, V, Zn, B, As, Zr, F, Fe2O3, K2O, CaO, MgO, Na2O, Al2O3, and SiO2). The analysis methods and detection limits for each element are shown in the table below (Table 2).
The basic statistical characteristics of the geochemical elements are summarized in Table 3. The significant discrepancies between the maximum values and the arithmetic means (or medians) for all elements indicate high dispersion. Furthermore, based on the skewness and kurtosis of the 39 elements and oxides, all variables—except CaO, Na2O, SiO2, and Fe2O3—exhibited right-skewed and leptokurtic distributions, failing to follow a normal distribution. This suggests that performing statistical analysis directly on the raw data could lead to erroneous conclusions. Since geochemical data possess typical compositional properties, it is essential to preprocess the raw data to eliminate the closure effect. Samples with missing measurements or invalid records were excluded prior to interpolation. To reduce the influence of extreme noise, obvious abnormal values were checked against the analytical records and retained only when consistent with the original assay results. Consequently, the isometric log-ratio (ILR) transformation [35] was applied to handle the compositional data.
Finally, the Inverse Distance Weighting (IDW) interpolation method was utilized to generate geochemical maps with a spatial resolution of 500 m × 500 m, overlaid with sampling points (Figure 2). The 500 m interpolation grid was selected as a compromise between the original sampling density (approximately one sample per 4 km2) and the need to preserve regional geochemical gradients without introducing excessive artificial detail. While Kriging is a widely used geostatistical standard, IDW was explicitly selected for this study due to the statistical nature of the dataset. As demonstrated in Table 3, the geochemical data exhibit extreme right-skewness and non-stationarity. In such highly dispersed datasets, standard Kriging can sometimes produce anomalous negative values or over-smooth critical local anomalies. IDW effectively preserves these localized, high-value geochemical halos, which are essential for identifying distinct lithological boundaries. Although comparative interpolation experiments (e.g., Kriging variants) were not the primary focus of this study, the selection of IDW is justified by its robustness in preserving local geochemical anomalies under highly skewed and non-stationary data distributions.

3.2. RS Data

The Landsat-8 satellite, successfully launched by the United States on February 11, 2013, carries the Operational Land Imager (OLI) and the Thermal Infrared Sensor (TIRS); its primary parameters are listed in Table 4. The satellite achieves global coverage every 16 days. The Landsat-8 data used in this study (Product ID: LC08_L2SP_127031_20231027_20231101_02_T1) were retrieved from the USGS official website. This dataset meets rigorous standards for geometric and radiometric accuracy. Additionally, the image quality over the study area is excellent, with cloud and snow cover of less than 2%. Given that the raw Landsat-8 imagery has already undergone strict radiometric calibration and other systematic preprocessing, the subsequent RS data processing focused on image registration and cropping based on the study area (Figure 3).

4. Methods

4.1. Multi-Source Data Fusion Method

To overcome the limitations of a single data source in lithological mapping within complex geological terrains, this study integrates RS data with geochemical data based on multi-source data fusion techniques [12,23,36]. The core principle of this fusion is to ensure data complementarity and feature space unification. Multi-source data exhibit significant complementarity in expressing geological features: RS imagery provides high-frequency spatial information, accurately capturing surface textures and the geometric boundaries of lithological units; conversely, geochemical data provide low-frequency compositional information, which can penetrate surface cover interference to reflect the chemical abundance of underlying bedrock. To achieve this fusion, discrete geochemical sampling points were first converted into continuous surfaces using Inverse Distance Weighting (IDW) interpolation and resampled to a 30 m grid consistent with the RS data. This process constructs a unified high-dimensional feature space, ensuring each pixel unit simultaneously possesses spectral-texture feature vectors and regional geochemical background values. The interpolated geochemical layers were then resampled to 30 m to ensure spatial alignment with Landsat pixels during fusion; this resampling does not increase the intrinsic spatial resolution of geochemical data, but provides a common computational grid for subsequent integration and classification. The fusion framework utilizes a collaborative constraint classification mechanism: High-frequency components from high-resolution RS data are used to extract classification boundaries, compensating for the insufficient spatial resolution of geochemical data. Low-frequency components from geochemical data serve as a prior background field to correct misclassifications in RS data caused by “same object with different spectra” or “different objects with the same spectrum” phenomena. Specifically, the proposed framework performs pre-classification fusion at the spatially aligned pixel-grid level, where geochemical and RS information are integrated into unified inputs before deep feature extraction by the MCNN.
The fusion process is as follows:
(1) Image decomposition
The Laplacian pyramid algorithm was employed to decompose the RS imagery into high and low-frequency components. Gaussian filtering was subsequently applied to further divide the high-frequency component into mid- and high-scale texture features, thereby capturing surface structural information across multiple spatial scales.
(2) Scaling transformation
The geochemical element layer ( g e o _ L ) and the low-frequency component ( M S _ L ) were resampled using cubic convolution interpolation to match the spatial resolution of the high-frequency component ( M S _ H ).
(3) Establish the correlation function
For the resampled geochemical element layer g e o _ L , considering the possible non-linear relationship between geochemical element concentration and RS spectral reflectance M S _ L m , the equation is established as:
g e o _ L = t 1 · M S _ L m + t 2 · ( M S _ L m ) 2 + + t m · ( M S _ L m ) m + b
where m is the band number ( m = 1 , 2 , , a ) and a is the total number of bands in the RS image. The coefficients t m and the constant term b can be calculated using the least squares method, so as to avoid textures being discarded due to zero or excessively small correlation coefficients.
This mathematical correlation is fundamentally grounded in physicochemical geology and geomorphology. Spatial texture, such as terrain ruggedness or weathering resistance, is physically linked to rock competence and underlying mineral composition. For instance, silica-rich intrusive rocks or quartzites typically form weathering-resistant ridges, yielding high-frequency textures in RS imagery. These specific textures are spatially and genetically correlated with positive geochemical anomalies (e.g., SiO2 enrichment) in the overlying residual soil. By modeling this non-linear relationship, the framework physically anchors diffuse geochemical halos to their corresponding geomorphological boundaries.
(4) Image Reconstruction
The fused high-frequency image g e o _ H is reconstructed using the coefficients t m and the high-frequency components M S _ H m , which can be expressed as:
g e o _ H = m = 1 a t m · M S _ H m + b
(5) Image fusion
A high-resolution geo-fusion image g e o _ f is generated by fusing the resampled geochemical element layer g e o _ L n with the reconstructed high-frequency image layer g e o _ H n , as described below:
g e o _ f = g e o _ L + g e o _ H

4.2. Convolutional Neural Network and Proposed MCNN Architecture

Convolutional Neural Network (CNN) algorithms have been widely applied across numerous fields, particularly in image classification and recognition tasks [25]. Evolving from Multilayer Perceptrons (MLPs) [37], the fundamental advantage of CNNs lies in their introduction of local connectivity and weight-sharing mechanisms, which have yielded remarkable success. These mechanisms not only drastically reduce the number of trainable parameters, simplify the network optimization process, and decrease overall model complexity, but they also significantly mitigate the risk of model overfitting. A standard CNN typically comprises several distinct types of layers: an input layer, convolutional layers, pooling layers, activation function layers, fully connected layers, and an output layer. To construct a robust and complete CNN architecture, convolutional and pooling layers are frequently alternated multiple times. Compared to traditional machine learning methods, CNNs possess a highly efficient capability for capturing sequential patterns alongside their inherent parameter-sharing properties. To address the inherent limitation of single convolutional kernels—which struggle to simultaneously capture localized mineralization details and broad regional tectonic backgrounds—this study proposes a Multi-scale Convolutional Neural Network (MCNN) architecture.
Inspired by the parallel Inception module introduced in GoogleNet [38], the stacked 1D-Inception modules within our MCNN are designed with four carefully engineered parallel branches: a 1 × 1 convolution, a 3 × 3 convolution, a 5 × 5 convolution, and a pooling layer. This specific configuration enables the dense extraction and fusion of features across multiple receptive fields. Furthermore, 1 × 1 convolutions are strategically introduced for dimensionality reduction and regularization.
In terms of implementation, the MCNN architecture consists of multiple stacked 1D-Inception blocks. Each block contains four parallel branches: a 1 × 1 convolution, a 3 × 3 convolution, a 5 × 5 convolution, and a max-pooling branch followed by a 1 × 1 projection layer. The number of filters in each branch was empirically set to ensure a balance between model capacity and computational efficiency. Batch normalization is applied after each convolutional layer to stabilize training, and the ReLU activation function is used throughout the network. After feature extraction, global average pooling is performed, followed by fully connected layers for classification.
This addition enhances the model’s representational capacity while substantially reducing the parameter count and improving its generalization ability, thereby effectively overcoming the overfitting and computational redundancy issues often associated with simple parallel structures (Figure 4).
(1) Extract local features from the input feature sequence X L ( L denotes length). One-dimensional convolution: F k , where k denotes the kernel size. Assume the input feature is x ( t ) ; the convolution operation is defined as:
s ( t ) = ( x ω ) ( t ) = i = 0 k 1 ω ( i ) x ( t i )
where w ( i ) denotes the kernel (filter) parameters. Feature extraction at different scales is determined by kernels of different sizes.
(2) Global Pooling and Feature Fusion
Features obtained from varying scales are concatenated to construct a high-dimensional representation enriched with multi-scale information. Global pooling is employed to enhance the model’s capability to recognize global patterns, thereby facilitating seamless integration with subsequent network layers. Specifically, a max-pooling operation is introduced within each parallel branch to effectively suppress the high-frequency noise inherently present in RS data.
Following the multi-scale extraction process, the resulting feature vectors undergo Global Average Pooling (GAP) and are subsequently concatenated along the channel dimension. The operation can be expressed as follows:
F c o m b i n e d = C o n c a t ( G A P ( f k = 3 ) , G A P ( f k = 5 ) , G A P ( f k = 7 ) )
(3) Model Training and Optimization
To address the inherent uneven distribution of natural geological samples, the Synthetic Minority Over-sampling Technique (SMOTE) algorithm [39] was employed. This approach generates synthetic samples via linear interpolation within the feature space of minority classes, thereby effectively balancing the training weights across all categories.
The mathematical generation of a new synthetic sample can be expressed as follows:
x n e w = x i + λ ( x j x i ) , s . t . λ ~ U n i f o r m ( 0 , 1 )
Furthermore, to mitigate the extreme class imbalance among various lithological samples during training, the Focal Loss function was adopted as a replacement for the traditional Cross-Entropy Loss [40].
The formula for Focal Loss is defined as:
L F L ( p t ) = α t ( 1 p t ) γ log ( p t )
In this study, the focusing parameter was set to γ = 2 , and the balancing parameter was set to α = 1.5 . By down-weighting easily classified samples, this loss function forces the model to concentrate its learning efforts on hard-to-distinguish and frequently confused lithologies. Additionally, regularization techniques were incorporated throughout the model training process to further prevent the risk of overfitting and enhance the model’s generalization capabilities.

5. Results and Discussions

This section presents the fusion results, lithological classification performance, ablation experiments, and geological interpretation of the proposed framework.

5.1. Results of Data Fusion

Figure 5 shows that the fused image preserves both the spatial texture and boundary information from Landsat-8 imagery and the regional compositional gradients derived from geochemical interpolation. Compared with the original geochemical layers, the fused result provides a spatially continuous representation on the Landsat grid, while retaining geochemical enrichment patterns relevant to lithological differentiation. This combined representation is therefore better suited for the subsequent classification of lithological units than either single data source alone [19].

5.2. Lithological Classification Experiment

5.2.1. Data Label and Experimental Setup

Based on the 1:200,000 regional geological map, this study established a lithological labeling system comprising eight bedrock categories (Table 1). The Quaternary System (Q), representing unknown samples, was excluded from training and gradient updates ( N = 2200 ). To evaluate the model’s generalization capability, a stratified sampling strategy was employed to divide the labeled bedrock samples into training and independent test sets at an 8:2 ratio, ensuring a uniform distribution of lithological classes. Three-fold cross-validation was then performed within the training set for model selection and hyperparameter tuning. The dataset contains a total of 17,185 valid sampling points. To construct the training set, we specifically selected known lithological samples from areas where bedrock is exposed. This approach not only ensures label accuracy and avoids interference from regolith contamination during training but also enables the model to establish unique characteristic descriptions for each lithological unit by learning “pure” geochemistry-texture features.
The experiment aimed to perform lithological mapping from “multi-source fused features” to “lithological categories.” The workflow was implemented using the PyTorch 2.8.0 deep learning framework and accelerated by NVIDIA GPUs. The hyperparameters were optimized through multiple iterations: we utilized the AdamW optimizer [41] with a weight decay of 0.01 to mitigate overfitting. The initial learning rate was set to 0.001, the batch size to 64, and the maximum number of epochs to 300. A Dropout rate of 0.4 was introduced in the fully connected layers to further enhance model robustness. A learning rate decay strategy was adopted to improve convergence stability during training.
We acknowledge that the labeled samples may exhibit spatial autocorrelation, which is common in geological datasets. In this study, stratified random splitting was adopted to preserve class balance, but this strategy may not fully eliminate potential spatial leakage between training and test samples. Therefore, the reported accuracy should be interpreted as the performance under the current sampling framework. In future work, spatial block cross-validation or distance-constrained partitioning will be introduced to provide a more rigorous evaluation of generalization.

5.2.2. Lithological Classification

We utilized the MCNN architecture to classify the multi-source fused data (Figure 6). Parallel multi-scale convolutional kernels were used to extract local textures, medium-scale structures, and global background features, while a global adaptive pooling mechanism achieved deep feature fusion. The training process employed a three-fold cross-validation. Although the overall classification performance is satisfactory, a small amount of “salt and pepper noise” still exists in the classification map (Figure 6). This is analyzed in combination with the confusion matrix (Figure 7), the relevant geological background, and the evaluation metrics for the training and test sets (loss function, accuracy curves) are shown in Figure 8. The model’s accuracy on the validation set stabilized between 85% and 95%, with the loss value rapidly converging to 0.11 and the maximum training accuracy reaching 0.95. The smooth, continuous decline in training loss across epochs indicates that the model effectively learned the data features and that the AdamW optimizer functioned correctly. These results demonstrate that the MCNN algorithm successfully captured the non-linear features of the fused data without overfitting, exhibiting strong generalization performance.

5.3. Comparison Results of Various Methods

To quantitatively evaluate the effectiveness of the multi-source fusion strategy and the MCNN architecture, we designed an ablation study comprising four configurations: (a) CNN + Geochemical Data; (b) MCNN + Geochemical Data; (c) CNN + Multi-source Fused Data; (d) MCNN + Multi-source Fused Data (Proposed). Metrics included Overall Accuracy (OA), F1-score, Precision, and Recall. Additionally, the Area Under the ROC Curve (AUC) and F1-score bar charts were analyzed.

5.3.1. Overall Performance Comparison

To thoroughly investigate the performance of the models across specific geological units, ROC curves (Figure 9) and F1-score bar charts for each lithological category (Figure 10) were plotted. Combined with a detailed comparison in Table 5, the overall performance of the four models on the test set was systematically analyzed. Experimental results indicate a significant, stepwise improvement in the models’ classification performance with the enrichment of data sources and the enhancement of the network architecture. From the baseline performance, the CNN model relying solely on geochemical data (Model a) exhibited the weakest performance, yielding an AUC of 0.8814. Although it performed adequately on certain lithologies with distinct geochemical features, its overall classification accuracy was bottlenecked by the limitations of a single data source. With the same data input, the MCNN model (Model b) improved the AUC to 0.9126. This demonstrates the advantage of multi-scale convolutional kernels in capturing lithological features, as they can simultaneously extract local anomalies and regional background variations in geochemical elements. Consequently, upon introducing RS imagery for multi-source fusion, the CNN model (Model c) saw its AUC leap to 0.9594, outperforming the model with solely improved network architecture (Model b). This suggests that the complementarity of data sources is crucial for resolving lithological identification in complex geological contexts.
The proposed MCNN model combined with multi-source fused data (Model d) achieved the best overall performance, with an AUC of 0.9744 and an OA exceeding 0.95. The ROC curve of Model (d) is closest to the top-left corner, indicating strong classification capability and robustness. Notably, the approximately 7% improvement in OA from Model (b) to Model (d) highlights the substantial contribution of the texture-guided fusion strategy. When relying solely on geochemical data, interpolation effects tend to produce overly smoothed spatial patterns, leading to ambiguous lithological boundaries. In contrast, the incorporation of high-frequency spatial information from RS data—such as geomorphic ridges, fault lineaments, and lithological contacts—provides additional structural constraints that help refine these diffuse geochemical patterns. This complementary integration enhances the delineation of lithological boundaries and results in more geologically consistent mapping outcomes. While explicit post hoc interpretability techniques (e.g., attention visualization or feature attribution methods) were not implemented in this study, the role of the texture-guided fusion mechanism is systematically evaluated through controlled ablation experiments. The consistent performance gains observed in Model (d) suggest that the model effectively leverages the synergy between high-frequency structural features and low-frequency geochemical backgrounds. This behavior is also consistent with established geological understanding of the coupling between surface morphology and material composition, providing an indirect yet meaningful form of interpretability. Future work will further incorporate interpretable deep learning techniques to explicitly analyze feature importance and enhance the transparency of the model.

5.3.2. Analysis of Lithological Identification Effectiveness

As illustrated in Figure 11a,b, the CNN algorithm utilizing geochemical data demonstrated baseline effectiveness in lithological identification, generally aligning with the macroscopic lithological categories of the geological map. However, significant misclassifications and omissions were present, yielding an overall classification accuracy of only 0.70. In contrast, the proposed MCNN algorithm achieved an accuracy of 0.88 using the same data, reflecting a vastly superior feature recognition capability.
Figure 11c,d reveal that when standard CNN algorithms process multi-source fused data, the exponential increase in sample complexity exacerbates the “salt-and-pepper” noise effect, limiting the classification accuracy to 0.75. This issue is effectively mitigated by the MCNN algorithm. It is worth noting, however, that the classification results for Quaternary strata generally deviated from actual geological conditions, indicating that the model struggles to accurately predict bedrock geology concealed beneath thick overburden. Geologically, this limitation occurs because Quaternary cover in this region is often transported (e.g., distal alluvial deposits or aeolian loess) rather than residual (weathered in situ). In transported cover, the surface geochemical signature is physically detached from the underlying bedrock, fundamentally disrupting the spatial and genetic correlations between surface textures and deep lithology that the multi-source fusion model relies upon.

5.4. Discussion

In this study, the effectiveness of the proposed texture-guided multi-source fusion framework combined with the MCNN architecture is evaluated through both quantitative metrics and geological consistency. The results demonstrate that the integration of geochemical and RS data significantly improves lithological classification performance, particularly in complex geological settings and Quaternary-covered areas.

5.4.1. Lithology-Dependent Performance and Geological Controls

The performance of the model varies across different lithological units, reflecting the intrinsic differences in material composition, texture, and structural characteristics. The Proterozoic (P) units, composed mainly of limestone, quartzite, and migmatized slate, show relatively low classification accuracy when only geochemical data are used. This is primarily due to sparse sampling and interpolation-induced smoothing, which obscure distinctive geochemical signatures. After incorporating multi-source data, classification performance improves significantly. This improvement is closely related to the strong geomorphological expression of quartzite and banded structures, which are effectively captured in RS imagery and compensate for geochemical discontinuities. The Cretaceous (C) Guyang Formation, characterized by mixed sedimentary and volcanic lithologies, presents high intra-class variability.
Conventional CNN models struggle to distinguish such heterogeneous assemblages. In contrast, the MCNN architecture, with its multi-scale feature extraction capability, can simultaneously capture local textures (e.g., tuff porphyritic structures) and broader sedimentary layering, resulting in improved classification performance. Similarly, the Jurassic (J) strata, dominated by conglomerates and tuffaceous sandstone, demonstrate the importance of texture information. The coarse-grained structure of conglomerates produces distinctive spatial patterns that are effectively recognized by the MCNN model. Intrusive rocks, including the Late Caledonian (K) and Early Lüliang (L) units, achieve consistently high classification accuracy. These lithologies exhibit both diagnostic geochemical signatures and well-developed structural fabrics such as gneissic and schistose textures. The MCNN model effectively captures these anisotropic features, leading to improved discrimination compared with conventional CNN approaches.

5.4.2. Contribution of Multi-Source Fusion and Model Architecture

The ablation experiments indicate that multi-source data fusion and model architecture play complementary roles in improving classification performance. Geochemical data provide essential compositional information but are limited by spatial discontinuity and interpolation effects, which often result in overly smooth patterns. In contrast, RS data offer continuous spatial coverage and high-frequency structural information, including lithological boundaries and geomorphological features. The proposed fusion strategy integrates these complementary data sources, allowing high-frequency spatial textures to constrain diffuse geochemical patterns. This integration reduces classification ambiguity and enhances the delineation of lithological boundaries. The MCNN architecture further strengthens this process by extracting features across multiple spatial scales, enabling the model to capture both local anomalies and regional geological structures. Importantly, while explicit interpretability techniques (e.g., attention visualization) were not implemented, the effectiveness of the fusion mechanism is supported by ablation experiments and its consistency with geological processes, suggesting a meaningful coupling between surface morphology and material composition.

5.4.3. Implications for Geological Mapping in Covered Areas

The results highlight the potential of multi-source data fusion for lithological mapping in Quaternary-covered regions, where direct bedrock exposure is limited. In such environments, geochemical signals may be partially decoupled from the underlying bedrock due to transported sediments. However, the integration of RS-derived structural information helps restore spatial constraints, improving the prediction of concealed lithological units. The model demonstrates the ability to infer bedrock distribution and structural features, although limitations remain in areas dominated by transported cover, where geochemical signals may not reliably reflect the parent material. Because this study was conducted in a single geological setting, the transferability of the learned relationships to other tectonic, climatic, or pedogenic environments remains to be verified.

5.4.4. Methodological Context: Relationship to 3D Geological Modeling

Recent advances in 3D and voxel-based geological modeling have significantly enhanced the reconstruction of subsurface structures. However, these approaches typically require dense and high-cost datasets, such as seismic surveys, borehole logs, and geophysical measurements, which are often unavailable for large-scale regional studies. In contrast, the 2D multi-source fusion framework proposed in this study provides a cost-effective and scalable alternative based on widely available surface data. Rather than replacing 3D modeling, this approach can be regarded as a complementary and preliminary step. It enables the identification of potential concealed lithological boundaries and structurally significant zones, thereby providing valuable prior constraints for targeted drilling and subsequent 3D modeling. Therefore, the proposed framework contributes not only to surface lithological mapping but also to the broader workflow of multi-scale geological investigation. Compared with 3D voxel-based modeling, the proposed 2D framework has lower data requirements and better scalability for regional screening, but it cannot explicitly reconstruct vertical lithological architecture or subsurface geometry. Therefore, its main value lies in providing laterally continuous prior information for areas where 3D constraints are insufficient.

6. Conclusions

To address the challenges of data heterogeneity and the interpretation of complex stratigraphic structures in exploration blind zones, this study proposes a texture-guided adaptive data fusion framework. Combined with an improved Multi-scale Convolutional Neural Network (MCNN), an intelligent geological mapping study was conducted. The integration of multi-source data and the optimization of the deep learning model have demonstrated significant advantages in the recognition and interpretation of geological features.
The main conclusions and innovative insights are as follows:
  • Texture-Guided Fusion Strategy: A texture-guided fusion strategy based on multi-scale filtering and high-frequency texture reconstruction was proposed. Multi-source data fusion fully leverages the inherent characteristics and advantages of distinct data types, yielding excellent performance in rock mass identification while enhancing model interpretability. Ablation experiments demonstrate that this texture-guided fusion strategy significantly improved the Overall Accuracy (OA) of the mapping by 7% (from 0.88 to 0.95).
  • Parallel Multi-Scale CNN Architecture: A parallel multi-scale CNN architecture was successfully constructed. This model overcomes the limitations of single-scale convolution, enabling the simultaneous extraction of microscopic vein textures and macroscopic rock mass backgrounds. Furthermore, by effectively addressing the issue of sample imbalance, the proposed MCNN exhibits superior robustness compared to traditional CNNs.
  • Prediction in Covered Areas: The model demonstrates the potential to infer bedrock distribution and concealed structural features beneath Quaternary cover, particularly in areas where geochemical signals remain genetically linked to the underlying lithology.
  • Limitations and Future Work: While the proposed framework achieves high accuracy in the Guyang area, its generalizability requires further validation. Future research will test the model in diverse climatic and pedogenic regimes (e.g., heavily lateritized terrains or glaciated regions) to evaluate the robustness of the texture-guided fusion strategy across different global geological settings and transported cover types.

Author Contributions

Methodology, Y.W. and R.T.; software, Y.W. and R.T.; writing—original draft preparation, Y.W.; writing—review and editing, Y.W. and Q.Z.; visualization, Y.W. and Q.Z.; supervision, K.X. and R.T.; funding acquisition, K.X. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Key R&D Program of China (grant numbers 2023YFC2906400 and 2023YFC2906403), and Major Demonstration Project of Science and Technology Innovation in Inner Mongolia Autonomous Region (grant number 2025KJTW0020).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets presented in this article are not readily available because the data are part of an ongoing study. Requests to access the datasets should be directed to the corresponding author.

Acknowledgments

The authors thank the anonymous reviewers and the editors for their hard work on this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Hartmann, J.; Moosdorf, N. The New Global Lithological Map Database GLiM: A Representation of Rock Properties at the Earth Surface. Geochem. Geophys. Geosyst. 2012, 13, Q12004. [Google Scholar] [CrossRef]
  2. Zhang, T.; Zhao, Z.; Dong, P.; Tang, B.-H.; Zhang, G.; Feng, L.; Zhang, X. Rapid Lithological Mapping Using Multi-Source Remote Sensing Data Fusion and Automatic Sample Generation Strategy. Int. J. Digit. Earth 2024, 17, 2420824. [Google Scholar] [CrossRef]
  3. Bachri, I.; Hakdaoui, M.; Raji, M.; Teodoro, A.C.; Benbouziane, A. Machine Learning Algorithms for Automatic Lithological Mapping Using Remote Sensing Data: A Case Study from Souk Arbaa Sahel, Sidi Ifni Inlier, Western Anti-Atlas, Morocco. ISPRS Int. J. Geo-Inf. 2019, 8, 248. [Google Scholar] [CrossRef]
  4. Peyghambari, S.; Zhang, Y. Hyperspectral Remote Sensing in Lithological Mapping, Mineral Exploration, and Environmental Geology: An Updated Review. J. Appl. Rem. Sens. 2021, 15, 031501. [Google Scholar] [CrossRef]
  5. Ouyang, S.; Chen, W.; Qin, X.; Yang, J. Geological Background Prototype Learning-Enhanced Network for Remote-Sensing-Based Engineering Geological Lithology Interpretation in Highly Vegetated Areas. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 8794–8809. [Google Scholar] [CrossRef]
  6. Bahrami, H.; Esmaeili, P.; Homayouni, S.; Pour, A.B.; Chokmani, K.; Bahroudi, A. Machine Learning-Based Lithological Mapping from ASTER Remote-Sensing Imagery. Minerals 2024, 14, 202. [Google Scholar] [CrossRef]
  7. Ali, S.; Li, H.; Ali, A.; Hassan, J.I. Lithological Discrimination of Khyber Range Using Remote Sensing and Machine Learning Algorithms. Appl. Sci. 2024, 14, 5064. [Google Scholar] [CrossRef]
  8. Xi, J.; Jiang, Q.; Liu, H.; Gao, X. Lithological Mapping Research Based on Feature Selection Model of ReliefF-RF. Appl. Sci. 2023, 13, 11225. [Google Scholar] [CrossRef]
  9. Sekandari, M.; Masoumi, I.; Beiranvand Pour, A.; M Muslim, A.; Rahmani, O.; Hashim, M.; Zoheir, B.; Pradhan, B.; Misra, A.; Aminpour, S.M. Application of Landsat-8, Sentinel-2, ASTER and WorldView-3 Spectral Imagery for Exploration of Carbonate-Hosted Pb-Zn Deposits in the Central Iranian Terrane (CIT). Remote Sens. 2020, 12, 1239. [Google Scholar] [CrossRef]
  10. Chen, Q.; Cai, D.; Zhao, Z.; Yang, X.; Wang, Y.; Jiang, X.; Xu, L.; Duan, H.; He, Y.; Zhang, X.; et al. Alteration Information Extraction and Mineral Prospectivity Mapping in the Laozhaiwan Area Using Multisource Remote Sensing Data. Remote Sens. 2025, 17, 2178. [Google Scholar] [CrossRef]
  11. Hajaj, S.; El Harti, A.; Pour, A.B.; Khandouch, Y.; Fels, A.E.A.E.; Elhag, A.B.; Ghazouani, N.; Ustuner, M.; Laamrani, A. Evaluation of Heterogeneous Ensemble Learning Algorithms for Lithological Mapping Using EnMAP Hyperspectral Data: Implications for Mineral Exploration in Mountainous Region. Minerals 2025, 15, 833. [Google Scholar] [CrossRef]
  12. Wang, Z.; Zuo, R.; Jing, L. Fusion of Geochemical and Remote-Sensing Data for Lithological Mapping Using Random Forest Metric Learning. Math. Geosci. 2021, 53, 1125–1145. [Google Scholar] [CrossRef]
  13. Dong, Y.; Yang, Z.; Liu, Q.; Zuo, R.; Wang, Z. Fusion of GaoFen-5 and Sentinel-2B Data for Lithological Mapping Using Vision Transformer Dynamic Graph Convolutional Network. Int. J. Appl. Earth Obs. Geoinf. 2024, 129, 103780. [Google Scholar] [CrossRef]
  14. Guo, D.; Wang, Z.; Zuo, R. Large-Scale Himalayan Leucogranite Mapping Based on Multi-Source Remote-Sensing Data and U-Net Convolutional Network. Nat. Resour. Res. 2025, 34, 2403–2421. [Google Scholar] [CrossRef]
  15. Cracknell, M.J.; Reading, A.M. Geological Mapping Using Remote Sensing Data: A Comparison of Five Machine Learning Algorithms, Their Response to Variations in the Spatial Distribution of Training Data and the Use of Explicit Spatial Information. Comput. Geosci. 2014, 63, 22–33. [Google Scholar] [CrossRef]
  16. Yao, C.; Xia, Q.; Zhang, X.; Fan, X.; Qin, Y.; Tan, J. Quaternary Petrography-based Mapping Method: A Modified Mapping Technique for Mineral Exploration in the Partially Covered Area. Acta Geosci. Sin. 2017, 038, 549–559. [Google Scholar]
  17. Govett, G.J.S. Rock Geochemistry in Mineral Exploration. Earth Sci. Rev. 1983, 17, 298–299. [Google Scholar] [CrossRef]
  18. Taylor, S.R.; Mclennan, S.M. The Continental Crust: Its Composition and Evolution. J. Geol. 1985, 94, 57–72. [Google Scholar]
  19. Xie, X.; Wang, X.; Zhang, Q.; Zhou, G.; Cheng, H.; Liu, D.; Cheng, Z.; Xu, S. Multi-Scale Geochemical Mapping in China. Geochem. Explor. Environ. Anal. 2008, 8, 333–341. [Google Scholar] [CrossRef]
  20. Appiah-Twum, M.; Xu, W.; Sunkari, E.D. Mapping Lithology with Hybrid Attention Mechanism–Long Short-Term Memory: A Hybrid Neural Network Approach Using Remote Sensing and Geophysical Data. Remote Sens. 2024, 16, 4613. [Google Scholar] [CrossRef]
  21. Harris, J.R.; Grunsky, E.C. Predictive Lithological Mapping of Canada’s North Using Random Forest Classification Applied to Geophysical and Geochemical Data. Comput. Geosci. 2015, 80, 9–25. [Google Scholar] [CrossRef]
  22. Wang, Z.; Zuo, R.; Liu, H. Lithological Mapping Based on Fully Convolutional Network and Multi-Source Geological Data. Remote Sens. 2021, 13, 4860. [Google Scholar] [CrossRef]
  23. Bai, S.; Zhao, J.; Yu, T.; Shao, Y. Fusion of Geochemical Data and Remote Sensing Data Based on Convolutional Neural Network. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2025, 18, 1212–1225. [Google Scholar] [CrossRef]
  24. Pan, T.; Zuo, R.; Wang, Z. Geological Mapping via Convolutional Neural Network Based on Remote Sensing and Geochemical Survey Data in Vegetation Coverage Areas. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 3485–3494. [Google Scholar] [CrossRef]
  25. Li, C.; Xiao, K.; Sun, L.; Tang, R.; Dong, X.; Qiao, B.; Xu, D. CNN-Transformers for Mineral Prospectivity Mapping in the Maodeng–Baiyinchagan Area, Southern Great Xing’an Range. Ore Geol. Rev. 2024, 167, 106007. [Google Scholar] [CrossRef]
  26. Lindsay, M.D.; Jessell, M.W.; Ailleres, L.; Perrouty, S.; de Kemp, E.; Betts, P.G. Geodiversity: Exploration of 3D Geological Model Space. Tectonophysics 2013, 594, 27–37. [Google Scholar] [CrossRef]
  27. Calcagno, P.; Chilès, J.P.; Courrioux, G.; Guillen, A. Geological Modelling from Field Data and Geological Knowledge: Part I. Modelling Method Coupling 3D Potential-Field Interpolation and Geological Rules. Phys. Earth Planet. Inter. 2008, 171, 147–157. [Google Scholar] [CrossRef]
  28. Caumon, G.; Collon-Drouaillet, P.; Le Carlier de Veslud, C.; Viseur, S.; Sausse, J. Surface-Based 3D Modeling of Geological Structures. Math. Geosci. 2009, 41, 927–945. [Google Scholar] [CrossRef]
  29. Pusacker, K.; Coors, V.; Eckhardt, J.-D.; Rupf, I. A Concept for 3D Geological and Urban Subsurface Modeling with a Unified Voxel Model Examined by a Case Study for the City Center of Stuttgart (Baden-Württemberg), Germany. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2024, 10, 193–200. [Google Scholar] [CrossRef]
  30. Tian, Y.; Xiao, S.; Zhang, R.; Weng, Z.; Wu, X.; Wu, Y. Local Dynamic Update Methods for 3D Geological Body Structure Model and Voxel Model. Earth Sci. Inform. 2024, 17, 841–851. [Google Scholar] [CrossRef]
  31. Abdelsattar, A.; Hemdan, E.E.-D. A Geostatistical Predictive Framework for 3D Lithological Modeling of Heterogeneous Subsurface Systems Using Empirical Bayesian Kriging 3D (EBK3D) and GIS. Geomatics 2025, 5, 60. [Google Scholar] [CrossRef]
  32. Wang, L.; Pan, Q.; Su, D.; Huang, S. Three-Dimensional Voxel Geological Modelling for Subsurface Stratigraphy: A Graph Convolutional Network Approach. Can. Geotech. J. 2025, 62, 1–15. [Google Scholar] [CrossRef]
  33. Wellmann, F.; Caumon, G. Chapter One: 3-D Structural Geological Models: Concepts, Methods, and Uncertainties. In Advances in Geophysics; Elsevier: Amsterdam, The Netherlands, 2018; Volume 59, pp. 1–121. [Google Scholar]
  34. Duan, R.H.; Liu, C.H.; Shi, J.R. Late Neoarchean magmatic arc extends westward in the southern of Yinshan Block: Evidence from geochronology and geochemistry of the Wulatezhongqi and Wulatehouqi area. Acta Petrol. Sin. 2021, 37, 1372–1404. [Google Scholar] [CrossRef]
  35. Egozcue, J.J.; Pawlowsky-Glahn, V.; Mateu-Figueras, G.; Barceló-Vidal, C. Isometric Logratio Transformations for Compositional Data Analysis. Math. Geol. 2003, 35, 279–300. [Google Scholar] [CrossRef]
  36. Ding, H.; Jing, L.; Xi, M.; Bai, S.; Yao, C.; Li, L. Research on Scale Improvement of Geochemical Exploration Based on Remote Sensing Image Fusion. Remote Sens. 2023, 15, 1993. [Google Scholar] [CrossRef]
  37. LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
  38. Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper with Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; IEEE: New York, NY, USA, 2015. [Google Scholar]
  39. Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-Sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
  40. Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV); IEEE: New York, NY, USA, 2017; pp. 2999–3007. [Google Scholar]
  41. Loshchilov, I.; Hutter, F. Decoupled Weight Decay Regularization. arXiv 2017, arXiv:1711.05101. [Google Scholar]
Figure 1. Location and geological setting of the research area. (a) Administrative map of Inner Mongolia Autonomous Region; (b) Administrative map of the corresponding city; (c) Landsat-8 satellite image covering the region; (d) Regional geological map.
Figure 1. Location and geological setting of the research area. (a) Administrative map of Inner Mongolia Autonomous Region; (b) Administrative map of the corresponding city; (c) Landsat-8 satellite image covering the region; (d) Regional geological map.
Applsci 16 04003 g001
Figure 2. Fe2O3 elemental interpolation diagram.
Figure 2. Fe2O3 elemental interpolation diagram.
Applsci 16 04003 g002
Figure 3. Landsat-8 RS image.
Figure 3. Landsat-8 RS image.
Applsci 16 04003 g003
Figure 4. Structure of the MCNN network based on the 1D-Inception module adapted from [38].
Figure 4. Structure of the MCNN network based on the 1D-Inception module adapted from [38].
Applsci 16 04003 g004
Figure 5. Multi-source data fusion image.
Figure 5. Multi-source data fusion image.
Applsci 16 04003 g005
Figure 6. Lithological classification map based on multi-source fusion data and MCNN.
Figure 6. Lithological classification map based on multi-source fusion data and MCNN.
Applsci 16 04003 g006
Figure 7. Confusion Matrix of Fusion Data.
Figure 7. Confusion Matrix of Fusion Data.
Applsci 16 04003 g007
Figure 8. Loss function and classification accuracy. (a) Loss function curves of the training set and validation set; (b) Accuracy curves of the training set and validation set.
Figure 8. Loss function and classification accuracy. (a) Loss function curves of the training set and validation set; (b) Accuracy curves of the training set and validation set.
Applsci 16 04003 g008
Figure 9. ROC Curve.
Figure 9. ROC Curve.
Applsci 16 04003 g009
Figure 10. F1-score Bar Chart.
Figure 10. F1-score Bar Chart.
Applsci 16 04003 g010
Figure 11. Lithological Classification Maps: (a) CNN + Geochemical Data; (b) MCNN + Geochemical Data; (c) CNN + Multi-source Fusion Data; (d) MCNN+ Multi-source Fusion Data.
Figure 11. Lithological Classification Maps: (a) CNN + Geochemical Data; (b) MCNN + Geochemical Data; (c) CNN + Multi-source Fusion Data; (d) MCNN+ Multi-source Fusion Data.
Applsci 16 04003 g011
Table 1. Lithologic Association Label Table.
Table 1. Lithologic Association Label Table.
LabelStratum and Geological AgeLithology Association Lithology Description
JJurassicJurassicSedimentary rock
YEarly YanshanianEarly YanshanianIntrusive rock
PProterozoicProterozoicSedimentary rock
CCretaceousCretaceousSedimentary rock
QQuaternary SystemQuaternary SystemLoose sediment
VVariscanVariscanIntrusive rock
KLate CaledonianLate CaledonianIntrusive rock
LEarly LuliangianEarly LuliangianIntrusive rock
AArchaeozoicArchaeozoicMetamorphic rock
Table 2. The Elements, Analytical Methods, and Detection Limits of the RGNR.
Table 2. The Elements, Analytical Methods, and Detection Limits of the RGNR.
ElementUnitDetection LimitAnalysis MethodElementUnitDetection LimitAnalysis Method
Agμg/g0.02AAS/AESPbμg/g2XRF
Asμg/g1AFSSbμg/g0.1AFS
Auμg/g0.0003AAS/GF-AASSnμg/g1AES
Bμg/g5AESSrμg/g5XRF
Baμg/g50XRFThμg/g4XRF
Beμg/g0.5AESTiμg/g100XRF
Biμg/g0.1AFSUμg/g0.5COL/LCF
Cdμg/g0.05AAS
Coμg/g1XRFVμg/g20XRF
Crμg/g15XRFWμg/g0.5POL
Cuμg/g1XRFYμg/g5XRF
Fμg/g100ISEZnμg/g10XRF
Hgμg/g0.0005AFSZrμg/g10XRF
Laμg/g30XRFAl2O3%0.05XRF
Liμg/g5AASCaO%0.05XRF
Mnμg/g30XRFFe2O3%0.05XRF
Moμg/g0.4POLK2O%0.05XRF
Nbμg/g5XRFMgO%0.05XRF
Niμg/g2XRFNa2O%0.05XRF
Pμg/g100XRFSiO2%0.1XRF
Note: XRF: X-ray fluorescence spectroscopy; AFS: Atomic fluorescence spectroscopy; AAS: Atomic absorption spectroscopy; AES: Atomic emission spectroscopy; POL: Polarography; ISE: Ion selective electrode method; GF-AAS: Graphite furnace atomic absorption spectroscopy; COL/LCF: Colorimetry or laser catalytic fluorescence.
Table 3. Statistical Characteristics of the Raw Data.
Table 3. Statistical Characteristics of the Raw Data.
ElementMeanStdDevSkewnessKurtosisNational AverageMinPercentileMax
Sediment Concentration25%50%75%
Ag83.46 22.49 4.49 54.78 81.00 25.12 71.10 81.00 91.61 466.87
As3.33 2.58 5.22 34.82 2.76 0.12 2.15 2.76 3.64 27.66
Au1.31 6.84 29.18 912.91 0.80 0.10 0.50 0.80 1.30 233.81
Mo0.54 0.49 5.30 42.57 0.42 0.06 0.29 0.42 0.62 6.84
Mn638.05 284.02 1.17 8.45 612.55 107.02 412.42 612.55 840.57 3909.94
Li14.34 6.44 4.26 31.99 13.10 2.47 10.92 13.10 15.73 88.79
La32.25 16.25 0.60 (0.26)29.53 5.02 18.52 29.53 44.37 103.07
Hg18.16 12.71 18.65 608.27 16.38 3.90 11.98 16.38 21.59 444.02
F400.16 96.28 0.24 0.96 396.92 77.96 340.89 396.92 457.26 866.74
Cu18.45 8.26 0.50 (0.22)17.63 3.27 11.75 17.63 23.89 50.72
Cr80.76 56.50 1.68 3.52 61.96 3.70 42.11 61.96 105.03 444.35
Co13.67 6.65 0.54 (0.18)12.85 2.13 8.34 12.85 18.23 42.82
Cd60.16 24.45 3.23 37.14 57.91 10.16 42.71 57.91 74.13 403.13
Bi0.13 0.07 2.71 15.74 0.11 0.01 0.08 0.11 0.16 0.90
Be1.76 0.66 0.54 0.58 1.69 0.35 1.27 1.69 2.23 5.78
Ba919.92 392.38 2.18 8.16 834.06 174.07 663.17 834.06 1063.76 4108.46
B14.85 10.63 3.41 15.62 12.14 2.26 8.96 12.14 16.69 103.57
SiO266.15 6.05 0.12 0.28 65.68 30.77 61.73 65.68 70.33 85.00
Na2O3.27 0.67 (0.49)0.45 3.38 0.94 2.86 3.38 3.73 5.70
MgO1.95 1.14 1.27 3.97 1.82 0.15 1.09 1.82 2.59 10.97
K2O2.50 0.70 0.12 0.34 2.52 0.64 2.05 2.52 2.93 5.03
Fe2O34.91 2.40 0.55 0.22 4.87 0.70 2.93 4.87 6.48 18.37
CaO3.51 1.42 1.51 8.28 3.47 0.74 2.60 3.47 4.24 17.55
Al2O313.33 1.63 (0.60)0.40 13.60 6.29 12.39 13.60 14.45 18.27
Zr208.88 110.20 2.50 10.33 178.48 49.89 141.09 178.48 242.85 1125.91
Zn53.96 22.47 0.21 (0.87)53.52 10.61 34.02 53.52 71.17 130.75
W0.54 0.55 15.98 434.33 0.47 0.05 0.31 0.47 0.67 17.16
V83.25 41.34 0.68 0.71 79.98 6.59 50.95 79.98 109.37 297.71
Y16.046.260.650.3215.43.2111.0715.4119.9345.42
U1.09 0.47 1.74 5.39 0.99 0.19 0.78 0.99 1.30 4.03
Ti3828.38 2405.43 3.64 32.73 3579.42 229.08 2049.89 3579.42 4960.06 36,496.50
Th6.71 3.59 1.30 5.18 5.97 0.88 3.84 5.97 8.94 42.48
Sr527.08 256.38 2.84 17.55 499.96 75.62 376.75 499.96 631.71 3162.26
Sn1.47 0.65 8.10 141.00 1.35 0.41 1.12 1.35 1.66 15.72
Sb0.21 0.13 2.67 12.55 0.18 0.04 0.13 0.18 0.25 1.49
Pb16.26 3.62 0.89 3.33 15.93 7.88 13.67 15.93 18.51 43.93
P622.18 354.93 1.33 2.59 546.20 105.99 363.90 546.20 800.85 2765.38
Ni23.02 14.04 1.13 1.09 19.54 2.47 12.42 19.54 30.61 87.64
Nb15.19 6.11 1.51 7.58 14.53 2.24 11.16 14.53 18.34 71.64
Note: Ag, Au, Cd, and Hg are in units of 10−9; oxides are in %; other elements are in units of 10−6.
Table 4. Main Parameter of Landsat-8 Multispectral Data.
Table 4. Main Parameter of Landsat-8 Multispectral Data.
Data TypeBandSpectral Range/µmSpatial Resolution/m
VNIR10.433–0.45330
20.450–0.515
30.525–0.600
40.630–0.680
50.845–0.885
SWIR61.560–1.660
72.100–2.300
80.500–0.68015
91.360–1.39030
Table 5. Ablation Study of Different Models and Data Sources.
Table 5. Ablation Study of Different Models and Data Sources.
ExperimentLithological CategoriesJurassicEarly YanshanianProterozoicCretaceousVariscanLate CaledonianEarly LuliangianArchaeozoicOAAUC
Class IDJYPCVKLA
CNN
(Geochemical data)
Precision0.740.610.740.640.840.690.690.680.700.88
Recall0.890.520.890.510.610.760.860.57
F1-score0.810.560.810.570.710.730.760.62
MCNN (Geochemical Data)Precision0.940.950.940.680.960.930.870.750.880.91
Recall0.990.960.990.550.970.910.790.86
F1-score0.960.970.970.600.980.920.820.80
CNN
(Fusion data)
Precision0.940.600.800.630.850.830.720.720.750.96
Recall0.920.800.910.590.700.750.830.53
F1-score0.930.690.850.610.770.790.770.61
MCNN (Fusion data)Precision0.990.980.990.930.960.970.990.880.950.97
Recall1.000.991.000.960.970.981.000.91
F1-score1.000.970.990.960.980.990.980.95
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, Y.; Xiao, K.; Tang, R.; Zhang, Q. Lithological Mapping Based on Multi-Source Fusion Data and Convolutional Neural Networks: A Case Study of the Guyang Area, Inner Mongolia, China. Appl. Sci. 2026, 16, 4003. https://doi.org/10.3390/app16084003

AMA Style

Wang Y, Xiao K, Tang R, Zhang Q. Lithological Mapping Based on Multi-Source Fusion Data and Convolutional Neural Networks: A Case Study of the Guyang Area, Inner Mongolia, China. Applied Sciences. 2026; 16(8):4003. https://doi.org/10.3390/app16084003

Chicago/Turabian Style

Wang, Yao, Keyan Xiao, Rui Tang, and Qianrong Zhang. 2026. "Lithological Mapping Based on Multi-Source Fusion Data and Convolutional Neural Networks: A Case Study of the Guyang Area, Inner Mongolia, China" Applied Sciences 16, no. 8: 4003. https://doi.org/10.3390/app16084003

APA Style

Wang, Y., Xiao, K., Tang, R., & Zhang, Q. (2026). Lithological Mapping Based on Multi-Source Fusion Data and Convolutional Neural Networks: A Case Study of the Guyang Area, Inner Mongolia, China. Applied Sciences, 16(8), 4003. https://doi.org/10.3390/app16084003

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop