Next Article in Journal
Trace Element Characteristics of Magnetite and Hematite from the Heshangqiao Iron Oxide–Apatite Deposit in Eastern China: Implications for the Ore-Forming Processes
Next Article in Special Issue
Performance of Electro-Geochemical Survey in Locating Hidden Lead–Zinc–Antimony Deposits: A Case Study of the Bancai Mining Area in Hechi, Guangxi
Previous Article in Journal
Geochemical Profile Characterization of Mine Tailings by Exploited Element as Input for Receptor Models: Case of Chilean Tailings (Cu-Au-Ag-Mo-Fe-Zn-Pb-Kaolin-CaCO3)
Previous Article in Special Issue
The Application of Sulfur–Metal Mass Ratios in Metal Sulfides in Assessing Prospects for Deep Metallogeny: A Case Study of the Tongshan Copper Deposit in Heilongjiang Province, Northeast China
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Data-Driven Decoupling of Metallogenic Patterns: A Case Study of Skarn-Type vs. Hydrothermal Vein-Type Pb-Zn Deposits in the Shanghulin Area, Inner Mongolia, China

1
Harbin Center for Integrated Natural Resources Survey, China Geological Survey, Harbin 150086, China
2
Observation and Research Station of Earth Critical Zone in Black Soil, Ministry of Natural Resources, Harbin 150086, China
3
Heilongjiang Institute of Natural Resources Survey, Harbin 150036, China
4
Geomathematics Key Laboratory of Sichuan Province, Chengdu University of Technology, Chengdu 610059, China
5
SinoProbe Laboratory, Institute of Mineral Resources, Chinese Academy of Geological Sciences, Beijing 100037, China
*
Author to whom correspondence should be addressed.
Minerals 2026, 16(1), 6; https://doi.org/10.3390/min16010006
Submission received: 19 November 2025 / Revised: 10 December 2025 / Accepted: 18 December 2025 / Published: 20 December 2025
(This article belongs to the Special Issue Geochemical Exploration for Critical Mineral Resources, 2nd Edition)

Abstract

The close spatial and genetic coexistence of Skarn-type and Hydrothermal Vein-type Pb-Zn deposits in the Shanghulin area, Inner Mongolia, poses a significant challenge to conventional “ undifferentiated” prediction models. This study aims to decouple these distinct metallogenic patterns using a data-driven, “type-specific modeling” strategy, establishing separate prediction models for Skarn-type and Hydrothermal Vein-type mineralization. Our workflow first employs Lasso–RFECV for rigorous pre-screening of over 60 geoscience features to identify the optimal predictive subset. Subsequently, an XGBoost model is trained on these selected features, and the SHAP framework is applied to interpret the geological significance of its decision logic. The results confirm two distinct indicator systems. (1) The Skarn-type model is controlled by spatial proximity to a heat source, heavily relying on Distance_to_Volcano and high-temperature indicators (CLR_Mo, CLR_W, CLR_Mn). (2) The Hydrothermal Vein-type model is “chemical fingerprint-driven”, prioritizing CLR_Y and identifying a complex “leaching-enrichment” pattern: mineralization requires simultaneous wall-rock leaching (low CLR_Al2O3, low CLR_Y) and specific metal enrichment (high CLR_Co, high CLR_Zn). This study confirms the controlling factors: Skarn-type deposits are governed by magmatic proximity, whereas Hydrothermal Vein-type deposits are defined by specific alteration geochemical signatures. The proposed “Lasso–RFECV → XGBoost → SHAP” workflow successfully decouples these independent, geologically meaningful prospectivity models from complex data, offering a new paradigm for precise exploration.

1. Introduction

Mineral resources are fundamental to national economic development and security. In China, lead (Pb) and zinc (Zn) are critical strategic resources essential for the electrical and military industries [1,2,3,4]. The Daxing’anling (Greater Khingan Range), located in the eastern segment of the Central Asian Orogenic Belt, is a major Ag-Pb-Zn province. Its western margin, the Derbugan Metallogenic Belt, hosts numerous Pb-Zn deposits [5,6,7,8,9]. However, the region’s complex tectonic and magmatic history has resulted in the spatial coexistence of Hydrothermal Vein-type and Skarn-type deposits. This superposition leads to mixed geochemical signatures, posing a significant challenge to differentiating deposit types using traditional exploration methods.
Machine learning (ML) has become pivotal for integrating heterogeneous geoscience data. While algorithms like Random Forest (RF) have served as benchmarks in Mineral Prospectivity Mapping (MPM) [10,11,12,13], XGBoost has emerged as a state-of-the-art implementation of Gradient Boosted Decision Trees due to its superior efficiency and capability in handling high-dimensional data [14,15]. Despite these advancements, most existing models treat regional potential uniformly, failing to differentiate between genetic types. This often results in predictive patterns that are merely a fuzzy superposition of geological processes. Furthermore, utilizing Explainable AI (XAI) techniques, such as SHAP (SHapley Additive ex-Planations), is essential to transform these “black box” models into transparent tools that quantify the contribution of specific geological features [14,16,17].
To address the challenges of insufficient model specificity and low signal-to-noise ratios in feature sets, this study proposes a “two-stage” intelligent prediction framework. We employ Lasso–RFECV to perform rigorous feature pre-screening, followed by a specific XGBoost model for final prediction. Subsequently, SHAP analysis is applied to deconstruct the predictive logic. We execute this workflow independently for Skarn-type and Hydrothermal Vein-type datasets, establishing separate, data-driven models to resolve the “classification difficulty” and precisely identify the dominant prospecting indicators for each genetic type.

2. Geological Setting and Data

2.1. Tectonic Background

The study area is situated in the eastern segment of the Central Asian Orogenic Belt (CAOB), on the western margin of the Daxing’anling (Greater Khingan) Range. It lies primarily to the north of the Derbugan Fault Belt. Since the Paleozoic, the region has undergone a prolonged and complex evolutionary history marked by the superposition of three major tectonic domains: the Paleo-Asian Ocean, the Mongol-Okhotsk Ocean, and the Paleo-Pacific Ocean [6,9]. This evolution was accompanied by multiple episodes of intense magmatic activity [6,9]. This unique tectonic-magmatic setting provided exceptional conditions for large-scale polymetallic mineralization [18,19], establishing the area as a critical component of the nationally renowned “III-47 Xinbaerhu Youqi—Genhe Polymetallic Metallogenic Belt” (Figure 1). Consequently, the study area, serving as a representative part of this metallogenic belt, is recognized as one of Northern China’s key Pb-Zn polymetallic ore clusters.

2.2. Regional Geological Setting

The study area is characterized by a complex geological structure with widespread Mesozoic volcanic-magmatic activity. The surface is extensively covered by Upper Jurassic volcanic series and Quaternary unconsolidated sediments. The region exhibits a clear structural framework, providing a favorable geological setting for the formation of polymetallic deposits.
The exposed strata in the study area are dominated by Mesozoic volcanic-sedimentary series and Cenozoic unconsolidated sediments. The Upper Jurassic (J3) strata are the most widely distributed, constituting the dominant rock series of the region, and are comprised mainly of the Damaoguaihe Formation (J3d), Baiyingaolao Formation (J3b), and Longjiang Formation (J3l). Lithologically, these formations consist of an assemblage of andesite, pyroclastic rock, rhyolite, and tuffaceous sandstone. Additionally, sporadic outcrops of the Lower Cretaceous (K1) Tamulangou Formation (K1t) are exposed in the southeastern part of the region. Quaternary (Q) unconsolidated sediments are also widespread, primarily distributed in strips along river valleys and drainage systems. These units, which overlie the underlying bedrock, are typically alluvial-proluvial gravel layers (Figure 2).
Magmatic activity in the area was intense, with intrusive rocks widely distributed, especially concentrated in the northeastern part of the study area, where they form a large-scale composite batholith. As shown in Figure 2, this intrusive activity is closely related to the Upper Jurassic volcanism and is also predominantly Upper Jurassic in age, making it a product of Yanshanian magmatism. The intrusive rock types are diverse but mainly intermediate to felsic, including biotite granite (γ52), granodiorite (γδ51), and quartz monzonite (ξ52). These intrusions are mostly emplaced into the Upper Jurassic volcanic strata and are considered among the most important ore-forming parent rocks in the region. The contact zones between these intrusions and the surrounding country rock are favorable locations for prospecting endogenetic metallic deposits [6,22].
Tectonic structures are sparsely exposed in the area. As shown in Figure 1C, the main faults primarily strike NE to NEE, which is consistent with the orientation of the Derbugan Fault Belt; the distribution of Quaternary sediments is also clearly controlled by this trend. Furthermore, some NWW-striking faults are relatively evenly distributed throughout the region.

2.3. Characteristics of Pb-Zn Deposits

2.3.1. Skarn-Type Pb-Zn Deposits

The formation of the Xiahulin Skarn-type Pb-Zn deposit is closely related to Mesozoic magmatic activity, and its occurrence is strictly controlled by contact zones. Ore bodies are primarily hosted at or near the contact between Early Yanshanian granitic intrusions and the carbonate-bearing strata (especially marble) of the Upper Proterozoic (or Sinian) Ergunarhe Group. Mineralization is characterized by a typical high-temperature contact metasomatic process. Crucially, the deposit exhibits distinct alteration zonation patterns: proximal zones are dominated by high-temperature anhydrous prograde skarn minerals (garnet, diopside), which grade outwards into retrograde hydrous skarn assemblages (epidote, chlorite, actinolite) superimposed by sulfide mineralization. This alteration is the primary proximal indicator for this deposit type. The ore minerals are predominantly galena and sphalerite, often associated with magnetite, chalcopyrite, pyrite, pyrrhotite, and molybdenite. Geochemically, the deposit displays a classic thermal halo zoning sequence centered on the intrusion: proximal high-temperature anomalies (Mo-W-Sn) transition outwards to medium-temperature polymetallic zones (Cu-Fe-Zn-Pb), and finally to distal low-temperature halos (Ag-Mn). These characteristics indicate that the key to prospecting for Skarn-type Pb-Zn deposits lies in identifying favorable intrusion-carbonate contact interfaces and their associated specific alteration and medium-to-high temperature geochemical anomalies [1,23,24].

2.3.2. Hydrothermal Vein-Type Pb-Zn Deposits

The Erdaohe Hydrothermal Vein-type Pb-Zn deposit represents a distinct metallogenic mechanism, with its formation being primarily controlled by tectonic activity. Ore bodies are strictly hosted within regional NE-trending fault zones and their derivative NW- and NNE-trending secondary fracture zones. They occur as veins, lenses, stockworks, or disseminations, with fault intersections, flexures, or dilational jogs being the primary loci for ore enrichment. The influence of host rock lithology (e.g., phyllite, tuff) on ore localization is relatively minor. Unlike the high-temperature alteration of the Skarn-type, the wall rocks adjacent to the veins exhibit a medium-to-low temperature hydrothermal alteration assemblage. The alteration typically manifests as symmetrical zoning around the ore veins, with intense silicification and sericitization in the inner zone grading outward into chloritization and propylitization. The ore minerals are also predominantly galena and sphalerite, but often contain pyrite and arsenopyrite, with gangue minerals dominated by quartz and carbonates. The geochemical mineralization model involves the precipitation of metals from cooling ore-forming fluids circulating within fracture zones. Consequently, the elemental assemblage shows marked vertical and lateral zoning: typically characterized by an association of Pb-Zn-Ag-Sb-As, with Ag, Sb, and As anomalies often occurring at the distinct distal or shallow parts of the metallogenic system compared to the Pb-Zn ore bodies. Therefore, the indicator system for prospecting Hydrothermal Vein-type Pb-Zn deposits should focus more on identifying favorable ore-controlling structures, specific medium-to-low temperature alteration assemblages, and their corresponding geochemical anomalies [1,2,25,26].

2.4. Data Introduction

The datasets employed in this study primarily consist of geological map data and geochemical data. The geochemical data were sourced from the Regional Geochemistry National Reconnaissance (RGNR) program. The sampling was conducted using a regular grid system with a density of 1–2 samples per 4 km2. Stream sediment samples were collected as the sampling medium to effectively capture the regional geochemical anomalies. A total of 9732 samples were collected covering the entire study area (see Figure 3 for sample distribution). The samples were analyzed using Inductively Coupled Plasma Mass Spectrometry (ICP-MS) and X-ray Fluorescence (XRF) for 39 chemical elements, and strict quality control procedures were implemented (detailed analytical protocols are described in references [27,28,29]).
Given that geochemical data are typical Compositional Data (CoDA), in which the sum of all components is a constant, they suffer from the “closure effect”. Direct statistical analysis of such data can lead to spurious correlations [30]. To effectively address this issue, this study applied the Centered Log-Ratio (CLR) transformation to the raw geochemical data prior to any subsequent processing. This method is a standard preprocessing step within the CoDA framework. Its theoretical basis and application have been extensively discussed in special issues of journals such as the Journal of Geochemical Exploration, and thus will not be elaborated upon here.

3. Methods

3.1. RFECV (Recursive Feature Elimination with Cross-Validation)

Recursive Feature Elimination (RFE) is an efficient “wrapper” feature selection method that iteratively trains a model to identify and eliminate features with the lowest contribution. This study employs RFECV, or “Recursive Feature Elimination with Cross-Validation,” which is designed to identify the optimal feature subset that yields the maximum predictive performance in a data-driven manner [31,32,33].
The RFECV workflow proceeds as follows: First, a base “estimator” model (in this study, a Lasso Logistic Regression) is trained on the complete feature set. The model then ranks all features based on their importance (e.g., Lasso coefficients) and eliminates the least important feature (or features, based on a step size of 1). This process is then repeated recursively on the remaining N-1 features until no features remain.
The most critical component of this method is the “CV” (Cross-Validation). At each step of feature elimination, RFECV performs a complete k-fold cross-validation (k = 5) and records the average prediction accuracy based on a pre-defined metric (e.g., accuracy). This process ultimately generates a “Cross-Validation Accuracy vs. Number of Features” performance curve, enabling us to objectively identify the “optimal number of features” that achieves the highest accuracy. This study adopts this method to eliminate redundant and noisy features, aiming to achieve the most robust predictive performance with the most parsimonious feature combination, thereby identifying the truly critical ore-controlling variables.

3.2. XGBoost Model

XGBoost (Extreme Gradient Boosting), one of the benchmark algorithms in machine learning, is a highly efficient, flexible, and scalable algorithm based on Gradient Boosting Decision Trees (GBDT) proposed by Chen and Guestrin in 2016 [34]. Through deep optimization of the traditional GBDT algorithm, it addresses bottlenecks in computational speed and overfitting control, and is characterized by its state-of-the-art predictive accuracy and efficiency in data science competitions and real-world applications.
XGBoost is an ensemble learning technique that sequentially builds multiple decision trees, where each new tree fits the predictive residuals (errors) of the preceding trees to continuously improve the model’s overall performance. This method introduces several key innovations: First, it incorporates L1 and L2 regularization terms into its objective function, effectively preventing overfitting by penalizing model complexity—a significant advantage over traditional GBDT. Second, the algorithm’s implementation is highly optimized, supporting parallelized feature processing, cache-aware computation, and built-in handling of sparse data, which greatly accelerates training speed. This strategy of “iterative error-correction” and “strict overfitting control” allows XGBoost to efficiently learn highly precise patterns from high-dimensional, complex, and non-linear data.
Owing to its superior performance, XGBoost has been widely and successfully applied in the field of mineral prospectivity mapping. For example, Yu, addressing the limitations of traditional modeling, proposed an explainable ensemble learning prediction method using Random Forest, XGBoost, and AdaBoost as learners [34]. In a 3D prediction study of the Lannigou gold mine in China, Zhang compared models such as Weights of Evidence and XGBoost, concluding that the XGBoost model yielded the best performance, and subsequently used its results for target delineation [15]. These studies generally agree that when processing high-dimensional, non-linear geoscience data, XGBoost’s powerful learning capability, built-in regularization, and high computational performance make it one of the most advanced and effective tools for data-driven mineral prediction today.
The XGBoost algorithm, as an ensemble model, offers valuable insights into feature relevance. Unlike opaque “black-box” models, XGBoost can provide variable importance rankings and reveal partial dependence relationships. This capability enables the identification and explanation of important predictors, which is highly applicable in mineral prospectivity mapping.

3.3. SHAP (SHapley Additive exPlanations)

The core idea of SHAP (SHapley Additive exPlanations) originates from the Shapley value in cooperative game theory. In game theory, the Shapley value is used to measure the contribution of each participant within a coalition to ensure a fair distribution of the payoff. SHAP cleverly adapts this concept to machine learning model explanation, quantifying the contribution of each feature to an individual prediction [35,36,37,38,39]. The SHAP value calculation is based on the marginal contribution of a feature averaged across all possible feature permutations (orderings), reflecting its precise impact on the model output. The formula is as follows:
V i = V i ( b a s e ) + j = 1 S   s h a p ( x i , j )
s h a p ( x i , j ) = M S ! | M | ! ( S | M | 1 ) ( V M x i , j V ( M ) )
where V i ( b a s e ) represents the expected output value for sample i , s h a p ( x i , j ) represents the marginal contribution of the feature j at time i to the prediction value, S is the set of all input feature indices, M is the subset of features not containing feature j , and V M x i , j and V ( M ) represent the model’s prediction output with and without feature j , respectively.

4. Results

4.1. Construction of Evidence Layers

To construct a multivariate evidence system capable of comprehensively reflecting the metallogenic geological processes in the study area, this study utilized the ArcGIS 10.8 platform. We interpolated the CLR-transformed geochemical data using Inverse Distance Weighting (IDW) and processed geological data (e.g., faults, volcanic craters) using Euclidean distance. This process generated the foundational evidence layers from which the tabular data required for the XGBoost model were subsequently extracted.
First, to convert the discrete geochemical sample point information into continuous spatial variables, we applied the Inverse Distance Weighting (IDW) method to spatially interpolate all 39 pre-processed (CLR-transformed) chemical elements. This step ultimately generated 39 geochemical raster maps that effectively depict the spatial distribution patterns and concentration gradients of each element. Given that 39 elements are too numerous to display, this paper presents only the geochemical maps for Pb and Zn, which are highly relevant to Pb-Zn mineralization (Figure 4 and Figure 5).
Second, for structural geological elements, considering that Hydrothermal Vein-type deposits are strictly controlled by structures and Skarn-type Pb-Zn deposits are controlled by volcanic craters, we performed an Euclidean Distance analysis on the volcanic crater and faults within the study area. This method calculates the straight-line distance from each pixel in the study area to the nearest volcanic craters or fault, generating continuous distance raster maps. This effectively transformed the important qualitative ore-control rule of “proximity to structures” into a quantitative predictor variable. The map of volcanic craters (Figure 6) and the fault distance map (Figure 7) are shown.
Finally, for lithological and stratigraphic units, we rasterized all stratigraphic units and intrusive bodies of different periods (polygonal vector data) in the study area to convert this qualitative information into binary features. Specifically, an independent binary “mask” layer was created for each geological unit deemed relevant to mineralization (e.g., a specific stratum or a particular granite phase). In these layers, pixels where the geological unit is present were assigned a value of ‘1’, while areas where it is absent were assigned ‘0’.
Ultimately, all geochemical maps, distance maps, and binary geological unit maps generated through the above processes were unified under the same projection coordinate system, sharing an identical spatial extent and pixel size. Together, they formed a multi-layer evidence-layer stack, which served as the primary data source for the grid-based sampling used to generate the final tabular feature dataset for the XGBoost model.

4.2. Results of RFECV

Among the 60 total features, which include geochemical, geological, and structural attributes, some are positively correlated with mineralization, some are negatively correlated, and others are essentially irrelevant. Excessive redundant information can impair the model’s ability to recognize patterns. Therefore, this study selected Recursive Feature Elimination (RFECV) to pre-screen the features.

4.2.1. Skarn-Type Features

Figure 8a shows that the X-axis represents the number of features selected by the model, while the Y-axis displays the prediction accuracy under 5-fold cross-validation. The results clearly indicate that the model’s performance improves with an increasing number of features, reaching a peak accuracy of 0.875 when 12 features are selected, as marked by the red dashed line. As the feature count exceeds 12, the model’s accuracy shows no significant improvement and begins to plateau, demonstrating that this data-driven set of 12 features is the optimal combination for identifying Skarn-type deposits.
The matching bar chart (Figure 8b) further reveals the importance ranking of these 12 optimal features, which is based on the “Absolute Coefficient” assigned to each feature by the Lasso model. The results show that the CLR-transformed elements Mo, Sb, W, and Mn are the most important geochemical indicators for differentiating Skarn-type deposits. The geological feature Distance_to_Volcano was also retained in this optimal combination as an important geological factor.
Geochemically, the inclusion of CLR_Mo, CLR_W, and CLR_Sn as top-ranking features, which are typical high-temperature lithophile elements, combined with the geological feature Distance_to_Volcano, collectively points to a magmatic-hydrothermal source for the mineralization. Concurrently, the selection of CLR_Zn and CLR_Cd represents the direct mineralization indicators of the deposit, while the high importance (ranked 4th) of CLR_Mn is a direct response to the skarn alteration zone itself. Finally, the presence of medium-to-low temperature volatile elements such as CLR_Sb and CLR_Hg further confirms a complex, hydrothermally zoned metallogenic system. In summary, the model not only identified the mineralization itself but also successfully extracted, in a data-driven manner, the key geochemical fingerprints indicating the heat source, alteration, and hydrothermal zoning.

4.2.2. Hydrothermal Vein-Type Features

Figure 9a displays the Lasso–RFECV feature selection results for the Hydrothermal Vein-type deposits. The results clearly indicate that the model’s performance improves significantly as the number of features increases, reaching a peak accuracy of 0.946 when 15 features are selected, as marked by the red dashed line. When the feature count exceeds 15, the model’s accuracy shows no significant improvement and subsequently plateaus, demonstrating that this data-driven set of 15 features is the optimal combination for identifying Hydrothermal Vein-type deposits.
The matching bar chart (Figure 9b) further reveals the importance ranking of these 15 optimal features, based on the “Absolute Coefficient” assigned by the Lasso model. The results show that a feature set distinct from that of the Skarn-type was selected, wherein CLR_Y, CLR_Ni, CLR_Al2O3, CLR_MgO, CLR_Au, and CLR_Co are the most important geochemical indicators for differentiating Hydrothermal Vein-type deposits. Additionally, the model automatically retained the key geological features Distance_to_Fault and the stratigraphic unit J3mn within this optimal combination.
Interpreted from a geochemical perspective, this 15-feature optimal combination genetically points to a structurally controlled hydrothermal system. First, the selection of Distance_to_Fault as a high-importance feature provides data-driven confirmation that fault structures are a critical factor controlling the spatial localization of this deposit type, which is highly consistent with the genetic understanding of hydrothermal “vein-type” deposits. Second, the inclusion of CLR_Au, CLR_Zn, and CLR_Cu directly reflects a polymetallic mineralization system. Most importantly, the selection of CLR_Y and CLR_La, along with CLR_Ni and CLR_Co, collectively reveals a complex fluid source, possibly related to deep-seated magma (or mantle-derived materials). Concurrently, major element oxides like CLR_Al2O3 and CLR_MgO likely represent the geochemical response of specific alterations (e.g., silicification, chloritization) resulting from fluid-wallrock reactions. Therefore, the model has successfully identified a set of prospecting indicators for Hydrothermal Vein-type deposits defined by the combination of “Structure (Fault) + Host Rock (J3mn) + Specific Geochemical Fingerprint (CLR_Y-CLR_Ni-CLR_Au-CLR_Co)”.

4.3. Mineral Prospectivity Mapping (MPM) Results

Based on the recursive feature elimination, we selected only the most predictive features to conduct prospectivity mapping. We constructed two separate indicator systems, one for Skarn-type Pb-Zn deposits and one for Hydrothermal Vein-type Pb-Zn deposits, respectively.

4.3.1. Skarn-Type Pb-Zn Prospectivity Mapping

For the Skarn-type model, positive samples were defined by applying a buffer zone around known mineral occurrences. The buffer distance was determined based on the spatial extent of the deposit to better reflect the actual mineralized area. Negative samples were then randomly selected from regions outside the positive sample buffer (e.g., at least 5 km away). This random selection of negative samples ensures that they are sufficiently distant from the positive samples, which helps maintain the model’s performance and accuracy. This approach is consistent with methods used in previous studies, such as those by Zuo et al. [40] and Carranza et al. [11], where random negative sampling was also employed. The entire set of positive and negative samples was then randomly divided into a training set and a validation set using an 80:20 ratio. The model was trained using the following optimized hyperparameters: ‘colsample_bytree’: 0.7, ‘gamma’: 0.1, ‘learning_rate’: 0.05, ‘max_depth’: 4, ‘n_estimators’: 500, and ‘subsample’: 0.8.
The model’s training process was effectively monitored via the accuracy and loss curves (Figure 10). The curves show that the model converged rapidly within the first 50 rounds and stabilized after approximately 100 rounds, indicating an efficient training process without significant overfitting. The final trained model achieved an AUC (Area Under the Curve) of 0.98 on the validation set, demonstrating excellent predictive performance. Based on this optimized model, we calculated the mineralization potential for the entire study area, generating the final prospectivity map (Figure 11). The high-probability zones in this map show a strong spatial correlation with known mineral occurrences (represented as points) and exhibit a predominant NE-trending distribution.
The spatial distribution of Skarn-type Pb-Zn mineralization probability shown in Figure 11 is highly consistent with the conclusions drawn from our prior SHAP and RFECV analyses. As the SHAP analysis revealed, Distance_to_Volcano is the dominant feature guiding the model’s predictions, with proximity (low feature value) acting as a strong positive driver for mineralization. Therefore, the primary “hotspot” areas (deep red, high probability) depicted in Figure 11 represent the optimal exploration targets identified by the data-driven model. These are areas that are not only in closest proximity to volcanic craters but also simultaneously enriched in key high-temperature geochemical indicators, such as CLR_Mo, CLR_W, and CLR_Mn.
It is noteworthy, however, that these independent high-probability “hotspots” also exhibit a significant NE (northeast) trend at a macro-scale. This orientation is highly consistent with the strike of the regional deep-seated fault (the Derbugan Fault Belt). This strongly suggests that this NE-trending deep fault acted as the primary conduit for regional magmatism, controlling the emplacement and distribution of the mineralization-related granite bodies and volcanic craters. Therefore, although the data-driven model “learned” to use Distance_to_Volcano as its strongest predictor, the spatial pattern of the final probability map indirectly reflects the regional tectonic framework controlled by this deep fault.

4.3.2. Hydrothermal Vein-Type Pb-Zn Prospectivity Mapping

For the Hydrothermal Vein-type Pb-Zn deposits, positive and negative samples were defined using the same sampling strategy. The optimal hyperparameters were set to: ‘colsample_bytree’: 0.7, ‘gamma’: 0.1, ‘learning_rate’: 0.05, ‘max_depth’: 3, ‘n_estimators’: 500, and ‘subsample’: 0.7. Ultimately, the trained model achieved an AUC (Area Under the Curve) of 0.92 on the validation set, demonstrating excellent predictive performance (Figure 12). Similarly, the prediction results for the Hydrothermal Vein-type Pb-Zn deposits were used to generate a final mineral prospectivity map (Figure 13). The map shows that the high-probability zones correlate well with the locations of known mineral occurrences (points), with the majority appearing in the southeastern part of the study area.
The high-potential prediction zones (deep red areas) for the Hydrothermal Vein-type Pb-Zn deposits exhibit distinct spatial distribution characteristics. As shown in Figure 13, these high-probability zones are spatially concentrated, mainly distributed in the southeastern part of the study area, and show a significant NE (northeast) trend. This distribution trend is highly consistent with the strike of the NE-trending Derbugan Fault Belt, indicating the macro-scale control of this primary fault system as a regional ore-fluid conduit; however, the final mineralization may be more closely related to the distribution of secondary faults. Furthermore, a particularly critical distinction is that the high-probability zones for the Hydrothermal Vein-type do not show a strong spatial coupling with volcanic craters. This stands in sharp contrast to the Skarn-type prediction model—which identified Distance_to_Volcano as its most important ore-controlling feature—and fundamentally differentiates the spatial localization patterns of the two deposit types from a data-driven perspective.

4.4. SHAP-Based Feature Importance Analysis

To deconstruct the internal decision-making logic of the XGBoost model and identify key data-driven prospecting indicators, this study introduced the SHAP (SHapley Additive exPlanations) framework. SHAP analysis not only quantifies the global importance of each feature but also reveals the local contribution of each feature to individual predictions, thereby “opening” the machine learning black box.

4.4.1. Skarn-Type

The SHAP feature importance bar plot (Figure 14b) reveals the global feature rankings utilized by the model to identify Skarn-type deposits. In sharp contrast to the Hydrothermal Vein-type model, this model demonstrates the balanced importance of both geochemical indicators and spatial geological features. CLR_W and Distance_to_Volcano are the two most important predictors, with their mean (|SHAP value|) far exceeding all other features. These are followed by another set of high-temperature magmatic-hydrothermal indicators: CLR_Sn, CLR_Sr, and the primary mineralization element CLR_Zn. This ‘high-temperature element + volcanic structure + mineralization’ assemblage, ranking within the top 5, clearly points the model toward a magmatic-hydrothermal origin for the Skarn-type deposit. The high importance of Distance_to_Volcano is well-supported by the geological context, where Skarn-type Pb-Zn deposits are closely associated with volcanic activity. The proximity to volcanic centers indicates areas that have experienced intense magmatic-hydrothermal alteration, which is a key factor in the formation of Skarn-type deposits. As shown in the geological map, the locations of these deposits (e.g., Xiahulin, Zixingtun, and Shanghulin North) are closely related to volcanic centers, with mineralization occurring along the contact zones between volcanic rocks and carbonate rocks.
The SHAP beeswarm plot (Figure 14a) further reveals how these features drive the model’s decisions. On this plot, the X-axis represents the SHAP value (positive values > 0 push the prediction toward ‘mineralization,’ while negative values < 0 push toward ‘non-mineralization’), and the color represents the feature’s original value (red = high, blue = low). The plot displays a highly complex geochemical pattern of coexisting ‘enrichment’ and ‘leaching,’ where the model has learned to identify not only high anomalies but also specific low backgrounds.
From a geochemical perspective (Figure 14a), we can discover a logical metallogenic process. First, the model identifies the magmatic heat source: Distance_to_Volcano exhibits a strong negative correlation (low distance values [blue dots] correspond to high positive SHAP values), confirming that spatial “proximity to the volcano” is a prerequisite for mineralization. Concurrently, high values (red dots) of the two typical high-temperature elements, CLR_W and CLR_Sn, also show a strong positive contribution (SHAP > 0), forming the magmatic-hydrothermal “fingerprint” itself. Second, the model captures complex signals of wall-rock alteration: on one hand, the enrichment (red dots) of CLR_Mn correlates positively with mineralization (SHAP > 0), consistent with the formation of manganese-bearing garnets or pyroxenes during skarnification. On the other hand, high values (red dots) of CLR_Sr and CLR_K2O show a strong negative contribution (SHAP < 0). This is a critical “leaching” signal, implying that as the metallogenic fluids metasomatized the wall rocks (such as Sr-rich carbonates or K-rich feldspars), they leached Sr and K while precipitating and enriching Zn, W, and Mn.
A particularly insightful discovery is the complex behavior of CLR_Zn in the beeswarm plot (Figure 14a), where its high values (red dots) appear on both the positive and negative ends of the SHAP axis. This is not a contradiction; rather, it is the strongest evidence of the success of our “type-specific modeling” approach. It is reasonable to infer that the high-Zn anomalies with negative SHAP values geochemically represent the coexisting Hydrothermal Vein-type mineralization in the study area. The Skarn-type model, by learning other key “contextual” features (e.g., high CLR_W, high CLR_Sn, low CLR_Sr, low CLR_K2O), has successfully learned how to “discriminate” Zn anomalies of different origins. It “learned” that only when high Zn co-occurs with the high-temperature magmatic-hydrothermal fingerprint (high W, high Sn) and specific alteration (low Sr, low K2O) should it be classified as “Skarn-type” (SHAP > 0). Conversely, when high Zn lacks these associated features (e.g., if it is instead accompanied by high CLR_Al2O3 or high CLR_Y signals, which are characteristic of the Vein-type model), the model “rejects” this false anomaly, classifying it as “non-Skarn” (SHAP < 0).
In summary, the model successfully identified, in a data-driven manner, that the optimal indicator system for Skarn-type deposits is a complex, highly discriminative geochemical pattern. This pattern first requires spatial proximity to a volcanic structure, accompanied by a specific “fingerprint” of wall-rock leaching of Sr and K2O (high values corresponding to negative SHAP values), and simultaneous strong enrichment of high-temperature indicators W, Sn, and Mn (high values corresponding to positive SHAP values). Critically, the model also learned how to “contextually discriminate” the CLR_Zn signal: only when Zn enrichment coexists with the aforementioned Skarn fingerprint (high W, high Sn, low Sr) is it treated as a strong positive indicator (SHAP > 0). When Zn enrichment is detached from this specific combination, it is identified by the model as a negative indicator (SHAP < 0).

4.4.2. Hydrothermal Vein-Type

The SHAP feature importance bar plot (Figure 15b) reveals the decision-making logic of the Hydrothermal Vein-type model. In sharp contrast to the Skarn-type model, this model’s predictions are dominated almost entirely by geochemical features. The top seven features in global importance (mean (|SHAP value|)) are all geochemical elements, including CLR_Y, CLR_B, CLR_Al2O3, CLR_Co, CLR_Au, CLR_Ni, and CLR_Zr. Conversely, the geological/structural features traditionally considered critical for vein-type deposits, Distance_to_Fault and the J3mn stratum, rank very low in importance in this model. This indicates that for the Hydrothermal Vein-type deposits in this region, a specific geochemical “fingerprint” is a more powerful and discriminative predictor than the macro-scale geological setting.
The SHAP beeswarm plot (Figure 15a) further elucidates how these features drive the model’s decisions. A deeper geochemical analysis of the beeswarm plot (Figure 15a) reveals a complex pattern highly consistent with hydrothermal alteration and mineralization. First, the model identifies a strong “enrichment” signal (i.e., high values [red dots] corresponding to positive SHAP values): this includes the mineralization element CLR_Zn, as well as CLR_Co, CLR_Zr, CLR_Sr, CLR_F, CLR_La, and CLR_MgO. The enrichment of CLR_F is a typical indicator of hydrothermal activity (especially magmatism-related). The enrichment of CLR_Zr, CLR_La, and CLR_Sr may indicate an affinity of the ore-forming fluids with felsic or alkaline magma evolution, while the enrichment of CLR_Co is a key associated metal indicator. In sharp contrast to this “enrichment” group, the model identified another set of “leaching” signals that are equally important but have an opposite contribution: high values (red dots) of CLR_Y, CLR_B, CLR_Al2O3, CLR_Au, and CLR_Ni all correspond to strong negative SHAP values (<0). This is geologically significant: the strong leaching of CLR_ Al2O3 (i.e., low CLR_Al2O3 values [blue dots] corresponding to high positive SHAP values) is direct evidence of the destruction of aluminosilicate minerals like feldspar during hydrothermal alteration (e.g., intense silicification, sericitization). The leaching of CLR_Y and CLR_Ni is also often associated with this type of acidic hydrothermal alteration. This “negative correlation” pattern for CLR_Au is even more unique, possibly indicating that the favorable host rock (e.g., J3mn) is itself Au-poor, and the model has “learned” that mineralization is more likely when this background value is low. Therefore, the model has discovered, through data-driven analysis, that the optimal indicator system for Hydrothermal Vein-type deposits is a complex geochemical fingerprint: the simultaneous occurrence of wall-rock alteration leaching (low CLR_Al2O3, low CLR_Y, low CLR_Au) and specific element enrichment (high CLR_Zn, high CLR_F, high CLR_Co) in structurally favorable (Distance_to_Fault) areas.

4.5. Delineation of Prospectivity Target Areas

To facilitate the classification of the target areas, we employed the Concentration-Area (C-A) fractal technique to threshold the probability map into inner (high anomaly), middle (intermediate anomaly), and outer (background) zones. Breakpoints were calculated from the log-log plot, ensuring line segments had an R2 > 0.8 for a robust fit.

4.5.1. Delineation of Skarn-Type Pb-Zn Prospectivity Areas

The C-A fractal analysis for the Skarn-type probability map is presented in Figure 16. The log-log plot clearly reveals three distinct linear populations, which were successfully fitted with high R2 values (0.80, 1.00, and 0.97). Based on these results, we identified two key breakpoints (thresholds) at S1 = −2.00 and S2 = −0.40 (Log (Value), or Log (Probability)). We defined the population corresponding to the innermost, high-anomaly zone as the “high-potential” zone for targeting. After applying this C-A fractal threshold (Log (Probability) > −0.40), the resulting high-probability (deep red) zones became more geographically concentrated, providing a clear focus for subsequent exploration. Based on the spatial distribution of these high-potential zones and their correlation with known mineral occurrences, we delineated three primary prospectivity areas (Prospects) within the study region:
Prospect Area I is located in the southwestern part of the study area. On the Skarn-type mineralization probability map (Figure 17), it manifests as an extensive, deep-red, high-value anomaly zone trending northeast (NE). This high-probability zone not only correlates closely with the known Xiahulin deposit and two other Skarn-type Pb-Zn occurrences, but its distribution pattern is also highly consistent with key ore-controlling geological bodies. Analysis shows this high-value zone is situated almost entirely upon granitic intrusions and is in close spatial association with the numerous volcanic structures (volcanic craters) shown in the figure. This spatial coupling is not a coincidence but rather the inevitable result of the data-driven model. As revealed by our RFECV and SHAP analyses, Distance_to_Volcano is the most important predictive feature in the Skarn-type model, and proximity (a low distance value) is a strong positive driver for mineralization. Therefore, the dual favorable conditions exhibited by Prospect I—”granitic basement + proximity to volcanic structures”—make it the most promising Skarn-type Pb-Zn exploration target in this area.
Prospect Area II is located in the northeastern (NE) part of the study area. At a macro-scale, its probability distribution also presents as a significant NE-trending high-value anomaly belt. This trend, echoing that of Prospect I, once again confirms the control of the regional tectonic framework: the NE-trending deep fault belt (the Derbugan Fault Belt) dominates the emplacement direction of the granite bodies and volcanic structures in this area. The model’s data-driven result—Distance_to_Volcano as the most important predictive feature—is validated again here, as the high-probability zones remain in close spatial relationship with the volcanic structures. It is noteworthy that this high-probability anomaly belt does not terminate at the northeastern corner of the study area but rather shows a trend of continuing beyond the boundary. This strongly implies that the favorable metallogenic bodies (such as intrusions or volcanic structures) and the hydrothermal system controlling the Skarn-type deposits likely extend into the adjacent region, indicating that the area peripheral to the northeast of the study region is also a high-potential exploration target warranting investigation.
Prospect Area III is located on the easternmost side of the study area. This high-probability zone is likewise driven by the model’s most important feature, Distance_to_Volcano, as known volcanic structures (craters) are indeed present in this area. However, our comprehensive geological interpretation reveals critical differences between this area’s conditions and the typical Skarn-type model: first, regional geological maps show no granite outcrops, meaning it lacks the key magmatic heat source and metasomatic parent rock; second, the known mineral occurrences near this target are of the Hydrothermal Vein-type, not the Skarn-type. Therefore, we infer that this high-probability anomaly may represent a false positive or an overestimation by the model based on the Distance_to_Volcano feature, or perhaps a response to another (non-Skarn) type of hydrothermal activity. Given the absence of key geological conditions (such as granite), its mineralization potential is considered insufficient, and it is thus classified as a lower-priority target.

4.5.2. Delineation of Hydrothermal Vein-Type Pb-Zn Prospectivity Areas

Likewise, the C-A fractal analysis was also applied to the prediction results for the Hydrothermal Vein-type Pb-Zn deposits, as shown in Figure 18. Using the S2 breakpoint (Log (Value) = −0.29) as the threshold, we delineated two primary prospectivity areas.
Prospect Area I is located in the southeastern part of the study area and is the largest and most concentrated high-probability anomaly on the Hydrothermal Vein-type prospectivity map (Figure 19). On a macro-scale, its spatial distribution exhibits a clear NE (northeast) linear trend, which is visually consistent with the strike of the regional Derbugan Fault Belt. This spatial correspondence corroborates the macro-scale background control of this primary fault system as a regional magmatic-hydrothermal conduit. However, as our SHAP analysis (Figure 15) has already revealed, Distance_to_Fault (the primary fault distance) was not, by itself, a key driver for the model’s predictions. Instead, the model delineated this target area by identifying its unique geochemical fingerprint—specifically, the strong leaching of CLR_Al2O3 and CLR_Y co-occurring with the enrichment of CLR_Co, CLR_Zn, and others. Prospect Area I thus represents a prime favorable zone where the macro-scale tectonic setting (NE-trending fault) and the micro-scale geochemical anomaly (the model’s recognized fingerprint) are successfully superimposed, with mineralization likely controlled by unmapped, secondary fracture networks associated with the primary fault.
Prospect Area II is located in the central part of the study area (Figure 19). Unlike Prospect I, this high-probability anomaly is spatially distributed in a NW (northwest) direction. This trend shows good consistency with the strike of a group of NW-trending secondary faults in the area. This further corroborates the implicit control of tectonics, particularly secondary structures, on spatial localization within the Hydrothermal Vein-type model. However, compared to Prospect I, the scale of this high-probability anomaly is relatively limited and smaller in area.

5. Discussion

Skarn-type and Hydrothermal Vein-type Pb-Zn deposits are both important mineral resources, but they differ significantly in their formation processes, geological settings, and geochemical characteristics. Skarn-type deposits are typically formed through metasomatic replacement, where ore-bearing magmatic fluids interact with carbonate rocks at their contact zones. These deposits are spatially controlled by the intrusion-wallrock interface, and their geochemical signature is characterized by high-temperature anomalies, with key elements such as CLR_Sn, CLR_W, CLR_Mo, and CLR_Zn. In contrast, Hydrothermal Vein-type deposits form when metallogenic fluids migrate along fractures and precipitate as veins, with mineralization primarily controlled by faults and shear zones. These deposits are associated with medium-to-low temperature elements, such as Ag, As, and Sb, and their geochemical signature reflects this difference in temperature regime.

5.1. Interpretation of the Skarn-Type Model

For Skarn-type deposits, the model emphasizes Distance_to_Volcano as the most significant predictor. Proximity to volcanic centers indicates areas that have undergone magmatic-hydrothermal alteration, a key process in the formation of Skarn-type deposits. The SHAP analysis further supports this, revealing that CLR_Zn, along with other high-temperature geochemical indicators like CLR_W and CLR_Sn, plays a critical role in predicting Skarn-type mineralization. However, CLR_Zn shows both positive and negative contributions to the model’s predictions, which may appear seemingly contradictory. This behavior is due to the complex mineralization process in the region. High Zn values contribute positively only when they occur in conjunction with high values of W and Sn, but on their own, high Zn values may not directly indicate Skarn-type mineralization, and could even lead to misinterpretations by the model. This highlights the importance of considering Zn’s interaction with other geochemical indicators to accurately define Skarn-type deposits.

5.2. Interpretation of the Hydrothermal Vein-Type Model

The Hydrothermal Vein-type model, in contrast, is controlled more by structural features such as faults and shear zones. Interestingly, Distance_to_Fault, which is traditionally a key feature for fault-controlled mineralization, ranked unexpectedly low in the SHAP analysis. This anomaly can be attributed to a scale mismatch between the input geological data and the actual ore-hosting structures. The geological map used in this study, with a scale of 1:200,000, depicts regional faults, but the true ore-hosting fractures are typically smaller-scale secondary or tertiary fractures. As a result, the regional-scale Distance_to_Fault could not effectively predict mineralization in the model, leading to its low importance.
Instead, the model relied more heavily on geochemical indicators. Crucially, the dominant role of Rare Earth Elements (REEs) and High Field Strength Elements (HFSEs) in the model—specifically the leaching of Y and the enrichment of La and Zr—provides critical insights into the nature of the ore-forming fluids and wall-rock alteration mechanisms. First, the model identifies low CLR_Y values as a strong positive predictor for mineralization. Geologically, Yttrium (Y) is typically hosted in rock-forming minerals such as plagioclase. In hydrothermal vein systems, the interaction between acidic fluids and the wall rock triggers intense alteration (e.g., sericitization), decomposing primary minerals and causing the leaching of Y. Therefore, the “low Y” anomaly serves as a robust proxy for intense fluid-rock interaction. Second, the specific enrichment of CLR_La and CLR_Zr (along with CLR_F) points to the magmatic heritage of the fluids. The mobilization of these typically immobile elements suggests the involvement of F-rich magmatic fluids capable of transporting HFSEs. In summary, the model successfully captured a “Source-Process” fingerprint: the input of magmatic fluids coupled with the destructive leaching of wall rocks.

5.3. Data-Driven Validation of Classical Metallogenic Models

Although this study employed machine learning to strictly classify Skarn-type and Hydrothermal Vein-type deposits as distinct exploration targets, classical metallogenic theories emphasize the genetic continuity between these systems. As noted by Dill [41], skarn and hydrothermal vein deposits often constitute different spatial and temporal components of a single, evolving magmatic-hydrothermal system. The Skarn-type deposits typically represent the proximal, high-temperature interaction phase at the intrusion contact, while the Hydrothermal Vein-type deposits represent the distal, lower-temperature phase controlled by cooling fluids migrating along structures.
Our data-driven results provide a robust, quantitative validation of these classical geological models. The RFECV and SHAP analyses independently identified Distance_to_Volcano and high-temperature elements (W, Sn) as the core predictors for the Skarn model (proximal), while selecting geochemical alteration fingerprints (leaching of Y, enrichment of Ag/Sb) for the Vein model (distal). This consistency demonstrates that the “black-box” machine learning model has successfully “learned” the intrinsic zonation laws of the metallogenic system described in classical literature. Thus, the proposed method is not a replacement for, but a powerful complement to, routine geological analysis, offering a means to quantify and visualize these classical geological patterns in complex exploration areas.

5.4. Limitations and Future Work

Finally, while this study successfully established high-precision probability maps for deposit targeting, future research could further benefit from uncertainty quantification techniques, such as Bootstrapping or MC-Dropout, to generate spatial uncertainty maps. This would provide additional layers of risk assessment for decision-making in exploration drilling.

6. Conclusions

This study successfully constructed and validated a “type-specific modeling” machine learning paradigm to address the geological challenge of coexisting and difficult-to-distinguish Skarn-type and Hydrothermal Vein-type Pb-Zn deposits in the Shanghulin ore concentration area of Inner Mongolia. Through the integrated workflow of “Lasso–RFECV screening → XGBoost prediction → SHAP explanation → C-A Fractal delineation”, we reached the following conclusions:
1. Skarn-type: This is a “spatial + heat source” controlled model. Its predictions are governed by proximity to magmatic centers (Distance_to_Volcano), accompanied by a specific fingerprint of high-temperature hydrothermal alteration elements (CLR_W, CLR_Sn, CLR_Mn); Hydrothermal Vein-type: This is a “chemical fingerprint” driven model. Its predictions rely on identifying intense wall-rock alteration, characterized by the strong leaching of CLR_Al2O3 and CLR_Y, and the relative enrichment of magmatic-affinity elements (CLR_La, CLR_Zr) and metals (CLR_Co, CLR_Zn). Additionally, the study revealed that regional-scale Distance_to_Fault is a weak predictor for this model due to scale mismatch.
2. Fractal Characteristics of Prospectivity: The Concentration-Area (C-A) fractal analysis revealed that the predicted probabilities for both Skarn-type and Hydrothermal Vein-type deposits exhibit distinct multifractal power-law distributions. By identifying the inflection points (thresholds) on the log-log plots, we objectively separated high-potential anomalies from the background. This fractal-based delineation successfully defined three Skarn-type and two Hydrothermal Vein-type prospectivity areas, which show strong spatial consistency with known mineral occurrences and structural frameworks.
3. Methodological Implication: The proposed “Type-Specific” workflow successfully decoupled the two genetic metallogenic models from complex, mixed data. By combining machine learning with fractal analysis, this study provides a targeted and scientific basis for precise exploration deployment in the Derbugan Metallogenic Belt.

Author Contributions

Methodology, L.F. and R.T.; software, L.F.; writing—original draft preparation, L.F.; writing—review and editing, L.F., G.C. and R.T.; visualization, Q.S., T.X., X.L., H.Y. and Y.S.; supervision, K.X.; funding acquisition, L.F. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the project of the China Geological Survey “Gold Resource Investigation and Evaluation in the Derbur-Moerdaoga Area, Inner Mongolia” (Project No. DD20242939) and Liaoning Provincial Natural Science Foundation of China (2024-MSLH-484).

Data Availability Statement

All data and materials are available on request from the corresponding author. The data are not publicly available due to ongoing research using part of the data.

Acknowledgments

The authors thank the anonymous reviewers and the editors for their hard work on this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Han, R.; Qin, K.; Xu, F.; Lyu, J.; Yang, X.; Zhang, J.; Wang, Y.; Hui, K. The Evolution of Ore-Forming Fluids of the Halasheng Ag-Pb-Zn Deposit, Inner Mongolia: Evidence from Fluid Inclusions and Mineral Constitute. Minerals 2024, 14, 1278. [Google Scholar] [CrossRef]
  2. Huang, T.; Chen, C.; Lv, X.; Wang, S.; Liu, H. Evolution and Origin of the Bairendaba Ag-Pb-Zn Deposit in Inner Mongolia, China: Constraints from Infrared Micro-Thermometry, Mineral Composition, Thermodynamic Calculations, and in Situ Pb Isotope. Ore Geol. Rev. 2023, 154, 105316. [Google Scholar] [CrossRef]
  3. Zhou, Z.; Yang, Z.; Li, X.; Xu, Q. Pre-Metallogenic Wall-Rock Alterations and Element Migration Features of Bairendaba Ag-Pb-Zn Deposit, Inner Mongolia. Miner. Depos. 2024, 43, 548–564. [Google Scholar]
  4. Song, T.; Wang, C.; Liang, X.; Liang, X. Metallogenic Age and Geological Setting of the Dongjun Ag-Pb-Zn Deposit, Inner Mongolia: Constraints from Geochemistry, Zircon U-Pb and Sphalerite Rb-Sr Chronology of the Alkali-Rich Granite Porphyry. Geotecton. Metallog. 2024, 48, 1040–1059. [Google Scholar]
  5. Cao, Y.; Liu, Y. Zircon U-Pb Age, Geochemical Characteristics and Metallogenic Significance of Ore-Bearing Porphyry of the Jiawula Ag-Pb-Zn Deposit in Inner Mongolia. Geol. Bull. China 2020, 39, 353–364. [Google Scholar]
  6. Cheng, L.; Li, H.; Yin, L.; Qin, W.; Tian, H. Research on Geological Characteristics and Prospecting Direction of Erdaohezi Ag-Pb-Zn Deposit in Inner Mongolia. Gold Sci. Technol. 2016, 24, 58–63. [Google Scholar]
  7. Nie, F.; Sun, Z.; Liu, Y.; Lv, K.; Zhao, Y.; Cao, Y. Mesozoic Multiple Magmatic Activities and Molybdenum Mineralization in the Chalukou Ore District, Da Hinggan Mountains. Geol. China 2013, 40, 273–286. [Google Scholar]
  8. Pei, S.; Yuan, J.; Huang, M. Soil Geochemical Anomly Characteristics of Xinbaerhuzuoqi, Inner Mongolia and the Ore Prospecting Direction. Contrib. Geol. Minerel Resour. Res. 2018, 33, 449–457. [Google Scholar]
  9. Yang, Y. Rb-Sr Dating of Sphalerites from Dongjun Pb-Zn-Ag Deposit, Inner Mongolia and Its Geological Significance. Earth Sci. Front. 2015, 22, 348–356. [Google Scholar]
  10. Carranza, E.J.M.; Laborte, A.G. Data-Driven Predictive Mapping of Gold Prospectivity, Baguio District, Philippines: Application of Random Forests Algorithm. Ore Geol. Rev. 2015, 71, 777–787. [Google Scholar] [CrossRef]
  11. Carranza, E.J.M.; Laborte, A.G. Random Forest Predictive Modeling of Mineral Prospectivity with Small Number of Prospects and Data with Missing Values in Abra (Philippines). Comput. Geosci. 2015, 74, 60–70. [Google Scholar] [CrossRef]
  12. Rodriguez-Galiano, V.; Sanchez-Castillo, M.; Chica-Olmo, M.; Chica-Rivas, M. Machine Learning Predictive Models for Mineral Prospectivity: An Evaluation of Neural Networks, Random Forest, Regression Trees and Support Vector Machines. Ore Geol. Rev. 2015, 71, 804–818. [Google Scholar] [CrossRef]
  13. Zheng, C.; Yuan, F.; Luo, X.; Li, X.; Liu, P.; Wen, M.; Chen, Z.; Albanese, S. Mineral Prospectivity Mapping Based on Support Vector Machine and Random Forest Algorithm—A Case Study from Ashele Copper-Zinc Deposit, Xinjiang, NW China. Ore Geol. Rev. 2023, 159, 105567. [Google Scholar] [CrossRef]
  14. Bigdeli, A.; Maghsoudi, A.; Ghezelbash, R. A Comparative Study of the XGBoost Ensemble Learning and Multilayer Perceptron in Mineral Prospectivity Modeling: A Case Study of the Torud-Chahshirin Belt, NE Iran. Earth Sci. Inform. 2024, 17, 483–499. [Google Scholar] [CrossRef]
  15. Zhang, Q.; Chen, J.; Xu, H.; Jia, Y.; Chen, X.; Jia, Z.; Liu, H. Three-Dimensional Mineral Prospectivity Mapping by XGBoost Modeling: A Case Study of the Lannigou Gold Deposit, China. Nat. Resour. Res. 2022, 31, 1135–1156. [Google Scholar] [CrossRef]
  16. Parsa, M. A Data Augmentation Approach to XGboost-Based Mineral Potential Mapping: An Example of Carbonate-Hosted Zn Pb Mineral Systems of Western Iran. J. Geochem. Explor. 2021, 228, 106811. [Google Scholar] [CrossRef]
  17. Xu, Y.; Zuo, R. Geochemical Survey Data Cube: A Useful Tool for Lithological Classification and Geochemical Anomaly Identification. Geochemistry 2024, 84, 125959. [Google Scholar] [CrossRef]
  18. Liu, Y.; Jiang, S.-H.; Bagas, L.; Han, N.; Chen, C.-L.; Kang, H. Isotopic (C-O-S) Geochemistry and Re-Os Geochronology of the Haobugao Zn-Fe Deposit in Inner Mongolia, NE China. Ore Geol. Rev. 2017, 82, 130–147. [Google Scholar] [CrossRef]
  19. Wu, C.; Wang, B.; Zhou, Z.; Wang, G.; Zuza, A.V.; Liu, C.; Jiang, T.; Liu, W.; Ma, S. The Relationship between Magma and Mineralization in Chaobuleng Iron Polymetallic Deposit, Inner Mongolia. Gondwana Res. 2017, 45, 228–253. [Google Scholar] [CrossRef]
  20. Chen, Y.-J.; Zhang, C.; Wang, P.; Pirajno, F.; Li, N. The Mo Deposits of Northeast China: A Powerful Indicator of Tectonic Settings and Associated Evolutionary Trends. Ore Geol. Rev. 2017, 81, 602–640. [Google Scholar] [CrossRef]
  21. Lu, S.; Deng, C.; Wang, K.; Feng, Y.; Li, C.; Chen, J.; Liu, Y. Crustal Contribution for the Formation of the Walali Au Deposit and Implications on the Early Cretaceous Au Mineralization in the Northern Great Xing’an Range. Ore Geol. Rev. 2022, 147, 105000. [Google Scholar] [CrossRef]
  22. Jiao, T.; Li, J.; Guo, X.; She, H.; Ren, C.; Li, C. Discussion on the Ore-Forming Fluids, Materials Sources and Genesis of Erdaohe Pb-Zn-Ag Deposit, Inner Mongolia. Geol. China 2024, 51, 426–442. [Google Scholar]
  23. Cai, W.; Wang, K.; Li, J.; Fu, L.; Li, S.; Yang, H.; Konare, Y. Genesis of the Bagenheigeqier Pb-Zn Skarn Deposit in Inner Mongolia, NE China: Constraints from Fluid Inclusions, Isotope Systematics and Geochronology. Geol. Mag. 2021, 158, 271–294. [Google Scholar] [CrossRef]
  24. Liu, Y. Element Geochemistry; Science Press: Beijing, China, 1984. [Google Scholar]
  25. Zhai, D.; Liu, J.; Zhang, H.; Tombros, S.; Zhang, A. A Magmatic-Hydrothermal Origin for Ag-Pb-Zn Vein Formation at the Bianjiadayuan Deposit, Inner Mongolia, NE China: Evidences from Fluid Inclusion, Stable (C-H-O) and Noble Gas Isotope Studies. Ore Geol. Rev. 2018, 101, 1–16. [Google Scholar] [CrossRef]
  26. Li, S.; Wang, Y.; Gao, L.; Xia, F.; Chen, C.; Ruan, D. Magma-Related Origin for Pb-Zn-Ag Vein Formation at the Aerhada Deposit, Inner Mongolia, NE China: Constraints from Fluid Inclusion, C-H-O-S-Pb Isotopic Compositions, and Geochronological Studies. Ore Geol. Rev. 2023, 163, 105793. [Google Scholar] [CrossRef]
  27. Xuejing, X.; Xuzhan, M.; Tianxiang, R. Geochemical Mapping in China. J. Geochem. Explor. 1997, 60, 99–113. [Google Scholar] [CrossRef]
  28. Wang, X.; Zhang, Q.; Zhou, G. National-Scale Geochemical Mapping Projects in China. Geostand. Geoanal. Res. 2007, 31, 311–320. [Google Scholar] [CrossRef]
  29. Xie, X.; Wang, X.; Zhang, Q.; Zhou, G.; Cheng, H.; Liu, D.; Cheng, Z.; Xu, S. Multi-Scale Geochemical Mapping in China. Geochem.-Explor. Environ. Anal. 2008, 8, 333–341. [Google Scholar] [CrossRef]
  30. Aitchison, J. The Statistical Analysis of Compositional Data; Springer: Dordrecht, The Netherlands, 1986. [Google Scholar]
  31. Kim, C. Discrete Space Deep Reinforcement Learning Algorithm Based on Support Vector Machine Recursive Feature Elimination. Symmetry 2024, 16, 940. [Google Scholar] [CrossRef]
  32. Barzani, A.R.; Pahlavani, P.; Ghorbanzadeh, O.; Gholamnia, K.; Ghamisi, P. Evaluating the Impact of Recursive Feature Elimination on Machine Learning Models for Predicting Forest Fire-Prone Zones. Fire 2024, 7, 440. [Google Scholar] [CrossRef]
  33. Anozie, L.; Fink, B.; Friedrich, C.M.; Engels, C. Monitoring Flow-Forming Processes Using Design of Experiments and a Machine Learning Approach Based on Randomized-Supervised Time Series Forest and Recursive Feature Elimination. Sensors 2024, 24, 1527. [Google Scholar] [CrossRef]
  34. Yu, Z.; Li, B.; Wang, X. Mineral Prospectivity Mapping Susceptibility Evaluation Based on Interpretable Ensemble Learning. Ore Geol. Rev. 2024, 173, 106248. [Google Scholar] [CrossRef]
  35. Yan, W.; Shen, Y.; Chen, S.; Wang, Y. Viscosity and Melting Temperature Prediction of Mold Fluxes Based on Explainable Machine Learning and SHapley Additive exPlanations. J. Non-Cryst. Solids 2024, 636, 123037. [Google Scholar] [CrossRef]
  36. Wang, Z.; Liu, H.; Amin, M.N.; Khan, K.; Qadir, M.T.; Khan, S.A. Optimizing Machine Learning Techniques and SHapley Additive exPlanations (SHAP) Analysis for the Compressive Property of Self-Compacting Concrete. Mater. Today Commun. 2024, 39, 108804. [Google Scholar] [CrossRef]
  37. Song, Z.; Cao, S.; Yang, H. An Interpretable Framework for Modeling Global Solar Radiation Using Tree-Based Ensemble Machine Learning and Shapley Additive Explanations Methods. Appl. Energy 2024, 364, 123238. [Google Scholar] [CrossRef]
  38. Feretzakis, G.; Sakagianni, A.; Anastasiou, A.; Kapogianni, I.; Bazakidou, E.; Koufopoulos, P.; Koumpouros, Y.; Koufopoulou, C.; Kaldis, V.; Verykios, V.S. Integrating Shapley Values into Machine Learning Techniques for Enhanced Predictions of Hospital Admissions. Appl. Sci. 2024, 14, 5925. [Google Scholar] [CrossRef]
  39. Ben Seghier, M.E.A.; Mohamed, O.A.; Ouaer, H. Machine Learning-Based Shapley Additive Explanations Approach for Corroded Pipeline Failure Mode Identification. Structures 2024, 65, 106653. [Google Scholar] [CrossRef]
  40. Zuo, R.; Xiong, Y. Big Data Analytics of Identifying Geochemical Anomalies Supported by Machine Learning Methods. Nat. Resour. Res. 2018, 27, 5–13. [Google Scholar] [CrossRef]
  41. Dill, H.G. The “Chessboard” Classification Scheme of Mineral Deposits: Mineralogy and Geology from Aluminum to Zirconium. Earth-Sci. Rev. 2010, 100, 1–420. [Google Scholar] [CrossRef]
Figure 1. Tectonic location map (A), Geotectonic location map of the study region; (B), Tectonic sketch map of Northeast China and the eastern segment of the Central Asian Orogenic Belt; (C), Regional geological map of the Erguna Block and surrounding areas, (based on references [20,21]).
Figure 1. Tectonic location map (A), Geotectonic location map of the study region; (B), Tectonic sketch map of Northeast China and the eastern segment of the Central Asian Orogenic Belt; (C), Regional geological map of the Erguna Block and surrounding areas, (based on references [20,21]).
Minerals 16 00006 g001
Figure 2. Simplified geological map of the study area (data from the 1:200,000 Digital Geological Map Spatial Database of the People’s Republic of China).
Figure 2. Simplified geological map of the study area (data from the 1:200,000 Digital Geological Map Spatial Database of the People’s Republic of China).
Minerals 16 00006 g002
Figure 3. Geochemical sample location map.
Figure 3. Geochemical sample location map.
Minerals 16 00006 g003
Figure 4. Geochemical maps of Zn element.
Figure 4. Geochemical maps of Zn element.
Minerals 16 00006 g004
Figure 5. Geochemical maps of Pb element.
Figure 5. Geochemical maps of Pb element.
Minerals 16 00006 g005
Figure 6. Buffer map of volcanic craters.
Figure 6. Buffer map of volcanic craters.
Minerals 16 00006 g006
Figure 7. Buffer map of faults.
Figure 7. Buffer map of faults.
Minerals 16 00006 g007
Figure 8. Data-driven feature selection results for the skarn-type model using Lasso–RFECV ((a) RFECV curve plot; (b) feature importance ranking plot).
Figure 8. Data-driven feature selection results for the skarn-type model using Lasso–RFECV ((a) RFECV curve plot; (b) feature importance ranking plot).
Minerals 16 00006 g008
Figure 9. Data-driven feature selection results for the hydrothermal vein-type model using Lasso–RFECV ((a) RFECV curve plot; (b) feature importance ranking plot).
Figure 9. Data-driven feature selection results for the hydrothermal vein-type model using Lasso–RFECV ((a) RFECV curve plot; (b) feature importance ranking plot).
Minerals 16 00006 g009
Figure 10. XGBoost model performance for Skarn-type Pb-Zn mineralization: (a) log loss curve; (b) accuracy curve.
Figure 10. XGBoost model performance for Skarn-type Pb-Zn mineralization: (a) log loss curve; (b) accuracy curve.
Minerals 16 00006 g010
Figure 11. XGBoost prediction result map for skarn-type Pb-Zn mineralization.
Figure 11. XGBoost prediction result map for skarn-type Pb-Zn mineralization.
Minerals 16 00006 g011
Figure 12. XGBoost model performance for Hydrothermal Vein-Type Pb-Zn mineralization: (a) log loss curve; (b) accuracy curve.
Figure 12. XGBoost model performance for Hydrothermal Vein-Type Pb-Zn mineralization: (a) log loss curve; (b) accuracy curve.
Minerals 16 00006 g012
Figure 13. XGBoost prediction result map for hydrothermal vein-type Pb-Zn mineralization.
Figure 13. XGBoost prediction result map for hydrothermal vein-type Pb-Zn mineralization.
Minerals 16 00006 g013
Figure 14. SHAP explanation plots for the skarn-type model ((a) SHAP value bee swarm plot; (b) SHAP feature importance bar plot).
Figure 14. SHAP explanation plots for the skarn-type model ((a) SHAP value bee swarm plot; (b) SHAP feature importance bar plot).
Minerals 16 00006 g014
Figure 15. SHAP explanation plots for the hydrothermal vein-type model ((a) SHAP value bee swarm plot; (b) SHAP feature importance bar plot).
Figure 15. SHAP explanation plots for the hydrothermal vein-type model ((a) SHAP value bee swarm plot; (b) SHAP feature importance bar plot).
Minerals 16 00006 g015
Figure 16. C-A fractal log-log plot for the skarn-type model.
Figure 16. C-A fractal log-log plot for the skarn-type model.
Minerals 16 00006 g016
Figure 17. Skarn-type Pb-Zn prospectivity map delineated by C-A fractal analysis.
Figure 17. Skarn-type Pb-Zn prospectivity map delineated by C-A fractal analysis.
Minerals 16 00006 g017
Figure 18. C-A fractal log-log plot for the hydrothermal vein-type model.
Figure 18. C-A fractal log-log plot for the hydrothermal vein-type model.
Minerals 16 00006 g018
Figure 19. Hydrothermal vein-type Pb-Zn prospectivity map delineated by C-A fractal analysis.
Figure 19. Hydrothermal vein-type Pb-Zn prospectivity map delineated by C-A fractal analysis.
Minerals 16 00006 g019
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fu, L.; Chen, G.; Song, Q.; Xie, T.; Yuan, H.; Li, X.; Su, Y.; Xiao, K.; Tang, R. Data-Driven Decoupling of Metallogenic Patterns: A Case Study of Skarn-Type vs. Hydrothermal Vein-Type Pb-Zn Deposits in the Shanghulin Area, Inner Mongolia, China. Minerals 2026, 16, 6. https://doi.org/10.3390/min16010006

AMA Style

Fu L, Chen G, Song Q, Xie T, Yuan H, Li X, Su Y, Xiao K, Tang R. Data-Driven Decoupling of Metallogenic Patterns: A Case Study of Skarn-Type vs. Hydrothermal Vein-Type Pb-Zn Deposits in the Shanghulin Area, Inner Mongolia, China. Minerals. 2026; 16(1):6. https://doi.org/10.3390/min16010006

Chicago/Turabian Style

Fu, Lichun, Guihu Chen, Qingyuan Song, Tiankun Xie, He Yuan, Xuefeng Li, Yu Su, Keyan Xiao, and Rui Tang. 2026. "Data-Driven Decoupling of Metallogenic Patterns: A Case Study of Skarn-Type vs. Hydrothermal Vein-Type Pb-Zn Deposits in the Shanghulin Area, Inner Mongolia, China" Minerals 16, no. 1: 6. https://doi.org/10.3390/min16010006

APA Style

Fu, L., Chen, G., Song, Q., Xie, T., Yuan, H., Li, X., Su, Y., Xiao, K., & Tang, R. (2026). Data-Driven Decoupling of Metallogenic Patterns: A Case Study of Skarn-Type vs. Hydrothermal Vein-Type Pb-Zn Deposits in the Shanghulin Area, Inner Mongolia, China. Minerals, 16(1), 6. https://doi.org/10.3390/min16010006

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop