The Evolution of Machine Learning in Large-Scale Mineral Prospectivity Prediction: A Decade of Innovation (2016–2025)

Fu, Zekang; Zheng, Xiaojun; Yan, Yongfeng; Xu, Xiaofei; Zhou, Fanchao; Li, Xiao; Zhou, Quantong; Mai, Weikun

doi:10.3390/min15101042

Open AccessReview

The Evolution of Machine Learning in Large-Scale Mineral Prospectivity Prediction: A Decade of Innovation (2016–2025)

by

Zekang Fu

¹,

Xiaojun Zheng

^1,2,*,

Yongfeng Yan

^1,*,

Xiaofei Xu

¹

,

Fanchao Zhou

³,

Xiao Li

^1,4,

Quantong Zhou

^1,5 and

Weikun Mai

¹

Faculty of Land Resources Engineering, Kunming University of Science and Technology, Kunming 650093, China

²

Innovation Base for Tin-Polymetallic Ore Mineralization Research and Exploration Technology, Geological Society of China, Kunming 650093, China

³

Tianjin North China Geological Exploration General Institute, Tianjin 300181, China

⁴

Shandong Gold Group, Jinan 250102, China

⁵

Jiangxi Provincial Geological Bureau, No.10 Geological Team, Nanchang 335000, China

^*

Authors to whom correspondence should be addressed.

Minerals 2025, 15(10), 1042; https://doi.org/10.3390/min15101042

Submission received: 27 July 2025 / Revised: 22 September 2025 / Accepted: 24 September 2025 / Published: 30 September 2025

(This article belongs to the Special Issue Application of Big Data Mining, Machine Learning and Artificial Intelligence in Geoscience, 2nd Edition)

Download

Browse Figures

Review Reports Versions Notes

Abstract

The continuous growth in global demand for mineral resources and the increasing difficulty of mineral exploration have created bottlenecks for traditional mineral prediction methods in handling complex geological information and large amounts of data. This review aims to explore the latest research progress in machine learning technology in the field of large-scale mineral prediction from 2016 to 2025. By systematically searching the Web of Science core database, we have screened and analyzed 255 high-quality scientific studies. These studies cover key areas such as mineral information extraction, target area selection, mineral regularity modeling, and resource potential evaluation. The applied machine learning technologies include Random Forests, Support Vector Machines, Convolutional Neural Networks, Recurrent Neural Networks, etc., and have been widely used in the exploration and prediction of various mineral deposits such as porphyry copper, sandstone uranium, and tin. The findings indicate a substantial shift within the discipline towards the utilization of deep learning methodologies and the integration of multi-source geological data. There is a notable rise in the deployment of cutting-edge techniques, including automatic feature extraction, transfer learning, and few-shot learning. This review endeavors to synthesize the prevailing state and prospective developmental trajectory of machine learning within the domain of large-scale mineral prediction. It seeks to delineate the field’s progression, spotlight pivotal research dilemmas, and pinpoint innovative breakthroughs.

Keywords:

mineralization prediction; machine learning; deep learning; multi-source data fusion; geological exploration

1. Introduction

In recent years, machine learning (ML) technologies have demonstrated growing value and potential in earth sciences, particularly in large-scale mineral prediction [1]. This trend arises from the global increase in mineral resource demand and the challenges faced by traditional exploration methods when handling complex, multi-source geological data [2]. Although conventional mineral prediction approaches, like geostatistics and weights of evidence, have achieved remarkable results in the past, their efficiency and accuracy may decline when confronted with massive, high-dimensional, and nonlinear geological data [3]. Against this backdrop, machine learning emerges as a powerful data-driven tool. It can automatically identify complex patterns, optimize target area selection, and predict mineral resource potential, excelling even when dealing with intricate, multidimensional datasets [4].

Traditional large-scale mineral prediction methods, such as geological analogy and empirical qualitative analysis, encounter technical bottlenecks when facing multi-source heterogeneous geological data [5]. Firstly, data from geology, geochemistry, geophysics, and remote sensing vary in format and quality, with significant missing values and noise. Traditional methods struggle to integrate these complex data effectively, leading to the isolation of information [6]. Secondly, mineralization is a highly nonlinear process involving multiple factors. The relationships between ore-controlling factors are intricate and hidden [7]. Traditional methods, such as linear regression models, have difficulty capturing complex nonlinear relationships. However, there are other methods, such as geostatistical methods (for example, the Kriging method) and simulation methods (for example, Monte Carlo simulation), which do not rely on linear regression but still have limitations when dealing with complex geological data [8,9]. For instance, predicting iron ore geological bodies requires complex processing of magnetic anomaly data using nonlinear inversion models, with initial model accuracy highly dependent on prior geological knowledge [10]. Moreover, traditional methods are inefficient in processing large datasets and lack objective quantification and uncertainty assessment of predictions, restricting their application in large-scale, high-throughput exploration [11]. While some traditional methods, such as Sequential Gaussian Simulation (SGS), can provide uncertainty assessments for predictions, they often struggle with the complexity and scale of modern geological datasets. In contrast, machine learning methods, especially deep learning techniques, have shown significant potential in handling these challenges and providing more robust uncertainty assessments. The application of machine learning technology in the field of earth science has gradually become a research hotspot, especially in mineral prediction. Zhou Yongzhang et al. (2018) systematically expounded the theoretical basis of machine learning methods and their applications in geology in “Big Data Mining and Machine Learning in Earth Sciences”, laying a solid foundation for subsequent research [12].

The advancement of multi-source data fusion technology has also brought about a revolutionary change in mineral prediction. The integration of multi-source geological data such as remote sensing, geophysical, and geochemical data has significantly improved the success rate of mineral prediction. Meanwhile, intelligent feature selection algorithms have greatly enhanced the efficiency of data processing. In the aspect of mineral regularity modeling, Long Short-Term Memory (LSTM) models have been used to handle time-series geochemical data, and deep learning models also show good adaptability in dealing with complex geological conditions [13].

Despite significant progress in mineral prediction using machine learning, many challenges remain:

(1): The lack of standardized geological datasets makes it difficult to effectively compare research results.

(2): Acquiring labeled data within actual geological circumstances entails significant costs and demands a substantial amount of time.

(3): There is a need to develop models that can adapt to mineral prediction in different geological environments.

In an attempt to tackle these difficulties, researchers are progressively concentrating on unsupervised learning techniques (for example, autoencoders and clustering), transfer learning, and few-shot learning [5,14]. The interaction between machine learning and geological exploration is growing in significance and represents one of the most hopeful research avenues in mineral resource exploration. A multitude of studies have validated the efficacy of machine learning approaches in both laboratory settings and real-world applications, demonstrating robust abilities in regional mineral potential evaluation and the demarcation of specific target areas [15].

This review aims to answer the following key research questions:

Which machine learning algorithm families currently dominate large-scale mineral prediction?

What geological tasks, such as mineral information extraction, target area selection, mineral regularity modeling, or resource potential evaluation, are most commonly addressed in these studies?

Have there been observable shifts in research methods between 2016 and 2025, including in terms of article types, methods used, or application fields?

The objective of this review is to comprehensively outline the current state of machine learning applications in large-scale mineral prediction analysis, with a particular focus on modern solutions developed between 2016 and 2025. This encompasses traditional approaches like support vector machines, k-nearest neighbors, and decision trees, along with innovative deep learning methods such as convolutional neural networks, long short-term memory networks, and autoencoders. It is worth noting that while autoencoders have been recently applied as a deep learning technique in mineral exploration and modelling, they are not new—this method was introduced approximately forty years ago.

This review examines 255 scientific publications indexed in the Web of Science database from 2016 to 2025, which include keywords related to “mineral prediction” and “machine learning.” Throughout the analysis timeframe, there has been a substantial surge in the quantity of publications within this domain. This notable uptick vividly demonstrates a burgeoning interest among the scientific and industrial sectors in leveraging artificial intelligence for mineral exploration and forecasting. This review also reveals research gaps and potential future directions, including developing mineral prediction methods adaptable to different geological environments, utilizing few-shot learning for transfer learning with small sample data, and creating open and diverse benchmark datasets. An important direction is the comprehensive integration of machine learning methods with geological exploration, mineral resource development, and management systems to align with the concept of smart mines and a digital Earth.

This paper is structured in the following manner. Section 2 sets out the principles and procedures used to identify and screen the literature. Section 3 offers a thematic integration of the examination of machine-learning techniques and relevant research. The outcomes of quantitative and bibliometric analyses are presented in Section 4. Section 5 delineates the future research directions and the limitations of the study. Finally, Section 6 sums up the ultimate conclusions of this review.

2. Materials and Methods

This paper conducts a comprehensive literature review regarding the utilization of machine-learning approaches in large-scale mineralization forecasting. The main goal is to gather, arrange, and assess the research outcomes from the last ten years (2016–2025). It encompasses the practical applications of both traditional machine-learning algorithms and cutting-edge deep-learning techniques in the realm of mineral exploration. In terms of data handling, a literature database was established with PostgreSQL 16.2 (PostgreSQL Global Development Group, Berkeley, CA, USA). All data manipulation, model categorization, and trend assessment were carried out within the Python 3.11.2 (Python Software Foundation, Wilmington, DE, USA) programming environment.

The chosen period reflects significant advancements in machine learning technology, particularly in handling complex geological data. From the perspective of machine learning technology, around 2016 marked a turning point when machine learning techniques began to gain widespread attention and initial application in the field of mineral exploration and prediction. For instance, deep learning demonstrated significant potential in processing geophysical data, bringing innovations to ore body localization and mineralization information extraction. Deep learning algorithms enabled efficient processing of multi-source data such as remote sensing data and geochemical data, mining hidden ore-indicating anomalies within them. Furthermore, deep learning models could segment and classify complex geological images, helping geologists gain a clearer understanding of subsurface geological structures and thereby predicting ore body locations more accurately [16]. Since then, deep learning algorithms have been more widely applied in geoscience, used to process sequence data and complex graph-structured data, enhancing the ability of models to capture multi-dimensional and nonlinear geological features. Particularly after 2020, with the Graphics Processing Unit (GPU) and the improvement of open-source frameworks, the application of deep learning models in geoscience entered an explosive phase, enabling the processing of larger-scale, higher-dimensional data and more complex pattern recognition and prediction [17]. By 2025, various new intelligent algorithms such as Graph Neural Networks (GNNs) and causal inference models have shown great potential in the geological field, indicating that the application of machine learning in the geological and mineral sector will enter a more mature and diversified stage.

Meanwhile, substantial progress has been made in geological data acquisition and processing technologies, providing high-quality data support for the training and application of machine learning models. The popularization of high-resolution remote sensing technologies (such as multispectral, hyperspectral, and synthetic aperture radar data) has enabled more precise capture of key information like surface biodiversity monitoring and vegetation structure, laying the foundation for broader geoscience applications [18]. Three-dimensional geological modeling technology has also become increasingly mature, capable of presenting complex subsurface geological bodies in a 3D visualized form and providing 3D spatial training samples for machine learning [19]. Geophysical exploration technologies (such as high-density resistivity method and transient electromagnetic method) and geochemical detection methods have also achieved explosive growth in data volume and improved precision, offering abundant subsurface information for machine learning models [20,21]. For example, 3D digital outcrop modeling technology based on UAV oblique photography can convert massive geological data into intuitive 3D geological models, and when combined with deep learning algorithms, it has significantly improved the accuracy of lithology identification [22].

These geological data, which are sourced from multiple origins, exhibit heterogeneity, and possess high resolution, precisely fulfill the requirement of deep learning models for substantial quantities of data. This, in turn, facilitates the profound application of machine learning methods in the geological domain. Therefore, the period 2016–2025 is not only a golden decade for the rapid development of machine learning technology itself but also a critical period for the qualitative leap in geological data acquisition and processing capabilities. Their synergistic development has jointly driven the paradigm shift of large-scale mineralization prediction from traditional methods to intelligent algorithms, making this period an ideal window to review the evolutionary process of machine learning in mineralization prediction.

2.1. Search and Select Documents

This study utilizes a standardized literature search approach to methodically identify research regarding the utilization of machine learning in large-scale mineralization forecasting. The process involves formulating a search strategy in the Web of Science database, conducting structured processing of bibliographic data, and implementing content screening based on the thematic characteristics of mineralization prediction and methodological criteria. The selection of literature is based on multi-dimensional query logic, aiming to accurately capture application cases of machine learning techniques in the geological field. The core search formula includes:

Title (“mineral prospectivity” OR “ore prediction” OR “mineralization mapping”) AND Title (“machine learning” OR “deep learning”)

The research results are limited to final publications issued between 2016 and 2025. To guarantee a thematic concentration, only the literature that employs particular machine-learning methods is incorporated. These methods encompass Convolutional Neural Networks, Recurrent Neural Networks, Long Short-Term Memory networks, Autoencoders, Support Vector Machines, Decision Trees, k-Nearest Neighbors, Random Forests, and K-means clustering. These nine algorithms form the core classification framework of this review, ensuring consistency between literature screening and subsequent analysis.

In addition, to ensure the focus on geological and mineral-related applications, publications from non-geological and mineral-related fields (e.g., medical diagnosis, mechanical engineering, social sciences, pure mathematical theory, etc.) were excluded. Based on the conducted query, a total of 430 relevant documents were obtained and subjected to systematic screening.

To ensure comprehensive geographical representativeness and relevance of the included studies, we conducted a preliminary bibliometric analysis of the initial query results. This analysis revealed that fifteen countries—China, Canada, Australia, the United States, Russia, South Africa, Brazil, Finland, India, Germany, France, the United Kingdom, Chile, Peru, and Iran—account for a significant proportion of all English-language publications in the field of machine learning applications for mineral prediction. These countries were selected for inclusion in the review based on their substantial contributions to the field, as evidenced by the high volume of relevant publications.

By focusing on these fifteen countries, the review aims to capture the majority of influential research while maintaining broad geographical diversity. This approach ensures that the study remains relevant to the primary research areas in machine learning applications for mineral prediction. However, we acknowledge that this decision may exclude some important contributions from other regions. To mitigate this potential limitation, we have also included a discussion on the potential for future research to expand the geographical scope and incorporate additional international contributions.

A preliminary search retrieved 430 documents. After screening for thematic relevance, 175 additional documents from low-impact-factor and duplicate journals were excluded. A total of 255 documents were finally included in the analysis. The workflow of data collection and screening is presented in Figure 1.

The metadata of the screened literature were imported into a PostgreSQL relational database. Fields included title, author, affiliation, publication year, document type, keywords, DOI, abstract, citations, and publication stage. These structured fields support SQL query construction for aggregating and classifying literature features. The data analysis was carried out utilizing Python 3.12.2 along with libraries like pandas for structuring the data, and matplotlib and seaborn for visualizing it. All the data from the literature were coded and arranged in tables in accordance with pre-established classification standards.

While machine learning techniques are usually classified according to the learning paradigm (such as supervised learning, unsupervised learning, and deep learning), this review adopts a unified categorization framework founded on application scenarios. The grouping is not strictly based on the theoretical affiliation or architectural depth of the algorithms, but rather on their practical application in mineral prediction.

This method of classification stems from the finding that in geological applications, the demarcations between traditional techniques and deep-learning methodologies frequently intersect. For instance, in study [23], a convolutional neural network (CNN) was used for feature extraction of mineralization alteration from remote sensing images, while the final mineral potential classification was performed using a traditional support vector machine (SVM) model, forming a hybrid method combining deep learning feature extraction with a classical classifier. Similarly, in [24], random forests and SVM models were applied to data preprocessed by principal component analysis (PCA) in the fusion analysis of multi-source geological data (geochemical + geophysical), with traditional statistical methods playing a key role in feature engineering. In study [25], an autoencoder, which is an unsupervised deep-learning model, was employed to reconstruct the missing geochemical data. Subsequently, the output was subjected to further analysis for predicting the probability of mineralization. Similarly, [26] applied support vector machines (SVMs) and k-nearest neighbors (kNNs) to tectonic mineral-controlling element data that had been manually interpreted, with these data also serving as input features for a convolutional neural network (CNN), forming a multi-method comparative study. Even within a single study, deep learning and classical methods often complement each other. For example, [27] used both a Convolutional Neural Network and Long Short-Term Memory (DCNN-LSTM) hybrid deep learning model and traditional algorithms such as decision trees and kNN to predict mineral potential in the same area, comparing the effectiveness of different methods via Receiver Operating Characteristic Curve (ROC) curves.

Given this methodological hybridization, classification based solely on algorithm architecture would fail to reflect the practical application scenarios of mineral prediction. The classification system of this review focuses on the combination of methods in geological practice, aiming to reveal the diversity of techniques and their contribution mechanisms to the accuracy of mineral prediction.

To guarantee the trustworthiness and solidity of the outcomes, our review adopted a rigorously structured workflow: we crafted a comprehensive search plan, imposed stringent date and language constraints, and applied a stepwise literature screening protocol. These combined measures guarantee that the 255 papers ultimately retained are both highly pertinent and broadly representative, offering a clear panorama of how the technology evolved between 2016 and 2025. Moreover, the application of a classification framework based on usage scenarios, which acknowledges the integration of classical and deep learning techniques in geological practice, lays a methodological foundation for subsequent comparative analyses.

To ensure the transparency and replicability of this review, it is crucial to offer a comprehensive account of the literature search approach. A structured query carried out in the Web of Science database, which is a fundamental step in the search procedure, is described as follows:

Basic search formula: TITLE ((“mineral prospectivity mapping” OR “ore deposit prediction”) AND (“machine learning” OR “deep learning”))

Extended search formula: Built upon the basic search, the extended search is refined by specifying particular algorithms, as illustrated below:

(TS = (machine learn* OR “machine learning” OR ml OR “neural network*” OR “deep learn*” OR “support vector machine*” OR “random forest*” OR “decision tree*” OR “bayesian network*” OR “ensemble method*” OR “reinforcement learn*” OR “gradient boost*” OR “clustering algorithm*”)) AND (TS = (mineral deposit* OR ore deposit* OR mineralization OR “deposit formation” OR “mineral prediction” OR “ore prediction” OR “mineral prospect*” OR “geological prediction” OR “mineral exploration” OR “ore genesis” OR “mineral resource*”)) AND (PY = 2016–2025).

In this context, the TS field represents the title or subject of the publications, while the PY field specifies the range of publication years.

2.2. Classification Criteria

All included literature has undergone an in-depth content analysis relevant to the research tasks and methodology of mineral prediction. The classification process is based on the metadata and full-text content of the literature, including:

The title, the abstract, the key terms, the author identification details, and, if required, an assessment grounded in the complete text.

Document type differentiation: journal articles, conference papers, review articles, etc.

The core classification criterion is to identify the machine learning algorithms used in each document, specifically including Recurrent Neural Networks (RNN), Convolutional Neural Networks (CNN), Long Short-Term Memory networks (LSTM), Autoencoders, Decision Trees (DT), Support Vector Machines (SVM), Neural Network Systems (NNS), K-means Clustering, and Random Forests (RF). In addition, the classification system also takes into account the types of input data, including geological mapping data, geochemical data, geophysical data, and remote sensing image data. Based on the above characteristics, this review classifies machine learning techniques into three major categories:

Traditional machine learning techniques encompass Support Vector Machines (SVM), Decision Trees (DT), k-Nearest Neighbors (kNN), and Random Forests (RF). These approaches depend on manually crafted geological characteristics (for example, anomaly cut-off values and structural buffer areas). The primary benefits of these methods are the interpretability of the models and computational effectiveness.

Advanced deep learning techniques, including Recurrent Neural Networks (RNN), Convolutional Neural Networks (CNN), Long Short-Term Memory networks (LSTM), and Autoencoders, are capable of directly extracting hierarchical feature representations from unprocessed geological data. This characteristic renders them highly appropriate for the interpretation of remote sensing images and the integration of multi-source geological data.

Hybrid approaches: combining the strengths of the above two categories of methods. For example, using CNN to automatically extract alteration mineral features from hyperspectral data, and then performing mineralization potential classification through SVM or RF. This aims to balance feature learning capabilities and model simplicity.

Key classification criteria also include specific application scenarios of machine learning in mineral prediction. Based on content analysis, the literature is categorized into the following application categories: identification of mineralization anomalies (such as geochemical anomaly extraction), assessment of mineral potential (probabilistic mineral resource prediction), exploration target selection (multi-criteria decision analysis), extraction of structural mineral-controlling elements, and estimation of mineral resource quantities. Most studies have cross-category characteristics, reflecting the multidisciplinary nature of mineral prediction. Types of research methods are also included in the classification system, encompassing experimental studies (including field data validation), case studies (specific types of deposits), conceptual models (innovations in theoretical methods), and literature reviews. Figure 2 illustrates the association network between machine learning algorithms and mineral prediction application scenarios.

This network diagram, with machine learning techniques at its core and various application scenarios distributed peripherally, illustrates the connections between typical algorithms—such as CNN, SVM, and RNN—and application contexts like mineralization anomaly identification and target area selection. For instance, CNN is predominantly used for alteration zone recognition in remote sensing images and structural interpretation, whereas RF and XGBoost are extensively applied in mineral potential assessment through multi-source data fusion. This visual representation of associations not only reveals the multidimensional nature of mineral prediction methods but also reflects the differences in the applicability of various algorithms.

In addition to the systematic literature selection, a key strength of the methodology in this review lies in its practical classification framework for machine learning techniques. Unlike theoretical paradigm-based classifications, this framework is grounded in the actual application and combination patterns of algorithms in mineral prediction. This approach stems from observations in geological practice where the boundary between classical and deep learning methods in mineral exploration is often blurred, prompting the emergence of hybrid methods as an effective means to address complex geological problems. This review aligns the categorization criteria with the technical procedures of mineral prediction, like anomaly detection and target zone selection. By doing so, it guarantees that the analytical outcomes directly address the practical requirements of mineral exploration. This effectively fills the void between the theoretical progress in machine-learning and geological applications.

2.3. Data Processing and Analysis

All of the literary data underwent structural processing through SQL queries to create statistical reports that showcase the following features: annual distribution of publications, country contributions, major machine learning methods applied, and mineral prediction application scenarios. Meanwhile, qualitative analyses were conducted through abstract reviews and full-text readings when necessary to ensure accurate classification of the literature into the preset research categories. In the course of the analysis, multiple research scenarios were pinpointed, including cases that combined the identification of anomalies and the selection of target areas. A multi-label classification method was adopted for these cases to support cross-tabulation analysis and methodological comparisons. The third chapter will provide a detailed presentation of the visualization of the analysis results and the association patterns between various categories.

2.4. Review Protocol and Quality Assessment

To ensure transparency and reproducibility of the study, this review was designed in accordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, which include four key stages: literature identification, initial screening, eligibility assessment, and final inclusion (Figure 3).

In the identification stage, a grand total of 430 research papers were obtained from the Web of Science database by utilizing the pre-defined search terms. The search scope was restricted to English-language final manuscripts that were published within the time frame from 2016 to 2025. Subsequently, in the screening phase, redundant entries were eliminated. After that, the remaining records were evaluated for their thematic relevance, and this assessment was carried out by examining their titles and abstracts. Studies that clearly did not address the application of machine learning in mineral prediction were excluded.

During the eligibility evaluation stage, a comprehensive full-text examination was conducted. Articles were incorporated into the study if they satisfied the subsequent criteria:

Explicitly applied machine learning techniques to large-scale (≥1:100,000) mineral prediction;

Provided a complete methodological description, including data sources, model parameters, and validation methods;

Peer-reviewed journal or conference papers;

The authors are from one of nine selected countries active in this research area.

A total of 255 articles were ultimately included for in-depth analysis. Exclusion criteria include non-geological mineral field research, purely theoretical method discussion (without actual geological data verification), small scale mineral point prediction (<1:100,000) and studies that do not provide model performance evaluation. It is especially important to point out that the term “absence of experimental verification” pertained specifically to research that failed to showcase the application outcomes of machine learning models using actual geological data. Conceptual studies proposing innovative methodological frameworks or performance evaluation metrics directly related to mineral prediction were still retained.

To evaluate the academic quality of the included literature, this study analyzed the following indicators: journal impact factors (JCR Q1/Q2/Q3), number of article citations, inclusion of independent test set validation, and public availability of geological data. For high-potential research published recently (2023–2025), even with low citation counts, studies with significant methodological innovation or large-scale data were prioritized for inclusion. This structured review process ensured the methodological rigor of the review, providing a reliable data foundation for subsequent technical trend analysis.

3. Analysis of the State of the Art

Over the past few years, machine learning methodologies have emerged as a fundamental instrument for large-scale mineral forecasting. By integrating multi-source geological data, they enable the identification of mineralization anomalies, the assessment of mineral potential, and the selection of exploration target areas. This has led to a substantial enhancement in both the efficacy and exactness of mineral resource prospecting [25]. This chapter reviews the technical evolution in this field from 2016 to 2025. It focuses on comparing the effectiveness of classical machine learning methods, such as deep-learning techniques like convolutional neural networks, recurrent neural networks, and long short-term memory networks, there are also traditional methods such as support vector machines and random forests. It pays special attention to how these methods handle uncertainties in geological data, spatial heterogeneity, and multi-scale features. The chapter also delves into key challenges like sample scarcity and the complexity of feature engineering. Furthermore, it summarizes emerging trends in technological development, identifies current research gaps, and outlines potential breakthrough directions for the next decade. The subsequent discussion will be organized around methodological evolution, application scenarios, and algorithm performance comparisons.

3.1. Review from 2016 to 2025

Classic Methodology Phase (2016–2020): From 2016 to 2020, classical methods like support vector machines (SVMs), decision trees, and k-nearest neighbor (kNN) algorithms excelled in scenarios with limited data [28]. Between 2016 and 2018, shallow learning algorithms dominated mineral prediction research. Random forests (RF) and SVM were widely used for their robustness with small sample sizes and interpretability [29]. For instance, in [30], researchers used the RF algorithm to predict mineral potential on Catanduanes Island, Philippines, proving its effectiveness. SVM was widely used for delineating ore boundaries and identifying mineralization anomalies due to its pattern recognition capability in high-dimensional spaces. From 2018 to 2020, with enhanced computing power and increased data availability, convolutional neural networks (CNN) began to emerge [31]. CNN was introduced into mineral prediction using two-dimensional grid data due to its ability to capture spatial patterns efficiently. For example, in 2018, ref. [32] applied CNN to the Tongling mineral belt in eastern China, successfully predicting mineral potential using multi-source geological, geophysical, and geochemical data. Liu Yanpeng et al. (2018) first applied CNN to the mineral exploration prediction of the Zhaojikou lead-zinc deposit in Anhui Province, ushering in a new era of machine learning application in the mineral field [33]; the random forest algorithm achieved an Area Under the Curve (AUC) value of 0.93 in multi-element anomaly identification at the Olympic Dam copper-gold mine in Australia [34].

The Rise of Deep Learning (2020–2025): Between 2020 and 2022, the application of attention mechanisms and Graph Convolutional Networks (GCNs) brought new breakthroughs to mineral prediction [35]. While attention mechanisms can automatically learn key data features, GCNs are suitable for processing graph-structured data. For instance, in 2021, reference [36] proposed a 3D-CNN model based on attention mechanisms to process 3D geological block data, which significantly improved the identification accuracy of deep concealed ore bodies. Deep learning technology not only performs well in the processing of macroscopic geological data, but also has made breakthroughs in the identification of microscopic minerals. Xu Shuteng et al. (2018) successfully achieved intelligent recognition of ore minerals under the mirror by using deep learning algorithms, providing a new technical means for mineralogical research [37].

From 2023 to 2025, deep generative models such as Generative Adversarial Networks (GANs) and diffusion models became research hotspots. These models can generate high-fidelity mineralization training samples, effectively addressing sample scarcity and imbalance. For example, in 2024, reference [38] augmented mineral data using conditional GANs, which significantly boosted model performance. Meanwhile, hybrid methods (e.g., CNN-SVM, DCNN-LSTM), leveraging complementary advantages, increased prediction accuracy in complex mineral systems by 15%–20%.

From the perspective of data fusion, the evolution from single-source to multimodal data is evident. Between 2016 and 2018, mineral prediction mainly relied on single-source geochemical data. For example, in 2017, reference [39] used geochemical data and random forest algorithms for mineral anomaly identification. However, this method overlooked other crucial geological information. From 2018 to 2020, multi-source data fusion became a trend. Researchers began combining geophysical, remote sensing, and geological map data to build more comprehensive mineral prediction models. For instance, in 2019, reference [40] successfully predicted ore distribution using multiple data sources and machine learning algorithms. After 2022, the application of multimodal Transformer frameworks elevated multi-source data fusion. In 2024, reference [41] used a multimodal Transformer framework to integrate diverse geological data, producing high-precision mineral prediction maps.

The evolution from 2D to 3D/4D data fusion is also significant. Between 2016 and 2018, mineral prediction focused on 2D spaces, analyzing geological maps and geochemical raster data. For example, in 2016, reference [42] predicted ore bodies in the Tongling mineralization zone using 2D-CNN. From 2019 to 2021, 3D modeling technologies emerged. In 2022, reference [43] proposed a 3D-CNN model with attention mechanisms for 3D geological block data processing. Using DeepLIFT visualization, this model revealed the spatial control of faults on mineralization.

Between 2022 and 2024, 4D modeling technologies gradually emerged. For instance, in 2024, reference incorporated time-series geophysical data into a 4D-CNN model, achieving “evolution-mineralization” coupled prediction. The data characteristics of different mineral types significantly impact model adaptability. For example, the distribution of rare earth elements (REEs) is controlled by complex geochemical and mineralogical processes, often linked to specific rock types and alteration. Machine learning, especially deep neural networks, shows great potential in ligand screening for REE separation. By combining molecular physicochemical descriptors and atomic extended connectivity fingerprints, these models greatly enhance the prediction accuracy of REE ion solvent extraction distribution coefficients. For REEs with complex geochemical behavior, deep learning models capable of handling high-dimensional, nonlinear features are advantageous. In contrast, for massive sulfide copper or porphyry copper deposits controlled by structures, models combining Graph Neural Networks (GNNs) or those handling structural information may more effectively capture the relationship between ore-controlling structures (such as faults and shear zones) and ore body distribution [44,45,46,47,48].

In terms of model interpretability, increasing use of interpretability tools in deep learning models has been seen [49]. Deep learning models, despite their excellent performance in mineral prediction, face the challenge of being “black boxes” with regard to interpretability. From 2016 to 2018, models for mineral prediction had poor interpretability, making it hard for geologists to understand the decision-making process of the models [50]. From 2019 to 2021, researchers started to look into ways to enhance model interpretability. For instance, in reference [51], the construction and application prospect of porphyry copper mine knowledge map in Qin-Hang Metallogenic Belt was discussed by Zhou et al. (2021). Between 2022 and 2024, interpretability tools like SHapley Additive exPlanations (SHAP) were widely used in deep learning models. As an example, in 2023, article [52] used SHAP to post-interpret a CNN model, confirming the decisive contribution of the Au-As-Sb element combination to orogenic gold deposits.

3.2. Technical Evaluation and Thoughtful Analysis

Despite the significant progress in large-scale mineral prediction with machine learning over the past decade, many problems persist. The following is a detailed analysis.

The scarcity of mineral deposit samples is a common issue in mineral prediction [53]. Typically, known mineral deposit samples account for less than 1%, leading to overfitting during model training. To address this challenge, researchers have proposed various data augmentation techniques. Synthetic Minority Over-Sampling Technique (SMOTE) alleviates sample imbalance by interpolating to generate new minority class samples in the feature space [54]. Generative Adversarial Networks (GANs) synthesize high-fidelity mineralized training samples through generative adversarial training [55]. Positive-Unlabeled (PU) learning methods reduce reliance on known mineral deposit samples by leveraging large amounts of unlabeled data. The rapid development of mathematical earth science has injected new vitality into geological research. Zhou Yongzhang et al. (2021) pointed out in the Bulletin of Mineralogy, Petrology and Geochemistry that big data and artificial intelligence algorithms are changing the traditional research paradigm of geology [56].

Uncertainty in mineralization factors is a long-standing problem in mineral prediction. This uncertainty mainly arises from significant differences in understanding the ore-controlling factors of the same deposit type among researchers. To resolve this issue, researchers have proposed concept-model-based frameworks. By clarifying the definitions and quantification methods of ore-controlling factors, uncertainties from conceptual differences are reduced [57]. Modeling uncertainties in ore-controlling factors using Bayesian networks accurately reflects the ambiguity and uncertainty of geological knowledge [58].

Regarding model interpretability, the “black-box” nature of deep learning models poses a challenge for their application in mineral prediction. To enhance interpretability, multiple approaches have been proposed. Rule-based machine learning methods, such as decision trees and rule-based systems, provide explicit decision rules [59]. Feature importance analysis methods like SHAP and Local Interpretable Model-agnostic Explanations (LIME) identify key features and their contributions in models [60]. Deep learning models with geological constraints ensure predictions align with geological patterns by integrating geological knowledge into model structures [61].

Despite these challenges, machine learning’s advantages grow with increasing geological data complexity. While traditional methods such as trend surface analysis and factor analysis can be extended to certain nonlinear cases through polynomial terms or nonlinear transformations, they often require explicit model specification and struggle to automatically capture high-dimensional, multiscale, or highly complex nonlinear patterns that are common in large-scale mineral prediction. In contrast, machine learning models learn these patterns directly from data without the need for a priori functional forms, offering greater flexibility and scalability.

3.3. Comparative Technical Analysis of Machine Learning Methods

Regarding computational complexity, traditional techniques like Support Vector Machines (SVM) and Decision Trees (DT) exhibit rapid training velocities. Specifically, they are approximately 10 to 100 times quicker than deep learning models. Moreover, these methods consume minimal resources. As a result, they are well-suited for embedded devices and situations requiring real-time prediction, for example, portable field analysis systems. However, their performance is highly dependent on manual feature engineering, and their accuracy is limited when dealing with raw data that has not been geologically interpreted [62].

In terms of feature learning ability, deep learning methods show significant advantages. CNN can automatically extract spatial features from remote sensing images through convolutional kernels, achieving an accuracy of over 95% in alteration mineral mapping. This performance claim is supported by quantitative metrics: precision ranges from 94.8%–96.5%, recall from 95.1%–97.2%, and F1-scores from 94.9%–96.8% [7,29]. LSTM networks are good at spotting long-term patterns in geological records, like sediment cycles or hot-water activity spells, and can date minerals to within half a time unit on average [14,37]. However, these models typically require thousands to tens of thousands of samples and training times ranging from hours to days, limiting their application in small dataset scenarios [63,64,65].

When there is a shortage of labeled data, unsupervised learning techniques (for example, autoencoders and K-means clustering) possess distinct worth. Autoencoders can extract implicit mineralization information from unlabeled geochemical data, achieving an anomaly detection accuracy of 85–90%. K-means clustering can automatically classify mineralization types but requires manual intervention to determine the optimal number of clusters and has poor interpretability of results.

Tests of robustness show that deep learning models with multi-sensor fusion (such as CNN + attention mechanisms) achieve a 15%–20% improvement in accuracy compared to SVM classifiers using single geochemical data (consistent with the 16.3%–18.8% accuracy advantage of multi-sensor fusion models over SVM classifiers using single geochemical data, where the SVM baseline accuracy is 72.5%–75.8%) [6,11,28]. Ensemble methods (such as RF and eXtreme Gradient Boosting (XGBoost) reduce overfitting risks through multi-model voting mechanisms and provide feature importance rankings, which help identify key ore-controlling factors (such as element anomalies and specific structural styles) [66,67,68,69,70,71].

Method selection strategies need to consider application scenarios comprehensively: (1) for rapid field assessment with high real-time requirements → prioritize kNN and Decision Trees; (2) for high-resolution remote sensing image processing → use CNN or Transformer models; (3) for geological time series analysis → use LSTM or GRU networks; (4) for mineral prediction in areas with small samples → use semi-supervised SVM or transfer learning models; (5) for mechanism studies with high interpretability requirements → use Decision Trees or Random Forests.

Table 1 presents a summary of the essential features of the chosen machine-learning techniques within the scope of large-scale mineral prospectivity forecasting. For the successful utilization of these machine-learning methods in geological applications, it is of utmost importance to have a comprehensive grasp of their technological traits. This necessity stems from the fact that when choosing these methods, one usually has to make basic trade-offs among interpretability, computational speed, and overall performance. Traditional algorithms, like support vector machines (SVMs), decision trees (DTs), and k-nearest neighbors (kNNs), are well-regarded. SVMs are known for their relatively low computational complexity, decision trees for their extremely minimal complexity, and kNNs for their medium-level complexity. They are suitable for scenarios such as geochemical anomaly classification, identification of structural ore-controlling factors, and rapid mineralization type classification, respectively. Consequently, they are preferred for resource-constrained scenarios. However, these algorithms have high demands for feature engineering in SVMs and decision trees, and kNNs also have such demands, which may limit their performance when dealing with complex or noisy datasets. Conversely, advanced deep learning frameworks such as convolutional neural networks (CNNs) and long short-term memory networks (LSTMs) are capable of directly extracting hierarchical data representations. CNNs demonstrate a high level of performance, attaining an accuracy rate ranging from 90% to 98% when applied to remote sensing alteration mineral mapping. Meanwhile, LSTMs exhibit remarkable proficiency in mineral time series prediction, achieving an accuracy within the range of 88% to 96%. Both of these neural network types are characterized by strong robustness. However, CNNs have high computational complexity and LSTMs have very high complexity, posing challenges for deployment. This contradictory requirement necessitates adjusting strategies based on specific application requirements, such as real-time response and model interpretability.

Given the differences between classical and deep learning methods, hybrid and ensemble approaches have become practical strategies. For example, hybrid models combine deep learning networks (such as CNNs for remote sensing alteration mineral mapping, which have low feature engineering requirements and enable automatic feature extraction) with classical algorithms (such as SVMs and kNNs, which are efficient and interpretable for classification). When integrated with multi-source geological data (such as in random forest multi-source data fusion prediction scenarios), these models enhance performance. Ensemble methods like random forests aggregate base learner predictions to enhance robustness (with high robustness), reduce overfitting, and provide insights through feature importance metrics. These architectures reflect the increasing complexity of model design, aiming for optimal performance, resilience, and practicality in complex geological application scenarios (such as those in the table). Autoencoders can also be used for reconstructing missing geological data due to their low feature engineering requirements, moderate computational complexity, and moderate to high robustness.

3.4. Key Trends in Technology Evolution

The application of machine learning in the field of mineral prospecting shows significant stage characteristics. From 2016 to 2019, classical algorithms dominated, with support vector machines (SVMs), decision trees, and random forests widely used in medium- and small-scale deposit prediction. The core breakthroughs lied in the standardized processing of geological data and innovations in feature engineering methods. After 2020, deep learning technologies rose rapidly. Convolutional neural networks (CNNs), long short-term memory networks (LSTMs), and autoencoders have achieved significant advancements in areas such as remote sensing interpretation and three-dimensional modeling. This accomplishment represents a transition in the technical emphasis, moving from a “feature-based” approach to a “data-centered” one.

In recent years, Large Language Models (LLMs) have gained significant traction in geoscientific research, becoming an emerging trend in the integration of machine learning and mineral prospecting [72]. Their applications in the field mainly focus on three key areas: first, document mining—LLMs can efficiently process massive geoscientific literature (such as geological survey reports, mineral deposit case studies, and exploration project records), automatically extract key information (e.g., mineralization age, ore-controlling structures, and typical geochemical anomalies) that is critical for prospecting [73]; second, knowledge extraction—by analyzing unstructured text data (including field investigation notes and laboratory test summaries), LLMs help construct structured geological knowledge graphs, which provide systematic knowledge support for mineral prediction model design; third, geoscientific mapping—combined with spatial analysis tools, LLMs can assist in interpreting text descriptions of geological outcrops and converting them into standardized spatial mapping elements, improving the efficiency and consistency of geological mapping. These applications are supported by recent literature focusing on the intersection of LLMs and geoscience, further verifying their potential in promoting intelligent mineral exploration [74].

Multi-source data fusion has become a core strategy to improve prediction performance. By amalgamating geological, geophysical, geochemical, and remote-sensing data, models are capable of encompassing more extensive mineralization details. For example, studies [75,76,77,78,79,80] show that multi-source data models have an accuracy improvement of 15%–25% compared to single data sources. Machine learning is deeply integrating with “digital geology” platforms, forming a full-process solution from data collection, intelligent interpretation to target area optimization, and promoting the transformation of mineral exploration to intelligent decision-making.

4. Results and Discussion

Table 2 summarizes the scientific literature on machine learning techniques for large-scale mineral prediction from 2016 to 2020 and from 2021 to 2025. During the initial five-year timeframe, there were 96 publications, and in the subsequent five years, this number grew to 159. Of the total 255 studies analyzed, nearly two-thirds (62.35%) were published in the latter five-year period. This notable disparity unmistakably demonstrates the burgeoning research enthusiasm in this domain and the escalating focus of the academic circle on the utilization of machine learning methodologies in large-scale mineral forecasting. In terms of document types, the analysis shows that journal articles dominate, followed by conference papers, with these being relatively few. This distribution pattern reflects the maturity of research in this field, the active academic exchange, and the preferred channels for knowledge dissemination. Journal articles, as the primary vehicle for publishing academic research findings, far exceed other types in number. This indicates that researchers in the field of mineral prediction using machine learning tend to publish rigorously peer-reviewed, highly academic, and innovative results in professional journals. This not only helps ensure the quality and reliability of the research but also promotes the systematic accumulation and widespread dissemination of knowledge. The high proportion of journal articles also implies that the research methods, theoretical frameworks, and application practices in this field are gradually forming a consensus and standards, providing a solid foundation for future research.

In terms of specific machine learning methods, decision trees were the most frequently used, appearing in 114 publications (44.71% of all cases). Random forests and SVMs were the next most common, appearing in 107 and 60 studies, respectively. Notably, all 43 publications related to CNNs in this research were from the 2021–2025 period, indicating a significant increase in interest in this technology [23,34]. Overall analysis shows that the popularity of machine learning methods has changed significantly over time, reflecting shifts in technical trends in research. In addition, there has been a significant change in the types of machine learning methods. Specifically, in recent times, deep learning models, including Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), have grown more widespread. They have supplanted earlier conventional methods such as Support Vector Machines (SVMs) and decision trees. This shift might be a consequence of the broader utilization of neural architectures, the accessibility of computational resources, and the escalating intricacy of geological datasets [5,25].

In terms of application fields, the most frequently discussed topic was mineral potential assessment, which appeared in as many as 175 publications (68.63% of all publications). Exploration target selection appeared in over half (53.73%) of the studies. No statistically significant association was found between application fields and time. This means that key areas such as mineralization anomaly identification, mineral potential assessment, and exploration target selection have remained consistent over the decade, regardless of the machine learning methods used. This indicates a stable research interest in key mineral exploration challenges.

In terms of research methods, conceptual research dominated, accounting for 63.33% of the 161 publications. Case studies appeared in 32.75% of the research, while literature analysis was relatively rare at 3.92%. The research method structure showed no significant changes between the two periods (2016–2020 and 2021–2025), indicating a stable distribution of methods despite the increase in research volume. The temporal distribution of research methods also showed no significant differences, with literature analysis, conceptual and case study methods being used similarly in both periods, reflecting a sustained methodological balance [81].

An annual statistical analysis of the 255 papers published between 2016 and 2025 clearly revealed the flourishing trend of machine learning techniques in large-scale mineral prediction research. Figure 4 indicates that there has been a remarkable increase in the quantity of publications within this domain. Notably, after the year 2020, there was a substantial upsurge in the output of research in this area. This reflects the growing academic interest in this research direction and indicates that the potential of machine learning techniques in solving complex geological problems is gradually being recognized and put into practice.

As depicted in Figure 4, research activity in this field has shown significant growth. From the relatively low publication volume in 2016, it reached a peak in 2021 and grew again in 2024, indicating the continuous rise of machine learning in mineral prediction. The following is a specific data analysis:

Early Exploration Phase (2016–2017): During this period, the number of publications was small, with an annual average of about 10–15 papers. This reflected the initial introduction and exploration of machine learning techniques in the Earth sciences. Research mainly focused on proof-of-concept and the application of basic methods, with researchers beginning to apply classic machine learning algorithms (such as SVM and RF) to simple geological classification and prediction tasks.

Rapid Growth Phase (2018–2021): Starting in 2018, the number of publications grew explosively, with an annual average of 20–40 papers. It reached its highest point in 2021, nearing 50 papers. This growth was primarily driven by the maturity and popularization of deep learning technologies and the rise of the geological big data concept. Research began to focus on more complex nonlinear problems and attempted to handle multi-source heterogeneous data. Additionally, enhanced computing capabilities and the user-friendly nature of open-source machine learning frameworks accelerated research progress.

Stable Development and Deepening Stage (2022–2025): Despite slight fluctuations in the number of studies in 2022 and 2023, the overall level remained high, with an annual publication rate between 30 and 45 papers. In 2024, growth occurred again, indicating that the field will continue to maintain strong momentum. During this stage, research focused more on model robustness, interpretability, generalization ability, and deep integration with geological expertise. Advanced technologies like transfer learning, few-shot learning, and reinforcement learning were also introduced to address practical application challenges.

This growth reflects the global surge of interest in artificial intelligence and the expanding availability of high-quality geoscience data. It mirrors the increasing enthusiasm and capacity of the research community to adopt and test machine-learning approaches. The continuous increase in literature is not just a simple accumulation of research output but also represents an expansion in research depth and breadth, indicating that machine learning will play a more central role in future mineral prediction.

Specifically, the number of studies in this field was relatively small in 2016 and 2017, which may be related to the limited popularity and depth of application of machine learning technology in earth science at that time. However, with the rapid development of cutting-edge technologies like deep learning and their breakthroughs in image recognition and natural language processing, researchers began to introduce these technologies into geological exploration. From 2018 onwards, the number of publications showed a steady upward trend, indicating growing research interest in this interdisciplinary field. In 2021, a peak was reached, likely linked to the increasing global demand for mineral resources and the deeper application of AI technologies across industries [2,3]. Despite minor fluctuations in 2022 and 2023, the overall level remained high, and significant growth reappeared in 2024, suggesting strong development momentum for the coming years. The 2025 data also shows a continuation of this positive trend.

Behind this growth are several driving factors. First, the rapid accumulation of geological big data has provided an unprecedented data foundation for machine learning applications. The richness of traditional and emerging data sources, such as remote sensing, geophysical exploration, geochemical analysis, and drilling data, has made data-driven mineral prediction possible. Second, the significant improvement in computing power, especially the popularization of high-performance computing and cloud computing, has provided essential support for training and deploying complex machine learning models. Third, the maturity and ease of use of open-source machine learning frameworks have greatly lowered the barrier to entry for researchers. Finally, global mineral resource exploration faces challenges like deep mineral discovery, hidden mineral prediction, and identification of complex deposit types, which urgently require new technological approaches to enhance efficiency and success rates, providing broad application prospects for machine learning.

From 2016 to 2019, classic machine learning algorithms dominated mineral prediction applications. As shown in Figure 5, their applications far exceeded those of deep learning algorithms during this period. The main reasons are as follows:

Technological Maturity and Interpretability: Classical algorithms such as Support Vector Machines (SVMs), Random Forests (RFs), Logistic Regression (LR), Decision Trees (DTs), and K-Nearest Neighbors (KNNs) were relatively mature with solid theoretical foundations and numerous application cases. Their good interpretability was crucial for geologists to understand model decision-making and verify prediction rationality. Model transparency was often key for acceptance in geological surveys.

Data Availability and Scale: Geological data for model training was limited and of varying quality in early studies. Classical machine learning algorithms, with lower data requirements and the ability to handle small sample data, were better suited to the data environment at the time.

Computational Resource Constraints: Deep learning models require substantial computational resources, such as high-performance GPUs, which were not easily accessible to all research teams around 2016. In contrast, classical machine learning algorithms demanded fewer computational resources and were easier to deploy and run in conventional computing environments.

Research Paradigm Inertia: The geological community’s acceptance of machine learning developed gradually. Early researchers tended to start with familiar classical algorithms, incrementally exploring their applicability in geology.

During this phase, classical machine learning algorithms were primarily used to process structured geological data, such as geochemical element concentrations, geophysical anomalies, and drilling data, for anomaly identification, classification prediction, and potential area delineation. For example, Random Forests, with their ability to handle high-dimensional data and nonlinear relationships and their robustness to outliers and noise, were widely applied in mineral prediction. SVMs, known for their excellent performance with small samples and high-dimensional data, were also effectively used in mineral potential evaluation and exploration target selection.

From 2020 to 2025, there has been a significant change in the application pattern of machine learning algorithms. As shown in Figure 4, the number of applications of deep learning algorithms has grown exponentially, almost reaching the same level as that of classical machine learning algorithms. This marks the transition of deep learning from an exploratory phase to a stage of widespread application in geological exploration, particularly in large-scale mineral prediction. The drivers of this shift are as follows:

The maturation and popularization of deep learning technologies: Deep learning models such as CNN, RNN, LSTM, and GAN have achieved remarkable success in fields like image recognition and natural language processing, attracting the attention of geologists. These models demonstrate unparalleled advantages in processing unstructured data (e.g., remote sensing images, geophysical profiles, geological maps) and complex spatiotemporal sequence data.

The development of geological big data: With advances in high-resolution remote sensing, 3D geophysical exploration, and high-throughput geochemical analysis, geological data has experienced explosive growth in volume and complexity. Deep learning models can automatically learn complex feature representations from massive datasets and effectively handle high-dimensional, nonlinear, and multimodal geological big data, which poses a challenge for classical machine learning algorithms.

Enhanced computing power and open-source framework support: The widespread availability of GPU computing power and the maturity of open-source deep learning frameworks like TensorFlow and PyTorch have significantly reduced the development and training thresholds for deep learning models, enabling more research teams to utilize these advanced tools.

The demand for higher predictive accuracy and automated feature extraction: Deep learning model can automatically learn the deep features of data through multi-layer neural network, which avoids the tedious and expert-dependent feature engineering process in traditional methods, thus improving the prediction accuracy and automation level. For instance, CNNs have shown strong capabilities in extracting mineralization alteration information from remote sensing images and identifying geophysical anomalies, while RNNs and LSTMs have unique advantages in processing sequence data such as drilling data and geochemical profiles.

In this stage, the application scope of deep learning algorithms has been continuously expanding. They are not only used for traditional classification and regression tasks but also extended to cutting-edge directions such as data generation (e.g., GANs for generating synthetic geological data), anomaly detection, transfer learning (applying pre-trained models to new geological areas), and few-shot learning (conducting effective predictions in data-sparse regions). Meanwhile, classical machine learning algorithms have not been completely replaced. They still play an important role in scenarios with relatively small data volumes, high requirements for model interpretability, or limited computational resources. Many studies have begun to explore the combination of classical and deep learning to leverage their respective advantages in constructing hybrid models, aiming to achieve better predictive performance.

It is worth noting that the growth in literature is not merely quantitative but also qualitative. Early research mainly focused on the initial exploration and verification of classical machine learning algorithms. Over time, more complex models such as deep learning, transfer learning, and reinforcement learning have been introduced. The depth and breadth of research have also expanded. Studies have evolved from single data sources to multi-source heterogeneous data integration, from regional predictions to global-scale ones, and from qualitative analysis to quantitative prediction. This progression reflects researchers’ deeper understanding and more effective utilization of machine learning’s potential.

Future research will likely continue to focus on building more robust and interpretable models, effectively handling uncertainties and sparsity in geological data, and closely integrating machine learning models with geological expert knowledge to form intelligent mineral prediction systems with human–machine collaboration. The sustained growth in literature indicates that machine learning will play an increasingly important role in mineral prediction, becoming a key driver of mineral exploration technology advancement.

A national affiliation analysis of the 255 studies included in this review was conducted. The results are shown in Figure 6, clearly displaying the research landscape and major contributing forces of machine learning in large-scale mineral prediction globally. The analysis indicates that a small number of countries show significant research activity in this interdisciplinary field, while others have relatively less involvement.

As observed in Figure 6 and Table 3, China has dominated the research on machine learning applications in mineral prediction, with a number of publications far exceeding those of other countries. This dominance is closely related to China’s immense demand for mineral resource exploration, strategic national investments in geological science and artificial intelligence, and its large base of research personnel. Research institutions and universities such as the Chinese Geological Survey, the Chinese Academy of Sciences, and China University of Geosciences have played a key role in advancing research in this field, producing a large number of high-quality academic outcomes. In addition, China’s rich data resources and diverse geological background in the earth sciences provide a unique advantage for training and validating machine learning models.

Following China are traditional mining giants such as Canada and Australia. These countries, with their long history of mineral exploration and abundant mineral resources, have an urgent need to improve mineral exploration efficiency and reduce costs. Consequently, they actively introduce and develop advanced technologies, including machine learning, to address the challenges of deep mineral exploration and hidden mineral prediction. Canada has strong capabilities in earth science data processing and modeling, while Australia is at the forefront of mining technology innovation and application. Both countries have also made significant progress in combining machine learning with mineral prediction.

The rationality of the “relatively low number of papers from the United States” can be analyzed as follows. Regarding the “low number of papers from the United States,” the following analysis can be made from the perspective of the real research ecosystem: The United States has a deep accumulation in the field of geological research. However, mineral prediction research is significantly influenced by resource demand drivers. If the research focuses on the emerging interdisciplinary field of “machine learning + mineral prediction,” the United States may have a relatively low number of publications in this niche direction compared to countries like China due to the following factors: the maturity of mineral resource development phases (e.g., some traditional minerals in the United States have reached a mature development stage, and resource demand priorities have shifted) and research investment focus (e.g., greater attention to frontier fields such as deep space and marine exploration). Meanwhile, India has a strong demand for mineral resources (such as iron ore and coal, which support industrial development) and has complex geological structures (e.g., the Himalayan orogenic belt and the Deccan Traps). There is a practical need for “improving traditional exploration efficiency and urgently requiring intelligent technology breakthroughs.” In recent years, institutions such as the Geological Survey of India and the Indian Institutes of Technology have intensified their efforts in directions like “intelligent processing of low-cost geochemical exploration data and machine learning analysis of structural mineralization patterns,” driven by resource needs to propel scientific research.

Russia is also a significant contributor in the field of machine learning for mineral prediction. Russia’s vast geological territories and complex geological formations provide a rich database for machine learning applications. The country’s expertise in geoscience, combined with its computational resources, enables the development of sophisticated models that can handle the intricacies of mineral exploration. Russia’s research often emphasizes the integration of traditional geological knowledge with cutting-edge machine learning techniques, enhancing the accuracy of mineral prediction in challenging environments. The nation’s strategic interest in resource development further drives investment in this domain, fostering innovation and collaboration among academic institutions and industry players.

Germany, the UK, and France together form a powerful trio in advancing machine learning for mineral prediction in Europe. Germany drives innovation with its strong geoscience infrastructure, creating hybrid models that merge machine learning with physical simulations. The UK utilizes its extensive historical geological data to optimize traditional exploration techniques through machine learning. France bridges fundamental and applied research with its comprehensive geological research system and cross-disciplinary collaboration. Collectively, they offer diverse yet complementary approaches, enhancing the academic rigor and practical applicability of machine learning in mineral exploration and contributing significantly to the global geoscience community.

China leads in research on machine learning for mineral prediction with 73 publications (28.63% of the total). Canada and Australia follow with 36 (14.12%) and 29 (11.37%) publications, respectively. The US and Russia contribute 27 (10.59%) and 22 (8.63%) publications, respectively. Countries like South Africa, Brazil, Finland, Iran, and India have fewer publications (11, 8, 6, 6, and 6, respectively). This distribution shows research is concentrated in major mining nations and those with high mineral demand, indicating potential for international collaboration.

Evolution of the Global Research Landscape: Despite the current concentration of research efforts, it is anticipated that more countries will join this field as global mineral resource demands continue to grow and artificial intelligence technologies become more widespread. Future international cooperation and data sharing will become increasingly important to jointly address the challenges of global mineral resource exploration.

Future research trends may include:

Strengthening of international cooperation: As the complexity of global mineral resource exploration increases and machine learning technologies become more universally applicable, scientific collaboration among nations will become more frequent to jointly address cross-regional and cross-mineral mineral prediction challenges.

Emergence of new economies: Some new economies with abundant mineral resources but currently low research output may gradually increase their investment in this field as their technological capabilities improve and their demand for mineral resource development grows, becoming new growth points for research.

Data sharing and standardization: To promote research progress globally, it will become crucial to advance the sharing and standardization of geological data and machine learning models. This will help overcome data barriers and accelerate technological innovation.

In general, countries such as China, Canada, and Australia play a leading role in the application of machine learning to large-scale mineral prediction. But research efforts globally need further expansion and synergy. International cooperation is an important driving force for the development of mineral prediction technology. Qianlong Zhang et al. (2023) systematically summarized the global cooperation network of cross-disciplinary research between geochemistry and artificial intelligence, providing an important reference for future international cooperation [78]. Through strengthened international cooperation and data sharing, it is expected to jointly promote technological advancements in this field and provide more effective solutions for the sustainable development of global mineral resources.

An analysis of the frequency of machine learning algorithms used in the 255 papers included in this review is presented in Figure 7. This figure clearly shows which algorithms are favored by researchers in large-scale mineral prediction and their application characteristics and strengths. Random Forest and Decision Trees are the most widely used classic machine learning algorithms. While CNNs and RNNs dominate deep learning algorithms.

As shown in Figure 7, Decision Trees (DT), Random Forests (RF), Support Vector Machines (SVM), Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN) are the four most frequently used algorithms in this field. Quantifying the application frequency of these algorithms can reveal preferences for them in different periods and tasks.

Decision Trees (DT): As the most frequently used algorithm, DT was applied 87 times from 2015 to 2019 and 27 times from 2020 to 2021. With clear rule derivation and adaptability to multi-feature hierarchical classification of geological data, DT can effectively break down complex geological associations in tasks like mineralized anomaly identification and ore-controlling element extraction. It provides highly interpretable model conclusions for mineral potential evaluation and continues to support mineral prediction analysis.

Random Forests (RF): In mineral prediction algorithms, RF stands out with a high application frequency, being mentioned or used in over 100 literature works. This is attributed to its excellent generalization ability, high-dimensional data processing capability, and noise robustness. RF performs well in dealing with multi-source heterogeneous geological data, feature importance evaluation, and integrated prediction model building. It is particularly suitable for target area selection and mineral potential evaluation tasks.

Support Vector Machines (SVM): With an application frequency of nearly 60 times ranking among the top, SVM is highly popular in tasks such as mineralized anomaly identification and deposit type classification due to its excellent performance on small samples and high-dimensional data, as well as its ability to handle nonlinear relationships. Despite the rise of deep learning, SVM still has irreplaceable advantages in some specific scenarios.

Convolutional Neural Networks (CNN): The application frequency of CNN exceeds 40 times, showing its rapid rise in recent years. As a representative of deep learning, CNN has a natural advantage in processing image data such as remote sensing images and geophysical maps. It can automatically extract deep features, greatly enhancing the automation and intelligence of remote sensing alteration information extraction, geophysical anomaly identification, and structural information extraction. The fast growth in its application frequency reflects the huge potential of deep learning in geological big data processing.

Recurrent Neural Networks (RNN): With an application frequency of 34 times, RNN mainly focuses on processing sequential data. For example, in tasks such as drill hole data interpretation, geochemical profile analysis, and time-series prediction, RNN and its variants (such as LSTM) can effectively capture the spatiotemporal dependencies in data, revealing deep mineralization patterns and element migration patterns.

Overall, the application frequency of classical machine learning algorithms (decision trees, RF, SVM) and deep learning algorithms (CNN, RNN) shows a neck-and-neck trend. This indicates that in mineral prediction, researchers flexibly select and combine different machine learning algorithms according to specific tasks and data characteristics. In the future, it is expected that more hybrid models and ensemble learning methods will emerge to fully utilize the advantages of different algorithms, achieving more accurate and robust mineral prediction.

The application of machine learning algorithms in mineral prediction shows a trend of diversification and specialization. Random forests and SVM, as representatives of classical algorithms, continue to be favored for their robustness and good performance with limited data. CNN and RNN, as representatives of deep learning, have rapidly risen due to their powerful automatic feature extraction ability and advantages in handling complex unstructured data. In the future, with the further development of geological big data and computing power, more advanced machine learning and deep learning algorithms will be introduced into mineral prediction. Meanwhile, methods such as multi-model integration, hybrid models, and semi-supervised learning and reinforcement learning combined with geological expert knowledge will become important research directions. It is hoped that these approaches will build smarter, more efficient, and more interpretable mineral prediction systems.

A statistical analysis was conducted on the application fields of the 255 papers included in this review, with the results shown in Figure 8. Table 4 classifies and summarizes the 255 scientific publications according to application field, machine learning method, and research method. It is important to highlight that a single publication may fall into several categories, which is why the total values do not add up to 100%. The analysis encompasses five primary topical domains: mineralization anomaly identification, mineral potential assessment, exploration target preference, tectonic ore-controlling element extraction, and mineral resource estimation. Among these, the most studied area is mineral potential assessment (175 papers), indicating the greatest research attention. This is followed by exploration target preference (137 papers) and mineralization anomaly identification (82 papers). The other fields, such as mineral resource estimation (58 papers) and tectonic ore-controlling element extraction (58 papers), though less frequent, remain important components of the research.

From the perspective of research method classification, conceptual research forms the largest group, accounting for 63.33% of the studies (161 papers). This indicates the research community’s emphasis on developing new solutions, concepts, and models. Most of these studies are concentrated in mineral potential assessment (47 papers), exploration target area selection (46 papers), and mineralization anomaly identification (33 papers). Case studies are the second largest group (84 papers, 32.75%), mainly focusing on mineral potential assessment (26 papers) and exploration target area selection (24 papers). Literature analysis is relatively rare, appearing in only 10 papers, primarily in the field of mineral potential assessment (3 papers). No significant differences were found between machine learning methods and their application fields. The distribution of research methods across different fields shows no particular preference. Mineral potential assessment remains the dominant research area with 175 papers, where decision trees (45 papers) and random forests (42 papers) are the most commonly used methods. The prevalence of conceptual and case studies strongly confirms that this field is in a stage of vigorous development.

Figure 9 presents a heat map that clearly illustrates the relationship between methodological types and major application areas. The frequency of occurrence, indicated by color coding, allows for the quick identification of dominant links between methodologies and research topics in the field of machine learning and large-scale mineral potential prediction.

In conceptual research, the impact is significant, with a large number of studies spanning almost all fields. These are most frequently found in publications on mineralization anomaly identification (58 related studies), mineral potential assessment (175 studies), exploration target selection (137 studies), and tectonic ore-controlling factor extraction (58 studies). The theoretical structures and analytical frameworks established through conceptual research are crucial as the cornerstone for subsequent studies. The ideas and models developed at this stage also lay the foundation for future experiments and practical applications.

Case studies rank second in total article count and are present across many fields, though to a lesser extent. Within these, the highest numbers are seen in exploration target selection (24 studies), mineral potential assessment (26 studies), and mineralization anomaly identification (17 studies). This dissemination indicates that conceptual notions are steadily being converted into real-world implementations and evaluated in laboratory or on-site research environments. Conversely, literature surveys occur only intermittently in a small number of publications and are confined to particular domains. For instance, the application of autoencoders is relatively low across all fields, with just 1 study in mineralization anomaly identification. This implies that the present emphasis continues to be on devising and evaluating novel solutions instead of integrating pre-existing knowledge.

The provided data table proficiently showcases the framework of research progression within the domain. Conceptual exploration lays the theoretical groundwork, whereas experiments and case analyses signify the ensuing phases of verifying and applying the devised methodologies in practical scenarios. This comprehensive overview deepens our comprehension of how diverse methodological strategies in machine learning can attain particular objectives in large-scale mineral forecasting.

A bibliometric examination uncovers not just a numerical increase in published works but also qualitative alterations in research methodologies and technological inclinations. A clear transition is observed from classical machine learning methods like support vector machines (with a total of 60 applications across fields) and decision trees (totaling 114) to deep learning architectures such as convolutional neural networks (CNNs with 43 applications) and long short-term memory networks (LSTMs with 18). This change reflects an increasing demand for the automated extraction and handling of intricate, high-dimensional geological data. Such data processing is of great significance for practical mineral prediction systems.

The geographic concentration of publications, particularly those from China, Australia, and Canada, shows increasing industrial involvement and investment in the digitization and intelligentization of mineral resource exploration in these countries. This indicates that the future of large-scale mineral prediction will be closely linked to countries with advanced mineral digitalization and exploration technologies, and these countries are also more likely to generate and share large-scale geological datasets.

In summary, the observed trends go beyond mere statistics. They reveal a progressive evolution in the application of machine learning to large-scale mineral prediction, moving toward higher technological levels, practical implementations, and comprehensive integration with mineral exploration engineering systems.

5. Further Development Prospects

A comprehensive review of the current literature indicates that the application of machine learning techniques in large-scale mineral prediction is transitioning from the theoretical formulation phase to the practical investigation phase. Despite the dominance of supervised learning methods like random forests and decision trees, solutions based on deep learning, unsupervised techniques, and hybrid configurations are increasing.

Over the next few years, approaches relying on deep neural networks, especially 3D-CNN and Transformer models, are expected to develop further. These methods can analyze raw geological data without manual feature engineering, making mineral prediction systems more reliable and efficient in data preparation, which is crucial in complex exploration environments [82,83]. Its core advantage lies in its ability to identify complex spatiotemporal distribution patterns, a feature that makes it highly efficient in locating deep concealed ore bodies and mineralization anomalies. Artificial intelligence and big data technologies are reshaping the landscape of earth science research. Zhou Yongzhang et al. (2024) pointed out in Earth Science Frontiers that these technologies have brought new research ideas and methods to Earth sciences [84].

Another promising direction is the integration of diverse geological data, such as geochemical, geophysical, remote sensing, and drilling data. When integrated with feature extraction and collective learning, these approaches demonstrate a high level of efficacy [85,86,87,88,89,90]. Integrating information from various data types enhances the understanding of mineral systems’ evolution, enabling more accurate target prediction and reducing ambiguity. With the progress of exploration technologies and data collection systems, algorithms capable of combining various geological data while having low computational requirements will gain greater significance [91,92,93,94].

Within the framework of mineral resource potential assessment, predictive modeling is gaining attention. The main challenge is not only to identify mineralization anomalies but also to estimate resource sizes and predict mineral spatial distribution. Studies indicate that combining time-series geochemical data with predictive algorithms can significantly improve resource evaluation accuracy and reduce exploration risks [95].

Motivated by the core technologies of Industry 4.0—ubiquitous IoT sensors, high-performance cloud computing, edge analytics and AI-driven modelling—geoscientific exploration is shifting from stand-alone, offline interpretations to an integrated, real-time digital paradigm. Within this new framework, geological, geochemical and geophysical observations are continuously streamed to cloud repositories, automatically updated in 3-D voxel models and instantly interrogated by machine-learning algorithms to identify subtle mineralization signatures [96]. Integrating mineral prediction models with cloud platforms, Internet of Things (IoT) sensors, 3D geological modeling software, and edge computing is crucial for building modern, scalable exploration decision-support systems. This drives the development of lightweight, noise-resistant models that can operate in real-time on-site, even on resource-constrained mobile devices.

Greater attention should be given to the need for geological data standardization and the availability of open datasets. This will facilitate better model benchmarking, objective performance evaluation, and accelerate progress in the field. At the same time, the significance of quality metrics is also on the rise—not only with regard to predictive accuracy, but also in terms of the geological interpretability of the model.

The future development of machine learning in large-scale mineral prediction will revolve around enhancing the self-sufficiency of systems, improving their adaptability to complex geological environments, achieving more seamless integration with digital geological ecosystems, and increasing the visibility of model behavior. These improvements will help build geologists’ confidence in the results generated by algorithms and further unlock the exploration potential of the technology.

It is necessary to recognize that this review comes with several limitations. To begin with, the scope of the analysis was confined to English-language publications listed in the Web of Science. This limitation might have led to the omission of relevant research from other databases or geographical areas. Secondly, although an examination of methodological and performance trends was carried out, no formal meta-analysis or tests for statistical significance were performed. Third, the review primarily focused on machine learning techniques and did not cover other prediction methods, such as expert systems based on mineralization theories or physical simulation models, which remain valuable in certain applications. These constraints underscore the necessity for prudence when generalizing the results and indicate future avenues for research. Such research could encompass more comprehensive cross-database evaluations and comparative analyses between machine learning (ML) and non-machine learning mineral prediction approaches.

6. Conclusions

From 2016 to 2025, machine learning applications in large-scale mineral prediction showed significant progress, particularly in mineralization anomaly identification, resource potential assessment, and exploration target selection. After reviewing 255 publications, it was found that research activity increased notably after 2020, with a 65.63% rise in studies published between 2020 and 2025.

The areas attracting the most research interest included mineralization anomaly identification (32.16% of publications), mineral potential assessment (68.6%), and exploration target selection (53.7%). Among the machine learning methods used, classical algorithms—especially decision trees and random forests (RF)—dominated, accounting for nearly 65.5% of all applications. Meanwhile, the popularity of deep learning technologies, particularly CNNs and RNNs, has grown significantly. These are most commonly applied in mineralization anomaly identification, 3D geological modeling, and mineralization time-series analysis.

Although less frequently used, methods such as autoencoders, LSTMs, and K-means clustering are gaining popularity. They typically serve exploratory or complementary roles to mainstream approaches.

From a methodological viewpoint, conceptual research holds a marked predominance, featuring in excess of 63.33% of publications. This indicates that the field remains heavily reliant on theoretical foundations and model development. Nonetheless, the proportion of case studies has shown a substantial rise (exceeding 32.75%), suggesting a gradual shift from theory to practical exploration, particularly in porphyry copper, sandstone uranium, and volcanic-associated gold mineral deposits, serving as a bridge between theory and practice. Literature reviews, however, account for only a small fraction of the analyzed work. This may stem from the field’s novelty and rapid evolution, with the focus still on generating novel solutions.

Geographically, China and Canada lead in research activity, with their publications constituting a significant share of the global total. China’s research output growth is especially noteworthy, with a substantial increase observed during 2020–2025 compared to the preceding period. Over the past few years, novel research facilities have come into existence in nations like Australia, the United States, Russia, and South Africa. This development showcases the increasing globalization of research within this particular domain.

To conclude, the review of existing literature vividly illustrates the growing significance of machine learning in the realm of mineral prediction. The current focus is primarily on identifying mineralization anomalies, resource assessment, and target area selection. While classic methods like SVM and decision trees remain fundamental to many solutions, interest in deep learning technologies is on the rise. While the majority of works possess a conceptual essence, an increasing quantity of contributions grounded in experimentation and case studies are emerging. This trend serves as an indication of the field’s progression towards maturity.

Over the next few years, future advancements ought to not only augment and broaden machine learning algorithms and their uses but also strive for standardizing geological data, establishing open benchmarks, and offering more real-world exploration datasets. Among the crucial future paths are delving into transfer learning, unsupervised learning, and fusing data from a variety of geological data sources. All these efforts can greatly enhance the efficiency and dependability of contemporary mineral prediction systems.

In the later phases of this study, the writers plan to expand the examination to cover publications listed in other well-regarded scientific data repositories, such as Scopus, CNKI, and Engineering Village. This will enrich the current review with additional knowledge sources and offer a more comprehensive perspective on global trends in the application of machine learning to large-scale mineral prediction. Incorporating a broader spectrum of databases will likewise enable comparisons of research methodologies among scientific circles and assist in pinpointing less investigated yet promising thematic domains. Additionally, the authors intend to carry out a comprehensive examination of specific deposit types, like porphyry copper deposits, lithium clay deposits, or volcanic massive sulfide (VMS) deposits, by utilizing bibliometric and network analysis methods.

This review focuses on the machine learning methods most widely used in the field of large-scale mineral exploration prediction over the past decade. However, we have also noted that emerging paradigms as Transformer models, graph neural networks, and reinforcement learning (RL) are attracting attention from the academic community. Through innovative modeling approaches, these technologies can effectively handle the structures of time-series data, spatial data, and unlabeled data, providing important support for the construction of mineral exploration prediction systems in the future. For example, transformers show potential in handling sequential data in mineral exploration, graph neural networks can effectively model complex geological relationships. and reinforcement learning methods—such as deep Q-networks (DQN) and policy gradient algorithms—are excellent in scenarios requiring dynamic decision optimization, which is a common challenge in mineral exploration. Specifically, in real-time exploration route planning or adaptive adjustment of geophysical survey parameters, reinforcement learning can continuously optimize decision strategies based on feedback from geological environments, significantly improving the efficiency of on-site exploration and the accuracy of target area positioning.

Yet, our review mainly focuses on relatively mature deep learning methods. This is because the application of large language models in mineral prospectivity prediction is still in its infancy with few relevant studies. At present, focusing on the research of mature deep learning methods may bring more valuable results to this field. In the future, we will pay close attention to the application of large language models in this field and plan to explore them more comprehensively in our subsequent work.

Author Contributions

Conceptualization, Z.F. and X.Z.; methodology, Z.F., X.Z. and Y.Y.; software, F.Z.; validation, X.X.; formal analysis, Z.F. and F.Z.; investigation, X.L.; resources, X.L.; data curation, X.L.; writing—original draft preparation, Z.F. and X.Z.; final writing—review and editing, Z.F., X.Z., Y.Y. and Q.Z.; visualization, X.L.; supervision, Q.Z.; project administration, Q.Z. and W.M.; funding acquisition, X.Z., Y.Y. and W.M. All authors have read and agreed to the published version of the manuscript.

Funding

The article processing charge (APC) for this paper was funded by the Yunnan Province New Round of Mineral Exploration Breakthrough Strategic Action Technology Research and Development Project (Y202502, Y202408).

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

Author Xiao Li was employed by the company Shandong Gold Group. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Yang, S.; Yang, W.; Cui, T.; Zhang, M. Prediction and practical application of bauxite mineralization in Wuzhengdao area, Guizhou, China. PLoS ONE 2024, 19, e0305917. [Google Scholar] [CrossRef]
Sun, T.; Li, H.; Wu, K.; Chen, F.; Zhu, Z.; Hu, Z. Data-driven predictive modelling of mineral prospectivity using machine learning and deep learning methods: A case study from southern Jiangxi Province, China. Minerals 2020, 10, 102. [Google Scholar] [CrossRef]
Qian, K.R.; He, Z.L.; Liu, X.W.; Chen, Y.Q. Intelligent prediction and integral analysis of shale oil and gas sweet spots. Pet. Sci. 2018, 15, 744–755. [Google Scholar] [CrossRef]
Sabah, M.; Talebkeikhah, M.; Wood, D.A.; Khosravanian, R.; Anemangely, M.; Younesi, A. A machine learning approach to predict drilling rate using petrophysical and mud logging data. Earth Sci. Inform. 2019, 12, 319–339. [Google Scholar] [CrossRef]
Xiong, Y.; Zuo, R.; Carranza, E.J.M. Mapping mineral prospectivity through big data analytics and a deep learning algorithm. Ore Geol. Rev. 2018, 102, 811–817. [Google Scholar] [CrossRef]
He, H.; Zhu, H.; Yang, X.; Zhang, W.; Wang, J. Mineral prospectivity prediction based on convolutional neural network and ensemble learning. Sci. Rep. 2024, 14, 22654. [Google Scholar] [CrossRef] [PubMed]
Hajihosseinlou, M.; Maghsoudi, A.; Ghezelbash, R. Stacking: A novel data-driven ensemble machine learning strategy for prediction and mapping of Pb-Zn prospectivity in Varcheh district, west Iran. Expert Syst. Appl. 2024, 237, 121668. [Google Scholar] [CrossRef]
Chen, J.; Zhao, Z.; Yang, Y.; Li, C.; Yin, Y.; Zhao, X.; Zhao, N.; Tian, J.; Li, H. Metallogenic prediction based on fractal theory and machine learning in Duobaoshan Area, Heilongjiang Province. Ore Geol. Rev. 2024, 168, 106030. [Google Scholar] [CrossRef]
Li, Y.; Liu, H.; Jiao, P.; Wang, Q.; Liu, D.; Ma, L.; Wang, Z.; Peng, H. Machine-Learning-Assisted Identification of Steam Channeling after Cyclic Steam Stimulation in Heavy-Oil Reservoirs. Geofluids 2023, 2023, 6593464. [Google Scholar] [CrossRef]
Furtney, J.K.; Thielsen, C.; Fu, W.; Le Goc, R. Surrogate models in rock and soil mechanics: Integrating numerical modeling and machine learning. Rock Mech. Rock Eng. 2022, 55, 2845–2859. [Google Scholar] [CrossRef]
Dumakor-Dupey, N.K.; Arya, S. Machine learning—A review of applications in mineral resource estimation. Energies 2021, 14, 4079. [Google Scholar] [CrossRef]
Zhou, Y.; Zhang, L.; Zhang, A.; Wang, J. Big Data Mining and Machine Learning in Earth Sciences; Sun Yat-sen University Press: Guangzhou, China, 2018; pp. 1–269. [Google Scholar]
Karpatne, A.; Ebert-Uphoff, I.; Ravela, S.; Babaie, H.A.; Kumar, V. Machine learning for the geosciences: Challenges and opportunities. IEEE Trans. Knowl. Data Eng. 2018, 31, 1544–1554. [Google Scholar] [CrossRef]
Yaworsky, P.M.; Vernon, K.B.; Spangler, J.D.; Brewer, S.C.; Codding, B.F. Advancing predictive modeling in archaeology: An evaluation of regression and machine learning methods on the Grand Staircase-Escalante National Monument. PLoS ONE 2020, 15, e0239424. [Google Scholar] [CrossRef]
He, B.; Chen, J.; Chen, C. Mineral prospectivity mapping method integrating multi-sources geology spatial data sets and case-based reasoning. J. Geogr. Inf. Syst. 2012, 4, 77–85. [Google Scholar] [CrossRef]
Wang, M.; Zhang, R.; Yan, B.; Song, C.; Lv, Y.; Zhao, H. Prediction of Soil Pollution Risk Based on Machine Learning and SHAP Interpretable Models in the Nansi Lake, China. Toxics 2025, 13, 278. [Google Scholar] [CrossRef]
Silva dos Santos, V.; Gloaguen, E.; Hector Abud Louro, V.; Blouin, M. Machine learning methods for quantifying uncertainty in prospectivity mapping of magmatic-hydrothermal gold deposits: A case study from Juruena mineral province, northern Mato Grosso, Brazil. Minerals 2022, 12, 941. [Google Scholar] [CrossRef]
Lauzon, D.; Gloaguen, E. Quantifying uncertainty and improving prospectivity mapping in mineral belts using transfer learning and Random Forest: A case study of copper mineralization in the Superior Craton Province, Quebec, Canada. Ore Geol. Rev. 2024, 166, 105918. [Google Scholar] [CrossRef]
Battalgazy, N.; Valenta, R.; Gow, P.; Spier, C.; Forbes, G. Addressing geological challenges in mineral resource estimation: A comparative study of deep learning and traditional techniques. Minerals 2023, 13, 982. [Google Scholar] [CrossRef]
Dada, M.A.; Oliha, J.S.; Majemite, M.T.; Obaigbena, A.; Biu, P.W. A review of predictive analytics in the exploration and management of us geological resources. Eng. Sci. Technol. J. 2024, 5, 313–337. [Google Scholar] [CrossRef]
Zeng, C.; Kalam, S.; Zhang, H.; Wang, L.; Luo, Y.; Wang, H.; Mu, Z.; Arif, M. Predicting absolute adsorption of CO2 on Jurassic shale using machine learning. Fuel 2025, 381, 133050. [Google Scholar] [CrossRef]
Li, S.; Chen, J.; Liu, C. Overview on the development of intelligent methods for mineral resource prediction under the background of geological big data. Minerals 2022, 12, 616. [Google Scholar] [CrossRef]
Cao, X.; Liu, Z.; Hu, C.; Song, X.; Quaye, J.A.; Lu, N. Three-dimensional geological modelling in earth science research: An in-depth review and perspective analysis. Minerals 2024, 14, 686. [Google Scholar] [CrossRef]
Iglesias, C.; Antunes, I.M.H.R.; Albuquerque, M.T.D.; Martínez, J.; Taboada, J. Predicting ore content throughout a machine learning procedure–An Sn-W enrichment case study. J. Geochem. Explor. 2020, 208. [Google Scholar] [CrossRef]
Liu, Y.; Carranza, E.J.M.; Xia, Q. Developments in quantitative assessment and modeling of mineral resource potential: An overview. Nat. Resour. Res. 2022, 31, 1825–1840. [Google Scholar] [CrossRef]
Dong, Z.; Tang, P.; Chen, G.; Yin, S. Synergistic application of digital outcrop characterization techniques and deep learning algorithms in geological exploration. Sci. Rep. 2024, 14, 22948. [Google Scholar] [CrossRef]
Liu, L.; Li, T.; Ma, C. Research on 3D geological modeling method based on deep neural networks for drilling data. Appl. Sci. 2024, 14, 423. [Google Scholar] [CrossRef]
Zuo, R.; Xu, Y. Graph deep learning model for mapping mineral prospectivity. Math. Geosci. 2023, 55, 1–21. [Google Scholar] [CrossRef]
Wu, Y.; Liu, B.; Gao, Y.; Li, C.; Tang, R.; Kong, Y.; Xie, M.; Li, K.; Dan, S.; Qi, K.; et al. Mineral prospecting mapping with conditional generative adversarial network augmented data. Ore Geol. Rev. 2023, 163, 105787. [Google Scholar] [CrossRef]
Harris, J.R.; Wilkinson, L.; Grunsky, E.C. Effective use and interpretation of lithogeochemical data in regional mineral exploration programs: Application of Geographic Information Systems (GIS) technology. Ore Geol. Rev. 2000, 16, 107–143. [Google Scholar] [CrossRef]
Khalifa, H.; Tomomewo, O.S.; Ndulue, U.F.; Berrehal, B.E. Machine learning-based real-time prediction of formation lithology and tops using drilling parameters with a Web App integration. Eng 2023, 4, 2443–2467. [Google Scholar] [CrossRef]
Jiang, S.; Hartley, R.; Fernando, B. Kernel support vector machines and convolutional neural networks. In Proceedings of the 2018 Digital Image Computing: Techniques and Applications (DICTA), Canberra, Australia, 10–13 December 2018; IEEE: New York, NY, USA, 2018; pp. 1–7. [Google Scholar]
Liu, Y.P.; Zhu, L.X.; Zhou, Y.Z. Application of Convolutional Neural Network in Prospecting Prediction of Ore Deposits—Taking the Zhaojikou Pb-Zn Ore Deposit in Anhui Province as a Case. Acta Petrol. Sin. 2018, 34, 3217–3224. [Google Scholar]
Carranza, E.J.M.; Laborte, A.G. Data-driven predictive modeling of mineral prospectivity using random forests: A case study in Catanduanes Island (Philippines). Nat. Resour. Res. 2016, 25, 35–50. [Google Scholar] [CrossRef]
Chen, Y.; Wu, W. Mapping mineral prospectivity using an extreme learning machine regression. Ore Geol. Rev. 2017, 80, 200–213. [Google Scholar] [CrossRef]
Zuo, R.; Xiong, Y.; Wang, Z.; Wang, J.; Kreuzer, O.P. A new generation of artificial intelligence algorithms for mineral prospectivity mapping. Nat. Resour. Res. 2023, 32, 1859–1869. [Google Scholar] [CrossRef]
Xu, S.T.; Zhou, Y.Z. Artificial intelligence identification of ore minerals under microscope based on Deep Learning algorithm. Acta Petrol. Sin. 2018, 34, 3244–3252. [Google Scholar]
Liu, Z.; Yu, S.; Deng, H.; Jiang, G.; Wang, R.; Yang, X.; Song, J.; Chen, J.; Mao, X. 3D mineral prospectivity modeling in the Sanshandao goldfield, China using the convolutional neural network with attention mechanism. Ore Geol. Rev. 2024, 164, 105861. [Google Scholar] [CrossRef]
Shabankareh, M.; Hezarkhani, A. Application of support vector machines for copper potential mapping in Kerman region, Iran. J. Afr. Earth Sci. 2017, 128, 116–126. [Google Scholar] [CrossRef]
Zuo, R.; Xu, Y. A physically constrained hybrid deep learning model to mine a geochemical data cube in support of mineral exploration. Comput. Geosci. 2024, 182, 105490. [Google Scholar] [CrossRef]
Yang, F.; Zuo, R.; Kreuzer, O.P. Artificial intelligence for mineral exploration: A review and perspectives on future directions from data science. Earth-Sci. Rev. 2024, 258, 104941. [Google Scholar] [CrossRef]
Farahnakian, F.; Sheikh, J.; Zelioli, L.; Nidhi, D.; Seppä, I.; Ilo, R.; Nevalainen, P.; Heikkonen, J. Addressing imbalanced data for machine learning based mineral prospectivity mapping. Ore Geol. Rev. 2024, 174, 106270. [Google Scholar] [CrossRef]
Zhang, Z.; Wang, G.; Carranza, E.J.M.; Fan, J.; Liu, X.; Zhang, X.; Dong, Y.; Chang, X.; Sha, D. An integrated framework for data-driven mineral prospectivity mapping using bagging-based positive-unlabeled learning and Bayesian cost-sensitive logistic regression. Nat. Resour. Res. 2022, 31, 3041–3060. [Google Scholar] [CrossRef]
Chen, Y.; Zhao, Q. Mineral exploration targeting by combination of recursive indicator elimination with the ℓ2-regularization logistic regression based on geochemical data. Ore Geol. Rev. 2021, 135, 104213. [Google Scholar] [CrossRef]
Meng, F.; Li, X.; Chen, Y.; Ye, R.; Yuan, F. Three-Dimensional Mineral Prospectivity Modeling for Delineation of Deep-Seated Skarn-Type Mineralization in Xuancheng–Magushan Area, China. Minerals 2022, 12, 1174. [Google Scholar] [CrossRef]
Shi, Z.; Zuo, R.; Zhou, B. Deep reinforcement learning for mineral prospectivity mapping. Math. Geosci. 2023, 55, 773–797. [Google Scholar] [CrossRef]
Zuo, R.; Shi, L.; Yang, F.; Xu, Y.; Xiong, Y. ArcMPM: An ArcEngine-based software for mineral prospectivity mapping via artificial intelligence algorithms. Nat. Resour. Res. 2024, 33, 1–21. [Google Scholar] [CrossRef]
Hou, W. 3D/4D Geological Modeling for Mineral Exploration; MDPI-Multidisciplinary Digital Publishing Institute: Basel, Switzerland, 2023. [Google Scholar]
Deng, H.; Zheng, Y.; Chen, J.; Yu, S.; Xiao, K.; Mao, X. Learning 3D mineral prospectivity from 3D geological models using convolutional neural networks: Application to a structure-controlled hydrothermal gold deposit. Comput. Geosci. 2022, 161, 105074. [Google Scholar] [CrossRef]
Zhang, Z.; Dong, Y.; Du, X.; Qi, K.; Xia, Y.; Sun, F.; Li, G. Application of High-Precision Magnetic Measurement in the Exploration of Deep Fluorite Deposits in Ore Concentrations. Minerals 2025, 15, 351. [Google Scholar] [CrossRef]
Zhou, Y.; Zhang, Q.; Huang, Y.; Yang, W.; Xiao, F.; Ji, J.; Han, F.; Tang, L.; Ouyang, C.; Shen, W. Construction and Application Prospect of Porphyry Copper Mine Knowledge Map in Qin-Hang Metallogenic Belt Earth. Sci. Front. 2021, 28, 67–75. [Google Scholar]
Schiller, J.; Stiller, S.; Ryo, M. Artificial intelligence in environmental and Earth system sciences: Explainability and trustworthiness. Artif. Intell. Rev. 2025, 58, 316. [Google Scholar] [CrossRef]
Zheng, X.; Meng, H.; Zhao, Z.; Liu, X.; Zhou, L.; Grieneisen, M.L.; Zhang, H.; Zhan, Y.; Yang, F. Deep transfer learning for spatiotemporal mapping of PM2. 5 nitrate across China: Addressing small data challenges in environmental machine learning. J. Hazard. Mater. 2025, 492, 138206. [Google Scholar] [CrossRef]
Luo, T.; Zhou, Z.; Tang, L.; Gong, H.; Liu, B. Identification of Geochemical Anomalies Using a Memory-Augmented Autoencoder Model with Geological Constraint. Nat. Resour. Res. 2025, 34, 23–40. [Google Scholar] [CrossRef]
Xiao, K.; Xu, Y.; Yang, Y.; Hu, X.; Luo, Q.; Duan, Z.; Jiao, C.; Chen, M.; Yin, D. Study on Logging Identification of Sandstone-Type Uranium Deposits Based on Ensemble Learning in the Songliao Basin in Northeast China. Nucl. Sci. Eng. 2025, 199, 1246–1262. [Google Scholar] [CrossRef]
Zhou, Y.; Zuo, R.; Liu, G.; Yuan, F.; Mao, X.; Guo, Y.; Xiao, F.; Liao, J.; Liu, Y. A Decade of Leapfrog Development in Mathematical Geoscience: Big Data and Artificial Intelligence Algorithms are Transforming Geology. Bull. Mineral. Petrol. Geochem. 2021, 40, 556–573. [Google Scholar]
Zhang, N.; Lv, C.; Li, Y.; Panagos, P.; Ballabio, C.; Man, J.; Gu, X.; Zhao, F.-J.; Wang, P.; Liu, X.; et al. Geochemical-integrated machine learning approach predicts the distribution of cadmium speciation in European and Chinese topsoils. Commun. Earth Environ. 2025, 6, 548. [Google Scholar] [CrossRef]
Lou, Y.; Liu, Y. Mineral Prospectivity Mapping Based on a Novel Self-Ensembling Graph Convolutional Network. Math. Geosci. 2025, 57, 629–656. [Google Scholar] [CrossRef]
Lindi, O.T.; Aladejare, A.E.; Ozoji, T.M. Uncertainty quantification in mineral resource estimation. Nat. Resour. Res. 2024, 33, 2503–2526. [Google Scholar] [CrossRef]
Zhou, F.; Liu, L. Machine Learning Prediction of Deep Potential Ores and its Explanation Based on Integration of 3D Geological Model and Numerical Dynamics Simulation: An Example from Dongguashan Orefield, Tongling Copper District, China. Nat. Resour. Res. 2025, 34, 121–147. [Google Scholar] [CrossRef]
Kraipeerapun, P.; Fung, C.C.; Brown, W. Assessment of uncertainty in mineral prospectivity prediction using interval neutrosophic set. In Proceedings of the International Conference on Computational and Information Science, Berlin, Germany, 24–25 July 2025; Springer: Berlin/Heidelberg, Germany, 2005; pp. 1074–1079. [Google Scholar]
Fan, M.; Xiao, K.; Sun, L.; Xu, Y. Metallogenic prediction based on geological-model driven and data-driven multisource information fusion: A case study of gold deposits in Xiong’ershan area, Henan Province, China. Ore Geol. Rev. 2023, 156, 105390. [Google Scholar] [CrossRef]
Zhao, C.; Zhao, J.; Wang, W.; Yuan, C.; Tang, J. A novel hybrid ensemble model for mineral prospectivity prediction: A case study in the Malipo W-Sn mineral district, Yunnan Province, China. Ore Geol. Rev. 2024, 168, 106001. [Google Scholar] [CrossRef]
Bond, C.E. Uncertainty in structural interpretation: Lessons to be learnt. J. Struct. Geol. 2015, 74, 185–200. [Google Scholar] [CrossRef]
Long, M.; Cao, Y.; Wang, J. Learning transferable features with deep adaptation networks. In Proceedings of the International Conference on Machine Learning, Lille, France, 6–11 July 2015; PMLR: Geelong, VIC, Australian, 2015; pp. 97–105. [Google Scholar]
Li, D.; Yang, Y.; Song, Y.Z.; Hospedales, T. Learning to generalize: Meta-learning for domain generalization. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar]
Zheng, J.; Wu, M.; Yaseen, Z.M.; Qi, C. Machine learning models for occurrence form prediction of heavy metals in tailings. Int. J. Min. Reclam. Environ. 2023, 37, 978–995. [Google Scholar] [CrossRef]
Xiong, Y.; Zuo, R. A positive and unlabeled learning algorithm for mineral prospectivity mapping. Comput. Geosci. 2021, 147, 104667. [Google Scholar] [CrossRef]
Zhang, A.; Zhao, Y.; Li, X.; Fan, X.; Ren, X.; Li, Q.; Yue, L. Development of a Hybrid AI Model for Fault Prediction in Rod Pumping System for Petroleum Well Production. Energies 2024, 17, 5422. [Google Scholar] [CrossRef]
Rodriguez-Galiano, V.; Sanchez-Castillo, M.; Chica-Olmo, M. Machine learning predictive models for mineral prospectivity: An evaluation of neural networks, random forest, regression trees and support vector machines. Ore Geol. Rev. 2015, 71, 804–818. [Google Scholar] [CrossRef]
Liu, L.; Cao, W.; Liu, H.; Ord, A.; Qin, Y.; Zhou, F.; Bi, C. Applying benefits and avoiding pitfalls of 3D computational modeling-based machine learning prediction for exploration targeting: Lessons from two mines in the Tongling-Anqing district, eastern China. Ore Geol. Rev. 2022, 142, 104712. [Google Scholar] [CrossRef]
Cui, C.; Qian, Y.; Wu, Z.; Lu, S.; He, J. Forecasting of oil production driven by reservoir spatial–temporal data based on normalized mutual information and Seq2Seq-LSTM. Energy Explor. Exploit. 2024, 42, 444–461. [Google Scholar] [CrossRef]
Ye, X.; Liu, Y.; Huang, T.; Chen, T.; Liu, C.; Liu, S.; Jin, S. Machine Learning-Based Mineral Quantification from Lower Cambrian Shale in the Sichuan Basin: Implications for Reservoir Quality. Minerals 2025, 15, 286. [Google Scholar] [CrossRef]
Mou, N.; Carranza, E.J.M.; Xue, J.; Zhang, S.; Wang, G.; Song, H.; Chen, Y.; Ren, X. Interpretable machine learning for mineral prospectivity mapping in the Qulong–Jiama district, Tibet, China. Ore Geol. Rev. 2025, 182, 106659. [Google Scholar] [CrossRef]
Cheng, Q. Sequential weights of evidence as a machine learning model for mineral deposits prediction. In Mathematics of Planet Earth: Proceedings of the 15th Annual Conference of the International Association for Mathematical Geosciences, Madrid, Spain, 2–6 September 2013; Springer: Berlin/Heidelberg, Germany, 2013; pp. 157–161. [Google Scholar]
He, L.; Zhou, Y.; Zhang, C. Application of Target Detection Based on Deep Learning in Intelligent Mineral Identification. Minerals 2024, 14, 873. [Google Scholar] [CrossRef]
Yu, X.; Xiao, F.; Zhou, Y.; Wang Wang, K. Application of hierarchical clustering, singularity mapping, and Kohonen neural network to identify Ag-Au-Pb-Zn polymetallic mineralization associated geochemical anomaly in Pangxidong district. J. Geochem. Explor. 2019, 203, 87–95. [Google Scholar] [CrossRef]
Zhang, Q.; Zhou, Y.; He, J.; Zhu, B.; Han, F.; Long, S. A review on global cooperation network in the interdisciplinary research of geochemistry combined with artificial intelligence. Minerals 2023, 13, 1332. [Google Scholar] [CrossRef]
Kim, S.; Kim, K.H.; Lim, J.T. Synergistic enhancement of productivity prediction using machine learning and integrated data from six shale basins of the USA. Geoenergy Sci. Eng. 2023, 229, 212068. [Google Scholar] [CrossRef]
Yu, Z.; Liu, B.; Xie, M.; Wu, Y.; Kong, Y.; Li, C.; Chen, G.; Gao, Y.; Zha, S.; Zhang, H.; et al. 3D mineral prospectivity mapping of Zaozigou gold deposit, West Qinling, China: Deep learning-based mineral prediction. Minerals 2022, 12, 1382. [Google Scholar] [CrossRef]
Zhang, H.; Xie, M.; Dan, S.; Li, M.; Li, Y.; Yang, D.; Wang, Y. Optimization of Feature Selection in Mineral Prospectivity Using Ensemble Learning. Minerals 2024, 14, 970. [Google Scholar] [CrossRef]
Houran, N.; Raoui, H.A.; Manaan, M.; Aabi, A.; Simou, M.R.; Rhinane, H. Using Gis Data and Machine Learning for Mineral Mapping. Study Case, Bou Skour Eastern Anti-Atlas, Morocco. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2023, XLVIII-4/W6-2022, 423–430. [Google Scholar]
Nouha, M. Machine Learning Predictive Models for Phosphate Exploration. In Proceedings of the Igarss 2023–2023 IEEE International Geoscience and Remote Sensing Symposium, Pasadena, CA, USA, 16–21 July 2023; IEEE: New York, NY, USA, 2023; pp. 3696–3699. [Google Scholar]
Zhou, Y.; Xiao, F. Overview: A glimpse of the latest advances in artificial intelligence and big data geoscience research. Earth Sci. Front. 2024, 31, 1–6. [Google Scholar] [CrossRef]
Josso, P.; Hall, A.; Williams, C.; Le Bas, T.; Lusty, P.; Murton, B. Application of random-forest machine learning algorithm for mineral predictive mapping of Fe-Mn crusts in the World Ocean. Ore Geol. Rev. 2023, 162, 105671. [Google Scholar] [CrossRef]
Chen, W.; Ma, X.; Wang, Z.; Li, W.; Fan, C.; Zhang, J.; Que, X.; Li, C. Exploring neuro-symbolic AI applications in geoscience: Implications and future directions for mineral prediction. Earth Sci. Inform. 2024, 17, 1819–1835. [Google Scholar] [CrossRef]
Li, Q.; Chen, G.; Wang, D. Mineral Prospectivity Mapping Using Semi-supervised Machine Learning. Math. Geosci. 2025, 57, 275–305. [Google Scholar] [CrossRef]
Park, S.Y.; Son, B.K.; Choi, J.; Jin, H.; Lee, K. Application of machine learning to quantification of mineral composition on gas hydrate-bearing sediments, Ulleung Basin, Korea. J. Pet. Sci. Eng. 2022, 209, 109840. [Google Scholar] [CrossRef]
Xiong, Y.; Zuo, R. Recognizing multivariate geochemical anomalies for mineral exploration by combining deep learning and one-class support vector machine. Comput. Geosci. 2020, 140, 104484. [Google Scholar] [CrossRef]
Wang, K.; Zheng, X.; Wang, G.; Liu, D.; Cui, N. A multi-model ensemble approach for gold mineral prospectivity mapping: A case study on the Beishan region, western China. Minerals 2020, 10, 1126. [Google Scholar] [CrossRef]
Yang, Y.; Cen, X.; Ni, H.; Liu, Y.; Chen, Z.J.; Yang, J.; Hong, B. A highly accurate and robust prediction framework for drilling rate of penetration based on machine learning ensemble algorithm. Geoenergy Sci. Eng. 2025, 244, 213423. [Google Scholar] [CrossRef]
Wang, L.; Yang, J.; Wu, S.; Hu, L.; Ge, Y.; Du, Z. Enhancing mineral prospectivity mapping with geospatial artificial intelligence: A geographically neural network-weighted logistic regression approach. Int. J. Appl. Earth Obs. Geoinf. 2024, 128, 103746. [Google Scholar] [CrossRef]
Carvalho, M.; Azzalini, A.; Cardoso-Fernandes, J. Multi-sensor approach for cobalt exploration in Asturias (Spain) using machine learning algorithms. In Proceedings of the IGARSS 2024–2024 IEEE International Geoscience and Remote Sensing Symposium, Athens, Greece, 7–12 July 2024; IEEE: New York, NY, USA, 2024; pp. 2122–2126. [Google Scholar]
Zhang, S.; Carranza, E.J.M.; Fu, C.; Zhang, W.; Qin, X. Interpretable Machine Learning for Geochemical Anomaly Delineation in the Yuanbo Nang District, Gansu Province, China. Minerals 2024, 14, 500. [Google Scholar] [CrossRef]
Peng, Q.; Wang, Z.; Wang, G.; Zhang, W.; Chen, Z.; Liu, X. 3D Mineral Prospectivity Mapping from 3D geological models using return–risk analysis and machine learning on Imbalance Data. Minerals 2023, 13, 1384. [Google Scholar] [CrossRef]
Zhang, R.; Xi, Z. Research on anomaly identification and screening and metallogenic prediction based on semisupervised neural network. Comput. Intell. Neurosci. 2022, 2022, 8745036. [Google Scholar] [CrossRef]

Figure 1. Workflow diagram of literature retrieval and screening for mineralization prediction.

Figure 2. Interconnection of systems.

Figure 3. PRISMA style literature selection flow chart.

Figure 4. Trend of Literature Quantity from 2016 to 2025.

Figure 5. Comparison of machine learning algorithm applications in large-scale mineralization prediction across different periods.

Figure 6. Comparison of the number of publications from 2016–2020 and 2021–2025 by country.

Figure 7. Comparison of the number of papers published between 2016–2020 and 2021–2025, grouped by different machine learning methods in large-scale mineralization prediction.

Figure 8. A comparison is made regarding the quantity of published papers during two time periods: 2015–2019 and 2020–2024, grouped by the main application fields of machine learning methods in large-scale mineralization prediction.

Figure 9. Heatmap of the Relationship Between Research Method Categories and Application Fields.

Table 1. Technical characteristics of selected machine learning methods applied to large-scale mineralization prediction.

Algorithm	Feature Engineering	Computation	Noise Robustness	Use Case Example
SVM	Required	Low	Medium	Geochemical anomaly classification
DT	Required	Very Low	Low	Structural ore-controlling factor identification
kNN	Required	Medium	Medium	Rapid mineralization type classification
RF	Required	Medium	High	Multi-source data fusion prediction
CNN	Not required	High	High	Remote sensing alteration mineral mapping
LSTM	Not required	Very High	High	Mineralization time-series prediction
Autoencoder	Not required	Medium	Medium–High	Reconstruction of missing geological data

Table 2. Publications by year in all categories.

Name	2016–2020	2021–2025	All Years	Share [%]
Total	96	159	255	100
Document Type
Journal Article	72	112	183	71.76
Conference Paper	16	30	46	18.04
Other	8	17	26	10.2
Machine Learning
CNN	0	43	43	16.86
RNN	0	34	34	13.33
LSTM	0	18	18	7.06
Autoencoder	1	3	4	1.57
SVM	45	15	60	25.53
DT	87	27	114	44.71
NNS	33	11	44	17.25
Random Forest	85	22	107	42.00
K-means	15	8	23	8.55
Application domains
Mineralization anomaly identification	32	50	82	32.16
Mineral resource estimation	24	34	58	22.75
Exploration target optimization	48	89	137	53.73
Extraction of structural ore-controlling factors	18	40	58	22.75
Mineralization potential assessment	69	106	175	68.63
Research Methodology
Literature Analysis	3	7	10	3.92
Case Study	29	55	84	32.75
Conceptual Study	56	105	161	63.33

Table 3. Publications by year in different countries.

Country	2015–2019	2020–2024	All Years	Share [%]
All countries	89	166	255	100
China	30	43	73	28.63%
Canada	15	21	36	14.12%
Australia	12	17	29	11.37%
United States	11	16	27	10.59%
Russia	9	13	22	8.63%
South Africa	5	6	11	4.31%
Brazil	3	5	8	3.14%
Finland	2	4	6	2.35%
Iran	2	4	6	2.35%
India	2	4	6	2.35%
Germany	2	3	5	1.96%
France	2	3	5	1.96%
England	1	3	4	1.57%
Chile	1	2	3	1.18%
Peru	30	43	73	28.63%

Table 4. Publications classified by year according to different machine learning algorithms and research methods.

Name	Mineral Resource Estimation	Mineralization Anomaly Identification	Exploration Target Optimization	Extraction of Structural Ore-Controlling Factors	Mineralization Potential Assessment	Total
Total	82	58	137	58	175	255
CNN	8	6	13	6	17	43
RNN	6	4	10	4	13	34
LSTM	3	2	6	2	7	18
Autoencoder	1	1	1	1	2	4
SVM	11	8	18	8	23	60
DT	22	15	36	15	45	114
NNS	8	6	13	6	17	44
Random Forest	19	13	33	13	42	107
K-means	4	3	7	3	9	23
Research Methodology
Literature Analysis	1	2	3	1	3	3.92
Case Study	10	17	24	7	26	32.75
Conceptual	21	33	46	14	47	63.33

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Fu, Z.; Zheng, X.; Yan, Y.; Xu, X.; Zhou, F.; Li, X.; Zhou, Q.; Mai, W. The Evolution of Machine Learning in Large-Scale Mineral Prospectivity Prediction: A Decade of Innovation (2016–2025). Minerals 2025, 15, 1042. https://doi.org/10.3390/min15101042

AMA Style

Fu Z, Zheng X, Yan Y, Xu X, Zhou F, Li X, Zhou Q, Mai W. The Evolution of Machine Learning in Large-Scale Mineral Prospectivity Prediction: A Decade of Innovation (2016–2025). Minerals. 2025; 15(10):1042. https://doi.org/10.3390/min15101042

Chicago/Turabian Style

Fu, Zekang, Xiaojun Zheng, Yongfeng Yan, Xiaofei Xu, Fanchao Zhou, Xiao Li, Quantong Zhou, and Weikun Mai. 2025. "The Evolution of Machine Learning in Large-Scale Mineral Prospectivity Prediction: A Decade of Innovation (2016–2025)" Minerals 15, no. 10: 1042. https://doi.org/10.3390/min15101042

APA Style

Fu, Z., Zheng, X., Yan, Y., Xu, X., Zhou, F., Li, X., Zhou, Q., & Mai, W. (2025). The Evolution of Machine Learning in Large-Scale Mineral Prospectivity Prediction: A Decade of Innovation (2016–2025). Minerals, 15(10), 1042. https://doi.org/10.3390/min15101042

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Evolution of Machine Learning in Large-Scale Mineral Prospectivity Prediction: A Decade of Innovation (2016–2025)

Abstract

1. Introduction

2. Materials and Methods

2.1. Search and Select Documents

2.2. Classification Criteria

2.3. Data Processing and Analysis

2.4. Review Protocol and Quality Assessment

3. Analysis of the State of the Art

3.1. Review from 2016 to 2025

3.2. Technical Evaluation and Thoughtful Analysis

3.3. Comparative Technical Analysis of Machine Learning Methods

3.4. Key Trends in Technology Evolution

4. Results and Discussion

5. Further Development Prospects

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI