Machine Learning in Surface Mining—A Systematic Review
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis paper conducts a systematic review of ML applications in surface mining from 2020 to 2025. It finds that 75% of the literature focuses on the blasting phase, where hybrid models like XGBoost demonstrate excellent predictive performance (R² > 0.97). Other operational phases rely more on technologies like reinforcement learning. However, key limitations such as strong site dependency, low model transparency, and a lack of unified evaluation metrics significantly hinder the practical deployment and scalability of ML in Mining 4.0. The article is well-structured and methodologically sound, though some analyses and discussions could be more in-depth. It is recommended that the article undergo major revision before publication.
1、The quality of English writing needs significant improvement. Grammatical errors, typos, and incorrect terminology seriously affect readability. For instance, on Page 1, Line 12, "unitarian operations" appears repeatedly, which is inaccurate; the standard mining engineering term is "unit operations". On Page 2, Line 20, "In the other hand" should be "On the other hand".
2、The authors used the ROBINS-I tool to assess bias risk in engineering-focused ML papers. While borrowing tools across disciplines is acceptable, the explanation in Section 2.4 (Page 5) for domains like "Bias of interventions" or "Bias for confusion" – linking them to algorithm transparency and geomechanical properties – seems slightly strained and lacks clear mapping criteria.
3、The quality of figures requires enhancement. For example, in Figure 1 (PRISMA flow diagram) around Line 244, there are typos, and the numbers in the diagram (e.g., n=5317, n=3011) need to be carefully cross-checked with the textual description in Section 3.1 for consistency.
4、In Line 299, it states "Load and haul was the second class with the most articles, totaling six articles", which is significantly fewer compared to the blasting phase (43 articles). It is recommended to discuss whether this uneven distribution reflects actual application needs or research gaps.
5、In line 573, it is mentioned that "A foundation for Digital Twin operations... can only be achieved if the industry moves toward open-access datasets", but the current obstacles to data sharing are not mentioned in the text. Suggest adding relevant discussion.
6、While the article discusses limitations to some extent, the guidance for future research remains somewhat general. It is recommended that the authors, based on the review findings, clearly pinpoint key research gaps in the current landscape and propose more specific, actionable suggestions for future work. This would enhance the paper's forward-looking perspective and practical guidance.
7、The overall paper structure is clear, and the language generally meets academic journal standards. However, the logical transitions between some sections are slightly abrupt. Suggest adding summarizing sentences at the beginning or end of sections to strengthen the overall logical coherence.
Author Response
Dear Reviewer 1,
First of all, we would like to thank you for your insightful comments and guidance on improving our systematic review. We believe that addressing your suggestions has significantly enhanced the quality of the manuscript.
Comment 1: The quality of English writing needs significant improvement. Grammatical errors, typos, and incorrect terminology seriously affect readability. For instance, on Page 1, Line 12, "unitarian operations" appears repeatedly, which is inaccurate; the standard mining engineering term is "unit operations". On Page 2, Line 20, "In the other hand" should be "On the other hand".
Response 1: We would like to thank Reviewer 1 for highlighting this issue. We apologize for the grammatical errors in the original submission. The entire manuscript has been thoroughly revised to improve English usage and readability. Specifically, we have corrected the terminology 'unitarian operations' to the standard 'unit operations' throughout the text and fixed the expression 'On the other hand', along with other grammatical inconsistencies.
Comment 2: The authors used the ROBINS-I tool to assess bias risk in engineering-focused ML papers. While borrowing tools across disciplines is acceptable, the explanation in Section 2.4 (Page 5) for domains like "Bias of interventions" or "Bias for confusion" – linking them to algorithm transparency and geomechanical properties – seems slightly strained and lacks clear mapping criteria.
Response 2: We would like to thank Reviewer 1 for this insightful observation. We agree that applying ROBINS-I in an engineering and machine learning context led to terminology misalignments (e.g., "bias of interventions").
To address this, we have replaced the ROBINS-I tool with the PROBAST (Prediction Model Risk of Bias Assessment Tool), which is specifically designed for evaluating prediction model studies and aligns much more naturally with ML workflows.
As suggested, we have completely rewritten Section 2.4 (Bias Assessment) to provide clear mapping criteria. Specifically:
- Participants now focus on data sources and sampling locations.
- Predictors include input variables such as geomechanical properties and atmospheric conditions.
- Outcome refers to the target variables in the ML models.
- Analysis assesses algorithmic rigor and transparency.
We believe this adjustment provides a more robust and discipline-appropriate methodological framework. These can be seen in lines 226-258 and lines 486-536, and the discussion of Bias in lines 555-578
Comment 3: The quality of figures requires enhancement. For example, in Figure 1 (PRISMA flow diagram) around Line 244, there are typos, and the numbers in the diagram (e.g., n=5317, n=3011) need to be carefully cross-checked with the textual description in Section 3.1 for consistency.
Response 3: Figure 1 (PRISMA flow diagram) has been completely revised to improve its visual quality and resolution. All typos within the diagram have been corrected. Furthermore, we have meticulously cross-checked the numerical data (e.g., n=5317, n=3011) against the textual description in Section 3.1 to ensure full consistency throughout the manuscript. Lines 260-282
Comment 4: In Line 299, it states "Load and haul was the second class with the most articles, totaling six articles", which is significantly fewer compared to the blasting phase (43 articles). It is recommended to discuss whether this uneven distribution reflects actual application needs or research gaps.
Response 4: Dear Reviewer 1, we agree with your insightful comment. We have expanded the discussion in Section 4.1 (Analysis by category), lines 613-620, to address this disparity. We analyzed whether the lower number of articles on 'load and haul' reflects a research gap or current industry needs.
Comment 5: In line 573, it is mentioned that "A foundation for Digital Twin operations... can only be achieved if the industry moves toward open-access datasets", but the current obstacles to data sharing are not mentioned in the text. Suggest adding relevant discussion.
Response 5: We appreciate Reviewer 1 insightful comment regarding the need to detail the obstacles to data sharing in the extractive industry. We have revised the text to address this issue and added a relevant discussion. The added text in Lines 718-724
Comment 6: While the article discusses limitations to some extent, the guidance for future research remains somewhat general. It is recommended that the authors, based on the review findings, clearly pinpoint key research gaps in the current landscape and propose more specific, actionable suggestions for future work. This would enhance the paper's forward-looking perspective and practical guidance.
Response 6: We have expanded the Discussion (Section 4.3) and Conclusions to provide concrete, actionable guidance; specifically, we now recommend that future research adopt minimum reporting checklists for datasets and preprocessing, and that practical applications must report operational KPIs (e.g., ton/h, fuel consumption) alongside statistical metrics to bridge the gap to deployment. Line 698-740
Comment 7: The overall paper structure is clear, and the language generally meets academic journal standards. However, the logical transitions between some sections are slightly abrupt. Suggest adding summarizing sentences at the beginning or end of sections to strengthen the overall logical coherence.
Response 7: We improved the flow of the manuscript by adding bridging sentences to contextualize new sections (e.g., Section 3.3 now explicitly links back to the previous categorization) and by restructuring the Discussion to directly address each Research Question (RQ1, RQ2, and RQ3) within their respective subsections, thereby strengthening the logical coherence.
Reviewer 2 Report
Comments and Suggestions for Authors1) The inclusion of 2025 in the analysis period (2020-2025), when the work was obviously prepared earlier. Data for the current/next year does not have time to go through the full cycle of indexing in a database, such as Scopus, which leads to systematic underestimation of the sample and distortion of trends. It is necessary to limit the period 2020-2024 or explicitly indicate the incompleteness of data for 2025 as a serious limitation.
2)The Methodology section describes only search queries. There is no detailed description of the filtering procedure, which makes the process impossible to reproduce. How many publications relate to original articles and conference materials, how many are in the public domain, etc. There is no metadata selection. Presentable in the form of a table or flow chart
3) The final sample of 57 publications over 5 years in such a vast field raises doubts about representativeness. I entered the query into the Scopus database and got 9,793 results.
4) Lack of basic scientometric and contextual analysis. Add the appropriate section. As a guideline on the methodology of deep analysis, we can consider the highly cited work in this area by Babyr N.V. Topical Themes and New Trends in the Mining Industry: Scientific Analysis and Research Visualization (2024), where a similar approach is fully and systematically implemented.
5) I do not see the point in Figure 3, it is enough to indicate in the text. Figure 4 is not readable at all, we need to correct the chart.
6) Careful language editing is required.
7) Some provisions are duplicated in the Results and Discussion sections.
Author Response
Dear Reviewer 2, first of all, we would like to thank you for your valuable comments and instructions on how to improve our systematic review. We believe that by following your comments and suggestions, the review has reached its full potential.
Comment 1: The inclusion of 2025 in the analysis period (20202025), when the work was obviously prepared earlier. Data for the current/next year does not have time to go through the full cycle of indexing in a database, such as Scopus, which leads to systematic underestimation of the sample and distortion of trends. It is necessary to limit the period 2020-2024 or explicitly indicate the incompleteness of data for 2025 as a serious limitation.
Response 1: The Reviewer 2 makes an excellent point regarding the indexing cycle in databases like Scopus. While the search was conducted in December 2025 (as per PRISMA guidelines in Section 2.1), we acknowledge that the 2025 data is inherently incomplete. We have updated the manuscript to explicitly state this incompleteness as a limitation, ensuring the reader is aware that 2025 figures do not yet reflect the full year's academic output
Comment 2: The Methodology section describes only search queries. There is no detailed description of the filtering procedure, which makes the process impossible to reproduce. How many publications relate to original articles and conference materials, how many are in the public domain, etc. There is no metadata selection. Presentable in the form of a table or flow chart
Response 2: We sincerely appreciate Reviewer 2 valuable feedback regarding the lack of detail in the Methodology section. We agree that transparency and reproducibility are essential. To address this, we have extensively rewritten Section 2 and included a PRISMA flow diagram, Lines 109-257
Comment 3: The final sample of 57 publications over 5 years in such a vast field raises doubts about representativeness. I entered the query into the Scopus database and got 9,793 results.
Response 3: We thank the Reviewer 2 for pointing out this crucial issue regarding representativeness. We have re-run the search to verify our results. While a broad search in Scopus indeed yields a high number of results (capturing general terms), our systematic review applies specific inclusion and exclusion criteria (as detailed in the Methodology) to focus strictly on ML/AI directly on surface mining unit operations (such as drilling, blasting, loading, and hauling). Upon re-evaluation, we confirmed that the vast majority of the ~9,000 results fall outside the specific scope of this review (e.g., different industries, theoretical concepts without application, or insufficient rigorous validation). Additionally, it is worth noting that access to these databases is dependent on institutional subscriptions, which might also explain some of the discrepancies in the initial research results.Therefore, the 57 selected articles represent the core body of literature relevant to our specific research questions.
Comment 4: Lack of basic scientometric and contextual analysis. Add the appropriate section. As a guideline on the Methodology of deep analysis, we can consider the highly cited work in this area by Babyr N.V. Topical Themes and New Trends in the Mining Industry: Scientific Analysis and Research Visualization (2024), where a similar approach is fully and systematically implemented.
Response 4: We thank the Reviewer 2 for this insightful suggestion to enhance the depth of the analysis. We have added a new Figure 4 in Section 3.2, along with a dedicated paragraph that conducts a scientometric analysis of keyword co-occurrence. This visualization maps the thematic clusters of the 57 included studies, highlighting the centrality of blasting/prediction research and the periphery of haulage studies. Lines 313-323
Commnet 5: I do not see the point in Figure 3, it is enough to indicate in the text. Figure 4 is not readable at all, we need to correct the chart.
Response 5: We agree with the Reviewer 2 suggestion. Accordingly, Figure 3 has been removed, and the relevant information is now concisely described within the text. Additionally, Figure 4 has been completely redesigned to improve its resolution, font size, and overall readability, ensuring the data is now clearly presented.
Commnet 6: Careful language editing is required.
Response 6: In accordance with the Reviewer 2 suggestion, the entire manuscript has undergone a thorough language editing process. We have meticulously revised the text to improve grammar, syntax, and the overall flow of the discussion, ensuring it meets professional academic standards.
Comment 7: Some provisions are duplicated in the Results and Discussion sections.
Response 7: We appreciate Reviewer 2 keen observation regarding the overlap between sections. We have conducted a thorough revision of Sections 3 (Results) and 4 (Discussion) to eliminate redundancies and ensure a clear distinction between data presentation and interpretation.
The Results section is now strictly dedicated to the quantitative presentation of the systematic review findings, including the bibliometric distribution, model frequencies, and statistical performance metrics (e.g., reporting specific R² and RMSE values without extensive commentary).
Conversely, the Discussion section has been restructured to focus exclusively on the interpretation of these findings. It now centers on contextualizing the shift toward hybrid architectures, explaining the prevalence of blasting studies due to data availability, and comparing the validation maturity of the mining sector against other industries (such as construction and manufacturing).
Reviewer 3 Report
Comments and Suggestions for Authors1--The search queries for Scopus, ScienceDirect, Dimensions, and Web of Science are presented in Section 2.1. However, the queries are not identical across databases, and the rationale for these differences is not explained.
2--The authors report counts, percentages, and model names, but do not critically synthesize the evidence. For example, the high predictive performance (R² > 0.97) is repeatedly highlighted, yet there is no discussion of potential overfitting, data leakage, or the appropriateness of R² for non‑linear models.
3--Figure 2 shows that China, Iran, and India contribute the largest number of studies. The authors state that this “highlights the importance of these technologies in the sector” (line 58). No attempt is made to explain why these countries dominate (e.g., national funding priorities, large mining sectors, journal publication trends) or what implications this geographic concentration has for the generalizability of findings.
4--The concepts of Mining 4.0 and digital twins are mentioned in the abstract, introduction, and conclusion, but they are not defined and their relationship to the reviewed ML studies is not critically examined.
5--The finding that 75% of articles focus on blasting is attributed to “the direct impact of blasting on operational safety and downstream costs” and the “nature of blasting data” (lines 493–499). While plausible, this explanation is not supported by evidence from the review itself.
Author Response
Dear Reviewer 3, first of all we want to thank you for such important and comments and instructions in how to improve and enhance our systematic review. We believe that by following your comments and suggestions, the review has reached its full potential.
Comment 1: The search queries for Scopus, ScienceDirect, Dimensions, and Web of Science are presented in Section 2.1. However, the queries are not identical across databases, and the rationale for these differences is not explained.
Response 1: We thank Reviewer 3 for pointing out this crucial missing information. We agree that detailing the filtering procedure is essential for reproducibility. We have added the necessary specifications regarding article selection, article types (original vs conference), and metadata at Lines 119-123 and 134-136.
To further address this, we have ensured that despite the structural differences, the search remains anchored in the same four conceptual pillars identified in the methodology: (1) technological and geospatial modelling; (2) artificial intelligence and learning algorithms; (3) mining operations and application context; and (4) sustainability and environmental dimensions.
Comment 2: The authors report counts, percentages, and model names, but do not critically synthesize the evidence. For example, the high predictive performance (R² > 0.97) is repeatedly highlighted, yet there is no discussion of potential overfitting, data leakage, or the appropriateness of R² for non‑linear models.
Response 2: We thank Reviewer 3 for the comment, we addressed these problems in "4.3 Methodological Limitations and Evidence Gap" in the specific Lines 707-713
Comment 3: Figure 2 shows that China, Iran, and India contribute the largest number of studies. The authors state that this "highlights the importance of these technologies in the sector" (line 58). No attempt is made to explain why these countries dominate (e.g., national funding priorities, large mining sectors, journal publication trends) or what implications this geographic concentration has for the generalizability of findings.
Response 3: We thank Reviewer 3 for this valuable comment. We agree that discussing the geographical distribution is extremely important to enhancing the article quality. We have addressed these concerns specifically in Lines 582-591.
Comment 4: The concepts of Mining 4.0 and digital twins are mentioned in the abstract, introduction, and conclusion, but they are not defined and their relationship to the reviewed ML studies is not critically examined.
Response 4: We agree with Reviewer 3 in is observation. To address this, we have included the definitions of Mining 4.0 and Digital Twins in the Introduction (Lines 65–72). Furthermore, a critical examination of their relationship with the reviewed ML studies has been added to Section 4.3, specifically in Lines 649-655.
Comment 5: The finding that 75% of articles focus on blasting is attributed to "the direct impact of blasting on operational safety and downstream costs" and the "nature of blasting data" (lines 493–499). While plausible, this explanation is not supported by evidence from the review itself.
Response 5: We appreciate this insightful comment from Reviewer 3. We agree that the initial explanation relied too heavily on general assumptions. We have revised the Discussion section to ground this explanation in the evidence gathered by this systematic review.
Specifically, we expanded the analysis to show that our findings regarding data complexity and algorithm types support the predominance of blasting studies. The review reveals that blasting datasets allow for high-accuracy static regression (e.g., SVM, ANN). In contrast, other domains, such as 'Load and Haul', require more complex Reinforcement Learning or Computer Vision for dynamic environments. We argue that this 'complexity gap' identified in our results explains the volume of publications.
Furthermore, regarding the 'downstream costs' claim, we incorporated specific literature support regarding the 'Mine-to-Mill' concept to validate the operational impact argument.
Reviewer 4 Report
Comments and Suggestions for AuthorsI would like to thank the authors for their great effort in preparing this manuscript, high-quality analysis and detailed report. This manuscript can be processed further. Minor note are listed below::
- Line310 The articles used nine different software, with two of the articles not specifying what type of software was used []. Please add references to literature in brackets.
- 2. The presented manuscript is a review and bibliometric study. But it does not contain rigorous scientific results in mining. However, it characterizes the development vector of AI and machine learning in surface mining. Therefore, in the reviewer's opinion, it would be appropriate to more clearly define the challenges of implementing these technologies (besides the difficulties associated with differences in technology and geological conditions) and possible solutions (besides the implementation of unified data protocols) in the conclusions. It is possible to classify conditions by the type of extracted raw material, clustering, etc.
Author Response
Dear Reviewer 4, first of all, we would like to thank you for your valuable comments and instructions on how to improve our systematic review. We believe that by following your comments and suggestions, the review has reached its full potential.
Comment 1: Line 310 The articles used nine different software, with two of the articles not specifying what type of software was used []. Please add references to literature in brackets.
Response 1: We thank Reviewer 4 for bringing this omission to our attention. The missing references for the two articles that did not specify the software used have been added in brackets, as requested.
Comment 2: The presented manuscript is a review and bibliometric study. But it does not contain rigorous scientific results in mining. However, it characterizes the development vector of AI and machine learning in surface mining. Therefore, in the reviewer's opinion, it would be appropriate to more clearly define the challenges of implementing these technologies (besides the difficulties associated with differences in technology and geological conditions) and possible solutions (besides the implementation of unified data protocols) in the conclusions. It is possible to classify conditions by the type of extracted raw material, clustering, etc.
Response 2: We sincerely thank the Reviewer 4 for this insightful comment. We agree that characterizing the development vector of AI requires a clearer definition of implementation challenges and potential solutions beyond geological constraints.
In response to this suggestion, we have substantially revised and expanded the Conclusions (Section 5) to address these points, Lines 741-787:
- Challenges in Implementation: We have added a discussion on the disconnect between statistical model accuracy and practical utility. Specifically, we highlight that “operational efficiency and safety-relevant indicators are inconsistently incorporated, creating a gap between model accuracy and practical utility”. We also emphasized the limitations imposed by the reliance on single-site datasets, which constrain scalable deployment.
- Proposed Solutions: Beyond unified protocols, we have introduced specific recommendations for validation and technology integration. We now suggest that validation strategies must “expand beyond internal cross-validation to include multi-site testing” to ensure reliability. Furthermore, we highlighted the integration of Digital Twins and visualization software (e.g., Unity) as a concrete solution to “bridge the gap between digital simulation and physical reality” for real-time decision-making.
We believe these additions provide a more rigorous outlook on the practical application of these technologies in surface mining.
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsNo more comments.
Reviewer 3 Report
Comments and Suggestions for AuthorsThe author has made detailed revisions based on the reviewer's comments. The paper meets the requirements for publication and is recommended for acceptance.

