Comprehensive Comparison of Machine Learning Approaches—Deterministic and Stochastic—In Modeling the Production and Power of an SAG Mill: A Case Study of the Chilean Copper Mining Industry
Abstract
1. Introduction
- (1)
- To evaluate the predictive performance of multiple modeling approaches (RF, GBM, XGB, ANN, etc.).
- (2)
- To fit probabilistic models that have the potential to assess uncertainty and allow for estimating responses with partial knowledge of the input variables.
- (3)
- To analyze and compare the behavior of the models under different production regimes (e.g., dilute, optimum, and thick hydraulic regimes).
- (4)
- To evaluate the interpretability of predictions through Sensitivity Analysis and comparison with process engineering knowledge.
- (5)
- To analyze and discuss the implications of selecting approaches differentiated by explained variables and operating regimes in the development of digital twins or real-time control systems in industrial concentrator plants.
2. Background
2.1. Model Comparison Matrix
2.2. Reference Metrics and Validation Practices
- Regression Metrics: Root Mean Square Error (RMSE), normalized RMSE, Mean Average Percentage Error (MAPE) [12], Mean Average Error (MAE) and R2 (widely used, although not all articles cite numerical values) [6,19]. Saldaña et al. [18] report simulated production increases and energy savings as operational performance metrics (+4.42% production; −7.62% energy) derived from their ML-driven scenarios.
- Probabilistic Classification/Metrics: The Receiver Operating Characteristic—Area Under the Curve (ROC-AUC), the Matthews Correlation Coefficient (MCC), Cohen’s k, the Brier score, the Expected Calibration Error (ECE), and reliability curves are conceptually relevant for miss/detect or binarized regimes but are rarely reported in reviewed SAG regression articles; BN/RUL work assesses probabilistic accuracy and inference performance rather than ROC-specific details [14,15].
2.3. Sensitivity and Uncertainty Analysis Methods
- Global Sensitivity Analysis (GSA) of Sobol: This is used to calculate overall sensitivity indices for input–output matching and control structure design in SAG mills; the Sobol–Jansen indices informed the selection of manipulated variables (fresh ore flow and mill filling fraction; energy consumption and % critical speed) in a Multiple-Input–Multiple-Output (MIMO) control design study [20].
- Monte Carlo Analysis/Scenarios: Total propagation of uncertainty through Monte Carlo over parameter ranges and plant simulators to identify where accurate information or online measurement is needed in the flow diagram design [21].
- GSA accelerated by substitutes: Supervised ML surrogates (batch and online Extreme Learning Machine—ELM) dramatically reduce the computational cost of Sobol-based GSA and enable online sensitivity estimation for mineral processing equipment [16].
- Stochastic Population Balance: Stochastic formulations of failure functions and probabilistic load/failure models are used to capture the evolution of the Particle Size Distribution (PSD) under an uncertain load [40].
2.4. Suitability Conditions and Operational Contexts
- Examples: Population balance simulators used to predict the temporal evolution of product flow rate, mill load level and energy consumption for redesigns of flow diagrams or circuits [17]; the DEM is used to quantify material transport, drag and collision energy and to evaluate elevator/grid modifications (impacts on discharge flow and collision energy) [7].
- These are less suitable when real-time, factory-scale closed-loop control is required without online parameter estimation due to runtime.
- Handling drift/liner wear/pebble load: ML models can incorporate features such as liner age and capture changes if they are retrained or adapted online; Saldaña et al. [18] explicitly included liner age in the models used for scenario simulation.
- These are suitable when decision support, alarm threshold definition, Remaining Useful Life (RUL), and probabilistic reasoning about uncertain events (e.g., sump level regimes, risk of clustering/torque events) are the primary objectives; BNs encode causal relationships and support evidence updating under changing operating conditions [14,15].
- Limitations: Fully probabilistic MPC/GP-MPC and Dynamic Bayesian Networks (DBNs) for online control are not demonstrated within this corpus (see Gaps).
2.5. Implementation Notes and Implementation Considerations
2.6. Industrial Evidence, Performance Results, and Observed Robustness
- Throughput and energy results from ML-driven studies: Saldaña et al. [18] used statistical and ML models combined with simulations to estimate a potential production increase of 4.42% through fragmentation/operational adjustments and an energy reduction of 7.62% by decreasing rotational speed under specific liner age configurations. Additional studies report that RNN/LSTM architectures outperform classical regressors in energy and throughput forecasting; one LSTM study reported energy prediction errors below 4% RMSE [6,12].
- Real-plant datasets and model comparisons: A comparative study using an industrial time series of 20,161 records found that time-aware RNN models were the most accurate for throughput prediction, followed by GP and SVR. Sensitivity Analysis identified rotational speed and inlet water as dominant factors [19,23]. Furthermore, GP-based approaches provide closed-form expressions valued by domain experts for their interpretability [23].
- Control and sensitivity evidence: Sobol-based GSA-driven decentralized controller pairing demonstrated performance comparable to MPC in case studies. The suggested pairings were fresh ore feed—fractional mill filling and power—percentage of critical speed for SAG control design [20]. Surrogate-based GSA reduced computational cost and enabled near-online sensitivity indices, supporting robustness under drift conditions [16].
- Design engineering and DEM evidence: DEM studies quantified how lifter geometry and pulp lifter design affect transport velocities, discharge flow, and collision energy. Significant increases in discharge flow and collision energy were reported when modifying lifter geometry, supporting liner/grate redesign strategies [7].
- Stochastic/probabilistic decision support: BN implementations demonstrated high accuracy in predicting the fresh feed rate and mill power in industrial proof-of-concept datasets and were proposed for decision support under feed uncertainty [14]. BN approaches also have also been successfully applied to predict RUL/reliability for heavy mining motors [15].
3. Materials and Methods
3.1. General Design of the Study
3.2. Data, Operational Variables and Data Preprocessing
- P80: Size of the mesh opening that allows the passage of 80% of the granulometry.
- SAG water feeding (m3/h): Water flow feeding to the SAG mill.
- SAG rotational speed (RPM): Mill rotational speed.
- SAG pressure (PSI): Bearing pressure signal used as an operational proxy to estimate the internal mill load (mill fill level).
- Stockpile level (m): Stockpile level in the feeding stack.
- Sump level (m): Thicker downloading pool at the SAG mill.
- Hardness: Resistance offered by the mineral to abrasion or scraping.
- Solids in the feeding (%): Percentage of solids in the feed pulp.
- Pebbles (tph): Pebbles (pebbles, chunks, or small stones) are the result of mineral grinding. These are hard materials and are difficult to reduce to a smaller size in the SAG mill.
- Granulometry > 100 mm (%): Percentage of the ore feed whose granulometry is greater than 100 mm.
- Granulometry < 30 mm (%): Percentage of the ore feed whose granulometry is less than 30 mm.
- Liner age (months): Age of the mill liners. Liners are integral components of the mill and function as protective shells for the internal casing (SAG mill shell), which is subject to progressive wear due to the intense and continuous impact generated by interactions between the ore charge and the steel grinding media.
3.3. Comparative Models
3.3.1. Multiple Regressions—MRs
3.3.2. Random Forest—RF
3.3.3. eXtreme Gradient Boosting— XGBoost—XGB
3.3.4. Gradient Boosting Machine—GBM
3.3.5. Artificial Neural Networks—ANNs
3.3.6. Bayesian Linear Regression—BLR
3.3.7. Bayesian Additive Regression Tree—BART
3.3.8. Gaussian Bayesian Network—GBN
3.3.9. Gaussian Process Regression—GPR
3.3.10. Bayesian Neural Network—BNN
3.4. Comparative Strategy Between Models
3.5. Sensitivity Analysis—SA
3.6. Suitability Comparison by Operating Regime
4. Results
4.1. Exploratory Analysis
4.2. Analysis of the Comparative Experiment
4.2.1. Non-Stochastic Model Analysis
4.2.2. Stochastic Model Analysis
4.3. Sensitivity and Elasticity Analysis
- Sensitivity and Elasticity Structure Across Models: Tornado diagrams reveal a clear and recurrent hierarchical structure in the influence of input variables on both SAG Power and SAG Production. Across nearly all the modeling approaches, the dominant drivers are SAG rotational speed, solids in the feeding, SAG pressure, and P80 in the feeding. These variables consistently exhibit the largest absolute sensitivities and the highest elasticities, indicating that marginal perturbations in these inputs produce the greatest impact on system response. For SAG Power, rotational speed emerges as the most structurally stable driver, displaying large positive sensitivities across deterministic and probabilistic models. This behavior is physically consistent with SAG dynamics, where increased rotational speed elevates energy transfer, and consequently, SAG Power. The solids percentage in the feeding also shows strong positive elasticities, reflecting the increase in effective load and grinding resistance as pulp density rises. Mill pressure has a significant influence on the fitted models, which can be explained by its function as an indicator of the internal load status and mill filling dynamics. On the other hand, while mill rotation remains significant in production, it is considerably more sensitive to parameters such as feed characteristics, including solids concentration and P80. The afore-mentioned aligns with fundamentals of SAG dynamics, where production rate is more controllable through feeding adjustments, while the SAG Power reflects the internal energy dissipation dynamics.
- Physical Consistency and Model Behavior: Observed elasticities are physically coherent. Rotational speed maintains a positive sign in nearly all models, while solids’ concentration generally increases both power and production. SAG Pressure also exhibits positive elasticity in the Power models, consistent with the higher internal loading conditions leading to higher energy demand. Granulometric variables (>100 mm and <30 mm fractions) display moderate sensitivities but lower elasticities, indicating secondary yet non-negligible influence. Their impact appears model-dependent, which is expected due to interaction effects with P80 and hardness. Notably, the best-performing predictive models (XGB and ANN in SAG Power; XGB and BNN in SAG Production) exhibit smoother and more structurally coherent tornado patterns, while the sensitivity rankings are stable, signs are consistent, and there are no erratic oscillations. In contrast, models with weaker predictive performance (BART in SAG Power or baseline MR and BLR) show either attenuated or exaggerated elasticities, suggesting reduced structural fidelity.
- Relationship with Predictive Performance and Robustness: A strong relationship emerges between structural sensitivity stability and predictive robustness, while the models with the lowest Test RMSE (XGB for SAG Power with RMSE ≈ 431 and R2 ≈ 0.95) present well-ordered and physically interpretable tornado diagrams. Their OOF Test gap remains moderate, and their sensitivity structure remains consistent between Train and Test, while models with larger OOF Test discrepancies tend to display more variable or less hierarchically organized elasticities. This suggests that sensitivity coherence may serve as an indirect indicator of generalization capability and that the models extrapolate more reliably outside the calibration sample.
- Integration with Uncertainty and Calibration Metrics: Sensitivity Analysis indicates that there is a directly proportional relationship between the variables with the highest sensitivity and the regions with the greatest predictive dispersion. For SAG Power, GPR produces relatively sharp predictive intervals (MPIW ≈ 1352) but slightly undercovers (90% coverage ≈ 0.888), indicating narrower yet more informative intervals. In contrast, BART and GBN achieve higher coverage (≈0.94–0.95) at the expense of substantially wider intervals, implying conservative uncertainty estimation, indicating that structural ranking of dominant variables remains consistent across probabilistic and deterministic approaches. It is important to remark that the inclusion of uncertainty does not alter the hierarchy of influence but quantifies the propagation of variability through the most sensitive drivers. This confirms that the uncertainty structure is aligned with process physics rather than model instability.
4.4. Suitability Indicators by Hydraulic Regime
4.4.1. SAG Power—Suitability by Hydraulic Regime
4.4.2. SAG Production—Suitability by Hydraulic Regime
5. Discussions
5.1. Machine Learning Performance in Industrial Context
5.2. Regime-Based Suitability Interpretation
5.3. Sensitivity Analysis and Physical Consistency
5.4. Limitations and Future Research Directions
- Analysis based on data from a single SAG mill limits the generalizability of performance metrics and Optimal model architectures to other sites with different ore types, mill geometries, or operational practices. This work is based on a single industrial SAG mill and should therefore be interpreted as a case study. The comparative performance observed among the evaluated models reflects the data structure, instrumentation, ore variability, and operating conditions of the analyzed plant. Accordingly, the results are not intended to establish a universal hierarchy, but rather to provide operation-specific evidence and a comparative framework that can be further tested in other SAG milling contexts. The development of transfer learning approaches that adapt models trained on one site to new locations represents a promising research direction [86].
- ML algorithms and hyperparameters evaluated represent only a subset of available methods and emerging techniques such as physics-informed ANNs, and hybrid models that combine phenomenological and data-driven components warrant investigation. Deep Learning architectures designed for time series prediction, may offer advantages for capturing long-term dependencies and improving predictions during transient conditions [36].
- Regime classification employed was based on hydraulic characteristics derived from pulp density and discharge flow measurements. Alternative regime definitions based on charge motion patterns may provide different insights into model performance and operational suitability [9].
- This study focused on steady-state or quasi-steady-state prediction without explicitly addressing transient dynamics, startup/shutdown procedures, or fault detection and diagnosis, hence extend the modeling framework to handle dynamic transitions, predict mill overload events, and detect abnormal conditions represents an interesting direction [87].
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
| Model | Final Hyperparameters/Structural Settings |
|---|---|
| MR | OLS with a second-order response surface formulation, including main effects, quadratic terms, and first-order interactions. |
| RFR | n_estimators = 1000; max_depth = None; min_samples_split = 4; min_samples_leaf = 3; max_features = 0.6; bootstrap = True; max_samples = 0.8. |
| XGB | n_estimators = 4000 (the maximum number of trees was high only as a ceiling, while the effective adjustment was controlled by early stopping); learning_rate = 0.03; max_depth = 4; min_child_weight = 10; subsample = 0.75; colsample_bytree = 0.65; reg_lambda = 6.0; reg_alpha = 0.2; gamma = 0.2; objective = “reg:squarederror”; tree_method = “hist”; early stopping with es_rounds = 100 and es_val_size = 0.15. |
| GBM | n_estimators = 1200; learning_rate = 0.03; max_depth = 4; min_samples_split = 80; min_samples_leaf = 40; subsample = 0.7; max_features = 0.6; loss = “squared_error”. |
| ANN–MLP | SAG Power: hidden_layer_sizes = (128, 32); activation = “relu”; solver = “lbfgs”; alpha = 2 × 10−1; early_stopping = False; tol = 1 × 10−6; max_iter = 2000. SAG Production: hidden_layer_sizes = (128, 64); activation = “relu”; solver = “adam”; learning_rate_init = 1 × 10−3; alpha = 5 × 10−3; batch_size = 128; early_stopping = True; validation_fraction = 0.15; n_iter_no_change = 30; tol = 1 × 10−4; max_iter = 3500. The final architectures were selected empirically for each response. The goal was to optimize stability and generalization for each target, not to impose a single architecture. |
| BLR | BayesianRidge with fit_intercept = False; second-order formulation with main effects, quadratic terms, and first-order interactions; max_iter = 4000; tol = 1 × 10−5; alpha_1 = alpha_2 = 1 × 10−6; lambda_1 = lambda_2 = 1 × 10−6; compute_score = True. Posterior pruning: PRUNE_ALPHA = 0.10; PRUNE_DRAWS = 2000; KEEP_HIERARCHY = True; MIN_ABS_MEAN = 1 × 10−3; PRUNE_SPLIT = 0.10. |
| BART | m_trees = 400; draws = 2000; tune = 2000; target_accept = 0.95; chains = 4; cores = 4; standardize_X = True; standardize_y = True; add_noise_ppc = True. Lighter CV setup: cv_draws = 200; cv_tune = 100. |
| GBN | Predefined network structure with BART-based conditional mean functions; m_trees = 400; draws = 1000; tune = 1000; target_accept = 0.95; chains = 4; standardize_X = True; k_params_mode = m_trees + 1. Lighter CV setup: cv_m_trees = 100; cv_draws = 150; cv_tune = 150; cv_target_accept = 0.90. |
| GPR | Sparse variational GP with FITC approximation; kernel = “matern52”; ard = True; sigma_prior = 2.0; jitter = 1 × 10−6; n_inducing = 400; advi_iters = 10000; advi_lr = 1 × 10−2; draws_post = 200. Lighter CV setup: cv_n_inducing = 60; cv_advi_iters = 800; cv_advi_lr = 5 × 10−3; cv_draws_post = 50. |
| BNN | Single hidden layer with hidden_layers = (8,); activation = “tanh”; advi_iters = 12000; advi_lr = 5 × 10−3; draws_post = 800; add_noise_test = True. Priors: sigma_w = sigma_b ~ HalfNormal(0.5); sigma~HalfNormal(1.0). Sensitivity settings: sensitivity_ci_alpha = 0.10; sensitivity_draws = 500. |
References
- Baawuah, E.; Kelsey, C.; Addai-Mensah, J.; Skinner, W. Comparison of the Performance of Different Comminution Technologies in Terms of Energy Efficiency and Mineral Liberation. Miner. Eng. 2020, 156, 106454. [Google Scholar] [CrossRef]
- Salazar, J.L.; Magne, L.; Acuña, G.; Cubillos, F. Dynamic Modelling and Simulation of Semi-Autogenous Mills. Miner. Eng. 2009, 22, 70–77. [Google Scholar] [CrossRef]
- Rybinski, E.; Ghersi, J.; Davila, F.; Linares, J.; Valery, W.; Jankovic, A.; Valle, R.; Dikmen, S. Optimisation and Continuous Improvement of Antamina Comminution Circuit. In Proceedings of the SAG Conference, Vancouver, BC, Canada, 25 November 2011; pp. 1–19. [Google Scholar]
- Beloglazov, I.I.; Petrov, P.A.; Bazhin, V.Y. The Concept of Digital Twins for Tech Operator Training Simulator Design for Mining and Processing Industry. Eurasian Min. 2020, 2020, 50–54. [Google Scholar] [CrossRef]
- Hasidi, O.; Abdelwahed, E.H.; Qazdar, A.; Boulaamail, A.; Krafi, M.; Benzakour, I.; Bourzeix, F.; Baïna, S.; Baïna, K.; Cherkaoui, M.; et al. Digital Twins-Based Smart Monitoring and Optimisation of Mineral Processing Industry. Commun. Comput. Inf. Sci. CCIS 2022, 1677, 411–424. [Google Scholar] [CrossRef]
- Avalos, S.; Kracht, W.; Ortiz, J.M. Machine Learning and Deep Learning Methods in Mining Operations: A Data-Driven SAG Mill Energy Consumption Prediction Application. Min. Metall. Explor. 2020, 37, 1197–1212. [Google Scholar] [CrossRef]
- Gutiérrez, A.; Ahues, D.; González, F.; Merino, P. Simulation of Material Transport in a SAG Mill with Different Geometric Lifter and Pulp Lifter Attributes Using DEM. Min. Metall. Explor. 2019, 36, 431–440. [Google Scholar] [CrossRef]
- Weerasekara, N.S.; Powell, M.S. Performance Characterisation of AG/SAG Mill Pulp Lifters Using CFD Techniques. Miner. Eng. 2014, 63, 118–124. [Google Scholar] [CrossRef]
- Lopez, P.; Reyes, I.; Risso, N.; Momayez, M.; Zhang, J. Machine Learning Algorithms for Semi-Autogenous Grinding Mill Operational Regions’ Identification. Minerals 2023, 13, 1360. [Google Scholar] [CrossRef]
- Feng, Y.; Wang, X.; Zou, H.; Yan, L. A Composite Power Prediction Model for Semi-au-togenous Grinding Mill Based on Mechanistic Approach and XGBoost. In Proceedings of the 37th Chinese Control and Decision Conference, CCDC, Xiamen, China, 16–19 May 2025; pp. 557–564. [Google Scholar] [CrossRef]
- Ghasemi, Z.; Neshat, M.; Aldrich, C.; Karageorgos, J.; Zanin, M.; Neumann, F.; Chen, L. An Integrated Intelligent Framework for Maximising SAG Mill Throughput: Incorporating Expert Knowledge, Machine Learning and Evolutionary Algorithms for Parameter Optimisation. Miner. Eng. 2024, 212, 108733. [Google Scholar] [CrossRef]
- Lopez, P.; Reyes, I.; Risso, N.; Aguilera, C.; Campos, P.G.; Momayez, M.; Contreras, D. Assessing Machine Learning and Deep Learning-Based Approaches for SAG Mill Energy Consumption. In 2021 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2021; IEEE: Piscataway, NJ, USA, 2021. [Google Scholar] [CrossRef]
- Liao, Z.; Xu, C.; Chen, W.; Chen, Q.; Wang, F.; She, J. Effective Throughput Optimization of SAG Milling Process Based on BPNN and Genetic Algorithm. In Proceedings—2023 IEEE 6th International Conference on Industrial Cyber-Physical Systems, ICPS 2023; IEEE: Piscataway, NJ, USA, 2023. [Google Scholar] [CrossRef]
- Valencia, J.V.; Vargas, F. A Probabilistic Graphical Model for Semi-Autogenous Grinding Processes. In Proceedings—IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, ChileCon; IEEE: Piscataway, NJ, USA, 2023. [Google Scholar] [CrossRef]
- Jana, D.; Kumar, D.; Gupta, S.; Pal, S.; Ghosh, S. Bayesian Network Approach for Studying the Operational Reliability and Remaining Useful Life. J. Reliab. Stat. Stud. 2023, 16, 373–392. [Google Scholar] [CrossRef]
- Lucay, F.A. Accelerating Global Sensitivity Analysis via Supervised Machine Learning Tools: Case Studies for Mineral Processing Models. Minerals 2022, 12, 750. [Google Scholar] [CrossRef]
- Salazar, J.L.; Valdés-González, H.; Cubillos, F. Advanced Simulation for Semi-Autogenous Mill Systems: A Simplified Models Approach. In Dynamic Modelling; InTech: London, UK, 2010. [Google Scholar] [CrossRef]
- Saldaña, M.; Gálvez, E.; Navarra, A.; Toro, N.; Cisternas, L.A. Optimization of the SAG Grinding Process Using Statistical Analysis and Machine Learning: A Case Study of the Chilean Copper Mining Industry. Materials 2023, 16, 3220. [Google Scholar] [CrossRef] [PubMed]
- Ghasemi, Z.; Neumann, F.; Zanin, M.; Karageorgos, J.; Chen, L. A Comparative Study of Prediction Methods for Semi-Autogenous Grinding Mill Throughput. Miner. Eng. 2024, 205, 108458. [Google Scholar] [CrossRef]
- Mamani-quiñonez, O.; Cisternas, L.A.; Lopez-arenas, T.; Lucay, F.A. Control Structure Design Using Global Sensitivity Analysis for Mineral Processes under Uncertainties. Minerals 2022, 12, 736. [Google Scholar] [CrossRef]
- Välikangas, H.; Ohenoja, M.; Brochot, S.; Fernández, M.G.; Ruuska, J.; Ruusunen, M. Evaluation of Model Uncertainty Propagation in Mineral Process Flowsheet Designs. Scand. Simul. Soc. 2025, 211, 456–463. [Google Scholar] [CrossRef]
- Yuwen, C.; Sun, B.; Liu, S. A Dynamic Model for a Class of Semi-Autogenous Mill Systems. IEEE Access 2020, 8, 98460–98470. [Google Scholar] [CrossRef]
- Ghasemi, Z.; Neshat, M.; Aldrich, C.; Karageorgos, J.; Zanin, M.; Neumann, F.; Chen, L. Enhanced Genetic Programming Models with Multiple Equations for Accurate Semi-Autogenous Grinding Mill Throughput Prediction. In 2024 IEEE Congress on Evolutionary Computation, CEC 2024—Proceedings; IEEE: Piscataway, NJ, USA, 2024. [Google Scholar] [CrossRef]
- Jayasundara, C.T.; Zhu, H.P. Predicting Liner Wear of Ball Mills Using Discrete Element Method and Artificial Neural Network. Chem. Eng. Res. Des. 2022, 182, 438–447. [Google Scholar] [CrossRef]
- Gupta, A.; Mishra, B.K. Multi-Head Neural Networks for Simulating Particle Breakage Dynamics. Theor. Appl. Mech. Lett. 2024, 14, 100515. [Google Scholar] [CrossRef]
- Lu, S.; Zhou, P.; Chai, T.; Dai, W. Modeling and Simulation of Whole Ball Mill Grinding Plant for Integrated Control. IEEE Trans. Autom. Sci. Eng. 2014, 11, 1004–1019. [Google Scholar] [CrossRef]
- Dai, W.; Liu, Q.; Chai, T. Particle Size Estimate of Grinding Processes Using Random Vector Functional Link Networks with Improved Robustness. Neurocomputing 2015, 169, 361–372. [Google Scholar] [CrossRef]
- Li, Y. Modelling Tumbling Ball Milling Based on DEM Simulation and Machine Learning. Ph.D. Thesis, The University of New South Wales, Sydney, Australia, 2023. [Google Scholar]
- Jayasundara, C.T.; Zhu, H.P. Impact Energy of Particles in Ball Mills Based on DEM Simulations and Data-Driven Approach. Powder Technol. 2022, 395, 226–234. [Google Scholar] [CrossRef]
- Doroszuk, B. Data-Driven Insight into Ball Mill Scaling Unveiling Differences Across Scales Through Computer Vision, Numerical Simulations, and Design of Experiments. Ph.D. Thesis, Wroclaw University of Science and Technology, Wroclaw, Poland, 2024. [Google Scholar]
- Rhein, F.; Hibbe, L.; Nirschl, H. Hybrid Modeling of Hetero-Agglomeration Processes: A Framework for Model Selection and Arrangement. Eng. Comput. 2023, 40, 583–604. [Google Scholar] [CrossRef]
- Lu, M.; Xia, Y.; Bhattacharjee, T.; Klinger, J.; Li, Z. Predicting Biomass Comminution: Physical Experiment, Population Balance Model, and Deep Learning. Powder Technol. 2024, 441, 119830. [Google Scholar] [CrossRef]
- Yang, J.; Zou, G.; Zhou, J.; Wang, Q.; Song, T.; Li, K. Hybrid Modeling and Simulation of the Grinding and Classification Process Driven by Multi-Source Compensation. Minerals 2024, 14, 1019. [Google Scholar] [CrossRef]
- Metta, N.; Ramachandran, R.; Ierapetritou, M. A Computationally Efficient Surrogate-Based Reduction of a Multiscale Comill Process Model. J. Pharm. Innov. 2020, 15, 424–444. [Google Scholar] [CrossRef]
- Ruiz, M.A.V.; Gonzales, J.A.V.; Villalba, F.J.B. Multivariable Predictive Models for the Estimation of Power Consumption (KW) of a Semi-Autogenous Mill Applying Machine Learning Algorithms. J. Energy Environ. Sci. 2024, 8, 14–31. [Google Scholar] [CrossRef]
- Zhang, D.; Xiong, X.; Shao, C.; Zeng, Y.; Ma, J. Semi-Autogenous Mill Power Consumption Prediction Based on CACN-LSTM. Appl. Sci. 2024, 15, 2. [Google Scholar] [CrossRef]
- Pural, Y.E.; Ledezma, T.; Hilden, M.; Forbes, G.; Boylu, F.; Yahyaei, M. Application of Machine Learning for Generic Mill Liner Wear Prediction in Semi-Autogenous Grinding (SAG) Mills. Minerals 2024, 14, 1200. [Google Scholar] [CrossRef]
- Harenberg, D.; Marelli, S.; Sudret, B.; Winschel, V. Uncertainty Quantification and Global Sensitivity Analysis for Economic Models. Quant. Econom. 2019, 10, 1–41. [Google Scholar] [CrossRef]
- Gao, H.; Huo, X.; Zhu, C. An Input Module of Deep Learning for the Analysis of Time Series with Unequal Length. In Proceedings—2023 IEEE 6th International Conference on Industrial Cyber-Physical Systems, ICPS 2023; IEEE: Piscataway, NJ, USA, 2023. [Google Scholar] [CrossRef]
- Guillermo, A. Mineral Movement Simulation through the Grates and Pulp Lifter in a SAG Mill and Evaluation for a New Grate Design Using DEM. Physicochem. Probl. Miner. Process. 2019, 55, 617–630. [Google Scholar] [CrossRef]
- Kuk, M.; Bobek, S.; Veloso, B.; Rajaoarisoa, L.; Nalepa, G.J. Feature Importances as a Tool for Root Cause Analysis in Time-Series Events. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), LNCS; Springer: Cham, Switzerland, 2023; Volume 14077, pp. 408–416. [Google Scholar] [CrossRef]
- James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning; Springer Texts in Statistics; Springer US: New York, NY, USA, 2021. [Google Scholar]
- Kuhn, M.; Johnson, K. Feature Engineering and Selection: A Practical Approach for Predictive Models; Chapman and Hall/CRC: New York, NY, USA, 2019. [Google Scholar] [CrossRef]
- Biau, G.; Scornet, E. A Random Forest Guided Tour. TEST 2016, 25, 197–227. [Google Scholar] [CrossRef]
- Scornet, E.; Biau, G.; Vert, J.-P. Consistency of Random Forests. Ann. Stat. 2015, 43, 1716–1741. [Google Scholar] [CrossRef]
- Belgiu, M.; Drăguţ, L. Random Forest in Remote Sensing: A Review of Applications and Future Directions. ISPRS J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
- Probst, P.; Wright, M.N.; Boulesteix, A.L. Hyperparameters and Tuning Strategies for Random Forest. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2019, 9, e1301. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; ACM: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; von Luxburg, U., Guyon, I., Bengio, S., Wallach, H., Fergus, R., Eds.; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 3149–3157. [Google Scholar]
- Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased Boosting with Categorical Features. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, QC, Canada, 3–8 December 2018; Bengio, S., Wallach, H.M., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Eds.; Curran Associates Inc.: Red Hook, NY, USA, 2018; pp. 6639–6649. [Google Scholar]
- Natekin, A.; Knoll, A. Gradient Boosting Machines, a Tutorial. Front. Neurorobot. 2013, 7, 63623. [Google Scholar] [CrossRef]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; The MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- Hornik, K.; Stinchcombe, M.; White, H. Multilayer Feedforward Networks Are Universal Approximators. Neural Netw. 1989, 2, 359–366. [Google Scholar] [CrossRef]
- Wu, Y.C.; Feng, J.W. Development and Application of Artificial Neural Network. Wirel. Pers. Commun. 2018, 102, 1645–1656. [Google Scholar] [CrossRef]
- Murphy, K.P. Machine Learning: A Probabilistic Perspective; MIT Press: Cambridge, MA, USA, 2012. [Google Scholar]
- Gelman, A.; Carlin, J.B.; Stern, H.S.; Dunson, D.B.; Vehtari, A.; Rubin, D.B. Bayesian Data Analysis; Chapman and Hall/CRC: New York, NY, USA, 2013. [Google Scholar] [CrossRef]
- Chipman, H.A.; George, E.I.; McCulloch, R.E. BART: Bayesian Additive Regression Trees. Ann. Appl. Stat. 2010, 4, 266–298. [Google Scholar] [CrossRef]
- Montgomery, D.C.; Peck, E.A.; Vining, G.G. Introduction to Linear Regression Analysis, 6th ed.; John Wiley & Sons: Hoboken, NJ, USA, 2021. [Google Scholar]
- Hill, J.; Linero, A.; Murray, J. Bayesian Additive Regression Trees: A Review and Look Forward. Annu. Rev. Stat. Appl. 2020, 7, 251–278. [Google Scholar] [CrossRef]
- Kapelner, A.; Bleich, J. BartMachine: Machine Learning with Bayesian Additive Regression Trees. J. Stat. Softw. 2016, 70, 1–40. [Google Scholar] [CrossRef]
- Scutari, M.; Denis, J.-B. Bayesian Networks; Chapman and Hall/CRC: Boca Raton, FL, USA, 2021. [Google Scholar]
- Koller, D.; Friedman, N. Probabilistic Graphical Models: Principles and Techniques; MIT Press: Cambridge, MA, USA, 2009. [Google Scholar]
- Rasmussen, C.E.; Williams, C.K.I. Gaussian Processes for Machine Learning; The MIT Press: Cambridge, MA, USA, 2005. [Google Scholar]
- Lloyd, C.; Gunter, T.; Osborne, M.; Roberts, S. Variational Inference for Gaussian Process Modulated Poisson Processes. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 6–11, July, 2015; Bach, F., Blei, D., Eds.; JMLR: Norfolk, MA, USA, 2015; pp. 1814–1822. [Google Scholar]
- Hensman, J.; Matthews, A.G.; Ghahramani, Z. Scalable Variational Gaussian Process Classification. In Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, San Francisco, CA, USA, 9–12 May 2015; JMLR: Norfolk, MA, USA, 2015; Volume 38, pp. 351–360. [Google Scholar]
- Burt, D.R.; Rasmussen, C.E.; Van Der Wilk, M. Convergence of Sparse Variational Inference in Gaussian Processes Regression. J. Mach. Learn. Res. 2020, 21, 1–63. [Google Scholar]
- Zhang, C.; Butepage, J.; Kjellstrom, H.; Mandt, S. Advances in Variational Inference. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 41, 2008–2026. [Google Scholar] [CrossRef] [PubMed]
- Neal, R.M. Bayesian Learning for Neural Networks; Lecture Notes in Statistics; Springer New York: New York, NY, USA, 1996; Volume 118. [Google Scholar]
- Kingma, D.P.; Welling, M. Auto-Encoding Variational Bayes. arXiv 2013, arXiv:1312.6114. [Google Scholar] [CrossRef]
- Blundell, C.; Cornebise, J.; Kavukcuoglu, K.; Wierstra, D. Weight Uncertainty in Neural Networks. arXiv 2015, arXiv:1505.05424. [Google Scholar] [CrossRef]
- Gal, Y.; Ghahramani, Z. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning. arXiv 2016, arXiv:1506.02142. [Google Scholar] [CrossRef]
- Zagayevskiy, Y.; Deutsch, C.V. A Methodology for Sensitivity Analysis Based on Regression: Applications to Handle Uncertainty in Natural Resources Characterization. Nat. Resour. Res. 2015, 24, 239–274. [Google Scholar] [CrossRef]
- Pandey, B.P.; Mishra, D.P. Developing an Alternate Mineral Transportation System by Evaluating Risk of Truck Accidents in the Mining Industry—A Critical Fuzzy DEMATEL Approach. Sustainability 2023, 15, 6409. [Google Scholar] [CrossRef]
- Lu, H.; Peng, Y.; Cao, S.; Zhu, Z. Parameter Sensitivity Analysis and Probabilistic Optimal Design for the Main-Shaft Device of a Mine Hoist. Arab. J. Sci. Eng. 2019, 44, 971–979. [Google Scholar] [CrossRef]
- Hou, J.; Nie, G.; Li, G.; Zhao, W.; Sheng, B. Optimization of Branch Airflow Volume for Mine Ventilation Network Based on Sensitivity Matrix. Sustainability 2023, 15, 12427. [Google Scholar] [CrossRef]
- Bakhtavar, E.; Saberi, S.; Hu, G.; Sadiq, R.; Hewage, K. Fuzzy Cognitive-Based Goal Programming for Waste Rock Management with in-Pit Dumping Priority: Towards Sustainable Mining. Resour. Policy 2023, 86, 104095. [Google Scholar] [CrossRef]
- Lucay, F.; Cisternas, L.A.; Gálvez, E.D. Global Sensitivity Analysis for Identifying Critical Process Design Decisions. Chem. Eng. Res. Des. 2015, 103, 74–83. [Google Scholar] [CrossRef]
- Ghaffari, A.; Hayati, M.; Shekholeslami, A. Probability and Sensitivity Analysis in Flotation Circuit of Bama Lead and Zinc Processing Plant Using Monte Carlo Simulation Method. Miner. Process. Extr. Metall. Rev. 2012, 33, 416–426. [Google Scholar] [CrossRef]
- Montgomery, D.C.; Runger, G.C. Applied Statistics and Probalisty for Engineers, 6th ed.; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2014. [Google Scholar]
- Kazemi, P.; Khalid, M.H.; Szlek, J.; Mirtič, A.; Reynolds, G.K.; Jachowicz, R.; Mendyk, A. Computational Intelligence Modeling of Granule Size Distribution for Oscillating Milling. Powder Technol. 2016, 301, 1252–1258. [Google Scholar] [CrossRef]
- Silva, M.; Casali, A. Modelling SAG Milling Power and Specific Energy Consumption Including the Feed Percentage of Intermediate Size Particles. Miner. Eng. 2015, 70, 156–161. [Google Scholar] [CrossRef]
- Li, L.; Chang, J.; Vakanski, A.; Wang, Y.; Yao, T.; Xian, M. Uncertainty Quantification in Multivariable Regression for Material Property Prediction with Bayesian Neural Networks. Sci. Rep. 2024, 14, 10543. [Google Scholar] [CrossRef]
- Wani, O.; Beckers, J.V.L.; Weerts, A.H.; Solomatine, D.P. Residual Uncertainty Estimation Using Instance-Based Learning with Applications to Hydrologic Forecasting. Hydrol. Earth Syst. Sci. 2017, 21, 4021–4036. [Google Scholar] [CrossRef]
- Garcia-Cardona, C.; Scheinker, A. Machine Learning Surrogate for Charged Particle Beam Dynamics with Space Charge Based on a Recurrent Neural Network with Aleatoric Uncertainty. Phys. Rev. Accel. Beams 2024, 27, 024601. [Google Scholar] [CrossRef]
- Collis, J.; Connor, A.J.; Paczkowski, M.; Kannan, P.; Pitt-Francis, J.; Byrne, H.M.; Hubbard, M.E. Bayesian Calibration, Validation and Uncertainty Quantification for Predictive Modelling of Tumour Growth: A Tutorial. Bull. Math. Biol. 2017, 79, 939–974. [Google Scholar] [CrossRef]
- Li, Y.; Bao, J.; Chen, T.; Yu, A.; Yang, R. Prediction of Ball Milling Performance by a Convolutional Neural Network Model and Transfer Learning. Powder Technol. 2022, 403, 117409. [Google Scholar] [CrossRef]
- Hermosilla, R.; Valle, C.; Allende, H.; Aguilar, C.; Lucic, E. SAG’s Overload Forecasting Using a CNN Physical Informed Approach. Appl. Sci. 2024, 14, 11686. [Google Scholar] [CrossRef]
- Coetzee, L.C.; Craig, I.K.; Kerrigan, E.C. Robust Nonlinear Model Predictive Control of a Run-of-Mine Ore Milling Circuit. IEEE Trans. Control Syst. Technol. 2010, 18, 222–229. [Google Scholar] [CrossRef]
- Saldana, M.; Gálvez, E.; Sales-Cruz, M.; Salinas-Rodríguez, E.; Castillo, J.; Navarra, A.; Toro, N.; Arias, D.; Cisternas, L.A. A Stochastic Model Approach for Modeling SAG Mill Production and Power Through Bayesian Networks: A Case Study of the Chilean Copper Mining Industry. Minerals 2026, 16, 60. [Google Scholar] [CrossRef]














| Dim. | Deterministic/ Phenomenological | ML Models | Stochastic/Probabilistic Models |
|---|---|---|---|
| Typical data needs | Mechanistic parameters, breakage/selection kernels, PSDs, feed fractions, mill geometry; calibration with sampling campaigns and NorBal/MODSIM style data is common [17] | Large, labeled SCADA/plant time series (feed tonnage, power, speed, inlet water, bearing pressure, PSD); studies used 20k+ records and explicit feature selection/delays [9,12,18,19] | Moderate data + expert priors; can combine sparse SCADA data and domain priors to estimate conditional probabilities [14,15] |
| Uncertainty handling | Uncertainty via Monte Carlo on parameters and UA/GSA using Sobol, with surrogates for speed up [16,20,21] | Usually point predictions; uncertainty via ensembles, quantile models or probabilistic wrappers (not widespread in reviewed SAG ML papers) [12,19] | Native probabilistic outputs, causal queries and RUL/reliability estimation; supports decision thresholds and scenario evidence updates [14,15] |
| Interpretability | High: physics-based parameters, PSD evolution and mechanistic insight; useful for design and “what-if” simulations [17,22] | Variable: GP and linear models give formulas and feature importances; deep nets require XAI (SHAP/LIME) for explanations [9,23] | High for structure (graph edges) and conditional relationships; enables causal reasoning if graph is well specified [14,15] |
| Runtime (relative) | Slow to very slow: DEM/DEM–CFD high computational cost; population balance dynamic simulators moderate but heavier than ML in closed-loop use [7,17] | Low-to-moderate inference times: LSTM/RNN suitable for real-time energy/throughput prediction; training costs higher but amortized [6,12,19] | Low for BN inference once learned; Monte Carlo propagation with detailed plant simulators is expensive but surrogates speed it up [14,16,21] |
| Deployment considerations | Best for control design, digital twins, “offline” optimization and engineering studies; needs online parameter estimation or periodic re-calibration for drift [17,22] | Proven deployable in real plants for energy/throughput forecasting (RNN/LSTM); require robust feature pipelines, delay handling and drift detection [12,18,19] | Well-suited for decision support, alarms and maintenance (RUL), and probabilistic supervisory controllers; BN requires whitelist/blacklist and priors during construction [14,15] |
| Typical Applications | Design studies, what-if simulations, engineering analysis | ML static: real-time energy/throughput prediction, process optimization; ML temporal: time series forecasting, dynamic process control | Decision support, reliability analysis, probabilistic control |
| Metric Cat. | Metric Name | Typical Range SAG | Model Types Used | Interpretation | Ref. |
|---|---|---|---|---|---|
| Regression Accuracy | RMSE | 5%–15% of mean throughput | All model types | Lower is better—measures average prediction error magnitude | [6,12,19] |
| MAE | 3%–12% of mean throughput | All model types | Lower is better—measures average absolute error | [18,23,35] | |
| R2 | 0.75–0.95 for good models | All model types | Higher is better—proportion of variance explained | [6,35,36] | |
| MAPE | 5%–20% typical range | ML and stochastic | Lower is better—percentage error measure | [18,19,37] | |
| Classification Performance | ROC-AUC | 0.8–0.95 for operational states | ML classification | Higher is better—discrimination ability | [9,36] |
| MCC | −1 to +1 range | Classification models | Higher is better—balanced classification measure | [9,23] | |
| Cohen Kappa | 0.6–0.9 for good agreement | Classification models | Higher is better—agreement beyond chance | [9,35] | |
| Probabilistic Calibration | Brier Score | 0.1–0.3 typical | Probabilistic models | Lower is better—measures calibration quality | [14,15] |
| ECE | <0.1 for well-calibrated | Probabilistic models | Lower is better—expected calibration error | [14,21] | |
| Reliability Curves | Visual assessment | Probabilistic models | Diagonal line indicates perfect calibration | [14,21] | |
| UQ | Coverage Probability | 0.9–0.95 for 95% intervals | Stochastic models | Should match nominal level | [20,21,38] |
| Interval Width | Context dependent | Stochastic models | Narrower intervals preferred if coverage maintained | [16,20,21] | |
| Sensitivity Analysis | Sobol First Order | 0–1 for each variable | All with SA | Higher indicates more important variable | [16,20,21] |
| Sobol Total Order | 0–1 for each variable | All with SA | Includes interaction effects | [16,20,21] | |
| Morris Mu Star | Variable dependent | All with SA | Screening measure for variable importance | [16,20] | |
| Morris Sigma | Variable dependent | All with SA | Measures interaction/nonlinearity effects | [16,20] | |
| Operational Performance | Energy Savings | 5%–15% typical improvements | All model types | Higher is better—kWh/t reduction | [12,18,39] |
| Throughput Uplift | 3%–10% typical improvements | All model types | Higher is better—increased production | [18,39] | |
| Availability Improvement | 2%–8% typical | Predictive models | Higher is better—reduced downtime | [14,15] |
| Variable Category | Variable Name | Data Treatment | Model Usage | Sensitivity Ranking |
|---|---|---|---|---|
| Feed Characteristics | Ore Hardness | Laboratory correlation with geology | All | High—primary driver |
| Feed Rate | Moving averages to smooth | All | High—direct throughput relationship | |
| P80 Feed | Size analysis correlation | All | High—affects grinding efficiency | |
| PSD Fractions | Discrete bins for modeling | Det/ Stoch | Medium—detailed breakage modeling | |
| Operational Parameters | Mill Speed | Critical speed percentage | All | High—affects grinding action |
| Water Addition | Flow control validation | All | Medium—affects pulp density | |
| Percent Solids | Density meter correlation | All | High—critical for transport | |
| Mill Power | Power meter readings | All | High—energy efficiency target | |
| Mill Condition | Liner Age | Maintenance tracking | ML/Stoch | Medium—wear progression |
| Liner Profile | Survey measurements | Det | Medium—affects charge motion | |
| Bearing Pressure | Pressure transducers | ML/Stoch | Low—condition monitoring | |
| Vibration Level | Accelerometer data | ML/Stoch | Low—condition monitoring | |
| Product Characteristics | Product P80 | Cyclone overflow sizing | All | High—product quality target |
| Cyclone Pressure | Pressure measurement | All | Medium—classification efficiency | |
| Sump Level | Level transmitter | All | Medium—inventory control | |
| Pebble Handling | Pebble Rate | Conveyor scales | All | Medium—circuit balance |
| Pebble Size | Size analysis | Det | Low—detailed modeling | |
| Environmental | Ambient T | Weather station | ML | Low—seasonal effects |
| Ore Moisture | Laboratory analysis | All | Low—feed preparation | |
| Derived Variables | Specific Energy | Power/throughput ratio | All | High—efficiency metric |
| Mill Filling | Load cell or model-based | Det/ML | Medium—charge dynamics | |
| Charge Motion | DEM/video analysis | Det | Medium—grinding mechanism | |
| Uncertainty Factors | Feed Variability | Statistical process control | Stoch | High—uncertainty source |
| Equipment Drift | Condition monitoring | ML/Stoch | Medium—model degradation | |
| Measurement Error | Calibration records | All | Low—data quality |
| Model | Class | Advantages | Limitations | Data/Requirements | Additional Relevant Information |
|---|---|---|---|---|---|
| OLS (Ordinary Least Squares) | Multivariate ML | Handles collinearity; useful with many predictors and few responses | Moderate interpretability; linear by default | Process signal matrix X; standardization recommended | Useful for inferring latent variables (e.g., pulp density and cut size) |
| Random Forest | ML (tree ensemble) | Robust to noise/outliers; provides variable importance; limited tuning required | Point predictions (not natively probabilistic) | Large, labeled dataset (TpH/MW); basic missing data handling | Good balance between Acc and interpretability; SHAP enhances explainability |
| XGBoost/GBM | ML (boosting) | High accuracy; strong handling of interactions and nonlinearity; fast inference | Risk of overfitting without validation; sensitive hyperparameter tuning | Large datasets; well-defined features; stratified/temporal validation | Often state-of-the-art for tabular data; combine with calibration if probabilities are used |
| ANN (MLP) | ML (neural network) | Approximates highly nonlinear functions; strong predictive performance | Black box behavior; requires more data; tuning and regularization needed | Abundant, normalized data; temporal splits; early stopping | Achieved highest R2; complement with SHAP/permutation importance for interpretability |
| Bayesian Linear Regression (BLR) | Probabilistic parametric ML | Incorporates uncertainty and regularization via priors; interpretable | Assumes linearity; requires appropriate prior specification | Standardized matrix X; defined priors and likelihood | Provides credibility intervals and probabilistic metrics |
| Bayesian Additive Regression Tree (BART) | Probabilistic nonparametric ML | Captures nonlinearities and interactions; posterior uncertainty | Lower global interpretability; higher computational cost | Sufficient data; hyperparameter tuning; optional standardization | Delivers credible intervals and strong predictive robustness |
| Gaussian Bayesian Network (GBN) | Continuous probabilistic graphical model | Represents causal dependencies; quantifies joint uncertainty | Assumes Gaussian and linear relationships; structure sensitive to data | Continuous variables; structure learning or expert-defined graph; approximate normality | Enables conditional inference and probabilistic causal analysis |
| Gaussian Process Regression (GPR) | Probabilistic (nonparametric) | Prediction with uncertainty bands; suitable for probabilistic MPC | O(n3) training cost; challenging with large datasets | Subsampling or approximations; relevant feature selection | Promising for predictive control with confidence intervals |
| Bayesian Neural Network (BNN) | Stochastic (probabilistic graphical model) | Handles incomplete evidence; quantifies uncertainty; interpretable DAG | Requires discretization; may blur intermediate classes; needs priors/whitelists | Discretized variables using physically meaningful thresholds; ESS/prior specification | Well, suited for “what-if” diagnostics and decision support under uncertainty |
| Model | Comparison Strategies (What to Evaluate) | Key Metrics/Statistics | Analysis Outputs |
|---|---|---|---|
| MR | Baseline linear performance; bias–variance trade-off; multicollinearity effects | RMSE, MAE, and R2, Adjusted R2; AIC/BIC; VIF; residual diagnostics | Parity plot; residual plots; coefficient table; influence diagnostics |
| RF | Accuracy vs. boosting; robustness to noise; variance reduction | RMSE, MAE, R2; OOB error; permutation importance; SHAP | SHAP |
| XGBoost; GBM | State-of-the-art tabular accuracy; overfitting risk; learning dynamics | RMSE, MAE, and R2; cross-val error; SHAP; learning curves; early stopping metrics | Parity plot; SHAP importance; learning curves; residual distribution |
| ANN (MLP) | Nonlinearity gain vs. interpretability; regularization effectiveness | RMSE, MAE, and R2; validation loss; early stopping; SHAP (Kernel/Deep) | Train/validation loss curves; SHAP summary; parity plot |
| BLR | Parametric uncertainty quantification; comparison vs. OLS | RMSE, MAE, and R2; posterior intervals; coverage probability; MPIW; NLPD; WAIC/LOO | Predictive intervals; coverage vs. nominal plot; posterior coefficient distributions |
| BART | Nonlinear probabilistic performance; uncertainty calibration | RMSE, MAE, and R2; Coverage; MPIW; NLPD; posterior inclusion proportions | Prediction with credible bands; calibration plot; variable inclusion summary |
| GBN | Conditional dependency modeling; joint uncertainty propagation | RMSE, MAE, R2 (continuous nodes); log-likelihood; entropy; KL divergence | Graph structure; conditional probability flows; sensitivity maps |
| GPR | Predictive uncertainty vs. computational cost; kernel sensitivity | RMSE, MAE, and R2; Coverage probability; average band width; log-marginal likelihood; NLPD | Prediction with confidence bands; calibration curve; kernel diagnostics |
| BNN | Deep nonlinear uncertainty modeling; robustness under regime shifts | RMSE, MAE, and R2; Coverage; MPIW; predictive entropy; ELBO; NLPD | Predictive distribution plots; uncertainty vs. error analysis; calibration curves |
| Scope | What to Evaluate | Metrics | Outputs |
|---|---|---|---|
| Global robustness and stability | Sensitivity to data splits, noise, and hydraulic regimes | , , and ; bootstrap stability; Sobol/Morris (ML); MC simulations (Bayesian models) | Stability heatmap; radar chart (accuracy–robustness–calibration–interpretability); suitability matrix by operating regime |
| Factor | Variable (s) | Suggested Range | Cut-Off Criterion |
|---|---|---|---|
| Mill Hydraulics | % solids, Water flow rate, Pressure | Diluted | % solids < P33, with water flow rate > P67 and/or bearing pressure < P33 |
| Optimal | P33 ≤ % solids ≤ P67, with P33 ≤ water flow rate ≤ P67 and P33 ≤ bearing pressure ≤ P67 | ||
| Thick | % solids > P67, with water flow rate < P33 and/or bearing pressure > P67 | ||
| Ore Hardness | Hardness (A × b or proxy) | Low | <P33 |
| Medium | P33–P67 | ||
| High | >P67 | ||
| Mechanical Condition | Liner age (months) | New | [0, 3) |
| Intermediate | [3, 6) | ||
| Worn | [6, +∞) | ||
| Internal Load/Transport | Pebbles, Sump level | Low | both variables < P33 |
| Medium | intermediate combinations/at least one variable between P33 and P67 | ||
| High | both variables > P67 |
| Resp. | Model | RMSE | MAE | R2 | MAPE | * | |
|---|---|---|---|---|---|---|---|
| Train | SAG Production | MR | 108.87 | 68.67 | 95.85 | 2.18 | 0.08 |
| RF | 58.94 | 30.31 | 98.78 | 1.01 | 0.06 | ||
| XGBoost | 41.39 | 24.132 | 99.40 | 0.76 | 0.08 | ||
| GBM | 59.35 | 33.92 | 98.77 | 1.10 | 0.07 | ||
| ANN | 75.56 | 37.55 | 98.00 | 1.21 | 0.05 | ||
| BLR | 111.10 | 70.47 | 95.68 | 2.24 | 0.08 | ||
| BART | 62.42 | 37.01 | 98.64 | 1.18 | 0.07 | ||
| GBN | 94.93 | 50.20 | 96.85 | 1.68 | 0.01 | ||
| GPR | 78.11 | 37.72 | 97.87 | 1.25 | 0.14 | ||
| BNN | 89.13 | 45.65 | 97.25 | 1.48 | 0.12 | ||
| SAG Power | MR | 473.73 | 369.19 | 94.12 | 1.82 | 0.04 | |
| RF | 309.03 | 226.66 | 97.50 | 1.13 | 0.07 | ||
| XGBoost | 226.44 | 162.73 | 98.66 | 0.80 | 0.05 | ||
| GBM | 327.59 | 250.78 | 97.19 | 1.24 | 0.05 | ||
| ANN | 311.84 | 242.33 | 97.45 | 1.18 | 0.04 | ||
| BLR | 487.08 | 375.85 | 93.78 | 1.85 | 0.05 | ||
| BART | 540.61 | 408.48 | 92.34 | 2.08 | 0.05 | ||
| GBN | 511.46 | 385.10 | 93.14 | 1.96 | 0.02 | ||
| GPR | 381.77 | 279.60 | 96.18 | 1.40 | 0.08 | ||
| BNN | 464.29 | 360.76 | 94.33 | 1.78 | 0.05 | ||
| Test | SAG Production | MR | 122.97 | 74.04 | 94.83 | 2.35 | 0.08 |
| RF | 104.02 | 54.20 | 96.30 | 1.82 | 0.05 | ||
| XGBoost | 87.98 | 47.38 | 97.35 | 1.55 | 0.06 | ||
| GBM | 91.73 | 48.14 | 97.12 | 1.57 | 0.05 | ||
| ANN | 93.09 | 43.88 | 97.04 | 1.43 | 0.06 | ||
| BLR | 124.09 | 76.54 | 94.73 | 2.24 | 0.04 | ||
| BART | 89.78 | 48.03 | 97.24 | 1.55 | 0.05 | ||
| GBN | 107.35 | 54.18 | 96.06 | 1.80 | 0.05 | ||
| GPR | 92.14 | 44.08 | 97.10 | 1.47 | 0.06 | ||
| BNN | 88.50 | 46.89 | 97.22 | 1.53 | 0.06 | ||
| SAG Power | MR | 528.29 | 392.27 | 92.66 | 1.95 | 0.06 | |
| RF | 553.51 | 401.57 | 91.94 | 2.01 | 0.05 | ||
| XGBoost | 431.11 | 298.48 | 95.11 | 1.49 | 0.08 | ||
| GBM | 470.39 | 335.29 | 94.18 | 1.68 | 0.07 | ||
| ANN | 474.10 | 325.25 | 94.09 | 1.62 | 0.07 | ||
| BLR | 534.91 | 398.49 | 92.48 | 1.97 | 0.05 | ||
| BART | 590.52 | 432.44 | 90.83 | 2.18 | 0.05 | ||
| GBN | 568.83 | 412.45 | 91.49 | 2.08 | 0.05 | ||
| GPR | 473.83 | 326.25 | 94.10 | 1.64 | 0.07 | ||
| BNN | 526.97 | 379.95 | 92.79 | 1.89 | 0.08 | ||
| Model | RMSE OOFTrain | RMSETest | MAE OOFTrain | MAETest | |||
|---|---|---|---|---|---|---|---|
| SAG Power | MR | [473.90, 504.23] | 487.81 | 528.29 | [367.24, 383.85] | 375.23 | 392.27 |
| RF | [505.43, 532.48] | 518.69 | 553.51 | [383.01, 401.14] | 391.55 | 401.57 | |
| XGB | [378.65, 404.38] | 391.89 | 431.11 | [285.80, 296.08] | 290.94 | 298.48 | |
| GBM | [414.71, 445.36] | 430.52 | 470.39 | [316.83, 329.60] | 323.21 | 335.29 | |
| ANN | [417.67, 466.32] | 442.34 | 474.10 | [313.37, 325.96] | 319.66 | 325.25 | |
| BLR | [484.80, 518.65] | 502.22 | 534.91 | [371.91, 392.05] | 381.98 | 398.49 | |
| BART | [545.90, 589.76] | 568.58 | 590.52 | [414.30, 431.63] | 422.97 | 432.44 | |
| GBN | [642.69, 700.96] | 671.89 | 568.83 | [487.44, 499.10] | 493.27 | 412.45 | |
| GPR | [503.74, 564.63] | 535.70 | 473.83 | [387.75, 411.36] | 399.55 | 326.25 | |
| BNN | [470.27, 502.82] | 487.03 | 526.97 | [362.91, 392.23] | 377.58 | 379.95 | |
| SAG Production | MR | [105.18, 120.63] | 112.25 | 122.97 | [67.51, 72.24] | 69.67 | 74.04 |
| RF | [90.69, 101.44] | 95.75 | 104.02 | [50.09, 54.36] | 52.18 | 54.20 | |
| XGB | [77.53, 87.36] | 82.70 | 87.98 | [45.41, 49.14] | 47.28 | 47.38 | |
| GBM | [80.22, 88.80] | 84.70 | 91.73 | [45.69, 48.47] | 47.08 | 48.14 | |
| ANN | [82.55, 93.27] | 87.99 | 93.09 | [42.85, 46.20] | 44.53 | 43.88 | |
| BLR | [108.23, 121.47] | 115.19 | 124.09 | [69.45, 73.91] | 71.67 | 76.54 | |
| BART | [82.04, 91.11] | 86.78 | 89.78 | [48.34, 52.83] | 50.58 | 48.03 | |
| GBN | [123.94, 131.71] | 127.83 | 107.35 | [76.59, 79.14] | 77.87 | 54.18 | |
| GPR | [120.44, 147.89] | 135.39 | 92.14 | [62.81, 70.63] | 66.72 | 44.08 | |
| BNN | [83.06, 99.34] | 91.83 | 88.50 | [42.68, 48.43] | 45.56 | 46.89 |
| Model | 90% CoverageTest | MPIWTest | NLPDTest | Differential EntropyTest | |||
|---|---|---|---|---|---|---|---|
| SAG Power | BLR | 0.91 | 1617.74 | 7.70 | 491.74 | 491.76 | 7.62 |
| BART | 0.94 | 2025.94 | 7.79 | 614.81 | 615.97 | 7.84 | |
| GBN | 0.95 | 1993.87 | 7.75 | 604.86 | 606.33 | 7.83 | |
| GPR | 0.89 | 1352.40 | 7.58 | 404.59 | 411.10 | 7.40 | |
| BNN | 0.91 | 1537.32 | 7.70 | 467.30 | 467.31 | 7.57 | |
| SAG Production | BLR | 0.94 | 368.19 | 6.24 | 111.93 | 111.92 | 6.14 |
| BART | 0.94 | 295.56 | 5.88 | 86.77 | 89.93 | 5.91 | |
| GBN | 0.95 | 383.24 | 6.09 | 116.24 | 116.58 | 6.17 | |
| GPR | 0.94 | 253.43 | 5.87 | 75.46 | 77.04 | 5.68 | |
| BNN | 0.93 | 297.19 | 5.90 | 90.33 | 90.34 | 5.92 |
| Model | Reg. | RMSE | MAE | MAPE | R2 | Cov90 | MPIW90 | NLPD | ||
|---|---|---|---|---|---|---|---|---|---|---|
| Train | XGB | D | 229.17 | 158.65 | 0.78 | 99.00 | 0.98 | 1451.56 | 7.14 | 441.24 |
| O | 232.63 | 168.08 | 0.81 | 98.09 | 0.98 | 1231.49 | 7.04 | 374.35 | ||
| T | 217.14 | 161.57 | 0.80 | 98.47 | 0.99 | 1161.48 | 6.97 | 353.07 | ||
| BNN | D | 560.90 | 442.13 | 2.21 | 94.05 | 0.90 | 1717.10 | 7.76 | 521.96 | |
| O | 489.80 | 387.90 | 1.93 | 92.14 | 0.93 | 1715.94 | 7.62 | 521.61 | ||
| T | 510.49 | 398.57 | 1.93 | 90.68 | 0.91 | 1715.44 | 7.66 | 521.46 | ||
| Test | XGB | D | 425.93 | 312.63 | 1.57 | 96.50 | 0.90 | 1451.56 | 7.48 | 441.24 |
| O | 495.20 | 306.70 | 1.52 | 91.23 | 0.91 | 1231.49 | 7.72 | 374.35 | ||
| T | 360.06 | 275.10 | 1.37 | 95.94 | 0.93 | 1161.48 | 7.30 | 353.07 | ||
| BNN | D | 583.94 | 450.76 | 2.24 | 93.21 | 0.92 | 1719.93 | 7.80 | 522.82 | |
| O | 530.64 | 422.83 | 2.14 | 91.67 | 0.89 | 1712.98 | 7.70 | 520.71 | ||
| T | 646.43 | 432.00 | 2.13 | 85.54 | 0.89 | 1716.44 | 7.95 | 521.76 | ||
| Model | Reg. | RMSE | MAE | MAPE | R2 | Cov90 | MPIW90 | NLPD | ||
|---|---|---|---|---|---|---|---|---|---|---|
| Train | XGB | D | 51.68 | 30.86 | 0.99 | 99.30 | 0.99 | 373.72 | 5.76 | 113.60 |
| O | 41.21 | 23.95 | 0.74 | 99.30 | 0.98 | 235.87 | 5.36 | 71.70 | ||
| T | 27.23 | 17.34 | 0.54 | 99.68 | 0.99 | 157.39 | 4.95 | 47.84 | ||
| BNN | D | 125.93 | 66.94 | 2.27 | 95.86 | 0.88 | 300.22 | 6.38 | 91.26 | |
| O | 47.66 | 27.24 | 0.88 | 99.03 | 0.98 | 299.89 | 5.57 | 91.16 | ||
| T | 73.91 | 41.06 | 1.29 | 97.78 | 0.95 | 299.93 | 5.76 | 91.17 | ||
| Test | XGB | D | 21.88 | 69.14 | 2.32 | 96.19 | 0.92 | 373.72 | 6.21 | 113.60 |
| O | 68.44 | 40.86 | 1.30 | 98.17 | 0.93 | 235.87 | 5.66 | 71.70 | ||
| T | 57.95 | 31.07 | 1.00 | 98.50 | 0.93 | 157.39 | 5.51 | 47.84 | ||
| BNN | D | 24.83 | 69.29 | 2.38 | 95.91 | 0.87 | 300.13 | 6.36 | 91.23 | |
| O | 52.43 | 29.32 | 0.91 | 98.72 | 0.97 | 299.75 | 5.60 | 91.12 | ||
| T | 65.30 | 40.18 | 1.28 | 98.25 | 0.96 | 300.58 | 5.69 | 91.37 | ||
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Saldana, M.; Gálvez, E.; Sales-Cruz, M.; Salinas-Rodríguez, E.; Salinas-Maldonado, R.G.; Castillo, J.; Toro, N.; Arias, D.; Cisternas, L.A. Comprehensive Comparison of Machine Learning Approaches—Deterministic and Stochastic—In Modeling the Production and Power of an SAG Mill: A Case Study of the Chilean Copper Mining Industry. Minerals 2026, 16, 412. https://doi.org/10.3390/min16040412
Saldana M, Gálvez E, Sales-Cruz M, Salinas-Rodríguez E, Salinas-Maldonado RG, Castillo J, Toro N, Arias D, Cisternas LA. Comprehensive Comparison of Machine Learning Approaches—Deterministic and Stochastic—In Modeling the Production and Power of an SAG Mill: A Case Study of the Chilean Copper Mining Industry. Minerals. 2026; 16(4):412. https://doi.org/10.3390/min16040412
Chicago/Turabian StyleSaldana, Manuel, Edelmira Gálvez, Mauricio Sales-Cruz, Eleazar Salinas-Rodríguez, Ramon G. Salinas-Maldonado, Jonathan Castillo, Norman Toro, Dayana Arias, and Luis A. Cisternas. 2026. "Comprehensive Comparison of Machine Learning Approaches—Deterministic and Stochastic—In Modeling the Production and Power of an SAG Mill: A Case Study of the Chilean Copper Mining Industry" Minerals 16, no. 4: 412. https://doi.org/10.3390/min16040412
APA StyleSaldana, M., Gálvez, E., Sales-Cruz, M., Salinas-Rodríguez, E., Salinas-Maldonado, R. G., Castillo, J., Toro, N., Arias, D., & Cisternas, L. A. (2026). Comprehensive Comparison of Machine Learning Approaches—Deterministic and Stochastic—In Modeling the Production and Power of an SAG Mill: A Case Study of the Chilean Copper Mining Industry. Minerals, 16(4), 412. https://doi.org/10.3390/min16040412

