Automated Machine Learning for Nitrogen Content Prediction in Steel Production: A Comprehensive Multi-Stage Process Analysis
Abstract
1. Introduction
2. Materials and Methods
2.1. Data Collection
2.1.1. Desulphurization of Pig Iron (Stage 1)
2.1.2. Basic Oxygen Furnace Before Tapping (Stage 2)
2.1.3. Beginning of Secondary Steelmaking Process (Stage 3)
2.1.4. End of Secondary Steelmaking Process (Stage 4)
2.2. Data Preprocessing and Quality Control
2.3. AutoML Model Development
- Linear Models: Elastic Net is a linear regression technique that combines the penalties of L1 (Lasso) and L2 (Ridge) regularization into a single loss function to improve model performance [24];
- Tree-Based Methods: Random Forest is an ensemble learning method for classification and regression that uses decision trees as base models, leveraging bootstrap aggregating (bagging) to improve stability and accuracy [25];
- Boosting Algorithms: LightGBM and XGBoostRegressor are high-performance, open-source gradient boosting frameworks designed for speed, efficiency, and scalability, particularly with large datasets. LightGBM’s speed and memory efficiency are due to its advanced optimizations and leaf-wise growth strategy [26];
- Instance-Based Methods: K-Nearest Neighbors (KNN) with local interpolation is a method used for regression and spatial data analysis, where the target value for a query point is predicted by interpolating the values of its k nearest neighbors in the training set [27];
- Regularized Methos: LassoLars is a method that combines the Least Angle Regression (LARS) algorithm with L1 penalization, effectively merging the efficiency of forward feature selection with the sparsity-inducing properties of L1 regularization [28].
2.4. Models Configuration Settings
2.5. Model Evaluation Metrics
- Primary metrics
- Normalized Root Mean Squared Error (NRMSE): is a scale-independent metric used to assess model performance, allowing for fair comparisons across datasets with different scales (2) where Root Mean Squared Error (RMSE) is calculated as (3) [29].
- Coefficient of Determination (R2): measuring the proportion of variance explained by the model [30] and is calculated by Equation (4).
- Mean Absolute Error (MAE): is a widely used metric for evaluating the accuracy of regression models, calculated using the Equation (5). This computes the average of the absolute differences between predicted and actual values, providing a straightforward measure of the average magnitude of errors [31].
- Mean Absolute Percentage Error (MAPE): is a measure of prediction accuracy for forecasting methods, commonly used in statistics and regression analysis [32]. MAPE is widely used as a loss function in regression problems because it provides a relative measure of error and is expressed by the Equation (6).
- Secondary metrics
2.6. Model Interpretability and Explainability
2.7. Industrial Validation and Deployment Considerations
3. Results
3.1. Overall Model Performance Comparison
3.2. Stage-Specific Model Performance Analysis
3.2.1. NitroML-DeS Model Performance (Stage 1)
3.2.2. NitroML-BOF Model Performance (Stage 2)
3.2.3. NitroML-SMB Model Performance (Stage 3)
3.2.4. NitroML-SME Model Performance (Stage 4)
3.3. Feature Importance and Model Architecture Analysis
3.3.1. Algorithm Selection Across Models
3.3.2. Feature Engineering Impact
3.3.3. Ensemble Diversity Analysis
3.4. Cross-Validation Stability and Generalization
3.5. Computational Performance and Scalability
4. Discussion
4.1. Interpretation of Performance Variations Across Process Stages
4.2. Algorithm Selection and Ensemble Architecture Effectiveness
4.3. Feature Engineering and Temporal Modeling Insights
4.4. Industrial Implementation Considerations
4.5. Limitations and Areas for Improvement
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| SM | Secondary Metallurgy |
| DeS | Desulphurization |
| BOF | Basic Oxygen Furnace |
| SMB | Secondary Metallurgy Beginning |
| SME | Secondary Metallurgy End |
| SHAP | SHapley Additive exPlanations) |
| VM | Virtual Machine |
| TOPT | Tree-Based Pipeline Automation Tool |
Nomenclature
| Meta-learner | |
| f1, f2, fn | Base learners |
| g(.) | Meta-learner trained on the predictions of the base learners |
| Predicted values | |
| Observed/Actual values | |
| n | Number of observations |
| Maximal observed value | |
| Minimal observed value | |
| Sum of squares of residuals | |
| Total sum of squares |
Appendix A
| vCPU Cores | Memory RAM | Temporary Storage | Storage Type | Processor Type | Architecture | Network |
|---|---|---|---|---|---|---|
| 4 | 14 GB | 28 GB | Premium SSD drives | Intel Xeon E5-2673 v3 | x64-based | throughput up to 3500 Mbps |
| Parameter | Minimum | Maximum | Mean |
|---|---|---|---|
| Nitrogen in pig iron (DeS) [%] | 0.0023 | 0.0066 | 0.00399 |
| C in pig iron [%] | 4.278 | 4.6670 | 4.45383 |
| Mn in pig iron [%] | 0.320 | 0.625 | 0.42606 |
| Si in pig iron [%] | 0.426 | 1.147 | 0.69642 |
| P in pig iron [%] | 0.046 | 0.067 | 0.05543 |
| Sulphur before DeS [%] | 0.028 | 0.086 | 0.05177 |
| Sulphur after DeS [%] | 0.001 | 0.013 | 0.00834 |
| Temperature before DeS [°C] | 1330 | 1409 | 1366.53247 |
| Temperature after DeS [°C] | 1306 | 1401 | 1352.11688 |
| Pig iron weight before DeS [kg] | 142,230 | 148,260 | 144,541.2987 |
| Pig iron weight after DeS [kg] | 139,200 | 145,500 | 141,940.25974 |
| Weight of DeS slag [kg] | 1260 | 4600 | 2601.03896 |
| Period of N2 blowing [s] | 219 | 1153 | 582.68831 |
| Weight of DeS mixture [kg] | 140 | 625 | 290.28571 |
| Blowing rate of N2 [l] | 373 | 14,586 | 3735.1039 |
| Parameter | Minimum | Maximum | Mean |
|---|---|---|---|
| Nitrogen in crude steel (BOF) [%] | 0.0012 | 0.0064 | 0.00217 |
| C in crude steel [%] | 0.026 | 0.105 | 0.05359 |
| Mn in crude steel [%] | 0.06 | 0.244 | 0.12983 |
| P in crude steel [%] | 0.005 | 0.015 | 0.00992 |
| S in crude steel [%] | 0.005 | 0.022 | 0.01292 |
| Fe in slag [%] | 12.63 | 26.14 | 17.53985 |
| MnO in slag [%] | 3.12 | 5.62 | 4.13894 |
| SiO2 in slag [%] | 9.14 | 15.83 | 12.66273 |
| Al2O3 in slag [%] | 0.7 | 2.09 | 1.26652 |
| CaO in slag [%] | 39.44 | 49.36 | 45.62197 |
| MgO in slag [%] | 6.78 | 12.32 | 9.01515 |
| P2O5 in slag [%] | 0.75 | 1.08 | 0.89061 |
| S in slag [%] | 0.051 | 0.086 | 0.06559 |
| Slag basicity [%] | 3.4 | 5.8 | 4.07273 |
| Pig iron charging time [s] | 11.0 | 732 | 308.56061 |
| Pure oxygen blowing time [s] | 1601 | 2322 | 1756.34848 |
| Pure oxygen reblowing time [s] | 0 | 100 | 15.15152 |
| Heat time [s] | 2458 | 8854 | 3863.81818 |
| Tapping time [s] | 409 | 1123 | 682.13636 |
| Crude steel tapping temp. [°C] | 1620 | 1685 | 1647.74242 |
| Overall oxygen for heat [l] | 8961 | 11,018 | 9605.18182 |
| Oxygen for reblow [l] | 0 | 764 | 64.93939 |
| Oxygen activity [-] | 449.9 | 1445 | 823.13485 |
| Pig iron weight [kg] | 139,400 | 145,500 | 142,069.69697 |
| Scrap weight [kg] | 43,800 | 50,000 | 47,409.09091 |
| Lime weight [kg] | 4655 | 11,985 | 7391.75758 |
| Dolomitic lime weight [kg] | 3065 | 6180 | 3283.48485 |
| Magnesia weight [kg] | 0 | 4805 | 1791.89394 |
| Pellets weight [kg] | 0 | 2055 | 164.09091 |
| Briquettes weight [kg] | 0 | 3260 | 708.48485 |
| Covering slag #6 [kg] | 0 | 2000 | 536.9697 |
| Yield of crude steel [%] | 84.892 | 94.169 | 89.4093 |
| Parameter | Minimum | Maximum | Mean |
|---|---|---|---|
| Nitrogen in steel (SMB) [%] | 0.0016 | 0.0082 | 0.0033 |
| C before Ar stirring [%] | 0.026 | 0.171 | 0.07305 |
| Mn before Ar stirring [%] | 0.174 | 1.29 | 0.42295 |
| Si before Ar stirring [%] | 0 | 0.403 | 0.05626 |
| P before Ar stirring [%] | 0.006 | 0.018 | 0.01092 |
| S before Ar stirring [%] | 0.005 | 0.022 | 0.01174 |
| Al (overall) before Ar stirring [%] | 0.006 | 0.056 | 0.02378 |
| Steel temperature (first on SM) [°C] | 1591 | 1629 | 1608.36923 |
| Tapping time [s] | 249 | 683 | 425.58462 |
| Weight of crude steel [kg] | 154,900 | 183,000 | 172,414.9351 |
| Weight of slag in ladle [kg] | 800 | 6480 | 3403.8961 |
| Tapping angle [°] | 98 | 115 | 106.12308 |
| Parameter | Minimum | Maximum | Mean |
|---|---|---|---|
| Nitrogen in steel (SME) [%] | 0.0014 | 0.0083 | 0.003404 |
| Al (blocks) [kg] | 200 | 350 | 287.5325 |
| Al (feeding wire) [kg] | 0 | 211 | 98.2857 |
| C after Ar stirring [%] | 0.026 | 0.201 | 0.08802 |
| Mn after Ar stirring [%] | 0.217 | 1.403 | 0.46378 |
| Si after Ar stirring [%] | 0.003 | 0.483 | 0.06692 |
| P after Ar stirring [%] | 0.007 | 0.016 | 0.01052 |
| S after Ar stirring [%] | 0.004 | 0.019 | 0.01103 |
| Al (overall) after Ar stirring [%] | 0.034 | 0.054 | 0.04408 |
| C after alloy adding [%] | 0.035 | 0.195 | 0.08791 |
| Mn after alloy adding [%] | 0.218 | 1.38 | 0.45617 |
| Si after alloy adding [%] | 0.012 | 0.411 | 0.06638 |
| P after alloy adding [%] | 0.005 | 0.016 | 0.01 |
| S after alloy adding [%] | 0.004 | 0.018 | 0.0106 |
| Al (overall) at the end of SM [%] | 0.04 | 0.066 | 0.05295 |
| Tapping time [s] | 249 | 683 | 422.3896 |
| Overall heat stay at SM [min] | 17 | 45 | 28.50769 |
| Ar stirring time [min] | 11.4 | 57.4 | 28.51385 |
| Ar stirring flow rate [l.min−1] | 482 | 2382 | 1055.81538 |
| Overall amount of stirring Ar [m3] | 3352 | 22,234 | 7312.7013 |
| Ar soft-bubbling flow rate [l.min−1] | 0 | 194 | 54.89231 |
| Ar soft-bubbling time [min] | 0 | 18.117 | 5.36771 |
| Steel weight [kg] | 154,900 | 183,000 | 172,379.23077 |
| FeMn during SM [kg] | 0 | 388 | 66.87692 |
| FeMn aff. during SM [kg] | 0 | 184 | 48.67692 |
| FeSi during SM [kg] | 0 | 442 | 26.30769 |
| Settling time [min] | 0 | 42 | 3.26154 |
| Weight of slag in ladle [kg] | 800 | 6480 | 3300.76923 |
| C at the end of SM [%] | 0.026 | 0.201 | 0.08795 |
| Mn at the end of SM [%] | 0.217 | 1.403 | 0.46362 |
| Si at the end of SM [%] | 0.003 | 0.483 | 0.06688 |
| P at the end of SM [%] | 0.007 | 0.016 | 0.01069 |
| S at the end of SM [%] | 0.004 | 0.019 | 0.01102 |
| Steel temperature (last on SM) [°C] | 1558 | 1597 | 1578.46154 |
| Tapping angle [°] | 98 | 117 | 106.3636 |




References
- Chen, J.; Gu, Y.; Zhu, Q.; Gu, Y.; Liang, X.; Ma, J. Automated Machine Learning of Interfacial Interaction Descriptors and Energies in Metal-Catalyzed N2 and CO2 Reduction Reactions. Langmuir ACS J. Surf. Colloids 2025, 41, 3490–3502. [Google Scholar] [CrossRef]
- Zhang, R.; Yang, J. State of the Art in Applications of Machine Learning in Steelmaking Process Modeling. Int. J. Miner. Metall. Mater. 2023, 30, 2055–2075. [Google Scholar] [CrossRef]
- Deng, Z.; Sarkisov, L. Engineering Machine Learning Features to Predict Adsorption of Carbon Dioxide and Nitrogen in Metal–Organic Frameworks. J. Phys. Chem. C 2024, 128, 10202–10215. [Google Scholar] [CrossRef]
- Yoon, C.; Eom, C.; Jeon, Y.; Kim, K. Development of a Nitrogen Prediction Model for 320 Tonne Converter. In Proceedings of the 12th International Conference of Molten Slags, Fluxes and Salts (MOLTEN 2024), Brisbane, Australia, 17–19 June 2024. [Google Scholar] [CrossRef]
- Wu, Y.; Zhang, H.; Jian, L.; Lv, Z. A Quantitative Causal Analysis and Optimization Framework for Inclusions of Steel Products. Adv. Eng. Inform. 2024, 62, 102629. [Google Scholar] [CrossRef]
- Patra, S.; Nayak, J.; Singhal, L.; Pal, S. Prediction of Nitrogen Content of Steel Melt during Stainless Steel Making Using AOD Converter. Steel Res. Int. 2017, 88, 1600271. [Google Scholar] [CrossRef]
- Liu, C.; Tang, L.; Liu, J. A Stacked Autoencoder With Sparse Bayesian Regression for End-Point Prediction Problems in Steelmaking Process. IEEE Trans. Autom. Sci. Eng. 2020, 17, 550–561. [Google Scholar] [CrossRef]
- Conrad, F.; Mälzer, M.; Schwarzenberger, M.; Wiemer, H.; Ihlenfeldt, S. Benchmarking AutoML for Regression Tasks on Small Tabular Data in Materials Design. Sci. Rep. 2022, 12, 19350. [Google Scholar] [CrossRef] [PubMed]
- Zhang, T.; Zhang, J.; Peng, G.; Wang, H. Automated Machine Learning for Steel Production: A Case Study of TPOT for Material Mechanical Property Prediction. In Proceedings of the 2022 IEEE International Conference on e-Business Engineering (ICEBE), Bournemouth, UK, 14–16 October 2022; pp. 94–99. [Google Scholar] [CrossRef]
- Feng, L.; Zhao, C.; Li, Y.; Zhou, M.; Qiao, H.; Fu, C. Multichannel Diffusion Graph Convolutional Network for the Prediction of Endpoint Composition in the Converter Steelmaking Process. IEEE Trans. Instrum. Meas. 2020, 70, 1–13. [Google Scholar] [CrossRef]
- Chumanov, I.; Sedukhin, V. Analysis of methods for predicting the limiting nitrogen concentration in duplex steels. Ferr. Metall. Bull. Sci. Tech. Econ. Inf. 2022, 78, 598–604. [Google Scholar] [CrossRef]
- Laha, D.; Ye, R.; Suganthan, P. Modeling of Steelmaking Process with Effective Machine Learning Techniques. Expert Syst. Appl. 2015, 42, 4687–4696. [Google Scholar] [CrossRef]
- Xiao, X.; Trinh, T.; Gerelkhuu, Z.; Ha, E.; Yoon, T. Automated Machine Learning in Nanotoxicity Assessment: A Comparative Study of Predictive Model Performance. Comput. Struct. Biotechnol. J. 2024, 25, 9–19. [Google Scholar] [CrossRef]
- Pitkälä, J.; Holappa, L.; Jokilaakso, A. Production of Nitrogen-Alloyed Stainless Steels in Argon Oxygen Decarburization Converter: Kinetics and Modeling of Nitrogenation and Denitrogenation. Steel Res. Int. 2023, 95, 2300597. [Google Scholar] [CrossRef]
- Tsamardinos, I.; Fanourgakis, G.; Greasidou, E.; Klontzas, E.; Gkagkas, K.; Froudakis, G. An Automated Machine Learning Architecture for the Accelerated Prediction of Metal-Organic Frameworks Performance in Energy and Environmental Applications. Microporous Mesoporous Mater. 2020, 300, 110160. [Google Scholar] [CrossRef]
- Luo, J.; Luo, Y.; Cheng, X.; Liu, X.; Wang, F.; Fang, F.; Cao, J.; Liu, W.; Xu, R.-Z. Prediction of Biological Nutrients Removal in Full-Scale Wastewater Treatment Plants Using H2O Automated Machine Learning and Back Propagation Artificial Neural Network Model: Optimization and Comparison. Bioresour. Technol. 2023, 390, 129842. [Google Scholar] [CrossRef]
- Sheik, S.; Mohammed, R.; Teeparthi, K.; Raghuvamsi, Y. Machine Learning-Based Prediction of Intergranular Corrosion Resistance in Austenitic Stainless Steels Exposed to Various Heat Treatments. J. Inst. Eng. India Ser. D 2025, 106, 491–504. [Google Scholar] [CrossRef]
- Kateb, M.; Safarian, S. Machine Learning-Driven Predictive Modeling of Mechanical Properties in Diverse Steels. Mach. Learn. Appl. 2025, 20, 100634. [Google Scholar] [CrossRef]
- Shinde, P.P.; Shah, S. A Review of Machine Learning and Deep Learning Applications. In Proceedings of the 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), Pune, India, 16–18 August 2018; pp. 1–6. [Google Scholar]
- Ghalati, M.K.; Zhang, J.; El-Fallah, G.M.M.M.; Nenchev, B.; Dong, H. Toward Learning Steelmaking—A Review on Machine Learning for Basic Oxygen Furnace Process. Mater. Genome Eng. Adv. 2023, 1, e6. [Google Scholar] [CrossRef]
- ELTRA GmbH. Basic Application Information. Available online: https://www.eltra.com/files/446383/expert-guide-application-information.pdf (accessed on 8 December 2025).
- ELTRA GmbH. Effective Quality Control of Steel and Iron Products with Combustion Analysis. Available online: https://www.eltra.com/files/14146/effective-quality-control-of-steel-and-iron-products-with-combustion-analysis.pdf (accessed on 28 December 2025).
- ASTM E1019-18; Standard Test Methods for Determination of Carbon, Sulfur, Nitrogen, and Oxygen in Steel, Iron, Nickel, and Cobalt Alloys by Various Combustion and Inert Gas Fusion Techniques. ASTM International: West Conshohocken, PA, USA, 2018. Available online: https://store.astm.org/e1019-18.html (accessed on 8 December 2025).
- Elastic Net Regularization. Available online: https://web.archive.org/web/20250814115617/https://questdb.com/glossary/elastic-net-regularization/ (accessed on 14 August 2025).
- Tanner, G. Random Forest. Available online: https://ml-explained.com/blog/random-forest-explained (accessed on 22 October 2025).
- Hossain, M.M. Mastering LightGBM: An In-Depth Guide to Efficient Gradient Boosting. Available online: https://medium.com/@mohtasim.hossain2000/mastering-lightgbm-an-in-depth-guide-to-efficient-gradient-boosting-8bfeff15ee17 (accessed on 22 October 2025).
- Ni, K.S.; Nguyen, T.Q. Adaptable K-Nearest Neighbor for Image Interpolation. In Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, Las Vegas, NV, USA, 31 March–4 April 2008; pp. 1297–1300. [Google Scholar]
- Hesterberg, T.; Choi, N.H.; Meier, L.; Fraley, C. Least Angle and L1 Penalized Regression: A Review. Stat. Surv. 2008, 2, 61–93. [Google Scholar] [CrossRef]
- Lifesight Normalized Root Mean Square Error (NRMSE). Available online: https://lifesight.io/glossary/normalized-root-mean-square-error/ (accessed on 22 October 2025).
- Draper, N.R.; Smith, H. Applied Regression Analysis; John Wiley & Sons: New York, NY, USA, 1998; ISBN 978-0-471-17082-2. [Google Scholar]
- Lee, S. MAE Mastery: Your Guide to Mean Absolute Error. Available online: https://www.numberanalytics.com/blog/mae-mastery-guide-mean-absolute-error (accessed on 22 October 2025).
- Mean Absolute Percentage Error (MAPE): What You Need To Know. Available online: https://arize.com/blog-course/mean-absolute-percentage-error-mape-what-you-need-to-know/ (accessed on 22 October 2025).
- Bhandari, P. Correlation Coefficient|Types, Formulas & Examples. Available online: https://www.scribbr.com/statistics/correlation-coefficient/ (accessed on 24 March 2025).
- Lane, D. Proportion of Variance Explained. Available online: https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Introductory_Statistics_(Lane)/19%3A_Effect_Size/19.04%3A_Proportion_of_Variance_Explained (accessed on 22 October 2025).
- Borgogno Mondino, E.; Farbo, A.; Novello, V.; Palma, L. A Fast Regression-Based Approach to Map Water Status of Pomegranate Orchards with Sentinel 2 Data. Horticulturae 2022, 8, 759. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30 (NIPS 2017); Curran Associates, Inc.: Red Hook, NY, USA, 2017. [Google Scholar]
- Van den Broeck, G.; Lykov, A.; Schleich, M.; Suciu, D. Tractability SHAP Explanations. arXiv 2021, arXiv:2009.08634. [Google Scholar] [CrossRef]
- Nohara, Y.; Matsumoto, K.; Soejima, H.; Nakashima, N. Explanation of Machine Learning Models Using Shapley Additive Explanation and Application for Real Data in Hospital. Comput. Methods Programs Biomed. 2022, 214, 106584. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Lee, S.-I. Consistent Feature Attribution for Tree Ensembles. arXiv 2018, arXiv:1802.03888. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Erion, G.; Chen, H.; DeGrave, A.; Prutkin, J.M.; Nair, B.; Katz, R.; Himmelfarb, J.; Bansal, N.; Lee, S.-I. From Local Explanations to Global Understanding with Explainable AI for Trees. Nat. Mach. Intell. 2020, 2, 56–67. [Google Scholar] [CrossRef]
- Sudjianto, A.; Knauth, W.; Singh, R.; Yang, Z.; Zhang, A. Unwrapping The Black Box of Deep ReLU Networks: Interpretability, Diagnostics, and Simplification. arXiv 2020. [Google Scholar] [CrossRef]
- Covert, I.; Lundberg, S.; Lee, S.-I. Understanding Global Feature Contributions With Additive Importance Measures. arXiv 2020. [Google Scholar] [CrossRef]
- Ponce-Bobadilla, A.V.; Schmitt, V.; Maier, C.S.; Mensing, S.; Stodtmann, S. Practical Guide to SHAP Analysis: Explaining Supervised Machine Learning Model Predictions in Drug Development. Clin. Transl. Sci. 2024, 17, e70056. [Google Scholar] [CrossRef]
- Mohsin, M.T.; Nasim, N.B. Explaining the Unexplainable: A Systematic Review of Explainable AI in Finance. arXiv 2025. [Google Scholar] [CrossRef]
- Truong, A.; Walters, A.; Goodsitt, J.; Hines, K.; Bruss, B.; Farivar, R. Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools. In Proceedings of the 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), Portland, OR, USA, 4 November–6 November 2019; pp. 1471–1479. [Google Scholar] [CrossRef]
- Goldstein, D.A.; Fruehan, R.J. Mathematical Model for Nitrogen Control in Oxygen Steelmaking. Met. Mater. Trans. B 1999, 30, 945–956. [Google Scholar] [CrossRef]
- Han, H.; Shaker, B.; Lee, J.H.; Choi, S.; Yoon, S.; Singh, M.; Basith, S.; Cui, M.; Ahn, S.; An, J.; et al. Employing Automated Machine Learning (AutoML) Methods to Facilitate the In Silico ADMET Properties Prediction. J. Chem. Inf. Model. 2025, 65, 3215–3225. [Google Scholar] [CrossRef]
- Shamsuddin, M. Secondary Steelmaking. In Physical Chemistry of Metallurgical Processes, 2nd ed.; Shamsuddin, M., Ed.; Springer International Publishing: Cham, Switzerland, 2021; pp. 293–351. ISBN 978-3-030-58069-8. [Google Scholar]
- Wu, H.; Zhang, B.; Li, Z. Small Sample-Oriented Prediction Method of Mechanical Properties for Hot Rolled Strip Steel Based on Model Independent Element Learning. IEEE Access 2024, 12, 197300–197311. [Google Scholar] [CrossRef]
- Zhang, C.-J.; Zhang, Y.-C.; Han, Y. Industrial Cyber-Physical System Driven Intelligent Prediction Model for Converter End Carbon Content in Steelmaking Plants. J. Ind. Inf. Integr. 2022, 28, 100356. [Google Scholar] [CrossRef]
- Yang, Q.; Fan, Y.; Rong, D.; Bao, R.; Zhang, D. An Auto-configurable Machine Learning Framework to Optimize and Predict Catalysts for CO2 to Light Olefins Process. AIChE J. 2024, 70, e18437. [Google Scholar] [CrossRef]
- De Oliveira, V.; Komati, K.; Andrade, J. Implementing Neuroevolution for Gas Consumption Forecasting in the Steel Industry. In Proceedings of the 2024 L Latin American Computer Conference (CLEI), Buenos Aires, Argentina, 12–16 August 2024; pp. 1–10. [Google Scholar] [CrossRef]
- Bender, J.; Trat, M.; Ovtcharova, J. Benchmarking AutoML-Supported Lead Time Prediction. Procedia Comput. Sci. 2022, 200, 482–494. [Google Scholar] [CrossRef]
- Hariri-Ardebili, M.; Mahdavi, P.; Pourkamali-Anaraki, F. Benchmarking AutoML Solutions for Concrete Strength Prediction: Reliability, Uncertainty, and Dilemma. Constr. Build. Mater. 2024, 423, 135782. [Google Scholar] [CrossRef]
- Liu, G.; Lu, D.; Lu, J. Pharm-AutoML: An Open-source, End-to-end Automated Machine Learning Package for Clinical Outcome Prediction. CPT Pharmacomet. Syst. Pharmacol. 2021, 10, 478–488. [Google Scholar] [CrossRef]
- Kwon, N.; Comuzzi, M. Genetic Algorithms for AutoML in Process Predictive Monitoring. In Process Mining Workshops; Springer: Berlin/Heidelberg, Germany, 2023; pp. 242–254. [Google Scholar] [CrossRef]
- Luo, C.; Zhang, Z.; Qiao, D.; Lai, X.; Li, Y.; Wang, S. Life Prediction under Charging Process of Lithium-Ion Batteries Based on AutoML. Energies 2022, 15, 4594. [Google Scholar] [CrossRef]
- Denkena, B.; Dittrich, M.; Lindauer, M.; Mainka, J.; Stürenburg, L. Using AutoML to Optimize Shape Error Prediction in Milling Processes. SSRN 2020. [Google Scholar] [CrossRef]
- Hadi, R.; Hady, H.; Hasan, A.; Al-Jodah, A.; Humaidi, A. Improved Fault Classification for Predictive Maintenance in Industrial IoT Based on AutoML: A Case Study of Ball-Bearing Faults. Processes 2023, 11, 1507. [Google Scholar] [CrossRef]
- Musigmann, M.; Akkurt, B.; Krähling, H.; Nacul, N.G.; Remonda, L.; Sartoretti, T.; Henssen, D.; Brokinkel, B.; Stummer, W.; Heindel, W.; et al. Testing the Applicability and Performance of Auto ML for Potential Applications in Diagnostic Neuroradiology. Sci. Rep. 2022, 12, 13648. [Google Scholar] [CrossRef]
- Li, P.; Yang, Y.; Chen, C. Research on Fatigue Crack Propagation Prediction for Marine Structures Based on Automated Machine Learning. J. Mar. Sci. Eng. 2024, 12, 1492. [Google Scholar] [CrossRef]
- De S’a, A.; Ascher, D. Auto-ADMET: An Effective and Interpretable AutoML Method for Chemical ADMET Property Prediction. arXiv 2025. [Google Scholar] [CrossRef]
- Mubarak, Y.; Koeshidayatullah, A. Hierarchical Automated Machine Learning (AutoML) for Advanced Unconventional Reservoir Characterization. Sci. Rep. 2023, 13, 13812. [Google Scholar] [CrossRef] [PubMed]
- Kaftantzis, S.; Bousdekis, A.; Theodoropoulou, G.; Miaoulis, G. Predictive Business Process Monitoring with AutoML for next Activity Prediction. Intell. Decis. Technol. 2024, 18, 1965–1980. [Google Scholar] [CrossRef]
- Sousa, A.; Ferreira, L.; Ribeiro, R.; Xavier, J.; Pilastri, A.; Cortez, P. Production Time Prediction for Contract Manufacturing Industries Using Automated Machine Learning. In Artificial Intelligence Applications and Innovations; Springer: Berlin/Heidelberg, Germany, 2022; pp. 262–273. [Google Scholar] [CrossRef]
- Kim, G.; Steller, M.; Olson, S. Modeling Watershed Nutrient Concentrations with AutoML. In Proceedings of the 10th International Conference on Climate Informatics; ACM: New York, NY, USA, 2020. [Google Scholar] [CrossRef]
- Wang, J.; Xue, Q.; Zhang, C.; Wong, K.K.L.; Liu, Z. Explainable Coronary Artery Disease Prediction Model Based on AutoGluon from AutoML Framework. Front. Cardiovasc. Med. 2024, 11, 1360548. [Google Scholar] [CrossRef]
- Kulkarni, G.; Ambesange, S.; Vijayalaxmi, A.; Sahoo, A. Comparision of Diabetic Prediction AutoML Model with Customized Model. In Proceedings of the 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS), Coimbatore, India, 25–27 March 2021; pp. 842–847. [Google Scholar] [CrossRef]
- Kačur, J.; Flegner, P.; Durdán, M.; Laciak, M. Prediction of Temperature and Carbon Concentration in Oxygen Steelmaking by Machine Learning: A Comparative Study. Appl. Sci. 2022, 12, 7757. [Google Scholar] [CrossRef]
- Conrad, F.; Mälzer, M.; Lange, F.; Wiemer, H.; Ihlenfeldt, S. AutoML Applied to Time Series Analysis Tasks in Production Engineering. Procedia Comput. Sci. 2024, 232, 849–860. [Google Scholar] [CrossRef]
- Demeter, J.; Buľko, B.; Demeter, P.; Hrubovčáková, M. Prediction Models for Nitrogen Content in Metal at Various Stages of the Basic Oxygen Furnace Steelmaking Process. Appl. Sci. 2025, 15, 9561. [Google Scholar] [CrossRef]





| Steel Grade | C (%) | Mn (%) | Si (%) | Al (%) | P (%) | S (%) | Nb (%) |
|---|---|---|---|---|---|---|---|
| Grade 1 | 0.07–0.21 | 0.8–1.6 | 0.03–0.6 | min 0.02 | max. 0.025 | max. 0.020 | - |
| Grade 2 | 0.02–0.1 | 0.1–0.55 | max. 0.08 | 0.02–0.07 | 0.01–0.07 | max. 0.020 | 0.004–0.0075 |
| Model Stage | Input Features | Allowed Models | Metric Score Threshold | Number of Cross Validation | Percentage Validation of Data (%) | Percentage Test of Data (%) |
|---|---|---|---|---|---|---|
| Stage 1 | 15 | LightGBM, XGBoostRegressor, RandomForest, ElasticNet | 0.1 | 10 | 20 | 20 |
| Stage 2 | 32 | GradientBoosting, ElasticNet, DecisionTree, KNN, LassoLars, RandomForest, LightGBM | 0.08 | 10 | 20 | 20 |
| Stage 3 | 12 | ElasticNet, GradientBoosting, DecisionTree, KNN, LassoLars, RandomForest | 0.08 | 10 | 20 | 20 |
| Stage 4 | 35 | ElasticNet, GradientBoosting, DecisionTree, KNN, LassoLars, RandomForest, LightGBM | 0.1 | 10 | 20 | 20 |
| Model Stage | Model Name | NRMSE | MAE | MAPE (%) | Spearman Correlation |
|---|---|---|---|---|---|
| Stage 1 | NitroML-DeS | 0.14878 | 0.00053011 | 14.608 | 0.31017 |
| Stage 2 | NitroML-BOF | 0.12735 | 0.00046536 | 20.635 | 0.48132 |
| Stage 3 | NitroML-SMB | 0.12699 | 0.00063443 | 21.081 | 0.58554 |
| Stage 4 | NitroML-SME | 0.11239 | 0.00063450 | 20.283 | 0.58692 |
| Model Name | NitroML-DeS | NitroML-BOF | NitroML-SMB | NitroML-SME |
|---|---|---|---|---|
| Number of algorithms | 4 | 7 | 6 | 7 |
| Model Name | NitroML-DeS | NitroML-BOF | NitroML-SMB | NitroML-SME |
|---|---|---|---|---|
| R2 score across fold | 0.021 | 0.015 | 0.018 | 0.022 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Demeter, J.; Buľko, B.; Demeter, P.; Hrubovčáková, M.; Hubatka, S.; Fogaraš, L. Automated Machine Learning for Nitrogen Content Prediction in Steel Production: A Comprehensive Multi-Stage Process Analysis. Appl. Sci. 2026, 16, 441. https://doi.org/10.3390/app16010441
Demeter J, Buľko B, Demeter P, Hrubovčáková M, Hubatka S, Fogaraš L. Automated Machine Learning for Nitrogen Content Prediction in Steel Production: A Comprehensive Multi-Stage Process Analysis. Applied Sciences. 2026; 16(1):441. https://doi.org/10.3390/app16010441
Chicago/Turabian StyleDemeter, Jaroslav, Branislav Buľko, Peter Demeter, Martina Hrubovčáková, Slavomír Hubatka, and Lukáš Fogaraš. 2026. "Automated Machine Learning for Nitrogen Content Prediction in Steel Production: A Comprehensive Multi-Stage Process Analysis" Applied Sciences 16, no. 1: 441. https://doi.org/10.3390/app16010441
APA StyleDemeter, J., Buľko, B., Demeter, P., Hrubovčáková, M., Hubatka, S., & Fogaraš, L. (2026). Automated Machine Learning for Nitrogen Content Prediction in Steel Production: A Comprehensive Multi-Stage Process Analysis. Applied Sciences, 16(1), 441. https://doi.org/10.3390/app16010441

