Characterization of High-Speed Steels—Experimental Data and Their Evaluation Supported by Machine Learning Algorithms
Abstract
:1. Introduction
2. Experimental
2.1. Materials
2.2. Dilatometry and Heat Treatment
2.3. X-Ray Diffraction Measurements
2.4. Quantification of Phases
3. Machine Learning Algorithms
- Unsupervised learning algorithms do not require labeled data for training. Instead, their objective is to discover patterns and structures within the data by clustering similar data points and identifying relationships between them; details can e.g., be found in [13]. The t-SNE algorithm used is described in Appendix A. Details of the agglomerative hierarchical clustering algorithm are presented in Appendix B.
- Supervised learning algorithms utilize labeled datasets, where each data point is assigned a label standing for a category. These labeled examples consist of input data along with their corresponding desired outputs. Through regression techniques, supervised learning algorithms analyze the labeled data to find the best-fitting model. Once the model is trained, it can be used to classify and assign new data to their categories [13]. Detailed information about the support vector machine can be found in Appendix C. The robustness of the support vector machine results is analyzed through cross-validation. The method is presented in Appendix D.
3.1. Data Preparation and Preprocessing
- Min-max scaling [16]: This method rescales the data to a specific range, typically between 0 and 1, by subtracting the minimum value and dividing by the range.
- Z-score standardization [17]: This method transforms the data to have a mean of 0 and a standard deviation of 1. This is achieved by subtracting the mean from the original data. Division of the standard deviation of the subtracted data results in Z-score standardization.
- Power transformation [18]: This method adjusts the data distribution to conform more closely to a desired distribution or meet certain statistical assumptions.
- An alternative method for normalizing diffractograms involves dividing each diffractogram by its average value. In the case of a Principal Component Analysis (PCA) analysis, the mean value is subtracted from the normalized data. This newly introduced area-normalized method is related to the Rietveld method, where the quantitative results are normalized by the overall peak areas [19]. This area-normalized method is particularly recommended when the diffractograms undergo PCA for dimensionality reduction, as will be explained in the next subchapter. This reduces the impact of the mentioned problems, such as different irradiated sample areas, variations in the tube intensity, and different measuring times per step for different diffractograms.
3.2. Dimensionality Reduction by PCA
- The orthogonal coefficient matrix W, also known as the loading matrix, represents the weights or coefficients that define the principal components in terms of the original variables.
- The orthogonal scores matrix U represents the original data expressed in the new principal component space (the rotated coordinate system).
- The variables become uncorrelated.
- The first few principal components, consisting of data in the rotated coordinate system (scores) and their weights (coefficients or loadings), capture the most crucial information, while the remaining components represent less significant variations or noise. The importance of individual components can be determined by the absolute values of their corresponding eigenvalues. Higher-order principal components can often be disregarded, leading to reduced dimensionality. The lost information corresponds to the sum of the ignored absolute eigenvalues in relation to the sum of all absolute eigenvalues.
- The original dataset can be reconstructed by multiplying T with the inverted matrix .
4. Results and Discussion
4.1. Preprocessing of Diffractogram Data
4.2. Machine Learning Algorithm Applied to XRD Data
4.2.1. Agglomerative Hierarchical Clustering of XRD Data
4.2.2. t-SNE Analysis of XRD Data
4.2.3. Support Vector Machines to Separate Phases by Hyperplanes
5. Conclusions and Outlook
- After “area-normalization” of the diffraction data, the different material classes “martensite”, “bainite”, and “pearlite” are successfully grouped by agglomerative hierarchical clustering as well as by the t-SNE method.
- PCA is an effective method to eliminate irrelevant information and serves as a prerequisite for the setup of SVMs. Problems occurred using PCA-transformed data in hierarchical clustering because the algorithm is sensitive to noise.
- By means of support vector machines (SVM), different microstructures in steels can be distinguished on the basis of diffractograms only. To maximize SVM effectiveness, the data must be transformed using principal component analysis (PCA).
- LOOCV serves as an appropriate method for assessing the accuracy of the hyperplanes generated by SVM.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
LIMI | Light Microscopy |
LOOCV | Leave-One-Out Cross-Validation |
PCA | Principal Component Analysis |
SVM | Support Vector Machine |
t-SNE | t-distributed stochastic neighbor embedding |
TTT | Time-Temperature-Transformation |
XRD | X-Ray Diffraction |
Appendix A. Dimensionality Reduction with t-Distributed Stochastic Neighbor Embedding (t-SNE)
Appendix B. Unsupervised Learning Using Hierarchical Clustering
- First, the similarity matrix is computed, which quantifies the similarity between all pairs of data points. This is achieved using a distance metric, such as the Euclidean distance, which measures the dissimilarity between two data points based on their feature values. A known problem of the Euclidean metric is its sensitivity to outliers due to the quadratic norm.
- Initially, singleton clusters are created, i.e., treating each data point as a separate cluster.
- In the next step, a similarity matrix is computed between all singleton clusters using the Euclidean metric.
- The different clusters are grouped in a cluster tree by using an iterative linkage process. The linkage function accesses the similarity matrix to determine the ordering in the hierarchical cluster tree. The similarity between clusters can be defined in different ways, such as single linkage. Alternative linkage methods include complete linkage, average linkage, and centroid linkage, each offering different approaches to assess the similarity between clusters. Detailed information on these linkage methods can be found in [11,12,13].
Appendix C. Supervised Learning with a Support Vector Machine (SVM)
- Hard-margin SVM [30]: When two classes of examples can be separated by a hyperplane (i.e., they are linearly separable), there are typically infinitely many hyperplanes that can separate the two classes. The hard-margin SVM selects the hyperplane with the maximum margin among all possible separating hyperplanes. The formulae and definitions used here follow the textbook [11], and slight variations in definitions may appear in different literature. In this reference, it is shown that the margin can be expressed asAt the same time, all data points must satisfy the condition for correct classification:The objective of this optimization problem is to minimize the Euclidean norm of the weight vector , which maximizes the margin between the hyperplane and the support vectors. The inequality is the constraint in order to ensure that the training samples are correctly classified. Thus, each training sample must satisfy the condition that the product of its class label and its signed distance from the hyperplane is greater than or equal to 1. Since it requires all data points to be perfectly separable without violating this margin, it performs poorly during the following instances:
- (a)
- Data are not perfectly separable: In real-world datasets, perfect linear separability is rare, especially in the presence of overlapping classes or mislabeled data. Hard-margin SVM fails in such scenarios as it cannot accommodate errors.
- (b)
- Outliers significantly affect the decision boundary: Even a single outlier can drastically influence the hyperplane, leading to a poorly generalized model. Hard-margin SVM does not tolerate violations of the margin, which makes it unsuitable for noisy datasets.
Therefore, in the case of not perfectly separable data, a better approach is the soft-margin SVM. - Soft-margin SVM [31]:The soft-margin SVM allows for a certain degree of misclassification. In this scenario, additional slack variables are introduced into the optimization problem, and a penalty parameter C controls the trade-off between maximizing the margin and tolerating misclassifications.The mathematical formulation for the soft-margin SVM, as indicated by Raschka (2019) [11] and Frochte (2020) [12], is as follows:Here, represents the slack variables accommodating misclassifications, while C serves as the penalty parameter that balances achieving a wider margin with permitting misclassifications.In practice, hard-margin and soft-margin SVMs are commonly employed choices. However, the central finding of the analysis by Rosasco (2004) [32] reveals that, particularly for classification purposes, the hinge loss emerges as the preferred loss function. This is attributed to the hinge loss function’s ability to yield better convergence compared with the conventional squared loss mentioned in hard-margin and soft-margin SVM.
- Maximum-margin SVM using the hinge loss function [32]:In supervised learning, the hinge loss function is a frequently chosen objective function. The hinge loss function, denoted as , for the data point i is defined by Equation (A7):In the hinge loss function, represents the labeled outcome, taking +1 if it belongs to the class and −1 if it does not. Further, the output of the classifier , computed as:The hinge loss function measures the difference between the predicted score () and the expected output (), taking the maximum between 0 and this difference. In case the class is correctly predicted, and have the same sign, and becomes zero due to the constraint, Equation (A3). Incorrect predictions lead to a non-zero loss when the predicted score and the expected output have opposite signs; see Equation (A7).The gradient of the hinge loss function is crucial for optimization. For optimizers, including those based on derivatives, the gradient with respect to the weight vector and the bias b must be calculated.The gradients are as follows:For multiple data points, the total loss function results in
Appendix D. Testing the Robustness of Support Vector Machines by Cross Validation
References
- Grzesik, W. Advanced Machining Processes of Metallic Materials: Theory, Modelling and Applications; Elsevier: Amsterdam, The Netherlands, 2008. [Google Scholar]
- Wießner, M. Hochtemperatur-Phasenanalyse am Beispiel Martensitischer Edelstähle; epubli: Berlin, Germany, 2010. [Google Scholar]
- Li, Y.; Colnaghi, T.; Gong, Y.; Zhang, H.; Yu, Y.; Wei, Y.; Gan, B.; Song, M.; Marek, A.; Rampp, M.; et al. Machine learning-enabled tomographic imaging of chemical short-range atomic ordering. Adv. Mater. 2024, 36, 2407564. [Google Scholar] [CrossRef]
- Hinton, G.E.; Roweis, S. Stochastic neighbor embedding. Adv. Neural Inf. Process. Syst. 2002, 15. Available online: https://papers.nips.cc/paper_files/paper/2002/hash/6150ccc6069bea6b5716254057a194ef-Abstract.html (accessed on 1 July 2024).
- Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
- Wiessner, M.; Gamsjäger, E.; van der Zwaag, S.; Angerer, P. Effect of reverted austenite on tensile and impact strength in a martensitic stainless steel- An in-situ X-ray diffraction study. Mater. Sci. Eng. A 2017, 682, 117–125. [Google Scholar] [CrossRef]
- Diffrac Plus, TOPAS/TOPAS R/TOPAS P, Version 3.0, User’s Manual; Bruker AXS GmbH: Karlsruhe, Germany, 2005.
- Balzar, D. Voigt-function model in diffraction line-broadening analysis. Int. Union Crystallogr. Monogr. Crystallogr. 1999, 10, 94–126. [Google Scholar]
- Baldinger, P.; Posch, G.; Kneissl, A. Pikrinsäureätzung zur Austenitkorncharakterisierung mikrolegierter Stähle/Revealing Austenitic Grains in Micro-Alloyed Steels by Picric Acid Etching. Pract. Metallogr. 1994, 31, 252–261. [Google Scholar] [CrossRef]
- Murphy, K.P. Machine Learning: A Probabilistic Perspective; MIT Press: Cambridge, MA, USA, 2012. [Google Scholar]
- Raschka, S.; Mirjalili, V. Python Machine Learning: Machine Learning and Deep Learning with Python, Scikit-Learn and TensorFlow 2, 3rd ed.; Packt Publishing: Birmingham, UK, 2019. [Google Scholar]
- Frochte, J. Maschinelles Lernen—Grundlagen und Algorithmen in Python; Hanser Verlag: Munich, Germany, 2020. [Google Scholar]
- Brunton, S.L.; Kutz, J.N. Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control; Cambridge University Press: Cambridge, UK, 2022. [Google Scholar]
- James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning; Springer: Berlin/Heidelberg, Germany, 2013; Volume 112. [Google Scholar]
- García, S.; Luengo, J.; Herrera, F. Data Preprocessing in Data Mining; Springer: Berlin/Heidelberg, Germany, 2015; Volume 72. [Google Scholar]
- Han, J.; Pei, J.; Tong, H. Data Mining: Concepts and Techniques; Morgan Kaufmann: Burlington, MA, USA, 2022. [Google Scholar]
- Iglewicz, B.; Hoaglin, D.C. Volume 16: How to Detect and Handle Outliers; Quality Press: Welshpool, Australia, 1993. [Google Scholar]
- Hossain, M.Z. The use of Box-Cox transformation technique in economic and statistical analyses. J. Emerg. Trends Econ. Manag. Sci. 2011, 2, 32–39. [Google Scholar]
- Bish, D.L.; Howard, S. Quantitative phase analysis using the Rietveld method. J. Appl. Crystallogr. 1988, 21, 86–91. [Google Scholar] [CrossRef]
- Jolliffe, I.T. Principal Component Analysis for Special Types of Data; Springer: Berlin/Heidelberg, Germany, 2002. [Google Scholar]
- Lohninger, H. Fundamentals of Statistics. Available online: http://www.statistics4u.info/fundstat_eng/copyright.html (accessed on 1 December 2024).
- Wiessner, M.; Angerer, P.; Prevedel, P.; Skalnik, K.; Marsoner, S.; Ebner, R. Advanced X-ray diffraction techniques for quantitative phase content and lattice defect characterization during heat treatment of high speed steels. BHM-Berg- und Hüttenmännnische Monatshefte 2014, 9, 390–393. [Google Scholar] [CrossRef]
- Novák, P.; Bellezze, T.; Cabibbo, M.; Gamsjäger, E.; Wiessner, M.; Rajnovic, D.; Jaworska, L.; Hanus, P.; Shishkin, A.; Goel, G.; et al. Solutions of Critical Raw Materials Issues Regarding Iron-Based Alloys. Materials 2021, 14, 899. [Google Scholar] [CrossRef] [PubMed]
- Ward, J.H., Jr. Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 1963, 58, 236–244. [Google Scholar] [CrossRef]
- Lance, G.N.; Williams, W.T. A general theory of classificatory sorting strategies: 1. Hierarchical systems. Comput. J. 1967, 9, 373–380. [Google Scholar] [CrossRef]
- Barr, G.; Dong, W.; Gilmore, C.J. High-throughput powder diffraction. II. Applications of clustering methods and multivariate data analysis. J. Appl. Crystallogr. 2004, 37, 243–252. [Google Scholar] [CrossRef]
- Iwasaki, Y.; Kusne, A.G.; Takeuchi, I. Comparison of dissimilarity measures for cluster analysis of X-ray diffraction data from combinatorial libraries. npj Comput. Mater. 2017, 3, 4. [Google Scholar] [CrossRef]
- Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. In kdd; University of Munich: München, Germany, 1996; Volume 96, pp. 226–231. [Google Scholar]
- Huber, P.J. Robust estimation of a location parameter. Ann. Math. Statist. 1964, 35, 73–101. [Google Scholar] [CrossRef]
- Vapnik, V. The Nature of Statistical Learning Theory; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
- Cortes, C.; Vapnik, V. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
- Rosasco, L.; De Vito, E.; Caponnetto, A.; Piana, M.; Verri, A. Are loss functions all the same? Neural Comput. 2004, 16, 1063–1076. [Google Scholar] [CrossRef] [PubMed]
- Brownlee, J. Machine Learning Mastery with Python: Understand Your Data, Create Accurate Models, and Work Projects End-to-End; Machine Learning Mastery: San Juan, PR, USA, 2021. [Google Scholar]
Component | C | Cr | Mo | W | V | Co | Si | Mn | Ni | Fe |
---|---|---|---|---|---|---|---|---|---|---|
Mass fraction in % | bal. |
Number | Time [s] | Mass Fraction of Austenite · 100 |
---|---|---|
1 | 40 | |
2 | 110 | |
3 | 310 | |
4 | 500 | |
5 | 800 | |
6 | 2300 | |
7 | 6500 | |
8 | 9000 | |
9 | 18,000 | 0 |
10 | 40,000 | 0 |
Number | time [s] | Volume Fraction of Martensite · 100 (Label) | Volume Fraction of Bainite · 100 (Label) | Volume Fraction of Pearlite · 100 (Label) |
---|---|---|---|---|
1 (martensite) | 40 | |||
2 (martensite) | 110 | |||
3 (martensite) | 310 | |||
4 (martensite) | 500 | |||
5 (bainite) | 800 | |||
6 (bainite) | 2300 | |||
7 (bainite) | 6500 | |||
8 (bainite) | 9000 | |||
9 (pearlite) | 18,000 | |||
10 (pearlite) | 40,000 |
Number | PCA1 | PCA2 | PCA3 | |
---|---|---|---|---|
1 (martensite) | 40 | 0.1505 | 0.4366 | −0.3775 |
2 (martensite) | 110 | 0.1740 | 0.4370 | −0.2785 |
3 (martensite) | 310 | 0.1487 | 0.3880 | −0.1940 |
4 (martensite) | 500 | 0.2022 | 0.3698 | 0.0199 |
5 (bainite) | 800 | 0.1785 | 0.1939 | 0.2282 |
6 (bainite) | 0.2029 | 0.1201 | 0.4385 | |
7 (bainite) | 0.2820 | 0.1762 | 0.4772 | |
8 (bainite) | 0.2785 | 0.0470 | 0.3152 | |
9 (pearlite) | 0.5609 | −0.2833 | 0.0319 | |
10 (pearlite) | 0.5825 | −0.4070 | −0.4118 |
Number | Time [s] | Distance from Hyperplane Calculated from All Points | Distance from Hyperplane Leave-One-Out Cross-Validation |
---|---|---|---|
1 (martensite) | 40 | 0.4644 | 0.4634 |
2 (martensite) | 110 | 0.3929 | 0.3930 |
3 (martensite) | 310 | 0.3018 | 0.3012 |
4 (martensite) | 500 | 0.1337 | 0.0384 |
5 (bainite) | 800 | −0.1340 | −0.0321 |
6 (bainite) | 2300 | −0.3359 | −0.3362 |
7 (bainite) | 6500 | −0.3314 | −0.3314 |
8 (bainite) | 9000 | −0.3090 | −0.3094 |
9 (pearlite) | 18000 | −0.3731 | −0.3730 |
10 (pearlite) | 40000 | −0.1521 | −0.0195 |
Number | Time [s] | Distance from Hyperplane Calculated from All Points | Distance from Hyperplane Leave-One-Out Cross-Validation |
---|---|---|---|
1 (martensite) | 40 | −0.4829 | −0.4831 |
2 (martensite) | 110 | −0.3969 | −0.3969 |
3 (martensite) | 310 | −0.3025 | −0.3014 |
4 (martensite) | 500 | −0.1147 | 0.0032 misclassified |
5 (bainite) | 800 | 0.1146 | 0.0740 |
6 (bainite) | 2300 | 0.3161 | 0.3167 |
7 (bainite) | 6500 | 0.3202 | 0.3172 |
8 (bainite) | 9000 | 0.1880 | 0.1875 |
9 (pearlite) | 18000 | −0.1153 | 0.1719 misclassified |
10 (pearlite) | 40000 | −0.5211 | −0.5198 |
Number | Time [s] | Distance from Hyperplane Calculated from All Points | Distance from Hyperplane Leave-One-Out Cross-Validation |
---|---|---|---|
1 (martensite) | 40 | −0.2955 | −0.2729 |
2 (martensite) | 110 | −0.3299 | −0.3300 |
3 (martensite) | 310 | −0.3431 | −0.3424 |
4 (martensite) | 500 | −0.4018 | −0.4018 |
5 (bainite) | 800 | −0.3744 | −0.3734 |
6 (bainite) | 2300 | −0.4028 | −0.4015 |
7 (bainite) | 6500 | −0.4262 | −0.4256 |
8 (bainite) | 9000 | −0.2560 | −0.1816 |
9 (pearlite) | 18000 | 0.2552 | 0.0868 |
10 (pearlite) | 40000 | 0.5612 | 0.5750 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wiessner, M.; Gamsjäger, E. Characterization of High-Speed Steels—Experimental Data and Their Evaluation Supported by Machine Learning Algorithms. Metals 2025, 15, 194. https://doi.org/10.3390/met15020194
Wiessner M, Gamsjäger E. Characterization of High-Speed Steels—Experimental Data and Their Evaluation Supported by Machine Learning Algorithms. Metals. 2025; 15(2):194. https://doi.org/10.3390/met15020194
Chicago/Turabian StyleWiessner, Manfred, and Ernst Gamsjäger. 2025. "Characterization of High-Speed Steels—Experimental Data and Their Evaluation Supported by Machine Learning Algorithms" Metals 15, no. 2: 194. https://doi.org/10.3390/met15020194
APA StyleWiessner, M., & Gamsjäger, E. (2025). Characterization of High-Speed Steels—Experimental Data and Their Evaluation Supported by Machine Learning Algorithms. Metals, 15(2), 194. https://doi.org/10.3390/met15020194