Applications of Machine Learning in High-Entropy Alloys: Phase Prediction, Performance Optimization, and Compositional Space Exploration

Xu, Xiaotian; He, Zhongping; Zheng, Kaiyuan; Che, Lun; Feng, Wei

doi:10.3390/met15121349

Open AccessReview

Applications of Machine Learning in High-Entropy Alloys: Phase Prediction, Performance Optimization, and Compositional Space Exploration

by

Xiaotian Xu

,

Zhongping He

^*,

Kaiyuan Zheng

,

Lun Che

and

Wei Feng

School of Mechanical Engineering, Chengdu University, Chengdu 610106, China

^*

Author to whom correspondence should be addressed.

Metals 2025, 15(12), 1349; https://doi.org/10.3390/met15121349

Submission received: 5 November 2025 / Revised: 1 December 2025 / Accepted: 2 December 2025 / Published: 8 December 2025

Download

Browse Figures

Versions Notes

Abstract

The rapid advancement of machine learning (ML) has ushered in a new era for materials science, particularly in the design and understanding of high-entropy alloys (HEAs). As a class of compositionally complex materials, HEAs have greatly benefited from the predictive power and computational efficiency of ML techniques. Recent years have witnessed remarkable expansion in the scope and sophistication of ML applications to HEAs, spanning from phase formation prediction to property and microstructure modeling. These developments have significantly accelerated the discovery and optimization of novel HEA systems. This review provides a comprehensive overview of the current progress and emerging trends in applying ML to HEA research. We first discuss phase prediction methodologies, encompassing both pure ML frameworks and hybrid physics-informed models. Subsequently, we summarize advances in ML-driven prediction of HEA properties and microstructural features. Further sections highlight the role of ML in exploring vast compositional spaces, guiding the design of high-performance HEAs, and optimizing existing alloys through data-driven algorithms. Finally, the challenges and limitations of current approaches are critically examined, and future directions are proposed toward interpretable models, mechanistic understanding, and efficient exploration of the HEA design space.

Keywords:

machine learning; high-entropy alloys; alloy design; optimized design; performance prediction

1. Introduction

Throughout the history of human productivity development, metallic materials have always played a crucial role. From the early Bronze Age to the Iron Age, and into modern times with steel, aluminum alloys, titanium alloys, and magnesium alloys, traditional metallic systems have been predominantly based on one principal element supplemented by minor additions of others. In contrast, high-entropy alloys (HEAs), first proposed in 2004, represent a novel class of alloys fundamentally distinct from conventional systems. They typically consist of five or more elements in near-equimolar ratios [1,2,3,4,5]. Compared with conventional alloys, HEAs possess higher configurational entropy, which reaches its maximum when the constituent elements are present in equal proportions. The configurational entropy further increases with the number of elemental species [3]. The formation of a single solid-solution phase in HEAs arises from the combined influence of multiple factors, including the high-entropy effect, elemental mutual solubility, crystal structure selection, thermodynamic stability, and phase diagram design. Together, these factors underpin the exceptional properties and stability of HEAs. The vast compositional design space afforded by multiple principal elements, along with their remarkable performance, has attracted increasing attention from researchers in the field of metallic materials science.

Traditional alloy design relies on well-established methodologies, including trial-and-error approaches, thermodynamic simulations, density functional theory (DFT), and molecular dynamics (MD) calculations [6,7,8,9,10]. However, these conventional methods face significant limitations when applied to HEAs due to their vast compositional space, which demands considerable time and computational cost. For instance, the trial-and-error approach becomes prohibitively time-consuming for HEAs with enormous compositional possibilities. Thermodynamic models are confined to predicting thermodynamic properties and equilibrium phases but cannot directly evaluate other performance metrics. DFT, constrained by dataset size and computational complexity, exhibits low efficiency and high cost when processing large-scale HEA data [11]. MD simulations, which model HEA deformation mechanisms at the atomic scale [12,13,14,15], can handle larger datasets; however, their accuracy depends on appropriate force field selection, precise modeling of complex elemental interactions, sufficient computational resources and time scales, and reliable experimental validation. Despite their inefficiency and high cost, these traditional approaches have achieved substantial success in previous HEA studies—particularly in parameter-selection-based predictions—and continue to play an essential role in supporting subsequent research by providing fundamental parameters and alloy data for machine learning (ML)-driven investigations.

In recent years, one of the most rapidly evolving areas in materials science has been the application of ML to accelerate material discovery and reduce computational and experimental costs [16,17,18,19]. Compared with traditional modeling approaches, ML offers inherent advantages owing to its flexibility in handling new data and its ability to rapidly establish input–output relationships. The intersection between the vast design space of HEAs and the rapid advancement of ML has reinvigorated HEA research. Over the past decade, ML and its subfields—particularly deep learning—have experienced explosive growth [20]. ML excels at identifying complex patterns and systematically optimizing models through data-driven learning. It encompasses a diverse range of computational methods, such as random forests (RF), neural networks (NN), support vector machines (SVM), and decision trees (DT), all of which are widely employed in HEA design.

Unlike conventional monobasic alloys, the compositional space of HEAs is extremely large, making it impractical to experimentally explore all possible compositional combinations. The strength of ML in this area lies in its ability to efficiently process and analyze large amounts of high-dimensional data, enabling the rapid identification of potentially promising alloy compositions. With ML, researchers can predict the properties of unexplored alloy systems based on limited experimental data, thereby significantly accelerating alloy design and development. There often exists a highly complex and nonlinear relationship between the properties of HEAs (e.g., mechanical strength, thermal stability, etc.) and their compositions and microstructures, which is difficult for traditional physical models to accurately describe. ML, particularly nonlinear models such as deep learning [21,22] and SVMs, can better capture these intricate mapping relationships. Although experimental data on HEAs are often scarce, ML can effectively address this challenge through data augmentation techniques [23,24,25,26] and transfer learning [27,28]. For example, data from other material systems (e.g., conventional alloys or related metallic systems) can be leveraged to enhance the training of HEA models and improve their generalization to untested compositions. Furthermore, ML optimization algorithms—such as Bayesian optimization and genetic algorithms—enable automatic searches for optimal alloy combinations within the multidimensional compositional space, thereby facilitating the discovery of top-performing HEAs. ML has demonstrated unique advantages in this field, including its ability to handle multivariate compositional complexity, capture nonlinear property–structure relationships, accelerate material exploration, and mitigate data scarcity. In addition, intelligent optimization strategies that integrate physical principles with experimental design allow ML not only to excel in HEA research but also to drive broader innovation across materials science. The exponential growth in related publications reflects this ongoing trend (see Figure 1).

In principle, given sufficient high-quality data, ML can reliably predict the structures and properties of HEAs through systematic training, validation, and testing [29]. ML models typically operate with high computational efficiency and achieve accuracy comparable to empirical or theoretical methods, which is crucial for understanding the exceptional mechanical properties of HEAs. Moreover, ML can autonomously identify critical physical descriptors and latent variables by establishing predictive models, thereby enabling efficient exploration of the vast HEA design space through the recognition of complex structure–property relationships.

In recent years, machine learning (ML) has found extensive applications in materials science, with high-entropy alloy (HEA) research emerging as a major hotspot. Hu et al. reviewed the role of ML in forward property prediction and inverse alloy design, highlighting key trends and future directions [30]. Yan et al. summarized recent ML applications in HEA design from three aspects: phase formation, structural properties, and the prediction of interatomic potentials [31]. Liu et al. provided a comprehensive overview of ML models, thermodynamics, and atomic-scale simulations for HEAs, and further proposed future directions in uncertainty quantification and ML-guided inverse design [29]. Hu et al. also constructed a cross-scale, data-driven research paradigm for HEAs. Their hybrid framework, which integrates multimodal data with physical constraints, is expected to enhance interpretability and enable breakthroughs in HEA design and performance, thereby providing theoretical support for industrial applications [32]. In another study, Liu et al. systematically reviewed recent advances in understanding chemical short-range order (SRO) in HEAs, with particular emphasis on the use of computational and simulation methods—such as DFT, MD, MC, and ML-based interatomic potentials—to reveal its formation mechanisms (including thermodynamic, electronic, mechanical, and magnetic factors) and its influence on key properties such as stacking fault energy, dislocation behavior, and phase transformations. The review also outlined future research perspectives [33]. A comprehensive guide to machine learning is offered by Zhao et al., intended to facilitate its application and accelerate the advancement of high-entropy alloys [34]. The monograph edited by Liaw & Brechtl is the definitive source for a panorama of the field and its data-driven methodologies [35].

The aim of this paper is to provide a comprehensive overview of current ML research on HEAs and to outline prospective research directions. The discussion centers on ML applications in HEA prediction, optimization, and novel alloy discovery. First, it introduces ML-driven approaches for phase structure prediction in HEAs. Next, it reviews the role of ML in predicting HEA properties and guiding the development of high-performance alloys. Subsequently, it summarizes recent advances in ML-enabled exploration of novel HEAs, including compositional space investigation, structure–property relationship analysis, and design optimization. Finally, the paper discusses the current challenges and future prospects of applying ML in HEA research.

2. General Workflow for Machine Learning (In HEA Design)

2.1. Key Components of Machine Learning

Recent advancements in ML within the HEA domain have facilitated the establishment of increasingly standardized research workflows. When applying ML to HEA design, the key steps typically include data collection, data preprocessing, algorithm selection, model training, model validation, and model evaluation. These steps constitute a standard ML workflow aimed at developing accurate and reliable predictive models. However, the process is not entirely rigid and can be flexibly adapted to specific experimental or computational requirements. A schematic illustration of the fundamental workflow is shown in Figure 2.

2.1.1. Data Preparation

Data collection is a crucial step in machine learning (ML), as high-quality datasets are essential for training reliable models. Experimental data, which provide direct insights into the properties and microstructures of high-entropy alloys (HEAs), are typically derived from literature, patents, and laboratory studies. However, obtaining experimental data is often costly and time-consuming, especially under extreme environmental conditions, and the data are usually limited and unevenly distributed. Text mining and automated database construction using natural language processing (NLP) [36,37] techniques have facilitated the extraction of key performance data from literature, contributing to publicly available databases that support ML-based property predictions. Computational simulation data, particularly from density functional theory (DFT) calculations, offer complementary information, such as electronic structures and formation energies, and are increasingly used to supplement experimental datasets.

Once raw data is collected, preprocessing is essential to ensure the robustness of ML models. Data cleaning addresses issues like missing values and outliers, which can be handled through imputation, interpolation, or statistical methods. Normalization or standardization [38] is often applied to scale data and improve model convergence. To overcome data scarcity, particularly in the diverse compositional space of HEAs, data augmentation techniques such as Generative Adversarial Networks (GANs) can generate synthetic data, enhancing model training.

Feature engineering [39,40] plays a key role in capturing the relationships between composition, structure, and properties. Compositional features, such as atomic ratios and thermodynamic parameters, alongside structural features like crystal type and phase fraction, influence phase stability and property predictions. Advanced techniques, such as deep learning for high-order feature extraction, enable more accurate predictions by capturing complex inter-element interactions. Thus, effective feature selection and engineering are critical for improving the performance and interpretability of ML models in HEA research.

2.1.2. Algorithm Selection

In materials science, particularly in the design and performance prediction of HEAs, the selection of ML algorithms is crucial for optimizing predictive accuracy, computational efficiency, and practical applicability. ML algorithms are primarily used for classification, regression, clustering, and dimensionality reduction tasks. Classification models predict discrete outcomes, such as phase formation in alloys (e.g., BCC, FCC), with common algorithms including SVM, RF, NN, and gradient-boosted trees (e.g., XGBoost). Regression models predict continuous variables, such as hardness or tensile strength, with similar models including linear regression (LR), SVR, and RF.

There is often a trade-off between model accuracy and efficiency. Complex models like NNs and gradient-boosted decision trees tend to provide high accuracy but require significant computational resources, slower training times, and reduced interpretability. Simpler models like decision trees and naïve Bayes classifiers are faster and more efficient but may struggle with complex relationships. Clustering algorithms, such as K-means and hierarchical clustering, help identify patterns in alloy compositions and properties, though K-means can be sensitive to the number of clusters, while hierarchical clustering is more computationally intensive. Dimensionality reduction techniques, like principal component analysis (PCA), simplify high-dimensional data for visualization and model acceleration, while more advanced methods like variational autoencoders (VAEs) capture nonlinear relationships and generate new alloy compositions for exploration.

Choosing the appropriate ML algorithm depends on the specific task, data characteristics, and the trade-offs between accuracy, efficiency, and interpretability.

2.1.3. Model Training and Evaluation

ML models are trained to perform effectively on new, unseen datasets. To achieve this goal, the dataset must be divided into subsets. Typically, the data are split into two parts: a training set and a test set, where the training set contains the majority of the data (e.g., 80% of the total), and the test set comprises the remaining portion (e.g., 20%). Alternatively, the dataset can be divided into three subsets—training, validation, and test sets—where the training set occupies the largest portion, while the validation and test sets each contain smaller portions of the data. When the dataset is small, cross-validation (CV) is often employed. In this method, the dataset is partitioned into N folds (commonly 5- or 10-fold CV). One fold is used as the test set, and the remaining folds are used for training. The trained model is then applied to the held-out fold for validation. This process is repeated until each fold has served once as the test set. After data partitioning, the model is trained using the designated training set.

The most standard approach to model selection is to train the model using the training set and then evaluate its prediction error on the validation set. This error provides an accurate measure of the model’s predictive performance on unseen data. The model corresponding to the smallest prediction error is subsequently chosen as the optimal model. Identifying the best model among trained candidates constitutes the core of the model training process. The primary objective of model selection is to prevent overfitting. Common optimization algorithms used for ML models in HEA research include Bayesian optimization, genetic algorithms, and particle swarm optimization.

Model evaluation serves as a measure of a model’s performance. For classification problems, commonly used evaluation metrics include accuracy, precision, and recall. A confusion matrix is frequently employed to visualize classification results, and a typical 2 × 2 confusion matrix is illustrated in Figure 3. Accuracy represents the proportion of correctly predicted samples among all samples and is calculated according to Equation (1). Precision measures the proportion of correctly predicted positive samples among all samples predicted as positive, as defined in Equation (2). Recall (also known as the true positive rate) represents the proportion of actual positive samples that are correctly identified by the model, as given in Equation (3). Abbreviations used are specified in Figure 2. For regression problems, common evaluation metrics include the root mean square error (RMSE) and the coefficient of determination (R²), as shown in Equations (4) and (5), respectively. A higher RMSE indicates a larger prediction error, whereas a higher R² value signifies a better model fit. For additional details, refer to [41].

A c c u r a c y = \frac{(T P + T N)}{(T P + T N + F P + F N)}

(1)

P r e c i s i o n = \frac{T P}{(T P + F P)}

(2)

R e c a l l = \frac{T P}{(T P + F N)}

(3)

R M A E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i a} - y_{i p})}^{2}}

(4)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i a} - y_{i p})}^{2}}{\sum_{i = 1}^{n} {(y_{i a} - \bar{y})}^{2}}

(5)

where n represents the total number of test samples, i represents the ith sample,

y_{i a}

and

y_{i p}

represent the actual output and predicted output of sample i, respectively, and

\bar{y}

represent the sample average of the target attribute in the dataset.

2.2. Example Diagram of the Application of Machine Learning to Phase Prediction, Performance and Composition Design

This section presents the specific applications of the aforementioned ML methods in HEAs, primarily through schematic diagrams illustrating various ML architectures and their corresponding functionalities [42,43,44]. Figure 4 illustrates a machine learning-based framework for the design of highly versatile alloys, structured into four phases: data preparation, model development and optimization, performance prediction and design output, and experimental validation feedback. This approach facilitates the transition from an experience-driven to a data- and intelligence-driven paradigm in materials development. The following three sections will further elaborate on recent advancements in this field.

3. Phase Prediction of HEA

The microscopic organization and structure of a material determine its properties; therefore, accurate phase structure prediction has a significant impact on materials research. Figure 5 illustrates the phase classification of HEAs at this stage, which can be divided into single-phase and multiphase categories. Thermodynamic properties govern phase formation in alloys and play a crucial role in the design of HEAs. However, due to the lack of fundamental thermodynamic data, achieving high-precision phase structure prediction through conventional thermodynamic calculations remains challenging. ML provides an important alternative approach for studying HEAs by leveraging existing thermodynamic datasets. The multi-component nature of HEAs leads to complex phase structures that are difficult to predict accurately using traditional methods. ML can rapidly predict alloy phase diagrams and microstructures by analyzing the relationships between composition and phase structure, thereby aiding in understanding phase stability and distribution. This section summarizes recent research progress on the application of ML in phase classification and phase prediction of HEAs, and further discusses predictive models and hybrid approaches that combine ML techniques with traditional computational methods.

3.1. ML Model Prediction

At present, ML is widely applied in the phase classification and phase structure prediction of HEAs [45,46,47], effectively guiding their design and development. Various ML algorithms play important roles in HEA research, although the functions of different algorithmic models vary. This section reviews the prediction of phase structures using various ML algorithms. The integration of multiple ML models can reduce the risk of overfitting and improve prediction accuracy.

3.1.1. RF

RF are an ensemble learning method that combines multiple DTs for classification or regression tasks. RF integrates the concepts of bagging and random feature selection, achieving high accuracy, robustness, and generalization capability. It has been widely applied in ML-based HEA prediction. In HEA phase prediction, RF serves as a powerful tool due to its ability to handle high-dimensional data, assess feature importance, and maintain robustness. However, it also has certain limitations, including high computational cost, limited model interpretability, and potential overfitting risks. Therefore, researchers should balance the advantages and disadvantages of RF according to specific application requirements and data characteristics, and consider combining it with other methods (e.g., deep learning or SVM) to enhance both predictive performance and interpretability.

The DT classification mechanism within RF endows the algorithm with high accuracy and the ability to handle multi-featured datasets, providing certain advantages over other ML algorithms in HEA phase prediction. Mishra et al. cross-validated a training dataset comprising 601 cast alloys and tested their models using integrated approaches (RF and stacked integration) and SVM methods. They predicted Solid Solution (SS), Intermetallic (IM), Amorphous (AM), SS + IM, and IM + AM phases. The results demonstrated that stacked integration—combining weak learners with meta-models—achieved accuracy comparable to that of neural network models [48]. Han et al. applied ensemble learning methods, represented by RF and Extreme Gradient Boosting (XGBoost), to over 800 HEAs with 16 input features for phase composition prediction. Their model achieved higher prediction accuracy than other traditional ML models. Furthermore, the effectiveness of the feature training model was validated, and dimensionality reduction using Principal Component Analysis (PCA) was performed without any loss in accuracy [49]. Peivaste et al. developed an integrated dataset containing 5692 experimental records covering 50 elements and 11 phase categories. By comparing the performance of various ML models, they found that RF and XGBoost consistently outperformed others, achieving an accuracy of 86% in predicting all phases [50]. Oñate et al. evaluated the phase prediction performance of HEAs using four supervised ML models: K-Nearest Neighbors (KNN), multinomial regression, XGBoost, and RF. They addressed the challenge of predicting multicomponent alloys by accounting for overlaps among multiparametric stability parameters, employing eight prediction classes (FCC, BCC, FCC + BCC, FCC + IM, BCC + IM, FCC + BCC + IM, IM, and AM). The model predictions were compared with two newly fabricated alloys prepared via induction melting under a controlled atmosphere and analyzed by X-ray diffraction (XRD). The study revealed that, with a robust database and appropriate preprocessing, conventional ML methods based on SS, SS + IM, IM, and AM classifications achieved satisfactory and competitive predictive metrics. Among the four evaluated models, RF exhibited the best performance, with an accuracy of 72.8% and an ROC-AUC of 93.1% [51].

Although RF exhibits high accuracy and strong capability in handling high-dimensional data, it functions as a “black box,” meaning that its internal decision processes cannot be easily interpreted. This results in limited model explainability and relatively poor performance in regression and low-dimensional problems. When HEA datasets are insufficient and the number of features is small, the prediction accuracy is significantly reduced, making it difficult to extract meaningful phase formation rules to guide alloy design.

3.1.2. NN

NNs are computational models that mimic the structure and function of the human brain. They consist of numerous artificial neurons that perform learning and recognition tasks by simulating the connections and information transfer between neurons. The training process of an NN generally includes two main steps—forward propagation and backpropagation—through which the connection weights between neurons are continuously adjusted to improve network performance. An NN typically comprises an input layer, one or more hidden layers, and an output layer. The input layer receives data from the external environment, the hidden layers process the signals and pass them to the next layer, and the output layer delivers the final results to the outside world [33]. NNs encompass various architectures, including Artificial Neural Networks (ANNs), Deep Neural Networks (DNNs), and Convolutional Neural Networks (CNNs). NNs demonstrate significant advantages in HEA phase prediction owing to their powerful feature-learning capability, ability to handle large-scale data, and high predictive performance. However, they also face challenges such as high computational cost, limited model interpretability, and large data requirements. Researchers should comprehensively consider these advantages and limitations, select and design NN models according to practical application needs, and combine them with other techniques (e.g., feature engineering or model fusion) to optimize the prediction performance for HEAs.

ANNs are self-learning systems that progressively recognize inherent patterns through exposure to labeled training samples, iteratively refining their predictive capabilities. Brown et al. trained a classical ANN using both a quantum computer simulator and a quantum processor to predict phase selection in HEAs. The alloy compositions served as inputs, while the corresponding phases were used as outputs. The quantum simulator was subsequently employed to implement a hybrid quantum–classical ML algorithm for the same supervised learning task. The resulting test accuracy was comparable to that achieved by the classical ANN. Finally, a quantum processor was used to perform hybrid quantum–classical ML computations, yielding slightly lower accuracy due to the instability and vulnerability of qubits within the quantum device [52]. Huang et al. employed three different ML algorithms—KNN, SVM, and ANN—using 401 experimental datasets, which included 174 SS, 54 IM, and 173 SS + IM phases. These datasets were characterized by five key parameters: atomic size difference, mixing entropy, mixing enthalpy, valence electron concentration, and electronegativity difference. The dataset was divided into four approximately equal subsets for cross-validation, and all three phase types were classified simultaneously. The test accuracies achieved were 68.6%, 64.3%, and 74.3% for KNN, SVM, and ANN, respectively. Subsequently, the classification of two-phase combinations using SVM and ANN was further refined. Using ANN, the test accuracies for the SS–IM, SS&IM–IM, and SS–SS&IM classifications reached 86.7%, 94.3%, and 78.9%, respectively—significantly higher than those obtained with SVM. The trained NN model demonstrated the best overall performance among the three algorithms, effectively predicting the phases of new HEAs. Moreover, a mathematical expression was proposed to estimate the probability of forming FCC and BCC phases in HEAs, derived through statistical analysis of experimental data and ML model predictions [53]. Krishna et al. investigated multiphase alloy systems composed of SS + IM mixtures using six ML algorithms—logistic regression, DT, SVM, RF, gradient boosting classifiers, and ANN—on a dataset of 636 alloys. Their analysis revealed overlapping boundaries among design parameters, which hindered accurate phase prediction [54]. Zhou et al. employed an artificial neural network to construct a machine learning model, from which a sensitivity matrix was derived to quantitatively evaluate the regulatory effects of various design parameters on the formation of specific phase structures, including solid solutions, intermetallic compounds, and amorphous phases, as is shown in Figure 6. Furthermore, the study introduced a set of extended parameters previously unconsidered in the design of high-entropy alloys or complex multi-phase alloys, systematically exploring their potential influence on phase formation. Ultimately, systematic experimental validation confirmed the feasibility and effectiveness of the design rules derived from the machine learning framework [55].

Compared with shallow models, DNN modeling can represent complex nonlinear problems more accurately and efficiently. Zhu et al. proposed a DNN architecture based on a Residual Network (ResNet) to predict the phase formation of HEAs. The model achieved a high overall accuracy of 81.9%, and its Micro-F1 score outperformed those of other ML models such as ANN and conventional DNNs in HEA phase prediction. The residual connections in ResNet effectively mitigated network degradation and improved the algorithm’s accuracy [56]. Most HEA phase prediction models rely on empirical thermophysical parameters, making the process complex and heavily dependent on the accuracy of descriptors, which significantly affects the final prediction results. To address this issue, Zhou et al. proposed a Deep Learning (DL) algorithm that incorporates additive manufacturing (AM) process parameters and data augmentation, and developed a more refined HEA phase classification strategy capable of predicting complex multiphase systems. By integrating AM process parameters into the model features, the prediction accuracy of HEA phases was substantially improved, while data augmentation alleviated the problem of data scarcity and further enhanced the performance of the DL model. After incorporating AM parameters and applying data augmentation, the model achieved an accuracy of 91.23%, and four new HEAs were fabricated through AM experiments to validate the robustness and practicality of the algorithm [57].

A CNN is a type of feedforward neural network with a convolutional structure. It reduces the memory requirements of deep networks, effectively extracts regional features, and decreases the number of parameters to mitigate overfitting. Guo et al. simplified the prediction process and improved prediction accuracy by automatically extracting features using CNNs for classification. They mapped HEA compositions onto a pseudo–two-dimensional periodic table, achieving prediction accuracies of over 89% for intermetallics, and over 98% for solid solution (SS) and amorphous (AM) phases. The phase compositions of Al_xFeCrNi (x = 0, 0.5, 1.0) HEAs predicted by the model were consistent with experimental results [42].

3.1.3. SVM

SVM is a binary classification model that separates samples of different classes by finding an optimal hyperplane. The basic idea of SVM is to identify a hyperplane that divides two classes of samples while maximizing the distance between the hyperplane and the nearest sample points, thereby achieving effective data classification. In HEA phase prediction, SVM offers strong classification capability and can efficiently handle high-dimensional data, showing good adaptability to nonlinear problems and robustness to noise. However, its high computational complexity, sensitivity to parameter selection, and limited interpretability when applied to large-scale datasets constrain its performance in certain applications. The decision to employ SVM should be based on the specific data characteristics and application requirements, and it may be beneficial to combine it with other algorithms or techniques (e.g., dimensionality reduction or model ensemble) to further enhance predictive performance.

Vishwakarma et al. employed a Support Vector Machine (SVM) model to classify HEAs into stable body-centered cubic (BCC), face-centered cubic (FCC), and other mixed phases using cross-validation. The model achieved training and testing accuracies exceeding 86%. This study demonstrated that ML methods can effectively classify and identify phases in HEAs with high and comparable predictive accuracy [58]. Chang et al. applied ML to investigate the formation and stability of solid solution (SS) phases using a dataset of 656 HEAs. The independence of nine physical parameters was verified through the Self-Organizing Mapping (SOM) algorithm, and their relative importance was ranked using feature importance analysis. The results indicated that the root-mean-square residual strain is the most critical parameter, enabling quantitative prediction of SS phase stability. SVM, GBDT, Multilayer Perceptron (MLP), and Logistic Regression (LR) algorithms were applied to predict SS phase formation, achieving test accuracies of 95.22%, 94.78%, 90.87%, and 89.57%, respectively. This study provides a new perspective on SS phase stability from the viewpoint of intrinsic residual strain and suggests that SVM may serve as a particularly effective algorithm for predicting HEA phase formation [59].

3.1.4. KNN

The KNN algorithm was proposed by Cover and Hart in 1968. As a basic and intuitive classification algorithm, KNN is a type of supervised learning that requires labeled training data. The class of a new sample is determined based on the k training samples closest to it, according to a specific classification rule. The basic approach involves three key steps: (1) defining the distance metric; (2) selecting the value of k (i.e., identifying the k nearest instances in the training set to the test sample); and (3) applying the classification decision rule. In HEA phase prediction, the advantages of KNN include its simplicity, lack of a dedicated training phase, adaptability to nonlinear relationships, and robustness to local outliers. However, it also suffers from limitations such as high computational complexity, sensitivity to data size and feature scaling, difficulty in handling high-dimensional data, and challenges in selecting optimal hyperparameters. Researchers should carefully consider these factors when applying KNN and may need to integrate techniques such as data preprocessing, feature selection, and model ensemble methods to further optimize model performance.

Qu et al. constructed an HEA phase selection strategy based on a large as-cast dataset comprising 2043 alloys, including HEAs as well as binary and ternary alloys. By combining multiple KNN models with ensemble learning methods, their approach achieved remarkably high predictive performance, with a test accuracy of 93% for the two proposed new thermodynamic parameters, and accuracy values exceeding 97% for each individual phase [60]. Risal et al. prepared nine different datasets extracted from experimental data and applied four different ML algorithms—KNN, SVM, RF, and MLP classifiers—for phase prediction. The best results were obtained using KNN and RF on the Oversampled-PCA-6 dataset, achieving test accuracies of 92.31% and 91.21%, respectively. Precision–recall curves were plotted for the best estimators, yielding an average accuracy of 92% and a micro-mean ROC accuracy of 98% for the optimal classifier [61].

3.1.5. Boosting

The boosting algorithm employs a sequential approach to integrate multiple weak classifiers into a single strong classifier, thereby enhancing the overall predictive accuracy of the model. Its basic principle is to iteratively stack weak classifiers, with each subsequent layer trained to correct the errors of the previous one, until the entire training dataset is accurately predicted or the maximum number of classifiers is reached. There are three main types of boosting algorithms: Adaptive Boosting (AdaBoost), Gradient Boosting (GB), and XGBoost. The application of boosting methods in HEA phase prediction offers notable advantages, including improved predictive accuracy, the ability to handle complex data, reduced overfitting, feature importance evaluation, and algorithmic flexibility. However, boosting also presents certain limitations, such as high computational cost, challenges in hyperparameter tuning, limited interpretability, and sensitivity to noise. When applying boosting algorithms, it is essential to balance these advantages and drawbacks, consider the characteristics of the dataset and the practical requirements, and appropriately select and tune the boosting method to achieve optimal predictive performance.

Bobbili et al. employed multiple models—XGBoost, RF, AdaBoost, DT, Logistic Regression (LR), SVM, and KNN—to predict phases. Among them, XGBoost achieved the highest accuracy (90%) and identified key variables—mean atomic radius, atomic size difference, mixing enthalpy, ideal mixing entropy, and valence electron concentration—in agreement with experimental observations [51]. Hareharen et al. used five ML algorithms (DT, KNN, RF, GB, and XGBoost) to predict the phase and crystal structure of HEAs. The input features, trained on a large experimental dataset, included atomic size difference, mixing enthalpy, mixing entropy, valence electron concentration, electronegativity difference, Ω, and average melting temperature. XGBoost attained 80% accuracy for phase prediction. When trained on the experimental dataset, RF delivered the highest accuracy (94%) for crystal-structure prediction. To further enhance performance, the authors combined the ML models with the Synthetic Minority Oversampling Technique (SMOTE); under this setting, XGBoost reached 93% accuracy for crystal-structure identification and 84% for phase prediction [45].

ML has been widely applied to HEA phase prediction. However, relying on a single (or even two) algorithms can increase the risk of overfitting and yield biased predictions. Leveraging multiple ML algorithms—either for comparative model selection or via ensemble strategies—can mitigate overfitting and enhance model robustness.

3.2. Mixed Model Phase Prediction

ML methods have been widely and successfully applied to predict the phases of HEAs. However, the performance of a single algorithm is often suboptimal, and different ML models may yield discrepant predictions, which hinders their use in materials design. Integrating traditional approaches with ML improves predictive accuracy and offers a promising direction for phase prediction. This section reviews phase-structure prediction using hybrid models. In particular, combining ML with thermodynamics-based methods provides a new paradigm for phase-structure prediction.

3.2.1. Combined with Thermodynamic Calculations

The CALPHAD method has been shown to be a powerful tool in alloy design [62,63,64,65] and has attracted significant interest for HEA design. However, CALPHAD is generally reliable only within well-defined composition–temperature spaces [66,67,68]. Integrating CALPHAD with ML has broadened the use of phase-diagram calculations for HEAs: CALPHAD-generated datasets can be used to train and select ML models, effectively addressing data scarcity.

Qu et al. used SVM to construct phase-selection models from datasets comprising composition features and thermodynamic parameters. Both SVM models achieved accuracies above 85%, and a dataset with >1000 entries covering 18 elements and a broad range of HEAs was established. The comparable performance of the composition-based and thermodynamics-based models indicates that phase formation in HEAs can be predicted directly from composition, with thermodynamic parameters effectively serving as a feature transformation of the compositional space [69]. Zeng et al. combined CALPHAD calculations with ML to generate >300,000 equilibrium-phase data points from 20 quinary systems built from eight elements (Al, Co, Cr, Cu, Fe, Mn, Ni, Ti) using Thermo-Calc and the TCHEA3 database. They initially selected 15 material/physical descriptors and then used XGBoost to identify the five most important. Based on these five descriptors, they established new single-phase selection rules for FCC and BCC, achieving a success rate exceeding 90%, significantly outperforming existing rules. This provides a powerful tool for single-phase screening across wide temperature ranges and reveals high-fidelity new phase-selection rules for HEAs [70].

He et al. successfully designed BCC/FCC dual-phase refractory high-entropy alloys (RHEAs) by combining CALPHAD with NN modeling. Based on binary phase-formation relationships among the alloying elements, 13 BCC/FCC dual-phase RHEAs exhibiting liquid-phase separation were selected as training data. Two binary NN classifiers (“multiphase” vs. “solid solution”) were then trained, achieving accuracies of 89.52% and 89.83%, respectively. Applying these models to 504 candidate RHEA compositions identified 51 BCC/FCC dual-phase RHEAs, enabling the first reported compositional design of metastable BCC/FCC dual-phase RHEAs. Experimental validation showed that the arc-melted alloys exhibited a reinforced dendritic microstructure. This work provides an effective pathway for designing multiphase RHEAs tailored to specific performance targets [71].

3.2.2. Combined with DS Evidence Theory

Dempster–Shafer (DS) evidence theory was proposed by A. P. Dempster at Harvard University to address multi-valued mapping problems using upper and lower probabilities [72], and was subsequently refined by his students. DS theory provides a general framework for reasoning under uncertainty that accommodates both objective evidence and subjective judgment. Its hallmark is the use of interval estimates—typically expressed as belief and plausibility—rather than single point estimates, offering greater flexibility to distinguish uncertainty from ignorance and to represent evidence aggregation faithfully.

Hou et al. proposed a hybrid framework for HEA phase prediction that resolves inter-model conflicts by combining DS evidence theory with ML algorithms, using an improved DS scheme. A dataset of 426 HEA samples—180 quinary, 189 senary, and 57 septenary compositions—was compiled. Predictions from different algorithms were fused via DS evidence theory to produce phase labels with the highest confidence (belief/plausibility), yielding a high-precision conflict-resolution model. The framework further integrates an empirical-knowledge criterion with the DS-based module to enhance efficiency and accuracy. Validation showed that the hybrid framework achieved higher accuracy and overall performance than any single ML algorithm [73].

3.3. Incorporating Physical Constraints into Machine Learning and the Applicability Boundaries of Purely Data-Driven Methods

The phase selection and steady-state structure of high-entropy alloys are fundamentally constrained by both thermodynamic potential landscapes and the dynamics of evolutionary pathways. While pure data-driven models can achieve high prediction accuracy within the training data distribution, they are prone to significant deviations once extended beyond the statistical support domain in terms of composition, processing, or temperature fields, as they violate inherent physical consistency. Therefore, systematically incorporating physical constraints at both the model structure and training objective levels is a key strategy to mitigate unreasonable extrapolation behaviors.

At the model structure level, physical constraints can be embedded by incorporating conservation laws and symmetries (such as element substitution invariance and scale invariance), introducing phase diagram priors and phase fraction constraints, explicitly constructing parameterized relationships between free energy, chemical potential, and state variables, and integrating multi-scale physical representations from CALPHAD, phase-field simulations, or first-principles calculations [74,75,76]. This ensures that the representation space of the network is naturally confined to the “physically feasible domain”. At the loss function level, a multi-objective optimization form with both soft and hard constraints can be constructed:

L = L_{data} + λ_{thermo} L_{G} + λ_{consv} L_{mass / charge} + λ_{eq} L_{phase - eq} + λ_{kin} L_{kinetics} + λ_{mono} L_{monotonicity} + λ_{cal} L_{UQ - calibration},

where

L_{G}

applies thermodynamic constraints on the free energy surface morphology and phase stability regions,

L_{mass / charge}

enforces mass or charge conservation,

L_{phase - eq}

ensures that phase fractions and their boundary conditions satisfy Gibbs’ phase rule, and

L_{kinetics}

introduces differential constraints from diffusion processes or phase transition dynamics (which can be realized using Physics-Informed Neural Networks, PINN, or Neural Ordinary Differential Equations). To support these constraint mechanisms, uncertainty quantification and confidence domain analysis are required to explicitly express the model’s prediction risk on out-of-distribution samples in the form of confidence intervals, thus triggering constrained active learning or supplementary sampling loops [77,78].

Existing studies have demonstrated that hybrid paradigms combining CALPHAD and machine learning can simultaneously use high-throughput thermodynamic calculations as both physical priors and data augmentation sources, significantly improving the stability and interpretability of single-phase/multi-phase classification tasks. Multiple studies covered in this review have repeatedly verified the effectiveness of this approach, such as adaptive screening of novel single-phase criteria based on XGBoost, and deep multi-phase classification achieved by combining additive manufacturing process parameters with data augmentation strategies.

From typical failure scenarios, it is observed that pure data-driven models tend to exhibit a “cliff-like” decline in accuracy when the training set is sparse in non-equal atomic regions, across-element family combinations, or under extreme temperature conditions. When the labels contain systematic errors (such as EDS compositional biases or DFT pseudopotential approximations), the model is prone to learn spurious correlations. Furthermore, in the absence of phase fraction or microstructural information, networks are prone to overfitting statistical noise in multi-phase overlapping regions, thereby violating phase equilibrium conditions [78,79].

As pointed out in Section 6.2 of this paper, the “small dataset—distribution drift—extrapolation instability” is a common challenge in current data-driven materials modeling. The “physical constraints—uncertainty calibration—active learning” triad framework proposed in this paper aims to transform extrapolation risks from “implicit traps” into “diagnosable, reparable” explicit entities. This not only clarifies the applicability boundaries of pure data-driven methods but also provides feasible corrective paths for crossing these boundaries. Figure 7 presents a two-dimensional UMAP projection of the material representations, which was used for domain identification.

3.4. Multi-Model Integration and Structural Fusion: From Statistical Complementarity to Mechanism-Data Coupling

In the design of high-entropy alloys—materials characterized by high-dimensional compositional spaces and strongly nonlinear couplings—any single learner rarely strikes a satisfactory balance between bias and variance. Ensemble learning, stacked generalization, and evidence theory together offer a route from mere statistical complementarity to genuine structural integration. Bagging and boosting curb variance and enhance the robustness of a cohort of weak learners, whereas stacking further introduces a meta-learning mechanism across the triad comprising models, features, and physical priors, thereby alleviating the systematic biases inherent to any isolated model. Empirical studies consistently show that parallel ensemble methods such as RFs and XGBoost outperform stand-alone predictors in phase-prediction tasks; feature-integration strategies based on principal component analysis or kernel mappings achieve substantial dimensionality compression while maintaining accuracy. When algorithmic predictions conflict, Dempster–Shafer evidence theory provides a unified representation of belief functions and plausibility bounds, bringing conflict reconciliation and confidence assessment into a single, interpretable framework [46,80]. Within physics–data co-modeling, large phase-equilibrium datasets spanning multiple temperature regimes and generated by CALPHAD can serve simultaneously as training samples and as structural priors—two facets of a common, physically grounded source. Integrating a compact set of key descriptors distilled from these data with machine-learning classifiers yields high-fidelity single-phase selection rules that retain strong transferability across wide temperature ranges [79].

The common advantage of these approaches is to recast inter-model discrepancies as evidence amenable to principled weighting rather than as binary, either–or verdicts, thereby leveraging structured priors to stabilize the boundaries of extrapolation.

4. Prediction of HEA Performance

HEAs exhibit excellent properties owing to their multi-element compositions and unique microscopic crystal structures. For instance, the presence of multiple elements in solid solution enhances strength and hardness [81,82] through atomic lattice distortion; they also maintain a single solid-solution phase over a wide temperature range, reducing phase transitions and ensuring high-temperature stability [83]. Moreover, HEAs possess good wear resistance [84,85] due to solid-solution strengthening and their distinctive multi-component microstructure, while their high chemical complexity contributes to excellent corrosion resistance [86,87] and oxidation resistance [88] at elevated temperatures.ML modeling enables the prediction of mechanical properties, corrosion resistance, and thermal stability of HEAs under diverse environments. These predictions are typically based on extensive experimental datasets, computational simulations, and microstructural descriptors. Advanced techniques such as deep learning can capture the complex relationships between microstructure and macroscopic properties, thereby assisting researchers in understanding and optimizing material performance. Consequently, the development of HEAs for industrial applications has attracted significant attention, and ML-driven performance prediction plays a particularly important role in this progress.

The properties of HEAs—such as hardness, tensile strength, and corrosion resistance—are typically expressed as continuous numerical variables, making regression modeling a central tool in data-driven studies of material performance prediction. By establishing nonlinear mapping relationships among elemental composition (e.g., atomic radius, electronegativity, valence electron concentration), processing parameters (e.g., annealing temperature, cooling rate), and target properties, researchers can employ ensemble-learning algorithms such as RFs and GBDT, or DNN–based cross-scale modeling approaches (e.g., combining CNNs with physical constraints) to achieve quantitative prediction of HEA properties. It is worth noting that such models must address the overfitting challenge associated with high-dimensional, small-sample datasets through feature-engineering optimization (e.g., principal component analysis or material-descriptor screening), while improving generalization performance via cross-validation, Bayesian optimization, and related techniques. In recent years, attention mechanism–based Transformer architectures and symbolic regression methods have also been introduced to enhance model interpretability for complex composition–structure–property relationships. Nevertheless, the deep integration of physical mechanisms with data-driven models remains a key challenge in this field, underscoring the urgent need to develop hybrid prediction frameworks that couple multi-scale methods, such as phase-field simulations and first-principles computations.

4.1. Hardness Model Prediction

As an important mechanical property index, hardness is not merely a simple physical concept but rather a comprehensive indicator reflecting the combined mechanical characteristics of elasticity, plasticity, strength, and toughness of materials [89]. The complex multi-element structure of HEAs, along with solid-solution strengthening and chemical–physical interactions among constituent elements, contributes to their exceptionally high hardness. In HEA design, identifying and predicting alloys with high hardness has become a key research focus. This section introduces ML-based hardness prediction in HEAs from three perspectives: (1) the use of SHapley Additive exPlanations (SHAP) for model interpretation, (2) neural network–based prediction approaches, and (3) novel solid-solution hardening (SSH) modeling.

4.1.1. Add SHAP Interpretation

ML is commonly employed to construct implicit structure–property relationships based on compositional and descriptor data for performance prediction; however, these complex internal relationships are often difficult to interpret. Introducing interpretability into ML models can greatly enhance the understanding of HEAs.

SHAP is a game-theoretic framework for interpreting the output of any ML model. It leverages the classical Shapley values from cooperative game theory and their extensions to associate optimal credit allocation with local feature explanations. When applying SHAP to analyze ML models, one can visually assess not only the global importance of individual features and the overall influence of samples on target values, but also the local contributions of specific elemental features—thereby revealing how each element affects the hardness of HEAs.

Guo et al. proposed a composition-based strategy for predicting the hardness of HEAs. A RF model was constructed using alloy composition as input, achieving Pearson correlation coefficients of 0.956 and 0.954 on the training and test sets, respectively. The predicted hardness of Al_1.2Cr_17.42Fe_25.42Ni_28.32Ti_27.62 HEA reached 869.88 HV, which is 21.15% higher than the maximum hardness in the original Al–Cr–Fe–Ni–Ti dataset. By introducing SHAP to enhance model interpretability, it was shown that Al, Ti, Mo, Cr, and V positively contribute to HEA hardness, whereas Ni, Cu, Co, Mn, and Hf tend to weaken it [90].

Yang et al. developed a SVM hardness prediction model using five descriptors—atomic weight mean deviation (ADAW), column mean deviation (ADC), specific volume mean deviation (ADSV), valence electron concentration (VEC), and mean melting point (T_m)—as inputs. The Pearson correlation coefficients for both the test set and leave-one-out cross-validation (LOOCV) reached 0.94. Several optimized compositions, identified through inverse projection and high-throughput screening, were experimentally synthesized, and the best-performing alloy exhibited a 24.8% improvement over the highest hardness in the original dataset. The SHAP framework was also employed to improve interpretability, and the analysis revealed that VEC plays a crucial role in hardness prediction: when VEC < 7.5, it positively affects hardness [91].

Zhang et al. established an ML model based on hardness data from the Al–Co–Cr–Cu–Fe–Ni HEA system. Multiple modeling features were selected using a three-step parallel screening method, and a stacking ensemble algorithm integrating RF, XGBoost, LightGBM, and CatBoost was employed. After 10-fold cross-validation, the model achieved a coefficient of determination (R²) of 0.93. The ensemble predictions of HEA hardness were stable, accurate, and experimentally validated. Moreover, the selected features and model were transferable to other HEA systems as well as low-hardness Cr–Fe–Ni medium-entropy alloys (MEAs). The model further explained the large prediction bias observed for MEAs in the high-hardness region, and qualitatively analyzed the effects of composition and phase formation on HEA hardness using interpretable tools such as SHAP values and partial dependence (PDP)/individual conditional expectation (ICE) plots [43].

Gao et al. proposed an alloy design strategy integrating machine learning with multi-objective optimization, applied to the rational design of lightweight high-entropy refractory alloys in the Al–Nb–Ti–V–Zr–Cr–Mo–Hf system. The study first established quantitative structure–property relationships between composition, structure, and performance via a machine learning model. Further employing SHAP feature importance analysis, it revealed that a chromium content exceeding 12 at.% serves as a critical descriptor for achieving high corrosion resistance, as is shown in Figure 8. Building on this insight, the team performed systematic multi-objective screening of phase composition, density, melting point, hardness, and corrosion resistance, ultimately designing three lightweight high-entropy refractory alloys that exhibit a remarkable combination of superior hardness and excellent corrosion resistance.

4.1.2. Neural Network Prediction

The application of NNs in predicting the phase structures of HEAs, as mentioned above, has also proven to be highly effective for performance prediction. The analog nature of NNs enables them to process complex data efficiently, and their inherent parallel computing capability allows the handling of large-scale datasets and complex computational tasks. Since the performance prediction of HEAs is a highly nonlinear fitting problem, NNs are particularly well-suited for accurately modeling such relationships.

The ability of neural networks (NNs) to process complex, nonlinear data makes them superior to many other algorithms in predicting the hardness performance of HEAs.

Bundela et al. employed eight different ML algorithms to identify fundamental material descriptors for microhardness prediction. Using a stability selection algorithm, the optimal set of descriptors was obtained for the dataset. Principal component analysis (PCA) was further applied to reduce the dimensionality of the data, thereby improving prediction accuracy. The test

R^{2}

scores of XGBoost, RF, and labeled regression models all exceeded 0.89. Experimental validation confirmed the general applicability of these algorithms, with artificial neural networks (ANNs) showing superior performance when tested against new experimental data [93]. Dewangan et al. developed an ANN-based hardness prediction model using data from 36 HEAs. The model was trained, validated, and tested through simulation, achieving an overall regression coefficient (

R

) of 97.1%. A backpropagation ANN model was used to predict hardness values based on elemental composition and sintering temperature, reaching an accuracy of 95.9% [94]. Li et al. further extended this approach by employing a convolutional neural network (CNN) combined with the periodic table representation (PTR) of composition, processing, structure, and physical parameters for hardness prediction, hardness classification, and alloy system extrapolation. The CNN–PTR model demonstrated strong predictive capability, and when phase classification (five non–mutually exclusive phases) was incorporated, the accuracy of hardness prediction was further enhanced. A superposition strategy was finally introduced to improve both prediction accuracy and stability [95].

4.1.3. Modeling of Novel Solid-State Solution Hardening (SSH)

The solid solution hardening (SSH) effect of HEAs is one of the main contributors to their excellent mechanical properties [96]. Contrary to the Gibbs phase rule, the number of phases in an equilibrium system generally increases with the number of independent components. However, the typical structure of HEAs is a simple multicomponent solid solution characterized by a high mixing entropy (S_mix), which suppresses the formation of intermetallic compounds [97,98]. The SSH effect in HEAs is particularly strong due to the presence of severe lattice distortion [99,100].

Huang et al. applied ML to SSH modeling in HEAs and established an ML-based framework for designing solid-solution–hardened HEAs. The ML–SSH model outperformed traditional physical SSH models in hardness prediction by incorporating key factors from SSH theory along with atomic environmental parameters and interaction descriptors as input features. Feature engineering analysis confirmed that charge transfer, SRO, and local compositional fluctuations have significant effects on SSH behavior in HEAs. The inclusion of charge transfer improved the performance and accuracy of both physical and ML-based SSH models. By integrating the ML–SSH model with an ML-based alloy design system, the authors achieved accurate single–solid-solution phase prediction, consistent with experimental observations for FeNiCuCo and CrMoNbTi alloy families. Notably, non-equiatomic counterparts exhibited hardness values 28.8% and 8.8% higher than those of the FeNiCuCo and CrMoNbTi groups, respectively [96].

4.2. Other Performance Predictions

The high mixing entropy of HEAs not only contributes to their high hardness, but also endows them with a range of other excellent properties. In practical material applications, multiple properties are often required simultaneously; therefore, the accurate prediction of these additional properties is particularly important. The superior mechanical performance of HEAs makes them promising candidates for advanced structural materials in future industrial applications.

This section discusses the prediction of key properties, including elastic properties, tensile behavior, yield strength, oxidation resistance, and corrosion resistance.

4.2.1. Elastic Performance Prediction

The modulus of elasticity is an important index that reflects a material’s resistance to deformation, and it can effectively indicate its hardness, ductility, and other mechanical properties. The prediction of HEA elastic performance can be combined with first-principles calculations or molecular dynamics (MD) simulations, and optimal predictive models can be trained using the Sure Independence Screening (SIS) and sparse operator (SO) methods to achieve improved accuracy.

The modulus of elasticity can be expressed in terms of shear modulus, bulk modulus, and Young’s modulus, and it can also be predicted using ML. Kim et al. implemented ML modeling to predict the elastic modulus of HEAs. Their results demonstrated that an ML model trained on a large inorganic structural dataset can accurately predict the elastic properties of HEAs, and also revealed the dependence of the bulk and shear moduli on various material properties. These insights provide valuable guidance for tuning the elastic behavior of HEAs. ML models can effectively predict the bulk modulus and optimize HEA compositions to achieve higher bulk modulus values [101].

Kandavalli et al. trained twelve ML algorithms to classify elemental compositions as HEA or non-HEA. The Gradient Boosting Classifier (GBC) achieved the highest test accuracy of 78%. Additionally, six regression models were trained to predict the bulk modulus of HEAs, among which the LASSO regression model achieved the best performance, with an R² value of 0.98 and an adjusted R² of 0.97 on the test dataset. This approach accelerates material discovery by providing an alternative route for designing virtual alloy compositions with desirable bulk moduli tailored to specific applications, thereby opening new avenues for HEA design [102].

First-principles computation [103] is an effective method for screening promising structural HEAs with target characteristics, serving as a basis for building ML databases. Gao et al. constructed high-precision predictive models through a hybrid framework combining design strategies, first-principles calculations, and ML. The model was used to predict the elastic properties and Poisson’s ratio of non-equiatomic Mo-Nb-Ta-Ti-V HEAs, yielding results in excellent agreement with experimental data. It also identified the optimal compositional ranges for elastic properties and Poisson’s ratio. Feature importance analysis revealed that the titanium content contributed most significantly to these properties. This model enables the rapid and accurate generation of large datasets and helps establish quantitative relationships between elemental content and mechanical properties in refractory high-entropy alloys (RHEAs), providing theoretical guidance for experimental validation [104].

The relationship between microstructure and elastic modulus can also be derived using molecular dynamics (MD) simulations) via the ensemble averaging method, and compared with ML predictions to verify model accuracy [105]. Jiang et al. optimized the composition of AlFeCuSiMg HEAs based on elastic modulus as the target property using a combination of ML and MD simulations. Training and test datasets were generated through MD simulations, and comparisons of R² and RMSE values among different ML models identified SVR and RF as the most suitable for predicting the elastic modulus of AlFeCuSiMg HEAs. The predicted results were consistent with MD outcomes. By combining descriptor-based stiffness matrices and elastic modulus analysis models with mean-field methods, the evaluation of key mechanical properties such as strength and ductility of HEAs was significantly accelerated [106].

In addition, Vazquez et al. employed the Sure Independence Screening and Sparsifying Operator (SISSO) framework to train models that yielded the most accurate elasticity analyses. The model uses physically meaningful atomic descriptors to predict target properties, with computationally inexpensive analytical features derived from DFT datasets of binary and ternary subsets of NbMoTaWV refractory alloys. The best Elastic-SISSO models extracted from exponentially large feature spaces produced highly accurate predictions, comparable or even superior to other models. Several of these models were experimentally validated and revealed that electronegativity variance and elastic modulus can directly predict trends in ductility and yield strength of refractory HEAs, highlighting promising compositional regions for alloy design [107]. Zhang et al. employed the EMTO-CPA approach to generate a comprehensive HEA dataset covering a 14-element compositional space. The dataset includes 7086 cubic HEA structures characterized by their structural properties, with complete elastic tensors computed for a subset of 1911 structures, as is shown in Figure 9. This dataset of elastic properties was subsequently used to train a machine learning model incorporating the Deep Sets architecture. The resulting Deep Sets model exhibited enhanced predictive performance and robustness, outperforming other machine learning benchmarks [108].

4.2.2. Tensile Performance Prediction

Exploring the combination of high strength and good ductility in advanced metallic materials has long been a central challenge in the development of structural metals. Recent studies on high-strength alloys have provided effective approaches to addressing this issue [109,110,111]. By adjusting the elemental distribution and composition of HEAs, it is possible to achieve higher yield strength without compromising their tensile strength and ductility [112].

The prediction of tensile properties provides valuable guidance for sample preparation during the experimental phase and accelerates the development of new high-performance alloys. Elgack et al. applied MD and ML algorithms to predict the tensile behavior of FeNiCrCoCu HEAs. A total of 918 polycrystalline thermoelectric field datasets were generated through MD simulations, from which representative samples were selected for isotropic property testing and evaluation. The results showed that the data generated by MD simulations were reasonably consistent with previously reported findings. All generated datasets were then used to train ANN, SVM, and Gaussian Process Regression (GPR) models. Among these, the proposed ANN model exhibited the highest prediction accuracy. The models were further evaluated on a new dataset with distinct predictor values that were not included during model construction, and the ANN model was found to be most sensitive to strain rate predictors. This framework provides useful guidance for experimental optimization in the search for HEAs with targeted tensile properties [113].

Zhang et al. combined MD and ML to investigate the mechanical properties of non-equiatomic FeCrNiCoMn HEAs. Using tensile tests of 300 HEA single-crystal samples, an MD simulation database was established to describe the relationship between alloy composition and mechanical properties. Three ML models—SVM, Kernel-based Extreme Learning Machine (KELM), and Deep Neural Network (DNN)—were developed and compared for yield stress prediction. The results showed that the DNN model outperformed the others in binary classification of yield stress. The composition-based design strategy was further used to guide the fabrication of polycrystalline FeCrNiCoMn samples, and the DNN predictions were experimentally validated, confirming its robustness. These findings indicate that combining computational modeling with ML provides valuable guidance for designing high-strength HEAs and accelerates new alloy development [105].

Tan et al. developed optimization models to predict the tensile properties—specifically, ultimate tensile strength (UTS) and yield strength (YS)—of SLM-fabricated HEAs, achieving mean absolute percentage errors (MAPE) of 20.43% and 20.25%, respectively. Several HEAs were subsequently fabricated via SLM, and the experimental results were found to be in good agreement with the model predictions [114].

4.2.3. Yield Strength Prediction

The yield strength is a key factor influencing the design of HEAs [115]. Due to their wide compositional space, obtaining the desired yield strength of HEAs through experimental methods is often difficult and time-consuming. By applying machine learning (ML) to optimize thermomechanical processing parameters, HEAs with suitable yield strength can be efficiently predicted, thereby reducing manufacturing costs and time—a strategy that is increasingly adopted in industry.

Steingrimsson et al. proposed a comprehensive and up-to-date bilinear logarithmic model for predicting the temperature-dependent yield strength (YS) of medium-entropy alloys (MEAs) and high-entropy alloys (HEAs). The model introduces a fracture temperature parameter, which serves as a useful guideline for designing MEAs and HEAs with improved high-temperature mechanical properties. It is grounded in fundamental physical principles, incorporated as prior information, and employs unconstrained global optimization techniques. Parallel optimization of the model parameters under low- and high-temperature conditions revealed that the fracture temperatures of different HEA systems were consistent in both YS and ultimate tensile strength. A high-level comparison of the YS of MEAs/HEAs with that of nickel-based superalloys demonstrated that the selected refractory HEAs (RHEAs) possess superior strength characteristics [116].

Veeresham et al. employed a linear regression–based ML model to predict the yield strength of nitrogen-doped (CoCrFeMnNi)_100−xN_x HEAs under specific thermomechanical processing conditions. The appropriate selection of models and material parameters using ML offers a promising approach for designing HEAs with tailored compositions and enhanced mechanical performance [117].

Ding et al. developed a novel prediction framework integrating multiple ML algorithms to estimate the yield strength of RHEAs at various temperatures. The framework achieved excellent predictive performance, with a coefficient of determination (R²) of 0.9605 and a root mean square error (RMSE) of 111.99 MPa on the test set. Furthermore, two new RHEAs exhibiting high yield strengths were synthesized and experimentally characterized, validating the reliability and feasibility of the proposed framework [118].

Bhandari et al. utilized a Random Forest (RF) regression model to predict the yield strength of HEAs at different temperatures. The RF model was applied to MoNbTaTiW and HfMoNbTaTiZr alloys at 800 °C and 1200 °C, and the predicted results were consistent with experimental observations. These findings demonstrate that the RF regression model can accurately predict the temperature-dependent yield strength of HEAs [119].

The meticulously designed Fe35Ni29Co21Al12Ta3 multi-principal element alloy by Sohail et al., developed via domain knowledge-driven machine learning, achieves an unprecedented combination of high strength and ductility, as is shown in Figure 10. This synergistic effect results in a yield strength of 1.8 GPa and a true uniform elongation of 25% [120].

4.2.4. Antioxidant Performance Prediction

Oxidation resistance is one of the key factors to consider when materials are used in high-temperature applications. The oxidation reaction between the alloy and atmospheric oxygen at elevated temperatures can significantly degrade the mechanical properties of the alloy. If oxidation is not effectively inhibited, the alloy may fail rapidly during service [121,122]. Antioxidant elements tend to form a protective oxide layer on the surface, which prevents the oxidation of the main constituent elements in the deeper regions at high temperatures [123].

It is particularly important to use ML to predict the oxidation resistance of HEAs, as higher oxidation resistance can improve the service life of materials and reduce operational risks. Dong et al. employed an ML-integrated workflow to guide the design of five-element HEAs containing Cr and Al to achieve enhanced high-temperature oxidation resistance. The ML model utilized the chemical compositions of Fe, Cr, Al, Ni, and Cu as design parameters to optimize oxidation resistance. The oxidation behavior of Al_xCrCuFeNi (x = 0, 0.25, 0.5, 1) HEAs in air at 1100 °C was systematically investigated, and the oxidation mechanism was elucidated [123].

For high-temperature oxidation-resistant RHEAs, superior oxidation resistance is often achieved by combining various refractory and antioxidant elements [124,125,126]. Yan et al. proposed a strategic algorithm for designing single-phase RHEAs using ML methods based on a dataset of 1807 entries, and applied multiple ML algorithms to train the dataset. Blind testing demonstrated that the GB (gradient boosting) model can effectively distinguish between single-phase and non-single-phase solid-solution alloys, achieving an accuracy of 96.41%. According to the GB model, more than 100 equiatomic antioxidant RHEAs were predicted from the compositional space of eight metallic elements. Ten predicted single-phase RHEAs were synthesized via mechanical alloying, and their XRD patterns confirmed single-phase body-centered cubic (BCC) solid-solution structures. The experimental results were in excellent agreement with the ML predictions, confirming that the model exhibits strong predictive capability for single-phase RHEAs and that single-phase antioxidant RHEAs were successfully designed [127].

4.2.5. Corrosion Performance Prediction

Corrosion can jeopardize the service life and safety of engineering materials, making it a major concern across various fields. However, since the performance limits of traditional materials are often insufficient to meet the requirements for technological advancements, there is an increasing demand for new materials with excellent corrosion resistance in many industries [128]. In general, HEAs exhibit good overall corrosion resistance but are still susceptible to localized corrosion in harsh environments [129,130]. ML has been employed to predict the corrosion behavior of HEAs, and the design and development of corrosion-resistant HEAs are increasingly being emphasized.

Composition is a key factor influencing the microstructure of HEAs, including elemental distribution, grains and grain boundaries, and precipitation behavior. The correlations between HEA composition, microstructure, and corrosion resistance derived from ML can help researchers design HEAs with superior corrosion resistance [129]. Ozdemir et al. employed ML to explore the entire compositional space of the HfNbTaTiZr system. K-fold cross-validation and guided modeling methods were applied to quantify model uncertainty and identify the most robust model. Potentiodynamic polarization experiments were then conducted on the ML-predicted compositions in simulated body fluids (SBF) at 37 ± 1 °C, demonstrating that the newly developed alloy exhibited excellent corrosion resistance, superior to that of alloys with uniform dendritic microstructures. This approach enables the efficient development of new biomedical alloys with enhanced corrosion resistance [131].

Electrochemical measurements are widely used as rapid and effective techniques to study the corrosion behavior of metals and alloys [132]. Among them, electrochemical impedance spectroscopy (EIS) can characterize corrosion processes while significantly minimizing system damage, making it a powerful method in corrosion research [133]. Wei et al. applied several ML models to predict the impedance spectra obtained from EIS data. It was found that [SO₄²⁻] and pH were positively correlated with –Z_Im, while [Cl⁻] was negatively correlated with –Z_Im. Moreover, pH was identified as a key parameter affecting the corrosion behavior of HEAs. The GBDT algorithm exhibited the best performance, with corresponding R², MAE, and RMSE values of 0.9491, 2930.24, and 13,138.96, respectively [134].

4.2.6. Prediction of Parameters Related to Mechanical Properties

ML can be applied to many aspects of HEA prediction research. Among these, phase prediction and mechanical property prediction are key components in HEA design, while studies on microstructure evolution also play an important supporting role. Research on thermal deformation behavior can significantly reduce the need for extensive experimental work in determining the mechanical response of HEAs, and ML methods can provide substantial solutions in this regard. Predicting the thermal deformation behavior and adsorption energy of HEAs also holds great potential for HEA design.

Jain et al. combined finite element modeling (FEM) with electron backscatter diffraction (EBSD) analysis to correlate the inhomogeneity of deformed samples with their strain field distribution. EBSD characterization revealed the presence of deformation bands and annealed twins at low and high temperatures, respectively. An ANN model was proposed to predict the thermal deformation behavior of CoCrFeNiV HEA, achieving a correlation coefficient (R) of 0.9983 and a mean absolute relative error (AARE) of 2.71% [135].

Dewangan et al. employed traditional constitutive models, including the modified Johnson–Cook (JC), modified Zerilli–Armstrong (ZA), and Arrhenius-type equations, along with ML-based approaches to predict flow stress under various thermal conditions. The performance of traditional and ML models was evaluated using R², MAE, and RMSE metrics. The gradient boosting ML model demonstrated the highest predictive accuracy (R² = 0.994, MAE = 7.77%, RMSE = 9.7%) [136].

In another study, Dewangan et al. developed ANN models trained with experimental creep displacement data using the Levenberg–Marquardt algorithm, and the predicted results were found to be in excellent agreement with the experimental data [137].

Jain et al. further applied ML techniques—including RF, KNN, XGBoost, DT, and SVR—to predict the flow stress behavior of a dual FCC-phase CoCrCu_1.2FeNi HEA under new temperature and strain rate conditions, thereby reducing experimental dependence. The RF and KNN models achieved the best predictive performance, with R² values of 0.987 and 0.986, respectively. New flow stress–strain curves were generated at 1123 K, and both RF and KNN models showed strong predictive capability (R² = 0.97 and 0.958, respectively) [138].

In a subsequent study, Jain et al. employed five ML models to predict the flow stress–strain response, where the RF model demonstrated excellent performance, particularly at a strain rate of 0.1 s⁻¹, with R² = 0.97, RMSE = 10.1%, and MAE = 8.9%. Experimental validation confirmed that the alloy could be safely deformed within the temperature range of 1173–1273 K and the strain rate range of 10^−0.8 to 10⁻² s⁻¹ [139]. Table 1 summarizes representative and recent ML-based prediction studies covering various aspects of HEA research.

Overall, the application of ML in predicting the thermal deformation and flow behavior of HEAs demonstrates remarkable efficiency and accuracy compared to traditional constitutive models. By integrating experimental data with computational approaches, ML enables precise modeling of complex deformation mechanisms and reduces the reliance on extensive experimental testing. These advances not only accelerate the design and optimization of HEAs with superior mechanical performance but also provide valuable insights into the underlying deformation mechanisms, laying a foundation for data-driven alloy development.

5. Design, Exploration and Optimization of New High-Entropy Alloys

Despite the promise of HEAs, it is not feasible to explore their vast compositional space solely through exhaustive experimental methods. The speed, scalability, and predictive accuracy of ML make it a powerful tool to accelerate the discovery and design of new materials. ML can help optimize experimental design by simulating the screening of a large number of alloys with different compositional combinations through high-throughput calculations, thereby reducing the number of required experiments. By integrating ML with high-throughput screening techniques, researchers can explore the high-dimensional compositional space more efficiently and rapidly identify HEAs with outstanding properties.

The development of ML techniques provides new perspectives for more efficient and effective exploration of HEA systems. ML can also provide guidance for the rational design of HEAs, enabling the optimization of alloy performance through efficient compositional design strategies. The design of HEAs using ML should focus on high interpretability, spatial exploration, composition optimization, and model refinement to identify superior HEAs.

This chapter presents an attempt to address these aspects in three ways:

Exploring explainable relationships between HEA properties, elemental compositions, and alloy characteristics;
Searching for high-performance HEAs across a broad compositional space;
Employing ML algorithms to optimize both alloy composition and model parameters to identify superior HEAs.

5.1. Exploration of Structure-Activity Relationship

Due to the vastness of the compositional space, the phase formation in HEAs is highly diverse. Most studies on phase formation have focused on the characteristics of solid-solution phases. Although HEAs exhibit outstanding and unique properties, their structure–property relationships have not yet been well established. Establishing a clear link between composition, phase formation, and performance is crucial for accelerating the development of new HEAs. This section discusses the exploration of the relationships between material characteristic parameters and their corresponding properties.

5.1.1. Feature-Correlated Structure–Property Relationships

With the increasing application of ML in HEA prediction, further performance-oriented research needs to establish effective structure–property relationships. Features (descriptors) play a crucial role in the data-driven design of HEAs, and building reliable correlations between these features and the desired performance can simplify the development of high-performance HEAs. Utilizing ML—either independently or in combination with traditional modeling methods and algorithms—to identify key influencing factors and to explore the relationships between features, structures, and performance can significantly accelerate HEA research [142]. Figure 11 shows the feature selection based on the feature combinations from different machine learning algorithms.

Establishing the relationship between target attributes and feature descriptors, and identifying the main influencing features, can greatly accelerate the design of HEAs with desired performance. Zhao et al. proposed a closed-loop inverse design framework guided by symbolic regression optimization for the efficient development of materials with target properties, validating its effectiveness using RHEAs as a model system. Through symbolic regression analysis, a concise mathematical relationship was established between a fundamental physical descriptor (enthalpy of fusion) and the target property (yield strength at 1000 °C). Based on this relationship, a novel VTiMoNbZr alloy system was successfully designed. By integrating heuristic algorithms with an uncertainty-aware utility function for compositional optimization, only four experimental iterations were required to produce 21 new alloys. Among these, 12 alloys exhibited significantly enhanced specific yield strength, and two optimal alloys achieved specific yield strengths exceeding 110 MPa (g/cm³). The performance improvement was attributed to the synergistic effects of increased density and enhanced lattice distortion. This study demonstrates the strong potential of symbolic regression–guided optimization strategies for achieving precise and efficient design in complex material systems [143].

Identifying the key factors influencing phase structure formation is critical for the design of high-performance HEAs. Dimensionality reduction techniques can help select and reduce redundant features, improving the efficiency of structure–property relationship exploration. Zhang et al. proposed a feature selection and eigenvariable transformation method based on kernel principal component analysis (KPCA) to optimize nine parameters, including the enthalpy of formation and mixing entropy determined by the extended Miedema theory. Phase distinction was performed using an SVM model, which revealed that elastic energy and atomic size difference had significant effects on phase formation. The prediction accuracy of the SVM test set based on four characteristic variables and KPCA (4V-KPCA) was 0.9743, while the F1 scores for the detailed prediction of solid solutions, amorphous states, mixtures, and intermetallic phases were 0.9787, 0.9463, 0.9863 and 0.8103, respectively. The extended Miedema theory provides accurate thermodynamic properties for HEA design, demonstrating that ML methods—specifically SVM combined with KPCA—are highly effective for alloy phase prediction [144].

Jaiswal et al. trained and tested their dataset using 664 labeled samples, including 267 BCC alloys, 199 FCC alloys, and 198 FCC + BCC alloys, to minimize prediction bias. They identified strong correlations among empirical design parameters. As the alloy system transitioned from the medium-entropy domain to the high-entropy domain, the correlation coefficient values varied, and the parameters VEC and Tm were found to be highly influential. The transition from a BCC + FCC phase to an FCC phase with increasing Ni content in the CoCuFeNix system was experimentally validated [145].

Based on the elemental concentrations of the alloys, Syarif et al. developed simple primitive prediction rules using pruning tree models and linear correlations, in conjunction with self-organizing maps (SOMs), constructing Euclidean spaces to formulate phase formation as an optimization problem. Genetic algorithm optimization revealed that phase formation was primarily influenced by the electron affinity, molar volume, and resistivity of the constituent elements. One of the primitive prediction rules achieved an accuracy of 87% in predicting the FCC phase formation of the AlCoCrFeNiTiCu HEA family, based solely on the concentrations of Al and Cu [146].

He et al. utilized five ML algorithms to predict HEA phases, including solid solution (SS) and amorphous (AM) states. The random forest (RF) model effectively distinguished BCC, FCC, mixed (BCC + FCC), and AM phases with an accuracy of 0.87. The CoCrFeNiAlx (x = 0, 0.5, 1) alloys were experimentally characterized using XRD and SEM–EDS. The experimental results confirmed that the phase structure of the CoCrFeNiAlx alloy evolved from FCC to BCC + FCC, and then to BCC with increasing Al content, consistent with the ML predictions [140].

The slow diffusion property is one of the fundamental reasons behind the excellent functional and structural performance of HEAs, governed by the rough potential energy landscape (PEL) resulting from intrinsic chemical disorder. The highly complex and multidimensional nature of PEL makes it challenging to describe how it controls diffusion in HEAs. Xu et al. developed ML models to accurately represent the dependence of PEL on the local atomic environment in HEAs. By combining the ML model with kinetic Monte Carlo (KMC) simulations, they found that self-diffusion in HEAs is mainly controlled by PEL roughness, characterized by element-specific potential energies and migration barriers. Comparison with simplified diffusion models showed that models based on species-average migration barriers can serve as effective alternatives for rapid evaluation of diffusion properties. Although correlation effects may be underestimated, theoretical analysis revealed that differences in the atomic concentration of fast-diffusing elements and in the average migration barriers among species are the main factors influencing slow diffusion in HEAs [147].

Huang et al. proposed an ML model based on an innovative artificial neural network (ANN) capable of predicting vacancy migration barriers for arbitrary local atomic configurations in the FeNiCrCoCu HEA system, using only training data from equiatomic compositions. The model accurately predicted migration barriers in non-equiatomic, quaternary, ternary, and binary subsystems as well. The ANN model was implemented as a dynamic barrier calculator for KMC simulations, achieving diffusivity nearly identical to molecular dynamics (MD) simulations but with much higher computational efficiency. Using a high-throughput ANN–KMC approach, diffusion behavior in 1500 non-equiatomic HEA compositions was analyzed. The study found that while slow diffusion was not apparent in equiatomic HEAs, it was evident in many non-equiatomic compositions. The composition of the fastest diffuser (Cu), the complexity of the PEL, and the percolation effects were analyzed, providing valuable insights for experimental HEA design [148].

HEAs also have great potential in the field of catalysis. ML studies of the oxygen reduction reaction (ORR) catalytic activity on HEA surfaces are important for identifying efficient HEA catalysts and revealing the origin of their ORR activity. Wan et al. proposed a machine learning model based on the Gradient Boosted Regression (GBR) algorithm, which demonstrated high precision, versatility, and simplicity. Their analysis showed that the adsorption energy is a combination of the respective contributions from the coordinated metal atoms near the reaction site. An effective strategy was proposed to further enhance the ORR catalytic activity of promising HEA catalysts by optimizing their surface structures. Among these, the high-efficiency HEA catalyst Ir₄₈Pt₇₄Ru₃₀Rh₃₀Ag₇₄ was recommended. This work provides valuable guidance for the rational design and nanostructured synthesis of HEA catalysts [149].

5.1.2. Elemental Component Relationships

The relationship between the structural characteristics and properties of HEAs is a key direction in HEA research and design. However, the influence of elemental composition on HEA performance is equally crucial. The amount of each element determines the resulting HEA properties, as different components exhibit distinct characteristics. To design HEAs with desired performances, it is essential to understand the impact of each elemental component. Predictions obtained through ML simulations can greatly reduce experimental workload and accelerate discovery.

Elemental composition affects microstructure formation, which in turn alters the material properties. Establishing the relationship between composition and performance can effectively improve HEA design. Liu et al. proposed a method combining ML and thermodynamic calculations to rapidly identify the eutectic composition of the NiCoCrAl system, successfully discovering two FCC + BCC eutectic high-entropy alloys (EHEAs). The layered eutectic structures and mechanical properties of Ni_68−wCo₁₆Cr₁₆Al_w HEAs were further investigated. Results showed that increasing the aluminum content promoted the formation of the body-centered cubic (BCC) phase. As Al content increased, the microhardness of Ni_68−wCo₁₆Cr₁₆Al_w HEAs decreased, while fracture ductility improved. Among them, Ni₄₉Co₁₆Cr₁₆Al₁₉ EHEA exhibited excellent compressive strength (3006.9 MPa) and good fracture ductility (45.5%), with a microhardness of approximately 326.3 HV [150].

Zhao et al. established the relationship between HEA composition and microhardness. Composition–microhardness data pairs from various alloy systems were collected and expanded using generative adversarial networks (GANs), which were then converted into empirical parameter–microhardness pairs. Active learning was applied to screen the AlCoCrCuFeNi system, and Extreme Gradient Boosting (XGBoost) was identified as the best-performing ML model. After millions of training iterations with the XGBoost sub model and optimization via the Expected Improvement (EI) algorithm, four aluminum-rich compositions were identified that exhibited ultra-high microhardness (>740 HV, maximum ~780.3 HV) and low density (<5.9 g/cm³) in the as-cast bulk state, outperforming dilute B2 AlCo intermetallics. The strengthening effect was attributed to the precipitation of disordered BCC nanoparticles within ordered AlCo-rich B2 matrices [151].

The chemical disorder and stacking fault energy (SFE) of HEAs are strongly influenced by the complex local atomic environment, rendering traditional SFE calculation methods inadequate. Liu et al. proposed a new computational strategy for local stacking fault energy (LSFE). Using statistical methods, the quantitative probability distributions of SFE in FCC and BCC HEAs were obtained, and amplification modeling of dislocation motion in HEAs was further analyzed. The intrinsic correlation between LSFE and local compositional inhomogeneity was established using ML. Feature classification revealed that compositional inhomogeneity is the primary factor affecting LSFE, providing critical guidance for HEA composition optimization. This strategy not only accounts for local chemical fluctuations in HEAs but also enables high-throughput SFE calculations [152].

Zhang et al. have developed an evolutionary algorithm-based strategy to generate new numerical descriptors for elements in high-entropy alloys (HEAs), moving beyond the conventional reliance on elemental physical/chemical properties for ML-assisted alloy design. These new descriptors dramatically increased phase classification accuracy from 77% to ~97% for FCC, BCC, and dual-phase structures compared to traditional empirical features, as is shown in Figure 12. Experimentally, a model using these descriptors correctly predicted phases in 8 out of 9 random alloys, doubling the success rate of the same model using traditional features (4/9). When integrated via a simple logistic regression model, these descriptors enhanced the performance of various classifiers by at least 15% [153].

5.1.3. Explanatory Formulas/Parameters

Science, technology, and engineering are in great need of mathematical formulas with high physical interpretability, accurate predictability, and strong generalization capability [140]. ML has achieved remarkable success in the field of materials science; however, most ML algorithm models behave as “black box” systems [154,155]. To address this, interpretable formulas, expressions, or representative parameters are being developed to enhance model transparency and provide guidance for both forward and inverse design of high-performance HEAs.

Wei et al. guided the selection of pre-factors, time indices, and activation energies based on domain knowledge and the chemical compositions of eight elements in FeCrAlCoNi-based HEAs. The Linear Regression Tree Classifier (TCLR) algorithm was employed to utilize two experimental characteristics: exposure time and temperature. The spectra of activation energy and time exponents were extracted from complex, high-dimensional data, and the eigenspace automatically provided the spectra of pre-factors. Elemental features were then used to combine these three spectra to obtain a formula that is both universal and physically interpretable. The resulting model achieved a high coefficient of determination (R² = 0.971), and the role of each chemical element in high-temperature oxidation behavior was illustrated through the three spectra. This interpretable formula provides valuable guidance for the inverse design of HEAs with resistance to high-temperature oxidation [156].

Huang et al. developed material descriptors using mathematical formulations and analyzed the effectiveness of these low-cost descriptors in incorporating materials knowledge into ML modeling. Key features were screened through feature engineering and used to derive explicit formulas that facilitate the prediction of BCC HEA plasticity. Furthermore, the interpretability of the established ML model and the influence of key features were elucidated [157].

Phase transition is one of the most fundamental phenomena in nature, and all material designs are inherently related to it; HEAs are no exception. All phase transitions can be characterized by appropriate order parameters, including order–disorder transitions. However, identifying representative order parameters for HEAs is challenging. Based on neural networks, VAEs [158,159] can map high-dimensional data into low-dimensional latent spaces. Yin et al. introduced the concept of “VAE-order parameters” by leveraging the ability of VAEs to reduce high-dimensional data into a few principal components. They proposed that the Manhattan distance in the latent space of a VAE can serve as a general order parameter for order–disorder transitions. The physical properties of these order parameters were quantitatively explained and validated using multiple RHEAs. Building on this, a universally applicable alloy design concept was proposed by mimicking the natural mixing of elements. These physically interpretable “VAE-sequence parameters” provide a foundation for understanding chemical ordering and alloy design [160].

5.2. Explore the HEA Space

Although the superior performance of HEAs has attracted significant research interest, the vast compositional space makes exhaustive experimental exploration impractical. Methods such as thermodynamic modeling and atomic-scale simulations are useful for investigating the fundamental mechanisms of various HEA properties [6,7,8,9,10,68,161]; however, they are limited by spatial and temporal constraints as well as computational cost. To overcome these limitations, researchers have turned to ML techniques to make HEA exploration more efficient. Based on ML models, alloy combinations with specific properties can be rapidly screened, thereby reducing the number and cost of experimental trials. Moreover, active learning strategies can accelerate the design of new high-entropy alloys in data-sparse compositional spaces [162]. An analysis of the spatial complexity in the HEA microstructure, presented in Figure 13, reveals a high degree of compositional fluctuation.

Kaufmann et al. recently coupled thermodynamic and chemical signatures with RF models to propose a novel high-throughput method, termed “ML-HEA,” for predicting solid-solution formation capacity. The ML-HEA method was validated against reliable experimental data from binary, ternary, quaternary, and quinary systems. Comparisons with other modeling approaches, including CALPHAD and LTVC models, were made to evaluate the performance of the ML framework on both labeled and unlabeled data. The outputs from each prediction tree were further analyzed to assess the uncertainty associated with the final phase prediction of each composition. The developed model can be directly applied to explore physical compositional spaces in an unconstrained manner and can be easily updated to reflect new experimental results [163].

5.2.1. DNN Global Search

In complex real-world applications [164] and across materials science [165,166], DNNs have proven highly effective: with sufficiently high-quality data they can accurately predict phase compositions and mechanical properties. DNNs can represent complex, high-dimensional, and nonlinear constitutive and phase spaces, and thereby help elucidate feature importance in HEA design to accelerate materials discovery.

Wang et al. developed an ML framework that combines thermodynamic modeling with DNN and CNN architectures to search the vast compositional space of HEAs. The model outperformed several baselines in predicting mechanical properties, and its architectural design enabled learning of element-level characteristics in HEAs. Thermodynamic descriptors were used as inputs to improve predictive accuracy, and a conditional random search—effective at locating local optima—was adopted as the inverse design predictor. Using this framework, two HEAs with optimal strength–ductility combinations were designed, demonstrating the effectiveness of both the model and the alloy-design approach [167].

Including phase fractions as features can improve the efficiency of ML models in mapping the potential-energy landscape of competing phases in HEAs [168,169,170,171,172]. Using a dataset generated via Thermo-Calc high-throughput thermodynamic calculations, Vazquez et al. developed a DNN for the Mn–Ni–Fe–Al–Cr–Nb–Co system. Feature-importance analysis reproduced and extended reported trends relating elemental properties to HEA phase formation. When the final regressor achieved R² > 0.96, the dominant phase could be reliably identified. By solving inverse optimization tasks to emulate alloy design, the DNN “simulator” enabled rapid, real-time materials discovery and design workflows [173].

Recently, Lee et al. proposed a deeply explainable material attribution analysis scheme (DISMAA) to systematically reveal composition–structure–property (CSP) relationships in HEAs and to generate novel compositions. A deep generative model trained with limited data was used to construct a continuous latent space for AlCoCrFeMnNi HEAs, and an additional deep model trained with larger datasets predicted phases and properties. Attribution and sensitivity analyses quantified each element’s contributions to different properties and phases. Visualizing composition, phase, mechanical properties, and elemental attributions in the latent space offered an intuitive route to analyze CSP relationships, and the approach was validated. The work demonstrates interpretable, rational DNN-based inverse design capable of customizing multiple properties and processing parameters across materials systems, providing a comprehensive framework for HEA design [174].

5.2.2. Active Learning Loop Iteration

Recently, active learning has emerged as a powerful approach for exploring the HEA compositional space and accelerating materials discovery. It guides experiments in data-scarce regimes by leveraging model uncertainty, enabling selective experiments that efficiently generate informative data to refine the model. Within ML [175,176,177,178], surrogate models in active-learning loops iteratively identify the most informative candidates, query them, update the model, and thereby improve predictive performance. In doing so, active learning reduces alloy-design costs and integrates with—and guides—experimental workflows.

Active learning often requires multiple iterations to improve performance in HEA research. Li et al. proposed an active-learning loop constrained by domain knowledge, narrowing the unexplored space using VEC criteria to design HEAs with optimized strength and ductility. After six iterations, they synthesized an alloy with an UTS of 1258 MPa and an elongation to failure of 17.3%. The phase constitution and eutectic microstructure were characterized, and the potential origins of strength–ductility co-optimization were discussed in terms of strain hardening and crack initiation. This framework, which couple’s domain knowledge with ML, facilitates targeted materials design by harmonizing competing attributes [179].

Rao et al. proposed an active-learning strategy (see Figure 14) to accelerate the design of HEAs in virtually infinite compositional spaces. By combining ML with DFT, thermodynamic calculations, and experiments, they identified two HEAs. At 300 K, the coefficient of thermal expansion was ~2 × 10⁻⁶ K⁻¹, illustrating a rapid, automated route to discover HEAs with favorable thermal, magnetic, and electrical properties [44].

Using a domain-knowledge-based ML approach, Sohail et al. designed a multi-principal-element alloy with the composition Fe₃₅Ni₂₉Co₂₁Al₁₂Ta₃. After processing, the alloy achieved a yield strength of 1.8 GPa and a true uniform elongation of 25%. The exceptional performance stems from deliberately pushed microstructural heterogeneity: coherent L1₂ nanoprecipitates together with an unusually high-volume fraction of incoherent, multicomponent B2 particles. Owing to its low chemical-ordering energy, the B2 phase acts as a deformable phase that accommodates dislocations, maintaining a high strain-hardening rate and extending uniform deformation [120].

However, many active-learning methods that guide experimental discovery still rely on simple surrogate models and Bayesian optimization, which are often constrained to low-dimensional design spaces and therefore tend to deliver substantial performance gains only after multiple iterations [44].

5.2.3. Exploration of Eutectic High-Entropy Alloys

In the early stages of HEA research, HEAs attracted considerable attention mainly due to the unexpected discovery of single-phase solid-solution (SS) alloys. Two decades after the initial discovery, the field has advanced substantially: research now extends beyond homogeneous SS, IM, and AM phases to the deliberate exploration of eutectic architectures. Eutectic HEAs have demonstrated excellent performance and promising application potential. Guided by empirical design rules, various eutectic high-entropy alloys (EHEAs) have been proposed [83,180].

Current design strategies can indicate eutectic (co-crystallized) formation in HEAs, but they often cannot quantitatively account for multiple variables when selecting a specific system. Wu et al. revealed eutectic formation in multi-principal-element systems via ML-based data mining. Based on large-scale data analytics, EHEA design can be summarized as: (i) identifying key elements; (ii) selecting elements strongly associated with the key elements and miscible with them; (iii) confirming viable combinations of key and closely related elements; and (iv) determining the proportions of the miscible elements (see Figure 15). Within the designed EHEA space, attributes can then be evaluated to optimize application-oriented performance [181]. Zeng et al. trained an XGBoost classifier on a comprehensive dataset combining CALPHAD calculations and bibliographic data, achieving high accuracy in classifying eutectic vs. non-eutectic compositions. Because XGBoost is a black-box model and its class separation is not easily visualized, an explicit mathematical expression derived from an ANN was further used to establish phase-selection rules, which were validated by experiments and the literature [182].

5.3. Optimize the Design

In recent years, substantial progress has been made in applying ML to HEA research; however, predictive accuracy remains limited by the quantity and quality of available data. Refining model architectures and training strategies can further enhance performance and robustness. The objective of model optimization is to attain strong predictive performance on a given dataset while maintaining high generalization capability. This section reviews ML-based strategies for model optimization.

5.3.1. Optimize the Model Using Genetic Algorithms

Genetic algorithms (GAs) are a class of search and optimization algorithms inspired by natural evolution [183]. By mimicking selection, recombination (crossover), and mutation [184], GAs can yield high-quality solutions to a wide range of search, optimization, and learning problems. Because they operate on populations and do not require gradient information, GAs can navigate high-dimensional, nonconvex, and discontinuous design spaces, helping overcome limitations of many traditional methods—especially for problems with numerous parameters and complex mathematical representations.

Materials informatics uses ML models to learn relationships between target properties and material descriptors, thereby accelerating the discovery of new materials [185,186]. Developing efficient strategies to select ML models and descriptors can further accelerate HEA research. Zhang et al. employed a GA to jointly select models and descriptors from many candidates and demonstrated its efficacy on a binary phase-formation task (solid-solution vs. non–solid-solution) in HEAs. The optimized classifier achieved 88.7% accuracy for SS vs. non-SS; for the multi-class task, it reached 91.3% accuracy in distinguishing BCC, FCC, and dual-phase (BCC + FCC) HEAs. Using active learning, high-uncertainty candidates were selected for experimental synthesis and phase identification; the resulting labels were appended to the initial dataset to iteratively improve the ML model [187]. This approach provides a general framework for selecting descriptors and ML models to address diverse materials problems, including classification and property-optimization tasks.

Li et al. incorporated feature-importance cues and enhanced genetic operators into an improved GA, yielding greater interpretability and significantly better precision, stability, and efficiency than a standard GA; comparisons were also made with other common feature-selection methods. They further discussed combining compositional features with physics-inspired descriptors for ML model selection within the improved GA. For hardness prediction in AlCoCrCuFeNi HEAs, a stacking ensemble was proposed to enhance predictive performance and reduce error [188].

5.3.2. GAN

The performance of ML models is often limited by the amount of available materials data. To address the scarcity of HEA hardness data, GANs—a class of DNNs—can be used to generate plausible synthetic data [189]. GANs augment datasets by producing high-quality samples via the adversarial training of a generator and a discriminator [190,191], thereby facilitating the application of ML algorithms in HEA research.

A schematic of the GAN architecture is shown in Figure 16. The generator maps a latent random noise vector (e.g., sampled from a simple prior) to the data space, synthesizing samples that resemble the original data. During training, it is optimized to produce increasingly realistic samples that the discriminator cannot distinguish from real ones. The discriminator outputs a probability indicating whether an input is real (close to 1) or generated (close to 0). Through alternating updates of the generator and discriminator within each iteration, both networks improve iteratively, enhancing the fidelity of the generated data and the robustness of the classifier [151].

Lee et al. developed conditional GANs (CGANs) to generate additional HEA samples, addressing data scarcity in high-entropy alloys. They employed deep-learning-based optimization, generation, and interpretation methods to improve performance and identify key design parameters for HEA phase prediction. First, a regularized DNN was established and its hyperparameters (architecture, training, and regularization) were optimized. With CGAN-augmented data, the model’s performance improved markedly, achieving an accuracy of 93.17%. This work provides guidance not only for building reliable deep-learning phase-prediction models but also for interpreting important design parameters to aid the design of novel HEAs [192].

Yang proposed a new two-step GAN data-augmentation method that maintains consistency between the feature distributions of generated and original data while preserving label quality. The approach was evaluated on HEA hardness prediction (205 samples) and photocatalyst formation energy prediction (3099 samples). Using the Diebold–Mariano test for statistical comparison, the two-step method significantly outperformed prior GAN approaches. The resulting ML pipeline showed strong augmentation performance, reducing the photocatalyst prediction error by 6.1% relative to previous work [193].

To address the challenge of data scarcity in materials science, Sun et al. present a novel generative framework: the Elemental Feature Transfer and Augmentation Generative Adversarial Network (EFTGAN), as is shown in Figure 15. EFTGAN seamlessly integrates elemental convolutional techniques with a generative adversarial network (GAN) to produce data imbued with elemental and structural knowledge. This capability allows for effective data augmentation to enhance model accuracy and facilitates reliable predictions for scenarios with unknown structures. In their study on the FeNiCoCrMn/P high-entropy system, EFTGAN not only improved predictive performance from a small dataset but also successfully mapped the concentration-dependent evolution of key properties, including formation energy, lattice parameters, and magnetic moments across the quinary composition space [194].

Figure 15. The green block is a ECNet model. The elemental convolution operation is used to extract the features of each element in the materials. The principal component analysis is used to downscale features to improve generator performance. The purple block is elemental features after downscaling. The blue block is InfoGAN model. The iteration use the multi-layer perceptron to predict the features generated by INfoGAN, returning the results to InfoGAN training. Adapted from Ref. [194].

5.3.3. Optimize the Model and Optimized Component Design

ML has been widely applied in materials research, yet the predictive accuracy of models and the design of high-performance HEAs still require continual improvement. Refining model architectures, feature representations, training strategies, and hyperparameters can further enhance predictive performance and robustness. Moreover, algorithmic advances and more effective (near-)global optimization can improve accuracy, thereby enabling the discovery and design of HEAs that meet or even surpass target performance criteria.

Model optimization refers to adjusting and improving ML models to enhance their performance and effectiveness. Its objective is to achieve the best predictive performance on a given dataset while maintaining strong generalization capability.

Nassar et al. compared a NN trained on composition-only inputs with one trained on Hume–Rothery (HR) [195] parameters plus composition. The average test accuracy of the composition-only model (NN1) was 92%, whereas the HR + composition model (NN2) achieved 90%. NN1 outperformed most contemporary approaches, suggesting that eschewing hand-crafted HR descriptors can be not only feasible but advantageous. Both networks were validated by predicting the single-phase solid-solution window in the AlxCrCuFeNi system. The observed strengthening was attributed to interactions between moving dislocations and solute-atom–induced lattice distortions [196].

Solid-solution strengthening (SSS) underpins the excellent mechanical properties of single-phase HEAs. However, HEA compositions often lie beyond the assumptions of traditional theories, motivating new SSS models [197,198,199]. Wen et al. proposed a model that predicts the strength/hardness of HEA solid solutions more accurately than existing formulations. Leveraging ML-assisted feature construction and selection to capture salient descriptors, they developed a simpler SSS model that outperforms prior solid-solution models and identified alloys with potentially high SSS in AlCoCrFeNi, CoCrFeNiMn, HfNbTaTiZr, and MoNbTaWV systems [200].

Xu et al. introduced an intelligent optimization algorithm (OA) to refine feature selection in ML models for predicting UTS and fracture elongation (FE) in multi-principal-element alloys (MPEAs). Compared with a GA, the OA achieved higher computational efficiency, better predictive accuracy, faster convergence, and stronger feature-recognition capability [141].

Bayesian optimization (BO) is a hyperparameter (and design-variable) search strategy grounded in Bayesian inference [201]. It seeks optimal hyperparameter combinations by constructing a probabilistic surrogate of the objective function and querying points expected to improve the objective. Vela et al. developed robust, rapid models using Bayesian updating concepts that combine ML surrogates, easy-to-implement physics-based models, and inexpensive proxy experimental data. Their approach was cross-validated, informed by physics for extrapolation, and rigorously benchmarked against a standard Gaussian process (GP) regressor on BO tasks. The framework can serve within the ICME workflow to screen RHEAs with excellent high-temperature performance [202]. Khatamsaz et al. proposed a multi-objective BO framework under unknown constraints, which actively learns feasibility boundaries and iteratively shrinks the design space by distinguishing feasible from infeasible regions. Targeting refractory MPEAs, they linked structure to properties by optimizing ductility indicators—the Pugh ratio and Cauchy pressure—and benchmarked the method by designing a ductile refractory MPEA under two constraints (density and solidus temperature) relevant to gas-turbine applications. Detailed DFT calculations on the predicted alloy elucidated the mechanisms governing ductility. Although demonstrated for ductility, the approach readily extends to additional objectives and constraints [203]. Zhou et al. built a Bayesian neural network (BNN) energy model to explore configuration space in the CoNiRhRu HEA system. The BNN used six independents pairwise features (Co–Ni, Co–Rh, Co–Ru, Ni–Rh, Ni–Ru, Rh–Ru), with coordination-shell/structural energy terms as targets, achieving an energy RMSE of 1.37 meV atom⁻¹. They analyzed the effect of feature periodicity on HEA energies and showed that once the network is well-optimized, RMSE alone is insufficient to assess performance; uncertainty quantification is essential for predicting new HEA structures with calibrated confidence in a BNN-based workflow [204]. Sulley et al. applied BO in an active-learning loop with neural-network surrogates to efficiently traverse the vast HEA compositional space, focusing on predicting stable phases [205]. Halpren et al. used multi-objective BO assisted by DFT to optimize hydrogen-absorption thermodynamics, identifying VNbCrMoMn as a high-performance composition; explainable-ML analysis revealed that the first- and second-stage absorption thermodynamics depend largely on the bulk modulus and the number of d-band states, respectively [206]. To address the challenge of constrained multi-objective design in expansive compositional spaces, a framework based on Bayesian optimization was proposed by Khatamsaz et al., with a focus on Mo-Nb-Ti-V-W MPEAs for advanced gas turbine blades. A distinct capability of this approach is its adaptation to unknown constraints in the design space, which allows it to iteratively decide the best subsequent action at every stage of the optimization [207]. Figure 16 shows the application of the proposed framework to solve the problem.

Figure 16. Overall results of the 5-constraint 3-objective material design problem. The process begins with learning the constraint boundaries by querying the constraints, effectively reducing the entropy associated with each classifier that represents a specific constraint. Once the entropy curves for all classifiers are flattened, Bayesian optimization begins to learn the non-dominated design region. As the estimations of the Pareto front improve, the hypervolume increases respectively. The figure also includes an illustration of the objective space, showing all the queries to the ground truth model and the final estimation of the Pareto front. Adapted from Ref. [207].

By adjusting the ratios (proportions) of the constituent elements in an alloy, hardness and other properties can be improved. Ren et al. proposed two data-driven ML models for (i) hardness prediction of HEAs and (ii) composition optimization for high-hardness HEAs. The prediction model combines explainable ML with solid-solution strengthening theory, achieving R² = 0.9716 and RMSE = 39.2525 (under leave-one-out validation). The optimization model employs an intelligent optimization algorithm to design molar ratios for high-hardness HEAs, followed by experimental validation. A general design framework for predicting HEA performance and optimizing compositions was also summarized [208].

Integrated phase–property design enables the rapid discovery of target HEAs with desired phases and properties. Li et al. developed two optimized classification models (accuracy > 85%) to identify single-phase BCC solid-solution HEAs, and a regression model with R > 0.9 to predict HEA hardness. An integrated strategy combined the phase and hardness predictors. From 284,634 candidates, low-activation HEAs with the desired phase and performance were identified. This led to the design and fabrication (in two experimental iterations) of a new single-phase BCC Fe₃₅Cr₃₀V₂₀Mn₁₀Ti₅ low-activation HEA with 555.9 ± 15.3 HV. In addition, two phase-selection rules (accuracy > 95%) were provided to effectively screen SS and BCC HEAs, respectively [209].

In another study, Li et al. simplified the design pipeline by using only elemental compositions as ML inputs to increase hardness while decreasing density. Using a database of 544 multi-principal-element alloy compositions, they developed a robust surrogate model and coupled it with principal component analysis (PCA) to aid candidate selection. Through three iterations involving only 14 new samples, they identified an alloy whose effective specific hardness exceeded the maximum in the training set by 8.6% [210]. Table 2 summarizes representative ML-driven explorations of HEA compositional space.

5.4. ML-Driven Research in HEAs: Emerging Trends and Applications

Confronted with ever-expanding compositional dimensionality and increasingly stringent experimental budgets in high-entropy-alloy design, a spectrum of machine-learning paradigms shows substantial promise. Reinforcement learning reframes the end-to-end pipeline of compositional design, heat treatment, and property evaluation as a sequential decision problem that optimizes experimental policies within thermodynamically and kinetically constrained environments, thereby driving systematic performance gains in a closed loop of control–characterization–learning; recent studies indicate that offline RL and safe, constraint-aware RL furnish viable pathways toward autonomous process optimization. Transfer learning proves particularly effective in small-sample regimes by transporting shared representations that link elements, phase constitution, and properties across alloy families, distilling source-domain priors into the target compositional space to mitigate data scarcity. In quantum machine learning, hybrid quantum–classical neural networks have achieved phase-prediction accuracies for HEAs comparable to those of classical counterparts; with steady advances in quantum hardware, these models are poised to offer distinctive advantages for rapid exploration of high-dimensional compositional spaces and for approximating basins of the potential-energy landscape. Given the pervasive multi-objective trade-offs demanded by engineering applications, HEA design typically seeks Pareto-optimal compromises among strength, ductility, oxidation/corrosion resistance, density, and cost; here, multi-objective Bayesian optimization coupled with feasible-region learning systematically identifies optimal solutions while honoring manufacturability and service-safety constraints. Notably, recent practice in the Ni–Co–Cr–Al–Fe system demonstrates that an optimize–screen–validate workflow combining machine learning with high-throughput thermodynamic calculations has led to alloys exhibiting oxidation resistance surpassing conventional MCrAlY coatings, providing a compelling exemplar of materials design under multi-objective, multiphysics constraints [211,212].

In aerospace applications, the engineering suitability of refractory high-entropy alloys is jointly delimited by three critical boundaries—phase stability, protective scale-forming capability, and interfacial adhesion. Here, a multi-objective, multi-fidelity Bayesian optimization framework (qEHVI coupled with co-kriging) is employed to construct Pareto fronts in the objective space defined by specific strength, oxidation mass gain, and density, while incorporating as functional regularizers the thermodynamic tendencies for forming Al₂O₃, Cr₂O₃, and SiO₂ scales driven by the activities of Al, Cr, and Si. The resulting, reproducible performance metrics are reported for 1100–1200 °C [213]. In energy systems, for solid-oxide fuel-cell interconnects and high-temperature heat-exchange components, thermodynamic activities and oxide-scale formation energies are imposed as constraints, augmented by deep-potential-based predictions of diffusion and interfacial migration trends, thereby strengthening model robustness under extrapolated operating conditions. For high-entropy alloy hydrides, prior knowledge from Mg-/Ti-based hydrogen-storage materials is transferred into multi-principal compositional spaces via transfer learning, and a budget-constrained multi-objective Bayesian optimization conducts compositional screening over equilibrium pressure, storage capacity, and cycling stability, with conformal intervals providing quantitative calibration of extrapolation risk [214].

In electrocatalysis, using the oxygen-reduction reaction as an exemplar, an equivariant graph neural network is trained to learn the joint distribution of

Δ G_{O H}

and

Δ G_{O}

; GFlowNets together with diffusion-based policies enable distributed sampling of structure–activity relationships. A high-throughput electrochemical platform closes the loop with a qEHVI-driven design cycle that systematically evaluates hypervolume and sample-efficiency metrics, establishing a continuous evidentiary chain from computational models through material specimens to device-level performance [215]. In biomedical materials, targeting the multi-objective requirements of low elastic modulus, high wear resistance, and excellent corrosion resistance, element embeddings and pre-trained potential functions are used to predict trends in stacking-fault energy and interfacial energy; coupled electrochemical-corrosion surrogate models and wear-volume thresholds define engineering acceptance lines, and potentiodynamic polarization together with wear tests in simulated body fluid delivers experimental validation, thereby realizing a closed-loop from mechanistic interpretation and compositional design to empirical verification [216].

6. Conclusions, Challenges, and Outlooks

6.1. Conclusions

The trial-and-error paradigm has guided alloy design for millennia. However, as the number of constituent elements increases and microstructures become increasingly hierarchical, this approach becomes impractical. Fortunately, decades of prior trial-and-error research on HEAs have yielded extensive experimental datasets that ML can leverage to design complex alloys. He et al. integrated ML models for yield strength and strain-at-fracture into a unified system to predict and design high-performance RHEAs, achieving compressive yield strength and strain-at-fracture surpassing those of NbTaTiVW [217]. In parallel, new experimental workflows and algorithms executed on modern supercomputers have produced large theoretical databases, alleviating common data-scarcity issues for ML models.

The combination of ML and materials science has spawned a rapidly growing research area. The exceptional performance, vast compositional space, and complex chemical interactions of HEAs make ML a powerful route to explore and design high-performance alloys. Broadly, ML studies on HEAs fall into two categories: (i) property-prediction models trained on high-throughput experimental or computational data to predict bulk properties (e.g., phase formation, crystal structure, elastic constants, yield strength); and (ii) ML-guided design frameworks that propose HEAs whose superior mechanical properties are subsequently validated experimentally.

6.2. Challenges

Although ML provides a new paradigm for high-throughput screening and performance prediction in the materials genomics of HEAs, it still faces three technical bottlenecks before broad engineering adoption. From a research-workflow perspective, the first challenge is the dataset: the difficulty of constructing multiscale data systems limits model generalization. Although DFT and GANs can expand existing HEA databases, they still presuppose non-trivial dataset sizes; in underexplored compositional domains these methods remain insufficient. Moreover, first-principles pseudopotential approximations (often > 0.5 eV atom⁻¹) compound with experimental EDS composition biases (typically > 2 at.%), leading to stacked errors. Consequently, small-data methods become crucial in HEA design [218]. Qiao et al. proposed a cuckoo-search artificial neural network (CS-ANN) tailored to small samples, demonstrated strong learning and generalization capabilities, and designed/fabricated Al_1.1CrCoFe_0.9Ni_0.9 HEAs for experimental validation [219]. Wen et al. developed a closed-loop framework—combining ML, genetic search, cluster analysis, and design of experiments—using only two independent datasets (54 and 145 alloys), and discovered a Zr_0.13Nb_0.27Mo_0.26Hf_0.13Ta_0.21 RHEA with excellent strength and ductility [220].

Secondly, feature selection and its interpretability with respect to the target remain major challenges. The analysis of high-dimensional data is nontrivial in ML; removing irrelevant or redundant variables via feature selection can reduce computation time, improve predictive accuracy, and enhance understanding of both the model and the data [221]. Yang et al. proposed a machine-learning-based alloy design system (MADS) that filters key features governing HEA hardness through a four-step feature-selection pipeline. SVM-based hardness predictors achieved a Pearson correlation of 0.94 under leave-one-out cross-validation (LOOCV), and the models were experimentally validated [91]. To further improve feature selection, Liu et al. designed a three-stage screening scheme that identified seven highly representative features from an initial set of 64; these were then used for model training. A comparative analysis against feature sets reported in six peer-reviewed studies corroborated the validity of the selected features [222].

A lack of physical interpretability continues to hinder the discovery of general materials-design principles. Current SHAP-based attribution methods often provide limited physical insight and may fail to establish quantitative links with classical theories (e.g., Hume–Rothery rules). Establishing explicit mappings between ML features and thermodynamic parameters (e.g., mixing entropy,

Δ S_{m i x}

) remains a pressing fundamental problem. Notably, hybrid modeling strategies that integrate symbolic regression and physics-informed neural networks (PINNs) have shown promise for deriving explicit composition–property relations. (See Section 4.1.3 for related studies).

Finally, the absence of a robust model-confidence assessment can introduce serious technical risks. In phase-structure prediction, modern DNNs often exhibit overconfidence; misclassification of phases can, in turn, compromise thermodynamic-stability analyses of alloys. Confidence scores from conventional CNNs are frequently mis-calibrated, showing a nonlinear relation to the true error rate. To address this, a dynamic assessment framework integrating uncertainty quantification (UQ)—e.g., Monte Carlo dropout and Bayesian neural networks (BNNs)—with confidence-domain analysis over the materials-feature space is needed to build a self-diagnostic prediction system. Demonstrating this direction, Wen et al. formulated an ML design strategy that incorporates uncertainty estimation and cluster analysis, enabling the discovery and synthesis of four non-equimolar alloys with excellent high-temperature strength, room-temperature ductility, and high-temperature specific yield strength [223].

6.3. Outlooks

The scarcity of data is not merely a matter of limited sample size; it arises from a conjunction of missing priors and suboptimal sampling strategies. To address this, we treat CALPHAD, first-principles calculations, and phase-field simulations as low- to mid-fidelity physical priors and embed them natively into surrogate construction via multi-fidelity co-kriging, while maximizing information gain under budget constraints through an active-learning, multi-objective acquisition scheme based on qEHVI. On the representation axis, we introduce knowledge distillation and transfer learning to transplant chemical–structural semantic information from related materials systems into the target compositional space, thereby prioritizing optimization within chemical subregions that exhibit a higher density of prior value. For interpretability, we integrate SHAP analysis, counterfactual explanations, and sparse identification of nonlinear dynamics (SINDy) to build a three-tier “local–global–symbolic” explanatory stack that converts black-box outputs into auditable physical rules. We further vet the resulting rules through dimensional-consistency checks and extreme-condition validation, elevating interpretability from a post hoc add-on to a first-class design criterion [216]. With respect to uncertainty governance, classification models undergo confidence calibration via temperature scaling and reliability (calibration) curves, whereas regression models combine deep ensembles, Bayesian posterior sampling, and conformal prediction to construct trustworthy uncertainty intervals; a systematic reject-option mechanism is incorporated to handle distribution-shifted and out-of-distribution instances effectively [224]. In the experimental validation loop, we adopt cost-aware batch experimental design coupled with a max–min diversity policy to avoid “near-duplicate composition clustering” in the search space, and we complete closed-loop verification—from computational prediction to laboratory synthesis—across several representative alloy systems. All studies adhere to scriptable, reproducible evaluation pipelines aligned with standard benchmarks such as Matbench and JARVIS [225].

HEAs represent a promising frontier for data-driven materials discovery, contingent on the seamless integration of modeling, data, and experimentation. The most immediate opportunity lies in the fusion of ML with mechanistic insights across multiple length and time scales. By embedding thermodynamic and kinetic constraints into models—using physics-informed neural networks (PINNs), symbolic regression, and loss functions enforcing mass and charge balance—unphysical extrapolations can be minimized, enhancing sample efficiency in data-limited regimes. The integration of DFT and MD simulations with phase-field and crystal-plasticity finite element models, coupled with neural operators and graph-based approaches, can facilitate coherent information transfer from atomic configurations to macroscopic properties.

A critical factor in advancing HEA research is ensuring data quality and interoperability. A unified data schema encompassing composition, processing histories, test conditions, and microstructural annotations, along with provenance tracking and uncertainty quantification, is vital for reproducibility and reliable generalization. Evaluating models with standardized metrics and robust calibration measures will further improve reliability and reduce barriers to data reuse.

Advances in algorithmic expressivity and interpretability are equally important. Hybrid approaches combining symbolic regression with physics-based models can yield interpretable composition-property relationships, aiding inverse design and enhancing model transparency. Autonomous, uncertainty-aware experimentation, leveraging active learning and high-throughput synthesis, can dramatically accelerate discovery timelines. Ultimately, the integration of ML with sustainability goals—such as recyclability, supply-chain risks, and environmental impact—will guide the future of HEA development and extend these paradigms to other materials systems.

The critical breakthroughs in future HEA research will not arise from marginal refinements to isolated models, but from the construction of an institutionalized, cross-disciplinary workflow. A reproducible HEA intelligent-design paradigm should treat multi-fidelity physical models as prior constraints, adopt active learning and multi-objective optimization as the decision-making core, employ conformal prediction intervals and evidence-fusion mechanisms for risk control, and use standardized data lineage together with model cards as auditable artifacts. Within this framework, three professional roles have complementary responsibilities: materials scientists articulate testable causal hypotheses and codify structure–property characterization protocols; computational modelers encode the underlying physics into computable, orchestra table sources of synthetic data; and machine-learning experts design sampling strategies with quantified uncertainty and choreograph the cadence of exploration. When these actors collaborate around unified objective functions, constraint sets, and a traceable data ecosystem, long-standing bottlenecks—extrapolation failure, limited interpretability, and engineering transfer bias—are mitigated in a systematic manner.

Realizing this vision requires a governance architecture and infrastructural backbone that enable collaboration across teams, including harmonized metadata standards and materials ontologies, a versioned data lake that natively accommodates multi-fidelity records, end-to-end ML-Ops pipelines covering the full model lifecycle, and a tightly coupled human-in-the-loop decision loop integrated with experimental platforms. Only by embedding these institutional mechanisms as integral elements of the research methodology can we translate technically validated approaches from controlled studies to real-world materials R&D settings characterized by complex constraints and high-cost sensitivity in a robust and reliable way.

In aggregate, the maturation of the field hinges on the synthesis of high-fidelity, uncertainty-annotated data; interpretable, physics-aware learning; and autonomous, closed-loop experimentation. When these elements are integrated under rigorous calibration and reproducibility standards, HEA discovery can transition from bespoke exploration to a repeatable, scalable, and sustainable design practice—shortening the pathway from computation to qualified components.

Author Contributions

X.X.: investigation, methodology, formal analysis, visualization, and writing original draft. Z.H.: conceptualization, supervision, writing, review and editing. K.Z.: investigation, methodology. L.C.: investigation, methodology. W.F.: investigation, review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This work was financially supported by the Natural Science Research Foundation of China (no. 51801015).

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no competing interests.

References

Yeh, J.-W.; Lin, S.-J.; Chin, T.-S.; Gan, J.-Y.; Chen, S.-K.; Shun, T.-T.; Tsau, C.-H.; Chou, S.-Y. Formation of simple crystal structures in Cu-Co-Ni-Cr-Al-Fe-Ti-V alloys with multiprincipal metallic elements. Metall. Mater. Trans. A 2004, 35, 2533–2536. [Google Scholar] [CrossRef]
Cantor, B.; Chang, I.T.H.; Knight, P.; Vincent, A.J.B. Microstructural development in equiatomic multicomponent alloys. Mater. Sci. Eng. A 2004, 75–377, 213–218. [Google Scholar] [CrossRef]
Yeh, J.-W.; Chen, S.K.; Lin, S.-J.; Gan, J.-Y.; Chin, T.-S.; Shun, T.-T.; Tsau, C.-H.; Chang, S.-Y. Nanostructured high-entropy alloys with multiple principal elements: Novel alloy design concepts and outcomes. Adv. Eng. Mater. 2004, 6, 299–303. [Google Scholar] [CrossRef]
Huang, P.-K.; Yeh, J.-W.; Shun, T.-T.; Chen, S.-K. Multi-principal-element alloys with improved oxidation and wear resistance for thermal spray coating. Adv. Eng. Mater. 2004, 6, 74–78. [Google Scholar] [CrossRef]
Kunce, I.; Polanski, M.; Bystrzycki, J. Structure and hydrogen storage properties of a high entropy ZrTiVCrFeNi alloy synthesized using Laser Engineered Net Shaping (LENS). Int. J. Hydrogen Energy 2013, 38, 12180–12189. [Google Scholar] [CrossRef]
Choi, W.-M.; Jo, Y.H.; Sohn, S.S.; Lee, S.; Lee, B.-J. Understanding the physical metallurgy of the CoCrFeMnNi high-entropy alloy: An atomistic simulation study. npj Comput. Mater. 2018, 4, 1. [Google Scholar] [CrossRef]
Jarlöv, A.; Ji, W.; Zhu, Z.; Tian, Y.; Babicheva, R.; An, R.; Seet, H.L.; Nai, M.L.S.; Zhou, K. Molecular dynamics study on the strengthening mechanisms of Cr–Fe–Co–Ni high-entropy alloys based on the generalized stacking fault energy. J. Alloys Compd. 2022, 905, 164137. [Google Scholar] [CrossRef]
Ma, D.; Grabowski, B.; Körmann, F.; Neugebauer, J.; Raabe, D. Ab initio thermodynamics of the CoCrFeMnNi high entropy alloy: Importance of entropy contributions beyond the configurational one. Acta. Mater. 2015, 100, 90–97. [Google Scholar] [CrossRef]
Lederer, Y.; Toher, C.; Vecchio, K.S.; Curtarolo, S. The search for high entropy alloys: A high-throughput ab-initio approach. Acta Mater. 2018, 159, 364–383. [Google Scholar] [CrossRef]
Jo, Y.H.; Choi, W.M.; Kim, D.G.; Zargaran, A.; Sohn, S.S.; Kim, H.S.; Lee, B.J.; Kim, N.J.; Lee, S. FCC to BCC transformation-induced plasticity based on thermodynamic phase stability in novel V₁₀Cr₁₀Fe₄₅Co_xNi_35−x medium-entropy alloys. Sci. Rep. 2019, 9, 2948. [Google Scholar] [CrossRef]
Feng, R.; Liaw, P.K.; Gao, M.C.; Widom, M. First-principles prediction of high-entropy-alloy stability. npj Comput. Mater. 2017, 3, 50. [Google Scholar] [CrossRef]
Yin, S.; Zuo, Y.; Abu-Odeh, A.; Zheng, H.; Li, X.-G.; Ding, J.; Ong, S.P.; Asta, M.; Ritchie, R.O. Atomistic simulations of dislocation mobility in refractory high-entropy alloys and the effect of chemical short-range order. Nat. Commun. 2021, 12, 4873. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.; Wang, Y. Disentangling diffusion heterogeneity in high-entropy alloys. Acta Mater. 2022, 224, 117527. [Google Scholar] [CrossRef]
Zhao, L.; Zong, H.; Ding, X.; Lookman, T. Anomalous dislocation core structure in shock compressed bcc high-entropy alloys. Acta Mater. 2021, 209, 116801. [Google Scholar] [CrossRef]
Xiong, J.; Shi, S.Q.; Zhang, T.Y. A machine-learning approach to predicting and understanding the properties of amorphous metallic alloys. Mater. Des. 2020, 187, 108378. [Google Scholar] [CrossRef]
Tancret, F.; Toda-Caraballo, I.; Menou, E.; Rivera Díaz-Del-Castillo, P.E.J. Designing high entropy alloys employing thermodynamics and Gaussian process statistical analysis. Mater. Des. 2017, 115, 486–497. [Google Scholar] [CrossRef]
Menou, E.; Toda-Caraballo, I.; Rivera-Díaz-del-Castillo, P.E.J.; Pineau, C.; Bertrand, E.; Ramstein, G.; Tancret, F. Evolutionary design of strong and stable high entropy alloys using multi-objective optimisation based on physical models, statistics and thermodynamics. Mater. Des. 2018, 143, 185–195. [Google Scholar] [CrossRef]
Zheng, K.; He, Z.; Che, L.; Cheng, H.; Ge, M.; Si, T.; Xu, X. Deep alloys: Metal materials empowered by deep learning. Mater. Sci. Semicond. Process 2024, 179, 108514. [Google Scholar] [CrossRef]
Che, L.; He, Z.; Zheng, K.; Xu, X.; Zhao, F. An automatic segmentation and quantification method for austenite and ferrite phases in duplex stainless steel based on deep learning. J. Mater. Chem. A 2024, 13, 772–785. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
Hinton, G.E.; Osindero, S.; Teh, Y.-W. A fast learning algorithm for deep belief nets. Neural Comput. 2006, 18, 1527–1554. [Google Scholar] [CrossRef]
Dong, S.; Wang, P.; Abbas, K. A survey on deep learning and its applications. Comput. Sci. Rev. 2021, 40, 100379. [Google Scholar] [CrossRef]
Jaderberg, M.; Simonyan, K.; Zisserman, A.; Kavukcuoglu, K. Spatial transformer networks. Adv. Neural Inf. Process Syst. 2015, 28. [Google Scholar]
Tarasiuk, P.; Pryczek, M. Geometric transformations embedded into convolutional neural networks. J. Appl. Comput. Sci. 2016, 24, 33–48. [Google Scholar]
Mounsaveng, S.; Laradji, I.; Ben Ayed, I.; Vazquez, D.; Pedersoli, M. Learning Data Augmentation with Online Bilevel Optimization for Image Classification. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2021; pp. 1691–1700. [Google Scholar]
Luo, H.; Jiang, W.; Fan, X.; Zhang, C. Stnreid: Deep convolutional networks with pairwise spatial transformer networks for partial person re-identification. IEEE Trans. Multimed. 2020, 22, 2905–2913. [Google Scholar] [CrossRef]
Weiss, K.; Khoshgoftaar, T.M.; Wang, D. A survey of transfer learning. J. Big Data 2016, 3, 9. [Google Scholar] [CrossRef]
Pan, S.J.; Yang, Q. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 2009, 22, 1345–1359. [Google Scholar] [CrossRef]
Liu, X.; Zhang, J.; Pei, Z. Machine learning for high-entropy alloys: Progress, challenges and opportunities. Prog. Mater. Sci. 2023, 131, 101018. [Google Scholar] [CrossRef]
Hu, M.; Tan, Q.; Knibbe, R.; Xu, M.; Jiang, B.; Wang, S.; Li, X.; Zhang, M.-X. Recent applications of machine learning in alloy design: A review. Mater. Sci. Eng. R. Rep. 2023, 155, 100746. [Google Scholar] [CrossRef]
Yan, Y.; Hu, X.; Liao, Y.; Zhou, Y.; He, W.; Zhou, T. Recent machine learning-driven investigations into high entropy alloys: A comprehensive review. J. Alloys Compd. 2024, 60, 177823. [Google Scholar] [CrossRef]
Hu, X. Review: Machine learning in high-entropy alloys-transformative potential and innovative application. J. Mater. Sci. 2025, 60, 12385–12408. [Google Scholar] [CrossRef]
Liu, H.; Chen, B.; Chen, R.; He, J.; Kang, D.; Dai, J. Computational simulation of short-range order structures in high-entropy alloys: A review on formation patterns, multiscale characterization, and performance modulation mechanisms. Adv. Phys. X 2025, 10, 2527417. [Google Scholar] [CrossRef]
Zhao, Y.M.; Zhang, J.Y.; Liaw, P.K.; Yang, T. Machine Learning-Based Computational Design Methods for High-Entropy Alloys. High Entropy Alloys Mater. 2025, 3, 41–100. [Google Scholar] [CrossRef]
Brechtl, J.; Liaw, P.K. (Eds.) High-Entropy Materials: Theory, Experiments, and Applications; Springer Nature: Cham, Switzerland, 2021. [Google Scholar] [CrossRef]
Zeroual, I.; Lakhouaja, A. Data science in light of natural language processing: An overview. Procedia Comput. Sci. 2018, 127, 82–91. [Google Scholar] [CrossRef]
Souili, A.; Cavallucci, D.; Rousselot, F. Natural Language Processing (NLP)–A Solution for Knowledge Extraction from Patent Unstructured Data. Procedia Eng. 2015, 131, 635–643. [Google Scholar] [CrossRef][Green Version]
Cerda, P.; Varoquaux, G.; K’egl, B. Similarity encoding for learning with dirty categorical variables. Mach. Learn. 2018, 107, 1477–1494. [Google Scholar] [CrossRef]
Zhao, D.Q.; Pan, S.P.; Zhang, Y.; Liaw, P.K.; Qiao, J.W. Structure prediction in high-entropy alloys with machine learning. Appl. Phys. Lett. 2021, 118, 231904. [Google Scholar] [CrossRef]
Dai, D.; Xu, T.; Wei, X.; Ding, G.; Xu, Y.; Zhang, J.; Zhang, H. Using machine learning and feature engineering to characterize limited material datasets of high-entropy alloys. Comput. Mater. Sci. 2020, 175, 109618. [Google Scholar] [CrossRef]
Bzdok, D.; Krzywinski, M.; Altman, N. Machine learning: A primer. Nat. Methods 2017, 14, 1119–1120. [Google Scholar] [CrossRef]
Guo, Q.; Xu, X.; Pei, X.; Duan, Z.; Liaw, P.K.; Hou, H.; Zhao, Y. Predict the phase formation of high-entropy alloys by compositions. J. Mater. Res. Technol. 2023, 22, 3331–3339. [Google Scholar] [CrossRef]
Zhang, Y.-F.; Ren, W.; Wang, W.-L.; Li, N.; Zhang, Y.-X.; Li, X.-M.; Li, W.-H. VInterpretable hardness prediction of high-entropy alloys through ensemble learning. J. Alloys Compd. 2023, 945, 169329. [Google Scholar] [CrossRef]
Rao, Z.; Tung, P.-Y.; Xie, R.; Wei, Y.; Zhang, H.; Ferrari, A.; Klaver, T.P.C.; Körmann, F.; Sukumar, P.T.; Kwiatkowski da Silva, A.; et al. Machine learning–enabled high-entropy alloy discovery. Science 2022, 378, 78–85. [Google Scholar] [CrossRef]
Hareharen, K.; Panneerselvam, T.; Raj Mohan, R. Improving the performance of machine learning model predicting phase and crystal structure of high entropy alloys by the synthetic minority oversampling technique. J. Alloys Compd. 2024, 991, 174494. [Google Scholar] [CrossRef]
Chen, C.; Han, X.; Zhang, Y.; Liaw, P.K.; Ren, J. Phase prediction of high-entropy alloys based on machine learning and an improved information fusion approach. Comput. Mater. Sci. 2024, 239, 112976. [Google Scholar] [CrossRef]
Veeresham, M.; Sake, N.; Lee, U.; Park, N. Unraveling phase prediction in high entropy alloys: A synergy of machine learning, deep learning, and ThermoCalc, validation by experimental analysis. J. Mater. Res. Technol. 2024, 29, 1744–1755. [Google Scholar] [CrossRef]
Mishra, A.; Kompella, L.; Sanagavarapu, L.M.; Varam, S. Ensemble-based machine learning models for phase prediction in high entropy alloys. Comput. Mater. Sci. 2022, 210, 111025. [Google Scholar] [CrossRef]
Han, Q.; Lu, Z.; Zhao, S.; Su, Y.; Cui, H. Data-driven based phase constitution prediction in high entropy alloys. Comput. Mater. Sci. 2022, 215, 111774. [Google Scholar] [CrossRef]
Peivaste, I.; Jossou, E.; Tiamiyu, A.A. Data-driven analysis and prediction of stable phases for high-entropy alloy design. Sci. Rep. 2023, 13, 22556. [Google Scholar] [CrossRef]
Oñate, A.; Sanhueza, J.P.; Zegpi, D.; Tuninetti, V.; Ramirez, J.; Medina, C.; Melendrez, M.; Rojas, D. Supervised machine learning-based multi-class phase prediction in high-entropy alloys using robust databases. J. Alloys Compd. 2023, 962, 171224. [Google Scholar] [CrossRef]
Brown, P.; Zhuang, H. Quantum machine-learning phase prediction of high-entropy alloys. Mater. Today 2023, 63, 18–31. [Google Scholar] [CrossRef]
Huang, W.; Martin, P.; Zhuang, H.L. Machine-learning phase prediction of high-entropy alloys. Acta Mater. 2019, 169, 225–236. [Google Scholar] [CrossRef]
Krishna, Y.V.; Jaiswal, U.K.; Rahul, R.M. Machine learning approach to predict new multiphase high entropy alloys. Scr. Mater. 2021, 197, 113804. [Google Scholar] [CrossRef]
Zhou, Z.; Zhou, Y.; He, Q.; Ding, Z.; Li, F.; Yang, Y. Machine learning guided appraisal and exploration of phase design for high entropy alloys. npj Comput. Mater. 2019, 5, 128. [Google Scholar] [CrossRef]
Zhu, W.; Huo, W.; Wang, S.; Wang, X.; Ren, K.; Tan, S.; Fang, F.; Xie, Z.; Jiang, J. Phase formation prediction of high-entropy alloys: A deep learning study. J. Mater. Res. Technol. 2022, 18, 800–809. [Google Scholar] [CrossRef]
Zhou, C.; Zhang, Y.; Xin, H.; Li, X.; Chen, X. Complex multiphase predicting of additive manufactured high entropy alloys based on data augmentation deep learning. J. Mater. Res. Technol. 2024, 28, 2388–2401. [Google Scholar] [CrossRef]
Vishwakarma, D.; Neigapula, V.S.N. Prediction of phase via machine learning in high entropy alloys. Mater. Today Proc. 2023, 112. [Google Scholar] [CrossRef]
Chang, H.; Tao, Y.; Liaw, P.K.; Ren, J. Phase prediction and effect of intrinsic residual strain on phase stability in high-entropy alloys with machine learning. J. Alloys Compd. 2022, 921, 166149. [Google Scholar] [CrossRef]
Qu, N.; Liu, Y.; Zhang, Y.; Yang, D.; Han, T.; Liao, M.; Lai, Z.; Zhu, J.; Zhang, L. Machine learning guided phase formation prediction of high entropy alloys. Mater. Today Commun. 2022, 32, 104146. [Google Scholar] [CrossRef]
Risal, S.; Zhu, W.; Guillen, P.; Sun, L. Improving phase prediction accuracy for high entropy alloys with Machine learning. Comput. Mater. Sci. 2021, 192, 110389. [Google Scholar] [CrossRef]
Bobbili, R.; Ramakrishna, B. Prediction of phases in high entropy alloys using machine learning. Mater. Today Commun. 2023, 36, 106674. [Google Scholar] [CrossRef]
Zhang, C.; Zhang, F.; Diao, H.; Gao, M.C.; Tang, Z.; Poplawsky, J.D.; Liaw, P.K. Understanding phase stability of Al-Co-Cr-Fe-Ni high entropy alloys. Mater. Des. 2016, 109, 425–433. [Google Scholar]
George, E.P.; Raabe, D.; Ritchie, R.O. High-entropy alloys. Nat. Rev. Mater. 2019, 4, 515–534. [Google Scholar] [CrossRef]
Andersson, J.-O.; Helander, T.; Höglund, L.; Shi, P.; Sundman, B. Thermo-Calc & DICTRA, computational tools for materials science. Calphad 2002, 26, 273–312. [Google Scholar] [CrossRef]
Raturi, A.; Aditya, J.; Gurao, N.P.; Biswas, K. ICME approach to explore equiatomic and non-equiatomic single phase BCC refractory high entropy alloys. J. Alloys Compd. 2019, 806, 587–595. [Google Scholar] [CrossRef]
Ng, C.; Guo, S.; Luan, J.; Shi, S.; Liu, C. Entropy-driven phase stability and slow diffusion kinetics in an Al_0.5CoCrCuFeNi high entropy alloy. Intermetallics 2012, 31, 165–172. [Google Scholar] [CrossRef]
Ma, D.; Yao, M.; Pradeep, K.; Tasan, C.C.; Springer, H.; Raabe, D. Phase stability of non-equiatomic CoCrFeMnNi high entropy alloys. Acta Mater. 2015, 98, 288–296. [Google Scholar] [CrossRef]
Qu, N.; Chen, Y.; Lai, Z.; Liu, Y.; Zhu, J. The phase selection via machine learning in high entropy alloys. Procedia Manuf. 2019, 37, 299–305. [Google Scholar] [CrossRef]
Zeng, Y.; Man, M.; Bai, K.; Zhang, Y.-W. Revealing high-fidelity phase selection rules for high entropy alloys: A combined CALPHAD and machine learning study. Mater. Des. 2021, 202, 109532. [Google Scholar] [CrossRef]
He, L.; Wang, C.; Zhang, M.; Li, J.; Chen, T.; Zhou, X. Design of BCC/FCC dual-solid solution refractory high-entropy alloys through CALPHAD, machine learning and experimental methods. npj Comput. Mater. 2025, 11, 105. [Google Scholar] [CrossRef]
Qian, J.; Guo, X.; Deng, Y. A novel method for combining conflicting evidences based on information entropy. Appl. Intell. 2016, 46, 876–888. [Google Scholar] [CrossRef]
Hou, S.; Sun, M.; Bai, M.; Lin, D.; Li, Y.; Liu, W. A hybrid prediction frame for HEAs based on empirical knowledge and machine learning. Acta Mater. 2022, 228, 117742. [Google Scholar] [CrossRef]
Dhamankar, S.; Jiang, S.; Webb, M.A. Accelerating multicomponent phase-coexistence calculations with physics-informed neural networks. Mol. Syst. Des. Eng. 2025, 10, 89–101. [Google Scholar] [CrossRef]
Hammad, R.; Mondal, S. Advancements in thermochemical predictions: A multi-output thermodynamics-informed neural network approach. J. Cheminform. 2025, 17, 95. [Google Scholar] [CrossRef]
Rittig, J.G.; Mitsos, A. Thermodynamics-consistent graph neural networks. Chem. Sci. 2024, 15, 18504–18512. [Google Scholar] [CrossRef]
Palmer, G.; Du, S.; Politowicz, A.; Emory, J.P.; Yang, X.; Gautam, A.; Gupta, G.; Li, Z.; Jacobs, R.; Morgan, D. Calibration after bootstrap for accurate uncertainty quantification in regression models. npj Comput. Mater. 2022, 8, 115. [Google Scholar] [CrossRef]
Li, K.; Rubungo, A.N.; Lei, X.; Persaud, D.; Choudhary, K.; DeCost, B.; Dieng, A.B.; Hattrick-Simpers, J. Probing out-of-distribution generalization in machine learning for materials. Commun. Mater. 2025, 6, 9. [Google Scholar] [CrossRef]
Zhu, S.; Sarıtürk, D.; Arróyave, R. Accelerating CALPHAD-based phase diagram predictions in complex alloys using universal machine learning potentials: Opportunities and challenges. Acta Mater. 2025, 286, 120747. [Google Scholar] [CrossRef]
Chen, Q.; He, Z.; Zhao, Y.; Liu, X.; Wang, D.; Zhong, Y.; Hu, C.; Hao, C.; Lu, K.; Wang, Z. Stacking ensemble learning assisted design of Al-Nb-Ti-V-Zr lightweight high-entropy alloys with high hardness. Mater. Des. 2024, 246, 113363. [Google Scholar] [CrossRef]
Jain, S.; Jain, R.; Dewangan, S.; Bhowmik, A.A. Machine learning perspective on hardness prediction in multicomponent Al-Mg based lightweight alloys. Mater. Lett. 2024, 365, 136473. [Google Scholar] [CrossRef]
Chuang, M.H.; Tsai, M.H.; Wang, W.R.; Lin, S.J.; Yeh, J.W. Microstructure and wear behavior of AlxCo1.5CrFeNi1.5Tiy high-entropy alloys. Acta Mater. 2011, 59, 6308–6317. [Google Scholar] [CrossRef]
Lu, Y.P.; Dong, Y.; Guo, S.; Jiang, L.; Kang, H.J.; Wang, T.M.; Wen, B.; Wang, Z.J.; Jie, J.C.; Cao, Z.Q.; et al. A promising new class of high-temperature alloys: Eutectic high-entropy alloys. Sci. Rep. 2014, 1, 4. [Google Scholar] [CrossRef]
Ye, Y.; Liu, C.; Wang, H.; Nieh, T. Friction and wear behavior of a single-phase equiatomic TiZrHfNb high-entropy alloy studied using a nanoscratch technique. Acta Mater. 2018, 147, 78–89. [Google Scholar] [CrossRef]
Jones, M.R.; Nation, B.L.; Wellington-Johnson, J.A.; Curry, J.F.; Kustas, A.B.; Lu, P.; Chandross, M.; Argibay, N. Evidence of Inverse Hall-Petch Behavior and Low Friction and Wear in High Entropy Alloys. Sci. Rep. 2020, 10, 10151. [Google Scholar] [CrossRef]
Lv, Y.; Lang, X.; Zhang, Q.; Liu, W.; Liu, Y. Study on corrosion behavior of (CuZnMnNi)_{100 x}Sn_x high-entropy brass alloy in 5 wt% NaCl solution. J. Alloys Compd. 2022, 921, 166051. [Google Scholar] [CrossRef]
Luo, H.; Sohn, S.S.; Lu, W.J.; Li, L.L.; Li, X.G.; Soundararajan, C.K.; Krieger, W.; Li, Z.M.; Raabe, D. A strong and ductile medium-entropy alloy resists hydrogen embrittlement and corrosion. Nat. Commun. 2020, 11, 3081. [Google Scholar] [CrossRef]
Gawel, R.; Rogal, Ł.; Dąbek, J.; Wójcik-Bania, M.; Przybylski, K. High temperature oxidation behaviour of non-equimolar AlCoCrFeNi high entropy alloys. Vacuum 2021, 184, 109969. [Google Scholar] [CrossRef]
Naik, S.N.; Walley, S.M. The Hall–Petch and inverse Hall–Petch relations and the hardness of nanocrystalline metals. J. Mater. Sci. 2019, 55, 2661–2681. [Google Scholar] [CrossRef]
Guo, Q.; Pan, Y.; Hou, H.; Zhao, Y. Predicting the hardness of high-entropy alloys based on compositions. Int. J. Refract. Met. Hard Mater. 2023, 112, 106116. [Google Scholar] [CrossRef]
Yang, C.; Ren, C.; Jia, Y.; Wang, G.; Li, M.; Lu, W. A machine learning-based alloy design system to facilitate the rational design of high entropy alloys with enhanced hardness. Acta Mater. 2022, 222, 117431. [Google Scholar] [CrossRef]
Gao, T.; Gao, J.; Yang, S.; Zhang, L. Data-driven design of novel lightweight refractory high-entropy alloys with superb hardness and corrosion resistance. npj Comput. Mater. 2024, 10, 80. [Google Scholar] [CrossRef]
Bundela, A.S.; Rahul, M.R. Machine learning-enabled framework for the prediction of mechanical properties in new high entropy alloys. J. Alloys Compd. 2022, 908, 164578. [Google Scholar] [CrossRef]
Dewangan, S.K.; Samal, S.; Kumar, V. Development of an ANN-based generalized model for hardness prediction of SPSed AlCoCrCuFeMnNiW containing high entropy alloys. Mater. Today Commun. 2021, 27, 102356. [Google Scholar] [CrossRef]
Li, S.; Li, S.; Liu, D.; Yang, J.; Zhang, M. Hardness prediction of high entropy alloys with periodic table representation of composition, processing, structure and physical parameters. J. Alloys Compd. 2023, 967, 171735. [Google Scholar] [CrossRef]
Huang, X.; Jin, C.; Zhang, C.; Zhang, H.; Fu, H. Machine learning assisted modelling and design of solid solution hardened high entropy alloys. Mater. Des. 2021, 211, 110177. [Google Scholar] [CrossRef]
Cantor, B. Multicomponent high-entropy cantor alloys. Prog. Mater Sci. 2020, 120, 100754. [Google Scholar] [CrossRef]
Senkov, O.N.; Miller, J.D.; Miracle, D.B.; Woodward, C. Accelerated exploration of multi-principal element alloys with solid solution phases. Nat. Commun. 2015, 6, 6529. [Google Scholar] [CrossRef]
Toda-Caraballo, I.; Rivera-Díaz-del Castillo, P.E.J. Modelling solid solution hardening in high entropy alloys. Acta Mater. 2015, 85, 14–23. [Google Scholar] [CrossRef]
Toda-Caraballo, I. A general formulation for solid solution hardening effect in multicomponent alloys. Scr. Mater. 2017, 127, 113–117. [Google Scholar] [CrossRef]
Kim, G.; Diao, H.; Lee, C.; Samaei, A.T.; Phan, T.; de Jong, M.; An, K.; Ma, D.; Liaw, P.K.; Chen, W. First-principles and machine learning predictions of elasticity in severely lattice-distorted high-entropy alloys with experimental validation. Acta Mater. 2019, 181, 124–138. [Google Scholar] [CrossRef]
Kandavalli, M.; Agarwal, A.; Poonia, A.; Kishor, M.; Ayyagari, K.P.R. Design of high bulk moduli high entropy alloys using machine learning. Sci. Rep. 2023, 13, 20504. [Google Scholar] [CrossRef]
Xiao, J.; Yan, B. First-principles calculations for topological quantum materials. Nat. Rev. Phys. 2021, 3, 283–297. [Google Scholar] [CrossRef]
Gao, Y.; Bai, S.; Chong, K.; Liu, C.; Cao, Y.; Zou, Y. Machine learning accelerated design of non-equiatomic refractory high entropy alloys based on first principles calculation. Vacuum 2023, 207, 111608. [Google Scholar] [CrossRef]
Zhang, L.; Qian, K.; Huang, J.; Liu, M.; Shibuta, Y. Molecular dynamics simulation and machine learning of mechanical response in non-equiatomic FeCrNiCoMn high-entropy alloy. J. Mater. Res. Technol. Jul. 2021, 13, 2043–2054. [Google Scholar] [CrossRef]
Jiang, L.; Yang, F.; Zhang, M.; Yang, Z. Composition optimization of AlFeCuSiMg alloys based on elastic modules: A combination method of machine learning and molecular dynamics simulation. Mater. Today Commun. 2023, 37, 107584. [Google Scholar] [CrossRef]
Vazquez, G.; Singh, P.; Sauceda, D.; Couperthwaite, R.; Britt, N.; Youssef, K.; Johnson, D.D.; Arróyave, R. Efficient machine-learning model for fast assessment of elastic properties of high-entropy alloys. Acta Mater. 2022, 232, 117924. [Google Scholar] [CrossRef]
Zhang, J.; Cai, C.; Kim, G.; Wang, Y.; Chen, W. Composition design of high-entropy alloys with deep sets learning. npj Comput. Mater. 2022, 8, 89. [Google Scholar] [CrossRef]
Li, Z.; Pradeep, K.G.; Deng, Y.; Raabe, D.; Tasan, C.C. Metastable high-entropy dual-phase alloys overcome the strengtheductility trade-off. Nature 2016, 534, 227. [Google Scholar] [CrossRef] [PubMed]
Lei, Z.; Liu, X.; Wu, Y.; Wang, H.; Jiang, S.; Wang, S.; Hui, X.; Wu, Y.; Gault, B.; Kontis, P.; et al. Enhanced strength and ductility in a high-entropy alloy via ordered oxygen complexes. Nature 2018, 563, 546e50. [Google Scholar] [CrossRef] [PubMed]
Cheng, H.; He, Z.; Ge, M.; Che, L.; Zheng, K.; Si, T.; Zhao, F. Composition design and optimization of Fe–C–Mn–Al steel based on machine learning. Phys. Chem. Chem. Phys. 2024, 26, 8219–8227. [Google Scholar] [CrossRef]
Ding, Q.; Zhang, Y.; Chen, X.; Fu, X.; Chen, D.; Chen, S.; Gu, L.; Wei, F.; Bei, H.; Gao, Y.; et al. Tuning element distribution, structure and properties by composition in high-entropy alloys. Nature 2019, 574, 223–227. [Google Scholar] [CrossRef]
Elgack, O.; Almomani, B.; Syarif, J.; Elazab, M.; Irshaid, M.; Al-Shabi, M. Molecular dynamics simulation and machine learning-based analysis for predicting tensile properties of high-entropy FeNiCrCoCu alloys. J. Mater. Res. Technol. 2023, 25, 5575–5585. [Google Scholar] [CrossRef]
Tan, X.; Chen, D.; Xiao, H.; Lu, Q.; Wang, Z.; Chen, H.; Peng, X.; Zhang, W.; Liu, Z.; Guo, L.; et al. Prediction of phase and tensile properties of selective laser melting manufactured high entropy alloys by machine learning. Mater. Today Commun. 2024, 41, 110209. [Google Scholar] [CrossRef]
Li, L.; Fang, Q.; Li, J.; Liu, B.; Liu, Y.; Liaw, P.K. Lattice-distortion dependent yield strength in high entropy alloys. Mater Sci. Eng. A. 2020, 784, 139323. [Google Scholar] [CrossRef]
Steingrimsson, B.; Fan, X.; Feng, R.; Liaw, P. A physics-based machine-learning approach for modeling the temperature-dependent yield strengths of medium- or high-entropy alloys. Appl. Mater. Today 2023, 31, 101747. [Google Scholar] [CrossRef]
Veeresham, M.; Jain, R.; Lee, U.; Park, N. Machine learning approach for predicting yield strength of nitrogen-doped CoCrFeMnNi high entropy alloys at selective thermomechanical processing conditions. J. Mater. Res. Technol. 2023, 24, 2621–2628. [Google Scholar] [CrossRef]
Ding, S.; Wang, W.; Zhang, Y.; Ren, W.; Weng, X.; Chen, J. A yield strength prediction framework for refractory high-entropy alloys based on machine learning. Int. J. Refract. Met. Hard Mater. 2024, 125, 106884. [Google Scholar] [CrossRef]
Bhandari, U.; Rafi Md, R.; Zhang, C.; Yang, S. Yield strength prediction of high-entropy alloys using machine learning. Mater. Today Commun. 2020, 26, 101871. [Google Scholar] [CrossRef]
Sohail, Y.; Zhang, C.; Xue, D.; Zhang, J.; Zhang, D.; Gao, S.; Yang, Y.; Fan, X.; Zhang, H.; Liu, G.; et al. Machine-learning design of ductile FeNiCoAlTa alloys with high strength. Nature 2025, 643, 119–124. [Google Scholar] [CrossRef]
Kutz, M. Handbook of Environmental Degradation of Materials; William Andrew: Norwich, NY, USA, 2018. [Google Scholar]
Li, K.; Huang, T.; Gao, Y.; Zhou, C. Enhancing antioxidant properties of hydrogen storage alloys using PMMA coating. Int. J. Hydrogen Energy 2023, 48, 4339–4348. [Google Scholar] [CrossRef]
Dong, Z.; Sun, A.; Yang, S.; Yu, X.; Yuan, H.; Wang, Z.; Deng, L.; Song, J.; Wang, D.; Kang, Y. Machine learning-assisted discovery of Cr, Al-containing high-entropy alloys for high oxidation resistance. Corros. Sci. 2023, 220, 111222. [Google Scholar] [CrossRef]
Li, R.; Song, X.; Duan, Z.; Hao, Z.; Yang, Y.; Han, Y.; Ran, X.; Liu, Y. Improving the high-temperature ductility of γ-TiAl matrix composites by incorporation of AlCoCrFeNi high entropy alloy particles. J. Alloys Compd. 2025, 1012, 178515. [Google Scholar] [CrossRef]
Yang, M.L.; Xu, J.L.; Huang, J.; Zhang, L.W.; Luo, J.M. Wear Resistance of N-Doped CoCrFeNiMn High Entropy Alloy Coating on the Ti-6Al-4V Alloy. J. Therm. Spray Technol. 2024, 33, 2408–2418. [Google Scholar] [CrossRef]
Jain, R.; Jain, S.; Nagarjuna, C.; Samal, S.; Rananavare, A.P.; Dewangan, S.K.; Ahn, B. A Comprehensive Review on Hot Deformation Behavior of High-Entropy Alloys for High Temperature Applications. Met. Mater. Int. 2025, 31, 2181–2213. [Google Scholar] [CrossRef]
Yan, Y.; Lu, D.; Wang, K. Accelerated discovery of single-phase refractory high entropy alloys assisted by machine learning. Comput. Mater. Sci. 2021, 199, 110723. [Google Scholar] [CrossRef]
Birbilis, N.; Choudhary, S.; Scully, J.R.; Taheri, M.L. A perspective on corrosion of multi-principal element alloys. npj Mater. Degrad. 2021, 5, 14. [Google Scholar] [CrossRef]
Fu, Y.; Li, J.; Luo, H.; Du, C.W.; Li, X.G. Recent advances on environmental corrosion behavior and mechanism of high-entropy alloys. J. Mater. Sci. Technol. 2021, 80, 217–233. [Google Scholar] [CrossRef]
Qiu, Y.; Thomas, S.; Gibson, M.A.; Fraser, H.L.; Birbilis, N. Corrosion of high entropy alloys. npj Mater. Degrad. 2017, 1, 15. [Google Scholar] [CrossRef]
Ozdemir, H.; Nazarahari, A.; Yilmaz, B.; Canadinc, D.; Bedir, E.; Yilmaz, R.; Unal, U.; Maier, H. Machine learning–Informed development of high entropy alloys with enhanced corrosion resistance. Electrochimica Acta 2023, 476, 143722. [Google Scholar] [CrossRef]
Slepski, P.; Szocinski, M.; Lentka, G.; Darowicki, K. Novel fast non-linear electrochemical impedance method for corrosion investigations. Measurement 2021, 173, 108667. [Google Scholar] [CrossRef]
Wang, C.; Li, W.; Wang, Y.; Yang, X.; Xu, S. Study of electrochemical corrosion on Q235A steel under stray current excitation using combined analysis by electrochemical impedance spectroscopy and artificial neural network, Constr. Build. Mater. 2020, 247, 118562. [Google Scholar] [CrossRef]
Wei, B.; Xu, J.; Pang, J.; Huang, Z.; Wu, J.; Cai, Z.; Yan, M.; Sun, C. Prediction of electrochemical impedance spectroscopy of high-entropy alloys corrosion by using gradient boosting decision tree. Mater. Today Commun. 2022, 32, 104047. [Google Scholar] [CrossRef]
Jain, R.; Rahul, M.R.; Chakraborty, P.; Sabat, R.K.; Samal, S.; Park, N.; Phanikumar, G.; Tewari, R. Integrated experimental and modeling approach for hot deformation behavior of Co–Cr–Fe–Ni–V high entropy alloy. J. Mater. Res. Technol. 2023, 25, 840–854. [Google Scholar] [CrossRef]
Dewangan, S.K.; Jain, R.; Bhattacharjee, S.; Jain, S.; Paswan, M.; Samal, S.; Ahn, B. Enhancing flow stress predictions in CoCrFeNiV high entropy alloy with conventional and machine learning techniques. J. Mater. Res. Technol. Jmrt 2024, 30, 2377–2387. [Google Scholar] [CrossRef]
Dewangan, S.K.; Sharma, A.; Lee, H.; Kumar, V.; Ahn, B. Prediction of nanoindentation creep behavior of tungsten-containing high entropy alloys using artificial neural network trained with Levenberg–Marquardt algorithm. J. Alloys Compd. 2023, 958, 170359. [Google Scholar] [CrossRef]
Jain, S.; Jain, R.; Rao, K.R.; Bhowmik, A. Leveraging machine learning to minimize experimental trials and predict hot deformation behaviour in dual phase high entropy alloys. Mater. Today Commun. 2024, 41, 110813. [Google Scholar] [CrossRef]
Jain, R.; Jain, S.; Dewangan, S.K.; Rahul, M.R.; Samal, S.; Song, E.; Lee, Y.; Jeon, Y.; Biswas, K.; Phanikumar, G.; et al. Machine-learning-driven prediction of flow curves and development of processing maps for hot-deformed Ni–Cu–Co–Ti–Ta alloy. J. Mater. Res. Technol. 2025, 36, 7447–7456. [Google Scholar] [CrossRef]
He, Z.; Zhang, H.; Cheng, H.; Ge, M.; Si, T.; Che, L.; Zheng, K.; Zeng, L.; Wang, Q. Machine learning guided BCC or FCC phase prediction in high entropy alloys. J. Mater. Res. Technol. Jmrt 2024, 29, 3477–3486. [Google Scholar] [CrossRef]
Xu, K.; Sun, Z.; Tu, J.; Wu, W.; Yang, H. Intelligent design of Fe–Cr–Ni–Al/Ti multi-principal element alloys based on machine learning. J. Mater. Res. Technol. 2025, 35, 6864–6873. [Google Scholar] [CrossRef]
Lu, Z.; Ma, D.; Liu, X.; Lu, Z. High-throughput and data-driven machine learning techniques for discovering high-entropy alloys. Commun. Mater. 2024, 5, 76. [Google Scholar] [CrossRef]
Zhao, S.; Li, J.; Wang, J.; Lookman, T.; Yuan, R. Closed-loop inverse design of high entropy alloys using symbolic regression-oriented optimization. Mater. Today 2025, 88, 263–271. [Google Scholar] [CrossRef]
Zhang, L.; Chen, H.; Tao, X.; Cai, H.; Liu, J.; Ouyang, Y.; Peng, Q.; Du, Y. Machine learning reveals the importance of the formation enthalpy and atom-size difference in forming phases of high entropy alloys. Mater. Des. 2020, 193, 108835. [Google Scholar] [CrossRef]
Jaiswal, U.K.; Vamsi Krishna, Y.; Rahul, M.R.; Phanikumar, G. Machine learning-enabled identification of new medium to high entropy alloys with solid solution phases. Comput. Mater. Sci. 2021, 197, 110623. [Google Scholar] [CrossRef]
Syarif, J.; Elbeltagy, M.B.; Nassif, A.B. A machine learning framework for discovering high entropy alloys phase formation drivers. Heliyon 2023, 9, e12859. [Google Scholar] [CrossRef]
Xu, B.; Zhang, J.; Ma, S.; Xiong, Y.; Huang, S.; Kai, J.J.; Zhao, S. Revealing the crucial role of rough energy landscape on self-diffusion in high-entropy alloys based on machine learning and kinetic Monte Carlo. Acta Mater. 2022, 234, 118051. [Google Scholar] [CrossRef]
Huang, W.; Farkas, D.; Bai, X.-M. High-throughput machine learning-Kinetic Monte Carlo framework for diffusion studies in Equiatomic and Non-equiatomic FeNiCrCoCu high-entropy alloys. Materialia 2023, 32, 101966. [Google Scholar] [CrossRef]
Wan, X.; Zhang, Z.; Yu, W.; Niu, H.; Wang, X.; Guo, Y. Machine-learning-assisted discovery of highly efficient high-entropy alloy catalysts for the oxygen reduction reaction. Patterns 2022, 3, 100553. [Google Scholar] [CrossRef]
Liu, F.; Xiao, X.; Huang, L.; Tan, L.; Liu, Y. Design of NiCoCrAl eutectic high entropy alloys by combining machine learning with CALPHAD method. Mater. Today Commun. 2022, 30, 103172. [Google Scholar] [CrossRef]
Zhao, S.; Jiang, B.; Song, K.; Liu, X.; Wang, W.; Si, D.; Zhang, J.; Chen, X.; Zhou, C.; Liu, P.; et al. Machine learning assisted design of high-entropy alloys with ultra-high microhardness and unexpected low density. Mater. Des. 2024, 238, 112634. [Google Scholar] [CrossRef]
Liu, X.; Zhu, Y.; Wang, C.; Han, K.; Zhao, L.; Liang, S.; Huang, M.; Li, Z. A statistics-based study and machine-learning of stacking fault energies in HEAs. J. Alloys Compd. 2023, 966, 171547. [Google Scholar] [CrossRef]
Zhang, Y.; Wen, C.; Dang, P.; Jiang, X.; Xue, D.; Su, Y. Elemental numerical descriptions to enhance classification and regression model performance for high-entropy alloys. npj Comput. Mater. 2025, 11, 75. [Google Scholar] [CrossRef]
Ramakrishna, S.; Zhang, T.; Lu, W.; Qian, Q.; Low, J.S.; Yune, J.H.; Tan, D.Z.; Bressan, S.; Sanvito, S.; Kalidindi, S.R. Materials informatics. J. Intell. Manuf. 2018, 30, 2307–2326. [Google Scholar] [CrossRef]
Lookman, T.; Balachandran, P.V.; Xue, D.; Yuan, R. Active learning in materials science with emphasis on adaptive sampling using uncertainties for targeted design. npj Comput. Mater. 2019, 5, 21. [Google Scholar] [CrossRef]
Wei, Q.; Cao, B.; Deng, L.; Sun, A.; Dong, Z.; Zhang, T.-Y. Discovering a formula for the high temperature oxidation behavior of FeCrAlCoNi based high entropy alloys by domain knowledge-guided machine learning. J. Mater. Sci. Technol. 2023, 149, 237–246. [Google Scholar] [CrossRef]
Huang, X.; Zheng, L.; Xu, H.; Fu, H. Predicting and understanding the ductility of BCC high entropy alloys via knowledge-integrated machine learning. Mater. Des. 2024, 239, 112797. [Google Scholar] [CrossRef]
Kingma, D.P.; Welling, M. Auto-Encoding Variational Bayes. In Proceedings of the 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14–16 April 2014. Conference Track Proceedings. [Google Scholar] [CrossRef]
Rocchetto, A.; Grant, E.; Strelchuk, S.; Carleo, G.; Severini, S. Learning hard quantum distributions with variational autoencoders. npj Quantum Inf. 2018, 4, 28. [Google Scholar] [CrossRef]
Yin, J.; Pei, Z.; Gao, M.C. Neural network-based order parameter for phase transitions and its applications in high-entropy alloys. Nat. Comput. Sci. 2021, 1, 686–693. [Google Scholar] [CrossRef]
Saal, J.E.; Berglund, I.S.; Sebastian, J.T.; Liaw, P.K.; Olson, G.B. Equilibrium high entropy alloy phase stability from experiments and thermodynamic modeling. Scr. Mater. 2018, 146, 5–8. [Google Scholar] [CrossRef]
Wang, Q.; Yao, Y. Harnessing machine learning for high-entropy alloy catalysis: A focus on adsorption energy prediction. npj Comput. Mater. 2025, 11, 91. [Google Scholar] [CrossRef]
Kaufmann, K.; Vecchio, K.S. Searching for high entropy alloys: A machine learning approach. Acta Mater. 2020, 198, 178–222. [Google Scholar] [CrossRef]
Nosratabadi, S.; Mosavi, A.; Duan, P.; Ghamisi, P.; Filip, F.; Band, S.S.; Reuter, U.; Gama, J.; Gandomi, A.H. Data science in economics: Comprehensive review of advanced machine learning and deep learning methods. Mathematics 2020, 8, 1799. [Google Scholar] [CrossRef]
Hong, Y.; Hou, B.; Jiang, H.; Zhang, J. Machine learning and artificial neural network accelerated computational discoveries in materials science. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2020, 10, e1450. [Google Scholar] [CrossRef]
Bhadeshia, H. Neural networks and information in materials science. Stat. Anal. Data Min. ASA Data Sci. J. 2009, 1, 296–305. [Google Scholar] [CrossRef]
Wang, J.; Kwon, H.; Kim, H.S.; Lee, B.-J. A neural network model for high entropy alloy design. npj Comput. Mater. 2023, 9, 60. [Google Scholar] [CrossRef]
He, Q.; Ye, Y.; Yang, Y. The configurational entropy of mixing of metastable random solid solution in complex multicomponent alloys. J. Appl. Phys. 2016, 120, 154902. [Google Scholar] [CrossRef]
An, S.; Su, R.; Hu, Y.-C.; Liu, J.; Yang, Y.; Liu, B.; Guan, P. Common mechanism for controlling polymorph selection during crystallization in supercooled metallic liquids. Acta Mater. 2018, 161, 367–373. [Google Scholar] [CrossRef]
He, Q.; Ding, Z.; Ye, Y.; Yang, Y. Design of high-entropy alloy: A perspective from nonideal mixing. Jom 2017, 69, 2092–2098. [Google Scholar] [CrossRef]
Stillinger, F.H. A topographic view of supercooled liquids and glass formation. Science 1995, 267, 1935–1939. [Google Scholar] [CrossRef] [PubMed]
Debenedetti, P.G.; Stillinger, F.H. Supercooled liquids and the glass transition. Nature 2001, 410, 259–267. [Google Scholar] [CrossRef]
Vazquez, G.; Chakravarty, S.; Gurrola, R.; Arróyave, R. A deep neural network regressor for phase constitution estimation in the high entropy alloy system Al-Co-Cr-Fe-Mn-Nb-Ni. npj Comput. Mater. 2023, 9, 68. [Google Scholar] [CrossRef]
Lee, C.-Y.; Jui, C.-Y.; Yeh, A.-C.; Chang, Y.-J.; Lee, W.-J. Inverse design of high entropy alloys using a deep interpretable scheme for materials attribution analysis. J. Alloys Compd. 2024, 976, 173144. [Google Scholar] [CrossRef]
Balachandran, P.V.; Kowalski, B.; Sehirlioglu, A.; Lookman, T. Experimental search for high-temperature ferroelectric perovskites guided by two-step machine learning. Nat. Commun. 2018, 9, 1668. [Google Scholar] [CrossRef]
Gubernatis, J.E.; Lookman, T. Machine learning in materials design and discovery: Examples from the present and suggestions for the future. Phys. Rev. Mater. 2018, 2, 120301. [Google Scholar] [CrossRef]
Xue, D.; Balachandran, P.V.; Hogden, J.; Theiler, J.; Xue, D.; Lookman, T. Accelerated search for materials with targeted properties by adaptive design. Nat. Commun. 2016, 7, 11241. [Google Scholar] [CrossRef] [PubMed]
Yuan, R.; Liu, Z.; Balachandran, P.V.; Xue, D.; Zhou, Y.; Ding, X.; Sun, J.; Xue, D.; Lookman, T. Accelerated Discovery of Large Electrostrains in BaTiO3 -Based Piezoelectrics Using Active Learning. Adv. Mater. 2018, 30, 1702884. [Google Scholar] [CrossRef] [PubMed]
Li, H.; Yuan, R.; Liang, H.; Wang, W.Y.; Li, J.; Wang, J. Towards high entropy alloy with enhanced strength and ductility using domain knowledge constrained active learning. Mater. Des. 2022, 223, 111186. [Google Scholar] [CrossRef]
Lu, Y.; Dong, Y.; Jiang, H.; Wang, Z.; Cao, Z.; Guo, S.; Wang, T.; Li, T.; Liaw, P.K. Promising properties and future trend of eutectic high entropy alloys. Scr. Mater. 2020, 187, 202–209. [Google Scholar] [CrossRef]
Wu, Q.; Wang, Z.; Hu, X.; Zheng, T.; Yang, Z.; He, F.; Li, J.; Wang, J. Uncovering the eutectics design by machine learning in the Al–Co–Cr–Fe–Ni high entropy system. Acta Mater. 2020, 182, 278–286. [Google Scholar] [CrossRef]
Zeng, Y.; Man, M.; Koon Ng, C.; Aitken, Z.; Bai, K.; Wuu, D.; Jun Lee, J.; Rong Ng, S.; Wei, F.; Wang, P.; et al. Search for eutectic high entropy alloys by integrating high-throughput CALPHAD, machine learning and experiments. Mater. Des. 2024, 241, 112929. [Google Scholar] [CrossRef]
Li, S.; Chen, W.; Jain, S.; Jung, D.; Lee, J. Optimization of flow behavior models by genetic algorithm: A case study of aluminum alloy. J. Mater. Res. Technol. 2024, 31, 3349–3363. [Google Scholar] [CrossRef]
Michalewicz, Z.; Schoenauer, M. Evolutionary Algorithms for Constrained Parameter Optimization Problems. Evol. Comput. 1996, 4, 1–32. [Google Scholar] [CrossRef]
Butler, K.T.; Davies, D.W.; Cartwright, H.; Isayev, O.; Walsh, A. Machine learning for molecular and materials science. Nature 2018, 559, 547–555. [Google Scholar] [CrossRef] [PubMed]
Raccuglia, P.; Elbert, K.; Adler, P.D.F.; Falk, C.; Wenny, M.; Mollo, A.; Zeller, M.; Friedler, S.A.; Schrier, J.; Norquist, A. Machine-learning-assisted materials discovery using failed experiments. Nature 2016, 533, 73–76. [Google Scholar] [CrossRef]
Zhang, Y.; Wen, C.; Wang, C.; Antonov, S.; Xue, D.; Bai, Y.; Su, Y. Phase prediction in high entropy alloys with a rational selection of materials descriptors and machine learning models. Acta Mater. 2020, 185, 528–539. [Google Scholar] [CrossRef]
Li, S.; Li, S.; Liu, D.; Zou, R.; Yang, Z. Hardness prediction of high entropy alloys with machine learning and material descriptors selection by improved genetic algorithm. Comput. Mater. Sci. 2022, 205, 111185. [Google Scholar] [CrossRef]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 2014, 27, 2672–2680. [Google Scholar]
Creswell, A.; White, T.; Dumoulin, V.; Arulkumaran, K.; Sengupta, B.; Bharath, A.A. Generative Adversarial Networks: An Overview. IEEE Signal Process. Mag. 2018, 35, 53–65. [Google Scholar] [CrossRef]
Bousmalis, K.; Silberman, N.; Dohan, D.; Erhan, D.; Krishnan, D. Unsupervised pixel-level domain adaptation with generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
Lee, S.Y.; Byeon, S.; Kim, H.S.; Jin, H.; Lee, S. Deep learning-based phase prediction of high-entropy alloys: Optimization, generation, and explanation. Mater. Des. 2021, 197, 109260. [Google Scholar] [CrossRef]
Yang, Z.; Li, S.; Li, S.; Yang, J.; Liu, D. A two-step data augmentation method based on generative adversarial network for hardness prediction of high entropy alloy. Comput. Mater. Sci. 2023, 220, 112064. [Google Scholar] [CrossRef]
Sun, Y.; Hou, C.; Tran, N.-D.; Lu, Y.; Li, Z.; Chen, Y.; Ni, J. EFTGAN: Elemental features and transferring corrected data augmentation for the study of high-entropy alloys. npj Comput. Mater. 2025, 11, 48. [Google Scholar] [CrossRef]
Yu, W.; Qu, Y.; Li, C.; Li, Z.; Zhang, Y.; Guo, Y.; You, J.; Su, R. Phase selection and mechanical properties of (Al_21.7Cr_15.8Fe_28.6Ni_33.9)_x(Al_9.4Cr_19.7Fe_41.4Ni_29.5)_100–x high entropy alloys. Mater. Sci. Eng. A 2019, 751, 154–159. [Google Scholar] [CrossRef]
Nassar, A.E.; Mullis, A.M. Rapid screening of high-entropy alloys using neural networks and constituent elements. Comput. Mater. Sci. 2021, 199, 110755. [Google Scholar] [CrossRef]
Chen, H.; Kauffmann, A.; Laube, S.; Choi, I.-C.; Schwaiger, R.; Huang, Y.; Lichtenberg, K.; Müller, F.; Gorr, B.; Christ, H.-J.; et al. Contribution of lattice distortion to solid solution strengthening in a series of refractory high entropy alloys. Metall. Mater. Trans. A. 2017, 49, 772–781. [Google Scholar] [CrossRef]
Fleischer, R.L. Substitutional solid solution hardening of titanium. Scr. Metall. 1987, 21, 1083–1085. [Google Scholar] [CrossRef]
Labusch, R. A statistical theory of solid solution hardening. Phys. Status Solidi 1970, 41, 659–669. [Google Scholar] [CrossRef]
Wen, C.; Wang, C.; Zhang, Y.; Antonov, S.; Xue, D.; Lookman, T.; Su, Y. Modeling solid solution strengthening in high entropy alloys using machine learning. Acta Mater. 2021, 212, 116917. [Google Scholar] [CrossRef]
Wang, H.; Yang, K. Bayesian Optimization. In Many-Criteria Optimization and Decision Analysis; Brockhoff, D., Emmerich, M., Naujoks, B., Purshouse, R., Eds.; Natural Computing Series; Springer: Cham, Switzerland, 2023. [Google Scholar] [CrossRef]
Vela, B.; Khatamsaz, D.; Acemi, C.; Karaman, I.; Arróyave, R. Data-augmented modeling for yield strength of refractory high entropy alloys: A Bayesian approach. Acta Mater. 2023, 261, 119351. [Google Scholar] [CrossRef]
Khatamsaz, D.; Vela, B.; Singh, P.; Johnson, D.D.; Allaire, D.; Arróyave, R. Multi-objective materials bayesian optimization with active learning of design constraints: Design of ductile refractory multi-principal-element alloys. Acta Mater. 2022, 236, 118133. [Google Scholar] [CrossRef]
Zhou, Y.; Yang, B. Uncertainty quantification of predicting stable structures for high-entropy alloys using Bayesian neural networks. J. Energy Chem. 2023, 81, 118–124. [Google Scholar] [CrossRef]
Sulley, G.A.; Raush, J.; Montemore, M.M.; Hamm, J. Accelerating high-entropy alloy discovery: Efficient exploration via active learning. Scr. Mater. 2024, 249, 116180. [Google Scholar] [CrossRef]
Halpren, E.; Yao, X.; Chen, Z.W.; Singh, C.V. Machine learning assisted design of BCC high entropy alloys for room temperature hydrogen storage. Acta Mater. 2024, 270, 119841. [Google Scholar] [CrossRef]
Sui, Y.; Gotovos, A.; Burdick, J.W.; Krause, A. Bayesian optimization with active learning of design constraints using an entropy-based approach. In Proceedings of the 32nd International Conference on Machine Learning (ICML), Lille, France, 7–9 July 2015; pp. 1539–1547. [Google Scholar]
Ren, W.; Zhang, Y.-F.; Wang, W.-L.; Ding, S.-J.; Li, N. Prediction and design of high hardness high entropy alloy through machine learning. Mater. Des. 2023, 235, 112454. [Google Scholar] [CrossRef]
Li, X.; Zheng, M.; Li, C.; Pan, H.; Ding, W.; Yu, J. Accelerated design of low-activation high entropy alloys with desired phase and property by machine learning. Appl. Mater. Today 2024, 36, 102000. [Google Scholar] [CrossRef]
Li, M.; Quek, X.K.; Suo, H.; Wuu, D.; Lee, J.J.; Teh, W.H.; Wei, F.; Made, R.I.; Tan, D.C.C.; Ng, S.R.; et al. Composition driven machine learning for unearthing high-strength lightweight multi-principal element alloys. J. Alloy. Compd. 2024, 1008, 176517. [Google Scholar] [CrossRef]
Li, J.; Zhang, X.; Shang, C.; Ran, X.; Wang, Z.; Tang, C.; Zhang, X.; Nie, M.; Xu, W.; Lu, X. Reinforcement Learning in Materials Science: Recent Advances, Methodologies and Applications. Acta Metall. Sin. (Engl. Lett.) 2025, 11, 2077–2101. [Google Scholar] [CrossRef]
Tan, X.; Trehern, W.; Sundar, A.; Wang, Y.; San, S.; Lu, T.; Zhou, F.; Sun, T.; Zhang, Y.; Wen, Y.; et al. Machine learning and high-throughput computational guided development of high temperature oxidation-resisting Ni-Co-Cr-Al-Fe based high-entropy alloys. npj Comput. Mater. 2025, 11, 93. [Google Scholar] [CrossRef]
Daulton, S.; Eriksson, D.; Balandat, M.; Bakshy, E. Multi-objective Bayesian optimization over high-dimensional search spaces. In Proceedings of the 38th Conference on Uncertainty in Artificial Intelligence, Eindhoven, The Netherlands, 1–5 August 2022; PMLR: Breckenridge, CO, USA; Volume 180, pp. 507–517. [Google Scholar]
Gardiner, C.; Marzolf, B. BoTorch: A framework for efficient Monte-Carlo Bayesian optimization. In Proceedings of the 37th International Conference on Machine Learning, Vienna, Austria, 13–18 July 2020; Volume 119, pp. 3409–3418. [Google Scholar]
Bengio, Y.; Hinton, G.; Yao, A.; Song, D.; Abbeel, P.; Darrell, T.; Harari, Y.N.; Zhang, Y.-Q.; Xue, L.; Shalev-Shwartz, S.; et al. Managing extreme AI risks amid rapid progress. Science 2024, 384, 842–845. [Google Scholar] [CrossRef]
Lipton, Z.C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, Montreal, QC, Canada, 21–26 April 2018; Association for Computing Machinery (ACM): New York, NY, USA, 2018; pp. 1–10. [Google Scholar] [CrossRef]
He, J.; Li, Z.; Lin, J.; Zhao, P.; Zhang, H.; Zhang, F.; Wang, L.; Cheng, X. Machine learning-assisted design of refractory high-entropy alloys with targeted yield strength and fracture strain. Mater. Des. 2024, 246, 113326. [Google Scholar] [CrossRef]
Pei, X.; Pei, J.; Hou, H.; Zhao, Y. Optimizing casting process using a combination of small data machine learning and phase-field simulations. Npj Comput. Mater. 2025, 11, 27. [Google Scholar] [CrossRef]
Qiao, L.; Zhu, J. Cuckoo search-artificial neural network aided the composition design in Al–Cr–Co–Fe–Ni high entropy alloys. Appl. Surf. Sci. 2024, 669, 160539. [Google Scholar] [CrossRef]
Wen, C.; Zhang, Y.; Wang, C.; Huang, H.; Wu, Y.; Lookman, T.; Su, Y. Machine-Learning-Assisted Compositional Design of Refractory High-Entropy Alloys with Optimal Strength and Ductility. Engineering 2024, 46, 214–223. [Google Scholar] [CrossRef]
Cai, J.; Luo, J.; Wang, S.; Yang, S. Feature selection in machine learning: A new perspective. Neurocomputing 2018, 300, 70–79. [Google Scholar] [CrossRef]
Liu, G.; Wu, Q.; Ma, Y.; Huang, J.; Xie, Q.; Xiao, Q.; Gao, T. Machine learning-based phase prediction in high-entropy alloys: Further optimization of feature engineering. J. Mater. Sci. 2025, 60, 3999–4019. [Google Scholar] [CrossRef]
Wen, C.; Shen, H.; Tian, Y.; Lou, G.; Wang, N.; Su, Y. Accelerated discovery of refractory high-entropy alloys for strength-ductility co-optimization: An exploration in NbTaZrHfMo system by machine learning. Scr. Mater. 2024, 252, 116240. [Google Scholar] [CrossRef]
Angelopoulos, A.N.; Bates, S.; Fannjiang, C.; Jordan, M.I.; Zrnic, T. Prediction-powered inference. Science 2023, 382, 669–674. [Google Scholar] [CrossRef] [PubMed]
Dunn, A.; Wang, Q.; Ganose, A.; Dopp, D.; Jain, A. Benchmarking materials property prediction methods: The Matbench test set and Automatminer reference algorithm. npj Comput. Mater. 2020, 6, 138. [Google Scholar] [CrossRef]

Figure 1. Advances in machine learning for high entropy alloys. Publication of the year retrieved from Web of science with the keyword “machine learning high entropy alloys”. Search date: 9 September 2025.

Figure 2. Flowchart of the application of machine learning in HEA design.

Figure 3. Confusion matrix for binary classification problems. True Positive (TP): the number of samples that are actually positive classes, but are also predicted to be positive. False Positive (FP): the number of samples that are actually negative but predicted to be positive. True Negative (TN): the number of samples that are actually negative and are predicted to be negative. False Negative (FN): the number of samples that are actually positive but are predicted to be negative.

Figure 4. This figure presents a four-phase, closed-loop workflow for applying machine learning to new-materials R&D (such as alloy design). (a) Data Preparation, multi-source information from experiments, simulations, and literature is integrated, followed by feature engineering and dimensionality reduction (e.g., PCA, VAE) to obtain abstract, representative descriptors that support robust model building. (b) Target Selection and Model Optimization, the research task is defined—classification (e.g., SS, FCC) or regression for property prediction—and the best model is identified through comparative evaluation, hyperparameter tuning, and strict training–validation protocols. (c) Prediction and Model-Guided Design, the optimized model is applied to property prediction and design exploration, generating phase diagrams and property maps and recommending specific candidate compositions (e.g., Fe 20%, Ni 10%, Cr 30%). (d) Experimental Validation and Feedback, the proposed candidates are synthesized and tested, and the new data are looped back into stage (a) via active learning, thereby refining the next iteration and enabling rapid, intelligent discovery of high-performance materials.

Figure 5. Classification of multiphase HEAs. The Solid solution (SS) phases, Intermetallic (IM) phases, and Amorphous (AP) phases of HEAs are divided into nine phases (FCC, BCC, HCP, FCC + BCC, FCC + IM, FCC + AP, BCC + IM, BCC + AP, FCC + BCC + IM, FCC + BCC + IM, and FCC + BCC + AP).

Figure 6. Comparison of the sensitivity measures of the 13 design parameters based on the result of the ANN model. The sensitivity measures of the 13 design parameters for (a) amorphous phase (AM), (b) intermetallics (IM) and (c) solid solution (SS). Adapted from Ref. [55].

Figure 7. UMAP two-dimensional projection of materials representations for domain identification. (a,b) UMAP plots of ALIGNN embeddings learned from the leave-Mg-out and leave-O-out tasks in JARVIS. (c) UMAP plot of ALIGNN embeddings learned from the leave-H-out task in OQMD. (d) UMAP plot of ALIGNN embeddings learned by leaving out structures with 5 or more elements in MP. (e) UMAP plot of the XGB descriptors for the leave-period-5-out task in MP. (f) Absolute errors (left Y axis) of test data as functions of kernel-density estimates of training data for the UMAP plot of (e); the solid line denotes the MAEs (right Y axis) for different density intervals. In all UMAP plots, the training data, in-domain test data, and out-of-domain test data are marked in gray, blue, and red, respectively; clusters of out-of-domain test data are circled out in (a–c); the R² scores are indicated for the in-domain, out-of-domain, and all-domain test data. Adapted from Ref. [78].

Figure 8. Feature analysis by SHAP. (a) SHAP analysis result of ML hardness model. (b) SHAP analysis results for corrosion current model. Adapted from Ref. [92].

Figure 9. Graph representations of association rules between elements and elastic properties of HEAs. Results for (a) bulk modulus, (b) Young’s modulus, (c) shear modulus, (d) Pugh’s ratio, (e) Poisson’ ratio, and (f) Zener ratio, respectively. Node colors and sizes represent different elements (as shown in the legend) and fractions. The redder (bluer) the color of the node outlines and connections, the higher (lower) the value of the elastic property is predicted to be. The thicker the node outline or connection, the higher the lift value of the rule is. The node outlines and connections for the Zener ratio are mapped to a separate color bar to emphasize rules that predict Zener ratios that are close to 1.0 for the isotropic case. Adapted from Ref. [108].

Figure 10. (a) Plot of predicted σ_y versus experimental σ_y of the eight newly synthesized alloys from the four iterations, as well as the training data, showing the high reliability and robustness of the present ML model. (b) Experimental σ_y versus the number of iterations, showing the optimized σ_y of the HEA05 alloy at the third iteration. Adapted from Ref. [120].

Figure 11. Feature selection based on combinations of features from different ML algorithms. The predicted error of each model contains a subset of the eight features in the data set. Adapted from Ref. [142].

Figure 12. Results of 50 GA runs and the classification performance for phase prediction of HEAs. Elemental numerical descriptions to enhance classification and regression model performance for high-entropy alloys. (a,b) The classification accuracy of the logistic regression (LR) model as a function of the number of iterations within 50 GA runs for classification I and classification II, respectively. The red solid line indicates the best performer. (c,e) is the margin of LR model to classify the SS and NSS HEAs based on the materials features defined from the numerical descriptions of elements and traditional empirical features, respectively. (d,f) is the margin of the LR model to classify FCC, BCC, and DP HEAs based on the features of the material defined from the numerical descriptions of elements and traditional empirical features, respectively. The larger symbols represent the 15 newly synthesized HEAs. Adapted from Ref. [153].

Figure 13. The cumulative complexities in HEAs. (a) Composition, structure, and site complexities progressively build upon each other, resulting in an increasingly vast space for HEA exploration. (b) Composition complexity. Phase separation and chemical ordering. (c) Structure complexity. The segregated phase(s) and chemical order. (d) Site complexity. Adapted from Ref. [162].

Figure 14. (a) First and last (sixth) iterations of the HEA-GAD generation. And (b) summary of the properties of the ML-designed HEAs. Reprinted with permission from Ref. [42].

Table 1. Examples of structure and performance prediction based on machine learning.

Author	Target	ML Algorithms	Dataset & Performance	Refs	Year
Hareharen et al.	Phase	DT, KNN, RF, GB, XGBoosting	84.0% accuracy	[45]	2024
Veeresham et al.	Phase	KNN, bagging, adaboost, DT, extra trees, and ANN	ANN 90.62% accuracy; extra trees 89.73% accuracy	[47]	2024
Jain et al.	thermal deformation behavior	ANN	R = 0.9983	[135]	2023
Dewangan et al.	flow stress	BR, EN, LR, RF, GBoosting, SV, RR, PR	R² = 0.994, MAE = 7.77%, RMSE = 9.7%	[136]	2024
Dewangan et al.	the room temperature creep behavior	ANN	The ANN model can accurately forecast the room temperature creep behavior of HEAs	[137]	2023
Wu et al.	thermal deformation behavior	RF, KNN, XGBoost, DT and SVR	Predicted flow stress behavior of dual FCC phase CoCrCu1.2FeNi high entropy alloy (HEA) at new temperatures and strain rates	[138]	2024
Jain et al.	flow curves	RF, XGBoost, DT, KNN and GB	R² = 0.97, RMSE = 10.1%, and MAE = 8.9%	[139]	2025
He et al.	Phase	KNN, SVM, DT, RF, LR	399 date, 87.0% accuracy	[140]	2024
Zhou et al.	structural energy	NN	root mean squared error of the energy predicted is 1.37 meV/atom	[141]	2023

Abbreviations for the following words in this table: Bayesian Ridge (BR), Elastic Net (EN), Linear Regression (LR), Gradient Boosting (GBoosting), Support Vector (SV), Ridge Regression (RR), Polynomial Regression (PR).

Table 2. Exploration and design of ML-based HEA space.

Author	Target	Algorithms	Results	Refs	Year
Chen et al.	find HEAs with high hardness	RF, PSO	Obtained a HEA with an average hardness value of 966 HV, which is higher than that of the existing alloys in the AlCoCrCuFeNi system	[143]	2023
Zhao et al.	design of HEAs with ultra-high microhardness and unexpected low density	GAN, AL, XGBoost	Four Al-rich compositions exhibit ultra-high microhardness (>740 HV, with a maximum of ~780.3 HV) and low density (<5.9 g/cm³) in the as-cast bulk state.	[151]	2024
Xu et al.	represent the local atomic environment dependence of PEL in HEAs	NN, KMC	Ti_xZr_2−xCrMnFeNi (x = 0.5, 1.0, 1.5) hydride formation enthalpy of −25 to −39 kJ/mol is designed for hydrogen storage at room temperature.	[147]	2022
Rao et al.	to accelerate the design of high-entropy Invar alloys	DFT, AL, WAE	identified two high-entropy Invar alloys with extremely low thermal expansion coefficients around 2 × 10⁻⁶ per degree kelvin at 300 kelvin	[44]	2022
Wei et al.	aims at the discovery of a thematical formula	XGBoost, SHAP	performed a domain knowledge-guided machine learning to discover high interpretive formula describing the high- temperature oxidation behavior of FeCrAlCoNi-based high entropy alloys (HEAs)	[156]	2023
Sulley et al.	exploration of the complex composition space	NN, AL	Active learning can through an iterative search process and hence reducing the expense of exploring the entire design space	[205]	2024
Yin et al.	find a representative order parameter	VAE, CNN	Coined a new concept of “VAE order parameter”	[160]	2021
Wang et al.	Developed a neural network model to search vast compositional space of HEAs	DNN, CNN	Two HEAs were designed using this model and experimentally verified to have the best combination of strength and ductility.	[167]	2023
Halpren et al.	design of BCC high entropy alloys for room temperature hydrogen storage	MOBO	Discovered 8 new HEA candidates for hydrogen storage, including the VNbCrMoMn HEA that can store 2.83 wt% hydrogen at room temperature and atmospheric pressure	[206]	2024

Abbreviations for the following words in this table: Particle swarm optimization (PSO), Active Learning (AL), Kinetic Monte Carlo (KMC), multi-objective Bayesian optimization (MOBO).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xu, X.; He, Z.; Zheng, K.; Che, L.; Feng, W. Applications of Machine Learning in High-Entropy Alloys: Phase Prediction, Performance Optimization, and Compositional Space Exploration. Metals 2025, 15, 1349. https://doi.org/10.3390/met15121349

AMA Style

Xu X, He Z, Zheng K, Che L, Feng W. Applications of Machine Learning in High-Entropy Alloys: Phase Prediction, Performance Optimization, and Compositional Space Exploration. Metals. 2025; 15(12):1349. https://doi.org/10.3390/met15121349

Chicago/Turabian Style

Xu, Xiaotian, Zhongping He, Kaiyuan Zheng, Lun Che, and Wei Feng. 2025. "Applications of Machine Learning in High-Entropy Alloys: Phase Prediction, Performance Optimization, and Compositional Space Exploration" Metals 15, no. 12: 1349. https://doi.org/10.3390/met15121349

APA Style

Xu, X., He, Z., Zheng, K., Che, L., & Feng, W. (2025). Applications of Machine Learning in High-Entropy Alloys: Phase Prediction, Performance Optimization, and Compositional Space Exploration. Metals, 15(12), 1349. https://doi.org/10.3390/met15121349

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Applications of Machine Learning in High-Entropy Alloys: Phase Prediction, Performance Optimization, and Compositional Space Exploration

Abstract

1. Introduction

2. General Workflow for Machine Learning (In HEA Design)

2.1. Key Components of Machine Learning

2.1.1. Data Preparation

2.1.2. Algorithm Selection

2.1.3. Model Training and Evaluation

2.2. Example Diagram of the Application of Machine Learning to Phase Prediction, Performance and Composition Design

3. Phase Prediction of HEA

3.1. ML Model Prediction

3.1.1. RF

3.1.2. NN

3.1.3. SVM

3.1.4. KNN

3.1.5. Boosting

3.2. Mixed Model Phase Prediction

3.2.1. Combined with Thermodynamic Calculations

3.2.2. Combined with DS Evidence Theory

3.3. Incorporating Physical Constraints into Machine Learning and the Applicability Boundaries of Purely Data-Driven Methods

3.4. Multi-Model Integration and Structural Fusion: From Statistical Complementarity to Mechanism-Data Coupling

4. Prediction of HEA Performance

4.1. Hardness Model Prediction

4.1.1. Add SHAP Interpretation

4.1.2. Neural Network Prediction

4.1.3. Modeling of Novel Solid-State Solution Hardening (SSH)

4.2. Other Performance Predictions

4.2.1. Elastic Performance Prediction

4.2.2. Tensile Performance Prediction

4.2.3. Yield Strength Prediction

4.2.4. Antioxidant Performance Prediction

4.2.5. Corrosion Performance Prediction

4.2.6. Prediction of Parameters Related to Mechanical Properties

5. Design, Exploration and Optimization of New High-Entropy Alloys

5.1. Exploration of Structure-Activity Relationship

5.1.1. Feature-Correlated Structure–Property Relationships

5.1.2. Elemental Component Relationships

5.1.3. Explanatory Formulas/Parameters

5.2. Explore the HEA Space

5.2.1. DNN Global Search

5.2.2. Active Learning Loop Iteration

5.2.3. Exploration of Eutectic High-Entropy Alloys

5.3. Optimize the Design

5.3.1. Optimize the Model Using Genetic Algorithms

5.3.2. GAN

5.3.3. Optimize the Model and Optimized Component Design

5.4. ML-Driven Research in HEAs: Emerging Trends and Applications

6. Conclusions, Challenges, and Outlooks

6.1. Conclusions

6.2. Challenges

6.3. Outlooks

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI