Prediction on Mechanical Properties of Non-Equiatomic High-Entropy Alloy by Atomistic Simulation and Machine Learning

: High-entropy alloys (HEAs) with multiple constituent elements have been extensively studied in the past 20 years, due to their promising engineering application. Previous experimental and computational studies of HEAs focused mainly on equiatomic or near equiatomic HEAs. However, there is probably far more treasure in those non-equiatomic HEAs with carefully designed composition. In this study, the molecular dynamics (MD) simulation combined with machine learning (ML) methods was used to predict the mechanical properties of non-equiatomic CuFeNiCrCo HEAs. A database was established based on a tensile test of 900 HEA single-crystal samples by MD simulation. Eight ML models were investigated and compared for the binary classiﬁcation learning tasks, ranging from shallow models to deep models. It was found that the kernel-based extreme learning machine (KELM) model outperformed others for the prediction of yield stress and Young’s modulus. The accuracy of the KELM model was further veriﬁed by the large-sized polycrystal HEA samples. The results show that computational simulation combined with ML methods is an efﬁcient way to predict the mechanical performance of HEAs, which provides new ideas for accelerating the development of novel alloy materials for engineering applications.


Introduction
The concept of "high-entropy alloys" (HEAs), in which metals are mixed by five or more elements in equiatomic or near-equiatomic proportions, was proposed by Yeh and co-workers [1], and independently by Cantor and co-workers [2] in 2004.After this, HEAs have been extensively studied by researchers due to their advanced properties in many aspects [3,4].Compared to conventional alloys, which contain one and rarely two base elements, HEAs have proven to show many superior mechanical properties, including excellent strength, exceptional ductility and fracture toughness at cryogenic temperatures, superior mechanical performance at high temperatures, super-paramagnetism and superconductivity [5].Some recent high-profile studies have shown that well-designed HEAs exhibit superior mechanical properties that can overcome the longstanding conundrum of the strength-ductility trade-off in materials [6][7][8].The excellent mechanical properties of HEA make it an advanced structural material, with promising engineering applications.
HEA is a large field with a countless number of new alloy systems.If five elements were randomly selected from a pool of 118 elements to form an alloy, the number of

Simulation Sample
For the HEA single-crystal, the 900 small-size samples were generated without initial structural defects.The length of each side of the cube sample is approximately 10 nm, and the number of atoms is about 100,000.The current sample size was controlled for a consideration of the computational efficiency for high throughput MD simulation, and ensures the observation of primary nucleation of dislocations in the grain [17].The largesize HEA polycrystal samples were constructed using the Voronoi construction method containing 16 grains with random crystallographic orientations.The mean grain size is 27.4 nm, and the total number of atoms is over 10 million.Atoms of the five elements are uniformly distributed in the single-crystal and polycrystal HEA samples [18].Both singlecrystal and polycrystal samples were constructed with a fcc phase.HEAs with the fcc phase have been widely researched in previous experimental studies, and this has been found to be a stable phase of the similar alloy system as CuFeNiCrCo [5].
The phase stability of the simulation sample was further examined by gradually raising the system temperature.Before the heating process, the HEA sample was firstly relaxed to an equilibrium configuration at 300 K in the canonical ensemble NVT (constant atom number, constant box volume and constant temperature).Then, the temperature gradually increases from ambient temperature (300 K) to high temperature (3000 K) in the isothermal-isobaric ensemble (NPT).As shown in Figure 2, the HEA samples with different element compositions were tested, including an equiatomic sample and four nonequiatomic samples.The total volume of the system was monitored during the heating process.The inserted snapshots show the atomic configurations of the Cu30Fe18Ni9Cr14Co29 sample at different temperature levels.Atoms are coloured by common neighbour analysis (CNA) method, where the green atoms are in fcc structure, and the blue atoms represent a disordered structure.It was found that the fcc phase of the HEA sample has not

Simulation Sample
For the HEA single-crystal, the 900 small-size samples were generated without initial structural defects.The length of each side of the cube sample is approximately 10 nm, and the number of atoms is about 100,000.The current sample size was controlled for a consideration of the computational efficiency for high throughput MD simulation, and ensures the observation of primary nucleation of dislocations in the grain [17].The largesize HEA polycrystal samples were constructed using the Voronoi construction method containing 16 grains with random crystallographic orientations.The mean grain size is 27.4 nm, and the total number of atoms is over 10 million.Atoms of the five elements are uniformly distributed in the single-crystal and polycrystal HEA samples [18].Both single-crystal and polycrystal samples were constructed with a fcc phase.HEAs with the fcc phase have been widely researched in previous experimental studies, and this has been found to be a stable phase of the similar alloy system as CuFeNiCrCo [5].
The phase stability of the simulation sample was further examined by gradually raising the system temperature.Before the heating process, the HEA sample was firstly relaxed to an equilibrium configuration at 300 K in the canonical ensemble NVT (constant atom number, constant box volume and constant temperature).Then, the temperature gradually increases from ambient temperature (300 K) to high temperature (3000 K) in the isothermalisobaric ensemble (NPT).As shown in Figure 2, the HEA samples with different element compositions were tested, including an equiatomic sample and four non-equiatomic samples.The total volume of the system was monitored during the heating process.The inserted snapshots show the atomic configurations of the Cu 30 Fe 18 Ni 9 Cr 14 Co 29 sample at different temperature levels.Atoms are coloured by common neighbour analysis (CNA) method, where the green atoms are in fcc structure, and the blue atoms represent a disordered structure.It was found that the fcc phase of the HEA sample has not changed below 1000 K. From 1500 to 2000 K, the disordered atoms (blue atoms) increased considerably.The first-order phase transition (solid to liquid) was observed at about 2500 K, where the sample volume shows a sudden jump.The other four simulated samples in Figure 2 show a similar process, although the melting temperature varies depending on the composition of the elements.changed below 1000 K. From 1500 to 2000 K, the disordered atoms (blue atoms) increased considerably.The first-order phase transition (solid to liquid) was observed at about 2500 K, where the sample volume shows a sudden jump.The other four simulated samples in Figure 2 show a similar process, although the melting temperature varies depending on the composition of the elements.

MD Simulation
MD simulations were performed using the parallel molecular dynamics package LAMMPS [19], with the embedded atom method (EAM) interatomic potentials developed for the CuFeNiCrCo HEA system [20].The interatomic potential can accurately reflect the basic structural and physical characteristics of the five components, including the lattice constant, cohesive energy and elastic modulus [20].In particular, the interatomic potential can present a large variation in stacking fault energies of the constituent elements, both stable and unstable, which is significant for the study of dislocation behaviour, and therefore the mechanical properties of the alloy system.
The equilibrium structure of the HEA sample was obtained after an initial energy minimisation with a standard conjugate gradient algorithm, and then followed a system annealing at the temperature of 300 K for 100 ps [21,22].Periodic boundary conditions were applied to all directions, and atomic vibration and change in sample dimensions were allowed during the simulation.The fcc lattice structure of the samples was maintained after system annealing in an isobaric-isothermal (NPT) environment.In order to test the mechanical properties of HEA samples with different compositions, the uniaxial tension was applied in one direction, while the pressure along the other two directions was kept at zero in the NPT ensemble (T = 300 K).In particular, when conducting the dynamic tension simulation, we not only randomly changed the composition of the elements in the alloy system, but also considered an important physical index of materials, namely the anisotropy of the mechanical properties.Here, nine crystallographic orientations were deliberately selected to test the mechanical properties of HEA samples.These

MD Simulation
MD simulations were performed using the parallel molecular dynamics package LAMMPS [19], with the embedded atom method (EAM) interatomic potentials developed for the CuFeNiCrCo HEA system [20].The interatomic potential can accurately reflect the basic structural and physical characteristics of the five components, including the lattice constant, cohesive energy and elastic modulus [20].In particular, the interatomic potential can present a large variation in stacking fault energies of the constituent elements, both stable and unstable, which is significant for the study of dislocation behaviour, and therefore the mechanical properties of the alloy system.
The equilibrium structure of the HEA sample was obtained after an initial energy minimisation with a standard conjugate gradient algorithm, and then followed a system annealing at the temperature of 300 K for 100 ps [21,22].Periodic boundary conditions were applied to all directions, and atomic vibration and change in sample dimensions were allowed during the simulation.The fcc lattice structure of the samples was maintained after system annealing in an isobaric-isothermal (NPT) environment.In order to test the mechanical properties of HEA samples with different compositions, the uniaxial tension was applied in one direction, while the pressure along the other two directions was kept at zero in the NPT ensemble (T = 300 K).In particular, when conducting the dynamic tension simulation, we not only randomly changed the composition of the elements in the alloy system, but also considered an important physical index of materials, namely the anisotropy of the mechanical properties.Here, nine crystallographic orientations were deliberately selected to test the mechanical properties of HEA samples.[876]) of the inverse pole figure for material textures.The deformation strain was set at a constant rate of 5 × 10 8 s −1 , and the timestep was set as 1 fs throughout the work.The short time duration to which MD simulations are inherently limited is particularly relevant to the simulation of plastic deformation.As a consequence of this, MD simulations always involve extremely high strain rates of typically 10 7 to 10 9 s −1 .The strain rate and timestep set in this study are frequently used parameter in MD simulations.System strain was derived from the positions of the periodic boundaries along the loading direction, and the system stress was attained by calculating the pressure of the entire system of atoms.All the atomic figures in this study are illustrated by the visualisation tool OVITO [23].

ML Method
For metallic materials, the nucleation of the first set of dislocations corresponds to the theoretical yield strength, and represents the beginning of the plastic deformation [24].It also plays a significant role in the contribution of material-hardening mechanisms, and hence can be used to guide the design and optimisation of alloy composition.Therefore, yield stress (Y S ) was chosen as the first learning target for ML.On the other hand, Young's modulus is an intrinsic mechanical property, mainly determined by the elements of their constitutions and crystal structures.For the conventional one-element principal alloys, Young's modulus is mainly controlled by the dominant element.In contrast, for HEAs, Young's modulus can be very different from any of the constituent elements in the alloys [25].Here, Young's modulus (E) was chosen as the second learning target for ML.
In the ML task, the composition of the five elements CuFeNiCrCo was used as input feature to predict the mechanical properties.The principal concept of HEAs is based on designing the alloys with multiple principal elements, ranging from 5 to 35 atomic percent with a target to form single-phase solid solutions arising from the high entropy of the system [1].Therefore, when randomly assigning the proportion of the five elements, the upper limit of each element was controlled to be 35%, and the lower limit to be 5%.The mechanical properties (Y S and E) were set as output features, and the binary classification of the output values is the learning target.
In particular, the traditional material design by ML is only based on the design of a pure mathematical algorithm, while the intrinsic structural and physical properties of the material are not considered.In this way, ML usually requires large data information, and may result in uneven learning samples in terms of the special properties of materials.In this study, when establishing the database, the anisotropy of the mechanical properties of materials was considered and nine representative crystallographic orientations were selected to test the mechanical properties of HEA samples.MD simulations on each orientation contains a result of the equiatomic HEA sample and 100 results of the nonequiatomic HEA samples based on different element compositions.
There are three main paradigms in ML: supervised learning, unsupervised learning and reinforcement learning.Our task is to predict the mechanical properties (output features) of the HEA samples through element composition as the input features; hence, the supervised learning paradigm is selected.Eight ML models were investigated and compared, ranging from classic (shallow) models to deep models, including Naïve Bayes (NB) [26], linear discriminant analysis (LDA) [27], k-Nearest Neighbour (k-NN) [28], support vector machine (SVM) [29], extreme learning machine (ELM) [30], kernel-based extreme learning machine KELM [31], deep neural network (DNN) [32] and stacked auto-encoders (SAEs) [33].

Mechanical Response of HEA Single-Crystal
The mechanical responses of the equiatomic CuFeNiCrCo samples with different crystallographic orientations under uniaxial tension at 300 K are shown in Figure 3a.Young's modulus for each orientation is obtained from the linear portion of its corresponding stress-strain between 0 and 0.5% strain.Theoretically, the nucleation of the first set of dislo-cations corresponds to the yield strength of materials [24].Figure 3c shows the snapshots of the nucleation of the first set of dislocations in the single-crystal samples with different orientations.The dislocations are extracted by the dislocation extraction algorithm (DXA) algorithm [34], wherein the different colours represent different types of dislocations.By comparing the tensile strain, it was found that the dislocation nucleation event occurs near the peak of the stress-strain curve.Therefore, according to previous MD studies, the peak stress on the stress-strain curve is defined as the yield stress [35][36][37].The mechanical response shows an elastic anisotropy of the HEA samples with different crystal orientations.The tension along [111] orientation shows the maximum Young's modulus of 268.96GPa, while the [210] orientation has the minimum value of 169.07 GPa.Additionally, there are remarkable differences in yield stress for different orientations; the maximum yield stress is 15.14 GPa for [111] orientation, and the minimum value is 7.66 GPa for [110] orientation.Therefore, considering the anisotropy of the mechanical properties of materials can contain a wider range of information than simply considering the element composition when establishing a database, which is conducive to improving the reliability and applicability of an ML model in dealing with complex structural materials.Young's modulus for each orientation is obtained from the linear portion of its corresponding stress-strain between 0 and 0.5% strain.Theoretically, the nucleation of the first set of dislocations corresponds to the yield strength of materials [24].Figure 3c shows the snapshots of the nucleation of the first set of dislocations in the single-crystal samples with different orientations.The dislocations are extracted by the dislocation extraction algorithm (DXA) algorithm [34], wherein the different colours represent different types of dislocations.By comparing the tensile strain, it was found that the dislocation nucleation event occurs near the peak of the stress-strain curve.Therefore, according to previous MD studies, the peak stress on the stress-strain curve is defined as the yield stress [35][36][37].The mechanical response shows an elastic anisotropy of the HEA samples with different crystal orientations.The tension along [111] orientation shows the maximum Young's modulus of 268.96GPa, while the [210] orientation has the minimum value of 169.07 GPa.Additionally, there are remarkable differences in yield stress for different orientations; the maximum yield stress is 15.14 GPa for [111] orientation, and the minimum value is 7.66 GPa for [110] orientation.Therefore, considering the anisotropy of the mechanical properties of materials can contain a wider range of information than simply considering the element composition when establishing a database, which is conducive to improving the reliability and applicability of an ML model in dealing with complex structural materials.samples, which provides an opportunity for achieving the required performance of HEA by optimising its element composition.On the other hand, the high Young's modulus does not guarantee high yield stress.For the equiatomic HEA sample, the yield stress of [100] orientation is 18.26 GPa, which is more than twice than that of the [110] orientation (7.66 GPa).However, Young's modulus of [100] orientation (118.7 GPa) is much lower than that of [110] orientation (192.27GPa).For further illustration, Figure 5 plots the yield stress (Ys) as a function of Young's modulus (E) of different CuFeNiCrCo samples under tension along different orientations.The fitting lines guide the general trend of the point sets.The scattered points indicate that there is no direct correlation between E and Ys of the HEA samples.Two conclusions can be drawn from the above results.On the one hand, changing the proportions of elements can improve or decrease the mechanical properties of HEA samples, which provides an opportunity for achieving the required performance of HEA by optimising its element composition.On the other hand, the high Young's modulus does not guarantee high yield stress.For the equiatomic HEA sample, the yield stress of [100] orientation is 18.26 GPa, which is more than twice than that of the [110] orientation (7.66 GPa).However, Young's modulus of [100] orientation (118.7 GPa) is much lower than that of [110] orientation (192.27GPa).For further illustration, Figure 5 plots the yield stress (Ys) as a function of Young's modulus (E) of different CuFeNiCrCo samples under tension along different orientations.The fitting lines guide the general trend of the point sets.The scattered points indicate that there is no direct correlation between E and Ys of the HEA samples.
The lower correspondence between Young's modulus and yield stress is mainly ascribed to the nonlinear elastic behaviour of materials under strain, either 'elastic hardening' or 'elastic softening' during dynamic tension in the elastic stage [38].The yield stress is usually small accompanied by the elastic softening, while a high yield stress value can be achieved when the elastic hardening occurs.Although the yield stress generally increases with the increase of Young's modulus for most orientations, the correspondence between the two values is not prominent.The independence of Young's modulus and yield stress makes it necessary to predict the two mechanical properties by ML separately.The lower correspondence between Young's modulus and yield stress is mainly ascribed to the nonlinear elastic behaviour of materials under strain, either 'elastic hardening' or 'elastic softening' during dynamic tension in the elastic stage [38].The yield stress is usually small accompanied by the elastic softening, while a high yield stress value can be achieved when the elastic hardening occurs.Although the yield stress generally increases with the increase of Young's modulus for most orientations, the correspondence between the two values is not prominent.The independence of Young's modulus and yield stress makes it necessary to predict the two mechanical properties by ML separately.

ML Models Training and Evaluation
Based on the MD simulations on the tensile responses of CuFeNiCrCo single-crystal samples, the quantitative relationship between element composition (input features) and the mechanical properties (Ys and E, output features) of HEA samples were obtained, and the database used for ML was constructed (see supplementary materials).In this section, eight ML models were investigated for the given tasks, ranging from shallow models to deep models.The overall dataset was split into train, development (dev) and test sets, which occupy 60, 20 and 20% of the whole instances, respectively.In order to eliminate the effects of outliers, all the input features (the element composition) were standardised using a z-score transformation before being fed into the ML models.The feature vectors were scaled to a distribution having an arithmetic mean of zero and a variance of one.Let x(n) be the composition vector of the n-th element in the high-entropy alloy data, and the values of x(n) were standardised as

ML Models Training and Evaluation
Based on the MD simulations on the tensile responses of CuFeNiCrCo single-crystal samples, the quantitative relationship between element composition (input features) and the mechanical properties (Ys and E, output features) of HEA samples were obtained, and the database used for ML was constructed (see Supplementary Materials).In this section, eight ML models were investigated for the given tasks, ranging from shallow models to deep models.The overall dataset was split into train, development (dev) and test sets, which occupy 60, 20 and 20% of the whole instances, respectively.In order to eliminate the effects of outliers, all the input features (the element composition) were standardised using a z-score transformation before being fed into the ML models.The feature vectors were scaled to a distribution having an arithmetic mean of zero and a variance of one.Let x(n) be the composition vector of the n-th element in the high-entropy alloy data, and the values of x(n) were standardised as where µ x and σ x are the mean and the standard deviation of the vector x(n), respectively.The information was measured in the train set (µ x and σ x ), and was applied to the dev and test sets.
Considering the imbalanced characteristic of our proposed database (i.e., the number of instances belonging to each class is unequal among the database), the unweighted average recall (UAR) was used as the main metric to evaluate model performance (Section S1).The UAR metric is regarded as more suitable and rigorous than the weighted average recall (WAR) for evaluating a model's performance based on an imbalanced database [39].On the other hand, the WAR, the sensitivity (Sens.), the specificity (Spec.), the precision (Prec.)and the F1-score are calculated as complementary evaluation metrics.The results of the complementary evaluation metrics are listed in Table 1.The hyper-parameters of all ML models are tuned and optimised on the dev set based on the performance (UAR).A gridsearch strategy was used to decide the optimal hyper-parameter for a specific ML model.The detailed information of the grid-search strategy and the procedure of hyper-parameter optimisation are introduced in Section S2 [40][41][42][43].When conducting the final evaluation, the data of the train and dev set are merged together to train the ML model within the optimised hyper-parameters, which are used to predict the output features (Ys and E) of the HEA samples in the test set.The achieved results (UARs in [%]) of the eight ML models for the learning task of yield stress (Ys) and Young's modulus (E) are presented in Figure 6a,b.It is seen that most of the results are above 85.0% of UAR, indicating that the ML models perform efficiently for both of the two tasks.For the task of yield stress, the best UAR achieved on the test set is reached by the NB model (86.4%), while for the task of Young's modulus, the KELM model takes the first place with a UAR of 92.2%.It is found that some simple ML models (e.g., NB, k-NN, LDA) can also show a good capacity in predicting the results; they perform well for both of the two tasks.It is reasonable to think that the relationship between the element composition (input features) and the mechanical properties (output features) of the HEA is sufficient to build ML models.Compared to the DNN model, a simple multi-layer perceptron structure without pre-training process in this study, the SAE model can learn more information inherited from the data itself in an unsupervised paradigm due to their pre-training process.Therefore, the SAE outperforms the general DNN for the two tasks on the test set.However, both of the two models have been constrained in this study, mainly due to the limited instance number of the small database for deep learning models.In contrast, ELM and its variant, KELM, are demonstrated to be more efficient in this study.Unlike the two deep models, the ELM and KELM models do not require to tune the parameters of the hidden nodes, which makes their training process much faster.Moreover, for a small size dataset in this study, ELM and KELM were able to execute efficiently while maintaining a fast training scheme.SVM is a popular and stable ML model that has been successfully applied to many tasks in the past decades.In this study, although SVM has shown some robustness and effectiveness, the results are not as good as ELM and KELM.It was indicated that, when using kernels, SVM is more likely to get sub-optimal solutions than KELM, which may lead to a negative result [44].By comparing the UAR results, as well as other complementary evaluation metrics of the above models, it is believed that the To further demonstrate the effectiveness of the KELM model, Figure 6c,d plot the 180 predicted results on the test set by the KELM model.Firstly, the results of yield stress and Young's modulus from MD simulations are divided into the two categories of 'Good' and 'Weak', respectively.The red line is the benchmark line defined by the values of the equiatomic sample, and the value is set as one.The 180 results of the non-equiatomic samples distributed along horizontal axis are normalised according to the benchmark value, and they are plotted along the vertical axis.According to the MD simulation, the points above the red line are classified into the 'Good' zone, while the points below the red line are classified into the 'Weak' zone.According to the ML result, if the predicted result ('Good' or 'Weak') from the KELM model matches the MD result, the point is presented as blue, or otherwise it is in red.It is found that most of the red points are located near the red line, implying that the KELM model only gives a negative result when the predicted value is too close to the benchmark value.This is more evident in the task of Young's modulus, as shown in Figure 6d.Based on the above results, it is assumed that if an upper and lower threshold was set for the benchmark value, the prediction accuracy of the ML model could be further improved, which proves the efficiency of the KELM model regarding the learning tasks.To further demonstrate the effectiveness of the KELM model, Figure 6c,d plot the 180 predicted results on the test set by the KELM model.Firstly, the results of yield stress and Young's modulus from MD simulations are divided into the two categories of 'Good' and 'Weak', respectively.The red line is the benchmark line defined by the values of the equiatomic sample, and the value is set as one.The 180 results of the non-equiatomic samples distributed along horizontal axis are normalised according to the benchmark value, and they are plotted along the vertical axis.According to the MD simulation, the points above the red line are classified into the 'Good' zone, while the points below the red line are classified into the 'Weak' zone.According to the ML result, if the predicted result ('Good' or 'Weak') from the KELM model matches the MD result, the point is presented as blue, or otherwise it is in red.It is found that most of the red points are located near the red line, implying that the KELM model only gives a negative result when the predicted value is too close to the benchmark value.This is more evident in the task of Young's modulus, as shown in Figure 6d.Based on the above results, it is assumed that if an upper and lower threshold was set for the benchmark value, the prediction accuracy of the ML model could be further improved, which proves the efficiency of the KELM model regarding the learning tasks.

Prediction on HEA Polycrystal
The large-size polycrystal HEA samples are constructed to verify further the reliability and applicability of the well-trained KELM model.The polycrystal sample is the combination of a number of single crystals with random orientations, which can better characterize the texture of real materials.As shown in Figure 7a, the remarkable difference from the single-crystal sample is that the polycrystal sample contains grain boundary networks.The sample is coloured according to the CNA method, in which green atoms are in the grain with fcc structure, and blue atoms are in the grain boundary region with a disordered structure.The polycrystal samples contain 16 grains with a mean grain size of 27.4 nm, and the total number of atoms is 10,976,192.By considering the computational capability regarding to the large-size samples, an equiatomic sample and ten non-equiatomic samples were tested with random element compositions.For simplicity, the tested polycrystal HEA samples are marked as P0 to P10, where P0 is the sample with equiatomic composition, and P1 to P10 are ten non-equiatomic samples.The elements information of the samples is shown in Table 2. Figure 7b plots the strain-stress response of the polycrystal samples with different compositions under uniaxial tension by MD simulation.The maximum stresses of the samples are found between 4~5% strain, depending on their element compositions.It is found that the yield stress and Young's modulus of the equiatomic samples are at a medium level, compared to the non-equiatomic counterparts.The result is consistent with the case of single-crystal samples-that is, changing the element composition can either improve or decrease the mechanical properties of the polycrystal HEA samples.Based on the MD simulation, the results of yield stress and Young's modulus are classified into two groups.Meanwhile, the prediction results ('Good' or 'Weak') are given by the ML model.The comparison results of MD simulation and ML prediction are listed in Table 2.For yield stress, it was found that the ML model can give a correct prediction on nine tested samples, with only P4 for an exception.The accuracy is consistent with the prediction of single-crystal samples (~90%).However, the ML model failed to predict Young's modulus of four (P2, P5, P9, and P10) out of the ten tested samples; the predictive accuracy is much lower than that in the case of the single crystal.This deviation is mainly  Based on the MD simulation, the results of yield stress and Young's modulus are classified into two groups.Meanwhile, the prediction results ('Good' or 'Weak') are given by the ML model.The comparison results of MD simulation and ML prediction are listed in Table 2.For yield stress, it was found that the ML model can give a correct prediction on nine tested samples, with only P4 for an exception.The accuracy is consistent with the prediction of single-crystal samples (~90%).However, the ML model failed to predict Young's modulus of four (P2, P5, P9, and P10) out of the ten tested samples; the predictive accuracy is much lower than that in the case of the single crystal.This deviation is mainly attributed to the grain boundaries in the polycrystal sample.Experiments and simulations have proved that the existence of grain boundaries has an important effect on the mechanical properties of materials [45][46][47][48].As mentioned, the yield stress depends mainly on the nucleation of the first set of dislocations [24].The stress required for dislocation nucleation at grain boundary is much lower than that required for nucleation of the single-crystal sample with defect-free structure [35,49], so the yield stress of polycrystal samples shows an overall decreasing trend when compared with single-crystal samples.In addition, similar to the single-crystal sample, dislocation nucleation is still the main reason for polycrystal samples to yield under the current grain size.Taking Cu 28 Fe 13 Ni 23 Cr 8 Co 28 as an example, Figure 7c shows the configuration of the sample at different deformation stages under tension.These snapshots are the cross-sectional view of the sample along [110] direction.It can be seen that the yielding of the sample occurs at tensile strain between 4~5%, which corresponds to the initial dislocation nucleation at the grain boundaries.After that, the system stress decreases rapidly with higher density of dislocations nucleated from grain boundaries.Meanwhile, it is shown that the configuration of the grain boundary network has not changed substantially near the yielding point, indicating that the presence of grain boundaries does not play an important role in the yield process of the polycrystal sample for the current grain size, and thus has almost no impact on the prediction accuracy of ML on the yield stress task.
On the other hand, Young's modulus is sensitive to the chemical composition and the intrinsic structure of materials.Simulations have shown that Young's modulus decreases with the increase of the grain boundary volume fraction, especially when the grain size drops to the nanocrystalline region (<100 nm) [47,48].While the single-crystal sample has a single fcc structure (green atoms), and for the polycrystal samples, the atoms at grain boundary regions contribute a new disordered structure (blue atoms).Therefore, grain boundaries can have more prominent influence on the result of Young's modulus than that of yield stress.Since the ML model was trained and optimised based on single-

Figure 1 .
Figure 1.Schematic diagram of MD simulation and ML methods.(a) Atomic configuration of a CuFeNiCrCo single-crystal sample, atoms are coloured according to the element types.(b) Atoms are coloured by the common neighbour analysis (CNA) method.The green atoms denote the face-centered cubic structure.(c) Schematic of the working principles of some ML models, including deep neural network (DNN), extreme learning machine (ELM), and support vector machine (SVM).(d) Strain-stress response of single-crystal HEA samples with various element compositions along [110] orientation by MD simulation.The inverse pole figure indicates nine different crystallographic orientations tested in this study.(e) Prediction of the mechanical properties (Ys and E) by ML method.

Figure 1 .
Figure 1.Schematic diagram of MD simulation and ML methods.(a) Atomic configuration of a CuFeNiCrCo single-crystal sample, atoms are coloured according to the element types.(b) Atoms are coloured by the common neighbour analysis (CNA) method.The green atoms denote the face-centered cubic structure.(c) Schematic of the working principles of some ML models, including deep neural network (DNN), extreme learning machine (ELM), and support vector machine (SVM).(d) Strain-stress response of single-crystal HEA samples with various element compositions along [110] orientation by MD simulation.The inverse pole figure indicates nine different crystallographic orientations tested in this study.(e) Prediction of the mechanical properties (Ys and E) by ML method.

Figure 2 .
Figure 2. The HEA sample volume as a function of temperature from 300 K to 3000 K.The inserted snapshots show the atomic configurations of the Cu30Fe18Ni9Cr14Co29 sample at different temperature levels.Atoms are coloured by CNA method; the green atoms represent the face-centered cubic structure, and the blue atoms are in disorder.

Figure 2 .
Figure 2. The HEA sample volume as a function of temperature from 300 K to 3000 K.The inserted snapshots show the atomic configurations of the Cu 30 Fe 18 Ni 9 Cr 14 Co 29 sample at different temperature levels.Atoms are coloured by CNA method; the green atoms represent the face-centered cubic structure, and the blue atoms are in disorder.

Figure 3 .
Figure 3. Mechanical responses of single-crystal HEA samples.(a) Stress-strain response of the equiatomic single-crystal HEA samples with different crystallographic orientations.(b) Stress-strain response of the single-crystal HEA samples with different element compositions along [110] orientation.(c) Snapshots of the nucleation of the first set of dislocations in the single-crystal samples with different orientations near the yield point. ε indicates the tensile strain.The stress-strain responses of the non-equiatomic HEA samples with different orientations are presented in Figure 4.For each orientation, the figure contains 100 results of the non-equiatomic HEA samples based on the random combination of the constituent elements.The results of the equiatomic HEA sample are also plotted for comparison.It is found that the yield stress and Young's modulus of the equiatomic HEA sample are at a medium level for all tested orientations.For example, Figure 3b plots the stress-strain responses of five HEA samples with different element compositions along [110] orientation.The maximum Young's modulus is 199.5 GPa for the Cu21Fe18Ni27Cr23Co11 sample, and the minimum value is 183.5 GPa for the Cu31Fe8Ni10Cr32Co19 sample.The maximum yield

Figure 3 .
Figure 3. Mechanical responses of single-crystal HEA samples.(a) Stress-strain response of the equiatomic single-crystal HEA samples with different crystallographic orientations.(b) Stress-strain response of the single-crystal HEA samples with different element compositions along [110] orientation.(c) Snapshots of the nucleation of the first set of dislocations in the single-crystal samples with different orientations near the yield point. ε indicates the tensile strain.The stress-strain responses of the non-equiatomic HEA samples with different orientations are presented in Figure 4.For each orientation, the figure contains 100 results of the non-equiatomic HEA samples based on the random combination of the constituent elements.The results of the equiatomic HEA sample are also plotted for comparison.It is found that the yield stress and Young's modulus of the equiatomic HEA sample are at a medium level for all tested orientations.For example, Figure 3b plots the stressstrain responses of five HEA samples with different element compositions along [110] orientation.The maximum Young's modulus is 199.5 GPa for the Cu 21 Fe 18 Ni 27 Cr 23 Co 11 sample, and the minimum value is 183.5 GPa for the Cu 31 Fe 8 Ni 10 Cr 32 Co 19 sample.The maximum yield stress of 9.78 GPa is observed for the Cu 20 Fe 33 Ni 32 Cr 10 Co 5 sample, and the Cu 26 Fe 9 Ni 9 Cr 22 Co 34 sample has the minimum value of 6 GPa.For the equiatomic HEA

Metals 2021 ,Figure 6 .
Figure 6.Results of ML models training and evaluation.The results (UARs in [%]) of eight ML models on the task of (a) yield stress, and (b) Young's modulus.The UARs shown for the dev set are the best UARs achieved by the optimal hyperparameters tuned for the corresponding model.The UARs shown for the test set are the final performance achieved by the model trained by the train plus dev sets within the optimal hyper-parameters.Prediction result of the 180 nonequiatomic samples on the test set by KELM model for (c) yield stress, and (d) Young's modulus.The red line is the benchmark line based on the value of the equiatomic sample, the results of non-equiatomic samples are normalised to the benchmark value, where the values above the line are classified to the 'Good' zone, and those below the line are classified to the 'Weak' zone.If the ML prediction matches the MD simulation, the point is coloured blue, but otherwise the point is coloured red.

Figure 6 .
Figure 6.Results of ML models training and evaluation.The results (UARs in [%]) of eight ML models on the task of (a) yield stress, and (b) Young's modulus.The UARs shown for the dev set are the best UARs achieved by the optimal hyperparameters tuned for the corresponding model.The UARs shown for the test set are the final performance achieved by the model trained by the train plus dev sets within the optimal hyper-parameters.Prediction result of the 180 non-equiatomic samples on the test set by KELM model for (c) yield stress, and (d) Young's modulus.The red line is the benchmark line based on the value of the equiatomic sample, the results of non-equiatomic samples are normalised to the benchmark value, where the values above the line are classified to the 'Good' zone, and those below the line are classified to the 'Weak' zone.If the ML prediction matches the MD simulation, the point is coloured blue, but otherwise the point is coloured red.

Figure 7 .
Figure 7. (a) Atomic configuration of the polycrystal ample.The sample contains 16 randomly orientated grains, and the mean grain size is 27.4 nm.(b) Stress-strain response of the polycrystal HEA samples with different element compositions.(c) Snapshots of the polycrystal sample Cu28Fe13Ni23Cr8Co28 at different deformation stages (cross-sectional view along [110] direction).Atoms are coloured by the CNA method.The green atoms are located in the grain with fcc structure, while the blue atoms are located at the grain boundary region, and the red atoms represent the stacking faults created by the slip of the nucleated dislocations from grain boundaries.

Figure 7 .
Figure 7. (a) Atomic configuration of the polycrystal ample.The sample contains 16 randomly orientated grains, and the mean grain size is 27.4 nm.(b) Stress-strain response of the polycrystal HEA samples with different element compositions.(c) Snapshots of the polycrystal sample Cu 28 Fe 13 Ni 23 Cr 8 Co 28 at different deformation stages (cross-sectional view along [110] direction).Atoms are coloured by the CNA method.The green atoms are located in the grain with fcc structure, while the blue atoms are located at the grain boundary region, and the red atoms represent the stacking faults created by the slip of the nucleated dislocations from grain boundaries.

Table 1 .
The results of the ML models on test set by complementary evaluation metrics (in [%]).

Table 2 .
Comparison of MD simulation and ML prediction for yield stress and Young's modulus of ten polycrystal HEA samples.x% represents the deviation of the values between the non-equiatomic samples (P1-P10) and the equiatomic sample (P0).' ' represents that the ML prediction is consistent with the MD simulation; otherwise, the ML result is marked as '×'.