Machine Learning-Based Strength Prediction for Refractory High-Entropy Alloys of the Al-Cr-Nb-Ti-V-Zr System

The aim of this work was to provide a guidance to the prediction and design of high-entropy alloys with good performance. New promising compositions of refractory high-entropy alloys with the desired phase composition and mechanical properties (yield strength) have been predicted using a combination of machine learning, phenomenological rules and CALPHAD modeling. The yield strength prediction in a wide range of temperatures (20–800 °C) was made using a surrogate model based on a support-vector machine algorithm. The yield strength at 20 °C and 600 °C was predicted quite precisely (the average prediction error was 11% and 13.5%, respectively) with a decrease in the precision to slightly higher than 20% at 800 °C. An Al13Cr12Nb20Ti20V35 alloy with an excellent combination of ductility and yield strength at 20 °C (16.6% and 1295 MPa, respectively) and at 800 °C (more 50% and 898 MPa, respectively) was produced based on the prediction.


Introduction
High entropy alloys (HEAs), which are sometimes also called multi-principal element alloys, were originally discovered by Yeh [1] and Cantor [2]. In contrast to traditional alloys, which are based on one principal element, HEAs are defined as alloys with five or more principal elements in equal or near-equal atomic percentage (5-35 at.%). HEAs have attracted great research interest [3][4][5][6] due to their high strength (including hightemperature strength), structural stability, hardness, and wear resistance, as well as good corrosion and oxidation resistance [3][4][5][6][7][8][9]. Their superior properties enable their application in a wide range of modern industries, for example, as high-temperature materials for future aerospace vehicles.
Promising candidates for a new generation of high-temperature materials are HEAs based on refractory elements (RHEAs). The first RHEA, consisted of several refractory elements (Mo, Nb, Ta, V and W), showed high strength up to 1600 • C but had high density (>12 g/cm 3 ) [10,11]. At such high densities, the applicability of these alloys is significantly limited. Therefore, the attention was focused on the development of lighter alloys. Usage of lighter refractory elements made it possible to reduce the density of the alloys considerably. For example, the specific strength of a CrNbTiVZr alloy with a density of 6.57 g/cm 3 was found to be higher than that for commercial nickel-based alloys [12,13] Further development of this approach can be associated with including low-density nonrefractory elements (such as Al or Si) in RHEAs. Thus, modern RHEAs can contain a wider range of elements (Ti, Zr, Hf, V, Nb, Ta, Cr, Mo, W, Al, Co, Ni, Si) [3,14,15].
Basically, high-entropy alloys provide a vast compositional space for the design of new alloys. On the one hand, a huge compositional space provides an ampler opportunity to obtain alloys with improved properties. On the other hand, the development of new

Computational Predictions
The algorithm for the model alloy selection is shown in Figure 1. At the first stage, the composition space area of Al-Cr-Nb-Ti-V-Zr system alloys was selected. The size of this area is defined by the maximum and minimum concentrations of the components. The concentration range was not limited by equatomic composition and was increased to the interval of 0-50 at.% for Nb, Ti, V and Zr and 0-15 at.% for Al and Cr. The lower concentrations for Al and Cr were used to avoid the formation of intermetallic phases and/or ordering of the matrix phase [24,55]. Since Al and Cr had a narrower concentration range, a 1% step of concentration change was used; for other elements the step was 5%. The total number of potential alloys was therefore 29,269. and/or ordering of the matrix phase [24,55]. Since Al and Cr had a narrower concentration range, a 1% step of concentration change was used; for other elements the step was 5%. The total number of potential alloys was therefore 29,269.

Machine Learning
The yield strength of metallic materials can be either measured directly or predicted; the physically based prediction however is usually based on rather complicated and long calculations [63,64]. In the case of thousands of alloys, the use a surrogate (approximate) model is more reasonable. This model is trained on a dataset that includes known values of the calculated characteristic and a set of corresponding features. A trained surrogate model can predict the values for a characteristic set of alloys which were not used for training. In comparison with the strict calculation, the accuracy of this approach is usually lower, but the procedure is significantly easier. In this work the machine learning approach was used for creating a surrogate model for the prediction of the yield strength.
Meanwhile the accuracy of the surrogate model strongly depends on the dataset, the set of features, and the machine learning algorithm. Since our model focuses on the Al-Cr-Nb-Ti-V-Zr system, the dataset included only those alloys, which consisted of these elements [12,51,55,62]. The dataset sizes for room temperature, 600 °C and 800 C were 30, 35 and 33 alloys, respectively. The datasets did not include data for those alloys which fractures in the elastic strain range, that is why the dataset for room temperature was the smallest. The set of features (δ, VEC, ΔHmix, etc.; Table 1) was chosen based on an analysis of the literature. These features are related to their intrinsic properties which influence the formation of a solid solution, amorphous phase and/or intermetallic compound in HEAs, and affect the final yield strength [16,39,41,[65][66][67].

Machine Learning
The yield strength of metallic materials can be either measured directly or predicted; the physically based prediction however is usually based on rather complicated and long calculations [63,64]. In the case of thousands of alloys, the use a surrogate (approximate) model is more reasonable. This model is trained on a dataset that includes known values of the calculated characteristic and a set of corresponding features. A trained surrogate model can predict the values for a characteristic set of alloys which were not used for training. In comparison with the strict calculation, the accuracy of this approach is usually lower, but the procedure is significantly easier. In this work the machine learning approach was used for creating a surrogate model for the prediction of the yield strength.
Meanwhile the accuracy of the surrogate model strongly depends on the dataset, the set of features, and the machine learning algorithm. Since our model focuses on the Al-Cr-Nb-Ti-V-Zr system, the dataset included only those alloys, which consisted of these elements [12,51,55,62]. The dataset sizes for room temperature, 600 • C and 800 • C were 30, 35 and 33 alloys, respectively. The datasets did not include data for those alloys which fractures in the elastic strain range, that is why the dataset for room temperature was the smallest. The set of features (δ, VEC, ∆H mix, etc. ; Table 1) was chosen based on an analysis of the literature. These features are related to their intrinsic properties which influence the formation of a solid solution, amorphous phase and/or intermetallic compound in HEAs, and affect the final yield strength [16,39,41,[65][66][67]. Table 1. List of input features for the surrogate prediction model.

Feature Equation for Feature Calculation
The difference in atomic radii between elements (δ) In order to reduce the computer time and to improve the surrogate prediction model efficiency, a correlation analysis was used to remove unnecessary features. A Pearson correlation coefficient map between different features was constructed ( Figure 2). The correlation coefficient is calculated as follows: where m x is the mean of the vector x and m y is the mean of the vector y. Each pair of the features with a correlation coefficient greater than 0.95 were considered as a highly correlated combination and one of the features was excluded from the model. The correlations of δ-Λ and µ-∆µ were found to be more 0.95, therefore the δ and µ were omitted. The choice of the optimal machine learning algorithm included a few stages. Firstly, seven well-known machine learning algorithms [68], such as a ridge regression algorithm (rid), support vector regressions with a linear kernel (svr.lin), a polynomial kernel (svr.poly), and a radial basis function kernel (svr.rbf), a regression tree algorithm (tree) and a k-nearest neighbor algorithm (knn) were compared. Each algorithm, in addition to a set of training data, includes also its own parameters (hyperparameters) so that the prediction accuracy can vary depending on the value of the hyperparameters as well. Grid search with root mean square error estimation were used for selecting optimal values of the hyperparameters for each algorithm.
To calculate the prediction accuracy, the obtained initial dataset was split into a training dataset and a testing dataset. Since the size of the initial dataset was rather small (30-35 alloys, depending on temperature), it was important to choose the optimal ratio of the new-forming training and test datasets in all used algorithms to attain the best accuracy of the prediction. To this end, the size of the training datasets was varied in an interval of 0.3-0.9 of the full dataset. The surrogate models were trained using the training dataset, and then the models were used for the prediction of the yield strengths of the testing set alloys and for the calculation of the root mean square error. The diagram showing the root mean square error as a function of the training dataset size for all algorithms used for the yield strength prediction is shown in Figure 3. The error slightly depends on the training dataset size for two algorithms-tree and knn, For other algorithms the error decreases with an increase in the size of the training set. The optimal size of the training dataset was defined at 0.7 of the whole initial datasets. The choice of the optimal machine learning algorithm included a few stages. Firstly, seven well-known machine learning algorithms [68], such as a ridge regression algorithm (rid), support vector regressions with a linear kernel (svr.lin), a polynomial kernel (svr.poly), and a radial basis function kernel (svr.rbf), a regression tree algorithm (tree) and a k-nearest neighbor algorithm (knn) were compared. Each algorithm, in addition to a set of training data, includes also its own parameters (hyperparameters) so that the prediction accuracy can vary depending on the value of the hyperparameters as well. Grid search with root mean square error estimation were used for selecting optimal values of the hyperparameters for each algorithm.
To calculate the prediction accuracy, the obtained initial dataset was split into a training dataset and a testing dataset. Since the size of the initial dataset was rather small (30-35 alloys, depending on temperature), it was important to choose the optimal ratio of the new-forming training and test datasets in all used algorithms to attain the best accuracy of the prediction. To this end, the size of the training datasets was varied in an interval of 0.3-0.9 of the full dataset. The surrogate models were trained using the training dataset, and then the models were used for the prediction of the yield strengths of the testing set alloys and for the calculation of the root mean square error. The diagram showing the root mean square error as a function of the training dataset size for all algorithms used for the yield strength prediction is shown in Figure 3. The error slightly depends on the training dataset size for two algorithms-tree and knn, For other algorithms the error decreases with an increase in the size of the training set. The optimal size of the training dataset was defined at 0.7 of the whole initial datasets.  After that, the most efficient algorithm was determined using a well-known in statistic approach bootstrap with replacement [68]. A total of 50 bootstrap datasets with a size of 0.7 of the initial datasets, each created by choosing random alloys from the initial dataset, and the alloys in the bootstrap datasets can be used more than once (even in one bootstrap datasets). These bootstrap datasets were used for training the algorithms to predict all the data points in the initial dataset and to calculate the root mean square error for all the machine learning algorithms. Among two algorithms showing the minimum prediction error (svr.rbf and rid; Figure 4), svr.rbf was trained and used for the prediction of the yield strength of HEAs at room at 20 °C, 600 °C or 800 °C. A cross-validation approach was also used for pretesting of the svr.rbf algorithm. After that, the most efficient algorithm was determined using a well-known in statistic approach bootstrap with replacement [68]. A total of 50 bootstrap datasets with a size of 0.7 of the initial datasets, each created by choosing random alloys from the initial dataset, and the alloys in the bootstrap datasets can be used more than once (even in one bootstrap datasets). These bootstrap datasets were used for training the algorithms to predict all the data points in the initial dataset and to calculate the root mean square error for all the machine learning algorithms. Among two algorithms showing the minimum prediction error (svr.rbf and rid; Figure 4), svr.rbf was trained and used for the prediction of the yield strength of HEAs at room at 20 • C, 600 • C or 800 • C. A cross-validation approach was also used for pretesting of the svr.rbf algorithm. After that, the most efficient algorithm was determined using a well-known in statistic approach bootstrap with replacement [68]. A total of 50 bootstrap datasets with a size of 0.7 of the initial datasets, each created by choosing random alloys from the initial dataset, and the alloys in the bootstrap datasets can be used more than once (even in one bootstrap datasets). These bootstrap datasets were used for training the algorithms to predict all the data points in the initial dataset and to calculate the root mean square error for all the machine learning algorithms. Among two algorithms showing the minimum prediction error (svr.rbf and rid; Figure 4), svr.rbf was trained and used for the prediction of the yield strength of HEAs at room at 20 °C, 600 °C or 800 °C. A cross-validation approach was also used for pretesting of the svr.rbf algorithm.   Figure 5 shows the comparison of the predicted and experimental values of the yield strengths at 20 • C, 600 • C or 800 • C; the predicted values were obtained using the svr.rbf surrogate model. One can see that the prediction for room temperature was more accurate than that for high temperatures. The bagging (from bootstrap aggregating) [68] approach was used for improving the prediction accuracy. To this end, 1000 datasets with a size of 0.7 of the initial datasets were randomly selected for (i) training surrogate model, (ii) yield strength prediction and (iii) calculating the average yield strength. The alloys with the yield strength less than the average value were excluded from consideration, due to which the number of the alloys was reduced to 11770. Then, the phenomenological phase formation criteria and the CALPHAD approach were used to select predominantly single-phase alloys.
was used for improving the prediction accuracy. To this end, 1000 datasets with a size of 0.7 of the initial datasets were randomly selected for (i) training surrogate model, (ii) yield strength prediction and (iii) calculating the average yield strength. The alloys with the yield strength less than the average value were excluded from consideration, due to which the number of the alloys was reduced to 11770. Then, the phenomenological phase formation criteria and the CALPHAD approach were used to select predominantly singlephase alloys. The predicted values were obtained using the svr.rbf surrogate model.

Phenomenological Rules
At first phenomenological models for the phase formation in HEAs were used to select single-phase alloys. The advantages and limitations of these models were thoroughly discussed elsewhere [25]. The general purpose of this step was to reduce the computing time, since the calculation using CALPHAD is much longer in comparison with that using the phenomenological models. The alloys were taken as single-phase ones if δ < 5.4%, VEC < 6.87, −16.25 kJ/mole ≤ ∆H_mix ≤ 5 kJ/mole, Ω > 1.1, φ > 7 and η > 0.19 [69]. The equations used for the calculations are presented below and in Table 1: The predicted values were obtained using the svr.rbf surrogate model.

Phenomenological Rules
At first phenomenological models for the phase formation in HEAs were used to select single-phase alloys. The advantages and limitations of these models were thoroughly discussed elsewhere [25]. The general purpose of this step was to reduce the computing time, since the calculation using CALPHAD is much longer in comparison with that using the phenomenological models. The alloys were taken as single-phase ones if δ < 5.4%, VEC < 6.87, −16.25 kJ/mole ≤ ∆H_mix ≤ 5 kJ/mole, Ω > 1.1, ϕ > 7 and η > 0.19 [69]. The equations used for the calculations are presented below and in Table 1: After calculations, 1250 candidate alloys with a presumably single-phase structure were selected for further consideration.

CALPHAD Calculations
At the last stage, densities of the 1250 potential alloys were calculated using the rule of mixtures. Then, specific yield strength for each potential alloy was calculated as the ratio between the predicted yield strength and the calculated density. A total of 80 alloys with the highest sum of the specific yield strength at 20 • C, 600 • C and 800 • C were chosen. Their phase compositions were calculated using a Thermo-Calc (version 2020a) and a TCHEA3 database (High Entropy Alloys version 3.1) for 1200 • C since this temperature is usually used for homogenization annealing of Al-Cr-Nb-Ti-V-Zr RHEAs [51][52][53][54][55][56][57][58][59][60][61][62]. As a result, six model alloys were selected ( Table 2). Six of them possessed the greatest sum of specific yield strength and they were either single-phase or contained less than 10% of a second phase(s). Another two alloys (A3 and A4) were chosen based on only high specific yield strength values irrespective of the phase composition. The experimental values of the yield strength of the model alloys were used for the prediction of computational accuracy. Additionally, the obtained experimental values of strength were included in the dataset for the development of the next generation of new alloys.

Experiment
The model alloys were produced by vacuum arc melting, using proper mixtures of pure metals with purities of better than 99.9 wt.%, in a Ti-gettered argon atmosphere. The measured compositions of the model alloys are listed in Table 3. The alloys were remelted five times to improve their homogeneity. The obtained ingots were then sealed in vacuumed (10 −2 torr) quartz tubes and soaked at 1200 • C for 10 h. After the homogenization annealing, samples for compression test and microstructure investigation were cut out using an electric discharge machine.
The phase composition was studied using X-ray diffraction (XRD) on a RIGAKU diffractometer with CuKα radiation. SEM investigations were carried out using either FEI Quanta 600 FEG or Nova NanoSEM microscopes; both instruments were equipped with back-scattered electron (BSE) and energy-dispersive X-ray spectroscopy (EDS) detectors. Specimens for structural investigations were finished with OP-S suspension (the abrasive particle size of 0.04 µm). The chemical composition of the alloys was measured using SEM-EDS with a scanning area of 2 × 2 mm 2 .
The microhardness was measured using a Wolpert group 402mvd microhardness tester. The load and dwell time were 300 g and 10 s, respectively. The microhardness value was averaged on five measurements. Rectangular specimens measured 8 × 5 × 5 mm 3 were compressed using an Instron 300LX testing machine equipped with a radial heating furnace. The tests were carried out at 20 • C, 600 • C or 800 • C with an initial strain rate of 10 −4 s −1 till 50% of height reduction (or till fracture).

Machine Learning Prediction of Composition-Properties Relationships in Alloys of the Al-Cr-Nb-Ti-V-Zr System
The number of alloys with a fixed percentage of each certain element (while others vary in the intervals 0-15% for Al, Cr or 0-45% for Nb, Ti, V, Zr) can attain several thousand variants. To evaluate the main trends of the influence of each element on strength, the maximum, minimum and average values (upper, lower, and middle points of each error bar in Figure 6) of the specific yield strength of alloys at fixed values of the elements (Al, Cr, Nb, Ti, V or Zr) was analyzed. For titanium, vanadium, zirconium, and niobium a parabolic law was observed for the dependence of specific yield strength on the content of elements. An increase in the concentration of titanium (Figure 6d For all elements, the dependence of the specific yield strength on concentration (below 15%) can be extrapolated linearly with the Pearson′s coefficient of more than 0.85. For further insight into the effect of the elements (and their content) on the specific yield strength of the Al-Cr-Nb-Ti-V-Zr system alloys, the slope of the specific yield strengthconcentration curves in the interval 0-15% was analyzed. Figure 7 shows that the slopes for Al, Cr and Zr are higher than those for other elements at all investigated temperatures. Thus, the contents of Al, Cr and Zr have the greatest influence on the specific yield strength, while the effect of V and Ti was lower; the addition of Nb have a negative effect on the specific yield strength. Chromium and zirconium have the largest and smallest atomic radii (129 and 160 picometers) and it can be assumed that these elements make the main contribution to hardening due to solid solution hardening. In this case, the atomic radius of aluminium is close to the rest of the elements (143 picometers) and its contribution to solid solution hardening is not as great as for chromium and zirconium. For all elements, the dependence of the specific yield strength on concentration (below 15%) can be extrapolated linearly with the Pearson's coefficient of more than 0.85. For further insight into the effect of the elements (and their content) on the specific yield strength of the Al-Cr-Nb-Ti-V-Zr system alloys, the slope of the specific yield strengthconcentration curves in the interval 0-15% was analyzed. Figure 7 shows that the slopes for Al, Cr and Zr are higher than those for other elements at all investigated temperatures. Thus, the contents of Al, Cr and Zr have the greatest influence on the specific yield strength, while the effect of V and Ti was lower; the addition of Nb have a negative effect on the specific yield strength. Chromium and zirconium have the largest and smallest atomic radii (129 and 160 picometers) and it can be assumed that these elements make the main contribution to hardening due to solid solution hardening. In this case, the atomic radius of aluminium is close to the rest of the elements (143 picometers) and its contribution to solid solution hardening is not as great as for chromium and zirconium. low 15%) can be extrapolated linearly with the Pearson′s coefficient of more than 0.85. For further insight into the effect of the elements (and their content) on the specific yield strength of the Al-Cr-Nb-Ti-V-Zr system alloys, the slope of the specific yield strengthconcentration curves in the interval 0-15% was analyzed. Figure 7 shows that the slopes for Al, Cr and Zr are higher than those for other elements at all investigated temperatures. Thus, the contents of Al, Cr and Zr have the greatest influence on the specific yield strength, while the effect of V and Ti was lower; the addition of Nb have a negative effect on the specific yield strength. Chromium and zirconium have the largest and smallest atomic radii (129 and 160 picometers) and it can be assumed that these elements make the main contribution to hardening due to solid solution hardening. In this case, the atomic radius of aluminium is close to the rest of the elements (143 picometers) and its contribution to solid solution hardening is not as great as for chromium and zirconium. Figure 7. Slope of curves (see Figure 6) for all elements for concentrations less than 15% at 20 C, 600 C or 800 C.  Figure 6) for all elements for concentrations less than 15% at 20 • C, 600 • C or 800 • C.
In this case, aluminium can contribute to the ordering of alloys and the formation of intermetallic compounds, and it can be assumed that its main contribution to strengthening is associated with the formation of secondary phases. For Al and Cr the maximum yield strength was observed at concentrations of 14% in each case. At the same time, the maximum values of specific yield strength at room temperature corresponded to 40% of Nb, 20% of V and 35% of Zr. Based on these concentrations, perspective areas of the composition space of Al-Cr-Nb-Ti-V-Zr high-entropy alloys were selected. For all the selected model alloys, the concentration of at least one element corresponded to the maximum strength.
In the A1 (

Comparison between the Predicted and Actual Structure of Al-Cr-Nb-Ti-V-Zr Alloys
Phase diagrams for the model alloys obtained using the CALPHAD approach (Thermo-Calc software) are shown in Figure 8. All the program alloys crystallize through a single bcc phase field. However, at 1200 • C (heat treatment temperature, used in the current study) only A1and A5 alloys have a single bcc phase structure. In the rest of the alloys, a secondary hexagonal (C14) Laves phase was expected to appear between the solidus temperature and 1200 • C. The fraction of the Laves phase was evaluated to be <0.1 in A6, and~0.13 and 0.21 in the A4 and A3 alloys, respectively. The fraction of the Laves phase gradually increased with a decrease in temperature for most of the alloys, with the exception of the A5 alloy which retains the single bcc phase structure down to 700 • C and A2 alloy, where some amount of Zr 3 Al 2 phase formed at T < 840 • C. current study) only A1and A5 alloys have a single bcc phase structure. In the rest of the alloys, a secondary hexagonal (C14) Laves phase was expected to appear between the solidus temperature and 1200˚C. The fraction of the Laves phase was evaluated to be <0.1 in A6, and 0.13 and 0.21 in the A4 and A3 alloys, respectively. The fraction of the Laves phase gradually increased with a decrease in temperature for most of the alloys, with the exception of the A5 alloy which retains the single bcc phase structure down to 700 C and A2 alloy, where some amount of Zr3Al2 phase formed at T < 840 C. XRD analysis (Figure 9) suggests that the A4 alloys comprise of the bcc phase only. The A5 and A2 alloys have additional tiny peaks due to the presence of the Laves phases. Other model alloys contain, in addition to the bcc phase, more than one phase (Laves and hcp phases in A1; Laves and Zr5Al3 in both A3 and A6). The presence of the C14 Laves phase in the A1, A3 and A6 alloys agrees with the CALPHAD calculations ( Figure 9). However, CALPHAD did not predict the single-phase condition in the A4 alloy as well as the appearance of the Zr5Al3 phase in A3. Some discrepancies between CALPHAD-based predictions and experimental results are well-documented [38,70] and therefore are not surprising. However, the CALPHAD approach gives quite reliable qualitative data, particularly in combination with other prediction methods, and therefore can be used for the XRD analysis (Figure 9) suggests that the A4 alloys comprise of the bcc phase only. The A5 and A2 alloys have additional tiny peaks due to the presence of the Laves phases. Other model alloys contain, in addition to the bcc phase, more than one phase (Laves and hcp phases in A1; Laves and Zr 5 Al 3 in both A3 and A6). The presence of the C14 Laves phase in the A1, A3 and A6 alloys agrees with the CALPHAD calculations ( Figure 9). However, CALPHAD did not predict the single-phase condition in the A4 alloy as well as the appearance of the Zr 5 Al 3 phase in A3. Some discrepancies between CALPHADbased predictions and experimental results are well-documented [38,70] and therefore are not surprising. However, the CALPHAD approach gives quite reliable qualitative data, particularly in combination with other prediction methods, and therefore can be used for the assessment of expected phase compositions in developed alloys. Microstructures of the model alloys are shown in Figure 10. SEM images of all alloys demonstrate multiphase structures. Microstructures of the alloys A2 and A4 consist of grains 100-200 μm in size with second phase(s) precipitations located mainly along grain boundaries. In the A4 and A2 alloys the second phase particles create a continuous intergranular layer with the thickness from 0.6 μm (in A4) to 4.8 μm (in A2). The volume fraction of the secondary phases was ~1% in A4 and ~27% in A2. The alloys A1, A3, A5 and A6 rather have a dendritic microstructure with second phase(s) located in the interdendritic areas. In the A1, A3, and A5 alloys, the second phase(s) are mostly presented as separate particles while in A6 the second phase creates a continuous network. The volume fraction of the second phase(s) was ~6% in A1, 41% in A3, 10% in A5 and 7% in A6. Chemical compositions of the phases in the model alloys are shown in Table 3; a more detailed investigation of microstructures was out of the scope of the present work. Microstructures of the model alloys are shown in Figure 10. SEM images of all alloys demonstrate multiphase structures. Microstructures of the alloys A2 and A4 consist of grains~100-200 µm in size with second phase(s) precipitations located mainly along grain boundaries. In the A4 and A2 alloys the second phase particles create a continuous intergranular layer with the thickness from 0.6 µm (in A4) to 4.8 µm (in A2). The volume fraction of the secondary phases was~1% in A4 and~27% in A2. The alloys A1, A3, A5 and A6 rather have a dendritic microstructure with second phase(s) located in the interdendritic areas. In the A1, A3, and A5 alloys, the second phase(s) are mostly presented as separate particles while in A6 the second phase creates a continuous network. The volume fraction of the second phase(s) was~6% in A1, 41% in A3, 10% in A5 and 7% in A6. Chemical compositions of the phases in the model alloys are shown in Table 3; a more detailed investigation of microstructures was out of the scope of the present work.
For four of the six model alloys, the amount of the second phase(s) was less than 10%. However, differences between the measured phase compositions (Table 3) and CALPHAD calculations were more pronounced. For the alloys A4 and A5, the content of the second phase was small and therefore the peaks of the second phase was not observed in the X-ray diffraction pattern. For A2 the calculated phase composition corresponded to the actual one only qualitatively. Some discrepancies between the calculated phase composition and the actual one was observed in A3. Meanwhile, for the A1 alloy, one can notice the presence of a phase that was not calculated in the CALPHAD calculation. For four of the six model alloys, the amount of the second phase(s) was less than 10%. However, differences between the measured phase compositions (Table 3) and CALPHAD calculations were more pronounced. For the alloys A4 and A5, the content of the second phase was small and therefore the peaks of the second phase was not observed in the X-ray diffraction pattern. For A2 the calculated phase composition corresponded to the actual one only qualitatively. Some discrepancies between the calculated phase composition and the actual one was observed in A3. Meanwhile, for the A1 alloy, one can notice the presence of a phase that was not calculated in the CALPHAD calculation.

Comparison of Predicted and Measured Mechanical Properties of the Al-Cr-Nb-Ti -V-Zr Alloys
The maximum and the minimum values of microhardness was observed in the A3 alloy (650 HV) and A6 alloy (489 HV), respectively (Table 4). Four alloys (A1, A4, A5 and A2) have the microhardness values in a narrow interval 540-556 HV.
Compression stress-strength curves of the model alloys at 20 °C, 600 °C and 800 °C are shown in Figure 11. The measured and predicted values of the mechanical properties are listed in Table 4. The yield strengths of the model alloys at 20 C are in a range from

Comparison of Predicted and Measured Mechanical Properties of the Al-Cr-Nb-Ti -V-Zr Alloys
The maximum and the minimum values of microhardness was observed in the A3 alloy (650 HV) and A6 alloy (489 HV), respectively (Table 4). Four alloys (A1, A4, A5 and A2) have the microhardness values in a narrow interval 540-556 HV.
Compression stress-strength curves of the model alloys at 20 • C, 600 • C and 800 • C are shown in Figure 11. The measured and predicted values of the mechanical properties are listed in Table 4. The yield strengths of the model alloys at 20 • C are in a range from 1049 MPa for the A2 alloy to 1608 MPa for the A3 alloy; the later showed the maximum microhardness as well. The majority of the model alloys has the yield strengths at room temperature is around 1300 MPa, however. Ductility over 1% was observed in the A5 and A6 alloys (16.6 and 13.8, respectively); A4 showed~1% ductility. The A1 and A2 alloys fractured in the elastic region; for these specimens the (yield) strength values were evaluated using microhardness tests. The ratio between the microhardness and yield strengths was estimated using the corresponding values for more ductile alloys (i.e., A3, A4, A5 and A6). This ratio was found to be 2.38, therefore the estimated strength values for the A1 and A2 alloys can be adopted as 1316 and 1323 MPa, respectively.
At 600 • C, the highest and the lowest values of the yield strengths were shown by the A3 and A6 alloys (1385 and 1048 MPa, respectively), similar to 20 • C. The yield strengths of other model alloys were around 1100 MPa. All alloys showed some ductility (i.e., did not fracture in the elastic region), yet only the A1 and A5 alloys had a ductility over 5% (17.2% and 5.5%, respectively). At 800 • C, all the model alloys did not fracture till 50% height reduction. Only three model alloys (A3, A5 and A6) showed yield strength more than 300 MPa (556, 898 and 509 MPa, respectively). For other alloys, the yield strengths were in a range between 152 and 287 MPa. The best strength/ductility ratio at all tested temperatures was demonstrated by the A5 alloy. Comparison of this alloy with 47 various RHEAs of the Al-Cr-Nb-Ti-V-Zr system and equiatomic four-, five-and six-components alloys of the Nb-Ti-V-Zr-Mo-Ta-Hf-W system collected in [14] have shown that the yield strength/density ratio of the A5 alloy at 800 • C is one of the highest. Only two alloys ((AlCr 2 NbTiV and Al 0.5 CrNbTiVZr) have comparable density (5.95 and 6.23 g/cm 3 , respectively) and higher values of the yield strength (970 MPa in both cases). Other alloys possess lower either strength of density (or both). The experimental and predicted by machine learning method yield strength values are shown in Table 4 and Figure 12. The surrogate model results were in good prediction accuracy at 20 °C and 600 °C; at 800 °C the prediction error was more pronounced. The mean prediction error is 7% at 20 °C and 12% at 600 °C , which is comparable to the accuracy of such predictive systems. Accuracy of prediction hardness for Al-Co-Cr-Cu-Fe-Ni system near 80% [47,71]. Li et al. [49] had a mean error between molecular dynamic simulation of tensile and strength predicted by machine learning less than 2%. At 800 °C , the surrogate model showed the prediction error less than 20% for only two model alloys. While in work [50] for high-entropy alloys MoNbTaTiW and HfMoNbTaTiZr at 800 °C, the prediction accuracy is 95%. Thus, our proposed model for predicting the yield stress has good accuracy for room temperature and 600 °C, but for higher temperatures its accuracy is insufficient. In this work, only alloys of the Al-Cr-Nb-Ti-V-Zr system were used   Table 4 and Figure 12. The surrogate model results were in good prediction accuracy at 20 • C and 600 • C; at 800 • C the prediction error was more pronounced. The mean prediction error is 7% at 20 • C and 12% at 600 • C, which is comparable to the accuracy of such predictive systems. Accuracy of prediction hardness for Al-Co-Cr-Cu-Fe-Ni system near 80% [47,71]. Li et al. [49] had a mean error between molecular dynamic simulation of tensile and strength predicted by machine learning less than 2%. At 800 • C, the surrogate model showed the prediction error less than 20% for only two model alloys. While in work [50] for high-entropy alloys MoNbTaTiW and HfMoNbTaTiZr at 800 • C, the prediction accuracy is 95%. Thus, our proposed model for predicting the yield stress has good accuracy for room temperature and 600 • C, but for higher temperatures its accuracy is insufficient. In this work, only alloys of the Al-Cr-Nb-Ti-V-Zr system were used to train the surrogate model. An increase in the sample due to the inclusion of alloys of the system Al-Cr-Nb-Ti-V-Zr-Mo-Ta-Hf-W does not lead to an increase in accuracy, but to a slight decrease (0.5% for 20 • C, 3% for 600 • C, for 800 • C the accuracy decreases by one and a half times). However, expanding the training dataset to include newfound alloys will improve the prediction accuracy. When the six model alloys obtained in this work are included in the dataset, the standard deviation for 20 • C decreases by 20% (from 145 to 116), for 600 • C by 6% (from 161 to 151); for 800 • C, the increase in the sample did not affect the standard deviation. transition from an athermal plateau to strong temperature dependence in bcc metals, seems more important. This transition was observed in high-entropy alloys with a bcc lattice, as well as in conventional bcc metals and alloys at temperatures of about (0.4-0.5) Tm. This means that for some alloys from the training dataset areas, the athermal plateau can be observed at 800 C, while other alloys demonstrate a strong temperature dependence. This heterogeneity in the training dataset may result in a severe spread in the yield strength values, thereby decreasing the prediction accuracy at 800 °C.

Conclusions
This section is not mandatory but can be added to the manuscript if the discussion is unusually long or complex.
A combined approach, including phenomenological rules, CALPHAD and machine learning, was used in the search for alloys with desirable properties (phase composition and yield strength). As a result, the following conclusions were made:

1.
The use of a combination of CALPHAD and phenomenological rules does not result in an accurate prediction of the phase composition of the alloys; only one of them had a desirable single-phase structure. However, in four model alloys the second phase(s) did not exceed 10%, thereby suggesting the good potential of this approach for the selection of alloys with a desirable phase composition.

2.
The surrogate model based on a support-vector machine algorithm for the prediction of the yield strength showed good accuracy at 20 • C and 600 • C (the error of prediction was less than 20% for all alloys except one). However, at 800 • C, the error of prediction was worse than 20% for only two model alloys. Relatively low prediction accuracy at 800 • C can be associated with the proximity of this temperature to the transition point between the athermal plateau and the strong temperature dependence in bcc alloys, causing, in turn, a severe spread in the yield strength of the training dataset alloys.

3.
For the Al-Cr-Nb-Ti-V-Zr system, the content of aluminum, chromium and zirconium have the greatest influence on the specific yield strength. The effect of vanadium and titanium is lower; an addition of niobium has a negative effect on specific yield strength. 4.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.