Inverse Design of Aluminium Alloys Using Genetic Algorithm: A Class-Based Workflow

Bhat, Ninad; Barnard, Amanda S.; Birbilis, Nick

doi:10.3390/met14020239

Open AccessArticle

Inverse Design of Aluminium Alloys Using Genetic Algorithm: A Class-Based Workflow

by

Ninad Bhat

^1,*

,

Amanda S. Barnard

²

and

Nick Birbilis

^1,3,*

¹

School of Engineering, College of Engineering, Computing & Cybernetics, The Australian National University, Canberra, ACT 2601, Australia

²

School of Computing, College of Engineering, Computing & Cybernetics, The Australian National University, Canberra, ACT 2601, Australia

³

Faculty of Science, Engineering and the Built Environment, Deakin University, Melbourne, VIC 3216, Australia

^*

Authors to whom correspondence should be addressed.

Metals 2024, 14(2), 239; https://doi.org/10.3390/met14020239

Submission received: 26 December 2023 / Revised: 30 January 2024 / Accepted: 13 February 2024 / Published: 16 February 2024

(This article belongs to the Special Issue Advanced Applications of Artificial Intelligence in Metallic Materials Processing)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

The design of aluminium alloys often encounters a trade-off between strength and ductility, making it challenging to achieve desired properties. Adding to this challenge is the broad range of alloying elements, their varying concentrations, and the different processing conditions (features) available for alloy production. Traditionally, the inverse design of alloys using machine learning involves combining a trained regression model for the prediction of properties with a multi-objective genetic algorithm to search for optimal features. This paper presents an enhancement in this approach by integrating data-driven classes to train class-specific regressors. These models are then used individually with genetic algorithms to search for alloys with high strength and elongation. The results demonstrate that this improved workflow can surpass traditional class-agnostic optimisation in predicting alloys with higher tensile strength and elongation.

Keywords:

aluminium alloys; alloy design; machine learning; genetic algorithm; optimisation; inverse design

1. Introduction

Aluminium (Al) alloys remain integral to numerous industries, ranging from aerospace, construction, electronics, transportation, and marine [1,2,3]. Pure Al is not used as a structural material owing to its low strength, with alloying being critical to enhancing strength and physical properties [4]. The use of Al alloys is essential in many applications for minimising weight due to their high strength-to-weight ratio; however, high strength is often at the cost of lower ductility. Structure–property relationships in Al- alloys permit the development of alloys with diverse properties that can be tailored for specific applications. For instance, 5xxx series Al alloys (based on the Al-Mg system) are chosen for marine applications owing to their corrosion resistance [4], while 7xxx series alloys (based on the Al-Zn-Mg system) are preferred in applications that require high strength and damage tolerance [5].

The experimental design of Al alloys has traditionally relied on a trial-and-error approach, which faces inherent challenges due to the number of alloying elements and different processing conditions (features) available for the production and fabrication of alloys [5]. The trial-and-error approach primarily focuses on the isolated examination of a single feature [6,7,8,9], which, in the case of Al alloys, may include the effect of the concentration of an alloying element or the impact of varying processing parameters. While readily interpretable, such approaches are ineffective for efficiently investigating simultaneous changes in multiple features in alloys, particularly in Al alloys that may include more than 15 alloying elements at different concentrations [5]. The relationship between alloying elements and processing conditions and mechanical properties (targets) is non-linear, posing significant challenges in the design of Al alloys.

Machine learning has emerged as a powerful tool for identifying non-linear relationships in metallic alloys (including Al alloys), successfully predicting mechanical properties based on alloy compositions and processing conditions [10,11,12,13,14,15,16]. Random forest models have predicted tensile strength and elongation in wrought Al alloys with 11% and 14% error rates, respectively [15]. In Al-Mg-Si alloys, random forest models have also been reported to perform better than neural network and support vector regression in tensile strength prediction, achieving an error rate of 2.87% on a test set [12]. Despite these advancements in forward-predictive models, inverse design [17], which involves creating alloys based on target properties, remains a complex task. The exhaustive exploration of all possible alloy combinations for inverse design is infeasible due to the vast combinatorial space, which includes multiple alloying elements and various processing conditions [18,19].

Multi-objective optimisation algorithms, particularly genetic algorithms (a type of evolutionary computing), have been extensively used for inverse design, thus addressing the complexities of exploring vast combinatorial spaces [12,20,21,22,23,24]. For instance, Feng et al. [12] combined a random forest model with the Non-dominated Sorting Genetic Algorithm (NSGA-II) to optimise strength and ductility in Al-Mg-Si alloy. This led to the successful prediction of an Al alloy with alloying concentrations of 0.74% Mg, 0.78% Si, and 0.37% Cu, with 410 MPa tensile strength and 15.2% elongation. Experimentally, this alloy composition demonstrated superior performance to the commonly used AA6013 alloy, with a tensile strength of 410 MPa and an elongation of 15.2%. Genetic programming, in combination with NSGA-II, has been utilised to enhance the strength and ductility of age-hardenable Al alloys. This approach led to the formulation of an alloy with a high Zn concentration of 5 wt% and moderate levels of Cu and Mg, each at 2 wt%. The resultant alloy showed a tensile strength of 356 MPa and elongation of 13% when peak-aged. Similarly, rough fuzzy models have been integrated with genetic algorithms to design Al alloys that exhibit high yield strength and elongation at cryogenic temperatures [21,25]. Cu and Mg emerged as critical alloying elements, with the optimal composition identified as Al-Cu-Mg-Si alloys, with Cu concentrations varying from 0.82 to 2.03 wt% and Mg concentrations between 0.72 and 1.48 wt%.

The design of Al alloys using machine learning has predominantly focused on specific subsets or classes of aluminium alloys, such as the Al-Mg-Si series [12] or age-hardenable alloys [20,21,24]. However, it remains unclear whether optimising properties within these specific classes yields more beneficial properties compared to a broader optimisation strategy that uses the entire dataset. For mechanical property prediction, data-driven class-based models have been reported to have higher accuracy [13]. It is unclear whether class-based optimisation offers any advantages in the context of alloy design using genetic design.

This study presents a workflow to improve the performance of traditional multi-objective optimisation-based design. The proposed workflow utilises data-driven classes by training class-specific regressors. These regressors are used to optimise objectives, tensile strength, and elongation, for each class. Additionally, this study uses recursive feature elimination to refine and reduce each class’s feature space. The performance of class-based optimisation is compared with class-agnostic models by comparing the optimal objectives predicted on the Pareto front. Finally, to assess the utility of the class-based genetic design, the alloys predicted are compared with those already reported in the literature.

2. Dataset and Methods

2.1. Dataset

In this study, we use a publicly accessible dataset of Al alloys, which was curated by the authors [26]. This dataset encompasses cast and wrought alloys and includes age-hardened and strain-hardened alloys. The dataset includes three mechanical properties (targets): tensile strength (44.81–820 MPa), yield strength (151.26–790 MPa), and elongation (0.5–50%). It also includes information on the concentration of 25 different alloying elements and the manufacturing processes, which are grouped into ten distinct processing conditions (features). The dataset contains 1154 instances, with 933 alloys having complete data on all three mechanical properties. Figure 1 illustrates the distribution of tensile strength and elongation within the dataset, with each alloy class denoted by different colours. This visual representation also highlights the inherent trade-off between strength and ductility in these alloys.

In a previous study, iterative label spreading was used [27] to identify eight distinct clusters in the data determined by feature similarity [28]. Further, a decision tree classifier showed that these clusters are separable classes, and information on these classes is also included in the dataset [28]. Class 1 is characterised by “as cast” or “solution heat-treated” alloys. Class 2 includes alloys with a high Cu content, while Class 3 consists of cold-worked and artificially aged alloys. Class 4 includes over-aged alloys, followed by Class 5, which features strain-hardened alloys with Mg additions. Class 6 consists of naturally aged alloys, and Classes 7 and 8 are differentiated by their high Mg and Fe concentrations, respectively.

2.2. Methods

2.2.1. Multi-Target Random Forest Models

This study uses random forest regressors for forward prediction due to their higher accuracy in predicting the mechanical properties of aluminium alloys [12,15]. Tree-based methods partition the feature space into distinct regions through successive splits, beginning at the root and continuing until a stop criterion is reached [29]. Each split is determined by a greedy algorithm aiming to reduce a loss, with mean squared error [30] commonly used for regression. Each node in the tree represents these splits. Random forest is an ensemble machine learning method that uses a decision tree as the base model [31]. A random forest is constructed by generating multiple decision trees, each trained using bootstrap aggregation and random feature selection. This approach ensures that each tree in the forest uses a different subset of features, minimising the influence of any single feature on the overall partitioning process and reducing correlation amongst individual trees. For regression tasks, the final output is the average predicted value from all the individual trees in the forest. Random forest regressors also expose feature importance profiles using variance reduction. These profiles are valuable for identifying how the features are used in the model architecture for property prediction. The random forest regressor was implemented using the scikit-learn library [32].

2.2.2. Multi-Objective Optimisation Method

This study utilised the Non-dominated Sorting Genetic Algorithm II (NSGA-II) to identify the concentration of alloying elements leading to optimal tensile strength and ductility [33]. Here, “optimal” refers to the best possible trade-off between these two mechanical properties. The NSGA-II identifies a set of optimal solutions known as Pareto-optimal solutions, representing the best trade-offs between conflicting objectives such as strength and ductility. An initial population of potential solutions is randomly generated. Each solution in the population is encoded as a continuous-valued chromosome, representing a possible alloying concentration. The random forest regressor is then used to predict the tensile strength and elongation of the potential solutions in the fitness function. The NSGA-II algorithm uses a non-dominated sorting method, which ranks individuals based on dominance criteria. This study’s dominance criteria determines whether one solution is superior in at least one objective (either tensile strength or elongation) without being inferior in the other, thus identifying the most effective trade-offs between these two properties. Solutions are classified into different fronts, the first front being entirely non-dominated, with subsequent fronts representing solutions dominated by those in the preceding fronts. Within each front, solutions were assigned a crowding distance value, measuring the density of solutions surrounding a particular alloy.

In this study, the selection process was carried out using a binary tournament selection based on rank and crowding distance [34,35]. Crossover and mutation were applied to create a new offspring population. The algorithm iterates through generations until a termination condition is met, which could be a predefined number of generations or a convergence criterion based on the diversity of the Pareto front. A hypervolume measure is often used to measure the convergence of a multi-objective optimisation algorithm [36]. The NSGA-II algorithm implementation discussed in this paper used the pymoo package, the details of which are available in [37]. In the implementation of the genetic algorithm, the following parameters were manually determined to ensure that the algorithm converged: a population size of 500 individuals, 125 offspring per generation, a mutation probability of 0.2, and a crossover probability of 0.9. The optimisation was carried out for 500 generations. The hypervolume plot demonstrating convergence can be found in the supplementary information.

2.2.3. An Alloy Design Workflow

As illustrated in Figure 2, the workflow starts with partitioning the dataset into data-driven classes, as identified in a previous study. For each class, distinct random forest models (

R_{i}

) are trained. Prior to the training of regressors, feature engineering is conducted. This involved the removal of features exhibiting a linear correlation greater than 95% to reduce bias. The processing condition feature is one hot encoded [38]. Subsequently, the remaining features and targets undergo normalisation using MinMaxScaler. Each dataset is then partitioned into 80% train and 20% test sets. The hyperparameters of all regressors are optimised using a random gird search with 1000 iterations and 5-fold cross-validation at each iteration.

Recursive feature elimination is used to determine the optimal number of features required to maintain similar accuracy. Subsequently, optimised random forest models (

R_{i}^{'}

) are trained for each class using this curated set of features. The optimised models are used in the fitness function of the NSGA-II algorithm to predict tensile strength and elongation. Then, the NSGA-II algorithm is used to identify the Pareto front (

P_{i}

), which includes optimal alloying features that lead to dominating mechanical properties. The size of the chromosome is equal to the number of features selected for each class. The objective functions for the NSGA-II algorithm are the predicted tensile strength and elongation using

R_{i}^{'}

. Further, the concentration of each element alloy is constrained between zero and the maximum concentration of an element in the class.

Parallel to the class-based approach, a random forest regressor (

R_{a l l}

) is trained using the entire dataset. Similar to the class-based workflow, an optimised model (

R_{a l l}^{'}

) is developed using features selected through recursive feature elimination and feature importance profiles. This optimised model is then utilised in the NSGA-II algorithm’s fitness function to compute the Pareto front for the entire dataset.

The class-based Pareto fronts are then compared to

P_{a l l}

to evaluate the effectiveness of the class-based approach against a more generalised model, identifying each method’s benefits and potential drawbacks. The comparison is conducted manually with the use of a scatter plot for tensile strength against elongation for each respective Pareto front.

3. Results

3.1. Model Training and Feature Selection

Multi-targeted random forest regressors were trained to predict the three mechanical properties. Learning curves were used to assess model overfitting and underfitting. The test set accuracies of the trained models

R_{i}

, where

i = {1, 2, 3, 4, 5, 6, 7, 8, a l l}

, are provided in Table 1 using the mean squared error (MSE) and mean absolute error (MAE) [39].

The relative importance of the feature to the property prediction is presented using feature importance (FI) profiles in Figure 3, which shows that some features have negligible importance in the prediction of mechanical properties. Eliminating such features before training might not affect the accuracy of the model. Further, the significance of processing conditions is negligible in some class-based regressors, particularly in comparison with

R_{a l l}

. This is attributed to alloys within such class having the same processing condition. In classes with multiple processing conditions, for example, Class 8, the importance of processing conditions can be seen. This can be further validated using the recursive feature elimination (RFE) profiles in Figure 4. Recursive feature elimination is carried out by iteratively identifying and removing the least significant features. The optimal number of features is determined using recursive feature elimination, as noted in Table 1. The number of features required for optimal performance is lower for all class-based regressors than

R_{a l l}

. This means that class-based regressors need fewer features to predict the mechanical properties.

Notably, there is a disparity between the FI and RFE profiles for Class 4 and the FI and RFE profiles for Class 4. The FI profile highlights several important features, whereas the RFE profile indicates that optimal accuracy is attainable using merely two features. The RFE profile for Class 4 suggests that as the number of features escalates, the regressor begins to overfit the data. Consequently, variance reduction from additional features, which might not be needed for the optimal model, is incorporated when computing the FI profiles. This observation indicates that some features, though appearing significant in an overfitted model, may not be needed to achieve the best model performance. Furthermore, Class 7’s RFE profile reveals an initial minimum error with a single feature, which subsequently rises upon integrating additional features, indicative of overfitting. The low error with a single feature suggests that Mg alone is a strong predictor in the model.

The random forest models were retrained using the selected features. The test set accuracy is also reported in Table 1, which shows that there is no loss of accuracy compared to respective regressors trained on all the features. The MAE of individual targets, along with the standard deviation error, is provided in the supplementary information. For the majority of the optimised class-based regressors (

R_{1}^{'}

,

R_{2}^{'}, R_{4}^{'}, R_{5}^{'} and R_{6}^{'}

), the optimised model exhibited superior performance, as indicated by the lower MSE and MAE values compared with the

R_{a l l}^{'}

model accuracy. The MAE for classes such as

R_{4}^{'}

(0.0315) and

R_{6}^{'}

(0.0290) were significantly lower than other class-based regressors. However, the MAE for

R_{7}^{'}

and

R_{8}^{'}

was higher than for

R_{a l l}^{'}

. The learning curve of the models is presented in the supplementary information. For

R_{4}^{'}

and

R_{6}^{'}

, the learning curves displayed a consistent decrease in MSE with increasing training sizes, showing low bias and variance. In contrast, the learning curve for

R_{7}^{'}

demonstrated a high MSE even as the training size increased.

R_{8}^{'}

also exhibited higher MSE values than

R_{4}^{'}

, and

R_{6}^{'}

, although the increase in MSE was less than

R_{7}^{'}

. The learning curves of both classes show higher underfitting compared with the other classes.

R_{a l l}^{'}

showed an overall higher MSE compared with the other classes. The learning curve for

R_{a l l}^{'}

suggests that the model underfits the data, suggesting that the model is not complex enough to generalise across the dataset. This underperformance could also be attributed to the current feature set not fully capturing the relationships with the targets.

3.2. Pareto Front

The trained regressors were used to identify the Pareto front maximising for tensile strength and elongation using the NSGA-II algorithm.

P_{a l l}

, which is calculated using

R_{a l l}

, servers as a baseline for comparison to other class-based Pareto fronts. The optimal tensile strength and elongation predicted using selected regressors are presented in Figure 5.

P_{1}

,

P_{2}

, and

P_{6}

were selected because they collectively dominate

P_{a l l}

. Other Pareto fronts can be found in the supplementary information provided.

P_{a l l}

spans a more extensive range of tensile strength and elongation than the class-based regressors. However, the class-based Pareto fronts (

P_{1}

,

P_{2}

and

P_{6}

) dominate

P_{a l l}

within their respective regions.

The baseline Pareto front

P_{a l l}

exhibits comparable performance to the reported literature. Specifically, in the tensile strength range of 350–450 MPa and elongation between 15% and 20%,

P_{a l l}

resembles the Pareto front reported by Feng et al. [12], which focused on optimising the properties of Al-Mg-Si alloys. Further, the Pareto front presented by Sekhar et al. [40] for the AA6063 alloy outperforms

P_{a l l}

in the 20–30% elongation region and has similar characteristics in regions below 20% elongation. However, the performance in regions 20–30% is comparable to the Pareto front

P_{1}

.

The enhanced performance of class-based Pareto fronts over

P_{a l l}

can be attributed to their focused optimisation within smaller regions because they are trained in specific regions of the feature space. This facilitates a more efficient optimisation process, enhancing the predicted tensile strength and elongation in the Pareto front.

3.3. Predicted Compositions

To inspect the alloys predicted to be on the Pareto fronts, the concentrations of the predicted alloys were assessed against tensile strength. Only a selected number of predicted alloy concentrations are reported below, with additional predictions detailed in the supplementary information.

3.3.1. Alloys Predicted within Class 1

The optimal alloy concentrations predicted for the Pareto front

P_{1}

are presented in Figure 6. The predictions show two distinct ranges of tensile strength for the alloys: one for strengths between 75 and 150 MPa and another for strengths between 275 and 300 MPa. An example alloy on this Pareto front, exhibiting a tensile strength of 150 MPa and an elongation of 29%, is Al-0.9%Sc, which has undergone solutionising as its processing condition.

A pronounced discontinuity is apparent in the Pareto front

P_{1}

, particularly noticeable at elongations exceeding 30%. In this region, the model predicts alloys with strengths under 200 MPa, whereas for elongations under 30%, the predicted strength jumps to around 300 MPa. This abrupt transition is likely due to the imposed constraint that limits the alloying elements’ concentration to their maximum concentration in the dataset for the given class. Specifically, when the scandium concentration reaches a maximum of 0.9%, the model shifts to predicting the characteristics of Al-Mg-Mn alloys. The difference in optimal properties is consistent with the predictions of two distinct types of alloys within the respective regions. Further, discontinuities were also seen in tensile strength and elongation Pareto front by Dey et al. [20] during optimisation for age-hardenable Al alloys. Another possible explanation for the discontinuity in the Pareto front might be the presence of two distinct clusters within the Class 1 dataset. However, the elbow plot [44] for KMeans clustering included in the supplementary information indicates a consistent reduction in distortion as the number of clusters increases. This suggests that there are no additional subclusters within Class 1.

The alloy with a tensile strength of 275 MPa is characterised by a higher Mg concentration, aligning with the composition of AA5xxx series alloys. Mg enhances strength via solid solution hardening mechanisms [45,46]. However, a rise in Mg content can result in the formation of Mg₅Al₈ precipitates, which are known to reduce ductility, explaining the observed trend in

P_{1}

where higher strength alloys exhibit lower elongation [5]. The precipitation of Mg₅Al₈ can be inhibited by the minor addition of elements such as Mn and Cr, which is also predicted, as seen in Figure 6b,c.

Conversely, alloys with tensile strengths below 150 MPa show low levels of alloying elements. These predicted alloys contain minor additions of Sc, which is known to increase alloy strength through the formation of Al₃Sc precipitates [47]. Besides Sc, these aluminium alloys contain very low quantities of other alloying elements.

3.3.2. Alloys Predicted within Class 2

The alloys predicted on Pareto front

P_{2}

show high strength and low elongation. The solutions within this front surpass

P_{a l l}

for strengths exceeding 600 MPa; hence, the discussions below are limited to predictions with strengths greater than 600 MPa. The concentration of alloying elements in

P_{2}

is reported in Figure 7. An example alloy on this Pareto front, with a tensile strength of 750 MPa and an elongation of 10%, is Al-12Zn-4Mg-1.5Cu, which is artificially peak-aged.

In the tensile strength range of 650–750 MPa, the alloys are predicted to have high concentrations of Zn, accompanied by some Mg and Cu. These compositions mirror the overaged Al-Zn-Mg-Cu alloys documented in the literature, which are known for their high strength, primarily due to MgZn₂ precipitates [42].

For the even higher tensile strength ranges (greater than 750 MPa), the predicted alloys contain higher concentrations of Zn and Mg than those reported in the existing literature, representing unexplored compositions. A 7xxx alloy reported in the literature with a Zn concentration of 8.67 wt% and a Mg concentration of 2.50 wt% exhibited a tensile strength of 641 MPa [48]. The enhanced strength of the alloy was attributed to precipitate strengthening due to the formation of MgZn₂ precipitates, a mechanism that may also apply to the newly predicted alloys. The predicted alloys present promising candidates that require experimental validation to confirm their tensile properties and any specific performance criteria relevant to target applications.

3.3.3. Alloys Predicted within Class 6

Alloys at the Pareto front

P_{6}

are defined by their moderate strength and ductility, including naturally aged alloys, and their alloying concentrations are reported in Figure 8. The alloys predicted in this class include moderate-strength 7xxx series alloys and the higher-strength 6xxx series alloys. Further, at higher strengths, this also includes 2xxx Al-Cu-Li alloys. Similar to Class 1, the Pareto front exhibits a discontinuous region between the two distinct predicted alloys (Al-Cu-Li and Al-Zn-Mg-Cu alloys). An example alloy on the Pareto front, with a tensile strength of 375 MPa and an elongation of 24%, is Al-6Zn-3Cu-0.5Li, which has been naturally aged.

Alloys with tensile strengths surpassing 500 MPa are Al-Cu-Li-based alloys, where strength is primarily attributed to Al₂CuLi precipitates, enhancing the mechanical properties, as reported in [49,50]. Further, it can be seen that these alloys also have a minor addition of Mg (0.5 wt%), which leads to faster precipitation of the Al₂CuLi phase, leading to higher strengthening during natural ageing [51].

In the tensile strength range of 350 to 500 MPa, the alloys are predicted to be of the Al-Zn-Mg-Cu alloys, which belong to the 7xxx series. Within this range, two alloy types can be observed based on Mg content: high Mg (2 wt%) and low Mg (0.5 wt%). At high Mg concentrations, natural ageing results in the formation of Guinier–Preston (GP) zones [52], with alloy strengthening attributed to coherency and modulus mismatch strain [43]. Conversely, at lower Mg concentrations, the strength is primarily due to the formation of η′ phases during natural ageing [53,54].

For alloys with tensile strengths below 350 MPa, the model predicts Al-Mg-Si alloys with trace amounts of Cu [55]. The introduction of copper modifies the precipitation sequence in Al-Mg-Si alloys, leading to metastable precipitates that influence the alloy’s microstructure and hardness. This results in the growth of phases like semi-coherent β″ and Q′ along specific crystallographic directions [56]. Notably, these alloys exhibit excellent formability, leading to their use in the automotive industry [57].

4. Discussion

The study presented here introduces a methodology for enhancing the design of alloys using class-based optimisation. By using a class-based genetic algorithm, this study demonstrated the prediction of a Pareto front that may surpass those generated by class-agnostic optimisation techniques. The combination of Pareto fronts

P_{1}

,

P_{2},

and

P_{6}

emerge as a dominant front, outperforming the Pareto front

P_{a l l}

. Further, the forward prediction multi-target regressors achieved low error in line with the literature, and some class-based outperformed the multi-target regressor trained on the entire dataset, which aligns with previous findings.

Multi-target random forest regressors expose feature importance profiles, which were used to identify the impact features on simultaneous prediction of the three mechanical properties. This revealed that only a specific subset of elements substantially affects these properties, directing the alloy design strategy towards optimising the concentrations of these elements while disregarding the less impactful ones. In the prediction of mechanical properties for

R_{1}

, Mg and Cu emerged as the most significant, likely due to their role in solid solution hardening mechanisms that enhance strength while reducing ductility [45,46]. For

R_{2},

Zn, Cu, Mg, and Si were identified as crucial, with the presence of Zn and Mg linked to the formation of MgZn₂ precipitates, contributing to precipitate strengthening [46]. The combination of Mg and Si in this context suggests strengthening due to the formation of the β″ phase in the 2xxx and 6xxx alloy series [58]. Cu, Si, and Mg are most important for prediction in

R_{6}

, which might denote strength due to similar mechanisms as in Class 2. Notably, the high importance of Cu in Class 6 could be attributed to the addition of Cu in 6xxx alloys, leading to the formation of a finer, needle-shaped β″ phase, which further increases the strength of these alloys [59,60].

The combination of Pareto fronts that dominated

P_{a l l}

included alloys with the following processing conditions: solutionised and peak-aged, naturally aged, as-cast, and solutionised. This implies that alloys with the most favourable properties can be fabricated by using only these processes. Notably, these processes are commonly used in manufacturing Al alloys [5,46], indicating that the alloys predicted may be readily manufactured using existing processes.

Comparative analysis with alloys documented in the literature validated the efficacy of the class-based genetic design predictions. The predicted concentration and processing conditions aligned with those reported in the literature. The predictions indicated that as-cast Al-Sc and Al-Mg alloys would exhibit low tensile strengths due to their low alloying element concentrations, which is also responsible for their high ductility [5]. The model predicted naturally aged Al-Mg-Si, Al-Zn-Mg-Cu, and Al-Cu-Li alloys for moderate strength. The strength of these alloys is commonly attributed to precipitation hardening, as noted in the existing studies [51,52,56]. At the high-strength end, the model anticipated peak-aged Al-Zn-Mg-Cu alloys, aligning with similar high-strength alloy reports in the literature [61].

Certain alloys that had not been previously documented in the literature were identified among these predictions. This includes 7xxx series alloys with high Zn (12 wt%) and Mg (4 wt%) concentrations predicted with tensile strength greater than 700 MPa. There are additional factors to consider for high Zn aluminium alloys that also have Mg. These factors, which are not included in our current model or the data used for it, might involve the risk of hot cracking when these alloys solidify from a liquid during production and casting and their tendency to crack under stress corrosion conditions [62]. The former is a topic that has been studied both in the context of wrought Al-Zn-Mg alloys [63] and additively manufactured Al-Zn-Mg alloys [64,65]. While further experimental validation through the fabrication and testing of these alloys is necessary to confirm the predictions, such investigations fall beyond the scope of this paper and represent a promising direction for subsequent research efforts.

The composition and processing parameters of an alloy significantly influence its resultant microstructure. This microstructure plays a vital role in determining the mechanical properties of the alloy. In this study, the regression models are trained to predict the mechanical properties of alloys by directly using the concentrations of alloying elements and processing conditions. This design methodology, however, does not eliminate the need for experimentation. Key processing parameters, notably ageing time and temperature, remain to be optimised through experimental methods as they are not included within the current dataset. Incorporating data on these processing parameters could significantly enhance the model’s applicability. A notable limitation inherent in optimisation-based design is that the forward model only provides an estimate of the error in predicted properties. Experimental validation is essential to estimate the error in predicting alloy concentrations.

Despite the limitations in machine learning-based computational approaches for alloy design, the model provides guidance for alloy development with predictive capabilities and Pareto front calculations. When additional properties like conductivity or hardness are part of the alloy design requirements, the workflow provides initial predictions of potential alloys that meet the strength and ductility requirements. These alloys require subsequent experimental validation to confirm their suitability in meeting additional criteria. Furthermore, the model’s utility could be improved by including additional properties in the dataset and, where relevant, augmenting the dataset with calculated parameters, such as phase concentration. Such an approach was recently presented in the context of multi-principal element alloys by Li and co-workers [66].

5. Conclusions

This study presented a design methodology using class-based optimisation to predict optimal Al alloy compositions. This method surpassed traditional class-agnostic optimisation techniques in predicting alloys with enhanced tensile strength and elongation, identifying key alloying elements for targeted optimisation. Recursive feature elimination was also used to reduce the feature space, particularly benefiting class-based regressors, which require fewer features for optimal accuracy.

This study, using data-driven approaches, has shown that the class-based optimisation of Al alloys further improves the predictions of Al alloys. The predicted alloys are consistent with the current literature, supporting the utility of the data-driven approach and model framework. Moreover, this study identified previously unreported 7xxx series alloys with high Zn and Mg concentrations predicted to have tensile strengths over 700 MPa, suggesting a promising area for future experimental validation. The method cannot substitute for experimental processes, particularly in optimising critical parameters like ageing time (3 h to 48 h) and temperature (100 °C to 200 °C), which vary with processing conditions and typically are determined through domain knowledge. The results herein, however, provide an interpretable framework for guiding future Al alloy design.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/met14020239/s1, Figure S1: Learning curves for the random forest regressor predicting three mechanical properties. (a) R₁, (b) R₂, (c) R₃, (d) R₄, (e) R₅, (f) R₆, (g) R₇, (h) R₈, and (i) R_all; Figure S2: Mean absolute error of prediction of each target for optimised models. (a) Tensile strength, (b) yield strength, and (c) elongation; Figure S3: Pareto front for individual classes optimising for tensile strength and elongation; Figure S4: Hypervolume plot showing the convergence of NSGA-II algorithm for (a) all, (b) Class 1, (c) Class 2, and (d) Class 6; Figure S5: Predicted alloy concentrations on Pareto front 1. (a) Si concentration, (b) Cu concentration, (c) Fe concentration, (d) Zn concentration, and (e) Zr concentration; Figure S6: Predicted alloy concentrations on Pareto front 2. (a) Ag concentration, (b) Co concentration, (c) Cr concentration, (d) Fe concentration, (e) Sc concentration, (f) Si Concentration, (g) V concentration, (h) Zr concentration, and (i) Ti concentration; Figure S7: Predicted alloy concentrations on Pareto front 6. (a) Cr concentration, (b) Fe concentration, and (c) Mn concentration; Figure S8: Elbow method analysis for the optimal K value in Class 1.

Author Contributions

Conceptualisation, A.S.B. and N.B. (Nick Birbilis); methodology, N.B. (Ninad Bhat) and A.S.B.; software, N.B. (Ninad Bhat); formal analysis, N.B. (Ninad Bhat); investigation, N.B. (Ninad Bhat); data curation, N.B. (Ninad Bhat) and N.B. (Nick Birbilis); writing—original draft preparation, N.B. (Ninad Bhat); writing—review and editing, A.S.B. and N.B. (Nick Birbilis); supervision, A.S.B. and N.B. (Nick Birbilis); project administration, N.B. (Nick Birbilis). All authors have read and agreed to the published version of this manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data used in this study are openly available in Mendeley Data [26].

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Dorward, R.C.; Pritchett, T.R. Advanced Aluminium Alloys for Aircraft and Aerospace Applications. Mater. Des. 1988, 9, 63–69. [Google Scholar] [CrossRef]
Hirsch, J. Automotive Trends in Aluminium—The European Perspective. In Proceedings of the 9th International Conference on Aluminium Alloys, Brisbane, Australia, 2–5 August 2004. [Google Scholar]
Verma, R.P.; Kumar Lila, M. A Short Review on Aluminium Alloys and Welding in Structural Applications. Mater. Today Proc. 2021, 46, 10687–10691. [Google Scholar] [CrossRef]
Davis, J.R. Corrosion of Aluminum and Aluminum Alloys; ASM International: Geauga County, OH, USA, 1999; ISBN 1-61503-238-X. [Google Scholar]
Polmear, I.; St John, D.; Nie, J.-F.; Qian, M. Light Alloys: Metallurgy of the Light Metals; Butterworth-Heinemann: Oxford, UK, 2017; ISBN 0-08-099430-X. [Google Scholar]
Kong, Y.; Jia, Z.; Liu, Z.; Liu, M.; Roven, H.J.; Liu, Q. Effect of Zr and Er on the Microstructure, Mechanical and Electrical Properties of Al-0.4 Fe Alloy. J. Alloys Compd. 2021, 857, 157611. [Google Scholar] [CrossRef]
Li, H.; Wang, H.; Liang, X.; Wang, Y.; Liu, H. Effect of Sc and Nd on the Microstructure and Mechanical Properties of Al-Mg-Mn Alloy. J. Mater. Eng. Perform. 2012, 21, 83–88. [Google Scholar] [CrossRef]
Macchi, C.; Somoza, A.; Ferragut, R.; Dupasquier, A.; Polmear, I.J. Ageing Processes in Al-Cu-Mg Alloys with Different Cu/Mg Ratios. Phys. Status Solidi C 2009, 6, 2322–2325. [Google Scholar] [CrossRef]
Pogatscher, S.; Antrekowitsch, H.; Leitner, H.; Sologubenko, A.S.; Uggowitzer, P.J. Influence of the Thermal Route on the Peak-Aged Microstructures in an Al–Mg–Si Aluminum Alloy. Scr. Mater. 2013, 68, 158–161. [Google Scholar] [CrossRef]
Dorbane, A.; Harrou, F.; Sun, Y. Machine Learning Methods for Predicting Mechanical Behavior of Aluminum Alloys. Wseas Trans. Electron. 2022, 13, 84–88. [Google Scholar] [CrossRef]
Merayo Fernández, D.; Rodríguez-Prieto, A.; Camacho, A.M. Prediction of the Bilinear Stress-Strain Curve of Aluminum Alloys Using Artificial Intelligence and Big Data. Metals 2020, 10, 904. [Google Scholar] [CrossRef]
Feng, X.; Wang, Z.; Jiang, L.; Zhao, F.; Zhang, Z. Simultaneous Enhancement in Mechanical and Corrosion Properties of Al-Mg-Si Alloys Using Machine Learning. J. Mater. Sci. Technol. 2023, 167, 1–13. [Google Scholar] [CrossRef]
Bhat, N.; Barnard, A.S.; Birbilis, N. Improving the Prediction of Mechanical Properties of Aluminium Alloy Using Data-Driven Class-Based Regression. Comput. Mater. Sci. 2023, 228, 112270. [Google Scholar] [CrossRef]
Hu, M.; Tan, Q.; Knibbe, R.; Wang, S.; Li, X.; Wu, T.; Jarin, S.; Zhang, M.-X. Prediction of Mechanical Properties of Wrought Aluminium Alloys Using Feature Engineering Assisted Machine Learning Approach. Metall. Mater. Trans. A 2021, 52, 2873–2884. [Google Scholar] [CrossRef]
Soofi, Y.J.; Rahman, M.A.; Gu, Y.; Liu, J. A Feasibility Study of Machine Learning-Assisted Alloy Design Using Wrought Aluminum Alloys as an Example. Comput. Mater. Sci. 2022, 215, 111783. [Google Scholar] [CrossRef]
Bhat, N.; Barnard, A.S.; Birbilis, N. Inverse Design of Aluminium Alloys Using Multi-Targeted Regression. J. Mater. Sci. 2024, 59, 1448–1463. [Google Scholar] [CrossRef]
Zunger, A. Inverse Design in Search of Materials with Target Functionalities. Nat. Rev. Chem. 2018, 2, 0121. [Google Scholar] [CrossRef]
Deschamps, A.; Tancret, F.; Benrabah, I.-E.; De Geuser, F.; Van Landeghem, H.P. Combinatorial Approaches for the Design of Metallic Alloys. Comptes Rendus Phys. 2018, 19, 737–754. [Google Scholar] [CrossRef]
Pollock, T.M.; Ven, A.V. der The Evolving Landscape for Alloy Design. MRS Bull. 2019, 44, 238–246. [Google Scholar] [CrossRef]
Dey, S.; Dey, P.; Datta, S. Design of Novel Age-Hardenable Aluminium Alloy Using Evolutionary Computation. J. Alloys Compd. 2017, 704, 373–381. [Google Scholar] [CrossRef]
Dey, S.; Dey, P.; Datta, S. Rough-Fuzzy-GA-Based Design of Al Alloys Having Superior Cryogenic Performance. Mater. Manuf. Process. 2017, 32, 1075–1081. [Google Scholar] [CrossRef]
Lee, K.W.; Song, Y.; Kim, S.-H.; Kim, M.-S.; Seol, J.B.; Cho, K.-S.; Choi, H. Genetic Design of New Aluminum Alloys to Overcome Strength-Ductility Trade-off Dilemma. J. Alloys Compd. 2023, 947, 169546. [Google Scholar] [CrossRef]
Dey, S.; Ganguly, S.; Datta, S. In Silico Design of High Strength Aluminium Alloy Using Multi-Objective GA; Springer: Berlin/Heidelberg, Germany, 2014; pp. 316–327. [Google Scholar]
Dey, S.; Sultana, N.; Dey, P.; Pradhan, S.K.; Datta, S. Intelligent Design Optimization of Age-Hardenable Al Alloys. Comput. Mater. Sci. 2018, 153, 315–325. [Google Scholar] [CrossRef]
Pawlak, Z. Rough Sets. Int. J. Comput. Inf. Sci. 1982, 11, 341–356. [Google Scholar] [CrossRef]
Bhat, N.; Barnard, A.S.; Birbilis, N. Aluminium Alloy Dataset for Supervised Learning. Mendeley Data 2023, V1. [Google Scholar] [CrossRef]
Parker, A.J.; Barnard, A.S. Selecting Appropriate Clustering Methods for Materials Science Applications of Machine Learning. Adv. Theory Simul. 2019, 2, 1900145. [Google Scholar] [CrossRef]
Bhat, N.; Barnard, A.S.; Birbilis, N. Unsupervised Machine Learning Discovers Classes in Aluminium Alloys. R. Soc. Open Sci. 2023, 10, 220360. [Google Scholar] [CrossRef]
Breiman, L.; Friedman, J.; Olshen, R.; Stone, C. Classification and Regression Trees; Wadsworth: Belmont, CA, USA, 1984; ISBN 978–0412048418. [Google Scholar]
Allen, D.M. Mean Square Error of Prediction as a Criterion for Selecting Variables. Technometrics 1971, 13, 469–475. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V. Scikit-Learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Deb, K.; Pratap, A.; Agarwal, S.; Meyarivan, T. A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II. IEEE Trans. Evol. Comput. 2002, 6, 182–197. Available online: https://ieeexplore-ieee-org.virtual.anu.edu.au/document/996017 (accessed on 3 November 2023). [CrossRef]
Blickle, T. Tournament Selection. Evol. Comput. 2000, 1, 181–186. [Google Scholar]
Fortin, F.-A.; Parizeau, M. Revisiting the NSGA-II Crowding-Distance Computation. In Proceedings of the GECCO ‘13: Genetic and Evolutionary Computation Conference, Amsterdam, The Netherlands, 6–10 July 2013; pp. 623–630. [Google Scholar]
Zitzler, E.; Thiele, L. Multiobjective Optimization Using Evolutionary Algorithms—A Comparative Case Study. In Proceedings of the Parallel Problem Solving from Nature—PPSN V, Amsterdam, The Netherlands, 27–30 September 1998; Eiben, A.E., Bäck, T., Schoenauer, M., Schwefel, H.-P., Eds.; Springer: Berlin/Heidelberg, Germany, 1998; pp. 292–301. [Google Scholar]
Blank, J.; Deb, K. Pymoo: Multi-Objective Optimization in Python. IEEE Access 2020, 8, 89497–89509. Available online: https://ieeexplore-ieee-org.virtual.anu.edu.au/document/9078759 (accessed on 3 November 2023). [CrossRef]
Hussein, A.Y.; Falcarin, P.; Sadiq, A.T. Enhancement Performance of Random Forest Algorithm via One Hot Encoding for IoT IDS. Period. Eng. Nat. Sci. 2021, 9, 579. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R.; Friedman, J.H.; Friedman, J.H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer: Berlin/Heidelberg, Germany, 2009; Volume 2. [Google Scholar]
Sekhar, A.P.; Nandy, S.; Dey, S.; Datta, S.; Das, D. Multi-Objective Genetic Algorithm Based Optimization of Age Hardening for AA6063 Alloy. IOP Conf. Ser. Mater. Sci. Eng. 2020, 912, 052019. [Google Scholar] [CrossRef]
Sun, Y.; Song, M.; He, Y. Effects of Sc Content on the Mechanical Properties of Al-Sc Alloys. Rare Metals 2010, 29, 451–455. [Google Scholar] [CrossRef]
Chen, Z.; Mo, Y.; Nie, Z. Effect of Zn Content on the Microstructure and Properties of Super-High Strength Al-Zn-Mg-Cu Alloys. Metall. Mater. Trans. A 2013, 44, 3910–3920. [Google Scholar] [CrossRef]
Lee, S.-H.; Jung, J.-G.; Baik, S.-I.; Seidman, D.N.; Kim, M.-S.; Lee, Y.-K.; Euh, K. Precipitation Strengthening in Naturally Aged Al–Zn–Mg–Cu Alloy. Mater. Sci. Eng. A 2021, 803, 140719. [Google Scholar] [CrossRef]
Syakur, M.A.; Khotimah, B.K.; Rochman, E.M.S.; Satoto, B.D. Integration K-Means Clustering Method and Elbow Method For Identification of The Best Customer Profile Cluster. IOP Conf. Ser. Mater. Sci. Eng. 2018, 336, 012017. [Google Scholar] [CrossRef]
Lee, B.-H.; Kim, S.-H.; Park, J.-H.; Kim, H.-W.; Lee, J.-C. Role of Mg in Simultaneously Improving the Strength and Ductility of Al–Mg Alloys. Mater. Sci. Eng. A 2016, 657, 115–122. [Google Scholar] [CrossRef]
Mondolfo, L.F. Aluminum Alloys: Structure and Properties; Elsevier: Amsterdam, The Netherlands, 2013; ISBN 1-4831-4482-8. [Google Scholar]
Venkateswarlu, K.; Pathak, L.C.; Ray, A.K.; Das, G.; Verma, P.K.; Kumar, M.; Ghosh, R.N. Microstructure, Tensile Strength and Wear Behaviour of Al–Sc Alloy. Mater. Sci. Eng. A 2004, 383, 374–380. [Google Scholar] [CrossRef]
Shu, W.X.; Hou, L.G.; Zhang, C.; Zhang, F.; Liu, J.C.; Liu, J.T.; Zhuang, L.Z.; Zhang, J.S. Tailored Mg and Cu Contents Affecting the Microstructures and Mechanical Properties of High-Strength Al–Zn–Mg–Cu Alloys. Mater. Sci. Eng. A 2016, 657, 269–283. [Google Scholar] [CrossRef]
Gayle, F.W.; Heubaum, F.H.; Pickens, J.R. Structure and Properties during Aging of an Ultra-High Strength Al-Cu-Li-Ag-Mg Alloy. Scr. Metall. Et Mater. 1990, 24, 79–84. [Google Scholar] [CrossRef]
Decreus, B.; Deschamps, A.; De Geuser, F.; Donnadieu, P.; Sigli, C.; Weyland, M. The Influence of Cu/Li Ratio on Precipitation in Al–Cu–Li–x Alloys. Acta Mater. 2013, 61, 2207–2218. [Google Scholar] [CrossRef]
Gumbmann, E.; De Geuser, F.; Sigli, C.; Deschamps, A. Influence of Mg, Ag and Zn Minor Solute Additions on the Precipitation Kinetics and Strengthening of an Al-Cu-Li Alloy. Acta Mater. 2017, 133, 172–185. [Google Scholar] [CrossRef]
Liu, J.; Hu, R.; Zheng, J.; Zhang, Y.; Ding, Z.; Liu, W.; Zhu, Y.; Sha, G. Formation of Solute Nanostructures in an Al–Zn–Mg Alloy during Long-Term Natural Aging. J. Alloys Compd. 2020, 821, 153572. [Google Scholar] [CrossRef]
Chen, Y.; Liu, C.Y.; Zhang, B.; Qin, F.C.; Hou, Y.F. Precipitation Behavior and Mechanical Properties of Al–Zn–Mg Alloy with High Zn Concentration. Journal of Alloys and Compounds 2020, 825, 154005. [Google Scholar] [CrossRef]
Wan, L.; Deng, Y.-L.; Ye, L.-Y.; Zhang, Y. The Natural Ageing Effect on Pre-Ageing Kinetics of Al-Zn-Mg Alloy. J. Alloys Compd. 2019, 776, 469–474. [Google Scholar] [CrossRef]
Kaufman, J.G. Introduction to Aluminum Alloys and Tempers; ASM International: Geauga County, OH, USA, 2000; ISBN 978-1-61503-066-8. [Google Scholar]
Ding, L.; Jia, Z.; Zhang, Z.; Sanders, R.E.; Liu, Q.; Yang, G. The Natural Aging and Precipitation Hardening Behaviour of Al-Mg-Si-Cu Alloys with Different Mg/Si Ratios and Cu Additions. Mater. Sci. Eng. A 2015, 627, 119–126. [Google Scholar] [CrossRef]
Miller, W.S.; Zhuang, L.; Bottema, J.; Wittebrood, A.J.; De Smet, P.; Haszler, A.; Vieregge, A. Recent Development in Aluminium Alloys for the Automotive Industry. Mater. Sci. Eng. A 2000, 280, 37–49. [Google Scholar] [CrossRef]
Chakrabarti, D.J.; Laughlin, D.E. Phase Relations and Precipitation in Al–Mg–Si Alloys with Cu Additions. Prog. Mater. Sci. 2004, 49, 389–410. [Google Scholar] [CrossRef]
Marioara, C.D.; Andersen, S.J.; Stene, T.N.; Hasting, H.; Walmsley, J.; Van Helvoort, A.T.J.; Holmestad, R. The Effect of Cu on Precipitation in Al–Mg–Si Alloys. Philos. Mag. 2007, 87, 3385–3413. Available online: https://www.tandfonline.com/doi/epdf/10.1080/14786430701287377?src=getftr (accessed on 15 December 2023). [CrossRef]
Buchanan, K.; Colas, K.; Ribis, J.; Lopez, A.; Garnier, J. Analysis of the Metastable Precipitates in Peak-Hardness Aged Al-Mg-Si(-Cu) Alloys with Differing Si Contents. Acta Mater. 2017, 132, 209–221. [Google Scholar] [CrossRef]
Zou, Y.; Wu, X.; Tang, S.; Zhu, Q.; Song, H.; Guo, M.; Cao, L. Investigation on Microstructure and Mechanical Properties of Al-Zn-Mg-Cu Alloys with Various Zn/Mg Ratios. J. Mater. Sci. Technol. 2021, 85, 106–117. [Google Scholar] [CrossRef]
Holroyd, N.J.; Scamans, G.M. Stress Corrosion Cracking in Al-Zn-Mg-Cu Aluminum Alloys in Saline Environments. Metall. Mater. Trans. A 2013, 44, 1230–1253. Available online: https://link.springer.com/article/10.1007/s11661-012-1528-3 (accessed on 27 November 2023). [CrossRef]
Janaki Ram, G.D.; Mitra, T.K.; Shankar, V.; Sundaresan, S. Microstructural Refinement through Inoculation of Type 7020 Al–Zn–Mg Alloy Welds and Its Effect on Hot Cracking and Tensile Properties. J. Mater. Process. Technol. 2003, 142, 174–181. [Google Scholar] [CrossRef]
Babu, A.P.; Choudhary, S.; Griffith, J.C.; Huang, A.; Birbilis, N. On the Corrosion of a High Solute Al-Zn-Mg Alloy Produced by Laser Powder Bed Fusion. Corros. Sci. 2021, 189, 109626. [Google Scholar] [CrossRef]
Babu, A.P.; Kairy, S.K.; Huang, A.; Birbilis, N. Laser Powder Bed Fusion of High Solute Al-Zn-Mg Alloys: Processing, Characterisation and Properties. Mater. Des. 2020, 196, 109183. [Google Scholar] [CrossRef]
Li, Z.; Li, S.; Birbilis, N. A Machine Learning-Driven Framework for the Property Prediction and Generative Design of Multiple Principal Element Alloys. Mater. Today Commun. 2023, 107940. [Google Scholar] [CrossRef]

Figure 1. Distribution of tensile strength and elongation in the dataset utilised for the present study.

Figure 2. Workflow for the design of aluminium alloys as explored herein. The workflow commences with data-driven partitions using unsupervised machine learning and ends with class-based optimisation.

Figure 3. Feature importance profiles for (a)

R_{1}

, (b)

R_{2}

, (c)

R_{3}

, (d)

R_{4}

, (e)

R_{5}

, (f)

R_{6}

, (g)

R_{7},

(h)

R_{8}

, and (i)

R_{a l l}

. The figures illustrate the relative significance of each feature in predicting mechanical properties. It highlights that certain features have negligible importance, suggesting that the exclusion of some features prior to model training may not adversely impact the model’s accuracy. The ideal number of features was calculated using recursive feature elimination.

Figure 3. Feature importance profiles for (a)

R_{1}

, (b)

R_{2}

, (c)

R_{3}

, (d)

R_{4}

, (e)

R_{5}

, (f)

R_{6}

, (g)

R_{7},

(h)

R_{8}

, and (i)

R_{a l l}

. The figures illustrate the relative significance of each feature in predicting mechanical properties. It highlights that certain features have negligible importance, suggesting that the exclusion of some features prior to model training may not adversely impact the model’s accuracy. The ideal number of features was calculated using recursive feature elimination.

Figure 4. Recursive feature elimination profile for each class-based regressor and regressor trained on the entire dataset for (a)

R_{1}

, (b)

R_{2}

, (c)

R_{3}

, (d)

R_{4}

, (e)

R_{5}

, (f)

R_{6}

, (g)

R_{7}

, (h)

R_{8}

, and (i)

R_{a l l}

. This figure demonstrates that a subset of features can achieve the same level of accuracy as using the entire feature set.

Figure 4. Recursive feature elimination profile for each class-based regressor and regressor trained on the entire dataset for (a)

R_{1}

, (b)

R_{2}

, (c)

R_{3}

, (d)

R_{4}

, (e)

R_{5}

, (f)

R_{6}

, (g)

R_{7}

, (h)

R_{8}

, and (i)

R_{a l l}

. This figure demonstrates that a subset of features can achieve the same level of accuracy as using the entire feature set.

Figure 5. Comparison of Pareto fronts

P_{1}

,

P_{2}

, and

P_{6}

, each exhibiting superior tensile strength and elongation compared with the Pareto front

P_{a l l}

. The Pareto font is compared with the Pareto front reported by Sekhar et al. [40] and experimentally tests alloys similar to the predicted alloys [41,42,43]. A detailed presentation of all other Pareto fronts can be found in this paper’s supplementary information.

Figure 5. Comparison of Pareto fronts

P_{1}

,

P_{2}

, and

P_{6}

, each exhibiting superior tensile strength and elongation compared with the Pareto front

P_{a l l}

. The Pareto font is compared with the Pareto front reported by Sekhar et al. [40] and experimentally tests alloys similar to the predicted alloys [41,42,43]. A detailed presentation of all other Pareto fronts can be found in this paper’s supplementary information.

Figure 6. Optimal alloy concentrations on Pareto front 1. This figure illustrates two distinct regions of tensile strength: one below 150 MPa and another above 275 MPa. (a) Mg concentration, (b) Mn concentration, (c) Cr concentration, and (d) Sc concentration.

Figure 7. Optimal alloy concentrations on Pareto front 2. (a) Cu concentration, (b) Mg concentration, (c) Mn concentration, (d) Zn concentration, (e) Ni concentration, and (f) Li concentration.

Figure 8. Optimal alloy concentrations on Pareto front 6. (a) Cu concentration, (b) Mg concentration, (c) Cr concentration, (d) Zn concentration, (e) Li concentration, and (f) Si concentration.

Table 1. The optimal number of features for each regressor and test set accuracy of the model trained with all features and selected features.

R_{i}

denotes a regressor trained on all the features.

R_{i}^{'}

denotes a regressor trained on select features.

Table 1. The optimal number of features for each regressor and test set accuracy of the model trained with all features and selected features.

R_{i}

denotes a regressor trained on all the features.

R_{i}^{'}

denotes a regressor trained on select features.

$Class (i)$	Optimal Number of Features	$Test MAE for R_{i}$	$Test MSE for R_{i}$	$Test MSE for R_{i}^{'}$	$Test MAE for R_{i}^{'}$
1	9	0.0311	0.0027	0.0028	0.0320
2	15	0.0351	0.0026	0.0026	0.0350
3	8	0.0438	0.0039	0.0038	0.0433
4	2	0.0364	0.0023	0.0018	0.0315
5	2	0.0423	0.0027	0.0024	0.0397
6	9	0.0290	0.0019	0.0020	0.0290
7	1	0.0720	0.0094	0.0092	0.0703
8	2	0.0491	0.0048	0.0069	0.0586
all	25	0.0403	0.0034	0.0034	0.0401

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bhat, N.; Barnard, A.S.; Birbilis, N. Inverse Design of Aluminium Alloys Using Genetic Algorithm: A Class-Based Workflow. Metals 2024, 14, 239. https://doi.org/10.3390/met14020239

AMA Style

Bhat N, Barnard AS, Birbilis N. Inverse Design of Aluminium Alloys Using Genetic Algorithm: A Class-Based Workflow. Metals. 2024; 14(2):239. https://doi.org/10.3390/met14020239

Chicago/Turabian Style

Bhat, Ninad, Amanda S. Barnard, and Nick Birbilis. 2024. "Inverse Design of Aluminium Alloys Using Genetic Algorithm: A Class-Based Workflow" Metals 14, no. 2: 239. https://doi.org/10.3390/met14020239

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Inverse Design of Aluminium Alloys Using Genetic Algorithm: A Class-Based Workflow

Abstract

1. Introduction

2. Dataset and Methods

2.1. Dataset

2.2. Methods

2.2.1. Multi-Target Random Forest Models

2.2.2. Multi-Objective Optimisation Method

2.2.3. An Alloy Design Workflow

3. Results

3.1. Model Training and Feature Selection

3.2. Pareto Front

3.3. Predicted Compositions

3.3.1. Alloys Predicted within Class 1

3.3.2. Alloys Predicted within Class 2

3.3.3. Alloys Predicted within Class 6

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI