Prediction of the Ultimate Tensile Strength (UTS) of Asymmetric Friction Stir Welding Using Ensemble Machine Learning Methods

: This research aims to develop ensemble machine-learning methods for forecasting the ultimate tensile strength (UTS) of friction stir welding (FSW). The substance utilized in the experiment was a mixture of aluminum alloys AA5083 and AA5061. An ensemble machine learning model was created to predict the UTS of the friction stir-welded seam, utilizing 11 FSW parameters as input factors and the UTS as a response variable. The proposed approach used the Gaussian process regression (GPR) and the support vector machine (SVM) model of machine learning to build the ensemble machine learning model. In addition, an efﬁcient technique using a differential evolution algorithm to optimize the weight for the decision fusion was incorporated into the proposed model. The effectiveness of the model was evaluated using three datasets. The ﬁrst and second datasets were divided into two groups, with 80% for the training dataset and 20% for the testing dataset, while the third dataset comprised the test data to validate the model’s accuracy. The computational results indicated that the proposed model provides more accurate forecasts than existing methods, such as random forest, gradient boosting, ADA boosting, and the original SVM and GPR, by 30.67, 49.18, 16.50, 48.87, and 49.33 %, respectively. In terms of prediction accuracy, the suggested technique for decision fusion surpasses unweighted average ensemble learning (UWE) by 10.32%.


Introduction
Materials made of aluminum alloys are regularly used in a variety of industrial fields, including the subsea, petroleum, aerospace, shipbuilding, rail transportation, the automotive industry, and many others [1].In particular, aluminum alloy grades AA5083 and AA6061 demonstrate several desirable qualities, including low weight, high specific strength, resistance to corrosion, high degree of flexibility, and good weldability [2][3][4].For this reason, they are used as components in the assembly of ships and other large products [5].Friction stir welding (FSW) is the standard solid-state welding process for assembly because FSW weld seams possess excellent mechanical properties, reduce weld cracking, and are homogenous [6][7][8], leading to a stronger weld.The quality of the mechanical properties of the weld depends on the selected welding parameters [9,10].According to the authors of [11,12], the controlled parameters of the FSW include the rotation speed (RS), the welding speed (WS), the shoulder diameter (SD), the pin type (PT), the pin length (PL), the pin diameter (PD), the penetration (PN), the tool traveling method (TT), the tilt angle (TA), the type of additive (TD), and the techniques used to add additives (TAD).
These variables determine the essential mechanical properties of the weld, including ultimate tensile strength (UTS) [13], prediction peak temperature (PT) [14], and the type of welding defect [15].The UTS has received the most attention as one of the most important parameters determining the quality of the FSW weld seam [16,17].Consequently, it is vital to control the mechanical property parameters [13] while welding.The authors of [18] stated that the UTS could be predicted using the values of controllable parameters as inputs in the forecasting model.Forecasting methodologies, such as response surface methodology (RSM) [19,20], Box-Behnken design [1], and grey relational analysis (GRA) [21,22], have been proposed in the literature.In addition, the authors of [23] utilized an experimental design in conjunction with a variable neighborhood strategy adaptive search (VaNSAS) to find the optimal solution for both single and multiple FSW objectives.
Artificial intelligence (AI) approaches have recently been widely used in manufacturing processes [24][25][26][27] and optimization problems [28][29][30].Okuyucu et al. [31] applied an artificial neural network (ANN) model to evaluate the effect of the welded joint on mechanical characteristics in order to achieve optimum efficiency in FSW settings.Shojaeefard et al. [32] used ANN modeling to assess whether there was a correlation between the input and output aspects of a process.Tansel et al. [33] modeled the FSW process using genetically optimized neural network systems known as GONNS.Machine learning (ML) has been successfully used to predict the UTS of FSW.
The most recent articles advocating the use of machine learning (ML) and artificial intelligence (AI) to predict UTS have highlighted the following three research gaps: (1) the gap regarding the parameter utilized to predict UTS; (2) the gap regarding the methodology or procedure used to predict UTS; (3) the gap regarding the type of material joining.The rest of this section discusses the specifics of the most recent publications that have inspired our study.(1) The lack of research on the overall number and types of parameters in the previous literature.
The input process parameters that the authors of [34] included in their model were the rotating speed (rpm) and the feed rate (mm/min).The authors of [35][36][37][38] included the tool material, shoulder diameter, pin diameter, traverse speed, and axial forces as input parameters forecasting the UTS of the FSW.As mentioned above, the authors of [11,12] suggested 11 parameters that can influence the UTS of the FSW.To the best of our knowledge, no previous methods have used all 11 as input parameters to predict the UTS, which could have improved the prediction quality of the UTS using FSW parameters.
(2) Ensemble machine learning has not yet been utilized to predict the UTS of FSW on the basis of a predefined set of parameters.
Building machine learning models is the second gap in the literature.The authors of [34,39] suggested a machine learning model for ultimate tensile strength (UTS) that incorporated Gaussian process regression (GPR), support vector machining (SVM), and multiple linear regression (MLR).These strategies have been utilized independently (as a homogenous model) but have not been integrated into a heterogeneous model.However, heterogeneous models have been confirmed as generally outperforming homogeneous AI models, for instance, by the authors of [40,41], who employed the heterogeneous ensemble model to classify drug-resistant types of tuberculosis patients and for pornographic image categorization and detection.Heterogeneous ensemble AI should be created for the UTS prediction model to increase the quality of the model's solutions.
(3) The heterogeneous ensemble machine learning architecture has not yet been applied to predict the UTS of FSW.
In [34,35,37,39], symmetric FSW was presented to join identical material types.These studies utilized aluminum alloys AA1100, AA6082, 6061AA, and AA2050-T8.The research described in [14] predicts the UTS on the basis of the parameters of asymmetric friction stir welding used to combine two distinct types of aluminum alloy.The materials utilized in that study were AA7050 and AA2014A.As previously noted in [2][3][4], AA5083 and AA5061 are commonly used in the assembly of ships and other large products.Studies [34,35,37,39] make no mention of these two types of materials.
To address these research gaps, a model of ensemble machine learning (EML) architectures will be used.Such approaches combine the prediction parameters of several models to improve the model's performance [42].Currently, ensemble machine deep learning and machine learning are widely used to solve various types of problems.For example, the authors of [43] proposed a method for classifying the type of drug resistance in tuberculosis patients, and [22] utilized EDL to identify patterns radiologists may have overlooked when examining anomalies in medical images.EDL and EML give excellent prediction results when compared with the original versions of the deep learning and machine learning models.Since, to the best of our knowledge, the EML has never been used to predict the UTS from various types of input parameters, the contributions of our research are as follows: 1.
It is the first time that EML has been used to predict the UTS from eleven controlled parameters.

2.
We present an effective decision fusion strategy of integrating the unweighted average model and the well-known metaheuristics model (differential evolution algorithm (DE)).
The remainder of this study is organized as follows.Section 2 presents the relevant literature, Section 3 explains the materials and methods used in our research, and Section 4 presents the results.In Sections 5 and 6, we discuss those results within the context of previous studies and give our conclusion and outlook on future studies.

Related Literature
Tansel et al. [33] modeled the FSW process using genetically optimized neural network systems known as GONNS.Ghetiya et al. [44] optimized the input parameters for FSwelded AA8011 using Taguchi's T-based grey relational technique.It was discovered that the highest possible tensile strength was attained with a tool diameter of 14 mm, a transverse speed of 80 mm per minute, and a spindle speed of 1400 revolutions per minute.The support vector regression (SVR) model was utilized by Na et al. [45] to forecast the residual stresses of dissimilar metals in tungsten arc welding.The conclusion was that these models accurately predicted the outcomes of experiments.Wang et al. [46] discovered that a support vector machine (SVM) precisely classifies the defective and non-defective aspects of the weld.The Gaussian process regression (GPR) method was applied by Pal and Deswal [47] to predict water engineering problems.In addition to the GPR and SVM models, ANN models have been frequently used by researchers in order to forecast the performance of industrial processes [48][49][50].
The two-stage SVM-ANN technique was suggested in [51] to forecast the weld bead geometry of the welding process.The authors of [52] presented a model for predicting the backside bead width, which quantitatively evaluates the penetration of the weld joint from the weld pool surface.The proposed model was constructed via neural networks.Support vector regression (SVR) was utilized by the model to reduce the number of trained datasets.For a relatively small quantity of available training data, the enhanced SVR model delivered a more accurate prediction of the backside bead width according to the model's findings.The current shear and tensile strength prediction model of diffusion-bounded AA5083 and AA7075 is presented in [53].The proposed model used ANN to forecast the welding's response.The authors of [54] utilized ANN to estimate the Vickers microhardness of friction stir-welded AA6061 sheets.
The authors of [55] discuss the use of machine learning techniques to analyze and forecast the tensile performance of friction stir-welded AA6082.Rotational speed and feed rate were utilized as input variables, and ultimate tensile strength (UTS) was measured as a response parameter.The experiment was conducted using a full factorial design.Random forest regression, M5P tree regression, and an artificial neural network (ANN) were used to validate the experimental outcomes.A support vector machine (SVM) was used to estimate the mechanical properties of friction stir-welded (FSW) 6 mm thick AA6061 welded joints [56].Experiments were also conducted to determine the attributes of welded structures in terms of factors affecting the tensile strength of the welded joints.The data from the tensile strength measurement from the experiments using various combinations of governing parameters were classified into two classes, namely high tensile strength and low tensile strength.These two data classes were used as inputs for the SVM classifier data training and testing to classify data patterns and construct models.Table 1 displays the conclusions of our literature review regarding the controlled input parameters, outputs, methodologies, and types of material.
Table 1 indicates that our proposed ensemble machine-learning model includes nine controlled welding parameters.We aimed to use these parameters to predict the UTS.The proposed model is the first model that uses EDL to predict the UTS of FSW from these nine parameters.

Materials and Methods
This section covers the research methodology used to construct the machine learning model for UTS prediction from the defined parameters.Figure 1 presents the research framework used in our study.

Dataset Preparation
The data utilized for this study consisted of three datasets.The first dataset was used by [34].The other two datasets constitute the dataset generated in this study.These datasets contain nine defined FSW parameters used to predict UTS.Detailed descriptions of each dataset are provided below.

Dataset 2PI-V1
This dataset is taken from [34] and contains 25 sets of data.The defined input parameters are rotation speed and welding speed, and the output of the dataset is UTS.

Dataset 11PI-V1
This dataset was obtained via a lab experiment.The experiment used AA6061 and AA5083 materials.In the FSW experiment, worksheet plates 75 mm wide, 175 mm long, and 6 mm thick were used, and the mechanical properties of the two aluminum alloys are listed in Table 2. AA5083 was placed on the retreating side (RS), and AA6061 was placed on the advancing side (AS) because AA5083 has a lower strength than AA6061.This placement facilitates material movement and smooth mixing from the RS to the AS [65,66] and reduces or eliminates defects in the weld structure [67].Figure 2 depicts the FSW process.A review of the literature on the mechanical properties of FSW of aluminum alloys shows that the parameter types of interest were separated into two groups of continuous and categorical variables.There were seven continuous variables: (1) pin length, (2) shoulder diameter, (3) pin bottom diameter, (4) tilt angle, (5) rotational speed, (6) welding speed, and (7) penetration.The four categorical variables were set as follows: (1) pin type-hexagonal cylindrical (HC), straight cylindrical (SC), and threaded cylindrical (TC); (2) additive-aluminum oxide (AI2O3) and silicon carbide (SiC); (3) additive techniquedrill and groove method; (4) traveling method-circles (CC), straight (ST), and zigzag (ZZ).They are shown in Table 3.A review of the literature on the mechanical properties of FSW of aluminum alloys shows that the parameter types of interest were separated into two groups of continuous and categorical variables.There were seven continuous variables: (1) pin length, (2) shoulder diameter, (3) pin bottom diameter, (4) tilt angle, (5) rotational speed, (6) welding speed, and ( 7) penetration.The four categorical variables were set as follows: (1) pin type-hexagonal cylindrical (HC), straight cylindrical (SC), and threaded cylindrical (TC); (2) additive-aluminum oxide (AI2O3) and silicon carbide (SiC); (3) additive technique-drill and groove method; (4) traveling method-circles (CC), straight (ST), and zigzag (ZZ).They are shown in Table 3.The experiments were conducted over the range of parameters specified in Table 2, and the selected dataset contained complete friction welded joints free from visible external defects [67][68][69].In total, 60 experiments were performed.Figure 3 shows the experimental procedure of the friction stir welding process.When all specimens had been welded, the test tensile specimens were prepared using a water jet cutting machine to cut the transverse specimens with reference to ASTM E8M-04.The UTS of each experiment was determined using three tensile specimens.The tensile tests were carried out at room temperature with a crosshead speed of 0.5 mm/min using an LLOYD LS100-Plus universal testing machine.Because of the high cost of the experiment, we generated the dataset using a full factorial design.Therefore, dataset 11PI-V1 has three subgroups named 11PI-V1-40, 11PI-V1-50, and 11PI-V1-60.The 11PU-V1 dataset randomly selected a set of controlled parameters from the full factorial design set, including 40, 50, and 60 sets of parameters.The detailed results of the 60 sets of parameters are shown in Table 3 for the 40 and 50 sets.
The experiments were conducted over the range of parameters specified in Table 2, and the selected dataset contained complete friction welded joints free from visible external defects [67][68][69].In total, 60 experiments were performed.Figure 3 shows the experimental procedure of the friction stir welding process.When all specimens had been welded, the test tensile specimens were prepared using a water jet cutting machine to cut the transverse specimens with reference to ASTM E8M-04.The UTS of each experiment was determined using three tensile specimens.The tensile tests were carried out at room temperature with a crosshead speed of 0.5 mm/min using an LLOYD LS100-Plus universal testing machine.Because of the high cost of the experiment, we generated the dataset using a full factorial design.Therefore, dataset 11PI-V1 has three subgroups named 11PI-V1-40, 11PI-V1-50, and 11PI-V1-60.The 11PU-V1 dataset randomly selected a set of controlled parameters from the full factorial design set, including 40, 50, and 60 sets of parameters.The detailed results of the 60 sets of parameters are shown in Table 3 for the 40 and 50 sets.

Experiment 11PI-V2 Dataset
The parameter ranges are given in Table 4.The experiments were designed using the Taguchi method [22], and the welding experiment and tensile test were performed.The results of 36 experiments were used as the dataset for testing and confirming the models' predictions made using datasets from related work and the dataset from experiment 11PI-V1.

Experiment 11PI-V2 Dataset
The parameter ranges are given in Table 4.The experiments were designed using the Taguchi method [22], and the welding experiment and tensile test were performed.The results of 36 experiments were used as the dataset for testing and confirming the models' predictions made using datasets from related work and the dataset from experiment 11PI-V1.Each dataset is separated into two groups, the training dataset (80%) and the testing dataset (20%).Table 5 displays the makeup of datasets in detail.

Machine Learning Methods
Machine learning methods were developed to predict the UTS of FS-welded seams using all defined parameters as input variables; ultimate tensile strength (UTS) was observed as a response parameter.Consequently, in this study, ensemble learning techniques were applied to improve the efficacy of the UTS prediction.The GPR machine learning model was employed to form the proposed ensemble machine learning model.In addition, an efficient decision fusion approach was implemented in the proposed model.Methodologies of decision fusion are described in greater detail below.

Gaussian Process Regression
In the Gaussian process model, finding the expected directional possibility is used to handle the input (x) and output (y) of a process.Moreover, these models create a conditional distribution p (y/x).A graphical representation of GPR is presented in Figure 4.In this figure, the boxes, circles, and lines represent the observed variable, unknown values, and connection nodes across the observed values, respectively.As seen in the figure, the information of the adjacent observation values can be assessed from each observation in the GPR models, which calls for a cluster of random variables [70].However, because of the GPR models' degradation properties, it is possible that the observed values are independent of either the values from the other nodes or the corresponding value of f (latent variable)  The GPR process mainly assumes that it is given by  = () + , where noise () is (0,    2 ).GPR has an additional variable () for each input value.The  value in this work represents the observational error.This  value is distributed identically with the noise variance   2 .On the other hand, the  value is also independent of the variance.As a result, the expression for the observation becomes One observation relating to another is the covariance function ((,  ′ )).We propose a particular function the called radial basis function (RBF) kernel: where   2 is defined as the maximum covariance and  is the parameter of the effect's length.If  ≈  ′ , then (,  ′ ) approaches this maximum, and () is closely correlated with ( ′ ).In the regression of Equation ( 2), the approach to folding the noise into The GPR process mainly assumes that it is given by y = f (x) + ε, where noise (ε) is N 0, σ 2 n .GPR has an additional variable f (x) for each input value.The ε value in this work represents the observational error.This ε value is distributed identically with the noise variance σ 2 n .On the other hand, the ε value is also independent of the variance.As a result, the expression for the observation becomes One observation relating to another is the covariance function (k(x, x )).We propose a particular function the called radial basis function (RBF) kernel: where σ 2 f is defined as the maximum covariance and l is the parameter of the effect's length.If x ≈ x , then k(x, x ) approaches this maximum, and f (x) is closely correlated with f (x ).
In the regression of Equation ( 2), the approach to folding the noise into k(x, x ) is given by where δ(x, x ) is the Kronecker delta function.The covariance function is used similarly to the SVM utilization of kernel functions to generate the covariance matrix.In this manner, for the case of known kernel k and noise σ 2 , Equations ( 2) and ( 3) are sufficient for interpretation, hence the negative log-posterior [71].

Support Vector Machines
Support vector machines (SVMs) are among the supervised learning algorithms used for regression problems.The algorithm handles not only regression tasks but also the classification problem.
In the case of regression, the goal of an SVM is to find a function that can accurately predict the constant output value for a given input.
SVM regression works by finding the function that maximizes the distance between the predicted and actual output values, called the margin.The margin is used as a measure of the accuracy of the prediction.The larger the margin, the more confident the SVM is in its prediction.
To find the maximization function of the margin, the SVM uses a set of input-output pairs called the training set to learn the relationship between the input and output variables.The SVM searches for the function that maximizes the margin by solving an optimization problem, which involves finding the hyperplane that best separates the input-output pairs.The hyperplane is defined as the line or plane that is the closest to all input-output pairs.Once the hyperplane is found, the SVM can use it to predict the output value of a new input.The prediction is made by mapping the new input to the hyperplane and using the function defined by the hyperplane to calculate the predicted output value.
SVM regression is useful for tasks where the relationship between the input and output variables is complex and cannot be easily modeled using a linear function.It is also effective in handling high-dimensional data and can handle noise and outliers in the training set.
The radial basis function (RBF) kernel is a nonlinear kernel that is widely used in SVMs.It is defined as: where x and x are the input vectors and gamma is a scaling factor.

Ensemble Strategy
Both homogenous and heterogenous models were used in our model, and the model that gave the best solution was used in comparison with methods proposed in the literature.The homogenous structure uses the same model to conduct the ensemble strategy.The number of ensembled machine learning methods is set to 50.The heterogeneous method uses 50% GPR and 50% SVM.The generic frameworks of the heterogenous and homogenous ensemble structures are shown in Figure 5.
Both homogenous and heterogenous models were used in our model, and the model that gave the best solution was used in comparison with methods proposed in the literature.The homogenous structure uses the same model to conduct the ensemble strategy.The number of ensembled machine learning methods is set to 50.The heterogeneous method uses 50% GPR and 50% SVM.The generic frameworks of the heterogenous and homogenous ensemble structures are shown in Figure 5.

Decision Fusion Strategy
The decision of the final prediction is reported by the individual machine learning method.The decision fusion strategy combines the different solutions from other methods into the single final prediction of the ensemble model.This study uses two decision fusion strategies, which are explained below.

Unweight Average Ensemble (UAE)
The decision fusion strategy (DFS) combines the solutions from many GPR models into a single one representing the proposed model's solution.In this research, the base learner outputs within an ensemble underwent unweighted average (UA) analysis in order to determine the fusing decision of the model [72].
Weighted Ensemble Optimization using Differential Evolution (WEDE)

Decision Fusion Strategy
The decision of the final prediction is reported by the individual machine learning method.The decision fusion strategy combines the different solutions from other methods into the single final prediction of the ensemble model.This study uses two decision fusion strategies, which are explained below.

Unweight Average Ensemble (UAE)
The decision fusion strategy (DFS) combines the solutions from many GPR models into a single one representing the proposed model's solution.In this research, the base learner outputs within an ensemble underwent unweighted average (UA) analysis in order to determine the fusing decision of the model [72].

Weighted Ensemble Optimization Using Differential Evolution (WEDE)
The ensemble machine learning approach develops a model to predict UTS values close to the actual value.It was developed from a group of learning techniques called weighted ensemble optimization using differential evolution (WEDE) to determine the appropriate fusion weight approach.The decision fusion strategy (DFS) is responsible for combining the solutions of many GPR models into a single solution of the proposed model.First, the unweighted average (UA) is used to combine the answers, and then WEDE is used to determine the best weight for the UAE to refine the quality of the final solution.
Every prediction task uses a predictive model to estimate the true underlying function [72].A mathematical optimization that can find the ensemble's optimal weight is used to determine the optimal method of combining base learners.For the regression problems with a continuous target, prediction bias and the defined mean square error (MSE), as the expected prediction error, need to be considered.The objective function chosen in the mathematical model for optimizing ensemble weights is MSE.
An objective of the optimization model is to determine how to combine the base learner's predictions using an optimal weight that can result in an ensemble with minimal total expected root mean square (RMSE).Equation ( 5) is used to calculate the RMSE of a single learner, where I is defined as the number of observations, y i is the true value of observation i, and ŷi is the prediction of observation i. Equation ( 6) is used to calculate ŷi of the ensemble learners, where Ŷj is a set of prediction values when using model j as the learners and J is the total number of learners used in the model.A differential evolution algorithm (DE) is used to find the optimal ω j in Equation ( 6).In general, the differential evolution algorithm (DE) consists of the following five steps: (1) generate the initial solution; (2) execute the mutation process; (3) execute the recombination process; (4) execute the selection process; (5) repeat steps ( 2)-( 4) until the termination condition is met.The DE method that was incorporated into the suggested model is described in detail below.
Step 1: Generating the Initial Solution To start the problem-solving process with the DE, first, a total of NP sets of random single vectors is generated, where NP is a fixed number and set to be equal to the number of SVMs and GPRs.In our experiment, the total number of SVMs and GPRs was set to 100, as suggested by [40,55].The value in positions 1-100 was randomly selected.An example with two random vectors (NP = 2) is provided in Table 6.In Table 6, NP1 and NP2 are called target vectors.The value in positions 1-100 of vector n (NP n) is defined as X n,j,G.Target vector 1 (NP1) has values in positions 1, 2, and 3 of 0.48, 0.76, and 0.32, respectively.When we defined µ n,j,G as the ω j of vector n at iteration G, then µ n,j,G could be calculated using Equation (7).
The X n,j,G value can be iteratively updated using Steps ( 2)-( 4), as shown below.
Step 2: Performing the Mutation Process To perform the mutation process, the mutant vector was generated from three random target vectors (X n,j,G ), as shown in Equation (8).In this equation, F is the predefined scaling factor, V n,j,G+1 is the mutant vector, and X r1,j,G , X r2,j,G , and X r3,j,G are random target vectors.
Step 3: Performing the Recombination Process The recombination process is a step that produces the trial vector.Given that j is the position of vector n and the current iteration is defined as G, we used Equation ( 9) to generate the trial vector (U n,j,G ) as follows Step 4: Performing the Selection Process The selection process finds the best chromosome to be used as the new target vector for the next round of simulation iteration.The selection is made between the trial and target vectors so that the better vector is chosen.However, the local optimum in the simulation is avoided by employing Equation (10), which involves the DE mechanism.The DE mechanism is likely to accept the inferior solutions based on the solution quality and the number of iterations, as shown below, when f (X n,j,G ) is defined as the objective function (RMSE) of vector n at iteration G.
Step 5: Repeating Steps ( 2)-( 4) Until the Termination Condition is Met Steps ( 2)-( 4) are iteratively repeated, and the stopping criterion we established here is the number of iterations, which is set to 100 [73].

Results
The model was run on a PC with an Intel Core i7-2.1GHz(eight cores) CPU, 32GB of RAM, and a Tesla V100 (GPU RAM 16GB) GPU using Python.The model was evaluated using the datasets mentioned above (i.e., RMSE and CC).The framework of the experiments is shown in Figure 6.The user-defined parameters of the Gaussian process regression (GPR), support vector machine (SVM), random forest (RF), Ada boosting (AB), gradient boosting (GB), and proposed models (WEDE) are summarized in Table 7.

Regressor User Defined Parameters
Gaussian process regression [34] Kernel = 'rbf', gamma = 7, noise = 0.2 Support Vector Machine [34] Kernel = 'rbf', gamma = 7, C = 0.2 Random Forest [55] learner = 100, max leaf = 1 Ada boosting [63] learner = 100, max leaf = 5 Gradient boosting [37] learner = 100, max leaf = 5, learning rate = 0.001 Our proposed ensemble learning (HO-UWE, HE-UWE, HO-WEDE, and HE-UWE) learner = 100 4.1.Testing the Proposed Model with the Existing Dataset (Dataset 2PI-V1) Datasets from [4] were utilized to verify the proposed model.The models of [4] are referred to as GPR [34], SVM [34], RF [55], AB [63], and GB [37], whereas the proposed models are referred to as GPR-Ho-UWE, GPR-Ho-WEDE, SVM-Ho-UWE, and SVM-Ho-WEDE.The models with a heterogeneous ensemble structure are referred to as HE-UWE and HE-WEDE, respectively, when UWE and WEDE are used as the decision fusion approach.The computational results are shown in Table 8.Table 8 reveals that the best available approach for predicting the UTS based on the two controlled parameters listed in dataset 2PI-V1 produced a 46.46% less accurate solution than the best-proposed model (HE-WEDE) (RMSE decreased from 5.94 to 3.18).Using WEDE as the decision fusion approach in the model resulted in a 25.00% more accurate solution than the UW"s.The results of the cross-validation experiment using two, three, and fivefold cross-validation are shown in Table 9.The computational results presented in Table 9 demonstrate that HE-WEDE provides the best solution among all approaches, as is evidenced by its lower RMSE and variance.

Testing the Proposed Model with the Newly Built Dataset (Dataset 11PI-V1)
This dataset was gathered via an experiment, as was previously noted.The details of the newly built dataset are shown in Table 10.The computational results of the proposed model compared with those of other methods (re-program) are shown in Table 11.Kfold validation was also used to verify the result.In this experiment, fivefold validation was used.From Table 11, we can see that HE-WEDE is the best-proposed method.It gives better solutions than the methods proposed in [34] by 23.45% and better solutions than RF, AB, and GB by 25.91%.Using DE to optimize the weights for DFS improved the solution quality compared with UWA by 1.98%.

Sensitivity Analysis on the Changing of Problem Size (Number of Sets of Data in a Dataset)
In this section, dataset 11PI-V2 was selected randomly to generate a new collection, creating a maximum of 60 sets of data.Consequently, the dataset has three subsets, 11PI-V2-40, 11PI-V2-50, and 11PI-V2-60, with 40, 50, and 60 sets of data, respectively.We also used five-fold cross-validation (5-cv) to validate the experiment's results.Eighty percent of the data in each subgroup were picked as the training data, while the remaining twenty percent served as the test data.Table 12 displays the results of this experiment.
The results in Table 12 indicate that, as the number of sets in the dataset increases, the forecast accuracy also increases.The number of data sets increased by 50% and 20% from 40 to 60 and from 50 to 60, respectively.The accuracy of the model increased by 5.72% when 60 data sets were utilized, as opposed to 4.68% when 40 or 50 were used.The test times for 11PI-V2-40, 11PI-V2-50, and 11PI-V2-60 were 0.42, 0.50, and 0.54 min, respectively, which are significantly different.In this section, the proposed model was tested with the newly built dataset.This dataset comprises 36 sets of data.The details of the dataset are shown in Table 13, and the results of the model testing are shown in Table 14.According to the computational results in Table 14, HE-WEDE provides a better result than the existing approaches and the methods with which it was compared.HE-WEDE increases the solution quality compared with GPR [34], SVM [34], RF [55], AB [63], GB [37], GPR-HO-UWE, GPR-HO-WEDE, SVM-HO-UWE, SVM-HO-WEDE, and HE-UWE by 49.33%, 48.87%, 30.67%, 16.50%, 49.18%, 40.73%, 17.11%, 15.25%, 13.08%, and 10.32%, respectively.Figure 7a-c

Discussion
The robustness and accuracy of the proposed models need to be discussed and contrasted with those in the literature.We performed a comparative analysis between the

Discussion
The robustness and accuracy of the proposed models need to be discussed and contrasted with those in the literature.We performed a comparative analysis between the results found in [34,55] and the results produced by our model for the same dataset.The authors of [34,55] modeled simple machine-learning techniques.We used the parameters of the model the authors determined, given in Table 5, to predict the UTS of FSW using varying levels of controlled parameters, including (1) tool traverse speed (mm/min) and (2) tool rotational speed (RPM).The computational results of testing our proposed method with the existing methods show that the proposed model HE-WEDE gives 46.46%, 48.62%, 52.68%, 77.45%, and 79.47% better solution than the GPR [34], SVM [34], RF [55], AB [63], and GB [37], respectively.
The reason why the proposed model outperforms the existing methods proposed in the literature is that existing models use the homogenous ensemble architecture (GPR, SVM) or a single-model architecture (RF, AB, GB), which is more likely to produce a weaker solution than the model using heterogeneous ensemble architecture.This finding is supported by the conclusions of [40,41].These two studies compared the results obtained from homogenous and heterogenous ensemble architectures, which classify the type of drug resistance from chest X-ray images.The results demonstrated that the heterogenous ensemble architecture provides a superior solution to the homogenous one.In addition, the effective decision fusion strategy described in this study is more effective than current methods, such as the unweighted average model (UWA).This is one reason why the suggested model significantly outperforms the existing heuristics.Due to our methodologies, this result is supported by [40,41] and [74,75].The suggested decision fusion approach (DE) outperforms the UWA by an average of 17.80% for the existing dataset and 17.97% for the new experimental dataset.Consequently, the proposed decision fusion strategy can enhance the quality of the existing method's solutions.
Using the existing dataset, the authors of [34] concluded that the percentage of error difference between the training and testing datasets is 56.13 %.However, using our approach, the difference in the rate of error between the training and testing datasets was only 15.78%.Using our dataset, our method also provided a smaller difference between the training and testing datasets.Existing approaches [37,55,63] yield an average difference of 55.36% between the train and test datasets, whereas our method yields a difference of 12.71%.This indicates that our approach is resistant to the alterations of datasets and will continue to perform better even if the dataset is altered.In addition, from these findings, we can deduce that as the number of parameters used to predict the UTS increases, the performance of the proposed model remains both good and superior to existing methods, such as SVM, GPR, RF, AB, and GB.
The computational results are presented in Table 13.We used the parameters controlled at different levels.In Table 2, the proposed model predicted UTS most accurately, consistent with the results from [63], which modeled machine learning and used the technique of cognitive learning to reduce the RMSE value and affect the accuracy of the outcome prediction.We took the model from Table 13 to predict the UTS for another set of parameters, shown in Table 6, to confirm the results of our generated model.It had the highest accuracy with an RMSE and CC of 3.39 MPa and 0.9965, respectively.This is a better result than those found in [34,55].
In this research, we were interested in UTS prediction accuracy based on an increased number of input parameters and whether our proposed model has better UTS prediction accuracy than those with fewer input parameters presented in [24,34,35,37].We found that our model results were more accurate.Therefore, the suggested model was successfully employed to predict the UTS of FS welds using all input parameters and can provide answers without destroying the sample.

Conclusions
In this study, a machine learning ensemble model based on 11 types of regulated FSW characteristics was developed to predict the UTS.Alloy grades AA5083 and AA6061 were connected using the FSW method performed in the experiment.The proposed method integrates Gaussian process regression (GPR) and support vector machine (SVM) models.When combining different types of machine learning models into one, it was necessary to build an efficient decision fusion approach.In this study, two distinct decision fusion approaches were used, namely, the unweighted average model (UAM) and the differential evolution algorithm, which was the solution of the GPR and SVM.
Three distinct datasets were used to assess the proposed model.The model was compared with the existing approach found in prior studies.The first dataset that we used (2PI-V1) was obtained from the literature, whereas the other two were acquired experimentally using random parameter selection (11PI-V1) and Taguchi methods (11PI-V2).The results of testing the new model on the existing dataset (2PI-V1) containing 25 sets of data, and comparing the outcome with the existing prediction methods of GPR, SVM, AB, RF, and GB demonstrated that the proposed model outperformed the current strategy.This model improved the quality of the solution, on average, by 54.94%, with maximum and minimum improvements of 77.45% and 46.46%, respectively.The result was validated using two-, three-, and five-fold cross-validation and was consistent across all validation folds.Two ensemble structures were evaluated during model testing.These ensemble structures were homogeneous and heterogeneous.The heterogeneous structure, which employed at least two forms of machine learning in the same model, provided a superior answer to the homogeneous structure by 17.83%.Moreover, DE, which is used to optimize the decision fusion weight strategy, outperformed UWA by 25.00%, according to the computational results.
The proposed model outperformed all other current approaches when tested with the 11PI-V1 dataset.The average improvement in the 11PI-V1 testing results was 26.49%.The heterogeneous structure also outperformed the homogeneous structure by 13.93%, while DE outperformed UWE by 1.98%.In both datasets, the experimental results confirm that the suggested model outperformed the existing techniques.Additionally, the heterogeneous ensemble structure and the application of DE to improve the fusion weight improved upon prior techniques.We did not use a full factorial design to conduct the experiment and create the dataset in this work.We believe machine learning should be robust enough to foresee non-pattern data.The models employing 40, 50, and 60 different sets of data surpassed all previously described methods.The average improvement was 20.79%.This indicates that the machine learning model is more resistant to the number of accessible datasets than earlier methods.Nevertheless, one conclusion we gleaned from the experiment was that using more sets of data results in a more accurate prediction model.The suggested model was further evaluated with an unobserved dataset (11PI-V2), and the results matched those of the first two experiments.We can confirm that the proposed method is an effective strategy that can be utilized to predict the UTS from several dataset sources.It can also forecast wider sets of parameters than the existing models reported in the scientific literature.
There are four alternative ways to expand our study in the future to achieve a greater level of research quality: (1) The effective decision fusion methodology, such as artificial multiple intelligence systems (AMISs) or other weight optimization approaches, can be investigated in depth, as can the best decision-making strategy for fusing various ML methods into a single answer; (2) Increasing the efficiency of the ML method can be accomplished using different approaches to find the optimal ML parameters, such as discovering a more efficient layer of the SVM and GPR; (3) The online application based on the proposed model can be developed to make it easier for a welder to select the appropriate parameters to achieve the desired UTS;

Figure 1 .
Figure 1.The framework of the proposed model used to predict the UTS.

Figure 5 .
Figure 5. Generic frameworks of the homogenous and heterogenous structures of the EML.

Figure 5 .
Figure 5. Generic frameworks of the homogenous and heterogenous structures of the EML.

Table 1 .
Parameters, outputs, methods, and materials used in the previous literature and our study.

Table 2 .
Mechanical properties of the aluminum alloy.

Table 2 .
Mechanical properties of the aluminum alloy.

Table 3 .
Input parameters of FSW.

Table 3 .
Input parameters of FSW.

Table 4 .
Input parameters of FSW for testing and confirmation.

Table 5 .
Details of the datasets.

Table 6 .
An example with two random vectors.

Table 8 .
Performance of different machine learning models on the FSW-2PI-V1 dataset.

Table 9 .
K-fold cross-validation of the models with the training dataset.

Table 11 .
Performance of different machine learning models on the FSW-11PI-V1 dataset.

Table 12 .
Sensitivity test with different numbers of sets of data.Testing the Effectiveness of the Proposed Method with the Unseen Dataset (11PI-V2)

Table 14 .
Performance of different machine learning methods on the FSW-11PI-V2 dataset.