A Novel Application of League Championship Optimization (LCA): Hybridizing Fuzzy Logic for Soil Compression Coe ﬃ cient Analysis

Featured Application: The present work proposes a capable hybrid intelligent approach which can be promisingly used for estimating the compression coe ﬃ cient of soil in civil and geotechnical engineering projects. Abstract: Employing league championship optimization (LCA) technique for adjusting the membership function parameters of the adaptive neuro-fuzzy inference system (ANFIS) is the focal objective of the present study. The mentioned optimization is carried out for better estimation of the soil compression coe ﬃ cient (SCC) using twelve key factors of soil, namely depth of sample, percentage of sand, percentage of loam, percentage of clay, percentage of moisture content, wet density, dry density, void ratio, liquid limit, plastic limit, plastic Index, and liquidity index. This information is widely useable in designing high-rise buildings located in smart cities. Notably, the used data is collocated from a real-world construction project in Vietnam. The hybrid ensemble of LCA-ANFIS is developed, and the best structure is determined by a three-step sensitivity analysis process. The prediction accuracy of the proposed hybrid model is compared with typical ANFIS to examine the e ﬃ ciency of the combined LCA. Based on the results, applying the LCA algorithm lead to a 4.88% and 6.19% decrease in prediction error, in terms of root mean square error and mean absolute error, respectively. Moreover, the correlation index rose from 0.7351 to 0.7539, which indicates the higher consistency of the hybrid model results. Due to the acceptable accuracy of the proposed LCA-ANFIS model, it can be a promising alternative to common empirical and laboratory methods.


Introduction
Determination of physio-mechanical parameters of soil is a significant task for the economical and safe design of civil engineering structures. Soil compression coefficient (SCC) is one of these parameters which reflects the potential of volume decrease in the soil [1]. The importance of this parameter lies in the changes in the structure of soil coming after the mentioned compression [2,3]. More clearly, once a load is applied to saturated soils, pore water pressure is increased, and the soil mass experiences a Figure 1 depicts an overall view of the steps carried out to meet the purpose of the current study. Each task is detailed in the next sections. Three well-known accuracy criteria used for evaluating the quality of the performance of the models are the root mean square error (RMSE), mean absolute error (MAE), and the coefficient of determination (R 2 ) which are expressed as follows: where Y i predicted and Y i observed symbolize the predicted and observed SCCs, respectively. Also, Y observed is the average of the observed SCCs, and N denotes the number of samples. Appl. Sci. 2019, 9, x; doi: FOR PEER REVIEW www.mdpi.com/journal/applsci quality of the performance of the models are the root mean square error (RMSE), mean absolute error (MAE), and the coefficient of determination (R 2 ) which are expressed as follows: where Y and Y symbolize the predicted and observed SCCs, respectively. Also, observed is the average of the observed SCCs, and N denotes the number of samples.

The Adaptive Neuro-fuzzy Inference System
The name ANFIS indicates an ANN-based hybrid of fuzzy logic which is inspired by decisionmaking in our life. This model was first proposed by Jang [20]. As an efficient intelligent model, ANFIS has shown good robustness in solving non-linear problems [21]. However, it might not yield preferable results in strange circumstances [22]. As mentioned it is an integration of both ANN (based on back-propagation gradient descent and least-squares learning methods) and FIS capturing the advantages of both of them [23][24][25][26]. As a result, ANFIS has outperformed unreinforced FIS for non-

The Adaptive Neuro-fuzzy Inference System
The name ANFIS indicates an ANN-based hybrid of fuzzy logic which is inspired by decisionmaking in our life. This model was first proposed by Jang [20]. As an efficient intelligent model, ANFIS has shown good robustness in solving non-linear problems [21]. However, it might not yield preferable results in strange circumstances [22]. As mentioned it is an integration of both ANN (based on back-propagation gradient descent and least-squares learning methods) and FIS capturing the advantages of both of them [23][24][25][26]. As a result, ANFIS has outperformed unreinforced FIS for non-linear issues [27]. Up to now, many scholars have satisfactorily used this tool for various scientific aims [28][29][30].
Among various fuzzy inference methods, Mamdani, Takagi-Sugeno, and Tsukamoto fuzzy are three notable ones that are commonly used to develop an ANFIS [31]. Figure 2 shows the structure of a Takagi-Sugeno FIS consisting of two inputs flowing within five layers. The membership function (MF) values of each input variable are calculated in the first layer nodes: where F and G are the linguistic variables, µF i (x) and µG i (y) represent the MFs, and x and y are the inputs. Based on Equations (6) and (7), the nodes in the two following layers calculate and normalize the rule firing strength: In the fourth layer, a node function associates each node when j i , k i , and l i denote the result parameters: Finally, the network gives the response as the summation of the signals received from previous layers: Appl. Sci. 2019, 9, x; doi: FOR PEER REVIEW www.mdpi.com/journal/applsci three notable ones that are commonly used to develop an ANFIS [31]. Figure 2 shows the structure of a Takagi-Sugeno FIS consisting of two inputs flowing within five layers. The membership function (MF) values of each input variable are calculated in the first layer nodes: where F and G are the linguistic variables, and represent the MFs, and x and y are the inputs. Based on Equations (6) and (7), the nodes in the two following layers calculate and normalize the rule firing strength: In the fourth layer, a node function associates each node when ji, ki, and li denote the result parameters: Finally, the network gives the response as the summation of the signals received from previous layers:

League Championship Optimization
Inspired by sporting competitions in sports leagues, Kashan [32] introduced LCA algorithm as a new evolutionary algorithm attempting to find the optimum solution of problems over a continuous search space. The initial population of the algorithm (league) consists of a group of L solutions which are randomly created. Each one of the solutions attributes to a team indicating the team's current formation. In this algorithm, the fitness value is represented as the "playing strength"

League Championship Optimization
Inspired by sporting competitions in sports leagues, Kashan [32] introduced LCA algorithm as a new evolutionary algorithm attempting to find the optimum solution of problems over a continuous search space. The initial population of the algorithm (league) consists of a group of L solutions which are randomly created. Each one of the solutions attributes to a team indicating the team's current formation. In this algorithm, the fitness value is represented as the "playing strength" associated with the corresponding team formation. Due to the greedy selection of the LCA, the present formation is aimed to be replaced by a more potent one [33].
The number of seasons (S) is a termination factor comprising L-1 weeks (repetitions) which yields S × (L-1) contest weeks (Note that L is an even value). Considering the league schedule in every week, the existing teams play in pairs. Based on the formation of the team, the team's playing strength determines the outcome of the match. Each team aims to update its formation during the recovery time when it is tracking the events of the previous events [34]. The flowchart of the LCA is schematically presented in Figure 3. The number of seasons (S) is a termination factor comprising L-1 weeks (repetitions) which yields S × (L-1) contest weeks (Note that L is an even value). Considering the league schedule in every week, the existing teams play in pairs. Based on the formation of the team, the team's playing strength determines the outcome of the match. Each team aims to update its formation during the recovery time when it is tracking the events of the previous events [34]. The flowchart of the LCA is schematically presented in Figure 3. Some examples of the governing rules of the LCA are the more the value of playing strength is, the more likelihood of winning the game, the outcome of a match cannot be predicted, and also, it only could represent the win or lose. Based on Figure 3, a brief explanation of the main modules is presented below:

League Schedule Development
A nonrandom order is generated for a season to enable teams to have a match against each other. The LCA also does this task by making a single round-robin program in which only one match is held between two teams during a season. Consequently, when L teams are involved, L (L−1)/2 games are required. Figure 4 shows an example of a single round-robin scheduling algorithm. As is seen, it contains eight teams where in week 1, the match is between teams 1 and 8, teams 2 and 7, and so on. In the next week, the position of team 1 is fixed and other teams rotate clockwise. This process is repeated for the next weeks until the initial status is met. Remarkably, a dummy team is considered when L is odd so that it donates rest to its opponent team. This process is also carried out for all S seasons. Some examples of the governing rules of the LCA are the more the value of playing strength is, the more likelihood of winning the game, the outcome of a match cannot be predicted, and also, it only could represent the win or lose. Based on Figure 3, a brief explanation of the main modules is presented below:

League Schedule Development
A nonrandom order is generated for a season to enable teams to have a match against each other. The LCA also does this task by making a single round-robin program in which only one match is held between two teams during a season. Consequently, when L teams are involved, L (L−1)/2 games are required. Figure 4 shows an example of a single round-robin scheduling algorithm. As is seen, it contains eight teams where in week 1, the match is between teams 1 and 8, teams 2 and 7, and so on. In the next week, the position of team 1 is fixed and other teams rotate clockwise. This process is repeated for the next weeks until the initial status is met. Remarkably, a dummy team is considered when L is odd so that it donates rest to its opponent team. This process is also carried out for all S seasons.

Winner/Loser Determination
Based on the idealized rule 1 (the higher playing strength of a team, the higher likelihood of its winning [33]), and assuming and as the formations, and also and as the playing strengths of the teams i and j, respectively, then we can write:

Winner/Loser Determination
Based on the idealized rule 1 (the higher playing strength of a team, the higher likelihood of its winning [33]), and assuming X t i and X t j as the formations, and also f X t i and f X t j as the playing strengths of the teams i and j, respectively, then we can write: where P t j is the chance of team j to its opponent at week t (P t i can be defined accordingly) and f [X = (x 1 , x 2 , . . . , x D )] is a D variable function aimed to be minimized over the space.
The above formula denotes that the likelihood of a win for the team j (or i) is proportional to the difference between f X t j (or f X t i ) and the ideal strength of an ideal team. In this relationship, it is assumed that a better team can comply with more factors of the ideal team. Here, the distances from a common reference point are the basis for evaluating the teams. Hence, the ratio of these distances gives the winning portion for each team.
Considering the idealized rule 3 (The probability that team i beats team j is assumed to be equal from the viewpoint of both teams [33]) we have: From Equations (10) and (11), P t i can be obtained as follows: A number (from 0 to 1) is then randomly generated and compared with P t i to determine the winning/losing team. Accordingly, team i wins the game if P t i is greater than or equal to this number, otherwise vice versa. More information about the LCA can be found in previous publications [33,35].

Data Collection and Statistical Analysis
By doing an extensive field survey in a real-world project in the Hai Phong city of Vietnam, namely Vinhomes Imperia housing, various information of soil was gathered [9]. The construction area was around 79 ha. The geological situation of the site was investigated by drilling 31 boreholes. Notably, the laboratory tests, as well as the geological survey, were carried out with respect to the Vietnam Standards of the TCVN9155:2012 and TCVN 9140:2012, considered for technical requirements for drilling machines and sample preservation, respectively. More details about the data provision process and the studied site are presented in ref. [9].
Eventually, the coefficient of compression is considered as the target variable affected by twelve key factors of soil, including depth of sample (DOP), percentage of sand, percentage of loam, percentage of clay, percentage of moisture content (MC), wet density (WD), dry density (DD), void ratio (VR), liquid limit (LL), plastic limit (PL), plastic Index (PI), and liquidity index (LI) as independent variables. Figure 5 depicts the SCC versus the mentioned factors. Moreover, descriptive statistics are available in Table 1.
Consisting of 496 samples, the whole dataset is divided into two parts by a random selection procedure. The proposed AI tools were fed with 80% of data (i.e., 397 instances) to analyze the relationship between the mentioned parameters. Then, the generalization power was evaluated by applying them to the remaining 20% of data (i.e., 99 instances). In other words, the input parameters of the second dataset were considered as stranger soil conditions with which the networks predicted the SCC for them.

Results and Discussion
This paper evaluates the application of the league championship optimization for hybridizing ANFIS in order to enhance its prediction capability. The results are presented in this part of the study. Firstly, the optimization procedure is explained and second, the accuracy enhancement is evaluated by comparing the results of the improved and typical ANFIS.

Hybridizing ANFIS Using the LCA Technique
Utilizing the programming language of MATLAB 2014 (MathWorks, Natick, Massachusetts, USA), the LCA algorithm was coupled with ANFIS. Here, the main role of the metaheuristic algorithms is to adjust the parameters of the MFs [36]. This process is depicted in Figure 6. The FIS is reconstructed each time using new parameters. Note that, Gaussian membership function was used for ANFIS.

Results and Discussion
This paper evaluates the application of the league championship optimization for hybridizing ANFIS in order to enhance its prediction capability. The results are presented in this part of the study. Firstly, the optimization procedure is explained and second, the accuracy enhancement is evaluated by comparing the results of the improved and typical ANFIS.

Hybridizing ANFIS Using the LCA Technique
Utilizing the programming language of MATLAB 2014 (MathWorks, Natick, Massachusetts, USA), the LCA algorithm was coupled with ANFIS. Here, the main role of the metaheuristic algorithms is to adjust the parameters of the MFs [36]. This process is depicted in Figure 6. The FIS is reconstructed each time using new parameters. Note that, Gaussian membership function was used for ANFIS. Like other optimization techniques, there are a number of LCA parameters that need to be optimized to make sure the best ensemble is being used [37]. Note that, one thousand iterations were set for each proposed network. Considering the population size, probability of success (POS), and type of formation (TOF) as the variables, three steps were carried out to achieve the most suitable architecture. Also, the RMSE was defined as the objective function, measuring the error of performance in each iteration.
According to Figure 7a, nine different population sizes of 10, 25, 50, 75, 100, 200, 300, 400, and 500 were tested, where the POS and TOF were set 0.99 and 2, respectively. As is seen, the majority of error reduction has occurred in the first 400 iterations. Finally, the best performance (i.e., lowest RMSE = 0.012741942) was obtained for the LCA-ANFIS with population size = 200. Next, five POSs of 0.2, 0.4, 0.6, 0.8, and 0.99 were tested, where the population size and TOF were 200 and 2, respectively. the convergence curves are presented in Figure 7b. The behavior of the models Like other optimization techniques, there are a number of LCA parameters that need to be optimized to make sure the best ensemble is being used [37]. Note that, one thousand iterations were set for each proposed network. Considering the population size, probability of success (POS), and type of formation (TOF) as the variables, three steps were carried out to achieve the most suitable architecture. Also, the RMSE was defined as the objective function, measuring the error of performance in each iteration.
According to Figure 7a, nine different population sizes of 10, 25, 50, 75, 100, 200, 300, 400, and 500 were tested, where the POS and TOF were set 0.99 and 2, respectively. As is seen, the majority of error reduction has occurred in the first 400 iterations. Finally, the best performance (i.e., lowest RMSE = 0.012741942) was obtained for the LCA-ANFIS with population size = 200.
Next, five POSs of 0.2, 0.4, 0.6, 0.8, and 0.99 were tested, where the population size and TOF were 200 and 2, respectively. the convergence curves are presented in Figure 7b. The behavior of the models was similar to Figure 7a, and the lowest RMSE was equal to previous value as the best structure was created by POS = 0.99.
Lastly, the TOF is another influential factor in the LCA algorithm which reflects the basis of the new formation. In this sense, the TOF of 1 and 2 indicate that new formation is based on previous week events and the best formations, respectively. As Figure 7c shows, the ensemble performs more efficiently when the TOF = 2.
Appl. Sci. 2019, 9, x; doi: FOR PEER REVIEW www.mdpi.com/journal/applsci created by POS = 0.99. Lastly, the TOF is another influential factor in the LCA algorithm which reflects the basis of the new formation. In this sense, the TOF of 1 and 2 indicate that new formation is based on previous week events and the best formations, respectively. As Figure 7c shows, the ensemble performs more efficiently when the TOF = 2.

Performance Assessment
As explained supra, the effect of the applied metaheuristic algorithm is represented by the changes in the results of the typical ANFIS when it is coupled with the LCA. In this regard, the performance error is measured by the RSME and MAE criteria, and the correlation between the observed and modeled SCC is calculated by the R 2 index. As is known, the quality of the training

Performance Assessment
As explained supra, the effect of the applied metaheuristic algorithm is represented by the changes in the results of the typical ANFIS when it is coupled with the LCA. In this regard, the performance error is measured by the RSME and MAE criteria, and the correlation between the observed and modeled SCC is calculated by the R 2 index. As is known, the quality of the training results represents the capability of the model in discerning the relationship between the SCC and influential soil factors. Also, the testing results indicate the generalization potential of the model. In other words, predicting the SCC for unseen soil conditions.

Performance Assessment
As explained supra, the effect of the applied metaheuristic algorithm is represented by the changes in the results of the typical ANFIS when it is coupled with the LCA. In this regard, the performance error is measured by the RSME and MAE criteria, and the correlation between the observed and modeled SCC is calculated by the R 2 index. As is known, the quality of the training results represents the capability of the model in discerning the relationship between the SCC and influential soil factors. Also, the testing results indicate the generalization potential of the model. In other words, predicting the SCC for unseen soil conditions. Figure 8 demonstrates the results of the training data. The observed values of the SCC vary from 0.0090 to 0.1820, and the products of ANFIS and LCA-ANFIS range in [0.0057, 0.1123] and [0.0109, 0.1105], respectively. It shows that both predictive models have grasped an acceptable pattern of the SCC. In this phase, the RMSE of ANFIS was reduced by 2.31% (i.e., from 0.0130 to 0.0127) as the impact of incorporation with the LCA technique. This improvement indicates that the MF parameters suggested by the LCA performed more promisingly than the regular ANFIS learning method. This claim could be also supported by respective MAEs of 0.0094 and 0.0091 calculated for ANFIS and LCA-ANFIS. The results of the testing data are presented in Figure 9. In addition to the graphical comparisons of the targets and outputs, the error (= target -output) calculated for each sample is depicted. Besides, the histogram of the errors is also presented showing the frequency of each error value. The observed testing SCCs vary from 0.0100 to 0.1060, and the products of ANFIS and LCA-ANFIS range in [0.0045, 0.1066] and [0.0107, 0.1045], respectively. Referring to the obtained RMSEs of 0.0123 and 0.0117, as well as the MAEs of 0.0097 and 0.0091 (respectively for ANFIS and LCA-ANFIS models), it can be seen that applying the LCA resulted in 4.88% decrease in the RMSE, and more considerably, 6.19% reduction in the MAE criterion. Moreover, standard errors of 0.0123 and 0.0117 indicate a higher The results of the testing data are presented in Figure 9. In addition to the graphical comparisons of the targets and outputs, the error (= target -output) calculated for each sample is depicted. Besides, the histogram of the errors is also presented showing the frequency of each error value. The observed testing SCCs vary from 0.0100 to 0.1060, and the products of ANFIS and LCA-ANFIS range in [0.0045, 0.1066] and [0.0107, 0.1045], respectively. Referring to the obtained RMSEs of 0.0123 and 0.0117, as well as the MAEs of 0.0097 and 0.0091 (respectively for ANFIS and LCA-ANFIS models), it can be seen that applying the LCA resulted in 4.88% decrease in the RMSE, and more considerably, 6.19% reduction in the MAE criterion. Moreover, standard errors of 0.0123 and 0.0117 indicate a higher consistency of the hybrid ensemble prediction. This increase in the prediction accuracy demonstrates that the LCA technique enables ANFIS to gain a better generalization capability for strange conditions of the problem.  Furthermore, the consistency of both training and testing results is graphically evaluated by correlation charts presented in Figure 10. As is seen, the coefficient of determination experienced a rise from 0.7224 to 0.7342 in the training phase, and from 0.7351 to 0.7539 in the testing phase. Therefore, confirming the RMSE and MAE, the SCCs estimated by the LCA-ANFIS were better correlated with actual values in both phases.
Another appreciable point is that the performance of both models for the testing data was more accurate than training ones. It could be also derived by the calculated values of the RMSE and MAE. Due to the extent of data, a possible reason for that could be the wider range of data in the training set (the difference between the maximum and minimum values of SCC is 0.173 and 0.096, respectively for the training and testing data). As Figure 10a,c illustrate, two maximum values of training dataset Furthermore, the consistency of both training and testing results is graphically evaluated by correlation charts presented in Figure 10. As is seen, the coefficient of determination experienced a rise from 0.7224 to 0.7342 in the training phase, and from 0.7351 to 0.7539 in the testing phase. Therefore, confirming the RMSE and MAE, the SCCs estimated by the LCA-ANFIS were better correlated with actual values in both phases.
Another appreciable point is that the performance of both models for the testing data was more accurate than training ones. It could be also derived by the calculated values of the RMSE and MAE. Due to the extent of data, a possible reason for that could be the wider range of data in the training set (the difference between the maximum and minimum values of SCC is 0.173 and 0.096, respectively for the training and testing data). As Figure 10a  Also, Figure 11 displays four examples of scatter-based comparison between the predicted and actual SCCs belonging to the whole dataset. Each figure plots the SCC (on the y-axis) versus the LI (on the x-axis) for a constant value of other influential parameters. In Figure 11a the data are obtained from a borehole with a depth of 5.8 m. Likewise, in Figure 11b the percentage of loam is considered to be 46.1, and Figure 11c addresses soil data their percentage of clay is 26.5. Last but not least, the wet density of 1.68 g/cm 3 is the common variable for the data shown in Figure 11d. Also, Figure 11 displays four examples of scatter-based comparison between the predicted and actual SCCs belonging to the whole dataset. Each figure plots the SCC (on the y-axis) versus the LI (on the x-axis) for a constant value of other influential parameters. In Figure 11a the data are obtained from a borehole with a depth of 5.8 m. Likewise, in Figure 11b the percentage of loam is considered to be 46.1, and Figure 11c addresses soil data their percentage of clay is 26.5. Last but not least, the wet density of 1.68 g/cm 3 is the common variable for the data shown in Figure 11d.  Also, Figure 11 displays four examples of scatter-based comparison between the predicted and actual SCCs belonging to the whole dataset. Each figure plots the SCC (on the y-axis) versus the LI (on the x-axis) for a constant value of other influential parameters. In Figure 11a the data are obtained from a borehole with a depth of 5.8 m. Likewise, in Figure 11b the percentage of loam is considered to be 46.1, and Figure 11c addresses soil data their percentage of clay is 26.5. Last but not least, the wet density of 1.68 g/cm 3 is the common variable for the data shown in Figure 11d. Among different existing intelligent models, ANFIS is considered as a leading predictive tool that simultaneously enjoys the learning capability of ANNs and the expert knowledge of FISs [38]. Due to the complexity and non-linearity of engineering issues like the SCC estimation problem, utilizing ANFIS for this aim is a logical task. Similarly, many this model has been successfully used for analyzing the relationship between natural phenomena with complicated conditions (e.g., landslide [39] and flood [40]). The findings of the current paper fall in good agreement with these Among different existing intelligent models, ANFIS is considered as a leading predictive tool that simultaneously enjoys the learning capability of ANNs and the expert knowledge of FISs [38]. Due to the complexity and non-linearity of engineering issues like the SCC estimation problem, utilizing ANFIS for this aim is a logical task. Similarly, many this model has been successfully used for analyzing the relationship between natural phenomena with complicated conditions (e.g., landslide [39] and flood [40]). The findings of the current paper fall in good agreement with these studies due to the high capability of ANFIS in inferring the non-linear relationship between the compression coefficient and related soil factors for a real-world project. The SCC estimation investigated in this study is classified as a high-dimensional problem due to the presence of several effective factors. It is more highlighted when this parameter is not in regular proportion with some effective factors like clay content and plastic index ( Figure 5). On the other hand, many studies (e.g., [41]) have shown that conducting feature validity is a reasonable way for reducing the complexity of such problems. Thus, we believe that optimizing the input configuration can be a helpful task for achieving more accurate prediction of the SCC. This idea along with comparing the LCA with other metaheuristic algorithms, can be potent suggestions for future studies in this field.

Conclusions
This study outlined a new application of league championship optimization which was optimizing ANFIS for prevailing its computational weaknesses. The studied subject was predicting soil compression coefficient by taking into consideration several soil parameters. The LCA was coupled with ANFIS and the optimization procedure revealed that the best values for the population size and probability of success were 200 and 0.99, respectively. Besides, the elite formation was found to be a more suitable basis for new formations, compared to the events of the previous week. The increase of R 2 , as well as the reduction of RMSE and MAE, showed the performance improvement of ANFIS as the results of LCA functioning. Similar to many previous studies which have conducted metaheuristic optimizations for predictive tools like ANFIS, it was concluded that the LCA is an efficient optimization technique in the field of SCC analysis. Consequently, along with ANFIS, it could be used as a robust predictive model for exploring the relationship between the SCC and soil parameters.