Predicting California Bearing Ratio of Lateritic Soils Using Hybrid Machine Learning Technique

: The increase in population has made it possible for better, more cost-effective vehicular services, which warrants good roadways. The sub-base that serves as a stress-transmitting media and distributes vehicle weight to resist shear and radial deformation is a critical component of the pavement structures. Developing novel techniques that can assess the sub-base soil’s geotechnical characteristics and performance is an urgent need. Laterite soil abundantly available in the West Godavari area of India was employed for this research. Roads and highways construction takes a chunk of geotechnical investigation, particularly, California bearing ratio (CBR) of subgrade soils. Therefore, there is a need for intelligent tool to predict or analyze the CBR value without time-consuming and cumbersome laboratory tests. An integrated extreme learning machine-cooperation search optimizer (ELM-CSO) approach is used herein to predict the CBR values. The correlation coefﬁcient is utilized as cost functions of the CSO to identify the optimal activation weights of the ELM. The statistical measures are separately considered, and best solutions are reported in this article. Comparisons are provided with the standard ELM to show the superiorities of the proposed integrated approach to predict the CBR values. Further, the impact of each input variable is studied separately, and reduced models are proposed with limited and inadequate input data without loss of prediction accuracy. When 70% training and 30% testing data are applied, the ELM-CSO outperforms the CSO with Pearson correlation coefﬁcient (R), coefﬁcient of determination (R 2 ), and root mean square error (RMSE) values of 0.98, 0.97, and 0.84, respectively. Therefore, based on the prediction ﬁndings, the newly built ELM-CSO can be considered an alternative method for predicting real-time engineering issues, including the lateritic soil properties.


Introduction
Transportation facilities are potential requirements for national development, which include industrialization, rural/urban development, and social-economic development [1]. Nowadays, in developing nations, highways and transportation facilities schemes are gaining importance to improve living standards and comforts. In India, from 2014 to 2022, the development of national highways witnessed a phenomenal growth of 55% (from 91,287 km to 141,190 km). In most highways, one of the most challenging tasks is providing quality subgrade soil. In this context, many projects depend on locally available soils or treated soils [2][3][4].
Lateritic soils and expansive clays are most common soils found globally. Most road projects on expansive clays often use strong subbase material and subgrade soil containing a montmorillonite mineral, which offers high swelling and shrinkage and is unsuitable for highway or building projects [5,6]. Under wet conditions, they can swell and lose their strength. To counter volume change behavior of black cotton soils, many modification techniques are in practice, such as mechanical stabilization and chemical stabilization [7][8][9]. Moreover, these techniques enhance the index and engineering properties of clays due to the chemical reaction in the clays blended with chemical additives and gradation improvement with the addition of non-cohesive granular materials [7,8]. In expansive clay blends with additives such as cement, lime, CaCl 2 , fly ash, rice husk ash, and other pozzolanic additives, at certain dosages of these additives possess improvement in grain size and plasticity behavior [10][11][12]. The chemically altered blends exhibit larger particle sizes than the montmorillonite clay, probably conforming to kaolinite and illite sizes. An interesting observation was montmorillonite could behave like kaolinite and illites.
On the other hand, Lateritic soils are created by the in-situ weathering and disintegrating rocks in tropical and subtropical locations with heat and humid climates. Lateritic are extensively weathered and changed residual soils. These soil formations are frequently applied as building materials for civil engineering projects. The characteristics of lateritic residual soils vary from location to location owing to variations in the prevalent geological settings, climatic circumstances, and mineral types. In India, most of the highway projects rely on the lateritic soils even though huge variation in geotechnical characteristics.
In pavements, assessment of soil subgrade, and design of pavements, the California bearing ratio (CBR) test plays a vital role [13]. However, determining the CBR values in subgrade soil samples is time-consuming, especially for large-scale highway projects [14]. Moreover, the CBR of subgrades is affected by many factors, such as grain size distribution, compaction effort, moisture content, and plasticity characteristics of the soils [15][16][17]. So, in this context, many others developed a correlation between the CBR values and the gradation distribution and plasticity characteristics [18][19][20]. In the case of chemically altered clays, the volume change of clays is markedly reduced, which can lead to dense phases of the blends and improves the densities [21,22]. Densities and moisture content of the blended clays are described as the most important parameters to evaluate the CBR value [12]. Sharma and Sivapullaiah [23] carried out an experimental investigation to evaluate the CBR values with varying densities and moisture content to describe the significant relationships between the compaction characteristics and CBR. The CBR values and densities correlations illustrate significant correlation coefficients of 0.879. Vinod et al. [24] reported that compaction efforts affect the CBR value of soils, and the correlations developed between the CBR value and energy ratio were dependent on the compaction energy and marginally dependent on the soil type. The wetting and drying effect on the soils influence the CBR value; the rate of change of the CBR value on the dry side of optimum (before optimum moisture content (OMC)) is more than 3 to 7 times wet side of optimum [25,26].
In recent years, correlations made by traditional approaches such as statistical correlations and regression analysis were surpassed by emerging artificial intelligence (AI) techniques [27,28]. The main asset of the AI techniques is the potential learning process of datasets without any assumption or uncertainty to improve the estimation model with accuracy. Nowadays, for predicting the CBR values of soils, a few AI techniques, including both supervised learning and unsupervised learning techniques, are emerging to reduce rigorous testing and time-consuming tests. Notably, researchers have employed artificial neural networks (ANN) to predict the CBR value of soils [15][16][17]29,30]. The CBR value of soils was successfully predicted using the ANN, and the sensitivity analysis of the input variable revealed that the dry density values demonstrated the most effective parameters in the prediction model [15]. Backpropagation neural networks (BP-ANN) are a hybrid tool that efficiently predicts the CBR values of chemically treated soils [17]. In the case of blended pond ash clays, the curing period of the blended clays is the most effective parameter that influences the CBR value predicted using the ANN [29]. Meanwhile, the genetic algorithm, support vector machine, and particle swarm optimization (PSO) algorithm have generalization capability and rational structure that can predict non-linear problems with convergent results [31][32][33]. The genetic algorithm successfully indicated the CBR value and amount of additive (fly ash) required to attain a fixed CBR value [34].
In previous studies, when a CBR was predicted using index properties of soils, different parameters were considered to achieve the CBR value of soils [32,33]. The parameters influencing the CBR, and prediction equations are provided in the Table 1. These parameters include grain size proportions, liquid limit (LL), plastic limit (PL), OMC, and maximum dry density (MDD). The parameters showed to have effect on the penetration resistance of the subgrade soil. The AI tools described above, including genetic programme (GP), PSO, radial basis network (RBN), and ANN demonstrated better performance with R 2 value ranging from 0.842 to 0.918, based on the nature of the dataset. The CBR has been estimated in numerous earlier research works utilizing soil factors such as OMC, MDD, LL, PL, plasticity index (PI), gravel (G), sand (S), coarse sand (CS), fine sand (FS), fines (F), specific gravity, lime sludge content (LS), and lime content (L). The three variables most frequently employed as input for predicting the CBR value of soils are the grain size distribution, plasticity characteristics, and compaction characteristics, as summarized in Table 1. In this study, the CBR prediction of soils using the ELM considering reliable field dataset includes the information about gradation distribution of soils, plasticity characteristics, and compaction characteristics.  By integrating machine learning and optimization, an enhanced tool can be developed to obtain acceptable prediction results compared with earlier methods. Therefore, an integrated extreme learning machine-cooperation search optimizer (ELM-CSO) model is proposed in this research to predict the CBR value of subgrade soils. This article also explores the efficacy of the current model with the standard ELM. The CSO algorithm is used in the process of the training of the ELM to find its optimal parameters to estimate the CBR from known input variables. This integrated method is adaptable for missing data to predict the CBR whenever it is required. The comparisons with the ELM in terms of R, R 2 , and RMSE values illustrate the improvements of the proposed scheme for prediction studies. Further, the optimal parameters achieved at a specific training rate, activation function, and other selective parameters of the ELM produce significant improvements in the estimation of the CBR at other choices.

Integrated ELM-CSO Model for CBR Prediction
The CBR values depend on the gradation characteristics, plasticity characteristics, and compaction characteristics, but theoretical function is not available to calculate the value of CBR from the available parameters. Therefore, an empirical expression is required to estimate the CBR value from the independent variables. Machine learning models are suitable in such conditions to predict the values of the output variables using the raw data of the input. However, the number of hidden layer neurons, the initial weights of the input and output activation links are randomly generated, which may not provide acceptable and better results in all the trials. Consequently, optimal values of these weights enhance the accuracy of the predicted outputs and reduce the errors between the actual and estimated CBR values. For this purpose, an integrated ELM-CSO model is proposed in this study to predict the CBR values with better statistical measures. The weights of the ELM model are tuned with the CSO algorithm, and the integrated approach is provided in this section in terms of overview of the ELM, CSO, and integrated ELM-CSO approach.

ELM
ELMs are special categories of the neural network with feedforward neural network architecture. The learning mechanism of the ELM is different from the ANN. The input weights and hidden layer biases of the ELM are randomly generated in the initial epoch of the training process similar to other standard ANN models. To find the weights of the output layer, the least square method is employed. The architecture of the ELM is displayed in Figure 1 with input, hidden, and output layers. diction studies. Further, the optimal parameters achieved at a specific training rate, activation function, and other selective parameters of the ELM produce significant improvements in the estimation of the CBR at other choices.

Integrated ELM-CSO Model for CBR Prediction
The CBR values depend on the gradation characteristics, plasticity characteristics, and compaction characteristics, but theoretical function is not available to calculate the value of CBR from the available parameters. Therefore, an empirical expression is required to estimate the CBR value from the independent variables. Machine learning models are suitable in such conditions to predict the values of the output variables using the raw data of the input. However, the number of hidden layer neurons, the initial weights of the input and output activation links are randomly generated, which may not provide acceptable and better results in all the trials. Consequently, optimal values of these weights enhance the accuracy of the predicted outputs and reduce the errors between the actual and estimated CBR values. For this purpose, an integrated ELM-CSO model is proposed in this study to predict the CBR values with better statistical measures. The weights of the ELM model are tuned with the CSO algorithm, and the integrated approach is provided in this section in terms of overview of the ELM, CSO, and integrated ELM-CSO approach.

ELM
ELMs are special categories of the neural network with feedforward neural network architecture. The learning mechanism of the ELM is different from the ANN. The input weights and hidden layer biases of the ELM are randomly generated in the initial epoch of the training process similar to other standard ANN models. To find the weights of the output layer, the least square method is employed. The architecture of the ELM is displayed in Figure 1 with input, hidden, and output layers. For the data with the input and output information (x i , y i ), the mathematical model of ELM is presented by the following: In Equation (1), the arbitrary distinct points of the data are represented by N and the number of nodes in hidden layer is designated by M. The inputs (x i ) are multiplied by randomly generated weights w i and added with the bias values b i to provide the input to hidden layer. It is then processed through an activation function f (x), which is an important feature of the ELM. These functional values are multiplied by the output layer weights (β i ). Based on the number neurons in the input, hidden, and output layers, Equation (2) can also be expressed in compact matrix form as the following: Equation (3) indicates the improved Equation (2) to find the values of β using the least squares method and can be written as the following: For the data with the input and output information ( , ), the mathematical model of ELM is presented by the following: In Equation (1), the arbitrary distinct points of the data are represented by and the number of nodes in hidden layer is designated by M. The inputs ( ) are multiplied by randomly generated weights and added with the bias values to provide the input to hidden layer. It is then processed through an activation function ( ), which is an important feature of the ELM. These functional values are multiplied by the output layer weights ( ). Based on the number neurons in the input, hidden, and output layers, Equation (2) can also be expressed in compact matrix form as the following: Equation (3) indicates the improved Equation (2) to find the values of β using the least squares method and can be written as the following: where [ ] Ɨ is Moore-Penrose generalized inverse of H. The detailed version of the ELM is well documented by Huang et al. [38] and Huang et al. [39].

CSO
CSO is a new population search optimization algorithm to find the optimal solutions of the complex engineering problems. The optimization algorithms are essential to find the solutions corresponding to the minimum and maximum values of the objective functions designed from the engineering systems. The CSO is implemented based on team cooperation behaviors in modern enterprises using communication, learning, and competition strategies [39]. A set of staffs called an enterprise, and each staff member represents one feasible solution of the optimization problem. In the initialization of the CSO, the solutions are randomly generated within the limits of the decision variables. The CSO creates multiple solutions ∈ [1 ] in the search space with Equation (4): In Equation (4), equals to 1 for initial iteration. Each swarm exhibits I number of solutions and denotes the decision variables size. After calculating the fitness values of each staff utilizing the solutions provided in Equation (4), information sharing is taking place between staff knowledge ( , ), the chairman's knowledge ( , ), the collective knowledge of board of directors ( , ), and the collective knowledge from the board of supervisors ( , ). Team solution ( , ) is calculated with all member knowledge, and is given by the following: In Equation (5), the individual components are expressed by the following: Knowledge metrices defined from Equation (6) to Equation (8) consist of which is the personal staff best, while and are the learning coefficients of board of where [H] (1) In Equation (1) ). Based on the number neurons in the input, hidden, and output layers, Equaion (2) can also be expressed in compact matrix form as the following: Equation (3) indicates the improved Equation (2) to find the values of β using the ast squares method and can be written as the following: here [ ] Ɨ is Moore-Penrose generalized inverse of H. The detailed version of the ELM well documented by Huang et al. [38] and Huang et al. [39].

.2. CSO
CSO is a new population search optimization algorithm to find the optimal solutions f the complex engineering problems. The optimization algorithms are essential to find he solutions corresponding to the minimum and maximum values of the objective funcions designed from the engineering systems. The CSO is implemented based on team ooperation behaviors in modern enterprises using communication, learning, and compeition strategies [39]. A set of staffs called an enterprise, and each staff member represents ne feasible solution of the optimization problem. In the initialization of the CSO, the sotions are randomly generated within the limits of the decision variables. The CSO cretes multiple solutions ∈ [1 ] in the search space with Equation (4): In Equation (4) In Equation (5), the individual components are expressed by the following: Knowledge metrices defined from Equation (6) to Equation (8) consist of hich is the personal staff best, while and are the learning coefficients of board of is Moore-Penrose generalized inverse of H. The detailed version of the ELM is well documented by Huang et al. [38] and Huang et al. [39].

CSO
CSO is a new population search optimization algorithm to find the optimal solutions of the complex engineering problems. The optimization algorithms are essential to find the solutions corresponding to the minimum and maximum values of the objective functions designed from the engineering systems. The CSO is implemented based on team cooperation behaviors in modern enterprises using communication, learning, and competition strategies [39]. A set of staffs called an enterprise, and each staff member represents one feasible solution of the optimization problem. In the initialization of the CSO, the solutions are randomly generated within the limits of the decision variables. The CSO creates multiple solutions M ∈ [1 I] in the search space with Equation (4): In Equation (4), k equals to 1 for initial iteration. Each swarm i exhibits I number of solutions and j denotes the decision variables size. After calculating the fitness values of each staff utilizing the solutions provided in Equation (4), information sharing is taking place between staff knowledge (x k i,j ), the chairman's knowledge (A k i,j ), the collective knowledge of board of directors (B k i,j ), and the collective knowledge from the board of supervisors (C k i,j ). Team solution (u k i,j ) is calculated with all member knowledge, and is given by the following: In Equation (5), the individual components are expressed by the following: Knowledge metrices defined from Equation (6) to Equation (8) consist of gbest which is the personal staff best, while α and β are the learning coefficients of board of directors and supervisors, respectively. Apart from the team solution u k+1 i,j , staff individual updated solution is v k+1 i,j . This solution is achieved by summing its own experience in their opposite direction. Using the team and individual solutions (u k+1 i,j and v k+1 i,j ), the next iteration solution is as follows: If any solution among the group exceeds their upper or lower limits, then the staff positions are restricted at extreme values of the decision variables. Using the new staff, solutions are updating employing the iterative procedure from Equation (5) to Equation (9). The final best solution at the end of the final iteration is considered as optimal solution of the problem [40].

Study Area
India has the second-largest road system in the world, with a total length of 5.7 million kilometers. The rapid depletion of high-quality natural materials due to rising requirements for road projects is a complex issue. Additionally, the lack of natural resources and the location of mines, far from most road projects, are driving up the total project costs. The application of unconventional road construction methods is widespread worldwide. However, such projects are still used extremely infrequently in India owing to a lack of legislation and explicit guidelines, uncertainties in the outcomes, and the functioning of roads throughout their life span [41,42].
The West Godavari district of Andhra Pradesh's Upland region has the highest concentration of lateritic soil deposits. From a geotechnical perspective, lateritic gravels provide desirable properties for roads in tropical climate zones. They are employed in pavements because they are readily available in the base, sub-base, or subgrade layers of roads. Despite being a marginal material, laterite soil is frequently utilized as a good material in pavement layers when appropriately amended under the necessary strength parameters. Laterite's effectiveness as a base or subbase material depends on several variables, including grading, plasticity, chemical and mineralogical compositions, and the field circumstances where they are used [2]. Although laterite has successfully been applied in highway construction, poor quality management often fails to fulfil the design strength [2,43].
On the other hand, the delta region of Andhra Pradesh located along the seacoast shows abundant black cotton soil deposits ( Figure 2). Moreover, due to good network of canals, most of the state highways and major district roads are constructed along the side of canal embankments. In recent years, to counteract the expansive soil subgrades, lateritic cushion layers have been utilized to strengthen the pavement performance. Pavements constructed on expansive clay subgrades are vulnerable to distress, unevenness, and instability owing to their propensity for volumetric changes brought on by changes in moisture regime. Because the pavements in such situations do not directly lie on the swelling clay subgrades, the situation can be alleviated by adding a non-swelling soil cushion over the expansive clay subgrade. As a result, the settlement of the layers can be prevented. Furthermore, choosing suitable lateritic soils in the lateritic cushion soils can improve the stiffness of the pavement layer and performance [6,44].
Due to the aquaculture industry's rapid expansion, which is causing dynamic changes in land use and land cover in the study region, the built-up area rose from 172.9 to 179.4 km 2 ( Figure 3) from 2020 to 2021 [45]. New highways were additionally needed to connect the newly created urban land. of the study was to evaluate lateritic soil characteristics, including particle size distribution, plasticity behavior, and compaction characteristics, under ASTM criteria.  This research was structured according to a systematic approach involving three steps to suggest a practical build relation between the CBR and index properties used in the Upland region for pavement construction and lateritic soil properties in that region.
Initially, samples in the field were collected, and the variation of lateritic soils was identified in the study area. Later, a laboratory investigation of lateritic soil characteristics was carried out to create a database. Further, machine learning approach was utilized to predict the soaked CBR value of lateritic soils. of the study was to evaluate lateritic soil characteristics, including particle size distribution, plasticity behavior, and compaction characteristics, under ASTM criteria.  This research was structured according to a systematic approach involving three steps to suggest a practical build relation between the CBR and index properties used in the Upland region for pavement construction and lateritic soil properties in that region.

Datasets Preparation
Initially, samples in the field were collected, and the variation of lateritic soils was identified in the study area. Later, a laboratory investigation of lateritic soil characteristics was carried out to create a database. Further, machine learning approach was utilized to predict the soaked CBR value of lateritic soils. This research aimed to predict the CBR of lateritic soils collected from the Upland area of Andhra Pradesh's West Godavari district. It employed a database with numerous samples of lateritic soils applied to pavement design. There were totally 149 samples that were gathered from the study region (Figure 2), and they were all examined. The purpose of the study was to evaluate lateritic soil characteristics, including particle size distribution, plasticity behavior, and compaction characteristics, under ASTM criteria.

Datasets Preparation
This research was structured according to a systematic approach involving three steps to suggest a practical build relation between the CBR and index properties used in the Upland region for pavement construction and lateritic soil properties in that region.
Initially, samples in the field were collected, and the variation of lateritic soils was identified in the study area. Later, a laboratory investigation of lateritic soil characteristics was carried out to create a database. Further, machine learning approach was utilized to predict the soaked CBR value of lateritic soils.

Datasets Preparation
Two major tests used to determine the strength of the soil samples are the unconfined compressive strength (UCS) and CBR tests. However, because these two tests imply separate failure mechanisms, they cannot be substituted with one another. In this context, the UCS tests can be utilized as a criterion for the chemically altered soil strength and are helpful for the design of engineering projects. In contrast, the CBR tests can only be employed as an indication for engineers to decide the performance of stabilized pavement subgrade layers and to classify the sub-grade conditions for road projects. Additionally, it is crucial to note that the CBR value can be assessed based on samples' index properties and compaction characteristics (such as the LL, PL, gradation distribution, OMC, and MDD).
The CBR testing was performed in accordance with ASTM D1883-16. The experiments were carried out using a cylindrical mold with dimensions of 150 mm in diameter and 175 mm in height. For three days, the unsoaked curing condition of each sample was examined. The samples were compressed for this purpose into five layers, each with a weight effort of 2.6 kg and a drop of 310 mm. The soaked samples maintained a surcharge of 2.5 kg in weight during this time. During the saturation time, strain gauges were also attached to the samples, and the samples' swelling was measured. The samples were soaked before being loaded into the CBR loading apparatus. It was calculated that the load against the cylindrical rod penetration plunger was 2.5 mm.

Experimental Data
The dataset contains 149 samples collected in the highway projects from Vijayawada to Tadepalligudem, India, as can be seen in Appendix A. The experimental investigation of samples was performed in accordance with the codes mentioned in Table 2. The soil samples were collected 0.5 m below the soil surface where the investigation was conducted. Particle distribution, plasticity characteristics, and compaction characteristics of 149 samples were examined in depth for many experimental findings. G %, S %, F %, LL, PL, OMC, MDD, and CBR (soaked) soils are among the data derived from the collected dataset. Table 2 also lists the statical data of the input variables considered for predicting the CBR value.

Simulation Results
The CSO algorithm was employed to find the input weights of the ELM to either minimize mean absolute error (MAE) and RMSE or maximize R and R 2 . The fitting model of the data with respect to any one of these metrices produced acceptable values to rest of the metrices. Therefore, a single objective function was considered in the process of the optimal input weights identification. On the other side, the input weights of the ELM were used in the optimization process to find corresponding best values instead of taking both input weights and biases as decision variables of the problem to reduce the dimension of the CSO. With the proposed integrated ELM-CSO approach, investigations were conducted for the direct data by varying the input variables, patterns, activation functions and parameters of the ELM. Comparisons were provided with the standard ELM to demonstrate the merits of the proposed algorithm.

Input Optimal Weights of ELM Using CSO Algorithm
To find the optimal weights of the input activation links, the inverse of the correlation coefficient was applied as objective function, which had to be minimized. The number of the hidden layer neurons were seven and therefore, the input weights matrix was 7 × 7, since the data possessed seven inputs. This replicated 49 decision variables. If biases included as decision variables of the CSO, the dimension of the optimization problem would be increased from 49 to 56. However, its effect was negligible on the performance of the ELM, and therefore it was limited to 49. The input optimal weights matrix is given by: Based on the number of the neurons of the hidden layer, the size of the matrix alters, and the provided optimized matrix is sufficient to obtain acceptable and improved results compared with the ELM in all the investigations carried out in this section.

Prediction Enhancement with ELM-CSO with Different Activation Functions
The improvement of the proposed integrated ELM-CSO approach was investigated using different activation functions, and comparisons are provided with the standard ELM.
The weighting factor of the inputs, or the self-generated sum, is transformed into an output value via the activation functions, also known as transfer functions. Different activation factors are occasionally obtained for various networks to produce improved performances. In order to add nonlinearity into the network in machine learning, activation or transfer functions are required for the hidden nodes. Linear and nonlinear activation functions are the two activation functions most frequently utilized in neural networks. The performance of the training procedure depends on the activation function selection. An ELM network activation function needs to possess many crucial qualities. It should be monotonically non-decreasing, continuous, and differentiable [46,47].
The activation function must be differentiable when utilizing the ELM-CSO learning algorithm in order for the function to be confined within specific bounds. For this purpose, initially hyperbolic tangent activation function is opted. Under this activation function, both the ELM and ELM-CSO methods are trained and tested on the CBR data and results are provided in Figure 4a,b. In case of sigmoid activation function, comparative results between the actual and predicted CBR values are depicted in Figure 4c,d for the standard ELM and proposed integrated approach. This function is endlessly discrete and supports non-binary interaction. Figure 4e,f illustrates the responses of the ELM and ELM-CSO methods when the activation function is sinusoidal transfer function. Due to the adaptable positive and negative values within the scale, this sinusoidal function's greatest asset is its ability to learn [48].
Furthermore, the percentage errors in between actual and predicted CBR values are displayed in Figure 5. The proposed integrated ELM-CSO method improved the accuracy of prediction and reduced the errors. The average errors of both the ELM and ELM-CSO are provided in Table 3 for all the three activation functions cases to illustrate the prediction enhancement of the proposed scheme. In all cases, the proposed ELM-CSO approach utilizing the optimal weights was presented in Equation (10) in the initial run. However, it is suitable for other activation function cases to obtain acceptable results.  Furthermore, the percentage errors in between actual and predicted CBR values are displayed in Figure 5. The proposed integrated ELM-CSO method improved the accuracy of prediction and reduced the errors. The average errors of both the ELM and ELM-CSO are provided in Table 3 for all the three activation functions cases to illustrate the prediction enhancement of the proposed scheme. In all cases, the proposed ELM-CSO approach utilizing the optimal weights was presented in Equation (10) in the initial run. However, it is suitable for other activation function cases to obtain acceptable results.

Effect of Uncertainties in Training and Testing Data
In the prediction studies, the ELM-CSO approach was investigated for different training and testing chances using the optimal weights identified at specific percentage for checking the adaptivity of the algorithm. The input weights provided in Equation (10) were searched by the CSO at 80% training and 20% testing data. With the help of these weights, the ELM was tested for other probabilities of the training and testing. Comparisons are provided with the standard ELM. First, 90% was employed for training, and corresponding predictions of 10% test data are presented in Figure 6 (Figure 6a for the ELM and Figure 6b for the ELM-CSO). Later, 70% was utilized for training, and corresponding

Effect of Uncertainties in Training and Testing Data
In the prediction studies, the ELM-CSO approach was investigated for different training and testing chances using the optimal weights identified at specific percentage for checking the adaptivity of the algorithm. The input weights provided in Equation (10) were searched by the CSO at 80% training and 20% testing data. With the help of these weights, the ELM was tested for other probabilities of the training and testing. Comparisons are provided with the standard ELM. First, 90% was employed for training, and corresponding predictions of 10% test data are presented in Figure 6 (Figure 6a for the ELM and Figure 6b for the ELM-CSO). Later, 70% was utilized for training, and corresponding predictions of 30% test data are demonstrated in Figure 6 (Figure 6c for the ELM and Figure 6d for the ELM-CSO). The statistical metrices indicate the handling capacity of the algorithm during the uncertainties in the portions of the training and testing data sets.

Performance Investigation with Missing Data
The performance of the method was evaluated by considering the data in which the missing variables from G, S, F, LL, PL, OMC, and MDD were counted separately. In this case, MDD was removed from the input data matrix and its size was reduced from 7 × 149 to 6 × 149. Therefore, the input weight matrix was reduced to 6 × 6. The number of decision variables of the optimization problem was 36, and the solution achieved with the CSO is given by the following: The predictions of the CBR test data utilizing Equation (11) as input weight matrix is provided in Figure 7a. For the comparison purpose, the standard ELM was also tested on similar scenarios, and results are illustrated in Figure 7b. In Table 4, the statistical measures are compared showing the additional benefits of the proposed algorithm.

Performance Investigation with Missing Data
The performance of the method was evaluated by considering the data in which the missing variables from G, S, F, LL, PL, OMC, and MDD were counted separately. In this case, MDD was removed from the input data matrix and its size was reduced from 7 × 149 to 6 × 149. Therefore, the input weight matrix was reduced to 6 × 6. The number of decision variables of the optimization problem was 36, and the solution achieved with the CSO is given by the following: The predictions of the CBR test data utilizing Equation (11) as input weight matrix is provided in Figure 7a. For the comparison purpose, the standard ELM was also tested on similar scenarios, and results are illustrated in Figure 7b. In Table 4, the statistical measures are compared showing the additional benefits of the proposed algorithm.

Comparisons
In this section, the comparison between the proposed integrated ELM-CSO method with the standard ELM with random input weights are presented. From the obtained results it is noticed that the statistical metrices of the proposed approach are impressive compared with the standard ELM, as summarized in Tables 5 and 6. In Table 5, the RMSE, correlation coefficient, and coefficient of determinations are provided at different activation functions. All the measures were improved significantly by the proposed approach and produced more accurate predictions of the CBR values. The approach is adaptive for all the types of activation functions of the ELM. Furthermore, the rate of training and testing data was not influenced by the proposed method, which is an added advantage (Table 6). Table 6 demonstrates that the ELM-CSO outperformed the unoptimized ELM model in terms of generalization in both training and prediction accuracy. Moreover, previous study by Shariati et al. [48] reported that similar trends, the performance of the ELM model can significantly be enhanced by hybridizing it with the grey wolf optimizer (GWO) for predicting concrete compressive strength. However, the amount of time needed to train the model considerably increased. Since evaluation takes time to complete, the use of an evolutionary algorithm in the design of the ELM was primarily responsible for the improvement in time [49].

Comparisons
In this section, the comparison between the proposed integrated ELM-CSO method with the standard ELM with random input weights are presented. From the obtained results it is noticed that the statistical metrices of the proposed approach are impressive compared with the standard ELM, as summarized in Tables 5 and 6. In Table 5, the RMSE, correlation coefficient, and coefficient of determinations are provided at different activation functions. All the measures were improved significantly by the proposed approach and produced more accurate predictions of the CBR values. The approach is adaptive for all the types of activation functions of the ELM. Furthermore, the rate of training and testing data was not influenced by the proposed method, which is an added advantage ( Table  6). Table 6 demonstrates that the ELM-CSO outperformed the unoptimized ELM model in terms of generalization in both training and prediction accuracy. Moreover, previous study by Shariati et al. [48] reported that similar trends, the performance of the ELM model can significantly be enhanced by hybridizing it with the grey wolf optimizer (GWO) for predicting concrete compressive strength. However, the amount of time needed to train the model considerably increased. Since evaluation takes time to complete, the use of an evolutionary algorithm in the design of the ELM was primarily responsible for the improvement in time [49].  In the current study, both models' prediction speeds were roughly the same in terms of prediction time. The accuracy and stability are more crucial in highlighting the prediction effect if the reasonable assurance regarding the achievement is within the set interval, even if the ELM-CSO model requires more time in the classification stage than the ELM model. Any hybrid ELM that allows for a fair comparison between the evolutionary algorithm could have an architecture comparable with that of an ELM. One of its main advantages is that the evolutionary algorithm relies on fewer parameters. Using a hybrid ELM with PSO and GWO, researchers could predict the behavior of steel-concrete floors [49]. The benefits of the prediction models suggested in this study are more obvious, particularly in the small generalization error, strong generalization ability, and high predictive accuracy, which can be extremely important in determining the CBR of lateritic soils and minimizing the time-consuming testing.

Conclusions
CBR is a crucial statistic in highway construction projects for figuring out how thick the pavement layers should be. Typically, subgrade soil samples are tested in laboratories under wet conditions for three days, which is both time intensive and costly. This develops effective AI models for predicting the CBR of lateritic soils based on the experimental dataset in place of the time-consuming task of performing actual laboratory tests. It is important to note that the wet CBR estimation can eliminate the need for costly and time-consuming laboratory testing. To perform this testing, experimental CBR data and fine-grained soils were acquired from an ongoing highway project in India that ran from Kovvuru to Gundugolanu (NH-216 (A)) and utilized to create an effective prediction solution.
The current article used soft computing to predict the CBR indices of the lateritic soils with considerable variability. Individual ELMs and ELM paired with an ELM-CSO were suggested for this purpose. The proposed models' predictability and performance were evaluated using the minimize MAE, MSE, and RMSE or maximize R and R 2 criteria.
The findings indicated that both suggested models could anticipate the CBR of lateritic soils, hence avoiding the need for extensive experimentation and saving time. According to experimental findings, the ELM-CSO model had the best prediction ability, with R 2 = 0.996, RMSE = 0.479, and average error (%) = 2.622. These results outperform those from the ELM model by a wide margin; because of this, combining the ELM and CSO can improve the performance of the ELM model and is recommended to be employed.
The current models' primary benefits significantly reduced computational costs and improved predicted accuracies. The created prediction model is useful for calculating the CBR of lateritic soils under wet conditions. Additionally, choosing or evaluating the lateritic soils' CBR will be simple for academics and practitioners. Here, it appears to be quite successful in estimating the saturated CBR by utilizing the lateritic soil properties, such as gradation distribution, plasticity characteristics, and compaction characteristics of soils. However, the ELM-CSO model, which is the current model, can be suggested as a viable option for predicting the CBR and is also helpful in assessing suitable lateritic cushion over expansive clays, based on the results.