1. Introduction
Soil stabilization techniques play an important role in geotechnical engineering to improve the engineering properties of weak or problematic soils [
1]. Traditional methods, such as soil replacement or compaction, have limitations in terms of cost, implementation, and environmental impact [
2,
3]. As a sustainable and economical alternative, soil stabilization using supplementary materials/additives has attracted considerable attention. One of these approaches includes adding hydrated lime and rice husk ash to the soil [
4].
Because of its unique properties, hydrated lime is a widely used additive in soil stabilization [
4,
5]. It is a fine, white powder obtained from the hydration of quicklime (calcium oxide) and consists mainly of calcium hydroxide [
4,
5]. When hydrated lime is added to soil, it undergoes chemical reactions with clay minerals, which lead to reduced soil compressibility, improved shear strength, and increased durability [
4,
5,
6]. These reactions, known as cation exchange and pozzolanic reactions, lead to the formation of stable compounds that bind soil particles together [
4,
5,
6].
Rice husk ash (RHA), a byproduct of the rice milling industry, is also promising as an additive in soil stabilization [
7]. RHA is obtained by burning rice husks at high temperatures, which converts the organic matter into amorphous silica and other inorganic components [
8]. When added to soil, RHA acts as a pozzolanic agent and reacts with calcium hydroxide from hydrated lime to form additional cementitious compounds [
4,
5]. The inclusion of RHA in the soil stabilization process further improves the strength, stiffness, and durability of the stabilized soil [
9]. Utilizing rice husk ash in cement and concrete offers numerous advantages, including reduced heat of hydration, improved strength, decreased permeability at higher dosages, enhanced resistance to chloride and sulfate, cost savings through reduced cement usage, and environmental benefits by mitigating waste disposal and lowering carbon dioxide emissions [
10,
11,
12,
13,
14,
15,
16,
17]. Jafer et al. [
18] focused on developing a sustainable ternary blended cementitious binder (TBCB) for soil stabilization. TBCB incorporates waste materials and improves the engineering properties of the stabilized soil. The results showed a reduced plasticity index and increased compressive strength. XRD and SEM analyses confirmed the formation of cementitious products, leading to a solid structure. TBCB offers a promising solution for soil stabilization with a reduced environmental impact [
18].
The combination of hydrated lime and rice husk ash has synergistic effects in soil stabilization [
19,
20]. The pozzolanic reactions between these materials and the soil matrix contribute to the development of cementitious compounds, thereby increasing strength and reducing permeability [
21]. In addition, the incorporation of rice husk ash contributes to the utilization of an agricultural waste product and increases sustainability in construction practices [
22,
23].
The success of soil stabilization using hydrated lime and rice husk ash depends on various factors such as the type and properties of the soil, dosage and ratio of supplementary materials/additives, curing conditions, and testing methods used [
24,
25]. Accurate predictive models that consider these parameters can help optimize the stabilization process and ensure the desired engineering performance of treated soils [
25,
26]. The accurate prediction of soil strength performance is crucial for geotechnical engineering. Artificial intelligence (AI) methods, including ANN, SVM, ANFIS, CNN, LSTM, decision trees, and GPR, have been applied in various geotechnical applications [
27,
28,
29,
30]. ANN is the most widely used AI technique, contributing to improved predictions and optimizations in geotechnical engineering [
27]. These AI methods enhance understanding and decision-making in areas such as frozen soils [
31,
32,
33], rock mechanics [
34,
35,
36], slope stability [
37,
38,
39], soil dynamics [
40,
41,
42,
43,
44], tunnels [
45,
46,
47], dams [
48,
49,
50], and unsaturated soils [
51,
52,
53]. Onyelowe et al. [
25] used artificial neural network (ANN) algorithms to predict the strength parameters of expansive soil treated with hydrated-lime activated rice husk ash. The algorithms performed well, with the Levenberg−Marquardt Backpropagation (LMBP) algorithm showing the most accurate results. The predicted models had a strong correlation coefficient and a high performance index.
AI algorithms aimed at material characterization and design have faced doubt because people are worried about the opacity and reliability of their intricate models [
54]. The primary obstacle is the absence of transparency and methods to extract knowledge from these models [
54]. There are different categories of mathematical modeling techniques: white-box, black-box, and grey-box, each varying in their level of transparency [
54,
55]. White-box models are grounded in fundamental principles and can elucidate the underlying physical relationships within a system [
54,
55,
56]. On the other hand, black-box models lack a clear structure, making it difficult to understand their inner workings [
54,
55,
56]. Grey-box models fall in between, as they identify patterns in data and offer a mathematical structure for the model [
54,
55,
56]. Artificial neural networks (ANN) are well-known examples of black-box models in engineering [
57]. Although they are widely used, they lack comprehensive information about the relationships they establish. In contrast, genetic programming (GP) and classification and regression trees (CART) are newer grey-box modeling techniques that employ an evolutionary process to develop explicit prediction functions and trees, respectively, making it more transparent compared to black-box methods such as ANN. GP and CART models provide valuable insights into system performance as they offer mathematical and tree structures, respectively, that aid in understanding the underlying processes. They have shown promise in terms of accuracy and efficiency across various applications. Here are the benefits of AI-based grey-box models:
- -
Transparency with Structure: Grey-box models, such as genetic programming (GP) and classification and regression trees (CART), strike a balance between white-box and black-box models. They offer transparency by identifying data patterns while providing a clear mathematical or tree-based structure, making it easier to understand the model’s operations [
58].
- -
Enhanced Insights: The structured nature of grey-box models allows for a deeper understanding of the underlying processes. Unlike black-box models such as artificial neural networks (ANN), GP and CART provide insights into system performance through their explicit prediction functions and tree structures.
- -
Transparency in Evolution: Techniques such as genetic programming (GP) use an evolutionary process to develop prediction functions, making the model’s development and evolution more transparent. This transparency aids in tracking the model’s progress and understanding its decision making [
59].
- -
Accuracy and Efficiency: Grey-box models, including GP and CART, have demonstrated promise across various applications, offering a combination of accuracy and efficiency. Their transparency, coupled with the ability to capture complex relationships in data, makes them valuable tools for mathematical modeling in engineering and other fields.
Based on the literature review provided, this research is groundbreaking in its systematic application of two distinct grey-box artificial intelligence models: genetic programming (GP) and classification and regression tree (CART). These models are utilized for the first time to predict critical soil parameters—California bearing ratio (CBR), unconfined compressive strength (UCS), and resistance value (Rvalue) determined through in-situ cone penetrometer tests—for expansive soil treated with recycled and activated rice husk ash composites. The study further evaluates the significance of input parameters, performing a sensitivity analysis on 121 datasets, each consisting of seven inputs: hydrated-lime activated rice husk ash (HARHA), liquid limit (LL), plastic limit (PL), plasticity index (PI), optimum moisture content (wOMC), clay activity (AC), and maximum dry density (MDD). HARHA, a material produced by blending 5% hydrated lime with rice husk ash, acts as an activator, and it is created from the controlled combustion of rice husk waste. Different proportions of HARHA (ranging from 0.1% to 12% in increments of 0.1% of weight) were employed to treat clayey soil, with the resulting effects on soil properties meticulously examined and documented within the study.
2. Database Processing
This study uses a 121-laboratory database, originally studied by Onyelowe et al. [
25], who examined expansive clays by conducting tests on both untreated and treated soils to determine the datasets by observing the effects of stabilization on the predictor parameters. Onyelowe et al. [
25] conducted a series of tests utilizing a laboratory to gather the data. The database is represented as three-dimensional diagrams in
Figure 1. Experiments were conducted on expansive clay soil, both untreated and treated with hydrated-lime activated rice husk ash (HARHA). The HARHA, a binder developed by blending rice husk ash with hydrated lime, was tested at varying proportions on the clayey soil.
Figure 2 depicts the effect of adding HARHA on three parameters: CBR, UCS, and resistance values. The results indicate that all three parameters increased with the percentage of additive until they reached a peak. Afterward, they showed slight decreases. An approximate value of 11.5% could be considered the optimal amount of HARHA additive.
The existing database contained seven inputs, which were as follows: hydrated-lime activated rice husk ash (HARHA), liquid limit (LL), plastic limit (PL), plasticity index (PI), optimum moisture content (wOMC), clay activity (AC), and maximum dry density (MDD). The plasticity index parameter was derived by subtracting the liquid limit from the plastic limit, while the remaining parameters were independent and lacked a direct correlation.
This database is noteworthy for using one of the largest sources of laboratory-derived UCS, CBR, and Rvalue measurements documented in the literature, boasting a considerable number of entries, totaling seven.
Table 1 shows the statistical characteristics of the database utilized in this study, presenting the minimum, maximum, and mean values for both inputs and outputs. These descriptive statistics offer valuable insights into the distribution and properties of the data, providing crucial information for model selection and optimization in subsequent analyses.
2.1. Outliers
Within the realm of database preparation, a pivotal concern revolves around the detection of outliers. Within a database context, an outlier denotes a data point that exhibits notable deviation from the majority of the data points [
60,
61]. The imperative lies in identifying and outrightly excluding these data points from the modeling process, as they have the potential to lead the model astray. Recognizing these particular data instances constitutes a pivotal stride in statistical analysis, commencing with the description of normative observations [
62,
63,
64]. This entails an overarching evaluation of the graphed data’s configuration, with the identification of extraordinary observations that diverge significantly from the data’s central mass—termed outliers.
Two graphical techniques commonly used to identify outliers are scatter plots and box plots [
54,
65]. The latter utilizes the median, lower quartile, and upper quartile to display the data’s behavior in the middle and at the distribution’s ends. Furthermore, box plots employ fences to identify extreme values in the tails of the distribution. Points beyond an inner fence are considered mild outliers, while those beyond an outer fence are regarded as extreme outliers [
54,
66,
67].
For this study, a dataset consisting of 121 observations was analyzed using a box plot to detect outliers. The process involved computing the median, lower quartile, upper quartile, and interquartile range, followed by calculating the lower and upper fences.
Figure 3 indicates that the first quarter parameter of CBR, UCS, and R
value values were 14, 141, and 17, respectively, and values of 34, 194, and 24 for the third quarter, respectively. The results demonstrate favorable distribution across the parameter range, from the minimum to the maximum values, with appropriately sized boxes. Additionally, the average parameters within all three boxes fall towards the center of the box, a positive indicator of data dispersion. According to
Figure 3, no points were found to exceed the extreme values defined by the computed fences. Furthermore,
Figure 4 shows histograms for different outputs.
2.2. Testing and Training Databases
The dataset employed in this study was divided into two distinct categories: the training database and the testing database. To achieve this, a random selection process was utilized, assigning 80% of the data to the training database and the remaining 20% to the testing database. The decision to allocate 80% of the data to the training database and the remaining 20% to the testing database is crucial in machine learning due to several reasons. The larger training set allows the model to learn patterns and generalize from the data, while the separate testing set provides an accurate evaluation of the model’s real-world predictive capabilities, avoiding overfitting and ensuring robustness. This division also permits validation, enabling fine-tuning and parameter optimization, resulting in a more realistic assessment of the model’s practical utility, which is essential for guiding decisions on real-world deployment.
Table 2 and
Table 3 provide a comprehensive overview of the statistical characteristics, including the minimum, maximum, and average values, for the parameters in the training and testing databases, respectively. The results of the statistical analysis reveal that the two databases share similar characteristics, indicating that the data used for training the artificial intelligence model are representative of the data used for testing the model. This similarity in statistical properties between the training and testing databases is likely to enhance the accuracy and robustness of the developed model. The findings from this study underscore the importance of using representative and well-characterized data for the development of effective and reliable artificial intelligence models. In
Table 2 and
Table 3, (U95) is 95% Confidence Interval.
2.3. Linear Normalizations
In a database context, each input or output variable is linked to specific units of measurement. To mitigate the impact of units and improve the efficiency of artificial intelligence training, a common approach involves data normalization. This process rescales the data to fit within a standardized range, often between zero and one. The normalization is achieved through the application of a linear transformation function, as described below:
The four terms in this equation are Xmax, Xmin, X, and Xnorm, which correspond to maximum, minimum, actual, and normalized values, respectively.
4. Results
In the development of artificial intelligence (AI) systems, the evaluation of various models is of paramount importance. This assessment process relies heavily on statistical parameters, which are instrumental in gauging the performance of AI models. The essential parameters are outlined by Equations (2)–(7), encompassing the mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMSE), mean squared logarithmic error (MSLE), root mean squared logarithmic error (RMSLE), and coefficient of determination (
R2). Leveraging these metrics, both researchers and developers can aptly gauge and juxtapose the efficacy of distinct AI models [
54].
where
N,
Xm, and
Xp are the number of datasets, actual values, and predicted values, respectively. In addition,
and
are the averages of the actual and predicted values, respectively. To have the best model,
R2 should be 1 and MAE, MSE, RMSE, MSLE, and RMSLE should equal 0.
4.1. Classification and Regression Tree (CART) Results
In this study, the development of the CART model was carried out using the MATLAB 2020 software package. The validity of a CART model hinges on selecting suitable distance ranges and maximum tree depth. Several strategies were tested through trial and error to determine the optimal values for these key parameters. Typically, setting a high value for the maximum tree depth may lead to an excessively complex model. Conversely, opting for a small value for the tree depth may result in the removal of certain input parameters, as the algorithm strives to minimize prediction errors. By iterating through trial and error, the most favorable CART model with well-tuned key parameters can be identified.
Figure 7,
Figure 8 and
Figure 9 present the performance of the optimal CART model, showcasing the predicted values versus the actual values obtained from experiments for CBR, UCS, and R
value testing, respectively. The results indicate that the CART method demonstrated satisfactory predictive capabilities in accurately determining the CBR, UCS, and R
value parameters.
Table 4,
Table 5 and
Table 6 present a detailed evaluation of the best CART model’s overall performance in predicting the CBR, UCS, and R
value parameters. These tables aim to provide comprehensive insights into the accuracy and generalization capabilities of the model on both the training and testing databases.
Table 4 focuses on the CART model’s performance in predicting the CBR parameter. The metrics utilized for the evaluation include MAE, MSE, RMSE, MSLE, RMSLE, and
R2. The model demonstrates excellent accuracy, with low MAE (0.976 for training and 1.141 for testing) and RMSE (1.195 for training and 1.363 for testing) values. The minimal MSLE and RMSLE values (approximately 0.004) indicate that the model’s predictions were closely aligned with the actual CBR values. Additionally, the high
R2 values (0.990 for training and 0.982 for testing) suggest that the model captures a substantial portion of variance in the data, leading to reliable predictions of the CBR parameter.
Table 5 evaluates the CART model’s performance in predicting the UCS. The model performed admirably with reasonably low MAE (3.262 for training and 3.742 for testing) and RMSE (3.868 for training and 4.300 for testing) values, showcasing its accuracy in predicting the UCS. The MSLE and RMSLE values (both approximately 0.001) further reinforce the strong correlation between predicted and actual values. The high
R2 values (0.985 for training and 0.980 for testing) indicate that the model captured a significant portion of variation in the data, resulting in reliable UCS predictions.
Table 6 highlights the CART model’s performance in predicting the R
value parameter. Once again, the model delivered impressive results, as indicated by the low MAE (0.539 for training and 0.548 for testing) and RMSE (0.672 for training and 0.663 for testing) values. The MSLE and RMSLE values (both approximately 0.001) reinforce the close correspondence between the predicted and actual values. Moreover, the high
R2 values (0.975 for training and 0.983 for testing) demonstrate a strong correlation between the model’s predictions and the actual resistance values.
The consistency in the model’s performance between the training and testing databases across all three tables indicates its ability to generalize unseen data well. Generalization is a critical aspect of machine learning models as it ensures that the model can make accurate predictions on new data, not just the data it was trained on. The fact that the model maintains its accuracy on unseen data suggests that it successfully learned meaningful patterns and relationships from the training database without overfitting. Furthermore, the values of MSLE and RMSLE close to zero in all three tables indicate that the model’s predictions were robust and stable, even for cases where the actual values exhibited significant fluctuations. This property is particularly useful in engineering, where data may have wide variations, and the logarithmic metrics provide a more balanced evaluation of the model’s performance. Moreover, the low values of the MAE, MSE, and RMSE metrics across all three tables indicate that the CART model performed well in minimizing the discrepancy between predicted and actual values. This suggests that the model’s predictions were relatively close to the true values. Small differences between the predicted and actual values are crucial in engineering applications, where precise estimations are vital for design, construction, and safety considerations.
Figure 10,
Figure 11 and
Figure 12 illustrate the optimal and best CART models for predicting the CBR, UCS, and R
value, respectively. Each model consists of 15 nodes, with the LL, PL, and PI serving as the root nodes, respectively. The model determined the number of nodes, which is an outcome of the modeling process. The node count can vary based on the initial tree depth chosen. A higher initial tree depth can result in more nodes, leading to increased model complexity and longer processing times. In this context, an optimal model consisted of 15 nodes.
To obtain the predicted number using any of these models, one must follow the branch rule from node 1 to the end of the tree. These CART models offer valuable tools for engineers and researchers to make accurate predictions based on specific input conditions, facilitating informed decision making in various engineering applications. Further analysis and examination of these models can provide deeper insights into their performance and effectiveness in real-world scenarios.
4.2. Genetic Programming (GP) Results
In this study, the second grey-box model under investigation was genetic programming (GP). To attain the highest performance, several iterations of this model were tested, and the most optimal configuration was identified.
Table 7 presents a summary of the crucial parameters used in the GP model, encompassing various properties that were adjusted to achieve the best possible predictive performance for the target variable.
The parameters listed in
Table 7 were derived from a series of deliberate attempts to refine the model, with the goal of finding the combination of values that produces the best balance of accuracy and performance. This iterative process involves making adjustments, experimenting with different settings, and analyzing the results to reach the most effective configuration. The table serves as a summary of these endeavors, showcasing the parameter values that have been identified as the most optimal based on the criteria of prediction accuracy and overall model quality.
Table 7 provides a detailed summary of the properties of the optimum GP model utilized to predict the CBR, UCS, and R
value parameters. Each parameter has its unique configuration in the GP model, which plays a crucial role in determining its predictive performance. The table presents key properties such as the population size, probabilities of GP operations (crossover, mutation, and reproduction), tree structure level, random constants, selection method, tour size, maximum initial and operation tree depth, range for initial values, count of node replacements during crossover, and brood size. By meticulously adjusting these properties for each parameter, the authors aimed to optimize the GP model’s accuracy in providing reliable CBR, UCS, and R
value predictions.
For each parameter, the GP model followed the HalfHalf method for generating individuals, leading to linear trees with a depth of 1. The selection method utilized was ‘rank selection’ with a tour size of 2, and various probabilities for GP operations were set, including 0.99 for crossover, 0.99 for mutation, and 0.2 for reproduction. The GP model incorporated two random constants, and it defined the maximum initial and operation tree depth based on values such as 7 for the CBR, 5 for the R
value, and 6 for the UCS. The range for initial values varied from 0 to 1, and during crossover, a count of 3 nodes was replaced. Furthermore, the brood size is set at 7 for the CBR, 5 for the R
value, and 6 for the UCS. The properties listed in
Table 7 offer crucial insights into the GP model’s tailored configuration for predicting the CBR, UCS, and R
value.
Figure 13,
Figure 14 and
Figure 15 show comparisons between the predicted CBR, UCS, and R
value generated by the GP model and the actual values based on results obtained from the testing and training databases. The GP model exhibited a highly satisfactory performance for predicting the CBR, UCS, and R
value parameters.
Table 8,
Table 9 and
Table 10 present the overall performance of the best GP model in predicting the CBR, UCS, and R
value parameters for both the training and testing databases, respectively. Each table consists of various metrics that evaluate the model’s predictive capability.
Table 8 focuses on predicting the CBR parameters and contains six evaluation metrics, namely MAE, MSE, RMSE, MSLE, RMSLE, and
R2. These metrics provide an assessment of the model’s accuracy and goodness of fit. For the training database, MAE was 1.508, indicating an average absolute error of 1.508 between the predicted and actual CBR values. The MAE for the testing dataset was 1.096. Lower MAE and RMSE values are desirable as they indicate a better model performance. The high
R2 values (0.978 for training and 0.981 for testing) suggest that the GP model explained a significant portion of the variance in the CBR data.
Table 9 focuses on predicting the UCS and shares the same evaluation metrics as
Table 8. The GP model exhibited an impressive performance for predicting the UCS, with lower MAE, MSE, and RMSE values for both the training and testing datasets compared with the CBR predictions reported in
Table 8. The
R2 values were also higher (0.990 for training and 0.993 for testing), indicating a stronger fit to the UCS data.
Table 10 presents various evaluation metrics associated with the R
value predictions. The GP model’s performance for R
value predictions was outstanding, with very low MAE, MSE, and RMSE values for both the training and testing datasets. The
R2 values were exceptionally high (0.998 for training and 0.998 for testing), demonstrating an excellent fit of the model to the R
value data.
Below is the recommended equation obtained from the GP model:
where
X1,
X2,
X3,
X4,
X5,
X6, and
X7 are HARHA, LL, PL, PI, w
opt, A
c, and MDD, respectively. In addition, the values of
R1,
R2 and
R3 are constants and equal to 0.32, 0.92, and 0.09, respectively.
6. Conclusions
This paper focused on developing predictive equations and trees using data-driven approaches for three crucial geotechnical properties of hydrated-lime activated rice husk ash-treated soil, namely the California bearing ratio (CBR), unconfined compressive strength (UCS), and resistance value (i.e., Rvalue from an in-situ cone penetrometer test). Two models, namely classification and regression trees (CART) and genetic programming (GP), were employed to predict these properties based on seven input parameters, consisting of hydrated-lime activated rice husk ash (HARHA) content, liquid limit (LL), plastic limit (PL), plasticity index (PI), optimum moisture content (wOMC), clay activity (AC), and maximum dry density (MDD).
The proposed CART and GP models both displayed commendable predictive aptitude for the CBR, UCS, and Rvalue parameters. The models were evaluated using various performance metrics, including MAE, MSE, RMSE, MSLE, RMSLE, and R2. The performance of the models was consistent across both the training and testing datasets, indicating their ability to generalize unseen data well.
Comparing the two models, CART generally outperformed GP in terms of predicting the UCS and Rvalue, particularly on the testing database. However, for the CBR predictions, CART demonstrated a slightly better performance on the training database, while GP performed slightly better on the testing database. Overall, both models proved effective in predicting the geotechnical properties under investigation. Additionally, the study assessed the importance of individual input parameters in the predictive models. This analysis provided valuable insights into the sensitivity of each model to specific parameters, helping identify critical factors influencing the models’ performance. The findings of this study contribute to the understanding of using data-driven approaches in predicting engineering properties of soils, which can have significant applications in geotechnical engineering and construction. The developed models offer valuable tools for engineers and researchers to make accurate predictions based on specific input conditions, facilitating informed decision making in various geoengineering applications.