Next Article in Journal
Numerical Analysis of the Stability of Underground Granite Chamber Under the Combined Effect of Penetration and Explosion
Previous Article in Journal
A Cognition–Affect–Behavior Framework for Assessing Street Space Quality in Historic Cultural Districts and Its Impact on Tourist Experience
Previous Article in Special Issue
Machine Learning-Assisted Sustainable Mix Design of Waste Glass Powder Concrete with Strength–Cost–CO2 Emissions Trade-Offs
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A PSO-XGBoost Model for Predicting the Compressive Strength of Cement–Soil Mixing Pile Considering Field Environment Simulation

1
Jiangxi Academy of Water Science and Engineering, Nanchang 330029, China
2
Jiangxi Provincial Dam Safety Management Center, Nanchang 330029, China
3
Jiangxi Provincial Technology Innovation Center for Ecological Water Engineering in Poyang Lake Basin, Nanchang 330029, China
*
Author to whom correspondence should be addressed.
Buildings 2025, 15(15), 2740; https://doi.org/10.3390/buildings15152740
Submission received: 9 July 2025 / Revised: 29 July 2025 / Accepted: 30 July 2025 / Published: 4 August 2025

Abstract

Cement–Soil Mixing (CSM) Pile is an important technology for soft ground reinforcement, and its as-formed compressive strength directly affects engineering design and construction quality. To address the significant discrepancy between laboratory-tested strength and field as-formed strength arising from differing environmental conditions, this study conducted modified laboratory experiments simulating key field formation characteristics. A cement–soil preparation system considering actual immersion conditions was established, based on controlling the initial water content state of the foundation soil before pile formation and applying submerged conditions post-formation. Utilizing data mining on 84 sets of experimental data with various preparation parameter combinations, a prediction model for the as-formed strength of CSM Pile was developed based on the Particle Swarm Optimization-Extreme Gradient Boosting (PSO-XGBoost) algorithm. Engineering validation demonstrated that the model achieved an RMSE of 0.138, an MAE of 0.112, and an R2 of 0.961. It effectively addresses the issue of large prediction deviations caused by insufficient environmental simulation in traditional mix proportion tests. The research findings establish a quantitative relationship between as-formed strength and preparation parameters, providing an effective experimental improvement and strength prediction method for the engineering design of CSM Pile.

Graphical Abstract

1. Introduction

Soft ground foundations, characterized by low bearing capacity, large deformation, and long consolidation times, pose significant challenges to engineering projects [1,2,3,4]. Due to land use planning requirements, encountering soft ground is often unavoidable in construction. So, engineers focus on the development of safe, economical, and efficient methods for soft ground reinforcement. Cement–Soil Mixing (CSM) Pile represents a common in situ reinforcement technology, The soft soil is mixed with cement by deep mixing to form a reinforcement, which significantly improves the foundation’s bearing performance [5,6,7,8,9,10,11,12,13]. The as-formed compressive strength of these piles is a critical parameter in design and construction, directly influencing project safety and stability [14,15]. To meet design specifications, extensive testing is typically required under site-specific geological conditions to determine the optimal mix proportion. However, conventional field mix proportion tests rely on a “trial-and-error” approach, often yielding multiple solutions and making the selection of the optimal scheme time consuming and labor intensive. Furthermore, significant discrepancies exist between the curing conditions in laboratory tests and the actual pile formation environment in the field, leading to considerable deviations between laboratory-predicted and field-observed performance. Therefore, it is crucial to build accurate prediction models for the as-formed strength of CSM Pile. Such models can not only reduce testing costs and environmental impact but also optimize mix proportion design and improve construction efficiency.
CSM Pile is one of the application forms of cement–soil. The research on the prediction of the pile strength often draws on the relevant achievements of the prediction of the compressive strength of cement–soil. Currently, scholars primarily investigate the influencing factors and development patterns of cement–soil strength through unconfined compressive strength (UCS) tests, subsequently establishing prediction models based on traditional empirical calculations. Consoli et al. [16] experimentally studied the effects of water content, porosity, and cement content on the UCS of artificially cemented silty soil and proposed a unique relationship where these three factors synergistically control strength. Sukmak et al. [17] identified clay mineralogy, water content, and cement content as key factors influencing clay–cement strength and established a prediction model. Horpibulsuk et al. [18] proposed a strength formula for Ariake clay centered on the water–cement ratio. Zhang et al. [19] treated different clays with Ordinary Portland Cement (OPC) and a DF stabilizer and established a relationship between the cone penetration index and stabilizer content, while Bi et al. [20] developed a strength development prediction method for OPC and Pulverized Blast Furnace Slag Cement (PBFC)—improved clays based on a log-normal cumulative distribution function. Kang et al. [21] proposed a cement–soil strength prediction model for clay and silt based on total water–cement ratio and soil–cement ratio. Consoli et al. [22] constructed a strength development model based on the porosity/cement content ratio (η/Cv). Do et al. [23] evaluated the influence of multiple factors on cement-stabilized sand strength using Bayesian Model Averaging (BMA) and Principal Component Analysis (PCA) and established a multivariate regression model. Zhang et al. [24] proposed an empirical formula for the strength of marine clay stabilized with low cement content PBFC, whereas Wu et al. [25] employed Pearson correlation and multiple linear regression to establish a strength prediction model for kaolin-based cement–soil. However, these cement–soil strength prediction models mainly rely on specific conditions, and their generalizability is limited.
In recent years, with the advancement of computer technology, intelligent algorithms have been widely applied across various scientific disciplines [26,27,28,29,30]. Tinoco et al. [31] constructed cement–soil strength prediction models using data mining techniques, such as Support Vector Machines (SVM) and Artificial Neural Networks (ANN). Zhang et al. [32] analyzed eight machine learning algorithms (including XGBoost) and identified cement content and water content as key variables. Yao et al. [33] and Khan et al. [34] proposed hybrid optimization algorithms and Explainable Artificial Intelligence (XAI) frameworks to enhance model accuracy and interpretability. However, existing models often utilize default parameters and lack validation through engineering practice.
Although considerable progress has been made in predicting the compressive strength of cement–soils, as a specific application the as-formed strength of the CSM Pile is complexly affected by multiple factors including foundation soil properties and the construction environment. Existing models often fail to directly guide construction mix design. Wang et al. [35] found significant differences in the optimal water–cement ratio between clay and fine sand. Roshan et al. [36] further highlighted that the physical properties of the foundation soil, field environment, and construction process are key influencing factors. Xi et al. [37], using their SA-IRMO-BPNN model, indicated that pile length, diameter, and spacing have the greatest impact on the Ultimate Bearing Capacity (UBC) of composite foundations. Mojtahedi et al. [38] achieved good prediction of UCS using an ANN model, but they did not fully consider the coupled effects of the initial water content state of the foundation soil and the surrounding immersion environment of the pile.
Given the complex and variable construction conditions of CSM Pile, existing research on predicting their as-formed strength suffers from limitations such as insufficient parameter refinement, lack of environmental factor analysis, and inadequate engineering validation. Therefore, this study improves the experimental protocol by introducing the sand and gravel content of the parent soil to quantify soil type, decomposing the water–cement ratio of the cement slurry into two independent variables (cement dosage and water dosage), and incorporating simulation of submerged conditions representative of field scenarios, thereby establishing a more realistic experimental system. Simultaneously, Particle Swarm Optimization (PSO) is employed to tune the hyperparameters of the XGBoost model, developing a PSO-XGBoost prediction model that effectively enhances the accuracy and reliability of as-formed strength prediction. This method quantifies the interactive effects of parent soil characteristics and construction parameters, providing an intelligent prediction tool for CSM Pile mix design. It aims to mitigate the “laboratory–field performance gap” and promote the transition of CSM Pile design from being experience-driven to data-driven, ultimately improving engineering quality and construction efficiency.

2. Cement–Soil Mix Proportion Design Experiments

2.1. Experimental Materials and Properties

The experiments in this study utilized three types of parent foundation soils obtained from the site of a sluice gate project under construction: silty clay, silty soil, and sandy soil (Jiangxi, China). These soils were targeted for reinforcement in the project, as shown in Figure 1. Several basic parameters were determined according to standard geotechnical testing methods [39]. The basic physical properties of the three soil samples are summarized in Table 1.
Particle size distribution curves for the three soil types were plotted based on the content percentages of different particle sizes, as illustrated in Figure 2, facilitating visual comparison.

2.2. Cement–Soil Mix Proportion Design

Foundation soils at engineering sites are often located below the groundwater table, existing in a near-saturated state. To maximally replicate the in situ moisture condition of the foundation soil, this study adopted the optimum water content (OWC) for preparing the test soil samples, rather than solely aiming for full saturation. Although saturation ensures that water fills the voids between soil particles, excess water can negatively affect the hardening process of the cement–soil mixture and fail to accurately reflect the true mechanical properties arising from the cement–soil particle interaction under actual field conditions. Conversely, the optimum water content ensures efficient cement hydration without causing adverse phenomena such as cracking or segregation in the cement–soil mixture due to excess moisture during hardening, which could ultimately affect its strength and stability. Therefore, to simulate the actual engineering environmental conditions as closely as possible, the test soils were first conditioned to their respective optimum water contents (OWC) before conducting the mix proportion experiments. This also ensured that all experiments were performed under uniform initial moisture conditions. This approach aimed both to ensure that the soil samples exhibited optimal mechanical performance at their OWC and to maintain consistent initial moisture states across the three different soil types (silty clay, silty soil, and sandy soil).
A total of 84 mix proportions were designed for the experiments. These included the 3 soil types (silty clay, silty soil, and sandy soil), 2 water–cement ratios (w/c ratios), and 14 cement dosages. The sand and gravel contents for the three soil types were 100%, 43.7%, and 3.4%, respectively. The experimental water content for each soil type was set to its OWC, which were 2.2%, 18.7%, and 22.6%, respectively. The cement–soil mix proportion design scheme is detailed in Table 2.
Based on the cement–soil mix proportions, the required quantities of each raw material for preparation were determined. The test soils, obtained from the construction site, were air-dried, ground, and sieved through a 5 mm mesh, then conditioned to their optimum water content. Soils in this state were used for the experiments. The cement dosage was calculated as a percentage of the mass of the test soil (at OWC). The added water quantity was calculated as a percentage of the mass of the cement dosage.

2.3. Unconfined Compressive Strength Test

2.3.1. Cement–Soil Specimen Preparation and Strength Testing

When reinforcing soft ground foundations with cement, the treated soil is often located below the groundwater table and thus exists in a submerged or saturated state. Considering that conventional laboratory tests for cement–soil strength often neglect the actual formation environment of CSM Pile, this study cured the prepared cement–soil specimens underwater to simulate the continuous immersion environment provided by groundwater to the solidified soil body under actual field conditions. Specifically, once the prepared cement–soil specimens reached the condition suitable for demolding, they were individually immersed in white plastic containers filled with water. The water-cured specimens and their containers were placed orderly in a curing room maintained at 95% relative humidity for a sealed curing period of 28 days. The preparation of cement–soil specimens involved several key steps: proportioning raw materials, mixing, molding, and demolding/curing, after which the final formed specimens were placed in a compression testing machine for compressive strength testing [40], as illustrated in Figure 3.

2.3.2. Experimental Results and Analysis of Trends

To quantify the influence of parent soil type and cement slurry water–cement ratio on the performance of CSM Pile in the field, this study converted the parent soil category into a tangible and easily comparable indicator: the parent soil’s sand and gravel content. Furthermore, it decomposed the cement slurry water–cement ratio into two independent variables relevant to practical application: cement dosage and water dosage. The analysis of cement–soil strength test results was then conducted using combinations of these different preparation parameters.
The variation pattern of the compressive strength of cement–soil under multiple preparation parameter combinations with respect to cement dosage is shown in Figure 4. Figure 4 presents the strength values for all 84 combinations, encompassing three influencing factors: 3 levels of sand and gravel content, 14 levels of cement dosage, and 2 levels of water–cement ratio (w/c ratio).
As observed from Figure 4, when the cement dosage and water–cement ratio are held constant, the parent soil type significantly influences the cement–soil strength. Specifically, a higher content of sand or fine gravel in the parent soil leads to greater cement–soil strength. The strength exhibits a positive correlation with the parent soil’s sand and gravel content; the higher the sand and gravel content, the higher the cement–soil strength.
Secondly, under conditions of constant water–cement ratio and sand and gravel content, the cement–soil strength increases with increasing cement dosage. Furthermore, the rate of strength increase varies across different ranges of cement dosage. Within an approximate cement dosage range of 6% to 25%, the strength increases rapidly with cement dosage, indicating an accelerating rate of strength gain. However, as the cement dosage exceeds 25%, the strength increases more slowly with further additions of cement, indicating a decelerating rate of strength gain. At this stage, relying solely on increasing the cement dosage to enhance strength becomes less cost effective.
Finally, the water–cement ratio (w/c ratio) is an important control variable in cement–soil mix proportion design. As a parameter reflecting the amount of water added during the experiment, adjusting the water–cement ratio, given a fixed cement dosage, effectively corresponds to changing the added water quantity. In this study’s experiments, the three soil types (silty clay, silty soil, and sandy soil) were initially conditioned to their optimum water content (OWC), and this OWC was used as the initial water content for each soil, ensuring a consistent starting state. The primary focus here is to investigate the effect of varying the added water quantity (implicitly via the w/c ratio) during mixing on the resulting strength. The figure indicates that when the parent soil is at its optimum water content and the cement dosage is constant, increasing the water–cement ratio (which means increasing the added water quantity) leads to a decrease in cement–soil strength. In other words, the cement–soil strength decreases as the quantity of added water increases.

3. Cement–Soil Mixing Pile As-Formed Strength Prediction Method

3.1. Data Description and Preprocessing

The improved cement–soil mix design tests, which incorporate the actual in situ environmental conditions during pile formation, were used as the training dataset for the prediction model. The dataset comprises 84 sets of experimental data obtained from tests with 14 types of cement dosages, 3 levels of sand and gravel content, and 2 water–cement ratios. In this study, cement dosage, sand and gravel content, and water–cement ratio are treated as the three input features, while the strength is used as the output variable.
According to the specifications, the selected cement dosage, water–cement ratio, and sand and gravel content are representative of the factors affecting cement–soil strength, and the determined variable values hold practical engineering significance. Specifically, the cement dosage is selected from a range of 6% to 41% with increments of 2% or 3%, yielding 14 values that cover the commonly used dosages in various engineering applications; the water–cement ratio is set at 0.5 and 1.0, which are frequently used in practice; for sand and gravel content, three types of foundation soils are chosen—sandy soil, silt, and silty clay—with corresponding sand and gravel contents of 100.0%, 43.7%, and 3.4%, respectively. In order to facilitate the development of a machine learning model for cement–soil using Python (version 3.10.3), boundary histograms were employed to visualize and analyze the distribution of the dataset, as shown in Figure 5.

3.2. Machine Learning Methods

In this study, a PSO-XGBoost model is constructed to predict the as-formed strength of cement–soil mixing piles. To objectively evaluate the performance of this model, five other widely used methods are also incorporated: Multiple Linear Regression (ML), Random Forest (RF), K-Nearest Neighbors (KNN), Back Propagation Neural Network (BP), and XGBoost (prior to PSO optimization) as a comparative baseline. Prediction models for cement–soil strength were developed using the preliminary experimental data for evaluation and comparison.
(1) Multiple Linear Regression (ML)
ML [41] is a basic statistical method used to establish a linear relationship model between independent and dependent variables by minimizing the sum of squared errors between predicted and actual values.
(2) Random Forest (RF)
RF [42] is an ensemble learning method that constructs multiple decision trees and aggregates their outcomes for prediction. Each tree is trained using bootstrap sampling and random feature selection, which enhances the model’s resistance to overfitting, robustness, and its ability to handle high-dimensional data and missing values—making it suitable for modeling complex relationships.
(3) K-Nearest Neighbor Regression (KNN)
KNN [43] is a non-parametric method that performs regression by computing the distance between a test sample and its k nearest neighbors and then taking the average of these neighbors.
(4) Back Propagation Neural Network Regression (BP)
The BP network [44] is a multi-layer feedforward neural network that utilizes the backpropagation algorithm to adjust weights and reduce output error. It is capable of approximating any nonlinear function and is applicable to various pattern recognition problems.
(5) eXtreme Gradient Boosting (XGBoost)
XGBoost [45] is an optimized gradient boosting library that provides an efficient and accurate decision tree boosting algorithm. It emphasizes computational speed and model performance, supports various objective functions, and is widely used in data analysis competitions and industrial applications due to its excellent predictive ability and efficiency.
(6) Particle Swarm Optimization (PSO)
PSO is a population-based stochastic optimization technique that simulates the movement of individuals in a swarm searching for the optimal solution in the search space. It features strong global search capabilities and fast convergence [46,47]. PSO is simple to implement, does not require gradient information, supports parallel computing, and is suitable for a variety of optimization problems.

3.3. Construction of the Cement–Soil Mixing Pile Strength Prediction Model

3.3.1. Machine Learning Modeling

(1) Python Libraries for the Model
In this study, the Python programming language was used with the PyCharm (Community Edition 2022.3) Integrated Development Environment (IDE) to write the code. To accomplish the tasks of building and evaluating the prediction model, several machine learning libraries were installed, which include the necessary classes and functions, as shown in Table 3.
(2) Model Evaluation Metrics
For evaluating the regression prediction models, three commonly used metrics were employed: Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and the Coefficient of Determination (R Score). Below are the descriptions and formulas for these metrics, where N represents the number of samples, y i represents the actual strength values, y ^ i represents the predicted strength values, and y ¯ is the mean of the actual values.
The RMSE, also known as the root mean square deviation, indicates the extent of deviation between the predicted and actual values. It quantifies the magnitude of prediction errors and intuitively demonstrates the prediction performance; however, RMSE is sensitive to outliers, as a large residual can significantly affect its value. RMSE is calculated as follows:
RMSM = 1 n i = 1 n y i y ^ i 2
The MAE is the average of the absolute differences between the predicted and actual values. Compared with RMSE, MAE is less sensitive to outliers since each error contributes equally. MAE is calculated as follows:
MAE = 1 n i = 1 n y i y ^ i
R2 is a measure of the model’s goodness-of-fit, reflecting the model’s ability to explain the variability in the dataset. Its value ranges from 0 to 1; the closer it is to 1, the better the model fits the data, indicating that the model can more effectively explain the data’s variability. R is calculated as follows:
R 2 = 1 y i y ^ i 2 y i y ¯ 2
(3) Modeling and Prediction Process
Using the 84 sets of experimental data as the dataset, the data were randomly split into 65% for training and 35% for testing. Multiple machine learning algorithms were employed to perform model calculations based on the 84 sets of cemented soil strength data. The XGBoost model with the best evaluation metrics was selected, followed by using PSO to tune the hyperparameters of the XGBoost model. A PSO-XGBoost prediction model was then constructed, with the optimized key hyperparameters output and interpreted using the SHAP method to assess the reliability and applicability of the model predictions. The flowchart of the construction process for the cement–soil mixing pile strength prediction model is shown in Figure 6.

3.3.2. Construction of the PSO-XGBoost Model

Based on the XGBoost model, the Particle Swarm Optimization (PSO) algorithm is further applied to tune the hyperparameters. The search space for PSO is defined based on these hyperparameter ranges of XGBoost. The PSO parameters are set as follows: swarmsize = 10, omega = 0.5, phip = 0.5, phig = 0.5, and maxiter = 50. The eight key hyperparameters of XGBoost and their empirical ranges are listed in Table 4. In the PSO-XGBoost model, the objective function is set to reg:squarederror to minimize the squared error, and the evaluation metric is set to rmse, with lower values indicating that the model’s predictions are closer to the actual values. To balance model fitting and computational efficiency, num_boost_round is set to 1000, representing the maximum number of training rounds, and early_stopping_rounds is set to 20—meaning that training is halted if the evaluation metric on the validation set does not improve for 20 consecutive rounds. This approach effectively prevents overfitting and conserves computational resources. By combining the global search capability of PSO with the efficient learning ability of XGBoost, an optimal hyperparameter configuration is achieved, as shown in Table 4.

3.3.3. Model Evaluation

Table 5 presents the evaluation metrics of the six models on both the test set and the entire dataset. The PSO-XGBoost model exhibits RMSE and MAE values closest to 0, indicating a minimal difference between predicted and actual values, and an R value nearest to 1, demonstrating an excellent fit and strong ability to explain the variability in the experimental data. Overall, the PSO-XGBoost based cement–soil strength prediction model shows the highest accuracy and best performance. In comparison, the XGBoost model before PSO optimization ranks second, the ML model performs the worst, while the RF, KNN, and BP models display intermediate performance. Figure 7 presents a radar chart of the performance differences among the six models across six evaluation metrics. Based on 84 sets of cemented soil strength data, the overall ranking from best to worst is PSO-XGBoost > XGBoost > RF > KNN > BP > ML.
The scatter plot of predicted versus actual values is shown in Figure 8, where the horizontal axis represents the actual strength values of the samples and the vertical axis represents the model’s predicted strength values. Each blue dot corresponds to one sample, represented as a coordinate pair of the actual and predicted values. The red diagonal line, labeled the Ideal Line, represents the best-fit line, serving as a reference baseline. The higher the prediction accuracy, the more the sample points align along this line; conversely, larger perpendicular deviations and a greater number of points away from the line indicate lower prediction accuracy.
Analysis of the scatter plots for the six models (ML, RF, KNN, BP, XGBoost, PSO-XGBoost) shows that the ML model has the largest deviations between predicted and actual values, performing the worst; the RF and BP models, although their points are close to the Ideal Line, still show some error; the KNN model performs moderately with most points near the Ideal Line but with slight deviations. In contrast, the XGBoost model demonstrates significantly improved prediction accuracy with points almost entirely on the Ideal Line. The PSO-XGBoost model further refines the hyperparameter settings, resulting in a scatter plot that closely adheres to the Ideal Line with minimal deviation, thereby exhibiting the best predictive performance and generalization capability. Consequently, for predicting the as-formed strength of cement–soil mixing piles, the PSO-XGBoost model, with its exceptional accuracy and stability, is the optimal choice. However, it should be noted that the predictive performance of the PSO-XGBoost model depends on the range and characteristic representation of the cement–soil parameter-strength dataset; if beyond the range of these 84 parameter combination data, the model’s performance requires further evaluation and validation.

3.3.4. Visualization and Explanation of the Optimal Model

To further elucidate the performance of the PSO-XGBoost model in predicting the strength of cement–soil mixing piles and to analyze the relative importance of the feature variables, the SHAP method was applied. Figure 9 shows the distribution of SHAP values for each feature variable in the test data. The vertical axis ranks the feature variables from most to least important; red points indicate larger SHAP values (signifying a more pronounced influence on the output), while blue points indicate smaller impacts. Cement dosage and sand and gravel content emerge as the key factors influencing the prediction of compressive strength for cement–soil mixing piles. High cement content and high sand and gravel content contribute significantly and positively to enhancing prediction accuracy, whereas the effect of the water–cement ratio is relatively minor. This implies that in optimizing the design and material selection for cement–soil mixing piles, priority should be given to adjusting the proportions of cement and aggregate to enhance structural strength and stability.

4. Engineering Case Validation

4.1. Cement–Soil Mixing Pile Construction Process

A large-scale hydraulic project is located at the downstream confluence of rivers and is primarily designed to regulate river water levels during low-flow periods. The project mainly comprises the left-bank connecting section, spillway gate section, intermediate connecting section, ship lock section, and the right-bank relocation embankment and its connecting section. During the first phase of construction, to enhance the stability of the foundation in certain areas, a composite foundation system utilizing cement–soil mixing piles was implemented to reinforce the existing foundation. The layout of the cement–soil mixing piles at the hydraulic project is shown in Figure 10. They are primarily arranged in the foundations of the following three sections—in order from the right bank to the left bank: ‘right-bank relocation embankment’, ‘embankments of the upstream and downstream approach channels of the ship lock’, and ‘energy dissipater pool of the spillway gate in Area Three’.
The construction procedure for the cement–soil mixing piles is illustrated in Figure 11, and mainly includes the following processes: positioning of the mixing pile machine, slurry spraying and mixing during sinking, slurry spraying and mixing during lifting, cleaning of the pile machine, and relocation of the pile machine. During both the sinking and lifting processes, the mixing drill rod repeatedly mixes the foundation to form the pile.

4.2. Prediction of the As-Formed Strength of Cement–Soil Mixing Piles

4.2.1. PSO-XGBoost Model Prediction

The PSO-XGBoost model was directly applied to predict the as-formed strength of cement–soil mixing piles on site. A comparative analysis was performed between the model’s predicted values and the actual measured values, and the model’s accuracy was evaluated using three metrics to explore its practical value. Based on the geological conditions at the project site, the foundation exhibited issues such as compressive deformation and sliding stability, necessitating reinforcement measures. cement–soil mixing piles were used to solidify the original soil mass. Three cement–soil mixing pile projects were carried out on site, and the optimal PSO-XGBoost model was used to predict the as-formed strength of cement–soil mixing piles at different project locations. The following shows the input combinations of preparation parameters for the cement–soil mixing piles at various project locations and the corresponding predicted output strength, as shown in Table 6.

4.2.2. Cement–Soil Mixing Pile Strength Detection Based on Core Drilling Method

For the cement–soil mixing pile projects involved in this study, systematic sampling tests were conducted on the piles after reaching the predetermined 28-day strength age, to accurately determine their as-formed strength. On-site operations involved the use of professional large-scale drilling equipment to drill and core the cement–soil mixing piles, thereby obtaining core samples that reflect the actual internal state of the piles. The core samples were extracted from the drilling machine’s rock core tube, as shown in Figure 12.

4.3. Results Analysis

Figure 13 displays the scatter distribution between the as-formed strength predicted by the PSO-XGBoost model and the actual measured values of the cement–soil mixing piles. In terms of model performance, the evaluation metrics are RMSE = 0.138 and MAE = 0.112—both values being close to zero and lower than those obtained from the indoor experimental data test set. This indicates that the difference between the predicted and actual cement–soil strength is relatively small. Additionally, the coefficient of determination R2 is 0.961, which is close to 1, demonstrating an excellent fit and the model’s strong ability to capture and explain the variability of cement–soil strength in actual engineering.
Comparing the predicted values with the measured values, the data points exhibit a clear trend along the diagonal (y = x). However, the actual measured values are generally lower than the predicted values. This discrepancy arises because the PSO-XGBoost model was trained on indoor experimental data, where the predicted values reflect ideal conditions—i.e., strictly controlled construction processes and full curing. Under actual field conditions, due to limitations in construction processes, curing conditions, and other uncontrollable factors, the actual strength of the cement–soil mixing piles is generally slightly lower than the predicted strength under laboratory conditions. Although there is some dispersion and uncertainty in the as-formed strength of cement–soil mixing piles in practice, the model exhibits high accuracy and reliability in predicting the as-formed strength, maintaining good predictive performance even in the complex environment of actual projects.
This study addresses the insufficiency in predicting cement–soil mixing pile strength by enhancing the model’s engineering applicability in four key aspects. First, at the experimental design level, the focus was on the water content of the foundation soil and the underwater immersion characteristics during pile formation; optimally moist soil samples and a water-curing scheme for cement–soil were used to simulate the actual conditions before and after pile formation. Second, in terms of parameter processing innovation, the parent soil type was quantified using the sand and gravel content index, and the water–cement ratio was decoupled into two independent variables—cement dosage and water dosage—to facilitate an intuitive analysis of the influence of preparation parameters. Third, at the model construction level, the PSO algorithm iteratively optimized and determined a combination of nine key hyperparameters for XGBoost, improving R by 0.28% to 28.55% compared to other models. Finally, the PSO-XGBoost model’s prediction results have been validated in engineering applications and proved reliable. Although Tinoco et al. [35] and Zhang et al. [36] have also used algorithms for prediction, their studies were confined to indoor experiments and were not validated in actual projects.

5. Conclusions

This study established a new method for predicting the strength of cement–soil mixing piles through experimental improvement and algorithm optimization. The main conclusions are as follows:
(1) Based on an experimental system using soil samples with optimal moisture content and water-curing of cement–soil specimens, the environmental characteristics of cement–soil mixing pile formation were simulated. The average difference between the predicted values of the model based on indoor test data and the actual detected values is only 2.18%, effectively enhancing the correlation between indoor tests and on-site pile formation strength.
(2) Through hyperparameter optimization, the PSO-XGBoost model has enhanced prediction accuracy compared to conventional models. It achieved an R2 value of 0.961 in engineering validation. This provides a reliable basis for mix proportion optimization in cement–soil mixing pile projects.

Author Contributions

J.X.: conceptualization, methodology, software, formal analysis, investigation, data curation, writing—original draft; Y.G.: conceptualization, writing—review and editing; X.L.: conceptualization, writing—review and editing, funding acquisition; Y.L.: conceptualization, methodology, resources, writing—review and editing, supervision, project administration; L.C.: methodology, validation, visualization; C.L.: methodology, writing—review and editing; C.Z.: formal analysis, data curation. All authors have read and agreed to the published version of the manuscript.

Funding

This work was financially supported by the Jiangxi Provincial Natural Science Foundation (Grant No. 20232BAB214090), the Jiangxi Provincial Science and Technology Department’s “Science and Technology + Water Resources” Joint Program (2023KSG01008, 2022KSG01003), and the Science and Technology Project of the Water Resources Department of Jiangxi Province (Grant Nos. 202325ZDKT16, 202526YBKT03, 202425YBKT05, 202425YBKT04).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  1. Baroni, M.; de Souza, S.; Almeida, M. Compressibility and stress history of very soft organic clays. Proc. Inst. Civ. Eng.-Geotech. Eng. 2017, 170, 148–160. [Google Scholar] [CrossRef]
  2. Madhira, M.; Sakleshpur, V.A. Geotechnics of Soft Ground. In Geotechnics for Natural and Engineered Sustainable Technologies; Krishna, A., Dey, A., Sreedeep, S., Eds.; Developments in Geotechnical Engineering; Springer: Singapore, 2018. [Google Scholar] [CrossRef]
  3. Li, W.; Ma, H.; Wei, M.; Xiang, P.; Tang, F.; Gao, B.; Zhou, Q. Dynamic Responses of Train-Symmetry-Bridge System Considering Concrete Creep and the Creep-Induced Track Irregularity. Symmetry 2023, 15, 1846. [Google Scholar] [CrossRef]
  4. Zhu, F.; Zhang, W. Scale effect on bearing capacity of shallow foundations on strain-softening clays. Comput. Geotech. 2021, 135, 104182. [Google Scholar] [CrossRef]
  5. Denies, N.; Huybrechts, N. Chapter 11—Deep Mixing Method: Equipment and Field of Applications. In Ground Improvement Case Histories; Elsevier: Amsterdam, The Netherlands, 2015; pp. 311–350. [Google Scholar] [CrossRef]
  6. Wonglert, A.; Jongpradist, P. Impact of reinforced core on performance and failure behavior of stiffened deep cement mixing piles. Comput. Geotech. 2015, 69, 93–104. [Google Scholar] [CrossRef]
  7. Lu, X.; Mengen, S.; Wang, P. Numerical simulation of the composite foundation of cement soil mixing piles using FLAC3D. Clust. Comput. 2018, 22 (Suppl. S4), 7965–7974. [Google Scholar] [CrossRef]
  8. Fan, J.; Wang, D.; Qian, D. Soil-cement mixture properties and design considerations for reinforced excavation. J. Rock Mech. Geotech. Eng. 2018, 10, 791–797. [Google Scholar] [CrossRef]
  9. Tabarsa, A.; Latifi, N.; Osouli, A.; Bagheri, Y. Unconfined compressive strength prediction of soils stabilized using artificial neural networks and support vector machines. Front. Struct. Civ. Eng. 2021, 15, 520–536. [Google Scholar] [CrossRef]
  10. Pourebrahim, F.; Zolfegharifar, S.Y. Stabilizers effects comprehensive assessment on the physical and chemical properties of soft clays. Shock. Vib. 2022, 5991132. [Google Scholar] [CrossRef]
  11. Mahesh, B. Machine Learning Algorithms—A Review. Int. J. Sci. Res. (IJSR) 2020, 9, 381–386. [Google Scholar] [CrossRef]
  12. Ghanizadeh, A.R.; Ghanizadeh, A.; Asteris, P.G.; Fakharian, P.; Armaghani, D.J. Developing bearing capacity model for geogrid-reinforced stone columns improved soft clay utilizing MARS-EBS hybrid method. Transp. Geotech. 2023, 38, 100906. [Google Scholar] [CrossRef]
  13. Karami, H.; Pooni, J.; Robert, D.; Costa, S.; Li, J.; Setunge, S. Use of secondary additives in fly ash based soil stabilization for soft subgrades. Transp. Geotech. 2021, 29, 100585. [Google Scholar] [CrossRef]
  14. Liu, J. Research on the Application of Cement-Soil Mixing Pile Technology in Soft Soil Foundation Reinforcement. Authorea 2025. [Google Scholar] [CrossRef]
  15. Shen, K.; Zhang, H.; Liu, J.; Zhao, X.; Zhang, Y. Study of cement-soil mixed piles reinforcement method for offshore wind turbine pile foundation. Ocean. Eng. 2024, 313, 119423. [Google Scholar] [CrossRef]
  16. Consoli, N.C.; Rosa, D.A.; Cruz, R.C.; Dalla Rosa, A. Water content, porosity and cement content as parameters controlling strength of artificially cemented silty soil. Eng. Geol. 2011, 122, 328–333. [Google Scholar] [CrossRef]
  17. Sukmak, G.; Sukmak, P.; Horpibulsuk, S.; Arulrajah, A.; Horpibulsuk, J. Generalized strength prediction equation for cement stabilized clayey soils. Appl. Clay Sci. 2023, 231, 106761. [Google Scholar] [CrossRef]
  18. Horpibulsuk, S.; Miura, N.; Nagaraj, T.S. Clay–Water∕Cement Ratio Identity for Cement Admixed Soft Clays. J. Geotech. Geoenviron. Eng. 2005, 131, 187–192. [Google Scholar] [CrossRef]
  19. Zhang, Z.; Omine, K.; Flemmy, S.O. Evaluation of the improvement effect of cement-stabilized clays with different solidifying agent addition and water content. J. Mater. Cycles Waste Manag. 2022, 24, 2291–2302. [Google Scholar] [CrossRef]
  20. Bi, J.; Chian, S.C. Modelling of three-phase strength development of ordinary Portland cement- and Portland blast-furnace cement-stabilised clay. Géotechnique 2020, 70, 80–89. [Google Scholar] [CrossRef]
  21. Kang, G.; Kim, Y.; Kang, J. Predictive strength model of cement-treated fine-grained soils using key parameters: Consideration of the total water/cement and soil/cement ratios. Case Stud. Constr. Mater. 2023, 18, e02069. [Google Scholar] [CrossRef]
  22. Consoli, N.C.; Cruz, R.C.; Floss, M.F.; Festugato, L. Parameters Controlling Tensile and Compressive Strength of Artificially Cemented Sand. J. Geotech. Geoenviron. Eng. 2010, 136, 759–763. [Google Scholar] [CrossRef]
  23. Do, H.-D.; Pham, V.-N.; Nguyen, H.-H.; Huynh, P.-N.; Han, J. Prediction of Unconfined Compressive Strength and Flexural Strength of Cement-Stabilized Sandy Soils: A Case Study in Vietnam. Geotech. Geol. Eng. 2021, 39, 4947–4962. [Google Scholar] [CrossRef]
  24. Zhang, R.J.; Santoso, A.M.; Tan, T.S.; Phoon, K.K. Strength of High Water-Content Marine Clay Stabilized by Low Amount of Cement. J. Geotech. Geoenviron. Eng. 2013, 139, 2170–2181. [Google Scholar] [CrossRef]
  25. Wu, Y.; Yang, J.; Liu, X.; Lu, Y.; Lu, R. Combined Correlation Analysis and Multilinear Regression for Strength Model of Cement-Stabilized Clayey Soils. Int. J. Geomech. 2024, 24, 04024190. [Google Scholar] [CrossRef]
  26. He, H.; Shuang, E.; Ai, L.; Wang, X.; Yao, J.; He, C.; Cheng, B. Exploiting machine learning for controlled synthesis of carbon dots-based corrosion inhibitors. J. Clean. Prod. 2023, 419, 138210. [Google Scholar] [CrossRef]
  27. Baghbani, A.; Choudhury, T.; Costa, S.; Reiner, J. Application of artificial intelligence in geotechnical engineering: A state-of-the-art review. Earth-Sci. Rev. 2022, 228, 103991. [Google Scholar] [CrossRef]
  28. Liu, H.; Su, H.; Sun, L.; Dias-da-Costa, D. State-of-the-art review on the use of AI-enhanced computational mechanics in geotechnical engineering. Artif. Intell. Rev. 2024, 57, 196. [Google Scholar] [CrossRef]
  29. Ho, L.S.; Tran, V.Q. Machine learning approach for predicting and evaluating California bearing ratio of stabilized soil containing industrial waste. J. Clean. Prod. 2022, 370, 133587. [Google Scholar] [CrossRef]
  30. Shan, H.; Ai, L.; He, C.; Li, K. Enhancing multi-objective prediction of settlement around foundation pit using explainable machine learning. J. Civ. Struct. Health Monit. 2025, 15, 1–22. [Google Scholar] [CrossRef]
  31. Tinoco, J.; Alberto, A.; da Venda, P.; Gomes Correia, A.; Lemos, L. A novel approach based on soft computing techniques for unconfined compression strength prediction of soil cement mixtures. Neural Comput. Appl. 2019, 32, 8985–8991. [Google Scholar] [CrossRef]
  32. Zhang, C.; Zhu, Z.; Liu, F.; Yang, Y.; Wan, Y.; Huo, W.; Yang, L. Efficient machine learning method for evaluating compressive strength of cement stabilized soft soil. Constr. Build. Mater. 2023, 392, 131887. [Google Scholar] [CrossRef]
  33. Yao, Q.; Tu, Y.; Yang, J. Predicting the compressive strength of solid waste-cement stabilized compacted soil using machine learning model. Mater. Today Commun. 2025, 44, 111882. [Google Scholar] [CrossRef]
  34. Khan, M.H.A.; Abdallah, A.; Cuisinier, O. Insights into the strength development in cement-treated soils: An explainable AI-based approach for optimized mix design. Comput. Geotech. 2025, 180, 107103. [Google Scholar] [CrossRef]
  35. Wang, S.; Guo, S.; Gao, X.; Zhang, P.; Li, G. Effects of cement content and soil texture on strength, hydraulic, and microstructural characteristics of cement-stabilized composite soils. Bull. Eng. Geol. Environ. 2022, 81, 264. [Google Scholar] [CrossRef]
  36. Roshan, M.J.; Rashid, A.S.B.A. Geotechnical characteristics of cement stabilized soils from various aspects: A comprehensive review. Arab. J. Geosci. 2023, 17, 1. [Google Scholar] [CrossRef]
  37. Xi, L.; Jin, L.; Ji, Y.; Liu, P.; Wei, J. Prediction of Ultimate Bearing Capacity of Soil–Cement Mixed Pile Composite Foundation Using SA-IRMO-BPNN Model. Mathematics 2024, 12, 1701. [Google Scholar] [CrossRef]
  38. Mojtahedi, S.F.F.; Ahmadihosseini, A.; Sadeghi, H. An Artificial Intelligence Based Data-Driven Method for Forecasting Unconfined Compressive Strength of Cement Stabilized Soil by Deep Mixing Technique. Geotech. Geol. Eng. 2023, 41, 491–514. [Google Scholar] [CrossRef]
  39. Ministry of Housing and Urban-Rural Development of the People’s Republic of China. GB/T 50123-2019; Standard for Geotechnical Testing Method. China Planning Press: Beijing, China, 2019.
  40. Ministry of Housing and Urban-Rural Development of the People’s Republic of China. JGJ/T 233-2011; Specification for Mix Proportion Design of Cement Soil. China Architecture & Building Press: Beijing, China, 2011.
  41. Galton, F. Regression towards mediocrity in hereditary stature. J. Anthropol. Inst. Great Br. Irel. 1886, 15, 246–263. [Google Scholar] [CrossRef]
  42. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  43. Cover, T.; Hart, P. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
  44. Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
  45. Chen, T.; Guestrin, C. XGBoost. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar] [CrossRef]
  46. Poli, R.; Kennedy, J.; Blackwell, T. Particle swarm optimization. Swarm Intell. 2007, 1, 33–57. [Google Scholar] [CrossRef]
  47. Kennedy, J.; Eberhart, R. Particle Swarm Optimization. In Proceedings of the IEEE International Conference on Neural Networks, Perth, WA, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar] [CrossRef]
Figure 1. Test soil samples and the determination of their parameters.
Figure 1. Test soil samples and the determination of their parameters.
Buildings 15 02740 g001
Figure 2. The particle size distribution curves of the test soil samples.
Figure 2. The particle size distribution curves of the test soil samples.
Buildings 15 02740 g002
Figure 3. Preparation and strength testing process of cement–soil test specimens.
Figure 3. Preparation and strength testing process of cement–soil test specimens.
Buildings 15 02740 g003
Figure 4. Variation in compression strength of cement–soil with multiple preparation parameters.
Figure 4. Variation in compression strength of cement–soil with multiple preparation parameters.
Buildings 15 02740 g004
Figure 5. Statistical distribution graph of the cement–soil strength data set.
Figure 5. Statistical distribution graph of the cement–soil strength data set.
Buildings 15 02740 g005
Figure 6. Flowchart of modeling for cement–soil mixing pile strength prediction model.
Figure 6. Flowchart of modeling for cement–soil mixing pile strength prediction model.
Buildings 15 02740 g006
Figure 7. Radar chart of evaluation metric errors for six models on the test set and the entire dataset.
Figure 7. Radar chart of evaluation metric errors for six models on the test set and the entire dataset.
Buildings 15 02740 g007
Figure 8. Experimental and predictive values of models.
Figure 8. Experimental and predictive values of models.
Buildings 15 02740 g008
Figure 9. SHAP values of different feature variables.
Figure 9. SHAP values of different feature variables.
Buildings 15 02740 g009
Figure 10. Layout of construction locations for cement–soil mixing piles.
Figure 10. Layout of construction locations for cement–soil mixing piles.
Buildings 15 02740 g010
Figure 11. Construction procedure of cement–soil mixing piles.
Figure 11. Construction procedure of cement–soil mixing piles.
Buildings 15 02740 g011
Figure 12. Field sampling of cement–soil core from mixing pile.
Figure 12. Field sampling of cement–soil core from mixing pile.
Buildings 15 02740 g012
Figure 13. Detected and predicted values of cement–soil core samples based on the PSO-XGBoost model.
Figure 13. Detected and predicted values of cement–soil core samples based on the PSO-XGBoost model.
Buildings 15 02740 g013
Table 1. Basic physical parameters of test soil samples.
Table 1. Basic physical parameters of test soil samples.
Soil TypePlastic Limit (%)Liquid Limit (%)Plasticity IndexFine Grain Content (%)Sand Content (%)Silt Content (%)Optimum Moisture Content (%)Maximum Dry Density (g/cm3)
Silty clay31.245.814.696.63.4022.61.59
Silty soil1625956.343.82.618.71.71
Sandy soil///080.619.4//
Table 2. Design plan of cement–soil mix proportion.
Table 2. Design plan of cement–soil mix proportion.
Soil TypeTest Moisture Content (%)Sand and Gravel Content (%)Water–Cement RatioCement Content (%)Curing Conditions
Silty clay22.63.41.0, 0.56, 9, 12, 15, 18, 21, 23, 25, 27, 29, 32, 35, 38, 41Water
Silty soil18.743.7
Sandy soil2.2100
Table 3. Python libraries for the model.
Table 3. Python libraries for the model.
ModelLibraryClassFunction
MLScikit-learnLinearRegressiontrain_test_split
RFScikit-learnRandomForestRegressortrain_test_split
KNNScikit-learnKNeighborsRegressortrain_test_split
BPScikit-learn, kerasStandardScaler, sequentialtrain_test_split
XGBoostScikit-learn, xgboostXGBRegressortrain_test_split
PSOPyswarms/pso
Table 4. Hyperparameters of XGBoost.
Table 4. Hyperparameters of XGBoost.
Hyperpara-MeterMax_DepthLearning_RateMin_Child_WeightSubsampleGammaColsampl-e_BytreeReg_
Alpha
Reg_Lambda
Empirical range3–100–0.51–100.3–1.00–100–10–40–10
Optimal value9.980.034.511.000.001.000.003.78
Table 5. Evaluation metrics of six models on the test set (T) and all datasets (A).
Table 5. Evaluation metrics of six models on the test set (T) and all datasets (A).
Evaluation MetricTest Set (T)All Datasets (A)
RMSEMAER2RMSEMAER2
ML2.97472.56880.77212.69712.24360.8137
RF1.12090.85540.96760.82070.56860.9827
KNN1.98481.59980.89851.96981.48430.9006
BP2.17661.61510.8781.69071.2210.9268
XGBoost0.6310.42510.98970.37710.15370.9964
PSO-XGBoost0.54040.37090.99250.3410.19820.997
Table 6. Prediction values of the PSO-XGBoost model.
Table 6. Prediction values of the PSO-XGBoost model.
No.Engineering LocationPreparation Parameter CombinationPredicted Value
Sand and Gravel ContentCement ContentWater–Cement Ratio
Right-bank relocation embankment3.4%12%0.51.4159
Embankments of the upstream and downstream approach channels of the ship lock43.7%12%0.52.6851
Energy dissipater pool of the spillway gate in Area Three100.0%12%1.53.6632
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xiong, J.; Gong, Y.; Liu, X.; Li, Y.; Chen, L.; Liao, C.; Zhang, C. A PSO-XGBoost Model for Predicting the Compressive Strength of Cement–Soil Mixing Pile Considering Field Environment Simulation. Buildings 2025, 15, 2740. https://doi.org/10.3390/buildings15152740

AMA Style

Xiong J, Gong Y, Liu X, Li Y, Chen L, Liao C, Zhang C. A PSO-XGBoost Model for Predicting the Compressive Strength of Cement–Soil Mixing Pile Considering Field Environment Simulation. Buildings. 2025; 15(15):2740. https://doi.org/10.3390/buildings15152740

Chicago/Turabian Style

Xiong, Jiagui, Yangqing Gong, Xianghua Liu, Yan Li, Liangjie Chen, Cheng Liao, and Chaochao Zhang. 2025. "A PSO-XGBoost Model for Predicting the Compressive Strength of Cement–Soil Mixing Pile Considering Field Environment Simulation" Buildings 15, no. 15: 2740. https://doi.org/10.3390/buildings15152740

APA Style

Xiong, J., Gong, Y., Liu, X., Li, Y., Chen, L., Liao, C., & Zhang, C. (2025). A PSO-XGBoost Model for Predicting the Compressive Strength of Cement–Soil Mixing Pile Considering Field Environment Simulation. Buildings, 15(15), 2740. https://doi.org/10.3390/buildings15152740

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop