1. Introduction
Expansive soil is a moisture-sensitive geomaterial widely used in the core and foundation zones of earth–rockfill dams due to its low permeability, high cohesion, and good saturated stability [
1,
2,
3]. However, its susceptibility to volumetric changes under wetting–drying cycles poses serious risks, including crest deformation, longitudinal cracking, slope instability, and internal erosion [
4,
5]. These risks are further exacerbated in fissured expansive soils, where dual-structure heterogeneity and interconnected crack networks contribute to increased mechanical instability [
6,
7,
8]. Desiccation cracking dynamically alters hydraulic conductivity and moisture migration patterns [
9], while cyclic fracture connectivity accelerates infiltration and internal erosion [
10]. Lan et al. [
11] emphasized that fractured geomaterials exhibit distinct reinforcement responses compared to homogeneous soils, reinforcing the necessity of accurate deformation prediction in dam engineering.
To address the nonlinear and multifactorial behavior of expansive soils, artificial neural networks (ANNs) have been increasingly applied in geotechnical modeling. As data-driven tools, ANNs learn complex input–output relationships without requiring explicit constitutive formulations [
12,
13,
14]. Recent applications include hybrid ANN–GA models for subgrade modulus prediction [
15] and ANN-driven strength forecasting of expansive soils [
16]. However, most existing models focus on homogeneous or treated soils, with limited attention to fissured expansive soils exhibiting dual-structure effects. Moreover, few studies evaluate how different back-propagation (BP) training algorithms influence prediction accuracy and computational efficiency, leaving a methodological gap in intelligent soil modeling.
Also, artificial intelligence is increasingly used in dam safety systems. For example, Beiranvand and Rajaee [
17] reviewed AI models for seepage and pore pressure forecasting, while Wen et al. [
18] applied stacked GRUs for concrete-dam deformation prediction. Nonetheless, swelling behavior modeling of fissured expansive soils remains underexplored in dam-related AI applications.
To bridge these gaps, this study develops a BP neural network model tailored for predicting the swelling deformation of fissured expansive soils under dam engineering conditions. Four input parameters—fissure ratio, dry density, initial moisture content, and overburden pressure—are incorporated based on laboratory data. A three-layer feedforward network is trained using two optimization strategies: gradient descent with momentum (traingdm) and the Fletcher–Reeves conjugate gradient method (traincgf). Prediction accuracy and convergence speed are systematically compared.
The proposed model demonstrates robust generalization and fast convergence, offering a lightweight and interpretable solution for swelling prediction in expansive soils. Its scalability makes it suitable for integration into dam safety management systems, digital twin frameworks, and intelligent deformation control platforms in swelling-prone environments.
2. Methodology
2.1. BP Neural Network Algorithm
The back-propagation (BP) neural network is a supervised learning algorithm based on a multilayer feedforward architecture, widely employed for modeling complex nonlinear relationships between input and output variables. In this study, a BP neural network was constructed to capture the nonlinear interactions among critical geotechnical parameters and to predict the swelling behavior of fissured expansive soils.
A standard three-layer feedforward structure was employed, comprising an input layer, a single hidden layer, and an output layer (
Figure 1). The network was configured with four input neurons and one output neuron, reflecting the dimensionality of the prediction problem. To enhance training stability and eliminate magnitude disparities among features, the input layer receives four normalized variables—fissure ratio, dry density, initial moisture content, and overburden pressure—all scaled to the range [–1, 1].
The input vector is defined as
In this formulation, Kr represents the fissure ratio, expressed as the volumetric percentage of internal cracks within the soil matrix. The dry density ρd (g/cm3) characterizes the degree of compaction of the specimen, while the initial moisture content w0 (%) reflects the pre-swelling water condition that governs hydration and suction potential. The overburden pressure σ (kPa) describes the vertical stress applied to the specimen during testing.
The output layer generates a single value, representing the predicted swelling ratio (
δep, %), defined as the percentage increase in specimen height resulting from water-induced expansion under a specified overburden pressure. The output vector is expressed as
During training, the input vectors were propagated forward through the network to generate output predictions. The error between predicted and actual values was calculated using the mean squared error (MSE) function, defined as
where
yi is the actual value,
is the predicted value, and
n denotes the total number of samples. The resulting error is then propagated backward to iteratively adjust the network’s weights and biases according to the selected optimization algorithm.
The activation function used in the hidden layer is the hyperbolic tangent sigmoid function (tansig), which introduces nonlinearity and maps values to the range [−1, 1]. For the output layer, a linear transfer function (purelin) is used, enabling the model to handle both positive and negative swelling ratios.
The BP neural network’s universal approximation capability, combined with its ability to learn from empirical data, makes it a robust tool for modeling the coupled effects of soil structure and environmental conditions in swelling deformation prediction—particularly in fissured expansive soil environments relevant to dam engineering applications.
2.2. Dataset Description and Preprocessing
The dataset used to train the BP neural network was derived from a series of laboratory-based one-dimensional swelling tests conducted on compacted expansive soil specimens. A total of 81 samples were constructed based on a full factorial combination of four input parameters: fissure ratio
Kr (35%, 50%, 65%), dry density
ρd (1.45, 1.50, 1.55 g/cm
3), initial moisture content
w0 (20%, 25%, 30%), and overburden pressure
σ (0, 25, 50 kPa). This design was intended to comprehensively represent the geotechnical conditions influencing the swelling behavior of fissured expansive soils. The swelling ratio served as the output variable for each sample and was predicted by the single output neuron of the BP neural network. A subset of representative training samples is shown in
Table 1.
To address the issue of differing units and scales among the input variables, normalization was applied prior to training. This process transforms raw input and output data into a dimensionless scale, constraining values within a predefined interval suitable for numerical computation. It helps to eliminate magnitude disparities across features and significantly improves the convergence speed and stability of the neural network.
In this study, MATLAB’s (Matlab 2017b) premnmx function was employed to normalize all input and output variables to the range [−1, 1]. This normalization ensures consistency across sample dimensions and prevents dominant features from skewing the learning process.
Let the normalized data
be denoted as
, which is computed using the following equation:
In the above equation, ymax = 1, ymin = −1, and dmax and dmin represent the maximum and minimum values of the data sample, respectively. In this study, the target output values were normalized using the same strategy, in order to maintain consistency. The resulting normalized dataset provided the foundation for training and evaluating the BP neural network.
2.3. Network Architecture and Parameter Settings
The BP neural network was implemented in MATLAB using the newff function, which allows for flexible specification of network architecture, activation functions, and training algorithms. A three-layer network structure was adopted, comprising an input layer with four neurons (corresponding to the four normalized input variables), a hidden layer, and an output layer with one neuron.
The number of neurons in the hidden layer significantly influences the model’s fitting accuracy and generalization capability. To determine an appropriate configuration, a trial-and-error approach was employed, testing hidden layer sizes from 3 to 10. The evaluation was based on the mean squared error (MSE) between the predicted and measured swelling ratios. The results indicated that a hidden layer with five neurons yielded the lowest MSE and the most stable convergence across multiple runs; therefore, the final network configuration (4 input neurons, 5 hidden neurons, and 1 output neuron) was adopted based on the performance evaluation.
The hidden layer adopted the hyperbolic tangent sigmoid activation function (tansig), which introduces nonlinearity and maps outputs to the range [−1, 1].
The training algorithms and hyperparameter configurations are discussed in detail in the next section.
2.4. Training Algorithms and Implementation
To investigate the impacts of training strategies on convergence behavior and predictive performance, two optimization algorithms were implemented and compared under identical conditions. The first algorithm, gradient descent with momentum (traingdm), is a first-order optimization technique that updates network weights along the negative gradient direction, incorporating a momentum term to accelerate convergence and reduce oscillations. The second algorithm, the Fletcher–Reeves conjugate gradient method (traincgf), is a second-order approach that generates conjugate search directions and is widely recognized for its efficiency in training complex nonlinear neural networks.
Both algorithms were applied under identical training settings: the learning rate was set to 0.05, the maximum number of epochs was 6000, and the performance goal was defined as achieving a mean squared error (MSE) below 0.01. Network weights and biases were initialized using MATLAB’s default settings, and training was performed in batch mode.
3. Results and Discussion
3.1. Training Efficiency and Convergence Analysis
To optimize the network architecture and evaluate training efficiency, a series of controlled experiments were conducted to investigate the impacts of hidden layer size and training algorithms on model performance. First, the number of neurons in the hidden layer was varied from three to ten. For each configuration, the network was trained using both the gradient descent with momentum algorithm (traingdm) and the Fletcher–Reeves conjugate gradient method (traincgf). Prediction performance was quantitatively assessed using the mean squared error (MSE) between the predicted and experimentally measured swelling ratios. The MSE variations are presented in
Figure 2.
As illustrated in
Figure 2, both optimization methods achieved their minimum MSE when the hidden layer contained five neurons. The traingdm algorithm reached a minimum MSE of 0.723, while traincgf achieved a slightly lower MSE of 0.612. Based on these findings, a hidden layer with five neurons was adopted for subsequent modeling.
To further assess training performance, both algorithms were applied under identical settings: learning rate of 0.05, maximum of 6000 epochs, and a performance goal of MSE = 0.01. The convergence behaviors are illustrated in
Figure 3.
As shown in
Figure 3a, the traingdm algorithm required 953 iterations to reach the target error, with a final MSE of 0.009996. In contrast,
Figure 3b shows that traincgf reached the same performance level in only 32 iterations, indicating a nearly 30-fold improvement in convergence speed.
To gain further insight into the internal mapping behavior of the trained models, the finalized weights and biases for both training algorithms are summarized in
Table 2 and
Table 3. These parameters characterize how each input contributes to the network’s nonlinear representation of the swelling response. The magnitude and sign of each weight reflect the influence and directionality of individual geotechnical parameters—fissure ratio, dry density, initial moisture content, and overburden pressure—relative to the predicted swelling ratio, enabling a preliminary interpretation of parameter sensitivity. This supports the preliminary sensitivity interpretation and sheds light on how each input contributes under different optimization strategies.
Both training algorithms successfully calibrated the BP neural network, enabling it to capture the complex nonlinear relationships between geotechnical parameters and swelling behavior. Analyses of the learned weights show that all four input variables—fissure ratio, dry density, initial moisture content, and overburden pressure—contributed substantially to the prediction output, with no single input parameter consistently dominating the weight distribution across all neurons. This distribution suggests that swelling deformation is governed by a coupled interaction among structural (fissure ratio, dry density) and environmental (moisture, pressure) factors, which the trained network effectively captured.
Additionally, differences in bias values and weight signs among hidden neurons reflect the network’s ability to model nonlinear boundaries in the input space, thereby enhancing generalization. While both algorithms achieved comparable prediction accuracy, traincgf demonstrated superior convergence speed; this efficiency gain stems from the second-order nature of the conjugate gradient method, which incorporates curvature information and enables faster convergence in non-convex, high-dimensional optimization landscapes—conditions typical of expansive soil modeling with coupled physical variables.
From an engineering-application perspective, faster convergence not only reduces computational burden but also improves model adaptability for real-time or iterative design scenarios, such as staged embankment construction or reservoir operation forecasting. The ability to achieve rapid convergence while maintaining high prediction accuracy makes the conjugate gradient–based network particularly suitable for expansive soil environments, in which timely and resource-efficient predictions are crucial. These findings suggest that traincgf-based networks not only demonstrated effectiveness under laboratory-scale conditions but are also readily deployable in field-scale prediction frameworks for geotechnical systems involving expansive soils.
3.2. Prediction Accuracy and Generalization Performance
To assess the predictive accuracy and generalization capability of the trained BP neural network, a two-stage validation procedure was conducted. First, the model’s fitting performance was evaluated using the original training dataset. Second, an independent test set consisting of 18 samples—distinct from the training data—was employed to verify the model’s extrapolation ability under unseen conditions.
Figure 4 illustrates the comparison between the predicted swelling ratios and the corresponding experimentally measured values for the training dataset. Both the traingdm and traincgf-based models exhibit strong consistency with the observed values, indicating that the trained networks achieved satisfactory fitting performance. The error margins remained within acceptable bounds, validating the model’s ability to capture the nonlinear swelling behavior induced by variations in fissure ratio, dry density, initial moisture content, and overburden pressure.
To evaluate the generalization capability, the trained models were applied to predict swelling ratios for the independent test set. The input conditions and corresponding prediction results are listed in
Table 4, and the prediction accuracy is visualized in
Figure 5.
Although
Table 4 omits explicit error metrics, a supplementary error analysis was conducted to evaluate the robustness of the predictions. For samples with very small measured swelling ratios (e.g., Sample 10–13), even minor numerical differences led to inflated relative errors exceeding 100%, which is a known artifact when the denominator approaches zero. However, the corresponding absolute errors in these cases remained generally within ±0.5%, indicating acceptable engineering accuracy. In contrast, samples such as Sample 3 and Sample 6 exhibited larger absolute deviations (greater than 1.0%), and are acknowledged as individual outliers. Nonetheless, the overall prediction performance across all 18 test samples remained within an acceptable error range, supporting the generalization capability of the model.
Figure 5 further visualizes the prediction performance of both algorithms on the test set. It can be observed that the predicted values align closely with the experimental measurements across most cases, demonstrating the network’s ability to generalize capably beyond the training data.
The validation results demonstrate that the trained BP neural network can effectively predict the swelling behavior of fissured expansive soils under diverse conditions. Notably, the traincgf-based model not only maintained accuracy comparable to traingdm, but also offered substantially improved computational efficiency—achieving convergence nearly 30 times faster.
In practical engineering contexts, such rapid convergence translates to significant savings in computation time, which is crucial for real-time analysis, iterative design updates, and decision support in projects involving staged construction or dynamic loading. These results confirm that the BP neural network—particularly when trained using the conjugate gradient method—constitutes a reliable and efficient tool for expansive soil behavior prediction in dam engineering and other geotechnical applications. These findings reinforce the model’s reliability not only in terms of mean prediction accuracy, but also in its ability to transparently account for and explain localized outliers in real-world applications.
4. Conclusions
In this study, a BP neural network-based prediction model was developed to evaluate the swelling behavior of fissured strong expansive soil, with direct application to geotechnical infrastructure such as dam cores and earth embankments. By incorporating four key influencing factors—fissure ratio, dry density, initial moisture content, and overburden pressure—as input variables, the model effectively captured the coupled nonlinear interactions governing soil swelling.
A standard three-layer feedforward BP neural network structure was employed, with the optimal hidden layer configuration (five neurons) selected based on mean squared error (MSE) minimization. Comparative analysis of two training algorithms showed that the conjugate gradient method achieved comparable prediction accuracy while significantly improving convergence efficiency—reducing training iterations by approximately 30-fold compared to gradient descent with momentum.
Validation using an independent dataset validated both the generalization capability and the field applicability of the proposed model. The predicted swelling ratios closely matched the experimentally measured values, with most absolute errors falling within acceptable engineering thresholds. This supports the model’s suitability for practical geotechnical uses, such as deformation risk assessment in dam cores and slope systems composed of fissured expansive soils.
Overall, the conjugate gradient-based BP neural network model exhibits strong potential for deployment in geotechnical engineering contexts requiring rapid, data-driven prediction of expansive soil swelling. Its balance of high accuracy and fast convergence makes it particularly suitable for engineering applications such as dam core zone assessment, embankment deformation forecasting, and real-time decision support in swelling-prone environments. Compared to conventional empirical or numerical methods, the proposed model offers a lightweight, generalizable alternative that can be embedded into intelligent infrastructure systems and smart monitoring frameworks.
From a geotechnical engineering perspective, the developed BP neural network model provides a reliable and intelligent tool for the prediction of swelling deformation in expansive soils. Its predictive capability is particularly beneficial in practical engineering scenarios such as dam core material selection, expansive subgrade assessment, and deformation control in swelling clay layers. These applications require rapid and accurate evaluations of soil behavior under variable loading and environmental conditions—capabilities that the proposed model supports efficiently. By integrating the model into early design evaluation or post-construction monitoring, engineers can enhance decision-making in expansive soil management and ensure structural safety and performance longevity.
Future work will focus on expanding the model’s generalization capability by incorporating additional field-derived datasets and accounting for broader ranges of spatial heterogeneity and environmental variability. Furthermore, the integration of advanced learning frameworks—such as convolutional neural networks (CNNs) and long short-term memory (LSTM) networks—is expected to further improve the model’s capacity to capture coupled spatial–temporal dynamics. Coupling the BP neural network with real-time monitoring systems and digital twin frameworks is also envisioned, with the aim of supporting intelligent decision-making in expansive soil-related geotechnical engineering projects.
Author Contributions
Conceptualization, S.L. and Z.L.; methodology, S.L.; software, S.L. and H.Z.; validation, H.T.; formal analysis, X.Z.; investigation, S.L.; resources, B.Z.; data curation, L.G.; writing—original draft preparation, S.L.; writing—review and editing, H.T.; visualization, S.L. and H.T.; supervision, Z.L. and J.Z.; project administration, S.L. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Data Availability Statement
The raw data supporting the conclusions of this article will be made available by the authors on request.
Conflicts of Interest
Author Junxing Zheng is the Editor-in-Chief of Intelligent Infrastructure and Construction. To ensure a fair and unbiased review process, he was completely excluded from the peer-review and editorial decision-making with respect to this manuscript. Authors Shuangping Li, Bin Zhang, Zuqiang Liu, Xin Zhang, and Linjie Guan were employed by the company Changjiang Spatial Information Technology Engineering Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
References
- Yu, K. Modelling Thermo-Hydro-Mechanical Behaviour of Expansive Soil Subject to Different Scenarios. Ph.D. Dissertation, Queensland University of Technology, Brisbane City, Australia, 2025. [Google Scholar]
- Sharo, A.; Bani Baker, M.; Tarawneh, D.A.; Khasawneh, M.; Ghuzlan, K. Stabilising highly expansive soil by using Nano-Clay additive. Int. J. Pavement Eng. 2025, 26, 2460077. [Google Scholar] [CrossRef]
- Alsabhan, A.H.; Hamid, W. Innovative Thermal Stabilization Methods for Expansive Soils: Mechanisms, Applications, and Sustainable Solutions. Processes 2025, 13, 775. [Google Scholar] [CrossRef]
- Oppong, F.; Kolawole, O. Reassessment of natural expansive materials and their impact on freeze-thaw cycles in geotechnical engineering: A review. Front. Built Environ. 2024, 10, 1396542. [Google Scholar] [CrossRef]
- Almuaythir, S.; Zaini, M.S.I.; Hasan, M.; Hoque, M.I. Sustainable soil stabilization using industrial waste ash: Enhancing expansive clay properties. Heliyon 2024, 10, e39124. [Google Scholar] [CrossRef]
- Tang, C.S.; Zhu, C.; Cheng, Q.; Zeng, H.; Xu, J.J.; Tian, B.G.; Shi, B. Desiccation cracking of soils: A review of investigation approaches, underlying mechanisms, and influencing factors. Earth-Sci. Rev. 2021, 216, 103586. [Google Scholar] [CrossRef]
- Yang, R.; Xiao, P.; Qi, S. Analysis of slope stability in unsaturated expansive soil: A case study. Front. Earth Sci. 2019, 7, 292. [Google Scholar] [CrossRef]
- Lucian, C. Geotechnical Aspects of Buildings on Expansive Soils in Kibaha, Tanzania: Preliminary Study. Ph.D. Dissertation, KTH Royal Institute of Technology, Stockholm, Sweden, 2006. [Google Scholar]
- Gao, H.; An, R.; Zhang, X.; Wang, G.; Liu, X.; Xu, Y. Dynamic evolution of desiccation cracks and their relationship with the hydraulic properties of expansive soil. Int. J. Geomech. 2024, 24, 04023299. [Google Scholar] [CrossRef]
- Zhao, Y.; Zhang, H.; Wang, G.; Yang, Y. The infiltration characteristics of expansive soil considering fracture under wet-dry cycle conditions. Heliyon 2024, 10, e36840. [Google Scholar] [CrossRef] [PubMed]
- Lan, X.; Zhang, X.; Li, X.; Zhang, J.; Zhou, Z. Experimental study on grouting reinforcement mechanism of heterogeneous fractured rock and soil mass. Geotech. Geol. Eng. 2020, 38, 4949–4967. [Google Scholar] [CrossRef]
- Abiodun, O.I.; Jantan, A.; Omolara, A.E.; Dada, K.V.; Umar, A.M.; Linus, O.U.; Arshad, H.; Kazaure, A.A.; Gana, U.; Kiru, M.U. Comprehensive review of artificial neural network applications to pattern recognition. IEEE Access 2019, 7, 158820–158846. [Google Scholar] [CrossRef]
- Yang, K.T. Artificial neural networks (ANNs): A new paradigm for thermal science and engineering. J. Heat Transfer. 2008, 130, 093001. [Google Scholar] [CrossRef]
- Poddar, H. From neurons to networks: Unravelling the secrets of artificial neural networks and perceptrons. In Deep Learning in Engineering, Energy and Finance; CRC Press: Boca Raton, FL, USA, 2024; pp. 25–79. [Google Scholar]
- Khawaja, L.; Asif, U.; Onyelowe, K.; Al Asmari, A.F.; Khan, D.; Javed, M.F.; Alabduljabbar, H. Development of machine learning models for forecasting the strength of resilient modulus of subgrade soil: Genetic and artificial neural network approaches. Sci. Rep. 2024, 14, 18244. [Google Scholar] [CrossRef] [PubMed]
- Alnmr, A.; Hosamo, H.H.; Lyu, C.; Ray, R.P.; Alzawi, M.O. Novel insights in soil mechanics: Integrating experimental investigation with machine learning for unconfined compression parameter prediction of expansive soil. Appl. Sci. 2024, 14, 4819. [Google Scholar] [CrossRef]
- Beiranvand, B.; Rajaee, T. Application of artificial intelligence-based single and hybrid models in predicting seepage and pore water pressure of dams: A state-of-the-art review. Adv. Eng. Softw. 2022, 173, 103268. [Google Scholar] [CrossRef]
- Wen, Z.; Zhou, R.; Su, H. MR and stacked GRUs neural network combined model and its application for deformation prediction of concrete dam. Expert Syst. Appl. 2022, 201, 117272. [Google Scholar] [CrossRef]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).