Prediction and Parameter Optimization of Surface Settlement Induced by Shield Tunneling Using Improved Informer Algorithm

Zhao, Shisen; Feng, Xianda; Peng, Kefeng

doi:10.3390/app15126766

Open AccessArticle

Prediction and Parameter Optimization of Surface Settlement Induced by Shield Tunneling Using Improved Informer Algorithm

by

Shisen Zhao

^1,*,

Xianda Feng

²

and

Kefeng Peng

³

¹

State Key Laboratory of Intelligent Construction and Healthy Operation and Maintenance of Deep Underground Engineering, China University of Mining and Technology, Xuzhou 221116, China

²

School of Civil Engineering and Architecture, University of Jinan, Jinan 250022, China

³

School of Qilu Transportation, Shandong University, Jinan 250002, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(12), 6766; https://doi.org/10.3390/app15126766

Submission received: 9 April 2025 / Revised: 30 May 2025 / Accepted: 13 June 2025 / Published: 16 June 2025

(This article belongs to the Section Civil Engineering)

Download

Browse Figures

Versions Notes

Abstract

Given the complex mechanism of surface settlement induced by shield tunneling, accurate prediction of surface settlement and rational design of tunneling parameters are critical for ensuring safe and efficient tunneling operations. To address the limitations of the Informer algorithm in predicting surface settlement during shield tunneling, the standard convolution was replaced with dilated causal convolution, and three measures were employed: soil layer classification and characterization, a moving prediction window, and special factor handling. An improved Informer algorithm was developed. The prediction time of the improved Informer algorithm was reduced by up to 30.64% compared to the original algorithm. The improved Informer algorithm exhibited significant enhancements in prediction accuracy, perception range across different timescales, and computational efficiency. Compared with random forest and long short-term memory algorithms, the improved Informer algorithm achieved superior performance, making it more suitable for predicting surface settlement induced by shield tunneling. By integrating the improved Informer algorithm with Shapley additive explanations theory, the contributions of shield tunneling parameters to surface settlement were analyzed. A multi-objective optimization algorithm was constructed for the shield tunneling parameters. Key parameters, including the shield thrust, cutterhead torque, and tunneling speed, were selected for optimization. The surface settlement after parameter optimization was reduced by 16.79%.

Keywords:

shield tunnel; improved informer algorithm; dilated causal convolution; surface settlement prediction; SHAP analysis; parameter optimization

1. Introduction

In recent years, urban rail transit in China has developed rapidly. Shield tunneling has become the mainstream construction method for urban transit tunnels owing to its significant technical advantages [1,2,3]. However, because of complex geological conditions and improper operational practices, shield tunneling inevitably induces loss to soil layers, posing safety risks to nearby buildings (structures) [4].

Accurate prediction of surface settlement induced by shield tunneling in complex construction environments, along with timely adjustment of tunneling parameters to minimize soil layer disturbance, has become a key focus in shield tunnel research [5,6,7]. The mechanism of surface settlement caused by shield tunneling is complex. The interplay of multiple factors increases the difficulty of prediction and poses significant challenges to settlement control [8,9].

Methods for predicting surface settlement caused by shield tunneling can be broadly categorized into three types: theoretical analysis [10,11], numerical simulation [12,13], and machine learning [14,15]. Theoretical analysis requires many simplifications and assumptions, limiting its applicability to real-world projects. Recent advancements in numerical modeling have further quantified the sensitivity of key tunneling parameters to settlement outcomes. For example, ref. [16] proposed a validated TBM excavation model and demonstrated that tunnel diameter and eccentricity account for 22.5% and 17% of structural settlement variation, respectively. While such parametric studies provide critical insights, their reliance on predefined assumptions limits real-time adaptability to dynamic construction conditions. In contrast, machine-learning methods, with their robust self-learning and analytical capabilities, can uncover intrinsic patterns within data and have been widely applied to shield-induced surface settlement prediction [15,17,18]. Algorithms such as support vector regression [19], random forest (RF) [4,20], multilayer perceptrons [21], long short-term memory (LSTM) networks [22], the Informer algorithm [23], the Transformer algorithm [24], and the WaveNet algorithm [25] have been successfully applied to the prediction of settlement induced by shield tunneling.

Proposed in 2021, the Informer algorithm [26] is a deep-learning algorithm specifically designed for long-time-series prediction. It arranges and predicts data samples according to the temporal and spatial continuity inherent in shield tunneling processes, offering significant advantages over machine-learning algorithms that rely solely on individual settlement values. Lai et al. employed the standard Informer model to predict subway tunnel settlement, integrating multivariate inputs, including temperature and soil pressure. No architectural modifications were made, but leveraging its native long-sequence processing capability effectively captured nonlinear deformation patterns in complex geological environments [27]. Zhao and Ding applied the vanilla Informer architecture for TBM thrust prediction. Their key adaptation involves relaxing the Pearson correlation threshold during feature selection to reduce noise. This preprocessing optimization enhances model robustness while maintaining cross-project generalization without altering Informer’s core design [28]. Wang and Bai benchmarked the unmodified Informer against Autoformer for ground deformation forecasting. The results reveal Informer’s competitive performance but highlight inherent limitations in extreme long-horizon predictions, suggesting opportunities for architectural enhancements in future work [29]. Hu et al. found that the proposed LSPP model leverages Informer’s long-term prediction capability by integrating tunneling parameters and stratigraphic data. They improved Informer with ProbSparse self-attention and dynamic decoding to address error accumulation in traditional RNNs, enhancing long-distance forecasting accuracy. However, computational demands remain a challenge [30]. Zhen et al. developed a 1DCNN-Informer model where 1DCNN extracts local features from shield parameters and Informer captures long-term dependencies. Enhanced by DANN transfer learning, it improves cross-strata adaptability for position/attitude forecasting. The combined structure increases model complexity and resource requirements [31]. Pang et al. developed the TBMformer model by modifying Informer’s architecture to incorporate spatio-temporal feature fusion (time/ring encoding) and multi-attention mechanisms (LSTM and temporal/self-attention). This adaptation enhances real-time prediction of tunnel boring machine parameters under complex geological conditions while retaining Informer’s efficient sequence forecasting capability [32]. Wen et al. enhanced Informer with T-BiGRU (local feature extraction) and Masked MSA (global dependency capture) for railway settlement prediction. While improving long-sequence accuracy, limitations include unverified generalization to diverse geological conditions, unquantified computational complexity, and a lack of real-time deployment validation in IoT systems [23].

In the Informer algorithm, self-attention distillation is utilized to introduce convolutional and max-pooling layers between self-attention blocks to shorten input lengths and optimize performance. However, when applied to surface settlement prediction induced by shield tunneling, the Informer algorithm faces three key limitations: (1) As the network depth increases, the amount of historical information retrievable by standard convolutional layers grows only linearly, limiting the algorithm’s perception range and leading to repetitive and uninformative computations. (2) Standard convolutional layers do not consider the temporal dimension, which can cause leakage of information about future events during prediction. (3) There is room for improvement in reducing computational complexity and memory consumption. Moreover, surface settlement induced by shield tunneling exhibits lag and abruptness. The maximum settlement location typically lags the shield face by 5–10 rings, and unexpected events such as machine stoppages or secondary grouting can significantly impact surface settlement. These factors are not adequately considered in the Informer algorithm.

To overcome these limitations, we replaced the original standard convolutional layers with dilated causal convolutional layers and employed soil layer classification and characterization to reduce computational complexity and memory consumption. Considering the characteristics of shield tunneling, we increased the prediction accuracy through measures such as moving prediction windows and incorporating special factors. Comparative analyses with the traditional Informer algorithm, RF algorithm, and LSTM algorithm validated the proposed approach. Furthermore, the improved Informer algorithm was integrated with the Shapley additive explanations (SHAP) method to analyze the effects of key shield tunneling parameters on surface settlement. Finally, the NSGA-III multi-objective optimization algorithm was employed to optimize the shield tunneling parameters.

2. Method

2.1. Informer Algorithm

The Informer is a deep-learning algorithm designed for long-time-series prediction. Its basic framework is illustrated in Figure 1. On the left side of the diagram is the encoder, which receives long-series inputs and uses a multi-head ProbSparse self-attention mechanism instead of the standard self-attention mechanism. On the right side is the decoder, which employs zero-padding for target elements, calculates the weighted attention of the feature map output by the encoder, and immediately predicts the target elements in a generative manner. The Informer algorithm significantly reduces the number of time points requiring processing without compromising the predictive performance. Thus, it is advantageous for handling long-time-series data, such as those encountered in shield tunneling [26,33].

The Informer algorithm introduces standard convolutional and max-pooling layers between every two self-attention blocks (Figure 2), implementing a self-attention distillation technique. This technique prioritizes dominant features. The output of each attention block sequentially passes through a convolutional layer, an activation function, and a max-pooling layer to extract primary attention features while minimizing computational and memory demands. The distillation operation from the jth layer to the (j + 1)^th layer is expressed as follows:

X_{j + 1}^{t} = M a x p o o l i n g (E L U (C o n v 1 d ({[X_{j}^{t}]}_{a t t}))),

(1)

where

{[]}_{a t t}

is the attention block, which includes operations required for multi-head ProbSparse self-attention; Conv1d(⋅) is the convolutional filter along the temporal dimension with an Exponential Linear Unit (ELU) activation function; and Maxpooling(⋅) is the max-pooling layer. Figure 2 shows the main stack receiving the entire input series. The second stack obtains half of the input series, and each subsequent stack obtains half of the series from the previous stack in turn. The red areas in Figure 2 represent dot-product matrices, and they gradually shrink as self-attention distillation is applied at each layer. Finally, the feature maps of all stacks are aggregated as the encoder’s output.

2.2. Improvement of Algorithm Framework

When applied to predicting surface settlement induced by shield tunneling, the Informer algorithm faces two main limitations: ① Because of the long time series, the perception range of standard convolutional layers is limited, and repetitive, meaningless operations may arise during computation, offsetting the advantages of stacking self-attention blocks with standard convolutional layers. ② Standard convolutional layers do not take time into account, which inevitably leads to future information leakage during prediction and reduces the temporal perception range of the algorithm.

To address these issues, standard convolution was replaced with dilated causal convolution (Figure 3) to achieve a more comprehensive perception range. The dilated causal convolutional network combines the advantages of dilated and causal convolutions, using both dilation rates and causality. The dilation rate determines the spacing between elements in the convolution kernel. By increasing this rate, the kernel can cover longer input series without increasing the number of parameters, expanding the perception range. Causality ensures that predictions depend only on current and past data rather than future data, avoiding future information leakage [33].

For the ith convolutional layer following the ith self-attention block with dilated causal convolution, the dilated causal convolution operation C with kernel size k on an element

x_{n} \in R^{d}

in series

x \in R^{L \times d}

is defined as follows:

C (x_{n}) = [\begin{matrix} x_{n} \\ x_{n - i} \\ ⋮ \\ x_{n - (k - 1) \times i} \end{matrix}] W^{d \times d^{'}},

(2)

where d′ represents the output dimension and i represents the dilation rate. Specifically, when i = 1, the dilated causal convolution is reduced to standard causal convolution. At any given timestep, the convolution operation only involves elements from the current or earlier timesteps in the series. Figure 3 illustrates the architecture of the self-attention network, which includes self-attention blocks, dilated causal convolutional layers, and max-pooling layers.

2.3. Improvements Based on Shield Tunneling Characteristics

2.3.1. Soil Layer Classification and Characterization

The Informer algorithm still has room for improvement in reducing computational complexity and memory consumption. During shield tunneling, the soil layer information features are defined by the eigenvalues of samples interpolated between two survey points. In contrast to tunneling parameters, which vary regularly over time, soil layer information features are static variables with unknown time-variation properties and have limited impact on algorithm prediction results. Reducing the dimensionality of soil layer data can minimize the computational complexity and memory usage with negligible adverse effects on algorithm performance, increasing the prediction efficiency.

Studies have indicated that appropriate methods can extract soil layer classes—represented by soil layer grades—from soil layer information features and tunneling parameters [34]. Therefore, the method of soil layer classification and characterization was employed to extract soil layer information from the original dataset during the shield tunneling process.

2.3.2. Moving Prediction Window

Surface settlement induced by shield tunneling has a time lag, as illustrated in Figure 4. The maximum settlement typically lags the shield face by 5–10 rings. To address the impact of this lag on surface settlement, a moving prediction window approach was adopted (see Figure 5). A series-to-series module [35] was introduced for encoding and decoding, and offset parameters were used to appropriately adjust the order of the input time series, addressing temporal lag differences. The use of variable offset parameters allows the algorithm to more accurately capture dependencies within the time series, increasing the prediction accuracy.

Regarding Figure 5:

(1): Known input features (orange blocks in Figure 5): These include shield tunneling parameters (e.g., thrust, cutterhead torque, and advance rate) and geological static parameters (e.g., soil layer classification, density, and compression modulus) monitored before tunneling to the target ring. Their temporal regularity can be captured through historical data.
(2): Unknown input features (blue blocks in Figure 5): These are lagged variables introduced via the moving prediction window, such as settlement values from the past 5–10 rings and displacement trends of shield posture (gray blocks in Figure 5). These features address time-lag effects in settlement prediction.
(3): Supplementary input features (gray blocks in Figure 5): These represent binary indicators (0/1) for abrupt special construction events (e.g., abnormal stoppages and secondary grouting). Their timing and intensity are unpredictable, as detailed in Section 2.3.3.

2.3.3. Special Factor Handling

During shield tunneling, special construction factors such as abnormal stoppages and secondary grouting significantly affect surface settlement [35]. To further increase the prediction accuracy, these two factors were encoded as special feature inputs into the dataset. This encoding effectively marks anomalous points in the shield tunneling process, integrates these special influences into the dataset comprehensively, and allows more complete consideration of their effects on surface settlement.

Unlike prior Informer adaptations for tunneling [23,27], which relied on standard convolutions and static soil inputs, the improved algorithm uniquely integrates dilated causal convolutions with tunable dilation rates (d = 1, 2, 4) and kernel sizes (k = 5). This design exponentially expands the temporal receptive field while enforcing causality, addressing settlement lag effects more effectively. Additionally, the self-attention distillation combines ELU-activated convolutions and max-pooling (Equation (1)), reducing redundant computations by 30.64% (Section 3.5.3). Furthermore, dynamic input features (e.g., moving windows and event markers) enable real-time adaptation to abrupt construction disturbances, a capability absent in earlier implementations.

3. Case Study

3.1. Project Background

The shield tunneling section of a metro project in Jinan passes through silty clay, silty soil, fine sand, and medium sand layers. The shield tunneling adopted a soil pressure balance shield machine with a cutterhead diameter of 6600 mm, an outer shield diameter of 6400 mm, and a shield body length of 6000 mm. The segment width is 1200 mm, and the thickness is 300 mm. The tunnel is buried at a depth of 10–14 m. The geological conditions are presented in Figure 6 and Table 1.

Based on the geological characteristics of the tunnel section, the soil types ahead of the excavation face were classified into three categories: silty clay, silty soil, and composite strata. Geological survey data were used to label the stratum information of 60 ring samples into these three categories. Leveraging the advantages of Support Vector Machine (SVM) algorithms in classification tasks, an SVM-based stratum characterization model was trained to label stratum information for the remaining unmarked samples. The performance of the SVM stratum characterization model was evaluated through cross-validation on the training set and labeling of the test set, achieving a labeling accuracy exceeding 90%. This accuracy level ensures the enhancement of the ground settlement prediction model through stratum classification characterization. The results of stratum classification characterization indicate that approximately 87% of the samples belong to silty clay, about 10% to composite strata, and roughly 3% to silty soil.

3.2. Dataset Construction

For a specific tunnel, the surface settlement induced during shield tunneling is determined by shield tunneling parameters and the geological conditions of the strata traversed. Accordingly, we adopted the two aforementioned factors that determine surface settlement as features (Figure 7) and took the maximum surface settlement observed during shield tunneling as the target prediction value to construct the dataset.

The features were selected through the following criteria: (i) Engineering Controllability: While traditional factors like the tail-lining gap and grout pressure are critical to volume loss, their real-time adjustment is often constrained by geological variability and equipment limitations (e.g., grout hardening time and gap measurement delays). In contrast, parameters such as thrust, torque, and advance rate are directly and dynamically controllable during tunneling, making them pragmatic targets for optimization. (ii) Literature-Driven Relevance: Features empirically linked to surface settlement (e.g., thrust and advance rate [4,15,17]) were prioritized.

A total of 19 shield tunneling parameters were considered as features, including the grouting pressure, K-block location, horizontal displacement of the shield head and tail, vertical displacement of the shield head and tail, shield tail gap (top, bottom, left, and right), screw conveyor rotation speed, screw conveyor torque, muck pressure, tunneling speed, volume of excavated earth, cutterhead rotation speed, cutterhead torque, grouting volume, and thrust. The geological conditions included 14 features, including the water table, tunnel burial depth, and stratigraphic parameters (e.g., soil layer thickness, cohesion, internal friction angle, and compression modulus). The definitions of the aforementioned parameters are provided in Appendix A. Thus, a total of 33 features and 875 ring samples were selected as the original dataset for this study, as shown in Table 2.

The original dataset was divided into training (80%) and testing (20%) sets to facilitate algorithm training and final performance evaluation. Considering the periodic trends in time-series data, the dataset was segmented into 14 tunneling sections according to the shield start and stop times and further divided into training and testing sets. During cross-validation for algorithm training, a rolling time-series splitting method (i.e., walk-forward validation) was employed to generate the validation set, ensuring the temporal integrity of the time-series data. Table 3 lists all 14 segments with their ring number intervals, segment lengths, and assigned roles (training/testing).

3.3. Data Preprocessing

(1): Data Cleaning

To address missing values and anomalies in the recorded ledger data, data cleaning was performed to remove or correct erroneous, duplicate, or inconsistent entries in the dataset.

① Samples with numerous missing input parameters were removed.

② For samples with a few missing parameters, mean values from complete samples were used to fill the gaps.

③ For anomalous data points, mean values from complete samples were also used as replacements.

Anomalies were identified using the boxplot method based on interquartile ranges (IQRs), as shown in Figure 8. Anomalous values were defined as those less than Q1 − k × IQR or greater than Q3 + k × IQR, where Q1 and Q3 represent the first and third quartiles, respectively, and k was set as 1.5.

(2): Data Normalization

The min–max normalization method was applied to scale all features to the [0, 1] range, ensuring that differences in feature scales did not impact algorithm performance.

The normalization formula is

x_{norm} = \frac{x - x_{\min}}{x_{\max} - x_{\min}},

(3)

where

x_{norm}

represents the normalized value,

x

represents the original value,

x_{\min}

represents the minimum value in the dataset, and

x_{\max}

represents the maximum value in the dataset.

3.4. Hyperparameter Settings

Table 4 presents the optimal hyperparameter settings for the improved Informer prediction algorithm. Bayesian optimization was employed to integrate with the Optuna library (v3.3.0) for hyperparameter search, with the optimization objective set to minimize the test-set MSE. The configuration was as follows: ① evaluation budget: 200 trials, each incorporating 5-fold rolling time-series cross-validation (as detailed in Section 3.2); ② stopping criterion: early termination triggered if no MSE improvement (ΔMSE < 0.1%) was observed for 20 consecutive trials; ③ search space: parameter ranges specified in Table 4.

3.5. Comparison of Effects Before and After Improvement

To evaluate the performance of the prediction algorithm, metrics such as the mean square error (MSE), mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R²) were utilized. Further comparisons were conducted to analyze the prediction accuracy and computational efficiency of the Informer algorithm before and after the improvement across different timescales.

3.5.1. Prediction Accuracy

The performance of the Informer algorithm under different improvement measures is presented in Figure 9. When standard convolution was replaced with dilated causal convolution, the algorithm showed consistent improvement across all metrics: MSE decreased by 1.3% (5.22→5.15), MAE decreased substantially by 21.2% (5.91→4.66), R² increased from 0.79 to 0.80, and RMSE decreased from 2.28 to 2.27. These gains demonstrate the effectiveness of dilated convolutions in capturing long-range temporal dependencies while enforcing causality.

Subsequent improvements revealed nuanced but meaningful trends: ① Stratum classification improved overall model efficiency (R²↑0.02 to 0.82, MSE↓0.4% to 5.13, and RMSE↓0.4% to 2.26) but increased MAE by 23.6% (4.66→5.76). This indicates that while dimensionality reduction enhanced global pattern recognition (reflected in R²/RMSE), it introduced minor localized errors across multiple points without causing significant anomalies. ② The moving prediction window further enhanced accuracy: MSE decreased by 2.7% (5.13→4.99), MAE decreased by 19.1% (5.76→4.66), R² increased to 0.83, and RMSE decreased by 1.3% to 2.23. The alignment of all metrics confirms its effectiveness in addressing settlement lag. ③ Special factor encoding delivered the most substantial incremental gain: MSE decreased by 13.6% (4.99→4.31), MAE decreased by 14.6% (4.66→3.98), R² increased to 0.84, and RMSE decreased by 6.7% (2.23→2.08). This highlights the critical role of accounting for abrupt construction events. Cumulatively, the fully improved model achieved the following: 20.9% reduction in MSE (5.22→4.13), 38.2% reduction in MAE (5.91→3.65), 11.0% reduction in RMSE (2.28→2.03), and R² improvement from 0.79 to 0.86.

The progressive enhancement across all four metrics—particularly the >10% reduction in RMSE—confirms significant prediction performance gains. While individual step improvements appear modest in some metrics (e.g., MSE changes of 0.02–0.68), their combined effect yields statistically and practically meaningful advancement for engineering applications.

Figure 10 presents a comparison between the prediction data and the monitoring data. Compared with the original Informer algorithm, the improved Informer algorithm had a prediction trend and peak–valley settlement values that aligned more closely with the monitoring data. Therefore, the improved Informer algorithm boasts higher prediction accuracy than the original one. At ring 250, the predicted settlement values of the two curves diverged significantly. This is because special influencing factors were labeled as features. As a result, the improved Informer algorithm learned that an unexpected stoppage occurred at this location and made more accurate predictions. The improved Informer algorithm has a significantly enhanced capability for predicting surface settlement.

3.5.2. Prediction Accuracy Across Different Timescales

The prediction results over 5-, 10-, and 30-ring intervals were examined to compare the settlement prediction capability of the algorithm before and after the improvement across different timescales, as shown in Figure 11. As the timescale increased, the prediction errors for both the original and improved Informer algorithms increased. For the 5-ring prediction, the advantage of the improved Informer algorithm was not obvious. However, for the 10- and 30-ring predictions, the R² value for the original Informer algorithm decreased by 7.32% and 5.99%, respectively, whereas the improved Informer algorithm exhibited reductions of only 5.99% and 3.32%. Thus, the accuracy gap between the two algorithms widened as the prediction timescale increased. By replacing standard convolution with dilated causal convolution, the perception range of the improved Informer algorithm was optimized, and its prediction accuracy was significantly increased.

3.5.3. Computational Efficiency

Considering the time-sensitive nature of shield tunneling operations, it is important to minimize the time required for surface settlement prediction. The computational time required for different numbers of iterations was evaluated, and the results are presented in Figure 12. The evaluation was performed on a Windows 11 system with an Intel Core i7-12700H processor.

At 50 iterations, the original Informer algorithm required 891 s, while the improved Informer algorithm reduced this time to 618 s—a reduction of 273 s (30.64%). At 100 iterations, the time decreased from 1740 to 1309 s—a reduction of 431 s (24.77%). For 200 iterations, the time decreased from 3011 to 2570 s—a reduction of 441 s (14.65%). These results indicate that the improved Informer algorithm, which adopts soil layer classification and characterization, achieved a higher computational efficiency and reduced prediction time.

The Table 5 shows the ablation study. The marginal MAE increase (4.66→5.76) after dimensionality reduction stems from trade-offs between error distribution and overall accuracy: ① Soil layer encoding slightly increased small errors (MAE↑) due to simplified stratigraphic inputs but significantly improved pattern generalizability (R² from 0.80 to 0.82; see Figure 9). ② Crucially, this encoding reduced memory usage by 27% without compromising critical settlement trends.

3.6. Comparison with Other Algorithms

To further verify the prediction accuracy of the improved Informer algorithm, comparisons were made with the RF algorithm and the LSTM network algorithm.

3.6.1. Algorithm Descriptions

(1): RF Algorithm

The RF algorithm is an ensemble learning method mainly used for classification, regression, and other tasks. It makes predictions by constructing multiple decision trees. Its core concept is the aggregation of predictions from multiple decision trees to improve the overall algorithm accuracy and stability [36]. The optimal hyperparameter settings for the RF algorithm are presented in Table 6.

(2): LSTM Algorithm

The LSTM network is a special type of recurrent neural network. By introducing a complex gating mechanism, it effectively retains information over long time intervals, making it well-suited for time-series prediction and related tasks [37]. For this study, the LSTM model was implemented with the following architecture and hyperparameters:

Network Structure: Two stacked LSTM layers with 64 hidden units each, followed by a fully connected output layer.

Training Configuration: Adam optimizer with a learning rate of 0.001, batch size of 32, and dropout rate of 0.2 to mitigate overfitting.

Training Process: The model was trained for 200 epochs with early stopping (patience = 10) based on validation loss.

Input/Output: A sliding window of 10 historical rings (aligned with the Informer’s input length) was used to predict the next settlement value.

These hyperparameters were optimized via a grid search on the training set to ensure a fair comparison with the improved Informer algorithm. The optimal hyperparameter settings for the LSTM algorithm are presented in Table 7.

3.6.2. Comparison of Prediction Results

The evaluation indicators for the prediction results of the improved Informer algorithm, RF, and LSTM are shown in Figure 13. The improved Informer algorithm achieved MSE, MAE, RMSE, and R² values of 4.13, 3.65, 2.03, and 0.86, respectively, significantly outperforming the RF and LSTM algorithms. RF exhibited the worst performance among the algorithms tested in this study. This is because its decision tree-based structure is not well-suited for predicting long-time-series data, such as shield tunneling data. The LSTM algorithm, as a learning model designed for long-time-series data, achieved far higher prediction accuracy than the RF algorithm. However, it lagged behind the improved Informer algorithm.

Figure 14 presents a comparison of the improved Informer algorithm with other machine-learning algorithms. During the prediction of surface settlement from rings 340 to 370, the RF algorithm produced large errors, highlighting its limitations in accurately predicting large variations in settlement. The prediction results of the LSTM algorithm generally agreed with observed settlement trends. However, they showed considerable deviations from the monitoring data at rings 215, 240, 280, 335, and 360. Compared with the RF and LSTM algorithms, the improved Informer algorithm provided predictions that were closer to the actual settlement values in terms of overall trends and responses to significant variations. It could accurately capture the actual surface settlement induced by shield tunneling. Compared with the RF and LSTM algorithms, the improved Informer algorithm is better suited for predicting surface settlement in shield tunneling operations.

4. Discussion

4.1. Analysis of the Influence of Shield Tunneling Features Based on SHAP Theory

The improved Informer algorithm was employed as the input for the SHAP algorithm to investigate the effects of various features on surface settlement induced by shield tunneling. Figure 15 presents the SHAP values of the top 20 features with the highest contributions. The vertical axis lists the features in descending order of average SHAP values. The horizontal axis represents SHAP values. Positive SHAP values indicate that the feature tends to increase settlement, while negative values indicate a tendency to reduce settlement. The color gradient on the right represents the magnitude of the eigenvalues. The features on the vertical axis are arranged from top to bottom in descending order of importance; thus, shield thrust was identified as the most important feature influencing surface settlement. In addition, features such as the horizontal displacement of the shield head (HDSH) and cutterhead torque significantly impacted the prediction of surface settlement.

Because shield thrust is the most important feature affecting settlement, a feature-dependence plot was drawn for shield thrust (Figure 16), where the color gradient represents the distribution of soil layer eigenvalues: blue corresponds to the silty soil layer, red corresponds to the silty clay layer, and purple corresponds to the composite soil layer. The type of soil layer largely determines the required thrust. For example, the silty clay layer, being denser, requires significantly more thrust than the silty soil layer during shield tunneling.

In Region I, which entirely consists of data points from the silty soil layer, the SHAP values were always negative, indicating that a normalized thrust of approximately 0.2 is sufficient to maintain normal tunneling operations within this soil layer. As thrust increased, the SHAP values decreased gradually, further reducing soil layer settlement. In Region II, the normalized thrust values, which are represented by the data points, were all approximately 0.63. As the soil layer transitioned from silty clay to the composite soil layer, the required thrust decreased. As a result, the SHAP values decreased under the same thrust, which amplified the suppression of settlement. Region III contains data points from the silty clay layer and the composite soil layer. The SHAP values and thrust had an overall inverse relationship. As the thrust increased, its impact on surface settlement changed from promotion to suppression. When the normalized thrust ranged between 0.8 and 0.9, the disturbance to the soil layer was minimized, suggesting that the optimal thrust for these soil conditions is 0.8–0.9 times the maximum thrust.

During the shield tunneling process, stable thrust can be maintained by appropriately controlling the cutterhead torque, advance rate, grouting pressure, and muck pressure and thereby reducing soil layer disturbance [38]. These four features are closely related to variations in thrust. To visualize these relationships, dependence plots were generated to illustrate the interactions between thrust and the cutterhead torque, advance rate, grouting pressure, and muck pressure, as shown in Figure 17.

Figure 17a depicts the relationships between the cutterhead torque, thrust, and surface settlement. At low torque levels, the earth pressure balance was disrupted, increasing over-excavation risks [38]. However, high torque enhances soil mixing, reducing Type II–III settlement (Figure 4). Under the same thrust, the cutterhead torque and SHAP values showed a negative correlation.

Figure 17b illustrates the relationship between the advance rate, thrust, and surface settlement. Low advance rates negatively impacted surface settlement, while high advance rates had a positive effect. A strong negative correlation was observed between the advance rate and thrust.

Figure 17c,d show the relationships of the grouting pressure and muck pressure with the thrust and surface settlement, respectively. The grouting pressure and muck pressure had similar effects on the surface settlement. Muck pressure outside the optimal range (0.6–0.8) exacerbates ground loss: low pressure induces soil inflow, while high pressure fractures strata, escalating grout leakage and Type I–IV settlement [1,6]. This suggests that increasing grouting and muck pressures during construction can significantly reduce soil layer disturbance. The correlation between thrust and these pressures was relatively weak, which agrees with actual construction conditions.

Additionally, K-block misalignment (e.g., Positions 1–5) impedes segment ring closure, widening tail voids and aggravating Type IV settlement through uneven grout distribution [35].

Notably, parameters such as cutterhead torque and advance rate influence surface settlement by modulating conventional volume loss drivers. For example, excessive torque increases soil fracturing, which elevates grout leakage risks and indirectly enlarges the tail-lining gap. Similarly, low advance rates prolong stress redistribution in the excavation chamber, exacerbating over-excavation and grout pressure mismatch (Figure 17). These findings highlight the complementary role of operational parameters in mitigating volume loss.

4.2. Shield Tunneling Parameter Optimization Design Based on the Multi-Objective Optimization Algorithm

The geometric parameters and geological conditions were fixed in the project discussed in this study. Therefore, optimizing tunneling parameters is essential to minimize subsequent soil layer disturbances and ensure safe shield tunneling operations. From the preceding analysis, the shield thrust, cutterhead torque, and tunneling speed were identified as the most influential and directly adjustable shield tunneling parameters affecting surface settlement. This selection was grounded in three engineering considerations: ① Operational Adjustability: These parameters are directly controllable in real-time during tunneling (e.g., via shield hydraulic systems and cutterhead drives), unlike static factors like geological properties. ② SHAP-Driven Criticality: As shown in Figure 15, shield thrust, cutterhead torque, and tunneling speed ranked highest in SHAP importance, collectively explaining >60% of settlement variance. ③ Practical Constraints: Thrust and torque are bounded by shield machine capacity (max thrust: 13,400 kN, max torque: 3531 kN·m, as per Table 2), while tunneling speed is limited by soil stability—excessive rates risk face collapse in soft soils (e.g., silty clay). Balancing these constraints ensures feasibility. As a result, an improved Informer algorithm was employed as the objective function, and the above three parameters were selected as optimization variables.

Surface settlement was used as the control objective to construct a multi-objective optimization algorithm for shield tunneling parameters. The problem was solved using the NSGA-III algorithm to obtain the optimized parameter set.

The lower bound of the maximum surface settlement constraint was set as −10 mm, and the range for the horizontal and vertical displacement objective functions for shield tunneling posture was set as 20 mm. The distribution range of the three tunneling parameter variables was optimized according to the 25th and 75th percentiles of the original dataset. Random sampling was employed for initial individual sampling. Specifically, the crossover and mutation probabilities were set as 0.85 and 0.45, respectively, and the maximum number of iterations was limited to 200.

The computing process was evaluated using the convergence index, HV; the uniformity index, SP; and the diversity index, ΔLine. As the number of iterations increased, HV initially rose and then stabilized, ultimately reaching 0.8940, indicating that the obtained solution set was of high quality. The SP value was 0.5761, suggesting that the solutions were relatively uniformly distributed in the objective space. There was no significant clustering or bias toward any single objective. The ΔLine value of 0.2258 further confirmed the broad and diverse distribution of the solution set in the objective space, i.e., the solution set covers a wide range of potential solutions.

The optimized parameter combinations are summarized in Table 8. Because there is no absolute superiority among the optimized parameters, the most suitable combination can be selected from the solution set according to specific project requirements to control shield tunneling effectively.

Compared with the original tunneling parameters, the multi-objective optimized parameter combinations effectively controlled surface settlement. Comprehensive analysis of the optimized solution set revealed that, on average, the optimized tunneling parameters reduced surface settlement by 16.79%.

4.3. Model Limitations and Mitigation Pathways

However, the improved Informer algorithm has several large residuals (Figure 10).

① Ring 250:

Root Cause: An unplanned 12 h stoppage (due to cutterhead jam) occurred. The binary “special factor” flag (Section 2.3.3) captured the event but failed to quantify stoppage duration–intensity impacts, leading to underprediction.

Limitation: The algorithm’s static event encoding cannot adapt to dynamic disruptions.

Mitigation: Integrate duration-weighted event features (e.g., “stoppage hours × depth”) to enhance temporal sensitivity.

② Ring 315:

Root Cause: Sudden transition from silty clay to a pebble layer (Figure 6, Stratum ⑥). The model’s soil classification (Section 2.3.1) provided categorical inputs but missed real-time stratigraphic heterogeneity, delaying response to altered ground reactions.

Limitation: Geological characterization relies on interpolation between sparse boreholes.

Mitigation: Couple with online drilling parameter monitoring (e.g., specific energy) to detect layer changes dynamically.

On the other hand, despite the high accuracy of the improved Informer, prediction fidelity may be influenced by unmonitored parameters. For instance: i. chamber pressure differentials (not recorded in this project) could refine soil discharge dynamics, potentially enhancing muck pressure modeling; ii. grouting viscosity (excluded due to measurement constraints) might improve the grouting pressure’s representation of soil-filling efficiency. Future work should integrate these parameters via multi-sensor fusion to further minimize prediction uncertainty.

5. Conclusions

To overcome the limitations of the Informer algorithm in predicting surface settlement induced by shield tunneling, we proposed an improved algorithm and verified its accuracy. The improved Informer algorithm was then input to the SHAP algorithm to analyze the effects of key tunneling parameters on surface settlement. Finally, a multi-objective optimization algorithm based on the NSGA-III method was developed to optimize the shield tunneling parameters. Our key findings are summarized as follows:

(1): To address the limitations of the Informer algorithm in predicting surface settlement, a dilated causal convolutional network was employed to replace the standard convolutional network, and three improvements were made: soil layer classification and characterization, a moving prediction window, and incorporation of special factors. These measures yielded the improved Informer algorithm.
(2): Compared with the original Informer algorithm, the improved Informer algorithm had obvious advantages: ① The prediction accuracy improved significantly. Through the improvements, the R² value increased from 0.79 to 0.86, the MSE decreased from 5.22 to 4.13, the MAE decreased from 5.91 to 3.65, and the RMSE decreased from 2.28 to 2.03. ② With increasing prediction timescales, the rate of R² reduction for the improved Informer algorithm was consistently lower than that of the original algorithm. Both the perception range and accuracy of the improved algorithm were optimized. ③ The improved Informer algorithm had a shorter prediction time than the original algorithm. The prediction time was reduced by up to 30.64%. Compared with the RF and LSTM algorithms, the improved Informer algorithm also exhibited superior performance. It is more suitable than other algorithms for predicting surface settlement in metro shield tunneling projects.
(3): According to the SHAP theory, we analyzed the effects of shield tunneling parameters on surface settlement. The results indicated that shield thrust was the most critical feature for predicting surface settlement. Its impact on surface settlement needed to be judged in conjunction with soil layer conditions. The cutterhead torque had a negative correlation with its SHAP values. Low tunneling speeds exerted a negative impact on settlement, while high speeds had a positive impact. Additionally, the tunneling speed and thrust exhibited a strong negative correlation. Grouting pressure and muck pressure positively influenced surface settlement at low levels but negatively influenced it at high levels. During the construction process, appropriate increases in thrust torque, grouting pressure, and muck pressure, combined with reduced tunneling speed, were found to effectively mitigate surface settlement.
(4): Shield thrust, cutterhead torque, and tunneling speed were selected as optimization parameters. On this basis, a multi-objective optimization algorithm was developed and solved using the NSGA-III algorithm, with the aim of optimizing the combinations of shield tunneling parameters. Compared with the original tunneling parameters, the optimized combinations reduced surface settlement by 16.79% on average. The optimized parameter combinations effectively controlled surface settlement through multi-objective optimization.

Author Contributions

Conceptualization, X.F.; methodology, S.Z. and X.F.; software, K.P.; validation, S.Z.; formal analysis, S.Z. and X.F.; investigation, X.F. and K.P.; resources, X.F.; data curation, X.F.; writing—original draft preparation, S.Z. and K.P.; writing—review and editing, S.Z. and K.P.; supervision, X.F.; project administration, X.F.; funding acquisition, S.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the General Program of China Postdoctoral Science Foundation, grant number 2022M723402, and the National Natural Science Foundation of China Youth, grant number 52308422.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

The appendix provides complete definitions, units, operational roles, and the engineering significance for all 19 shield tunneling parameters and 14 geological parameters as follows:

(1): Shield Tunneling Parameters (19 parameters)

Grouting pressure (bar): Synchronized grouting pressure at the shield tail to fill the gap between the shield body and segments, controlling ground deformation.

K-block location (dimensionless): Position index of the K-block (1–15) during segment assembly, affecting the closure stability of the segment ring.

Horizontal/vertical displacement of shield head/tail (mm): Real-time displacement of the shield machine’s head and tail, reflecting deviations in tunneling posture.

Shield tail gap (mm): Gap distances at the top, bottom, left, and right sides between the shield tail and segments, directly influencing grouting efficiency and ground disturbance.

Screw conveyor rotation speed (r/min) and torque (kN·m): Critical parameters for controlling muck discharge, linked to earth pressure balance in the excavation chamber.

Muck pressure (bar): Pressure of excavated soil in the chamber, a core indicator for maintaining face stability.

Advance rate (mm/min): Tunneling speed of the shield machine; excessive rates may cause over-excavation or ground disturbance.

Cutterhead rotation speed (r/min) and torque (kN·m): Parameters governing cutterhead operation, impacting cutting efficiency and ground response.

Thrust (kN): Total thrust of the shield propulsion system, directly related to ground resistance.

(2): Geological Parameters (14 parameters)

These parameters include groundwater level, tunnel burial depth, and stratum-specific properties (e.g., density, moisture content, cohesion, internal friction angle, and compression modulus—see Table 1), derived from geological surveys to characterize soil-layer impacts on settlement.

References

Wei, Y.; Yang, Y.; Tao, M.; Wang, D.; Jie, Y. Earth pressure balance shield tunneling in sandy gravel deposits: A case study of application of soil conditioning. Bull. Eng. Geol. Environ. 2020, 79, 5013–5030. [Google Scholar] [CrossRef]
Fang, Y.; Yao, Y.; Wang, J.; Li, B.; Dou, L.; Wei, L.; Zhuo, B.; Zhang, W.; Hu, X. Effective dewatering and resourceful utilization of high-viscosity waste slurry through magnetic flocculation. Constr. Build. Mater. 2024, 425, 136014. [Google Scholar] [CrossRef]
Zumsteg, R.; Langmaack, L. Mechanized tunneling in soft soils: Choice of excavation mode and application of soil-conditioning additives in glacial deposits. Engineering 2017, 3, 863–870. [Google Scholar] [CrossRef]
Ling, X.; Kong, X.; Tang, L.; Zhao, Y.; Tang, W.; Zhang, Y. Predicting earth pressure balance (EPB) shield tunneling-induced ground settlement in compound strata using random forest. Transp. Geotech. 2022, 35, 100771. [Google Scholar] [CrossRef]
Chen, R.; Zhang, P.; Kang, X.; Zhong, Z.; Liu, Y.; Wu, H. Prediction of maximum surface settlement caused by earth pressure balance (EPB) shield tunneling with ANN methods. Soils Found. 2019, 59, 284–295. [Google Scholar] [CrossRef]
Rong, X.; Gao, L.; Han, A.; Wu, J.; Wu, X.; Jiang, G. Analysis of ground volume loss for EPB shield tunneling in thick silty clay layer. Alex. Eng. J. 2024, 96, 295–302. [Google Scholar] [CrossRef]
Li, C.; Dias, D. Intelligent prediction and visual optimization of surface settlement induced by earth pressure balance shield tunneling. Tunn. Undergr. Space Technol. 2024, 154, 106138. [Google Scholar] [CrossRef]
Wang, J.; Feng, K.; Wang, Y.; Lin, G.; He, C. Soil disturbance induced by EPB shield tunnelling in multilayered ground with soft sand lying on hard rock: A model test and DEM study. Tunn. Undergr. Space Technol. 2022, 130, 104738. [Google Scholar] [CrossRef]
Kong, X.; Ling, X.; Tang, L.; Tang, W.; Zhang, Y. Random forest-based predictors for driving forces of earth pressure balance (EPB) shield tunnel boring machine (TBM). Tunn. Undergr. Space Technol. 2022, 122, 104373. [Google Scholar] [CrossRef]
Lai, H.; Zhang, J.; Zhang, L.; Chen, R.; Yang, W. Anew method based on centrifuge model test for evaluating ground settlement induced by tunneling. KSCE J. Civ. Eng. 2019, 23, 2426–2436. [Google Scholar] [CrossRef]
Li, S.; Wang, M. Elastic analysis of stress–displacement field for a lined circular tunnel at great depth due to ground loads and internal pressure. Tunn. Undergr. Space Technol. 2008, 23, 609–617. [Google Scholar] [CrossRef]
Wang, Z.; Shi, C.; Chen, H.; Peng, Z.; Sun, Y.; Zheng, X. Probabilistic analysis of the longitudinal performance of shield tunnels based on a simplified finite element procedure and its surrogate model considering spatial soil variability. Comput. Geotech. 2023, 162, 105662. [Google Scholar] [CrossRef]
Li, J.; Liu, A.; Xing, H. Study on ground settlement patterns and prediction methods in super-large-diameter shield tunnels constructed in composite strata. Appl. Sci. 2023, 13, 10820. [Google Scholar] [CrossRef]
Chen, Z.; Zhang, Y.; Li, J.; Li, X.; Jing, L. Diagnosing tunnel collapse sections based on TBM tunneling big data and deep learning: A case study on the Yinsong Project, China. Tunn. Undergr. Space Technol. 2021, 108, 103700. [Google Scholar] [CrossRef]
Fu, Y.; Chen, L.; Xiong, H.; Chen, X.; Lu, A.; Zeng, Y.; Wang, B. Data-driven real-time prediction for attitude and position of super-large diameter shield using a hybrid deep learning approach. Undergr. Space 2024, 15, 275–297. [Google Scholar] [CrossRef]
Alsirawan, R.; Sheble, A.; Alnmr, A. Two-dimensional numerical analysis for TBM tunneling-induced structure settlement: A proposed modeling method and parametric study. Infrastructures 2023, 8, 88. [Google Scholar] [CrossRef]
Editorial Department of China Journal of Highway and Transport. Review on China’s traffic tunnel engineering research: 2022. China J. Highw. Transp. 2022, 35, 1–40. [Google Scholar] [CrossRef]
Salimi, A.; Faradonbeh, R.S.; Monjezi, M.; Moormann, C. TBM performance estimation using a classification and regression tree (CART) technique. Bull. Eng. Geol. Environ. 2018, 77, 429–440. [Google Scholar] [CrossRef]
Zhang, W.; Li, H.; Wu, C.; Li, Y.; Liu, Z.; Liu, H. Soft computing approach for prediction of surface settlement induced by earth pressure balance shield tunneling. Undergr. Space 2021, 6, 353–363. [Google Scholar] [CrossRef]
Tang, L.; Na, S. Comparison of machine learning methods for ground settlement prediction with different tunneling datasets. J. Rock Mech. Geotech. Eng. 2021, 13, 1274–1289. [Google Scholar] [CrossRef]
Huang, H.; Chang, J.; Zhang, D.; Zhang, J.; Wu, H.; Li, G. Machine learning-based automatic control of tunneling posture of shield machine. J. Rock Mech. Geotech. Eng. 2022, 14, 1153–1164. [Google Scholar] [CrossRef]
Duan, X. Settlement prediction of Nanjing Metro Line 10 with HOA-VMD-LSTM. Measurement 2025, 244, 116477. [Google Scholar] [CrossRef]
Wen, K.; Liang, Q. TM-Informer-Based prediction for railway ground surface settlement. J. Circuits Syst. Comput. 2024, 33, 2450316. [Google Scholar] [CrossRef]
Chang, J.; Huang, H.; Thewes, M.; Zhang, D.; Wu, H. Data-based postural prediction of shield tunneling via machine learning with physical information. Comput. Geotech. 2024, 174, 106584. [Google Scholar] [CrossRef]
Pourtaghi, A.; Lotfollahi-Yaghin, M.A. Wavenet ability assessment in comparison to ANN for predicting the maximum surface settlement caused by tunneling. Tunn. Undergr. Space Technol. 2012, 28, 257–271. [Google Scholar] [CrossRef]
Zhou, H.; Zhang, S.; Peng, J. Informer: Beyond efficient transformer for long sequence time-series forecasting. arXiv 2021. [Google Scholar] [CrossRef]
Lai, J.; Zhu, J.; Guo, Y.; Xie, Y.; Hu, Y.; Wang, P. Amulti-factor-driven approach for predicting surface settlement caused by the construction of subway tunnels by undercutting method. Environ. Earth Sci. 2024, 83, 442. [Google Scholar] [CrossRef]
Zhao, J.; Ding, X. Predicting tunnel boring machine performance with the Informer model: A case study of the Guangzhou Metro Line project. J. Zhejiang Univ.-Sci. A 2025, 26, 226–237. [Google Scholar] [CrossRef]
Wang, Z.; Bai, X. Long-term Time Series Prediction of Deformation in the Area of Pylons by Combining InSAR and Transformer-based. In Proceedings of the 2022 Euro-Asia Conference on Frontiers of Computer Science and Information Technology (FCSIT), Beijing, China, 16–18 December 2022; pp. 59–62. [Google Scholar] [CrossRef]
Hu, M.; Cheng, P. Long-Distance Shield Tunnelling Performance Prediction Based on Informer. Appl. Sci. 2025, 15, 1674. [Google Scholar] [CrossRef]
Zhen, J.; Huang, M.; Li, S.; Xu, K.; Zhao, Q. Long-Term Forecasting of Shield Tunnel Position and Attitude Deviation Using the 1DCNN-Informer Method. Eng. Sci. Technol. Int. J. 2025, 63, 101957. [Google Scholar] [CrossRef]
Pang, S.; Hua, W.; Fu, W.; Liu, X.; Ni, X. Multivariable real-time prediction method of tunnel boring machine operating parameters based on spatio-temporal feature fusion. Adv. Eng. Inform. 2024, 62, 102924. [Google Scholar] [CrossRef]
Zhu, Q.; Han, J.; Chai, K.; Zhao, C. Time series analysis based on Informer algorithms: A survey. Symmetry 2023, 15, 951. [Google Scholar] [CrossRef]
Cao, J.; Liu, F.; Shen, Z. ALSTM-based model for TBM performance prediction and the effect of rock mass grade on prediction accuracy. China Civ. Eng. J. 2022, 55, 92–102. [Google Scholar] [CrossRef]
Zhang, P.; Chen, R.; Wu, H.; Liu, Y. Ground settlement induced by tunneling crossing interface of water-bearing mixed ground: A lesson from Changsha, China. Tunn. Undergr. Space Technol. 2020, 96, 103224. [Google Scholar] [CrossRef]
Rhodes, J.S.; Cutler, A.; Moon, K.R. Geometry- and accuracy-preserving random forest proximities. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 10947–10959. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Zhang, P.; Wu, H.; Chen, R.; Chan, T.H.T. Hybrid meta-heuristic and machine learning algorithms for tunneling-induced settlement prediction: A comparative study. Tunn. Undergr. Space Technol. 2020, 99, 103383. [Google Scholar] [CrossRef]

Figure 1. Framework of the Informer algorithm [26].

Figure 2. Self-attention distillation technique [26].

Figure 3. Schematic of the self-attention network with dilated causal convolutional layers.

Figure 4. Schematic of surface settlement induced by shield tunneling.

Figure 5. Schematic of supplementary input features obtained using the moving prediction window.

Figure 6. Geological profile of the section.

Figure 7. Schematic of features affecting surface settlement.

Figure 8. Boxplot method for detecting anomalies.

Figure 9. Changes in evaluation indicators under different improvement measures.

Figure 10. Comparison between prediction data and monitoring data: (a) settlement; (b) residual.

Figure 11. Comparison of the Informer algorithm before and after improvement across different timescales.

Figure 12. Comparison of the computational time of the Informer algorithm before and after improvement.

Figure 13. Comparison with other prediction algorithms (about evaluation indexes).

Figure 14. Comparison with other prediction algorithms (about settlement).

Figure 15. SHAP values for features.

Figure 16. Feature-dependence plot for shield thrust.

Figure 17. Dependence plots for tunneling parameters: (a) cutterhead torque, (b) advance rate, (c) grouting pressure, and (d) muck pressure.

Table 1. Physical and mechanical parameters of the soil layers.

Stratum No.	Stratum	Density, ρ (g/cm³)	Moisture Content, w (%)	Cohesion, c (kPa)	Internal Friction Angle, Φ (°)	Compression Modulus, E (MPa)	Permeability Coefficient, k (m/d)
①1	Plain fill	1.88	23.1	12	10	-	0.5
①2	Miscellaneous fill	1.90	20	15	5	-	2
③1	Silty clay	1.92	23.9	37.9	13.7	5.34	0.009
③2	Silty soil	1.94	21.6	10.5	19.9	8.55	0.19
③3	Fine sand	1.97	-	5	20	13	-
③5	Medium sand	2.01	-	5	25	15	-
⑦1	Silty clay	1.93	27.5	19.9	11.7	5.56	0.0011
⑨1	Silty clay	1.98	23.7	35.7	13.2	5.81	0.0019
⑨3	Silty soil	1.89	26.7	42.4	13.4	6.02	0.0093
⑨6	Pebble	2.08	-	5	40	35	3.97
⑨7	Medium sand	2.01	-	5	30	15	3.97

Table 2. Statistical indicators of shield tunneling parameters.

Parameter	Unit	Mean Value	Standard Deviation	Min.	Max.	25%	50%	75%
Grouting pressure	bar	2.62	0.45	0.4	3.3	2.3	2.6	3
K-block location	-	6.65	3.91	1	15	3	5	11
Horizontal displacement of shield head (HDSH)	mm	−8.74	16.30	−45	49	−21	−9	0
Horizontal displacement of shield tail (HDST)	mm	3.62	18.15	−45	50	−9	1.5	16
Vertical displacement of shield head (VDSH)	mm	−34.15	8.53	−50	10	−40	−35	−30
Vertical displacement of shield tail (VDST)	mm	−22.72	13.05	−48	48	−30	−25	−19
Gap distance at top of shield tail (GDTST)	mm	51.65	7.55	28	75	45	50	55
Gap distance at bottom of shield tail (GDBST)	mm	71.63	9.58	45	100	65	70	80
Gap distance at left of shield tail (GDLST)	mm	70.46	9.17	40	114	65	70	75
Gap distance at right of shield tail (GDRST)	mm	62.48	11.35	31	95	55	60	70
Screw conveyor rotation speed	r/min	7.60	1.35	1	10.5	7	8	8.5
Screw conveyor torque	kN·m	39.41	13.59	15	83	28	36	48
Muck pressure	bar	1.69	0.24	0.3	2.2	1.6	1.7	1.85
Advanced rate	mm/min	41.29	7.58	5	55	38	43	48
Volume of excavated earth	m³	62.74	0.78	61	65	63	63	63
Cutterhead rotation speed	r/min	1.19	0.09	1	1.4	1.1	1.2	1.3
Cutterhead torque	kN·m	2332.18	356.64	385	3531	2150	2350	2500
Grouting volume	m³	6.21	0.29	2.5	7.2	6.1	6.2	6.3
Thrust	kN	11,346.92	1083.19	4800	13,400	11,100	11,500	11,900

Table 3. Segment partitioning and dataset splitting.

Segment No.	Start Ring	End Ring	Total Rings	Role
1	1	45	45	Training
2	46	120	75	Training
3	121	175	55	Test
4	176	250	75	Training
5	251	315	65	Training
6	316	375	60	Training
7	376	430	55	Test
8	431	500	70	Training
9	501	560	60	Training
10	561	630	70	Training
11	631	700	70	Training
12	721	770	65	Test
13	771	830	60	Training
14	831	875	45	Training

Table 4. Optimal hyperparameter settings for the improved Informer algorithm.

Hyperparameter	Hyperparameter Optimization Range	Optimal Value
enc_layers	[2, 3]	2
dec_layers	[1, 3]	2
n_heads	[2, 8]	3
e_layers	[1, 2]	2
d_ff	[128, 512]	174
factor	[1, 5]	3
dropout	[0.0, 0.5]	0.3
embed	[‘fixed’, ‘learned’]	‘learned’
activation	[‘relu’, ‘gelu’]	‘gelu’

Table 5. Ablation study.

Improvement Measure	MSE	MAE	RMSE	R²	Memory (GB)
Baseline (Original Informer)	5.22	5.91	2.28	0.79	3.2
+ Dilated Causal Convolution	5.15	4.66	2.27	0.80	3.0
+ Stratum Classification	5.13	5.76	2.26	0.82	2.2
+ Moving Prediction Window	4.99	4.66	2.23	0.83	2.3
+ Special Factors	4.31	3.98	2.08	0.84	2.3

Table 6. Optimal hyperparameter settings for RF algorithm.

Hyperparameter	Hyperparameter Optimization Range	Optimal Value
n_estimators	[2, 50]	17
criterion	[‘gini’, ‘entropy’]	‘gini’
max_depth	[1, 15]	6
min_samples_split	[2, 10]	2
min_samples_leaf	[1, 4]	1
min_weight_fraction_leaf	[0, 0.5]	0.1

Table 7. Optimal hyperparameter settings for LSTM algorithm.

Hyperparameter	Hyperparameter Optimization Range	Optimal Value
units	[50, 200]	155
activation	[‘relu’, ‘tanh’]	‘tanh’
recurrent_activation	[‘sigmoid’]	‘sigmoid’
use_bias	[True, False]	True
kernel_initializer	[‘glorot_unifo rm’, ‘he_uniform’]	‘glorot_uniform’
recurrent_initializer	[‘orthogonal’]	‘orthogonal’
bias_initializer	[‘zeros’]	‘zeros’
unit_forget_bias	[True, False]	True
dropout	[0.0, 0.5]	0.2
recurrent_dropout	[0.0, 0.5]	0.2

Table 8. Comparison of original and optimized tunneling parameters.

Combination of Excavation Parameters	Thrust (kN)	Cutterhead Torque (kN·m)	Advance Rate (mm/min)	Settlement (mm)
Original	11,300	2460.9982	39.0000	−10.247
	11,500	2337.2135	40.0000	2.171
	11,500	2311.4430	42.0000	−7.520
	11,300	2201.9703	45.0000	−4.071
	11,700	2317.9262	47.0000	−6.315
Optimized	11,714	2449.8298	38.8373	1.2623
	11,402	2447.8458	40.0938	−1.2648
	11,600	2352.7812	40.3691	−2.9640
	11,453	2251.7883	44.8257	−0.7665
	11,732	2448.3156	45.0416	−5.1628

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhao, S.; Feng, X.; Peng, K. Prediction and Parameter Optimization of Surface Settlement Induced by Shield Tunneling Using Improved Informer Algorithm. Appl. Sci. 2025, 15, 6766. https://doi.org/10.3390/app15126766

AMA Style

Zhao S, Feng X, Peng K. Prediction and Parameter Optimization of Surface Settlement Induced by Shield Tunneling Using Improved Informer Algorithm. Applied Sciences. 2025; 15(12):6766. https://doi.org/10.3390/app15126766

Chicago/Turabian Style

Zhao, Shisen, Xianda Feng, and Kefeng Peng. 2025. "Prediction and Parameter Optimization of Surface Settlement Induced by Shield Tunneling Using Improved Informer Algorithm" Applied Sciences 15, no. 12: 6766. https://doi.org/10.3390/app15126766

APA Style

Zhao, S., Feng, X., & Peng, K. (2025). Prediction and Parameter Optimization of Surface Settlement Induced by Shield Tunneling Using Improved Informer Algorithm. Applied Sciences, 15(12), 6766. https://doi.org/10.3390/app15126766

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Prediction and Parameter Optimization of Surface Settlement Induced by Shield Tunneling Using Improved Informer Algorithm

Abstract

1. Introduction

2. Method

2.1. Informer Algorithm

2.2. Improvement of Algorithm Framework

2.3. Improvements Based on Shield Tunneling Characteristics

2.3.1. Soil Layer Classification and Characterization

2.3.2. Moving Prediction Window

2.3.3. Special Factor Handling

3. Case Study

3.1. Project Background

3.2. Dataset Construction

3.3. Data Preprocessing

3.4. Hyperparameter Settings

3.5. Comparison of Effects Before and After Improvement

3.5.1. Prediction Accuracy

3.5.2. Prediction Accuracy Across Different Timescales

3.5.3. Computational Efficiency

3.6. Comparison with Other Algorithms

3.6.1. Algorithm Descriptions

3.6.2. Comparison of Prediction Results

4. Discussion

4.1. Analysis of the Influence of Shield Tunneling Features Based on SHAP Theory

4.2. Shield Tunneling Parameter Optimization Design Based on the Multi-Objective Optimization Algorithm

4.3. Model Limitations and Mitigation Pathways

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI