TBM Enclosure Rock Grade Prediction Method Based on Multi-Source Feature Fusion

Huang, Yong; Hu, Xiewen; Pang, Shilong; Fu, Wei; Chang, Shuaipeng; Gao, Bin; Hua, Weihua

doi:10.3390/app15126684

Open AccessArticle

TBM Enclosure Rock Grade Prediction Method Based on Multi-Source Feature Fusion

by

Yong Huang

^1,*

,

Xiewen Hu

¹

,

Shilong Pang

²

,

Wei Fu

³

,

Shuaipeng Chang

³

,

Bin Gao

³

and

Weihua Hua

²

¹

Facuity of Geosciences and Environmental Engineering, Southwest Jiaotong University, Chengdu 611756, China

²

School of Geography and Information Engineering, China University of Geosciences, Wuhan 430078, China

³

China Railway First Survey and Design Institute Group Co., Ltd., Xi’an 710043, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(12), 6684; https://doi.org/10.3390/app15126684

Submission received: 12 May 2025 / Revised: 28 May 2025 / Accepted: 3 June 2025 / Published: 13 June 2025

(This article belongs to the Special Issue Tunnel and Underground Engineering: Recent Advances and Challenges)

Download

Browse Figures

Versions Notes

Abstract

Aiming to mitigate engineering risks such as tunnel face collapse and equipment jamming caused by poor geological conditions during the construction of tunnel boring machines (TBMs), this study proposes a TBM surrounding rock grade prediction method based on multi-source feature fusion. Firstly, a multi-source dataset is established by systematically integrating TBM tunnelling parameters, horizontal acoustic profile (HSP) detection data and three-dimensional geological spatial information. In the data preprocessing stage, the TBM data is cleaned and divided according to the mileage section, the statistical characteristics of key tunnelling parameters (thrust, torque, penetration, etc.) are extracted, and the rock fragmentation index (TPI, FPI, WR) is fused to construct a composite feature vector. The Direct-LiNGAM causal discovery algorithm is innovatively introduced to analyse the nonlinear correlation mechanism between multi-source features, and then a hybrid model, TRNet, which combines the local feature extraction ability of convolutional neural networks and the nonlinear approximation advantages of Kolmogorov–Arnold networks, is constructed. Verified by a real tunnel project in western Sichuan, China, the prediction accuracy of TRNet for surrounding rock grade on the test set reaches an average of 92.15%, which is higher than other data-driven methods. The results show that the prediction method proposed in this paper can effectively predict the surrounding rock grade of the tunnel face during TBM tunnelling, and provide decision support for the dynamic regulation of tunnelling parameters.

Keywords:

tunnel boring machine; surrounding rock grades; multi-source characteristic fusion; Direct-LiNGAM; TRNet

1. Introduction

In the context of China’s rapid advancements in water conservation, railway infrastructure, and other engineering projects, there has been a substantial and continuous increase in the number of tunnels being constructed [1,2,3]. Tunnel boring machines (TBMs) are the most widely used in this context due to their high efficiency and safety. However, the complexity and variability of the subterranean environment can present significant challenges to the functionality of TBMs, often resulting in the encountering of bad rock formations characterised by high fragmentation and diminished stability during operation [4,5]. If the parameters of the TBM are not adapted in a timely manner to these rock conditions, this can result in equipment failure and even construction accidents, such as boring instability, collapse and jamming [6,7]. Consequently, enhancing the efficiency of TBM tunnelling hinges on the ability to accurately assess the current perimeter rock grade of its palm face, a crucial yet challenging aspect that demands attention [8,9]. The existing methods for predicting the current surrounding rock grade of the tunnel face of TBM tunnels can be categorised into three approaches: theoretical analysis, numerical calculation, and machine learning.

In terms of theoretical analysis, Jancsecz and Steiner [10] were the first to utilise the limit equilibrium method to construct a set of theoretical analytical models. From these models, they derived the corresponding calculation formulas and applied them to the tunnel face support and its related support mechanisms. Davis et al. [11] also employed the limit equilibrium method to analyse the stability of the tunnel’s tunnel face, taking into account the potential for local collapses independent of the conditions of the overburden above the tunnel. Broere [12] proposed a superporous water pressure stability model for analysing the stability of tunnel faces in soft soil excavations. The theoretical analysis method plays an important role in providing a prediction of surrounding rock stability. However, the simplifying assumptions and the uncertainty of model parameter selection still limit its accuracy and universality. In terms of numerical analysis, Jia et al. [13] investigated the damage mechanisms of tunnels in jointed rock bodies, emphasising the importance of understanding the behaviour of rock joints in tunnel stability analysis. Zhang et al. [14] proposed an exponential FAI for estimating the stability of brittle enclosing rock bodies based on the analysis of the state of stress and the geometry of the strength envelope. Xia et al. [15] investigated the formation mechanism and changing law of the lateral force of the TBM central cutter, focusing on predicting the average lateral force and analysing the degree of rock fragmentation. Numerical analysis methods have been shown to be capable of accurately simulating complex geological conditions and construction impacts, and providing detailed stress–strain distributions. However, the calculation process is complicated, with a low degree of automation and high sensitivity to parameters.

In the domain of machine learning, the rapid accumulation of TBM construction data and the extensive application of artificial intelligence algorithms have led to the emergence of novel approaches and strategies for the prediction research of TBM rock grades [16,17]. Liu et al. [18] proposed an enhanced support vector regression (SVR) model, which enhances the precision of predicting rock mass parameters by employing the stacked single target technique (SST). Liu et al. [19] proposed an integrated learning model based on Classification and Regression Tree (CART) with the AdaBoost algorithm for predicting the classification of TBM enclosing rocks. Hou et al. [20] proposed a real-time enclosing rock classification prediction method based on big data of TBM operation and an integrated learning stacking technique. The advent of deep learning technology has brought with it a suite of advantages in the processing of complex data and the mining of deep features, thus providing new ideas and methods for the prediction of substandard rocks [21,22,23]. Feng et al. [24] predicted the performance parameters of a TBM using a deep belief network (DBN) in the excavation of a water diversion project tunnel in northeast China. Liu et al. [25] proposed a BP neural network (SA-BPNN) model combined with a simulated annealing algorithm for predicting rock mass parameters, including uniaxial compressive strength (UCS), in TBM tunnels. Shi et al. [4] developed a deep neural network model based on measured data from a water diversion project, which realised the performance parameters of TBM excavation processes to identify the surrounding rock in real time. Qiao et al. [26] investigated a deep learning-based method for real-time identification of rock fragments during TBM excavation, which used an instance segmentation model to identify the rock fragments. He et al. [27] developed a model based on a Long Short-Term Memory (LSTM) network for analysing and predicting the tunnel surrounding rock deformation. The majority of these studies utilise deep learning algorithms to perceive the intrinsic patterns of TBM data, thereby transforming the problem of predicting the enclosing rock grades into the problem of classifying the enclosing rocks or predicting the rock mass parameters. This approach is more automated and universal. However, it should be noted that the majority of the data utilised in these studies originated from a single data source. Moreover, there was a paucity of consideration given to the joint influence of multi-source data (e.g., HSP data, TBM data, 3D spatial data, etc.) on rock body state perception. Moreover, the majority of these studies overlooked the consideration of the causal relationship between the data. The incorporation of this factor would facilitate the enhancement of the ability to discern the intrinsic patterns in the data, as well as the interpretation of these patterns, thereby optimising the prediction effect [28].

The following key technical issues are discussed in this study: (1) How to fully extract the feature information of multi-source data and use it for surrounding rock grade prediction. (2) How to consider the causal relationship between multi-source data to improve the prediction accuracy of TBM surrounding rock grade. (3) How to verify the accuracy and robustness of the proposed TBM surrounding rock grade prediction model and improve the interpretability of the model. The main contributions of this study are as follows: (1) Based on the horizontal acoustic profile method (HSP), the reflected wave signal is detected and analysed as additional feature information, and the calculation of various rock fragmentation indexes is introduced. (2) A hybrid neural network model (TRNet) based on multi-source feature fusion is proposed, which combines causal discovery to achieve interpretable TBM surrounding rock grade prediction while maintaining high accuracy. (3) The superiority of the model structure is proved by ablation experiments and comparative experiments, and the contribution of each input feature is explained by SHapley Additive exPlanations (SHAP). This represents a genuine instance of the integration of tunnel engineering with deep learning technology. Numerical simulation and intelligent algorithms are utilised in conjunction with the characteristics of multi-source data to achieve real-time prediction of surrounding rock grade in complex and changeable rock mass conditions during the process of tunnel excavation. This method is expected to provide auxiliary decision-making for engineering rock mass technology and rock disaster prediction technology in the field of rock mechanics. The overall flowchart of this study is shown in Figure 1.

The remaining sections are organised as follows: Section 2 presents the construction process of the novel hybrid neural network (TRNet) proposed in this study and the various algorithms it relies on, and describes how the model is evaluated and interpreted. Section 3 presents the data preprocessing methods and data analysis results applied to a tunnel project in western Sichuan, China. Section 4 details the training process of the model and the prediction results on the test set of the tunnel project in western Sichuan, China, and analyses the interpretability of the model. In addition, ablation and comparison experiments are conducted to discuss the advantages of the model over other deep learning algorithms. Section 5 offers a conclusion and recommendations for future research.

2. Methodologies

2.1. Convolutional Neural Network (CNN)

In the context of multi-source data features from three distinct data sources, this paper proposes the utilisation of convolutional neural networks (CNNs) for the extraction of local features within the data. The primary objective of this approach is to facilitate the subsequent TBM rock grade prediction model in discerning the inherent patterns present within the data. CNNs constitute a category of feed-forward neural networks. As illustrated in Figure 2, the model’s primary structure encompasses an input layer, a convolutional layer, a pooling layer, a fully connected layer, and an output layer [29]. The convolution layer performs the convolution operation on the input information by means of a convolution kernel, so as to extract the features and generate a new feature map. The process of convolution is described as follows:

y_{i, j} (n) = w_{i - 1, j} \cdot x_{i, j} (n) + b_{i - 1, j}

(1)

In this equation,

w_{i - 1, j}

and

x_{i, j} (n)

are the weights and bias terms of the jth convolution kernel in the

i - 1 th

layer, respectively, while

y_{i, j} (n)

represents the output of the nth feature map and the output resulting from the convolution of the jth convolution kernel in the ith layer.

A batch normalisation layer is incorporated after the convolutional layer with the objective of accelerating the training process of the neural network and enhancing the generalisation ability of the model. Additionally, an activation function layer is included to preserve the effect of each layer following the convolutional operation and to acquire nonlinear features. The nonlinear activation function selected in this paper is the ReLU function, which has been shown to be computationally faster than other nonlinear functions and to perform better in practical applications [30]. The expression for the ReLU function is as follows:

a_{i, j} (n) = δ (y_{i, j} (n)) = m a x {0, y_{i, j} (n)}

(2)

where

a_{i, j} (n)

represents the value after nonlinear activation, and

δ

denotes the activation function.

Redundant information is ultimately eliminated through the pooling layer, thus reducing the size of the feature map and preventing overfitting. The following procedure is employed to achieve this objective:

p_{i, j} (n) = p o o l (a_{i, j} (n))

(3)

where

p_{i, j} (n)

signifies the value corresponding to the feature map in layer i after the pooling operation, and

p o o l

is the pooling function.

2.2. Kolmogorov–Arnold Network (KAN)

This paper proposes an enhancement to the nonlinear mapping capability of a model by introducing a Kolmogorov–Arnold network (KAN). This is designed to assist a subsequent TBM enclosure grade prediction model in performing effective feature fusion and transformation. In contrast to traditional multilayer perceptrons (MLPs) that perform nonlinear spatial transformations at each level, KANs only perform nonlinear transformations on each pair of bases individually and then combine them into a multidimensional space. This approach effectively demonstrates that a high-dimensional function can be reduced to learning a polynomial number of one-dimensional functions. When represented as a neural network graph, this is equivalent to a two-layer neural network. The difference is that there is no more linear combination, and the input features are directly activated nonlinearly. It is important to note that these activation functions are not predetermined; they have the capacity to self-learn.

The multi-layer KAN network architecture is illustrated in Figure 3. In the “activation layer”, a number of B-spline functions with different control points are used to fit functions of arbitrary shapes.

To construct deep KANs, one simply needs to define for each KAN layer a function matrix:

Φ = {ϕ_{q, p}}, p = 1, 2, \dots, n_{in}, q = 1, 2 \dots, n_{out}

(4)

Multiple stacking should be performed. In a manner akin to traditional multilayer perceptual machines, neural networks are predicated on the establishment of in–out to output mapping relationships, that is to say, the construction of transformations (matrices) from input to output layers:

x_{l + 1} = \underset{Φ_{l}}{\underset{︸}{(\begin{matrix} ϕ_{l, 1, 1} (\cdot) & ϕ_{l, 1, 2} (\cdot) & \dots & ϕ_{l, 1, n_{l}} (\cdot) \\ ϕ_{l, 2, 1} (\cdot) & ϕ_{l, 2, 2} (\cdot) & \dots & ϕ_{l, 2, n_{l}} (\cdot) \\ ⋮ & ⋮ & ⋮ \\ ϕ_{l, n_{l + 1, 1}} (\cdot) & ϕ_{l, n_{l + 1, 2}} (\cdot) & \dots & ϕ_{l, n_{l + 1}, n_{l}} (\cdot) \end{matrix})}} x_{l}

(5)

The function matrix

ϕ_{l}

corresponds to the first layer of the KAN. In general, a KAN network can be regarded as a single-layer composition. For a given input

x_{0} \in R^{n_{0}}

, the output of the KAN is computed as

KAN (x) = (Φ_{L - 1} \circ Φ_{L - 2} \circ \dots \circ Φ_{1} \circ Φ_{0}) x

(6)

2.3. Local Causality Matrix Construction Method for Multi-Source Data Based on Direct-LiNGAM

In this paper, we introduce the Direct-LiNGAM method as a means to consider causal relationships among multi-source data, thereby enhancing the model’s ability to perceive the intrinsic connections of multi-source data and facilitating more effective multi-source feature fusion. The Direct-LiNGAM method is a causal discovery algorithm improved for LiNGAM, which is applicable to a larger number of variables [31,32]. The method has also been extended to allow for latent variables, time series data, and feedback loops [33]. The method does not require a priori knowledge to obtain causal relationships between variables and has a stable computational complexity that is easy to converge [28].

The fundamental concept underpinning the Direct-LiNGAM algorithm can be summarised as follows: Initially, the exogenous variables are selected and incorporated into the causal order. Subsequently, the dataset is updated and the subsequent stage of variable selection is initiated. This process is repeated until the final causal order is obtained. The specific process and algorithm flow of selecting exogenous variables in this paper are as follows, for the case of multi-source data.

Assuming a set of variables x, the specific procedure is as follows: First, for each variable

x_{j} \in U

, where U denotes the set of remaining variables, compute the residuals

r_{i}^{(j)}

from regressing

x_{i} \in U ∖ {x_{j}}

on

x_{j}

. Then, using an ICA tool, estimate the residuals

r_{i}^{(j)}

, and compute the kernel-based mutual independence measure,

\hat{M} I_{k e r n e l} (x_{j}, r_{i}^{(j)})

. From this, obtain the total kernel-based independence measure

T_{k e r n e l} (x_{j}; U)

, as defined by Equation (8), and identify the variable that minimises this quantity as the exogenous variable:

T_{k e r n e l} (x_{j}; U) = \sum_{i \in I U_{i + i}} \hat{M} I_{k e r n e l} (x_{j}, r_{i}^{(j)})

(7)

x_{m} = a r g m i n_{j \in U / K} T_{k e r n e l} (x_{j}; U)

(8)

Step 1.: Given an $m \times n$ observation matrix X, centre each variable (column-wise).
Step 2.: For each $x_{j} \in U$ , regress all other $x_{i} \in U ∖ {x_{j}}$ , and obtain residuals $r_{i}^{(j)}$ . Then compute the kernel-based independence measure $\hat{M} I_{k e r n e l} (x_{j}, r_{i}^{(j)})$ , and derive $T_{k e r n e l} (x_{j}; U)$ . Select the variable $x_{m}$ that minimises this measure as the exogenous variable.
Step 3.: Remove $x_{m}$ from the set U and record it in the causal ordering set K.
Step 4.: $x = r^{(m)}$ , $X = R^{(m)}$ .
Step 5.: If the number of variables in U is greater than 1, return to Step 2. Otherwise, proceed to Step 6.
Step 6.: Append the final variable in U to the end of K; this yields the full causal order.
Step 7.: Estimate the final adjacency matrix B using the identified causal order.

Eventually, as demonstrated in Figure 4, the causal adjacency matrix of the input features is computed based on the Direct-LiNGAM method, which is subsequently multiplied by the original input vectors to finally obtain the local causality matrix. This matrix employs colour shades to indicate the strength of causal relationships between variables; the darker the shade, the stronger the relationship. This is carried out to help the model capture causal relationships and correlations between multiple sources of data. The local causality matrix is then spliced with the original input vectors, and the resultant matrix is used as inputs to the model.

2.4. Neural Network (TRNet) for TBM Surrounding Rock Grade Prediction Based on Multi-Source Feature Fusion

In order to achieve instant prediction of the surrounding rock state in front of the TBM tunnel face, the problem was transformed into a classification problem based on multi-source data. In order to consider the causal relationship between multi-source data, we construct a novel hybrid neural network (TRNet) based on several algorithms mentioned in the previous sections.

As demonstrated in Figure 5, the TRNet model represents a pioneering deep learning architecture that is founded on a CNN and KAN. It comprises an input layer, a feature coding module (CNN), a feature fusion module (KAN), a regularisation module (Dropout), and a classifier module.

Initially, the input data consists of multi-source feature vectors represented as a matrix

X \in R^{N \times d}

, where N signifies the number of samples and d encompasses the feature dimension (comprising HSP-coded data, rock fragmentation index, TBM boring parameter data, and 3D spatial coordinate information). This data is represented as a matrix X, where each row represents a mileage point and each column represents a feature. The distance span of the input data is denoted as T, and the number of features is represented by X. The input matrix X can be expressed as follows:

X = [\begin{matrix} x_{1, 1} & x_{1, 2} & \dots & x_{1, X} \\ x_{2, 1} & x_{2, 2} & \dots & x_{2, X} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ x_{T, 1} & x_{T, 2} & \dots & x_{T, X} \end{matrix}]

(9)

where

x_{t, i}

denotes the ith eigenvalue of the tth mileage point.

The Direct-LiNGAM method is utilised to compute the causal adjacency matrix between the input features A. The size of matrix A is

X \times X

and its element

a_{i, j}

denotes the causal strength of the ith feature to the jth feature. The matrix A can be expressed as follows:

A = [\begin{matrix} a_{1, 1} & a_{1, 2} & \dots & a_{1, X} \\ a_{2, 1} & a_{2, 2} & \dots & a_{2, X} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ a_{X, 1} & a_{X, 2} & \dots & a_{X, X} \end{matrix}]

(10)

where

a_{i, j}

indicates a stronger causal relationship.

The local causality matrix C is then generated by multiplying the causal adjacency matrix A with the input matrix X. The size of matrix C is

T \times X

, and its elements

c_{t, j}

denote the local causality strength of the jth feature at the tth time point. The expression for the matrix C is as follows:

C = X \cdot A

(11)

where each row of C represents a local causal feature at one point in time.

The original input matrix X is spliced with the local causality matrix C to form the final input matrix

X_{i n p u t}

. The size of the matrix

X_{i n p u t}

is

T \times 2 X

and it can be expressed as

X_{input} = Concat (X, C)

(12)

where the

Concat

function denotes splicing along the feature dimensions.

Subsequently, two-layer one-dimensional convolution (1D Conv) was used to achieve feature space mapping and local pattern extraction, and ReLU was used for all activation functions:

H^{(1)} = R e L U (W_{1} * X_{i n p u t} + b_{1})

(13)

These include 32 filters (kernel size

3 \times 3

) with output dimension

d_model / 2

. The use of ReLU has been demonstrated to enhance feature sparsity and reduce ineffective signal transmission through its unilateral suppression properties.

H^{(2)} = R e L U (W_{2} * H^{(1)} + b_{2})

(14)

These include 64 filters (kernel size

3 \times 3

) with output dimension

d_model / 2

. The dual ReLU design avoids introducing additional hyperparameter complexity while preserving feature expressiveness.

The output of the activation layer is then connected to a stacked KAN network. The KAN decomposition, as outlined by the Kolmogorov–Arnold theorem, involves the transformation of multivariate continuous functions into combinations of univariate functions and additions. This process enables the fusion of features and the reduction of dimensionality through the utilisation of a learnable B-spline function. The input features

H^{(2)} \in

are passed through two stacked KAN layers with hidden layer dimensions in the order of

d_model \to d_model / 2 \to d_model / 4

, and the edge activation function of each layer is a B-spline:

Φ (h_{i}) = \sum_{j = 1}^{G} c_{i j} B_{j} (h_{i})

(15)

where

B_{j}

denotes the B-spline basis function,

c_{i, j}

represents the learnable coefficients, G is the grid resolution, and G is the number of layers optimised by the Bayesian grid search. The KAN model has been developed to support the refinement of the grid (e.g., adjusting it from

G = 5

to

G = 10

) without the need for re-training, thus enhancing the model’s adaptability to complex geological conditions.

Finally, the model is prevented from overfitting by a Dropout layer, and the dimensions of the predicted vectors are mapped to the individual classes by a fully connected layer:

y_{p r e d} = S o f t m a x (W_{c} \cdot H_{d r o p o u t} + b_{c})

(16)

where

W_{c} \in R^{d_{model} / 4 \times num_classes}

.

This approach to constructing the network architecture has the capacity to effectively integrate multi-source feature information, thereby achieving an enhancement in classification accuracy. Table 1 provides a comprehensive overview of the components constituting the TRNet network.

2.5. Indicators for Model Evaluation

In order to facilitate a more robust evaluation of the model’s prediction performance, this study employs the metrics of accuracy, AUC (area under curve), and the Kappa coefficient to assess the model’s efficacy. Specifically, accuracy signifies the proportion of accurately predicted instances among all instances; AUC is defined as the area under the ROC curve enclosed with the coordinate axis, with values approaching 1 indicating higher accuracy; and the Kappa coefficient is a statistical metric employed for the performance evaluation of multiclassification models, with its value denoting the degree of consistency between the actual results and the model prediction results. The formulae for these indicators are shown in Equations (17)–(20).

\begin{matrix} A c c u r a c y & = \frac{1}{m} \sum_{i = 1}^{m} H (f (x_{i}) = y_{i}) = \\ \frac{T P + T N}{T P + T N + F P + F N} \end{matrix}

(17)

A U C = \int_{0}^{1} \frac{T P}{T P + F N} d (\frac{F P}{F P + T N})

(18)

P_{ε} = \frac{\sum_{i = 1}^{n} \overset{\cdot}{L_{i} \times L_{i}}}{N^{2}}

(19)

K = \frac{A c c u r a c y - P_{e}}{1 - P_{e}}

(20)

where

T P

denotes a true case, i.e., the real label and prediction result are both positive cases;

T N

denotes a true negative case, i.e., the real label is a positive case and the prediction result is a negative case;

F P

denotes a false positive case, i.e., the real label is a positive case and the prediction result is a positive case; and

F N

denotes a false negative case, i.e., the real label and prediction result are both negative cases.

H (.)

The condition discriminant returns 1 if the condition is satisfied and 0 if it is not.

L_{i}

denotes the number of correctly predicted samples of the class,

i, L_{i}

denotes the total number of samples of class i, and N denotes the total number of all samples.

3. Case Study

3.1. Project Overview

This paper examines a tunnel project located in western Sichuan Province, China. The tunnel follows a northeast–southwest alignment with a total length of approximately 37.965 km and a maximum overburden depth of 1696 m, characteristic of a deep-buried, extra-long tunnel. The presence of substantial deposits of Quaternary alluvial–proluvial cohesive soils and gravelly soils has been identified in the vicinity of the portal areas (i.e., the entrances and exits) and in the adjacent valleys. The bedrock encountered along the primary tunnel alignment is predominantly of the Himalayan granite (

γ_{6}

) and diorite (

δ_{6}

) categories. The strata cut by the exit section include Meso-Neoproterozoic gneisses (

P t_{2 - 3}^{G n}

) of the Nyenchen Tanglha Group, a region which is over 90% hard rock. Figure 6 provides a visual representation of the distribution of the primary lithologies along the designated tunnel route. The project entails the construction of two parallel single-track tunnels. The tunnel section under scrutiny in this paper constitutes a segment of the left line tunnel of the project. The statistical distribution of the surrounding rock grades along the tunnel is illustrated in Table 2. The segment from chainage DK1223+195 to DK1226+770 has been selected for detailed investigation in this study. Within this segment, the tunnel passes through slightly weathered granite, with the surrounding rock predominantly classified as Grade III and IV.

3.2. Data Preprocessing

The tunnel boring machine (TBM) data for this project was collected at approximately 85,000 data points per day, sampled at 1 s intervals from the TBM time series data. During the TBM construction process, the automatic acquisition system records two types of data: those pertaining to normal working states and those pertaining to non-working states. In addition to these non-working state data, the TBM may contain abnormal values such as zero digging parameters during the digging process due to the influence of equipment failure, among other factors. This data can interfere with the subsequent model training. Consequently, the outliers are eliminated through the implementation of a binary state discriminant function [34].

h (x) = \{\begin{matrix} 1 & (x > 0) \\ 0 & (x \leq 0) \end{matrix}

(21)

H = h (V) h (F) h (P) h (T)

(22)

where

h (x)

is the binary state discriminant function, whose value is 0 when the value of the independent variable is 0. The time series functions of the cutterhead speed (V), total thrust (F), cutterhead penetration (P), and cutterhead torque (T) are recorded over time during the tunnelling process. In the event that the dichotomous state discriminant function

H = 0

, the data record is considered an outlier and must be eliminated.

However, TBM operational data is by nature a time record rather than a consistent geographical record. The data points are evenly distributed in time but not in space [35]. As a result, the original TBM time data was categorised based on the precise mileage (i.e., spatial position) at which each TBM instant occurred. The raw time data is grouped according to the actual mileage (spatial location) of the TBM at each moment in time. The time points are denoted by

t_{i}

, and the corresponding spatial locations (mileages) are designated as

s_{i}

. The data should then be grouped by spatial location, with a grouping interval of 0.1 m, i.e.,

∆ s

= 0.1.

[s_{k}, s_{k + 1}) = [k \cdot ∆ s, (k + 1) \cdot ∆ s)

(23)

where k is an integer indicating the kth mileage segment.

For each mileage segment

[s_{k}, s_{k + 1})

, the average of all time data within the segment is calculated. Let

N_{k}

denote the number of data points within the segment and

t_{k, j}

be the jth time point within the segment, then the average value, denoted as

\bar{t_{k}}

, is

\bar{t_{k}} = \frac{1}{N_{k}} \sum_{j = 1}^{N_{k}} t_{k, j}

(24)

Lastly, TBM data with a total length of 2.291 km was obtained from the western Sichuan tunnel project dataset. This section extends from DK1223+959 on the left line to DK1226+250 on the same line [36].

In addition, Horizontal Seismic Profiling (HSP) was utilised to investigate the geological conditions ahead of the tunnel face. The process of Horizontal Seismic Profiling (HSP) entails the concurrent implementation of low-frequency seismic sources and receivers in a single horizontal plane along the tunnel walls, thereby facilitating the acquisition of seismic reflection data. Subsequent data analysis is primarily based on travel time analysis, where travel times are used directly to locate reflective interfaces, providing relatively intuitive imaging results [37]. The primary advantage of the HSP method lies in its capacity to analyse reflected wave signals using both time and frequency domain techniques, thereby enabling indirect inference of lithological variations in front of the tunnel face.

The HSP system utilised in this study is the HSP217 Advanced Geological Prediction Instrument. The instrument’s design incorporates a novel spatial array observation layout, adding calculations for parameters such as forward wave velocity, and expanding output formats to include 2D and 3D spectral maps of reflected energy. This results in higher prediction resolution and more intuitive and accurate detection results. A comprehensive list of key parameters is provided in Table 3.

Subsequent to processing the data in accordance with the workflow depicted in Figure 7, prediction result images (Figure 8) are obtained. These images are presented using 3D visualisation. In these visualisations, the XOY and ZOY slices represent horizontal and vertical geological cross sections of the surrounding rock mass in front of the tunnel face. The analysis of anomalous zones within these slices facilitates the approximate determination of size and distribution of unfavourable geological bodies.

In this study, geophysical experts interpreted the inversion imaging results in order to classify the characteristics of the reflected energy. These categories were then represented using one-hot encoding (see Table 4). This process involves converting categorical variables into a format that is readily interpretable by machine learning algorithms. This method involves the representation of m categories through m binary bits, with each reflected energy category assigned a unique numerical label [38].

Additionally, due to the inherent subjectivity in interpreting inversion imaging results, a degree of random noise was introduced to mitigate the influence of human factors. This random noise was introduced based on two indicators: length and category. It was primarily concentrated near the start and end points of the reflected wave chainage range.

Specifically, the maximum (

η m a x

) and minimum (

η m i n

) noise proportions were first determined. The noise proportion,

η

, is defined as the ratio of noise data points to the total number of data points within a specified chainage range. Subsequently, the Monte Carlo simulation method was employed to randomly generate the lengths of misclassified segments. The Monte Carlo simulation method is a numerical computation method based on random sampling, approximating solutions to complex problems through extensive random trials. Specifically, a random number r is generated, where

r \in [0, 1]

. The length of the misclassified segment, L, is then determined using Equation (23):

L = L_{m i n} + r \times (L_{m a x} - L_{m i n})

(25)

where

L_{m i n}

and

L_{m a x}

denote the minimum and maximum misclassification lengths, respectively.

Subsequently, the generated misclassification length L is randomly classified to the nearest neighbouring category. The distance of the current data point from all categories is then calculated as follows:

d_{i} = \sqrt{{(x - x_{i})}^{2} + {(y - y_{i})}^{2}}

(26)

where x and y represent the chainage and reflected energy intensity of the current data point, respectively, and

x_{i}

and

y_{i}

are the mileage and reflected energy intensity of i, the ith category. The category with the smallest distance is then selected as the nearest neighbour category.

The noise data is then generated and added to the original data. The formula for generating noise data is as follows:

y_{noise} = y_{original} + η \times ϵ

(27)

where

y_{noise}

is the reflected energy intensity class after adding noise,

= y_{original}

is the original reflected energy intensity class,

η

is the noise weight, and

ϵ

is the random noise value, which usually follows a normal distribution

ϵ \sim N (0, σ^{2})

.

The visual depiction of the raw data and the data following the addition of noise is presented in Figure 9a illustrates the raw reflected energy data, with the distance travelled (m) represented. Specifically, Figure 9a illustrates the raw reflected energy data, with the distance travelled (m) represented on the horizontal axis and the HSP grade on the vertical axis. The data points are distinguished by different colours to indicate different HSP classes, and each data point corresponds to a specific mileage and HSP class. Figure 9a illustrates the raw reflected energy data, with the distance travelled (m) represented on (b) illustrating the data after the introduction of noise to the original dataset. The introduction of noise to the original dataset results in noise predominantly concentrated at the beginning and end of the mileage, and a deviation from the original data distribution. A clear visualisation of the impact of noise on data distribution is afforded by a comparison of Figure 9a, illustrating the raw reflected energy data, with the distance travelled (m) represented on (a) and (b). The introduction of noise results in a more dispersed distribution of data points on the graph. However, through the judicious application of noise weighting and classification methods, the preservation of the overall trend and characteristics of the data is possible.

From the 116 recorded data variables for the Chuanxi TBM left-line tunnel, this study selected five excavation parameters as the initial TBM input parameters: cutterhead rotational speed (

R P M

), advance rate (v), penetration rate (P), total thrust (F), and cutterhead torque (T). Utilising these parameters, rock fragmentation indices such as the fragment coefficient (

F C

), the total fragment coefficient (

T F C

), and the weight ratio (

W R

) were calculated using Equations (28)–(30) and incorporated as additional inputs. Furthermore, real-time coordinate data (

X, Y, Z

) from the shield tail guidance system were added as three-dimensional spatial data. The amalgamation of multi-source data from three distinct origins—TBM operational data, 3D spatial data, and HSP data—culminated in a total of 12 input features.

F P I = \frac{F}{N P}

(28)

T P I = \frac{T}{0.3 D N P}

(29)

WR = 2 π \times 10^{3} \frac{T \cdot R P M}{F \cdot ν}

(30)

where N denotes the number of tools installed on the cutter disc, and D represents the diameter of the roadheader.

3.3. Data Analysis

In order to further understand the multi-source data characteristics of the input models, the following data analyses were conducted: (1) basic statistical analysis and data distribution visualisation for each TBM data input variable; (2) correlation analysis for each multi-source data input variable; and (3) causal analysis for each multi-source data input variable.

As illustrated in Figure 10, the violin plots for each TBM parameter input variable were used to demonstrate the shape of the data distribution and concentration trends. Table 5 provides basic statistical information for each TBM parameter input variable.

We assess the correlation between paired variables using the Pearson linear correlation coefficient (PLCC) [39] and the maximum information coefficient (MIC) [40] in order to further examine the association between various tunnelling characteristics. While the second approach can capture nonlinear correlations, the first method determines the degree of linear connection. The following is how PLCC and MIC are calculated:

P L C C = \frac{c o ν (X, Y)}{σ_{X} σ_{Y}}

(31)

M I C (X, Y) = m a x \{\frac{I (X, Y)}{l o g_{2} m i n {n_{X}, n_{Y}}}\}

(32)

where the term

c o v (X, Y)

denotes the covariance between the paired variables X and Y, while

σ_{X}

and

σ_{Y}

represent their respective standard deviations. For the MIC calculation (see Equation (32)),

I (X, Y)

denotes the mutual information between X and Y, calculated based on the partitioning of the variables into

n_{X}

and

n_{Y}

bins, respectively.

PLCC and MIC heat map matrices were plotted for each of the multi-source data input variables (see Figure 11). For the raw TBM data, the linear correlation between T,

R P M

, and F is higher than their nonlinear correlation, while V is not highly correlated with all three of them. This phenomenon can be attributed to the fact that V, in its capacity as an active driving parameter, is significantly influenced by human factors. In the context of the rock fragmentation index, the linear and nonlinear correlations between FPI, TPI, and WR are notably higher, attributable to the theoretically derived relationship between the calculation formulas of the three. In the context of 3D spatial data, the correlation patterns are further complex, exhibiting significant linear and nonlinear dependencies not only among the three variables but also with the TBM data, the rock fragmentation index, and the HSP data. This observation signifies that the spatial distribution exerts a notable influence on the TBM data. This phenomenon can be attributed to the existence of specific spatial distribution patterns within diverse rock masses, which are influenced by geological factors such as rock lithology and the presence of fissures. These spatial patterns, in turn, affect the alterations in boring parameters and acoustic wave reflections. The correlation between the HSP data and various TBM data, as well as the rock fragmentation index, is not significant. This may be attributed to the distinct modes of the two data sources, resulting in a low degree of correlation between them. However, this does not diminish the significance of their role in the prediction process.

In order to investigate the causal relationship between multi-source features, the causal relationship between multi-source features was analysed using the Direct-LiNGAM method (see Figure 12 for the causal network). Each node in the figure corresponds to a distinct data feature, with arrows denoting the direction of causality. The arrows originate from the cause and terminate at the effect. The value assigned to each arrow signifies the strength of the causal relationship, with positive values indicating positive effects and negative values indicating negative effects. It is important to note that causality is based on a threshold of 0.1, and values less than 0.1 are generally considered to lack a significant causal relationship. As demonstrated in Figure 13, the strength of causal relationships between individual TBM features is illustrated, with the horizontal axis variable acting as the cause and the vertical axis variable denoting the effect. Null intersections in Figure 13 indicate a non-significant causal relationship between two variables, while solid circles indicate that the horizontal axis variable is the cause of the corresponding vertical axis variable. The findings from Figure 12 and Figure 13 demonstrate that the features exhibiting the strongest causal relationship with surrounding rock grades among the three data sources are

H S P

, F, and Y. This observation indicates that, in terms of causality, these three features are the most significant contributors to surrounding rock grades among the respective data sources. Concurrently, the causal relationship between these three features and other features is also relatively robust, and they correlate with each other as an intrinsic link between the data from multiple sources. Nevertheless, this does not imply that these three features have the greatest impact on the prediction process of surrounding rock grades, a topic that will be examined in Section 4.2.

4. Results and Discussion

4.1. Training Process and Results

In this study, the TRNet model was utilised to predict the surrounding rock grades at the current tunnel face of the TBM. The model’s input variable is constituted by a 12-dimensional feature vector, which is derived from three distinct data sources (comprising five columns of raw TBM data, three columns of rock fragmentation index, and three columns of spatial data, alongside one column of HSP data). The model’s output variable is a two-dimensional vector representing the probabilities of two rock classes (III and IV). The final total dataset size was

12 \times 7751

. A three-segment segmentation method was applied to this dataset in order to prevent overfitting of the model and to ensure consistency with the actual tunnel excavation. The initial segment of the tunnel was designated as the training set, the medial segment as the validation set, and the terminal segment as the test set, with a ratio of 8:1:1. The batch size for each training iteration was set at 32. The algorithm was optimised by an early stopping algorithm, and the loss function was CrossEntropyLoss. In addition, a Bayesian optimisation algorithm was used to optimise the hyperparameters of the model [41]. The search range and optimal hyperparameters for each hyperparameter are presented in tabular form (Table 6).

In order to verify the prediction performance of the TRNet model, the present study randomly conducted 10 model trainings and recorded the accuracy, AUC, Kappa, number of training rounds, and average training duration of each training on the test set. As demonstrated in Table 7, the mean prediction accuracy, AUC and Kappa of TRNet reached 92.15%, 95.90% and 74.56%, respectively. It is noteworthy that the average number of training rounds is only 13, with each training requiring an average of 6 s, which is a highly efficient computation time for deep learning models. This suggests that TRNet achieves both computational efficiency and high prediction accuracy.

4.2. Discussion

In order to provide further validation of the superiority of the TRNet model in comparison to other models, a comparison of the prediction performance of several common machine learning models and deep learning models was conducted on the test set. The models included in the comparison were XGBoost, RF, LSTM, CNN and KAN. It was ensured that a uniform training strategy and architecture were used for all models, and the hyperparameters of the models were optimised by the Bayesian optimisation algorithm. The results, as presented in Table 8, demonstrate that TRNet consistently achieves the highest prediction accuracy across all three evaluation metrics when compared to the other models. With regard to classification accuracy, TRNet attains 0.9214, representing an enhancement of 0.98%, 0.03%, and 1.84% over CNN, LSTM, and KAN, respectively. In terms of AUC metrics, TRNet exhibits a substantial superiority over the comparison models with 0.9590, marking an advancement of 1.67 percentage points over the nearest CNN, and a notable improvement of 7. The results demonstrate that TRNet enhances the Kappa coefficient, which is a metric of classification consistency, by 4.13% and 2.49% in comparison to CNN and KAN, respectively. This suggests that TRNet can enhance the prediction accuracy of multi-source data. This may be due to the fact that TRNet can improve the ability to perceive the intrinsic patterns of multi-source data as well as the ability to explain them by identifying the causal relationships between them, which in turn better improves the prediction effect. It is noteworthy that, among the deep learning models, CNN and TRNet exhibit the most analogous prediction accuracy. However, they necessitate more than twice the number of training rounds from the outset of training to fitting, and the aggregate training time is considerably more extensive than that of TRNet, although the training time of each round is marginally lower than that of TRNet. In contrast, the two common machine learning models, KNN and XGBoost, do not necessitate a substantial number of training rounds and exhibit the shortest training time. However, their prediction accuracies are lower compared to the deep learning models, which may be due to the difficulty of traditional machine learning models in coping with noisy data, and their weak ability to perceive the intrinsic patterns of multimodal data due to the lack of mapping relationships in high-dimensional hidden layers.

To facilitate a more visual comparison of the model predictions, Figure 14 plots the scores of various evaluation metrics. As shown in Figure 14a, TRNet and LSTM exhibit higher accuracy and Kappa scores, while the ROC curves in Figure 14b demonstrate TRNet’s superior prediction capabilities. The confusion matrix in Figure 15 reveals discrepancies in classification outcomes, particularly for Class III and Class IV perimeter rocks. Most misclassifications (excluding XGBoost) involve labelling Class IV rocks as Class III, highlighting variations in model tendencies. Detailed analysis shows that TRNet not only achieves the highest overall accuracy but also maintains a balanced performance across rock classes. It significantly reduces the critical misclassification of Class IV rocks as Class III—minimising safety risks from insufficient support—while also exhibiting fewer instances of mislabelling Class III rocks as Class IV, thereby avoiding unnecessary construction costs. These findings underscore TRNet’s effectiveness in handling asymmetric error costs, making it the most reliable choice for real-world applications where misclassification consequences are substantial. Its lowest total misclassifications further demonstrate its superiority in practical engineering scenarios.

To further demonstrate the superiority of the fusion strategy combining various network modules, we conducted ablation experiments on TRNet. The prediction performance of the model was compared after each network module was removed in a gradual manner, with TRNet serving as the benchmark. The findings, presented in Table 9, demonstrate that TRNet attains the optimal prediction performance, while the accuracy of the model experiences a decline following the removal of each network module. The results demonstrate the relative importance of various network modules to the prediction ability of TRNet as follows: Direct-LiNGAM > CNN > KAN. The findings demonstrate that each network module incorporated into TRNet contributes positively to its prediction ability, thereby validating the efficacy of integrating multi-source feature information and illustrating the consideration of the intrinsic connection between data importance. Furthermore, the incorporation of Direct-LiNGAM in constructing the local causality matrix has been identified as the most significant contributor to the model’s prediction capability. This outcome signifies that the consideration of causality between multi-source data can enhance not only the perception of the intrinsic pattern within the data but also the capacity to provide a comprehensive explanation.

In order to verify the effectiveness of the causal fusion mechanism of the TRNet model, this study visually analyses the dynamic changes of the causal structure vector during the training process. As shown in Figure 16, the mean value of causal weights of each feature dimension tends to be stable after the first round of training, and gradually converges to a stable interval. This phenomenon shows that TRNet can effectively learn the potential causal relationship between multi-source features through iterative optimisation, and finally achieve optimal weight regulation through gradient descent algorithm. Further, Figure 17 shows the evolution of the standard deviation value of each feature after being regulated by the local causality matrix with the training rounds. It can also be seen that a similar phenomenon occurred after the first round of training. It can be found that TRNet not only achieves balanced regulation at the level of weight distribution, but also maintains variance consistency at the level of feature representation.

In order to further verify whether the possible multicollinearity of multi-source data will affect the model prediction, we tested the robustness of the TRNet model to the multicollinearity of input features. We calculate the VIF of each input feature and test the model prediction results after removing these features one by one. The left image in Figure 18 shows each high-VIF feature, and the right image shows the influence of eliminating these features on the prediction accuracy of the model (the average prediction result of 10 trainings). We find that removing any feature will reduce the prediction accuracy of the model. For X, Y and Z with VIF higher than 10, the accuracy of the model decreases sharply after removal. The reason may be that the spatial coordinate data represents a certain spatial information, which is related to the distribution of surrounding rock state. At the same time, it also shows that TRNet can deal with the input features with multicollinearity well. Figure 19 shows the changes of accuracy, AUC and Kappa after removing each input feature.

In order to further evaluate the effectiveness of feature selection and the necessity of dimensionality reduction, principal component analysis (PCA) is performed on multi-source heterogeneous data to solve the problem of high-variance-inflation-factor features, and comparative experiments are performed. Figure 20 illustrates the PCA results and their impact on model performance. The bar chart in the top left shows that the first two principal components explain approximately 35% and 25% of the total variance, respectively, indicating that they capture most of the data variability. The top-right curve demonstrates that the cumulative explained variance rapidly approaches 1, suggesting that only a few principal components are sufficient to retain the major information. The bottom-left bar chart compares the performance of the original feature model and the PCA-based model in terms of accuracy, AUC, and Kappa. The bottom-right bar chart visually shows the reduction in feature dimensions from 12 to 7. The findings indicate that the performance of the model following PCA dimension reduction is analogous to that of the baseline model with respect to accuracy and AUC, though Kappa does exhibit a slight decrease, which may be attributable to the loss of classification-related information during the dimension reduction process. Considering the current multi-source feature set, dimensionality reduction is not necessary, which further confirms the validity and effectiveness of the selected features in this study.

In order to further enhance the interpretability of the TRNet model, the SHAP method is employed to investigate the influence of selected features on the prediction results and to quantify the degree of influence. SHAP is a technique for interpreting the output of machine learning models [42]. Among other things, SHAP values are a way to assess the importance of features relative to other features in the model and can be used to quantify the contribution of each feature to the predicted output. As illustrated in Figure 21, the HSP data contributed the most to the prediction of surrounding rock grades, followed by the 3D spatial data, and finally the rock fragmentation index and TBM data. This finding aligns with the results of the causal analysis, though there remain discrepancies. Firstly, it demonstrates that horizontal sonic profiling (HSP) can effectively identify the fracture and integrity of the rock mass. Secondly, it shows that the rock condition is also affected by the spatial distribution to a certain extent, because the lithology as well as the yield of different rocks have certain spatial distribution patterns. Secondly, for TBM data, it is usually necessary to indirectly reflect the rock fragmentation through the calculation of the rock fragmentation index (such as TPI, FPI, etc.). Among the various boring parameters, the variation of cutterhead speed (RPM) is frequently more significant for predicting surrounding rock grades. Consequently, the integration of the outcomes of the causal analysis and SHAP method demonstrates that all three data sources contribute to the prediction of surrounding rock grades to a certain extent. In this regard, the focus should be directed towards HSP data, 3D spatial data and rock fragmentation indicators (FPI, TPI).

5. Conclusions

In this paper, a novel hybrid neural network model, TRNet, based on multi-source feature fusion, is proposed. This model is based on real engineering data from a tunnel face in western Sichuan, China, with the aim of instantly predicting the surrounding rock grades in front of the current tunnel face of a TBM. The model incorporates the joint influence of multimodal data on the rock state and the causal relationship between multi-source data, and consists of multiple network modules, which can achieve efficient and accurate surrounding rock grades prediction. Concurrently, the model and other prevalent algorithms are evaluated by a series of statistical metrics, including accuracy, AUC (area under curve) and the Kappa coefficient. Subsequently, the model was analysed for interpretability using SHAP. The following conclusions were drawn:

(1) In the test set, the mean accuracy, AUC and Kappa coefficient of TRNet are 92.15%, 95.90% and 74.56%, respectively. These three evaluation metrics are on average 2.26% higher than those of CNN, 2.66% higher than those of LSTM, 2.50% higher than those of KAN, 16.34% higher than those of KNN and 7.85% higher than those of XGBoost. Conversely, the TRNet model demonstrates the lowest number of misclassifications, thereby substantiating its efficacy in predicting surrounding rock grades. (2) The findings of the ablation experiments demonstrate that the multi-network module fusion strategy adopted by TRNet is an effective method to enhance the prediction ability of the model. The relative importance of various network modules in the prediction ability of TRNet is as follows: Direct-LiNGAM > CNN > KAN. (3) The results of the interpretability analysis of TRNet using the SHAP method show that the HSP data contributes the most to the surrounding rock grade prediction, followed by 3D spatial data, and lastly rock fragmentation indicators (TPI, FPI) and TBM data. Among several common boring parameters, the variation of cutter torque (T) tends to be more important for surrounding rock grades prediction.

Although TRNet has demonstrated certain advantages in the prediction of TBM surrounding rock grades, there remains untapped potential and significance. Firstly, in addition to the three data sources already mentioned, the incorporation of geological data and other geotechnical parameters is recommended in order to enhance the accuracy of surrounding rock grade prediction under various geological conditions. Secondly, there is potential to enhance the automation level of the interpretation process of the physical exploration data, with attempts to perform automatic feature extraction and modelling analysis of the physical exploration data through intelligent methods, which could further reduce the manpower cost. The exploration of more intelligent fusion analysis methods for multimodal data will be undertaken.

Author Contributions

Conceptualisation: Y.H., X.H., S.P. and W.H.; methodology: Y.H., X.H. and S.P.; validation: Y.H., X.H., S.P. and W.F.; formal analysis: W.F., S.C. and B.G.; investigation: S.C. and B.G.; resources: Y.H. and W.F.; data curation: S.C. and B.G.; writing—original draft preparation: Y.H. and S.P.; writing—review and editing: Y.H., S.P. and W.H.; visualisation: S.P.; supervision: W.F. and W.H.; project administration: S.C. and B.G.; funding acquisition: Y.H. and W.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Tibet Autonomous Region Science and Technology Plan Project—Research and Application of Key Technologies for Dynamic Prevention and Control of Tunnel Disasters in Complex Geological Conditions on the Plateau (XZ202501ZY0108), the Major Special Project for Deep Underground Space Development and Utilization of China Railway Construction Corporation Limited, “Long-distance Hole Surrounding Detection Technology and Equipment Based on Deep Holes” (2024-SD02), and the China Railway Construction Corporation Major Project “Research and Development of Key Technologies and Equipment for Intelligent Survey of Railway Engineering” (2023-Z03).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in this article; further inquiries can be directed to the corresponding author.

Acknowledgments

The authors would like to express their gratitude to all the staff who participated in the project.

Conflicts of Interest

Authors Wei Fu, Shuaipeng Chang and Bin Gao were employed by the company China Railway First Survey and Design Institute Group Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Li, X.; Wu, L.J.; Wang, Y.J.; Li, J.H. Rock fragmentation indexes reflecting rock mass quality based on real-time data of TBM tunnelling. Sci. Rep. 2023, 13, 10420. [Google Scholar] [CrossRef] [PubMed]
Liu, J.; Liang, F.; Wei, K.; Zuo, C. Prediction Model for Cutterhead Rotation Speed Based on Dimensional Analysis and Elastic Net Regression. Appl. Sci. 2025, 15, 1298. [Google Scholar] [CrossRef]
Wu, Z.; Huo, D.; Chu, Z.; Liu, X.; Weng, L.; Xu, X.; Li, Z. Advanced Identification Method for Adverse Geological Conditions in TBM Tunnels Based on Stacking Ensemble Algorithm and Bayesian Theory. Tunn. Undergr. Space Technol. 2025, 163, 106741. [Google Scholar] [CrossRef]
Shi, K.; Shi, R.; Fu, T.; Lu, Z.; Zhang, J. A Novel Identification Approach Using RFECV–Optuna–XGBoost for Assessing Surrounding Rock Grade of Tunnel Boring Machine Based on Tunneling Parameters. Appl. Sci. 2024, 14, 2347. [Google Scholar] [CrossRef]
Landar, S.; Velychkovych, A.; Ropyak, L.; Andrusyak, A. A Method for Applying the Use of a Smart 4 Controller for the Assessment of Drill String Bottom-Part Vibrations and Shock Loads. Vibration 2024, 7, 802–828. [Google Scholar] [CrossRef]
Gong, Q.M.; Liu, Q.S.; Zhang, Q.B. Tunnel boring machines (TBMs) in difficult grounds. Tunn. Undergr. Space Technol. 2016, 57, 1–3. [Google Scholar] [CrossRef]
Gong, Q.M.; Yin, L.J.; Ma, H.S.; Zhao, J. TBM tunnelling under adverse geological conditions: An overview. Tunn. Undergr. Space Technol. 2016, 57, 4–17. [Google Scholar] [CrossRef]
Rostami, J. Performance prediction of hard rock Tunnel Boring Machines (TBMs) in difficult ground. Tunn. Undergr. Space Technol. 2016, 57, 173–182. [Google Scholar] [CrossRef]
Zheng, Y.L.; Zhang, Q.B.; Zhao, J. Challenges and opportunities of using tunnel boring machines in mining. Tunn. Undergr. Space Technol. 2016, 57, 287–299. [Google Scholar] [CrossRef]
Jancsecz, S.; Steiner, W. Face support for a large Mix-Shield in heterogeneous ground conditions. In Tunnelling and Underground Space Technology; Springer: Boston, MA, USA, 1994; pp. 531–550. [Google Scholar]
Davis, E.H.; Gunn, M.J.; Mair, R.J.; Seneviratine, H.N. The Stability of Shallow Tunnels and Underground Openings in Cohesive Material. Géotechnique 1980, 30, 397–416. [Google Scholar] [CrossRef]
Broere, W. Tunnel Face Stability and New CPT Applications. Ph.D. Thesis, Delft University of Technology, Delft, The Netherlands, 2001. [Google Scholar]
Jia, P.; Tang, C.A. Numerical Study on Failure Mechanism of Tunnel in Jointed Rock Mass. Tunn. Undergr. Space Technol. 2008, 23, 500–507. [Google Scholar] [CrossRef]
Zhang, C.; Zhou, H.; Feng, X.-T. An Index for Estimating The Stability of Brittle Surrounding Rock Mass: FAI and Its Engineering Application. Rock Mech. Rock Eng. 2011, 44, 401–411. [Google Scholar] [CrossRef]
Xia, Y.; Tian, Y.-C.; Tan, Q.; Hou, Y.-M. Side Force Formation Mechanism and Change Law of TBM Center Cutter. J. Cent. South Univ. 2016, 23, 1115–1122. [Google Scholar] [CrossRef]
Li, J.-B.; Chen, Z.-Y.; Li, X.; Jing, L.-J.; Zhang, Y.-P.; Xiao, H.-H.; Wang, S.-J.; Yang, W.-K.; Wu, L.-J.; Li, P.-Y.; et al. Feedback on a Shared Big Dataset for Intelligent TBM Part II: Application and Forward Look. Undergr. Space 2023, 11, 26–45. [Google Scholar] [CrossRef]
Sioutas, K.N.; Benardos, A. Boosting Model Interpretability for Transparent ML in TBM Tunneling. Appl. Sci. 2024, 14, 11394. [Google Scholar] [CrossRef]
Liu, B.; Wang, R.; Guan, Z.; Li, J.; Xu, Z.; Guo, X.; Wang, Y. Improved support vector regression models for predicting rock mass parameters using tunnel boring machine driving data. Tunn. Undergr. Space Technol. 2019, 91, 102958. [Google Scholar] [CrossRef]
Liu, Q.; Xinyu, W.; Huang, X.; Yin, X. Prediction model of rock mass class using classification and regression tree integrated AdaBoost algorithm based on TBM driving data. Tunn. Undergr. Space Technol. 2020, 106, 103595. [Google Scholar] [CrossRef]
Hou, S.; Liu, Y.; Yang, Q. Real-time Prediction of Rock Mass Classification Based on TBM Operation Big Data and Stacking Technique of Ensemble Learning. J. Rock Mech. Geotech. Eng. 2022, 14, 123–143. [Google Scholar] [CrossRef]
Cheng, X.; Tang, H.; Wu, Z.; Liang, D.; Xie, Y. BILSTM-Based Deep Neural Network for Rock-Mass Classification Prediction Using Depth-Sequence MWD Data: A Case Study of a Tunnel in Yunnan, China. Appl. Sci. 2023, 13, 6050. [Google Scholar] [CrossRef]
Yu, H.; Tao, J.; Qin, C.; Xiao, D.; Sun, H.; Liu, C. Rock mass type prediction for tunnel boring machine using a novel semi-supervised method. Measurement 2021, 179, 109545. [Google Scholar] [CrossRef]
Song, Y.; Feng, Y.; Wang, W.; Fan, Y.; Wu, Y.; Lv, Z. Data-driven prediction framework of surrounding rock pressure in a fully mechanized coal face with temporal-spatial correlation. Sci. Rep. 2024, 14, 28476. [Google Scholar] [CrossRef] [PubMed]
Feng, S.; Chen, Z.; Luo, H.; Wang, S.; Zhao, Y.; Liu, L.; Ling, D.; Jing, L. Tunnel boring machines (TBM) performance prediction: A case study using big data and deep learning. Tunn. Undergr. Space Technol. 2021, 110, 103636. [Google Scholar] [CrossRef]
Liu, B.; Ruirui, W.; Guangzu, Z.; Guo, X.; Wang, Y.; Li, J.; Wang, S. Prediction of rock mass parameters in the TBM tunnel based on BP neural network integrated simulated annealing algorithm. Tunn. Undergr. Space Technol. 2020, 95, 103103. [Google Scholar] [CrossRef]
Qiao, W.; Zhao, Y.; Xu, Y.; Lei, Y.; Wang, Y.; Yu, S.; Li, H. Deep learning-based pixel-level rock fragment recognition during tunnel excavation using instance segmentation model. Tunn. Undergr. Space Technol. 2021, 115, 104072. [Google Scholar] [CrossRef]
He, Y.; Chen, Q. Construction and Application of LSTM-Based Prediction Model for Tunnel Surrounding Rock Deformation. Sustainability 2023, 15, 6877. [Google Scholar] [CrossRef]
Wang, K.; Zhang, L.; Fu, X. Time series prediction of tunnel boring machine (TBM) performance during excavation using causal explainable artificial intelligence (CX-AI). Autom. Constr. 2023, 147, 104730. [Google Scholar] [CrossRef]
Tianjiao, Z. Flow Measurement of Natural Gas in Pipeline Based on 1D-Convolutional Neural Network. Int. J. Comput. Intell. Syst. 2020, 13, 1198–1206. [Google Scholar]
Tan, D.G.; Yuan, Y.P.; Fan, P.P. CNN-GRU Based Health Assessment of Mining Electric Motor Using Adaptive Multi-Scale Attention Mechanism. Ind. Mine Autom. 2024, 50, 138–146. [Google Scholar]
Shimizu, S.; Inazumi, T.; Sogawa, Y.; Hyvarinen, A.; Kawahara, Y.; Washio, T.; Hoyer, P.O.; Bollen, K. DirectLiNGAM: A Direct Method for Learning a Linear Non-Gaussian Structural Equation Model. J. Mach. Learn. Res. 2011, 12, 1225–1248. [Google Scholar]
Ahola, J. The Self-Organizing Map as a Tool in Knowledge Engineering. In Pattern Recognition in Soft Computing Paradigm; World Scientific Publishing: Singapore, 2001. [Google Scholar]
Drton, M.; Maathuis, M.H. Structure learning in graphical modeling. Annu. Rev. Stat. Appl. 2017, 4, 365–393. [Google Scholar] [CrossRef]
Huang, X.; Zhang, Q.; Liu, Q.; Liu, X.; Liu, B.; Wang, J.; Yin, X. A real-time prediction method for tunnel boring machine cutter-head torque using bidirectional long short-term memory networks optimized by multi-algorithm. J. Rock Mech. Geotech. Eng. 2022, 14, 798–812. [Google Scholar] [CrossRef]
Erharter, G.H.; Marcher, T. On the pointlessness of machine learning based time delayed prediction of TBM operational data. Autom. Constr. 2021, 121, 103443. [Google Scholar] [CrossRef]
Pang, S.; Hua, W.; Fu, W.; Liu, X.; Ni, X. Multivariable real-time prediction method of tunnel boring machine operating parameters based on spatio-temporal feature fusion. Adv. Eng. Inform. 2024, 62, 102924. [Google Scholar] [CrossRef]
Godio, A.; Strobbia, C.; De Bacco, G. Geophysical characterisation of a rockslide in an alpine region. Eng. Geol. 2006, 83, 273–286. [Google Scholar] [CrossRef]
Seger, C. An Investigation of Categorical Variable Encoding Techniques in Machine Learning: Binary Versus One-Hot and Feature Hashing. Ph.D. Thesis, KTH, School of Electrical Engineering and Computer Science (EECS), Stockholm, Sweden, 2018. [Google Scholar]
Cohen, I.; Huang, Y.; Chen, J.; Benesty, J. Pearson correlation coefficient. In Noise Reduction in Speech Processing; Springer: Berlin/Heidelberg, Germany, 2009; pp. 1–4. [Google Scholar]
Kinney, J.B.; Atwal, G.S. Equitability, mutual information, and the maximal information coefficient. Proc. Natl. Acad. Sci. USA 2014, 111, 3354–3359. [Google Scholar] [CrossRef]
Joy, T.T.; Rana, S.; Gupta, S.; Venkatesh, S. Hyperparameter tuning for big data using Bayesian optimisation. In Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 4–8 December 2016; pp. 2574–2579. [Google Scholar]
Li, L.; Liu, Z.; Shen, J.; Wang, F.; Qi, W.; Jeon, S. A LightGBM-based strategy to predict tunnel rockmass class from TBM construction data for building control. Adv. Eng. Inform. 2023, 58, 102130. [Google Scholar] [CrossRef]

Figure 1. Overall flowchart.

Figure 2. Schematic diagram of the basic CNN architecture.

Figure 3. Schematic diagram of KAN network architecture.

Figure 4. Local causality embedding strategy.

Figure 5. Schematic diagram of TRNet model architecture.

Figure 6. Distribution of major lithologies in the project’s tunnels.

Figure 7. Description based on the flowchart components.

Figure 8. 3D slice map from advance geological prediction.

Figure 9. Comparison of the distribution of reflection intensity classes after the introduction of random noise. (a) Original data; (b) noisy data.

Figure 10. Violin diagram of TBM parameter input variables.

Figure 11. PLCC and MIC heat map matrix for multi−source data.

Figure 12. Multi-source feature causal network diagram.

Figure 13. Multi-Source Feature Causal Relationship Adjacency Matrix.

Figure 14. Evaluation scores and ROC curves. (a) Evaluation scores for various classifiers; (b) ROC curves for various classifiers.

Figure 15. Confusion matrix of six algorithmic classifiers on the test set. (a) TRNet; (b) CNN; (c) LSTM; (d) KAN; (e) KNN; (f) XGBoost.

Figure 16. Training dynamics of causal weight means.

Figure 17. Training dynamics of feature representation consistency.

Figure 18. The influence of high-VIF feature recognition and elimination on prediction accuracy.

Figure 19. Index changes after eliminating different input features.

Figure 20. The PCA results and impact on model performance.

Figure 21. SHAP-based ranking of importance of multi-source features.

Table 1. TRNet model components with their parameters and descriptions.

Component	Parameters	Description
Input Layer	N/A	Concatenation of the input vector and local causal features along the feature dimension
1D Conv	in_channels = input_size, out_channels = d_model/2, kernel_size = 3, padding = 1	1D convolutional layer for encoding, mapping from input_size to d_model/2
Relu	N/A	Activation layer
1D Conv	in_channels = d_model/2, out_channels = d_model, kernel_size = 3, padding = 1	1D convolutional layer for encoding, mapping from d_model/2 to d_model
Relu	N/A	Activation layer
KAN	[d_model, d_model/2, d_model/2, d_model/4]	RAN module with hidden layers of sizes d_model, d_model/2 and d_model/4
Dropout	rate = 0.15	Dropout layer
Output layer	d_model/4, num_classes	Fully connected layer

Table 2. Tunnel surrounding rock grades statistics.

Surrounding Rock Rank	Length/m	Proportion/%
II	4100	$10.8$
III	24,070	$63.4$
IV	7973	$21.0$
V	1822	$4.8$

Table 3. Main parameters of the HSP217 advance geological prediction instrument.

Name	Unit	Range
A/D Conversion	bit	24
Channel	N/A	8
Sampling Speed	μs	0.1– $1 \times 10^{6}$
Record Length	K	1–23
Dynamic Scope	v	$\pm 10$

Table 4. One-hot encoding of reflected energy categories.

Reflection Anomaly	Rank	Code
None	I	$[1, 0, 0, 0, 0, 0]$
Weak	II	$[0, 1, 0, 0, 0, 0]$
Obvious	III	$[0, 0, 1, 0, 0]$
Relatively Strong	IV	$[0, 0, 0, 1, 0]$
Strong	V	$[1, 0, 0, 0, 0, 1]$

Table 5. Basic statistical information on TBM parameters.

Parameters	Min	Median	Max	Mean	Std
P	2.607108	6.44266	55.94638	7.635982	5.634877
T	1536.277	3680.723	5613.279	3656.688	701.2206
RPM	2.239389	4.054824	5.258368	4.060949	0.7243764
F	14,476.57	23,237.92	31,188.26	23,230.56	4178.51
V	7.954941	26.01329	290.4988	30.78468	25.78844
FPI	5.44542	52.55251	130.6987	56.02042	23.09044
TPI	0.073619	0.81829	1.506045	0.82678	0.274542
WR	12.9168	158.0774	352.7145	153.7523	41.84008

Table 6. Optimal hyperparameters and search ranges.

Hyperparameters	Search Area	Optimal Hyperparameters
The dimension of hidden layer	32, 64, 128, 512	64
The dimension of KAN	16, 32, 64, 128, 512	[64, 32, 16]
Batch size	16, 32, 64, 128, 256	32
Dropout	0.01, 0.05, 0.1, 0.2	0.05
Patience	5, 10, 20	10

Table 7. Predictive performance of TRNet on the test set.

Serial	Accuracy	AUC	Kappa	Epoch	Train Duration/s
1	0.92029	0.956131	0.741277	12	5.960255
2	0.92029	0.944145	0.741277	11	5.961756
3	0.921739	0.959131	0.746631	13	5.96497
4	0.921739	0.968798	0.746631	12	6.123185
5	0.921739	0.948734	0.746631	13	6.173523
6	0.921739	0.941898	0.746631	14	6.383748
7	0.921739	0.966647	0.746631	15	5.926478
8	0.921739	0.969336	0.746631	14	5.991457
9	0.921739	0.963146	0.746631	13	6.140293
10	0.921739	0.971977	0.746631	13	5.935721

Table 8. Comparison of prediction performance of different algorithms on the test set.

Method	Accuracy	AUC	Kappa	Epoch	Train Duration/s
TRNet	0.92144275	0.95899427	0.74555987	12	6.0561
CNN	0.91159420	0.94224364	0.70426115	28	5.3334
LSTM	0.92113913	0.88156809	0.74363058	16	5.3498
KAN	0.90304347	0.92731085	0.72062828	12	5.8756
KNN	0.8636	0.7298	0.5424	N/A	0.01
XGBoost	0.8563	0.8823	0.6518	N/A	0.15

Table 9. Ablation study results of the TRNet model on the test set.

Method	Accuracy	AUC	Kappa
TRNet	0.9215	0.9590	0.7456
TRNet-LiNGAM ¹	0.9072	0.95753	0.6748
TRNet-CNN ²	0.9174	0.8198	0.7346
TRNet-KAN ³	0.8957	0.9461	0.6459

¹ ‘-LiGAM’ represents the removal of the local causality matrix and replaces it with the original feature vector. ² ‘-CNN’ removes the CNN layer of TRNet. ³ ‘-KAN’ removes the KAN layer of TRNet.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Huang, Y.; Hu, X.; Pang, S.; Fu, W.; Chang, S.; Gao, B.; Hua, W. TBM Enclosure Rock Grade Prediction Method Based on Multi-Source Feature Fusion. Appl. Sci. 2025, 15, 6684. https://doi.org/10.3390/app15126684

AMA Style

Huang Y, Hu X, Pang S, Fu W, Chang S, Gao B, Hua W. TBM Enclosure Rock Grade Prediction Method Based on Multi-Source Feature Fusion. Applied Sciences. 2025; 15(12):6684. https://doi.org/10.3390/app15126684

Chicago/Turabian Style

Huang, Yong, Xiewen Hu, Shilong Pang, Wei Fu, Shuaipeng Chang, Bin Gao, and Weihua Hua. 2025. "TBM Enclosure Rock Grade Prediction Method Based on Multi-Source Feature Fusion" Applied Sciences 15, no. 12: 6684. https://doi.org/10.3390/app15126684

APA Style

Huang, Y., Hu, X., Pang, S., Fu, W., Chang, S., Gao, B., & Hua, W. (2025). TBM Enclosure Rock Grade Prediction Method Based on Multi-Source Feature Fusion. Applied Sciences, 15(12), 6684. https://doi.org/10.3390/app15126684

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

TBM Enclosure Rock Grade Prediction Method Based on Multi-Source Feature Fusion

Abstract

1. Introduction

2. Methodologies

2.1. Convolutional Neural Network (CNN)

2.2. Kolmogorov–Arnold Network (KAN)

2.3. Local Causality Matrix Construction Method for Multi-Source Data Based on Direct-LiNGAM

2.4. Neural Network (TRNet) for TBM Surrounding Rock Grade Prediction Based on Multi-Source Feature Fusion

2.5. Indicators for Model Evaluation

3. Case Study

3.1. Project Overview

3.2. Data Preprocessing

3.3. Data Analysis

4. Results and Discussion

4.1. Training Process and Results

4.2. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI