Stability Prediction of Soil Slopes Based on Digital Twinning and Deep Learning

: This paper proposes a slope stability prediction model based on deep learning and digital twinning methods. To establish a reliable slope database, 30 actual slopes were collected, and 100 digital twin (DT) models were generated for each actual slope by ﬁne-tuning the slope proﬁles. The safety factors of all slope samples were calculated using the Limit Equilibrium Methods (LEMs)


Introduction
Landslide disasters are becoming more frequent with the continued growth of the world's population, the gradual expansion of human activities, and the continuing increase in the scale of engineering activities, coupled with other influencing factors such as global climate change.Landslides often cause traffic jams, river blockages, and disruption of civil engineering works, posing a serious threat to the safety of people's lives and causing significant property losses.Slope stability analysis plays a key role in the prevention of slope disasters, which is an important basis for decision-making in slope design, reinforcement, and treatment as well as a long-term research topic in engineering and science.
Many methods for slope stability analysis have been developed successively in recent decades, such as engineering geological analysis methods [1], analogy methods [2], LEMs [3], numerical analysis methods [4], genetic algorithms [5][6][7], machine learning (ML) methods, etc.Among them, numerical analysis methods have been widely studied and applied [8][9][10][11][12][13][14][15][16], but they still face many problems.Especially for slopes with complex soil layers, it is usually necessary to simplify the finite element modeling.In addition, because soil layers are geometrically complex, it is difficult to mesh the slopes along the soil layers, and the modeling workload is very large.Therefore, LEMs are still the basic method of slope stability analysis, which mainly analyzes the static equilibrium of unstable soil blocks and evaluates the stability of a slope according to the Mohr-Coulomb strength criterion.The LEMs mainly include the Spencer method, the Janbu method, the Bishop method, and the unbalanced thrust method.The LEMs are easy to implement and have been tested in engineering, so they are still the most widely used methods.Chen et al. [17] proposed a simplified three-dimensional slope stability analysis method based on LEMs.Zhou et al. [18] established a rigorous limit equilibrium column method considering the intercolumn force based on six equilibrium conditions.Tang et al. [19] established the stability diagram of the homogeneous isotropic slope by using the LEMs and the strength reduction method.Song et al. [20] proposed a method to evaluate the stability of reinforced slopes based on LEMs.
However, the traditional LEMs of calculating the slope safety factor require engineers to draw slope profiles (including the geometric parameters of each soil layer) according to field exploration data and then input the slope profiles and the material parameters of the soil layers into engineering software to calculate the slope safety factor.Or, in the finite element software, the strength reduction method can be used to calculate the slope safety factor.However, the above methods require in-depth professional skills and knowledge for slope inspectors.With the development of artificial intelligence (AI), ML is becoming popular in slope stability analysis.Das et al. [21] established various artificial neural networks (ANNs) to predict the slope safety factor.Compared with support vector machines (SVM) and genetic programming (GP), ANNs are more efficient.Erzin et al. [22] used ANN and multiple regression (MR) models to predict the critical safety factor of homogeneous slopes subjected to earthquake forces.Rahul et al. [23] used ANNs to predict the safety factor of dump slopes in a coal mine.Suman et al. [24] used functional networks (FNs), multivariate adaptive regression splines (MARS), and multigene genetic programming (MGGP) to predict the safety factor of slopes and give the prediction model equation.Chakraborty et al. [25] used multiple linear regression (MLR) and ANNs to predict the stability of slopes.Bui et al. [26] intended to use optimized ANNs to design pure, cohesive slopes.Mahmoodzadeh et al. [27] predicted the safety factor of slopes by using six ML techniques.
However, most of the above studies take the internal friction angle ϕ, the cohesion c of the slope soil, and the slope height and angle as the input parameters of each slope but do not consider the slope shape or the geometric and material parameters of each soil layer, so they are only applicable to homogeneous slopes with simple shapes.However, actual slopes are more complex and cannot be expressed simply by slope height and angle, and slopes are always composed of more soil layers.Therefore, this paper proposes a digital model that can truly reflect the authentic profiles of an actual slope and accurately predict its stability.As shown in Figure 1, it included: (1) A large number of actual slopes were collected, the DT method was used to establish the digital slope models with similar characteristics to the actual slopes, and the slope sample database was built by fine-tuning the slope profiles; (2) the LEMs were used to calculate the slope safety factor, and then the CNN model was built and trained with the slope samples.The coordinates of the slope profiles and the material parameters of the soil layers were taken as the CNN input, and the corresponding safety factor was taken as the output to train the CNN.To predict the stability of a slope, the inspectors only need to input their field exploration data into the proposed model.

Slope Sample Collection
In this paper, 30 actual slopes were collected from the existing literature, and the horizontal projection length of the slopes ranged from 50 m to 3000 m.The stability state of the slopes was classified into four levels: stable, basically stable, metastable, and unstable according to the safety factor.The slope number of different types was listed in Table 1, the profiles of three slopes were shown in Figure 2, and the corresponding material parameters of their soil layers were listed in Table 2.

Slope Sample Collection
In this paper, 30 actual slopes were collected from the existing literature, and the horizontal projection length of the slopes ranged from 50 m to 3000 m.The stability state of the slopes was classified into four levels: stable, basically stable, metastable, and unstable according to the safety factor.The slope number of different types was listed in Table 1, the profiles of three slopes were shown in Figure 2, and the corresponding material parameters of their soil layers were listed in Table 2.

Slope Sample Collection
In this paper, 30 actual slopes were collected from the existing literature, and the horizontal projection length of the slopes ranged from 50 m to 3000 m.The stability state of the slopes was classified into four levels: stable, basically stable, metastable, and unstable according to the safety factor.The slope number of different types was listed in Table 1, the profiles of three slopes were shown in Figure 2, and the corresponding material parameters of their soil layers were listed in Table 2.

Digital Twins of Slopes
Since the number of actual slopes collected was insufficient for CNN training, as shown in Figure 3, in this paper, the DT method was used to create digital models with similar characteristics to the actual slopes, and a group of multi-size models was created by inserting some random points to fine-tune the slope profiles.Each actual slope was extended to create 100 samples.This method was able to provide sufficient CNN training samples for slope stability prediction.

Digital Twins of Slopes
Since the number of actual slopes collected was insufficient for CNN training, as shown in Figure 3, in this paper, the DT method was used to create digital models with similar characteristics to the actual slopes, and a group of multi-size models was created by inserting some random points to fine-tune the slope profiles.Each actual slope was extended to create 100 samples.This method was able to provide sufficient CNN training samples for slope stability prediction.First, the outer profile line of each soil layer was represented by a polyline connected by several points, and these points that formed the original slope were called control points.The existing control points were kept unchanged, and some random points were inserted between them to fine-tune the outer shape of the slope and its soil layers and increase the number of slope samples.It should be explained here that the positive x-axis of the slope coordinate system points horizontally in the direction of the landslide, and the positive y-axis points vertically downward.
As shown in Figure 4, the number of randomly inserted points was determined by the quotient of the horizontal projection length between two connected control points divided by the corresponding threshold.According to the total length L of the horizontal projection of the slope, the following threshold was set: First, the outer profile line of each soil layer was represented by a polyline connected by several points, and these points that formed the original slope were called control points.The existing control points were kept unchanged, and some random points were inserted between them to fine-tune the outer shape of the slope and its soil layers and increase the number of slope samples.It should be explained here that the positive x-axis of the slope coordinate system points horizontally in the direction of the landslide, and the positive y-axis points vertically downward.
As shown in Figure 4, the number of randomly inserted points was determined by the quotient of the horizontal projection length between two connected control points divided This method was used to fine-tune the outer contour of the slope and soil layer generate a large number of slope models with profiles similar to the actual slope.To g erate diverse samples, the vertical and horizontal coordinates of the insertion points w randomly generated within the range of the control points.In addition, it was neces to ensure that the insertion point did not penetrate the soil layer and prevent reverse sl (Figure 5).For this reason, the following work was performed:  Between each pair of control points, the x-coordinates of the insertion points w randomly generated according to their number.
The number of insertion points = the horizontal projection length between the control points the corresponding threshold  ( )  coordinates of the insertion points  ( ) As shown in Figure 6, random points were inserted from the higher control poin + 1) to the lower control point P(i) of the slope, and the y-coordinates of the random po were generated: This method was used to fine-tune the outer contour of the slope and soil layers to generate a large number of slope models with profiles similar to the actual slope.To generate diverse samples, the vertical and horizontal coordinates of the insertion points were randomly generated within the range of the control points.In addition, it was necessary to ensure that the insertion point did not penetrate the soil layer and prevent reverse slope (Figure 5).For this reason, the following work was performed: This method was used to fine-tune the outer contour of the slope and soil laye generate a large number of slope models with profiles similar to the actual slope.To erate diverse samples, the vertical and horizontal coordinates of the insertion points w randomly generated within the range of the control points.In addition, it was neces to ensure that the insertion point did not penetrate the soil layer and prevent reverse s (Figure 5).For this reason, the following work was performed:  Between each pair of control points, the x-coordinates of the insertion points randomly generated according to their number.
The number of insertion points = the horizontal projection length between the control points the corresponding threshold  ( )  coordinates of the insertion points  ( ) As shown in Figure 6, random points were inserted from the higher control poin + 1) to the lower control point P(i) of the slope, and the y-coordinates of the random po were generated: Between each pair of control points, the x-coordinates of the insertion points were randomly generated according to their number.
The number of insertion points = the horizontal projection length between the control points the corresponding threshold x P(i) ≥ x coordinates of the insertion points ≥ x P(i+1) As shown in Figure 6, random points were inserted from the higher control point P(i + 1) to the lower control point P(i) of the slope, and the y-coordinates of the random points were generated: (a) When traversing each of the slope soil layers, if there was only one insertion p within a segment, as shown in Figure 7a, the higher y-coordinate of the control poin + 1) and the corresponding y-coordinate of the upper soil layer of the insertion point w compared, and the maximum value (i.e., the corresponding y-coordinate of the lower layer of the insertion point and the y-coordinate of the control point P(i) were compa and the minimum value (i.e., the higher point) was taken as the upper limit of the inser point y-coordinate.If there were multiple insertion points between two control point in Figure 7b, the y-coordinate for the first insertion point was obtained using the ab steps.For subsequent insertion points, the higher y-coordinate of the already inse points and the y-coordinate of the upper soil layer corresponding to the insertion p were compared, and the maximum value was taken as the lower limit of the inser point y-coordinate.The y-coordinate of the lower soil layer corresponding to the inser point and the y-coordinate of the lower control point P(i) were also compared, and minimum value was taken as the upper limit of the insertion point y-coordinate.(b) However, in the above random point insertion process, only the upper and lo ordinates of the insertion points were considered to prevent the local soil layer from vading the lower soil layer, but the connecting line between the insertion points inva the upper layer was not considered, resulting in the situation shown in Figure 8.In paper, self-correction methods were considered to solve such invasions.(a) When traversing each of the slope soil layers, if there was only one insertion point within a segment, as shown in Figure 7a, the higher y-coordinate of the control point P(i + 1) and the corresponding y-coordinate of the upper soil layer of the insertion point were compared, and the maximum value (i.e., the corresponding y-coordinate of the lower soil layer of the insertion point and the y-coordinate of the control point P(i) were compared, and the minimum value (i.e., the higher point) was taken as the upper limit of the insertion point y-coordinate.If there were multiple insertion points between two control points, as in Figure 7b, the y-coordinate for the first insertion point was obtained using the above steps.For subsequent insertion points, the higher y-coordinate of the already inserted points and the y-coordinate of the upper soil layer corresponding to the insertion point were compared, and the maximum value was taken as the lower limit of the insertion point y-coordinate.The y-coordinate of the lower soil layer corresponding to the insertion point and the y-coordinate of the lower control point P(i) were also compared, and the minimum value was taken as the upper limit of the insertion point y-coordinate.(a) When traversing each of the slope soil layers, if there was only one insertion within a segment, as shown in Figure 7a, the higher y-coordinate of the control po + 1) and the corresponding y-coordinate of the upper soil layer of the insertion poin compared, and the maximum value (i.e., the corresponding y-coordinate of the low layer of the insertion point and the y-coordinate of the control point P(i) were com and the minimum value (i.e., the higher point) was taken as the upper limit of the ins point y-coordinate.If there were multiple insertion points between two control poi in Figure 7b, the y-coordinate for the first insertion point was obtained using the steps.For subsequent insertion points, the higher y-coordinate of the already in points and the y-coordinate of the upper soil layer corresponding to the insertion were compared, and the maximum value was taken as the lower limit of the ins point y-coordinate.The y-coordinate of the lower soil layer corresponding to the ins point and the y-coordinate of the lower control point P(i) were also compared, an minimum value was taken as the upper limit of the insertion point y-coordinate.(b) However, in the above random point insertion process, only the upper and ordinates of the insertion points were considered to prevent the local soil layer fro vading the lower soil layer, but the connecting line between the insertion points inv the upper layer was not considered, resulting in the situation shown in Figure 8. paper, self-correction methods were considered to solve such invasions.(b) However, in the above random point insertion process, only the upper and lower ordinates of the insertion points were considered to prevent the local soil layer from invading the lower soil layer, but the connecting line between the insertion points invading the upper layer was not considered, resulting in the situation shown in Figure 8.In this paper, self-correction methods were considered to solve such invasions.For the generated soil layer, each point of the upper soil layer was checked; i ordinate of the point of the upper soil layer was greater than that of the correspon local layer line, it proved that there was soil layer intrusion, and the ordinate of the p was regenerated and then checked again for upper soil layer intrusion.If there was soil layer intrusion, the above process was repeated until an insertion point withou layer intrusion was found.If the above process was repeated more than 30 times and was still soil intrusion, the insertion point could be moved down 5 m and then che for soil intrusion.If there was still soil intrusion, the insertion point could be moved d another 5 m until there was no soil intrusion.
In this paper, 30 typical slopes were collected, and 100 samples were generate each slope.A total of 3000 slope samples were generated as experimental data.This ple generation method not only maintains the rationality of the generated slopes but greatly increases the number of samples, which meets the requirements of deep lear for big data.The partially generated slopes are shown in Table 3.For the generated soil layer, each point of the upper soil layer was checked; if the ordinate of the point of the upper soil layer was greater than that of the corresponding local layer line, it proved that there was soil layer intrusion, and the ordinate of the point was regenerated and then checked again for upper soil layer intrusion.If there was still soil layer intrusion, the above process was repeated until an insertion point without soil layer intrusion was found.If the above process was repeated more than 30 times and there was still soil intrusion, the insertion point could be moved down 5 m and then checked for soil intrusion.If there was still soil intrusion, the insertion point could be moved down another 5 m until there was no soil intrusion.
In this paper, 30 typical slopes were collected, and 100 samples were generated for each slope.A total of 3000 slope samples were generated as experimental data.This sample generation method not only maintains the rationality of the generated slopes but also greatly increases the number of samples, which meets the requirements of deep learning for big data.The partially generated slopes are shown in Table 3.

Slope Name Actual Slope Generated Slope
Tianshengqiao Secondary Gate Head Yanglu Expressway K52 Slope

Calculation of Safety Factor
In this paper, the Spencer slice method was used to calculate the safety factor.assumed that the soil slope slides along a certain slip plane, and the soil reaches equ rium along this slip plane.A certain vertical soil slice was considered in the sliding and its forces were shown in Figure 9: On the considered soil slice, ΔW is the dead weight, ΔT and ΔN are the tangen forces and the normal force at its bottom, and G is the interaction force between the sli The differential form of the equilibrium equation was obtained by making Δ x→0, and integral of the whole sliding surface was obtained.The force and moment balance eq tions of the whole sliding body [28] are shown in Equations ( 2

Slope Name Actual Slope Generated Slope
Tianshengqiao Secondary Gate Head Yanglu Expressway K52 Slope

Calculation of Safety Factor
In this paper, the Spencer slice method was used to calculate the safety factor.It is assumed that the soil slope slides along a certain slip plane, and the soil reaches equilibrium along this slip plane.A certain vertical soil slice was considered in the sliding soil, and its forces were shown in Figure 9: On the considered soil slice, ΔW is the dead weight, ΔT and ΔN are the tangential forces and the normal force at its bottom, and G is the interaction force between the slices.The differential form of the equilibrium equation was obtained by making Δ x→0, and the integral of the whole sliding surface was obtained.The force and moment balance equations of the whole sliding body [28] are shown in Equations ( 2) and (3), respectively: (, ) = ()()() = 0 (3) Table 3. Generated slope.

Slope Name Actual Slope Generated Slope
Tianshengqiao Secondary Gate Head Yanglu Expressway K52 Slope

Calculation of Safety Factor
In this paper, the Spencer slice method was used to calculate the safety factor.assumed that the soil slope slides along a certain slip plane, and the soil reaches equ rium along this slip plane.A certain vertical soil slice was considered in the sliding and its forces were shown in Figure 9: On the considered soil slice, ΔW is the dead weight, ΔT and ΔN are the tangen forces and the normal force at its bottom, and G is the interaction force between the sl The differential form of the equilibrium equation was obtained by making Δ x→0, and integral of the whole sliding surface was obtained.The force and moment balance eq tions of the whole sliding body [28]

Slope Name Actual Slope Generated Slope
Tianshengqiao Secondary Gate Head Yanglu Expressway K52 Slope

Calculation of Safety Factor
In this paper, the Spencer slice method was used to calculate the safety factor.It is assumed that the soil slope slides along a certain slip plane, and the soil reaches equilibrium along this slip plane.A certain vertical soil slice was considered in the sliding soil, and its forces were shown in Figure 9: On the considered soil slice, ΔW is the dead weight, ΔT and ΔN are the tangential forces and the normal force at its bottom, and G is the interaction force between the slices.The differential form of the equilibrium equation was obtained by making Δ x→0, and the integral of the whole sliding surface was obtained.The force and moment balance equations of the whole sliding body [28] are shown in Equations ( 2) and (3), respectively: (, ) = ()()() = 0 (3)

Calculation of Safety Factor
In this paper, the Spencer slice method was used to calculate the safety factor.It is assumed that the soil slope slides along a certain slip plane, and the soil reaches equilibrium along this slip plane.A certain vertical soil slice was considered in the sliding soil, and its forces were shown in Figure 9: Table 3. Generated slope.

Slope Name
Actual Slope Generated Slope Tianshengqiao Secondary Gate Head Yanglu Expressway K52 Slope

Calculation of Safety Factor
In this paper, the Spencer slice method was used to calculate the safety assumed that the soil slope slides along a certain slip plane, and the soil reac rium along this slip plane.A certain vertical soil slice was considered in the and its forces were shown in Figure 9: On the considered soil slice, ΔW is the dead weight, ΔT and ΔN are the forces and the normal force at its bottom, and G is the interaction force betwee The differential form of the equilibrium equation was obtained by making Δ x→ integral of the whole sliding surface was obtained.The force and moment ba tions of the whole sliding body [28] are shown in Equations ( 2 On the considered soil slice, ∆W is the dead weight, ∆T and ∆N are the tangential forces and the normal force at its bottom, and G is the interaction force between the slices.The differential form of the equilibrium equation was obtained by making ∆ x→0, and the integral of the whole sliding surface was obtained.The force and moment balance equations of the whole sliding body [28] are shown in Equations ( 2) and (3), respectively: where dW dx is the bulk density of the soil, α is the angle of inclination of the bottom of the soil slice, c e = c F , tan φ e = tanφ F , c is the cohesion of the soil mass, φ is the internal friction angle of the soil, F is the safety factor, and is the unknown quantity to be solved, and β is the angle between the total force between the soil slices and the horizontal line, tanβ = λ.Among them, F and λ are the unknowns to be solved.
The Newton-Raphson iterative method is used to solve the equilibrium equation of the above integral form to obtain F and λ.

Optimization Method for Non-Circular Slip Surface
The slope stability analysis consists of the following two steps: (1) The safety factor for a slip surface in the landslide was calculated by LEMs; (2) For all possible slip surfaces, the above step was repeated to find the minimum safety factor and the corresponding critical slip surface.
The optimization method for the non-circular slip surface was as follows: where F is the safety factor of the non-circular slip surface calculated by LEMs, n is the number of control points of the non-circular slip surface, and x i , y i are the coordinates of the i-th control point of the non-circular slip surface.
Since n control points correspond to 2n degrees of freedom, the non-circular slip surface optimization problem contains many degrees of freedom, and there may be many local extremum points in its independent variable space, making the task of finding the global extremum very difficult.If the slope section is complex, there may be many local extrema of the safety factor.To solve this problem, a simple and effective idea was to attempt an initial value close to the global extremum.Obviously, the closer the initial value was to the global extremum, the less likely it was to miss the global extremum.As shown in Equation (8), when searching for the minimum safety factor, the optimization of the circular slip surface involves only 3 degrees of freedom.Therefore, in this paper, the minimum safety factor of the circular sliding surface corresponding to the most dangerous sliding surface was calculated first, and the non-circular sliding surface was optimized based on it.
where (x 0 , y 0 ) are the coordinates of the center of the circle and R is the radius.In this paper, the fmincon function of the Matlab R2021a Optimization Toolbox was used to optimize the safety factor of a non-circular sliding surface with a slope.The specific steps are as follows: (1) As shown in Figure 10, the enumeration method was used to continuously change the center coordinates (x 0 , y 0 ) and radius, and the Spencer method was used to calculate the safety factor of the circular slip surface of the slope.The minimum safety factor and the corresponding most dangerous slip surface were obtained by comparing the safety factors one by one.(2) As shown in Figure 11, according to the number of control poin circular slip surface, the coordinates of discrete control points were uniform the most dangerous circular slip surface, and the line segments connecting points were taken as the initial slip surface in the optimization of the no surface.The slip surface was approximately composed of (n − 1) line segme n control points {( ,  ), ( ,  ),…,( ,  )}.Points A and B are the upper a sections of the slip surface and the slope, respectively; their coordinates and coordinates of the intermediate control points are fixed, and the search varia circular slip surface composed of n control points was S =  ,  , …  slip surface is generally concave upward, which means that the slip surface to satisfy the following constraints [29]: In this paper, the safety factor of 3000 slope samples was calculated ac above method.Figure 12 shows the optimization result of the non-circular 2 slope samples, where the blue is the circular slip surface, the green is t circular slip surface, and the red is the optimal non-circular slip surface.(2) As shown in Figure 11, according to the number of control points on the non-circular slip surface, the coordinates of discrete control points were uniformly selected on the most dangerous circular slip surface, and the line segments connecting these discrete points were taken as the initial slip surface in the optimization of the non-circular slip surface.The slip surface was approximately composed of (n − 1) line segments connecting n control points {(x 1 , y 1 ), (x 2 , y 2 ), . . .,(x n , y n )}.Points A and B are the upper and lower intersections of the slip surface and the slope, respectively; their coordinates and the horizontal coordinates of the intermediate control points are fixed, and the search variable of the non-circular slip surface composed of n control points was S = {y 2 , y 3 , . . .y n−1 }.A reasonable slip surface is generally concave upward, which means that the slip surface usually needs to satisfy the following constraints [29]: (2) As shown in Figure 11, according to the number of control poin circular slip surface, the coordinates of discrete control points were uniform the most dangerous circular slip surface, and the line segments connecting points were taken as the initial slip surface in the optimization of the no surface.The slip surface was approximately composed of (n − 1) line segme n control points {( ,  ), ( ,  ),…,( ,  )}.Points A and B are the upper an sections of the slip surface and the slope, respectively; their coordinates and coordinates of the intermediate control points are fixed, and the search varia circular slip surface composed of n control points was S =  ,  , …  slip surface is generally concave upward, which means that the slip surface to satisfy the following constraints [29]: In this paper, the safety factor of 3000 slope samples was calculated ac above method.Figure 12 shows the optimization result of the non-circular 2 slope samples, where the blue is the circular slip surface, the green is t circular slip surface, and the red is the optimal non-circular slip surface.In this paper, the safety factor of 3000 slope samples was calculated according to the above method.Figure 12 shows the optimization result of the non-circular slip surface of 2 slope samples, where the blue is the circular slip surface, the green is the initial non-circular slip surface, and the red is the optimal non-circular slip surface.

Convolutional Neural Network 3.2.1. Data Preprocessing
In this experiment, the internal friction angle ϕ, cohesion c, natural bulk density γ, saturated bulk density γ sat , and x and y coordinates of each soil layer of the slope were taken as the CNN input, and the corresponding safety factor was taken as the output.A 1D CNN and a 2D CNN were constructed to compare the prediction effects of the two models.
For the 2D model, as shown in Table 4, the mechanical parameters of each soil layer of the slope and the coordinates of the control points were arranged from top to bottom.Since the number of coordinate points in each soil layer was not equal and the input pattern of the neural network was a matrix, the soil layer with the largest number of line points in the slope soil layer was taken as the reference, and zeros were added at the end of the line with an insufficient number of line points.The input size of the neural network was fixed, but the size of the input matrix was not the same for different slope samples.As shown in Figure 13, the slope sample with the largest input size was taken as the reference, and zeros were added after each row and column for the samples with insufficient size.

Data Preprocessing
In this experiment, the internal friction angle φ, cohesion c, natural bulk d saturated bulk density   , and x and y coordinates of each soil layer of the sl taken as the CNN input, and the corresponding safety factor was taken as the 1D CNN and a 2D CNN were constructed to compare the prediction effects o models.
For the 2D model, as shown in Table 4, the mechanical parameters of each of the slope and the coordinates of the control points were arranged from top t Since the number of coordinate points in each soil layer was not equal and the i tern of the neural network was a matrix, the soil layer with the largest numb points in the slope soil layer was taken as the reference, and zeros were added a of the line with an insufficient number of line points.The input size of the neura was fixed, but the size of the input matrix was not the same for different slope As shown in Figure 13, the slope sample with the largest input size was taken a erence, and zeros were added after each row and column for the samples with in size.For the 1D CNN, as shown in Figure 14, to keep the input size the same, data were first processed as a 2D input sample.Then, from top to bottom, th angle, cohesion, natural bulk density, saturated bulk density, and x and y coor each soil layer of the slope were arranged in a column, and the above operat performed on all slopes.For the 1D CNN, as shown in Figure 14, to keep the input size the same, the input data were first processed as a 2D input sample.Then, from top to bottom, the friction angle, cohesion, natural bulk density, saturated bulk density, and x and y coordinates of each soil layer of the slope were arranged in a column, and the above operations were performed on all slopes.

CNN Regression Model
The CNN was a type of feed-forward convolutional neural network consisting of an input layer, convolutional layers, activation layers, pooling layers, normalization layers, and a fully connected layer.The convolution layer was used to extract local features from the data and was usually followed by an activation layer.Commonly used nonlinear activation functions include Sigmod, Tanh, ReLU, and Leaky ReLU.The pooling layer was used to greatly reduce the number of parameters (dimensionality reduction) and alleviate the problem of overfitting.The fully connected layer performed a nonlinear combination of the extracted features to obtain the output, which was located in the last part of the hidden layers of the CNN.The pooling layer was mainly used to reduce the dimension of the input data.Since the data volume of a single slope was much smaller than that of an image, the pooling layer was not added to the CNN in this study to prevent the loss of effective information with increasing network depth.
In this experiment, the shallow CNN model was built based on Matlab R2021a (Math Works, Inc., Natick, MA, USA).The network structure is shown in Figure 15, including an input layer and three convolutional layers with the Leaky ReLU activation function, followed by a fully connected layer and a regression layer.The parameters of the 1D CNN and 2D CNN models are shown in Table 5.

CNN Regression Model
The CNN was a type of feed-forward convolutional neural network consisting of an input layer, convolutional layers, activation layers, pooling layers, normalization layers, and a fully connected layer.The convolution layer was used to extract local features from the data and was usually followed by an activation layer.Commonly used nonlinear activation functions include Sigmod, Tanh, ReLU, and Leaky ReLU.The pooling layer was used to greatly reduce the number of parameters (dimensionality reduction) and alleviate the problem of overfitting.The fully connected layer performed a nonlinear combination of the extracted features to obtain the output, which was located in the last part of the hidden layers of the CNN.The pooling layer was mainly used to reduce the dimension of the input data.Since the data volume of a single slope was much smaller than that of an image, the pooling layer was not added to the CNN in this study to prevent the loss of effective information with increasing network depth.
In this experiment, the shallow CNN model was built based on Matlab R2021a (Math Works, Inc., Natick, MA, USA).The network structure is shown in Figure 15, including an input layer and three convolutional layers with the Leaky ReLU activation function, followed by a fully connected layer and a regression layer.The parameters of the 1D CNN and 2D CNN models are shown in Table 5.The 3000 soil slope samples from Section 2.2 were divided into a training set (80%), a validation set (10%), and a test set (10%).The physical and mechanical parameters of each slope in the training set were used as input data, along with the coordinates of each soil layer, and the corresponding safety factor was used as output.The CNN-based soil slope stability prediction model was trained using this dataset.Finally, the accuracy of the CNN model was tested on the test set (10%), and an actual slope was input into the neural network to validate the universality of the CNN model.

K-Fold Cross Validation
To evaluate the generalization ability of the model, this paper adopts the K-fold crossvalidation method.The commonly used K value ranges from 5 to 12, and relevant studies have confirmed that 10 is widely used [30].This paper adopts 10-fold cross-validation, and the principle is shown in Figure 16.

K-Fold Cross Validation
To evaluate the generalization ability of the model, this paper adopts the K-fold cro validation method.The commonly used K value ranges from 5 to 12, and relevant stud have confirmed that 10 is widely used [30].This paper adopts 10-fold cross-validati and the principle is shown in Figure 16.

Optimization of Non-Circular Slip Surface
The non-circular slip surfaces consisted of line segments connected by multiple control points.The safety factors were different for different numbers of control points.Based on 30 original slopes, the safety factors and optimization times of circular slip surfaces and non-circular slip surfaces with different numbers of control points are compared in this section.The comparison results are shown in Tables 6 and 7.In comparison, it was found that for most slopes, the safety factor of a non-circular slip surface gradually decreases with the increase of the number of control points, and for very few slopes, the safety factor of a non-circular slip surface fluctuates with the increase of the number of control points.The change in the safety factor with each increase in the number of control points does not exceed 0.1.When the control points of the non-circular slip surface of the 12th slope were 6 and 7, the maximum difference in the safety factor was 0.0816.The difference between the maximum value and the minimum value of the safety factor of the noncircular slip surface was 0.1047 and 0.1052 for the 12th and 30th slopes, respectively, and the difference for other slopes was not larger than 0.1, but the optimization time greatly increases with the increase of the control points.Since the six control points can roughly reflect the shape of the slip surface of the soil slope, in this paper, the non-circular slip surface with six control points was used to calculate the safety factor of the three-thousand generated slope samples.
By comparing the minimum safety factor of the non-circular sliding surface with that of the circular sliding surface, it was found that the safety factor of the non-circular sliding surface is smaller than that of the circular sliding surface, as shown in Figure 17.Among them, the difference in the safety factor of the 2nd, 12th, 18th, 20th, 24th, and 30th slopes is 0.202, 0.155, 0.161, 0.160, 0.100, and 0.118, respectively, and the difference of the other slopes is not larger than 0.1.The slope stability states were classified according to Table 1.Except for the 9th and 17th slopes, the stability states of the slopes determined based on the safety factors of the circular slip surface and the non-circular slip surface were the same, but the calculation time of the safety factors of the non-circular slip surface was 4-50 times that of the circular slip surface.For the 9th and 17th slopes, the safety factors of the circular slip surface were 1.1080 and 1.0527, respectively, which were basically stable slopes, but the minimum safety factors of the non-circular slip surface were 1.0217 and 1.0445, respectively, which were metastable slopes.Therefore, when the safety factor was around the critical value of slope stability state division, the slope stability state divided based on the safety factor of the non-circular slip surface and the safety factor of the circular slip surface could be different, and the slope stability state divided based on the safety factor of the non-circular slip surface was safer and more conservative.
of the circular sliding surface, it was found that the safety factor of the non-circular sliding surface is smaller than that of the circular sliding surface, as shown in Figure 17.Among them, the difference in the safety factor of the 2nd, 12th, 18th, 20th, 24th, and 30th slopes is 0.202, 0.155, 0.161, 0.160, 0.100, and 0.118, respectively, and the difference of the other slopes is not larger than 0.1.The slope stability states were classified according to Table 1.Except for the 9th and 17th slopes, the stability states of the slopes determined based on the safety factors of the circular slip surface and the non-circular slip surface were the same, but the calculation time of the safety factors of the non-circular slip surface was 4-50 times that of the circular slip surface.For the 9th and 17th slopes, the safety factors of the circular slip surface were 1.1080 and 1.0527, respectively, which were basically stable slopes, but the minimum safety factors of the non-circular slip surface were 1.0217 and 1.0445, respectively, which were metastable slopes.Therefore, when the safety factor was around the critical value of slope stability state division, the slope stability state divided based on the safety factor of the non-circular slip surface and the safety factor of the circular slip surface could be different, and the slope stability state divided based on the safety factor of the non-circular slip surface was safer and more conservative.In this section, the slope sample data were pre-processed, of which 80% (2400) were used as the training set, 10% (300) as the validation set, and 10% (300) as the test set.The prediction effects of 1D CNN and 2D CNN based on the safety factors of the non-circular slip surface and circular slip surface of the slope, respectively, were compared.The same hyperparameters were used in both network models, and the specific parameters are shown in Table 8.The RMSE was used as the evaluation indicator.The results of the 10-fold cross-validation of the two CNN models are shown in Table 9.By comparing the average value of the RMSE of the 10-fold cross-validation of each model, it was found that for the 1D CNN model and the 2D CNN model, the RMSE of the prediction model based on the non-circular slip surfaces was slightly lower than that based on the circular slip surface, while for the two slip surfaces, the RMSE of the 2D CNN was slightly lower than that of the 1D CNN, which may be due to the input of the 2D CNN in the form of a matrix.Every two rows of data from the 2D input samples correspond to the mechanical parameters and coordinates of one soil layer.The size of the 2D CNN convolution kernel was 2 × 2. In the convolution process, the independence of information for each soil layer and the correlation between soil layers are maintained.Therefore, the following tests adopt the 2D CNN model based on a non-circular slip surface.

Comparison of CNN Models with Different Depths
The CNN was composed of multiple convolutional layers, and the predictive performance of the network model composed of different numbers of convolutional layers was different.To find the most suitable network model for the experimental dataset, six different depth network models were compared.As shown in Table 10, the network structure of Model 1 ~Model 5 was similar, and only one convolutional layer was increased in turn.As shown in Figure 18, Model 6 was a depth residual network.The hyperparameters of all the network models are shown in Table 8.The RMSE of network models with different depths is shown in Figure 19.As the number of network convolutional layers increased from three to six, the RMSE of the network model became smaller and smaller, and the prediction effect became better and better.However, when the number of convolutional layers was seven, the RMSE of the network model was significantly larger than that of the network model (Model 4) with six convolutional layers.The RMSE of the residual depth network model was the largest at 0.0696.Obviously, it was not the case that the deeper the number of network layers, the better the network prediction effect.Especially when the sample data volume is small, as the number of network layers increases, less effective information can be extracted for the deeper convolutional layer.Therefore, for this experimental dataset, the 2D CNN model with six convolutional layers has the best performance with an RMSE of 0.0666.The comparative results of the network hyperparameter optimization are shown in Appendix A.  The RMSE of network models with different depths is shown in Figure 19.As the number of network convolutional layers increased from three to six, the RMSE of the network model became smaller and smaller, and the prediction effect became better and better.However, when the number of convolutional layers was seven, the RMSE of the network model was significantly larger than that of the network model (Model 4) with six convolutional layers.The RMSE of the residual depth network model was the largest at 0.0696.Obviously, it was not the case that the deeper the number of network layers, the better the network prediction effect.Especially when the sample data volume is small, as the number of network layers increases, less effective information can be extracted for the deeper convolutional layer.Therefore, for this experimental dataset, the 2D CNN model with six convolutional layers has the best performance with an RMSE of 0.0666.The comparative results of the network hyperparameter optimization are shown in Appendix A.  The RMSE of network models with different depths is shown in Figure 19.As the number of network convolutional layers increased from three to six, the RMSE of the network model became smaller and smaller, and the prediction effect became better and better.However, when the number of convolutional layers was seven, the RMSE of the network model was significantly larger than that of the network model (Model 4) with six convolutional layers.The RMSE of the residual depth network model was the largest at 0.0696.Obviously, it was not the case that the deeper the number of network layers, the better the network prediction effect.Especially when the sample data volume is small, as the number of network layers increases, less effective information can be extracted for the deeper convolutional layer.Therefore, for this experimental dataset, the 2D CNN model with six convolutional layers has the best performance with an RMSE of 0.0666.The comparative results of the network hyperparameter optimization are shown in Appendix A.  7.The network solver adopts Adam, the initial learning rate is 0.001, and 300 (10%) verification sets are taken as the test set.The prediction performance of the model is shown in Figure 20.In this paper, a six-layer 2D CNN model is built.The network structure is shown in Model 4 in Table 7.The network solver adopts Adam, the initial learning rate is 0.001, and 300 (10%) verification sets are taken as the test set.The prediction performance of the model is shown in Figure 20.In order to verify the generalization ability of the model, this paper collects a typical landslide case from the existing literature.The landslide is located in Qingning Township, Daxian County, Sichuan Province.The slope shape is shown in the figure, and the physical parameters of the soil are listed in Table 11.The coordinate information and the physical and mechanical parameters of the soil layer are arranged as shown in Table 3 and input into the trained network model.The actual slope safety factor is 1.0197, the model prediction result is 1.0229, and the absolute error is 0.0032, which verifies the applicability of the model.In order to verify the generalization ability of the model, this paper collects a typical landslide case from the existing literature.The landslide is located in Qingning Township, Daxian County, Sichuan Province.The slope shape is shown in the figure, and the physical parameters of the soil are listed in Table 11.The coordinate information and the physical and mechanical parameters of the soil layer are arranged as shown in Table 3 and input into the trained network model.The actual slope safety factor is 1.0197, the model prediction result is 1.0229, and the absolute error is 0.0032, which verifies the applicability of the model.Weakly weathered sandstone 350.5 52 25.5 28.8

Conclusions
In this paper, the DT method was used to build a slope stability prediction model based on big data and deep learning.Initially, the sample database of the slope was established by using the sample generation method, and the optimized safety factor of the noncircular slip surface was compared with the safety factor of the circular slip surface.Secondly, the CNN regression model is established, and the predictive effects of 1D and 2D

Conclusions
In this paper, the DT method was used to build a slope stability prediction model based on big data and deep learning.Initially, the sample database of the slope was established by using the sample generation method, and the optimized safety factor of the non-circular slip surface was compared with the safety factor of the circular slip surface.Secondly, the CNN regression model is established, and the predictive effects of 1D and 2D network models and different depth network models are compared, respectively.The following conclusions are drawn: 1.
The DT slope models generated based on the LEMs greatly expand the slope sample dataset while maintaining the rationality of the slope samples, thus providing robust training samples for the big data-based deep learning method to predict slope stability.

2.
Based on the circular slip surface, the safety factor of the non-circular slip surface is optimized, and the safety factor is smaller than that of the circular slip surface.
The minimum safety factor of the non-circular slip surface decreases slightly as the number of control points on the slip surface increases, but the computation time increases significantly.The RMSE of the prediction model based on the non-circular slip surface is slightly lower than that based on the circular slip surface.

3.
For this experimental dataset, the RMSE of 2D CNN was slightly lower than that of 1D CNN.The RSME of CNN first decreases and then increases as the number of network layers increases.

4.
The prediction model of slope safety factor based on CNN had strong generalization ability, and the prediction error was within an acceptable range, providing a method for rapid detection of slope stability.
In this paper, the DT slope stability prediction model based on CNN is built.The inspector only needs to input the exploration data into the slope stability prediction model to obtain the safety factor of the slope.The procedure is simple, does not require strong professional knowledge, and has great engineering application value.However, the input data may be redundant.In future studies, sensitivity analysis will be carried out, and pruning and other methods will be considered for clipping input data from slopes.In addition, this study only considers 2D slopes, and in a later stage, a 3D slope sample database can be constructed by combining finite element analysis to achieve 3D slope stability prediction.Finally, this study only focuses on the study of slope stability prediction methods based on CNNs.In the future, a user-oriented intelligent detection platform can be developed to promote the real implementation of this research result in slope engineering practice.Figure A2 shows that, when the initial learning rate is 0.01, the network optimization process based on the gradient descent method fluctuates due to the large learning rate, which makes it difficult for the training to converge to the minimum value, and the RMSE of the network fluctuates greatly.As shown in Figure A3, when the initial learning rate is 0.0001 and the solver is Adam and SGDM, the step in the gradient descent process is small due to the low learning rate.When the maximum epoch is the same, the convergence speed is slower and the RMSE is larger than when the initial learning rate is 0.001.When the initial learning rate is 0.0001 and the solver is RMSProp, the network convergence is better and the RMSE is larger than when the initial learning rate is 0.001.As shown in Figure A4, when the initial learning rate is 0.001, the network convergence is the best, and when the solver is Adam, the RMSE is 0.0666, and the network prediction effect is the best.Figure A2 shows that, when the initial learning rate is 0.01, the network optimization process based on the gradient descent method fluctuates due to the large learning rate, which makes it difficult for the training to converge to the minimum value, and the RMSE of the network fluctuates greatly.As shown in Figure A3, when the initial learning rate is 0.0001 and the solver is Adam and SGDM, the step in the gradient descent process is small due to the low learning rate.When the maximum epoch is the same, the convergence speed is slower and the RMSE is larger than when the initial learning rate is 0.001.When the initial learning rate is 0.0001 and the solver is RMSProp, the network convergence is better and the RMSE is larger than when the initial learning rate is 0.001.As shown in Figure A4, when the initial learning rate is 0.001, the network convergence is the best, and when the solver is Adam, the RMSE is 0.0666, and the network prediction effect is the best.0.0001 and the solver is Adam and SGDM, the step in the gradient descent process is small due to the low learning rate.When the maximum epoch is the same, the convergence speed is slower and the RMSE is larger than when the initial learning rate is 0.001.When the initial learning rate is 0.0001 and the solver is RMSProp, the network convergence is better and the RMSE is larger than when the initial learning rate is 0.001.As shown in Figure A4, when the initial learning rate is 0.001, the network convergence is the best, and when the solver is Adam, the RMSE is 0.0666, and the network prediction effect is the best.

Figure 3 .
Figure 3. DT of an actual slope.

Figure 3 .
Figure 3. DT of an actual slope.

1 )
threshold.According to the total length L of the horizontal projection of the slope, the following threshold was set: > 500 m 20 m 500 m ≥ L > 150 m 10 m 150 m ≥ L > 50 m 3 m L ≤ 50 m (Appl.Sci.2023, 13, x FOR PEER REVIEW 5

Figure 4 .
Figure 4. Insert random points to fine-tune the shape of the slope.

Figure 4 .
Figure 4. Insert random points to fine-tune the shape of the slope.

Figure 4 .
Figure 4. Insert random points to fine-tune the shape of the slope.

Figure 6 .
Figure 6.Insert random points between two control points.

Figure 7 .
Figure 7.The insertion points of the soil layer line: (a) Only one insertion point; (b) More than insertion point.

Figure 6 .
Figure 6.Insert random points between two control points.

Figure 6 .
Figure 6.Insert random points between two control points.

Figure 7 .
Figure 7.The insertion points of the soil layer line: (a) Only one insertion point; (b) More th insertion point.

Figure 7 .
Figure 7.The insertion points of the soil layer line: (a) Only one insertion point; (b) More than one insertion point.

Figure 8 .
Figure 8.A self-correcting method for soil penetration.(a) The penetration point is located between the insertion point and the control point; (b) The penetration point is locate between two insertion points.

Figure 8 .
Figure 8.A self-correcting method for soil penetration.(a) The penetration point is located between the insertion point and the control point; (b) The penetration point is located between two insertion points.

Figure 10 .
Figure 10.Enumeration method: arrange slip arcs according to different center pos

Figure 10 .
Figure 10.Enumeration method: arrange slip arcs according to different center positions and radii.

Figure 10 .
Figure 10.Enumeration method: arrange slip arcs according to different center pos

Figure 12 .
Figure 12.Optimization of non-circular slip surface of slop.

Figure 13 .
Figure 13.Normalize the dimensions of the input data.

( 2 )
Step 2: (K − 1) parts are used for model training, and the remaining 1 part is used for model testing.(3)Step 3: Repeat Step 2 K times to obtain the evaluation results of the K models.(4) Step 4: Calculate the average value of the K-fold cross-validation results as the performance evaluation of the model.

Figure 17 .
Figure 17.The safety factor of circular slip surface vs. the minimum value of safety factor of noncircular slip surface.

Figure 17 .
Figure 17.The safety factor of circular slip surface vs. the minimum value of safety factor of non-circular slip surface.

Figure 20 .Figure 20 .
Figure 20.Model prediction results.In order to verify the generalization ability of the model, this paper collects a typical landslide case from the existing literature.The landslide is located in Qingning Township, Daxian County, Sichuan Province.The slope shape is shown in the figure, and the physical parameters of the soil are listed in Table11.The coordinate information and the physical

Figure A1 .
Figure A1.RMSE of CNN model with different hyperparameters.

Figure A1 .
Figure A1.RMSE of CNN model with different hyperparameters.

Table 1 .
Classification of slope stability state.

Table 1 .
Classification of slope stability state.

Table 2 .
Physical and mechanical parameters of slope.

Table 2 .
Physical and mechanical parameters of slope.

Table 4 .
Preprocessing of slope input data.

Table 4 .
Preprocessing of slope input data.
Figure 13.Normalize the dimensions of the input data.

Table 5 .
Parameters of 1D CNN and 2D CNN.

Table 5 .
Parameters of 1D CNN and 2D CNN.

Table 6 .
Comparisons of the safety factor of circular slip surface and non-circular slip surface with different numbers of control points.

Table 7 .
Comparisons of optimization time for different numbers of control points.

Table 9 .
RMSE of the model under 10-fold cross-validation.