A Hybrid PSO–SVM Model Based on Safety Risk Prediction for the Design Process in Metro Station Construction

Incorporating safety risk into the design process is one of the most effective design sciences to enhance the safety of metro station construction. In such a case, the concept of Design for Safety (DFS) has attracted much attention. However, most of the current research overlooks the risk-prediction process in the application of DFS. Therefore, this paper proposes a hybrid risk-prediction framework to enhance the effectiveness of DFS in practice. Firstly, 12 influencing factors related to the safety risk of metro construction are identified by adopting the literature review method and code of construction safety management analysis. Then, a structured interview is used to collect safety risk cases of metro construction projects. Next, a developed support vector machine (SVM) model based on particle swarm optimization (PSO) is presented to predict the safety risk in metro construction, in which the multi-class SVM prediction model with an improved binary tree is designed. The results show that the average accuracy of the test sets is 85.26%, and the PSO–SVM model has a high predictive accuracy for non-linear relationship and small samples. The results show that the average accuracy of the test sets is 85.26%, and the PSO–SVM model has a high predictive accuracy for non-linear relationship and small samples. Finally, the proposed framework is applied to a case study of metro station construction. The prediction results show the PSO–SVM model is applicable and reasonable for safety risk prediction. This research also identifies the most important influencing factors to reduce the safety risk of metro station construction, which provides a guideline for the safety risk prediction of metro construction for design process.


Introduction
Urban rail transit construction is very significant in promoting urban economic development. Urban metros are developing rapidly around the world since they are a fast, efficient, safe and comfortable transportation mode [1]. At the end of 2018, 35 cities in mainland China had constructed 185 urban rail operation lines with a total length of 5761.4 km according to the Annual Urban Rail Transit Statistical and Analysis Report [2]. The scale of lines planned and under construction has been growing steadily. In addition, the annual completed construction investment has reached a new record. However, with the rapid development of the metro, construction safety accidents occur frequently, which cause a large number of casualties and economic losses [3]. According to statistics, during the The risk prediction of metro station construction can provide a guideline for the implementation of DFS. On one hand, the identification of safety risk influencing factors is one of the most significant procedures in the risk-prediction process [44]. In such a case, the identification of safety risk influencing factors for the metro construction process has attracted much attention. For example, Yu et al. [45] analyzed the influencing factors of safety management in metro construction, and pointed out that safety attitude, construction site safety, government supervision, market restrictions and the unpredictability of tasks were the most important factors. Zhang et al. [46] classified the causes of construction accidents in the Beijing metro and found that inadequate management was the biggest cause of accidents, but the most serious accidents were caused by leaks or fractures in pipes and poor geological conditions. Ghosh and Jintanapakanont [47] separated and evaluated the key risk factors in the Thailand metro project, obtaining nine key factors and 35 sub-key factors by using the key factor analysis method. On the other hand, the construction of a prediction model is another key procedure in the risk-prediction process. Thus, many scholars have focused on the construction of prediction model. For instance, Wu et al. [48] presented a systemic Bayesian network method for the dynamic risk analysis of adjacent buildings in tunneling environments. Li et al. [49] proposed the safety risk identification system and early warning system for China's metro construction based on BIM. Zheng et al. [50] used the fuzzy analytic hierarchy process (AHP) and the comprehensive evaluation method to assess the metro construction risk of Changchun No.1 in China.
However, as mentioned above, there are some limitations in the existing literature: (1) few studies have focused on the leading role of design in metro construction, and few researchers have incorporated risk prediction into the DFS to enhance the safety of metro station construction. The design result is the most important work basis for the field operator. If there are construction safety risks in the preliminary design documents and these are not effectively dealt with, these defects will lead to an unsafe state of objects (including the environment) and the unsafe behavior of field operators in the construction process [51]; (2) some traditional risk-prediction methods, such as the neural network method, fuzzy comprehensive evaluation method, Bayesian network, and so on, have some shortcomings, such as low accuracy and low prediction efficiency [52]. In recent years, the support vector machine (SVM) model based on particle swarm optimization (PSO-SVM) has been widely used in many fields, which can overcome these problems [53,54]. For example, Zhou et al. [55] built the prediction model of PSO-SVM to predict the landslide displacement, and demonstrated that the proposed PSO-SVM model can better represent the response relationship between the factors and the periodic displacement. Chen et al. [56] used the evaluation model of short-term atmospheric pollutant concentration forecasting based on PSO-SVM, which demonstrated the superior performance of the proposed hybrid model. However, little attention has been paid to developing a hybrid risk-prediction framework for metro station construction by using the PSO-SVM model.
Design science is considered as practical knowledge used to support design activities, which seeks various approaches to a real-world problem of interest to practice [57,58]. Improving the ability of identifying safety risks is an available method to promote the implementation of DFS at the design phase of engineering project [59]. Therefore, this paper aims to construct a safety risk-prediction model to improve the performance of DFS in practice based on PSO-SVM. The PSO-SVM intelligent prediction model is used to predict the safety risks of a specific metro station construction project. The remainder of this paper is presented as follows. Section 2 introduces the basic principle and analysis method of PSO-SVM. Section 3 constructs the framework for the safety risk prediction of metro station construction. Section 4 applies the hybrid model to predict the safety risk of a case study of metro station construction. Finally, conclusions are presented in Section 5.

Support Vector Machine
The SVM [60] is a machine learning method based on the extended development of statistical learning theory [61,62]. The basic idea of the theory is to use non-linear mapping to project data points from a low-dimensional space into a high-dimensional space and linearly regress them in the high-dimensional feature space [53].
Machine learning is mainly divided into supervised learning and unsupervised learning [63]. Supervised learning usually requires a fully annotated training set to train a model which can be generalized to other unseen data. Unsupervised learning applies to datasets which only have input features but are missing annotations. In this research, feature attributes and annotations are both provided, and a machine learning model of metro construction safety risk is trained by analyzing the relationship between influencing factors and safety risks. Therefore, this paper addresses the problem of prediction of safety risk of metro construction engineering from a novel machine-learning perspective. Scholars have proposed some machine-learning methods, including neural network, artificial neural network (ANN), back propagation (BP), decision tree and SVM, etc. Traditional neural networks (NN) face some problems related to convergence and local optimization [56]. Moreover, the defects of faulty theory foundation, local minimum and over fitting weakened the ability of prediction [64]. Meanwhile, ANN shows a promising performance in fitting non-linear variables, but the complex relationships amongst some variables can affect its performance [65]. Back propagation (BP) is widely used in neural network models, which are trained by the error back propagation algorithm. The convergence speed of the back propagation neural network is slow, and it cannot guarantee the convergence to the global optimum. To address these problems, SVM is used for machine learning given its excellent performance in dealing with the statistical learning theory for small sample and in addressing global optimization and the principle of structural risk minimization [56].
A successful supervising learning algorithm usually contains two stages: one is the training stage, and the other is the practical application stage. The purpose of supervised learning training is to analyze the dependency x i → y i between the input and target according to the given training sample (x 1 , y 1 ), (x 2 , y 2 ), . . . , (x n , y n ). Assuming the evaluation function is f : x → f (x), y = f (x) , the output y is the target classification based on the evaluation function, as shown in Figure 1. Supposing there are N samples in the dataset space, (x i , y i ) 1≤i≤N as training samples. The input variables are mapped into a high-dimensional linear feature space through a non-linear transformation.
Then the optimal decision function is constructed. The dot product operation in the higher dimensional feature space is replaced by the kernel function in original space, and the global optimal solution is obtained by the training of the finite sample. Equation (1) represents the classification hyperplane.
where, ω is the weight vector; b is bias; "·" is the inner product; x i is a D-dimensional real input vector; y i represents the corresponding annotation of x i , y i = ±1 which is represented as precipitation occurrence or not here [66]. In order to maximize the interval, one only needs to calculate 1 2 ω T ω. The basic type of SVM is as shown in Equation (2): where i = 1, 2, · · · , N. The basic type of the above problem is a constrained convex quadratic programming problem. In order to solve dual problems, the Lagrange function is used to fuse the constraint into the objective function. It can be defined as Equation (4): subject to: where α i is a Lagrange multiplier for each training sample, the sample is the support vector for which α i = 0, lying on one of the two hyper-planes: (ω · x + )+b = +1; (ω · x − )+b = − 1.
According to the Karush-Kuhn-Tucker condition, the optimization problem must satisfy the last one in Equation (4) [67]. When dealing with the non-linear SVM problem, SVM introduces a kernel function to map relationship of the training samples from the original space to the high-dimensional space: The K x i , x j is the kernel function of SVM that represents the inner product of two vectors [68].
The radial basis function (RBF) is one of the most popular kernel functions, and the RBF kernel function is accounted for in the non-linear problems in this paper: where g is the kernel parameter to measure the width of kernel function in RBF. Therefore, there are two important parameters affecting the learning performance of SVM that are the penalty parameter c and the kernel function parameter g, which can affect the classification accuracy of the SVM. It is necessary to utilize optimization algorithm to optimize the parameters of SVM.

The Multi-Classification Support Vector Machine (SVM) Prediction Model Based on Binary Tree
The classic SVM is mainly designed for binary classification problems and cannot be directly used for multiclass classification problems. However, there are several levels of safety risk prediction for metro construction, and the binary classifier cannot meet the requirements of prediction. Therefore, this study uses the binary tree to solve multiclass classification problem. The principle is to divide all categories into two sub-classes by constructing a binary tree and using a single SVM to do binary classification each time; then a sub-class is divided into two sub-sub classes, which continues until all nodes contain only a single category. For N class classification problems, the method needs N − 1 binary classifiers to distinguish from the SVM of the root node in turn. According to the specific classification problem, binary tree algorithms can be divided into complete binary trees and partial binary trees [69], as shown in Figure 2. In Figure 2b, the sorting order of the classifier in the partial binary tree is 1, 2, 3, and 4; since there are 4 categories, corresponding to 4 structures respectively. In the partial binary tree structure, we assume that the correct rate of each layer is p 1 , p 2 , . . . , p k . If the correct rate of class division is 1, the classification accuracy of all categories is: From Equation (7): It can be found from Equation (8) that the deeper the binary tree classifier is, the lower the recognition accuracy. Only by making the shallow SVM recognize correctly can improve the performance of the deep SVM. Therefore, in the binary tree structure, the classification nodes usually play a more important role. According to this fact, a safety risk classification and prediction model based on a binary tree SVM is designed, as shown in Figure 3. It can be found from Figure 3 that the safety risk level of metro construction is high and is in the upper node. The structural design of this binary tree classifier is mainly based on two aspects: (1) the higher the safety risk level of the metro construction, the greater the loss caused by an accident. However, the recognition accuracy of the binary tree classifier decreases with the increase of depth; therefore, it is a priority to identify the level I safety risk, which is the most critical and has the largest loss, to ensure the accuracy of identification.
The prediction model of SVM metro construction for the safety risk classification of the binary tree includes three binary classifiers of SVM, and each SVM classifier determines a category: (1) training the predictive model-firstly, the collected construction safety risk samples of Class I are taken as the positive class of the first sub-classifier, which are identified as 1. Then, the remaining three types of sample sets are combined as the negative class of the first sub-classifier, which are identified as -1, and the training classifier is SVM 1; (2) the second-level construction safety risk sample from the remaining three types of samples is selected as the positive class of the second sub-classifier, whose category is identified as 1, and the remaining class II and class III samples are combined as the negative class of the second sub-classifier, which are identified as -1, and the training classifier is SVM 2; (3) the third sub-classifier SVM 3 is established to complete the construction of the binary tree SVM.

Parameter Optimization of SVM Model Based on Particle Swarm Optimization (PSO)
The SVM model has a good ability to solve small-sample, high-dimensional and non-linear problems. However, the choice of kernel function parameter g and the penalty parameters c of the SVM model have important influences on the accuracy of the SVM model. The different types of kernel functions determine the different properties of the SVM model. In the present work, kernel functions, such as linear kernel, polynomial kernel, sigmoid function and RBF are commonly used for SVM modeling [70]. With the wide convergence domain, the RBF has the advantage of being able to approximate an arbitrary non-linear and high-dimensional computation function, so it is the most widely used kernel function. Besides, RBF is a prior selection, since it effectively reduces complexity for inputs by only adjusting c and g [71].
In the SVM-RBF model, the appropriate model parameter setting has a heavy impact on the classification accuracy of the SVM model [72]. The penalty parameter c represents a "degree of punishment" that controls the sampling error. If the value of c is large, model may suffer from overfitting problem, that is model can fit training data well but perform poorly on other unseen data. If the value of c is small, the complexity of the model is reduced, and the model's generalization ability may be improved but may also suffer from underfitting problems. The parameter g of the kernel function represents the width of the RBF kernel function, and the larger value of g, the higher correlation between the support vectors. Therefore, c and g affect the performance of SVM together. Only by constantly adjusting the model parameters to achieve the best combination of model parameters can the SVM machine-learning ability and regression prediction effect be improved. Therefore, the penalty parameters c and kernel parameters g should be optimized.
The traditional methods of SVM parameter selection are generally the cross validation method and grid search method. These two methods have some limitations of low efficiency, low precision, and the search parameters cannot be optimized. PSO has the advantages of a simple algorithm structure, such as high precision, fast convergence speed and strong global search ability. PSO is an algorithm developed in recent years, and it was inspired from the feeding behavior characteristic of a bird flock, which is used for solving optimization problem. PSO was first proposed by Kennedy and Eberhart [73]. In PSO, each particle represents a potential solution to the problem. The common feature of the particle is represented by position, speed and fitness value. Each particle updates the position and speed in the next iteration by tracking the fitness extreme value. The fitness extreme value mainly includes the individual extreme value P best and the global extreme value g best . The position of the particle's previous best performance in a vector called P best , and the g best value is tracked by the particle swarm optimizer.
The fitness value can be calculated through the fitness function, which can estimate the merit of the particles. After discovering P best and g best , PSO identifies the speed and distance of each particle [56].
PSO has the advantages of a simple algorithm structure, such as high precision, fast convergence speed and a strong global search ability. It is widely used in data optimization and data mining. This research adopts the PSO to search and optimize the kernel parameters and penalty factors of the SVM model instead of the traditional parameter optimization method [74]. By constructing the PSO-SVM prediction model, the learning ability and prediction effect of the safety risk-prediction model of metro construction are improved.
In the D-dimensional space, the space vector X i = (X i1 , X i2 , · · · , X iD ) T is represented as the i − th particle, where i = 1, 2, · · · , n, X i is the position of the ith particle and a possible solution.
The velocity and position of the particles are iterated to obtain the equation as follows [75]: is the velocity of the i − th particle, and P i = (P i1 , P i2 , · · · , P iD ) is the optimal position of this particle. The optimal swarm position is P g = P g1 , P g2 , · · · , P gD . Under the condition of the i − th particle at the k − th iteration, X k+1 id and V k+1 id are the d − th location and speed component. Parameters c 1 , c 1 , r 1 , and r 2 are the random number, the range is 0 to 1, and ω is the inertial weight of the PSO algorithm.
The process of using PSO parameters for optimization is as follows: (i) The population is initialized. The population size, the maximum number of iterations of the population, the penalty factor c and the optimization range of the kernel parameter g are set.
The learning factors c 1 and c 2 are adopted by the linear learning strategy. The inertia weight ω is adopted by the linear decreasing strategy.
(ii) The position x 0 i and velocity v 0 i of the initial particles within the allowed range are generated randomly.
(iii) Fitness calculation: the fitness value is the mean squared error (MSE) when cross-checking the training set.
(iv) The fitness value f i of the current position of each particle in the population with the individual extreme value P best is compared; if f i < P best , then P best = f i , otherwise f i remains unchanged.
(v) The individual optimal value P best of each particle in the population with the population global extremum is compared; if P best < g best , then g best = P best , otherwise g best remains unchanged.
(vi) If the termination condition is satisfied, the iteration is stopped and the positional parameters of the optimal particle are output, that is, the optimal penalty coefficient c of the SVM and the kernel function parameter g are output, otherwise steps iii-vi are repeated.

Framework for Safety Risk Prediction of Metro Station Construction
In this section, a hybrid risk-prediction framework for metro station construction is presented by using the PSO-SVM prediction model. The flowchart of this framework is presented in Figure 4. First, the literature review and code of construction safety management analysis methods are used to identify the influencing factors of safety risk in metro station construction. Second, a structured interview is used to collect safety risk cases of metro construction projects. Then, the PSO-SVM model is constructed to predict the safety risk. Finally, the proposed framework is applied to a case study of metro station construction.

Stage 1: Identify the Influencing Factors of Safety Risk in Metro Construction
In order to obtain the influencing factors of metro construction safety risk more comprehensively, the relevant core influencing factors are selected from the existing construction safety management standards, specifications and literature analysis.

Stage 2: Collection of Safety Risk Cases in Metro Construction
Due to the complexity of the metro construction process, the related safety risk influencing factors frequently cannot be measured directly; the research data can only be obtained indirectly. Therefore, an expert interview method is adopted to collect construction safety risk cases. The expert interview is mainly divided into structured interviews and semi-structured interviews. The difference mainly lies in the degree of the researcher's control over the interview process. Structured interviews usually use questionnaires which are uniformly designed and structured, while semi-structured interviews can be adjusted in time according to the actual situation of the interview, and there is no strict interview outline. The main purpose of the investigation is to sort out the evaluation of risk events by experts in the process of metro station construction to form expert experience samples, to analyze the relationship between risk events and identified influencing factors, and to provide case samples for safety risk prediction of metro construction. Therefore, the structured interview method of experts filling in the questionnaire is adopted in this research to carry out the investigation. In order to ensure the reliability of the collection of safety risk cases in metro construction, experts (project employers, contractors, and supervisors) who were engaged in long-term safety management works in metro construction were invited to fill out the questionnaire. All of experts had over 10 years of working experience and participated in more than five metro station construction projects. The above research provides sample data for subsequent construction safety risk-prediction research.

Stage 3: Construction of the PSO-SVM Prediction Model
The establishment of the safety risk-prediction model of metro construction mainly includes training and testing. Firstly, safety risk cases are collected as sample data; then, the processed data samples are selected randomly as training sets, and the remaining samples are used as test sets. Secondly, the PSO algorithm is used to optimize the model parameters, which can be derived according to Equation (9), and the training set is used to learn the SVM model. Finally, the accuracy of the prediction ability of the model is tested by comparing the test results with the original data. The flow chart of the safety risk intelligent prediction model of metro construction is shown in Figure 5.

Stage 4: The Safety Risk Prediction of Metro Station Construction
If the training sample classification accuracy of the constructed PSO-SVM prediction model is relatively high, it shows that the PSO-SVM model has a relatively high accuracy for the training set and the testing set model. The PSO-SVM model can make a more scientific prediction for the safety risk of metro construction. According to the sample classification accuracy of Stage 3, it is determined whether to use the PSO-SVM prediction model constructed in this paper to predict the safety risk of the specific metro station construction project.

Determination of Influencing Factors of Safety Risk in Metro Station Construction
The code of construction safety management is a summary of construction safety management after years' experience. Therefore, code reading and review is an effective way to acquire factors (see Table 1). The main code of construction safety management issued by mainland China were mainly used in this research, and the scope was appropriately expanded; meanwhile, the relevant laws and regulations of Hong Kong, Singapore, Japan, and other regions or countries were referenced as shown in Table 1. Then, based on the in-depth analysis of the relevant literature on the influencing factors of safety risk in construction (especially metro engineering), the influencing factors that were generally considered to be more important in most studies were extracted, as shown in Table 2. Hydrogeology, engineering geology and surrounding environment are important basic data in the process of metro construction. The uncertain factors such as complex surrounding environment, hydrogeological conditions, and engineering geological conditions constitute the dangerous source environment of the construction safety risk, which is the original factor impact of the safety risk of metro construction [76]. Because the dangerous source environment provides technical parameters for design scheme, the uncertainty of environment of dangerous sources will have an impact on the engineering design and the construction scheme design. In addition, if the dangerous source environment is handled improperly in the construction process, it will directly cause safety accidents. The main basis of the metro construction process is the design scheme. It can be said that the safety hazard is also designed [77]. Therefore, engineering design defects or errors will directly cause safety risks. According to the process of engineering design → construction scheme design → implementation scheme, engineering design is the basis of construction scheme design. Therefore, the engineering design will have an impact on construction scheme design. The relationship between the factors is shown in Figure 6. Accordingly, influencing factors were identified from the dangerous source environment, project design scheme and construction scheme design in this research. According to these three dimensions, the influencing factors of safety risk were identified. Therefore, the influencing factors of metro construction safety risks are shown in Table 2. The design of the envelope structure is the temporary or permanent structure to resist the unfavorable external environment in the process developing underground space.
GB 50652 GB 50656 [83,89] C8 Support system design The support system is the temporary structure which resists the internal or external deformation of the enclosure during the excavation of the foundation pit, which is one of the main causes of safety accidents.

Collection of Safety Risk Cases in Metro Construction
During the interview process, in order to ensure that the experts can make more accurate judgments on the content of the interviews, the content of the options were depicted and described in the questionnaire, and an expert judgment reference was designed. For example, the respondent can make a judgment on the construction safety risk level according to the risk-level standard in the "Guidelines for Risk Management of Urban Rail Transit Underground Engineering Construction" (GB50652-2011), as shown in Table 3. Table 3. Risk-level standard of metro construction project.

Ignorable (E)
>0. 1  Frequent  I  I  I  II  III  0.01-0.1  Possible  I  I  II  III  III  0.001-0.01  Unmeant  I  II  III  III  IV  0.0001-0.001  Infrequent  II  III  III  IV  IV  <0.0001  Impossible  III  III  IV  IV  IV A Likert five-point system was used to measure safety risk influencing factors, and a corresponding judgment basis was designed. The corresponding judgment basis for the influencing factors that can be quantitatively measured was referenced. According to the rules on the relationship between the safety level of the foundation pit and the thickness of the soft soil layer [91,92], the measurement basis of influencing factors on the soft soil layer thickness is shown in Table 4. For qualitative influencing factor indicators, the judgment basis can be formulated according to the meaning of the indicators. For example, if the design scheme of monitoring and measurement fully meets the monitoring layout, monitoring accuracy and design requirements, a score of 5 will be given; otherwise, a score of 1 will be given, as shown in Table 5. Basically consistent 4 More consistent 5 Fully consistent Because the interviewees directly affect the reliability, comprehensiveness and effectiveness of safety risk cases obtained for the metro construction, only interviewees who have at least 10 years of safety management experience in metro construction projects were included when selecting interviewees. The interview questionnaire was divided into three parts: the first part was the basic information of the interviewees, including the work unit, present assignment, professional title, educational background, years engaged in metro construction and project location, etc.; the second part was that interviewees who had participated or interviewees who were participating made a judgment on safety risk events of the metro station construction, such as risk type and risk level; and the third part was that interviewees made a judgment on the related safety factors aimed at construction safety risk events-for example, whether the hydrogeological condition and the selection of a support scheme were reasonable.
After the structured questionnaires of 70 experts were sorted, it was determined that there were two questionnaires in which experts believed that the safety risk events had nothing to do with the influencing factors identified. Excluding the two questionnaires, a total of 68 case samples were obtained. The safety risk categories and grades in metro station construction are shown in Table 6. Table 6. Classifications and grade statistics of safety risks in metro station construction.

Risk Category
Risk Level

I II III IV
Instability and failure of foundation pit 10 21 29 8

Construction of the PSO-SVM Model
As mentioned above, a total of 68 case samples were collected in the construction of the safety risk-prediction model of the foundation pit instability damage of metro station engineering. In this case study, 57 group samples were selected randomly as training samples and 11 group samples were used as testing samples to train and test the PSO-SVM model. The sample data are shown in Table 7. Table 7. The sample data for the instability failure of the foundation pit.

Sample Influence Factor Risk Level
According to the sample data of the base pit instability and safety risk-prediction model in Table 8, the MATLAB (2014b) (MathWorks, Natick, USA) and LIBSVM (Version 3.22) toolbox were used to implement the SVM model according to the construction process. When the PSO-SVM algorithm program was written, the initial parameters of the model were set. The 2011 Standard PSO with 20 particles and 50 iterations was used [93]. In this paper, the feasible range of value was extended, and the value of particles was set as 20 and the maximum iteration number was k max = 100. Because of a lack of references of optimal g and c, the value range should be enlarged [94]. So the value range of g was g ∈ 2 −8 , 2 8 and c was [0.1, 100]. In PSO, the learning factor was a random number between 0 and 2. In this paper, the learning factors were c 1 = 1.5,c 2 = 1.7 [95]. In addition, the criterion for parameter evaluation was the minimum root mean square error (MSE). The machine environment was a i5-2430M central processing unit (CPU) 2.40 GHz, with 4.0 GHz memory and running on the Windows 7 operating system. After initialization, the calculation program was processed to read the sample data of the training set. The penalty parameter c = 26.70 and kernel parameter g = 0.039 of SVM were obtained. The sample data of the foundation pit instability failure category were put into the LIBSVM toolbox in the MATLAB program; then, the optimal (c, g) parameter combination was searched by the PSO, and 57 randomly selected sample data were trained to obtain a training model. The comparison between the actual and predicted values in the training set is shown in Figure 7. In Figure 7, x-axis represents the sample size of training set, and y-axis is the class value of safety risk level. Then, the regression prediction was made for the test data of 11 groups of randomly selected test samples. The comparison between the actual value and the predicted value is shown in Figure 8.  From the results of the operation in Figure 7, it can be found that, among the 57 training samples, three predicted values of samples do not overlap with the actual value. Therefore, the classification accuracy of the training samples of the SVM prediction model after parameter optimization is 94.74% (54/57). From the results of the operation in Figure 8, it can be found that in the 11 test samples, the predicted value and the actual value of sample 8 do not overlap. Therefore, the classification accuracy of the test sample is 90.91% (10/11). This shows that both the training set and the test set model have high accuracy. The PSO-SVM model can make a scientific prediction on the safety risk level of the foundation pit instability damage. On the other hand, the influencing factors in the model are all from the statistical analysis of expert knowledge, which also verifies the rationality of the integration of expert knowledge in the SVM prediction model.

Safety Risk Prediction of the Metro Station Construction
Station D was the transfer station of lines 1 and 2 of urban rail transit in a Chinese city. The main length of the station was 683.1 m, the width of the standard section was 41.30 m, the maximum width was 52 m, the buried depth of the structural floor was about 17.02 m, and the thickness of the roof was about 3.0 m. It was a two-story double island station on the ground floor. The total construction area of the station was 58,150 m 2 .
This study predicted the safety risk of the project in the construction preparation process of station D. Five safety managers, including the project leader and the safety supervisor of the construction unit, the chief project manager, the safety supervisor of construction enterprise and the safety director of supervision enterprise, referred to the quantitative standard of the influencing factors to quantify the two influencing factors of the metro station (the stability of the foundation pit and the safety risk of the construction). According to the comprehensive expert opinions, the quantitative results of safety risk influencing factors are shown in Table 8.
From the comparison of Figures 7 and 8, it can be found that the prediction accuracy of the PSO-SVM model is high. In order to overcome the influence of SVM model random samples in this research, the original case samples were repeated under random samplings. Then different random training samples were constructed, and the SVM model was used to observe the difference of test accuracy. In this paper, 57 case samples were selected randomly as training data, the remaining 11 case samples were selected as verification data. The accuracy of test sets based on 10 random samplings are shown in Table 9. It can be found that the accuracy of the training and test sets of the PSO-SVM model has a high predictive accuracy for non-linear relationship and small samples. Meanwhile, the relevant safety risk influencing factors (Table 8) were loaded into the prediction model, and the calculation results are shown in Table 9. From Table 9, it can be found that the prediction of instability and damage of the foundation pit of station D is predicted to be a class II of safety risk, which is relatively high. From the results of influencing factors of instability and failure safety of the engineering foundation pit, it can be found that the quantitative results of poor geological distribution (C2) and construction precipitation design (C10) are 2, and the quantitative result of the excavation depth of the foundation pit (C6) is 1, which are relatively low. Under the condition that the results of other influencing factors remain unchanged, the three factors with lower adjustment values were adjusted continuously. Then, the PSO-SVM prediction model was used to calculate the change of safety risk-prediction level. The complex coupling relationship between influencing factors and construction safety risk was uncovered and the most important influence on construction safety risk factors was identified, as shown in Table 10.  Table 10, it can be found that under the condition that the quantitative results of other influencing factors remain unchanged, the safety risk of instability and failure of the foundation pit of the station project does not change after the quantitative results of poor geological distribution (C2) are adjusted from the original value 2 to 3. After the quantitative results are adjusted from the original value of 2 to 4, the level is reduced from level II to level III. The quantitative result of the excavation depth of the foundation pit (C6) is adjusted from the original value of 2 to 3 and 4, which does not affect the safety risk of foundation pit instability and failure. The quantitative result of the construction precipitation design (C10) is adjusted from 2 to 3, and the safety risk level of foundation pit instability and failure is reduced from II to III.

Multi-Factor Dynamic Adjustment Analysis
In this study, the three influence factors of poor geological distribution (C2), excavation depth of foundation pit (C6) and construction precipitation design (C10) are used to study the change of construction safety risk-prediction level under different combinations. The calculation results are shown in Table 11. Table 11. Level changes of safety risk prediction with multiple factors.

Influence Factor Quantitative Result Adjustment Foundation Pit Instability and Failure Risk Level
It can be seen from Table 11 that, in the C2 + C6 combination, adjusting the C2 quantification result from the original value 2 to 3 and adjusting the C6 quantification result from the original value 1 to 2, the foundation pit instability and failure safety risk of the station project do not change. If we adjust the quantitative results to 4 and 2, respectively, the safety risk level of foundation pit instability and failure is reduced from level II to level III. In the combination of C2 + C10, adjusting C2 and C10 from the original value of 2 to 3, the safety risk level of foundation pit instability and failure is reduced from the original level II to level III. In the C6 + C10 combination, adjusting the C6 quantification result from the original value 1 to 2 and adjusting the C10 quantification result from the original value 2 to 3, the safety risk level of foundation pit instability and failure is reduced from level II to level III. If we adjust the quantitative results to 3 and 4, the safety risk level of foundation pit instability and failure is reduced from the original level II to level III. Therefore, the combination of poor geological distribution (C2) and construction precipitation design (C10) is most favorable to reduce the construction safety risk level of the station, while the combination of poor geological distribution (C2) and excavation depth of foundation pit (C6) is most unfavorable to reduce the construction safety risk. The most effective way to reduce the construction safety risk is to design a reasonable construction precipitation scheme.

Conclusions
Design of Safety (DFS) is one of the most effective ways to consider safety risks in the design process, which is regarded as a risk-prevention technique for metro station construction. In order to improve the effectiveness of the application of DFS in metro station construction, it is useful to incorporate the risk-prediction procedure into the DFS. Therefore, a comprehensive framework is proposed by using the PSO-SVM model to predict the safety risk of metro station construction in this study, which provides a valuable guideline for safety risk prediction in metro station construction and provides a useful reference for engineers and managers in the design process. Firstly, 12 influencing factors related to the safety risk of metro construction are identified by using the literature review and code of construction safety management analysis. Then, the structural interview method is used to collect the safety risk cases of metro construction. Next, the PSO-SVM model is presented to predict safety risk in metro construction, in which the multi-class SVM prediction model with an improved binary tree is designed. Finally, an illustrative example is used to demonstrate the efficiency of the proposed PSO-SVM approach.
In this study, the classification accuracy of the training samples constructed by the PSO-SVM prediction model is 94.74% (54/57), and the classification accuracy of the test samples is 90.91% (10/11), which show that the training set and the test set models all had high accuracy. In order to overcome the influence of SVM model random samples, the original case samples were repeated under random samplings. Then different random training samples were constructed, and the SVM model was used to observe the difference of test accuracy. The result of test sets based on 10 random samplings was respectively: 81.82%, 90.91%, 71.73%, 90.91%, 100.00%, 81.82%, 90.91%, 71.73%, 81.82%, 90.91%. It can be found that the accuracy of the test sets of the PSO-SVM model has a high predictive accuracy for a non-linear relationship and small samples. In addition, the relevant safety risk-influencing factors were loaded into the PSO-SVM model. The result shows that the foundation pit of station D is predicted to be a class II safety risk, which is relatively high. Meanwhile, after the computation of single and multiple factor analyses, the complex coupling relationship between influencing factors and construction safety risk was uncovered. According to the prediction results, the most important influencing factors to reduce the safety risk of metro station construction were identified, which provides a guideline for the safety risk prediction of metro construction for design process.
Further study will be focused on the following directions: (1) the safety risks of metro construction were mainly focused on the instability of foundation pits in metro station projects in this study, which cannot cover all possible accident types in the process of metro construction. Other types of safety accidents should be investigated and analyzed in future research. (2) The intelligent safety risk prediction of the metro construction in the design process is a relatively new area of research. It is necessary to perform a more in-depth analysis of the influencing factors; for example, the technical parameters in the design process.