A Global Optimization-Based Method for the Prediction of Water Inrush Hazard from Mining Floor

: Water inrush hazards can be effectively reduced by a reasonable and accurate soft-measuring method on the water inrush quantity from the mine ﬂoor. This is quite important for safe mining. However, there is a highly nonlinear relationship between the water outburst from coal seam ﬂoors and geological structure, hydrogeology, aquifer, water pressure, water-resisting strata, mining damage, fault and other factors. Therefore, it is difﬁcult to establish a suitable model by traditional methods to forecast the water inrush quantity from the mine ﬂoor. Modeling methods developed in other ﬁelds can provide adequate models for rock behavior on water inrush. In this study, a new forecast system, which is based on a hybrid genetic algorithm (GA) with the support vector machine (SVM) algorithm, a model structure and the related parameters are proposed simultaneously on water inrush prediction. With the advantages of powerful global optimization functions, implicit parallelism and high stability of the GA, the penalty coefﬁcient, insensitivity coefﬁcient and kernel function parameter of the SVM model are determined as approximately optimal automatically in the spatial dimension. All of these characteristics greatly improve the accuracy and usable range of the SVM model. Testing results show that GA has a useful ability in ﬁnding optimal parameters of a SVM model. The performance of the GA optimized SVM (GA-SVM) is superior to the SVM model. The GA-SVM enables the prediction of water inrush and provides a promising solution to the predictive problem for relevant industries.


Introduction
Water seepage causes constant difficulties in underground mining and creates a range of unstable operation problems. Handling, pumping, treatment and the disposal of mine water are serious problems in the observed situation [1]. The inundation in mining disasters is a sudden, violent and veritable irruption. Sudden water inrush results in hundreds of fatalities in various countries in the world [2]. In recent years, safe mining production of coal is improved with the development of science and technology. In developed countries, such as the United States, only a very few of water inrush accidents occurred with the application of the dewatering technique in mining [3]. However, more than 90% of the water inrush accidents in China occurred due to the water inflow from karst aquifers through coal seam floors [4]. According to incomplete statistics, from 2002 to 2012, there were 1110 inrush accidents involving various kinds of mines in China, and 4444 were dead and missing. Among to be further studied in comprehensive aspects; additionally, the existing water inrush prediction methods need to be further improved in terms of convenient use. Therefore, the prediction method of water inrush from the coal seam floor still needs to be continuously explored. The support vector machine (SVM) has become a research hotspot in the machine learning community, due to its excellent learning performance [25,26]. To automatically determine the optimal or approximate optimal parameters in the parameter space, a genetic algorithm (GA) is employed, as it has a powerful global optimization function [27], including implicit parallelism, high stability of the algorithm, penalty coefficient, insensitivity coefficient and a kernel function parameter for the SVM model. All of these characteristics greatly improve the accuracy and use of the SVM model.
Driven by importance of prediction of water inrush hazards, this study first analyzes influencing factors and provides corresponding data of water inrush cases in Section 2. Section 3 describes a used method of GA to estimate the SVM model and Section 4 presents the detailed calculation results. Finally, the implication of these findings is concluded in Section 5.

Analysis of Influencing Factors
The water inrush from the coal seam floor is affected by various factors, such as geological structure, hydrogeology and mining conditions [28]. As shown in Figure 1, the existence of the confined aquifer under the coal seam is the material basis of water inrush, the hydraulic pressure and the mine pressure is the force source, and the water-resisting strata is the inhibiting condition. Its suppression ability depends on the thickness, strength and combination of the water-resisting strata [29]. When the hydraulic pressure, mine pressure and the stability of the water-resisting strata are in a relatively balanced state, the fault has the controlling effect. According to the in-situ observation of the field, combined with the comprehensive analysis of relevant data, it was concluded that there are five main factors affecting water inrush from floor [9,30,31].
Water 2018, 10, x FOR PEER REVIEW 3 of 17 methods need to be further improved in terms of convenient use. Therefore, the prediction method of water inrush from the coal seam floor still needs to be continuously explored. The support vector machine (SVM) has become a research hotspot in the machine learning community, due to its excellent learning performance [25,26]. To automatically determine the optimal or approximate optimal parameters in the parameter space, a genetic algorithm (GA) is employed, as it has a powerful global optimization function [27], including implicit parallelism, high stability of the algorithm, penalty coefficient, insensitivity coefficient and a kernel function parameter for the SVM model. All of these characteristics greatly improve the accuracy and use of the SVM model. Driven by importance of prediction of water inrush hazards, this study first analyzes influencing factors and provides corresponding data of water inrush cases in Section 2. Section 3 describes a used method of GA to estimate the SVM model and Section 4 presents the detailed calculation results. Finally, the implication of these findings is concluded in Section 5.

Analysis of Influencing Factors
The water inrush from the coal seam floor is affected by various factors, such as geological structure, hydrogeology and mining conditions [28]. As shown in Figure 1, the existence of the confined aquifer under the coal seam is the material basis of water inrush, the hydraulic pressure and the mine pressure is the force source, and the water-resisting strata is the inhibiting condition. Its suppression ability depends on the thickness, strength and combination of the water-resisting strata [29]. When the hydraulic pressure, mine pressure and the stability of the water-resisting strata are in a relatively balanced state, the fault has the controlling effect. According to the in-situ observation of the field, combined with the comprehensive analysis of relevant data, it was concluded that there are five main factors affecting water inrush from floor [9,30,31]. 1. Aquifer. The aquifer's water-richness is the material basis for the size of water inrush. It determines the scale of the water hazard and the degree of threat to the mine [31]. Therefore, the aquifer is one of the important factors for water inrush from the coal floor. Water-richness is related to the development of karst fissures, runoff conditions, structural development and burial depth. 2. Hydraulic pressure. The hydraulic pressure in the aquifer is the driving force for the water out of the working face. It is the hydrostatic pressure before the effluent. The hydrostatic pressure has an expanding effect on the aquifer fissure [27]. The higher the water pressure, the more significant the effect; water energy of the aquifer after the effluent converts into kinetic energy. The effect is that the fractures are scoured and expanded, the filling material is continuously taken away, channels are more and more opened and the amount of water is getting larger.

1.
Aquifer. The aquifer's water-richness is the material basis for the size of water inrush. It determines the scale of the water hazard and the degree of threat to the mine [31]. Therefore, the aquifer is one of the important factors for water inrush from the coal floor. Water-richness is related to the development of karst fissures, runoff conditions, structural development and burial depth. 2.
Hydraulic pressure. The hydraulic pressure in the aquifer is the driving force for the water out of the working face. It is the hydrostatic pressure before the effluent. The hydrostatic pressure has an expanding effect on the aquifer fissure [27]. The higher the water pressure, the more significant the effect; water energy of the aquifer after the effluent converts into kinetic energy. The effect is that the fractures are scoured and expanded, the filling material is continuously taken away, channels are more and more opened and the amount of water is getting larger.

3.
Thickness of the water-resisting strata. Water-resisting strata act as a barrier to the water inrush from the floor. The barrier capacity mainly depends on the thickness of water-resisting strata, mechanical strength of rocks [32,33], and the integrity of the water-resisting rock layer. Under certain conditions, when the thickness of the water-resisting strata is greater with higher strength, the probability of water inrush is lower and vice versa.

4.
Depth of mining-induced failure zone. The depth of mining-induced failure zone determines the degree of the failure of rock floor. Practice and theories proof that reducing the failure depth of floor mining and increasing the thickness of water-resisting strata are important methods and measures for safe compensated mining under certain premise of conditions [2]. When the failure depth of floor mining is small, the probability of water inrush becomes smaller, and vice versa. 5.
Fault fall. The damage of the fault to coal rock is mainly manifested in the increase of cracks and pores in the coal and rock layers near the fault [22], and the sharp decrease of strength.
The different size of the fault gap can result in different contact between the coal seam and the aquifer in the two plates of the fault. The relationship analysis shows that when the fault fall is larger, the impact becomes greater on the fault, and a fault fall is more likely to occur on the floor water inrush.

Data Collection of Water Inrush Cases
Karst water inrush is one of the major mine disasters in North China. It has many distinguishing features: The water source mainly exists in the fractures of Ordovician karst limestone aquifers, followed by that of the Carboniferous and Cambrian aquifers. The karst formation of the water inrush strata is dominated by fractures, followed by caves, pores and underground river pipes. The water inrush deposits are generally located below the local erosion reference surface; the water storage structure has a large scale, most of which have abundant water resources for storage and supply [34]. The water inrush method mainly presents as accidental inrush from coal floor, resulting in serious damage of coal mines. In particular, the five influencing factors listed above all have the obvious characteristics [35].
Therefore, this paper collected typical data of water inrush of coal mining in northern China ( Figure 2) as samples from a large number of cases. Furthermore, as the previous section described, Table 1 lists the corresponding influencing factors for each case. 3. Thickness of the water-resisting strata. Water-resisting strata act as a barrier to the water inrush from the floor. The barrier capacity mainly depends on the thickness of water-resisting strata, mechanical strength of rocks [32,33], and the integrity of the water-resisting rock layer. Under certain conditions, when the thickness of the water-resisting strata is greater with higher strength, the probability of water inrush is lower and vice versa. 4. Depth of mining-induced failure zone. The depth of mining-induced failure zone determines the degree of the failure of rock floor. Practice and theories proof that reducing the failure depth of floor mining and increasing the thickness of water-resisting strata are important methods and measures for safe compensated mining under certain premise of conditions [2]. When the failure depth of floor mining is small, the probability of water inrush becomes smaller, and vice versa. 5. Fault fall. The damage of the fault to coal rock is mainly manifested in the increase of cracks and pores in the coal and rock layers near the fault [22], and the sharp decrease of strength. The different size of the fault gap can result in different contact between the coal seam and the aquifer in the two plates of the fault. The relationship analysis shows that when the fault fall is larger, the impact becomes greater on the fault, and a fault fall is more likely to occur on the floor water inrush.

Data Collection of Water Inrush Cases
Karst water inrush is one of the major mine disasters in North China. It has many distinguishing features: The water source mainly exists in the fractures of Ordovician karst limestone aquifers, followed by that of the Carboniferous and Cambrian aquifers. The karst formation of the water inrush strata is dominated by fractures, followed by caves, pores and underground river pipes. The water inrush deposits are generally located below the local erosion reference surface; the water storage structure has a large scale, most of which have abundant water resources for storage and supply [34]. The water inrush method mainly presents as accidental inrush from coal floor, resulting in serious damage of coal mines. In particular, the five influencing factors listed above all have the obvious characteristics [35].
Therefore, this paper collected typical data of water inrush of coal mining in northern China ( Figure 2) as samples from a large number of cases. Furthermore, as the previous section described, Table 1 lists the corresponding influencing factors for each case.

Methodology
In order to accurately predict the water inrush from the coal seam floor, this paper first initialized the raw data so that it satisfied the input requirements of SVM training samples. At present, this new theoretical method shows unique advantages and good application prospects in solving practical problems, such as small samples, nonlinearity, high dimensionality and local minimums [36]. It has a good application in pattern recognition, density estimation, data mining, two-dimensional object recognition, remote sensing image analysis, nonlinear system control, function approximation, function fitting and regression estimation. However, there are few researches on SVM prediction with strong color noise performance [37]. Next, for SVM, as in the radial basis function (RBF) network, there still exists the following problem: How to select a kernel function and the most suitable kernel function for specific problems. The prediction performance of the SVM is sensitive to the choice of parameters [38]. Figure 3 shows the established prediction system. Firstly, the original data was defined and initialized as input training data set of SVM model, namely, the data processing stage of the prediction model established in this paper. Then, by coding the training data set, initial population parameters were randomly generated to produce a group of population. Through SVM training, the fitness of each individual in the population could be calculated. According to the fitness evaluation function, optimal SVM parameters were found.
Water 2018, 10, x FOR PEER REVIEW 6 of 17

Methodology
In order to accurately predict the water inrush from the coal seam floor, this paper first initialized the raw data so that it satisfied the input requirements of SVM training samples. At present, this new theoretical method shows unique advantages and good application prospects in solving practical problems, such as small samples, nonlinearity, high dimensionality and local minimums [36]. It has a good application in pattern recognition, density estimation, data mining, two-dimensional object recognition, remote sensing image analysis, nonlinear system control, function approximation, function fitting and regression estimation. However, there are few researches on SVM prediction with strong color noise performance [37]. Next, for SVM, as in the radial basis function (RBF) network, there still exists the following problem: How to select a kernel function and the most suitable kernel function for specific problems. The prediction performance of the SVM is sensitive to the choice of parameters [38]. Figure 3 shows the established prediction system. Firstly, the original data was defined and initialized as input training data set of SVM model, namely, the data processing stage of the prediction model established in this paper. Then, by coding the training data set, initial population parameters were randomly generated to produce a group of population. Through SVM training, the fitness of each individual in the population could be calculated. According to the fitness evaluation function, optimal SVM parameters were found. After finding the optimal parameters, if the termination conditions of the SVM training model were met, i.e., the value of the loss function was smaller than that of the learning error rate, the optimal SVM model was output. If it did not satisfy the conditions, a series of genetic operations After finding the optimal parameters, if the termination conditions of the SVM training model were met, i.e., the value of the loss function was smaller than that of the learning error rate, the optimal SVM model was output. If it did not satisfy the conditions, a series of genetic operations were performed on the existing optimal parameters through the GA algorithm to carry out operations of copying, crossover and mutation. Then, new parameter populations were generated, training was continued and fitness was evaluated until the optimal SVM parameters were found and the optimal SVM model was output. This process was the GA optimization SVM platform (GA-SVM) in this article.
Finally, according to the existing SVM model, the test data set was input into the trained SVM model. The actual data were compared and analyzed on the basis of the output results. According to the error analysis results, the accuracy of the prediction was judged on whether the trained SVM reached a predetermined precision requirement. If the requirements were not satisfied, i.e., the value of the loss function was larger than that of the learning error rate, the SVM model needed to be improved. If the requirements were met (the termination conditions of the SVM training model were met), the actual coal seam floor was predicted based on the SVM model optimized by the GA algorithm and applied to practice. This was the forecast and recall stage.

Data Preprocessing
As the classification model was generated, the selection of training samples had a certain influence on the classification results [39][40][41]. If there were a small number of samples, it as easy to cause the under-learning; if there were a large number of samples, it might increase training. It was easy to cause over-learning. The key was to select representative samples for training.
Because the research data were measurement data of each coal mine, there were many unpredictable qualitative factors. Therefore, in order to meet the input data requirements of SVM, i.e., feature vectors of the classification model, the existing sample data were reprocessed in this paper. First of all, according to Jin's research [42], the aquifer and the maximum water inrush volume is defined as shown in Table 2. It can be seen from Table 2 that through the value of the variable, the actual concept of water inrush, the aquifer can be quantified to facilitate the input of the SVM training data set. By collecting data of maximum water inrush of cases in Table 1, the input information for water inrush grade is listed in Table 3. Table 2. Definition of the aquifer and the maximum water inrush volume.

Variable Name Variable Type Variable Value
Aquifer Thin layer limestone 1 Thick layer limestone 0 Maximum water inrush Q < 600 (small water inrush) 1000 600 ≤ Q < 1200 (medium water inrush) 0100 1200 ≤ Q < 3000 (large water inrush) 0010 Q ≥ 3000 (super-large water inrush) 0001  [43][44][45], which is better at solving practical problems, such as a small sample, nonlinearity, high dimension and local minimum point. SVM is based on optimal methods and statistical theory. It follows the principle of Structural Risk Minimization (SRM) and can handle small samples, high dimensionality and nonlinearity issues. It is widely used to solve the problem of pattern recognition and function fitting [46].
The kernel parameter and penalty factor of SVM have a great influence on the prediction effect, however, the theory itself does not give the best method to obtain the kernel parameter and the penalty factor. The main idea is: There are l sample data, (x 1 , y 1 ), (x 2 , y 2 ), . . . , (x l , y l ) ∈ R n × R, in which x k is the sample input and y k is the sample output. Firstly, the input vector was mapped from the original space R n to a high-dimensional feature space (Hilbert space) by using the nonlinear mapping ϕ(·). Then, the optimal decision function was constructed by using the structural risk minimization principle in this high-order feature space, and the kernel function of the original space was used to replace the dot product operation in the high-dimensional feature space to avoid complex operations. Thus, the nonlinear function estimation problem transformed into a linear function problem in the high-dimensional feature space [47].
The form of the optimal decision function of a structure is as follows: Therefore, the goal is to use the structural risk minimization principle to find the parameters ω T and b, making y − ω T − ω T ϕ(x k − b) ≤ ε for the input x outside the sample, which is equivalent to solving the following problems: where ω 2 is the complexity of the control model of confidence interval; C is the Error penalty function, i.e., penalty parameter, which represents the compromise between the smoothness of the function and the allowable error greater than the value of ε, C > 0; and R emp is the experience risk, namely the ε insensitive loss function. Lagrange multipliers are introduced to construct the Lagrange function, through the dual problem of the original problem is obtained. The dual form can be used to establish the Lagarangian functions according to the constraints of the objective function machine: According to the optimization conditions: Based on this, the solution to the optimization problem can be solved by solving linear equation: As previously mentioned, there is a complex and nonlinear mapping relationship between water inrush from a coal seam floor and its influence factors [48][49][50]. Using the latest machine learning tool based on statistical learning theory, the support vector machine can express the non-linear relationship between them, so to conduct water inrush prediction. At the same time, the problem of coal floor water inrush prediction is essentially a typical two-category classification problem, because no matter how large the number of influencing factors is, or how complex the relationship between the factors and their classification results, there are only two possibilities, namely, water inrush and no water inrush. That is to say, in theory, water inrush can be predicted completely through proper classification force. SVM is specially designed for finite samples. It has a strict theoretical basis and can solve practical problems such as small samples, nonlinearity, high dimensionality, and local minimums.
Specifically, the water inrush prediction of the support vector machine is to map samples of the input space to a high-dimensional feature space through some kind of non-linear function relationship. Through the classifier processing, the prediction result of the sample can be obtained, and the model can be expressed as: where x i is the i samples in the l samples and K(x i · x) is the kernel function. The model inputs are the actual field measurement data, and the output is the corresponding prediction result. The SVM prediction model is listed as Figure 4.
As previously mentioned, there is a complex and nonlinear mapping relationship between water inrush from a coal seam floor and its influence factors [48][49][50]. Using the latest machine learning tool based on statistical learning theory, the support vector machine can express the non-linear relationship between them, so to conduct water inrush prediction. At the same time, the problem of coal floor water inrush prediction is essentially a typical two-category classification problem, because no matter how large the number of influencing factors is, or how complex the relationship between the factors and their classification results, there are only two possibilities, namely, water inrush and no water inrush. That is to say, in theory, water inrush can be predicted completely through proper classification force. SVM is specially designed for finite samples. It has a strict theoretical basis and can solve practical problems such as small samples, nonlinearity, high dimensionality, and local minimums.
Specifically, the water inrush prediction of the support vector machine is to map samples of the input space to a high-dimensional feature space through some kind of non-linear function relationship. Through the classifier processing, the prediction result of the sample can be obtained, and the model can be expressed as: where i x is the i samples in the l samples and ( ) ⋅ i K x x is the kernel function. The model inputs are the actual field measurement data, and the output is the corresponding prediction result. The SVM prediction model is listed as Figure 4.

GA-SVM Prediction
According to the basic principle of the SVM model, the SVM model parameters mainly include the penalty parameter C, the insensitivity coefficient ε, the kernel function and the corresponding parameters. The determination of these three parameters will greatly affect the accuracy of the SVM model. Parameters of SVM have great influence on the efficiency of the algorithm and the ability of generalization and prediction. Their choice is an important content of building a SVM model. The emergence of genetic algorithms makes it possible.

GA-SVM Prediction
According to the basic principle of the SVM model, the SVM model parameters mainly include the penalty parameter C, the insensitivity coefficient ε, the kernel function and the corresponding parameters. The determination of these three parameters will greatly affect the accuracy of the SVM model. Parameters of SVM have great influence on the efficiency of the algorithm and the ability of generalization and prediction. Their choice is an important content of building a SVM model. The emergence of genetic algorithms makes it possible.
The principle of the genetic algorithm (GA) is derived from Darin's evolution theory and Mendel's genetic theory [51,52]. Genetic algorithm expresses the solution of the problem as "chromosomes" (using code to represent strings). The algorithm starts with a bunch of "chromosomal" strings. According to the principle of survivability of the fittest, a highly-adapted "chromosome" is selected for replication, and a new generation of more adaptable environmental "chromosomal" populations is generated through cross and mutational genetic manipulations. The algorithm is carried out from generation to generation. Those models with high fitness will grow exponentially in the later generations, and finally get the chromosome with the highest degree of fitness, that is, the optimal solution to the optimization problem.
Genetic algorithms imitate the evolutionary process of living organisms, use the modern heuristic algorithm [53] of Darwin's theory of evolutionary "survival of the fittest" to search in the solution space by means of simulated genetic operations (crossover and mutation operations), and to search the optimal in the solution space by selecting operations. On the basis of the SPL method, the genetic algorithm is used to obtain the best or satisfactory layout.
The basic principles of genetic algorithm are as follows: Assuming that the global optimization problem is considered as (P): max{F(x) : x ∈ R n }, F : Ω ⊂ R n → R 1 , then an invariant scale of the solution problem (P), which is assumed to be N, is described as follows: Step 1: Initialize the population X(0) = {X 1 (0), X 2 (0), . . . , X N (0)} and set k = 0; Step 2: Calculate the fitness value (i = 1, 2, . . . , N) of each individual X i (k) of the current population; Step 3: Specify the replication probability of its corresponding individual according to adaptability; Step 4: According to the specified replication probability, a suitable population of the new generation population is produced by the genetic mechanism of crossing and variation.
Step 5: According to some selection rules, a new generation of population X(k + 1) is determined from the candidate population.
Step 6: Test whether the current population has a satisfactory solution or has reached a preset evolutionary time limit. If it has been satisfied, stop it. Otherwise, let k = k + 1 and go to Step 2. Given that GA has global optimization capabilities, SVM seeks the best compromise between model complexity and learning ability based on limited sample information in order to obtain the best generalization capability. Compared with the traditional neural network, an SVM algorithm can be transformed into a quadratic optimization. Theoretically speaking, the obtained global optimization will solve the problem of local extremum that cannot be avoided in the neural network. SVM topology is determined by support vectors, which avoids the need for empirical trial and error of traditional neural network topology. SVM can also approximate any function with arbitrary accuracy. This paper combines the two and proposes a genetic algorithm-based support vector machine (GA-SVM). The basic idea is: Before SVM algorithm, we first used GA to optimize in the random point set, quickly determine the approximate range of the global optimal solution, calculate the initial weight of SVM, and to use the improved SVM algorithm to train the network. The specific algorithms are as follows: 1.
Initialize the population P, including the determination of cross-scale, crossover probability P c , mutation probability P m and initialization of any connection weight. In the coding, the real number code is used.

2.
Calculate each individual evaluation function, sort them, and select individuals according to their probability values. The probability value is

3.
New individuals G i and G i+1 are generated by probability P c crossing the individual G i and G i+1 , and no cross individuals are directly copied.

4.
Generate G j and new individual G j by using probability P m mutation.

5.
The new individual is inserted into population P and the evaluation function of the new individual is calculated. 6.
If a satisfied individual is found, it ends. Otherwise, after achieving the required performance index, the optimal individual in the final group can be decoded to obtain the optimized network connection weight coefficient.
The connection weight coefficient optimized by GA is used as the initial weight; the Least Squares-Support Vector Machine (LS-SVM) algorithm is used to train the network, and the sum of square error is calculated until the specified precision is satisfied. The specific framework of GA-SVM model is shown in Figure 5.

Generate j G and new individual '
j G by using probability m P mutation.

The new individual is inserted into population P and the evaluation function of the new
individual is calculated. 6. If a satisfied individual is found, it ends. Otherwise, after achieving the required performance index, the optimal individual in the final group can be decoded to obtain the optimized network connection weight coefficient.
The connection weight coefficient optimized by GA is used as the initial weight; the Least Squares-Support Vector Machine (LS-SVM) algorithm is used to train the network, and the sum of square error is calculated until the specified precision is satisfied. The specific framework of GA-SVM model is shown in Figure 5.

Testing Designs and Results
Based on the SVM, the key problems of coal seam floor water inrush prediction model was the determination of input mode, selection of training samples and selection of model structure parameters. One aspect of GA applied to SVM optimization was to optimize the structure of SVM. Moreover, it was also used to optimize the weights of SVM kernel functions. The prediction steps of SVM water inrush from coal seam floor, which were optimized by GA, are as follows:

Selection of Kernel Function and Parameters
The common kernel functions were the linear kernel function, polynomial kernel function, radial basis function (RBF) kernel function and Sigmoid kernel function. Through statistical analysis of a large number of floor water inrush cases, the five main factors affecting water inrush were: hydraulic pressure, aquifer, thickness of water-resisting strata (floor thickness), depth of water

Testing Designs and Results
Based on the SVM, the key problems of coal seam floor water inrush prediction model was the determination of input mode, selection of training samples and selection of model structure parameters. One aspect of GA applied to SVM optimization was to optimize the structure of SVM. Moreover, it was also used to optimize the weights of SVM kernel functions. The prediction steps of SVM water inrush from coal seam floor, which were optimized by GA, are as follows:

Selection of Kernel Function and Parameters
The common kernel functions were the linear kernel function, polynomial kernel function, radial basis function (RBF) kernel function and Sigmoid kernel function. Through statistical analysis of a large number of floor water inrush cases, the five main factors affecting water inrush were: hydraulic pressure, aquifer, thickness of water-resisting strata (floor thickness), depth of water flowing fractured zone on the floor and fault fall (the fault depth is zero in the case of floor failure water inrush).

Determination of SVM Structure
The SVM prediction model with optimized parameters was trained by training samples to obtain the support vector, and the structure of the SVM was determined.
In this study, 18 typical data of water inrush from coal face floor were selected as samples from a large number of water inrush cases ( Table 1). The first 15 samples were selected as the training samples of the network, and the network was optimized and trained with the help of the GA toolbox of the MATLAB software (version: R2014a) and the SVM toolbox. The significance of the selection and parameter determination of the kernel function can be explained by the principle of structural risk minimization (SRM). The SRM principle seeks the minimum empirical risk in every subset. It refers to the subset of functions that determine the VC dimension, where each specific function has different empirical risks due to different parameters and forms. SVM can find the minimum experience risk function. Furthermore, for a given empirical error, the decision function obtained by the SVM is the simplest function that can achieve this empirical error. In the SRM principle, "compromise the empirical risk and confidence range among subsets" is to select the SVM with the smallest actual risk among the SVM in different VC dimensions by adjusting the parameters. The SRM principle is made for the determined data subspace. The data distribution in different data subspaces is different. The empirical risk changes with the VC dimension. Optimizing the SVM kernel parameter (kp) and penalty parameter (C) at the same time is significant: in addition to optimizing C in the same data subspace to obtain the optimal SVM, kp is also optimized to obtain the optimal SVM. The initial values of some parameters through random SVM generation and the range of parameters produced are shown in Table 4. In the training process, the relation curve of the classification precision and the penalty parameter C can be obtained, which is listed in Figure 6. It can be seen from the above that the maximum value of test accuracy and training accuracy did not coincide, further indicating that the traditional principle of minimizing the empirical risk cannot guarantee good promotion. When kp reached a certain value, the number of support vectors started to increase again. At the same time, the corresponding training accuracy and test accuracy began to decrease, indicating that the discriminative ability and promotion performance of the training samples all deteriorated. That is to say, when the kernel parameter increased from zero gradually, the learning and promotion ability of the optimal support vector machine underwent a process from low-high-low. Table 5 shows the statistics of the SVM classifier parameter comparison.

Prediction of the Test Data
After the network passed the training, the test data were input for simulation; after the output, the anti-normalization procedure was run, and the trained support vector predictor was used to predict the test sample. The predicted values of the last three samples in Table 1 were obtained, and the results are shown in Table 6. The prediction error (PE) can be calculated as: where, a x is the actual water inrush value; p x is the predictive value. It can be seen from Table 6 that the SVM model has a certain ability to predict; however, the prediction error remained large: The average error of the three test samples was 10.76%, which cannot guarantee the prediction accuracy, or meet the actual water inrush prediction and control. After optimizing the parameters of the GA-optimized SVM model, the average error of the prediction was 3.01%, and the prediction accuracy was high. Meanwhile, the error was very small

Prediction of the Test Data
After the network passed the training, the test data were input for simulation; after the output, the anti-normalization procedure was run, and the trained support vector predictor was used to predict the test sample. The predicted values of the last three samples in Table 1 were obtained, and the results are shown in Table 6. The prediction error (PE) can be calculated as: where, x a is the actual water inrush value; x p is the predictive value. It can be seen from Table 6 that the SVM model has a certain ability to predict; however, the prediction error remained large: The average error of the three test samples was 10.76%, which cannot guarantee the prediction accuracy, or meet the actual water inrush prediction and control. After optimizing the parameters of the GA-optimized SVM model, the average error of the prediction was 3.01%, and the prediction accuracy was high. Meanwhile, the error was very small in predicting the inrush level, and the actual situation of water inrush was well predicted. Therefore, the SVM model optimized by GA can be used to predict the water inrush of the coal seam floor with a high accuracy.

Comparison and Analysis of Predictive Results
The SVM prediction model optimized by GA had small error and good stability. The execution time of the program (0.0154 s) was shorter than that of the standard SVM algorithm (0.0358 s). It showed that GA was applied to the SVM network to reduce the network oscillation, and the number of iterations was significantly reduced. GA can also use the support vector machine to find the best compromise between the complexity of the model and the learning ability based on the limited sample information, so as to obtain the optimization of the best generalization ability. Combining GA with improved SVM, the prediction of water inrush of coal seam floor was able to be carried out accurately and quickly. Figure 7 is the comparison of the predicted values of the SVM and the SVM model after the GA optimization. As can be seen from Figure 7, the SVM model can be used to predict water inrush cases. However, the average error of the three test samples was 10.76%, which cannot guarantee the prediction accuracy, or meet the actual water inrush prediction and control. After optimization of the optimal parameters in the training, the GA-optimized SVM model had a higher prediction accuracy. At the same time, the prediction of the inrush had a small error, which is a good predictor of the actual situation of water inrush. Therefore, the SVM model optimized by GA can be used to predict the water inrush of the coal seam floor, and the prediction accuracy is high. in predicting the inrush level, and the actual situation of water inrush was well predicted. Therefore, the SVM model optimized by GA can be used to predict the water inrush of the coal seam floor with a high accuracy.

Comparison and Analysis of Predictive Results
The SVM prediction model optimized by GA had small error and good stability. The execution time of the program (0.0154 s) was shorter than that of the standard SVM algorithm (0.0358 s). It showed that GA was applied to the SVM network to reduce the network oscillation, and the number of iterations was significantly reduced. GA can also use the support vector machine to find the best compromise between the complexity of the model and the learning ability based on the limited sample information, so as to obtain the optimization of the best generalization ability. Combining GA with improved SVM, the prediction of water inrush of coal seam floor was able to be carried out accurately and quickly. Figure 7 is the comparison of the predicted values of the SVM and the SVM model after the GA optimization. As can be seen from Figure 7, the SVM model can be used to predict water inrush cases. However, the average error of the three test samples was 10.76%, which cannot guarantee the prediction accuracy, or meet the actual water inrush prediction and control. After optimization of the optimal parameters in the training, the GA-optimized SVM model had a higher prediction accuracy. At the same time, the prediction of the inrush had a small error, which is a good predictor of the actual situation of water inrush. Therefore, the SVM model optimized by GA can be used to predict the water inrush of the coal seam floor, and the prediction accuracy is high.

Conclusions
There is a highly nonlinear relationship between the water outburst from coal seam floor and geological structure, hydrogeology, aquifer, water pressure, water-resisting strata, mining damage, fault and other factors. Therefore, it is difficult to establish a suitable model by traditional methods to forecast the water inrush quantity from mine floor. Modeling methods developed in other fields can provide adequate models for rock behavior on water inrush.
The prediction model of water inrush from a coal seam floor established in this paper can quickly and accurately predict coal seam floor water inrush under different environments when fitting nonlinear multi variables. The prediction of water inrush from coal seam floor had a high accuracy, and the prediction result was more reliable. Since the SVM has a self-learning function, it can continuously improve the prediction accuracy in the application; thus, this method made a breakthrough in the prediction of coal mine water inrush.

Conclusions
There is a highly nonlinear relationship between the water outburst from coal seam floor and geological structure, hydrogeology, aquifer, water pressure, water-resisting strata, mining damage, fault and other factors. Therefore, it is difficult to establish a suitable model by traditional methods to forecast the water inrush quantity from mine floor. Modeling methods developed in other fields can provide adequate models for rock behavior on water inrush.
The prediction model of water inrush from a coal seam floor established in this paper can quickly and accurately predict coal seam floor water inrush under different environments when fitting nonlinear multi variables. The prediction of water inrush from coal seam floor had a high accuracy, and the prediction result was more reliable. Since the SVM has a self-learning function, it can continuously improve the prediction accuracy in the application; thus, this method made a breakthrough in the prediction of coal mine water inrush.
To automatically determine the optimal or approximate optimal parameters in the parameter space, this paper used the powerful global optimization function in genetic algorithm, implicit parallelism, and high stability of the algorithm, penalty coefficient, insensitivity coefficient and kernel function parameter for the SVM mode. All of these characteristics greatly improved the accuracy and usable range of the SVM model.
The coal seam floor water inrush prediction model established in this paper has significant advantages in the fitting of non-linear multivariable, and was able to quickly and accurately predict coal floor water inrush in different environments. The prediction of water inrush from coal seam floor had high accuracy, and the prediction result was more reliable. Since the support vector machine had a self-learning function, it can continuously improve the prediction accuracy in application. Therefore, this method has a broad application prospect in coal mine water inrush.