Predicting the Water-Conducting Fracture Zone (WCFZ) Height Using an MPGA-SVR Approach

: Mine water that inrushes from coal-roof strata has always posed a substantial threat to mining activities every year. Therefore, an accurate prediction of the water-conducting fracture zone (WCFZ) height in the mining overburden strata is of great signiﬁcance for the prevention and control of mine water accidents. The support vector regression (SVR) is proposed to predict the height of the WCFZ based on the mining depth, hard rock proportional coe ﬃ cient, mining thickness and length of the working face. Simultaneously, the multi-population genetic algorithm (MPGA) is employed to search for the optimal SVR parameters. The MPGA-SVR model is trained and tested with a total of 69 collected data samples, and it is also applied to a ﬁeld test. The accuracy and stability of the model were measured by the mean squared error and correlation coe ﬃ cients. The obtained results show that the MPGA-SVR model achieves a higher accuracy and stability than the traditional empirical formula and genetic algorithm (GA)-SVR model. In terms of the process for optimizing the SVR parameters, the MPGA can ﬁnd the optimal parameters more quickly and accurately, and it can e ﬀ ectively overcome the problem of premature and slow convergence of the genetic algorithm (GA). The proposed model improves the prediction accuracy and stability, which will help to avoid accidents caused by the inrush of water inrush in mining overburden strata and protect the ecological environment of the mining area.


Introduction
As an important fossil energy source, coal has always played a dominant role in China's primary energy consumption structure [1,2]. Although the safety situation of coal mines has been improved in recent years, mine water has always been a substantial threat to mining safety [3]. In the process of mining activities, the equilibrium state of the original rock stress in the overburden strata is destroyed, which leads to collapse, fracture and bending in the mining overburden strata. Once these fractures are interconnected with a water-bearing body (surface water, goaf water or aquifer water) in the mining overburden strata, the mine water will flow into the working face and bring huge economic losses and casualties, as illustrated in Figure 1 [4]. With the increase of mining depths, mining intensity and mining speed, the problem of mine water is becoming more prominent [5]. Therefore, accurately predicting the height of the water-conducting fracture zone (WCFZ) in the mining overburden strata is of great significance for the safe production of coal mining [6][7][8]. In the process of coal mining, the development of a WCFZ in mining overburden strata is an extremely complicated mechanical problem, which is characterized by the fuzzy randomness of the rock stratum structure, the complexity of the mining influence stress change and the nonlinear deformation and failure of the mining overburden. To predict the height of the WCFZ, scholars have proposed many methods, such as the field measurement method [9,10], theoretical calculations, numerical simulations [11][12][13], empirical formulas [14] and intelligence algorithms [15]. Among them, the field measurement method has the highest accuracy, but it is time consuming, laborious and costly [16]. The theoretical calculation method is too idealistic and has a large deviation from the actual complex geological conditions. The accuracy of the numerical simulation method is closely related to the geological condition parameters of the model, and the accurate acquisition of these parameters is difficult. The empirical formula considers a single influencing factor that is insufficient to reflect the combined effects of multiple influencing factors [3,17]. Increasingly more methods have been proposed with the deepening of research, and the prediction results of each method often have large differences since each method has a different level of adaptability and constraints.
In recent years, with the development and promotion of artificial intelligence technology, some machine learning methods, such as artificial neural networks (ANNs), decision trees (DTs) and support vector regression (SVR), have been developed. These methods have the advantages of comprehensive consideration, simple operation, low cost and good prediction results, therefore they are introduced to predict the height of the WCFZ. Ma et al. (2008) [18] established a three-layer Back Propagation (BP) neural network for predicting the height of the water-conducting fracture zone, and the prediction results are more reasonable and accurate than those of the empirical formula. Q. Wu et al. (2017) [19] presented a radial basis function neural network (RBFNN) model to predict the height of the WCFZ in a fully mechanized longwall mining operation with sublevel caving. However, the ANN is not stable enough to make a prediction with a small data sample, and the final prediction result easily falls into a local optimal solution since the parameters are often solved by a gradient technique. Zhang et al. (2017) [10] proposed a random forest regression model that is applied to the Hongliu Coal Mine in Northwest China with a high prediction accuracy. However, this model takes much time and has a high calculation cost, and the model is prone to over-fitting when the sample set is noisy.
The SVR is a machine learning method based on statistical learning theory and structural risk minimization criteria, which can be applied to small data sample for learning and predicting [20]. The SVR has the advantages of high accuracy, fast convergence speed and strong generalization ability, and it is widely used for predictions [21][22][23]. On the other hand, the setting of the SVR model parameters directly affects the performance of the model, and the problem of setting the model parameters is still not well solved [17]. Sun et al. (2009) [24] and Roushangar et al. (2015) [25] proposed a hybrid calculation system in which genetic algorithm (GA) was adopted to search for the SVR Figure 1. The development of a water-conducting fracture zone (WCFZ) in mining overburden strata after mining.
In the process of coal mining, the development of a WCFZ in mining overburden strata is an extremely complicated mechanical problem, which is characterized by the fuzzy randomness of the rock stratum structure, the complexity of the mining influence stress change and the nonlinear deformation and failure of the mining overburden. To predict the height of the WCFZ, scholars have proposed many methods, such as the field measurement method [9,10], theoretical calculations, numerical simulations [11][12][13], empirical formulas [14] and intelligence algorithms [15]. Among them, the field measurement method has the highest accuracy, but it is time consuming, laborious and costly [16]. The theoretical calculation method is too idealistic and has a large deviation from the actual complex geological conditions. The accuracy of the numerical simulation method is closely related to the geological condition parameters of the model, and the accurate acquisition of these parameters is difficult. The empirical formula considers a single influencing factor that is insufficient to reflect the combined effects of multiple influencing factors [3,17]. Increasingly more methods have been proposed with the deepening of research, and the prediction results of each method often have large differences since each method has a different level of adaptability and constraints.
In recent years, with the development and promotion of artificial intelligence technology, some machine learning methods, such as artificial neural networks (ANNs), decision trees (DTs) and support vector regression (SVR), have been developed. These methods have the advantages of comprehensive consideration, simple operation, low cost and good prediction results, therefore they are introduced to predict the height of the WCFZ. Ma et al. (2008) [18] established a three-layer Back Propagation (BP) neural network for predicting the height of the water-conducting fracture zone, and the prediction results are more reasonable and accurate than those of the empirical formula. Q. Wu et al. (2017) [19] presented a radial basis function neural network (RBFNN) model to predict the height of the WCFZ in a fully mechanized longwall mining operation with sublevel caving. However, the ANN is not stable enough to make a prediction with a small data sample, and the final prediction result easily falls into a local optimal solution since the parameters are often solved by a gradient technique. Zhang et al. (2017) [10] proposed a random forest regression model that is applied to the Hongliu Coal Mine in Northwest China with a high prediction accuracy. However, this model takes much time and has a high calculation cost, and the model is prone to over-fitting when the sample set is noisy.
The SVR is a machine learning method based on statistical learning theory and structural risk minimization criteria, which can be applied to small data sample for learning and predicting [20]. The SVR has the advantages of high accuracy, fast convergence speed and strong generalization ability, and it is widely used for predictions [21][22][23]. On the other hand, the setting of the SVR model parameters directly affects the performance of the model, and the problem of setting the model parameters is still not well solved [17]. Sun et al. (2009) [24] and Roushangar et al. (2015) [25] proposed a hybrid calculation system in which genetic algorithm (GA) was adopted to search for the SVR parameters, Sustainability 2020, 12, 1809 3 of 15 and the results were encouraging. As a method to search for the optimal solution by simulating the natural evolution process, GA was first proposed by Holland (1975) [26]. Although the GA has inherent implicit parallelism and better global optimization ability, many shortcomings have also been exposed with the wide application of GA and the deepening of research. In the GA evolution process, the choice of the crossover and mutation probability often determines the global search performance of the algorithm and the balance with the local search ability. In the actual application process, the value of the crossover and mutation probability are often fixed. There is a risk of premature convergence. The individuals in the group prematurely move towards a unified state and gradually stop evolving, therefore the result is a local optimum rather than a global optimum. On the other hand, the GA also has the disadvantage of slow convergence; that is, it fluctuates as it approaches the optimal solution but does not converge quickly. To deal with this problem, the concept of information theory has been introduced to preventing from premature convergence. Information-guided mutation was performed on multiple variables, and selection was made based on the obtained information entropy [27]. In addition, there are also optimization designs based on single genetic algorithm to reduce the probability of premature convergence. The main reason for the premature and slow convergence of the GA is that the population loses diversity before the optimal solution (or satisfactory solution) is obtained during the population evolution. To make full use of the global evolutionary characteristics of the GA and avoid its shortcomings, the multi-population genetic algorithm (MPGA) is first adopted to establish an MPGA-SVR model for predicting the height of the WCFZ. For the proposed model, this paper analyses its prediction performance in terms of accuracy and stability. The empirical formula and GA-SVR were also adopted to predict the height for comparison.

Support Vector Regression (SVR)
The SVR is an application model of a support vector machine (SVM), which was proposed by Vapnik (2000) [28]. The SVR model transforms complex low-dimensional non-linear regression problems into linear regression problems in high-dimensional feature space by applying a mapping function, Φ(x). The regression function is defined as follows [29]: where ω is the weight, is the threshold, Φ(x) is the inner product. The Equation (1) can be transformed into the following functional minimum problem: where x i and y i is respectively the input and output values of the training samples, ε is the insensitive loss function parameter, ξ and ξ * are the two sets of non-negative relaxation variables, C is the penalty factor. Equation (2) can be transformed into a dual problem by the Lagrangian multiplier method: Sustainability 2020, 12, 1809

of 15
By solving Equation (3), the SVR regression function can be obtained as shown in Equation (4): where the K(x i , x j ) is the kernel function of SVR. As different kernel functions have different kernel function parameters, and the most commonly used Gaussian Radial Basis (RBF) kernel function [30] as shown in Equation (5) is selected in this paper: where σ is the only adjustable kernel width parameter in the RBF function. If g = 1/2σ 2 , then there are three parameters (C, ε, g) in the SVR regression function that need to be determined. The insensitive loss coefficient ε controls the width of the insensitive region of the regression function to the sample data and affects the number of support vectors. If the value of ε is too large, the number of support vectors will be small, which may cause the model to be too simple and the learning accuracy is not enough. If the value of ε is too small, the regression accuracy is high, but it may cause the model to be too complicated and the generalization ability is poor. The penalty coefficient C reflects the penalty degree of the algorithm on sample data beyond ε, and its value affects the complexity and stability of the model. If the value of C is too small, the penalty for the sample data exceeding ε is small, and the training error becomes large. If the value of C is too large, the learning accuracy is high, but the generalization ability of the model is poor. The kernel width parameter σ reflects the degree of correlation between support vectors. If σ is too small, the relationship between support vectors is loose, the learning machine is relatively complicated and the promotion ability cannot be guaranteed. If σ is too large, the influence between support vectors is too strong, and it is difficult for the regression model to achieve sufficient accuracy.
From the above analysis, the complexity and generalization ability of the SVR model depends on these three parameters. It is unreasonable and time-consuming to optimize and select each parameter individually in the parameter selection. These three parameters should be considered simultaneously. Therefore, it is very important to find an accurate, stable and fast parameter selection method. It is difficult to obtain satisfactory results by directly using the default parameters or simply using the Cross Validation (CV) method provided by the LibSVM toolbox in Matlab to optimize the parameters in the SVR model. In this paper, the MPGA is employed to optimize the parameters in the SVR model.

The Multi-Population Genetic Algorithm (MPGA)
The MPGA is an improvement based on the GA, which can well solve the premature convergence and slow convergence of the GA. In the process of GA evolution, the selection of crossover probability (Pc) and mutation probability (Pm) often determines the global search of the algorithm and the balance of local search ability. The crossover operator is the main operator for generating new individuals, which determines the ability of genetic algorithm to search globally. The mutation operator is only the auxiliary operator that generates the new individual, which determines the local search ability of the genetic algorithm. Many scholars [31][32][33] recommend choosing a larger Pc (0.7-0.9) and a smaller Pm (0.001-0.05). However, there are still many values for Pc and Pm. For different choices, the optimization results are quite different. The MPGA compensates for this shortcoming of GA by co-evolution of several populations with different control parameters, taking into account both global search and local search of the algorithm.
In the evolution of MPGA, the various independent populations are connected by immigration operators. The immigration operator introduces the optimal individuals that appear in the evolution process of various populations periodically (this article sets every other generation) into other populations, and realizes the information exchange between the populations [34]. The specific operational rule is to replace the worst individual in the target population with the best individual in the source population. After each generation of evolution is completed, the best individuals in each population are selected by the artificial selection operator and placed in the elite population for preservation. The elite population does not perform genetic operations such as crossover and mutation to ensure that the optimal individuals produced by various groups in the evolution process are not destroyed or lost [35]. Based on GA, the MPGA co-evolve by introducing multiple populations with different control parameters to shorten the generation number needed to find nearly optimal solutions. MPGA uses multiple sets of different genetic parameters to search at the same time, so it has low dependence on genetic parameters and has strong applicability. Remarkably, no matter single population or multi-population cases, the key to solve the problem of premature convergence is to formulate the implementation rules of selection, crossover and mutation operations. The goal of MPGA in this paper is searching for the optimal parameter of SVR for predicting.

MPGA-SVR Model
Good accuracy and generalizability of the SVR model depend on the proper selection of optimal parameters. This paper uses the MPGA to optimize the parameters (the penalty factor C and the kernel function parameter g) in the SVR model, and the insensitive loss function parameter takes the default value (ε = 1) provided in the LibSVM toolbox.
The flowchart shows the procedures followed during the parameter optimization of the adopted MPGA, as shown in Figure 2. (f) In this paper, the p-th population's crossover probability Pc (p) is set between 0.5 and 0.9, and the p-th population's mutation probability P m (p) is set between 0.001 and 0.05. The parameters are defined as Equation (6): where δ is an number randomly generated between 0 and 1, p∈ [1,MP].
(2) Binary coding and generate the initial populations.
(3) Calculate the fitness function as shown in Equation (7): where F(p, q) is the fitness value of the q-th individual in p-th population,H f (p, q, j) is the predicted height of the j-th input sample in the q-th individual in the p-th population.
(4) Selection operation. According to the fitness value of each individual and the generation gap GGAP, some excellent individuals are selected from the previous generation and inherited to the next generation.
(5) Crossover operation. A new generation of individuals is produced by crossover operation. The individuals within each population are selected randomly to exchange part of their chromosomes according to the P c (p).
(6) Mutation operation. For each individual in each population, the gene value at one or some loci is changed to other alleles according to the P m (p).
(7) Immigration operation. Different populations are relatively independent, which are linked by immigration operators. The specific operation rule is to replace the worst individual (the maximum value of F(p, q) in the target population with the best individual (the minimum value of F(p, q)) in the source population.
(8) Artificial selection operation. After the end of each generation, the optimal individuals of each population are selected by artificial operators to store them in elite populations.
(9) Select the best individual from the elite population and gen= gen +1.
(10) If gen =MAXGEN, the process of the evolution will stop, and the result of optimal parameters is acquired. Otherwise, go to (3).
(11) The obtained parameters are used to predict the height of WCFZ. Genetic code and generate the initial populations (gen=0)

Elitist population
The optimal Parameters C , g

Engineering background
The No. 8101 working face of the No. 2 coal seam, located in the center of Selian Coal Mine is in the Ordos, Inner Mongolia, China. The ground level corresponding to the mining area is +1382-+1441 m, and the average mining depth of the working face is 170 m. Fully-mechanized mining method is adopted with an average coal thickness of 4.0 m, and the length of working face is 280 m. The strata directly overlying the coal seam mainly consist of the siltstone and sandy mudstone in the Jurassic formation, which were considered as aquitards. Mechanical properties of the 8101 working face's roof under the gully were tested by the lithology, and the lithology is defined as medium hard.
The surface corresponding to the 8108 working face is undulating and has a gully development with a depth of about 38 m. A short flood will be formed during the rainy season. If the mining crevice penetrates the gully, it will cause surface water to flood into the well, causing the mine to burst. Therefore, it is necessary to study the height of the WCFZ during the mining of the 8101 working face through the gully.

Model Sample Data
The selection of the sample data is of great significance for the predicting results. There are many factors affecting the height of WCFZ in mining overburden. Based on the previous studies [17,19,36], four major influencing factors affecting the height of WCFZ (H f ) have been selected: mining depth (H), hard rock proportional coefficient (c), mining thickness (d), length of working face (L). Among them, the hard rock proportional coefficient (c) is the ratio between the accumulated thickness of hard rock strata within the range of the estimated height of WCFZ ( h) and the estimated height of WCFZ in mining overburden (H p ). The calculation formulas of c and H p are shown in Equation (8) and Equation (9) respectively.
where the H p can be valued 15-20 times of mining thickness (d) according to local conditions. On the other hand, the sample data's gradient distribution for each attribute is directly related to the prediction performance of the model. The wider the gradient distribution, the stronger the representativeness of the established model and the stronger the applicability of the prediction. Based on the above considerations and the characteristics of SVR for processing small sample data, 69 sets of sample data [18,24] were searched to verify the effect of the MPGA-SVR model, as shown in Table 1.
The dimension and magnitude of each attribute are different, therefore it is necessary to normalize the sample data, as shown in Equation (10): where X ij is the j-th input sample value in the i-th attribute after normalization, x ij is the j-th input sample value in the i-th attribute. Y j is the j-th output sample value after normalization, y j is the j-th output sample value.

Result and Discussion
To test the performance of the MPGA-SVR model, 48 sets of sample data (70%) are selected randomly as training samples, and the other 21 sets of sample data (30%) is selected as test samples. During the development of the SVR model, the MPGA was employed to search for the SVR parameter. The prediction performance of the model was evaluated from two aspects of accuracy and stability. Finally, the total of 69 sets of data were employed to predicting the WCFZ's height of 8101 working face. All computational works in this model were implemented in MATLAB 2017b programming environment based on the CPU of Xeon(R) E3-1230 3.30GHz processor.

The Parametric Optimization Process of the SVR Model
After 100 generations of evolution, the target parameters were obtained based on the randomly selected 48 sets of training sample data. The parametric optimization process of GA and MPGA are shown in Figure 3.  In order to reflect the performance advantages of MPGA, both of the GA and MPGA were adopted to search for the parameters respectively. Each algorithm's performance was assessed by the mean squared error (MSE) between the standardized predicted results ( ' sf H ) and actual results (H sf ), and it was expressed as Equation (11): In the process of parameters optimization, the GA tends to converge after 32 generations of evolution, and the value of MSE is reduced from 0.018 to 0.007. The MPGA converges after only seven generations of evolution, and the value of MSE is reduced from 0.014 to 0.005. On the other hand, the average of the MSE in GA has a larger range of change than MPGA. Therefore, the MPGA has higher search accuracy, and significantly improves the slow convergence of GA. Finally, the optimal fitness parameter values selected are shown in Table 2. To further verify the stability of the algorithms during the parameters search process, five times of repeated searches were performed, as shown in Figure 4. Comparing with the result we can see that the GA has a large MSE of 0.014 in the parameter optimization process of the second time, which is caused by the premature convergence. For the result of the MPGA, the MSE error results of five iterations were basically consistent with an average value of 0.0055. In order to reflect the performance advantages of MPGA, both of the GA and MPGA were adopted to search for the parameters respectively. Each algorithm's performance was assessed by the mean squared error (MSE) between the standardized predicted results (H s f ) and actual results (H sf ), and it was expressed as Equation (11): In the process of parameters optimization, the GA tends to converge after 32 generations of evolution, and the value of MSE is reduced from 0.018 to 0.007. The MPGA converges after only seven generations of evolution, and the value of MSE is reduced from 0.014 to 0.005. On the other hand, the average of the MSE in GA has a larger range of change than MPGA. Therefore, the MPGA has higher search accuracy, and significantly improves the slow convergence of GA. Finally, the optimal fitness parameter values selected are shown in Table 2. To further verify the stability of the algorithms during the parameters search process, five times of repeated searches were performed, as shown in Figure 4. Comparing with the result we can see that the GA has a large MSE of 0.014 in the parameter optimization process of the second time, which is caused by the premature convergence. For the result of the MPGA, the MSE error results of five iterations were basically consistent with an average value of 0.0055. Therefore, MPGA has higher accuracy and stability in the process of parameter optimization for SVR.
To further verify the stability of the algorithms during the parameters search process, five times of repeated searches were performed, as shown in Figure 4. Comparing with the result we can see that the GA has a large MSE of 0.014 in the parameter optimization process of the second time, which is caused by the premature convergence. For the result of the MPGA, the MSE error results of five iterations were basically consistent with an average value of 0.0055.

The Test of the MPGA-SVR Model
In order to accurately and objectively map the performance of MPGA-SVR model, the correlation coefficient (r) between the predicted results (H f ) and experimental results (H f ) were used: where  Figures 5 and 6. Based on the Equations (11) and (12), the r and MSE of the two models were calculated as Table 3.  Therefore, MPGA has higher accuracy and stability in the process of parameter optimization for SVR.

The Test of the MPGA-SVR Model
In order to accurately and objectively map the performance of MPGA-SVR model, the correlation coefficient (r) between the predicted results (

Sample data Prediction Model MSE (m 2 ) Correlation Coefficient (r)
Training sample GA-SVR 11.13 0.96 MPGA-SVR 7.29 0.97 Comparing with the results we can see that the MPGA-SVR model has the lower MSE and higher r for both of the training sample and test sample. Therefore, the MPGA-SVR model has a better performance on accuracy compared with the GA-SVM.
In addition to accuracy, stability is also an important index to test the predictive performance of the model. In this paper, 10 repeated trainings and predictions were performed through the two models respectively for the same grouped data, and the prediction results were shown in Figure 7. As it can be seen from Figure 7, the GA-SVR model has a poor prediction effect in the third time with a value of 0.6. However, the correlation coefficient of the MPGA-SVR model is higher than 0.95 every time.
Therefore, the MPGA-SVR has a better prediction performance from the results of training sample and test sample. In addition to accuracy, stability is also an important index to test the predictive performance of the model. In this paper, 10 repeated trainings and predictions were performed through the two models respectively for the same grouped data, and the prediction results were shown in Figure 7. As it can be seen from Figure 7, the GA-SVR model has a poor prediction effect in the third time with a value of 0.6. However, the correlation coefficient of the MPGA-SVR model is higher than 0.95 every time.
Therefore, the MPGA-SVR has a better prediction performance from the results of training sample and test sample.

Application of the MPGA-SVR Model
In order to further verify the validity of the MPGA-SVR model for the prediction of watercrushing zone and its engineering application value, the prediction of the WCFZ's height for the 8101 working face is carried out.
In this section, 69 sets of data were used as training samples to predict the height of the WCFZ during the mining of the 8101 working face through the gully. For comparison with traditional methods, the empirical formula method, GA-SVR and MPGA-SVR model were applied, respectively.
The traditional empirical formula was proposed by Liu Tianquan in the early 1980s based on

Application of the MPGA-SVR Model
In order to further verify the validity of the MPGA-SVR model for the prediction of water-crushing zone and its engineering application value, the prediction of the WCFZ's height for the 8101 working face is carried out. In order to compare the accuracy and stability of the GA-SVR and MPGA-SVR models at the same time, the prediction results were obtained by repeated calculation 15 times, the prediction height of the WCFZ is shown in Figure 8. After the actual measurement on site, the height of the WCFZ is approximately 50 m. Aiming to show the error of different prediction results, the absolute relative error φ between the prediction result and measured result was adopted, which is defined as Equation (14). The result was shown in Figure 9. After the actual measurement on site, the height of the WCFZ is approximately 50 m. Aiming to show the error of different prediction results, the absolute relative error ϕ between the prediction result and measured result was adopted, which is defined as Equation (14). The result was shown in Figure 9.
Sustainability 2019, 11, x FOR PEER REVIEW 13 of 15 Figure 9. The absolute relative error of the prediction results for different methods.
From the Figure 8 and Figure 9, we can see that the empirical formula's prediction height of the WCFZ is from 34.4 m to 45.6 m, and the absolute relative error is from 0.088 to 0.312. As the empirical formula method considers a single influencing factor only, so the error is the largest. Although the average prediction height of the WCFZ by the GA-SVR and MPGA-SVR models are both about 47.5 m, but the GA-SVR model's prediction result varies greatly each time with a biggest error of 0.11. That is because the GA has premature convergence when the individuals in the population tend to be in the same state prematurely and stop evolution. It is obvious that a consistent result can always be obtained for each calculation by the MPGA-SVR model with an absolute relative error of 0.005, while the result obtained by the GA has greater uncertainty. Based on the comprehensive consideration of various factors that affect the development of the water-conducting fracture zone, the MPGA was used to select more appropriate model parameters for the SVR prediction model. Hence, the MPGA-SVR model exhibits a better accuracy and stability performance comparing with the other methods in this case. In addition, the model will be continuously enriched and updated in practice to make it more accurate and generalizable.

Conclusions
In this paper, the MPGA-SVR model is proposed as a novel approach to predict the height of the WCFZ. The prediction accuracy and stability of the WCFZ's height have been greatly improved by the MPGA-SVR model with the parameters of mining depth, hard rock proportional coefficient, mining thickness, length of working face.
In the process of parameter optimization, the MPGA properly resolves the premature convergence and slow convergence of GA by co-evolving multiple populations with different genetic parameters, and the local and global search ability of the model is further improved. Finally, a more suitable parameter is obtained to predict the height of WCFZ prediction, and the prediction results have a high correlation coefficient, low mean square errors and good stability, which is very close to the experimental result. Comparing with the traditional method, since the model of MPGA-SVR considers many influencing factors at the same time, unlike the traditional empirical formula, only the unilateral factors are considered. Therefore, the model proposed in this paper provides a more reasonable solution to obtain the height of the WCFZ, which is of great significance for the safe and From the Figures 8 and 9, we can see that the empirical formula's prediction height of the WCFZ is from 34.4 m to 45.6 m, and the absolute relative error is from 0.088 to 0.312. As the empirical formula method considers a single influencing factor only, so the error is the largest. Although the average prediction height of the WCFZ by the GA-SVR and MPGA-SVR models are both about 47.5 m, but the GA-SVR model's prediction result varies greatly each time with a biggest error of 0.11. That is because the GA has premature convergence when the individuals in the population tend to be in the same state prematurely and stop evolution. It is obvious that a consistent result can always be obtained for each calculation by the MPGA-SVR model with an absolute relative error of 0.005, while the result obtained by the GA has greater uncertainty. Based on the comprehensive consideration of various factors that affect the development of the water-conducting fracture zone, the MPGA was used to select more appropriate model parameters for the SVR prediction model. Hence, the MPGA-SVR model exhibits a better accuracy and stability performance comparing with the other methods in this case. In addition, the model will be continuously enriched and updated in practice to make it more accurate and generalizable.

Conclusions
In this paper, the MPGA-SVR model is proposed as a novel approach to predict the height of the WCFZ. The prediction accuracy and stability of the WCFZ's height have been greatly improved by the MPGA-SVR model with the parameters of mining depth, hard rock proportional coefficient, mining thickness, length of working face.
In the process of parameter optimization, the MPGA properly resolves the premature convergence and slow convergence of GA by co-evolving multiple populations with different genetic parameters, and the local and global search ability of the model is further improved. Finally, a more suitable parameter is obtained to predict the height of WCFZ prediction, and the prediction results have a high correlation coefficient, low mean square errors and good stability, which is very close to the experimental result. Comparing with the traditional method, since the model of MPGA-SVR considers many influencing factors at the same time, unlike the traditional empirical formula, only the unilateral factors are considered. Therefore, the model proposed in this paper provides a more reasonable solution to obtain the height of the WCFZ, which is of great significance for the safe and efficient mining of coal mines. Based on the algorithm proposed in this paper, we can further study and design different migration mechanisms and several sub-population hierarchical execution architectures of GA in multi-population mechanisms to obtain more accurate results.

Conflicts of Interest:
The authors declare no conflict of interest.