Markov Chain Simulation of Coal Ash Melting Point and Stochastic Optimization of Operation Temperature for Entrained Flow Coal Gasiﬁcation

: In the entrained ﬂow coal gasiﬁcation process, the gas production is critically a ﬀ ected by the operating temperature (OT) and coal ash melting point (AMP), and the AMP is one of key factors for the determinations of OT. Considering the fact that coal is a typical nonhomogeneous substance and the coal ash composition varies from batch to batch, this paper proposes the application of the Markov Chain (MC) method in simulation of the random AMP series and the stochastic optimization of OT based on MC simulation for entrained ﬂow coal gasiﬁcation. The purpose of this paper is to provide a more accurate optimal OT decision method for entrained ﬂow coal gasiﬁcation practice. In this paper, the AMP was regarded as a random variable, and the random process method, Markov Chain, was used to describe the random AMP series of feed coal. Firstly, the MC simulation model about AMP was founded according to an actual sample data, 200 sets of AMP data from an industrial gasiﬁcation plant under three simulation schemes (the sample data were individually divided into 16, eight and four state groups,). The comparisons between the simulation results and the actual values show that the founded MC simulation model descries the AMP series very well. Then, a stochastic programming model based on MC simulation for OT optimization was developed. Finally, this stochastic programming optimization model was optimized by genetic algorithm (GA). Comparing with the conventional OT optimization method, the proposed stochastic OT optimization model integrated MC simulation can ascertain a more accurate OT for guiding the coal gasiﬁcation practice.


Introduction
Coal is the most wildly used natural energy resource with the largest reserves. Compared with other energy sources, the utilization of coal has attracted wider attention in view of the advantages of wide distribution and easy exploitation of coal. In recent years, the increasing environmental and health concerns in utilization of coal promote the emergence of clean coal technologies [1][2][3][4]. Among the clean coal technologies, coal gasification is regard as a promising utilization of coal because of its low technology cost and high conversion rate [5]. In the coal gasification practice, the basic equipment can be grouped into three main categories: fixed bed gasifiers, fluidized bed gasifiers and entrained flow gasifiers [6,7]. Differ from fixed bed and fluidized bed gasification, entrained flow gasification operates at a higher temperature (above the melting temperature of ash) with smaller particles and produces high quality syngas with low methane and free tar content. In addition, this type of coal gasification has some advantages such as high carbon conversion, large production capacity, high thermal efficiency and a low environmental pollution [8,9]. On the contrary, there are also some disadvantages to be improved in entrained flow gasification such as slagging and fouling.
Because the ash discharging mode for entrained-flow bed gasifiers is liquid slagging, the melting characteristics of coal ash is a key factor influencing the gasifier operation [10]. Thus, the ash melting point (AMP) of the feed coal is an important reference index for the determination of operational temperature (OT), which is a critical parameter for the stable and efficient operation of the gasifier. When the temperature in a gasifier is lower than the AMP of the feed coal, the gasification reaction as well as coal conversion rate will be greatly reduced [11]. On the contrary, when the gasification temperature is far higher than the AMP, the ash in the coal may agglomerate, resulting in a reduction of the reaction contact between the gasifying agent and coal and an uneven air flow in the gasifier, which is not conducive to good gasification reactions [12]. Thus, the OT of the gasifier should usually be moderately higher than the AMP of the feed coal so as to facilitate the melting and flow of coal ash and maintain the gasification reaction [13]. Therefore, the AMP of feed coal is always paid special attention to in practice due to the important role it plays in the determination of the optimum OT for the gasification process.
Over the past years, in order to study the melting behavior of coal ash in gasifiers, many studies about the AMP of coal have been reported. Current studies on AMP of coal can be roughly divided into two categories: (i) the mechanism models that are based on the mechanism of chemical reactions; (ii) the empirical models that are based on data statistics. A large number of researchers have developed mechanism models in their studies on AMP of coal, for example, Chakravarty et al. [14], Dai et al. [15], Weber et al. [16], Kim et al. [17]. However, the mechanism models are very theoretical, with many assumptions, so the reaction process expressed by these formulas may deviate from the actual reaction process due to these assumptions. As for the empirical models for coal AMP, studies on this subject have developed rapidly and a great deal of literature on this subject has been published in recent years. Some statistic method such as linear and non-linear regression methods, as well as some advanced intelligent algorithms [18] such as artificial neural network(ANN), support vector machine (SVM), etc. were employed in these models. For instance, Ozbayoglu et al. [19] applied linear regression and non-linear regression, respectively, to the chemical composition of coal ash and the temperature of AMP for Turkish lignite. In [10,[20][21][22], the authors took a large number of parameters of coal ash as training samples and used ANN to predict the AMP of different coals. In [23][24][25], the authors all took ash composition as input and AMP as output to build prediction models of AMP using SVM. Compared with the mechanism model, the advantage of the empirical models is that they avoid complex mechanism process analysis, and they can just extract information from historical data to predict the AMP of coal. Both the mechanism models and the empirical models have contributed to the study of ash problems (such as slagging and fouling), however, these models are static models, that is, the foundations for modelling are based on a point in time rather than a time series perspective. More importantly, these studies focus on the prediction of AMP based on the characteristics of coal rather than OT optimization based on dynamic AMP.
As one of the critical factors influencing the stable and efficient operation of a gasifier as well as the quality of gasification product, OT has always been one of the key parameters discussed in many entrained flow coal gasification studies, and its optimization is usually an important topic in a large number of published studies. For instance, Chen et al. [26] carried out some orthogonal designs to optimize some operating parameters of an entrained flow gasifier for coal-biomass co-gasification. As an important operating parameter, OT was optimized according to Taguchi's philosophy. Vejahati et al. [27] optimized three operating variables, including OT, for a coal-coke gasification in an entrained flow gasifier using response surface methodology (RSM). In addition to the experiment-based studies, many researchers also discussed the temperature optimization in their simulation-based researches. Emun et al. [28] used Aspen Plus to study the effect of different operating variables on thermal efficiency. As a result, the optimal OT was 1550 • C with the highest thermal efficiency. Biagini et al. [29] developed an entrained-flow gasifier model in Aspen Plus, and obtained the optimal operating conditions including temperature through parametric analysis. Chen et al. [30] applied a computational fluid dynamics (CFD) simulation method to develop an optimization analysis procedure of the gasification process in an entrained-flow gasifier, and obtained a best temperature of 1227 • C. Shastri et al. [31] developed a CFD model for a single-stage coal gasifier. Eleven parameters including OT were stochastically optimized by using the parameter space investigation method. Wei et al. [32] used the iterative adaptive dynamic programming (IADP) method to establish a nonlinear optimization control scheme for the coal gasification system. They concluded that the optimal OT should be controlled at around 1320 • C. Actually, the composition of pulverized feed coal varies from batch to batch, and even in the same batch of pulverized coal, there may be fluctuations in composition depending on the feeding time. In turn, the AMP of feed coal usually manifests as a dynamic series over a period of time. Thus the target of OT optimization is actually to seek an optimal OT that is best adapted to the dynamic changes in coal composition from batch to batch and extending to the dynamic changes in AMP. However, in reviewing the current studies on OT optimization, a lot of the reported studies involve static optimization because they are mainly based on static small-scale experiments or simulation experiments and the dynamic changes in coal composition as well as in AMP series of feeding coal are not taken into account.
In this paper, the AMP of feed coal over a period of time is treated as a group of random data series based on random perspective. A stochastic process theory, Markov Chain (MC), is used to simulate and describe the AMP series. A Markov process is a stochastic process model describing a sequence of possible events that satisfies the Markov property. In the philosophy of MC, the probability of each event depends solely on the state attained in the previous event, and this dependence obeys a certain probability distribution [33]. MC models can express the dynamic changes of a random series and predict the future changes of the system through the transition probability of state change among different variables. Because of the clear mathematical meaning of the MC model, it is widely used in modelling or simulation for different applications. For example, Afzal et al. [34] established a hidden Markov model that solves the problem of information prediction in industrial process control. Reference [35] provided a general approximation framework for (path-dependent) option evaluation under time-varying Markov processes. Reference [36] applied the Markov decision process to the control of the hybrid energy storage system, increasing the self-consumption of photovoltaics by 5%. In [37], a semi-Markov process was used to describe the machine degradation process, and a fault detection scheme based on Bayesian estimation was proposed to achieve more effective early fault detection. Others similar reports can be found in [38][39][40][41]. Recently, there have been a few reports about the application of MC in coal gasification. References [42,43] introduced a residence time distribution model of the granular phase in a gasifier based on MC. The results showed that the simulation results are basically in agreement with the measured data, which can be used to predict the residence time distribution of granular phase in the gasifier. However, to our best knowledge, there no studies on the utilization of MC in the simulation of AMP of coal, especially studies on OT optimization based on MC simulation for entrained flow coal gasification process have been reported so far.
The purpose of this paper is to describe our study on: (i) using MC to simulate the AMP series of feed coal for an entrained flow gasification process, (ii) stochastic program modelling based on MC simulation of AMP for OT optimization, and (iii) solving OT optimization with a genetic algorithm (GA). In this study, the AMP series of a feed coal was regarded as a random series, and the simulation for the AMP series with MC was carried out based on 200 sets of actual AMP data collected from an industrial gasification plant. Then a stochastic programming model with the target of optimizing OT based on MC simulation was proposed. Finally, the optimal OT was ascertained by solving the stochastic programing model using GA.

Marckov Chain
The Markov chain is a modeling and prediction method that works based on a stochastic process in order to describe a series of uncertain events [33]. In this type of stochastic process, the Markov property that the future prediction of a process variable could be made independent of the process history should be satisfied. For a stochastic process {X t , t = 0, 1, 2, . . .} with a finite number of possible values, the statement that X t being equal to i expresses that the process is in state i when the time is at t. Figure 1 shows the state transition of a MC process. For a given past state X 0 , X 1 , . . . , X t−1 and the present state X t , the condition distribution of the future state X t+1 is independent of the past state and depends only on the present state. As shown in Figure 1, state set S = 1, 2, . . . , i, j, . . . n , and the state transition with non-aftereffect property can be achieved by a fixed possibility of P ij [44]: where P ij indicates the probability of a process being transferred to state X t+1 at the time of state X t .

=
where indicates the probability of a process being transferred to state at the time of state .
should satisfy the following conditions: where is the total number of states. Then the Markov characteristic of a data series could be expressed by a matrix of transition probability : (3)) is the first order one-step transition matrix. In matrix P, the i th row, vector V, represents the one-step transition probability of state . Then the n-step transition probability vector of state could be expressed by the following formula: As the embedded Markov chain is assumed ergodic, the stationary distribution, described by an k-element vector [46], can be obtained solving the following system of linear equations: where the last equation specifies the normalization of the probability distribution.

MC Simulation Procedure for AMP
The state transition probability matrix P is the core of Markov theory. The n-dimensional state transition probability matrix can be used to describe the change rule of any state in the stochastic system, which greatly simplifies the complexity of the stochastic model. Thus, Markov theory has been widely used in stochastic modelling. P ij should satisfy the following conditions: where n is the total number of states. Then the Markov characteristic of a data series could be expressed by a matrix of transition probability P: Matrix P (Equation (3)) is the first order one-step transition matrix. In matrix P, the ith row, vector V, represents the one-step transition probability of state i. Then the n-step transition probability vector of state i could be expressed by the following formula: Energies 2019, 12, 4245

of 23
As the embedded Markov chain is assumed ergodic, the stationary distribution, described by an k-element vector π [45], can be obtained solving the following system of linear equations: where the last equation specifies the normalization of the probability distribution.

MC Simulation Procedure for AMP
The state transition probability matrix P is the core of Markov theory. The n-dimensional state transition probability matrix can be used to describe the change rule of any state in the stochastic system, which greatly simplifies the complexity of the stochastic model. Thus, Markov theory has been widely used in stochastic modelling.
In this paper, a MC simulation for 200 sets of AMP data (see Appendix A Table A1) collected from a fertilizer plant in Xinxiang, Henan Province, China was carried out. Among them, 160 sets of data were used as training samples, and the remaining 40 sets were used as test samples. The dynamic changes in the AMP series of feeding coal were regarded as state changes, and then the MC process was used to stochastically simulate the AMP series. The specific steps [44] are as shown in Figure 2. In this paper, a MC simulation for 200 sets of AMP data (see Appendix Table A1) collected from a fertilizer plant in Xinxiang, Henan Province, China was carried out. Among them, 160 sets of data were used as training samples, and the remaining 40 sets were used as test samples. The dynamic changes in the AMP series of feeding coal were regarded as state changes, and then the MC process was used to stochastically simulate the AMP series. The specific steps [44] are as shown in Figure 2.   Step 1: Partitioning the state. In order to convert the AMP data into state variables for the Markov process, the data should be divided into several regions and each region could be regarded as states , , , … , . The number of states depends on the capacity of the original AMP data. The interval length of each state depends on the upper and lower bounds of the original AMP data and the error range of the final simulation data. Step 1: Partitioning the state. In order to convert the AMP data into state variables for the Markov process, the data should be divided into several regions and each region could be regarded as states S 1 , S 2 , S 3 , . . . , S n . The number of states depends on the capacity of the original AMP data. The interval length of each state depends on the upper and lower bounds of the original AMP data and the error range of the final simulation data.
Step 2: Constructing the probability matrix of state transition. The AMP sample data is regarded as time series, and the change of data in time series is regarded as state transition. By counting the frequency of each state transition, the probability matrix of state transition could be constructed.
Step 3: Simulating the Markov sequences. It is assumed that the state transition probability vector of the initial state is V 1 = P i 1 1 , P i 1 2 , . . . , P i 1 m , . . . , P i 1 n and r is a random number of [0, 1]. If m is satisfied with: The state S m would be the second state of the sequence. By analogy, the remaining part of the state sequence could be simulated. Then, according to the rules of state partition, the state sequence is transformed into the sequence of AMP. That is, the simulation data are obtained. In this study, we carried out the MC simulation programming (see Appendix B) in the Matlab2015 platform.

MC Simulation for AMP
In a MC simulation, the partition of the state intervals determines the simulation result. The common state partition methods include empirical-based partition, sample mean-variance-based partition and ordered clustering. Among these three methods, the empirical-based partition method is the simplest one and thus is mostly used. For instance, in [44], a wind speed series was divided into 28 states with the length of each state being 0.5 m/s using the empirical-based division method when it was simulated by MC method. References [44,46,47] also present the application of this method in dividing some data series for MC simulation in some practical problems. Similarly, in this paper, the empirical method was adopted to divide the states of the collected AMP series. In addition, in order to evaluate the simulation results, we proposed three portioning schemes and compared the simulation accurate in these three schemes so as to achieve a more precise simulation model for further OT optimization. The three portioning schemes are as follows:    Figure 3 shows a graphical representation of the state transition probability matrix P CaseI , which highlights the peculiarities of its structure. As can be seen from Figure 3, the greater probabilities are almost all distributed near the diagonal, and their values identify the probabilities that the AMP data from one moment to the next remains in the initial state. Furthermore, the non-zero transition probabilities are concentrated around the diagonal, making transitions from an initial state to another far away much less probable.  Then the simulation of AMP series was carried out. In simulation, was set as the initial state, which was the same as the initial state in the sample data, and let be a random number with uniform distribution ~[0,1]. For a state to simulated , if satisfies: could be simulated according to random transformation principle. Similar to Equation (8), Figure 4 shows the schematic principle of the state transition chain in the simulation. As shown in Figure 4, the random number r is 0.04 and it falls into the third interval, so the second state is . Then, taking as the starting state, the next state could be predicted according to the corresponding state transition probability vector. By analogy, a random state sequence with Markov property can be obtained. Then the simulation of AMP series was carried out. In simulation, S 8 was set as the initial state, which was the same as the initial state in the sample data, and let r be a random number with uniform distribution X ∼ [0, 1]. For a state to simulated S m , if m satisfies: S m could be simulated according to random transformation principle. Similar to Equation (8), Figure 4 shows the schematic principle of the state transition chain in the simulation. As shown in Figure 4, the random number r is 0.04 and it falls into the third interval, so the second state is S 3 . Then, taking S 3 as the starting state, the next state could be predicted according to the corresponding state transition probability vector. By analogy, a random state sequence with Markov property can be obtained.
Then the simulation of AMP series was carried out. In simulation, was set as the initial state, which was the same as the initial state in the sample data, and let be a random number with uniform distribution ~[0,1]. For a state to simulated , if satisfies: could be simulated according to random transformation principle. Similar to Equation (8), Figure 4 shows the schematic principle of the state transition chain in the simulation. As shown in Figure 4, the random number r is 0.04 and it falls into the third interval, so the second state is . Then, taking as the starting state, the next state could be predicted according to the corresponding state transition probability vector. By analogy, a random state sequence with Markov property can be obtained.  Furthermore, data reconstruction was carried out so as to get the simulated AMP series. In this step, the obtained random state sequence were transformed into the concrete simulation data of AMP according to the following equation: where, ( ) is the final simulation data of AMP, r is a random number with uniform distribution After completing the steps described above, the simulated AMP series could been obtained by using Equation (9) (as shown in Figure 5 and provided in Appendix Table A1). Furthermore, data reconstruction was carried out so as to get the simulated AMP series. In this step, the obtained random state sequence were transformed into the concrete simulation data of AMP according to the following equation: where, X(t) is the final simulation data of AMP, r is a random number with uniform distribution is the state at t. After completing the steps described above, the simulated AMP series could been obtained by using Equation (9) (as shown in Figure 5 and provided in Appendix A Table A1).
Similar to Case I, the state transition probability matrices of the original AMP series in Case II and Case III, P CaseII and P CaseIII , can be ascertained, individually: The simulated AMP series in Case II and Case III are also shown in Figure 5 and provided in Appendix A Table A1.
The simulated AMP series in Case II and Case III are also shown in Figure 5 and provided in Appendix Table A1.

Accuracy Test
In order to evaluate the simulation accuracy by the MC approach, the goodness of fit between the simulated and the original AMP series was measured. Figure 6 shows the probability distribution of the original and simulated data in three cases, which correspond to the stationary distribution probability vector . It is obvious that the probability distribution of Case I is the most similar to that of the sample data. Its main characteristics are being high in the middle and low on both sides, which indicates that the AMP data are concentrated at about 1350 ℃. In the other two cases, the probability distribution curves are relatively flat and the data are scattered.

Accuracy Test
In order to evaluate the simulation accuracy by the MC approach, the goodness of fit between the simulated and the original AMP series was measured. Figure 6 shows the probability distribution of the original and simulated data in three cases, which correspond to the stationary distribution probability vector π. It is obvious that the probability distribution of Case I is the most similar to that of the sample data. Its main characteristics are being high in the middle and low on both sides, which indicates that the AMP data are concentrated at about 1350 • C. In the other two cases, the probability distribution curves are relatively flat and the data are scattered.  Figure 7 shows the comparisons of the simulated and the original AMP series in three cases. As can be seen from this figure, the data points in Case I are closest to the diagonal. In addition, the error between simulation data and original data in Case I is within 13℃, while the error in Case II and Case III are 18℃ and 26℃, respectively. It can also be seen that there no simulated data locates on the diagonal line for Case III. This means that the simulated data is always inaccurate in this case. The reason may lie in the fact that the state range of Case III is too large. Although each state is simulated accurately, the simulation data of AMP that is randomly generated in a large state range will deviate the sample data greatly. Therefore, the range of the state interval determines the accuracy of the data simulation. Some further statistical indexes i.e., Mean Absolute Deviation (MAD), Root Mean Square Error (RMSE) and Absolute Average Relative Error (AARE) were employed to measure the deviation between the simulated and the actual AMP values: ∑ Figure 6. Comparison of probability density of the simulated and original AMP series in three cases. Figure 7 shows the comparisons of the simulated and the original AMP series in three cases. As can be seen from this figure, the data points in Case I are closest to the diagonal. In addition, the error between simulation data and original data in Case I is within 13 • C, while the error in Case II and Case III are 18 • C and 26 • C, respectively. It can also be seen that there no simulated data locates on the diagonal line for Case III. This means that the simulated data is always inaccurate in this case. The reason may lie in the fact that the state range of Case III is too large. Although each state is simulated accurately, the simulation data of AMP that is randomly generated in a large state range will deviate the sample data greatly. Therefore, the range of the state interval determines the accuracy of the data simulation. Some further statistical indexes i.e., Mean Absolute Deviation (MAD), Root Mean Square Error (RMSE) and Absolute Average Relative Error (AARE) were employed to measure the deviation between the simulated and the actual AMP values: where, y i is the actual value of AMP andŷ i is the simulated value of AMP by the MC simulation approach.
Energies 2019, 12, x FOR PEER REVIEW 10 of 25 Figure 7. Comparisons of the simulated data and the sample data of AMP in three cases.

Further Discussion on the Selection of State Number
Based on the above discussion, it seems to be that the more states there are, the more accurate the simulation will be. However, there are still problems. According to [48], if the number of states is small, the probability the state changing will be significantly reduced, and searching for this other state is, after all, a goal of prediction. If on the other hand the AMP data is divided into a large number of states, the number of events per state will be smaller. This may be insufficient to estimate the actual distribution of the probability of state changes. By comparing equations (7), (10) and (11), it could be found that the probability of transition between states is decreasing with the increase of the number of states. If the number of states is too large, a large number of state transition probability will be small, and the simulated data will be irregular. Such a Markov model with a large number of states  Table 1 presents the deviation statistics between the simulated and the original AMP series in three cases. As shown in Table 1, AARE in Case I and Case II are 0.42% and 0.52%, respectively, which are smaller than 1%; while the AARE in Case III is 1.04%, which is also a small deviation although it is larger than 1%. This indicates the simulation accuracy of the three cases are all acceptable. However, in comparing the three cases, the deviation indexes of MAD, RMSE and AARE in Case I are obviously smaller than those in Case II and Case III. This indicates that the simulation scheme of Case I that dividing the original AMP data into 16 states can provide better prediction accuracy.

Further Discussion on the Selection of State Number
Based on the above discussion, it seems to be that the more states there are, the more accurate the simulation will be. However, there are still problems. According to [48], if the number of states is small, the probability the state changing will be significantly reduced, and searching for this other state is, after all, a goal of prediction. If on the other hand the AMP data is divided into a large number of states, the number of events per state will be smaller. This may be insufficient to estimate the actual distribution of the probability of state changes. By comparing Equations (7), (10) and (11), it could be found that the probability of transition between states is decreasing with the increase of the number of states. If the number of states is too large, a large number of state transition probability will be small, and the simulated data will be irregular. Such a Markov model with a large number of states obviously cannot reflect the characteristics of AMP data. Therefore, it is necessary to discuss the selection of a suitable state number.
In order to determine the optimal states number of Markov model, the MC models with state number from 4 to 16 were compared. The simulation results are shown in Table 2. As shown in Table 2, the simulation model with 16 states is not the best one, although the state number of this model is the largest. However, the most accuracy model with lower evaluation indexes among these simulation models is the one with 13 states. Then the simulation accuracy gradually decreases with the increase of the number of states when the state number is larger than 13. This is because too many states lead the probability of state transition to be very small, and further lead the simulation data to lose the distribution characteristics of the original data. Thus, the simulation scheme with 13 states would be employed in the latter optimization modelling for gasifier OT.

Stochastic Programing Modelling Based on MC Simulation
In coal gasification practice, the gas production will generally reach a maximum when the OT is 50~100 • C higher than the AMP of the feed coal [49,50]. Traditional determination of OT is just done according to a fixed AMP value and simply adding a temperature increment to this fixed AMP value. This fixed AMP value is usually the AMP mean of the feed coal over a period of time or several successive batches. This traditional method is simple, but often results in inaccuracy in OT determination because the possible great changes in AMP of feed coal from batch to batch are not taken into account. This may greatly affect the gasification reaction and the stability of gasification production. In order to ascertain a more feasible OT for gasifier operation, it is necessary to take the fluctuations in the AMP of the pulverized feed coal into account when determining the OT. For these reasons, we proposed a stochastic programing model based on MC simulation for OT optimization in this paper.
Considering the random fluctuation in the AMP series of feed coal and treating the AMP series as a dynamic random series, the principle of this stochastic model is to gain an optimal OT that compromises the ideal OT series according to each data of the AMP series. In this model, a random increment (denotes as ∆(t)) within the interval [50,100] was added to each data of the simulated AMP series (denotes as X(t)) to represent the ideal value of OT (denotes as Y(t)), then the optimization target is to minimize the gap between the optimal OT (denotes as T) and each ideal OT Y(t). The stochastic programing model based on MC simulation for OT optimization was proposed as follows: where T is the decision variable; Z is the objective function, represented by the sum of squares of the difference between T and Y(t); ∆(t) is obeyed the uniform distribution ∆(t) ∼ U(50, 100); rand(t) is a random number of [0, 1]; S(t) is the state of AMP at t, and its initial value is set as S(1) = 8.
The most prominent feature of this stochastic programing optimization model is the integration of the MC simulation in it. In order to gain a more accurate optimal OT to provide more valuable guidance for gasifier operation practice, the simulated AMP data series were expanded to 10,000 data sets according to the change law of the original AMP data series using the found simulation program. Then the stochastic programing would be optimized based on the newly acquired large data sets. The found mathematical model for OT optimization is a stochastic programming model due to the hybridization of MC random simulation in the constraints. These random, non-linear and non-quadratic constraints present multiple challenges in regard to solution methodology using traditional mathematical programming methods. However, these shortcomings can be easily handled using some artificial intelligence algorithms such as genetic algorithm (GA), simulated annealing algorithm (SAA), particle swarm optimization (PSO) and so on. Different from the traditional mathematical optimization methods, these artificial intelligence algorithms are evolutionary heuristics that are population-based search methods [50][51][52]. In an optimization procedure with these algorithms, the iterations move from a set of points (population) to another set of points with likely improvement using a combination of deterministic and probabilistic rules. Among these evolutionary algorithms, the GA employs the principal of "survival of the fittest" in its search process, and achieves evolution through crossover and mutation operations to explore new areas of solution space using best characteristics of previous population generations. Thus, it is well suited to and has been extensively applied to solve complex optimization problems such as nonlinear objectives and constrained functions without requiring gradient information. Therefore, we used GA to optimize the found stochastic optimization model of OT in this study.

Parameter Optimization Using GA
GA is an intelligent algorithm based on artificial population evolution [51]. The main idea is to screen individuals according to a series of genetic operations such as fitness function and cross-mutation, so that individuals with high fitness can be retained to form a new population. In this way, the fitness of each individual in the population continues to improve until certain limit conditions are met. At this time, the individual with the highest fitness in the population is the optimal solution of the parameters to be optimized.
In order to realize the programming of GA, the Sheffield GA optimizer coded in MATLAB is preferable due the operational convenience of this platform. Many optimization problems in various fields have been solved with it [53][54][55]. Similar to ordinary genetic algorithm programs, the Sheffield GA optimizer also contains standard crossover and mutation operators. Crossover is achieved by cloning pairs of parent solutions drawn randomly from the population and performing gene-swap between two gene positions selected at random. Similarly, mutation is achieved by cloning a parent solution drawn randomly from the population and then switching select genes in the cloned solution to new states randomly. Gene selection for mutation is done randomly with a 1% probability [56]. Figure 8 illustrates examples for crossover and mutation operators on solution candidates. In this paper, the value of OT is regarded as different individuals. After substituting it into the objective function, the individual is evaluated according to the size of the objective value, and the individuals with good quality are selected for cross and variation to produce better individuals. Finally, after a certain number of iteration, we can get the best individual, that is, the best operation temperature. The GA procedure is shown in Figure 9. In this paper, the value of OT is regarded as different individuals. After substituting it into the objective function, the individual is evaluated according to the size of the objective value, and the individuals with good quality are selected for cross and variation to produce better individuals. Finally, after a certain number of iteration, we can get the best individual, that is, the best operation temperature. The GA procedure is shown in Figure 9.
In solving our stochastic optimization model (Equation (15)) by GA, the population size is set as N = 30 [57,58]. Besides, the roulette method [50] is used for selection, which is obey the rule that the higher the fitness, the greater the probability of being selected. Moreover, the crossover probability and the variation probability are usually set as P c = 0.8 and P m = 0.01 [58]. The iteration process of optimal solution is presented in Figure 10. It could be seen from Figure 10 that the convergence speed of the optimal solution is very fast, and the optimal solution has gradually stabilized at T = 1424.80 • C.
In order to verify this optimization result, we also optimized the found stochastic optimization model using SAA and PSO. Figure 11 shows the iterations of the optimal solution by SAA and PSO, and the optimal solutions by the these two algorithms are 1424.77 • C and 1424.75 • C, respectively. These two optimal results are almost consistent with that by GA algorithm. This indicates that the optimization result using GA is valid. However, the convergence speed of GA is the fastest comparing with SAA and PSO, which can be confirmed by their iterations. In this paper, the value of OT is regarded as different individuals. After substituting it into the objective function, the individual is evaluated according to the size of the objective value, and the individuals with good quality are selected for cross and variation to produce better individuals. Finally, after a certain number of iteration, we can get the best individual, that is, the best operation temperature. The GA procedure is shown in Figure 9. In solving our stochastic optimization model (Equation (15)) by GA, the population size is set as N = 30 [57,58]. Besides, the roulette method [50] is used for selection, which is obey the rule that the higher the fitness, the greater the probability of being selected. Moreover, the crossover probability and the variation probability are usually set as = 0.8 and = 0.01 [58]. The iteration process of optimal solution is presented in Figure 10. It could be seen from Figure 10 that the convergence speed of the optimal solution is very fast, and the optimal solution has gradually stabilized at = 1424.80 ℃.  In order to verify this optimization result, we also optimized the found stochastic optimization model using SAA and PSO. Figure 11 shows the iterations of the optimal solution by SAA and PSO, and the optimal solutions by the these two algorithms are 1424.77 ℃ and 1424.75 ℃, respectively. These two optimal results are almost consistent with that by GA algorithm. This indicates that the optimization result using GA is valid. However, the convergence speed of GA is the fastest comparing with SAA and PSO, which can be confirmed by their iterations.

Optimization Results Comparison between Stochastic Model and Conventional Model
As stated in Section 4.1, the traditional determination of OT is just according to the AMP mean of feed coal over a period of time or several successive batches. The determined OT with this method is usually inaccurate because the dynamic changes in AMP series are neglected. To verify whether our proposed stochastic programing model based on MC simulation can obtain a more accurate optimal OT or not for gasification practice, we compared the optimization results obtained from our

Optimization Results Comparison between Stochastic Model and Conventional Model
As stated in Section 4.1, the traditional determination of OT is just according to the AMP mean of feed coal over a period of time or several successive batches. The determined OT with this method is usually inaccurate because the dynamic changes in AMP series are neglected. To verify whether our proposed stochastic programing model based on MC simulation can obtain a more accurate optimal OT or not for gasification practice, we compared the optimization results obtained from our model and those of the conventional method.
According to the conventional method, a random temperature increment with uniform distribution ∆(t) ∼ U(50, 100) was added to the mean of the 200 AMP series (X = 1348 • C), then the optimal OT obtained is 1398 ∼ 1448 • C. Figure 12 shows the range of optimal solutions obtained by the conventional method and by our stochastic optimization model. Compared the optimal result ascertained by the conventional model as shown in Figure 12, the optimal solution obtained by our proposed stochastic optimization model based on MC simulation converges to a narrower interval, which indicates that our proposed model can provide a more accurate optimal OT than that obtained from the conventional method.
In order to verify this optimization result, we also optimized the found stochastic optimization model using SAA and PSO. Figure 11 shows the iterations of the optimal solution by SAA and PSO, and the optimal solutions by the these two algorithms are 1424.77 ℃ and 1424.75 ℃, respectively. These two optimal results are almost consistent with that by GA algorithm. This indicates that the optimization result using GA is valid. However, the convergence speed of GA is the fastest comparing with SAA and PSO, which can be confirmed by their iterations.

Optimization Results Comparison between Stochastic Model and Conventional Model
As stated in Section 4.1, the traditional determination of OT is just according to the AMP mean of feed coal over a period of time or several successive batches. The determined OT with this method is usually inaccurate because the dynamic changes in AMP series are neglected. To verify whether our proposed stochastic programing model based on MC simulation can obtain a more accurate optimal OT or not for gasification practice, we compared the optimization results obtained from our model and those of the conventional method.  These two optimal results are almost consistent with that by GA algorithm. This indicates that the optimization result using GA is valid. However, the convergence speed of GA is the fastest comparing with SAA and PSO, which can be confirmed by their iterations.

Optimization Results Comparison between Stochastic Model and Conventional Model
As stated in Section 4.1, the traditional determination of OT is just according to the AMP mean of feed coal over a period of time or several successive batches. The determined OT with this method is usually inaccurate because the dynamic changes in AMP series are neglected. To verify whether our proposed stochastic programing model based on MC simulation can obtain a more accurate optimal OT or not for gasification practice, we compared the optimization results obtained from our model and those of the conventional method.

Conclusions
Based on a stochastic process perspective, this paper focuses on the random simulation of AMP of feed coal for an entrained flow coal gasification process and the stochastic optimization of OT so as to provide more precise operational guidance for gasification operations. In this study, the AMP of coal was regarded as a random variable, and the MC method was used to simulate the dynamic changes in AMP. Then based on the MC simulation, a stochastic programing model for OT optimization was proposed. After a series of simulation with MC and optimization with GA, the optimal OT was gained. Some conclusions are as follows: (1) MC is an effective simulation method to describe the dynamic changes in the AMP series when considering the characteristic variations of the feed coal from batch to batch. The MC simulation method can be further used in OT optimization. (2) In the application of MC to simulate the collected original AMP data, the simulation result is best when the original AMP series is divided into 13 states. Under this partition scheme, the average relative deviation between the simulated and the original AMP is only 0.35%, which is very small. This indicates that founded MC simulation model under this scheme can accurately describe the dynamic change of AMP. (3) Compared to the conventional OT determination method that just according to the AMP mean of the feed coal over a period of time or several successive batches, the proposed stochastic programing model that integrating MC simulation for OT optimization has obvious advantages, because the dynamic changes in AMP series are taken into account. Moreover, the final optimal OT value is T = 1424.80 • C, which is more accurate than the result T ∈ [1398, 1448] obtained with the conventional method. (4) The proposed stochastic programing model for OT optimization is a co-integration dynamic optimization problem, which can be optimized by some intelligent algorithms. The final optimal OT ascertained from the proposed stochastic programing model is more accurate than that obtained using the conventional method, which has been verified by comparing the results. Thesr show that the proposed OT optimization model based on MC simulation can provide more accurate and reliable references for actual production.

Conflicts of Interest:
The authors declare no conflicts of interest.

Nomenclature
The following nomenclature is used in this manuscript: The i-th simulated data y i The i-th sample data T Actual operating temperature Y(t) Ideal operating temperature at time t X(t) Ash melting point at time t ∆(t) The uniform distribution ∆(t) ∼ U(50, 100) rand(t) A random number of [0, 1] at time t S(t) State of ash melting point at time t N Population size P c Crossover probability P m Variation probability Appendix A Table A1. Simulation data and sample data deviation in three cases (the sample data of coal is from Hebi lean coal).