An Unsupervised Classification Method for Flame Image of Pulverized Coal Combustion Based on Convolutional Auto-Encoder and Hidden Markov Model

Combustion condition monitoring is a fundamental and critical issue that needs to be addressed in the wide-load operation of coal-fired boilers. In this paper, an unsupervised classification framework based on the convolutional auto-encoder (CAE), the principal component analysis (PCA), and the hidden Markov model (HMM) is proposed to monitor the combustion condition with the uniformly spaced flame images, which are collected from the furnace combustion monitoring system. First, CAE is adopted to extract the features from the flame images, which obtain the sparse representations in the images. Then, PCA is applied to project the feature vectors into the orthogonal space for robustness and computation efficiency. Finally, a HMM is built to calculate the corresponding optimal states by learning the temporal behaviors in the compressed representations. A coal combustion adjustment experiment was conducted in a 660 MW opposed-firing boiler, and the sequential 14,400 flame images with three different combustion states were obtained to evaluate the effectiveness of the proposed approach. We tested six different compression dimensions of the latent variable z in the CAE model and ensured that the appropriate compress parameter was 1024. The proposed framework is compared with five other methods: the CAE + Gaussian mixture model (GMM), CAE + Kmean, the CAE + fuzzy c-mean method, CAE + HMM, and the traditional handcraft feature extraction method (TH) + HMM. The results show that the proposed framework has the highest classification accuracy (95.25% for the training samples and 97.36% for the testing samples) and has the best performance in recognizing the semi-stable state (85.67% for the training samples and 77.60% for the testing samples), indicating that the proposed framework is capable of identifying the combustion condition, changing when the combustion deteriorates as the coal feed rate falls.


Introduction
Fossil power plants in China are facing more peak-shaving requests for the growth of renewable energies.Reducing the minimum unit technique output is one of the goals of flexible transformation, which means the boilers in power plants should be operated below the designed minimum output.When the boiler runs in a low load, the changeable quality of coal used in practice makes the combustion unstable, which directly affects the safety and economics of the boiler operation.Hence, the identification of combustion condition has received extensive attention by researchers.Flame visualization and characterization techniques are some of the research tools for understanding the combustion process.Current research mainly includes feature-based machine learning, statistical-based process monitoring, and deep learning methods.
The main steps of feature-based machine learning methods are feature extraction and state classification.However, whether or not the extracted characteristic parameters could represent the combustion condition is dependent on the image processing technology, such as the segmentation algorithm.Machine learning mainly includes artificial neural networks, linear classifiers, and clustering analysis, and tuning the parameters usually takes lot of time.Back propagation (BP) neural network (NN) [1], wavelet NN [2], and a method combined with Kohonen's self-organizing NN and BP [3] were proposed to predict combustion status based on digital images.For improving the convergence speed and recognition accuracy, Han et al. [4] proposed an interactive flame image recognition method in which manual evaluation was introduced to the NN.Considering the uncertainty in flame detection, Li et al. [5] proposed a two-stage fusion structure, combining BP and the Dempster-Shafer (D-S) evidence theory.To analyze the dynamic characters in the flame images, Liu et al. [6] selected the relative changes of the ignition position as the inputs of the fuzzy neural network, Zhang et al. [7] applied the fuzzy theory to fuse the fire detection value and flame characteristics, and Xu et al. [8] proposed an online fuzzy clustering algorithm to monitor the in-furnace combustion states automatically.In terms of the linear classifiers, the support vector machine (SVM) [9] and the robust support vector regression machine [10] are also proposed for their anti-outlier performance.Wu et al. [11] introduced the Krawtchouk moment and Wang et al. [12] used Fisher Discriminant Analysis to extract the features, then combined wavelet SVM and k-nearest neighbor (KNN) for combustion state classification.For the sintering process of a rotary kiln, Li et al. [13] applied the ensemble learner models with the probabilistic neural network (PNN), the NN, the SVM, and extreme learning machine (ELM) classifiers for combustion state recognition and Chen et al. [14] used SVM and ELM to recognize the temperature condition of a rotary kiln.
On combustion process monitoring, Bai et al. [15] proposed a principal component analysis (PCA) to extract the characteristics of flame images.Considering the stochastic behavior of time series in the combustion process, Chen et al. [16] proposed a framework based on multiway principal component analysis (MPCA) and the hidden Markov model (HMM) to establish the probability monitoring chart of oil flame images under a normal combustion state for the combustion transition state sequence tracking.Bai et al. [17] proposed a multi-condition combustion process monitoring method based on a principal component analysis and random weight network (PCA-RWN).
In recent years, deep learning has received unprecedented attention and development.Its greatest advantage is that it can learn the representative characteristics in data automatically and avoid misleading from hand-crafted features [18].It is widely applied in research fields related to feature extraction: Zhou et al. [19] combined independent subspace analysis (ISA) with a convolutional network to extract the local morphology of the burning image for the rotary kiln sintering process layer by layer and built the word package model to learn its global feature.Yuting et al. [20] used a deep belief network (DBN) to obtain features in the image of the furnace flame.In combustion state recognition, Wang et al. [21] proposed a deep learning method based on convolutional neural network (CNN) and deep neural network (DNN) to monitor combustion states and predict the heat release rate.
The studies above were mainly based on supervised learning, that is, the models learn and extract the corresponding features, directly and intentionally, through the information feedback from the labeled images.These labels are often given by researchers through specific experimental conditions or expertise.It's a time-consuming and laborious task to label the image manually.If mistakes are made in the process of labeling, they would affect the performance of the models.The Kohonen network, proposed by Wei [2], and a fuzzy immune network algorithm, proposed by Guo et al. [22], were the attempts to realize the unsupervised learning in the flame monitoring process.However, since the features are extracted by hand-craft, it is hard to determine whether the selected features could improve the optimal recognition performance for combustion state classification.
In this paper, we propose an unsupervised classification framework based on the convolutional auto-encoder (CAE), principal component analysis (PCA), and the hidden Markov model (HMM) for pulverized coal combustion status recognition with the uniformly spaced flame images.The effective characteristics of the flame images are retrieved with the CAE and then further compressed with PCA to obtain a set of orthogonal data.The hidden Markov model is built on the orthogonal data and applied to new flame images for combustion status recognition.In the framework, the CAE is applied to extract features from the flame images directly, which simplifies the whole feature extraction process compared with the hand-craft method; PCA is introduced to disentangle the latent space, in which the data exists as relevant relationships in a good representation subspace, to make the different features independent of each other; HMM is applied to capture the dynamic temporal behaviors, which are effective for combustion condition recognition.In this paper, six different compression dimensions of the latent variable z in the CAE model are compared to select the appropriate compress parameter.The framework is then tested with the flame images from a 660 MW thermal power plant and compared with other methods, including the CAE + Gaussian mixture model (GMM), CAE + Kmean, the CAE + fuzzy c-mean method, CAE + HMM, and the traditional handcraft feature extraction method (TH) + HMM, to verify the effectiveness of the proposed framework.

Convolutional Auto-Encoder
CAEs differ from conventional AEs as their weights are shared among all locations in the input and are preserved in spatial locality.The details of CAEs can be found in [23,24].For a mono-channel input x, the latent representation of the k-th feature map h k is given by [23], where the bias b k and weight W k are broadcasted to the whole map, σ is a nonlinear transformation (we use the rectified linear units (ReLU) [25] function here), and " * " denotes the 2-dimensional convolution.
The reconstruction process can be calculated as [23], where H identifies the group of latent feature maps, Wk identifies the flip operation over both dimensions of the weights, and c is the bias for each input channel.The cost function to minimize the reconstruction error between the input x and output y is the binary cross entropy, which can be written as [26], where n is the number of samples.

Max-Pooling
For hierarchical networks, in general, and CNNs, in particular, a max-pooling layer is often introduced to obtain translation-invariant representations by taking the maximum value over the non-overlapping sub-region [23].Figure 1b shows the down-sampling process of the input matrix A, which is divided into several sub-regions.If the sub-regions do not overlap and the size of a sub-region is λ × τ, the i j-th sub-region can be expressed as [27] G A λ,τ (i, j) = (a st ) λ×τ , (i − 1) where a st is the st-th blocks in the matrix A.

Activation
The activation layer mainly performs nonlinear transformation on the input data, so that the network can fit nonlinear projection.The commonly used activation functions are sigmoid [26] and rectified linear units (ReLU).Sigmoid has the exponential function shape to imitate the biological neuron, which is located in the final layer to produce a categorical probability distribution, and is defined as [26].
The ReLU function is a piecewise function, which can change all the negative values to 0, and has better performance than the sigmoid function in terms of calculation speed.The function can be written as [25].

Principal Component Analysis (PCA)
As described in the previous sections, we can get the feature matrix X from the output of the encoder.If the i-th feature vector is where n is the number of features and m is the number of samples.The auto-encoder compresses the original data into a more compact representation, which contains the most important features by reconstructing the input.PCA projects the input into an orthogonal space, ensuring that the obtained lower-dimension vector components are independent of each other.The goal of PCA [28] is to find the orthonormal matrix P in Y = PX, where the rows of P are the principal components of X.The singular value decomposition of the correlation matrix of X, i.e., C X , is given by [17] where U = [u 1 , u 2 , . . ., u n ] T represents a n × n unitary matrix and Λ is the diagonal matrix of the eigenvalues.If r is the number of the principal components, then the loading matrix P can be marked as where Y is a r × m matrix transformed from X n×m , by reducing the n-dimensional feature vectors to r-dimensional vectors.

The Hidden Markov Model (HMM)
The HMM is a double stochastic process, in which the transition probability between each state and the observations of each state are uncertain [16][17][18]29].The state that exists in the model can only be perceived through the vector and cannot be observed directly.We used the notation λ = (A, B, π) as the parameter set of the HMM model.A = a i,j N×N represents the state transition probability distribution, which is the state transition probability among hidden states, and N is the number of states.We denote the individual states as S = {S 1 , S 2 , • • • , S N } and the state at time t as q t .The transition distribution a i, j is defined as [29] a i,j = P q t+1 = S j q t = S i , 1 ≤ i, j ≤ N (12) if π = {π i } is the initial state distribution, where The observation distribution P O t S j is generalized by a Gaussian density [29], where is the set of the observation vectors collected for modeling, c jm is the mixture coefficient for the m-th mixture in state S j and it should be summed up to one for each state, N[•] is the Gaussian density with mean vector µ jm , and the covariance matrix U jm is the mth mixture component in state S j .Given some observation sequences as training data, we can use an iterative procedure, such as the Baum-Welch procedure, to estimate the model parameters, and the details are explained in [29].The goal is to adjust the parameters of the model λ to maximize P(O|λ).
T is the single best state sequence from a given O, then the quantity can be defined as [29] where δ t (i) is the highest probability along the single path when the state ends in S i at time t.

By induction we have [29]
Finally, we can get the formula of q * t , i.e., the combustion state to its corresponding observation sets [29]

CAE and PCA-HMM-Based Combustion Classificaion
Applying image processing technology can retrieve rich and reliable information from the flame images.It has been widely studied and applied in various fields.In the sintering process of the rotary kiln, a flame image-based burning state recognition system is used to ensure sintered clinkers are qualified [19].In the process of basic oxygen furnace blowing, it is used to make accurate and real-time judgment of the furnace endpoint [30].In the aero-engine, it is proposed to determine the combustor ignition/flameout status and retrieve its corresponding features directly [31].In this paper, we propose a deep learning method with pulverized flame images to recognize the burner state, which has a great reference value when applying it to the image-based combustion status recognition in other fields.
Figure 2a shows the schematic diagram of the proposed methods and the distribution map of swirl burners in the opposed-firing boiler.The A, B, C, D, E, F, and G in Figure 2a represent the seven layers of burners, which were distributed on the front and back wall of the furnace.The C, D, E, and F layer burners were on the front wall and the A, B, and G layer burners were on the back wall.Every layer burner was equipped with a medium-speed coal mill, and the pulverized coal pipelines of each coal mill were connected to the corresponding layer burners.Each mill and its corresponding layer burners could be shut down according to the boiler load changing.The flame image monitoring system, including a probe and a protective sleeve for each burner, was mounted on the F layer burners, as shown in Figure 2b.The cooling air pipelines were installed to cool the probe, which was extended into the furnace, ensuring that the probe's temperature did not exceed 70.The images of coal combustion in the furnace were captured by the image probe as a video format, transferred through the composite video cables, and stored in the hard disk recorder in the electronics room.Firstly, we transformed the flame videos into the red, green, and blue (RGB) images with the size of 960 × 576 × 3 under 25 frames per sec frame rate, using video processing technology, and calculated the average of the 25 images per sec to eliminate the influence of the flicker.Secondly, the images were resized to the size of 128 × 128 × 3, as the inputs of the CAE model.The CAE model learns to encode the input images in compact representations and then reconstructs the input images from these representations.By training the CAE model, the representations can hold key information about the input images.Subsequently, PCA was applied to project the representations as feature vectors into the orthogonal space, ensuring that the feature vector components were independent of each other.Finally, the HMM model was built to generate the corresponding optimal state sequence from the feature variables, which can be applied for online combustion condition monitoring.The framework was built with the following steps: (1) A CAE model was constructed, in which the latent variables were considered as the features of the flame images x i , i = 1, . . ., m, where n is the number of the nodes in the encoding network's output and m is the numbers of the flame images.(2) The loading matrix P was calculated to transform the n-dimensional latent variables into the (4) The output of the HMM model was calculated as q * t = argmax 1≤i≤N [δ t (i)], t = 1, . . ., m , where δ t (i) is the highest probability along the path when the state ends in S i at time t, and q * t is the corresponding hidden state of O t .

Data Preparation
To evaluate the performance of the proposed framework, a coal combustion adjustment experiment was conducted in a 660 MW opposed-firing boiler.The lines in Figure 3a,b are the actual total load in the training samples and the testing samples during the period of 11:00-17:00.The lines in Figure 3c,d are the coal rate of the F mill.The coal rate was adjusted to make the combustion state of the F layer burners gradually change between stable and unstable.According to the combustion theory [32], the concentration of the pulverized coal is the most important factor directly influencing the pulverized coal ignition and combustion in the airflow.For example, if the concentration of the pulverized coal is low and the amount of heat released is small, then a continuous flame cannot be formed.The large external heat dissipation makes the temperature level decrease, leading to unstable combustion.In the coal combustion adjustment experiment, as shown in Figure 3c, the coal rate was relatively steady at 58t/h during 11:00-12:00, which means the combustion state was stable; then the coal rate of the F mill began to decrease around 14:35, until it was 0t/h at 14:45; finally, the coal rate began to increase at 15:02 until the combustion stable state was reached.The combustion was under poor stability when the coal rate was around 0t/h.We sampled the sequential 14,400 flame images as training sets and the 3600 images for testing sets.In particular, due to the lack of unstable and semi-stable samples, we selected the flame images between the period of 14:00-15:00, both as training and testing data.To differentiate testing data from the training data, we took the average images of 2 s during 13:00-15:00 as the testing data.The selected time periods are shown in Table 1. Figure 4 shows three typical images representing different combustion statuses, in which the combustion status of the flame from left to right are stable, semi-stable, and unstable, respectively.Table 1.The selected time periods in the coal combustion experiment.

Convolutional Auto-Encoder
We adjusted the image size from 960 × 576 × 3 to 128 × 128 × 3 as the input of the CAE network.The procedure was performed in the computer equipped with an Intel i7-CPU and a Nvidia GeForce GTX 1060 GPU (MI, Beijing, China).Figure 5 shows the architecture of the CAE network and the process of the encoding and decoding.The encoder consists of three convolutional layers, three max-pooling layers, and two fully connected layers.All convolutional layers use 3 × 3 kernels, followed by ReLU activation functions and max-pooling layers, with 2 × 2 with the stride of 2. We flattened the output of the last max-pooling layer into a one-dimensional vector and compressed it into the smaller dimensions with two full connection layers.The decoding structure was completely symmetrical with the encoding structure, which contained fully connected layers, convolutional layers, and un-pooling layers.The sigmoid activation function was applied in the last layer of the decoder to produce the output y.Here, we chose the cross-entropy as a loss function to measure the deviation between the reconstructions y and the original images x.Table 2 shows the structure and parameters of the CAE network, where f represents the kernel size, s is the step size, d is the number of the kernels, and p is the fill parameter.

Determinate in the Compression Dimensions of Latent Variable z
In this section, we explore the compression limit of the latent representation z to ensure that it has small dimensions but still contains enough useful information to reconstruct the original images closely.According to the CAE's architecture, we tested six different dimensions in the calculation.Figure 6 shows the comparison between the original images and the model's outputs.When the dimensions of the latent variables are 4096, 2048, and 1024, as shown in Figure 6b-d, the decoded images are diversified, similar to the original images, as shown in Figure 6a, and there is little difference among the three rows of flame images.The decoded images of the CAEs with latent dimensions as 512 and 256, respectively, as shown in Figure 6e,f are blurred and hardly satisfactory in some parts of the images.When the latent dimension is set to 2, the latent representation contains too little information to reconstruct the input, and there is almost no difference among the right three decoded images in Figure 6g.
The object of training the CAE network is to use as few dimensions as possible to reconstruct the input, and the reconstructed images should be as similar as the original images.Figure 7 shows the mean square error (MSE) among different dimensions, and Table 3 shows the corresponding values.The MSE changes slightly when the dimensions are greater than 1000.We choose 1024 as the appropriate dimensions for the latent representation z.

Using PCA and HMM
In the previous section, we selected 1024 as the appropriate dimensions parameter for z and took the latent representation z as the feature vector of the flame image.PCA is applied here to reduce the dimensions of the feature vector, to improve the computational efficiency, and to avoid overfitting, due to the redundancy in the images.We used PCA to descend the high-dimensional data into a 90-dimensional vector with a contribution rate of 81.75%.As the observation sets, the compressed vectors were used to train the HMM, assuming that the number of the hidden states is 3.By the iterative procedure, we obtained the model parameters and got the optimal state sequence corresponding to the observation sets.
Since the HMM is proposed in this paper, we chose three different unsupervised clustering algorithms for comparison: the Gaussian Mixture model [33], k-means [34], and Fuzzy c-means [35].For the four methods, including CAE, we took the latent variables obtained by CAE as the inputs of the clustering model and then set the labels by these methods respectively.2.
TH adopts the traditional feature extraction procedure to get the feature vector of a flame image.We preprocessed the flame images before the feature extraction: (1) The median filter was applied for image denoising; (2) the QSTU algorithm [36] was applied for image segmentation.A total of 18 features were extracted, including the RGB image's first-order and second-order moment, the gray level co-occurrence matrix (angular second moment, entropy, inverse differential moment, and correlation) [17], the fractal dimension [36], the flame's area, and the first-order and second-order moment of the images' HSI (hue,saturation,intensity) space [37].The extracted vectors were taken as the observation sets, to get the optimal hidden state distribution using HMM.

Experimental Results and Analysis
In order to evaluate the performance of the proposed framework effectively, we got the real labels of the flame images, according to three intervals of the coal rate.Table 4 shows the specific ranges of coal rate.In addition, the combustion stability classification should not only reflect the combustion condition changes, but should also reflect the flicker frequency during the combustion process.We selected the following criteria to evaluate the performance: The flame image was considered as the stable state when the combustion status was changing between unstable and stable within 10 s; if the unstable (non-fire) state lasted for at least 10 s, the flame image was considered as the unstable state.Figure 8a shows the comparison of different methods on the training data.Each point represents the combustion state corresponding to the flame image per second.Combining with the specific ranges of the coal rate (Table 4), the red dots represent the actual states: Category 0, 1, and 2, which represent unstable, semi-stable, and stable, respectively.From the distribution of the red dots, we can see that the unstable state remains at 14:45-15:02 when the coal rate was around 0 t/h; the semi-stable state remains at 14:35-14:45 and 15:02-15:09 when the coal rate was increasing or decreasing between 27 t/h and 53 t/h during those periods.As shown in Figure 8a, these methods, except the fuzzy c-means method (the pink dots), have higher accuracy in recognizing the unstable states; the Gaussian mixture model (the violet dots) and k-means algorithm (the yellow dots) have poor performance in identifying the semi-stable state.The methods that obtain HMMs, such as CAE + HMM (the brown dots), TH + HMM (the green dots), and CAE + PCA − HMM (the blue dots), have relatively steady performance in recognizing the combustion states.In particular, the proposed framework has the most ideal output.The results show that the HMM has an advantage in processing time series, in which the HMM can capture the temporal behaviors in the characteristic parameters and detect the change of combustion status more accurately.The conventional k-means is especially not suitable for this classification task, since the data has temporal behavior.The classification accuracy could be enhanced using the semi-supervised learning algorithms, such as discriminative k-means method [38], which needs further research.
Figure 8b shows the results of different methods on the testing data.In the figure, each point represents the combustion state corresponding to the flame image every 2 s.As in Figure 8a, the red dots represent the actual states over the combustion process.The Gaussian mixture model, k-means, and fuzzy c-means methods could not identify the semi-stable state well.All the methods except the proposed framework have a higher error rate in considering the semi-stable state as the stable state.
Under the aforementioned criteria, the performance of the proposed framework, CAE + GMM, CAE + Kmeans, CAE + fuzzy c-mean, CAE + HMM, and TH + HMM were evaluated with the same sets of flame images, and their confusion matrices are shown in Figure 9.It can be seen from Figure 9a-e that these methods have higher accuracy for category 0 and 2, while they have bad performance in category 1.The classification accuracy of the GMM, k-means, and fuzzy c-means in the semi-stable state are 0.0026, 0.0123, and 0.059, respectively, which mean that these clustering methods can't capture the sequential correlation in the flame characteristic.The lower accuracy of the CAE + HMM in this case implies that the observations set contains many correlated variables, which affects the performance of combustion state classification.TH was only about half accurate in category 1 (0.4948).Since the feature extraction process was performed manually, we cannot guarantee that the feature vectors acquired are comprehensive and repetitive.Table 5 shows the classification accuracy of different methods on training data.It is worth noting that the total accuracy of the TH + HMM method is close to the proposed framework, indicating that the extracted features, such as the brightness and texture of the image, can reflect the change of combustion state effectively.It can be explained that CAE can extract useful features in the flame images effectively.The confusion matrices of these methods in testing data are shown in Figure 10.Table 6 shows the specification of the classification accuracy in testing data.The results show that all the methods have high accuracy in identifying the unstable and stable states; however, they have worse performance in the semi-stable states, among which the k-mean has the lowest accuracy (0.0301) and the proposed method has the highest accuracy (0.7760).The results show that the proposed framework has excellent performance in recognizing the semi-stable state, which is of great significance to guide the coal combustion adjustment.

Conclusions
In this study, an unsupervised framework combined CAE, PCA, and HMM is proposed to classify the coal combustion status.First, we tested the influence of latent representation variables (z) in CAE with different dimensions on coal combustion adjustment experiment results, and 1024 was selected as the suitable parameter, which made the CAE contain the useful and sufficient features to closely reconstruct the input data.PCA was then applied to compress and de-nose the latent representation and to ensure that the vector components in the latent representation were independent and the computing was efficient.Finally, we adopted the HMM to learn and capture the sequence correlation in the combustion process.
In order to verify the effectiveness of the model, the coal rate was selected as the classification criteria of the combustion state, according to the combustion theory.The classification accuracy of the proposed framework on the training data and testing data were 95.25% and 97.36%, respectively.In particular, the proposed framework had better performance in recognising the semi-stable state (85.67% for the training samples and 77.60% for the testing samples), which is important for adjusting the combustion state in advance to avoid unstable combustion.It can therefore be concluded that the proposed framework not only simplifies the feature extraction process compared with manual image processing, but also provides an effective means for classifying the flame combustion status in an unsupervised way.

Figure 1 .
Figure 1.The schematics of (a) the convolutional process and (b) the down-sampling process.
using PCA, and was taken as the observation vectors in the HMM model.(3) A set of O = [O 1 , O 2 , . . ., O m ] was collected as the training data, and the parameters of the HMM model λ * = max λ m i=1 P(O i |λ) were estimated using Baum-Welch algorithm.

Figure 2 .
Figure 2. (a) The schematic diagram of the proposed convolutional auto-encoder (CAE) and principal component analysis and hidden Markov model (PCA-HMM) framework for combustion classification and the position of seven layer burners in the boilers with the C, D, E, and F layers of burners on the front wall and the A, B, and G layers on the back wall; (b) The site installation of the flame image monitoring device.

Figure 3 .
Figure 3.The related variables in the boiler operation process: (a) The actual total load in the training data; (b) the actual total load in the testing data; (c) the coal rate of the F mill in the training data; (d) the coal rate of the F mill in testing data.

Figure 5 .
Figure 5.The architecture of the CAE network.

Figure 7 .
Figure 7.The mean square error (MSE) of the CAE networks with different latent representations.

Figure 8 .
Figure 8.(a) The hidden states distribution of the training data via different methods; (b) the hidden states distribution of the testing data via different methods.For better observation, the results of some methods are shifted up (or down) a little, based around the actual values.

Table 2 .
Structure and parameters of the CAE network.

Table 3 .
The MSE of the CAE networks with different latent representations.

Table 4 .
The specific ranges of coal rate and combustion status.

Table 5 .
Accuracy of the training data via different methods.

Table 6 .
Accuracy of the testing data via different methods.