Time Series Prediction Method of Clean Coal Ash Content in Dense Medium Separation Based on the Improved EMD-LSTM Model

Cheng, Kai; Zhang, Xiaokang; Zhou, Keping; Zhou, Chenao; Li, Jielin; Yang, Chun; Guo, Yurong; Wang, Ranfeng

doi:10.3390/bdcc9060159

Open AccessArticle

Time Series Prediction Method of Clean Coal Ash Content in Dense Medium Separation Based on the Improved EMD-LSTM Model

by

Kai Cheng

¹,

Xiaokang Zhang

²,

Keping Zhou

¹,

Chenao Zhou

¹,

Jielin Li

^1,*

,

Chun Yang

¹

,

Yurong Guo

¹ and

Ranfeng Wang

³

¹

School of Resources and Safety Engineering, Central South University, Changsha 410083, China

²

Tianhe Daoyun (Beijing) Technology Co., Ltd., Beijing 100176, China

³

School of Mining Engineering, Taiyuan University of Technology, Taiyuan 030600, China

^*

Author to whom correspondence should be addressed.

Big Data Cogn. Comput. 2025, 9(6), 159; https://doi.org/10.3390/bdcc9060159

Submission received: 28 March 2025 / Revised: 28 May 2025 / Accepted: 12 June 2025 / Published: 15 June 2025

(This article belongs to the Special Issue Application of Deep Neural Networks)

Download

Browse Figures

Versions Notes

Abstract

Real-time ash content control in dense medium coal separation is challenged by time lags between detection and density adjustment, along with nonlinear/noisy signals. This study proposes a hybrid model for clean coal ash content in dense medium separation by integrating empirical mode decomposition, long short-term memory networks, and sparrow search algorithm optimization. A key innovation lies in removing noise-containing intrinsic mode functions (IMFs) via EMD to ensure clean signal input to the LSTM model. Utilizing production data from a Shanxi coal plant, EMD decomposes ash content time series into intrinsic mode functions (IMFs) and residuals. High-frequency noise-containing IMFs are selectively removed, while LSTM predicts retained components. SSA optimizes LSTM parameters (learning rate, hidden layers, epochs) to minimize prediction errors. Results demonstrate the EMD-IMF1-LSTM-SSA model achieves superior accuracy (RMSE: 0.0099, MAE: 0.0052, MAPE: 0.047%) and trend consistency (NSD: 12), outperforming baseline models. The study also proposes the novel “Vector Value of the Radial Difference (VVRD)” metric, which effectively quantifies prediction trend accuracy. By resolving time-lag issues and mitigating noise interference, the model enables precise ash content prediction 16 min ahead, supporting automated density control, reduced energy waste, and eco-friendly coal processing. This research provides practical tools and new metrics for intelligent coal separation in the context of green mining.

Keywords:

dense medium separation; empirical mode decomposition (EMD); long short-term memory (LSTM) neural network; clean coal ash prediction; vector value of the radial difference (VVRD); trend consistency evaluation

1. Introduction

Dense medium separation is one of the most common methods for coal processing and utilization. In the context of green development, improving dense medium separation processes is an effective approach to enhance clean coal quality, save energy, and protect the environment [1,2,3,4]. Based on Archimedes’ principle, clean coal products that meet quality standards can be selected by adjusting the density of the dense medium to achieve dense medium coal separation. As the most important indicator of clean coal quality, coal ash content must be precisely controlled within a reasonable range [5,6,7,8]. Accurate control not only ensures coal quality but also reduces raw coal loss, which is critical for clean, efficient, and green development of coal preparation plants. The dense medium separation process is a typical flow process [9,10,11]. Suspension density is set at the front part of the process, and clean coal ash content is measured at the end. There is a certain lag between real-time ash content detection and the corresponding suspension density setting, so it is difficlut for ash content to effectively guide the automatic adjustment of suspension density in real-time, failing to ensure that the ash content remains within a stable and reasonable range over the long term. Additionally, coal ash content is influenced by numerous factors during production [12,13], such as raw coal quality and suspension density, resulting in the time series of clean coal ash content being nonlinear, non-stationary, and stochastic. Additionally, the time series often contains “noise” due to uncertain working conditions in the field and human error.

This study is based on a coal preparation plant located in Shanxi Province, China. In the plant, a pressurized three-product dense medium cyclone separation process is employed. Raw coal is fed into the cyclone via a feeder, and a dense medium suspension with a specified density is injected into the cyclone under pressure. The raw coal is separated into clean coal, medium coal, and gangue by centrifugal force within the cyclone. Then the clean coal undergoes dewatering and de-mediuming through an arc sieve and a de-medium screen, ultimately producing clean coal products that meet quality standards. The process flow is illustrated in Figure 1.

Like most coal preparation plants, the plant faces inherent drawbacks of dense medium separation process and lagging current technological measures. Based on actual production conditions in the plant, it is evident that there is a delay of approximately 16 min between the measurement of dense medium clean coal ash content and the corresponding dense medium suspension density setting. During this 16 min delay, operators rely solely on personal experience to adjust the dense medium suspension density. However, unavoidable and occasional factors in actual production, such as sudden changes in raw coal quality or abnormal signals from density meters, can prevent workers from responding promptly to unexpected situations or misleading indicators in the production process. This can result in resource wastage, excessive consumption of dense medium (magnetite powder), and subsequent environmental pollution and ecological damage. The prolonged delay also hampers the timely and effective control of clean coal ash content within the target range, causing the dense medium separation system to remain in a state of significant dynamic adjustment. The instability of the production system inevitably leads to non-compliant product quality, thereby affecting the economic benefits of the enterprise.

Therefore, addressing the issue of time lag in the clean coal ash content time series has become the primary goal in achieving automatic control of dense medium clean coal ash content. This study aims to break through current technical bottlenecks by proposing a robust time series prediction model that not only mitigates signal noise through empirical mode decomposition but also accurately captures temporal dynamics through LSTM modeling. Furthermore, an original evaluation metric—VVRD—is introduced to assess trend accuracy, aligning the prediction output with control system needs in real production scenarios. Predicting the dense medium clean coal ash content time series is a practical and effective method to resolve this issue.

The remainder of this paper is organized as follows: Section 2 reviews the literature on prediction methods for clean coal ash content and EMD-LSTM. Section 3 introduces the research methodology and process for the proposed problem, with detailed explanations of data processing and model construction. Section 4 presents the evaluation methods, analyzes model results and computational performance, with the optimization of parameters in the optimal model. Finally, Section 5 provides the conclusions of this paper.

2. Literature Review

The prediction methods for dense medium clean coal ash content can primarily be divided into two categories of traditional prediction methods and data-driven prediction methods.

Traditional prediction methods primarily include empirical formulas, statistical regression models, and physical models, among others. These methods typically use known relationships or historical data for prediction and are suitable for cases of simple or linear relationships. Zhang et al. [14] utilized a multiple linear regression analysis method to establish a mathematical model for the separation density of the dense medium cyclone and clean coal ash content, which can quickly guide coal preparation production and improve separation efficiency. However, the accuracy of the model is somewhat limited because of many factors affecting separation, requiring further in-depth research. Sun et al. [15] developed a mathematical relationship prediction model based on phase space reconstruction between raw coal ash content, dense medium density, and clean coal ash content, with the prediction accuracy being influenced by the precision of raw coal ash content detection. Qiu et al. [16] proposed an online prediction method for clean coal ash content based on image analysis. The method establishes a polynomial regression model by extracting seven features from the grayscale histogram and the ash content, but the model has certain limitations, as predicted values are often slightly higher than actual measurements. This is mainly due to the changes in image features coming from both the coal quality itself and errors in image acquisition process. Chen et al. [17] proposed an improved hyperbolic tangent model that directly calculates the separation coefficient during the separation process of dense medium and water-medium coal, relatively simple to apply and able to predict the yield and ash content of coal washing products, thereby improving the production level of coal preparation plants to some extent.

In recent years, with the rapid development of technologies such as artificial intelligence and deep learning, large amounts of production data have been utilized by data-driven prediction methods for complex nonlinear analysis and modeling. Common techniques mainly include artificial neural networks (ANN), support vector machines (SVM), and deep learning models such as long short-term memory (LSTM) and recurrent neural networks (RNN). These methods are particularly well-suited for handling multivariate, nonlinear, and dynamically changing industrial data, improving the precision of prediction results. Zhang et al. [18] developed an online coal ash prediction system based on machine vision, demonstrating that the normalized SVM model outperforms radial basis function (RBF) neural networks and multiple linear regression (MLR) models in terms of fitting and generalization capabilities. Ali et al. [19] evaluated five machine learning and AI models (random forest, artificial neural network, adaptive neuro-fuzzy inference system, Mamdani fuzzy logic, and hybrid neuro-fuzzy inference system) to predict the flotation behavior of high-ash clean coal, with results showing that the Mamdani fuzzy logic model had the highest prediction precision However, these models are only applicable to specific raw material characteristics and laboratory scales, requiring retraining if there is a significant change in raw materials. Legnaioli et al. [20] explored the feasibility of using Laser-Induced Breakdown Spectroscopy (LIBS) to analyze coal ash content and proposed a prediction method based on artificial neural networks, which can accurately predict coal ash content, controlling the error range within ±4% to meet industrial application requirements. Yin et al. [21] proposed a deep learning-based semi-supervised soft sensor modeling method for quality prediction in the coal washing process. A stacked autoencoder network is used in the method to extract features with a bidirectional LSTM network to capture temporal dependencies. However, the method cannot continuously improve prediction performance with a large amount of unlabeled data, without a reasonable determination of the ratio between labeled and unlabeled data. Zhou et al. [22] employed a method based on RNN to predict product ash content in the coal washing process, noting that this method better captures time series features, thereby improving prediction accuracy. However, he also pointed out that as the prediction time step increased, the prediction error gradually increased, necessitating selection of an appropriate time step based on the time difference between raw coal entering the system and appearing on the belt. Wang et al. [23] proposed a multi-step prediction framework based on time series alignment and a dual GRU model for predicting dense medium clean coal ash content. However, the method also has some limitations, such as information loss during the time series alignment process and limited prediction capability for factors like changes in the raw coal ash content.

In summary, an ash content prediction method based on empirical mode decomposition (EMD) and long short-term memory (LSTM) neural networks is proposed in this study. The EMD-LSTM ash prediction method effectively handles nonlinear, non-stationary, and random signals while reducing the impact of noise in the raw clean coal ash signal on prediction accuracy. It is also well-suited for processing and predicting delayed events caused by process operations, such as heavy-medium coal preparation ash time series. Zhang et al. [24] applied the EMD-LSTM algorithm to predict water quality indicators in urban drainage networks. The integrated EMD-LSTM model outperformed other integrated models using traditional preprocessing and data-driven algorithms, achieving the highest R² and the lowest RMSE in predicting key water quality indicators. This study provided a more accurate and sustainable approach to water quality monitoring in model-based urban drainage systems. Similarly, Ali et al. [25] combined improved EMD with LSTM deep learning to predict complex stock market data. The experimental results showed that this hybrid model is an effective method for predicting non-stationary and nonlinear financial time series., although the application scenarios in the dense medium coal preparation process differ from those in references to Zhang et al. and Ali et al., which both involve predicting relevant parameters using non-stationary and nonlinear data. Therefore, the EMD-LSTM approach can also be applied to the dense medium coal preparation process.

The principal significance of this study lies in the following aspects: (i) the innovative application of the EMD signal processing method to processing of clean coal ash data in dense medium separation has achieved effective “noise reduction”; (ii) the Vector Value of the Radial Difference (VVRD) evaluation metric is proposed, which accurately assesses the prediction trends of clean coal ash content, ensuring stable operation of the dense medium separation automatic control system; (iii) accurate prediction of clean coal ash content will be achieved, solving a series of issues caused by the lag in the dense medium process in coal preparation plants.

3. Research Methodology and Procedure

3.1. Research Methodology

3.1.1. EMD

The empirical mode decomposition (EMD) is a novel adaptive time-frequency signal decomposition method proposed by Chinese American scientist Norden E. Huang in 1998 during his tenure at NASA [26]. Unlike traditional Fourier and wavelet decompositions, which rely on predefined harmonic basis functions and wavelet basis functions, respectively, EMD overcomes the limitation of Fourier transforms as it does not require any predetermined basis functions. Instead, EMD relies solely on the intrinsic time scale characteristics of time series signal for its decomposition. Because of the theoretical flexibility, EMD is broadly applicable to any type of signal, including the decomposition of time series of clean coal ash content, breaking them down into components on different time scales [27]. As a result, EMD performs significantly better than traditional time-frequency processing methods when handling the time scale division of clean coal ash content time series data.

EMD is capable of decomposing a complex signal into a finite number of intrinsic mode functions (IMFs) and a residual r(t). Each IMF component encapsulates local features of original time series on different long-term time scales, while the residual r(t) often reflects overall long-term trend of original time series [28]. By analyzing components obtained from the decomposition on different time scales, it is possible to identify time-scale components where the “noise” exists and causes original time series to exhibit nonlinear, non-stationary, and stochastic characteristics predominantly. By selecting and reconstructing signals from the IMFs, with discarding those that contain noise, the EMD method can effectively reduce noise, thereby improving prediction accuracy by minimizing the impact of noise on predictive performance.

Norden E. Huang proposed that any signal is composed of a finite number of IMFs superimposed on different time scales. The working principle of EMD is to identify all the oscillatory modes of intrinsic mode functions within the signal based on characteristic time scales of original signal, thus decomposing original signal into a set of IMFs.

3.1.2. LSTM

Long short-term memory (LSTM) network was first proposed in 1997 [29,30] as a specialized type of recurrent neural network (RNN) designed to address the issue of long-term dependencies, which is a common limitation in standard RNNs [31]. LSTM is developed as an enhancement of the RNN structure, incorporating a unique processing unit known as the “cell”. It functions as a processor that determines whether the information is useful and should be retained or discarded [32].

The “cell” in LSTM is a single-input, dual-output information processing unit [33]. When raw information enters the “cell”, it is processed and validated through the LSTM algorithm. If the information is deemed relevant, it will be passed through one output to the next unit in the sequence. If the information does not pass the validation, it will be discarded through the other output.

Due to its unique gate structure within the neural cell unit, LSTM is particularly well-suited for handling and predicting time series data with long delays caused by process flows [34,35], such as ash content time series in dense medium coal separation.

3.2. Framework of Prediction Methodology

The framework for predicting ash content time series in dense medium clean coal based on the EMD-LSTM method is illustrated in Figure 2.

First of all, time series signal x(t) of original clean coal ash content is decomposed according to EMD to obtain a finite number of IMF components and a residual r(t) [36]. Next, each IMF component and the residual r(t) are predicted based on LSTM neural network, resulting in corresponding predicted time series P_i (i = 1, 2, 3, …, n, r). Moreover, a selective reconstruction is performed combined with analysis results of components decomposed by EMD. Components containing “noise” are discarded for noise reduction. Finally, the reconstructed model undergoes optimal selection and parameter optimization, leading to development of the most accurate prediction model.

3.3. Construction Process of Prediction Methodology

3.3.1. EMD of Time Series Signal

The EMD for time series signals of original clean coal ash content involves the following seven steps:

Step 1: Identify the local maximum and minimum points of time series signal x(t), and then draw the upper envelope e_max(t) and lower envelope e_min(t), as illustrated in Figure 3.

Step 2: Calculate the average of the upper envelope e_max(t) and the lower envelope e_min(t) to obtain the mean envelope m₁(t) of time series [37], as shown in Figure 4. The formula is as follows:

m_{1} (t) = [e_{m a x} (t) + e_{m i n} (t)] / 2

(1)

Step 3: Subtract the mean envelope m₁(t) from time series signal x(t) of original clean coal ash content to obtain a new time series x₁(t) which removes low-frequency components. This process yields a signal that contains only high-frequency components, as shown in Figure 5. The formula is as follows:

x_{1} (t) = x (t) - m_{1} (t)

(2)

Step 4: Determine whether x₁(t) satisfies two criteria for IMF components [38]:

(i): The number of extrema (sum of the number of maxima and minima) must be equal to or differ by at most one from the number of zero crossings.
(ii): The mean of envelopes defined by the local maxima and minima should be zero at any point of the IMF.

If x₁(t) does not meet these criteria, it should be used as raw data for the next iteration, and Step 1 through Step 3 should be repeated until the criteria are met. If x₁(t) satisfies the criteria, set c₁(t) = x₁(t), where c₁(t) is the first IMF component, as shown in Figure 6. Then, subtract c₁(t) from time series signal x₁(t) to obtain residual component r₁(t), which represents high-frequency components removed from the original signal. The formula is as follows:

r_{1} (t) = x (t) - c_{1} (t)

(3)

Step 5: Use r₁(t) as new time series signal of clean coal ash content and repeat previous steps to obtain the nth IMF component r_n(t) from x(t).

Step 6: Determine whether r_n(t) satisfies given termination condition for EMD (which usually requires r_n(t) to become a monotonic function). If the condition is met, the decomposition process ends. Otherwise, the process continues.

Step 7: Through series of decompositions described above, the process ultimately yields n IMF components and a residual r(t). The final representation of original signal x(t) is given by the following formula:

x (t) = \sum_{i = 1}^{n} c_{i} (t) + r (t)

(4)

3.3.2. LSTM Prediction for Each Component

Each IMF component and the residual r(t) of time series signals are individually predicted by LSTM networks. Based on requirements for prediction accuracy and duration in actual coal washing ash content process, the absolute difference between actual ash content and predicted ash content is set to 0.5%, with a prediction duration of 16 min. Predicted values for each component are denoted as P₁, P₂, P₃, …, P_n, P_r.

As shown in Figure 7, an LSTM cell unit includes three gate structures used to protect and control the cell state. The three gates are forget gate, input gate, and output gate [39,40].

Forget Gate determines which information from previous cell state should be discarded. Its inputs are x_t and h_t−1, and its output is f_t. Each value in previous cell state C_t−1 is output as a corresponding value between 0 and 1, where 0 indicates complete forgetting and 1 indicates complete retention.

Input Gate decides the information to be updated in the cell state. It outputs the updated information i_t, with a new candidate value vector Ĉ_t created by the tanh layer.

The forget gate and input gate determine the deletion and addition of transmitted information, updating the cell state to obtain the new cell state C_t [41].

Output Gate determines which information from the cell state should be output, and named as o_t. The tanh layer processes new cell state C_t to produce a value between −1 and 1, and then multiplies it by o_t to obtain the output h_t.

The Formulas (5) to (10) outline detailed process of cell state transition. Here, W_n(n = f, i, C, o) represents the weights for corresponding gate. b_n(n = f, i, C, o) represents the biases. The operators × and + represent multiplication and addition of matrix and element-wise, respectively [42].

Forget Gate : f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})

(5)

Input Gate : i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(6)

Candidate Information : {\hat{C}}_{t} = t a n h (W_{c} \cdot [h_{t - 1}, x_{t}] + b_{c})

(7)

Cell State : C_{t} = f_{t} * C_{t - 1} + i_{t} * {\hat{C}}_{t}

(8)

Output Gate : o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})

(9)

Output : h_{t} = o_{t} * t a n h (C_{t})

(10)

3.3.3. Selective Reconstruction of Each Prediction Component

To remove “noise” from time series signal x(t) of original clean coal ash content, it is necessary to reconstruct P₁, P₂, P₃, …, P_n, P_r. The specific steps are as follows.

Step 1: Divide P_i(i = 1, 2, 3, …, n, r) into high-frequency, middle-frequency, and low-frequency components based on their frequency distribution.

Step 2: Define

S_{k} (t) = \sum_{i = j}^{k} P_{i} (t) + P_{r} (t), j, k \in (1, 2, \dots, n), j \leq k .

(11)

Step 3: Compare prediction signals S_k(t) with time series signal x(t) and then determine the optimal prediction model.

3.4. Structural Diagram of Prediction Methodology

The prediction process for the time series of dense medium clean coal ash content based on EMD-LSTM is illustrated in Figure 8. The flowchart on the left side depicts the EMD process, while the flowchart on the right side shows the LSTM prediction process. During the LSTM prediction process, a set of dense medium clean coal ash content data from a coal preparation plant with a continuous duration of 220 min under automatic control (sampling frequency of 1 min per sample) is used. Combined with normalized data, the prediction and testing sets are determined to be 92.73% and 7.27% of the data, respectively, for a prediction period of 16 min.

4. Results and Analysis

4.1. Evaluation Metrics

To quantify prediction accuracy, the following metrics were employed:

(1) Error: The raw deviation between predicted (

p_{i}

) and observed (

x_{i}

) values

E_{i} = p_{i} - x_{i}, i \in (1, 2, \dots, m)

(12)

where m is the sample size.

(2) Root Mean Squared Error (RMSE): Penalizes larger errors due to squaring, reflecting overall model fit:

R M S E = \sqrt{(\sum_{i = 1}^{m} {E_{i}}^{2}) / m}

(13)

(3) Absolute Error (AE): The absolute value of the error, used to measure the accuracy of measurement results without considering the direction of the error (positive or negative):

{A E}_{i} = |E_{i}|

(14)

(4) Mean Absolute Error (MAE): Robust to outliers, representing average absolute deviation:

M A E = (\sum_{i = 1}^{m} {A E}_{i}) / m

(15)

(5) Mean Absolute Percentage Error (MAPE): Normalizes errors by observed values for scale-independent assessment:

M A P E = \frac{1}{m} \sum_{i = 1}^{m} |\frac{E_{i}}{x_{i}}| \times 100 %

(16)

(6) Vector Value of the Radial Difference (VVRD): An evaluation metric defined specifically for this study, used to assess the consistency between the trend of predicted values and the trend of actual observed values. It has a directional component, where a positive direction indicates that the trend of predicted values is consistent with actual observed values, and a negative direction indicates that the trend of predicted values is opposite to actual observed values. The formula is as follows

V V R D = \frac{{\hat{p}}_{i + 3} - {\hat{p}}_{i}}{{\hat{x}}_{i + 3} - {\hat{x}}_{i}} / |\frac{{\hat{p}}_{i + 3} - {\hat{p}}_{i}}{{\hat{x}}_{i + 3} - {\hat{x}}_{i}}| \times (a r c t a n \frac{{\hat{p}}_{i + 3} - {\hat{p}}_{i}}{3} - a r c t a n \frac{{\hat{x}}_{i + 3} - {\hat{x}}_{i}}{3}), i \in (1, 2, \dots, m - 3)

(17)

where

{\hat{p}}_{i}

and

{\hat{x}}_{i}

are the normalized values of

p_{i}

and

x_{i}

, respectively.

(7) Number of the Same Direction (NSD): An evaluation metric defined specifically for this study, used to count the number of instances where the trend direction of predicted values matches the trend direction of actual observed values.

4.2. EMD Results

The decomposition results of the time series signal are shown in Figure 9.

Original data is decomposed into six IMF components and one residual r(t). Among these, IMF1 and IMF2 are high-frequency components, IMF3 is a middle-frequency component, IMF4, IMF5, and IMF6 are low-frequency components, and the residual r(t) indicates overall trend of ash content variation. Analysis of components shows that the “noise” in original data mainly comes from middle- and high-frequency components, while the variation trend of ash content mainly depends on low-frequency components. Frequency thresholds are set as shown in Table 1.

4.3. LSTM Prediction of Time Series Component

4.3.1. Experimental Setup

Based on EMD results, a total of six reconstruction models are selected in this study:

LSTM: LSTM prediction directly on original data;

EMD-LSTM: LSTM prediction after EMD of original data;

EMD-imf1-LSTM: LSTM prediction after EMD with IMF1 component removed;

EMD-imf2-LSTM: LSTM prediction after EMD with IMF2 component removed;

EMD-imf3-LSTM: LSTM prediction after EMD with IMF3 component removed;

EMD-imf1+2-LSTM: LSTM prediction after EMD with IMF1 and IMF2 components removed.

Based on accuracy requirements for predicting coal ash content in actual production process control, a tolerance of −0.25 ≤ E ≤ 0.25 is set as acceptable error.

Three sets of values are predefined for initial learning rate, number of hidden layers, and number of training cycles of LSTM neural network, resulting in 3 × 3 × 3 = 27 different RMSE values. The results are shown in Figure 10, where a) to f) correspond to LSTM, EMD-LSTM, EMD-imf1-LSTM, EMD-imf2-LSTM, EMD-imf3-LSTM, and EMD-imf1+2-LSTM, respectively. The darker the color of the lines in the figure, the smaller the value of RMSE. The three parameter values of the initial learning rate, the number of hidden layers, and the training period passed by the lines with the darkest color are the best parameter combinations for the RMSE effect under this model.

4.3.2. Prediction Results

Select parameter combinations of initial learning rate, number of hidden layers, and number of training cycles that yield the minimum RMSE value for each reconstruction model, and then perform predictions again.

(1): Results of the error metric

Prediction results and their errors are shown in Figure 11. On the left is a visual comparison between the predicted values and the original values, and on the right is the error between the predicted values and the original values for each model. From the figure, it is evident that the “noise” in original data is mainly present in the IMF1 component. The prediction curve of the EMD-imf1-LSTM model is the smoothest and best fits the original curve, with the error also controlled within acceptable range of ±0.25.

In contrast, while errors for four prediction models—LSTM, EMD-LSTM, EMD-imf2-LSTM, and EMD-imf3-LSTM—are within acceptable range, they exhibit greater fluctuation, detrimental to their use as input for a closed-loop automatic control system of heavy-medium clean coal ash content. In addition, the prediction results of the EMD-imf1+2-LSTM model do not provide a high degree of fit and fails to effectively reflect the variation trend, although with an acceptable error and relatively smooth curve.

(2): Results of the VVRD metric

To determine whether a prediction is accurate, a comprehensive evaluation can be made from two aspects: trend development and the size of the error. If the direction of the trend development can be accurately predicted (i.e., in the same or opposite direction as the original curve) while ensuring that the error from the original value is as close to 0 as possible, then we can consider the prediction to be relatively accurate. As shown in Figure 12, the horizontal axis represents VVRD, and the vertical axis represents AE, with y = VVRD × AE. Blue indicates that trend directions are the same, and the darker the color, the stronger the consistency. Red indicates that trend directions are opposite, and the darker the color, the weaker the consistency.

As shown in Figure 13, the radial coordinate (r) represents the AE value, and the angular coordinate (θ) represents the VVRD. The closer the point is to “0π”, the smaller the angular difference between the predicted direction and the true direction (blue is the same trend, red is the reverse trend), and the closer it is to the center of the circle, the smaller the error. It is evident from Figure 13 that the EMD-LSTM and EMD-imf1-LSTM models exhibit better trend consistency in their prediction results, with each having eleven prediction points in the same trend direction.

(3): Comprehensive results

As shown in Figure 14, the results of four evaluation indicators of six models are comprehensively presented. It can be seen that the EMD-imf1+2-LSTM model has the best performance in terms of the RMSE metric, followed by the EMD-imf1-LSTM model. For the MAE metric, the EMD-imf1+2-LSTM model also shows the best performance, with the EMD-imf1-LSTM model coming in second. The MAPE metric shows that the EMD-imf1-LSTM model performs the best, followed by the EMD-imf1+2-LSTM model. For the NSD metric, both the EMD-imf1-LSTM and EMD-LSTM models demonstrate the same optimal performance.

4.4. Analysis of Model Selection

When the error and RMSE metrics meet required standards, the accuracy of predicted data trends largely determines whether the input to the closed-loop automatic control system of heavy-medium clean coal ash content is accurate. Inputting a trend in opposite direction could lead to operations in reverse direction, resulting in greater data fluctuations, affecting stability of the system, reducing production efficiency, and leading to resource wastage.

The EMD-imf1-LSTM model exhibits outstanding performance across the RMSE, MAE, MAPE, and NSD metrics, particularly demonstrating a significant advantage in the most critical aspect of trend consistency. Therefore, based on comprehensive analysis, the EMD-imf1-LSTM model is identified as the optimal model choice.

4.5. Parameter Optimization for the EMD-imf1-LSTM Model

In this study, the sparrow search algorithm (SSA) [43] is utilized to optimize three parameters of the EMD-imf1-LSTM model (the initial learning rate, the number of hidden layers, and the number of training epochs). The initial parameter combination, corresponding to the minimum RMSE value of the EMD-imf1-LSTM model, is set to 0.015-120-200. After optimization, parameter combination is adjusted to 0.013-92-110. Prediction results of the optimized EMD-imf1-LSTM-SSA model are shown in Figure 15, with trend performance shown in Figure 16, and the evaluation metrics provided in Table 2.

As can be seen from Figure 15 and Figure 16, prediction results of the EMD-imf1-LSTM-SSA model have errors strictly controlled within predefined range. Additionally, the curve is smoother, and the trend consistency is superior to that of the EMD-imf1-LSTM model, meeting expected outcomes.

The data in Table 2 shows that the EMD-imf1-LSTM-SSA model outperforms the EMD-imf1-LSTM model across all evaluation metrics.

Experimental results indicate that the EMD-imf1-LSTM-SSA model provides the best overall performance in terms of evaluation metrics, demonstrating strong predictive capability for dense medium clean coal ash content. It meets the requirements for prediction accuracy, prediction duration, and trend development in the process control of coal ash content in actual production.

5. Conclusions

In recent years, with the development and widespread adoption of artificial intelligence and increased awareness of resource and environmental protection, the urgency of achieving clean, efficient, and green development in the coal processing sector has become more and more pronounced. Beyond addressing the root causes, innovative technological approaches and solutions are needed to ensure intelligent and sustainable coal separation. This research focuses on practical production site applications by implementing accurate predictions of clean coal ash content to address challenges associated with the dense medium separation process.

The major contributions of this study are threefold: (i) a novel integration of EMD and LSTM that effectively denoises and predicts nonlinear, non-stationary time series of coal ash content; (ii) the introduction of the VVRD metric to accurately evaluate trend consistency, which is more aligned with the control requirements in industrial systems; and (iii) application of the sparrow search algorithm (SSA) for hyperparameter optimization, further improving model prediction performance and robustness.

Comparative results of various models indicate that EMD followed by LSTM prediction effectively removes the “noise” present in the data of dense medium clean coal ash content without affecting overall trend of original data, significantly improving prediction accuracy. The EMD-imf1-LSTM model is the best choice for the data in this study. By analyzing which component contains the “noise” for different datasets and removing it, followed by appropriate parameter optimization, the model achieves optimal performance.

This research provides new insights into automatic control systems for dense medium separation ash content and serves as a technical support for clean and efficient coal resource production. It offers practical guidance for the development of intelligent separation and even intelligent mining. Additionally, this study focuses on constructing optimized models using limited data samples. Future work will incorporate larger, more detailed, and diversified datasets to accurately reflect dynamic characteristics of coal preparation and support realization of green mining.

Author Contributions

K.C.: Writing—original draft, Resources, Methodology, Conceptualization. X.Z.: Supervision, Conceptualization. K.Z.: Writing—review and editing, Supervision. C.Z.: Software. J.L.: Funding acquisition, Writing—review and editing, Formal analysis. C.Y.: Funding acquisition Writing—review and editing, Supervision. Y.G.: Methodology. R.W.: Resources, Data curation. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Key Research and Development Program of China (2023YFC2907305), National Natural Science Foundation of China (NSFC) (No. 52204167), Hunan Provincial Natural Science Foundation of China (2024JJ6504), and Graduate University-Enterprise Joint Innovation Program of Central South University (2022XQLH106).

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Zhang, B.; Zhu, G.Q.; Lv, B.; Yan, G.H. A novel and effective method for coal slime reduction of thermal coal processing. J. Clean. Prod. 2018, 198, 19–23. [Google Scholar] [CrossRef]
Lv, W.; Wang, C. Intelligent control of heavy media separation. Int. J. Glob. Energy Issues 2023, 45, 86–100. [Google Scholar] [CrossRef]
Tao, N.J.; Zheng, J.H.; Li, Z.Y. A General review of the course of development of China’s coal preparation technologies over the past five decades. Coal Prep. Technol. 2023, 51, 4047. (In Chinese) [Google Scholar] [CrossRef]
Li, B.; Yu, X.; Zhang, K.; Yuan, Y.; Dou, N. Experimental study on ash reduction by classification flotation. Int. J. Coal Prep. Util. 2024, 4, 479–499. [Google Scholar] [CrossRef]
Wang, G.H.; Kuang, Y.L.; Wang, Z.G.; Wang, Y.; Ji, L. A Real-Time Prediction Model for Production Index in Process of Dense-Medium Separation. Int. J. Coal Prep. Util. 2012, 32, 298–309. [Google Scholar] [CrossRef]
Sriramoju, S.K.; Singh, R.; Sengupta, M.; Akhter, S.; Dash, P.S. Selective screening of coal to improve the washability characteristics at different levels of size reduction. Int. J. Coal Prep. Util. 2021, 42, 3070–3089. [Google Scholar] [CrossRef]
Zhang, K.H.; Wang, W.D.; Lv, Z.Q.; Feng, J.D.; Li, H.X.; Zhang, C.L. LKDPNet: Large-Kernel Depthwise-Pointwise convolution neural network in estimating coal ash content via data augmentation. Appl. Soft Comput. 2023, 144, 110471. [Google Scholar] [CrossRef]
Lu, F.C.; Liu, H.Z.; Lv, W.B. Deep correlation and precise prediction between static features of froth images and clean coal ash content in coal flotation: An investigation based on deep learning and maximum likelihood estimation. Measurement 2024, 224, 113843. [Google Scholar] [CrossRef]
Cierpisz, S.; Heyduk, A. A simulation study of coal blending control using a fuzzy logic ash monitor. Control Eng. Pract. 2002, 10, 449–456. [Google Scholar] [CrossRef]
Zhang, L.; Xia, X.; Zhu, B. A Dual-Loop Control System for Dense Medium Coal Washing Processes With Sampled and Delayed Measurements. IEEE Trans. Control Syst. Technol. 2017, 25, 2211–2218. [Google Scholar] [CrossRef]
Theerayut, P.; Palot, S.; Chinawich, K.; Natatsawas, S.; Somthida, S.; Nutthakarn, P.; Onchanok, J.; Kreangkrai, M.; Apisit, N.; Ilhwan, P.; et al. Conventional and recent advances in gravity separation technologies for coal cleaning: A systematic and critical review. Heliyon 2023, 9, e13083. [Google Scholar] [CrossRef]
Ma, L.C.; Wei, L.B.; Zhu, X.S.; Pei, X.Y.; Zhou, Q.F.; Liu, J.L. Response surface method for modeling of fine coal beneficiation by Knelson concentrator. Int. J. Coal Prep. Util. 2018, 41, 776–788. [Google Scholar] [CrossRef]
Cui, Y.; Zhang, K.H.; Lv, Z.Q.; Li, H.X.; Song, S.; Zhang, C.L.; Wang, W.D.; Xu, Z.Q. Exploring the effect of various factors for ash content estimation via ensemble learning: Color-texture features, particle size, and magnification. Miner. Eng. 2023, 201, 108212. [Google Scholar] [CrossRef]
Zhang, J.L.; Kang, W.Z.; Zhang, X.G.; Fan, C.J.; Lou, Y.Q. Discussion on ash prediction of cleaned coal in Taoshan Coal preparation Plant. Clean Coal Technol. 2007, 04, 15–17. (In Chinese) [Google Scholar] [CrossRef]
Sun, X.L.; Cao, Z.G.; Yue, Y.H.; Kuang, Y.L.; Zhou, C.X. Online prediction of dense medium suspension density based on phase space reconstruction. Part. Sci. Technol. 2017, 36, 989–998. [Google Scholar] [CrossRef]
Qiu, Z.Y.; Dou, D.Y.; Zhou, D.Y.; Yang, J.G. On-line prediction of clean coal ash content based on image analysis. Measurement 2021, 173, 108663. [Google Scholar] [CrossRef]
Chen, P.; Wang, C.Y.; Wang, S.W.; Zhang, C.H.; Li, Z.W. Prediction of Cleaned Coal Yield and Partition Coefficient in Coal Gravity Separation Based on the Modified Hyperbolic Tangent Model. Min. Metall. Explor. 2022, 39, 2491–2502. [Google Scholar] [CrossRef]
Zhang, Z.L.; Yang, J.G. Online Analysis of Coal Ash Content on a Moving Conveyor Belt by Machine Vision. Int. J. Coal Prep. Util. 2016, 37, 100–111. [Google Scholar] [CrossRef]
Ali, D.; Hayat, M.B.; Lana, A.; Molatlhegi, O.K. An evaluation of machine learning and artificial intelligence models for predicting the flotation behavior of fine high-ash coal. Adv. Powder Technol. 2018, 29, 3493–3506. [Google Scholar] [CrossRef]
Legnaioli, S.; Campanella, B.; Pagnotta, S.; Poggialini, F.; Palleschi, V. Determination of Ash Content of coal by Laser-Induced Breakdown Spectroscopy. Spectrochim. Acta Part B At. Spectrosc. 2019, 155, 23–126. [Google Scholar] [CrossRef]
Yin, X.H.; Niu, Z.W.; He, Z.; Li, Z.J.; Lee, D.H. Ensemble deep learning based semi-supervised soft sensor modeling method and its application on quality prediction for coal preparation process. Adv. Eng. Inform. 2020, 46, 101136. [Google Scholar] [CrossRef]
Zhou, C.X.; Sun, X.L.; Shen, Y.S.; Yue, Y.H.; Jing, M.Y.; Liang, W.N.; Zhang, H. Product quality prediction in dense medium coal preparation process based on recurrent neural network. Int. J. Coal Prep. Util. 2023, 44, 291–308. [Google Scholar] [CrossRef]
Wang, J.; Wang, R.F.; Fu, X.; Wei, K.; Han, J.; Zhang, Q. Research on Dual GRU model driving by time series alignment for multi-step prediction of ash content of dense medium clean coal. Int. J. Coal Prep. Util. 2024, 44, 2200–2224. [Google Scholar] [CrossRef]
Zhang, Y.T.; Li, C.L.; Jiang, Y.Q.; Sun, L.; Zhao, R.B.; Yan, K.F.; Wang, W.H. Accurate prediction of water quality in urban drainage network with integrated EMD-LSTM model. J. Clean. Prod. 2022, 354, 131724. [Google Scholar] [CrossRef]
Ali, M.; Khan, D.M.; Alshanbari, H.M.; El-Bagoury, A.A. Prediction of Complex Stock Market Data Using an Improved Hybrid EMD-LSTM Model. Appl. Sci. 2023, 13, 1429. [Google Scholar] [CrossRef]
Ghezaiel, W.; Ben Slimane, A.; Ben Braiek, E. Nonlinear multi-scale decomposition by EMD for Co-Channel speaker identification. Multimed. Tools Appl. 2017, 76, 20973–20988. [Google Scholar] [CrossRef]
Xiong, Z.H.; Yao, J.J.; Huang, Y.M.; Yu, Z.X.; Liu, Y.L. A wind speed forecasting method based on EMD-MGM with switching QR loss function and novel subsequence superposition. Appl. Energy 2024, 353, 122248. [Google Scholar] [CrossRef]
Rezaee, M.; Taraghi Osguei, A. Improving empirical mode decomposition for vibration signal analysis. Proceedings of the Institution of Mechanical Engineers. Part C J. Mech. Eng. Sci. 2017, 231, 2223–2234. [Google Scholar] [CrossRef]
Sepp, H.; Jürgen, S. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar]
Pan, S.W.; Yang, B.; Wang, S.K.; Guo, Z.; Wang, L.; Liu, J.H.; Wu, S.Y. Oil well production prediction based on CNN-LSTM model with self-attention mechanism. Energy 2023, 284, 128701. [Google Scholar] [CrossRef]
Yu, Y.; Si, X.S.; Hu, C.H.; Zhang, J.X. A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef] [PubMed]
Wei, Q.; Tan, C.D.; Gao, X.Y.; Guan, X.; Shi, X. Research on early warning model of electric submersible pump wells failure based on the fusion of physical constraints and data-driven approach. Geoenergy Sci. Eng. 2024, 233, 212489. [Google Scholar] [CrossRef]
Huynh, A.N.L.; Deo, R.C.; Ali, M.; Abdulla, S.; Raj, N. Novel short-term solar radiation hybrid model: Long short-term memory network integrated with robust local mean decomposition. Appl. Energy 2021, 298, 117193. [Google Scholar] [CrossRef]
Yu, R.G.; Gao, J.; Yu, M.; Lu, W.H.; Xu, T.Y.; Zhao, M.K.; Zhang, J.; Zhang, R.X.; Zhang, Z. LSTM-EFG for wind power forecasting based on sequential correlation features. Future Gener. Comput. Syst. 2019, 93, 33–42. [Google Scholar] [CrossRef]
Hu, Y.T.; Zhang, Q. A hybrid CNN-LSTM machine learning model for rock mechanical parameters evaluation. Geoenergy Sci. Eng. 2023, 225, 211720. [Google Scholar] [CrossRef]
Marcelo, A.C.; Gastón, S.; María, E.T. Improved complete ensemble EMD: A suitable tool for biomedical signal processing. Biomed. Signal Process. Control 2014, 14, 19–29. [Google Scholar] [CrossRef]
Hadi, R.; Hamidreza, F.; Gholamreza, M. Stock price prediction using deep learning and frequency decomposition. Expert Syst. Appl. 2021, 169, 114332. [Google Scholar] [CrossRef]
Fang, T.H.; Zheng, C.L.; Wang, D.H. Forecasting the crude oil prices with an EMD-ISBM-FNN model. Energy 2023, 263, 125407. [Google Scholar] [CrossRef]
Smagulova, K.; James, A.P. A survey on LSTM memristive neural network architectures and applications. Eur. Phys. J. Spec. Top 2019, 228, 2313–2324. [Google Scholar] [CrossRef]
Zhang, M.Z.; Jia, A.L.; Lei, Z.X. Inter-well reservoir parameter prediction based on LSTM-Attention network and sedimentary microfacies. Geoenergy Sci. Eng. 2024, 235, 212723. [Google Scholar] [CrossRef]
Alex, S. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
Zhou, F.T.; Huang, Z.H.; Zhang, C.H. Carbon price forecasting based on CEEMDAN and LSTM. Appl. Energy 2022, 311, 118601. [Google Scholar] [CrossRef]
Gao, T.; Niu, D.X.; Ji, Z.S.; Sun, L.J. Mid-term electricity demand forecasting using improved variational mode decomposition and extreme learning machine optimized by sparrow search algorithm. Energy 2022, 261, 125328. [Google Scholar] [CrossRef]

Figure 1. Schematic of the three-product dense medium cyclone separation process for clean coal production.

Figure 2. Framework diagram of the prediction methodology.

Figure 3. Upper and lower envelope lines.

Figure 4. Mean envelope line.

Figure 5. New time series x₁(t).

Figure 6. IMF1 component.

Figure 7. Cell state transmission process.

Figure 8. Structure schematic diagram of the EMD-LSTM prediction model.

Figure 9. EMD results of clean coal ash content time series.

Figure 10. RMSE values for each model.

Figure 11. Comparison of prediction errors across six models.

Figure 12. Heatmap.

Figure 13. Trend performance of prediction results in each model.

Figure 14. Evaluation metric values for each model.

Figure 15. Prediction results of the EMD-imf1-LSTM-SSA model.

Figure 16. Trend performance of prediction results in the EMD-imf1-LSTM-SSA model.

Table 1. Frequency thresholds for IMF component classification in EMD.

Frequency Band	Threshold	Range
Low Frequency	10 30	F ≤ 10
Middle Frequency		10 < F ≤ 30
High Frequency		30 < F

Table 2. Performance metrics of EMD-IMF1-LSTM-SSA vs. EMD-IMF1-LSTM.

	EMD-imf1-LSTM	EMD-imf1-LSTM-SSA
Valuation Metrics	EMD-imf1-LSTM	EMD-imf1-LSTM-SSA
RMSE	0.01014	0.0099389
MAE	0.00884	0.0051748
MAPE	0.0806%	0.04720%
NSD	11	12

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cheng, K.; Zhang, X.; Zhou, K.; Zhou, C.; Li, J.; Yang, C.; Guo, Y.; Wang, R. Time Series Prediction Method of Clean Coal Ash Content in Dense Medium Separation Based on the Improved EMD-LSTM Model. Big Data Cogn. Comput. 2025, 9, 159. https://doi.org/10.3390/bdcc9060159

AMA Style

Cheng K, Zhang X, Zhou K, Zhou C, Li J, Yang C, Guo Y, Wang R. Time Series Prediction Method of Clean Coal Ash Content in Dense Medium Separation Based on the Improved EMD-LSTM Model. Big Data and Cognitive Computing. 2025; 9(6):159. https://doi.org/10.3390/bdcc9060159

Chicago/Turabian Style

Cheng, Kai, Xiaokang Zhang, Keping Zhou, Chenao Zhou, Jielin Li, Chun Yang, Yurong Guo, and Ranfeng Wang. 2025. "Time Series Prediction Method of Clean Coal Ash Content in Dense Medium Separation Based on the Improved EMD-LSTM Model" Big Data and Cognitive Computing 9, no. 6: 159. https://doi.org/10.3390/bdcc9060159

APA Style

Cheng, K., Zhang, X., Zhou, K., Zhou, C., Li, J., Yang, C., Guo, Y., & Wang, R. (2025). Time Series Prediction Method of Clean Coal Ash Content in Dense Medium Separation Based on the Improved EMD-LSTM Model. Big Data and Cognitive Computing, 9(6), 159. https://doi.org/10.3390/bdcc9060159

Article Menu

Time Series Prediction Method of Clean Coal Ash Content in Dense Medium Separation Based on the Improved EMD-LSTM Model

Abstract

1. Introduction

2. Literature Review

3. Research Methodology and Procedure

3.1. Research Methodology

3.1.1. EMD

3.1.2. LSTM

3.2. Framework of Prediction Methodology

3.3. Construction Process of Prediction Methodology

3.3.1. EMD of Time Series Signal

3.3.2. LSTM Prediction for Each Component

3.3.3. Selective Reconstruction of Each Prediction Component

3.4. Structural Diagram of Prediction Methodology

4. Results and Analysis

4.1. Evaluation Metrics

4.2. EMD Results

4.3. LSTM Prediction of Time Series Component

4.3.1. Experimental Setup

4.3.2. Prediction Results

4.4. Analysis of Model Selection

4.5. Parameter Optimization for the EMD-imf1-LSTM Model

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI